Next Article in Journal
Seasonal Varied Responses of Block-Scale Land Surface Temperature to Multidimensional Urban Canopy Morphology Interpreted by SHAP Approach
Next Article in Special Issue
TMAFNet: A Transformer-Based Multi-Level Adaptive Fusion Network for Remote Sensing Change Detection
Previous Article in Journal
Integrating Multi-Source Data to Assess Temporal Changes and Drivers of Forest Cover in the Western Margins of the Sichuan Basin
Previous Article in Special Issue
A Coarse-to-Fine Optical-SAR Image Registration Algorithm for UAV-Based Multi-Sensor Systems Using Geographic Information Constraints and Cross-Modal Feature Consistency Mapping
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Harmonic Phenology Mapping: From Vegetation Indices to Field Delineation

Faculty of Geodesy, University of Zagreb, Kačićeva 26, 10000 Zagreb, Croatia
*
Author to whom correspondence should be addressed.
Remote Sens. 2026, 18(7), 1011; https://doi.org/10.3390/rs18071011
Submission received: 4 February 2026 / Revised: 16 March 2026 / Accepted: 25 March 2026 / Published: 27 March 2026

Highlights

What are the main findings?
  • A phenology-encoded HSV composite enables zero-shot parcel delineation without retraining.
  • Among the eleven tested indices, performance is tiered: MSAVI, EVI, EVI2, and SAVI yield the best results, with errors also being dependent on parcel area.
What are the implications of the main findings?
  • A simple, scalable and interpretable methodology is defined for operations; further improvements are easily implementable.
  • The same harmonic descriptors used for the segmentation also support crop mapping.

Abstract

Operational agricultural monitoring in the Central European lowlands requires timely parcel boundaries; however, unmarked field edges produce minimal spectral contrast in single-date imagery. Previous works demonstrated that harmonic NDVI encoding enables zero-shot field delineation using foundational models, but the influence of the spectral index choice on temporal boundaries remained unquantified. This study systematically evaluates eleven vegetation indices—NDVI, GNDVI, NDRE, EVI, EVI2, SAVI, MSAVI, NDWI, CIg, CIre, and NDYVI—within a fixed harmonic phenology encoding pipeline. A one-year PlanetScope time series (15 × 15 km, Slavonija, Croatia) was decomposed via annual sinusoidal regression to extract per-pixel phase, amplitude, and mean parameters. These harmonic descriptors were mapped to HSV colour channels and segmented using the Segment Anything Model without fine-tuning. Official agricultural parcels (PAAFRD, 2025) provided ground truth for pixel-wise, object-wise, and size-stratified evaluation. Performance stratified into three tiers based on object-wise metrics. Soil-adjusted and enhanced-greenness indices (MSAVI, EVI, EVI2, and SAVI) achieved F1 = 0.51–0.52, and mIoU = 0.70–0.71, statistically outperforming standard ratio formulations (NDVI: F1 = 0.49) and chlorophyll indices (CIg, CIre: F1 = 0.45–0.47). Pixel-wise scores remained compressed (F1 > 0.88 across all indices), indicating consistent interior coverage but index-dependent boundary precision. Error analysis revealed scale-dependent patterns: merging dominated small parcels (<10,000 m2), while fragmentation increased with parcel size. Results demonstrate that spectral formulation is a systematic design factor in phenology-based delineation, with soil background correction and dynamic range compression improving seasonal trajectory separability. The harmonic parameters generated by this framework provide feature-ready input for crop classification, suggesting that integrated boundary extraction and crop mapping workflows merit further investigation.

1. Introduction

Accurate delineation of agricultural field boundaries underpins land administration, subsidy control, yield monitoring, crop statistics, and environmental compliance. However, boundaries are often weakly expressed, particularly in smallholder and strip-field systems where hedgerows, tracks, or ditches are discontinuous or absent. Recent surveys of agricultural parcel and boundary delineation (APBD) highlight a surge in techniques, from edge and region-growing pipelines to modern deep learning networks, but persistent challenges in fragmented landscapes and a lack of temporal cues in single-date inputs are also noted [1,2]. Although satellite time series are now widely available, the use of seasonal phenology as a primary signal for boundary detection remains limited in practice. Traditional field delineation methods evolved from manual digitisation and semi-automated workflows based on aerial photogrammetry and cadastral maps, which are accurate but labour-intensive, slow, and costly, especially in fragmented smallholder landscapes or rapidly changing areas [1]. With satellite remote sensing, large-area timely monitoring became possible, and modern approaches are broadly divided into pixel-based, edge-based, region-based, and hybrid methods that fuse boundaries with regional homogeneity [1]. Recent deep learning systems still rely on spatial–spectral contrasts from a single date, with phenological dynamics treated as an auxiliary input rather than central.
Foundational models for segmentation have transformed the field by enabling zero-shot mask proposals without task-specific training. The Segment Anything Model (SAM) is promptable and broadly transferable, built from more than one billion masks on eleven million images; it can generalize to new, unseen domains, including Earth observation, with one of its input modes [3]. In parallel, geospatial adaptations have emerged, frameworks that tailor inputs, tiling, and non-maximum suppression (NMS) to parcel geometry, and report competitive object-wise scores across heterogeneous agricultural regions [4], while early-season delineation seeks operational timelines with limited temporal depth [5].
Harmonic analysis summarizes seasonal trajectories into mean, amplitude, and phase using a compact wave function, enabling the interpolation of sparse or irregular acquisitions without extensive gap filling [6]. Studies using Landsat and MODIS data have shown that harmonic regression robustly captures phenology and can be mapped to downstream tasks where timing (phase) and seasonal range (amplitude) have agronomic significance [6,7]. When these descriptors are mapped to perceptual channels, ambiguous adjacent parcels can be separated through temporal contrast, even when single-date reflectance is similar [6,7].
Recent work by our group [8] addressed the temporal encoding challenge through perceptual colour mapping: harmonic Normalized Difference Vegetation Index (NDVI) parameters (phase, amplitude, and mean) were projected into cylindrical colour spaces (Hue–Saturation–Value—HSV, Hue–Whiteness–Blackness—HWB, and Luminance–Chroma–Hue—LCH) and segmented via SAM. That proof of concept, tested on a compact Northern Croatia site (5 × 5 km), established the viability of training-free, phenology-based delineation but examined only a single spectral formulation. The question of whether index physics—soil sensitivity, red-edge incorporation, and dynamic range—systematically affects boundary detectability when encoded through identical temporal decomposition and colour space projection remained open.
Spectral index choice matters mechanistically. Soil-adjusted formulations (MSAVI, SAVI) suppress within-field heterogeneity that could trigger spurious fragmentation. Enhanced-greenness variants (EVI, EVI2) normalize amplitude distributions across canopy densities, potentially stabilizing saturation values in colour space encodings. Red-edge indices (NDRE, CIre) respond to nitrogen variability, which may either sharpen crop-type boundaries or amplify management noise. Yet no study has isolated index formulation as the sole experimental variable within a fixed harmonic-to-segmentation pipeline, controlling for decomposition method, colour space, and segmentation architecture.
We address this gap through a controlled comparison across eleven vegetation indices spanning greenness, soil adjustment, red-edge, and water formulations. By holding the harmonic model, colour encoding, segmentation backbone, and evaluation protocol constant, we attribute performance differences directly to index-specific characteristics. Testing on a larger, more heterogeneous agricultural landscape (15 × 15 km, Slavonija, Croatia) enables size-stratified error analysis and the statistical validation of index rankings.
Many new SAM-derived delineation frameworks provide additional context. FieldSeg combines Sentinel temporal composites with SAM and careful patch management, reporting scalable extraction across eight regions and clarifying practical factors such as input normalization, mask filtering, and tiling overlaps [4]. Other studies enhance boundaries using detail-enhancement or edge-aware filters on SAM embeddings (Boundary SAM) or employ hybrid prompts (DeepLabV3+) with SAM blocks (fabSAM) to reduce merging and fragmentation errors common in dense strip fields [9,10]. Meanwhile, Delineate Anything reframes the task as instance segmentation trained on the FBIS-22M dataset, demonstrating resolution-agnostic generalization and strong metrics, setting a high standard for supervised approaches [2]. This highlights that input representation and post-processing remain crucial, even with strong backbones [2,4,9,10].
Considering all this, we present a method-development study that keeps the segmenter, tiling, thresholds, and evaluation protocol fixed, while varying the vegetation index used for harmonic to HSV recolouring. This study addresses the open question of how the choice of vegetation index affects boundary detectability when phenology is summarized via harmonic descriptors and mapped to a cylindrical colour space. The objective is to determine whether index-specific encodings produce systematically different delineation outcomes under an otherwise fixed, training-free segmentation pipeline, and to quantify any differences using standard pixel-wise and object-wise metrics.

2. Materials and Methods

2.1. Study Area and PlanetScope Imagery

The 15 × 15 km study area was chosen to complement the Northern Croatia site, providing contrasting parcel-size distribution and pedological characteristics. Whereas the earlier compact area of interest highlighted the smallholder fragmentation, this Slavonija site contains intensive arable blocks interspersed with peri-urban strips, allowing for stratified evaluation by parcel area (0–10,000 m2, 10,000–100,000 m2, and >100,000 m2). To further assess generalization, we additionally evaluated the workflow on a second area of interest (AOI) in Northern Croatia representing a smallholder dominated landscape; the results are provided in Appendix C.
No ancillary layers were used; all inputs were derived from PlanetScope satellite imagery. The dense PlanetScope revisit increases the likelihood of capturing asynchronous phenology among neighbouring fields, which is expected to enhance temporal contrast at parcel boundaries in the HSV composites.
A one-year PlanetScope SuperDove time series from October 2024 to October 2025 was assembled. Scenes were cloud-free according to provider flags and passed visual quality control. No additional per-pixel haze or shadow correction was applied beyond the provider quality masks and manual screening; therefore, residual thin haze, undetected cloud edges, and cast shadows may persist in parts of the imagery and may introduce localized artefacts in the vegetation index time series and the subsequent harmonic descriptors. Missing observations were not interpolated; harmonic parameters were fitted using the available valid samples only. The images were received orthorectified, atmospherically corrected, and harmonized to Sentinel-2 data as surface reflectance at native 3 m resolution. All scenes were delivered in EPSG:32634. The scenes were obtained through Planet’s Education and Research Program.
The spatial extent of the AOI is shown in Figure 1, using a PlanetScope RGB composite. The landscape is predominantly agricultural, organized into elongated, rectangular field units. A transportation corridor traverses the AOI diagonally, dividing it into two areas. The northern sector has a finer spatial grain, consisting of smaller agricultural parcels, linear settlement patterns aligned with road networks, a meandering river, and scattered woodland features. The southern sector is dominated by larger arable fields, with forested patches occurring intermittently. Vegetated surfaces are represented by darker green while bare or sparsely vegetated soils appear as lighter, reddish-brown tones. Field boundaries are mainly inferred from spectral and tonal discontinuities associated with crop phenological stages and management practices rather than from clearly defined physical boundaries.
SuperDove’s eight bands—coastal blue, blue, green, green I, yellow, red, red-edge (RE), and near-infrared (NIR)—enable the calculation of indices that exploit the green and red-edge regions in addition to classic red and NIR contrasts, supporting chlorophyll-sensitive formulations [11,12,13]. Instrument characterisations and validation studies confirm the added spectral utility of the yellow and red-edge bands in terrestrial and aquatic applications, motivating tests beyond traditional indices towards green and red-edge variants that may strengthen boundary contrast where crops diverge in canopy structure or chemistry [11,12,13]. The band set is used for per-index calculations to build the harmonic descriptors used to create a false colour composite. Leveraging all bands provides an index-agnostic basis for subsequent harmonic recolouring and segmentation.

2.2. Indices Calculation

Vegetation indices provide compact, interpretable summaries of canopy status. In this study, eleven indices are evaluated: the Normalized Difference Vegetation Index (NDVI) [14], the Green Normalized Difference Vegetation Index (GNDVI) [14], the Normalized Difference Red-Edge Index (NDRE) [15], the Enhanced Vegetation Index (EVI) [16] and the Two-band Enhanced Vegetation Index (EVI2) [16], the Soil-Adjusted Vegetation Index (SAVI, L = 0.5) [17] and the Modified Soil-Adjusted Vegetation Index (MSAVI) [18], the Normalized Difference Water Index (NDWI) [19], the Chlorophyll Index—Green (CIg) [15] and the Chlorophyll Index—Red-Edge (CIre) [15], and the Normalized Difference Yellow Vegetation Index (NDYVI) [20]. Collectively, these indices span complementary sensitivities: structural greenness and canopy vigour, pigment and nitrogen proxies via green and red-edge leverage, robustness to soil background in high biomass, explicit soil suppression in sparse cover, separation of water influences, and band-specific contrasts that exploit yellow and red-edge sensitivity. The diversity of indices is expected to yield different boundary expressions at field edges once their harmonic phase, amplitude, and mean are encoded into the HSV colour space.
All vegetation indices were computed per date, per pixel from PlanetScope surface reflectance on a common 3 m grid. The same scene set was used for every index. Spectral symbols follow SuperDove naming: BLUE, GREEN, GREEN-I, YELLOW, RED, RED-EDGE (RE), and NIR. Formulas and sources for all indices are provided in Table 1.
These eleven indices serve as inputs for the harmonic analysis, with each index modelled independently to extract phase, amplitude, and mean per pixel before HSV recolouring and segmentation.

2.3. Harmonic Analysis

The harmonic analysis follows the framework validated in Papić et al. (2025) [8] and is presented as an end-to-end processing pipeline in Figure 2, applied independently to each of the eleven indices in Table 1. This section provides implementation details required for reproducibility across all workflow stages. Starting from PlanetScope imagery, scenes are quality-screened (provider cloud flags and manual quality control), reprojected to a common grid, and assembled into a temporal stack. Vegetation indices are computed per date and are normalized temporally. The constructed time series are summarized by fitting a single annual harmonic to obtain three descriptors, phase (timing), amplitude (seasonality), and mean (baseline), which are then mapped to the HSV colour space and exported as a recoloured false colour composite optimized for field boundary delineation and used as inputs for SAM. The resulting instance masks are vectorized, filtered, and evaluated using pixel-wise and object-wise accuracy metrics.
Per-pixel index trajectories xk (k = 1, …, n acquisitions) undergo two-stage modelling. First, slow temporal drift is removed via linear regression in year units since the first observation in order to suppress slow cross-scene drifts that can bias a single sinusoid; sk is defined as the time in years since the first image:
s k = t k t 1 365.2422
where t1 is the ordinal day of the first valid acquisition, tk is the ordinal day of acquisition k, and 365.2422 is the mean tropical year length in days. The model is fit as:
x k = β 0 + β 1 s k + ε k .  
where β 0   is the intercept, β 1 is the linear trend, and ε k is a zero-mean random error term, which captures unexplained variability at acquisition k. Parameters β 0 and β 1 are estimated using ordinary least squares (OLSs), and from that the detrended residual at acquisition k is formed:
r k = x k β ^ 0 + β ^ 1 s k
Seasonal timing is represented on a centred calendar angle, the mean ordinal day t ¯ is defined over the set of valid dates ν , and the annual angular time in radians θ k is formulated as:
t ¯ = 1 n v k ν t k ,   θ k = 2 π 365.2422 t k t ¯ .  
The detrended series is then approximated by:
r k = a 0 + a 1 cos θ k + b 1 sin θ k .  
where a0 is a constant bias that can remain after detrending; a1 and b1 are the cosine and sine coefficients of the first harmonic. The coefficients a0, a1, and b1 are estimated by OLSs independently for each pixel and each index. Only the fundamental component is used; thus, the phase is directly interpretable on the calendar circle, and descriptors are comparable across indices. Amplitude A summarizes the seasonal range, phase ϕ, in radians, encodes the calendar timing of the seasonal maximum on the annual circle, and the empirical mean of the original index over the dates x ¯ is derived as:
A = a 1 2 + b 1 2 , ϕ = a t a n 2 b 1 , a 1 ,   x ¯ = 1 n v k ν x k ,
A 0 ,   ϕ π , π .
For x ¯ , the empirical mean is used, not the harmonic intercept. The detrending and annual harmonic fitting is executed independently for each index, which preserves index-specific seasonality. A one-year index trace at each pixel is summarized by a single annual harmonic. To further contextualize the performance of the proposed harmonic-HSV encoding, two RGB-based baselines were also evaluated using the same SAM configuration and post-processing pipeline, and the results are shown in Appendix D.

2.4. Perceptual Recolouring

The three harmonic descriptors, phase ϕ , amplitude A, and mean x ¯ are converted into a Hue–Saturation–Value (HSV) image used as input for segmentation. The mapping follows the perceptual recoloring rationale established by Papić et al. [8], with AOI wide scaling to preserve cross-tile consistency.
Hue H is defined using the annual phase ϕ , and it encodes relative seasonal timing; it is defined as:
H =   ϕ + π 2 π ,   H   0,1 .  
And H = 0 and H = 1 coincide and represent the same timing on the annual circle.
To stabilize contrast across indices with different dynamic ranges, AOI wide winsorization is applied to the amplitude A over all finite pixels for the current index [21]. pL and pH are defined as the 2nd and 98th percentiles of A over all finite pixels for that index:
p L = p e r c 2 A ,   p H = p e r c 98 A ,  
Subsequently, winsorized amplitude A* is defined as:
A * = min max A , p L , p H ,  
which is then normalized to saturation S:
S = A * p L max p H p L .  
The intuition behind this being that values below pL, which have very low seasonality, get mapped to S ~ 0, which results in grey colours and values above pH being mapped to S ~ 1, while the majority of pixels get spread linearly. This step stabilizes indices without suppressing parcel-scale contrast.
The value is depicted by the empirical mean x ¯ . To keep the mapping index-agnostic and robust, x ¯   is clipped to [−1, 1] and then mapped to [0, 1] as per:
x ¯ * = min max x ¯ , 1 , 1 ,   V = x ¯ * + 1 2 .  
This preserves rank order for naturally bounded indices, e.g., NDVI, while also preventing CIg and CIre from over-brightening due to large numeric ranges.
The (H,S,V) triplet then gets converted to RGB using the standard HSV transform. This perceptual encoding mirrors the strategy proposed by Papić et al. [8], while generalizing it across various indices.
Figure 3, Panel (a) illustrates a multitemporal vegetation-index stack: each greyscale sheet represents the same AOI on a different acquisition date. Stacking these Panels yields a one-year time series for each pixel for all indices. Figure 3, Panel (b) shows the per-pixel harmonic modelling of that series. The blue trace depicts the observed index values over time. A simple linear trend in years since the first image is removed, producing residuals. These residuals are then fitted with a single annual harmonic. Figure 3, panel (c) depicts the mapping to the HSV colour space: the phase is mapped to the Hue so that parcels with different peak times receive different colours, while parcels with similar calendars appear in nearby Hues along the colour wheel. The amplitude is mapped to saturation: weak seasonality desaturates towards grey, while strongly seasonal crops are rendered vividly. The empirical index mean is mapped to the value, making low mean surfaces darker than high mean surfaces. Scaling is applied per index at the AOI level, using fixed saturation percentiles and conservative value clipping to keep colours consistent across tiles. The resulting output is a single three-band image per index, which is the direct input to SAM segmentation.

2.5. Segmentation with Segment Anything Model

Each HSV composite is segmented using the Segment Anything Model (SAM) in automatic, prompt-free mode. All runs use the same configuration; no fine-tuning is performed. The HSV composites are used as RGB inputs, and images are partitioned into 512 × 512 px tiles. Tiling, multi-scale crop generation, and scene reconstruction using tile-wise masks were handled by the Segment Geospatial (0.12.3) wrapper built around SAM’s Automatic Mask Generator. The hyperparameters remain unchanged from Papić et al. (2025) [8]:
c r o p _ n _ l a y e r s = 2 , c r o p _ o v e r l a p _ r a t i o = 0.35 , c r o p _ n _ p o i n t s _ d o w n s c a l e _ f a c t o r = 1 ,
p o i n t s _ p e r _ s i d e = 32 , p r e d _ i o u _ t h r e s h = 0.75 , s t a b i l i t y _ s c o r e _ t h r e s h = 0.80 .  
All runs used the Vision Transformer Huge (ViT-H) backbone, with weights sam_vit_h_4b8939.pth. Duplicates created by overlaps were suppressed using SAM’s internal non-maximum suppression and deduplication, and the results from each vegetation index input were kept separate, resulting in eleven distinct prediction sets. The instance masks were converted to polygons without post-processing. Predictions smaller than 350 m2 were removed. To check whether the tiling and overlap handling introduced systematic edge effects, the performance was quantified in seam-adjacent versus interior regions. Tile borders were expanded by 16 pixels and rasterized to delineate seam zones; the remaining area was treated as interior. Ground truth and predictions were rasterized at 3 m resolution, and IoU, prediction, recall, and F1 scores were computed independently in both zones to test the effects of tiling on accuracy.

2.6. Validation and Accuracy Metrics

The ground-truth (GT) parcels are from the Paying Agency for Agriculture, Fisheries, and Rural Development (PAAFRD, 2025), delivered as a shapefile. Figure 4 depicts the ground-truth parcels overlaid on an RGB composite of the study area. The pixel grid used for pixel-based metrics is built from the GT extent at 3 m resolution, so all rasterizations are perfectly co-registered. Prior to evaluation, polygons with an area less than 350 m2 were removed. An Intersection-over-Union (IoU)-based screening step was applied prior to evaluation. For each predicted polygon, the maximum IoU with any ground truth (GT) parcel was computed, and only predictions with IoU > 0.50 were retained. This screening step is ground truth-aware and is therefore not a deployable post-processing strategy; consequently, all reported metrics should be interpreted as conditional segmentation quality for sufficiently overlapping candidates. We retain this GT-dependent screening to keep the pipeline fixed and isolate the effect of vegetation index encoding across variants. In a deployable workflow, this step would be replaced by a GT-agnostic acceptance/rejection module (a lightweight classifier trained on a small, labelled subset, shape heuristics, thresholding…), which is outside the scope of the present study. All metrics were computed over the AOI extent, and an identical evaluation workflow and threshold values were applied for each vegetation index variant.
For object-based scoring, a one-to-one matching procedure was performed between the retained predictions and the GT parcels. GT polygons were processed in order of occurrence, and candidate predictions were retrieved by querying a spatial index using bounding-box intersection as a coarse filter. Among the candidates, the polygon with the highest IoU was selected; when this value exceeded 0.50, a match was registered, and both objects were marked as used. Under this constraint, each GT parcel was matched to one prediction at most, and each prediction was allowed to participate in one match at most.
After matching, true positives (TP) were defined as predictions assigned to a GT parcel with IoU > 0.50. False positives (FP) were defined as predictions without an assigned GT parcel, and false negatives (FN) were defined as GT parcels without an assigned prediction. These counts were then used to compute the object-based performance metrics.
P r e c i s i o n = T P T P + F P , R e c a l l = T P T P + F N , F 1 = 2 P r e c i s i o n R e c a l l P r e c i s i o n + R e c a l l ,
I o U = T P T P + F P + F N ,   m I o U = 1 N n I o U n
To quantify the performance of each index with respect to area, the previously mentioned metrics were also categorized by the area of the ground truth parcel, divided into three categories:
  • Parcels with an area less than 10,000 m2;
  • Parcels with an area between 10,000 m2 and 100,000 m2;
  • Parcels with an area greater than 100,000 m2.
Within each parcel-size category, the following object-based counts were defined and counted:
  • TP: GT parcels in the category that were matched to exactly one prediction with IoU > 0.50.
  • FN: GT parcels in the category for which no matching prediction was assigned.
  • FP: Predictions that were not assigned to any GT parcel.
These counts were used to calculate precision, recall, F1 score, and mean IoU. True negatives (TNs) were not reported, as the evaluation was performed in an object-based, GT-anchored manner, and TNs would correspond to background area and were not informative for the metrics considered.
Pixel-wise metrics are calculated by rasterizing both the ground truth and the prediction set for each index. The vector polygons were burnt as binary masks, where parcels had a value of 1, and the background had a value of 0. On the resulting grid, the following counts were computed:
  • TP: Cells where both GT and prediction masks were equal to 1 (correct parcel coverage).
  • FP: Cells where the GT mask was 0 and the prediction mask was 1 (non-parcel area labelled as parcel).
  • FN: Cells where the GT mask was 1, and the prediction mask was 0 (missed parcel area).
TNs are not counted, as they represent the background. Using these counts, precision, recall, F1 score, and IoU were calculated.
Fragmentation was quantified by counting the number of predicted polygons associated with each GT parcel. A GT parcel was classified as fragmented when it was linked to more than one prediction, and fragmentation statistics were reported by parcel-size category.
Additional object-level geometric errors were computed, including Global Over-Classification (GOC), Global Under-Classification (GUC), and Global Total Classification (GTC) error. Let Si denote the i-th predicted parcel and Oi denote the GT parcel with which Si had the largest area of overlap. The over-classification (OC) error was defined following [1]:
O C S i = 1 a r e a S i O i a r e a O i ,  
and the under-classification (UC) error was defined as:
U C S i = 1 a r e a S i O i a r e a S i ,  
and the total classification (TC) error as:
T C S i = O C S i 2 + U C S i 2 2 .  
To quantify spatial overreach, omission, and overall geometric fidelity, global error means were computed as area-weighted averages across all predicted polygons:
G O C =   i n w i O C S i ,   G U C = i n w i O C S i ,   G T C = i n w i T C S i   ,  
w i = a r e a S i k n a r e a S k .  
In order to explicitly evaluate the spatial agreement between predicted and reference parcel outlines, boundary-sensitive metrics are also computed and reported. This evaluation is due to the fact that metrics such as IoU and F1 primarily reflect the amount of shared area between polygons but are less sensitive to local boundary displacements. For this reason, Boundary F1 (bF1) and Boundary IoU (bIoU) are included as the boundary-focused complements to the main accuracy metrics. The evaluation was performed on a common raster grid derived from the reference data extent. Both the ground truth parcel layer and the predicted parcel polygons were rasterized to the same 3 m spatial resolution. After rasterization, parcel boundaries were extracted from the label images by identifying pixels located at transitions between different labels. Let LGT denote the rasterized ground truth label image and LPR the rasterized prediction label image. Their corresponding binary boundary maps are denoted by BGT and BPR, where a value of 1 marks a boundary pixel, and 0 marks a non-boundary pixel. Because the exact pixel-level coincidence of two boundaries is unrealistic in parcel delineation, a tolerant boundary-matching strategy was adopted. Buffer δ was used to dilate both boundary masks. In this study, the tolerance δ was set to two pixels, which corresponds to 6 m. The buffered boundary supports are defined as:
B G T δ = d i l a t e B G T , δ ,   B P R δ = d i l a t e B P R , δ .
Boundary precision and recall were computed by testing whether predicted boundary pixels fell within the tolerance neighbourhood of the ground truth boundary and vice versa. Boundary precision is defined as:
P b = B P R B G T δ B P R ,  
which measures the proportion of predicted boundary pixels that lie within the tolerated neighbourhood of the reference boundary. Boundary recall is defined as:
R b = B G T B P R δ B G T ,  
which measures the proportion of reference boundary pixels that are recovered within the tolerated neighbourhood of the predicted boundary. Using these two quantities, the Boundary F1 score was computed as the harmonic mean:
b F 1 = 2 P b P r P b + P r .  
In addition to bF1, we also compute boundary Intersection-over-Union to provide a direct overlap measure between the tolerated boundary supports of the reference and prediction. Boundary IoU was defined as:
b I o U = B G T δ B P R δ | B G T δ B P R δ | .
Unlike bF1, which is derived from separate precision and recall terms, bIoU summarizes the symmetric overlap between the dilated boundary sets in a single ratio. Higher values indicate stronger geometric consistency between predicted and reference contours after accounting for the predefined tolerance. As it is based exclusively on the boundary regions, bIoU is more sensitive to contour placement than the standard IoU.
The identical encoding, segmentation, post-processing, and evaluation (pixel-wise and object-wise) settings were applied to the second AOI to enable a cross-site comparison; results are summarized in Appendix C.

2.7. Statistical Analysis

The indices were compared on a within-parcel basis using non-parametric tests for repeated measures. For each ground truth parcel, the best IoU against each index’s prediction was computed to form a per-parcel matrix. The next step is to run a Friedman omnibus test across indices. A Friedman omnibus test was used to assess whether at least one method exhibited a different rank distribution across parcels [22]. Subsequently, paired Wilcoxon signed-rank tests were applied to all index pairs, and the Holm correction was applied to control the family-wise error rate [22,23]. For each ground truth polygon G, candidate predictions were retrieved via a spatial index and IoU was computed with each; the maximum IoU found is recorded for that parcel-index pair. The same approach is used for the F1 score as well. For the Friedman omnibus test, the test statistic, and p-value are reported; for the Wilcoxon test, for each pair and metric, the median difference, upper and lower 95% confidence intervals, and the Holm-adjusted p-value are reported.

3. Results

3.1. Harmonically Recoloured Outputs

Figure 5, Figure 6 and Figure 7 illustrate the harmonically recoloured composites generated from the fitted annual harmonic descriptors for all evaluated vegetation indices. In these false colour images, Hue encodes the timing of peak vegetation activity (phase), saturation encodes the strength of the seasonal dynamics (amplitude), and value represents the baseline vegetation level (empirical mean). As a consequence, parcels with similar crop calendars share similar Hues, while strongly seasonal fields appear more saturated than weakly seasonal or mixed-cover areas.
Across vegetation indices, the composites show consistent large-scale phenological patterns but differ in how clearly they separate adjacent parcels and how strongly non-vegetated elements are suppressed. Indices designed to reduce soil background effects typically produce higher saturation within cultivated parcels while maintaining comparatively uniform value in bare or sparsely vegetated zones, which visually improves parcel-to-parcel contrast. Indices more sensitive to canopy greenness emphasize dense vegetation but may compress contrast in areas where vegetation is uniformly high, making neighbouring parcels appear more similar. Red-edge and chlorophyll-related indices often highlight crop vigour differences more strongly within the same general phenological timing, which can increase within-field texture and may either aid or hinder boundary delineation, depending on local heterogeneity. Finally, NDWI behaves differently from greenness indices by emphasizing moisture-related variation, this can accentuate drainage patterns and non-crop-related features and therefore changes the visual relationship between parcel interiors and boundaries.
For additional context, we evaluated three commonly used classical segmentation baselines (Felzenszwalb, SLIC, and Quickshift); their pixel-wise and object-wise results are reported in Appendix B.
Figure 5 and Figure 6 summarize these behaviours for the main greenness and soil-adjusted indices (NDVI, GNDVI, NDRE, NDYVI, EVI, EVI2, SAVI, and MSAVI), while Figure 7 shows indices with different physical sensitivity (NDWI) and chlorophyll proxies (CIg, CIre).

3.2. Segmentation Outputs

Each HSV composite is segmented using the Segment Anything Model (SAM) on a fixed 512 × 512 px grid. SAM’s native output is a binary raster mask, which is then vectorized for further analysis and presentation. All Panels use identical scale, symbology, and their corresponding base map.
Figure 8 presents the segmentation outputs for four indices over the previously defined AOI: Panel (a) NDVI, Panel (b) GNDVI, Panel (c) NDRE, and Panel (d) NDYVI. Panels show the vectorized instance polygons derived from SAM on the previously shown HSV inputs.
Figure 9 depicts the segmentation outputs for four indices over the AOI: Panel (a) EVI, Panel (b) EVI2, Panel (c) SAVI; L = 0.5, and Panel (d) MSAVI.
Figure 10 shows the segmentation outputs for three additional indices: Panel (a) NDWI, Panel (b) CIg, and Panel (c) CIre.

3.3. Validation Metrics

Object-based metrics were used to indicate whether a distinct geometry was produced for each reference parcel, while the size-stratified tables were used to describe how performance varied with parcel area. Pixel-based metrics were used to quantify spatial agreement on a shared 3 m raster grid.
Across vegetation indices, three recurring segmentation failure modes were observed, each of which substantially affected the object-level scores:
  • Over-segmentation: Multiple predicted objects were assigned to a single GT parcel. Precision was reduced due to the increase in FPs, while recall was generally preserved. An example is shown in Figure 11a.
  • Under-segmentation: A single predicted polygon was found to cover two or more GT parcels. Recall was reduced because fewer reference parcels were matched, while precision was typically less affected. An example is shown in Figure 11b.
  • Fragmentation: A parcel was represented by numerous small predicted components. When at least one component exceeded IoU > 0.50, a true positive was recorded alongside multiple false positives, leading to reduced precision. An example is shown in Figure 11c.

3.3.1. Pixel-Wise Metrics

Table 2 shows the pixel-wise metrics. All indices achieve high scores, confirming that the interiors of the parcels are well covered for each vegetation index. NDWI achieves the highest pixel-wise metrics overall, but most of the differences are marginal when comparing the various vegetation indices. NDWI leads, with NDVI practically indistinguishable at the top; EVI2, EVI, MSAVI, and SAVI form a tight cluster just behind it. NDRE trails the greenness family, and CIg/CIre perform the worst.

3.3.2. Object-Wise Metrics

Table 3 presents the object-related metrics, with columns representing the individual metrics: precision, recall, F1 score, and mean IoU. MSAVI leads in precision, recall, and F1, with EVI2 and SAVI effectively tied just behind it. For mean IoU, EVI ranks first with NDWI second, while EVI2 and MSAVI are practically indistinguishable. Greenness-based indices sit mid-pack, NDRE trails them, and CIg and CIre produce the lowest results.
Table 4 shows the object-wise metrics for parcels smaller than 10,000 m2. MSAVI, EVI, and SAVI lead in precision, recall, and F1, while NDRE, CIg, and CIre fall behind. NDWI achieves the highest mean IoU, but the margin is tiny and does not change the overall ordering.
Table 5 shows the results for parcels greater than 10,000 m2 and smaller than 100,000 m2. Performance improves over small parcels, with higher recall and F1 across all indices. A stable cluster of MSAVI, EVI2, EVI, and SAVI leads in precision, recall and F1, with NDWI close behind. NDVI, GNDVI, NDYVI, and NDRE form a middle tier, while CIg and CIre trail behind. For mIoU, EVI produces the best results but by an insignificant margin.
Table 6 shows the results for parcels larger than 100,000 m2. Recall dominates across indices, being in the 0.8 range, which indicates that the method’s coverage is broad for large fields, while precision is comparatively lower, consistent with boundary overreaches. NDWI leads in F1 and recall, with EVI and EVI2 being close behind.

3.3.3. Tile Boundary Errors

Table 7 reports the differences between the seam band and the overall AOI. The dominant effect is a recall drop at the seams across all indices, while precision is essentially unchanged. F1 and IoU are lower in the seam for every index. The largest IoU penalty occurs for NDRE, followed by CIg and NDVI, with the smallest penalties affecting MSAVI, EVI, and SAVI.

3.3.4. Fragmentation Metrics

Table 8, Table 9 and Table 10 report the fragmentation ratio, defined as the number of predicted polygons normalized by the number of GT parcels, and stratified by parcel-size class. Values below 1 were interpreted as evidence of under-segmentation, while values above 1 indicate over-segmentation.
Table 8 reports the results for small parcels; ratios are well below one across all indices, indicating merging and under-segmentation at the smallest scale. NDVI achieves the highest result, while CIre achieves the lowest.
Table 9 reports the results for medium parcels; all ratios are slightly above one, indicating mild fragmentation and over-segmentation. NDWI produces a ratio closest to one, while CIg and NDVI are towards the higher end of the results.
Table 10 reports the ratios for large parcels; ratios rise to nearly 2.5, suggesting the fragmentation of large, homogeneous fields. NDWI again yields the lowest fragmentation ratio, while CIre and CIg produce the highest fragmentation ratios.

3.3.5. Over-Segmentation and Under-Segmentation Metrics

Table 11, Table 12 and Table 13 report the Global Over-Classification Error (GOC), Global Under-Classification Error (GUC), and their combined measure, the Global Total Classification Error (GTC), grouped by parcel area.
The largest errors occur in the bin with the smallest parcels (Table 11), where both GOC and GUC are high, resulting in GTC values in the high 0.3 range across all indices. The best performers are MSAVI, EVI, EVI2, NDWI, and SAVI, while NDRE, CIg, and CIre are at the high end of the error spectrum. GUC and GOC are relatively balanced, indicating that small parcels are prone to both spillover and missed interiors due to their tight boundaries.
Errors decrease notably for the bin with medium parcels (Table 12). Here, GOC clearly dominates GUC, indicating that medium parcels are more affected by boundary spillover than by omission. EVI, EVI2, MSAVI, SAVI, and NDWI produce the best results, while CIg and CIre remain among the weakest. Overall, GTC is in the mid 0.2 range, which is significantly lower than for small parcels.
Errors are lowest for the bin with large parcels (Table 13). GOC greatly exceeds GUC, as large fields exhibit overreach but with very little omission; their interiors are mostly complete, as supported by the high mIoU for large parcels reported in Table 6. NDWI achieves the lowest GTC in this bin, with EVI, EVI2, MSAVI, and NDRE close behind. CIre and CIg are the most error-prone, and GTC sits roughly between 0.15 and 0.20.
GTC decreases monotonically with parcel size, and over-classification dominates under-classification across all bin sizes. Regarding the indices, a stable top cluster is formed by EVI, EVI2, MSAVI, SAVI, and NDWI, which consistently yield the lowest total error. NDRE, NDVI, GNDVI, and NDYVI form a middle tier, while CIg and CIre consistently yield the greatest errors.

3.3.6. Boundary-Sensitive Metrics

Boundary-sensitive evaluation with a two-pixel, 6 m tolerance showed clear differences in contour agreement among the tested indices. The results are shown in Table 14. Overall, the strongest boundary performance was obtained from MSAVI, which achieved the highest bF1 (0.4741), and bIoU (0.3724), closely followed by EVI (bF1 = 0.4725, bIoU = 0.3262) and EVI2 (bF1 = 0.4712, bIoU = 0.3257). SAVI also performed strongly, confirming that soil-adjusted and enhanced vegetation formulations produced the most accurate parcel contours in this setting. By contrast, lower boundary agreement was observed for chlorophyll- and red-edge-oriented indices. Standard greenness indices such as NDVI and GNDVI occupied an intermediate position, while NDWI performed slightly better than NDVI in terms of boundary overlap.

3.3.7. Statistical Significance Testing

Across 11 indices, the Friedman test rejects the null hypothesis of equal performance. IoU and F1 produced identical rank orders and decisions, so only one set of values is reported. The results are shown in Table 14. Accordingly, the Friedman test statistic, Q, is equal to 2504.13, and the p-value is less than 0.001.
We applied paired Wilcoxon signed-rank tests with Holm correction on per-parcel comparisons across the 11 indices. Tests were run across all parcels. The main text reports a reduced, representative set of pairs for IoU in Table 15 and F1 in Table 16, while complete pairwise tables for both metrics are in Appendix A, Table A1 (IoU) and Table A2 (F1). These pairs capture the overall pattern: a top cluster contains MSAVI, EVI, EVI2, and SAVI, which are mutually indistinguishable. A middle cluster contains NDVI and NDWI, which are indistinguishable within the pair, followed by GNDVI and NDYVI, then NDRE, with CIg and CIre consistently performing the worst.

4. Discussion

Across eleven vegetation indices under a fixed recolouring scheme and zero-shot segmentation pipeline, the delineations were consistently strong at the pixel level, while parcel level precision and recall were noticeably lower. Index choice influenced the ability to properly delineate parcels, with soil-adjusted and enhanced-greenness indices tending to stabilize boundaries, while chlorophyll ratio indices consistently provided the worst results. Repeated measures tests indicated systematic differences among indices under identical preprocessing and segmentation pipelines, reinforcing that index physics, not tuning, drives the observed shift in results.
These findings align with prior evidence that temporal encodings can produce parcel-coherent masks with zero-shot SAM, a compact annual harmonic projected into cylindrical colour spaces that yields stable interiors and sharper inter-parcel contrast. Under HSV, Hue separates neighbours by timing, saturation scales with the seasonal range, and value reflects the baseline level. The dominant errors were over-segmentation, under-segmentation at the parcel level, and boundary under-reach with otherwise correct masks, which explains strong pixel coverage but lower object-wise precision. Hue can split large uniform fields when subtle management or moisture gradients affect timing; high saturation improves detectability but can also encourage duplicate fragments, and value tracks empirical mean greenness, which dampens short-lived events. Index interactions are consistent with these mechanics: soil-adjusted and enhanced-greenness (MSAVI, SAVI, EVI, and EVI2) indices generally reduce merges and spurious splits, while chlorophyll-ratio (CIg, CIre) indices are more sensitive to edge instability, and NDWI covers interiors well but may reduce inter-parcel chromatic contrast.
Papić et al. [8] demonstrated that recolouring multitemporal NDVI into cylindrical colour spaces and segmenting with a zero-shot segmenter yields parcel-coherent fields without retraining. HSV, HWB, and LCH each emphasized different aspects of the seasonal signal and produced distinct precision–recall trade-offs. Our results preserve this core mechanism: contrast from time using a compact annual harmonic is mapped to perceptual channels, while showing that index choice affects the strength of these cues. Compared to classic delineation methods based on gradients, region-growing or spectral-similarity merging, this approach derives contrast from seasonal trajectories rather than single-date spatial patterns [1].
Limitations include evaluation in a limited set of regions and a single season with one sensor family, geometry and timing mismatches in administrative ground truth, and sensitivity of time series encodings to residual atmospheric effects. In addition, the evaluation uses a ground truth-aware IoU screening step prior to computing metrics; therefore, the reported results reflect conditional quality for sufficiently overlapping candidates and may overestimate performance in a fully automated setting. The single-harmonic assumption captures the dominant seasonal mode but not double cropping or rapid cut regrowth. Mapping amplitude to the saturation and mean to value stabilizes the seasonality but may also mute diagnostically brief events. Index ordering shows AOI dependence in the smallholder setting, indicating that relative performance is not fully stable across contrasting parcel regimes.
Strengths include a controlled study design attributing differences to index physics, eleven indices spanning greenness, soil-adjusted, red-edge, and water-sensitive variants over a full year series, training-free and interpretable encoding that is both index- and sensor- agnostic, and rigorous evaluation with multiple metric types and statistical testing.
Index selection is a significant factor for boundary quality in temporally encoded, zero-shot delineation. The same harmonic descriptors used for delineation also carry crop-specific information, enabling the development of a unified methodology for field delineation and crop mapping from a single interpretable representation, well aligned with CAP-style monitoring and seasonal change analysis [8,11,12]. Recent reviews of crop mapping and yield prediction highlight the benefits of multitemporal inputs and modern deep networks but also underline label demands and generalization issues [24]. In smallholder mosaics, large-scale delineation with transfer learning and weak supervision has achieved strong boundary quality while drastically reducing manual labels yet still requires supervised training, which positions our zero-shot methodology as a fast, interpretable starting point that can seed supervised refinements [25]. For crop classification, crop-specific spectro-temporal feature selection improves map accuracy by tailoring features per class [26]. Our harmonic descriptors provide compact, discriminative features that can be integrated into such methodologies or guide class-wise feature subsets.
Future work should proceed along two complementary tracks. First, zero-shot upgrades, pairing the same recoloured inputs with specialized SAM derivatives, such as Delineate Anything [2], FieldSeg [4], BoundarySAM [9], fabSAM [10], and applying boundary-aware refinements, such as Principal Component Analysis (PCA), high-frequency enhancement, guided filtering, or light prompt/decoder tuning to tighten edges and suppress spurious splits without altering the temporal encoding [9,10]. Second, a supervised path, training compact segmenters such as U-Net, SegNet, or DeepLabV3+ directly on HSV composites, using a frozen DINOv3 backbone plus a lightweight decoder for data-efficient fine-tuning, size-aware sampling to reduce fragmentation, and implementing simple topology repair post-inference [27,28]. In parallel, the harmonic descriptors embedded in the HSV image can be used for parcel-level crop mapping.

5. Conclusions

This work proposes a simple, training-free method that converts annual harmonic summaries of vegetation index time series into HSV composites and feeds them to a zero-shot segmenter for parcel delineation over a 15 × 15 km AOI in Slavonija.
Harmonically summarizing vegetation index time series and mapping them to HSV, then segmenting with a zero-shot segmenter, provides a solid and transparent baseline for field delineation at scale. The pipeline is fast, training-free, and index-agnostic. The same phase–amplitude–mean descriptors embedded in the composites are also discriminative for crop mapping, enabling a single, scalable workflow that delineates fields and assigns crop types, while leaving a clear path to supervised heads and multi-sensor extensions.

Author Contributions

Conceptualization, F.P. and M.M.; methodology, F.P.; software, F.P.; validation, L.R. and D.M.; formal analysis, F.P.; investigation, F.P. and M.M.; resources, F.P. and L.R.; data curation, F.P.; writing—original draft preparation, F.P.; writing—review and editing, M.M. and D.M.; visualization, F.P.; supervision, M.M.; project administration, L.R.; funding acquisition, L.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Croatian Science Foundation for the FORMAT project: “Fusion of multitemporal optical and radar microsatellite data for land cover change detection”, Grant Number IP-2022-10-2639. This research was funded by the University of Zagreb, Faculty of Geodesy, through the institutional research projects Primjena spektralnih modela za segmentaciju zemljišnog pokrova—PRISMA (2025–2029) and Prostorno modeliranje urbanih šuma korištenjem LiDAR podataka i otvorenih GIS alata—GeoUrbanBio (2025–2029), under the Call for Institutional Research Project Funding, financed by the European Union—NextGenerationEU. The views and opinions expressed are those of the author(s) only and do not necessarily reflect those of the European Union or the European Commission. Neither the European Union nor the European Commission can be held responsible for them.

Data Availability Statement

Restrictions apply to the availability of this data. The PlanetScope imagery analyzed in this study was obtained under a research licence from Planet Labs PBC and cannot be redistributed. Derived data (figures) are included in the article. The ground truth data obtained from PAAFRD also cannot be redistributed.

Acknowledgments

Planet Team (2025). Planet Application Program Interface: In Space for Life on Earth. San Francisco, CA. https://api.planet.com, accessed on 5 February 2026. Paying Agency for Agriculture, Fisheries, and Rural Development (2026). Zagreb, Croatia. https://www.apprrr.hr/, accessed on 19 January 2026. During the preparation of this work the authors used ChatGPT 5.2 in order to improve the quality of writing. After using this tool, the authors reviewed and edited the content as needed and took full responsibility for the content of the publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
APBDAgricultural Parcel and Boundary Delineation
SAMSegment Anything Model
NDVINormalized Difference Vegetation Index
GNDVIGreen Normalized Difference Vegetation Index
NDRENormalized Difference Red Edge
EVIEnhanced Vegetation Index
EVI2Two-band Enhanced Vegetation Index
SAVISoil-Adjusted Vegetation Index
MSAVIModified Soil-Adjusted Vegetation Index
NDWINormalized Difference Water Index
CIgChlorophyll Index Green
CIreChlorophyll Index Red-Edge
NDYVINormalized Difference Yellow Vegetation Index
HSVHue-Saturation-Value
HWBHue-Whiteness-Blackness
LCHLuminance-Chroma-Hue
AOIArea-of-Interest
CAPCommon Agricultural Policy
NIRNear Infra-Red
OLSOrdinary Least Squares
IoUIntersection-over-Union
GTGround Truth
PAAFRDPayment Agency for Agriculture, Fisheries and Rural Development
TPTrue Positive
FPFalse Positive
FNFalse Negative
GOCGlobal Over-Classification Error
GUCGlobal Under-Classification Error
GTCGlobal Total Classification Error
mIoUMean Intersection-over-Union
PCAPrincipal Component Analysis

Appendix A

In this appendix, the full set of post hoc pairwise comparisons among vegetation-index variants was provided to support the main results. After statistically significant differences among indices had been established using a Friedman test (Q = 2504.13, p < 0.001), Wilcoxon signed-rank tests were carried out for all index pairs, and Holm correction was applied to control the family-wise error rate.
The outcomes were summarized in Table A1 and Table A2 for IoU and F1, respectively. For every index pair, the median difference in scores is reported, together with a 95% bootstrap confidence interval and the Holm-adjusted p-value. Positive median differences were interpreted as higher performance of the first-listed index. In addition to the full pairwise comparisons, per-parcel comparisons against NDVI showed that top-tier indices MSAVI, EVI2, EVI, and SAVI significantly outperformed NDVI, with all Holm-adjusted p-values having values less than 0.001. These indices exceeded NDVI on 62.9–63.4% of parcels, with median IoU improvements of 0.012–0.0147. In contrast, NDWI did not significantly outperform NDVI (adjusted p = 0.817), with a zero median difference and a win rate of 48.6%. Overall, the statistical results indicate a consistent advantage of several top-tier formulations over NDVI, whereas NDWI did not show a significant per-parcel improvement.
Table A1. Full pairwise Wilcoxon signed-rank tests for IoU scores across indices.
Table A1. Full pairwise Wilcoxon signed-rank tests for IoU scores across indices.
Index AIndex BMedian IoU DifferenceLower 95% CI 1Upper 95% CI 1Adjusted p-Value 2
CIreMSAVI−0.02329−0.02519−0.02061<0.001
CIreEVI2−0.02288−0.02571−0.02064<0.001
CIreEVI−0.02352−0.02622−0.02130<0.001
CIreSAVI−0.02236−0.02496−0.01950<0.001
EVINDRE0.010040.008820.01181<0.001
EVI2NDRE0.010290.008800.01187<0.001
MSAVINDRE0.010310.008960.01200<0.001
CIgMSAVI−0.01577−0.01833−0.01354<0.001
CIgEVI2−0.01648−0.01897−0.01409<0.001
NDRESAVI−0.00972−0.01115−0.00836<0.001
CIgEVI−0.01683−0.01908−0.01413<0.001
EVI2NDYVI0.008520.007230.00986<0.001
CIgSAVI−0.01519−0.01781−0.01288<0.001
CIreNDVI−0.01476−0.01747−0.01245<0.001
MSAVINDYVI0.008210.006670.00977<0.001
CIreNDWI−0.01462−0.01673−0.01218<0.001
EVINDYVI0.008360.007050.00962<0.001
EVI2GNDVI0.008260.007040.00984<0.001
EVIGNDVI0.008750.007440.01023<0.001
CIreNDYVI−0.01041−0.01218−0.00825<0.001
GNDVIMSAVI−0.00895−0.01011−0.00753<0.001
GNDVISAVI−0.00747−0.00869−0.00620<0.001
NDYVISAVI−0.00698−0.00799−0.00592<0.001
EVI2NDVI0.005850.004650.00703<0.001
CIreGNDVI−0.00911−0.01143−0.00674<0.001
EVINDVI0.005750.004540.00698<0.001
CIreNDRE−0.00617−0.00807−0.00434<0.001
MSAVINDVI0.006150.004540.00745<0.001
NDVISAVI−0.00485−0.00603−0.00387<0.001
NDRENDVI−0.00309−0.00415−0.00207<0.001
CIgNDVI−0.00775−0.00998−0.00577<0.001
CIgNDWI−0.00664−0.00901−0.00458<0.001
NDRENDWI−0.00301−0.00420−0.00173<0.001
EVI2NDWI0.004540.003290.00604<0.001
EVINDWI0.004810.003270.00622<0.001
MSAVINDWI0.004860.003030.00625<0.001
CIgCIre0.000740.000000.00173<0.001
CIgNDYVI−0.00352−0.00524−0.00188<0.001
NDRENDYVI−0.00087−0.00133−0.00039<0.001
NDWISAVI−0.00343−0.00514−0.00216<0.001
CIgGNDVI−0.00398−0.00582−0.00208<0.001
GNDVINDVI−0.00084−0.00157−0.00006<0.001
GNDVINDWI−0.00085−0.001800.00000<0.001
NDWINDYVI0.001330.000140.00221<0.001
EVISAVI0.000230.000000.00047<0.001
GNDVINDRE0.000860.000160.00162<0.001
EVI2SAVI0.000000.000000.00019<0.001
NDVINDYVI0.000620.000000.00137<0.001
MSAVISAVI0.000000.000000.00010<0.001
CIgNDRE−0.00028−0.001990.00000<0.001
EVI2MSAVI0.000000.000000.000000.032
EVIEVI20.000000.000000.000000.591
EVIMSAVI0.000000.000000.000000.817
GNDVINDYVI0.000000.000000.000000.817
NDVINDWI0.000000.000000.000000.817
1 Confidence interval; 2 Holm-adjusted p-value.
Table A2. Full pairwise Wilcoxon signed-rank tests for F1 scores across indices.
Table A2. Full pairwise Wilcoxon signed-rank tests for F1 scores across indices.
Index AIndex BMedian F1 DifferenceLower 95% CI 1Upper 95% CI 1Adjusted p-Value 2
CIreMSAVI−0.02047−0.02247−0.01814<0.001
CIreEVI2−0.02027−0.02224−0.01787<0.001
CIreEVI−0.02070−0.02285−0.01881<0.001
CIreSAVI−0.01971−0.02169−0.01743<0.001
EVI2NDRE0.009140.008110.01044<0.001
EVINDRE0.009180.007930.01024<0.001
MSAVINDRE0.009210.007740.01038<0.001
CIgMSAVI−0.01378−0.01550−0.01210<0.001
CIgEVI2−0.01413−0.01600−0.01228<0.001
NDRESAVI−0.00847−0.00954−0.00728<0.001
CIgEVI−0.01398−0.01585−0.01251<0.001
EVI2NDYVI0.007320.006280.00824<0.001
CIreNDVI−0.01297−0.01486−0.01120<0.001
CIgSAVI−0.01342−0.01529−0.01169<0.001
CIreNDWI−0.01279−0.01447−0.01107<0.001
MSAVINDYVI0.007360.006140.00837<0.001
EVINDYVI0.007300.006250.00845<0.001
EVI2GNDVI0.007330.006230.00852<0.001
EVIGNDVI0.007700.006510.00881<0.001
CIreNDYVI−0.00903−0.01061−0.00738<0.001
GNDVIMSAVI−0.00767−0.00906−0.00649<0.001
GNDVISAVI−0.00662−0.00760−0.00556<0.001
NDYVISAVI−0.00604−0.00673−0.00511<0.001
EVI2NDVI0.005050.004150.00614<0.001
CIreGNDVI−0.00820−0.01009−0.00673<0.001
CIreNDRE−0.00567−0.00695−0.00413<0.001
EVINDVI0.005020.004200.00594<0.001
MSAVINDVI0.005210.004010.00619<0.001
NDVISAVI−0.00438−0.00524−0.00324<0.001
NDRENDVI−0.00289−0.00383−0.00201<0.001
CIgNDVI−0.00700−0.00894−0.00479<0.001
CIgNDWI−0.00636−0.00829−0.00429<0.001
NDRENDWI−0.00269−0.00369−0.00183<0.001
EVI2NDWI0.004140.003100.00558<0.001
EVINDWI0.004360.003090.00542<0.001
MSAVINDWI0.004510.003250.00601<0.001
CIgCIre0.000720.000000.00167<0.001
CIgNDYVI−0.00309−0.00453−0.00202<0.001
NDRENDYVI−0.00082−0.00122−0.0004<0.001
CIgGNDVI−0.00375−0.00545−0.00194<0.001
NDWISAVI−0.00336−0.00440−0.00210<0.001
GNDVINDVI−0.00068−0.00128−0.00011<0.001
GNDVINDWI−0.00088−0.001630.00000<0.001
NDWINDYVI0.001160.000120.00210<0.001
EVISAVI0.000200.000000.00043<0.001
GNDVINDRE0.000890.000140.00151<0.001
EVI2SAVI0.000000.000000.00020<0.001
NDVINDYVI0.000600.000000.00118<0.001
CIgNDRE−0.00029−0.001730.00000<0.001
MSAVISAVI0.000000.000000.00011<0.001
EVI2MSAVI0.000000.000000.000000.043
EVIEVI20.000000.000000.000000.709
EVIMSAVI0.000000.000000.000000.899
GNDVINDYVI0.000000.000000.000000.899
NDVINDWI0.000000.000000.000000.899
1 Confidence interval; 2 Holm-adjusted p-value.

Appendix B

To contextualize the performance of the proposed SAM-based approach, we additionally evaluated three commonly used classical segmentation algorithms, Felzenszwalb graph-based segmentation, SLIC superpixels, and Quickshift superpixels, which were applied directly to the same harmonic composite MSAVI input. The goal is providing a representative non-learning reference that is typically used for image segmentation.
Baseline segmentations were polygonized and filtered using the previously described algorithm, and accuracy was quantified using pixel-wise metrics and object-wise metrics, while matching at an IoU threshold of 0.50. The results are summarized in Table A3 and Table A4, and the segmentation outputs are depicted in Figure A1.
Figure A1. (a) Segmentation output of harmonic Felzenszwalb, (b) segmentation output of harmonic Quickshift, and (c) segmentation output of harmonic SLIC.
Figure A1. (a) Segmentation output of harmonic Felzenszwalb, (b) segmentation output of harmonic Quickshift, and (c) segmentation output of harmonic SLIC.
Remotesensing 18 01011 g0a1
As shown in Table A3, superpixel-based methods achieved high pixel-wise scores on this AOI, whereas Felzenszwalb yielded a substantially lower pixel-wise IoU and F1. This behaviour is expected, as pixel-wise metrics can be inflated when methods generate extensive contiguous masks that overlap the parcel interiors, even if parcel boundaries are imprecise or instances are not well separated.
Table A3. Pixel-wise metrics for different segmentation methods.
Table A3. Pixel-wise metrics for different segmentation methods.
MethodPrecisionRecallF1IoU
Felzenszwalb0.96910.48630.64760.4789
Quickshift0.94900.89110.91920.8504
SLIC0.96110.83990.89650.8124
SAM0.97970.80570.88420.7924
Object-wise evaluation in Table A4 reveals a different picture. Despite high pixel-wise scores, SLIC and Quickshift show limited parcel-level delineation quality at IoU > 0.50 due to over-segmentation and fragmentation. Quickshift achieved the highest object-wise recall amongst the three, while SLIC produced lower object-wise F1. Felzenszwalb comparatively produced higher precision but very low recall, which indicates that it missed many parcels.
Table A4. Object-wise metrics for different segmentation methods.
Table A4. Object-wise metrics for different segmentation methods.
MethodPrecisionRecallF1mIoU
Felzenszwalb0.4920.10360.17120.7420
Quickshift0.27700.39420.32540.6995
SLIC0.20530.27750.23600.6589
SAM0.51980.48690.50280.6998
These results highlight that pixel-wise agreement does not necessarily translate to accurate parcel delineation, where correct instance separation and boundary placement are essential. This motivates the usage of SAM, which is designed for instance segmentation and can leverage learned priors to better respect object boundaries, while our harmonic-HSV encoding focuses the input representation on phenological separability rather than raw texture alone. Although parcel interiors exhibit distinct textural and phenological patterns in the composites, delineation remains challenging in cases of adjacent parcels sharing similar seasonal signatures; within-field heterogeneity also introduces internal edges that can attract classical segmentation boundaries; and narrow field margins at 3 m resolution can also be sub-pixel or mixed, which limits boundary precision.

Appendix C

The main experiment quantifies how different vegetation indices affect segmentation performance when encoded via the proposed method. However, parcel delineation is sensitive to landscape structure, most notably field size and boundary density. In order to assess whether the conclusions drawn from the primary AOI are transferable to a more challenging setting, additional evaluation was performed on an independent agricultural region dominated by smallholder parcels in Northern Croatia.
The same processing steps are applied as in the main study without changing parameters. Figure A2 shows the location and extent of the second AOI in Northern Croatia, while Table A5 and Table A6 report object-wise and pixel-wise metrics for all index variants.
Table A5. Northern Croatia object-wise metrics.
Table A5. Northern Croatia object-wise metrics.
IndexPrecisionRecallF1mIoU
CIg0.14900.23400.18210.6549
CIre0.12520.24800.16640.6463
EVI20.08350.19050.11610.6387
EVI0.12600.27400.17260.6371
GNDVI0.12920.24750.16980.6449
MSAVI0.13070.26800.17570.6437
NDRE0.12400.25100.16600.6481
NDVI0.13190.26200.17540.6465
NDWI0.14600.24300.18240.6466
NDYVI0.13310.25650.17520.6458
SAVI0.12990.27450.17640.6405
Table A6. Northern Croatia pixel-wise metrics.
Table A6. Northern Croatia pixel-wise metrics.
IndexPrecisionRecallF1IoU
CIg0.96010.49250.65100.4826
CIre0.95940.54000.69100.5279
EVI20.96510.51860.67460.509
EVI0.96950.54440.69730.5352
GNDVI0.96710.51850.67510.5095
MSAVI0.96930.55990.70980.5502
NDRE0.96390.55540.70470.5441
NDVI0.96750.55550.70580.5454
NDWI0.96880.52670.68240.5180
NDYVI0.96810.53770.69140.5284
SAVI0.97110.56440.71390.5551
Figure A2. Secondary AOI overlaid with GT parcels.
Figure A2. Secondary AOI overlaid with GT parcels.
Remotesensing 18 01011 g0a2
Table A5 summarizes object-wise results in the smallholder AOI. Overall object-wise performance is significantly lower than in the primary AOI, which is expected in landscapes with numerous small parcels where mixed pixels along narrow parcel margins produce a disproportionate penalty under the defined IoU threshold. The differences between indices are also compressed within this AOI, as F1 spans a relatively narrow range, which implies that parcel geometry and boundary ambiguity dominate the error budget more strongly than index choice.
Despite the compressed ranges, a small upper tier is observable, NDWI and CIg yield the highest object-wise F1 values, followed closely by the soil-adjusted indices (SAVI and MSAVI) and greenness indices. At the same time, several indices exhibit notable rank shifts with regard to the primary AOI, which indicates that the full index ordering is not strictly preserved under the smallholder conditions. Table A6 reports the pixel-wise metrics for the same AOI, and they are generally higher than their object-wise counterparts, reflecting that many predicted masks overlap parcel interiors even when instance separation and boundary placement are imperfect. To summarize, evaluation of the second site indicates that absolute segmentation accuracy degrades substantially in a smallholder landscape at 3 m resolution and that index-dependent effects persist but become harder to separate due to the narrower performance spread.

Appendix D

To contextualize the proposed harmonic-HSV encoding, two RGB-based baselines were evaluated using the same SAM configuration and post-processing pipeline using a single-date RGB selected from the peak growing season and a naive multi-date RGB composite constructed without harmonic modelling. The naive multi-date RGB composite was generated by taking the per-pixel median of the red, green, and blue bands across three dates, producing a single three-band image that summarizes the temporal stack in a simple, non-phenological manner. The purpose of this comparison was to assess whether the proposed harmonic-HSV representation provides a measurable benefit over more straightforward three-channel inputs.
Object-wise results are summarized in Table A7. Among the tested inputs, the MSAVI harmonic-HSV composite achieved the highest object-wise F1 score (0.5028), indicating the best balance between precision and recall at the parcel level. The naive multi-date RGB composite yielded lower object-wise F1 (0.4411), although it produced the highest object-wise precision (0.7417), suggesting that it generated fewer false positive parcel matches but missed a larger proportion of reference parcels. The single-date RGB baseline performed the worst, with a significantly lower recall (0.1481) and F1 (0.2469). This indicates that relying on a single acquisition substantially reduces parcel detectability relative to both temporally informed representations.
Table A7. Object-wise metrics including two RGB baselines.
Table A7. Object-wise metrics including two RGB baselines.
TypePrecisionRecallF1mIoU
Naive multi-date0.74170.31380.44110.7137
Single date0.74210.14810.24690.7281
MSAVI 0.51980.48690.50280.6998
Pixel-wise results are provided in Table A8. In this case, the MSAVI harmonic-HSV also achieved the strongest performance overall, with the highest recall (0.8057), F1 (0.8842), and IoU (0.7924). The naive multi-date RGB composite produced a lower pixel-wise F1 (0.7480) and IoU (0.5975), while the single-date RGB baseline again showed the weakest performance (F1 = 0.5381, IoU = 0.3681). Although both RGB baselines attained very high precision, this was accompanied by reduced recall, indicating under-segmentation relative to the harmonic-HSV encoding.
Table A8. Pixel-wise metrics including two RGB baselines.
Table A8. Pixel-wise metrics including two RGB baselines.
TypePrecisionRecallF1IoU
Naive multi-date0.99030.60100.74800.5975
Single date0.98690.36990.53810.3681
MSAVI0.97970.80570.88420.7924
Overall, these baseline experiments show that the proposed harmonic-HSV representation improves segmentation performance relative to both a single-date RGB input and a naive multi-date RGB composite, with the largest gains observed in recall-sensitive metrics. This suggests that encoding the annual phenological signal into a compact three-channel representation benefits parcel delineation beyond what can be obtained from straightforward RGB compositing alone.

References

  1. Zheng, J.; Ye, Z.; Wen, Y.; Huang, J.; Zhang, Z.; Li, Q.; Hu, Q.; Xu, B.; Zhao, L.; Fu, H. A Comprehensive Review of Agricultural Parcel and Boundary Delineation from Remote Sensing Images: Recent Progress and Future Perspectives. IEEE Geosci. Remote Sens. Mag. 2026; early access. [CrossRef]
  2. Lavreniuk, M.; Kussul, N.; Shelestov, A.; Yailymov, B.; Salii, Y.; Kuzin, V.; Szantoi, Z. Delineate Anything: Resolution-Agnostic Field Boundary Delineation on Satellite Imagery. arXiv 2025, arXiv:2504.02534. [Google Scholar] [CrossRef]
  3. Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.-Y.; et al. Segment Anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 4015–4026. [Google Scholar]
  4. Ferreira, L.B.; Martins, V.S.; Aires, U.R.V.; Wijewardane, N.; Zhang, X.; Samiappan, S. FieldSeg: A scalable agricultural field extraction framework based on the Segment Anything Model and 10-m Sentinel-2 imagery. Comput. Electron. Agric. 2025, 232, 110086. [Google Scholar] [CrossRef]
  5. Amin, G.; Oberlin, T.; Demarez, V. Early-season delineation of agricultural fields using a fully convolutional multi-task network and satellite images. Sci. Remote Sens. 2025, 12, 100256. [Google Scholar] [CrossRef]
  6. Wilson, B.T.; Knight, J.F.; McRoberts, R.E. Harmonic regression of Landsat time series for modeling attributes from national forest inventory data. ISPRS J. Photogramm. Remote Sens. 2018, 137, 29–46. [Google Scholar] [CrossRef]
  7. Ben Abbes, A.; Bounouh, O.; Farah, I.R.; de Jong, R.; Martínez, B. Comparative study of three satellite image time-series decomposition methods for vegetation change detection. Eur. J. Remote Sens. 2018, 51, 607–615. [Google Scholar] [CrossRef]
  8. Papić, F.; Rumora, L.; Medak, D.; Miler, M. Turning Seasonal Signals into Segmentation Cues: Recolouring the Harmonic Normalized Difference Vegetation Index for Agricultural Field Delineation. Sensors 2025, 25, 5926. [Google Scholar] [CrossRef]
  9. Awad, B.; Erer, I. Boundary SAM: Improved parcel boundary delineation using SAM’s image embeddings and detail enhancement filters. IEEE Geosci. Remote Sens. Lett. 2025, 22, 2502905. [Google Scholar] [CrossRef]
  10. Xie, Y.; Wu, H.; Tong, H.; Xiao, L.; Zhou, W.; Li, L.; Wanger, T.C. fabSAM: A Farmland Boundary Delineation Method Based on the Segment Anything Model. arXiv 2025, arXiv:2501.12487. [Google Scholar] [CrossRef]
  11. PlanetScope|Planet Documentation. Available online: https://docs.planet.com/data/imagery/planetscope/#psbsd (accessed on 24 November 2025).
  12. Chasles, R.G.; Maciel, D.A.; Barbosa, C.C.F.; Novo, E.M.L.M.; Martins, V.S.; Paulino, R.; Wanderley, R.; Júnior, R.F.; Lima, T.M.; Bacellar, P.; et al. Accuracy assessment of PlanetScope SuperDove products for aquatic reflectance retrieval over Brazilian inland and coastal waters. ISPRS J. Photogramm. Remote Sens. 2025, 227, 678–690. [Google Scholar] [CrossRef]
  13. Vanhellemont, Q. Evaluation of eight-band SuperDove imagery for aquatic applications. Opt. Express 2023, 31, 13851–13872. [Google Scholar] [CrossRef]
  14. Yang, C.; Everitt, J.H.; Bradford, J.M.; Murden, D. Airborne Hyperspectral Imagery and Yield Monitor Data for Mapping Cotton Yield Variability. Precis. Agric. 2004, 5, 445–461. [Google Scholar] [CrossRef]
  15. Hunt, E.R., Jr.; Daughtry, C.S.T.; Eitel, J.U.H.; Long, D.S. Remote sensing leaf chlorophyll content using a visible band index. Agron. J. 2011, 103, 1090–1099. [Google Scholar] [CrossRef]
  16. Jiang, Z.; Huete, A.R.; Didan, K.; Miura, T. Development of a two-band enhanced vegetation index without a blue band. Remote Sens. Environ. 2008, 112, 3833–3845. [Google Scholar] [CrossRef]
  17. Huete, A.R. A soil-adjusted vegetation index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
  18. Qi, J.; Chehbouni, A.; Huete, A.R.; Kerr, Y.H.; Sorooshian, S. A modified soil adjusted vegetation index. Remote Sens. Environ. 1994, 48, 119–126. [Google Scholar] [CrossRef]
  19. Gao, B.-C. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sens. Environ. 1996, 58, 257–266. [Google Scholar] [CrossRef]
  20. Wei, Y.; Lu, M.; Yu, Q.; Li, W.; Wang, C.; Tang, H.; Wu, W. The normalized difference yellow vegetation index (NDYVI): A new index for crop identification by using GaoFen-6 WFV data. Comput. Electron. Agric. 2024, 226, 109417. [Google Scholar] [CrossRef]
  21. Pringle, M.J. Robust prediction of time-integrated NDVI. Int. J. Remote Sens. 2013, 34, 4791–4811. [Google Scholar] [CrossRef]
  22. García, S.; Fernández, A.; Luengo, J.; Herrera, F. Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Inf. Sci. 2010, 180, 2044–2064. [Google Scholar] [CrossRef]
  23. Demšar, J. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 2006, 7, 1–30. [Google Scholar]
  24. Joshi, A.; Pradhan, B.; Gite, S.; Chakraborty, S. Remote-Sensing Data and Deep-Learning Techniques in Crop Mapping and Yield Prediction: A Systematic Review. Remote Sens. 2023, 15, 2014. [Google Scholar] [CrossRef]
  25. Wang, S.; Waldner, F.; Lobell, D.B. Unlocking Large-Scale Crop Field Delineation in Smallholder Farming Systems with Transfer Learning and Weak Supervision. Remote Sens. 2022, 14, 5738. [Google Scholar] [CrossRef]
  26. Yin, L.; You, N.; Zhang, G.; Huang, J.; Dong, J. Optimizing Feature Selection of Individual Crop Types for Improved Crop Mapping. Remote Sens. 2020, 12, 162. [Google Scholar] [CrossRef]
  27. Luo, Z.; Yang, W.; Yuan, Y.; Gou, R.; Li, X. Semantic segmentation of agricultural images: A survey. Inf. Process. Agric. 2024, 11, 172–186. [Google Scholar] [CrossRef]
  28. Siméoni, O.; Vo, H.V.; Seitzer, M.; Baldassarre, F.; Oquab, M.; Jose, C.; Khalidov, V.; Szafraniec, M.; Yi, S.; Ramamonjisoa, M.; et al. DINOv3. arXiv 2025, arXiv:2508.10104. [Google Scholar] [CrossRef]
Figure 1. PlanetScope RGB composite of the study area, Imagery © 2025 Planet Labs.
Figure 1. PlanetScope RGB composite of the study area, Imagery © 2025 Planet Labs.
Remotesensing 18 01011 g001
Figure 2. Workflow of the proposed method.
Figure 2. Workflow of the proposed method.
Remotesensing 18 01011 g002
Figure 3. (a) Vegetation index time series where each greyscale slice corresponds to one acquisition date, stacking the slice forms a per-pixel annual index time series; (b) per-pixel harmonic modelling of the time series-observed index values (blue points) are detrended with a linear term and the residuals are fitted with a single annual harmonic. The fit yields three descriptors—phase (timing of the seasonal peak), amplitude (seasonality strength), and mean (baseline level); (c) mapping of harmonic descriptors to the HSV colour space to form the false colour composite: Hue = phase (interpreted as a circular timing axis over the annual cycle, where similar Hues indicate similar peak timing); saturation = amplitude (low seasonality desaturates towards grey); and value = mean (low baseline appears darker). Scaling is applied per index at the AOI level using fixed saturation percentiles and conservative value clipping to ensure consistent colour ranges.
Figure 3. (a) Vegetation index time series where each greyscale slice corresponds to one acquisition date, stacking the slice forms a per-pixel annual index time series; (b) per-pixel harmonic modelling of the time series-observed index values (blue points) are detrended with a linear term and the residuals are fitted with a single annual harmonic. The fit yields three descriptors—phase (timing of the seasonal peak), amplitude (seasonality strength), and mean (baseline level); (c) mapping of harmonic descriptors to the HSV colour space to form the false colour composite: Hue = phase (interpreted as a circular timing axis over the annual cycle, where similar Hues indicate similar peak timing); saturation = amplitude (low seasonality desaturates towards grey); and value = mean (low baseline appears darker). Scaling is applied per index at the AOI level using fixed saturation percentiles and conservative value clipping to ensure consistent colour ranges.
Remotesensing 18 01011 g003
Figure 4. The ground truth parcels overlaid on an RGB composite of the study area, Imagery © 2025 Planet Labs.
Figure 4. The ground truth parcels overlaid on an RGB composite of the study area, Imagery © 2025 Planet Labs.
Remotesensing 18 01011 g004
Figure 5. (a) NDVI harmonic composite, (b) GNDVI harmonic composite, (c) NDRE harmonic composite, and (d) NDYVI harmonic composite.
Figure 5. (a) NDVI harmonic composite, (b) GNDVI harmonic composite, (c) NDRE harmonic composite, and (d) NDYVI harmonic composite.
Remotesensing 18 01011 g005
Figure 6. (a) EVI harmonic composite, (b) EVI2 harmonic composite, (c) SAVI harmonic composite, and (d) MSAVI harmonic composite.
Figure 6. (a) EVI harmonic composite, (b) EVI2 harmonic composite, (c) SAVI harmonic composite, and (d) MSAVI harmonic composite.
Remotesensing 18 01011 g006
Figure 7. (a) NDWI harmonic composite, (b) CIg harmonic composite, and (c) CIre harmonic composite.
Figure 7. (a) NDWI harmonic composite, (b) CIg harmonic composite, and (c) CIre harmonic composite.
Remotesensing 18 01011 g007
Figure 8. (a) NDVI segmentation output, (b) GNDVI segmentation output, (c) NDRE segmentation output, and (d) NDYVI segmentation output.
Figure 8. (a) NDVI segmentation output, (b) GNDVI segmentation output, (c) NDRE segmentation output, and (d) NDYVI segmentation output.
Remotesensing 18 01011 g008
Figure 9. (a) EVI segmentation output, (b) EVI2 segmentation output, (c) SAVI segmentation output, and (d) MSAVI segmentation output.
Figure 9. (a) EVI segmentation output, (b) EVI2 segmentation output, (c) SAVI segmentation output, and (d) MSAVI segmentation output.
Remotesensing 18 01011 g009
Figure 10. (a) NDWI segmentation output, (b) CIg segmentation output, and (c) CIre segmentation output.
Figure 10. (a) NDWI segmentation output, (b) CIg segmentation output, and (c) CIre segmentation output.
Remotesensing 18 01011 g010
Figure 11. Representative examples of the main segmentation error types relative to GT. GT parcel boundaries are shown in red, and predicted polygons are shown in grey: (a) over-segmentation, (b) under-segmentation, and (c) fragmentation.
Figure 11. Representative examples of the main segmentation error types relative to GT. GT parcel boundaries are shown in red, and predicted polygons are shown in grey: (a) over-segmentation, (b) under-segmentation, and (c) fragmentation.
Remotesensing 18 01011 g011
Table 1. Table of vegetation indices, alongside their formula and source.
Table 1. Table of vegetation indices, alongside their formula and source.
IndexFormulaReference
NDVI   N I R R E D N I R + R E D [14]
GNDVI   N I R G R E E N N I R + G R E E N [14]
NDRE   N I R R E N I R + R E [15]
EVI 2.5   N I R R E D N I R + 6 R E D 7.5 B L U E + 1 [16]
EVI2 2.5 N I R R E D N I R + 2.4 R E D + 1 [16]
SAVI 1 + L N I R R E D N I R + R E D + L , L = 0.5 [17]
MSAVI 2 N I R + 1 ( 2 N I R + 1 ) 2 8 ( N I R R E D ) 2 [18]
NDWI G R E E N N I R G R E E N + N I R [19]
CIg N I R G R E E N 1 [15]
CIre N I R R E 1 [15]
NDYVI N I R Y E L L O W R E N I R + Y E L L O W + R E [20]
Table 2. Pixel-wise precision, recall, F1, and IoU.
Table 2. Pixel-wise precision, recall, F1, and IoU.
IndexPrecisionRecallF1IoU
CIg0.97580.76930.86030.7549
CIre0.97380.77240.86150.7567
EVI20.97960.81160.88770.7981
EVI0.97900.80750.88510.7938
GNDVI0.97890.79540.87760.7819
MSAVI0.97970.80570.88420.7924
NDRE0.97750.78150.86860.7677
NDVI0.97940.81520.88980.8015
NDWI0.98190.81610.89130.8040
NDYVI0.97990.79720.87920.7844
SAVI0.97850.80180.88140.7879
Table 3. Object-wise precision, recall, F1, and mIoU.
Table 3. Object-wise precision, recall, F1, and mIoU.
IndexPrecisionRecallF1mIoU
CIg0.46930.44370.45620.6883
CIre0.45920.42290.44030.6848
EVI20.51940.48560.50190.6999
EVI0.51570.48150.49800.7014
GNDVI0.48460.45400.46880.6921
MSAVI0.51980.48690.50280.6998
NDRE0.49240.44730.46880.6942
NDVI0.47860.46210.47020.6956
NDWI0.51450.46410.48800.7000
NDYVI0.49480.45940.47640.6931
SAVI0.51130.48180.49610.6982
Table 4. Object-wise precision, recall, F1, and mIoU for small parcels (<10,000 m2).
Table 4. Object-wise precision, recall, F1, and mIoU for small parcels (<10,000 m2).
IndexPrecisionRecallF1mIoU
CIg0.31160.16950.21960.6079
CIre0.29680.15270.20170.5963
EVI20.37690.21770.27600.6078
EVI0.36540.21230.26850.6064
GNDVI0.32030.18320.23310.5994
MSAVI0.38030.21950.27840.6067
NDRE0.30750.16770.21710.6051
NDVI0.31430.18770.23510.6028
NDWI0.35090.19360.24960.6096
NDYVI0.33310.18550.23820.6015
SAVI0.36890.21680.27310.6042
Table 5. Object-wise precision, recall, F1, and mIoU for medium parcels (10,000 m2–100,000 m2).
Table 5. Object-wise precision, recall, F1, and mIoU for medium parcels (10,000 m2–100,000 m2).
IndexPrecisionRecallF1mIoU
CIg0.58240.70830.63920.6922
CIre0.57190.68180.62200.6886
EVI20.63290.74550.68460.7096
EVI0.62930.74200.68100.7111
GNDVI0.60150.71340.65270.6978
MSAVI0.63680.74550.68690.7107
NDRE0.61470.71600.66150.6971
NDVI0.59600.72310.65350.7015
NDWI0.62230.71750.66650.7048
NDYVI0.60960.72210.66110.6987
SAVI0.62520.73740.67660.7096
Table 6. Object-wise precision, recall, F1, and mIoU for large parcels (larger than 100,000 m2).
Table 6. Object-wise precision, recall, F1, and mIoU for large parcels (larger than 100,000 m2).
IndexPrecisionRecallF1mIoU
CIg0.34370.82460.48520.8001
CIre0.33590.80970.47480.7985
EVI20.38360.85450.52950.8311
EVI0.39450.85820.54050.8328
GNDVI0.37650.84700.52120.8216
MSAVI0.37260.86190.52030.8259
NDRE0.39240.84330.53550.8214
NDVI0.37620.87310.52580.8234
NDWI0.43350.89930.58500.8315
NDYVI0.38100.85450.52700.8217
SAVI0.38020.85820.52690.8218
Table 7. Tile boundary errors.
Table 7. Tile boundary errors.
IndexΔPrecisionΔRecallΔF1ΔIoU
CIg−0.0007−0.0222−0.0143−0.0218
CIre0.0014−0.0121−0.0071−0.0108
EVI2−0.0031−0.0088−0.0066−0.0106
EVI0.0005−0.0062−0.0036−0.0057
GNDVI0.0012−0.0176−0.0103−0.0162
MSAVI−0.0032−0.0019−0.0025−0.0039
NDRE−0.0009−0.0263−0.0169−0.0259
NDVI−0.0020−0.0174−0.0113−0.0182
NDWI0.0008−0.0099−0.0056−0.0091
NDYVI0.0007−0.0138−0.0082−0.0130
SAVI0.0010−0.0070−0.0039−0.0061
Table 8. Fragmentation metrics for parcels smaller than 10,000 m2.
Table 8. Fragmentation metrics for parcels smaller than 10,000 m2.
IndexNumber of Ground Truth PolygonsNumber of Predicted PolygonsRatio
CIg220011560.5255
CIre220010890.4950
EVI2220012530.5695
EVI220012430.5650
GNDVI220012170.5532
MSAVI220012360.5618
NDRE220011610.5277
NDVI220012780.5809
NDWI220011810.5368
NDYVI220011900.5409
SAVI220012670.5759
Table 9. Fragmentation metrics for parcels between 10,000 m2 and 100,000 m2.
Table 9. Fragmentation metrics for parcels between 10,000 m2 and 100,000 m2.
IndexNumber of Ground Truth PolygonsNumber of Predicted PolygonsRatio
CIg196124101.2290
CIre196123601.2035
EVI2196123131.1795
EVI196123291.1877
GNDVI196123481.1973
MSAVI196123121.1790
NDRE196123061.1759
NDVI196124021.2249
NDWI196122821.1637
NDYVI196123421.1943
SAVI196123271.1866
Table 10. Fragmentation metrics for parcels larger than 100,000 m2.
Table 10. Fragmentation metrics for parcels larger than 100,000 m2.
IndexNumber of Ground Truth PolygonsNumber of Predicted PolygonsRatio
CIg2686592.4590
CIre2686672.4888
EVI22686122.2836
EVI2686012.2425
GNDVI2686222.3209
MSAVI2686382.3806
NDRE2685932.2127
NDVI2686352.3694
NDWI2685682.1194
NDYVI2686172.3022
SAVI2686172.3022
Table 11. Over-segmentation and under-segmentation metrics for parcels smaller than 10,000 m2.
Table 11. Over-segmentation and under-segmentation metrics for parcels smaller than 10,000 m2.
IndexGOCGUCGTC
CIg0.34020.31790.3871
CIre0.34530.32490.3927
EVI20.34000.30190.3756
EVI0.33650.30350.3757
GNDVI0.34910.31830.3907
MSAVI0.33330.30720.3754
NDRE0.34270.33360.3936
NDVI0.34850.30440.3847
NDWI0.32590.31520.3767
NDYVI0.33630.32480.3881
SAVI0.33940.30330.3770
Table 12. Over-segmentation and under-segmentation metrics for parcels between 10,000 m2 and 100,000 m2.
Table 12. Over-segmentation and under-segmentation metrics for parcels between 10,000 m2 and 100,000 m2.
IndexGOCGUCGTC
CIg0.26720.13360.2482
CIre0.26930.14560.2561
EVI20.2440.12450.2289
EVI0.24330.12740.2296
GNDVI0.25780.13380.2421
MSAVI0.24430.12480.2289
NDRE0.25470.13610.2398
NDVI0.26330.12170.2393
NDWI0.24920.12980.2344
NDYVI0.25880.12950.2399
SAVI0.24920.12600.2320
Table 13. Over-segmentation and under-segmentation metrics for parcels larger than 100,000 m2.
Table 13. Over-segmentation and under-segmentation metrics for parcels larger than 100,000 m2.
IndexGOCGUCGTC
CIg0.24830.03720.1925
CIre0.25210.05150.2037
EVI20.22140.03090.1708
EVI0.21180.03130.1642
GNDVI0.22570.03720.1767
MSAVI0.22150.03120.1710
NDRE0.20660.03590.1620
NDVI0.22210.03240.1723
NDWI0.20090.02450.1515
NDYVI0.22140.02610.1678
SAVI0.22950.03270.1767
Table 14. Boundary metrics.
Table 14. Boundary metrics.
IndexBoundary PrecisionBoundary RecallBoundary F1Boundary IoU
CIg0.40940.40090.40510.2791
CIre0.39530.38720.39120.2681
EVI20.47120.47110.47120.3257
EVI0.47450.47040.47250.3262
GNDVI0.43280.42420.42850.2975
MSAVI0.47430.47400.47420.3274
NDRE0.42810.41010.41890.2910
NDVI0.43730.43910.43820.3045
NDWI0.44880.43030.43930.3064
NDYVI0.42950.41740.42340.2954
SAVI0.46700.46620.46660.3219
Table 15. Results of the pairwise Wilcoxon signed-rank tests for IoU scores across indices.
Table 15. Results of the pairwise Wilcoxon signed-rank tests for IoU scores across indices.
Index AIndex BMedian IoU DifferenceLower 95% CI 1Upper 95% CI 1Adjusted p-Value 2
CIreMSAVI−0.02329−0.02519−0.02061<0.001
CIgEVI2−0.01648−0.01897−0.01409<0.001
MSAVINDRE0.010310.008960.01200<0.001
EVI2NDVI0.005850.004650.00703<0.001
NDVINDWI0.000000.000000.000000.817
EVIEVI20.000000.000000.000000.591
1 Confidence interval; 2 Holm-adjusted p-value.
Table 16. Results of the pairwise Wilcoxon signed-rank tests for F1 scores across indices.
Table 16. Results of the pairwise Wilcoxon signed-rank tests for F1 scores across indices.
Index AIndex BMedian F1 DifferenceLower 95% CI 1Upper 95% CI 1Adjusted p-Value 2
CIreMSAVI−0.02047−0.02247−0.01814<0.001
CIgEVI2−0.01413−0.01600−0.01228<0.001
MSAVINDRE0.009210.007740.01038<0.001
EVI2NDVI0.005050.004150.00614<0.001
NDVINDWI0.000000.000000.000000.899
EVIEVI20.000000.000000.000000.709
1 Confidence interval; 2 Holm-adjusted p-value.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Papić, F.; Miler, M.; Medak, D.; Rumora, L. Harmonic Phenology Mapping: From Vegetation Indices to Field Delineation. Remote Sens. 2026, 18, 1011. https://doi.org/10.3390/rs18071011

AMA Style

Papić F, Miler M, Medak D, Rumora L. Harmonic Phenology Mapping: From Vegetation Indices to Field Delineation. Remote Sensing. 2026; 18(7):1011. https://doi.org/10.3390/rs18071011

Chicago/Turabian Style

Papić, Filip, Mario Miler, Damir Medak, and Luka Rumora. 2026. "Harmonic Phenology Mapping: From Vegetation Indices to Field Delineation" Remote Sensing 18, no. 7: 1011. https://doi.org/10.3390/rs18071011

APA Style

Papić, F., Miler, M., Medak, D., & Rumora, L. (2026). Harmonic Phenology Mapping: From Vegetation Indices to Field Delineation. Remote Sensing, 18(7), 1011. https://doi.org/10.3390/rs18071011

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop