Next Article in Journal
Numerical Simulation of Granular Phase Flow Behavior and Heat Transfer Characteristics in an Industrial-Scale Rotary Cooler
Previous Article in Journal
A Neimark–Sacker Bifurcation Analysis of a Decision Delay Duopoly Model and Its Control Using Improved Impulsive Control
Previous Article in Special Issue
Modeling the Timing of Trade Adjustment: A Piecewise Linear Trend Approach with Financial and Labor Frictions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Quantitative Analysis of NDVI Temporal Data Using Artificial Neural Networks: A Decision-Making Approach for Precision Agriculture

by
Constantin Ilie
1,*,
Margareta Ilie
2,
Kamer Ainur Aivaz
2,
Cristina Duhnea
2 and
Silvia Ghiță-Mitrescu
2
1
Faculty of Mechanical, Industrial and Maritime Engineering, Ovidius University of Constanta, 900527 Constanta, Romania
2
Faculty of Economic Sciences, Ovidius University of Constanta, 900527 Constanta, Romania
*
Author to whom correspondence should be addressed.
Mathematics 2026, 14(10), 1741; https://doi.org/10.3390/math14101741
Submission received: 6 April 2026 / Revised: 11 May 2026 / Accepted: 15 May 2026 / Published: 19 May 2026

Abstract

The integration of quantitative mathematical methods and artificial intelligence into agricultural monitoring systems represents a critical pathway toward data-driven decision-making in the contemporary precision agriculture economy. This study applies mathematical modeling and quantitative analysis to temporal NDVI (Normalized Difference Vegetation Index) raster datasets from six agricultural parcels in the Dobrogea region of Romania (2017 growing season), with the objective of supporting agronomic performance evaluation and operational decision-making. Higher-order statistical descriptors—variance, kurtosis, and skewness—were extracted from XML raster files and subjected to comprehensive visual analytics using kernel density estimation, three-dimensional surface modeling, and polynomial regression in Python. A feedforward Artificial Neural Network (ANN) with a 4-15-9-3-1 architecture was trained under four activation function and solver combinations (tanh/ReLU × Adam/SGD) to classify satellite sensing-date authenticity (is_sensing_date), a key data-quality indicator for operational crop monitoring workflows. Permutation-based feature importance analysis confirmed that variance is the dominant mathematical predictor (~35.8%), followed by kurtosis (~31.5%) and skewness (~26.6%), while the temporal month variable contributed least (~6.1%). The tanh–SGD configuration yielded the best training–test error balance for most individual datasets, while tanh–Adam performed optimally on the combined dataset. The inverse mathematical relationship between variance and kurtosis, and the direct co-variation between kurtosis and skewness, were consistent across all parcels, demonstrating the universality of these quantitative patterns in agricultural remote sensing data. These findings establish a replicable mathematical modeling framework applicable to predictive analytics, risk assessment of data quality, and performance evaluation in agricultural decision-making systems, with direct relevance to digital transformation strategies in the agri-economy sector.

Graphical Abstract

1. Introduction

Precision agriculture increasingly relies on the integration of remote sensing data, computational statistics, and machine learning to support informed decision-making at the field level [1,2]. Among remote sensing indices, the Normalized Difference Vegetation Index (NDVI) is one of the most widely applied metrics for assessing crop health, biomass accumulation, and phenological progression [3,4]. Modern Earth observation platforms generate dense temporal NDVI time series, typically delivered as raster-based datasets, offering an unprecedented volume of spatio-temporal information for agricultural monitoring [5].
Despite their potential, raw NDVI time series present significant challenges for direct interpretation. Cloud contamination, atmospheric interference, and sensor noise introduce systematic and random variability that complicates the identification of true vegetation dynamics [6]. Higher-order statistical descriptors—specifically variance, kurtosis, and skewness—offer a condensed mathematical representation of the distributional characteristics of NDVI pixel populations within a parcel for each temporal observation. These metrics quantify the spread, tail behavior, and asymmetry of the pixel distribution, respectively, thereby providing diagnostic information about within-field spatial heterogeneity and its temporal evolution [7,8]. Their mathematical definitions are rooted in moment theory: variance corresponds to the second central moment, skewness to the standardized third central moment, and kurtosis to the standardized fourth central moment.
Artificial Neural Networks (ANNs) have been successfully applied to agricultural remote sensing problems, including crop type classification, yield estimation, and growth stage prediction [9,10]. Their capacity to model nonlinear relationships between multi-dimensional inputs and categorical or continuous outputs makes them particularly suitable for classifying the quality or origin of satellite-derived observations, such as distinguishing actual sensing dates from synthetic or interpolated values [11]. Feature importance analysis within trained ANNs further provides explainability regarding which statistical descriptors are most informative for the prediction task [12].
Despite the growing body of work on NDVI time series classification using machine learning, several gaps remain insufficiently addressed. First, most ANN-based classification studies operate on scene-level or pixel-level imagery rather than on pre-aggregated parcel-level statistical descriptors, limiting applicability in operational farm management contexts. Second, the automated discrimination of actual versus interpolated sensing dates has received limited attention at the statistical descriptor level, with existing approaches relying on satellite-operator quality flags rather than learned distributional signatures. Third, the XML-based PAMDataset format represents a lightweight, vendor-neutral, human-readable data exchange standard increasingly adopted by national Land Parcel Identification System (LPIS) systems across Europe, yet its exploitation using machine learning has not been systematically documented.
The use of XML-structured datasets offers specific advantages over raster-native formats (e.g., GeoTIFF—image format based on the standard TIFF—stacks or NetCDF—Network Common Data Form—archives): (i) per-band statistical metadata including variance, kurtosis, and skewness are pre-computed and directly accessible without pixel-level image processing; (ii) the hierarchical XML structure naturally encodes parcel identity, temporal sequencing, and quality flags within a single file; and (iii) XML files are directly importable into standard data science environments (Python, R, Excel) without specialized Geographic Information System (GIS) software, lowering the technical barrier for agronomic end-users.
The present study addresses the interdisciplinary analysis of agricultural NDVI datasets provided in XML format. Three main objectives were pursued: (i) definition and formulation of the statistical descriptors employed; (ii) comprehensive visual exploration of relationships between descriptors and temporal variables; and (iii) training, evaluation, and feature importance analysis of feedforward ANNs for classifying sensing date authenticity.
The remainder of the article is organized as follows. Section 2 provides a literature review covering statistical descriptors for NDVI time series analysis, remote sensing data quality and sensing date classification, the application of artificial neural networks in agricultural remote sensing, and visual analytics methods for agricultural data. Section 3 describes the materials and methods, including the dataset characteristics, the mathematical definitions of the statistical descriptors employed, the data processing and visual analytics pipeline, and the ANN architecture and training procedure. Section 4 presents the results, comprising the temporal relationships between statistical descriptors, the distribution of kurtosis and skewness stratified by sensing date flag and calendar month, the KDE and polynomial regression analyses of descriptor relationships, the ANN training error curves and performance metrics, and the permutation-based feature importance analysis. Section 5 discusses the findings in the context of the existing literature and identifies the principal limitations and future research directions. Section 6 summarizes the principal conclusions of the study.

2. Literature Review

2.1. Statistical Descriptors for NDVI Time Series Analysis

The use of higher-order statistical moments for characterizing remote sensing pixel distributions has a well-established theoretical basis. DeCarlo [8] provided a foundational treatment of kurtosis, clarifying its interpretation as a measure of tail heaviness and demonstrating its sensitivity to distributional outliers—properties that are directly relevant when assessing the quality of NDVI raster bands contaminated by cloud shadows or atmospheric noise. Similarly, Moors [13] demonstrated that kurtosis is intimately linked to the presence of bimodal structures within a distribution, a feature that can arise in heterogeneous agricultural parcels containing both vegetated and bare soil areas.
The application of variance as a measure of within-field NDVI heterogeneity has been extensively documented in precision agriculture research. Pettorelli et al. [4] showed that spatial variance in NDVI is strongly correlated with vegetation structural diversity, a finding that supports the use of variance as the primary predictor variable in ANN-based classification tasks. More recently, Delegido et al. [14] demonstrated that the temporal evolution of variance in multi-temporal NDVI stacks can effectively discriminate between crop types at field parcel scale, with variance reaching peak values during periods of canopy closure. These findings are consistent with the inverse relationship between variance and kurtosis observed in the present study: as canopy becomes more uniform (lower variance), pixel distributions become sharply peaked (high kurtosis).
The skewness of NDVI pixel distributions has received comparatively less attention, though its utility as a crop condition indicator has been documented by Vibhute et al. [15], who showed that left-skewed NDVI distributions are characteristic of stressed or senescing crops, while right-skewed distributions correspond to vigorous, actively growing canopies. The interaction between skewness and the seasonal sensing date flag investigated in the present study adds a novel temporal dimension to this body of work.

2.2. Remote Sensing Data Quality and Sensing Date Classification

Temporal compositing and gap-filling are standard operations in satellite NDVI time series production, and the distinction between actual sensing dates and interpolated or composited values is a recurrent challenge in operational crop monitoring systems. Kandasamy et al. [6] systematically evaluated seven smoothing and gap-filling algorithms applied to MODIS LAI time series, concluding that the choice of method substantially affects phenological metrics derived from the smoothed series. Their work established that observations flagged as non-sensing dates (i.e., gap-filled values) introduce biases in phenology extraction that are difficult to correct post hoc.
Shen et al. [11] reviewed missing data reconstruction approaches for remote sensing time series, categorizing methods into temporal interpolation, spatial interpolation, spectral reconstruction, and hybrid approaches. They noted that temporally interpolated values tend to underestimate variance and overestimate kurtosis compared to actual acquisition dates—a pattern that is consistent with the distributional differences observed between True and False is_sensing_date observations in the present study. Verger et al. [16] specifically addressed the Sentinel-2 constellation and showed that the effective temporal resolution in cloud-prone regions such as the Black Sea coast of Romania can drop to below one actual observation per month during spring, making gap-filling a dominant data generation mechanism for winter cereal monitoring.
The automated classification of remote sensing observation quality using machine learning has emerged as an active research direction. Mateo-García et al. [17] trained convolutional neural networks to classify Sentinel-2 scenes by cloud fraction and surface reflectance quality, achieving classification accuracies exceeding 95%. However, their approach operated at the scene level rather than at the statistical descriptor level of individual parcels. The present study’s ANN architecture, which classifies individual band observations using within-parcel statistical descriptors, represents a complementary and computationally lighter approach suited to operational parcel-level monitoring.

2.3. Artificial Neural Networks in Agricultural Remote Sensing

The application of ANNs to agricultural remote sensing problems has grown substantially following the availability of free, cloud-computing accessible satellite data. Kamilaris and Prenafeta-Boldú [9] reviewed 40 deep learning studies in agriculture, finding that convolutional neural networks (CNNs) dominated crop classification tasks while feedforward MLPs remained competitive for regression and time series classification tasks with small datasets. Their meta-analysis identified dataset size as the most critical factor determining model performance, with studies using fewer than 1000 training samples frequently reporting high variance in test accuracy—a finding directly applicable to the limited dataset used in the present study.
Pantazi et al. [10] applied competitive learning ANNs to predict within-field wheat yield variability from multi-source sensor data, including satellite-derived vegetation indices. They reported that network architectures with 2–3 hidden layers consistently outperformed single-layer networks, and that the hyperbolic tangent activation function yielded more stable convergence than sigmoid or linear functions on heterogeneous agricultural datasets. These architectural preferences align with the results of the present study, where configurations with at least two hidden layers and the tanh activation function produced the best training metrics.
Recurrent neural network approaches, particularly Long Short-Term Memory (LSTM) networks, have demonstrated strong performance for temporal NDVI classification. Russwurm and Körner [18] applied LSTM networks to multi-temporal Sentinel-2 time series for crop type mapping, achieving overall accuracies of 83% with substantially smaller training sets than equivalent CNN approaches. Their demonstration that LSTM networks can leverage the sequential structure of time series data—treating observations in temporal order rather than as independent samples—suggests a clear upgrade path from the feedforward MLP architecture employed in the present study.
The use of feature importance analysis for ANN explainability in remote sensing has been addressed by Zhong et al. [19], who compared permutation-based importance scores with gradient-based saliency maps for crop type classification. Their results showed that variance-derived features consistently ranked highest in importance across multiple datasets and classification architectures, corroborating the findings of the present study. They also noted that temporal features (month, day-of-year) contributed disproportionately less than spectral or statistical features, consistent with the low importance of the month variable (~6.1%) observed here.

2.4. Visual Analytics for Agricultural Data

The integration of visual analytics tools into agricultural data workflows has been recognized as essential for bridging the gap between data scientists and domain experts. Benos et al. [20] reviewed the application of data visualization in precision agriculture, emphasizing that exploratory visual analysis—including distribution plots, correlation matrices, and temporal heatmaps—substantially reduces the time required to identify relevant features for subsequent machine learning modeling. Their framework for “visually guided feature engineering” is methodologically aligned with the approach taken in the present study, where violin plots, joint KDE plots, and 3D surface representations were used to identify the inverse kurtosis–variance relationship and the seasonal clustering of sensing date types before ANN training.
The Python scientific stack employed in the present study (NumPy, Pandas, Matplotlib, Seaborn, Plotly, Scikit-learn) has become the de facto standard for agricultural data analysis pipelines. Kluyver et al. [21] documented the advantages of Jupyter Notebook-based workflows for reproducibility in agricultural remote sensing, and the libraries referenced in the present study are all natively supported in this environment. The specific use of sns.catplot with violin representation for is_sensing_date analysis parallels the approach of Inglada et al. [22], who used violin plots to characterize the inter-annual variability of NDVI statistics across crop type classes in the Sentinel-2 time series archive.

3. Materials and Methods

3.1. Dataset Description

The data were provided, during a research contract, in XML format (PAMDataset structure, see Figure 1), containing NDVI-derived statistical descriptors for individual agricultural parcels registered in Romania’s Land Parcel Identification System (LPIS). Each XML file corresponds to a single parcel and contains a sequence of raster bands, each representing one daily observation. Parcel-level metadata includes the declared area (area_decla), block identifier (bloc_nr), land use category (cat_use), crop code (crop_code), and a unique geographic identifier (gid). The temporal extent covers the 2017 growing season from 1 March to 31 October.
The statistical descriptors stored in the PAMDataset XML files are computed on the raw digital number (DN, examples are shown in Table 1) or reflectance-scaled pixel values prior to final NDVI normalization to the [−1, 1] interval. The PAMDataset format stores band statistics on the internal numeric representation of the raster, which may correspond to integer-scaled reflectance values (e.g., scaled by a factor of 10,000 for Sentinel-2 Level-2A products) rather than the normalized floating-point NDVI. Consequently, variance values reported in the XML files are in the units of the underlying raster data type and may substantially exceed values expected from a [−1, 1] normalized index. All descriptor values were extracted directly from the XML metadata without additional re-normalization, and were subsequently scaled by division by 10n for graphical display purposes only, as described in Section 3.3.
The binary flag is_sensing_date is set to True when a band corresponds to an actual satellite overpass containing valid surface reflectance data acquired over the parcel, and to False when the band represents a synthetic estimate produced by temporal linear interpolation to fill gaps in the acquisition series. Unlike true satellite measurements, gap-filled values are inserted solely to preserve continuity in the daily time series and do not reflect any real observation. This distinction carries significant implications for downstream analyses, as interpolated values tend to systematically suppress variance and may introduce distortions in kurtosis and skewness estimates relative to the distributions characteristic of genuine observational data.
The NDVI values from which the statistical descriptors were derived were computed from Sentinel-2 MultiSpectral Instrument (MSI) Level-2A surface reflectance imagery, processed through the Sen2Cor atmospheric correction algorithm. For each parcel and each acquisition date, values were aggregated to produce the statistical descriptors stored in the PAMDataset XML.
The theoretical temporal revisit frequency of the combined Sentinel-2A and Sentinel-2B constellation over Romania is 5 days under cloud-free conditions; however, effective temporal coverage was substantially reduced by cloud contamination, particularly during March and May, as documented in the results (Section 4.2). Individual parcel boundaries were extracted from Romania’s Land Parcel Identification System (LPIS) as provided by the Agency for Payments and Intervention in Agriculture (APIA, https://www.apia.org.ro, accessed: 1 March 2017). The Sentinel-2 Level-2A products used in this study are publicly accessible through the Copernicus Open Access Hub (https://www.copernicus.eu/en/access-data/conventional-data-access-hubs, accessed: 1 March 2017) or the Copernicus Data Space Ecosystem (https://dataspace.copernicus.eu, accessed: 1 March 2017) using the parcel geographic coordinates and the temporal range specified above.
The six agricultural parcels are located in the Dobrogea region of southeastern Romania, a semi-arid plateau bounded by the Danube River to the west and north and the Black Sea coast to the east (Figure 2).
The region is characterized by a continental climate with sub-Mediterranean influences, receiving annual precipitation of 350–450 mm predominantly concentrated in the autumn-winter period, making spring and early summer the critical growth windows for winter cereals. The parcels were selected to represent a diversity of field sizes, crop types, and spatial NDVI heterogeneity present in the LPIS registry for this region.
For each daily band, the following descriptors are pre-computed and stored: count (number of valid pixels), mean, variance, kurtosis, skewness, minimum, and maximum (see Figure 3). A binary flag is_sensing_date indicates whether the band corresponds to an actual satellite observation or an interpolated estimate.
Six datasets—coded 343, 314, 977, 978, 686, and 306—were selected for visual analysis. Three of these (343, 977, 978) were further used for ANN modeling, alongside a combined dataset (“all”) formed by aggregating all three.

3.2. Statistical Descriptors

Let {x1, x2, …, xn} be the set of N valid NDVI pixel values within a given parcel for a single daily observation, and let x ¯ be the sample mean. The three main statistical descriptors analyzed in this study are formally defined as follows.
The variance (σ2) quantifies the heterogeneity of NDVI within the parcel as the second central moment, normalized by the number of pixels:
σ 2 = i = 1 N ( x i x ¯ ) 2 N ,
where:
σ2: represents the population variance.
N: is the total number of observations (pixels or data points in the parcel).
xi: represents the individual value of each observation.
x ¯ : is the arithmetic mean of all values.
x ¯ = i = 1 N x i N ,
( x i x ¯ ) 2 : represents the square of the deviation of each value from the mean.
σ: standard deviation of the data set.
Skewness (SK) quantifies distributional asymmetry around the mean as the standardized third central moment:
S K = i = 1 N ( x i x ¯ ) 3 ( N 1 ) · σ 3 ,
Kurtosis (K) describes the heaviness of distributional tails relative to a normal distribution, defined via the standardized fourth central moment:
K = i = 1 N ( x i x ¯ ) 4 N · σ 4 ,
The mathematical relationship between these descriptors stems from their common dependence on σ2 and the centered deviations (xi x ¯ ). Specifically, since both skewness and kurtosis are functions of the same standard deviations, the former (cubic power) and the latter (quartic power) exhibit codirectional variations, but kurtosis generates larger inflection amplitudes due to its steeper exponent. The inverse relationship between variance and kurtosis results from the fact that as the dispersion of pixels within the parcel increases (high σ2), the distributional mass is distributed more evenly across the tails, flattening the peak and reducing κ.
Consider a pixel population in which a scaling perturbation increases the spread of values uniformly: each deviation (xi x ¯ ) is multiplied by a factor c > 1. Under this transformation, the variance scales as:
σ 2 = c 2 · σ 2 ,
while the fourth central moment, defined as:
μ 4 = i = 1 N ( x i x ¯ ) 4 N ,
scales as c 4 · μ 4 . Since Kurtosis (K) is defined as:
K = μ 4 σ 4 ,
the kurtosis transforms as:
c 4 · μ 4 ( c 2 · σ 2 ) 2 = c 4 · μ 4 c 4 · σ 4 = μ 4 σ 4 ,
remaining invariant under uniform scaling. This invariance implies that kurtosis is sensitive not to the absolute spread of the distribution, but to the relative concentration of mass in the tails versus the peak.
In agricultural NDVI pixel populations, an increase in within-parcel variance typically arises from the co-presence of spectrally heterogeneous surface types (e.g., vegetated and bare-soil pixels). This broadens the distribution by adding mass to intermediate values rather than to the extreme tails. Formally, if the pixel distribution is represented as a mixture of two unimodal components with means μ 1 and μ 2 and equal variances σ 0 2 , the kurtosis of the mixture decreases as the separation ( μ 1 μ 2 ) increases. Consequently, as σ 2 increases due to greater spectral heterogeneity, the distribution transitions from leptokurtic to platykurtic, producing the inverse proportionality between variance and kurtosis observed empirically across the six parcels in this study.
These three descriptors, together with the month of observation, constitute the input feature vector X = (σ2, SK, K, month) for the ANN models.

3.3. Data Processing and Visual Analytics

Raw XML files were parsed and converted to Microsoft Excel format (.xlsx) to facilitate accessibility for both expert and non-expert stakeholders. Descriptor values were normalized to sub-unity scale by dividing by 10n (n ∈ ℕ) to harmonize dynamic ranges for graphical display. Observation months were extracted and appended as a categorical variable to enable temporal stratification.
The normalization factor 10n was selected independently for each descriptor to bring its values into a sub-unity or single-digit range suitable for graphical comparison. The values of n applied in the present study are summarized in Table 2.
Visual analytics were performed in Python (v3.x) using: NumPy v2.4.0 and Pandas v3.0.0 for data manipulation; Matplotlib v3.10 and Seaborn v0.13.2 for 2D visualization; Plotly v6.7.0 for interactive 3D representations; and Scikit-learn 1.8.0 for regression and ANN modeling. Visualization types included: (1) temporal line charts of statistical descriptors; (2) violin plots of kurtosis and skewness stratified by is_sensing_date flag and calendar month (sns.catplot); (3) joint kernel density estimation plots of kurtosis/skewness against variance (sns.jointplot, KDE fill); (4) 3D triangulated surface plots of kurtosis and skewness as functions of variance and mean (ax.plot_trisurf); (5) 3D scatter plots colored by month (px.scatter_3d); and (6) polynomial regression plots (order 3 for kurtosis, order 4 for skewness) versus variance, stratified by is_sensing_date.

3.4. Artificial Neural Network Architecture and Training

A feedforward multilayer perceptron (MLP) was implemented using Scikit-learn’s MLPClassifier. The fixed network configuration comprised: 4 input neurons (variance, kurtosis, skewness, month); three hidden layers with 15, 7, and 3 neurons, respectively; and 1 output neuron (is_sensing_date, binary classification). Four structural variants were obtained by crossing two activation functions (tanh—hyperbolic tangent; ReLU—Rectified Linear Unit) with two optimization solvers (adam—Adaptive Moment Estimation; sgd—Stochastic Gradient Descent): tanh&adam, tanh&sgd, relu&adam, and relu&sgd.
t a n h x = s i n h ( x ) c o s h ( x ) = e x e x e x + e x ,
where:
sinh(x): hyperbolic sine.
cosh(x): hyperbolic cosine.
e: Euler’s number (approximately 2.718).
R e L U = m a x ( 0 , x )   o r   R e L U = x ,   i f   x 0 0 ,   o t h e r w i s e ,
where:
x: input.
Each configuration was trained and evaluated on the three individual datasets and on the combined “all” dataset. Performance was quantified using: Mean Absolute Error (MAE) and Mean Squared Error (MSE) for training and testing partitions, and the model score (R2). Feature importance was estimated by a permutation-based method (feature_importances attribute), returning the percentage contribution of each input variable to the training outcome. Computations were performed on an Intel Core i5-9400 processor (2.90 GHz, 6 cores, 8 GB RAM). Training iterations ranged from 300 to 3500.
In Table 3 hyperparameters for all four ANN configurations are specified. Values in parentheses indicate Scikit-learn defaults retained without modification. Weight initialization follows the Glorot uniform scheme (also known as Xavier initialization) applied by default in Scikit-learn’s MLPClassifier to all weight matrices. The adaptive learning rate schedule for Adam halves the learning rate when training loss does not improve for two consecutive epochs. For SGD and Adam the default momentum of 0.9 applied to parameter updates was used.
To provide a comprehensive overview of the proposed methodology, a flowchart illustrating the analytical pipeline is shown below in Figure 4.
The workflow originates with the ingestion of raw XML PAMDataset files, which are subsequently parsed and converted into structured spreadsheet format (.xlsx). Statistical descriptors—variance, kurtosis, skewness, and month—are then extracted and normalized through division by the appropriate power of ten prior to model training. In parallel, six types of visual analytics are generated to support exploratory data analysis. The normalized descriptors are subsequently used to train an Artificial Neural Network under four activation-solver configurations, all sharing the same 4-15-9-3-1 architecture. Model performance is evaluated using standard regression metrics (MAE, MSE, and R2), followed by permutation-based feature importance analysis to assess the relative contribution of each input variable. The terminal step of the pipeline produces a binary classification output determining the authenticity of the sensing date (is_sensing_date). Taken together, this workflow reflects a structured and reproducible methodological framework, spanning data ingestion, feature engineering, visual analytics, model training, and interpretability analysis.

4. Results

4.1. Temporal Relationships Between Statistical Descriptors

Initial temporal plotting of the descriptor time series across all six datasets revealed consistent structural relationships in Figure 5 and Figure 6. Variance exhibits an inversely proportional temporal pattern with respect to kurtosis, though with varying multiplicative factors and non-uniform periodicity of value inflections. Conversely, skewness evolves in direct proportion to kurtosis, albeit with a smaller propagation factor and similarly irregular periodicity.
These relationships are mathematically predictable: the third-power dependence in skewness and the fourth-power dependence in kurtosis produce inflection points of smaller amplitude in the former.

4.2. Kurtosis and Skewness Distributions by Month and Sensing Date

Violin plots of kurtosis (Figure 7 and Figure 8 for skewness) stratified by is_sensing_date flag and calendar month revealed that May is the month in which false sensing dates predominantly exceed true sensing dates in 5 of the 6 analyzed datasets. A similar pattern was observed for March (datasets 343 and 978) and October (dataset 306). The months of April, June, and July exhibited the highest total number of observations, irrespective of sensing date flag. Datasets 343 and 314 notably lacked October records.
The violin plots for skewness confirmed the monthly pattern, with false sensing dates predominating in May and March (4 of 6 datasets) and June (2 of 6 datasets). Three-dimensional surface and scatter plots showed that peak values of both kurtosis and skewness cluster predominantly in April, June, and July for kurtosis, and April, March, and May for skewness. These seasonal concentrations likely reflect the agronomic dynamics of winter wheat and other spring–summer crops cultivated in the Dobrogea region.

4.3. Descriptor Relationships: KDE and Regression Analysis

Figure 9 and Figure 10 present the bivariate kernel density estimation (KDE) joint plots for the pairs kurtosis–variance and skewness–variance, respectively. Each panel aggregates all daily observations from a single parcel, with contour density encoding the joint probability of co-occurring descriptor values. The marginal distributions are shown along each axis, enabling simultaneous assessment of the univariate spread and the bivariate concentration region. This representation was selected because it captures the full distributional geometry of the descriptor pairs without imposing any parametric assumption, making it particularly suited to the heterogeneous and temporally mixed nature of the NDVI raster datasets analyzed here.
Joint KDE plots of kurtosis against variance revealed that the majority of observations concentrate within variance values of 0–50 and kurtosis values of −1 to 10 across most datasets. Exceptions were observed for dataset 314 (kurtosis values exceeding 10), dataset 977 (kurtosis concentrated near zero), and dataset 306 (kurtosis up to 3). Joint KDE plots of skewness versus variance showed a primary concentration in variance 0–50 and skewness −2 to 4, with dataset 314 exhibiting skewness above 5 and variance up to 80, and dataset 306 showing variance up to 210.
Polynomial regression analysis confirmed that regression errors for the True is_sensing_date variant are consistently larger than for the False variant, a pattern replicated for skewness. This indicates that actual satellite acquisition dates introduce greater statistical complexity into the descriptor distributions, likely due to genuine vegetation dynamics captured at the moment of sensing.
Figure 11 presents overlaid three-dimensional triangulated surface plots of kurtosis and skewness as simultaneous functions of variance and mean, constructed for all six agricultural parcels. The kurtosis surface is rendered with a transparency factor of α = 0.75, allowing the skewness surface beneath it to remain visible throughout the overlapping regions. Both surfaces were generated using ax.plot_trisurf with the inferno colormap, where warmer tones indicate higher descriptor values. This dual-surface representation was designed to expose the topographic similarity and divergence between the two higher-order descriptors across the joint variance–mean space, providing a direct visual test of the proportionality relationships identified in the temporal line plots.
Across the majority of parcels (343, 977, 978, 686, 306), the two response surfaces exhibit broadly similar topographic profiles: both kurtosis and skewness reach their highest values in the low-to-moderate variance domain combined with intermediate mean NDVI values, consistent with transitional canopy states where pixel heterogeneity is spatially constrained yet the distribution remains asymmetric. The transparency of the kurtosis surface reveals that it consistently sits above the skewness surface in peak regions, reflecting the steeper inflection points produced by the fourth-power standardization in kurtosis relative to the third-power basis of skewness. The most pronounced divergences between the two surfaces are observed for parcels 314 and 686, where the kurtosis surface develops an elevated secondary lobe at high variance values—a feature absent in the corresponding skewness surface—suggesting the presence of extreme pixel outliers that disproportionately amplify kurtosis without a corresponding increase in distributional asymmetry. These inter-parcel differences underscore the complementary, non-redundant informational content carried by kurtosis and skewness as ANN input features, and support their combined use in the classification of sensing date authenticity.
Figure 12 displays, individually, the three-dimensional triangulated surface plots of kurtosis as a joint function of mean NDVI and variance, for each of the six agricultural parcels. Each surface was constructed using ax.plot_trisurf applied to the complete set of daily observations for a given parcel, with mesh triangulation performed directly on the scattered (mean, variance, kurtosis) point cloud. The inferno colormap encodes kurtosis magnitude, with high values rendered in yellow–white and low values in dark purple–black, allowing rapid visual identification of the descriptor’s concentration regions in the mean–variance parameter space.
Figure 13 complements the surface representation of Figure 12 by displaying the same kurtosis data as three-dimensional scatter plots, with each observation point colored according to its calendar month. The plots were generated using plotly.scatter_3d with axes corresponding to mean NDVI (x), variance (y), and kurtosis (z). The monthly color coding—using a discrete palette spanning March through October—enables the temporal dimension to be overlaid on the three-dimensional descriptor space, thereby linking the statistical structure of individual observations to their seasonal context without collapsing the temporal information into a surface average.
Across all parcels, kurtosis peaks are concentrated in the low-to-moderate variance domain (variance roughly 0–50), regardless of mean NDVI level, confirming the inverse proportionality between variance and kurtosis established in the temporal analysis. The highest kurtosis values correspond to periods in which pixel reflectance distributions are sharply peaked—most notably during April, June, and July—when uniform canopy cover reduces within-parcel dispersion and concentrates pixel values near the parcel mean. Notable differences in surface morphology are apparent for parcel 314, which displays a markedly elevated and narrow kurtosis ridge extending into the high-variance region, and for parcel 306, where the variance axis extends considerably further than in the remaining parcels, producing a flattened surface topography that reflects the greater spatial heterogeneity of that parcel’s NDVI pixel population.
The scatter plots confirm that high-kurtosis observations cluster predominantly in the months of April, June, and July across most parcels, consistent with the active vegetative growth phases of winter wheat and spring–summer crops typical of the Dobrogea region. March and October observations, by contrast, tend to occupy the low-kurtosis, high-variance region of the scatter cloud, reflecting early-season and post-harvest conditions where bare-soil and residue pixels coexist with emerging or senescing vegetation, broadening the pixel distribution and reducing peak sharpness. The seasonal stratification is most distinct in parcels 343 and 977, where monthly color clusters are well separated along the kurtosis axis, and least distinct in parcel 306, where the high-variance observations of multiple months overlap considerably, suggesting persistent within-parcel heterogeneity that attenuates the seasonal kurtosis signal.
Figure 14 presents the three-dimensional triangulated surface plots of skewness as a joint function of mean NDVI and variance, structured identically to the kurtosis surfaces in Figure 12 to allow direct visual comparison. The inferno colormap is applied consistently, with warm tones encoding high positive skewness and cool-dark tones encoding near-zero or negative skewness values. The common axis scaling between Figure 12 and Figure 14 was maintained wherever the data range permitted, reinforcing the comparability of the two descriptor response topographies.
Figure 15 presents the three-dimensional scatter plots of skewness colored by calendar month, mirroring the structure of Figure 13 for the skewness descriptor. Axes correspond to mean NDVI (x), variance (y), and skewness (z), with monthly color coding applied using the same discrete palette as in Figure 13. This parallel construction allows a direct side-by-side reading of the seasonal behavior of kurtosis and skewness in the three-dimensional descriptor space, highlighting both similarities and divergences in their temporal patterning.
The skewness surfaces display a broadly similar spatial organization to their kurtosis counterparts, with peak values occurring in the low-variance, intermediate-mean NDVI region. However, the amplitude of skewness peaks is consistently lower than that of the corresponding kurtosis peaks, in agreement with the mathematical relationship between the two descriptors: the third-power standardization in skewness is less sensitive to extreme pixel values than the fourth-power basis of kurtosis, producing shallower surface ridges and less pronounced inflection points. The most notable inter-parcel difference involves parcels 314 and 686, where the skewness surface exhibits a broader, lower-amplitude response compared to the sharply peaked kurtosis surface observed for the same parcels in Figure 14. This dissociation indicates that these parcels contain a subset of daily observations with extreme NDVI outlier pixels that elevate kurtosis selectively, without producing a proportionate increase in distributional asymmetry—a diagnostic signature of non-Gaussian contamination rather than systematic directional stress.
In contrast to kurtosis, which reached its seasonal peaks predominantly in April, June, and July, skewness peak values occur primarily in April, March, and May across most parcels. This earlier seasonal concentration reflects the asymmetric NDVI distributions characteristic of early-season conditions, when partially vegetated pixels coexist with bare-soil responses and the pixel distribution is right-skewed due to the emerging high-NDVI tail. As the growing season progresses into June and July, canopy closure reduces distributional asymmetry while simultaneously sharpening the peak—raising kurtosis while lowering skewness—producing the seasonal offset between the two descriptors visible when comparing Figure 13 and Figure 15. This complementary seasonal behavior reinforces the non-redundancy of kurtosis and skewness as ANN input features and provides a mechanistic agronomic rationale for the feature importance values, where both descriptors contribute substantially and independently to the classification of sensing date authenticity.

4.4. ANN Training Errors and Model Performance

Figure 16, Figure 17, Figure 18 and Figure 19 present the network error curves recorded during the training of the feedforward ANN for each dataset and activation–solver configuration. Figure 16 corresponds to dataset 343, Figure 17 to dataset 977, Figure 18 to dataset 978, and Figure 19 to the combined dataset ‘all’. Within each figure, panel (a) shows the tanh–Adam configuration, panel (b) tanh–SGD, panel (c) ReLU–Adam, and panel (d) ReLU–SGD. The horizontal axis represents the number of training iterations and the vertical axis the mean squared training error, both rendered on a linear scale. The curves were recorded at each iteration step using the loss_curve_ attribute of Scikit-learn’s MLPClassifier, providing a complete trace of the optimization trajectory from parameter initialization to convergence or maximum iteration limit.
The loss curves reveal substantial differences in convergence speed, final error level, and trajectory smoothness across datasets and configurations. For dataset 343 (Figure 16), the tanh–Adam configuration (panel (a)) achieves the most gradual and stable descent, requiring approximately 3000 iterations to reduce the error from an initial value of ~0.12 to a final plateau near 0.04. The tanh–SGD variant (panel (b)) converges substantially faster, within ~500 iterations, but produces a markedly oscillatory curve with persistent fluctuations in the range 0.085–0.110, indicative of the high gradient variance inherent to stochastic gradient descent on a small dataset. The ReLU–Adam configuration (panel (c)) exhibits the steepest initial drop—from ~0.45 to below 0.10 in the first 250 iterations—followed by a slow asymptotic approach to ~0.05 over ~1750 total iterations, a pattern consistent with the large initial weight updates characteristic of ReLU networks with Adam momentum. Dataset 977 (Figure 17) yields the most compact loss curves overall: all four configurations converge within 350–1600 iterations, and the final error values (~0.04–0.05) are the lowest among the three individual datasets, confirming that dataset 977 is the most tractable for ANN classification and the closest in error profile to the combined ‘all’ dataset. The ReLU–Adam curve for dataset 977 (panel (c)) is notably compressed, spanning only ~250 iterations with a very narrow error range (0.049–0.055), suggesting near-immediate convergence to a local minimum. Dataset 978 (Figure 18) shows intermediate convergence behavior: tanh–Adam (panel (a)) and ReLU–SGD (panel (d)) require the most iterations (~1750 and ~2000, respectively), while ReLU–Adam (panel (c)) again converges fastest (~400 iterations) with a sharp monotonic decline from ~0.18 to ~0.04. The combined dataset ‘all’ (Figure 19) produces the smoothest loss curves of all four figures: both tanh configurations (panels (a) and (b)) converge within 700–800 iterations with limited oscillation and final errors in the range 0.050–0.055, while the ReLU–Adam curve (panel (c)) replicates the steep initial drop observed in individual datasets but stabilizes earlier and at a lower final error. The generally smoother and lower-error trajectories of the ‘all’ dataset reflect the stabilizing effect of the larger training sample on gradient estimation, reducing the sensitivity to initialization and providing a more reliable loss landscape for optimization—consistent with the superior training scores reported for this dataset in Table 4.
Table 4 presents the complete error metrics for all 16 ANN configurations (4 structures × 4 datasets). Training loss curves showed convergence within 300–3500 iterations, with occasional pronounced inflections, indicating sensitivity to learning rate and weight update momentum. The tanh&sgd configuration produced the best results for datasets 343 and 977, relu&adam was optimal for dataset 978, and tanh&adam achieved the best overall balance for the combined “all” dataset. Training scores consistently exceeded 0.65, whereas testing scores were frequently negative, indicating overfitting related to limited data volume.

4.5. Feature Importance Analysis

Figure 20, Figure 21, Figure 22 and Figure 23 display the permutation feature importance results for each ANN configuration and dataset, expressed as percentage contributions of the four input variables—variance, kurtosis, skewness, and month—to the overall classification outcome.
Permutation-based feature importance estimates the marginal contribution of each input variable by measuring the increase in classification error when that variable’s values are randomly shuffled across the test set, thereby breaking its association with the target while leaving all other variables intact. A higher score indicates that the model relies more heavily on that variable; a near-zero score indicates that permuting it causes no measurable loss in accuracy.
Figure 20 corresponds to dataset 343, Figure 21 to dataset 977, Figure 22 to dataset 978, and Figure 23 to the combined dataset ‘all’. Within each figure, panels (a) through (d) correspond to the tanh–Adam, tanh–SGD, ReLU–Adam, and ReLU–SGD configurations, respectively. Each panel presents a horizontal bar chart in which bar length encodes the percentage importance of each input feature, with the x-axis labeled as Ratio, % and values annotated directly on the bars. Feature importance was estimated using the permutation method, which measures the increase in classification error produced by randomly shuffling each input variable independently, thereby quantifying each feature’s marginal contribution to the trained model.
The four input features—variance, kurtosis, skewness, and month—were selected according to three guiding criteria: their mathematical derivability directly from the pre-computed per-band statistics stored in the XML structure, without requiring additional processing or external data sources; their theoretical complementarity, as variance, kurtosis, and skewness span the second, third, and fourth central moments of the NDVI pixel distribution; and parsimony, given that the limited sample size necessitates a low-dimensional feature space to mitigate overfitting. Additionally, minimum, maximum, and mean were deliberately excluded, as they are embedded in the computation of the retained variables, and their inclusion would have introduced redundancy into the neural network training process. The month variable was further included as a proxy for seasonal phenological context.
Across all four datasets and all sixteen ANN configurations, the feature importance hierarchy is consistent: variance ranks first, followed by kurtosis and skewness at comparable levels, with month contributing least in every case. For dataset 343 (Figure 20), variance importance ranges from 34.83% (ReLU–SGD, panel (d)) to 38.14% (ReLU–Adam, panel (c)), while kurtosis spans 28.53–34.54% and skewness 22.75–27.82%; month remains below 10% in all panels, with its highest value of 9.14% recorded for ReLU–Adam. Dataset 977 (Figure 21) shows the most notable departure from the standard hierarchy: in panels (a) and (c), kurtosis (36.88% and 33.83%, respectively) either approaches or exceeds variance (31.48% and 33.00%), suggesting that for this parcel the tail-heaviness descriptor carries information not fully captured by variance alone. In panels (b) and (d) of Figure 21, variance reasserts its dominance (35.85% and 36.06%), and the kurtosis–variance gap widens again. For dataset 978 (Figure 22), the ordering of kurtosis and skewness is occasionally inverted: panels (a) through (d) all show skewness ranking above kurtosis—for instance, panel (a) gives skewness 34.54% vs. kurtosis 25.55%—indicating that for this parcel distributional asymmetry carries greater discriminative power than tail-heaviness, a pattern consistent with a pixel population dominated by systematic spatial gradients rather than isolated extreme values. The combined dataset ‘all’ (Figure 23) restores the canonical hierarchy across all four configurations: variance 35.76–38.34%, kurtosis 28.28–30.62%, skewness 26.16–27.84%, and month 6.01–6.82%. The narrower inter-configuration spread observed in Figure 23 compared to the individual dataset figures reflects the homogenizing effect of the larger and more diverse training sample, which reduces sensitivity to the specific optimization path taken by each activation–solver combination and produces a more stable importance estimate. The persistently low month importance across all figures confirms that the ANN learned to discriminate between actual and interpolated sensing dates primarily on the basis of the instantaneous distributional structure of NDVI pixels, rather than on the calendar position of the observation—a desirable property for an operational sensing-quality classifier intended to function year-round without seasonal recalibration.
Table 5 summarizes the feature importance values (percentage contribution to training outcome) for the three individual datasets across all configurations.
Variance consistently ranked as the most important input feature (~35.8% average), followed by kurtosis (~31.5%) and skewness (~26.6%). The temporal variable month contributed least (~6.1%). This hierarchy is theoretically consistent: variance serves as the computational foundation for both kurtosis and skewness, and its dominance in feature importance is therefore mathematically justified. The low importance of month indicates that the ANN appropriately learned that distributional properties of the pixel population, rather than the calendar acquisition date, are the primary determinants of sensing date authenticity.

5. Discussion

The inverse proportionality between variance and kurtosis observed across all datasets is consistent with established statistical theory: as pixel-level variability increases (high variance), distributional tails become relatively lighter compared to the peak, reducing kurtosis. This behavior is particularly relevant in agricultural contexts, where high within-field variance typically indicates transitional phenological states—crop emergence or senescence—characterized by diverse pixel responses [23]. Conversely, periods of uniform canopy development yield low variance, highly peaked (leptokurtic) distributions.
The seasonal concentration of false sensing dates (is_sensing_date = False) in May and March across multiple datasets warrants particular attention. These months coincide with the most dynamic phases of winter cereal growth in the Dobrogea region—tilling, stem elongation, and the transition to reproductive stages—which are also periods of frequent cloudiness and atmospheric disturbance along the Black Sea coast [24]. Satellite operators frequently resort to temporal interpolation during these windows, artificially inflating the ratio of synthetic to real observations. This has direct implications for the reliability of phenology-based analyses that do not distinguish actual from interpolated records.
The superiority of tanh&sgd for most individual datasets is consistent with machine learning literature: the hyperbolic tangent function provides symmetric output and avoids the dying-neuron problem of ReLU in small networks, while SGD’s stochastic updates help escape shallow local minima that Adam may converge to prematurely with small datasets [25]. The requirement for at least two hidden layers suggests that the classification boundary between actual and interpolated sensing dates is not linearly separable in the four-dimensional descriptor space.
The ANN classification task proved more challenging than anticipated, as reflected by consistently negative test scores for individual datasets. Negative R2 values indicate that the model performs worse than a naive mean predictor on the test set, a clear indicator of overfitting attributable to insufficient training data. The variability in ANN performance across datasets can be attributed to two interacting factors: dataset size and class imbalance in the is_sensing_date flag. The poor test generalization observed for individual datasets is primarily a data scarcity problem rather than an architectural deficiency, as evidenced by the substantially improved test scores on the combined ‘all’ dataset. This interpretation is supported by the learning curve analysis: training scores consistently exceed 0.65 across all configurations, confirming that the models learn meaningful discriminative patterns within the training partition. The transition to positive test generalization observed for the ‘all’ dataset demonstrates that the 4-15-9-3-1 architecture is capable of generalization when provided with sufficient and diverse training examples. Dataset 977, which yielded the lowest test errors among the individual parcels, is the most temporally complete and contains the most balanced ratio of True to False sensing dates, providing the ANN with representative examples of both classes during training. By contrast, datasets 343 and 314, which lack October records, present a truncated seasonal coverage that reduces the diversity of distributional states available for learning, particularly for post-harvest conditions characterized by high variance and low kurtosis. The negative R2 values observed for all individual datasets signal overfitting: the model learns the specific statistical signatures of the training partition but fails to generalize to the test partition, a consequence of the limited training sample size (below 5% of available data).
The principal sources of classification error in the present study are:
  • class overlap in the descriptor space, where interpolated and actual observations can produce statistically similar variance, kurtosis, and skewness values, particularly during stable phenological periods in June and July;
  • the deterministic nature of linear temporal interpolation, which produces gap-filled values with artificially smooth descriptor trajectories that may coincidentally match the distributional characteristics of real acquisitions; and
  • the absence of spatial contextual features (e.g., neighboring parcel statistics, soil type covariates) that might help to disambiguate cases where the instantaneous statistical descriptors alone are insufficient.
From a decision-support perspective, the methodology addresses a concrete operational need in precision agriculture: the automated identification of interpolated versus actual satellite observations within NDVI time series used for agronomic monitoring. Crop managers and agri-advisory systems that rely on NDVI time series for growth stage detection, irrigation scheduling, or yield forecasting risk systematic errors if gap-filled observations are treated as equivalent to real acquisitions. By providing a lightweight, parcel-level classifier operating exclusively on pre-computed statistical descriptors—requiring no additional satellite data access or image processing—the approach is deployable within existing farm management information systems with minimal computational overhead. The feature importance results further offer actionable insight: since variance is the dominant discriminator (~35.8%), operational systems may prioritize variance monitoring as a first-pass quality screening step. These findings are directly relevant to digital transformation strategies in the agri-economy sector, particularly in the context of EU Common Agricultural Policy satellite-based parcel verification requirements.
From a broader agronomic perspective, this methodology contributes to the growing literature on automated quality assessment of satellite time series for agricultural applications. Reliable discrimination between actual and synthetic observations is a prerequisite for accurate phenology extraction, crop mapping, and yield forecasting systems increasingly demanded by national and European agricultural monitoring programs [26]. Future work should explore alternative distributional descriptors (e.g., entropy, coefficient of variation), recurrent neural network architectures (RNN, LSTM) for temporal modeling, and the incorporation of ancillary variables such as soil type, irrigation status, and meteorological data.

5.1. Positioning in the Context of Related Literature

The results of the present study can be contextualized within the broader literature on NDVI statistical analysis, ANN-based classification, and remote sensing data quality. The observed inverse proportionality between variance and kurtosis across all six datasets replicates the theoretical predictions of DeCarlo [8] and is consistent with the empirical findings of Delegido et al. [14], who reported the same relationship for multi-temporal Sentinel-2 NDVI stacks over Spanish vineyards. However, the present study extends these findings by demonstrating that the relationship holds across a diverse set of Romanian agricultural parcels with different crop types, parcel sizes, and temporal data completeness, thereby broadening the ecological and geographic scope of previously documented descriptor relationships.
The feature importance hierarchy—variance (35.8%) > kurtosis (31.5%) > skewness (26.6%) > month (6.1%)—is broadly consistent with the findings of Zhong et al. [19], who reported variance-derived features as the top contributors to crop classification accuracy in their permutation importance analysis of Sentinel-2 time series. The particularly low importance of the month variable in the present study contrasts with studies that use temporal position as a primary feature for phenology-based crop mapping (e.g., Inglada et al. [22], who assigned ~30% importance to seasonal timing features). This contrast reflects the fundamental difference in classification tasks: phenology-based crop type mapping relies heavily on the timing of vegetation development stages, whereas sensing date authentication should ideally be independent of the time of year.
The transition from feedforward MLP to sequential architectures (LSTM, temporal CNN) is identified as the most impactful single improvement, building on demonstrated advantages of temporal ordering exploitation in NDVI time series classification [18].

5.2. Limitations and Future Challenges

The present study is subject to several limitations. First, the training datasets represent less than 5% of the available NDVI observations, severely constraining model generalization; expansion to the full dataset is the most critical improvement pathway.
Second, computational resources (Intel Core i5-9400, 8 GB RAM) restricted the number of parcels analyzed and the range of hyperparameter configurations explored. It should be noted that the performance metrics are based on a single stratified 85/15 train-test split (random_state = 42), rather than a full k-fold cross-validation procedure. This choice was imposed by the computational constraints of the available hardware: a single MLPClassifier training run required approximately 10–45 min depending on the dataset and configuration, making 5-fold cross-validation across 16 configurations computationally prohibitive within the project timeline. Full k-fold cross-validation is identified as a priority for future work, contingent on access to high-performance computing resources. Cloud-based platforms that would make this scaling straightforward include Google Earth Engine (GEE), which provides direct API access to Sentinel-2 time series alongside scalable Python-based machine learning workflows; Google Colab Pro, which offers GPU/TPU acceleration and up to 80 GB RAM at minimal cost; and the Copernicus DIAS ecosystem (CREODIAS, MUNDI), which co-locates Sentinel-2 Level-2A archives with configurable compute instances, eliminating data transfer bottlenecks. Transitioning the present pipeline to any of these platforms is identified as a high-priority next step that would resolve both the data scarcity and computational constraints simultaneously.
Third, the feedforward MLP architecture does not exploit the temporal ordering of observations, treating each daily band as an independent sample—a structural limitation relative to recurrent architectures such as LSTM. Fourth, the study is geographically confined to six parcels in a single growing season (2017), limiting generalizability. Fifth, the four input features represent only a subset of potentially informative descriptors; additional metrics such as entropy or coefficient of variation may improve classification accuracy.
Future work should prioritize:
  • scaling the training database to include the full dataset and multiple growing seasons;
  • transitioning to sequential architectures (LSTM, temporal CNN);
  • incorporating auxiliary variables such as soil type, irrigation status, and meteorological data;
  • extending geographic scope to diverse agro-climatic regions; and
  • evaluating swarm-based optimization algorithms as alternative training strategies.
Future mitigation strategies could include:
  • L2 weight regularization (alpha in [0.001, 0.1]) to constrain model complexity;
  • early stopping based on a held-out validation set rather than a fixed iteration limit;
  • data augmentation through bootstrap resampling of the training partition;
  • dimensionality reduction (e.g., Principal Component Analysis on the four input features) to reduce the effective parameter-to-sample ratio.

6. Conclusions

This study demonstrated the feasibility of combining statistical descriptor extraction, visual analytics, and ANN-based classification for the automated analysis of agricultural NDVI time series. The principal conclusions are as follows:
Variance and kurtosis exhibit a consistent inverse proportional relationship across all analyzed datasets, while skewness and kurtosis co-vary proportionally. These structural relationships are rooted in the mathematical definitions of the descriptors.
May consistently shows the highest proportion of interpolated sensing dates relative to real observations (5 of 6 datasets), followed by March and October. April, June, and July contain the highest absolute number of observations, reflecting their agronomic significance.
ANN feature importance analysis confirmed that variance is the most informative input feature (~35.8%), followed by kurtosis (~31.5%) and skewness (~26.6%), with temporal month contributing least (~6.1%). This hierarchy is theoretically consistent and agronomically meaningful.
The tanh&sgd ANN configuration produced the best results for most individual datasets; tanh&adam was optimal for the combined training set. A minimum of two hidden layers was required to capture the nonlinear decision boundary.
Data scarcity (below 5% of the available dataset) and limited computational resources were the primary constraints on model performance. Expanding the training database and using high-performance computing environments are the most critical improvement pathways.
Future research should investigate recurrent neural architectures (RNN, LSTM), swarm-based and convolutional neural networks, and data preprocessing pipelines for outlier detection, alongside substantially larger and geographically diversified training datasets.

Author Contributions

Conceptualization, C.I., M.I., K.A.A., C.D. and S.G.-M.; Methodology, C.I. and K.A.A.; Data curation, M.I. and S.G.-M.; Formal analysis, C.I. and K.A.A.; Investigation, M.I., C.D. and S.G.-M.; Writing—original draft preparation, C.I., M.I. and S.G.-M.; Writing—review and editing, C.I., K.A.A. and C.D.; Visualization, C.I., M.I. and K.A.A.; Supervision, C.I., K.A.A. and C.D.; Project administration, C.I.; Funding acquisition, C.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by “Interdisciplinary research of agricultural data using visual data analysis and artificial neural networks” project, Contract no. 7896/2022.

Data Availability Statement

The original datasets are the property of Contract no. 7896/2022. Anonymized derived statistical data may be made available upon reasonable request to the corresponding author, subject to the data owner’s approval. The underlying Sentinel-2 Level-2A satellite imagery from which the NDVI values were derived is freely and publicly accessible through the Copernicus Open Access Hub (https://www.copernicus.eu/en/access-data/conventional-data-access-hubs, accessed: 1 March 2017) or the Copernicus Data Space Ecosystem (https://dataspace.copernicus.eu, accessed: 1 March 2017). The parcel boundaries are registered in Romania’s Land Parcel Identification System (LPIS), administered by the Agency for Payments and Intervention in Agriculture (APIA, https://www.apia.org.ro, accessed: 1 March 2017).

Acknowledgments

During the preparation of this manuscript/study, the authors used Claude AI for the purposes of translation and text correction. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Weiss, M.; Jacob, F.; Duveiller, G. Remote sensing for agricultural applications: A meta-review. Remote Sens. Environ. 2020, 236, 111402. [Google Scholar] [CrossRef]
  2. Liakos, K.G.; Busato, P.; Moshou, D.; Pearson, S.; Bochtis, D. Machine learning in agriculture: A review. Sensors 2018, 18, 2674. [Google Scholar] [CrossRef]
  3. Rouse, J.W.; Haas, R.H.; Schell, J.A.; Deering, D.W. Monitoring vegetation systems in the Great Plains with ERTS. NASA Spec. Publ. 1974, 351, 309–317. Available online: https://repository.exst.jaxa.jp/dspace/handle/a-is/570457 (accessed on 1 September 2022).
  4. Pettorelli, N.; Vik, J.O.; Mysterud, A.; Gaillard, J.M.; Tucker, C.J.; Stenseth, N.C. Using the satellite-derived NDVI to assess ecological responses to environmental change. Trends Ecol. Evol. 2005, 20, 503–510. [Google Scholar] [CrossRef] [PubMed]
  5. Drusch, M.; Del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F. Sentinel-2: ESA’s optical high-resolution mission for GMES operational services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
  6. Kandasamy, S.; Baret, F.; Verger, A.; Neveux, P.; Weiss, M. A comparison of methods for smoothing and gap filling time series of remote sensing observations: Application to MODIS LAI products. Biogeosciences 2013, 10, 4055–4071. [Google Scholar] [CrossRef]
  7. Jonsson, P.; Eklundh, L. Seasonality extraction by function fitting to time-series of satellite sensor data. IEEE Trans. Geosci. Remote Sens. 2002, 40, 1824–1832. [Google Scholar] [CrossRef]
  8. DeCarlo, L.T. On the meaning and use of kurtosis. Psychol. Methods 1997, 2, 292–307. [Google Scholar] [CrossRef]
  9. Kamilaris, A.; Prenafeta-Boldú, F.X. Deep learning in agriculture: A survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef]
  10. Pantazi, X.E.; Moshou, D.; Alexandridis, T.; Whetton, R.L.; Mouazen, A.M. Wheat yield prediction using machine learning and advanced sensing techniques. Comput. Electron. Agric. 2016, 121, 57–65. [Google Scholar] [CrossRef]
  11. Shen, H.; Li, X.; Cheng, Q.; Zeng, C.; Yang, G.; Li, H.; Zhang, L. Missing information reconstruction of remote sensing data: A technical review. IEEE Geosci. Remote Sens. Mag. 2015, 3, 61–85. [Google Scholar] [CrossRef]
  12. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  13. Moors, J.J.A. A quantile alternative for kurtosis. J. R. Stat. Soc. Ser. D Stat. 1988, 37, 25–32. [Google Scholar] [CrossRef]
  14. Delegido, J.; Verrelst, J.; Alonso, L.; Moreno, J. Evaluation of Sentinel-2 red-edge bands for empirical estimation of green LAI and chlorophyll content. Sensors 2011, 11, 7063–7081. [Google Scholar] [CrossRef]
  15. Vibhute, A.; Gawali, B.W. Analysis and modeling of agricultural land use using remote sensing and geographic information system: A review. Int. J. Eng. Res. Appl. 2013, 3, 81–91. Available online: https://www.researchgate.net/publication/312704684 (accessed on 1 September 2022).
  16. Verger, A.; Baret, F.; Weiss, M. Near real-time vegetation monitoring at global scale. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 3473–3481. [Google Scholar] [CrossRef]
  17. Mateo-García, G.; Laparra, V.; López-Puigdollers, D.; Gómez-Chova, L. Transferring deep learning models for cloud detection between Landsat-8 and Proba-V. ISPRS J. Photogramm. Remote Sens. 2020, 160, 1–17. [Google Scholar] [CrossRef]
  18. Russwurm, M.; Körner, M. Multi-temporal land cover classification with sequential recurrent encoders. ISPRS Int. J. Geo-Inf. 2018, 7, 129. [Google Scholar] [CrossRef]
  19. Zhong, L.; Hu, L.; Zhou, H. Deep learning based multi-temporal crop classification. Remote Sens. Environ. 2019, 221, 430–443. [Google Scholar] [CrossRef]
  20. Benos, L.; Tagarakis, A.C.; Dolias, G.; Berruto, R.; Kateris, D.; Bochtis, D. Machine learning in agriculture: A comprehensive updated review. Sensors 2021, 21, 3758. [Google Scholar] [CrossRef] [PubMed]
  21. Kluyver, T.; Ragan-Kelley, B.; Pérez, F.; Granger, B.; Bussonnier, M.; Frederic, J.; Kelley, K.; Hamrick, J.; Grout, J.; Corlay, S.; et al. Jupyter Notebooks: A publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas; IOS Press: Amsterdam, The Netherlands, 2016; pp. 87–90. [Google Scholar] [CrossRef]
  22. Inglada, J.; Vincent, A.; Arias, M.; Marais-Sicre, C. Improved early crop type identification by joint use of high temporal resolution SAR and optical image time series. Remote Sens. 2016, 8, 362. [Google Scholar] [CrossRef]
  23. Sellers, P.J. Canopy reflectance, photosynthesis and transpiration. Int. J. Remote Sens. 1985, 6, 1335–1372. [Google Scholar] [CrossRef]
  24. Lazăr, C.; Lazăr, I. Simulation of temperature increase influence on winter wheat yields and development in South-Eastern Romania. Ann. Univ. Craiova—Agric. Mont. Cadastre Ser. 2010, 40, 149–155. Available online: https://www.incda-fundulea.ro/rar/nr27/rar27.2.pdf (accessed on 1 September 2022).
  25. Heaton, J. Ian Goodfellow, Yoshua Bengio, and Aaron Courville: Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar] [CrossRef]
  26. Fritz, S.; See, L.; McCallum, I.; You, L.; Bun, A.; Moltchanova, E.; Duerauer, M.; Albrecht, F.; Schill, C.; Perger, C.; et al. Mapping global cropland and field size. Glob. Change Biol. 2015, 21, 1980–1992. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Representative XML raster data structure (PAMDataset format) showing per-band statistical metadata for a single agricultural parcel. Dataset 343, excerpt from the 2017 growing season.
Figure 1. Representative XML raster data structure (PAMDataset format) showing per-band statistical metadata for a single agricultural parcel. Dataset 343, excerpt from the 2017 growing season.
Mathematics 14 01741 g001
Figure 2. Geographic location of the Dobrogea region in southeastern Romania, within the broader context of South-East Europe. Source: https://gisgeography.com/high-resolution-europe-map/ (accessed on 1 March 2017) and https://earth.google.com/web/ (accessed on 1 March 2017).
Figure 2. Geographic location of the Dobrogea region in southeastern Romania, within the broader context of South-East Europe. Source: https://gisgeography.com/high-resolution-europe-map/ (accessed on 1 March 2017) and https://earth.google.com/web/ (accessed on 1 March 2017).
Mathematics 14 01741 g002
Figure 3. Excel (.xlsx) representation of the NDVI statistical data after normalization. Dataset 343. The first 9 data sets and the simple representation of all 343 data points.
Figure 3. Excel (.xlsx) representation of the NDVI statistical data after normalization. Dataset 343. The first 9 data sets and the simple representation of all 343 data points.
Mathematics 14 01741 g003
Figure 4. Overall methodological workflow.
Figure 4. Overall methodological workflow.
Mathematics 14 01741 g004
Figure 5. Initial temporal representation of all statistical descriptors (variance, kurtosis, skewness, mean) for Dataset 343. Each band corresponds to one daily observation over the March–October 2017 growing season.
Figure 5. Initial temporal representation of all statistical descriptors (variance, kurtosis, skewness, mean) for Dataset 343. Each band corresponds to one daily observation over the March–October 2017 growing season.
Mathematics 14 01741 g005
Figure 6. Temporal representation after normalization of statistical descriptor values by division by 10n. Dataset 343.
Figure 6. Temporal representation after normalization of statistical descriptor values by division by 10n. Dataset 343.
Mathematics 14 01741 g006
Figure 7. Violin plots of kurtosis stratified by is_sensing_date and calendar month for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Panels generated using sns.catplot (kind = ‘violin’).
Figure 7. Violin plots of kurtosis stratified by is_sensing_date and calendar month for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Panels generated using sns.catplot (kind = ‘violin’).
Mathematics 14 01741 g007
Figure 8. Violin plots of skewness stratified by is_sensing_date and calendar month for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Panels generated using sns.catplot (kind = ‘violin’).
Figure 8. Violin plots of skewness stratified by is_sensing_date and calendar month for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Panels generated using sns.catplot (kind = ‘violin’).
Mathematics 14 01741 g008
Figure 9. Bivariate KDE joint plots of kurtosis versus variance for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Generated using sns.jointplot (kind = ‘kde’, cmap = ‘brg_r’). The "warmer" or more intense the color, the more data is clustered at that point.
Figure 9. Bivariate KDE joint plots of kurtosis versus variance for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Generated using sns.jointplot (kind = ‘kde’, cmap = ‘brg_r’). The "warmer" or more intense the color, the more data is clustered at that point.
Mathematics 14 01741 g009
Figure 10. Bivariate KDE joint plots of skewness versus variance for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Generated using sns.jointplot (kind = ‘kde’, cmap = ‘brg_r’). The "warmer" or more intense the color, the more data is clustered at that point.
Figure 10. Bivariate KDE joint plots of skewness versus variance for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Generated using sns.jointplot (kind = ‘kde’, cmap = ‘brg_r’). The "warmer" or more intense the color, the more data is clustered at that point.
Mathematics 14 01741 g010
Figure 11. Overlaid three-dimensional surface plots of kurtosis (α = 0.25—transparency) and skewness as functions of variance and mean for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Generated using ax.plot_trisurf (cmap = ‘inferno’).
Figure 11. Overlaid three-dimensional surface plots of kurtosis (α = 0.25—transparency) and skewness as functions of variance and mean for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Generated using ax.plot_trisurf (cmap = ‘inferno’).
Mathematics 14 01741 g011
Figure 12. Three-dimensional triangulated surface plots of kurtosis as a function of mean and variance for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Generated using ax.plot_trisurf.
Figure 12. Three-dimensional triangulated surface plots of kurtosis as a function of mean and variance for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Generated using ax.plot_trisurf.
Mathematics 14 01741 g012
Figure 13. Three-dimensional scatter plots of kurtosis colored by calendar month for all six parcels: (a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306. Generated using plotly.scatter_3d.
Figure 13. Three-dimensional scatter plots of kurtosis colored by calendar month for all six parcels: (a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306. Generated using plotly.scatter_3d.
Mathematics 14 01741 g013aMathematics 14 01741 g013b
Figure 14. Three-dimensional triangulated surface plots of skewness as a function of mean and variance for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Generated using ax.plot_trisurf.
Figure 14. Three-dimensional triangulated surface plots of skewness as a function of mean and variance for all six parcels ((a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306). Generated using ax.plot_trisurf.
Mathematics 14 01741 g014
Figure 15. Three-dimensional scatter plots of skewness colored by calendar month for all six parcels: (a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306. Generated using plotly.scatter_3d.
Figure 15. Three-dimensional scatter plots of skewness colored by calendar month for all six parcels: (a): 343, (b): 314, (c): 977, (d): 978, (e): 686, (f): 306. Generated using plotly.scatter_3d.
Mathematics 14 01741 g015aMathematics 14 01741 g015b
Figure 16. ANN training loss curves for Dataset 343: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Figure 16. ANN training loss curves for Dataset 343: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Mathematics 14 01741 g016
Figure 17. ANN training loss curves for Dataset 977: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Figure 17. ANN training loss curves for Dataset 977: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Mathematics 14 01741 g017aMathematics 14 01741 g017b
Figure 18. ANN training loss curves for Dataset 978: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Figure 18. ANN training loss curves for Dataset 978: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Mathematics 14 01741 g018
Figure 19. ANN training loss curves for the combined dataset ‘all’: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Figure 19. ANN training loss curves for the combined dataset ‘all’: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Mathematics 14 01741 g019
Figure 20. Permutation feature importance bar plots for Dataset 343: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Figure 20. Permutation feature importance bar plots for Dataset 343: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Mathematics 14 01741 g020aMathematics 14 01741 g020b
Figure 21. Permutation feature importance bar plots for Dataset 977: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Figure 21. Permutation feature importance bar plots for Dataset 977: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Mathematics 14 01741 g021
Figure 22. Permutation feature importance bar plots for Dataset 978: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Figure 22. Permutation feature importance bar plots for Dataset 978: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Mathematics 14 01741 g022aMathematics 14 01741 g022b
Figure 23. Permutation feature importance bar plots for Dataset combined Dataset ‘all’: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Figure 23. Permutation feature importance bar plots for Dataset combined Dataset ‘all’: (a) tanh–Adam, (b) tanh–SGD, (c) ReLU–Adam, (d) ReLU–SGD.
Mathematics 14 01741 g023
Table 1. PAMDataset XML metadata fields and their agronomic interpretations.
Table 1. PAMDataset XML metadata fields and their agronomic interpretations.
FieldTypeDescription
area_declaNumericDeclared parcel area (hectares) as registered in LPIS
bloc_nrStringLPIS block identifier grouping adjacent parcels
cat_useStringLand use category (e.g., arable, permanent grassland)
crop_codeStringDeclared crop type code for the growing year
gidIntegerUnique geographic identifier for the parcel
countIntegerNumber of valid NDVI pixels within the parcel boundary
meanFloatMean NDVI pixel value for the observation band
varianceFloatPopulation variance of NDVI pixel values
kurtosisFloatStandardized fourth central moment of pixel distribution
skewnessFloatStandardized third central moment of pixel distribution
minimumFloatMinimum NDVI pixel value within the parcel
maximumFloatMaximum NDVI pixel value within the parcel
is_sensing_dateBooleanTrue = actual satellite acquisition; False = interpolated gap-fill
Table 2. Normalization factors applied to each statistical descriptor for graphical display.
Table 2. Normalization factors applied to each statistical descriptor for graphical display.
DescriptorTypical Raw Value Range (Approx.)Normalization Factor ( 10 n )Normalized Range (Approx.)
Variance (σ2)0 ÷ 500 10 3 0 ÷ 1
Kurtosis (K)−1 ÷ 100 10 2 −1 ÷ 1
Skewness (SK)−10 ÷ 10 10 1 −1 ÷ 1
mean0 ÷ 300 10 3 0 ÷ 1
Table 3. Hyperparameter specification for the four MLPClassifier configurations.
Table 3. Hyperparameter specification for the four MLPClassifier configurations.
Hyperparametertanh and Adamtanh and SGDReLU and AdamReLU and SGD
Architecture (layers)4-15-9-3-14-15-9-3-14-15-9-3-14-15-9-3-1
Activation functiontanhtanhReLUReLU
SolverAdamSGDAdamSGD
Initial learning rate0.0010.0010.0010.001
Learning rate scheduleconstantconstantconstantconstant
Beta_1 (Adam momentum)0.9 (default)N/A0.9 (default)N/A
Beta_2 (Adam RMS term)0.999 (default)N/A0.999 (default)N/A
SGD momentumN/A0.9 (default)N/A0.9 (default)
Weight initializationGlorot uniformGlorot uniformGlorot uniformGlorot uniform
Max iterations5000500050005000
Batch size2222
Tolerance (tol)10−610−610−610−6
Train/test split85%/15%85%/15%85%/15%85%/15%
Table 4. Training and testing error metrics for all ANN configurations across datasets.
Table 4. Training and testing error metrics for all ANN configurations across datasets.
StructureMAE TestMSE TestMAE TrainMSE TrainScore TestScore TrainDataset
tanh&adam0.43470.32880.15930.0627−0.19080.7887343
tanh&sgd0.30100.13140.33990.1697−0.29660.8020343
relu&adam0.30070.18710.21640.1078−0.65130.8156343
relu&sgd0.35490.22490.24280.1165−0.37050.8057343
tanh&adam0.22550.13860.15030.0609−0.38260.6533977
tanh&sgd0.17130.07520.17070.0875−0.68560.7420977
relu&adam0.13210.03340.19710.0983−1.53970.6793977
relu&sgd0.15430.07150.17180.0851−0.48370.6435977
tanh&adam0.19470.13560.10300.0446−0.07400.7368978
tanh&sgd0.16490.10880.11750.0546−0.34570.7556978
relu&adam0.13120.05220.15810.0791−0.39840.7938978
relu&sgd0.13930.05890.14610.0674−0.17920.7897978
tanh&adam0.19230.10890.18750.09810.19950.8303all
tanh&sgd0.23100.15080.17370.09670.84470.2594all
relu&adam0.21640.11990.18530.08850.21640.8517all
relu&sgd0.20430.08490.21650.09970.12210.8370all
Table 5. Feature importance values (%) for ANN input variables across datasets and configurations.
Table 5. Feature importance values (%) for ANN input variables across datasets and configurations.
StructureVariance (%)Kurtosis (%)Skewness (%)Month (%)Dataset
tanh&adam35.0934.5425.554.82343
tanh&sgd37.5928.5326.867.02343
relu&adam38.1429.9722.759.14343
relu&sgd34.8329.4527.827.90343
tanh&adam36.8831.4823.158.50977
tanh&sgd35.8529.7728.565.82977
relu&adam33.8333.0027.275.89977
relu&sgd36.0630.4326.596.92977
tanh&adam35.0934.5425.554.82978
tanh&sgd37.1433.6026.043.22978
relu&adam33.7931.4029.854.96978
relu&sgd34.9631.7628.954.32978
Mean (ind.)35.7731.5426.586.11343/977/978
tanh&adam37.3929.7026.166.75all
tanh&sgd37.2028.2827.846.69all
relu&adam38.3429.2426.426.01all
relu&sgd35.7630.6226.806.82all
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ilie, C.; Ilie, M.; Aivaz, K.A.; Duhnea, C.; Ghiță-Mitrescu, S. Quantitative Analysis of NDVI Temporal Data Using Artificial Neural Networks: A Decision-Making Approach for Precision Agriculture. Mathematics 2026, 14, 1741. https://doi.org/10.3390/math14101741

AMA Style

Ilie C, Ilie M, Aivaz KA, Duhnea C, Ghiță-Mitrescu S. Quantitative Analysis of NDVI Temporal Data Using Artificial Neural Networks: A Decision-Making Approach for Precision Agriculture. Mathematics. 2026; 14(10):1741. https://doi.org/10.3390/math14101741

Chicago/Turabian Style

Ilie, Constantin, Margareta Ilie, Kamer Ainur Aivaz, Cristina Duhnea, and Silvia Ghiță-Mitrescu. 2026. "Quantitative Analysis of NDVI Temporal Data Using Artificial Neural Networks: A Decision-Making Approach for Precision Agriculture" Mathematics 14, no. 10: 1741. https://doi.org/10.3390/math14101741

APA Style

Ilie, C., Ilie, M., Aivaz, K. A., Duhnea, C., & Ghiță-Mitrescu, S. (2026). Quantitative Analysis of NDVI Temporal Data Using Artificial Neural Networks: A Decision-Making Approach for Precision Agriculture. Mathematics, 14(10), 1741. https://doi.org/10.3390/math14101741

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop