Deep Learning Improves Planting Year Estimation of Macadamia Orchards in Australia

Clark, Andrew; Brinkhoff, James; Robson, Andrew; Shephard, Craig

doi:10.3390/agriculture15222346

Open AccessArticle

Deep Learning Improves Planting Year Estimation of Macadamia Orchards in Australia

Applied Agricultural Remote Sensing Centre, University of New England, Armidale, NSW 2350, Australia

^*

Author to whom correspondence should be addressed.

Agriculture 2025, 15(22), 2346; https://doi.org/10.3390/agriculture15222346

Submission received: 7 October 2025 / Revised: 7 November 2025 / Accepted: 9 November 2025 / Published: 11 November 2025

(This article belongs to the Special Issue Remote Sensing in Crop Protection)

Download

Browse Figures

Versions Notes

Abstract

Deep learning reduced macadamia planting year error at a national scale, achieving a pixel-level Mean Absolute Error (MAE) of 1.2 years and outperforming a vegetation index threshold baseline (MAE 1.6 years) and tree-based models—Random Forest (RF; MAE 3.02 years) and Gradient Boosted Trees (GBT; MAE 2.9 years). Using Digital Earth Australia Landsat annual geomedians (1988–2023) and block-level, industry-supplied planting year data, models were trained and evaluated at the pixel level under a strict Leave-One-Region-Out cross-validation (LOROCV) protocol; a secondary block-level random split (80/10/10) is reported only to illustrate the more optimistic setting, where shared regional conditions yield lower errors (0.6–0.7 years). Predictions reconstruct planting year retrospectively from the full historical record rather than providing real-time forecasts. The final model was then applied to all Australian Tree Crop Map (ATCM) macadamia orchard polygons to produce wall-to-wall planting year estimates. The approach enables fine-grained mapping of planting patterns to support yield forecasting, resource allocation, and industry planning. Results indicate that sequence-based deep models capture informative temporal dynamics beyond thresholding and conventional machine learning baselines, while remaining constrained by regional and temporal data sparsity. The framework is scalable and transferable, offering a pathway to planting year mapping for other perennial crops and to more resilient, data-driven agricultural decision-making.

Keywords:

macadamia; planting year; deep learning; machine learning

1. Introduction

Accurate estimation of orchard block planting year, and consequently age, is crucial for various aspects of agricultural management, environmental assessment, and economic planning. The age of an orchard block significantly influences its productivity, carbon sequestration potential, water requirements, and susceptibility to pests and diseases [1,2,3]. In the Australian macadamia industry specifically, reliable planting year data underpins accurate yield forecasting, resource allocation, and market planning [4]. By integrating tree census records (e.g., planting dates) with models that account for climatic variability, stakeholders can better predict annual yield fluctuations, stabilize prices, and strengthen industry resilience [5]. As global demand for macadamia nuts continues to rise, there is an increasing need for reliable, large-scale methods to map and monitor orchards across diverse landscapes.

Traditional methods for estimating orchard age, such as field surveys and manual interpretation of aerial imagery, are time-consuming, costly, and often impractical for large areas. Earth observation data provides extensive temporal and spatial coverage, making it a powerful tool for broad-scale orchard mapping. Satellite systems such as MODIS, Landsat, and Sentinel-2 offer varying spatial resolutions and historical records. MODIS provides daily global coverage at 250–1000 m resolution since 2000, which is valuable for regional-scale vegetation monitoring. However its coarse spatial resolution limits its use for individual orchard mapping [6]. Landsat has provided a continuous global record since 1972 (60–80 m MSS initially), with 30 m multispectral observations from Landsat-4/5 TM, Landsat-7 ETM+, and Landsat-8/9 OLI; this archive is extensively used for land-cover mapping, including orchard delineation [7]. For instance, Brinkhoff & Robson [8] utilised over 30 years of Landsat imagery to estimate macadamia orchard planting years in Australia, demonstrating the potential of long-term satellite time series for orchard age estimation at a national scale. Sentinel-2, launched in 2015, provides 10–20 m resolution imagery with 5-day revisit, making it suitable for detailed orchard mapping, although its shorter historical record limits long-term studies [9].

Integrating multiple data sources has emerged as a strategy to overcome limitations of individual sensors and improve mapping accuracy. Claverie et al. [10] demonstrated the potential of combining high-resolution Sentinel-2 data with moderate-resolution MODIS data for improved crop-type mapping. In terms of predicting the tree-planting date from earth observation data, a study using a time series of Landsat NDVI on almond orchards in California, Chen et al. [11] achieved high predictive accuracy, with a mean absolute error of less than half a year and an

R^{2}

of 0.96 at the orchard block level. This approach was facilitated by the availability of detailed block-level planting data, that supported precise model training and validation within a geographically limited area. A further study by Zhou et al. [12] identified that young plantations often lack sufficient canopy cover to be detected as tree categories in land cover maps, leading to underestimation of newly planted areas.

The spatial resolution of satellite imagery can result in mixed pixels, particularly for smaller blocks or those with wide row spacing, complicating accurate mapping [8]. Distinguishing between different tree crops with similar spectral and phenological characteristics also presents a challenge when attempting to predict the tree planting of orchards at a regional or national scale [12]. Additionally, seasonal vegetation changes, especially during critical growth periods, may not be adequately captured by current satellite revisit times, affecting the accuracy of phenological assessments and age estimations [13].

Deep learning models have demonstrated their capability to detect subtle temporal trends in multi-temporal datasets by leveraging automated feature extraction to capture complex dynamics without relying on predefined rules or manual engineering. Zhong et al. [14] highlighted the effectiveness of CNNs (Conv1D) in identifying fine-grained temporal patterns in Enhanced Vegetation Index (EVI) time series, where lower layers of the network detect small-scale temporal variations while upper layers summarise broader seasonal trends. Additionally, Zhong et al. [14] demonstrated that deep neural networks applied to multi-temporal datasets can capture nuanced spectral and temporal trends in crop classification tasks, suggesting that faint signals—such as planting rows or emerging saplings—could likewise be identified in newly planted orchards. Similarly, Kussul et al. [15] emphasised the ability of deep learning to discern smaller distinctions in land cover through extensive feature extraction and pattern recognition, illustrating how these methods can adapt to diverse agricultural contexts. Together, these findings indicate that deep learning approaches need not wait for clear canopy signatures or dramatic land-cover changes but can instead harness subtle indicators in remote sensing data to detect the planting and early growth of orchard crops. This adaptability underscores their potential to improve orchard monitoring and management by identifying transitions related to orchard establishment before dense canopies fully develop, offering significant advantages for agricultural applications.

Despite significant progress in broad-scale orchard mapping and planting year estimation, several challenges remain unaddressed. Accurate detection and age estimation of young orchard blocks are difficult due to insufficient canopy cover and subtle spectral signatures, leading to underrepresentation in mapping efforts. Existing models often struggle to generalise across different geographical regions and orchard types because of variability in environmental conditions and management practices [16]. Additionally, combining different data sources introduces complexities related to data alignment, scaling, and fusion methodologies. Whilst in Australia, the extent (location and area) of all commercial macadamias has been mapped in recent years [17], a similar comprehensive dataset of block-level information such as planting date, variety etc. is yet to be developed. Without consistent nation-wide producer-supplied block-level data to apply model predictions to, alternative approaches must be employed, such as pixel-level analysis using available satellite data.

Orchard-level (block-aggregated) targets offer interpretability aligned with management units and can smooth within-block variability, but aggregation can mask sub-block heterogeneity and can inflate apparent accuracy when training and testing share regional conditions [18]. In this context, per-pixel temporal modelling at the observation scale treats each pixel as an independent sequence, preserving fine-scale dynamics relevant to planting year signals while avoiding block averaging and increasing the potential information content available [19]. However, this approach assumes pixel independence, which may not hold due to spatial autocorrelation within blocks. Balancing these trade-offs is crucial for developing robust models that generalise well across diverse orchard conditions.

This study therefore bridges the gap between prior block-level analyses and fine-grained, national-scale mapping by formulating planting year estimation at the pixel level. Constrained by the absence of uniform, producer-supplied block labels nationwide and the generalised nature of ATCM polygons, the approach treats each Landsat pixel as an independent temporal sequence (1988–2023 DEA annual geomedians) within mapped orchard extents. This design preserves sub-orchard heterogeneity while enabling large-sample training across regions. Generalisation is assessed under a strict region-held-out (LOROCV) protocol to reflect out-of-region deployment, and block-level summaries (medians of pixel predictions) are reported as secondary, management-facing aggregates.

In this context, this study compares machine learning and deep learning models for macadamia orchard planting year estimation across Australia. Traditional machine learning approaches, such as Random Forests and Gradient Boosting Machines are evaluated and compared with advanced deep learning models, including Long Short-Term Memory (LSTM) and Temporal Convolutional Networks (TCN). By utilising multi-temporal satellite imagery from Landsat, the study aims to leverage the long-term historical data provided by the sensors. Hyperparameter tuning techniques, including the use of Keras Tuner [20], are employed to optimise model architectures for each growing region. Additionally, this study aims to improve the accuracy of planting year predictions, particularly for recently planted orchards under three years old, by incorporating both Landsat spectral data and temporal pattern analysis. By integrating machine and deep learning models with satellite imagery, this study aims to improve the accuracy of macadamia orchard block planting year estimation across Australia. The findings will provide valuable insights for industry stakeholders, enhancing yield forecasting, resource management, and agricultural sustainability. Additionally, the research has the potential to inform decision-making at both farm and industry levels, supporting biosecurity preparedness, natural disaster response and recovery, and long-term agricultural planning.

2. Materials and Methods

In this study, multiple machine learning models were developed and evaluated to predict crop planting years from multi-temporal satellite imagery. The methodology included data preparation, hyperparameter tuning, model training, evaluation, and comparative analysis of model configurations. The top-performing models were subsequently applied to the Australian Tree Crop Map (ATCM) [17] to derive nation-wide statistics on planted area by year. An overview of the end-to-end workflow is provided in Figure 1.

2.1. Study Area

The study area encompassed all macadamia orchards defined by ATCM, from Tropical Queensland to New South Wales Mid North Coast in the east and South West Western Australia (Figure 2). The spatial extent of the industry-defined growing regions was built from Local Government Area (LGA) and postcode boundaries. The regions vary in climate, soil type, and management practices, influencing the growth and development of macadamia orchards. The geographic diversity of these regions provides an ideal setting for evaluating model performance across different environmental conditions and management practices.

2.2. Input Data

Compiling the input data required information about existing macadamia block planting years (labels) and a time series of satellite data (imagery). The input data labels were derived from polygon vectors with a planting year attribute, which were supplied by growers and industry bodies. For each block, the planting year was converted into a block age, with 1 representing the most recent year and increasing sequentially for earlier years. The block age was then used as the label for training the models.

The Landsat yearly geometric median, spanning from 1988 to 2023, were extracted from Digital Earth Australia’s (DEA) Data Cube [21] for all macadamia blocks and formed the basis of the training, validation, and test data. These data, offering temporally consistent median composites, are useful for tracking long-term land cover changes, including orchard establishment. The data consists of six spectral bands (blue, green, red, near-infrared, shortwave infrared 1, and shortwave infrared 2). The Normalised Difference Vegetation Index (NDVI) [22] and Green Normalised Difference Vegetation Index (GNDVI) [23] values were calculated for each pixel. The training data were segmented by growing region, shown in Figure 2. Table 1 shows the number of blocks with planting date information for each region. In total, there were 422 blocks with known planting dates.

In this study, individual pixel time series within the training-block boundaries were used. Pixels were included only when the pixel centre lay within the block polygon (i.e., no edge-touch pixels were taken), which functions as an implicit edge filter to reduce boundary mixing and label noise at block margins. Deep learning models typically require substantial data to learn generalisable patterns; using per-pixel sequences increases the number of training instances from 422 blocks to 60,405 pixels (Table 1). However, due to spatial autocorrelation among pixels within the same block, this increase does not translate linearly into independent information [19]. This granular sampling strategy aligns with recommendations to maximise sample size to improve reliability and generalisability [24]. To ensure adequate data coverage while limiting regional imbalance, regions with fewer than 3000 pixels (Lismore, Macksville, Maclean, South East Queensland, and Western Australia) were grouped into an “Other Regions” category for modelling.

Features are drawn from the full historical sequence (1988–2023) to reconstruct the planting year retrospectively; the system is not designed as an as-of-year operational forecaster. Annual DEA geomedians are computed per calendar year and then stacked temporally; no future information is introduced within-year, but the multi-year context includes post-planting observations.

2.3. Data Sampling

A Leave-One-Region-Out Cross-Validation (LOROCV) approach was implemented to enhance the model’s ability to generalise across diverse geographies. As highlighted by Lyons et al. [25], splitting the data this way ensures robust cross-validation essential for accurate assessment of a model’s ability to generalise geographically. Each model was trained on all regions except the one reserved for validation or testing [26]. The withheld region’s pixels were split into two datasets: a validation dataset, which was used exclusively for the ‘early stopping’ and ‘reduce learning rate on plateau’ callbacks to determine when to stop training or reduce the learning rate, and a test dataset, which was used for final model evaluation. While the test dataset remained geographically distinct from the training data, its independence from the validation set depends on the degree of spatial autocorrelation within the withheld region. This evaluation approach aligns with best practices in spatial machine learning, ensuring that model performance is assessed on data not used for weight updates or direct optimisation. Table 2 shows the number of training, validation, and test pixels across models, detailing pixel allocation for each region in the LOROCV process, which adheres to best practices in evaluating spatial datasets.

2.4. Machine Learning Models

To determine the best approach to predict the age of macadamia orchard pixels, range of machine learning and deep learning models were evaluated. The models encompass both traditional and advanced techniques, each with unique mechanisms for handling temporal data. Table 3 lists the ten model types trialled in this study. This comprehensive evaluation aims to identify the most effective model for accurate and reliable prediction of macadamia orchard planting years.

2.5. Hyperparameter Tuning

Hyperparameter tuning plays a critical role in this process, ensuring models do not overfit or underfit to training data. In this application, the model configurations are tuned to ensure spatial generalisability. By exploring the hyperparameter space for each region, the most effective settings for each model type can be determined, ensuring robust and reliable predictions that account for geographic and temporal diversity. This tailored approach aims to maximise model performance in mapping orchard planting dates with precision across varied landscapes. Details of the search space for each of the model types can be found in Appendix A.

2.5.1. Thresholding Approach

As a simple baseline for comparison with more complex algorithms, a method similar to [8] was implemented at the pixel level (rather than the block level) to align with the study objective of pixel-level planting year estimation and with the other model types. For each pixel, the planting year was estimated as the year in which a vegetation index (VI) first crossed a specified threshold in its time series. Threshold values from

- 0.2

to

0.8

(step

0.02

) were explored for both NDVI and GNDVI. Recognising that specific thresholds may correspond to orchards of differing ages, integer “delta” adjustments from

- 4

to

+ 4

years were applied to the crossing year. For each combination of VI, threshold, and delta, planting year predictions were generated and evaluated against planting records using standard metrics; the combination minimising root mean squared error (RMSE) was selected.

2.5.2. Machine Learning Approach

Hyperparameter tuning was conducted to enhance model performance for macadamia orchard planting year estimation. Tree-based models were optimised with scikit-learn’s RandomizedSearchCV [36], and deep learning models with Keras Tuner [20]. For tree-based models, parameter distributions included the number of estimators, maximum depth, learning rate, and regularisation parameters (Appendix A). The search spanned 50 iterations with threefold cross-validation to promote robust selection.

For neural network architectures, Keras Tuner HyperModel classes targeted key hyperparameters, including the number of layers, units per layer, dropout rates, activation functions, and optimiser settings (Appendix A). Bayesian optimisation was used to explore the hyperparameter space, with up to 50 trials initialised by five random points. During tuning, performance was assessed on validation data with early stopping (patience 5) to avoid unnecessary training. The total number of trained models is summarised in Table 4. This systematic procedure refined model performance and supported accurate predictions across regions.

2.6. Model Evaluation

Evaluation followed a Leave-One-Region-Out cross-validation (LOROCV) scheme in which, for each fold, one growing region was held out entirely for testing and all remaining regions supplied training/validation data. This setup assesses spatial generalisation to unseen geographic conditions and reduces overfitting to regional characteristics. Hyperparameters were tuned using only the non-held-out regions for each fold; the held-out region remained untouched until the final test.

Performance was quantified using mean absolute error (MAE), root mean square error (RMSE), and the coefficient of determination (

R^{2}

) [37]. For each model type and region, the top five hyperparameter settings (selected by the lowest validation MAE) were retrained for up to 100 epochs with early stopping (patience 10) and learning-rate reduction on plateau (factor 0.1, patience 5). Each region-model-hyperparameter combination was trained five times with different random initialisations to characterise variability due to stochastic optimisation. The resulting five test scores per configuration were averaged, and the best hyperparameter setting and checkpoint were retained for each region-model pair.

To summarise overall behaviour, metrics were aggregated to report the mean and standard deviation of MAE, RMSE, and

R^{2}

across runs and regions. For like-for-like comparison among model types, regional test predictions were also concatenated to form a unified test set per model. Consistent with the retrospective formulation, evaluation uses the full multi-year context; region-held-out (LOROCV) metrics are treated as the primary measure of spatial generalisation, with block-level random-split results reported only as supplementary context because shared regional conditions can yield optimistic error estimates.

2.7. Model Application

To generate national planting year maps, Landsat annual geometric medians from Digital Earth Australia (1988–2023) were extracted for all macadamia orchard areas delineated in the ATCM. Pixels were included only when the pixel centre lay within an ATCM orchard polygon (no edge-touch pixels), providing an implicit edge filter that reduces boundary mixing. As the ATCM polygons are generalised with a minimum mapping unit of approximately 1 ha and may aggregate neighbouring blocks/orchards into a single feature [8], planting year estimation was performed at the pixel level to retain sub-orchard heterogeneity within the mapped extent, acknowledging that polygon boundaries may not coincide exactly with management units.

The top-performing model type was identified from the LOROCV comparison. For deployment, a single model was then trained on the combined multi-region dataset using a block-level randomised split (80/10/10 for train/validation/test), ensuring validation and test blocks were spatially independent from training blocks while maximising training volume. Training proceeded for up to 200 epochs with early stopping (patience 10; minimum MAE improvement 0.001) and learning-rate reduction on plateau (factor 0.1; patience 5). The best checkpoint on the validation set was retained.

The final model was applied to every eligible Landsat pixel within ATCM orchard boundaries nationwide, producing pixel-level planting year estimates. Regional rasters were mosaicked and merged into a single, spatially explicit dataset covering all growing regions. For management-facing summaries, pixel-wise estimates within each orchard block were aggregated by the median to provide a block-level value while preserving the underlying pixel-level maps. Outputs were vectorised and clipped to ATCM boundaries so that partial pixels intersecting orchard edges were split in the final vector product. Regional summaries and cumulative planted-area curves were then generated to visualise spatio-temporal patterns, and an interactive web map was produced for exploratory access to the results.

2.8. Computing Infrastructure

All model hyperparameter tuning, training and evaluation were conducted on a high-performance computing (HPC) system running a 64-bit Linux-based operating environment, featuring an Intel Xeon Processor (Icelake) with 16 physical cores (2793.35 MHz). The system contained 251 GB of RAM and was equipped with an NVIDIA A100 PCIe 40 GB GPU (CUDA Version 12.6), optimised for deep learning computations. The Python environment, was built around Python 3.9.19, with essential packages including TensorFlow (Version 2.15.1) for deep learning model construction, Rasterio (Version 1.3.11) and Geopandas (Version 1.0.1) for geospatial data processing, and Matplotlib (Version 3.9.2) and Seaborn (Version 0.13.2) for visualisation. The environment also included libraries such as Scikit-Learn (Version 1.5.2) and Pandas (Version 2.3.1) for machine learning and data handling, respectively.

2.9. Use of Generative AI Tools

During the preparation of this manuscript, the authors used ChatGPT 4 (OpenAI, 2024) to assist with language refinement, including grammar and phrasing. The authors reviewed and edited the content produced by the tool and take full responsibility for the final text.

3. Results

The following section presents the evaluation of the macadamia planting date prediction models. The analysis includes an examination of the data used for training, validation, and testing the models, a summary of overall model performance, an evaluation of errors over time and across regions, and insights into the application of the models for national-scale predictions. Together, these analyses provide a comprehensive assessment of the model effectiveness and highlight region-specific challenges and accuracies in predicting orchard block planting year.

3.1. Training Data

The analysis of the available data (Table 2) indicates an uneven geographic distribution of training and validation samples across regions. Bundaberg had the most available data resulting in fewer training pixels and more validation and test data for the Bundaberg model. The North Coast NSW region has the least amount of data available resulting in a high number of training pixels and fewer validation and test pixels for this region’s models. Figure 3 illustrates the temporal distribution of available data at the pixel level by year and region. The temporal distribution of training data across different regions is notably sparse in earlier years with no data available for 1992–1996, 2002, and 2012.

3.2. Comparison of Model Types

Figure 4 presents a comparative evaluation of various machine learning and deep learning models in predicting the planting year of macadamia orchard pixels. Each subplot corresponds to a specific model and visualises the relationship between predicted and true planting years using a hex-bin plot. The BiRNN, GRU, and LSTM, models demonstrate the highest accuracy, achieving

R^{2}

scores above 0.9, MAE values below 1.4 years, and RMSEs under 2.3 years. The deep learning models show a tighter clustering along the diagonal, indicating high prediction accuracy closely aligned with true planting years. In contrast, the RF and GBT models exhibit greater variability, achieving an

R^{2}

of 0.51, suggesting these methods may be less precise in this context. Notably, the thresholding approach demonstrates moderate clustering along the diagonal but exhibits some variability for recent planting years. This variability is due to the method’s requirement of a minimum age of two to three years to detect new plantings. Nonetheless, the thresholding approach performs well, achieving a MAE of 1.62 years, RMSE of 2.59 years, and an

R^{2}

of 0.88, which is similar to the accuracy reported by Brinkhoff & Robson [8].

Figure 5 summarises the distribution of absolute errors via cumulative distribution functions (CDFs). Curves that rise more steeply indicate a larger fraction of predictions falling within smaller error tolerances (better performance). Deep-learning models, particularly GRU and LSTM, consistently dominate the CDFs across the low-error range, while RF/GBT form an intermediate tier and the thresholding baseline shows a heavier tail. TCN and Transformer lag across most thresholds, indicating a higher proportion of large errors relative to the other deep models. These distributional patterns are consistent with the MAE/RMSE summaries and the predicted-versus-true plots.

3.3. Temporal Analysis

Figure 6 presents the MAE for each planting year across the ten evaluated models. Each line captures the variation in prediction accuracy per model from 1988 to 2023, showing errors per planting year. Missing data points indicate years with no available test data. Notably, high error levels are observed for the RF and GBT models, particularly for more recent years, indicating challenges in accurately predicting planting years with these models. In contrast, the deep learning models, except for TCN, maintain a more consistent and lower error across the time span, reflecting their capacity to capture temporal patterns effectively. The thresholding model produced the highest errors in the most recent planting year, where its performance diverges from the deep learning models due to the method’s inability to accurately predict young plantings.

3.4. Regional Analysis

Figure 7 shows the MAE of each model across the macadamia growing regions in Australia. The models demonstrate substantial variability in prediction accuracy depending on the region. For example, the Other Regions region consistently has high validation errors, suggesting lower prediction accuracy. In contrast, regions such as North QLD and Gympie display narrower error distributions, indicating more consistent and potentially more accurate predictions. The figure underscores the significant influence of regional characteristics on model performance. Certain regions consistently show higher MAE values, which highlights the importance of region-specific model tuning or tailored approaches to enhance predictive accuracy across diverse geographic contexts.

3.5. Summary of Training Outcomes

The top-performing model type and associated hyperparameters for each model number are detailed in Table 5. The hyperparameters used for each model were selected based on the lowest MAE achieved during hyperparameter tuning. The average MAE values for each model are also included in the table, providing insights into the optimal configurations for each model type. These hyperparameters were instrumental in enhancing model performance and ensuring accurate predictions across diverse regional datasets. A complete per-region breakdown of held-out performance is provided in Appendix B.

3.6. Selecting the Top-Performing Model

Although the top-performing models varied by region, the GRU model emerged as the overall best performer and ranked as the top or second top for all models. Consequently, the GRU model, configured with hyperparameters from the model with the Gympie region excluded (Table 5), was chosen as the final model and trained using a block-level randomised split of the data as described in Section 2.6. For clarity, the region-held-out (LOROCV) results are treated as the primary measure of spatial generalisation. Block-level random split results are reported only as supplementary context because they are susceptible to optimistic error estimates when training and testing share regional conditions. The final model was trained for a total of 123 epochs, with the learning rate decreased to

2.69 \times 10^{- 4}

at epoch 39 and

2.69 \times 10^{- 5}

at epoch 119; however, the model failed to improve and the early stopping callback was triggered at epoch 123. The lowest error was achieved at epoch 113, and the corresponding model was saved for evaluation. On the test data (not regionally independent), the model achieved a MAE of 0.59 years and a RMSE of 1.09 years. The final model consisted of 141,793 trainable parameters.

To better illustrate differences between the threshold-based method and the deep learning approach, Figure 8 presents a scatter plot comparing predicted versus true planting years on a per-pixel basis, evaluated using a test dataset comprising 5863 pixels. Figure 9 displays a scatter plot of predicted versus true planting years based on per-block medians across 42 blocks. The GRU model achieved a MAE of 0.71 and an RMSE of 1.07. In contrast, the threshold method achieved a MAE of 2.1 and RMSE of 4.18. The GRU model’s strong alignment to the diagonal demonstrates its high overall accuracy and ability to capture the subtle temporal signals of orchard establishment. Figure 10 visualises the intra-block variability for the GRU model planting year prediction. The blocks featuring narrow boxes indicate consistent predictions, while wider spreads suggest more heterogeneity—whether due to orchard management practices or simply differences in pixel-level spectral signatures.

3.7. Australian Macadamia Predicted Planting Year

Figure 11 presents the cumulative macadamia planting area by year and growing region. The plot shows a clear trend of increasing macadamia planting areas over time, with significant growth observed from the early 2000’s onward, particularly in Bundaberg. This growth trend indicates the expansion of macadamia orchards in Australia, reflecting the rising demand and popularity of macadamias. There is a noticeable acceleration in the planting of macadamia orchards starting around 2018, highlighting a period of rapid expansion in the macadamia industry.

AARSC developed, as part of this study, the Australian Macadamia Society Predicted Planting Year Dashboard (https://experience.arcgis.com/experience/4365ba21a9e14d03ae988e9ba333f549/; accessed on 10 February 2025), which presents the planting year predictions produced by the deployment described in Section 2.6. The application serves the pixel-level national outputs derived from DEA Landsat annual geomedians (1988–2023) constrained to ATCM orchard polygons and provides interactive summaries by planting year within the current map extent (Figure 12). The dashboard is intended for visualisation and industry engagement; all quantitative results reported in this paper are computed offline from the same model outputs.

4. Discussion

This study presents a method to improve macadamia orchard planting year predictions using satellite-image time series and deep learning. Accurate planting year data are essential for yield forecasting and resource planning at both orchard and industry scales; knowing precisely when trees were established can significantly enhance yield models and market forecasts [4]. The final model was applied to predict the planting year for all pixels within Australian macadamia orchards mapped in the ATCM [17]; the resulting estimates reveal region-specific planting trends over time. The cumulative planting area over time (Figure 11) indicates substantial increases in planting areas across all regions in recent years (2018–2023), especially in the Bundaberg and Gympie growing regions. However, it is important to note that this analysis only considers macadamia orchards represented in the ATCM. Macadamia blocks that have been removed over time are not accounted for in the plot, meaning the cumulative areas shown do not reflect historical orchards that were planted and subsequently removed. This omission potentially leads to an underestimation of total planting activity over the years if used in this way.

Deep learning models demonstrated significant advantages over traditional methods like thresholding, RF, and GBT in predicting planting dates from satellite imagery. A key strength of deep learning is its ability to learn complex, abstract features directly from the imagery, capturing not only visible land-clearing events but also subtle changes in vegetation dynamics indicative of orchard establishment. Unlike thresholding, which relies on simple, predefined rules, and traditional machine learning models that depend heavily on engineered features, deep learning models can automatically identify intricate temporal and spatial patterns in the data [38]. Across regions held out entirely for testing (LOROCV), all deep models achieved MAE < 1.5 years except TCN (Figure 4); GRU and LSTM performed best at 1.2 years, improving on the macadamia-specific block-level benchmark of ∼1.7 years reported by Brinkhoff & Robson [8]. When predictions are aggregated to blocks and evaluated under a block-level random split, MAE decreases to 0.6–0.7 years, approaching the <0.5 years reported by Chen et al. for almonds in a geographically compact, densely labelled setting [11]; this alignment is consistent with the effects of target aggregation and evaluation design (random splits admit regional feature sharing, whereas out-of-region tests do not). The consistently strong performance across deep architectures suggests that modelling temporal trends is central to success, yet the observed plateau indicates limits imposed by current data coverage. Expanding the spatial and temporal diversity of training data is therefore likely to yield further gains and more reliable estimates across diverse growing conditions.

The thresholding approach, which relied on identifying specific GNDVI values to determine orchard age, performed remarkably well. This simple approach proved effective for detecting orchards within an age range (>3 years), particularly when strong spectral signals were present. Nonetheless, it is inherently limited in its ability to generalise across diverse planting scenarios or account for more subtle temporal changes. In contrast, deep learning models utilised multi-year spectral patterns and temporal context. This suggests that deep learning models may accurately determine orchard age and planting year, even in cases where complex or subtle temporal dynamics were involved. Tree-based methods like RF and GBT offered moderate performance but were inherently constrained by their inability to model temporal dependencies effectively for this type of application.

The hyperparameter optimisation in this study was limited to a maximum of 50 trials, which, while sufficient to identify a set of effective hyperparameters, may not have fully explored the parameter space. It is possible that other hyperparameter combinations, beyond those tested, could yield improved model performance. Additionally, the training process may not have allowed sufficient time for some models, particularly larger and more complex architectures like the Temporal Convolutional Network (TCN), to converge and minimise error. The TCN’s consistently poor performance could be attributed to this limitation, as its greater complexity likely requires extended training to fully leverage its capacity to model temporal patterns [33]. Future work should consider increasing the number of hyperparameter trials and allocating additional training time to ensure that models achieve optimal performance.

Accurately modelling planting dates for macadamia orchards across Australia’s diverse growing regions depends on a balanced and representative training dataset. The limited data availability before the 2000s (Figure 3), both temporally and geographically, presents challenges for predicting planting years in these earlier periods. Model performance is constrained by limited historical data, especially in regions with fewer pixels, which can reduce accuracy for years and regions that are sparsely represented in our dataset. Geographic and temporal imbalances identified in this study have influenced model performance, highlighting the need to address these challenges. Models with smaller validation and test datasets, such as the model with Other Regions withheld, exhibited high prediction inaccuracies even when overall errors were low (Figure 7), limiting the ability to learn region-specific features from other regions and thereby impacting generalisability. The merging of smaller regions appears to have created a class with greater geographical variability, leading to higher errors when validated and tested in these areas. Similarly, the concentration of available data in recent years (2017–2023) and earlier peaks (2004–2008) (Figure 3) creates a temporal imbalance: MAE is lower in these well-sampled periods and higher for planting years with sparse data (Figure 6). Enhancing data collection for under-represented regions and historical periods is essential to improve predictive accuracy and robustness. Collaborative efforts with industry are underway to address data gaps, with future work aiming to incorporate a larger, more comprehensive dataset that better represents growing regions both spatially and temporally. No explicit temporal or spatial reweighting was applied in this study, so estimates reflect the empirical sampling distribution (skewed toward recent years and larger regions), which helps explain higher errors for sparsely represented pre-2000 planting years. To mitigate this in future work—particularly if additional data further over-represents recent plantings or specific regions—we will evaluate (i) temporal reweighting to equalise the effective sample size per planting year bin, (ii) region-aware weighting to reduce dominance by data-rich regions, and (iii) stratified mini-batching so each update includes a balanced mix of years and regions. These adjustments are designed to improve reliability for under-represented years and locales without altering the retrospective formulation or the LOROCV evaluation protocol.

The lower errors in Figure 8 compared to Figure 4 can be explained by differences in data splitting. In Figure 4, the data were split at the regional level, so the model was tested on entire growing regions held out from training. This forces the model to generalise to unseen geographical conditions, often increasing error. By contrast, Figure 8 used block-level splitting, meaning the model still saw data from the same region during training (just not from those specific blocks). Consequently, the model encounters less overall variation at test time, leading to lower error estimates. This optimism arises from regional feature sharing (e.g., climate regimes, soils, management practices, sensor/viewing geometry) that the model can exploit under random splits, but which is intentionally broken by region-held-out evaluation.

Figure 8 compares the GRU model against the threshold-based method initially described by Brinkhoff & Robson [8]. In that previous work, orchard-level (i.e., block-averaged) data were used to define NDVI/GNDVI thresholds, whereas this study derived them from pixel-level data for a more direct comparison with the deep learning approaches. An immediate consequence of moving to pixel-level analysis is that orchards planted in a single year can exhibit varied reflectance signals across different pixels, reflecting factors like row spacing, canopy density, or management zones. Figure 9 shows the performance of each model when each block value is the median of all pixel-level predictions. This block-level perspective helps to smooth out within-orchard variability, because a single orchard-level estimate is taken from the median of all pixel-level predictions. Consequently, the block-level results (Figure 9) tend to cluster more tightly around the diagonal and show slightly improved

R^{2}

value, indicating robust orchard-wide estimates. However, as noted, this aggregation can mask important sub-orchard differences—such as new rows or partial replanting within older blocks—which are clearly captured in pixel-level analysis (Figure 8). Thus, although a block-level summary may align well with farm management units and simplify interpretation, the richer pixel-level approach offers finer spatial resolution of canopy development and highlights heterogeneity arising from row spacing, uneven planting, or distinct management zones within the same orchard block.

In this study, a single GRU model was utilised for planting year predictions. Future research should consider implementing probabilistic modelling techniques, as outlined by [25]. By training multiple models on various data subsets using methods such as k-fold cross-validation or bootstrapping, it is possible to generate probabilistic estimates for each pixel’s planting year. Aggregating these estimates would create a map indicating the likelihood of planting in a given year for each pixel, providing both predictions and a measure of classification confidence. This approach addresses current limitations by highlighting areas with prediction uncertainty, particularly in regions with sparse or imbalanced data. It guides data collection efforts to improve model accuracy and mitigates data biases, offering more stable and accurate estimates of macadamia planting years across diverse regions. Incorporating these techniques aligns with best practices for remote sensing classification and accuracy assessment, thereby enhancing the reliability and interpretability of planting year predictions for improved agricultural management.

Implementing data augmentation strategies [39] could enhance model performance across the entire time series. Furthermore, generating synthetic satellite imagery and corresponding planting year labels to mimic the spectral and temporal characteristics of historical orchards could expand the training dataset, thus reducing bias and improving model generalisability [40,41]. Moreover, moving from a pixel-based to a patch-based analysis offers promising improvements in prediction accuracy by incorporating the broader landscape context surrounding each pixel [42]. This patch-based approach captures spatial relationships and patterns crucial for orchard age estimation, which may lead to more robust and reliable planting year predictions.

Future work could also examine simpler, transparent baselines alongside deep sequence models to contextualise national-scale feasibility. In particular, ordinary least squares and Elastic Net would provide like-for-like comparators under the same region-held-out (LOROCV) protocol. A targeted feature-set ablation could further quantify the contribution of (i) indices-only (NDVI/GNDVI), (ii) raw bands only, and (iii) bands+indices. Given the complementarity between broadband reflectances and vegetation indices, indices-only may underperform relative to bands+indices; empirical confirmation under identical training and evaluation settings would clarify these contributions.

5. Conclusions

This study demonstrated the effectiveness of using time-series Landsat imagery and advanced deep learning models to predict the planting years of macadamia orchards across multiple growing regions in Australia. By leveraging extensive satellite data spanning from 1988 to 2023 and state-of-the-art modelling techniques, a robust methodology has been developed for accurately mapping the age of macadamia orchards, addressing limitations in traditional orchard age estimation methods.

The predictive models developed in this study offer significant potential for enhancing agricultural planning, yield forecasting, and informed decision-making within the macadamia industry. Accurate orchard age information is invaluable for optimising resource allocation, pest and disease control, harvest and processing scheduling/planning. The integration of these insights into a dashboard-style mapping application shared with industry stakeholders provides a practical tool for real-world applications, facilitating ongoing collaboration and data expansion.

Future work should first address data limitations by expanding the geographic and temporal coverage of training samples, particularly in underrepresented regions and earlier planting years. To further refine planting year predictions, approaches such as probabilistic modelling [25] could be employed, providing uncertainty estimates that highlight areas where predictions are less reliable. Implementing data augmentation strategies—for instance, generating synthetic satellite imagery—can increase the diversity of training examples and improve model robustness [39,40,41]. Meanwhile, a patch-based analysis, which examines clusters of pixels rather than individual pixels, could capture broader spatial context and further enhance orchard age estimation [42]. Finally, exploring advanced architectural designs (e.g., ensemble or hybrid deep learning) could further boost both accuracy and interpretability [5].

Ultimately, this research contributes a scalable, adaptable framework for predictive modelling in precision agriculture. The methodologies and insights presented not only support the sustainability, profitability, and resilience of the macadamia industry in Australia but also offer a foundation for broader applications across other perennial crops and countries. By addressing key challenges and leveraging advancements in remote sensing and machine learning, this framework ensures the agricultural sector is better equipped to navigate future environmental and economic challenges.

Author Contributions

Conceptualisation, A.C., J.B. and A.R.; Data curation, A.C.; Formal analysis, A.C.; Investigation, A.C.; Methodology, A.C. and J.B.; Project administration, A.C.; Software, A.C.; Validation, A.C.; Visualisation, A.C.; Writing—original draft, A.C.; Writing—review and editing, A.C., J.B., A.R. and C.S.; Funding acquisition, A.R. All authors have read and agreed to the published version of the manuscript.

Funding

The “Spatially enabling tree crop production practice” project (AS23000) is funded by the Hort Frontiers Advanced Production Practice, part of the Hort Frontiers strategic partnership initiative developed by Hort Innovation, with co-investment from University of New England (UNE), Australia Avocados Ltd (AAL), Australian Banana Growers Council (ABGC), Citrus Australia (CA), Australian Macadamia Society (AMS), and the CRC for Future Food Systems (FFSCRC) and contributions from the Australian Government.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to commercial sensitivities.

Acknowledgments

The authors would like to acknowledge the National Committee on Land Use and Management Information (NCLUMI), the National Computing Infrastructure, and Digital Earth Australia for data access and resources to support the initial investigations as part of this study. The authors would also like to acknowledge the support of the Australian Macadamia Society (AMS) and Horticulture Innovation Australia for their guidance and funding, which were instrumental in the development and execution of this research. Special thanks are extended to industry stakeholders for their valuable insights and feedback, which have helped shape the application and practical relevance of this work. The ATCM (which is updated and maintained by researchers at AARSC) has been essential to this research at this scale (national).

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

ATCM	Australian Tree Crop Map
biRNN	Bidirectional Recurrent Neural Network
DEA	Digital Earth Australia
GBT	Gradient Boosted Trees
GNDVI	Green Normalised Difference Vegetation Index
GRU	Gated Recurrent Unit
HPC	High-Performance Computing
LGA	Local Government Area
LOROCV	Leave-One-Region-Out Cross-Validation
LSTM	Long Short-Term Memory
MAE	Mean Absolute Error
NDVI	Normalised Difference Vegetation Index
RF	Random Forest
RMSE	Root Mean Square Error
ResNet	Residual Network
TCN	Temporal Convolutional Network
TCNN	Temporal Convolutional Neural Network

Appendix A. Hyperparameter Search Spaces

This appendix contains the hyperparameter search spaces for each model evaluated in the study.

Table A1. Hyperparameter search spaces for each model.

Model	Hyperparameter	Search Space
TCN	Number of stacks	1 to 3
	Number of filters per layer	32 to 256 (step 32)
	Kernel size	{2, 3, 4, 5}
	Dropout rate	0.0 to 0.5 (step 0.1)
	Activation function	ReLU, SELU, Tanh
	Use batch normalisation	True or False
	Use layer normalisation	True or False
	Use skip connections	True or False
	Dilations	[1,2,4,8], [1,2,4,8,16], [1,2,4,8,16,32]
	Optimiser	Adam, SGD, RMSprop, Nadam
	Learning rate	$1 \times 10^{- 4}$ to $1 \times 10^{- 2}$ (log scale)
	Kernel initialiser	Glorot Uniform, He Normal, Lecun Normal
	L2 regularisation	0.0 to $1 \times 10^{- 2}$ (step $2 \times 10^{- 3}$ )
TCNN	Number of layers	1 to 5
	Number of filters per layer	32 to 256 (step 32)
	Kernel size	{3, 5, 7}
	Dropout rate	0.0 to 0.5 (step 0.1)
	Activation function	ReLU, SELU, Tanh
	Use batch normalisation	True or False
	Optimiser	Adam, SGD, RMSprop, Nadam
	Learning rate	$1 \times 10^{- 4}$ to $1 \times 10^{- 2}$ (log scale)
	Kernel initialiser	Glorot Uniform, He Normal, Lecun Normal
	L2 regularisation	0.0 to $1 \times 10^{- 2}$ (step $2 \times 10^{- 3}$ )
Transformer	Number of layers	1 to 6
	Embedding dimension ( $d_{m o d e l}$ )	{64, 128, 256}
	Number of attention heads	{2, 4, 8}
	Feedforward dimension ( $d_{f f}$ )	{128, 256, 512}
	Use layer normalisation	True or False
	Activation function	ReLU, SELU, Tanh, Linear
	Dropout rate	0.0 to 0.5 (step 0.1)
	Optimiser	Adam, SGD, RMSprop, Nadam
	Learning rate	$1 \times 10^{- 4}$ to $1 \times 10^{- 3}$ (log scale)
	Kernel initialiser	Glorot Uniform, He Normal, Lecun Normal
	L2 regularisation	0.0 to $1 \times 10^{- 2}$ (step $2 \times 10^{- 3}$ )
biRNN, LSTM, GRU	Number of layers	1 to 3
	Units per layer	32 to 256 (step 32)
	Dropout rate	0.0 to 0.5 (step 0.1)
	Recurrent dropout	0.0 to 0.5 (step 0.1)
	Activation function	ReLU, Tanh
	Use batch normalisation	True or False
	Optimiser	Adam, SGD, RMSprop, Nadam
	Learning rate	$1 \times 10^{- 4}$ to $1 \times 10^{- 2}$ (log scale)
	Kernel initialiser	Glorot Uniform, He Normal, Lecun Normal
	L2 regularisation	0.0 to $1 \times 10^{- 2}$ (step $2 \times 10^{- 3}$ )
ResNet	Number of residual blocks	1 to 4
	Number of filters	32 to 128 (step 32)
	Filter size	{3, 5, 7}
	Dropout rate	0.0 to 0.5 (step 0.1)
	Activation function	ReLU, SELU, Tanh
	Optimiser	Adam, SGD, RMSprop, Nadam
	Learning rate	$1 \times 10^{- 4}$ to $1 \times 10^{- 3}$ (log scale)
	Kernel initialiser	Glorot Uniform, He Normal, Lecun Normal
	L2 regularisation	0.0 to $1 \times 10^{- 2}$ (step $2 \times 10^{- 3}$ )
Random Forest (RF)	Number of estimators	{100, 200, 500}
	Maximum depth	{None, 10, 20, 30}
	Minimum samples split	{2, 5, 10}
	Minimum samples leaf	{1, 2, 4}
	Bootstrap	True or False
HistGradientBoosting (GBT)	Maximum iterations	{100, 200, 500}
	Maximum depth	{None, 10, 20, 30}
	Learning rate	{0.01, 0.1, 0.2}
	Minimum samples leaf	{1, 2, 4}
	L2 regularisation	{0.0, 0.1, 0.5}

Appendix B. Held-Out Performance by Region and Model

Table A2. Region-held-out (LOROCV) performance by region and model. MAE and RMSE are in years. Pixel count is the number of test pixels used for that region–model evaluation.

Region	Model	MAE	RMSE	$R^{2}$	Pixel Count
Bundaberg	LSTM	1.16	1.86	0.89	18,304
	GRU	1.17	1.88	0.88	18,304
	biRNN	1.19	2.03	0.86	18,304
	Transformer	1.25	1.94	0.88	18,304
	Thresholding	1.33	1.91	0.88	18,304
	ResNet	1.43	2.07	0.86	18,304
	TCNN	1.49	2.18	0.84	18,304
	TCN	2.42	3.95	0.48	18,304
	GBT	3.65	5.99	−0.19	18,304
	RF	3.70	5.89	−0.15	18,304
Gympie	GRU	1.05	2.65	0.90	3916
	biRNN	1.11	2.86	0.88	3916
	LSTM	1.13	2.76	0.89	3916
	ResNet	1.24	2.80	0.89	3916
	TCNN	1.32	3.18	0.85	3916
	Thresholding	1.35	2.81	0.87	3916
	TCN	1.58	3.57	0.81	3916
	Transformer	1.58	3.97	0.77	3916
	RF	1.63	3.92	0.78	3916
	GBT	1.66	4.11	0.75	3916
North Coast NSW	Transformer	1.36	2.80	0.93	1805
	GRU	1.54	2.67	0.94	1805
	TCNN	1.55	2.79	0.93	1805
	ResNet	1.67	2.89	0.93	1805
	LSTM	1.72	2.82	0.93	1805
	biRNN	1.90	3.07	0.92	1805
	GBT	2.28	3.72	0.88	1805
	Thresholding	2.28	3.77	0.88	1805
	RF	2.43	3.77	0.88	1805
	TCN	3.67	5.26	0.77	1805
Other Regions	ResNet	2.15	3.15	0.91	2595
	GRU	2.15	3.33	0.89	2595
	biRNN	2.22	3.65	0.87	2595
	LSTM	2.25	3.53	0.88	2595
	TCNN	2.38	3.71	0.87	2595
	Transformer	2.43	3.79	0.86	2595
	Thresholding	2.86	4.17	0.84	2595
	GBT	3.06	4.62	0.80	2595
	RF	3.25	5.07	0.75	2595
	TCN	3.86	5.42	0.72	2595
QLD North	TCNN	0.74	1.73	0.93	3583
	GRU	0.75	1.87	0.92	3583
	LSTM	0.76	1.91	0.92	3583
	biRNN	0.85	1.70	0.93	3583
	Transformer	0.93	1.83	0.93	3583
	GBT	1.04	1.90	0.92	3583
	ResNet	1.12	1.82	0.93	3583
	RF	1.29	2.10	0.90	3583
	TCN	1.75	2.70	0.84	3583
	Thresholding	2.16	3.09	0.79	3583

References

Sylvaine, S.; Jean-Charles, B.; Jean-François, D.; Benoît, S. Biodiversity and pest management in orchard systems. A review. Agron. Sustain. Dev. 2010, 30, 139–152. [Google Scholar] [CrossRef]
Wu, T.; Wang, Y.; Yu, C.; Chiarawipa, R.; Zhang, X.; Han, Z.; Wu, L. Carbon Sequestration by Fruit Trees—Chinese Apple Orchards as an Example. PLoS ONE 2012, 7, e38883. [Google Scholar] [CrossRef] [PubMed]
Zanotelli, D.; Montagnani, L.; Manca, G.; Scandellari, F.; Tagliavini, M. Net ecosystem carbon balance of an apple orchard. Eur. J. Agron. 2015, 63, 97–104. [Google Scholar] [CrossRef]
Mayer, D.; Stephenson, R.; Jones, K.; Wilson, K.; Bell, D.; Wilkie, J.; Lovatt, J.; Delaney, K. Annual forecasting of the Australian macadamia crop – integrating tree census data with statistical climate-adjustment models. Agric. Syst. 2006, 91, 159–170. [Google Scholar] [CrossRef]
Mayer, D.G.; Chandra, K.A.; Burnett, J.R. Improved crop forecasts for the Australian macadamia industry from ensemble models. Agric. Syst. 2019, 173, 519–523. [Google Scholar] [CrossRef]
Justice, C.; Townshend, J.; Vermote, E.; Masuoka, E.; Wolfe, R.; Saleous, N.; Roy, D.; Morisette, J. An overview of MODIS Land data processing and product status. Remote Sens. Environ. 2002, 83, 3–15. [Google Scholar] [CrossRef]
Wulder, M.A.; Coops, N.C.; Roy, D.P.; White, J.C.; Hermosilla, T. Current status of Landsat program, science, and applications. Remote Sens. Environ. 2019, 225, 127–147. [Google Scholar] [CrossRef]
Brinkhoff, J.; Robson, A.J. Macadamia orchard planting year and area estimation at a national scale. Remote Sens. 2020, 12, 2245. [Google Scholar] [CrossRef]
Drusch, M.; Del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P. Sentinel-2: ESA’s optical high-resolution mission for GMES operational services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
Claverie, M.; Ju, J.; Masek, J.G.; Dungan, J.L.; Vermote, E.F.; Roger, J.C.; Skakun, S.V.; Justice, C. The harmonized Landsat and Sentinel-2 surface reflectance data set. Remote Sens. Environ. 2018, 219, 145–161. [Google Scholar] [CrossRef]
Chen, B.; Jin, Y.; Brown, P. Automatic mapping of planting year for tree crops with Landsat satellite time series stacks. Isprs J. Photogramm. Remote Sens. 2019, 151, 176–188. [Google Scholar] [CrossRef]
Zhou, X.X.; Li, Y.Y.; Luo, Y.K.; Sun, Y.W.; Su, Y.J.; Tan, C.W.; Liu, Y.J. Research on remote sensing classification of fruit trees based on Sentinel-2 multi-temporal imageries. Sci. Rep. 2022, 12, 11549. [Google Scholar] [CrossRef]
Wang, Y.; Hollingsworth, P.M.; Zhai, D.; West, C.D.; Green, J.M.; Chen, H.; Hurni, K.; Su, Y.; Warren-Thomas, E.; Xu, J.; et al. High-resolution maps show that rubber causes substantial deforestation. Nature 2023, 623, 340–346. [Google Scholar] [CrossRef] [PubMed]
Zhong, L.; Hu, L.; Zhou, H. Deep learning based multi-temporal crop classification. Remote Sens. Environ. 2019, 221, 430–443. [Google Scholar] [CrossRef]
Kussul, N.; Lavreniuk, M.; Skakun, S.; Shelestov, A. Deep learning classification of land cover and crop types using remote sensing data. IEEE Geosci. Remote Sens. Lett. 2017, 14, 778–782. [Google Scholar] [CrossRef]
Meghraoui, K.; Sebari, I.; Pilz, J.; Ait El Kadi, K.; Bensiali, S. Applied Deep Learning-Based Crop Yield Prediction: A Systematic Analysis of Current Developments and Potential Challenges. Technologies 2024, 12, 43. [Google Scholar] [CrossRef]
Robson, A.; Walsh, K.B.; Yaakobi, R.; Empson, M.; Mazhar, M.S.; Dickinson, G.; Schultz, A.; Gillett, S.; Young, K. Multi-Scale Monitoring Tools for Managing Australian Tree Crops—Phase 2; Final Report No. ST19001; Horticulture Innovation Australia: Sydney, Australia, 2023. [Google Scholar]
Wu, H.; Li, Z.L. Scale Issues in Remote Sensing: A Review on Analysis, Processing and Modeling. Sensors 2009, 9, 1768–1793. [Google Scholar] [CrossRef]
Bishop, T.; McBratney, A.; Whelan, B. Measuring the quality of digital soil maps using information criteria. Geoderma 2001, 103, 95–111. [Google Scholar] [CrossRef]
O’Malley, T.; Bursztein, E.; Long, J.; Chollet, F.; Jin, H.; Invernizzi, L.; de Marmiesse, G.; Fu, Y.; Podivín, J.; Schäfer, F.; et al. Keras Tuner. 2019. Available online: https://github.com/keras-team/keras-tuner (accessed on 8 November 2025).
Roberts, D.; Mueller, N.; Mcintyre, A. High-dimensional pixel composites from earth observation time series. IEEE Trans. Geosci. Remote Sens. 2017, 55, 6254–6264. [Google Scholar] [CrossRef]
Rouse, J.W.; Haas, R.H.; Schell, J.; Deering, D.W. Monitoring Vegetation Systems in the Great Plains with ERTS; NASA Special Publication; NASA: Washington, DC, USA, 1974; Volume 351, p. 309. [Google Scholar]
Gitelson, A.A.; Merzlyak, M.N. Remote sensing of chlorophyll concentration in higher plant leaves. Adv. Space Res. 1998, 22, 689–692. [Google Scholar] [CrossRef]
Foody, G.M. Status of land cover classification accuracy assessment. Remote Sens. Environ. 2002, 80, 185–201. [Google Scholar] [CrossRef]
Lyons, M.B.; Keith, D.A.; Phinn, S.R.; Mason, T.J.; Elith, J. A comparison of resampling methods for remote sensing classification and accuracy assessment. Remote Sens. Environ. 2018, 208, 145–153. [Google Scholar] [CrossRef]
Foody, G.M. Thematic map comparison. Photogramm. Eng. Remote Sens. 2004, 70, 627–633. [Google Scholar] [CrossRef]
Schuster, M.; Paliwal, K.K. Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. [Google Scholar] [CrossRef]
Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; pp. 1724–1734. [Google Scholar]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar] [CrossRef]
Pelletier, C.; Webb, G.I.; Petitjean, F. Temporal convolutional neural network for the classification of satellite image time series. Remote Sens. 2019, 11, 523. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Chai, T.; Draxler, R.R. Root mean square error (RMSE) or mean absolute error (MAE)?—Arguments against avoiding RMSE in the literature. Geosci. Model Dev. 2014, 7, 1247–1250. [Google Scholar] [CrossRef]
Amato, F.; Guignard, F.; Robert, S.; Kanevski, M. A novel framework for spatio-temporal prediction of environmental data using deep learning. Sci. Rep. 2020, 10, 22243. [Google Scholar] [CrossRef] [PubMed]
Clark, A.; Phinn, S.; Scarth, P. Pre-processing training data improves accuracy and generalisability of convolutional neural network based landscape semantic segmentation. Land 2023, 12, 1268. [Google Scholar] [CrossRef]
Hao, X.; Liu, L.; Yang, R.; Yin, L.; Zhang, L.; Li, X. A review of data augmentation methods of remote sensing image target recognition. Remote Sens. 2023, 15, 827. [Google Scholar] [CrossRef]
Iglesias, G.; Talavera, E.; González-Prieto, Á.; Mozo, A.; Gómez-Canaval, S. Data augmentation techniques in time series domain: A survey and taxonomy. Neural Comput. Appl. 2023, 35, 10123–10145. [Google Scholar] [CrossRef]
Song, H.; Kim, Y.; Kim, Y.I. A patch-based light convolutional neural network for land-cover mapping using landsat-8 images. Remote Sens. 2019, 11, 114. [Google Scholar] [CrossRef]

Figure 1. An overview of the planting year modelling pipeline, highlighting data inputs, sampling strategy, model development, and evaluation workflow.

Figure 2. Geographic distribution of macadamia growing regions across Australia.

Figure 3. Temporal distribution of the planting year pixels coloured according to region.

Figure 4. Hex-bin scatter plots comparing predicted (y-axis) versus true (x-axis) planting years for various predictive models. Darker hex bins represent a higher density of data points and lighter bins fewer data points. The diagonal red line indicates perfect agreement (1:1). Marginal histograms contextualise the distributions of true and predicted planting years.

Figure 5. Cumulative distribution function (CDF) plot showing the cumulative probabilities of absolute errors for various predictive models. The x-axis represents the absolute prediction error in years, while the y-axis shows the cumulative probability of errors being less than or equal to the values on the x-axis.

Figure 6. Mean absolute error (MAE) per true planting year for various machine learning models. The plot compares the MAE for different models across planting years from 1988 to 2023. Each line represents the fluctuation in prediction accuracy for each model over time. Missing years indicate that no test samples were available for those years in the held-out folds.

Figure 7. Mean absolute error (MAE) for predictive models tested on various excluded macadamia producing regions in Australia. Each boxplot represents the distribution of MAE values for a specific model in each region, highlighting the variability and performance of each model across different geographic settings.

Figure 8. Scatter plot comparing predicted (y-axis) versus true (x-axis) orchard planting years for both the top-performing GRU model and the thresholding approach, evaluated on a per-pixel basis. Each point corresponds to a single orchard pixel; perfect agreement lies on the dashed red 1:1 line.

Figure 9. Scatter plot comparing predicted (y-axis) versus true (x-axis) orchard planting years for both the top-performing GRU model and the thresholding approach, evaluated on a per-block basis (block median of pixel-level predictions). Each point corresponds to a single orchard block; perfect agreement lies on the dashed red 1:1 line.

Figure 10. Per-block distribution of predicted planting years (x-axis) for individual orchard blocks (y-axis; sorted by pixel count). For each block, the black box spans the interquartile range (Q1–Q3) of pixel-level predictions; the red centre line is the median; whiskers extend to the most extreme values within 1.5 × IQR; black points beyond are outliers. A × and adjacent red label denote the recorded planting year for that block. Alignment of the median/box with the red × indicates agreement; box width reflects within-block heterogeneity.

Figure 11. Cumulative macadamia planting area by year and growing region (1988–2023) as predicted by the GRU Model. The x-axis represents planting years from 1988 to 2023, while the y-axis shows the cumulative planting area in hectares. The plot includes only orchards as shown in the Australian Tree Crop Map (ATCM) as at 27 November 2023.

Figure 12. Screenshot of Australian Macadamia Society Predicted Planting Year Dashboard. The dashboard can be accessed: https://experience.arcgis.com/experience/4365ba21a9e14d03ae988e9ba333f549/ (accessed on 10 February 2025).

Table 1. The number of macadamia blocks and pixels in each growing region with planting year data.

Region	Number of Blocks	Number of Pixels
Bundaberg	231	36,607
North Coast NSW	51	3610
Gympie	51	7832
Macksville *	41	2272
North QLD	23	7167
Lismore *	10	2249
South East QLD *	8	162
Maclean *	6	496
Western Australia *	1	10
Total	422	60,405

* Regions with less than 3000 pixels were combined into “Other Regions”.

Table 2. Distribution of training, validation, and testing pixels used to train models for different held-out regions in the study.

Withheld Region	Training Pixels	Validation Pixels	Test Pixels
Bundaberg	23,798	18,304	18,303
Gympie	52,573	3916	3916
North QLD	53,238	3584	3583
North Coast NSW	56,795	1805	1805
Other Regions	55,216	2595	2594

Table 3. Machine learning models with descriptions and references.

Model Name	Brief Description	Reference
Bidirectional Recurrent Neural Network (biRNN)	A type of neural network architecture designed for sequential data analysis, where information is processed in both forward and backward directions to capture contextual dependencies. Used in natural language processing (NLP).	[27]
Gradient Boosted Trees (GBT)	A machine learning technique for regression and classification, creating a prediction model by fitting successive trees to the residuals of previous trees, thereby minimising errors iteratively.	[28]
Gated Recurrent Unit (GRU)	A type of RNN that uses gating mechanisms to control information flow. It is used in sequence prediction, time-series analysis, and NLP.	[29]
Long Short-Term Memory (LSTM)	An RNN variant that uses input, output, and forget gates, making it effective for sequence prediction tasks.	[30]
Random Forest (RF)	An ensemble learning method that constructs multiple decision trees and outputs the mean of the predictions.	[31]
Residual Network (ResNet)	Uses residual connections to enable the training of very deep neural networks by addressing the problem of vanishing gradients.	[32]
Temporal Convolutional Network (TCN)	Used for sequence modelling, using dilated convolutions to create large receptive fields with fewer parameters.	[33]
Temporal Convolutional Neural Network (TCNN)	Uses 1D convolutions for sequence modelling, suitable for time-series classification and anomaly detection.	[34]
Thresholding	NDVI and GNDVI thresholds are used to predict the planting year for macadamia orchards, with a restriction on the minimum age that can be predicted.	[8]
Transformer	Uses self-attention mechanisms, with an encoder consisting of multi-head self-attention and position-wise feed-forward networks.	[35]

Table 4. Total number of models trained and evaluated. “Full Training Runs” denotes post-tuning retraining of the top five hyperparameter sets per region and model type, repeated five times with different random initialisations; the thresholding baseline is excluded from this count.

Algorithm	Num Regions	Num Models	Num (Hyper-) Parameter Sets	CV Folds/Runs	Total Trained
Thresholding	5	2 (NDVI, GNDVI)	51 × 9	–	4590
RF and GBT	5	2	50	3-fold CV	1500
Neural Networks	5	7	50	1-run CV	1750
Full Training Runs	5	9	5	5 runs	1125
Total					8965

Table 5. Top-performing model and hyperparameters used for each region (average MAE shown). For full per-region metrics see Appendix B.

Excluded Region	Model Type	Average MAE	Hyperparameters Used
Bundaberg	LSTM	1.16	Layers: 1, Units per layer: 32, Dropout: 0.4, Recurrent dropout: 0.0, Activation: tanh, Batch normalisation: No, Optimiser: Adam, Learning rate: $1.00 \times 10^{- 2}$ , Kernel initialiser: Glorot Uniform, L2 regularisation: 0.0
Gympie	GRU	1.05	Layers: 3, Units per layer: 96, Dropout: 0.0, Recurrent dropout: 0.0, Activation: ReLU, Batch normalisation: No, Optimiser: Nadam, Learning rate: $2.69 \times 10^{- 3}$ , Kernel initialiser: Lecun Normal, L2 regularisation: 0.0
North Coast NSW	Transformer	1.36	Layers: 1, Embedding dimension: 128, Attention heads: 8, Feed-forward dimension: 256, Layer normalisation: Yes, Activation: ReLU, Dropout rate: 0.2, Optimiser: Nadam, Learning rate: $1.00 \times 10^{- 3}$ , Kernel initialiser: He Normal, L2 regularisation: $2.00 \times 10^{- 3}$
Other Regions	ResNet	2.15	Residual blocks: 3, Filters: 96, Filter size: 5, Dropout: 0.1, Activation: SELU, Optimiser: RMSprop, Learning rate: $5.13 \times 10^{- 4}$ , Kernel initialiser: Glorot Uniform, L2 regularisation: $6.00 \times 10^{- 3}$
North QLD	TCNN	0.74	Layers: 4, Filters per layer: 192, Kernel size: 7, Dropout: 0.0, Activation: ReLU, Batch normalisation: No, Optimiser: Nadam, Learning rate: $4.34 \times 10^{- 4}$ , Kernel initialiser: Lecun Normal, L2 regularisation: 0.0

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Clark, A.; Brinkhoff, J.; Robson, A.; Shephard, C. Deep Learning Improves Planting Year Estimation of Macadamia Orchards in Australia. Agriculture 2025, 15, 2346. https://doi.org/10.3390/agriculture15222346

AMA Style

Clark A, Brinkhoff J, Robson A, Shephard C. Deep Learning Improves Planting Year Estimation of Macadamia Orchards in Australia. Agriculture. 2025; 15(22):2346. https://doi.org/10.3390/agriculture15222346

Chicago/Turabian Style

Clark, Andrew, James Brinkhoff, Andrew Robson, and Craig Shephard. 2025. "Deep Learning Improves Planting Year Estimation of Macadamia Orchards in Australia" Agriculture 15, no. 22: 2346. https://doi.org/10.3390/agriculture15222346

APA Style

Clark, A., Brinkhoff, J., Robson, A., & Shephard, C. (2025). Deep Learning Improves Planting Year Estimation of Macadamia Orchards in Australia. Agriculture, 15(22), 2346. https://doi.org/10.3390/agriculture15222346

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning Improves Planting Year Estimation of Macadamia Orchards in Australia

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Input Data

2.3. Data Sampling

2.4. Machine Learning Models

2.5. Hyperparameter Tuning

2.5.1. Thresholding Approach

2.5.2. Machine Learning Approach

2.6. Model Evaluation

2.7. Model Application

2.8. Computing Infrastructure

2.9. Use of Generative AI Tools

3. Results

3.1. Training Data

3.2. Comparison of Model Types

3.3. Temporal Analysis

3.4. Regional Analysis

3.5. Summary of Training Outcomes

3.6. Selecting the Top-Performing Model

3.7. Australian Macadamia Predicted Planting Year

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Hyperparameter Search Spaces

Appendix B. Held-Out Performance by Region and Model

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI