Detection of Banana Diseases Based on Landsat-8 Data and Machine Learning

Retkute, Renata; Crew, Kathleen S.; Thomas, John E.; Gilligan, Christopher A.

doi:10.3390/rs17132308

Open AccessTechnical Note

Detection of Banana Diseases Based on Landsat-8 Data and Machine Learning

¹

Epidemiology and Modelling Group, Department of Plant Sciences, University of Cambridge, Downing Street, Cambridge CB2 3EA, UK

²

Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, GPO Box 267, Brisbane, QLD 4001, Australia

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(13), 2308; https://doi.org/10.3390/rs17132308

Submission received: 31 March 2025 / Revised: 26 June 2025 / Accepted: 3 July 2025 / Published: 5 July 2025

(This article belongs to the Special Issue Plant Disease Detection and Recognition Using Remotely Sensed Data)

Download

Browse Figures

Versions Notes

Abstract

Banana is an important cash and food crop worldwide. Recent outbreaks of banana diseases are threatening the global banana industry and smallholder livelihoods. Remote sensing data offer the potential to detect the presence of disease, but formal analysis is needed to compare inferred disease data with observed disease data. In this study, we present a novel remote-sensing-based framework that combines Landsat-8 imagery with meteorology-informed phenological models and machine learning to identify anomalies in banana crop health. Unlike prior studies, our approach integrates domain-specific crop phenology to enhance the specificity of anomaly detection. We used a pixel-level random forest (RF) model to predict 11 key vegetation indices (VIs) as a function of historical meteorological conditions, specifically daytime and nighttime temperature from MODIS and precipitation from NASA GES DISC. By training on periods of healthy crop growth, the RF model establishes expected VI values under disease-free conditions. Disease presence is then detected by quantifying the deviations between observed VIs from Landsat-8 imagery and these predicted healthy VI values. The model demonstrated robust predictive reliability in accounting for seasonal variations, with forecasting errors for all VIs remaining within 10% when applied to a disease-free control plantation. Applied to two documented outbreak cases, the results show strong spatial alignment between flagged anomalies and historical reports of banana bunchy top disease (BBTD) and Fusarium wilt Tropical Race 4 (TR4). Specifically, for BBTD in Australia, a strong correlation of 0.73 was observed between infection counts and the discrepancy between predicted and observed NDVI values at the pixel with the highest number of infections. Notably, VI declines preceded reported infection rises by approximately two months. For TR4 in Mozambique, the approach successfully tracked disease progression, revealing clear spatial spread patterns and correlations as high as 0.98 between VI anomalies and disease cases in some pixels. These findings support the potential of our method as a scalable early warning system for banana disease detection.

Keywords:

remote sensing; plant disease detection; Landsat; machine learning

1. Introduction

Bananas are one of the most widely consumed fruits globally [1] as well as one of the world’s most valuable primary agricultural commodities [2]. However, banana production is affected by numerous diseases, including black leaf streak (Black Sigatoka) [3,4], yellow Sigatoka [5], banana streak [6], Xanthomonas wilt [7], Panama disease (Fusarium wilt) [8], and banana bunchy top [9]. These diseases can cause substantial yield losses, threatening livelihoods and food availability in affected regions [10,11]. Novel and rapid methods for the timely detection of diseases will enable more effective surveillance and facilitate the development of control measures with increased efficiency [12].

Traditional methods for detecting banana diseases, such as field inspections and laboratory testing, are often labor-intensive, time-consuming, and costly [13]. Moreover, ground-based approaches are limited in their capacity to monitor large plantations efficiently, especially in remote or hard-to-access areas. Advances in remote sensing technologies and machine learning can offer new possibilities to address these challenges [14].

The primary objective of this study is to evaluate the detection of banana diseases based on satellite observations. Landsat-8, a satellite launched as part of the Landsat program, provides high-resolution, multispectral imagery [15]. The spectral bands of Landsat-8 capture critical information on plant reflectance in visible, near-infrared (NIR), and shortwave infrared (SWIR) regions, which are sensitive to changes in vegetation health. Landsat imagery has been used to monitor plant diseases such as wheat powdery mildew [16], sudden oak death [17], and pine wilt disease [18].

To date, unmanned aerial vehicles (UAVs) have been employed for detecting a range of banana diseases, including Fusarium wilt [19], Xanthomonas wilt [20,21], banana bunchy top virus (BBTV) [22,23], and banana blood disease [19]. While UAV-based approaches enable high-resolution monitoring and disease detection at the individual plant level, their operational scalability remains limited due to the significant costs, labor, and time associated with acquiring and annotating extensive training datasets [24]. Furthermore, the restricted spatial coverage of UAV platforms presents challenges for large-scale plantation monitoring, as multiple flights are required to cover extensive areas, increasing logistical complexity [25,26]. In contrast, satellite-based remote sensing offers a scalable alternative capable of detecting and mapping plant diseases over broad geographic regions. However, to our knowledge, no study has yet demonstrated the use of satellite data for the detection of banana diseases—a gap that our study addresses for the first time.

We investigate two banana diseases that pose major threats to banana production globally: banana bunchy top disease (BBTD) and Fusarium wilt Tropical Race 4 (TR4) [27]. Banana bunchy top disease is caused by banana bunchy top virus (BBTV), which is transmitted from infected to healthy plants by an aphid vector Pentalonia nigronervosa with additional transmission to field sites by the use of infected propagation material [28]. The most characteristic symptom is the “bunchy” appearance of banana plants, where new leaves become narrow, upright, with chlorotic margins, and crowded at the top of the plant, giving it a rosette or “bunchy top” look [29]. When plants are infected with BBTV, their leaves may become smaller, firmer, and more brittle, often developing wavy or crinkled edges [30]. The appearance of BBTV symptoms and symptom expression can vary among cultivars [31]. The second disease, widely known as Foc TR4, is a form of Fusarium wilt or Panama disease, caused by the fungus Fusarium oxysporum f. sp. cubense Tropical Race 4 [32]. The most common symptom of Fusarium TR4 is the yellowing of older leaves, which starts at the leaf margins and progresses toward the midrib [33]. Affected leaves wilt, droop, and collapse, often hanging down along the pseudostem, eventually drying out and dying. Fusarium TR4 causes 100% losses of infected banana plants [2].

In this study, we present an innovative approach for detecting banana diseases using high-resolution remote sensing data in combination with non-parametric statistical methods within a machine learning framework. Our model predicts VIs as a function of meteorological conditions at specific locations, effectively capturing seasonal dynamics in VI variations. By generating expected VI values under healthy conditions, the model enables the identification of anomalies indicative of disease outbreaks. To validate our approach, we analyzed two case studies: a BBTD outbreak in a banana plantation located in the Northern Rivers Region of New South Wales, Australia, and a Tropical Race 4 (TR4) outbreak in a banana plantation in Nampula Province in northern Mozambique, both selected due to the availability of published epidemiological data. We included an analysis of an additional plantation in the Northern Rivers Region of New South Wales with no reported presence of disease as a control. Our findings reveal a correlation between VI anomalies and the number of infected cases, demonstrating the potential of this methodology for large-scale, early disease surveillance and precision agriculture applications.

2. Materials and Methods

2.1. Case Studies

2.1.1. Banana Bunchy Top Disease

The first case study involves the spread of BBTD in a banana plantation located in the Northern Rivers region of New South Wales, Australia, which we call the ‘NSW1 plantation’ [34]. The plantation covers an area of approximately 12 hectares. Surveys for recording the occurrence of disease were conducted at monthly intervals from December 2014 to January 2018. In addition to the NSW1 plantation, we selected a second plantation 20 km away in the same region, which had no reported presence of banana diseases from April 2013 to March 2016, to serve as a control for testing the methodology. We call this plantation ‘NSW2’. The NSW1 and NSW2 plantations are situated in a subtropical climate zone characterized by warm, humid summers and mild winters, with an average annual rainfall of approximately 1200 mm, predominantly concentrated between November and March [35].

2.1.2. Fusarium TR4

The second case study is on Panama disease Fusarium oxysporum f. sp. cubense Tropical Race 4 (TR4) in Nampula Province in northern Mozambique, which we call the ‘NM plantation’ [36]. The NM plantation in northern Mozambique lies within a tropical savanna climate zone, marked by higher average temperatures year-round and a distinct wet season, receiving around 1000–1200 mm of rainfall annually, primarily between December and April [37]. The plantation covers an area of approximately 265 hectares. Between April 2013 and October 2015, the number of affected plants was recorded weekly, with total cases aggregated every three months. Plant counting was discontinued in October 2015 due to the overwhelming number of dead plants, making it difficult to accurately determine infections in newly affected plants [36].

2.2. Methodological Approach

Figure 1 presents the workflow of our study, which consists of three key stages. First, we compile data for plantation area, incorporating eight surface reflectance bands from Landsat-8, meteorological variables (temperature and precipitation), and disease surveillance records. Following the pre-processing of the surface reflectance time series, we derive 11 VIs to serve as key indicators of plant health. In the second stage, we develop pixel-level random forest (RF) models to predict VIs from historical meteorological data specific to each location. The RF models are fine-tuned through hyperparameter optimization, guided by multiple performance evaluation metrics to ensure the robust predictive accuracy of VI values of healthy crops. The optimized RF models are then employed to estimate expected VI values under healthy, disease-free conditions across the plantation area. In the final stage, disease presence is detected by quantifying the deviations between observed and predicted VIs. This allows for the identification of potential outbreak zones.

2.3. Datasets

2.3.1. Remote Sensing Imagery

We used Landsat 8 OLI surface reflectance data. The surface reflectance images were collected from 1 April 2013 to 1 January 2019, as this period corresponds to the available outbreak data. The data have a 30 m spatial resolution with a 16-day revisit cycle. The cloud-free mask algorithm was used to remove bad-quality observations that were identified as clouds, cloud shadows, and snow. All data were exported from Google Earth Engine [38]. The image data ID for the surface reflectance data on Google Earth Engine was ‘LANDSAT/LC08/C02/T1_L2’.

2.3.2. Temperature Data

The MOD11A1 V6 product provides daytime land surface temperature (band ‘LST_Day_1km’) and nighttime land surface temperature (band ‘LST_Night_1km’) [39]. The data have a 1 km spatial resolution. Data were extracted for the period from 1 January 2010 to 1 January 2019 and converted from Kelvin to Celsius.

2.3.3. Precipitation Data

We used the NASA GES DISC product ‘NASA/GPM_L3/IMERG_MONTHLY_V06’, which provides merged satellite–gauge precipitation estimates [40] at 10 km spatial and monthly temporal resolution. Data were extracted for the period from 1 January 2010 to 1 January 2019 and converted from mm/hr to mm/d.

2.4. Pre-Processing and Analysis

2.4.1. Plantation Boundary Delineation

We used the coordinates provided in [34,36] as the locations of the centers of the NSW1 and NM plantations, respectively (Figure 2A,C). Plantation boundaries were delineated manually together with the location of the center of the NSW2 plantation from high-spatial-resolution airborne imagery in Google Earth Engine [38]. The boundaries of the NSW1, NSW2, and NM plantations are shown in Figure 2B,C,E.

2.4.2. Smoothing of Time Series Data

A Whittaker smoother was applied to the surface reflectance and temperature time series [41,42]. We set the smoothing parameter to 1 for the surface reflectance time series [43] and to 100 for the temperature time series [42]. Precipitation data were left unsmoothed to preserve the fine-scale temporal variability critical for modeling rainfall-driven growth responses and stress signals, given its episodic nature and the importance of rainfall intensity and timing in banana phenology.

2.5. Vegetation Indices

In banana crops, healthy plants absorb the majority of red light while strongly reflecting near-infrared (NIR) wavelengths, whereas diseased or stressed plants exhibit increased visible light reflectance and reduced NIR reflectance, indicative of vegetation decline [19]. For our analysis, we selected 11 VIs, encompassing those previously applied to banana mapping and disease detection using UAV-based multispectral imagery, alongside indices with demonstrated potential for improved sensitivity. The normalized difference vegetation index (NDVI) has been shown to reliably map the distribution of banana blood disease and Fusarium wilt at plantation scales [19]. The recently proposed kernel NDVI (kNDVI), a nonlinear extension of the NDVI, enhances sensitivity to canopy structural and physiological variations while addressing saturation issues typical in high-biomass crops [44].

The ratio vegetation index (RVI) captures variation in red and NIR reflectance effectively, ranking among the top features in distinguishing banana classes from non-banana classes [22]. Similarly, the difference vegetation index (DVI) excels at minimizing soil background effects while maximizing sensitivity to canopy health, making it particularly suitable for detecting nuanced stress symptoms in banana foliage [45]. The enhanced vegetation index (EVI) correlates strongly with the leaf area index (LAI) across diverse crops, including banana, by reducing atmospheric and soil noise and improving responsiveness to canopy structure and leaf density [46].

The soil-adjusted vegetation index (SAVI) has demonstrated remarkable accuracy—up to 99%—in classifying Fusarium wilt presence [20]. Further refining soil effect correction, the modified soil-adjusted vegetation index (MSAVI) and the optimized soil-adjusted vegetation index (OSAVI) enhance the dynamic range of vegetation signals [47]. The normalized difference phenology index (NDPI), which is sensitive to changes in canopy structure, water content, and phenological stage, holds strong potential for the early detection of disease-induced stress [48]. The near-infrared reflectance of vegetation (NIRV) index offers a robust measure of vegetation productivity by isolating canopy signals from soil and background noise, enabling the detection of subtle declines in canopy health and photosynthetic function associated with crop diseases [49]. Finally, the global environment monitoring index (GEMI) leverages spectral signatures of vegetation to provide superior resilience to atmospheric and soil background variability compared to traditional indices such as the NDVI, enhancing its suitability for disease monitoring applications [50].

The formulas for these VIs, along with their respective references, are provided in Table 1. After calculating the VIs, their values were linearly interpolated at regular 10-day intervals.

Alignment of Datasets to a Common Baseline

The alignment of datasets, despite their significant differences in resolution and units, was achieved through a multi-faceted approach integrated into the model construction:

Spatial alignment with Landsat-8 resolution. A central component of the methodology was the development of pixel-specific ML models, wherein an independent model was trained for each individual 30 m × 30 m Landsat-8 pixel. While the meteorological predictors—MODIS-derived temperature (1 km resolution) and NASA GES DISC precipitation (10 km resolution)—originate from coarser spatial scales, they were systematically aligned to the Landsat grid to enable pixel-level modeling.
Temporal alignment. Landsat-8 imagery, with its 16-day revisit cycle, was used to derive VIs, which were linearly interpolated to regular 10-day intervals to enhance temporal resolution. Daily temperature and monthly precipitation data were incorporated through a lag-window approach, where meteorological conditions from day $d - K$ to $d - 1$ informed each VI prediction.

2.6. Machine Learning Algorithms

2.6.1. Decision Tree (DT) Model

Decision trees recursively partition the data into subsets based on feature values [56]. In this study, we implemented decision tree models using the R package caret v7.0-1, which provides a unified framework for training, validating, and tuning machine learning algorithms, including the rpart method for recursive partitioning [57].

2.6.2. Support Vector Machines (SVMs)

Support vector machines (SVMs) operate by identifying an optimal hyperplane that maximizes the margin—the perpendicular distance between the decision boundary and the nearest data points, known as support vectors [58]. We implemented the support vector machine using the hlsvm() function from the R package e1071 v1.7 [59].

2.6.3. Random Forest (RF) Model

The random forest (RF) model is an ensemble learning method that generates multiple decision trees using randomly selected subsets of training samples and features [60]. The RF model was constructed using the R package ‘randomForest’ v4.7 [61]. Model construction requires choosing the number of trees to grow (

n_{t r e e s}

), which could influence model performance, accuracy, and computational efficiency.

2.6.4. Training and Testing of ML Models

To train and test the three ML models, we used VI values from April 2013 to March 2015 for the training of models and from April 2015 to December 2016 for the testing of models. The starting date was limited by the availability of Landsat-8 data. We used daytime temperature, nighttime temperature, and precipitation from

d - K

to

d - 1

days preceding each VI observation. The parameter K was treated as a hyperparameter. We set the lag period of meteorological data, K, to vary on a regular grid at 30-day intervals, ranging from one month to one year. For RF models applied to remote sensing data, the number of trees (

n_{t r e e}

) typically ranges between 10 and 5000 [60]. We selected tree counts of 100, 200, 300, 500, and 1000 for evaluation. The machine learning models were fitted independently for each 30 m × 30 m pixel corresponding to the Landsat-8 tiles.

2.7. Model Accuracy Assessment

VIs were randomly split into training and testing datasets with an 80%:20% ratio. The training dataset was used to optimize the random forest regression model, while the testing dataset was used to evaluate the model’s predictive reliability. We varied the value of the parameter K (days of lag) for each of the three ML algorithms, and the number of trees in the random forest. The performance of each model was assessed by calculating four metrics between the observed and predicted VIs from the testing dataset: (i) the root mean square error (

R M S E

), (ii) the Spearman correlation coefficient (

ρ

), (iii) the coefficient of determination (

R^{2}

), and (iv) bias analysis (b). The model with the lowest

R M S E

and highest correlation was selected as the final model.

3. Results

3.1. Year-to-Year Dynamics of Temperature and Precipitation

First, we analyzed the differences in meteorological conditions among seasons at each of the two (northern NSW and northern Mozambique) locations. This step was necessary to ensure that any anomalies observed in VIs were not caused by unusual weather patterns. Daytime and nighttime temperatures showed very similar profiles within sites across different years at both the northern NSW location (Figure 3A) and northern Mozambique location (Figure 3C). There was some variability in monthly precipitation in the NSW location but, overall, the heaviest rainfall occurred during the first quarter of the year (Figure 3B). The monthly precipitation profile for the region occupied by northern Mozambique was more consistent, with very low rainfall levels between April and November (Figure 3D).

3.2. Selection of ML Model and Hyperparameters

Vegetation indices data from the NSW1 banana plantation were used for machine learning model selection and hyperparameter tuning. Among the evaluated algorithms, the decision tree model consistently demonstrated the poorest performance across all four evaluation metrics and lag periods of meteorological data (K) (Figure 4). The support vector machine model exhibited moderate improvement over the DT but remained inferior to the random forest (RF). Notably, increasing K values yielded only marginal gains in SVM performance, suggesting a limited sensitivity of this algorithm to the temporal context. In contrast, the RF model achieved the highest average performance across all metrics and lag configurations larger than one month, confirming its suitability as the most robust and reliable model for the predictive task. There was no significant difference in model performance when the number of trees was varied from 100 to 1000. The root mean square error decreased when the number of days for meteorological data used as covariates was increased from one month (

K = 30)

to three years (

K = 365

). The correlation coefficient between observed VIs and VIs predicted by RF also increased when the length of the covariate window was extended (Figure 4). Therefore, for further analysis, we chose the RF model with fixed

n_{t r e e s} = 100

and

K = 120

(3 months) for the main analyses.

3.3. Effect of Seasonal Variation on VIs

Seasonal variations in VIs could potentially obscure the presence of diseases or interfere with its accurate detection by introducing fluctuations unrelated to disease symptoms. These variations may be caused by changes in environmental factors such as temperature, rainfall, and sunlight availability, which influence plant growth and spectral reflectance patterns. To account for these potential confounding effects, we applied the method to the NSW2 plantation, where no banana disease presence was reported during 2013–2016.

To assess vegetation health dynamics, we quantified the deviation between observed VIs from Landsat-8 surface reflectance imagery and expected VI values in the absence of disease. Data were partitioned into one year for training an RF model, as this showed the best performance, and two years for forecasting. Substantial temporal and spatial variability in VI values were observed over the study period. For instance, the NDVI ranged from 0.6 to 0.9 (Figure 5A).

The model performance was evaluated using the normalized error metric

e r = (p r e d - o b s) / o b s

, representing the ratio between the predicted and observed VI differences and the observed VI values. The relative error remained within ±10% during the training phase and below ±4% for the forecasting period for the NDVI (Figure 5B). Notably, the highest normalized errors corresponded to grid cells with a lower observed NDVI, likely linked to abiotic stress rather than seasonal effects. A consistent trend was observed across all VIs, with forecasting errors remaining within ±10% (Figure 5C), demonstrating the robustness of the approach for vegetation monitoring and disease impact assessment.

3.4. Detecting BBTD Presence at the NSW1 Banana Plantation

We used the optimized RF model to predict VI values from April 2013 to December 2019, where the period of January 2016 to December 2019 was used for forecasting (see Table 2). The presence of BBTD was assessed by calculating the difference between the observed VI values and the VI values predicted using the RF model. In the generated curves, yellow shading represents a negative difference, indicating that observed VI values exceed the predicted values, which may suggest areas of unusually high vegetation vigor (Figure 6A). In contrast, deep purple hues highlight regions where the observed VI values are substantially lower than expected, potentially indicating plant stress or disease presence. We defined the anomaly threshold as twice the maximum difference observed in disease-free crops. All of the VIs showed a decline in their values above the threshold from around the middle of 2016. For example, the difference in the NDVI has values of up to 0.5, indicating a mild to very severe reduction in photosynthetic activity in the plants. There was a low correlation (up to 0.15) between VI anomalies and precipitation (Supplementary Figure S1).

The BBTD outbreak in the NSW1 banana plantation dataset included spatially documented locations of infected plants [34]. The pixel with the highest number of infected plants (

n = 109

) exhibited one of the strongest correlations (0.73) between infection count and the discrepancy between predicted and observed NDVI values. Assuming a planting density of 2000 plants per hectare, this suggests that approximately 60% of plants within the 30 m × 30 m area were affected. Notably, at this location, the decline in NDVI values preceded the observed rise in infection cases by approximately two months (11 August 2016 vs. 15 October 2016; Figure 6A), highlighting the potential of remote sensing for early disease detection.

3.5. Detecting TR4 Presence

We used the optimized RF model to predict VI values from April 2013 to October 2015. This period overlaps with the available surveillance data for the NM plantation (Figure 7, top left panel). Similar to the BBTD case, all of the VIs showed a decline in their values from a critical time—in the case of TR4, around the middle of 2014—but the curves exhibited more complex variability (Figure 7A) than was the case for BBTD.

A key insight derived from the TR4 analysis is the ability to track disease spread over time. We identified clusters of pixels with similar trajectories in vegetation index (VI) changes between 2014 and 2015. One group exhibited a decline in VI values beginning in June 2014, while another showed a reduction starting in October 2014 (Figure 7B). The spatial mapping of these pixel groups revealed a clear pattern of TR4 spread, originating from the top-left corner of the plantation and progressing outward (Figure 7C). This is consistent with the on-farm disease development described by Viljoen et al. [36], where managers began to observe an increasing number of symptomatic plants in the area corresponding to the blue region in Figure 7C. This demonstrates the potential of remote sensing for monitoring disease progression involving Fusarium TR4 in a banana plantation at a fine spatial scale.

3.6. Performance of VIs

To assess changes in crop health, we employed 11 VIs, each varying in sensitivity to different physiological traits. While some VIs are more responsive to alterations in leaf structure, others are primarily influenced by variations in chlorophyll or water content [62]. As illustrated in Figure 6 and Figure 7, all VIs exhibited a decline in observed values compared with predicted values characteristic of healthy crops during both BBTV and TR4 outbreaks. For BBTD, the strongest correlations between disease incidence and deviations in predicted versus observed values were observed with the ratio vegetation index (RVI) and the normalized difference phenology index (NDPI) (Figure 8). The RVI, which reflects changes in the red to near-infrared reflectance ratio, is known for its sensitivity to leaf structural damage and stress, such as that caused by brown planthopper infestations in rice [63]. The characteristic leaf deformation and stunted growth associated with BBTD likely trigger similar structural changes, explaining the RVI’s strong performance. Although the precise physiological processes captured by the NDPI remain less well defined, its sensitivity to alterations in canopy structure and pigment composition, via its use of NIR, red, and SWIR bands, makes it well suited for detecting BBTD-induced stress. For TR4, correlations between VI anomalies and disease cases were consistently high, reaching up to 0.98 in some pixels (Figure 8). This strong and uniform performance across multiple VIs reflects the extensive physiological damage caused by TR4, including chlorophyll loss, yellowing, wilting, and reduced water content—processes that most vegetation indices, such as the NDVI and EVI, are designed to detect.

4. Discussion

This study investigated the use of Landsat-8 data and machine learning to detect two banana diseases: BBTD and TR4. Early detection of these diseases is critical for implementing timely control measures and mitigating yield losses. By leveraging the high spatial resolution of Landsat-8 imagery, along with machine learning algorithms, our study aimed to identify disease-induced changes in VIs.

4.1. VIs, Phenology, and the Challenge of Disease Detection

The NDVI, along with other VIs, is commonly used to derive key phenological markers of the seasonal cycles of banana plant growth, such as the start of the season, the peak, and the end of the season [64]. Therefore, it can be difficult to distinguish whether a reduction in VI values is due to the end of the growth season or the presence of a disease. We introduced a novel approach for banana disease detection based on a non-parametric statistical method to predict VIs as a function of meteorological conditions at a given location. Daytime temperature, nighttime temperature, and precipitation were the three main factors used as inputs for the VI prediction models. Temperature influences banana crop growth, as confirmed in several studies [65].

4.2. Pixel-Level Modeling to Account for Phenological and Structural Variability

The model was applied to generate the expected values of VIs in the absence of disease and to investigate the performance of 11 VIs. We found large variations in VI values among 30 m × 30 m pixels at the NSW2 plantation under disease-free conditions. For example, on 3 October 2014, the NDVI values in the NSW2 plantation ranged from 0.6 to 0.8. This variation could be related to plant spacing in the field [66]. Another reason is the asynchronicity of banana crops. After establishment, the first flowering of a banana plantation is relatively synchronized among plants. However, as flowering cycles progress, individual plants establish their own development stages [67]. This results in a diversity of phenological stages within a field [68]. Therefore, we fitted an RF model for each individual pixel to account for the diversity in planting density as well as the asynchronicity in developmental stages.

4.3. Linking VI Anomalies to Observed Disease Dynamics

To highlight the advantages of the methodology, the resulting curves were compared with published epidemiological data from Australia and Mozambique. The results show that all the VI values declined from around the middle of 2016 for the NSW1 plantation (BBTD) and the middle of 2014 for the NM plantation (TR4). For example, the difference in the NDVI reached values as low as −0.5, indicating a mild to very severe reduction in photosynthetic activity on the ground. There was a strong correlation between the number of disease cases and the deviation in VI values from those detected in the absence of disease, suggesting that remote sensing data can be used to detect the presence of banana diseases. At the pixel with the largest number of infected plants in the NSW1 plantation, the reduction in NDVI values preceded the increase in the number of cases by approximately two months. Our analysis also demonstrated that remote sensing data can be used to track the spread of banana diseases. The locations of pixels with similar profiles of changes in VI values from the NM banana plantation were consistent with the spread of TR4 from the top-left corner of the plantation [36]. This approach holds promise for large-scale disease surveillance, offering a cost-effective and scalable method for monitoring plantation health over time.

4.4. Limitations and Future Opportunities

We applied the method retrospectively to two documented outbreak cases, which defined the temporal scope of the analysis. A key limitation of this study was the availability of remote sensing data. Landsat-8, launched in 2013, coincided with the TR4 outbreak in Mozambique, effectively restricting the volume of data available for training machine learning models for the NM banana plantation. Future research may benefit from integrating data from the Sentinel-2 and Sentinel-1 constellations. Sentinel-2 provides higher spatial resolution (10–20 m) and enhanced spectral capabilities, especially in the red-edge and shortwave infrared bands, which are well suited for detecting early signs of vegetation stress and disease [69]. Its 5-day revisit cycle also supports timely monitoring for early warning and rapid response systems. Recently, Sentinel-2 has been used to map banana production in Tanzania [70].

5. Conclusions

Timely disease detection is critical for ensuring the economic sustainability of banana production. Early identification enables rapid intervention, minimizing disease impact and mitigating the risk of widespread crop loss. This study demonstrates the potential of high-resolution remote sensing for detecting BBTV and TR4 at the plantation scale. Future research will focus on developing operational frameworks for leveraging remote sensing data in the detection and continuous monitoring of banana diseases.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs17132308/s1, Figure S1: The correlation between inter-annual variability of meteorological data and VI anomalies at NSW1 location. Boxes indicate the interquartile range, with the line marking the median. Whiskers extend to 1.5 × IQR, and dots represent outliers beyond this range.

Author Contributions

Conceptualization, R.R. and C.A.G.; methodology, R.R.; software, R.R.; validation, R.R.; formal analysis, R.R.; investigation, R.R.; resources, C.A.G.; data curation, R.R.; writing—original draft preparation, R.R., K.S.C., J.E.T., and C.A.G.; writing—review and editing, R.R., K.S.C., J.E.T., and C.A.G.; visualization, R.R.; project administration, C.A.G.; funding acquisition, C.A.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Gates Foundation grant INV070408. The funder had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. We thank Australian Banana Growers’ Council (ABGC) and banana inspectors for the original data collecting.

Data Availability Statement

Analysis code and data used in this study can be accessed at the following URL: (accessed on 3 July 2025) https://github.com/rretkute/BananaDiseasesRS.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Voora, V.; Larrea, C.; Bermudez, S. Global Market Report: Bananas; International Institute for Sustainable Development: Winnipeg, MB, Canada, 2020. [Google Scholar]
Ploetz, R.C. Management of Fusarium wilt of banana: A review with special reference to tropical race 4. Crop Prot. 2015, 73, 7–15. [Google Scholar] [CrossRef]
De Bellaire, L.d.L.; Fouré, E.; Abadie, C.; Carlier, J. Black Leaf Streak Disease is challenging the banana industry. Fruits 2010, 65, 327–342. [Google Scholar] [CrossRef][Green Version]
Marin, D.H.; Romero, R.A.; Guzmán, M.; Sutton, T.B. Black Sigatoka: An increasing threat to banana cultivation. Plant Dis. 2003, 87, 208–222. [Google Scholar] [CrossRef] [PubMed]
Cook, D.; Liu, S.; Edwards, J.; Villalta, O.; Aurambout, J.P.; Kriticos, D.; Drenth, A.; De Barro, P. An assessment of the benefits of yellow Sigatoka (Mycosphaerella musicola) control in the Queensland Northern Banana Pest Quarantine Area. NeoBiota 2013, 18, 67–81. [Google Scholar] [CrossRef]
Dahal, G.; Hughes, J.; Lockhart, B. Status of banana streak disease in Africa: Problems and future research needs. Integr. Pest Manag. Rev. 1998, 3, 85–97. [Google Scholar] [CrossRef]
Tripathi, L.; Mwangi, M.; Abele, S.; Aritua, V.; Tushemereirwe, W.K.; Bandyopadhyay, R. Xanthomonas wilt: A threat to banana production in East and Central Africa. Plant Dis. 2009, 93, 440–451. [Google Scholar] [CrossRef]
Kema, G.H.; Drenth, A.; Dita, M.; Jansen, K.; Vellema, S.; Stoorvogel, J.J. Fusarium wilt of banana, a recurring threat to global banana production. Front. Plant Sci. 2021, 11, 628888. [Google Scholar] [CrossRef] [PubMed]
Dale, J.L. Banana bunchy top: An economically important tropical plant virus disease. Adv. Virus Res. 1987, 33, 301–325. [Google Scholar] [PubMed]
Jeger, M.; Eden-Green, S.; Thresh, J.; Johanson, A.; Waller, J.; Brown, A. Banana diseases. In Bananas and Plantains; Springer: Berlin/Heidelberg, Germany, 1995; pp. 317–381. [Google Scholar]
Drenth, A.; Kema, G. The vulnerability of bananas to globally emerging disease threats. Phytopathology 2021, 111, 2146–2161. [Google Scholar] [CrossRef]
Christaki, E. New technologies in predicting, preventing and controlling emerging infectious diseases. Virulence 2015, 6, 558–565. [Google Scholar] [CrossRef]
Gupta, S.; Tripathi, A.K. Fruit and vegetable disease detection and classification: Recent trends, challenges, and future opportunities. Eng. Appl. Artif. Intell. 2024, 133, 108260. [Google Scholar] [CrossRef]
Kashyap, B.; Kumar, R. Sensing methodologies in agriculture for monitoring biotic stress in plants due to pathogens and pests. Inventions 2021, 6, 29. [Google Scholar] [CrossRef]
Roy, D.P.; Wulder, M.A.; Loveland, T.R.; Woodcock, C.E.; Allen, R.G.; Anderson, M.C.; Helder, D.; Irons, J.R.; Johnson, D.M.; Kennedy, R.; et al. Landsat-8: Science and product vision for terrestrial global change research. Remote Sens. Environ. 2014, 145, 154–172. [Google Scholar] [CrossRef]
Ma, H.; Jing, Y.; Huang, W.; Shi, Y.; Dong, Y.; Zhang, J.; Liu, L. Integrating Early Growth Information to Monitor Winter Wheat Powdery Mildew Using Multi-Temporal Landsat-8 Imagery. Sensors 2018, 18, 3290. [Google Scholar] [CrossRef]
Wang, H.; Pu, R.; Zhu, Q.; Ren, L.; Zhang, Z. Mapping health levels of Robinia pseudoacacia forests in the Yellow River delta, China, using IKONOS and Landsat 8 OLI imagery. Int. J. Remote Sens. 2015, 36, 1114–1135. [Google Scholar] [CrossRef]
Long, L.; Chen, Y.; Song, S.; Zhang, X.; Jia, X.; Lu, Y.; Liu, G. Remote Sensing Monitoring of Pine Wilt Disease Based on Time-Series Remote Sensing Index. Remote Sens. 2023, 15, 360. [Google Scholar] [CrossRef]
Wikantika, K.; Ghazali, M.F.; Dwivany, F.M.; Susantoro, T.M.; Yayusman, L.F.; Sunarwati, D.; Sutanto, A. A Study on the Distribution Pattern of Banana Blood Disease (BBD) and Fusarium Wilt Using Multispectral Aerial Photos and a Handheld Spectrometer in Subang, Indonesia. Diversity 2023, 15, 1046. [Google Scholar] [CrossRef]
Zhang, S.; Li, X.; Ba, Y.; Lyu, X.; Zhang, M.; Li, M. Banana Fusarium Wilt Disease Detection by Supervised and Unsupervised Methods from UAV-Based Multispectral Imagery. Remote Sens. 2022, 14, 1231. [Google Scholar] [CrossRef]
Mora, J.J.; Selvaraj, M.G.; Alvarez, C.I.; Safari, N.; Blomme, G. From pixels to plant health: Accurate detection of banana Xanthomonas wilt in complex African landscapes using high-resolution UAV images and deep learning. Discov. Appl. Sci. 2024, 6, 377. [Google Scholar] [CrossRef]
Gomez Selvaraj, M.; Vergara, A.; Montenegro, F.; Alonso Ruiz, H.; Safari, N.; Raymaekers, D.; Ocimati, W.; Ntamwira, J.; Tits, L.; Omondi, A.B.; et al. Detection of banana plants and their major diseases through aerial images and machine learning methods: A case study in DR Congo and Republic of Benin. ISPRS J. Photogramm. Remote Sens. 2020, 169, 110–124. [Google Scholar] [CrossRef]
Alabi, T.R.; Adewopo, J.; Duke, O.P.; Kumar, P.L. Banana Mapping in Heterogenous Smallholder Farming Systems Using High-Resolution Remote Sensing Imagery and Machine Learning Models with Implications for Banana Bunchy Top Disease Surveillance. Remote Sens. 2022, 14, 5206. [Google Scholar] [CrossRef]
Moradi, S.; Bokani, A.; Hassan, J. UAV-based Smart Agriculture: A Review of UAV Sensing and Applications. In Proceedings of the 2022 32nd International Telecommunication Networks and Applications Conference (ITNAC), Wellington, New Zealand, 30 November–2 December 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 181–184. [Google Scholar] [CrossRef]
Reddy Maddikunta, P.K.; Hakak, S.; Alazab, M.; Bhattacharya, S.; Gadekallu, T.R.; Khan, W.Z.; Pham, Q.V. Unmanned Aerial Vehicles in Smart Agriculture: Applications, Requirements, and Challenges. IEEE Sensors J. 2021, 21, 17608–17619. [Google Scholar] [CrossRef]
Tagami, G.N.; van Westerhoven, A.C.; Seidl, M.F.; van der Sluis, J.; Cozzarelli, M.; Balarezo Camminati, D.; Pflücker, R.; Márquez Rosillo, C.; Clercx, L.; Kema, G.H.J. Aerial Mapping of the Peruvian Chira Valley Banana Production System to Monitor the Expansion of Fusarium Wilt Caused by Tropical Race 4. PhytoFrontiers 2024, 4, 196–204. [Google Scholar] [CrossRef]
Food and Agriculture Organization of the United Nations (FAO). Strengthening the Preparedness of West and Central Africa to Control Banana Bunch Top Virus and Banana Fusarium Wilt (TR4); FAO: Rome, Italy, 2024; Available online: https://www.fao.org/transboundary-plant-pests-diseases/news/detail/strengthening-the-preparedness-of-west-and-central-africa-to-control-banana-bunch-top-virus-and-banana-fusarium-wilt-(tr4)/en (accessed on 3 July 2025).
Qazi, J. Banana bunchy top virus and the bunchy top disease. J. Gen. Plant Pathol. 2016, 82, 2–11. [Google Scholar] [CrossRef]
Hooks, C.; Wright, M.; Kabasawa, D.; Manandhar, R.; Almeida, R. Effect of banana bunchy top virus infection on morphology and growth characteristics of banana. Ann. Appl. Biol. 2008, 153, 1–9. [Google Scholar] [CrossRef]
Nelson, S.C.; Messing, R.; Hamasaki, R.; Gaskill, D.; Nishijima, W. Banana bunchy top: Detailed signs and symptoms. In Knowledge Creation Diffusion Utilization; Cooperative Extension Service, College of Tropical Agriculture and Human Resources, University of Hawai’i at Mānoa: Honolulu, HI, USA, 2004; pp. 1–22. [Google Scholar]
Chabi, M.; Dassou, A.G.; Adoukonou-Sagbadja, H.; Thomas, J.; Omondi, A.B. Variation in Symptom Development and Infectivity of Banana Bunchy Top Disease among Four Cultivars of Musa sp. Crops 2023, 3, 158–169. [Google Scholar] [CrossRef]
EFSA Panel on Plant Health (PLH); Bragard, C.; Baptista, P.; Chatzivassiliou, E.; Di Serio, F.; Gonthier, P.; Jaques Miret, J.A.; Justesen, A.F.; MacLeod, A.; Magnusson, C.S.; et al. Pest categorisation of Fusarium oxysporum f. sp. cubense Tropical Race 4. EFSA J. 2022, 20, e07092. [Google Scholar]
Garcia-Bastidas, F. Fusarium oxysporum f. sp. cubense Tropical Race 4 (Foc TR4); CABI Compendium: Oxfordshire, UK, 2022. [Google Scholar]
Varghese, A.; Drovandi, C.; Mira, A.; Mengersen, K. Estimating a novel stochastic model for within-field disease dynamics of banana bunchy top virus via approximate Bayesian computation. PLoS Comput. Biol. 2020, 16, e1007878. [Google Scholar] [CrossRef]
Australian Bureau of Meteorology. New South Wales Climate Averages and Extremes. 2023. Available online: http://www.bom.gov.au/climate/current/annual/nsw/summary.shtml (accessed on 3 July 2025).
Viljoen, A.; Mostert, D.; Chiconela, T.; Beukes, I.; Fraser, C.; Dwyer, J.; Murray, H.; Amisse, J.; Matabuana, E.L.; Tazan, G.; et al. Occurrence and spread of the banana fungus Fusarium oxysporum f. sp. cubense TR4 in Mozambique. S. Afr. J. Sci. 2020, 116, 1–11. [Google Scholar] [CrossRef]
Food and Agriculture Organization of the United Nations. Mozambique: Climate and Agro-Ecological Zones. 2022. Available online: https://www.fao.org/gaez/en/pages/national-zones (accessed on 3 July 2025).
Gorelick, N.; Hancher, M.; Dixon, M.; Ilyushchenko, S.; Thau, D.; Moore, R. Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote. Sens. Environ. 2017, 202, 18–27. [Google Scholar] [CrossRef]
Wan, Z.; Hook, S.; Hulley, G. MOD11A1 MODIS/Terra Land Surface Temperature/Emissivity Daily L3 Global 1km SIN Grid V006; NASA: Washington, DC, USA, 2015. [CrossRef]
Huffman, G.J.; Bolvin, D.T.; Braithwaite, D.; Hsu, K.; Joyce, R.; Xie, P.; Yoo, S.H. NASA global precipitation measurement (GPM) integrated multi-satellite retrievals for GPM (IMERG). Algorithm Theor. Basis Doc. (ATBD) Version 2015, 4, 2020–05. [Google Scholar]
Frasso, G.; Eilers, P.H. L- and V-curves for optimal smoothing. Stat. Model. 2014, 15, 91–111. [Google Scholar] [CrossRef]
Eilers, P.H.; Pesendorfer, V.; Bonifacio, R. Automatic smoothing of remote sensing data. In Proceedings of the 2017 9th International Workshop on the Analysis of Multitemporal Remote Sensing Images (MultiTemp), Brugge, Belgium, 27–29 June 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 1–3. [Google Scholar] [CrossRef]
Retkute, R.; Thurston, W.; Cressman, K.; Gilligan, C.A. A framework for modelling desert locust population dynamics and large-scale dispersal. PLoS Comput. Biol. 2024, 20, e1012562. [Google Scholar] [CrossRef] [PubMed]
Camps-Valls, G.; Campos-Taberner, M.; Moreno-Martínez, Á.; Walther, S.; Duveiller, G.; Cescatti, A.; Mahecha, M.D.; Muñoz-Marí, J.; García-Haro, F.J.; Guanter, L.; et al. A unified vegetation index for quantifying the terrestrial biosphere. Sci. Adv. 2021, 7, eabc7447. [Google Scholar] [CrossRef]
Tucker, C.J. Red and photographic infrared linear combinations for monitoring vegetation. Remote Sens. Environ. 1979, 8, 127–150. [Google Scholar] [CrossRef]
Kang, Y.; Özdoğan, M.; Zipper, S.; Román, M.; Walker, J.; Hong, S.; Marshall, M.; Magliulo, V.; Moreno, J.; Alonso, L.; et al. How Universal Is the Relationship between Remotely Sensed Vegetation Indices and Crop Leaf Area Index? A Global Assessment. Remote Sens. 2016, 8, 597. [Google Scholar] [CrossRef]
Qi, J.; Chehbouni, A.; Huete, A.R.; Kerr, Y.H.; Sorooshian, S. A modified soil adjusted vegetation index. Remote Sens. Environ. 1994, 48, 119–126. [Google Scholar] [CrossRef]
Wang, C.; Chen, J.; Wu, J.; Tang, Y.; Shi, P.; Black, T.A.; Zhu, K. A snow-free vegetation index for improved monitoring of vegetation spring green-up date in deciduous ecosystems. Remote Sens. Environ. 2017, 196, 1–12. [Google Scholar] [CrossRef]
Badgley, G.; Anderegg, L.D.; Berry, J.A.; Field, C.B. Terrestrial gross primary production: Using NIRV to scale from site to globe. Glob. Change Biol. 2019, 25, 3731–3740. [Google Scholar] [CrossRef]
Pinty, B.; Verstraete, M. GEMI: A non-linear index to monitor global vegetation from satellites. Vegetatio 1992, 101, 15–20. [Google Scholar] [CrossRef]
Rouse, J.W.; Haas, R.H.; Schell, J.A.; Deering, D.W. Monitoring vegetation systems in the Great Plains with ERTS. NASA Spec. Publ. 1974, 351, 309. [Google Scholar]
Pearson, R.L.; Miller, L.D. Remote mapping of standing crop biomass for estimation of the productivity of the shortgrass prairie, Pawnee National Grasslands, Colorado. In Proceedings of the Eighth International Symposium on Remote Sensing of Environment, Ann Arbor, MI, USA, 2–6 October 1972. [Google Scholar]
Liu, H.Q.; Huete, A. A feedback based modification of the NDVI to minimize canopy background and atmospheric noise. IEEE Trans. Geosci. Remote Sens. 1995, 33, 457–465. [Google Scholar] [CrossRef]
Huete, A.R. A soil-adjusted vegetation index (SAVI). Remote Sens. Environ. 1988, 25, 295–309. [Google Scholar] [CrossRef]
Rondeaux, G.; Steven, M.; Baret, F. Optimization of soil-adjusted vegetation indices. Remote Sens. Environ. 1996, 55, 95–107. [Google Scholar] [CrossRef]
Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef]
Kuhn, M. Building Predictive Models inRUsing thecaretPackage. J. Stat. Softw. 2008, 28, 1–26. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Meyer, D.; Dimitriadou, E.; Hornik, K.; Weingessel, A.; Leisch, F.; Chang, C.C.; Lin, C.C. e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071). R Package Version 1.7-16. 2024. Available online: https://CRAN.R-project.org/package=e1071 (accessed on 3 July 2025).
Belgiu, M.; Drăguţ, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [Google Scholar] [CrossRef]
Liaw, A.; Wiener, M. Classification and Regression by randomForest. R News 2002, 2, 18–22. [Google Scholar]
Zou, X.; Mõttus, M. Sensitivity of Common Vegetation Indices to the Canopy Structure of Field Crops. Remote Sens. 2017, 9, 994. [Google Scholar] [CrossRef]
Tan, Y.; Sun, J.Y.; Zhang, B.; Chen, M.; Liu, Y.; Liu, X.D. Sensitivity of a Ratio Vegetation Index Derived from Hyperspectral Remote Sensing to the Brown Planthopper Stress on Rice Plants. Sensors 2019, 19, 375. [Google Scholar] [CrossRef] [PubMed]
Aeberli, A.; Phinn, S.; Johansen, K.; Robson, A.; Lamb, D.W. Characterisation of Banana Plant Growth Using High-Spatiotemporal-Resolution Multispectral UAV Imagery. Remote Sens. 2023, 15, 679. [Google Scholar] [CrossRef]
Allen, R.; Dettmann, E.; Johns, G.; Turner, D. Estimation of leaf emergence rates of bananas. Aust. J. Agric. Res. 1988, 39, 53. [Google Scholar] [CrossRef]
Gonçalves, L.R.; Oliveira, C.W.; Meireles, A.C.M. Spatial distribution of evapotranspiration by fractional vegetation cover index on irrigated cropland banana (Musa Spp.) in the semiarid. Remote Sens. Appl. Soc. Environ. 2023, 29, 100878. [Google Scholar] [CrossRef]
Lamour, J.; Le Moguédec, G.; Naud, O.; Lechaudel, M.; Taylor, J.; Tisseyre, B. Evaluating the drivers of banana flowering cycle duration using a stochastic model and on farm production data. Precis. Agric. 2020, 22, 873–896. [Google Scholar] [CrossRef]
Lamour, J.; Naud, O.; Lechaudel, M.; Le Moguédec, G.; Taylor, J.; Tisseyre, B. Spatial analysis and mapping of banana crop properties: Issues of the asynchronicity of the banana production and proposition of a statistical method to take it into account. Precis. Agric. 2019, 21, 897–921. [Google Scholar] [CrossRef]
Drusch, M.; Del Bello, U.; Carlier, S.; Colin, O.; Fernandez, V.; Gascon, F.; Hoersch, B.; Isola, C.; Laberinti, P.; Martimort, P.; et al. Sentinel-2: ESA’s Optical High-Resolution Mission for GMES Operational Services. Remote Sens. Environ. 2012, 120, 25–36. [Google Scholar] [CrossRef]
Retkute, R.; Gilligan, C.A. Developing a spatio-temporal model for banana bunchy top disease: Leveraging remote sensing and survey data. Front. Plant Sci. 2025, 16, 1521620. [Google Scholar] [CrossRef]

Figure 1. Conceptual framework for banana disease detection using remote sensing data. The framework involves three phases: database construction, model construction, and model application.

Figure 2. (A) Locations of the NSW1 and NSW2 banana plantations in Australia (red dots). (B) Boundaries of the NSW1 banana plantation (red polygon). (C) Boundaries of the NSW2 banana plantation (red polygon). (D) Location of the NM banana plantation in Mozambique (red dot). (E) Boundaries of the NM banana plantation (red polygon). Satellite images in (B,C,E) were obtained from Google Earth Engine (Imagery @2024 Airbus, CNES/Airbus, Landsat/ Copernicus, Maxar Technology).

Figure 3. Meteorological data during the study period. Monthly temperature (A) and precipitation (B) in northern NSW. Monthly temperature (C) and precipitation (D) in northern Mozambique.

Figure 4. Performance of three ML algorithms as a function of length of window for covariate observation (K): decision tree (DT), support vector machines (SVMs), and random forest (RF). We assessed five values of hyperparameters for random forest model:

n_{t r e e s} = 100

(RF1),

n_{t r e e s} = 200

(RF2),

n_{t r e e s} = 300

(RF3),

n_{t r e e s} = 500

(RF4), and

n_{t r e e s} = 1000

(RF5). Calculated metrics are (from top to bottom rows) the Spearman correlation coefficient, the coefficient of determination (

R^{2}

), the root mean square error, and bias coefficient.

Figure 4. Performance of three ML algorithms as a function of length of window for covariate observation (K): decision tree (DT), support vector machines (SVMs), and random forest (RF). We assessed five values of hyperparameters for random forest model:

n_{t r e e s} = 100

(RF1),

n_{t r e e s} = 200

(RF2),

n_{t r e e s} = 300

(RF3),

n_{t r e e s} = 500

(RF4), and

n_{t r e e s} = 1000

(RF5). Calculated metrics are (from top to bottom rows) the Spearman correlation coefficient, the coefficient of determination (

R^{2}

), the root mean square error, and bias coefficient.

Figure 5. Comparing observed and predicted VIs at the NSW2 plantation to test the methodology with respect to seasonality in the absence of disease. (A) Observed values of NDVI over time. (B) Relative error between predicted and observed NDVI. Color shows observed values of NDVI. (C) Distribution of relative errors for forecasting period for all VIs. In (A,B), gray area shows period used to train the RF models.

Figure 6. (A) Number of new monthly cases in the NSW1 banana plantation [34] (red curve) and dynamics of difference between predicted and observed VIs at all pixels. The analyses are conducted for the range of VIs defined in Table 1. Yellow shading denotes a negative difference, where observed VI values exceed predicted values, while deep purple hues suggest a potential presence of disease. (B) Difference between predicted and observed NDVI and number of new monthly cases at the pixel with the largest number of infections.

Figure 7. (A) Number of new monthly cases in the NM banana plantation [36] (red curve) and dynamics of difference between predicted and observed VIs at all pixels. (B) Grouping pixels with similar profiles. (C) Locations of grouped pixels from (B).

Figure 8. Correlation between number of cases and difference between predicted and observed VIs. Boxes indicate the interquartile range, with the line marking the median. Whiskers extend to 1.5 × IQR, and dots represent outliers beyond this range.

Table 1. VIs used in the analyses, where BLUE, RED, NIR, and SWIR correspond to the blue (‘SR_B2’), red (‘SR_B4’), near-infrared (‘SR_B5’), and shortwave infrared (‘SR_B6’) spectral bands, respectively.

Index	Formula	Ref.
Normalized difference vegetation index	$N D V I = \frac{N I R - R E D}{N I R + R E D}$	[51]
Kernel NDVI	$k N D V I = tanh ({(\frac{N I R - R E D}{N I R + R E D})}^{2})$	[44]
Ratio vegetation index	$R V I = N I R / R E D$	[52]
Difference vegetation index	$D V I = N I R - R E D$	[45]
Enhanced vegetation index	$E V I = \frac{2.5 (N I R - R E D)}{N I R + 6 R E D - 7.5 B L U E + 1}$	[53]
Soil-adjusted vegetation index	$S A V I = \frac{1.5 (N I R - R E D)}{N I R + R E D + 0.5}$	[54]
Modified soil-adjusted vegetation index	$M A S V I = 0.5 * (2 N I R + 1 -$	[47]
	$\sqrt{{(2 N I R + 1)}^{2} - 8 (N I R - R E D)})$
Optimized soil-adjusted vegetation index	$O S A V I = \frac{N I R - R E D}{N I R + R E D + 0.16}$	[55]
Normalized difference phenology index	$N D P I = \frac{N I R - (0.74 R E D + 0.26 S W I R}{N I R + (0.74 R E D + 0.26 S W I R)}$	[48]
Near-infrared reflectance of vegetation	$N I R v = N I R * N D V I$	[49]
Global environment monitoring index	$G E M I = η (1 - 0.25 η) - \frac{R e d - 0.125}{1 - R E D}$	[50]
	$η = \frac{2 (N I R^{2} - R E D^{2}) + 1.5 N I R + 0.5 R E D}{N I R + R E D + 0.5}$

Table 2. Division of the observation data into training, testing, and forecasting sets. Size is the number of datapoints.

Case	Split	Date Range	Size
No disease	Training	Apr 2013 to Marc 2014	25
No disease	Forecasting	Apr 2014 to Marc 2016	50
BBTD	Training	Apr 2013 to Mar 2015	100
BBTD	Testing	Apr 2015 to Dec 2015	31
BBTD	Forecasting	Jan 2016 to Dec 2019	200
Fusarium TR4	Training	Jun 2013 to Apr 2014	48
Fusarium TR4	Forecasting	May 2014 to Oct 2015	79

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Retkute, R.; Crew, K.S.; Thomas, J.E.; Gilligan, C.A. Detection of Banana Diseases Based on Landsat-8 Data and Machine Learning. Remote Sens. 2025, 17, 2308. https://doi.org/10.3390/rs17132308

AMA Style

Retkute R, Crew KS, Thomas JE, Gilligan CA. Detection of Banana Diseases Based on Landsat-8 Data and Machine Learning. Remote Sensing. 2025; 17(13):2308. https://doi.org/10.3390/rs17132308

Chicago/Turabian Style

Retkute, Renata, Kathleen S. Crew, John E. Thomas, and Christopher A. Gilligan. 2025. "Detection of Banana Diseases Based on Landsat-8 Data and Machine Learning" Remote Sensing 17, no. 13: 2308. https://doi.org/10.3390/rs17132308

APA Style

Retkute, R., Crew, K. S., Thomas, J. E., & Gilligan, C. A. (2025). Detection of Banana Diseases Based on Landsat-8 Data and Machine Learning. Remote Sensing, 17(13), 2308. https://doi.org/10.3390/rs17132308

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Detection of Banana Diseases Based on Landsat-8 Data and Machine Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Case Studies

2.1.1. Banana Bunchy Top Disease

2.1.2. Fusarium TR4

2.2. Methodological Approach

2.3. Datasets

2.3.1. Remote Sensing Imagery

2.3.2. Temperature Data

2.3.3. Precipitation Data

2.4. Pre-Processing and Analysis

2.4.1. Plantation Boundary Delineation

2.4.2. Smoothing of Time Series Data

2.5. Vegetation Indices

Alignment of Datasets to a Common Baseline

2.6. Machine Learning Algorithms

2.6.1. Decision Tree (DT) Model

2.6.2. Support Vector Machines (SVMs)

2.6.3. Random Forest (RF) Model

2.6.4. Training and Testing of ML Models

2.7. Model Accuracy Assessment

3. Results

3.1. Year-to-Year Dynamics of Temperature and Precipitation

3.2. Selection of ML Model and Hyperparameters

3.3. Effect of Seasonal Variation on VIs

3.4. Detecting BBTD Presence at the NSW1 Banana Plantation

3.5. Detecting TR4 Presence

3.6. Performance of VIs

4. Discussion

4.1. VIs, Phenology, and the Challenge of Disease Detection

4.2. Pixel-Level Modeling to Account for Phenological and Structural Variability

4.3. Linking VI Anomalies to Observed Disease Dynamics

4.4. Limitations and Future Opportunities

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI