Comparative Analysis of Machine Learning Algorithms for Object-Based Crop Classification Using Multispectral Imagery

Be, Madjebi Collela; Randrianantenaina, Antsa Sarobidy; Kanneh, James E.; Han, Yingchun; Lei, Yaping; Zhi, Xiaoyu; Xiong, Shiwu; Jiao, Yahui; Shang, Shilong; Ma, Yunzhen; Yang, Beifang; Tao, Lin; Li, Yabing

doi:10.3390/drones9110763

Open AccessArticle

Comparative Analysis of Machine Learning Algorithms for Object-Based Crop Classification Using Multispectral Imagery

by

Madjebi Collela Be

¹,

Antsa Sarobidy Randrianantenaina

²

,

James E. Kanneh

³,

Yingchun Han

¹,

Yaping Lei

¹,

Xiaoyu Zhi

¹,

Shiwu Xiong

¹,

Yahui Jiao

¹,

Shilong Shang

¹,

Yunzhen Ma

¹,

Beifang Yang

¹,

Lin Tao

⁴ and

Yabing Li

^1,*

¹

State Key Laboratory of Cotton Bio-Breeding and Integrated Utilization, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, China

²

Department of Biodiversity, Ecology, and Evolution, Faculty of Sciences and Engineering, Sorbonne University, 75005 Paris, France

³

Key Laboratory of Crop Water Requirement and Regulation of Ministry of Agriculture and Rural Affairs, Institute of Farmland Irrigation, Chinese Academy of Agricultural Sciences, Xinxiang 453002, China

⁴

Institute of Cash Crops, Xinjiang Academy of Agricultural Sciences, Urumqi 830091, China

^*

Author to whom correspondence should be addressed.

Drones 2025, 9(11), 763; https://doi.org/10.3390/drones9110763

Submission received: 8 September 2025 / Revised: 30 October 2025 / Accepted: 4 November 2025 / Published: 5 November 2025

(This article belongs to the Special Issue UAS in Smart Agriculture: 2nd Edition)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

Unmanned aerial vehicle imagery and object-based image analysis identify crop types
Five machine learning algorithms exhibit high Accuracy for crop type classification

What is the implication of the main finding?

Ensemble learning method outperformed a single model
Index and grey-level co-occurrence matrix are important for crop identification

Abstract

Unmanned Aerial Vehicles (UAVs) offer enhanced spatial and temporal resolution for agricultural remote sensing, surpassing traditional satellite-based methods. Given the abundance of evolving machine-learning methods for crop recognition, this study evaluates and compares five machine learning algorithms (ML) and tests an Ensemble Learning method as a sixth approach, integrated with object-based image analysis (OBIA) for crop-type classification using UAV multispectral imagery, aiming to identify the most effective model and produce a classification map based on the best-performing method. Image segmentation was built using eCognition software, and spectral, index, and gray level co-occurrence matrix (GLCM) features were extracted from the segmented object. A machine learning model integrating multiple classification algorithms (SVM, ANN, RF, XGBoost, KNN, Ensemble Learning) with automated hyperparameter optimization was developed and executed in Google Colab using Python 3.10. All classifiers achieved accuracies exceeding 80% and Area Under the Curve (AUC) values above 0.9. SVM and ANN are the best classifiers, with the same value of accuracy (94%), followed by XGBoost (93%), RF (92%), and KNN (89%). The Ensemble Learning method (SVM + ANN) as a sixth approach outperformed all single models, with an accuracy value of 95%. Cotton, maize, peanut, and soybean were classified with the highest accuracy, with index and GLCM features contributing most significantly, followed by spectral features. The integration of high-resolution UAV imagery with ML and OBIA demonstrates strong potential for automated crop-type classification, offering valuable support for precision agriculture applications.

Keywords:

precision agriculture; ANN; XGBoost; SVM; image processing; crop identification; UAV

1. Introduction

Accurate crop classification mapping performs a vital role in agricultural production, food security, and policy development. It provides crucial reference data for agricultural decision-making and is crucial for creating crop management plans [1]. Remote sensing is a critical tool for the agricultural sector. It provides reliable and localized data for monitoring and thematic mapping of crops [2].

One primary source of data for crop classification in today’s remote sensing community is satellite remote sensing. However, acquiring images with high temporal and spatial resolution from satellites and airborne sensors is expensive, requires long collection cycles, is susceptible to weather and cloud effects, and cannot guarantee the time phase. Therefore, the quality of the data does not satisfy the requirement [3]. On the other hand, unmanned aerial vehicle (UAV) remote sensing, using low-altitude imaging technology, overcomes the drawbacks of conventional satellite remote sensing by producing aerial images with better resolution, more spatial information, and distinct ground object characteristics. However, there are limitations to UAV use, particularly regarding the coverage area, which in this case has disadvantages in extensive agriculture. The experiment performed covers a small area (only 0.5 hectares, with dimensions of approximately 45 × 120 m²). Moreover, UAV remote sensing provides a cost-effective solution, rapid agricultural information collection on the ground, and flexible data acquisition methods [3].

Recent research utilizes UAV remote sensing with a particular focus on visible light images for crop information extraction; however, this approach gives inadequate spectral information, which poses significant challenges to achieving a better classification of crops [4,5], especially for individual crop plants. To address this challenge, the use of hyperspectral and multispectral imaging provides more precise data for individual crop classification. Nevertheless, while providing extensive spectrum data, its high cost restricts its wider use [6]. However, although it is considered expensive, multispectral imaging balances cost-effectiveness and information richness, emphasizing the significance of developing crop identification studies with UAV multispectral imagery [7].

Generally, remote sensing imagery classification methods can be divided into two categories according to classification units: pixel-based and object-based [8]. Pixel-based classification methods face difficulties with spectral heterogeneity and similarity in high-resolution remote sensing, which frequently results in salt-and-pepper noise and decreased accuracy [9]. On the other hand, object-based image classification methods use segments with consistent characteristics as the fundamental classification units rather than individual pixels. For every segment in the classification process, this method allows the integration of spectral, textural, geometric, and contextual data [9]. This method significantly increases the dependability of the data and improves classification accuracy [10,11].

In recent history, due to the distinct benefits of the object-based approach, several researchers have combined it with diverse machine learning algorithms for classifying remote sensing images. Ameslek et al. [12] used Object-Based Image Analysis (OBIA) with Convolutional Neural Network (CNN) to automatically identify and count olive trees, using a Phantom 4 advanced drone imagery, and achieved an overall accuracy of 97%, which improved to 99% after OBIA refinement. Anderson et al. [13] investigated the efficacy of Uncrewed aircraft systems (UAS) with a three-band (red, green, blue, RGB) sensor for invasive Phragmites australis identification using OBIA combined with machine learning algorithms, support vector machine (SVM), random forest (RF), and artificial neural networks (ANN). The findings showed that all three machine learning algorithms achieved a classification accuracy of 90% after applying the post-ML OBIA workflow. Feng et al. [3] combined OBIA and machine learning algorithms, random forest (RF), support vector machine (SVM), decision tree (DT), and K-nearest neighbor (KNN) to extract and analyze weed information from visible light images taken by UAVs in farmland areas with two distinct weed densities. Among the machine learning techniques, RF performed the best in both experimental regions, increasing total accuracy by 1.74–12.14% in densely covered areas and 7.51–11.56% in sparsely covered areas, according to the data. Deng et al. [7] assessed a technique for identifying crop, utilizing multispectral UAV images that combine an object-oriented method with the random forest (RF) algorithm, and achieved a kappa coefficient of 0.92 and an overall accuracy (OA) of 92.76%, demonstrating that integrating spectral features with Grey Level Co-Occurrences Matrix and Index features (SPEC + GLCM + INDEX) produced better results. Nevertheless, the combination of geometric and spectral features (SPEC + GEOM) showed the lowest performance. This study produced satisfactory results in crop detection. However, given the wealth of evolving machine-learning methods for crop recognition [14], this study used only the random forest algorithm, thereby restricting the ability to evaluate the efficacy of machine learning in classifying crops.

Therefore, this study proposes an approach for classifying crop types by combining object-based with five frequently used machine learning algorithms, Support Vector Machine (SVM), Artificial Neural Network (ANN), Random Forest (RF), XGBoost, and K-Nearest Neighbor (KNN), using UAV multispectral images by considering the index, spectral, and textural features. This work aims to evaluate and compare the performance of these machine learning algorithms to identify the most suitable model, test an Ensemble Learning model as a sixth approach, and generate a map using the best-performing classifier. The approach of this research will enhance agricultural remote sensing techniques, making it a significant innovation in precision agriculture.

2. Materials and Methods

2.1. Study Area

The experimental station is located at the East Field of the Cotton Research Institute of the Chinese Academy of Agricultural Sciences in Anyang City, Henan Province, China (36°07′ N, 116°22′ E) (Figure 1). The crops were planted in April in an inter-cropping system covering an area of 4990.4 m² and harvested in October 2024. The experimental site has a clay loam soil with medium fertility and a semi-humid and subtropical monsoon climate with an average annual precipitation of around 544 mm [15]. With a diverse range of crops, including cotton, maize, soybeans, wheat, peanuts, and others, this site provides suitable conditions for agriculture. It matches the requirements of testing this study’s applied methodology.

2.2. Framework of This Study

In this work, we used OBIA-ML algorithms to classify crop types based on UAV images, and the framework of the whole process is presented in Figure 2: (1) UAV image acquisition and image preprocessing that involves image acquisition and orthographic image generation; (2) Image segmentation using multiresolution segmentation; (3) Feature selection and extraction based on spectral, index and textural (GLCM) features from segmented objects; (4) Image Classification performed with six machine learning (SVM, ANN, Random Forest, XGBoost, KNN, Ensemble Learning) and training samples; (5) Accuracy assessment of different models of machine learning and crop mapping of the whole study area based on the best classifier.

2.3. UAV Image Data Acquisition and Preprocessing

A Matrice 350 RTK UAV from Feitoun Icont (Beijing) Technology Co., Ltd. (Beijing, China) (Figure 3), equipped with an FT10-2512L camera, was used for multispectral imaging of the study area on August 01, 2024. The FT10-2512L camera includes 12 single-band lenses with wavelengths of 410 nm, 430 nm, 450 nm, 550 nm, 560 nm, 570 nm, 630 nm, 650 nm, 685 nm, 710 nm, 850 nm, and 900 nm.

The UAV positioning system is based on the Global Navigation Satellite System-Real-Time Kinematic (GNSS-RTK) method, providing high-accuracy location data for the camera. The FT10-2512L’s onboard navigation system integrates this GNSS-RTK data with its own attitude data, calculating high-precision position and orientation information. During each data acquisition session, the FT10-2512L also measures ambient light using its built-in spectroradiometer. Data, including spectral images, pose information, and ambient light intensity, are processed and fused by FT-PreData software (developed by Feitoun Icont (Beijing) Technology Co., Ltd.). This processing includes spectral calibration, spectral data extraction, and normalized index calculation, resulting in high-resolution multispectral photogrammetric products.

During operations, the UAV flew at an altitude of 70 m under clear and windless weather conditions at noon (12:00), with the camera lens maintained in a vertically downward position. A total of 300 TIFF images were collected in a 10 min flight session. Using FT-PreData software, the data were processed to produce an orthophotomosaic covering 0.4 square kilometers, with a spatial resolution of approximately 0.008 m.

2.4. Image Segmentation

Image segmentation is an essential phase in object-based image processing, because the effectiveness of object-based algorithm extraction is directly impacted by the quality of segmentation [16]. In this study, we used OBIA’s multiresolution segmentation algorithm. Multiresolution segmentation, which is a frequently used segmentation algorithm, fuses the pixels into a larger, clustered, and homogeneous patch from the bottom up based on the homogeneity principle. The segmentation scale determines the heterogeneity threshold that halts the expansion of an object [17]. Selecting the right scaling parameters in eCognition software is a crucial factor when segmenting remotely sensed imagery [8]. Based on the determined optimal segmentation scale, image segmentation was conducted using eCognition Developer 10.3 (Trimble Germany GmbH, Munich, Germany). After testing different scale parameters, the parameters were configured as follows: the compactness of each band’s weight was set to 1, the scale parameter was set to 10, the shape factor was set to 0.1, and the compactness was set to 0.5.

2.5. Feature Selection and Extraction

According to Yang et al. [18] pointed out that classification accuracy is strongly correlated to the number of features. Deng et al. [7] assessed the significance of features for segmented objects across different classification schemes. They found that combining spectral features with indices and textural features (GLCM) outperformed other methods, such as those that combine spectral and geometric features. Based on these results, we selected and extracted 64 features from spectral, index, and texture features for each object segmented (12 spectra, 12 index, and 40 GLCM). These features for each object segmented were selected, calculated, and extracted using eCognition Developer 10.3 (Trimble Germany GmbH, Munich, Germany) (Table 1).

2.6. Model Development and Implementation

We developed a comprehensive machine learning framework integrating multiple classification algorithms (support vector machine, artificial neural network, random forest, XGBoost, K-nearest neighbor) with automated hyperparameter optimization. The implementation was executed in Google Colab using Python 3.10. The analytical pipeline was built using scikit-learn 1.5.2, XGBoost 2.1.2, NumPy 2.1.3, Pandas 2.2.3, and Optuna 4.1.0 for the hyperparameter optimization framework.

2.7. Machine Learning Models and Optimization

Six classification algorithms were systematically evaluated: Support Vector Machine (SVM), Artificial Neural Network (ANN), Random Forest (RF), Extreme Gradient Boosting (XGBoost), K-Nearest Neighbors (KNN), and an Ensemble Learning model combining the top two best-performing classifiers. These machine learning approaches were chosen because they are strong and widely used in agricultural remote sensing studies to classify crop types.

SVM is a binary classification that separates training samples by locating a hyperplane in high-dimensional feature space. It works well for issues requiring small sample sizes, nonlinearity, and high-dimensional binary classification [17]. SVM consists of the hyperparameters as follows: regularization parameter C (0.01–100), Kernel function type (“rbf”, “linear”), and kernel coefficient for rbf gamma (0.001–1.0). SVM was used to classify vegetation based on its health status [29], classify trees by type [30], identify and classify weeds to generate weed maps [31], and segment crop rows [32].

ANN has been widely used for classifying crops and their traits [33,34]. ANN has an input layer, multiple hidden layers, and an output layer. Several neurons in each hidden layer compute mathematical calculations to identify complex connections between the input and output data [35]. ANN consists of the following hyperparameters as follows: network architecture options (hidden-layer-sizes), activation function (“relu”, ”tanh”), L2 regularization team “alpha” (ranging from 1 × 10⁻⁵ to 1 × 10⁻²), and initial learning rate (1 × 10⁻⁵ to 1 × 10⁻²).

RF is a tree-based classifier that increases the variety of classification trees by using bootstrapping. RF creates several separate decision tree models by leveraging the speed and accuracy of the decision tree technique. These models collaboratively reduce errors, resulting in classification outcomes that are more accurate and reliable [34,36]. RF includes hyperparameters such as the number of trees in the forest (100–500) and the maximum tree depth (5–50). To demonstrate its effectiveness in classification studies, RF was used to classify crop types [7], sugar beet crops and weeds [37].

XGBoost method was applied to evaluate soybean lodging using UAV imagery [35]. XGBoost improves performance over previous boosting methods by reducing overfitting through regularized model formalization [38]. It consists of hyperparameters such as the number of boosting rounds (100–1000), maximum tree depth (3–12), step size shrinkage (0.001–0.1), the fraction of samples used for tree building (0.6–1.0), the fraction of features used per tree (0.6–1.0), the minimum sum of instance weight needed in a child (1–7).

KNN is a non-parametric supervised machine learning technique that is widely utilized in data processing and modeling [39]. KNN consists of the following hyperparameters: the number of neighbors to consider (3–30), the methods of weighing neighbor votes (uniform, distance), and the distance calculation methods (“Euclidean”, “Manhattan”, “Minkowski”). Literature shows that KNN has a wide range of applications in precision agriculture for land-cover classification [40], sugarcane planting line detection [41], and crop-row segmentation [32].

The Ensemble Learning Model was constructed by combining the two best-performing individual classifiers using soft voting, where class probabilities from constituent models are averaged to produce final predictions. This approach leverages the complementary strengths of different algorithms to improve overall classification accuracy. For the Ensemble model, optimization determined the optimal voting mechanism (“soft”, “hard”) and relative weights for the two constituent models (0.5–2.0 each) to maximize classification performance.

Hyperparameter optimization was conducted using the Tree-structured Parzen Estimator (TPE) sampler implemented in the Optuna framework. The TPE algorithm employs Bayesian optimization principles to efficiently explore the hyperparameter space by modeling the relationship between parameter configurations and objective values, thereby reducing the number of trials required compared to grid or random search approaches. The optimization protocol executed 50 trials per individual algorithm (SVM, ANN, RF, XGBoost, KNN) and 20 trials for the Ensemble Learning model. These trial allocations were determined through preliminary convergence analysis, which revealed that objective function values reached asymptotic behavior—indicating convergence to near-optimal hyperparameter configurations—within these iteration limits. Extending trials beyond these thresholds yielded negligible improvements in validation performance, suggesting that the search space had been adequately explored and local optima had been identified for each algorithm within the defined parameter ranges.

2.8. Data Processing and Performance Evaluation

In this study, visual interpretation and ground verification of the high-resolution UAV images were used to select sample data for image classification. The total number of samples for nine object types was determined based on the distribution characteristics observed in the study area (Figure 4). Sample counts for different objects ranged from 11 to 3490 (Table 2), resulting in a total of 12,274 samples. The dataset underwent systematic preprocessing, including feature extraction and encoding of categorical variables. A stratified sampling method was employed, allocating 70% of the data for training—which was used to develop the classification model—and 30% for testing to validate classification accuracy. This approach preserved the inherent characteristics of the data while minimizing potential bias during model evaluation.

Model assessment employed a three-tier evaluation approach. First, confusion matrices enabled detailed examination of classification patterns and error distributions. Second, feature importance analysis identified key predictive variables through model-specific importance extraction methods for tree-based models and permutation importance for others, with scores normalized for cross-model comparison. Third, ROC curve analysis provided insights into classification performance across different thresholds, with micro-averaging for multi-class scenarios.

The evaluation protocol emphasized comparative analysis across models using multiple performance metrics, including Accuracy, Cohen’s Kappa coefficient, and F1-score, calculated using Equations (1)–(5). Accuracy refers to the model’s overall performance. Cohen’s Kappa coefficient represents the proportion of correctly predicted classes. Model accuracy is measured by the F1-score, which balances precision and recall, which ranges from 0 to 1, “with 1 being the best possible score”, indicating perfect precision and recall.

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(1)

K a p p a = \frac{P 0 - P e}{1 - P e}

(2)

P r e c i s i o n = \frac{T P}{T P + F P}

(3)

R e c a l l = \frac{T P}{T P + F N}

(4)

F 1 S c o r e = \frac{2 \times (P r e c i s o n \times R e c a l l)}{(P r e c i s i o n + R e c a l l)}

(5)

where P₀ = is the overall accuracy of the model; P_e = is the measure of the agreement between the model predictions and actual class values; and TP, TN, FP, and FN are True Positive, True Negative, False Positive, and False Negative cases of the result data, respectively.

2.9. Visualization Framework

We developed a custom visualization system to monitor optimization progress and compare model performance. This system comprises optimization trajectory analysis, model performance assessment, and distribution analysis. This integrated approach enabled systematic evaluation of model behavior and performance characteristics.

3. Results

3.1. Classification Performance of Machine Learning Models

Figure 5 shows that SVM, ANN, RF, XGBoost, and KNN exhibit high Accuracy values. The top two best-performing individual methods were SVM (accuracy: 94%) and ANN (accuracy: 94%), which built the Ensemble Learning model. Looking at the classes, we noticed that cotton, maize, peanut, and soybean had the most accurately predicted in all the models used, followed by soil. Whereas infrastructure, roads, and solar panels were predicted with the least accuracy.

Figure 6 and Table 3 illustrate the feature significance rankings for the different classification models (SVM, ANN, RF, XGBoost, KNN, and Ensemble). Notably, among the top 20 ranked features across all models, a significant number of index and GLCM features were included. On the other hand, none of the top 20 features included spectral features, except for Max-diff. For the SVM model, whilst MSWI retained the first position, GLCM texture features immediately dominated subsequent rankings, with Dissimilarity 90°, Entropy (all directions), Contrast 45°, and Contrast 135° occupying positions two through five. Vegetation indices DVI, NDVI, and IPVI appeared at eighth, tenth, and eleventh positions, respectively. The spectral feature Max-diff was positioned at the fourteenth rank. The SVM’s top 20 important features encompassed 07 vegetation indices, 12 GLCM features, and 01 spectral feature, demonstrating the model’s substantial reliance on spatial texture information for classification. In the ANN model, the texture feature GLCM Dissimilarity 90° achieved the highest importance ranking, followed closely by the vegetation index MSWI. Additional GLCM features (Contrast 135°, Contrast 45°, Entropy all directions) occupied positions three through five, with vegetation indices DVI and IPVI subsequently appearing at sixth and seventh positions. The spectral feature Max-diff ranked tenth. For the ANN, 07 vegetation indices, 12 GLCM features, and 01 spectral feature were represented among the top 20 most important features, highlighting the neural network’s sensitivity to spatial texture patterns in the classification process. For the Random Forest model, vegetation indices exhibited strong dominance in the feature hierarchy, with MSWI occupying the first position, followed by NIRRR, NDVI, and IPVI. The first texture feature, GLCM Homogeneity 90°, appeared in fourth position, interspersed with additional index features maintaining prominent positions within the top ten. The spectral feature Max-diff was positioned at the fifteenth rank. In total, 12 vegetation indices, 07 GLCM texture features, and 01 spectral feature comprised the top 20 most important features for the RF model, indicating a pronounced preference for vegetation-based metrics. In the XGBoost model, the index feature MSWI demonstrated exceptional discriminative capacity, exhibiting substantially higher importance relative to all other features. This was followed sequentially by vegetation indices IPVI and GNDI, with the spectral feature Max-diff positioned at fourth rank. Additional index features (DVI, NDRE) continued to maintain elevated rankings, whilst the first texture feature (GLCM Homogeneity 90°) appeared at seventh position. In total, 11 vegetation indices, 08 GLCM features, and 01 spectral feature constituted the top 20 features for the XGBoost model, revealing a clear algorithmic preference for spectral-based vegetation metrics. In the KNN model, vegetation indices demonstrated pronounced importance, with MSWI ranking first, followed consecutively by NDRE, IPVI, NDVI, NIRGR, and GNDI occupying positions two through six. The spectral feature Max-diff appeared at the seventh position, demonstrating relatively earlier prominence compared to other models. GLCM texture features commenced appearing from the ninth position onwards, initiated by Homogeneity 90°. In total, 10 vegetation indices, 09 GLCM features, and 01 spectral feature comprised the top 20 features for the KNN model, suggesting the algorithm’s preferential utilization of spectral-based discriminatory information. For the Ensemble model, a more equitable distribution between GLCM and vegetation index features was observed. GLCM Dissimilarity 90° achieved the highest ranking, followed by GLCM Contrast features (45° and 135°), with the vegetation index MSWI positioned third. GLCM and index features alternated systematically throughout the ranking hierarchy, with vegetation indices DVI, NDVI, IPVI, and DVIGRE positioned at seventh, eleventh, twelfth, and fourteenth ranks, respectively. The spectral feature Max-diff occupied the thirteenth position. In total, 07 vegetation indices, 12 GLCM features, and 01 spectral feature constituted the top 20 features for the Ensemble model, demonstrating the complementary and synergistic nature of texture and vegetation information in the integrated classification approach.

Across all models, MSWI consistently demonstrated substantial discriminative capacity, appearing within the top three features for five of the six models evaluated. GLCM texture features, particularly Dissimilarity, Contrast, and Correlation metrics computed at various angular orientations (0°, 45°, 90°, 135°), exhibited strong classificatory power, especially pronounced in SVM, ANN, and Ensemble models. The spectral feature Max-diff represented the sole non-index spectral feature to consistently appear within the top 20 across all models, though generally positioned between seventh and fifteenth rank. The predominance of derived vegetation indices and spatial texture features over raw spectral bands suggests that engineered features capturing vegetation physiological status and spatial heterogeneity patterns possess superior discriminative capability for the classification task at hand.

The area under the curve (AUC) represents the area beneath an ROC curve, which plots an algorithm’s true positive rate (TPR) against its false positive rate (FPR). An ideal ROC curve will hug the top left corner, so a higher AUC indicates a better classifier. Figure 7 showed that all the models used had AUC values greater than 0.90, which is close to the maximum 1. We can deduce that all the models used were good classifiers.

3.2. Performance Metrics Comparison

Figure 8 shows that the models used all performed well. Their accuracy, F1-score, and Cohen’s kappa values exceeded 0.8. The Ensemble model performed best, with accuracy, F1-Score, and Cohen’s kappa values of 0.95, 0.95, and 0.93, respectively. These values were slightly higher than those of the other models. In second place were the ANN and SVM models, which demonstrated identical accuracy and F1-Score values of 0.94, with Cohen’s kappa of 0.93 and 0.92, respectively. The XGBoost and Random Forest models achieved comparable performance levels, with XGBoost recording accuracy and F1-Score values of 0.93, and a Cohen’s kappa of 0.91, whilst Random Forest yielded accuracy and F1-Score values of 0.92, accompanied by a Cohen’s kappa coefficient of 0.89. The KNN algorithm exhibited the lowest performance amongst the evaluated models, attaining accuracy and F1-Score values of 0.89, with a Cohen’s kappa coefficient of 0.85. Nevertheless, it is noteworthy that even the lowest-performing model maintained accuracy levels approaching 0.9.

3.3. F-1 Score Comparison Across Different Classes by Model

Figure 9 and Table 4 demonstrate the class-specific F1-score performance across all evaluated models. The Ensemble model achieved optimal classification accuracy for the majority of target classes, including cotton, maize, peanut, road, and soybean, with F1-Score values of 0.93, 0.92, 0.95, 0.93, and 0.91, respectively. The soil class was most accurately predicted by ANN, attaining F1-Score values of 0.96. For the infrastructure class, ANN, Random Forest, and XGBoost models demonstrated superior performance with F1-Score values of 0.94. The shrub class exhibited the most variable performance across models, with SVM achieving the highest F1-Score of 0.69, whilst Random Forest recorded notably lower accuracy at 0.37. The solar panels class was most accurately classified by ANN with an F1-Score of 0.82, though this class demonstrated substantially reduced classification accuracy across all algorithms.

Among the agricultural crop classes, peanut exhibited the most consistent and robust classification performance, with F1-Score values ranging from 0.93 to 0.97 across all models, indicating high discriminability of this crop type. Cotton similarly demonstrated relatively elevated accuracy, with F1-Score values spanning 0.91 to 0.95 across different algorithms. Maize maintained comparably high classification accuracy, with F1-Score values ranging from 0.89 to 0.95 for all models. Soybean achieved satisfactory performance, with F1-score values between 0.89 and 0.93 across the evaluated classification approaches. These results suggest that the spectral and textural characteristics of major crop classes—cotton, maize, peanut, and soybean—provide sufficient discriminatory information for reliable classification across multiple algorithmic frameworks.

Conversely, certain non-agricultural classes demonstrated more variable performance. The shrub class exhibited consistently lower classification accuracy across all models, with F1-Score values ranging from 0.25 to 0.69, suggesting potential spectral confusion with other vegetation types or high intra-class variability. Similarly, the solar panels class presented classification challenges, with F1-Score values between 0.43 and 0.82, potentially attributable to limited training samples or spectral similarity with infra-structure features. The soil, road, and infrastructure classes achieved robust classification performance across all models, with F1-Score values consistently exceeding 0.85, indicating strong separability of these land cover types from vegetated classes.

3.4. Optimization History

Figure 10 shows the hyperparameter optimization process for all classification models using the Tree-structured Parzen Estimator (TPE) algorithm implemented through the Optuna framework, with accuracy as the optimization objective. The optimization trajectories reveal distinct convergence patterns and performance characteristics across the evaluated models. The Random Forest model attained an optimized accuracy of 0.9167 following 50 optimization trials. The convergence behavior was characterized by consistent performance across most trials, with accuracy values predominantly clustering between 0.90 and 0.92. The relatively stable optimization trajectory indicates limited sensitivity to hyperparameter variations, suggesting robust default performance of the ensemble learning approach. The XGBoost model attained an optimized accuracy of 0.9298 after 50 trials, exhibiting the most efficient convergence behavior among all models. The optimization curve demonstrated steep initial improvement, reaching near-optimal accuracy within the first 10 trials, with subsequent trials exploring the parameter space whilst maintaining consistently high performance above 0.92. The relatively narrow performance variance across trials suggests robust stability across diverse hyperparameter combinations. The SVM model achieved the highest optimized accuracy of 0.9404 following systematic hyperparameter tuning across 50 trials. The optimization trajectory demonstrated rapid initial convergence, with accuracy stabilizing above 0.90 after approximately 5 trials, followed by marginal incremental improvements through subsequent iterations. Notable variability in trial performance was observed, with accuracy values ranging from approximately 0.65 to 0.94, indicating substantial sensitivity to hyperparameter configurations. The ANN model reached an optimized accuracy of 0.9455 across 50 trials, demonstrating a gradual convergence pattern. The optimization trajectory exhibited a stepwise improvement characteristic, with distinct performance plateaus at approximately 0.942 and 0.944 before achieving the final optimized value. The scattered trial points suggest moderate sensitivity to architectural and training hyperparameters. The KNN model achieved an optimized accuracy of 0.8897, representing the lowest performance among all evaluated models. The optimization curve exhibited a gradual ascending trend throughout the trial sequence, with accuracy progressively improving from approximately 0.88 to 0.89. The smooth convergence pattern without significant performance fluctuations suggests that the KNN algorithm’s performance is primarily constrained by the intrinsic characteristics of the feature space rather than hyperparameter configurations. The Ensemble model achieved an optimized accuracy of 0.9492, representing the highest performance among all evaluated approaches. However, the optimization process required fewer trials (approximately 20) compared to other models. The convergence pattern showed a characteristic two-phase behavior, with rapid initial improvement followed by a plateau phase with minimal further gains, indicating efficient identification of optimal parameter configurations. Across all models, the TPE optimization algorithm demonstrated effective exploration of the hyperparameter space, consistently identifying configurations that maximized classification accuracy. The varying numbers of trials required for convergence (ranging from 20 for Ensemble Learning to 50 for other models) reflect differences in hyperparameter sensitivity and model complexity. The optimization process successfully enhanced model performance, with final accuracies exceeding 0.88 for all algorithms.

3.5. Crop Mapping

Figure 11 shows the best mapping result for crops in the study area, based on the 64 features and ensemble model classifier. The target crops were successfully identified in the mapping results, and the spatial distributions of each category were also confirmed. The accuracy of all categories corresponded to the prior analysis of selected samples. In particular, the categories of non-crop class, such as shrub, soil, infrastructure, and solar panels, had distinct characteristics, ensured consistent identification with the real situation, and complete coverage. Figure 11 also demonstrates the accurate mapping of cotton, maize, soybean, and peanut crops within a monoculture system on a single plot. However, the accuracy of crop mapping declined significantly in intercropped and associated plots. Soybean intercropped with maize or cotton, as well as peanut intercropped with maize or cotton, were undervalued and frequently misclassified. This misinterpretation primarily arises from the spectral similarities between peanut and soybean and between cotton and maize. Consequently, mapping crops in intercropping systems or association plots yields lower estimation accuracy, whereas monoculture plots achieve higher precision in estimation.

4. Discussion

This work compares common machine learning algorithms combined with OBIA, applied to crop-type classification on multispectral UAV images. A systematic literature review was carried out to find the methods used for this task. Five individual classification models and one Ensemble Learning method were selected and used to perform the classification. All models were tested using a combination of spectral, index, and texture features. By analyzing the performance of each machine learning model, we can conclude that the six machine learning algorithms (SVM, ANN, RF, XGBoost, KNN, and ensemble) used in this study gave good classification results (Figure 5 and Figure 7). These results supported those obtained by [10,13,42,43] combined object-based image analysis with machine learning to classify crops, extract an urban impervious surface, and detect shrubs and weeds. These studies achieved better accuracy for various machine learning models. Additionally, the Ensemble Learning model outperformed all single models, with an accuracy value of 95%. That result is in line with what Lei et al. [44] found when they used SVM and ANN (along with logistic regression) to classify paddy rice using satellite and aerial images. They obtained an accuracy score of 97.27%, which is better than single models like SVM and ANN. Also, Hao et al. [45] compared a single classifier (SVM, RF, C5.0) with two ensemble methods (multiple voting and probabilistic fusion) and observed that the ensemble strategies achieved accuracies of 80.88% and 81.34%, outperforming the best single model (SVM, 79.26%).

In terms of the features used in the classification process, the Index and GLCM features were ranked among the top 20 most important features across all models (SVM, ANN, RF, XGBoost, KNN, and Ensemble). These two features effectively contribute to improving the performance of these classifiers. These results are consistent with the previous study of [7]. Also, Chen et al. [46] found that incorporating indices and GLCM features into spectral bands improved class separability and achieved greater accuracy compared to using bands alone. The optimal performance was observed with the combined “bands + indices + GLCM” feature set, highlighting that index and GLCM features significantly enhance performance regardless of the classifier used. In addition, Kwak & Park [47] showed that classification using texture features extracted from the GLCM with a larger kernel size significantly improved the classification accuracy of the support vector machine classifier, compared with classification based solely on spectral information. Among the 20 most important features (Figure 6), the main Index features cover the near-infrared and red-edge bands. These results highlighted the usefulness of near-infrared and red-edge bands in crop identification. Gao et al. [48] have highlighted the importance of the near-infrared band for detecting weeds in maize crops. Also, Guo et al. [42] and Abdollahnejad & Panagiotidis [49] have emphasized the significance of the red-edge band in plant identification.

Analyzing the classes (Figure 9 and Table 4), cotton, maize, peanut, and soybean had been predicted most accurately across all the models used compared to the other classes, including soil, shrubs, infrastructure, solar panels, and road. This increased accuracy can be attributed to the phenotypes, quantities, and qualities of the sample data collected. Pádua et al. [10] reported that a small number of class-related pixels present in some objects can lead to classification errors. The cotton, maize, peanut, and soybean often exhibit stronger, distinctive spectral signatures and more consistent phenological patterns (growth stages, leaf structure, canopy development) compared with non-crop classes (soil, shrubs, infrastructures, solar panels, and roads). This results in improved separability throughout feature space, hence enhancing classification accuracy. Yang et al. [50] and Wasif Yousaf et al. [51] found that employing high-resolution imagery and well-defined crop parcels can achieve accuracies of ~90%+.

The optimization trajectories (Figure 10) showed that the models being tested have different patterns of convergence and performance. The TPE optimization algorithm effectively explored the hyperparameter space across all models, consistently identifying configurations that optimized classification accuracy. These results tell us that machine learning-based crop-type classification methods perform well with default hyperparameters, but hyperparameter optimization can significantly improve their performance. Zhu et al. [36] noticed that optimizing the hyperparameters of the RF intrusion detection model has more outstanding performance and achieves a superior detection performance with an accuracy of 98%.

Limitation of This Study

eCognition software is a crucial factor when segmenting a remote sensing image to classify objects [8]. While the results were satisfied, the limitation of this study lies in the determination of scaling parameters during image segmentation. However, further research into the method of automating OBIA analysis for statistically-based scaling parameter determination should be considered. Additionally, this research validates the efficacity of various machine learning combined with OBIA for accurately classifying crop types using UAV multispectral images. However, there are still many methods that can enhance crop type classification accuracy, such as object-based image analysis (OBIA) combined with deep learning models (CNN). Future research will combine OBIA with CNN to classify crop types using UAV imagery and eCognition software.

5. Conclusions

This study evaluates and compares five machine-learning models and tests an Ensemble Learning method as a sixth approach, combined with Object-based image analysis for crop-type classification to address the limitations of insufficient machine-learning models. An automated hyperparameter optimization method for each machine learning model is proposed to improve their accuracy. The results indicate that the five machine learning (SVM, ANN, RF, XGBoost, KNN) models exhibit high classification performance. These accuracies are more than 80% and the AUC values are more than 0.9. Notably, SVM and ANN are the best classifiers, with the same accuracy value (94%), followed by XGBoost (93%), RF (92%), and KNN (89%). Additionally, the Ensemble Learning method (SVM + ANN) as a sixth approach outperformed all single models, with an accuracy value of 95%. The hyperparameter optimization history for these six models makes it easy to visualize the detailed performance of each model. It was observed that the most important features are the index and GLCM, followed by spectral. This research validates the efficacity of these six machine learning models combined with OBIA for accurately classifying crop types using UAV multispectral images. However, there are still many methods that can improve the accuracy of crop type classification, such as OBIA combined with deep learning models (CNN). In addition, future research will combine OBIA with CNN to classify crop types using UAV imagery.

Author Contributions

Conceptualization, L.T., Y.L. (Yabing Li) and M.C.B.; methodology, M.C.B., A.S.R. and Y.L. (Yabing Li); software, M.C.B. and A.S.R.; validation, Y.H., Y.L. (Yaping Lei), X.Z., S.X., Y.J., S.S., Y.M. and B.Y.; formal analysis, M.C.B. and A.S.R.; investigation, Y.H., Y.L. (Yaping Lei), X.Z., S.X., Y.J., S.S. and Y.M.; resources, Y.L. (Yabing Li); data curation, M.C.B.; writing—original draft preparation, M.C.B. and A.S.R.; writing—review and editing, J.E.K., Y.H., Y.L. (Yaping Lei), X.Z., S.X., Y.J., S.S. and Y.M.; visualization, Y.L. (Yabing Li); supervision, L.T.; project administration, Y.H., Y.L. (Yaping Lei), X.Z., S.X., Y.J., S.S., Y.M. and B.Y.; funding acquisition, Y.L. (Yabing Li). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Key Research and Development Program of Xinjiang Uygur Autonomous Region of China (grant 2022B02049).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Acknowledgments

The authors would like to acknowledge the Smart Agriculture Management Team headed by Yabing Li at the Institute of Cotton Research of the Chinese Academy of Agricultural Sciences, Anyang, Henan Province, China.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ANN	Artificial Neural Network
KNN	K-Nearest Neighbor
SVM	Support Vector Machine
XGBoost	Extreme Gradient Boosting
RF	Random Forest
OBIA	Object-Based Image Analysis
UAV	Unmanned Aerial Vehicles
ML	Machine Learning
GLCM	Grey Level Co-Occurrences Matrix
AUC	Area Under the Curve
ROC	Receiver Operating Characteristic
OA	Overall Accuracy
NDVI	Normalized Difference Vegetation Index
NDRE	Normalized Difference Vegetation Index of Red-Edge
GNDVI	Green Normalized Difference Vegetation Index
NIRRR	NIR-Red Ratio Vegetation Index
NIRGR	NIR-Green Ratio Vegetation Index
DVI	Difference Vegetation Index
DVIGRE	Difference Vegetation Index of Green
MSWI	Modified Shade Water Index
OSAVI	Optimized Soil-Adjusted Vegetation Index
IPVI	Infrared Percentage Vegetation Index
EVI	Enhanced Vegetation Index
BI	Brightness Index

References

Wang, Y.; Zhang, Z.; Feng, L.; Du, Q.; Runge, T. Combining Multi-Source Data and Machine Learning Approaches to Predict Winter Wheat Yield in the Conterminous United States. Remote Sens. 2020, 12, 1232. [Google Scholar] [CrossRef]
Kim, Y.; Park, N.-W.; Lee, K.-D. Self-Learning Based Land-Cover Classification Using Sequential Class Patterns from Past Land-Cover Maps. Remote Sens. 2017, 9, 921. [Google Scholar] [CrossRef]
Feng, C.; Zhang, W.; Deng, H.; Dong, L.; Zhang, H.; Tang, L.; Zheng, Y.; Zhao, Z. A Combination of OBIA and Random Forest Based on Visible UAV Remote Sensing for Accurately Extracted Information about Weeds in Areas with Different Weed Densities in Farmland. Remote Sens. 2023, 15, 4696. [Google Scholar] [CrossRef]
Veramendi, W.N.C.; Cruvinel, P.E. Method for maize plants counting and crop evaluation based on multispectral images analysis. Comput. Electron. Agric. 2024, 216, 108470. [Google Scholar] [CrossRef]
Bai, Y.; Shi, L.; Zha, Y.; Liu, S.; Nie, C.; Xu, H.; Yang, H.; Shao, M.; Yu, X.; Cheng, M.; et al. Estimating leaf age of maize seedlings using UAV-based RGB and multispectral images. Comput. Electron. Agric. 2023, 215, 108349. [Google Scholar] [CrossRef]
Liu, D.; Yang, F.; Liu, S. Estimating wheat fractional vegetation cover using a density peak k-means algorithm based on hyperspectral image data. J. Integr. Agric. 2021, 20, 2880–2891. [Google Scholar] [CrossRef]
Deng, H.; Zhang, W.; Zheng, X.; Zhang, H. Crop Classification Combining Object-Oriented Method and Random Forest Model Using Unmanned Aerial Vehicle (UAV) Multispectral Image. Agriculture 2024, 14, 548. [Google Scholar] [CrossRef]
Ventura, D.; Napoleone, F.; Cannucci, S.; Alleaume, S.; Valentini, E.; Casoli, E.; Burrascano, S. Integrating low-altitude drone based-imagery and OBIA for mapping and manage semi natural grassland habitats. J. Environ. Manag. 2022, 321, 115723. [Google Scholar] [CrossRef]
Prince, A.; Franssen, J.; Lapierre, J.-F.; Maranger, R. High-resolution broad-scale mapping of soil parent material using object-based image analysis (OBIA) of LiDAR elevation data. CATENA 2020, 188, 104422. [Google Scholar] [CrossRef]
Pádua, L.; Matese, A.; Di Gennaro, S.F.; Morais, R.; Peres, E.; Sousa, J.J. Vineyard classification using OBIA on UAV-based RGB and multispectral data: A case study in different wine regions. Comput. Electron. Agric. 2022, 196, 106905. [Google Scholar] [CrossRef]
Rodriguez Gonzalez, C.; Guzman, C.; Andreo, V. Using VHR satellite imagery, OBIA and landscape metrics to improve mosquito surveillance in urban areas. Ecol. Inform. 2023, 77, 102221. [Google Scholar] [CrossRef]
Ameslek, O.; Zahir, H.; Latifi, H.; Bachaoui, E.M. Combining OBIA, CNN, and UAV imagery for automated detection and mapping of individual olive trees. Smart Agric. Technol. 2024, 9, 100546. [Google Scholar] [CrossRef]
Anderson, C.J.; Heins, D.; Pelletier, K.C.; Knight, J.F. Improving Machine Learning Classifications of Phragmites australis Using Object-Based Image Analysis. Remote Sens. 2023, 15, 989. [Google Scholar] [CrossRef]
Garg, R.; Kumar, A.; Prateek, M.; Pandey, K.; Kumar, S. Land cover classification of spaceborne multifrequency SAR and optical multispectral data using machine learning. Adv. Space Res. 2022, 69, 1726–1742. [Google Scholar] [CrossRef]
Wu, F.; Qiu, Y.; Huang, W.; Guo, S.; Han, Y.; Wang, G.; Li, X.; Lei, Y.; Yang, B.; Xiong, S.; et al. Water and heat resource utilization of cotton under different cropping patterns and their effects on crop biomass and yield formation. Agric. For. Meteorol. 2022, 323, 109091. [Google Scholar] [CrossRef]
Gao, H.; He, L.; He, Z.; Bai, W. Early landslide mapping with slope units division and multi-scale object-based image analysis —A case study in the Xianshui River basin of Sichuan, China. J. Mt. Sci. 2022, 19, 1618–1632. [Google Scholar] [CrossRef]
Ye, Z.; Yang, K.; Lin, Y.; Guo, S.; Sun, Y.; Chen, X.; Lai, R.; Zhang, H. A comparison between Pixel-based deep learning and Object-based image analysis (OBIA) for individual detection of cabbage plants based on UAV Visible-light images. Comput. Electron. Agric. 2023, 209, 107822. [Google Scholar] [CrossRef]
Yang, K.; Zhang, H.; Wang, F.; Lai, R. Extraction of Broad-Leaved Tree Crown Based on UAV Visible Images and OBIA-RF Model: A Case Study for Chinese Olive Trees. Remote Sens. 2022, 14, 2469. [Google Scholar] [CrossRef]
Holland, K.H.; Lamb, D.W.; Schepers, J.S. Radiometry of Proximal Active Optical Sensors (AOS) for Agricultural Sensing. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2012, 5, 1793–1802. [Google Scholar] [CrossRef]
Barnes, E.M.; Clarke, T.R.; Richards, S.E.; Colaizzi, P.D.; Haberland, J.; Kostrzewski, M.; Waller, P.; Choi, C.; Riley, E.; Thompson, T.; et al. Coincident Detection of Crop Water Stress, Nitrogen Status and Canopy Density Using Ground-Based Multispectral Data. In Proceedings of the Fifth International Conference on Precision Agriculture, Bloomington, MN, USA, 16–19 July 2000; Volume 1619. [Google Scholar]
Gitelson, A.A.; Kaufman, Y.J.; Merzlyak, M.N. Use of a green channel in remote sensing of global vegetation from EOS-MODIS. Remote Sens. Environ. 1996, 58, 289–298. [Google Scholar] [CrossRef]
Jordan, C.F. Derivation of Leaf-Area Index from Quality of Light on the Forest Floor. Ecology 1969, 50, 663–666. [Google Scholar] [CrossRef]
Vinciková, H.; Hais, M.; Brom, J.; Procházka, J.; Pecharová, E. Use of remote sensing methods in studying agricultural landscapes—A review. J. Landsc. Stud. 2010, 3, 53–63. [Google Scholar]
Merzlyak, M.N.; Gitelson, A.A.; Chivkunova, O.B.; Rakitin, V.Y. Non-destructive optical detection of pigment changes during leaf senescence and fruit ripening. Physiol. Plant. 1999, 106, 135–141. [Google Scholar] [CrossRef]
Rondeaux, G.; Steven, M.; Baret, F. Optimization of soil-adjusted vegetation indices. Remote Sens. Environ. 1996, 55, 95–107. [Google Scholar] [CrossRef]
Kandare, K.; Ørka, H.O.; Dalponte, M.; Næsset, E.; Gobakken, T. Individual tree crown approach for predicting site index in boreal forests using airborne laser scanning and hyperspectral data. Int. J. Appl. Earth Obs. Geoinf. 2017, 60, 72–82. [Google Scholar] [CrossRef]
Huete, A.; Didan, K.; Miura, T.; Rodriguez, E.P.; Gao, X.; Ferreira, L.G. Overview of the radiometric and biophysical performance of the MODIS vegetation indices. Remote Sens. Environ. 2002, 83, 195–213. [Google Scholar] [CrossRef]
Khan, N.M.; Rastoskuev, V.V.; Sato, Y.; Shiozawa, S. Assessment of hydrosaline land degradation by using a simple approach of remote sensing indicators. Agric. Water Manag. 2005, 77, 96–109. [Google Scholar] [CrossRef]
Tendolkar, A.; Choraria, A.; Manohara Pai, M.M.; Girisha, S.; Dsouza, G.; Adithya, K.S. Modified crop health monitoring and pesticide spraying system using NDVI and Semantic Segmentation: An AGROCOPTER based approach. In Proceedings of the 2021 IEEE International Conference on Autonomous Systems (ICAS), Montreal, QC, Canada, 11–13 August 2021; pp. 1–5. [Google Scholar] [CrossRef]
Natividade, J.; Prado, J.; Marques, L. Low-cost multi-spectral vegetation classification using an Unmanned Aerial Vehicle. In Proceedings of the 2017 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), Coimbra, Portugal, 26–28 April 2017; pp. 336–342. [Google Scholar] [CrossRef]
Pérez-Ortiz, M.; Peña, J.M.; Gutiérrez, P.A.; Torres-Sánchez, J.; Hervás-Martínez, C.; López-Granados, F. A semi-supervised system for weed mapping in sunflower crops using unmanned aerial vehicles and a crop row detection method. Appl. Soft Comput. 2015, 37, 533–544. [Google Scholar] [CrossRef]
César Pereira Júnior, P.; Monteiro, A.; Da Luz Ribeiro, R.; Sobieranski, A.C.; Von Wangenheim, A. Comparison of Supervised Classifiers and Image Features for Crop Rows Segmentation on Aerial Images. Appl. Artif. Intell. 2020, 34, 271–291. [Google Scholar] [CrossRef]
Murthy, C.S.; Raju, P.V.; Badrinath, K.V.S. Classification of wheat crop with multi-temporal images: Performance of maximum likelihood and artificial neural networks. Int. J. Remote Sens. 2003, 24, 4871–4890. [Google Scholar] [CrossRef]
Wang, H.; Zhang, J.; Xiang, K.; Liu, Y. Classification of Remote Sensing Agricultural Image by Using Artificial Neural Network. In Proceedings of the 2009 International Workshop on Intelligent Systems and Applications, Wuhan, China, 25–26 April 2009; pp. 1–4. [Google Scholar] [CrossRef]
Sarkar, S.; Zhou, J.; Scaboo, A.; Zhou, J.; Aloysius, N.; Lim, T.T. Assessment of Soybean Lodging Using UAV Imagery and Machine Learning. Plants 2023, 12, 2893. [Google Scholar] [CrossRef]
Zhu, N.; Zhu, C.; Zhou, L.; Zhu, Y.; Zhang, X. Optimization of the Random Forest Hyperparameters for Power Industrial Control Systems Intrusion Detection Using an Improved Grid Search Algorithm. Appl. Sci. 2022, 12, 10456. [Google Scholar] [CrossRef]
Lottes, P.; Khanna, R.; Pfeifer, J.; Siegwart, R.; Stachniss, C. UAV-based crop and weed classification for smart farming. In Proceedings of the 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, 29 May–3 June 2017; pp. 3024–3031. [Google Scholar] [CrossRef]
Cisty, M.; Soldanova, V. Flow Prediction Versus Flow Simulation Using Machine Learning Algorithms. In Machine Learning and Data Mining in Pattern Recognition; Perner, P., Ed.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 10935, pp. 369–382. ISBN 978-3-319-96132-3. [Google Scholar]
Uddin, S.; Haque, I.; Lu, H.; Moni, M.A.; Gide, E. Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Sci. Rep. 2022, 12, 6256. [Google Scholar] [CrossRef] [PubMed]
Rodriguez-Garlito, E.C.; Paz-Gallardo, A. Efficiently Mapping Large Areas of Olive Trees Using Drones in Extremadura, Spain. IEEE J. Miniat. Air Space Syst. 2021, 2, 148–156. [Google Scholar] [CrossRef]
Rocha, B.M.; Da Silva Vieira, G.; Fonseca, A.U.; Pedrini, H.; De Sousa, N.M.; Soares, F. Evaluation and Detection of Gaps in Curved Sugarcane Planting Lines in Aerial Images. In Proceedings of the 2020 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE), London, ON, Canada, 30 August–2 September 2020; pp. 1–4. [Google Scholar] [CrossRef]
Guo, Q.; Zhang, J.; Guo, S.; Ye, Z.; Deng, H.; Hou, X.; Zhang, H. Urban Tree Classification Based on Object-Oriented Approach and Random Forest Algorithm Using Unmanned Aerial Vehicle (UAV) Multispectral Imagery. Remote Sens. 2022, 14, 3885. [Google Scholar] [CrossRef]
Li, Z.; Ding, J.; Zhang, H.; Feng, Y. Classifying Individual Shrub Species in UAV Images—A Case Study of the Gobi Region of Northwest China. Remote Sens. 2021, 13, 4995. [Google Scholar] [CrossRef]
Lei, T.C.; Wan, S.; Wu, S.-C.; Wang, H.-P. A New Approach of Ensemble Learning Technique to Resolve the Uncertainties of Paddy Area through Image Classification. Remote Sens. 2020, 12, 3666. [Google Scholar] [CrossRef]
Hao, P.; Wang, L.; Niu, Z. Comparison of Hybrid Classifiers for Crop Classification Using Normalized Difference Vegetation Index Time Series: A Case Study for Major Crops in North Xinjiang, China. PLoS ONE 2015, 10, e0137748. [Google Scholar] [CrossRef]
Chen, Q.; Shen, C.; Du, H.; Tang, D. Comparing supervised classification algorithm–feature combinations for Spartina alterniflora extraction: A case study in Zhanjiang, China. Front. Remote Sens. 2025, 6, 1606549. [Google Scholar] [CrossRef]
Kwak, G.-H.; Park, N.-W. Impact of Texture Information on Crop Classification with Machine Learning and UAV Images. Appl. Sci. 2019, 9, 643. [Google Scholar] [CrossRef]
Gao, J.; Nuyttens, D.; Lootens, P.; He, Y.; Pieters, J.G. Recognising weeds in a maize crop using a random forest machine-learning algorithm and near-infrared snapshot mosaic hyperspectral imagery. Biosyst. Eng. 2018, 170, 39–50. [Google Scholar] [CrossRef]
Abdollahnejad, A.; Panagiotidis, D. Tree Species Classification and Health Status Assessment for a Mixed Broadleaf-Conifer Forest with UAS Multispectral Imaging. Remote Sens. 2020, 12, 3722. [Google Scholar] [CrossRef]
Yang, R.; Qi, Y.; Zhang, H.; Wang, H.; Zhang, J.; Ma, X.; Zhang, J.; Ma, C. A Study on the Object-Based High-Resolution Remote Sensing Image Classification of Crop Planting Structures in the Loess Plateau of Eastern Gansu Province. Remote Sens. 2024, 16, 2479. [Google Scholar] [CrossRef]
Yousaf, W.; Ahmad, S.R.; Shahzad, N.; Ramzan, A.; Javaid, A. An Object-Based Crop Classification Using Optimum Remotely Sensed Phenological and Multi-Spectral Data in Pakistan. Remote Sens. Earth Syst. Sci. 2025, 8, 945–964. [Google Scholar] [CrossRef]

Figure 1. The geographical location of the study area.

Figure 2. The framework of OBIA-ML using a UAV multispectral image.

Figure 3. UAV Matrice 350 RTK equipped with a multispectral camera.

Figure 4. Spatial distribution of the Class.

Figure 5. Confusion matrices of six machine learning models: (a) SVM, (b) ANN, (c) RF, (d) XGBoost, (e) KNN, (f) Ensemble.

Figure 6. Top 20 most important features for each model: (a) SVM, (b) ANN, (c) RF, (d) XGBoost, (e) KNN, (f) Ensemble.

Figure 7. Receiver Operating Characteristic Curve (ROC Curve) analysis, AUC Area Under ROC curve.

Figure 8. Performance Metrics Comparison Across Models.

Figure 9. F-1 score comparison across class by model.

Figure 10. Visualization of the model performance during the optimization by model: (a) Random Forest, (b) XGBoost, (c) SVM, (d) ANN, (e) KNN, (f) Ensemble.

Figure 11. Classification results based on 64 features using the Ensemble Model.

Table 1. Formula of the selected features.

Feature Type	Feature Name	Formula ¹	Reference
Spectral	Blue band (B), Green band (G), red band (R), red-edge band (RE), near-infrared band (NIR), the mean of each band, the standard deviation of each band, and the maximum of the difference and total brightness.	There is no formula for spectral	[7]
Index	NDVI	(NIR − R)/(NIR + R)	[19]
	NDRE	(NIR − RE)/(NIR + RE)	[20]
	GNDVI	(NIR − G)/(NIR + G)	[21]
	NIRRR	NIR/R	[22]
	NIRGR	NIR/G	[23]
	DVI	NIR − R	[24]
	DVIGRE	NIR − G	[7]
	MSWI	(B − NIR)/NIR	[7]
	OSAVI	(NIR − R)/(NIR + R + 0.16)	[25]
	IPVI	NIR/(NIR + R)	[26]
	EVI	2.5 × (NIR − R)/(NIR + 6 × R − 7.5 × B + 1)	[27]
	BI	(R² + NIR²) × 0.5	[28]
GLCM	Ang. 2nd moment, Contrast, Correlation, Dissimilarity, Entropy, Homogeneity, Mean, Standard Deviation, all in five directions (0°, 45°, 90°, 135°, and All).	Automatic calculation by the eCognition developer

¹: B, G, R, RE, and NIR represent blue, green, red, red edge, and near-infrared bands, respectively.

Table 2. Training and testing samples for various classes.

Class Name	Total of Samples	Training Samples	Testing Samples
cotton	3490	2443	1047
infrastructure	78	55	23
maize	2975	2083	893
peanut	1624	1137	487
road	55	39	17
shrub	48	34	14
soil	517	362	155
solar panels	11	8	3
soybean	3476	2433	1043
Total	12,274

Table 3. Number of top 20 features by model.

Model	SVM	ANN	RF	XGBoost	KNN	Ensemble
Feature	SVM	ANN	RF	XGBoost	KNN	Ensemble
Spectral	1	1	1	1	1	1
Index	7	6	11	10	10	6
GLCM	12	13	8	9	9	13

Table 4. Precision, Recall, F1-Score across Class by Model.

Model	SVM			ANN			Random Forest
Class	Precision	Recall	F1-Score	Precision	Recall	F1-Score	Precision	Recall	F1-Score
Cotton	0.94	0.95	0.95	0.95	0.96	0.95	0.92	0.94	0.93
infrastructure	0.88	0.88	0.88	0.96	0.92	0.94	0.91	0.96	0.94
maize	0.94	0.94	0.94	0.95	0.95	0.95	0.92	0.92	0.92
peanut	0.98	0.95	0.97	0.96	0.96	0.96	0.96	0.92	0.94
road	0.91	0.89	0.90	0.96	0.93	0.94	0.94	0.91	0.93
shrub	0.81	0.60	0.69	0.71	0.60	0.65	1	0.23	0.37
soil	0.95	0.96	0.95	0.95	0.96	0.96	0.93	0.96	0.94
solar panels	0.88	0.64	0.74	0.82	0.82	0.82	1	0.64	0.78
soybean	0.92	0.93	0.92	0.93	0.93	0.93	0.89	0.89	0.89
Model	XGBoost			KNN			Ensemble
Class	Precision	Recall	F1-Score	Precision	Recall	F1-Score	Precision	Recall	F1-Score
cotton	0.94	0.95	0.94	0.91	0.91	0.91	0.93	0.95	0.94
infrastructure	0.92	0.95	0.94	0.94	0.92	0.93	0.91	0.95	0.93
maize	0.93	0.92	0.93	0.91	0.87	0.89	0.93	0.92	0.93
peanut	0.96	0.94	0.95	0.97	0.89	0.93	0.96	0.93	0.95
road	0.96	0.93	0.94	0.90	0.80	0.85	0.96	0.91	0.93
shrub	0.86	0.52	0.65	0.78	0.15	0.25	0.95	0.44	0.60
soil	0.94	0.97	0.95	0.89	0.95	0.91	0.93	0.97	0.95
solar panels	1	0.36	0.53	1	0.27	0.43	1	0.55	0.71
soybean	0.90	0.91	0.91	0.83	0.89	0.89	0.90	0.91	0.91

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Be, M.C.; Randrianantenaina, A.S.; Kanneh, J.E.; Han, Y.; Lei, Y.; Zhi, X.; Xiong, S.; Jiao, Y.; Shang, S.; Ma, Y.; et al. Comparative Analysis of Machine Learning Algorithms for Object-Based Crop Classification Using Multispectral Imagery. Drones 2025, 9, 763. https://doi.org/10.3390/drones9110763

AMA Style

Be MC, Randrianantenaina AS, Kanneh JE, Han Y, Lei Y, Zhi X, Xiong S, Jiao Y, Shang S, Ma Y, et al. Comparative Analysis of Machine Learning Algorithms for Object-Based Crop Classification Using Multispectral Imagery. Drones. 2025; 9(11):763. https://doi.org/10.3390/drones9110763

Chicago/Turabian Style

Be, Madjebi Collela, Antsa Sarobidy Randrianantenaina, James E. Kanneh, Yingchun Han, Yaping Lei, Xiaoyu Zhi, Shiwu Xiong, Yahui Jiao, Shilong Shang, Yunzhen Ma, and et al. 2025. "Comparative Analysis of Machine Learning Algorithms for Object-Based Crop Classification Using Multispectral Imagery" Drones 9, no. 11: 763. https://doi.org/10.3390/drones9110763

APA Style

Be, M. C., Randrianantenaina, A. S., Kanneh, J. E., Han, Y., Lei, Y., Zhi, X., Xiong, S., Jiao, Y., Shang, S., Ma, Y., Yang, B., Tao, L., & Li, Y. (2025). Comparative Analysis of Machine Learning Algorithms for Object-Based Crop Classification Using Multispectral Imagery. Drones, 9(11), 763. https://doi.org/10.3390/drones9110763

Article Menu

Comparative Analysis of Machine Learning Algorithms for Object-Based Crop Classification Using Multispectral Imagery

Highlights

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Framework of This Study

2.3. UAV Image Data Acquisition and Preprocessing

2.4. Image Segmentation

2.5. Feature Selection and Extraction

2.6. Model Development and Implementation

2.7. Machine Learning Models and Optimization

2.8. Data Processing and Performance Evaluation

2.9. Visualization Framework

3. Results

3.1. Classification Performance of Machine Learning Models

3.2. Performance Metrics Comparison

3.3. F-1 Score Comparison Across Different Classes by Model

3.4. Optimization History

3.5. Crop Mapping

4. Discussion

Limitation of This Study

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI