Distinguishing Planting Structures of Different Complexity from UAV Multispectral Images

This study explores the classification potential of a multispectral classification model for farmland with planting structures of different complexity. Unmanned aerial vehicle (UAV) remote sensing technology is used to obtain multispectral images of three study areas with low-, medium-, and high-complexity planting structures, containing three, five, and eight types of crops, respectively. The feature subsets of three study areas are selected by recursive feature elimination (RFE). Object-oriented random forest (OB-RF) and object-oriented support vector machine (OB-SVM) classification models are established for the three study areas. After training the models with the feature subsets, the classification results are evaluated using a confusion matrix. The OB-RF and OB-SVM models’ classification accuracies are 97.09% and 99.13%, respectively, for the low-complexity planting structure. The equivalent values are 92.61% and 99.08% for the medium-complexity planting structure and 88.99% and 97.21% for the high-complexity planting structure. For farmland with fragmentary plots and a high-complexity planting structure, as the planting structure complexity changed from low to high, both models’ overall accuracy levels decreased. The overall accuracy of the OB-RF model decreased by 8.1%, and that of the OB-SVM model only decreased by 1.92%. OB-SVM achieves an overall classification accuracy of 97.21%, and a single-crop extraction accuracy of at least 85.65%. Therefore, UAV multispectral remote sensing can be used for classification applications in highly complex planting structures.


Introduction
According to statistics published by the United Nations, the world population is expected to reach about 10 billion in 2050 [1,2]. Population expansion brings new challenges to the maintenance of food production security. Mastering the area and spatial distribution of regional crops is the prerequisite for accurately obtaining regional crop yields, and the rational allocation of regional water resources. However, smallholders or family farms, which are still prevailing in some developing countries, are responsible for a large share of the world food production. The scattered farmland and discrete crops of smallholders make cropland mapping and monitoring more difficult, affecting the accurate estimation of regional crop yields and the rational allocation of water resources. The emergence of remote sensing technology has promoted agricultural production and research, from the traditional stage to the stage of refinement, quantification, and mechanism. Highquality remote sensing images, especially those of high resolution, can extract feature information from the ground, making the fine classification and monitoring of ground details possible [3,4]. Agricultural information at the farmland scale can be directly applied In general, the information extraction of single farmland features based on UAV remote sensing data is relatively mature, and the extraction accuracy is high. However, there is still some confusion in the classification of many kinds of crops. In addition, multiple crop classification mainly focuses on three to five different crops, and there have been few comparative studies in which the cropping structures have different levels of complexity. Therefore, this paper describes the use of UAV remote sensing technology to classify farmland features in study areas with different levels of planting structure complexity. The aims of this study are as follows: (1) explore the applicability of UAV multispectral remote sensing recognition algorithms for farmland feature classification with planting structures that have different degrees of complexity; and (2) analyze the potential for UAV multispectral remote sensing technology to be used for complex planting structure extraction.

Overview of the Study Area
The study areas are located in Wuyuan County, part of the Inner Mongolia Autonomous Region of China, which have a typical mid-temperate continental monsoon climate. The geographic location map of the study areas is shown in Figure 1. The study areas are arid and receives plenty of sunshine. The annual rainfall is only 130-285 mm, and the annual total amount of solar radiation is as high as 6424.23 MJ·m −2 . The rich water resources in these area benefit from the Yellow River diversion irrigation system, and can completely satisfy the needs of local crops. This study considers three areas in Wuyuan County. Study area 1 (SA1) is in Taerhu (49.99°N, 107.83°E), study area 2 (SA2) is in Fuxing (41.12°N, 107.96°E), and study area 3 (SA3) is located in Yindingtu (41.18°N, 107.84°E). SA1 contains three types of crops (corn, sunflower, and zucchini), and is selected as a district with low planting structure complexity. SA2 contains five types of crops (corn, sunflower, zucchini, hami melon, and pepper), and is selected as a district with medium planting structure complexity. SA3 contains eight types of crops (sunflower, zucchini, hami melon, pepper, sapling, watermelon, cherry tomato, and tomato), and is selected as a district with high planting structure complexity. During the experimental period, the corn was in the jointing stage, the sunflower was in the budding stage, and the zucchini, hami melon, pepper, watermelon, cherry tomato, and tomato were all in the fruiting stage.

The Collection of UAV Remote Sensing Data
An information collection system based on a UAV (S900, DJI Technology Co., Shenzhen, China) was adopted to collect the multispectral remote sensing images. The system integrated UAV flight control and the corresponding position and orientation system (POS) data acquisition. It could stably obtain UAV multispectral images without distortion. The multispectral camera (MicaSense RedEdge-M, MicaSense, Seattle, WA, USA)

The Collection of UAV Remote Sensing Data
An information collection system based on a UAV (S900, DJI Technology Co., Shenzhen, China) was adopted to collect the multispectral remote sensing images. The system integrated UAV flight control and the corresponding position and orientation system (POS) data acquisition. It could stably obtain UAV multispectral images without distortion. The multispectral camera (MicaSense RedEdge-M, MicaSense, Seattle, WA, USA) could obtain red, green, blue, near-infrared, and red edge band data. Detailed information of the UAV and multispectral camera is presented in Table 1. The spectral characteristics of crops vary significantly under different phenological periods and light conditions. The UAV remote sensing tests were conducted on 26, 29 July and 1 August 2020, which had similar meteorological conditions. The meteorological data obtained from the local weather bureau were the average values from 11 a.m. to 2 p.m. during the test period (shown in Table 2). The three experimental days were sunny days with lower wind speed, fewer air pollutants, and higher illuminance, all suitable for UAV flight operations. The flight altitude was set to 100 m above the ground, the course overlap was 70%, and the horizontal overlap was 65%. The RAW format images were exported and converted to TIFF format using the PixelWrench2 software installed in the camera. The spectral reflectivity was calculated using ENVI (v. 5.1, Exelis Visual Information Solutions, Boulder, CO, USA) combined with the standard whiteboard data. The ground control points (GCPs) are vital in verifying the accuracy of terrain information obtained by UAV. The 3D coordinates of the GCPs in this study were accurately measured by a real-time kinematic (RTK) product (Zhonghui i50, CHCNAV, Shanghai, China), which has a high precision of 2 mm. According to the actual terrain conditions and the control point layout principle, six GCPs were selected in each study area. Among them, three base numbers were used as calibration points and three even numbers were used as check points. The control points were set at the intersections of hardened roads. They were easy to distinguish and had good stability. Images were stitched using Pix4DMapper (v. 3.2, Pix4D, Prilly, Switzerland) based on the TIFF multispectral images and POS data collected by the UAV remote sensing system. In this study, 1540 multispectral remote sensing images were obtained from the three study areas. The data contained grayscale information such as red, blue, green, near-infrared, and red edge bands, and the spatial resolution was 7 mm. The UAV images' mosaic results of the study areas are shown in Figure 2.

The collection of Ground Data
During the experiment, we collected the ground distribution data and ground tral data of crops. The ground distribution data of crops is the basis for selecting tr samples and verification samples. This can help to evaluate the classification result ally. Ground spectral data of crops can help us explore the differences in crop sp characteristics better, provide a theoretical basis for the classification results, and a the error sources in the classification results effectively.

The Ground Distribution data of Crops
The types of crops were determined based on field surveys, and the location o crop was recorded using portable RTK in units of plots. Combining ground data and images, the ground crops distribution maps ( Figure 3) were drawn.

The Collection of Ground Data
During the experiment, we collected the ground distribution data and ground spectral data of crops. The ground distribution data of crops is the basis for selecting training samples and verification samples. This can help to evaluate the classification results visually. Ground spectral data of crops can help us explore the differences in crop spectral characteristics better, provide a theoretical basis for the classification results, and analyze the error sources in the classification results effectively.

The Ground Distribution Data of Crops
The types of crops were determined based on field surveys, and the location of each crop was recorded using portable RTK in units of plots. Combining ground data and UAV images, the ground crops distribution maps ( Figure 3) were drawn.

Crop-Ground Spectral Curves
The crop-ground spectral curves in this study were obtained by FieldSpec H Held (ASD, Westborough, CO, USA) on a sunny day (1 August, 11.00-14.00). The spec parameters of the FieldSpec Hand Held are shown in Table 3. As shown in Figure 4, field experimenters had to wear dark clothes and face the sun when collecting grou spectral data. First, the optical fiber probe was aligned at the whiteboard for correct and then aligned at the vegetation canopy to collect reflectance spectra. Six samples w randomly selected from each type of crop, and ten spectral curves were measured for e plant sample, which was arithmetically averaged to obtain the final reflectance spec data of the sample.

Crop-Ground Spectral Curves
The crop-ground spectral curves in this study were obtained by FieldSpec Hand Held (ASD, Westborough, CO, USA) on a sunny day (1 August, 11.00-14.00). The specific parameters of the FieldSpec Hand Held are shown in Table 3. As shown in Figure 4, the field experimenters had to wear dark clothes and face the sun when collecting ground spectral data. First, the optical fiber probe was aligned at the whiteboard for correction, and then aligned at the vegetation canopy to collect reflectance spectra. Six samples were randomly selected from each type of crop, and ten spectral curves were measured for each plant sample, which was arithmetically averaged to obtain the final reflectance spectral data of the sample.

Research Procedure and Method
The workflow of planting structure extraction is shown in Figure 5. There are seven main stages: (1) the acquisition and preprocessing of UAV remote sensing images, includ ing the construction of the UAV multispectral system, the selection of an aerial photog raphy path, and the stitching and geometric correction of orthophoto images; (2) the col lection of ground data, including the investigation of the true distribution of crops on the ground and the collection of crop-ground spectral curves; (3) the selection of training and verification samples of UAV multispectral images; (4) multiscale segmentation of UAV images; (5) the extraction of features and the determination of feature subsets, including the extraction of spectral features and texture features, and the selection of the best feature band based on recursive feature elimination (RFE); (6) the use of object-oriented RF (OB RF) and object-oriented SVM (OB-SVM) classification models to classify farmland crops (7) the use of confusion matrices to evaluate and compare the classification accuracy o each model.

Research Procedure and Method
The workflow of planting structure extraction is shown in Figure 5. There are seven main stages: (1) the acquisition and preprocessing of UAV remote sensing images, including the construction of the UAV multispectral system, the selection of an aerial photography path, and the stitching and geometric correction of orthophoto images; (2) the collection of ground data, including the investigation of the true distribution of crops on the ground and the collection of crop-ground spectral curves; (3) the selection of training and verification samples of UAV multispectral images; (4) multiscale segmentation of UAV images; (5) the extraction of features and the determination of feature subsets, including the extraction of spectral features and texture features, and the selection of the best feature band based on recursive feature elimination (RFE); (6) the use of object-oriented RF (OB-RF) and objectoriented SVM (OB-SVM) classification models to classify farmland crops; (7) the use of confusion matrices to evaluate and compare the classification accuracy of each model.

Research Procedure and Method
The workflow of planting structure extraction is shown in Figure 5. There are seven main stages: (1) the acquisition and preprocessing of UAV remote sensing images, including the construction of the UAV multispectral system, the selection of an aerial photography path, and the stitching and geometric correction of orthophoto images; (2) the collection of ground data, including the investigation of the true distribution of crops on the ground and the collection of crop-ground spectral curves; (3) the selection of training and verification samples of UAV multispectral images; (4) multiscale segmentation of UAV images; (5) the extraction of features and the determination of feature subsets, including the extraction of spectral features and texture features, and the selection of the best feature band based on recursive feature elimination (RFE); (6) the use of object-oriented RF (OB-RF) and object-oriented SVM (OB-SVM) classification models to classify farmland crops; (7) the use of confusion matrices to evaluate and compare the classification accuracy of each model.

Sample Selection
The types of crops in the three study areas were determined through field research, and RTK was used to calibrate each crop's geographic location. We randomly generated samples based on the ground standard crop distribution maps ( Figure 3). In three study areas, the reference samples were randomly split into two sets of disjointed training samples (TS) and validation samples (VS), via the sample-function in R (v. 4.0.3). The selection results of the samples are shown in Table 4.

Construction of Spectral Features and Texture Features
Vegetation indices can magnify the spectral information between ground objects, and are one of the simplest and most effective methods of studying vegetation characteristics. In this study, eight common vegetation indices were obtained from band calculations (shown in Table 5): the normalized difference vegetation index (NDVI) [33], the ratio vegetation index (RVI) [34], the difference vegetation index (DVI) [35], excess green (EXG) [36], the visible-band difference vegetation index (VDVI) [37], the normalized green-blue difference index (NGBDI) [38], the normalized green-red difference index (NGRDI) [39], and the Woebbecke index (WI) [40]. Texture features can reflect the characteristics of homogeneity in the images, and are unaffected by image color and image brightness The common texture features include the mean, variance, synergy, contrast, dissimilarity, information entropy, second moment, and correlation. This study obtained 40 texture features of crops in five bands (red, green, blue, near-infrared, red edge) by applying co-occurrence measures, which calculated texture values using the grey tone spatial dependence matrix. This process was implemented in ENVI 5.1 (Exelis Visual Information Solutions, Boulder, CO, USA). In addition, the size of the filtering window was 3×3, which was the default value in ENVI.

Vegetation Indices
Full Name Formula

Screening of Characteristic Parameters
To improve the operation speed and prediction accuracy of the model and avoid overfitting, the feature parameters of the images were screened to eliminate features that had a low correlation with the model prediction. Recursive feature elimination (RFE) is an efficient algorithm that combines classifiers to find the optimal feature subset [41]. It creates the model repeatedly, and retains the best features or removes the worst features in each iteration. In subsequent iterations, it uses features that were not selected in the previous model to create the next model until all features are exhausted. Finally, RFE ranks the features according to the order in which they were retained or removed, and selects the best subset. This study performed feature optimization based on the RFE module in scikit-learn, a free software machine learning library basing the Python programming language. The RF classifier was used to evaluate the RFE model, and ten-fold cross-validation was adopted to evaluate the model parameters' accuracy.
RFE was used to screen features in five spectral bands, seven vegetation indices, and 40 texture features. The importance rankings of the features are shown in Tables A1-A3. The characteristics ranked first, second, third (and so on) in their corresponding feature sets were denoted as B1, B2, B3 (and so on). The feature parameters were accumulated one by one as per the importance rankings of the feature parameters, and the images were pre-classified based on the accumulated feature subset. The classification accuracy is shown in Figures A1-A3. According to the importance rankings of all features and the pre-classification results in Figures A1-A3, the feature subset was then constructed by retaining the features that contributed significantly to the classification, and eliminating the features that contributed little or were useless. The final filtering results are presented in Table 6.

Multiresolution Segmentation
OBIA makes full use of the spatial, textural, contextual, and other geometric features and structure information of remote sensing images. It is superior to pixel-oriented analysis for crop extraction because it efficiently solves the problems of "same substance with a different spectrum", "same spectrum with a foreign substance" and the "salt and pepper effect" [24]. OBIA uses an iterative algorithm to segment remote sensing images into uniform and continuous image objects. OBIA mainly has two independent modules: object generation and image information extraction. A good segmentation effect is the prerequisite to achieving excellent classification results [25]. Generally, the ground feature information is complex and mixed, making it challenging to obtain an ideal segmentation effect using a single-scale segmentation method. Therefore, multiresolution segmentation is commonly adopted for land use information extraction. This method creates image polygon objects with arbitrary scales and similar attribute information. Through multiresolution segmentation, adjacent similar pixels gather to form objects, and the classifier uses these homogeneous objects as the basic processing units to extract information from images. In this study, remote sensing images were first segmented into image objects with different scales, based on the multiscale segmentation method. Then, target crop extraction was accomplished using spectral and textural features of the objects. The data processing was carried out by eCognition Developer (v. 9.2.1, Trimble Geospatial). The segmentation parameters were adjusted through multiple segmentation experiments based on expert knowledge. The principle of hyper-parameter selection was that the segmentation effect best fits the ridge line. The segmentation parameters were adjusted many times to ensure optimal values during the multiscale segmentation. The optimal segmentation parameters for remote sensing images were determined to be as follows: the segmentation scale was set to 200, the shape weight was set to 0.2, and the compactness weight was set to 0.5 through segmentation experiments. The final segmentation results are shown in Figure 6.
Sensors 2021, 21, x FOR PEER REVIEW 10 of 2 these homogeneous objects as the basic processing units to extract information from im ages. In this study, remote sensing images were first segmented into image objects wit different scales, based on the multiscale segmentation method. Then, target crop extrac tion was accomplished using spectral and textural features of the objects. The data pro cessing was carried out by eCognition Developer (v. 9.2.1, Trimble Geospatial). The seg mentation parameters were adjusted through multiple segmentation experiments base on expert knowledge. The principle of hyper-parameter selection was that the segmenta tion effect best fits the ridge line. The segmentation parameters were adjusted many time to ensure optimal values during the multiscale segmentation. The optimal segmentatio parameters for remote sensing images were determined to be as follows: the segmentatio scale was set to 200, the shape weight was set to 0.2, and the compactness weight was se to 0.5 through segmentation experiments. The final segmentation results are shown i Figure 6.

RF
RF is a nonparametric machine learning algorithm composed of multiple decisio trees. This algorithm has high prediction accuracy, good tolerance to outliers and noise, wide range of applications, and cannot easily be overfitted [42]. According to statistica

RF
RF is a nonparametric machine learning algorithm composed of multiple decision trees. This algorithm has high prediction accuracy, good tolerance to outliers and noise, a wide range of applications, and cannot easily be overfitted [42]. According to statistical learning theory, RF uses the bootstrap resampling method to extract multiple samples from the original data, and then performs decision tree modeling for each sample. The prediction results from various decision trees are synthesized, and finally, a random forest with a mass of classification trees is constructed [43]. Two parameters need to be defined to generate prediction models: the number of expected classification trees (ntree) and the number of features extracted when nodes are split (mtry). The implementation of the RF model in this study was based on the Random Forest module in scikit-learn, based on the Python programming language. It was found that setting the ntree to 50 produced an error that gradually converged and tended to be stable, while mtry was set to the square root of the total number of features.

SVM
SVM is based on the Vapnik-Chervonenkis dimension theory of statistical learning and the principle of minimum structural risk. It is often used to solve small-sample, nonlinear, and high-dimensional pattern recognition problems [44]. Under the condition of limited sample information, SVM provides a good balance between the complexity and learning of the model, and has a good generalization ability. The common kernel functions in the SVM algorithm are linear, polynomial, radial basis, and sigmoid kernel functions. The radial basis kernel function is the most widely used as it has fewer parameters and better performance than the others, regardless of the number of samples [45]. The implementation of the SVM model in this study was based on the support vector machines module in scikit-learn based on the Python programming language.

Classification Accuracy Assessment
Based on the verification sample data, a confusion matrix was used to calculate the user accuracy (UA), production accuracy (PA), extraction accuracy (F), overall accuracy (OA), and Kappa coefficient. UA and PA can be used to evaluate misclassification and omission errors quantitatively, and the overall accuracy and the Kappa coefficient (K) are commonly used to evaluate the overall classification effect. Besides this, F is used to evaluate the extraction accuracy of all kinds of ground objects under various methods.
where P 0 represents the overall classification accuracy, T c represents the number of pixels correctly classified by method c, and A c represents the total number of pixels classified by method c.
where P 0 represents the overall classification accuracy, assuming that the true number of samples of each category is a 1 , a 2 , · · · , a c , the predicted number of samples of each category is b 1 , b 2 , · · · , b c , and the total number of samples is n.
where F represents the extraction precision, P Am represents the production accuracy of category m, and U Am represents the user precision of category m.

Results
The crop planting information in three study areas with different planting complexities was extracted using OB-RF and OB-SVM (Figures 7 and 8), based on the multispectral remote sensing images obtained by the UAV in the three study areas. The confusion matrix was used to evaluate the accuracy of the classification results. It assumed that pixels at the reference locations could be assigned to single classes, and accuracy measures based on the proportion of area correctly classified were then calculated from the number of correctly classified pixels [46]. The accuracy evaluation results are presented in Tables 7-9.
In SA1, both OB-RF and OB-SVM achieved good classification results, with an overall accuracy greater than 97% and an extraction accuracy for every crop greater than 92%. The accuracy of SA2 was slightly lower, but the overall accuracy was still above 92%. The extraction accuracy of the OB-RF model for pepper and hami melon was low (84.86% for pepper, 75.65% for Hami melon), while the extraction accuracy of the OB-SVM model for all crops remained at a high level (extraction accuracy greater than 94%). In SA3, the overall accuracy and extraction accuracy based on the OB-SVM model remained high (overall accuracy of 97.21%, extraction accuracy greater than 85.65%). However, the overall accuracy and extraction accuracy given by the OB-RF model decreased significantly. Among all study areas, corn had the highest extraction accuracy, and saplings had the lowest extraction accuracy.
Sensors 2021, 21, x FOR PEER REVIEW correctly classified pixels [46]. The accuracy evaluation results are presented in Tab 9. In SA1, both OB-RF and OB-SVM achieved good classification results, with an o accuracy greater than 97% and an extraction accuracy for every crop greater than The accuracy of SA2 was slightly lower, but the overall accuracy was still above 92% extraction accuracy of the OB-RF model for pepper and hami melon was low (84.86 pepper, 75.65% for Hami melon), while the extraction accuracy of the OB-SVM mod all crops remained at a high level (extraction accuracy greater than 94%). In SA3, the all accuracy and extraction accuracy based on the OB-SVM model remained high (o accuracy of 97.21%, extraction accuracy greater than 85.65%). However, the overall racy and extraction accuracy given by the OB-RF model decreased significantly. A all study areas, corn had the highest extraction accuracy, and saplings had the low traction accuracy.

Classification Error Analysis
By comparing the classification results obtained by OB-SVM and OB-RF (Figures 7 and 8) with the standard crops distribution map obtained through field investigation (Figure 3), classification error detail maps ( Figure 9) were made. In SA1, the primary source of error was the mixed classification of zucchini and sunflower. In SA2, the primary source of error was the mixed fraction of cantaloupe and zucchini, and the mixed fraction of pepper and cantaloupe. In SA3, the primary source of error was the mixed fraction of hami melon and cherry tomato, the mixed fraction of pepper and cherry tomato, and the mixed fraction of zucchini and sunflower. In general, there are mainly five crops that are easy to mix: hami melon, pepper, zucchini, cherry tomato, and sunflower. In order to explore the reasons for crop mixing, we analyzed the spectral curves of five easily mixed crops. The spectral curves of five crops with a high mixing frequency (sunflower, cherry tomato, pepper, hami melon, and zucchini) are shown in Figure 9. In the spectral range 400-900 nm, the spectral reflectance of five easily confused crops is stable in the near-infrared band range from 770 to 800 nm, where the difference is most apparent. Additionally, there are apparent reflection peaks in the green band from 540 to 560 nm, and some differences in the height of reflection peaks of different crops. However, for both 770-800 nm and 540-560 nm, the six spectral curves of Hami melon overlap with pepper and zucchini, which is one of the reasons why Hami melon is easily confused with pepper and zucchini. In addition, Hami melon, pepper, and zucchini are all grown by strip cultivation in the study areas, and were in the same phenological period (fruit setting) when the experimental images were obtained, which weakens the differences in their texture features. Interestingly, although the reflectance of cherry tomato is obviously higher than that of Hami melon, and the reflectance of sunflower is obviously higher than that of zucchini in the near-infrared band of 770-800 nm, there are mixed fractions of Hami melon and cherry tomato, and mixed fractions of sunflower and zucchini, in SA3. One possible explanation is that the cherry tomato and sunflower are densely planted in the study area, with many overlapping leaves. Compared with single-leaf plants, multiple leaves can produce higher reflectivity in the near-infrared band due to additional reflectivity [47]. Therefore, cherry tomato and sunflower have a higher reflectivity than other crops in the near-infrared band. However, in Figure 10, the cherry tomato blossom in area m and the sunflower in area n grow poorly and are sparsely planted, decreasing their reflectivity in the green band of 540-560 nm and the near-infrared band of 770-800 nm. Thus, the difference between them is decreased. Moreover, Hami melon and cherry tomato are vine plants, which have similar textural features. The big leaves of sunflower and zucchini also weaken the differences in their respective textural features. Interestingly, although the reflectance of cherry tomato is obviously higher than that of Hami melon, and the reflectance of sunflower is obviously higher than that of zucchini in the near-infrared band of 770-800 nm, there are mixed fractions of Hami melon and cherry tomato, and mixed fractions of sunflower and zucchini, in SA3. One possible explanation is that the cherry tomato and sunflower are densely planted in the study area, with many overlapping leaves. Compared with single-leaf plants, multiple leaves can produce higher reflectivity in the near-infrared band due to additional reflectivity [47]. Therefore, cherry tomato and sunflower have a higher reflectivity than other crops in the nearinfrared band. However, in Figure 10, the cherry tomato blossom in area m and the sunflower in area n grow poorly and are sparsely planted, decreasing their reflectivity in the green band of 540-560 nm and the near-infrared band of 770-800 nm. Thus, the difference between them is decreased. Moreover, Hami melon and cherry tomato are vine plants, which have similar textural features. The big leaves of sunflower and zucchini also weaken the differences in their respective textural features.

Model Performance under Different Planting Structure Complexity
The classification results for the three study areas were produced using the OB-RF and OB-SVM models. The overall accuracy values of SA1 given by OB-RF and OB-SVM are 97.085% and 99.126%, respectively. For SA2, the overall accuracy values are 92.610% and 99.078%, respectively, and for SA3 they are 88.994% and 97.207%, respectively. These results indicate that the OB-RF and OB-SVM classification accuracies decrease as the complexity of the planting structure increases. In particular, OB-RF's overall accuracy decreased by 8.1%, while that of OB-SVM only decreased by 1.9%. In general, the advantage of OB-SVM's classification accuracy becomes more prominent as the number of ground features increases. From the differences in the extraction accuracies of the different methods, OB-SVM's extraction accuracy was obviously better than that of OB-RF in SA3.
The occurrence of classification errors in this study is related to the sample size limitation, such as for the saplings and Hami melon in SA3. Comparing area j with area o, and area m with area q, in Figure 10, it is clear that the classification error of OB-SVM is smaller than that of OB-RF in small sample areas. As a representative ensemble learning algorithm, the RF classifier achieved good results in the automatic extraction of remote sensing information [42,43]. However, the RF classifier is better suited to large samples and high-

Model Performance under Different Planting Structure Complexity
The classification results for the three study areas were produced using the OB-RF and OB-SVM models. The overall accuracy values of SA1 given by OB-RF and OB-SVM are 97.085% and 99.126%, respectively. For SA2, the overall accuracy values are 92.610% and 99.078%, respectively, and for SA3 they are 88.994% and 97.207%, respectively. These results indicate that the OB-RF and OB-SVM classification accuracies decrease as the complexity of the planting structure increases. In particular, OB-RF's overall accuracy decreased by 8.1%, while that of OB-SVM only decreased by 1.9%. In general, the advantage of OB-SVM's classification accuracy becomes more prominent as the number of ground features increases. From the differences in the extraction accuracies of the different methods, OB-SVM's extraction accuracy was obviously better than that of OB-RF in SA3.
The occurrence of classification errors in this study is related to the sample size limitation, such as for the saplings and Hami melon in SA3. Comparing area j with area o, and area m with area q, in Figure 10, it is clear that the classification error of OB-SVM is smaller than that of OB-RF in small sample areas. As a representative ensemble learning algorithm, the RF classifier achieved good results in the automatic extraction of remote sensing information [42,43]. However, the RF classifier is better suited to large samples and high-dimensional data, and thus requires a sufficient number of samples [44]. The SVM classifier specializes in analyzing small numbers of samples [39,45]. In this study, the number of features in the three test areas gradually increased, and the number of available training samples was minimal as the plots became more fragmented. The learning ability of the classifier from the training samples under this situation directly determined the accuracy of the classification results. Therefore, the classification accuracy of the OB-SVM method was superior to that of OB-RF in this study, because of the high sensitivity of the SVM classifier to the samples. These results indicate that the OB-SVM method is more suitable for the classification of crops in fragmented areas with highly complex planting structures.

Classification Potential of UAV Multispectral Remote Sensing Technology under Complex Planting Structures
The OB-SVM model achieved superior classification performance in extracting crop information in areas with low-, medium-, and high-complexity planting structures, based on UAV multispectral remote sensing. The overall accuracies of the three study areas were 99.13%, 99.08%, and 97.21%, and the extraction accuracy values were better than 92.59%, 94.81%, and 85.65%. As the planting structure complexity increased, the classification accuracy and extraction accuracy decreased, but the overall accuracy was only reduced by 1.92%. Using UAV visible light images, Park et al. [30] applied an object-oriented method to classify cabbage and radish, and obtained an accuracy of 84.7%. The overall accuracy reached 97.21%, even under a complex classification environment with eight different crops in this study. Chen et al. [48] pointed out that UAV visible light images led to lower interpretation accuracy than multispectral images in agricultural land classification. In addition, Ishida et al. [49] used UAV hyperspectral remote sensing technology to classify 14 ground objects with an overall accuracy of 94.00%, and compared with this, the classification results in this paper were not inferior.
It can be seen from the spectral curve of ground objects (Figure 9) that the most remarkable difference in the reflectivity of each crop was in the near-infrared band, which made a significant contribution to the classification. Additionally, the importance ranking of the multispectral bands (Table A1) suggests that the near-infrared band played an essential role in the classification results of each study area (importance ranked third, first and second for SA1, SA2, and SA3, respectively). The vegetation indices (DVI and RVI) obtained using the near-infrared band as the input variable, and the texture features obtained from the second-order matrix probability operation (near-infrared coordination, near-infrared information entropy, near-infrared correlation, near-infrared contrast, and near-infrared heterogeneity), also played essential roles in the classification results (Tables A2 and A3). Thus, it can be concluded that the near-infrared band provides essential features that improve the extraction accuracy of the planting structure, and enable the fine classification of crops. This is the main advantage of multispectral remote sensing compared with visible light remote sensing. Besides this, multispectral remote sensing's high price limits its applicability to agricultural production, although UAV hyperspectral remote sensing offers a higher spectral resolution than multispectral remote sensing. Indeed, multispectral remote sensing satisfies the requirements as far as crop classification is concerned. In general, multispectral remote sensing technology has a higher spectral resolution than visible light, and has higher cost performance than hyperspectral remote sensing. Thus, it offers a wider range of applications for the fine classification of farmland features under highly complex planting structures.
Based on UAV multispectral remote sensing images as data, we used the OB-SVM and OB-RF models to extract crops in areas with highly complex planting structures. We verified the application potential of this method in the extraction of complex planting structures. The conclusions can provide new ideas for obtaining accurate crop distribution maps in areas with complex planting structures, and technical support for stabilizing food security and protecting water resources.

Conclusions
This study has described the analysis and classification of multispectral images using UAV remote sensing technology. RFE was used to screen the spectral features and texture features of crops in the images, allowing the feature subsets of three study areas to be successfully constructed. The OB-RF and OB-SVM models were used for the fine classification of crops based on the above procedures. Field observations and visual interpretation were used to evaluate the accuracy of the classification results through the confusion matrix method. The main conclusions of this study are as follows: (1) The OB-SVM model's classification accuracy in areas with low-, medium-, and highcomplexity planting structures was respectively 1.99%, 4.60%, and 8.22% higher than that of the OB-RF model. As the planting structure complexity increased, the classification advantages of the OB-SVM model became more evident. This indicates that the OB-SVM model offers higher classification accuracy under land fragmentation and highly complex planting structures, and is more suitable for the fine classification of farmland features with highly complex agricultural planting patterns; (2) Based on UAV multispectral remote sensing technology and the OB-SVM classification model, the overall accuracy of the study areas with low, medium, and high complexity were as high as 99.13%, 99.08%, and 97.21%, respectively. The extraction accuracy of each crop was at least 92.59%, 94.81% and 85.65% in the three study areas, respectively. As the planting structure complexity increased from low to high, the classification accuracy and extraction accuracy decreased, but the overall accuracy only decreased by 1.92%. Therefore, UAV multispectral remote sensing technology has vast application potential for the fine classification of farmland features under highly complex planting structures. The conclusions can provide new ideas for accurately obtaining crop distribution maps in areas with complex planting structures, and then provide technical support for protecting food security and the rational allocation of water resources. Institutional Review Board Statement: Not applicable.

Informed Consent Statement: Not applicable.
Data Availability Statement: The data are not publicly available due to [privacy].

Acknowledgments:
We are very grateful to Shide Dong, a student at the institute of Geographic Sciences and Natural Research, Chinese Academy of Sciences, for his guidance and correction in English.

Conflicts of Interest:
The author declares no conflict of interest.
Appendix A   Figure A1. Relationship between accumulated multispectral bands and overall accuracy.            Figure A3. Relationship between accumulated textural features and overall accuracy.