remote sensing Cotton Cultivated Area Extraction Based on Multi-Feature Combination and CSSDI under Spatial Constraint

: Cotton is an important economic crop, but large-scale ﬁeld extraction and estimation can be difﬁcult, particularly in areas where cotton ﬁelds are small and discretely distributed. Moreover, cotton and soybean are cultivated together in some areas, further increasing the difﬁculty of cotton extraction. In this paper, an innovative method for cotton area estimation using Sentinel-2 images, land use status data (LUSD), and ﬁeld survey data is proposed. Three areas in Hubei province (i.e., Jingzhou, Xiaogan, and Huanggang) were used as research sites to test the performance of the proposed extraction method. First, the Sentinel-2 images were spatially constrained using LUSD categories of irrigated land and dry land. Seven classiﬁcation schemes were created based on spectral features, vegetation index (VI) features, and texture features, which were then used to generate the SVM classiﬁer. To minimize misclassiﬁcation between cotton and soybean ﬁelds, the cotton and soybean separation index (CSSDI) was introduced based on the red band and red-edge band of Sentinel-2. The conﬁguration combining VI and spectral features yielded the best cotton extraction results, with F1 scores of 86.93%, 80.11%, and 71.58% for Jingzhou, Xiaogan, and Huanggang. When CSSDI was incorporated, the F1 score for Huanggang increased to 79.33%. An alternative approach using LUSD for non-target sample augmentation was also introduced. The method was used for Huangmei county, resulting in an F1 score of 78.69% and an area error of 7.01%. These results demonstrate the potential of the proposed method to extract cotton cultivated areas, particularly in regions with smaller and scattered plots.


Introduction
As the world's largest cotton producer, cotton, as China's second-largest crop after grain, is a crucial strategic material related to the national economy and people's livelihood [1,2]. China's cotton is mainly produced in the Xinjiang province, the Yellow River Basin, and the Yangtze River Basin. Located at the center, Hubei is the most important cotton-producing province in the Yangtze River Basin [3]. Statistical data on cotton cultivated areas is commonly used for cotton yield estimation, economic index monitoring, and agricultural management [4]. Due to changes in regional land use [5] and cotton subsidy policies, coupled with high labor costs due to time-consuming and laborious planting methods, Hubei's cotton cultivated area has decreased significantly in recent years. However, due to the prevalence of small and fragmented cotton fields, accurately extracting and estimating cotton planting areas have remained extremely challenging, particularly in large areas.
Nowadays, satellite remote sensing technology has been widely used in various agricultural production applications [6,7]. The analysis, collection, processing, and visual display of remote sensing data can be used to classify, extract, and estimate cultivated areas, which is vital in agricultural production management, particularly in growth monitoring, pest control, and yield estimation [8][9][10][11].
A number of studies have employed remote sensing technology and developed various approaches for cotton area extraction and yield estimation. For example, Ahmad et al. [12] combined multi-temporal MODIS data (with a resolution up to 250 m) with Landsat7 TM/ETM+ data (30 m) to extract cotton cultivated area and estimate the yield based on NDVI index, demonstrating the economics and feasibility of large-scale crop yield estimation. With the continuous development of satellite technology, high-resolution satellite image data has been widely used in agricultural production and applications [13,14]. Yi et al. [15] constructed LAI estimation models at different development and growth stages of cotton in northern Xinjiang for various applications, such as cotton yield estimation, growth monitoring, and fertilization monitoring. However, other areas have varying circumstances that make popular RS estimation approaches unsuitable. For example, unlike Xinjiang, which has large and mainly contiguous cotton fields, Hubei's cotton production comprises smaller and scattered plots. Xu et al. [16] used high spatial resolution satellite images of GF-2 (up to 0.81 m) and QuickBird (up to 0.61 m) for accurate extraction of farmland based on image texture features using object-oriented multi-scale hierarchical partitioning and various local segmentation algorithms. While their approach can accurately extract farmlands in complex landscapes, it is limited by the revisit period and imaging quality of high-resolution satellites. Using an unmanned aerial vehicle (UAV) equipped with a hyperspectral sensor to capture low-altitude images, Liu et al. [17] classified cotton fields by object-oriented segmentation method for yield estimation. However, the cost of acquiring UAV data of large-scale fields is relatively high.
In data processing, algorithms such as deep learning, migration learning, and reinforcement learning are widely used in remote sensing data interpretation, improving the extraction of crop spatial distribution [18,19]. Zhu et al. [20] used deep learning semantic segmentation model for cotton ridge road recognition and utilized improved U-Net networks (i.e., Half-U-Net and Quarter-U-Net), providing a technical basis for the development of cotton field intelligent agricultural machinery navigation equipment. Chen et al. [21] built an improved Faster R-CNN model incorporating dynamic mechanisms to identify the top buds of cotton in the field and verified the feasibility of deep learning image processing algorithms in UAV remote sensing agriculture. Crane-Droesch [22] utilized a semi-parametric variant of a deep neural network to predict annual corn yield in the Midwest of the United States, resulting in better model effect and practical significance than classical statistics and other methods. However, these deep learning-related algorithms have been used mainly in small target areas, and require a large number of sample libraries. These approaches are not suitable for cotton field extraction in areas with limited sample sizes, such as Hubei.
In terms of remote sensing data analysis, time-series images and optimized results can be obtained using various approaches, such as combining auxiliary data (e.g., landuse planning vectors) [23,24], and establishing spatial-temporal data fusion model [25]. Zhang et al. [26] used an SVM extraction algorithm and a cultivated land mask on highresolution GF series satellite data to differentiate cotton from other crops, considerably improving the efficiency of ground data survey collection. Zhang et al. [27] proposed a new PMI index to monitor the spatial changes in rice, given the difficulties of rice field extraction, particularly during the rice flooding period. A number of studies have also developed field extraction approaches based on vegetation index (VI) from GF-5 AHSI satellite data, using texture features extracted from GF-6 PMS and topographic factors from DEM, and have tested different classification schemes employing Nearest Neighbor, Support Vector Machine, and Random Forest algorithms for regional tree species identification [28][29][30][31]. These studies can be used in developing new extraction approaches for cotton cultivated areas that address the limitations of current methods.
To address the major challenges in RS field extraction for small cultivated plots, this study proposes an innovative remote sensing monitoring approach for cotton cultivated areas at the regional scale using Sentinel-2 images, land use status data (LUSD), and field survey data. The study site is Hubei Province, where cotton plantations are fragmented and irregular, and where cotton and soybean are cultivated together in some areas. Given that the growth cycles of cotton and soybean are similar, cotton extraction can be extremely confusing and highly prone to large errors. We utilized LUSD data as spatial constraint and constructed seven schemes that use spectral features, vegetation index (VI), and texture features. Based on SVM classification algorithm, we incorporated a new cotton and soybean separation difference index (CSSDI) to separate cotton and soybean in adjacent planting plots. For specific areas with few cotton samples, a non-object sample augmentation based on LUSD categories was developed to improve the accuracy of cotton extraction. The results of this study can be used to optimize regional cotton growth monitoring and cultivation management, which are crucial for sustainable agricultural development.

Study Area
Hubei Province in central China is located between 29 • 01 53" N-33 • 6 47" N and 108 • 21 42" E-116 • 07 50" E, with a total area of 185,900 km 2 . Outside its mountainous region, most of the area has a humid subtropical monsoon climate. For this study, three major cotton production areas in Hubei were selected for field sampling and cotton extraction (see Figure 1): Jingzhou, Xiaogan, and Huanggang. These regions have different topography. Jingzhou is mainly composed of plain areas with altitudes ranging from 20-50 m. Xiaogan is mostly hilly, with some mountainous areas in the north and plains in the south. Huanggang is mountainous in the north, with hills and plains in the south. from DEM, and have tested different classification schemes employing Nearest Neighbor, Support Vector Machine, and Random Forest algorithms for regional tree species identification [28][29][30][31]. These studies can be used in developing new extraction approaches for cotton cultivated areas that address the limitations of current methods.
To address the major challenges in RS field extraction for small cultivated plots, this study proposes an innovative remote sensing monitoring approach for cotton cultivated areas at the regional scale using Sentinel-2 images, land use status data (LUSD), and field survey data. The study site is Hubei Province, where cotton plantations are fragmented and irregular, and where cotton and soybean are cultivated together in some areas. Given that the growth cycles of cotton and soybean are similar, cotton extraction can be extremely confusing and highly prone to large errors. We utilized LUSD data as spatial constraint and constructed seven schemes that use spectral features, vegetation index (VI), and texture features. Based on SVM classification algorithm, we incorporated a new cotton and soybean separation difference index (CSSDI) to separate cotton and soybean in adjacent planting plots. For specific areas with few cotton samples, a non-object sample augmentation based on LUSD categories was developed to improve the accuracy of cotton extraction. The results of this study can be used to optimize regional cotton growth monitoring and cultivation management, which are crucial for sustainable agricultural development.

Study Area
Hubei Province in central China is located between 29°01′53″ N-33°6′47″ N and 108°21′42″ E-116°07′50″ E, with a total area of 185,900 km 2 . Outside its mountainous region, most of the area has a humid subtropical monsoon climate. For this study, three major cotton production areas in Hubei were selected for field sampling and cotton extraction (see Figure 1): Jingzhou, Xiaogan, and Huanggang. These regions have different topography. Jingzhou is mainly composed of plain areas with altitudes ranging from 20-50 m. Xiaogan is mostly hilly, with some mountainous areas in the north and plains in the south. Huanggang is mountainous in the north, with hills and plains in the south.

Crop Phenology
Cotton is an important economic crop in Hubei. The main growing period is from April to September. After maturity in mid-September, cotton harvesting is carried out in batches, lasting until October. Aside from cotton, the dominant crops in Hubei-rice,  (Table 1)-share a similar growth period with cotton. In particular, the blooming of cotton, the key stage for feature selection, is highly coincidental with the maturity stage of the soybean, which negatively affects cotton extraction. The phenology information of cotton and other main crops from April to October in Hubei is summarized in Table 2.  Note: Early refers to the first ten days of the month, middle refers to the next ten days, and late refers to the remaining days.

Satellite Data
The satellite images used in this study were Sentinel-2 Level-2A data covering 13 spectral bands, with a temporal resolution of five days and spatial resolutions of 10 m, 20 m, and 60 m. Based on the phenology characteristics of cotton, the images in late August and late September 2020 were selected. The visible red (Band 4), green (Band 3), blue (Band 2), and near-infrared (Band 8) at 10 m resolution were used for crop classification, while the vegetation red-edge bands (Band 5, 6, and 7) at 20 m resolution were used to determine the CSSDI. The pre-processing procedure included band combination, mosaicking, clipping, and cloud masking. The red-edge band images were resampled to 10 m using the nearest neighborhood method. The above steps were carried out in Google Earth Engine.

Land Use Status Data
The land-use status data (LUSD) used in this study were acquired in 2017-2019 from China's third nationwide land and resources survey. The dataset included information on cultivated land, forest land, residential land, and other primary categories. Each primary category was further decomposed into secondary categories. For instance, cultivated lands were subcategorized into paddy fields, irrigated land, and dry land. In general, cotton, soybean, and corn belong to irrigated or dry land categories, while rice is in the paddy field grouping. LUSD can provide a spatial constraint on the original image, reducing the impact of non-crop and rice on cotton extraction.

Field Sampling Data
The field survey was conducted during the main growing season for cotton from June to September in Jingzhou, Huanggang, and Xiaogan.. In the field survey, the sampling route was designed using expert knowledge and statistical information from statistical yearbooks and the local Academy of Agricultural Sciences. The station description, including location information, Sentinel and Google Earth Map images, and photos of the sampling points, were obtained for each site. Table 3 shows the specific sample information, and Figure 2 shows the sampling distribution in Jingzhou, Xiaogan, and Huanggang. determine the CSSDI. The pre-processing procedure included band combination, mosaicking, clipping, and cloud masking. The red-edge band images were resampled to 10 m using the nearest neighborhood method. The above steps were carried out in Google Earth Engine.

Land Use Status Data
The land-use status data (LUSD) used in this study were acquired in 2017-2019 from China's third nationwide land and resources survey. The dataset included information on cultivated land, forest land, residential land, and other primary categories. Each primary category was further decomposed into secondary categories. For instance, cultivated lands were subcategorized into paddy fields, irrigated land, and dry land. In general, cotton, soybean, and corn belong to irrigated or dry land categories, while rice is in the paddy field grouping. LUSD can provide a spatial constraint on the original image, reducing the impact of non-crop and rice on cotton extraction.

Field Sampling Data
The field survey was conducted during the main growing season for cotton from June to September in Jingzhou, Huanggang, and Xiaogan.. In the field survey, the sampling route was designed using expert knowledge and statistical information from statistical yearbooks and the local Academy of Agricultural Sciences. The station description, including location information, Sentinel and Google Earth Map images, and photos of the sampling points, were obtained for each site. Table 3 shows the specific sample information, and Figure 2 shows the sampling distribution in Jingzhou, Xiaogan, and Huanggang.

Methods
The Sentinel-2 satellite images for Jingzhou, Xiaogan, and Huanggang, acquired in late August and September 2020, were selected for the cotton area extraction. First, preprocessing was performed on the original images, including band combination, mosaicking, clipping, and cloud masking. The pre-processed images were then spatially constrained using the specific categories of LUSD. Another experiment using LUSD was carried out for non-target sample augmentation for cotton extraction in order to make the samples balance for each class and be evenly distributed in each region.
The multi-dimensional features calculated from four bands (i.e., red, green, blue, and near-infrared bands) were selected and combined in seven schemes, which were then used Remote Sens. 2022, 14, 1392 6 of 20 in SVM for crop classification. To have better separation for soybean and cotton fields in Huanggang, CSSDI was established using red-edge and red band.
In order to balance the distribution of training samples and test samples, this study conducted a five-fold cross-validation to test the stability of the proposed model. The technical route of the study is shown in Figure 3. 6 late August and September 2020, were selected for the cotton area extraction. First, preprocessing was performed on the original images, including band combination, mosaicking, clipping, and cloud masking. The pre-processed images were then spatially constrained using the specific categories of LUSD. Another experiment using LUSD was carried out for non-target sample augmentation for cotton extraction in order to make the samples balance for each class and be evenly distributed in each region.
The multi-dimensional features calculated from four bands (i.e., red, green, blue, and near-infrared bands) were selected and combined in seven schemes, which were then used in SVM for crop classification. To have better separation for soybean and cotton fields in Huanggang, CSSDI was established using red-edge and red band.
In order to balance the distribution of training samples and test samples, this study conducted a five-fold cross-validation to test the stability of the proposed model. The technical route of the study is shown in Figure 3.

Spatial Constraint
In this study, the spatial constraint method was to use the geographic information data, including more detailed plot information and attribute information, to assist satellite images for object classification. According to the technical regulation for category identification of LUSD, cotton is generally divided into irrigated land and dry land. The data was collected in 2019, and the land type may vary for 2019 and 2020. Therefore, this study analyzed the distribution of cotton samples collected in the field in each category of LUSD to verify its reliability. Figure 4 shows that more than 90% of cotton samples in the three regions belong to the irrigated land and dry land categories, which indicates the reliability of the LUSD. Thus, the Sentinel-2 images were clipped using the irrigated and dry land categories as masks before performing crop classification.

Spatial Constraint
In this study, the spatial constraint method was to use the geographic information data, including more detailed plot information and attribute information, to assist satellite images for object classification. According to the technical regulation for category identification of LUSD, cotton is generally divided into irrigated land and dry land. The data was collected in 2019, and the land type may vary for 2019 and 2020. Therefore, this study analyzed the distribution of cotton samples collected in the field in each category of LUSD to verify its reliability. Figure 4 shows that more than 90% of cotton samples in the three regions belong to the irrigated land and dry land categories, which indicates the reliability of the LUSD. Thus, the Sentinel-2 images were clipped using the irrigated and dry land categories as masks before performing crop classification.

Features Selection
Each Sentinel-2 image has red, green, blue, and near-infrared bands, and all eight spectral bands can be obtained using two-period images. VI is a combination of reflectance of two or more wavelengths to enhance features or details of vegetation. To distinguish the two kinds of features in the paper, we refer to the original bands as spectral features and the band combinations as VI features. In this study, 18 VIs were calculated, and the summary of equations used is presented in Appendix A (Table A1). There were strong

Features Selection
Each Sentinel-2 image has red, green, blue, and near-infrared bands, and all eight spectral bands can be obtained using two-period images. VI is a combination of reflectance of two or more wavelengths to enhance features or details of vegetation. To distinguish the two kinds of features in the paper, we refer to the original bands as spectral features and the band combinations as VI features. In this study, 18 VIs were calculated, and the summary of equations used is presented in Appendix A (Table A1). There were strong correlations among some VIs and spectral bands. Therefore, correlation analysis was performed, and features with high correlation were removed to reduce data redundancy. The correlation coefficient threshold was set to 0.9, and VIs and spectral features with more significant differences and small dimensionality were obtained. Figure 5 shows the correlation matrix for the vegetation indices.

Feature Combination
Due to the diversity of crops and the limitation of spectral information acquisit different objects could be in the same spectrum, and objects in the same category could in different spectrums. In order to improve cotton extraction accuracy, seven classifica schemes were generated according to different combinations of spectral, texture, and features selected. The different classification schemes are presented in Table 4.

SVM Algorithm
The SVM classifier provides a powerful supervised classification method [38]. In study, we selected the radial basis function (RBF) kernel, where the gamma param defines the influence of a single training example; a low gamma value means 'far', w a high value means 'near'. After parameter adjustment, the value of gamma was set to The penalty parameter C of the error term trades off the correct classification of train examples against the maximization of the decision function margin. For larger value Texture features provide supplementary information about object properties and can be helpful for the discrimination of heterogeneous crop fields [36]. In this study, two widelyused texture features for image classification, Entropy and Second Moment [31,37], were used to assist in cotton extraction. To reduce data dimensionality, Principal Component Analysis (PCA) was performed for two Sentinel-2 images, and only the first principal component was used in calculating co-occurrence measures for each texture. The cooccurrence shift included four directions (1, 0), (1, 1), (0, 1), (−1, 1), which represent 0 • , 45 • , 90 • , and 135 • , respectively. The results of the co-occurrence shift were averaged, producing the final texture features for Entropy and Second Moment.

Feature Combination
Due to the diversity of crops and the limitation of spectral information acquisition, different objects could be in the same spectrum, and objects in the same category could be in different spectrums. In order to improve cotton extraction accuracy, seven classification schemes were generated according to different combinations of spectral, texture, and VI features selected. The different classification schemes are presented in Table 4.

SVM Algorithm
The SVM classifier provides a powerful supervised classification method [38]. In this study, we selected the radial basis function (RBF) kernel, where the gamma parameter defines the influence of a single training example; a low gamma value means 'far', while a high value means 'near'. After parameter adjustment, the value of gamma was set to 0.1.
The penalty parameter C of the error term trades off the correct classification of training examples against the maximization of the decision function margin. For larger values of C, a smaller margin is accepted if the decision function is better at classifying all training points correctly. Lower C encourages a larger margin, which means a simpler decision function at the cost of training accuracy. After exploring, the value of penalty, parameter C was set to 1.0. Four categories were used as input: cotton, soybean, corn, and other crops.

Cotton and Soybean Separation Difference Index
In some regions of Hubei, particularly in Huanggang, cultivated cotton areas have been changed into soybean fields, resulting in mixed cotton and soybean cultivation. Due to similar growth cycles and similar features in the visible and near-infrared bands, cotton and soybean are difficult to separate using the SVM classifier. The red-edge bands are closely related to the vegetation growth state. Sentinel-2 has vegetation red-edge bands, i.e., Band 5 (VRE 1 ), Band 6 (VRE 2 ), and Band 7 (VRE 3 ), with central wavelengths of 705 nm, 740 nm, and 783 nm, respectively.
This study compared the red band and the three red-edge bands of cotton and soybean. As shown in Figure 6a,b, there were significant differences in the red band and VRE 3 between cotton and soybean. Therefore, an index was developed based on the red band and VRE 3 in Sentinel-2 to differentiate between cotton and soybean. The index, termed the cotton and soybean separation difference index (CSSDI), is calculated using the formula: where VRE 3 and Red were Band 7 and Band 4 of Sentinel-2.
After calculating the CSSDI, the t-test for cotton and soybean was performed. The resulting p-value was less than 0.01, indicating that there is a significant difference in CSSDI values for cotton and soybean. The CSSDI and pixel count values for cotton and soybean were plotted to define the separation threshold (see Figure 6c). The minimum overlapping values of CSSDI ranged from 0.275 to 0.344. An increment value of 0.003 was applied from the lower limit of 0.275 to the upper limit of 0.344. The accuracy was calculated for each threshold value to find the optimal threshold value for segmentation. The final threshold of 0.299 gave the maximum separation level and highest accuracy.
Remote Sens. 2022, 14, 1392 9 of 20 9 resulting p-value was less than 0.01, indicating that there is a significant difference in CSSDI values for cotton and soybean. The CSSDI and pixel count values for cotton and soybean were plotted to define the separation threshold (see Figure 6c). The minimum overlapping values of CSSDI ranged from 0.275 to 0.344. An increment value of 0.003 was applied from the lower limit of 0.275 to the upper limit of 0.344. The accuracy was calculated for each threshold value to find the optimal threshold value for segmentation. The final threshold of 0.299 gave the maximum separation level and highest accuracy.

Evaluation Index
The field samples were divided into a training set and a test set at a 4:1 ratio, and a five-fold cross-validation was conducted. A confusion matrix was established, and the

Evaluation Index
The field samples were divided into a training set and a test set at a 4:1 ratio, and a five-fold cross-validation was conducted. A confusion matrix was established, and the accuracy of the crop classification model was assessed in terms of overall accuracy (OA) and Kappa coefficient. Producer's accuracy, user's accuracy, and F1 score (F1) were used to assess the results of the cotton area extraction. F1 is the harmonic producer's accuracy and user's accuracy.
Another evaluation index for cotton extraction is area error (Equation (2)), calculated using the formula: where A image is the extracted cotton cultivated area and A image and A statistics are the actual cotton cultivated area. The data used for calculating the area error was the 2020 dataset for Jingzhou, Xiaogan, and Huanggang obtained from official government statistics (http: //tjj.hubei.gov.cn/tjsj/, accessed on 20 October 2021). Figure 7 shows the results of feature selection, which includes one spectral feature (green band), one VI (RVI), and two texture features (Entropy and Second Moment) for the August image, and one spectral feature (near-infrared band) and three VIs (NDVI, MCARI2d, and RGBVI) for the September image. By the end of August, since cotton is in the blooming stage, it exhibits its typical feature. At the end of September, cotton and other crops with similar growth cycles are in harvest. However, the cotton harvest is carried out in batches and lasts until October, which can show different features in images compared to other crops. 10 (green band), one VI (RVI), and two texture features (Entropy and Second Moment) for the August image, and one spectral feature (near-infrared band) and three VIs (NDVI, MCARI2d, and RGBVI) for the September image. By the end of August, since cotton is in the blooming stage, it exhibits its typical feature. At the end of September, cotton and other crops with similar growth cycles are in harvest. However, the cotton harvest is carried out in batches and lasts until October, which can show different features in images compared to other crops. The three kinds of features were combined into seven classification schemes. The results for the cotton extraction using the SVM classifier are shown in Figure 8. Among the different schemes, the F1 values of the Spectral + VI configuration were highest in the three regions at 86.93%, 80.11%, and 71.58%, respectively. The producer's accuracy for cotton was greater than the user's accuracy by 12-17%. This suggests that the errors in the cotton area extraction were mainly due to the misclassification of other crops to cotton. In addition, the three evaluation indices for Jingzhou were higher than for Xiaogan and Huanggang, primarily caused by Jingzhou's flat terrain, uniform planting patterns, and a The three kinds of features were combined into seven classification schemes. The results for the cotton extraction using the SVM classifier are shown in Figure 8. Among the different schemes, the F1 values of the Spectral + VI configuration were highest in the three regions at 86.93%, 80.11%, and 71.58%, respectively. The producer's accuracy for cotton was greater than the user's accuracy by 12-17%. This suggests that the errors in the cotton area extraction were mainly due to the misclassification of other crops to cotton. In addition, the three evaluation indices for Jingzhou were higher than for Xiaogan and Huanggang, primarily caused by Jingzhou's flat terrain, uniform planting patterns, and a consistent growth cycle for cotton. The confusion matrices of optimal results in three regions are listed in Appendix B (Tables A2-A6).

Improved Results of Cotton Extraction based on CSSDI
As shown in Figure 8, the three evaluation indices for Huanggang were much lower than for the other two regions, mainly because of misclassification between cotton and soybean. This study proposed a CSSDI based on red and red-edge bands to improve the classification results by further separating cotton and soybean. In Appendix C (figure C1), different characteristics in the images of the same field can be found, leading to serious misclassification between cotton fields and soybean fields. As shown in Figure 9, the introduction of CSSDI improved the accuracy of cotton area extraction. Producer's accu-

Improved Results of Cotton Extraction Based on CSSDI
As shown in Figure 8, the three evaluation indices for Huanggang were much lower than for the other two regions, mainly because of misclassification between cotton and soybean. This study proposed a CSSDI based on red and red-edge bands to improve the classification results by further separating cotton and soybean. In Appendix C ( Figure A1), different characteristics in the images of the same field can be found, leading to serious misclassification between cotton fields and soybean fields. As shown in Figure 9, the introduction of CSSDI improved the accuracy of cotton area extraction. Producer's accuracy, user's accuracy, and F1 increased by 6.67%, 8.41%, and 7.75%, respectively, and the final value of F1 was 79.33%. Similarly, CSSDI also improved the accuracy of soybean extraction. For Xiaogan and Jingzhou, since there were fewer mixed planted areas, the introduction of CSSDI resulted only in marginal improvements in cotton area extraction.

Improved Results of Cotton Extraction based on CSSDI
As shown in Figure 8, the three evaluation indices for Huanggang were much lower than for the other two regions, mainly because of misclassification between cotton and soybean. This study proposed a CSSDI based on red and red-edge bands to improve the classification results by further separating cotton and soybean. In Appendix C (figure C1), different characteristics in the images of the same field can be found, leading to serious misclassification between cotton fields and soybean fields. As shown in Figure 9, the introduction of CSSDI improved the accuracy of cotton area extraction. Producer's accuracy, user's accuracy, and F1 increased by 6.67%, 8.41%, and 7.75%, respectively, and the final value of F1 was 79.33%. Similarly, CSSDI also improved the accuracy of soybean extraction. For Xiaogan and Jingzhou, since there were fewer mixed planted areas, the introduction of CSSDI resulted only in marginal improvements in cotton area extraction.

Comparison of Different Spatial Constraint Methods
To further investigate the performance and explore the advantage of spatial constraint using LUSD, we used two other cotton extraction methods for comparison: (1) without mask and (2) using Globeland30 (G30) data as mask. G30 data provides a global

Comparison of Different Spatial Constraint Methods
To further investigate the performance and explore the advantage of spatial constraint using LUSD, we used two other cotton extraction methods for comparison: (1) without mask and (2) using Globeland30 (G30) data as mask. G30 data provides a global geo-information product available online (http://www.globallandcover.com/, accessed on 10 September 2021). The cultivated land categories of G30 were used as a mask before classification. Aside from cotton, soybean, and corn, the SVM classifier includes a category for rice, which was also collected in the field. For the method without mask, three additional classes were added using visual interpretation to avoid misclassification of non-crops: water, building, and forest.
The cotton extraction accuracy of the three methods was evaluated in two aspects, actual samples collected in the field and actual cotton area from statistical data. The cotton extraction results of the three methods were shown in Figure 10. The concentration of cotton cultivated areas was similar for the three methods. However, the method without mask produced a much larger cotton area compared to the statistical data. This suggests that masking can significantly reduce extracted areas. The LUSD method was closest to the statistical data, with area errors at 13.89%, 40.77%, and 20.66%. Accuracy assessment using samples collected in the field was also performed, and the summary of results is presented in Table 5. The LUSD approach produced higher F1 scores than the G30 method, especially in Xiaogan. While the maskless approach generated the highest OA values, the LUSD method can provide a balance between the two evaluation indices of actual area and actual samples. 12 that masking can significantly reduce extracted areas. The LUSD method was closest to the statistical data, with area errors at 13.89%, 40.77%, and 20.66%. Accuracy assessment using samples collected in the field was also performed, and the summary of results is presented in Table 5. The LUSD approach produced higher F1 scores than the G30 method, especially in Xiaogan. While the maskless approach generated the highest OA values, the LUSD method can provide a balance between the two evaluation indices of actual area and actual samples.

Necessity of Feature Selection and Combination
The performance of different cotton extraction schemes with varying multi-feature combinations was analyzed in this study. As shown in Figure 11, cotton extraction schemes using a single texture feature or spectral feature can result in more over-segmentation or under-segmentation. Schemes with VI features can effectively reduce the misclassification between cotton and corn (Figure 11b), achieving relatively high accuracy of cotton extraction. Although VIs are based on multiple spectral band calculations, the green and near-infrared spectral bands can generate different crop information from VI features. The results suggest that the spectral + VI scheme provides higher accuracy than methods using only a single VI feature. 13 resentative texture features, causing the texture-feature-based methods to have lower accuracy than spectral + VI. Some studies have also introduced texture features to help crop classification. Using a 2 m spatial resolution WorldView-2 imagery, Wan et al. [39] found that texture and spectral features can slightly improve crop classification compared to using spectral bands alone. Kwak et al. [37] explored the impact of texture information on crop classification based on UAV images with much finer resolution. They concluded that GLCM-based texture features obtain the most accurate classification. These studies suggest the usefulness of texture features for crop classification in high-resolution images. In this study, the combination of features was found to increase crop classification accuracy.

Advantages of CSSDI for Separating the Cotton and Soybean
To improve crop differentiation and field extraction, we compared the spectral characteristics between cotton and soybean. The biggest spectral differences between the two crops were found in the red-edge band and the red band ( Figure 6). Therefore, a new However, texture features had a relatively smaller role in cotton extraction. Texture features indicate the distribution function statistics of the local crop properties in the image, while spectral and VI reflect the crop-based features of pixels. In this study, the small plot size and coarse spatial resolution of the Sentinel image limited the extraction of representative texture features, causing the texture-feature-based methods to have lower accuracy than spectral + VI. Some studies have also introduced texture features to help crop classification. Using a 2 m spatial resolution WorldView-2 imagery, Wan et al. [39] found that texture and spectral features can slightly improve crop classification compared to using spectral bands alone. Kwak et al. [37] explored the impact of texture information on crop classification based on UAV images with much finer resolution. They concluded that GLCM-based texture features obtain the most accurate classification. These studies suggest the usefulness of texture features for crop classification in high-resolution images. In this study, the combination of features was found to increase crop classification accuracy.

Advantages of CSSDI for Separating the Cotton and Soybean
To improve crop differentiation and field extraction, we compared the spectral characteristics between cotton and soybean. The biggest spectral differences between the two crops were found in the red-edge band and the red band ( Figure 6). Therefore, a new vegetation index CSSDI was proposed. In Huanggang, the F1 scores increased by 7.75%. Red-edge is the spectral feature corresponding to the maximum slope in the reflectance profile of green vegetation [36]. Several studies that evaluated the capabilities of Sentinel-2 for vegetation classification have concluded that the red-edge band contributes significantly to accurate crop classification [40,41]. Xiao et al. developed red-edge indices (RESI) by normalizing three red-edge bands in Sentinel-2, applying them to map rubber plantations [42]. The index was sensitive to changes in moisture content and canopy density of rubber plantations, with an overall accuracy of 92.50% and a kappa coefficient of 0.91. Kang et al. combined NDVI time series and NDRE red-edge time series and used a Random Forest algorithm for crop classification [43], resulting in better crop classification than single NDVI time series.
Due to limitations in spatial resolution, the use of red-edge bands for crop classification would be difficult for the entire Huanggang area. However, the results of this study suggest that red edge bands play a key role in the further separation of soybean and cotton cultivated areas, and that combining the visible, near-infrared, and red-edge bands would result in better cotton area extraction.

Balance between F1 Scores and Area Errors Using LUSD
The spatial constraint method with LUSD generated high F1 scores in Jingzhou, resulting in comparable area estimates with the statistical data. For Huanggang and Xiaogan, the cotton area estimates were 21-41% higher than the statistical data, which means that other categories were misclassified as cotton.
The Huangmei county with more area error was selected as the test area. A non-object sample augmentation was performed based on LUSD categories (e.g., water, building, forest, shrub, and bare land), and the LUSD samples were evenly distributed throughout the study area. The classes of cotton and other crops used samples collected in the field. The classification results are shown in Figure 12. Although the F1 score of the sample augmentation method was slightly lower than that of the mask method, the cotton area estimate was closer to the statistical values, with a difference of 0.49 kha (see Table 6). The SVM algorithm created a serious salt-and-pepper phenomenon in the segmentation images, especially at the object boundaries [44]. The small number of sample types resulted in more noise in the cotton category. Although post-processing of classified images by morphological methods can reduce noise, these were not suitable for this study due to the small size of cotton plots in the research area, which may be regarded as noise. Therefore, given the small cotton field and limited samples available in this study, the sample types can be increased using LUSD to reduce the pixels of cotton misclassification to ensure the F1 scores of the cotton and reduce the area errors compared with statistical data.

Conclusions
This study used the spatial constraint method and a multi-feature combination based on the SVM algorithm to extract cotton-cultivated areas in Hubei. The present work demonstrates a promising method for cotton extraction for areas with small plots and limited field samples. In this paper, the main contributions are as follows: 1. Through the establishment of seven kinds of feature combination schemes, the optimal scheme was selected for cotton extraction; 2. Further, the CSSDI was established to improve the extraction accuracy of cotton, considering the phenomenon of cotton and soybean mixture; 3. Using LUSD for spatial constraints in this study serves two purposes: (a) LUSD can provide accurate land type information to reduce the influence of non-crop on cotton; (b) non-object sample augmentation is carried out to solve the problem of a small sample number.
For the multi-feature combination, the scheme with VI and spectral features produced the optimal extraction accuracy, with F1 scores of 86.93%, 80.11%, and 71.58% for Jingzhou, Xiaogan, and Huanggang. In addition, the CSSDI was used to further differentiate cotton and soybean, increasing the F1 score in Huanggang to 79.33%. The spatial constraint method using LUSD can effectively reduce area errors for cotton extraction. The relative error for the cotton areas in Jingzhou was 13.89%; however, there were relatively more area errors in the other two regions. An alternative approach (i.e., non-object sample augmentation) was discussed for Huangmei county, and the area error was from 58.51% to 7.01%.
The CSSDI proposed in this paper is only for mixed planting of cotton and soybean in Huanggang. However, in the field investigation, we found mixed planting of cotton with other crops (e.g., watermelon, sesame, and peanut). Therefore, other index model sets need to be constructed in the next stage of research. Moreover, in the follow-up study, we will consider adding auxiliary data with social attribute information to further synthesize the cotton area extraction method. At the same time, we will collect more samples and optimize the feature combination strategy to improve the model and achieve higher crop classification accuracy.  Acknowledgments: All the authors are thankful to the public, private, and government sectors for providing help in field data collection. We are also grateful to the anonymous reviewers for their time and support.

Appendix B
The confusion matrices in Tables A3-A6 were the results of five-fold cross validation.    Table A6. Confusion matrix of Spectral + VI in Huanggang using CSSDI.