Automatic Cotton Mapping Using Time Series of Sentinel-2 Images

Large-scale crop mapping is essential for agricultural management. Phenological variation often exists in the same crop due to different climatic regions or practice management, resulting in current classification models requiring sufficient training samples from different regions. However, the cost of sample collection is more time-consuming, costly, and labor-intensive, so it is necessary to develop automatic crop mapping models that require only a few samples and can be extended to a large area. In this study, a new white bolls index (WBI) based on the unique canopy of cotton at the bolls opening stage was proposed, which can characterize the intensity of bolls opening. The value of WBI will increase as the opening of the bolls increases. As a result, the white bolls index can be used to detect cotton automatically from other crops. Four study areas in different regions were used to evaluate the WBI performance. The overall accuracy (OA) for the four study sites was more than 82%. Additionally, the dates when the opening stage of bolls begins can be determined based on the time series of WBI. The results of this research demonstrated the potential of the proposed approach for cotton mapping using sentinel-2 time series of remotely sensed data.


Introduction
Agriculture is the main contributor to the global economy, food security, and ecological environment and provides grain, non-staple food, and industrial raw materials for human beings [1][2][3]. Cotton is one of the world's most important fiber crops for the economy, supplying some 79% of the world's natural fiber and expanding one of the largest textile industries to at least $600 billion annually worldwide [4][5][6]. More than 100 cotton-producing countries worldwide are planting the total area for cotton production in 2014 is 33 M ha for the entire world. Of these, India, China, the United States, Pakistan, Brazil, Australia, Uzbekistan, Turkey, Turkmenistan, and Burkina Faso are among the top ten cotton-producing countries [6]. Different cotton variations have been developed, for example, the genetic engineering Bacillus thuringiensis (Bt), which rapidly increases cotton production [7]. Therefore, it is necessary to automatically map cotton distribution for wide scales in time to match the significantly accelerated breeding process.
Satellite Image Time Series (SITS) have been widely used to identify crops in large regions because of their global coverage and short revisit periods to capture canopy coverage of crops at various growth stages [8][9][10][11]. As one crop type, cotton can be mapped with crop classification technologies based on remote sensing images.
Generally, there are two steps for classification technologies. Firstly, the time-series features of original spectral bands [12] or vegetable indexes [13][14][15] are extracted, such as the statistical characteristic [16][17][18] such as mean, variance, maximum values, transformed components, etc. and phenological features [19][20][21][22][23][24] such as the start of greening/season, the peak of the season, the end of the season, fitted curve parameter, and so forth. Secondly, training samples train the classification models with the features above. The most widely used classifiers including random forest (RF), decision tree (DT), support vector machine (SVM), K-nearest neighbor (KNN), and the advance deep learning models [25][26][27]. Then, the trained models can predict the crop classes.
As is well known, due to the different climatic regions or practice management, there are phenological diversities within the same crops [28], which lead the trained models with samples of the old region to unfit the new regions. Updating the training samples can solve this problem, but it is more time-consuming, costly, and labor-intensive. Recently, several studies have explored unique phenological characteristics to automatically map crops without the amount of training data. For example, Ashourloo et al. (2018) developed a new feature for the identification of alfalfa with the phenological patterns of alfalfa from a multi-harvest one-year time series of Landsat 8 data, which detected alfalfa with the overall accuracy being better than 90% [29]. Qiu et al. (2017) found that the length of growth period in winter wheat was longer than that in other crops, based on which a temporal index within an EVI time series was developed to map winter wheat [30]. Recently, several studies derived the identification characteristics of the crop canopy in a certain phenological stage to map the crops. For example, Andrimont et al. (2020) utilized a yellow flower index in the flowering stage to detect canola [31]. Ashourlooa et al. (2019) developed a Canola Index to characterize the yellow flower and automatically map the canola [32].
Observations from previous studies indicate that most crops display a particular phenological or canopy pattern in a certain phenological stage. In comparison with the phenological patterns, canopy patterns are bound to occur at a certain date, even if under different climatic regions or practice managements. Hence, it is a better way to automatically map crops with the unique canopy feature.
Similar to the yellow canola flowers in the flowering period, cotton also has the distinctive trait of the canopy that current research studies pay less attention to. The stages covering cotton growing are planted (when the seeds are placed in the ground), squaring (when the appearance of a small triangular leaf-like structure on the growing tip of the main stem and/or branches), setting bolls (when one bloom or boll is visible), bolls opening (when white fibers are visible on at least one boll), and harvested (when the cotton is cut or gathered from the field) [33]. Cotton is known as "white gold" in some countries [34], which vividly describes the characteristic of the cotton canopy in the bolls opening stage. Since bolls are full of white wool, the cotton canopy appears white, which is a distinctive trait and different from other crops.
In this research, a white bolls index (WBI) that characterizes the intensity of bolls opening during cotton growing is constructed with few training samples. With the increase of the bolls opening, the value of WBI will be larger and vice versa. Based on the feature of the WBI, it can be used to map cotton under various cultivations and climate areas without training samples. Additionally, the date that the bolls opening stage begins can be recognized by the WBI time series.
The rest of the paper is organized as follows. The material and methods of the study are described in Section 2. Results and discussion are respectively presented in Sections 3 and 4. The last Section 5 provides conclusions of the study. Figure 1 shows the summary of the research stages and flowchart of this study. Firstly, 4 study sites were described and datasets including satellite image time series with ground references were presented. Then, the WBI of cotton was constructed based on temporalspectral conjoint analysis with training samples. Thirdly, cotton was mapped in the 4 study sites with the empirical threshold method. Fourthly, the start date of the bolls opening temporal-spectral conjoint analysis with training samples. Thirdly, cotton was mapped in the 4 study sites with the empirical threshold method. Fourthly, the start date of the bolls opening stage was derived from the WBI time series based on the cotton maps. At last, we evaluated the classification results with test samples and the start of bolls opening date with the phenological data from United States Department of Agriculture National Agricultural Statistics Service (USDA-NAS) and ground survey.

Study Sites
Four study sites were chosen to test the proposed index under different environmental conditions. These included Missouri's 8th District (MO-08), California's 21st District (CA-21), Georgia's 8th District, and Shihezi and its surrounding areas (Shihezi reclamation area) in China's North Xinjiang (XJ). All of the 4 study sites are mechanical cultivation, including sowing, fertilization, irrigation, weed removal, and insect management in the cotton growth period. Additionally, defoliation should proceed to ensure optimum yield and fiber quality. However, the specific practice that should be performed depends on the field situation. The cotton planting area ranges from smaller to larger in these areas These areas, which have various agro-climatic conditions, climates, and crop types, are situated from west to east. The location of the study sites is shown in Figure 2, and the climatological and agricultural conditions at each site are shown in Table 1.

Study Sites
Four study sites were chosen to test the proposed index under different environmental conditions. These included Missouri's 8th District (MO-08), California's 21st District (CA-21), Georgia's 8th District, and Shihezi and its surrounding areas (Shihezi reclamation area) in China's North Xinjiang (XJ). All of the 4 study sites are mechanical cultivation, including sowing, fertilization, irrigation, weed removal, and insect management in the cotton growth period. Additionally, defoliation should proceed to ensure optimum yield and fiber quality. However, the specific practice that should be performed depends on the field situation. The cotton planting area ranges from smaller to larger in these areas. These areas, which have various agro-climatic conditions, climates, and crop types, are situated from west to east. The location of the study sites is shown in Figure 2, and the climatological and agricultural conditions at each site are shown in Table 1.
MO-08 is one of eight congressional districts in the state of Missouri. The district encompasses rural Southeast Missouri and South Central Missouri as well as some counties in Southwest Missouri. Cotton in Dunklin and Stoddard counties rank in the top 3 in MO state. Hence, these 2 counties were selected for the study site. Other major crops contain soybean, rice, and corn. Cotton is planted in late May, the bolls open in mid to late September, and are harvested in mid to late October and early November. Figure 3 showed the phenological stages of cotton nearby the study site from the ecosystem phenology camera network (PhenoCam) (https://phenocam.sr.unh.edu/webcam/ (accessed on 1 March 2021).  other major crops here, such as peanuts and corn. Cotton in GA-08 is planted in late April, the stage of bolls opening starts in late September and early October, and cotton is harvested after middle November. Xinjiang is the biggest cotton-growing province in China, where 36% of cropland is cotton. The reclamation area of Shihezi is one of the major production bases for high-quality cotton in Xinjiang, where 70% of cropland is covered by cotton. Hence, it is selected as the study site, and other crops include wheat, corn, and fruit trees. Cotton here is planted from early to mid April, bolls are opening from late August to early September, and cotton is harvested in late October.

Satellite Images
Time series of Sentinel-2 images covering the study sites from April to November 2019 were used to train the white bolls index. Cloud percentage was limited to less than 10%. The corresponding years and Julian's days are shown in Table 2. As can be seen in Table 2, CA-21 has the most images, with a total of 39 scenes; MO-08, GA-08, and XJ had 20, 11, and 22 scenes respectively. Ten bands were used to extract white bolls index, including B2(Blue), B3(Green), B4(Red), B5(RedEdge1), B6(RedEdge2), B7(RedEdge3), B8(NIR), B8A(narrow NIR or RedEdge4), B11(SWIR1), and B12(SWIR2) from Google Earth Engine (GEE) surface reflectance products with 10,000 scale.  CA-21 is a congressional district in the U.S. state of California. It is located in the San Joaquin Valley and includes Kings County and portions of Fresno, Kern, and Tulare counties. This paper selected Fresno and Kings as the main study areas, since more cotton was planted here than in other counties. Cotton is the second major crop in CA-21, while the first major crop is fruit trees. There are various fruit trees, such as almonds, grapes, pistachios, and so on. Cotton in CA-21 is planted in late April, the stage of bolls opening starts in late September and early October, and then, cotton is harvested in mid to late November.
GA-08 is one of the congressional districts in the U.S. state of Georgia. Our study site here including 9 counties, half of which rank in the top 10 in planting cotton. There are other major crops here, such as peanuts and corn. Cotton in GA-08 is planted in late April, the stage of bolls opening starts in late September and early October, and cotton is harvested after middle November.
Xinjiang is the biggest cotton-growing province in China, where 36% of cropland is cotton. The reclamation area of Shihezi is one of the major production bases for high-quality cotton in Xinjiang, where 70% of cropland is covered by cotton. Hence, it is selected as the study site, and other crops include wheat, corn, and fruit trees. Cotton here is planted from early to mid April, bolls are opening from late August to early September, and cotton is harvested in late October.

Satellite Images
Time series of Sentinel-2 images covering the study sites from April to November 2019 were used to train the white bolls index. Cloud percentage was limited to less than 10%. The corresponding years and Julian's days are shown in Table 2. As can be seen in Table 2, CA-21 has the most images, with a total of 39 scenes; MO-08, GA-08, and XJ had 20, 11, and 22 scenes respectively. Ten bands were used to extract white bolls index, including B2(Blue), B3(Green), B4(Red), B5(RedEdge1), B6(RedEdge2), B7(RedEdge3), B8(NIR), B8A(narrow NIR or RedEdge4), B11(SWIR1), and B12(SWIR2) from Google Earth Engine (GEE) surface reflectance products with 10,000 scale.
Surface reflectance products were computed by running sen2cor on level-1C products. B2, B3, B4, and B8 are 10 m resolution, which was resampled to 20 m with bilinear algorithm from GEE. Additionally, temporal interpolation was applied to remove clouds from the time series. The 20 m surface reflectance time series with the free cloud can be obtained through the above pre-processing.

Ground References
Crop types of 2058 fields located in the 4 study sites of MO-08 (851 fields), CA-21 (734 fields), GA-08 (254 fields), and XJ (219 fields) were collected. In MO-08, 239, 244, 187, 152, and 29 field samples for cotton, soybean, corn, rice, and peanuts were included in the collected dataset. In CA-21, 159, 415, 59, 57, and 44 field samples were included for cotton, fruit trees (grape, almond, pistachio, walnut), corn, rice, and peanuts. In GA-08, there were 159, 42, 14, and 39 field samples for cotton, peanuts, pecans, and corn. In XJ, there were 122 and 97 field samples for cotton and other crops including corn, grapes, wheat, and sorghum. The size of the fields in the study areas was very different and varies between 0.3 and 50 ha.
The U.S. ground cotton phenological stages are from PhenoCam and the United States Department of Agriculture National Agricultural Statistics Service (USDA-NAS). Filed investigation, meteorological stations, and references [35,36] were used to derive the ground phenological stages in XJ. Table 3 shows the training and test samples. Of these, 5% of the samples from 851 field samples of MO-08 were used as training to construct the WBI. The rest of the samples were left to test the model. The reflectance of samples during all the growth periods was analyzed to produce the white bolls index.

. Temporal and Spectral Conjoint Analysis
The time series of Sentinel data was used to study the reflectance values of cotton and other crops during the period from planting time to harvesting time. The training site is MO-08, where the growth period of cotton is similar to corn, sorghum, soybean, rice, and peanuts. Figure 4 shows the normalized difference vegetation index (NDVI) of various crops. As is shown in Figure 4, cotton, peanuts and rice have almost the same cultivating date, which is difficult to recognize only based on NDVI. Remote Sens. 2021, 13, x FOR PEER REVIEW 7 of 20 As is well known, the color of most crops' canopy always turns yellow from green during the growth period. However, cotton has a unique feature in the canopy by comparing with other crops, which is that the color of the canopy will turn white at the stage of bolls opening. The unique white bolls of cotton can provide the clue for automatically detecting cotton from others.
To extract the characteristic of white bolls, the training samples should be divided into 2 classes: white bolls (class C, including the bolls opening stage of cotton) and others (class O, including other crops and other phenological stages of cotton). According to USDA-NAS, the bolls always open in mid to late September, and the cotton harvests at mid to late October and early November in MO; therefore, Julian's days from 261 (September 18) to 291 (October 18) were selected as the bolls opening stage. The scatter plot of the reflectance in different temporal images was graphed in Figure 5.   As is well known, the color of most crops' canopy always turns yellow from green during the growth period. However, cotton has a unique feature in the canopy by comparing with other crops, which is that the color of the canopy will turn white at the stage of bolls opening. The unique white bolls of cotton can provide the clue for automatically detecting cotton from others.
To extract the characteristic of white bolls, the training samples should be divided into 2 classes: white bolls (class C, including the bolls opening stage of cotton) and others (class O, including other crops and other phenological stages of cotton). According to USDA-NAS, the bolls always open in mid to late September, and the cotton harvests at mid to late October and early November in MO; therefore, Julian's days from 261 (September 18) to 291 (October 18) were selected as the bolls opening stage. The scatter plot of the reflectance in different temporal images was graphed in Figure 5.  As is well known, the color of most crops' canopy always turns yellow from green during the growth period. However, cotton has a unique feature in the canopy by comparing with other crops, which is that the color of the canopy will turn white at the stage of bolls opening. The unique white bolls of cotton can provide the clue for automatically detecting cotton from others.
To extract the characteristic of white bolls, the training samples should be divided into 2 classes: white bolls (class C, including the bolls opening stage of cotton) and others (class O, including other crops and other phenological stages of cotton). According to USDA-NAS, the bolls always open in mid to late September, and the cotton harvests at mid to late October and early November in MO; therefore, Julian's days from 261 (September 18) to 291 (October 18) were selected as the bolls opening stage. The scatter plot of the reflectance in different temporal images was graphed in Figure 5.  Blue, Green, RedEdge3, and narrow NIR bands were combined to demonstrate the distinctions between white bolls and others. Figure 5a showed various phenological stages of the cotton. On day 261~291 of the bolls opening stage, the spectral scatter points gathered in the top-left corner, and most of the points are smaller than 200 in the X-direction and larger than 300 in the Y-direction. The points in other phenological stages of cotton, including planted, squaring, setting bolls, and harvested (the color of the canopy is brown, green, and brown in turn) are distributed larger than 100 in the X-direction and smaller than 400 in the Y-direction. Similarly, Figure 5b showed the same distribution, where non-cotton crops such as corn, sorghum, soybean, rice, and peanuts are presented by the gray points on day 106~331.
The analysis of reflectance in different dates above shows that Sentinel-2 time series are potentially used to extract the white bolls of cotton.

Discriminate the Cotton Bolls from Other Crops
To find the optimal index representing the characteristic of white bolls, the first step is maximizing the discrepancy between the reflectance values of white bolls (class C) and others (class O). The second step is constructing the white bolls index, indicating the number of white bolls. The larger the index value is, the more the bolls open up.
In the first step, linear discriminant analysis (LDA) [37] is used to separate the white bolls from others: where S b is the between-class scatter matrix, S w is the within-class scatter matrix, and w is the transformation matrix. S b and S w are related to the samples' reflectance of class C and class O. After the optimizing procedure, the optimal w can be obtained. The transformation reflectance can lead to the max discrepancy between class O and class C: where R = [r 1 ,r 2 , . . . ,r n ] T is the reflectance of one sample, and P = [p 1 ,p 2 , . . . ,p n ] T is the transformed reflectance.
In the second step, the index indicating the number of opening bolls was constructed by a function, the value of which can change with the intensity of opening bolls: White Boll index (WBI) = f (p 1 , p 2 , . . . , p n ) where f (x) is the function of the transformed reflectance.

Compared with Supervised Classifier
To evaluate the potential of the proposed index, two popular classifiers were compared, namely SVM and 1-dimensional convolutional neural network (1DCNN). SVM is a non-parametric classifier with no assumption about the distribution of underlying data. Recently, SVM has been conducted in the remote sensing community in many studies [38,39]. The code we used is from module sklearn.svm.
Deep learning models were developed for land cover classification these years, which leads to very high accuracy. 1DCNN was the most popular model for a crop classificationbased satellite time series [25,40]. The code we used is from git-hub bhavesh907-Crop-Classification.
In this paper, all the algorithms, including WBI, SVM, and 1DCNN were trained by the same training samples and predicted for the same test samples (as Table 3 shows). Due to the number of images in various study sites being different, SVM and 1DCNN models trained by MO-08 can not predict other sites. Therefore, we selected the same number of images with near dates from the study sites, trained SVM and 1DCNN with MO-08, and then predicted other sites. We categorized training and test data into two classes of cotton and other crops. The overall accuracy and kappa coefficient are used to evaluate the results.

Extract the Start of Bolls Opening Time
Based on the white bolls index time series, the start date of the bolls opening can be extracted based on the phenological algorithm. The WBI time series should be filtered by Savitzky-Golay [41] to remove the noises, and the start of season time (SOS) can be detected as the bolls opening date, due to the value of WBI starting to increase with the bolls opening.
Midpoint [42] was applied to extract SOS from the WBI series. The threshold of SOS is determined as the 50% of the VI amplitude, and the ratio can be computed as follows: where VI max is the maximum value of the WBI time series, and VI min is the minimum value of the WBI time series. VI is the current value of the WBI time series.

Construction of White Bolls Index
In this study, LDA was used to separate the white bolls from others. Optimizing Equation (1), the transformation matrix w T can be obtained as shown in Equation (5).
where u j corresponds to the mean value of class j. With the LDA transformation, white bolls and others were maximum separated. Figure 6 showed the scatter plot of the first two transformed reflectances, namely LDA1 and LDA2. and other crops. The overall accuracy and kappa coefficient are used to evaluate the results.

Extract the Start of Bolls Opening Time
Based on the white bolls index time series, the start date of the bolls opening can be extracted based on the phenological algorithm. The WBI time series should be filtered by Savitzky-Golay [41] to remove the noises, and the start of season time (SOS) can be detected as the bolls opening date, due to the value of WBI starting to increase with the bolls opening.
Midpoint [42] was applied to extract SOS from the WBI series. The threshold of SOS is determined as the 50% of the VI amplitude, and the ratio can be computed as follows: where VImax is the maximum value of the WBI time series, and VImin is the minimum value of the WBI time series. VI is the current value of the WBI time series.

Construction of White Bolls Index
In this study, LDA was used to separate the white bolls from others. Optimizing Equation (1), the transformation matrix w T can be obtained as shown in Equation (5).  (5) where uj corresponds to the mean value of class j. With the LDA transformation, white bolls and others were maximum separated. Figure 6 showed the scatter plot of the first two transformed reflectances, namely LDA1 and LDA2. As shown in Figure 6, the values of white bolls were centered in −200 and other crops concentrated in 0 in the projected direction. The white bolls index can be constructed based on the first two LDA components. Several combinations of the two LDA bands were tried for the separation of white bolls from other crops. Figure 7 showed the box plot for different combinations. As shown in Figure 6, the values of white bolls were centered in −200 and other crops concentrated in 0 in the projected direction. The white bolls index can be constructed based on the first two LDA components. Several combinations of the two LDA bands were tried for the separation of white bolls from other crops. Figure 7 showed the box plot for different combinations. Based on Figure 7, the combination p2 − p1 showed the highest separation potential for the identification of white cotton bolls from others. Therefore, the white bolls' index was proposed as follows: The WBI values for different crops on Julian's days 106~331 in the study sites were shown in Figure 8. During the stage of bolls opening (day 261-291), the mean WBI values are obviously larger than those of other crops, while the WBI values of other crops are always lower than 100 in all growing stages, and those of white bolls are far higher than 100. The significant differences of WBI between cotton and other crops can acceptably detect cotton from other crops. Based on Figure 7, the combination p 2 − p 1 showed the highest separation potential for the identification of white cotton bolls from others. Therefore, the white bolls' index was proposed as follows: The WBI values for different crops on Julian's days 106~331 in the study sites were shown in Figure 8. During the stage of bolls opening (day 261-291), the mean WBI values are obviously larger than those of other crops, while the WBI values of other crops are always lower than 100 in all growing stages, and those of white bolls are far higher than 100. The significant differences of WBI between cotton and other crops can acceptably detect cotton from other crops. Remote Sens. 2021, 13, x FOR PEER REVIEW 11 of 20

Automatically Mapping the Cotton with White Bolls Index and Accuracy Assessment
In several remote sensing studies, empirical methods have been successfully employed for threshold determination [21,29], which was used in this study to select acceptable threshold values. We used the training samples in MO to measure the kappa coefficients at various thresholds. According to Figure 7, the range of thresholds is set from 70 to 270 and the search step is 10. Figure 9 shows that the best accuracy occurs between 100 and 220 of the WBI value, which is an acceptable threshold for detecting cotton from other crops. Additionally, before using the threshold to map cotton, it is better to mask the noncropland to avoid interference from other ground objectives with white color.
.  Table 4 displays the accuracy of classification based on WBI, SVM, and 1DCNN. The models were trained by the same samples of MO-08 and applied to test samples of MO-08 and other study sites (CA-21, GA-08, and XJ). As the table shows, SVM and 1DCNN can produce very high precision in MO-08, reaching 97% above, but they offer a lower output in CA-21, GA-08, and XJ than WBI. The best results are shown by the bold fonts.

Automatically Mapping the Cotton with White Bolls Index and Accuracy Assessment
In several remote sensing studies, empirical methods have been successfully employed for threshold determination [21,29], which was used in this study to select acceptable threshold values. We used the training samples in MO to measure the kappa coefficients at various thresholds. According to Figure 7, the range of thresholds is set from 70 to 270 and the search step is 10. Figure 9 shows that the best accuracy occurs between 100 and 220 of the WBI value, which is an acceptable threshold for detecting cotton from other crops. Additionally, before using the threshold to map cotton, it is better to mask the non-cropland to avoid interference from other ground objectives with white color.

Automatically Mapping the Cotton with White Bolls Index and Accuracy Assessment
In several remote sensing studies, empirical methods have been successfully employed for threshold determination [21,29], which was used in this study to select acceptable threshold values. We used the training samples in MO to measure the kappa coefficients at various thresholds. According to Figure 7, the range of thresholds is set from 70 to 270 and the search step is 10. Figure 9 shows that the best accuracy occurs between 100 and 220 of the WBI value, which is an acceptable threshold for detecting cotton from other crops. Additionally, before using the threshold to map cotton, it is better to mask the noncropland to avoid interference from other ground objectives with white color.
.  Table 4 displays the accuracy of classification based on WBI, SVM, and 1DCNN. The models were trained by the same samples of MO-08 and applied to test samples of MO-08 and other study sites (CA-21, GA-08, and XJ). As the table shows, SVM and 1DCNN can produce very high precision in MO-08, reaching 97% above, but they offer a lower output in CA-21, GA-08, and XJ than WBI. The best results are shown by the bold fonts.   Table 4 displays the accuracy of classification based on WBI, SVM, and 1DCNN. The models were trained by the same samples of MO-08 and applied to test samples of MO-08 and other study sites (CA-21, GA-08, and XJ). As the table shows, SVM and 1DCNN can produce very high precision in MO-08, reaching 97% above, but they offer a lower output in CA-21, GA-08, and XJ than WBI. The best results are shown by the bold fonts.  Figure 10 illustrates the derived cotton maps of the four study sites. The background is the true color composite by R, G, and B bands.   Figure 10 illustrates the derived cotton maps of the four study sites. The background is the true color composite by R, G, and B bands.

Identification of Bolls Opening Date
The window of the SG filter is set to 5~11, and the ratio is set to 0.5. The SOS of the WBI series can be seen as the bolls opening date. Figure 11 shows the procedure of the extraction for the bolls opening date and the corresponding NDVI time series. As shown in Figure 11, the day that the bolls start to open is the day the NDVI value begins to decline.

Identification of Bolls Opening Date
The window of the SG filter is set to 5~11, and the ratio is set to 0.5. The SOS of the WBI series can be seen as the bolls opening date. Figure 11 shows the procedure of the extraction for the bolls opening date and the corresponding NDVI time series. As shown in Figure 11, the day that the bolls start to open is the day the NDVI value begins to decline. Figure 12 shows the start date of bolls opening at the four study sites, and the ratios of different bolls opening dates are compared with USDA phenological data. It is shown that the start date of bolls opening based on the WBI estimation and USDA are generally consistent in the U.S. The ground surveys and references showed that the start date of the bolls opening always occurred between 238 and 252 in the north of XJ, which is fitting for the estimation by WBI. Remote Sens. 2021, 13, x FOR PEER REVIEW 14 of 20 (a) (b) Figure 11. (a) The WBI time series, SG fit, and the midpoint extraction for bolls opening date. (b) The NDVI time series and the start date of bolls opening. Figure 12 shows the start date of bolls opening at the four study sites, and the ratios of different bolls opening dates are compared with USDA phenological data. It is shown that the start date of bolls opening based on the WBI estimation and USDA are generally consistent in the U.S. The ground surveys and references showed that the start date of the bolls opening always occurred between 238 and 252 in the north of XJ, which is fitting for the estimation by WBI.   Figure 12 shows the start date of bolls opening at the four study sites, and the ratios of different bolls opening dates are compared with USDA phenological data. It is shown that the start date of bolls opening based on the WBI estimation and USDA are generally consistent in the U.S. The ground surveys and references showed that the start date of the bolls opening always occurred between 238 and 252 in the north of XJ, which is fitting for the estimation by WBI.

Discussion
A new white bolls index (WBI) is developed for automatic cotton mapping in this study, which enables effortless computation as well as mathematical simplicity. The proposed WBI can indicate the number of bolls opening, which turns larger with the increase of white bolls. Recently, more and more spectral indices for crop detection have been introduced based on remotely sensed time series, such as yellow flower indexes of canola and so on, showing the potential of spectral indexes in crop mapping. However, there are no reports for cotton mapping with spectral index, and the introduced WBI has shown an excellent effect for mapping cotton and accurate determination for cotton bolls opening date, which can be considered as an important contribution in this paper. It is worth noting that the WBI does not demand training data to map cotton fields.
In this paper, we used LDA on 10 bands of Sentinel-2 (B2: blue, B3: green, B4: red, B5: redEdge-1, B6: redEdge-2, B7: redEdge-3, B8: NIR, B8A: narrow NIR, B11: SWIR-1, B12: SWIR-2) to extract the WBI. To assess the importance of these different bands in WBI, we computed the correlation coefficients between WBI and WBI of removing a band. The lower the correlation coefficient, the more the band contributes. As Figure 13 shows, the blue, green, red, redEdge-3, and wide NIR bands show the higher weight in the construction of WBI, while the two SWIR bands are almost useless.
The MSI sensor records the radiation from white bolls, green or yellow leaves and soil background. Provided that cotton bolls are white, it enhances the reflection of visible

Discussion
A new white bolls index (WBI) is developed for automatic cotton mapping in this study, which enables effortless computation as well as mathematical simplicity. The proposed WBI can indicate the number of bolls opening, which turns larger with the increase of white bolls. Recently, more and more spectral indices for crop detection have been introduced based on remotely sensed time series, such as yellow flower indexes of canola and so on, showing the potential of spectral indexes in crop mapping. However, there are no reports for cotton mapping with spectral index, and the introduced WBI has shown an excellent effect for mapping cotton and accurate determination for cotton bolls opening date, which can be considered as an important contribution in this paper. It is worth noting that the WBI does not demand training data to map cotton fields.
In this paper, we used LDA on 10 bands of Sentinel-2 (B2: blue, B3: green, B4: red, B5: redEdge-1, B6: redEdge-2, B7: redEdge-3, B8: NIR, B8A: narrow NIR, B11: SWIR-1, B12: SWIR-2) to extract the WBI. To assess the importance of these different bands in WBI, we computed the correlation coefficients between WBI and WBI of removing a band. The lower the correlation coefficient, the more the band contributes. As Figure 13 shows, the blue, green, red, redEdge-3, and wide NIR bands show the higher weight in the construction of WBI, while the two SWIR bands are almost useless. spectral bands give a higher contribution to the separation of cotton from other crops. Additionally, under the bolls opening stage, the growth of cotton is obviously weakened and nearly stopped. The leaves gradually declined and the photosynthesis decreased. While, red, redEdge-3, and narrow NIR can easily indicate the growth and health of vegetation, which can detect the senescence of vegetation and play important roles in WBI construction. Figure 13. The correlation coefficient of WBI and WBI without a selected band.
The new feature was mainly composed of visible and near IR bands. During the bolls opening stage, more and more white bolls are opening, which leads to the change of canopy reflectance. The color of white bolls can be expressed by visible bands, and the senescence of cotton can be indicated by near IR bands. As shown in Figure 14a, the ground reflectance of white bolls is shown as more "flat" in the visible bands, which indicates the color "white". Meanwhile, the decreasing slope of RedEdge and the lower NIR show that the senescence of vegetation is beginning, marking the stage of bolls opening. Figure 14b displays the cotton's harvested stage of the Xinjiang on 2014-10-16 from Google Earth, which shows the satellite view of the white bolls of the canopy. The MSI sensor records the radiation from white bolls, green or yellow leaves and soil background. Provided that cotton bolls are white, it enhances the reflection of visible wavelengths, which leads to a higher and flatter reflectance of cotton in the blue, green, and red wavelengths as compared with the other crops. Therefore, blue, green, and red spectral bands give a higher contribution to the separation of cotton from other crops. Additionally, under the bolls opening stage, the growth of cotton is obviously weakened and nearly stopped. The leaves gradually declined and the photosynthesis decreased. While, red, redEdge-3, and narrow NIR can easily indicate the growth and health of vegetation, which can detect the senescence of vegetation and play important roles in WBI construction.
The new feature was mainly composed of visible and near IR bands. During the bolls opening stage, more and more white bolls are opening, which leads to the change of canopy reflectance. The color of white bolls can be expressed by visible bands, and the senescence of cotton can be indicated by near IR bands. As shown in Figure 14a, the ground reflectance of white bolls is shown as more "flat" in the visible bands, which indicates the color "white". Meanwhile, the decreasing slope of RedEdge and the lower NIR show that the senescence of vegetation is beginning, marking the stage of bolls opening. Figure 14b displays the cotton's harvested stage of the Xinjiang on 2014-10-16 from Google Earth, which shows the satellite view of the white bolls of the canopy.
The proposed index was developed for the bolls opening stage, lasting 15 to 30 days. The larger number of images without clouds in this period will lead to a better performance of WBI. Due to various climate and field management, the beginning date and rate of progress for bolls opening are diverse in different regions, as well as considering the obtained images with different temporal periods, all of which results in the subtle variation for thresholds. In our four study sites, we use 150, 170, 150, and 200 for MO-08, CA-21, GA-08, and XJ, respectively and achieve the best accuracy. According to our threshold test in study sites, the range [100-220] can be considered proper to detect white bolls within cropland. The proposed index was developed for the bolls opening stage, lasting 15 to 30 days. The larger number of images without clouds in this period will lead to a better performance of WBI. Due to various climate and field management, the beginning date and rate of progress for bolls opening are diverse in different regions, as well as considering the obtained images with different temporal periods, all of which results in the subtle variation for thresholds. In our four study sites, we use 150, 170, 150, and 200 for MO-08, CA-21, GA-08, and XJ, respectively and achieve the best accuracy. According to our threshold test in study sites, the range [100-220] can be considered proper to detect white bolls within cropland.
It is worth noticing that the WBI describes the feature of white bolls for cotton, which is almost not affected by the various climates of regions. As Figure 12 shows, the survey shows that the four study sites have different dates for the bolls opening stage. However, the WBI only notices the feature of white bolls, which is a widespread trait of cotton in various regions and cannot be provided by other crops. Hence, based on WBI in the bolls opening period, the cotton can be easily mapped, and based on the WBI time series, SOS can be extracted by phenological methods as the beginning of bolls opening.
The SVM and 1DCNN need new training samples for new regions relative to the WBI classification; otherwise, they do not work satisfactorily. As shown in Table 4, SVM and 1DCNN can produce very high accuracy at the training site (MO-08) due to the test and training samples being from the same site, but they provide poorer performance in CA-21, GA-08, and XJ, which shows that machine learning models often dependent on training samples, and it is hard to extend to large-scale crop mapping due to the lack of training samples.
The results demonstrated that the proposed index shows the potential for automated cotton mapping without retraining at any location. The WBI is very effective for white bolls of UPLAND and PIMA cotton, of which white bolls are significant in the bolls opening stage, especially after defoliation, but the performance for tree cotton, the bolls of which are not present on the canopy, may not be very strong. The capability of WBI should be further assessed in future studies.

Conclusions
In the study, the white bolls index was developed based on the spectral characteristic during the bolls opening stage. White bolls, which are specific for cotton but cannot be seen by other crops, are derived as the white bolls index by linear discriminant analysis based on a Sentinel-2 time series. Rapid cotton mapping without training samples was performed by WBI, and the dates of bolls opening were identified by a WBI time series. It is worth noticing that the WBI describes the feature of white bolls for cotton, which is almost not affected by the various climates of regions. As Figure 12 shows, the survey shows that the four study sites have different dates for the bolls opening stage. However, the WBI only notices the feature of white bolls, which is a widespread trait of cotton in various regions and cannot be provided by other crops. Hence, based on WBI in the bolls opening period, the cotton can be easily mapped, and based on the WBI time series, SOS can be extracted by phenological methods as the beginning of bolls opening.
The SVM and 1DCNN need new training samples for new regions relative to the WBI classification; otherwise, they do not work satisfactorily. As shown in Table 4, SVM and 1DCNN can produce very high accuracy at the training site (MO-08) due to the test and training samples being from the same site, but they provide poorer performance in CA-21, GA-08, and XJ, which shows that machine learning models often dependent on training samples, and it is hard to extend to large-scale crop mapping due to the lack of training samples.
The results demonstrated that the proposed index shows the potential for automated cotton mapping without retraining at any location. The WBI is very effective for white bolls of UPLAND and PIMA cotton, of which white bolls are significant in the bolls opening stage, especially after defoliation, but the performance for tree cotton, the bolls of which are not present on the canopy, may not be very strong. The capability of WBI should be further assessed in future studies.

Conclusions
In the study, the white bolls index was developed based on the spectral characteristic during the bolls opening stage. White bolls, which are specific for cotton but cannot be seen by other crops, are derived as the white bolls index by linear discriminant analysis based on a Sentinel-2 time series. Rapid cotton mapping without training samples was performed by WBI, and the dates of bolls opening were identified by a WBI time series. The performance of the proposed index was assessed and compared with advanced classifiers such as SVM and 1DCNN, where WBI performed well without training samples. With the encouraging results of this study, the capability of WBI should be further assessed for cotton types with the lower significance of white bolls.