Cotton Fiber Quality Estimation Based on Machine Learning Using Time Series UAV Remote Sensing Data

: As an important factor determining the competitiveness of raw cotton, cotton ﬁber quality has received more and more attention. The results of traditional detection methods are accurate, but the sampling cost is high and has a hysteresis, which makes it difﬁcult to measure cotton ﬁber quality parameters in real time and at a large scale. The purpose of this study is to use time-series UAV (Unmanned Aerial Vehicle) multispectral and RGB remote sensing images combined with machine learning to model four main quality indicators of cotton ﬁbers. A deep learning algorithm is used to identify and extract cotton boll pixels in remote sensing images and improve the accuracy of quantitative extraction of spectral features. In order to simplify the input parameters of the model, the stepwise sensitivity analysis method is used to eliminate redundant variables and obtain the optimal input feature set. The results of this study show that the R 2 of the prediction model established by a neural network is improved by 29.67% compared with the model established by linear regression. When the spectral index is calculated after removing the soil pixels used for prediction, R 2 is improved by 4.01% compared with the ordinary method. The prediction model can well predict the average length, uniformity index, and micronaire value of the upper half. R 2 is 0.8250, 0.8014, and 0.7722, respectively. This study provides a method to predict the cotton ﬁber quality in a large area without manual sampling, which provides a new idea for variety breeding and commercial decision-making in the cotton industry.


Introduction
Nondestructive and low-cost large-scale prediction of crop quality parameters is of great help to the government and farmers in making correct agronomic decisions [1]. Getting crop character index data early is helpful to effectively screen high-quality varieties from a large number of varieties, improve crop quality, and reap subsequent economic benefits [2]. The quality of cotton fiber directly determines the quality of cotton yarn, and it is also a decisive factor in the value of cotton [3]. Although the traditional indoor detection method has high precision, it is time-consuming, labor-consuming, and expensive; therefore, it is necessary to explore new methods to predict cotton quality quickly and on a large scale [4]. At present, the indexes widely used to evaluate the quality of cotton fiber mainly include the upper half mean length, uniformity, breaking strength, maturity, micronaire value, etc. Taking fiber length as an example, the current measurement methods can be divided into manual measurement and machine measurement [5,6]. The hand-pull classification methods such as object-oriented [31] and random forest [32]. The appearance of a Full Convolution Neural Network (FCN) realizes the application of a deep learning model in the field of semantic segmentation and constantly derives various optimization models. U-Net adopts a completely different feature fusion method [33]. Compared with the operation of summing the corresponding points of the feature map during FCN fusion, it splices features together in the channel dimension to form a feature map with higher dimensions, which can extract image features at more scales [34]. At present, U-Net has been widely used in the field of remote sensing; the ENVINet-5 used in this study is a semantic segmentation network with U-Net as the core.
The purpose of this study is to (1) reveal the response of cotton fiber quality differences in remote sensing data, (2) achieve efficient acquisition of cotton fiber quality parameters at field scale, and (3) generate a visual heat map of the field distribution of key quality parameters of cotton fiber. Figure 1 shows the geographical location of the experimental site, which is located in Binzhou City, Shandong Province, China. These fields belong to Shandong Lvfeng Agriculture Group Co. Ltd., with a total area of about 61,000 square meters and a planting density of 11 plants per square meter. images can eliminate the influence of soil pixels when establishing the quality prediction model and make the prediction result of the model more accurate. At present, there is some research on extracting cotton bolls using machine learning, mainly based on traditional classification methods such as object-oriented [31] and random forest [32]. The appearance of a Full Convolution Neural Network (FCN) realizes the application of a deep learning model in the field of semantic segmentation and constantly derives various optimization models. U-Net adopts a completely different feature fusion method [33]. Compared with the operation of summing the corresponding points of the feature map during FCN fusion, it splices features together in the channel dimension to form a feature map with higher dimensions, which can extract image features at more scales [34]. At present, U-Net has been widely used in the field of remote sensing; the ENVINet-5 used in this study is a semantic segmentation network with U-Net as the core. The purpose of this study is to (1) reveal the response of cotton fiber quality differences in remote sensing data, (2) achieve efficient acquisition of cotton fiber quality parameters at field scale, and (3) generate a visual heat map of the field distribution of key quality parameters of cotton fiber. Figure 1 shows the geographical location of the experimental site, which is located in Binzhou City, Shandong Province, China. These fields belong to Shandong Lvfeng Agriculture Group Co. Ltd., with a total area of about 61,000 square meters and a planting density of 11 plants per square meter. A flowchart outlining the overall methodology is presented in Figure 2. Subsequent sections describe all the steps in detail. A flowchart outlining the overall methodology is presented in Figure 2. Subsequent sections describe all the steps in detail.

Field Sample Collection
Four fields were selected and marked A, B, C, and D, and 150 square sampling points were set with a side length of 0.9 m and marked with a wire frame and white cloth (Figure 3a) for the identification of the ROI (region of interest). After collection, all samples were put into air-permeable gauze bags and labeled (Figure 3b). A small sawtooth gin was used ( Figure 3c) to remove the cotton seeds to obtain the lint, which was air-dried in a ventilated and dry environment for 5 days and sent for inspection. The quality parameter testing was completed in a professional organization, and the testing organization received a qualification certificate. The test method is the physical property test method of HVI cotton fiber. Taking GB/T 20392-2006 as the test standard, five indexes of average length, uniformity index, breaking specific strength, micronaire value, and upper half mean length are measured. Table 1 shows the maximum, minimum, mean, and coefficient of variation (CV) for the five parameters.

Field Sample Collection
Four fields were selected and marked A, B, C, and D, and 150 square sampling points were set with a side length of 0.9 m and marked with a wire frame and white cloth ( Figure  3a) for the identification of the ROI (region of interest). After collection, all samples were put into air-permeable gauze bags and labeled (Figure 3b). A small sawtooth gin was used ( Figure 3c) to remove the cotton seeds to obtain the lint, which was air-dried in a ventilated and dry environment for 5 days and sent for inspection. The quality parameter testing was completed in a professional organization, and the testing organization received a qualification certificate. The test method is the physical property test method of HVI cotton fiber. Taking GB/T 20392-2006 as the test standard, five indexes of average length, uniformity index, breaking specific strength, micronaire value, and upper half mean length are measured. Table 1 shows the maximum, minimum, mean, and coefficient of variation (CV) for the five parameters.

Field Sample Collection
Four fields were selected and marked A, B, C, and D, and 150 square sampling points were set with a side length of 0.9 m and marked with a wire frame and white cloth ( Figure  3a) for the identification of the ROI (region of interest). After collection, all samples were put into air-permeable gauze bags and labeled ( Figure 3b). A small sawtooth gin was used ( Figure 3c) to remove the cotton seeds to obtain the lint, which was air-dried in a ventilated and dry environment for 5 days and sent for inspection. The quality parameter testing was completed in a professional organization, and the testing organization received a qualification certificate. The test method is the physical property test method of HVI cotton fiber. Taking GB/T 20392-2006 as the test standard, five indexes of average length, uniformity index, breaking specific strength, micronaire value, and upper half mean length are measured. Table 1 shows the maximum, minimum, mean, and coefficient of variation (CV) for the five parameters. (c) (d)    It can be seen from the data in the table that elongation has no significant difference in the data collected and the data distribution is relatively concentrated; therefore, only the average length of the upper half, the uniformity index, the fracture specific strength, and the micronaire value are predicted in this study.

Data Acquisition Equipment
Data collection uses two types of drones to capture visible light and multispectral remote sensing data, as shown in Figure 4a. A DJI Phantom 4 RTK captures visible light data, and a DJI Phantom 4 Multispectral captures multispectral data. The parameters of the two aircraft and the flight parameters when the data were collected are shown in Table 2. In order to obtain centimeter-level accurate positioning, the time synchronization system is used to reduce the error between the flight control system, camera, and the RTK's clock system to a microsecond level. Combined with the equipment altitude information, the center point positions of the camera lens and the antenna are compensated in real time so that the image can obtain accurate geographical position information. The data collection time was selected at 2 p.m. on a cloudless day, and the calibration plate was photographed to obtain the reflectance (Figure 4b).
(c) (d)  It can be seen from the data in the table that elongation has no significant difference in the data collected and the data distribution is relatively concentrated; therefore, only the average length of the upper half, the uniformity index, the fracture specific strength, and the micronaire value are predicted in this study.

Data Acquisition Equipment
Data collection uses two types of drones to capture visible light and multispectral remote sensing data, as shown in Figure 4a. A DJI Phantom 4 RTK captures visible light data, and a DJI Phantom 4 Multispectral captures multispectral data. The parameters of the two aircraft and the flight parameters when the data were collected are shown in Table  2. In order to obtain centimeter-level accurate positioning, the time synchronization system is used to reduce the error between the flight control system, camera, and the RTK's clock system to a microsecond level. Combined with the equipment altitude information, the center point positions of the camera lens and the antenna are compensated in real time so that the image can obtain accurate geographical position information. The data collection time was selected at 2 p.m. on a cloudless day, and the calibration plate was photographed to obtain the reflectance (Figure 4b).    The preprocessing of remote sensing data is completed on the Power Leader TM PR490P computing server. DJI Terra is used to splice the images taken by the UAV. The software will automatically read the EXIF (exchangeable image file format) information of the images and find the same name points in a single image to generate an orthophoto of the research area.

Pixel-Level Fusion of Time Series RGB and Multispectral Data
In this study, 13 periods of remote sensing image data of cotton from the bud stage to the complete boll opening were collected and superimposed with pixels in the direction of the channel to realize the data fusion of the time series ( Figure 5). The most important step in the fusion process is to reduce the error of image registration to less than 1 pixel, which is the premise of accurate feature extraction. A polynomial correction model was used for image registration, and the coefficients (a ij and b ij ) in the polynomial were calculated by the least squares method using the image coordinates of the ground control point and the ground coordinates of its same name point. Because the data has centimeterlevel positioning accuracy, it does not need a lot of control points to meet the accuracy requirements. The update process of image coordinates is shown in Formula (1), where (x, y) is the original coordinate and (u, v) is the coordinate of the same name point.
PR490P computing server. DJI Terra is used to splice the images taken by the UAV. The software will automatically read the EXIF (exchangeable image file format) information of the images and find the same name points in a single image to generate an orthophoto of the research area.

Pixel-Level Fusion of Time Series RGB and Multispectral Data
In this study, 13 periods of remote sensing image data of cotton from the bud stage to the complete boll opening were collected and superimposed with pixels in the direction of the channel to realize the data fusion of the time series ( Figure 5). The most important step in the fusion process is to reduce the error of image registration to less than 1 pixel, which is the premise of accurate feature extraction. A polynomial correction model was used for image registration, and the coefficients (aij and bij) in the polynomial were calculated by the least squares method using the image coordinates of the ground control point and the ground coordinates of its same name point. Because the data has centimeterlevel positioning accuracy, it does not need a lot of control points to meet the accuracy requirements. The update process of image coordinates is shown in Formula (1), where (x, y) is the original coordinate and (u, v) is the coordinate of the same name point. Formula (2) shows the relationship between the number N of polynomial coefficients and the polynomial order n. In general, the second-order polynomial model can handle most image registrations. Formula (2) shows the relationship between the number N of polynomial coefficients and the polynomial order n. In general, the second-order polynomial model can handle most image registrations.
the second-order polynomial model needs at least 6 control points. When the number of control points exceeds 6, the error is calculated according to Formula (3). The error is reduced by increasing the control points to make them smaller than one pixel.
In this study, an ENVI deep learning module is used to extract cotton peach pixels from remote sensing images, with the feature extraction part based on a U-Net network model ( Figure 6a). Before convolution, pixels with a value of 0 are usually filled around the image to ensure the same size before and after convolution; however, as the number of model layers increases, the degree of feature map abstraction increases, and the error will continue to accumulate. A U-Net model will not perform this operation, which will make it difficult to restore the original resolution of the image. The "overlap tile" scheme is used to reflect and fill pixels at the edge of the image to solve this problem ( Figure 6b). This method preserves the context information of edge features while maintaining the original resolution of the image. Formula (2) shows the relationship between the number N of polynomial coefficients and the polynomial order n. In general, the second-order polynomial model can handle most image registrations.
the second-order polynomial model needs at least 6 control points. When the number of control points exceeds 6, the error is calculated according to Formula (3). The error is reduced by increasing the control points to make them smaller than one pixel.

Cotton Boll Pixel Recognition Based on Deep Learning
In this study, an ENVI deep learning module is used to extract cotton peach pixels from remote sensing images, with the feature extraction part based on a U-Net network model ( Figure 6a). Before convolution, pixels with a value of 0 are usually filled around the image to ensure the same size before and after convolution; however, as the number of model layers increases, the degree of feature map abstraction increases, and the error will continue to accumulate. A U-Net model will not perform this operation, which will make it difficult to restore the original resolution of the image. The "overlap tile" scheme is used to reflect and fill pixels at the edge of the image to solve this problem (Figure 6b). This method preserves the context information of edge features while maintaining the original resolution of the image.

Training Model
Once a set of ROIs is defined for each training raster (Figure 7a), the Deep Learning Labeling Tool automatically creates a label raster in the form of a binarized image ( Figure  7b). In this study, 500 ROI regions were extracted, and the training set and the test set were divided according to a ratio of 4 to 1.

Training Model
Once a set of ROIs is defined for each training raster (Figure 7a), the Deep Learning Labeling Tool automatically creates a label raster in the form of a binarized image (Figure 7b). In this study, 500 ROI regions were extracted, and the training set and the test set were divided according to a ratio of 4 to 1. (b)

Training Model
Once a set of ROIs is defined for each training raster (Figure 7a), the Deep Learning Labeling Tool automatically creates a label raster in the form of a binarized image ( Figure  7b). In this study, 500 ROI regions were extracted, and the training set and the test set were divided according to a ratio of 4 to 1. In this study, the cross-entropy loss function is used to evaluate the classification error of the model. In the semantic segmentation task, it will calculate each pixel one by one. Initially, the model will generate a random class activation grid, calculate the loss function between the grid and the label, and then update the model parameters through the gradient descent method to converge the model. The image will be divided into a given size in advance and then sent to the network to avoid excessive calculation pressure.

Cotton Boll Opening Pixel Percentage
The extraction of cotton boll pixels is a binary task. The proportion of cotton boll opening pixel percentage (BOP) can be obtained by calculating the average value in the sampling area after the output results are binarized, as shown in Figure 8.  In this study, the cross-entropy loss function is used to evaluate the classification error of the model. In the semantic segmentation task, it will calculate each pixel one by one. Initially, the model will generate a random class activation grid, calculate the loss function between the grid and the label, and then update the model parameters through the gradient descent method to converge the model. The image will be divided into a given size in advance and then sent to the network to avoid excessive calculation pressure.

Cotton Boll Opening Pixel Percentage
The extraction of cotton boll pixels is a binary task. The proportion of cotton boll opening pixel percentage (BOP) can be obtained by calculating the average value in the sampling area after the output results are binarized, as shown in Figure 8.

Training Model
Once a set of ROIs is defined for each training raster (Figure 7a), the Deep Learning Labeling Tool automatically creates a label raster in the form of a binarized image ( Figure  7b). In this study, 500 ROI regions were extracted, and the training set and the test set were divided according to a ratio of 4 to 1. In this study, the cross-entropy loss function is used to evaluate the classification error of the model. In the semantic segmentation task, it will calculate each pixel one by one. Initially, the model will generate a random class activation grid, calculate the loss function between the grid and the label, and then update the model parameters through the gradient descent method to converge the model. The image will be divided into a given size in advance and then sent to the network to avoid excessive calculation pressure.

Cotton Boll Opening Pixel Percentage
The extraction of cotton boll pixels is a binary task. The proportion of cotton boll opening pixel percentage (BOP) can be obtained by calculating the average value in the sampling area after the output results are binarized, as shown in Figure 8.   In this study, a two-layer BP neural network was built to explore the mapping relationship between cotton fiber quality and spectral index (Figure 9). The Bayesian regularization training algorithm introduces the regularization coefficient into the traditional loss function and uses the Gaussian distribution as the prior probability density distribution, which can improve the overfitting problem in the model training.

Bayesian Regularized BP Neural Network
In this study, a two-layer BP neural network was built to explore the mapping relationship between cotton fiber quality and spectral index (Figure 9). The Bayesian regularization training algorithm introduces the regularization coefficient into the traditional loss function and uses the Gaussian distribution as the prior probability density distribution, which can improve the overfitting problem in the model training. The traditional loss function is the sum of the squared errors, calculated according to Formula (4) where ti is the true value and xi is the predicted value of the model.
Formula (6) is the regularization performance function of the network, where Ew is the mean square sum of all weighted parameters in the network and α and β affect the complexity and smoothness of the network, respectively; they can be calculated by Formula (9) when the effective network parameter is γ. As the number of network training increases, they are constantly updated. All weights and biases in the network will be given random values in the first training. Suppose that the noise in D (datasets) and the weight vector in M (neural network model) obey Gaussian distributions in probability density; a posteriori probability density function of parameters in the network after an iteration can be calculated using the Bayesian formula. The traditional loss function is the sum of the squared errors, calculated according to Formula (4) where t i is the true value and x i is the predicted value of the model.
Formula (6) is the regularization performance function of the network, where E w is the mean square sum of all weighted parameters in the network and α and β affect the complexity and smoothness of the network, respectively; they can be calculated by Formula (9) when the effective network parameter is γ. As the number of network training increases, they are constantly updated. All weights and biases in the network will be given random values in the first training. Suppose that the noise in D (datasets) and the weight vector in M (neural network model) obey Gaussian distributions in probability density; a posteriori probability density function of parameters in the network after an iteration can be calculated using the Bayesian formula.     Figure 12 shows the convergence of the loss function on the training set and the verification set. When the epoch reaches 24, the model converges almost completely.  Figure 12 shows the convergence of the loss function on the training set and the verification set. When the epoch reaches 24, the model converges almost completely. Figure 11. Connect all-time series eigenvectors. Figure 12 shows the convergence of the loss function on the training set and the verification set. When the epoch reaches 24, the model converges almost completely. A total of 500 points are randomly selected in the image as the test set, and the confusion matrix method is used to test the accuracy of the model. The results are shown in Table 3. Formula (10) is the calculation method for overall accuracy, where n is the number of correctly classified samples and N is the total number of samples. Formula (12)   A total of 500 points are randomly selected in the image as the test set, and the confusion matrix method is used to test the accuracy of the model. The results are shown in Table 3. Formula (10) is the calculation method for overall accuracy, where n is the number of correctly classified samples and N is the total number of samples. Formula (12) is the calculation method for the Kappa coefficient, where P o is the overall accuracy and P c is the accidental consistent pixel scale. The overall accuracy and Kappa coefficient are 92.4% and 0.8204, respectively, which show that the classification results of the model are highly consistent with the actual situation.

All Parameters Participate in Modeling
Using the upper half mean length prediction as an example, this paper explains how to build the prediction model. The modeling and prediction methods of uniformity index, fracture specific strength, and micronaire value are consistent with the method steps shown below. Two methods were used to calculate the spectral index in the sampling area. The first is to calculate the average value in the region. The second is to use the segmentation results of U-Net to calculate the average value after removing the non-cotton peach pixels in the region. The dataset is divided into a training set and a verification set according to the ratio of 4:1, and the model is evaluated by ten-fold cross-verification.

Least Squares Modeling
A preliminary correlation analysis on the upper half mean length is performed using a multivariate linear equation. The dataset is divided into 10 equal parts and verified with a ten-fold cross-validation method. Figure 13 shows the ten-fold cross-validation results of the correlation between the estimated results of the linear regression model and the measured results of the sampling.
Using the upper half mean length prediction as an example, this paper explains how to build the prediction model. The modeling and prediction methods of uniformity index, fracture specific strength, and micronaire value are consistent with the method steps shown below. Two methods were used to calculate the spectral index in the sampling area. The first is to calculate the average value in the region. The second is to use the segmentation results of U-Net to calculate the average value after removing the noncotton peach pixels in the region. The dataset is divided into a training set and a verification set according to the ratio of 4:1, and the model is evaluated by ten-fold crossverification.

Least Squares Modeling
A preliminary correlation analysis on the upper half mean length is performed using a multivariate linear equation. The dataset is divided into 10 equal parts and verified with a ten-fold cross-validation method. Figure 13 shows the ten-fold cross-validation results of the correlation between the estimated results of the linear regression model and the measured results of the sampling. According to the results of the cross validation, R 2 is generally low (average 0.56) and fluctuates greatly (MSE is 0.21). This may be due to the complex relationship between cotton fiber quality parameters, spectrum, and texture features and the traditional linear regression model, which makes it difficult to express the mapping relationship between them. According to the results of the cross validation, R 2 is generally low (average 0.56) and fluctuates greatly (MSE is 0.21). This may be due to the complex relationship between cotton fiber quality parameters, spectrum, and texture features and the traditional linear regression model, which makes it difficult to express the mapping relationship between them.

BP Neural Networks Modeling
The hyperparameters of the neural network (such as the number of hidden layers and the number of neurons contained in each hidden layer) need to be set before training. In general, the two-layer network and linear output neurons with sigmoid activation functions can fit well the multi-dimensional mapping problem given a certain amount of data. For more hidden neurons and layers, the model's performance has increased in the training set but decreased in the test set. This study uses a trial-and-error method to determine the number of neurons in the hidden layer. Figure 14 shows the R 2 for the different network structures.

BP Neural Networks Modeling
The hyperparameters of the neural network (such as the number of hidden layers and the number of neurons contained in each hidden layer) need to be set before training. In general, the two-layer network and linear output neurons with sigmoid activation functions can fit well the multi-dimensional mapping problem given a certain amount of data. For more hidden neurons and layers, the model's performance has increased in the training set but decreased in the test set. This study uses a trial-and-error method to determine the number of neurons in the hidden layer. Figure 14 shows the R 2 for the different network structures. As can be seen from Figure 14, with the increase in the number of neurons, there is an overall fluctuation trend, in which four and nine are the positions of the two peaks in the broken line diagram. From the principal analysis of the neural network, the more neurons, the more complex the data features the model can fit. At the same time, the model may learn some noise or irrelevant features in the dataset, thus affecting the performance of the model in the test set. Comparing the accuracy of the model on the test set when the number of hidden neurons is four and nine, the results show that the latter has higher accuracy.
There are twelve nodes in the input layer, which input various spectral indexes of time series and BOP, respectively. The hidden layer contains nine neurons, which are connected with the twelve nodes of the input layer, and then the average length of the upper half is output through the output layer, which contains a linear function. Figure 15 shows the ten-fold cross-validation results of the correlation between the estimated results of the BP neural networks model and the measured results of sampling. Figure 14. The relationship between the number of hidden layer neurons and network performance As can be seen from Figure 14, with the increase in the number of neurons, there is an overall fluctuation trend, in which four and nine are the positions of the two peaks in the broken line diagram. From the principal analysis of the neural network, the more neurons, the more complex the data features the model can fit. At the same time, the model may learn some noise or irrelevant features in the dataset, thus affecting the performance of the model in the test set. Comparing the accuracy of the model on the tes set when the number of hidden neurons is four and nine, the results show that the latter has higher accuracy.
There are twelve nodes in the input layer, which input various spectral indexes o time series and BOP, respectively. The hidden layer contains nine neurons, which are connected with the twelve nodes of the input layer, and then the average length of the upper half is output through the output layer, which contains a linear function. Figure 15 shows the ten-fold cross-validation results of the correlation between the estimated results of the BP neural networks model and the measured results of sampling.

Comparison of Results
It can be seen from the results that the prediction effect of neural network modeling is much better than that of linear regression, so a BP neural network is selected as the framework for a prediction model. In addition, in order to explore the impact of soil pixels on fiber quality prediction performance, 10 repeated modeling experiments were carried out. When calculating the spectral index, the average value of boll cottonseed pixels segmented by U-Net was used instead of the average value in the region. Table 4 shows the accuracy of the estimation model in three cases; the estimation accuracy is higher when using the spectral index calculated by pixel points, excluding soil, for modeling, and R 2 has a 4.01% increase.

Remove Redundant Variables
In order to eliminate redundant variables, a sensitivity analysis was conducted for all input parameters of the model (using the one-by-one elimination method). After each parameter is proposed, the model effect will be cross-verified, and R 2 and MSE will be counted ( Table 5). One-sided test (95% confidence interval) is used to evaluate whether the influence of the removed variables on the model is significant.  Table 6. When using the simplified input variable set for prediction, the accuracy of the model remains basically unchanged in the training and test sets. In addition, the accuracy level remains at the same level. The result of the stepwise sensitivity analysis shows that each of the remaining input parameters has a significant impact on the performance of the estimation model.

Summary of Cotton Fiber Quality Parameter Prediction Models
Using the same method, the prediction models of uniformity index, fracture specific strength, and micronaire value are established. It is found that the optimal input variables of the four-index prediction model are consistent, which shows that the fiber quality parameters of cotton are consistent in the formation of physiological characteristics. RSI, NDVI, MTCI, OSAVI, and BOP can be used to predict cotton fiber quality parameters. The model performance parameters are shown in Table 7. It can be seen from the prediction results that the prediction model has good prediction performance for the average length, uniformity index, and micronaire value of the upper half. R 2 is around 0.8, but the prediction ability for fracture-specific strength is poor, and the average R 2 is only 0.7264. It is speculated that the reason for this result is that the characteristics sensitive to the change of fracture-specific strength have not been added to the input variables of the model. The average errors of the four indexes in the verification within one square meter are 0.5502, 0.6625, 0.5674, and 0.1595, respectively.

Visualization of Quality Prediction
An image is inputted into MATLAB in the form of a matrix, the output matrix is visualized in the form of an image, and the visible light in the orthophoto image is taken as the base map to obtain the distribution map of cotton fiber quality parameters. From Figure 16, we can more intuitively see the distribution of cotton fiber quality parameters in the field. In practice, it is meaningless to predict the fiber quality at the pixel level with a BOP of 0, so the model eliminates the pixel with a BOP of 0.
remains basically unchanged in the training and test sets. In addition, the accuracy level remains at the same level. The result of the stepwise sensitivity analysis shows that each of the remaining input parameters has a significant impact on the performance of the estimation model.

Summary of Cotton Fiber Quality Parameter Prediction Models
Using the same method, the prediction models of uniformity index, fracture specific strength, and micronaire value are established. It is found that the optimal input variables of the four-index prediction model are consistent, which shows that the fiber quality parameters of cotton are consistent in the formation of physiological characteristics. RSI, NDVI, MTCI, OSAVI, and BOP can be used to predict cotton fiber quality parameters. The model performance parameters are shown in Table 7. It can be seen from the prediction results that the prediction model has good prediction performance for the average length, uniformity index, and micronaire value of the upper half. R 2 is around 0.8, but the prediction ability for fracture-specific strength is poor, and the average R 2 is only 0.7264. It is speculated that the reason for this result is that the characteristics sensitive to the change of fracture-specific strength have not been added to the input variables of the model. The average errors of the four indexes in the verification within one square meter are 0.5502, 0.6625, 0.5674, and 0.1595, respectively.

Visualization of Quality Prediction
An image is inputted into MATLAB in the form of a matrix, the output matrix is visualized in the form of an image, and the visible light in the orthophoto image is taken as the base map to obtain the distribution map of cotton fiber quality parameters. From Figure 16, we can more intuitively see the distribution of cotton fiber quality parameters in the field. In practice, it is meaningless to predict the fiber quality at the pixel level with a BOP of 0, so the model eliminates the pixel with a BOP of 0.

Discussion
This study used a time series of UAV remote sensing images to estimate the cotton fiber quality index, and good accuracy was achieved. High-resolution images in visible light can make up for the deficiencies of traditional multispectral sensors in spatial

Discussion
This study used a time series of UAV remote sensing images to estimate the cotton fiber quality index, and good accuracy was achieved. High-resolution images in visible light can make up for the deficiencies of traditional multispectral sensors in spatial resolution to a certain extent. Input of time series fusion features can improve the accuracy of the model, and machine learning models further improves the accuracy. Combined with machine learning and multi-source data fusion, it realizes efficient and accurate monitoring of cotton fiber quality parameters.

Correlation between Reflectivity and Quality Parameters of Cotton Fiber
The spectral and texture information of UAV images has been widely used in crop phenotypic research. The absorption of substances in the near-infrared spectral region mainly includes the frequency combination and frequency doubling vibration absorption of chemical bond groups with higher energy (C-H, N-H, O-H, S-H, C=O, and C=C). It contains almost all the information of hydrogen-containing groups in organic matter, which is rich in information, so a multispectral vegetation index has a strong correlation with phenotypic traits [35][36][37]. Compared with multispectral images and RGB images, they usually have a higher spatial resolution, which helps to extract more abundant texture information and achieve a fixed point, as well as a quantitative and accurate collection of spectral features. However, a single data source has great limitations [38]. RGB images cannot obtain spectral information in the NIR band, and multispectral images will lose a lot of texture features, such as BOP and soil pixel masks, in this research; these indicators both have a greater contribution to improving the monitoring accuracy of cotton fiber quality (Tables 5 and 6). Moreover, Thompson et al. showed that the vegetation index based on the red edge band can be a good grading score of cotton fiber maturity, and the correlation was not breed-specific, suggesting that spectral monitoring of cotton fiber quality may be used for multiple cotton genotypes [39].
NIR spectroscopy is generally considered a cost-effective alternative to traditional laboratory methods and systems [40]. Because the infrared reflectance spectrum is sensitive to cellulose content, there is a strong correlation between fiber quality and the reflectance spectrum. Near-infrared spectroscopy has been widely studied in the qualitative analysis of textile fibers, traceability, impurity content of raw cotton, qualitative analysis of blended fabrics, etc. The accuracy of using near-infrared spectroscopy to estimate micronaire value exceeds 97% [41], which is consistent with the results of the correlation analysis of this study. All spectral indexes that have significant influence on the estimation results are calculated according to the NIR ( Figure 10). However, there are complex relationships between cotton fiber quality and spectra, structural parameters and texture information. Traditional linear regression models may be difficult to express in terms of their mapping relationships. Compared with the traditional least squares linear regression (Figure 13), machine learning can achieve higher precision regression through training parameters ( Figure 15). In this study, we used deep learning to segment the cotton boll to get BOP and remove the soil pixels to obtain a more accurate spectral index. The results show that a more accurate spectral index, using a neural network algorithm and time series data, can improve the estimation accuracy of cotton fiber quality parameters.

Limitations and Prospects
Because this study did not carry out large-scale, cross-regional sampling, regional differences were not considered. Therefore, the next step of this study is to collect more time series data from different regions and constantly adjust and optimize the model. In addition, the hyperspectral full-frame imaging system can obtain canopy spectral reflection data with a wider band range and higher spectral resolution, which may help to improve the accuracy of fiber quality monitoring. The research results have important practical significance for improving the efficiency of cotton breeding research, especially in breeding experiments involving thousands of samples.

Conclusions
In this study, time series RGB and multispectral UAV images were used to obtain relevant cotton canopy attributes and establish a machine learning model for monitoring cotton fiber quality parameters. The ability to combine high-resolution RGB images and multispectral information to predict cotton fiber quality is very powerful. The model can predict well the upper-half mean length, uniformity index, and micronaire value with an R 2 value of 0.8250, 0.8014, and 0.7722, respectively. Redundant input variables are eliminated through sensitivity analysis to obtain the optimal subset of input variables, which reduces the cost of collecting multispectral data. In addition, when using the method of removing soil pixels to calculate the spectral index for prediction, R 2 is 4.01% higher than the method of using the average value to calculate the spectral index. In this research, a large-and small-scale cotton fiber quality parameter prediction model is established, which provides a valuable tool for cotton breeding research. The results of this study enable UAVs to replace most of the manual inspection work, greatly improve the efficiency of cotton breeding research, and accelerate the breeding of high-quality cotton varieties.
Author Contributions: Conceptualization, data curation, formal analysis, methodology, supervision, investigation, writing-original draft, review and editing, W.X.; data curation, formal analysis, methodology, software, validation, visualization, writing-original draft, review and editing, W.Y.; conceptualization, methodology, supervision, L.Z.; methodology and investigation, P.C.; methodology and investigation, Y.Z.; methodology and investigation, P.C.; conceptualization, funding acquisition, project administration, software, supervision, writing-review and editing, Y.L.; equal contribution to this work, W.X. and W.Y. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement:
The data used in this study are available from the corresponding author upon reasonable request.