UAV Data as an Alternative to Field Sampling to Monitor Vineyards Using Machine Learning Based on UAV/Sentinel-2 Data Fusion

: Pests and diseases affect the yield and quality of grapes directly and engender noteworthy economic losses. Diagnosing “lesions” on vines as soon as possible and dynamically monitoring symptoms caused by pests and diseases at a larger scale are essential to pest control. This study has appraised the capabilities of high-resolution unmanned aerial vehicle (UAV) data as an alternative to manual ﬁeld sampling to obtain sampling canopy sets and to supplement satellite-based monitoring using machine learning models including partial least squared regression (PLSR), support vector regression (SVR), random forest regression (RFR), and extreme learning regression (ELR) with a new activation function. UAV data were acquired from two ﬂights in Turpan to determine disease severity (DS) and disease incidence (DI) and compared with ﬁeld visual assessments. The UAV-derived canopy structure including canopy height (CH) and vegetation fraction cover (VFC), as well as satellite-based spectral features calculated from Sentinel-2A/B data were analyzed to evaluate the potential of UAV data to replace manual sampling data and predict DI. It was found that SVR slightly outperformed the other methods with a root mean square error (RMSE) of 1.89%. Moreover, the combination of canopy structure (CS) and vegetation index (VIs) improved prediction accuracy compared with single-type features (RMSEcs of 2.86% and RMSE VIs of 1.93%). This study tested the ability of UAV sampling to replace manual sampling on a large scale and introduced opportunities and challenges of fusing different features to monitor vineyards using machine learning. Within this framework, disease incidence can be estimated efﬁciently and accurately for larger area monitoring operation.


Introduction
China, which has the second largest grape planting area of in the world, has seen steady growth in planted area. The largest wine-grape growing region of China is Xin-jiang province with 371,152 acres planted in 2018. Especially in Turpan city, given the high economic benefits of this commodity crop, and the enormous planted area and production, there is significant interest in developing a strategy to ensure grape quality and yield [1]. In the present study, the quality and yield of grapes were found to depend on their content of sugars, acids, and phenols, and the accumulation of these substances during grape development and maturation was influenced by the health of the grapes [2,3]. However, pests and diseases incidence seriously affected the health of the grapes [4]. Studies have shown that environmental (climatic and rainfall) conditions and vineyard management programs play a vital role in the occurrence of pests and diseases. In particular, increases in temperature and humidity with climate change have caused increasing occurrences of pests and diseases [5]. Pests and diseases are a major threat to grape yield and composition, because they often dramatically affect the spongy tissues of leaves, change the ratio between different kinds of pigments, and damage photosynthetic pigments of microalgae [6,7], thus affecting photosynthesis during the growth stage. The use of pesticides is still a mainly pests and diseases control [8]. However, food security and sustainability are of major importance to agriculture production [9]. In recent years, many studies have shown that scientific pest management programs are important because overuse of pesticides can be harmful to human health and the environment [8,10]. Therefore, an in-depth understanding of the incidence and distribution of pests and diseases in vineyards for the purpose of pest control has become difficult for local governments and growers to achieve. It is necessary to evaluate infection indicators as important evidence to predict the development of pests and diseases, so as to develop scientific and organic protection plans [9]. Monitoring vineyards affected by pests is an initial and crucial step during pest control, because it can provide reference information and valuable parameters for government and growers to generate a strategy for pesticide purchases [1].
To provide important information on vineyard status at different scales all year round for different managers, remote sensing is an outstanding choice [11]. Traditional and classical crop monitoring provides two main methods to monitor the status of crops and changes caused by various factors such as plant stress, pests, and diseases. One is measuring leaves and shoots under field conditions based on the polarization characteristics of reflected radiation and using a hand-held spectrometer and fluorometer to estimate plant stress [12]. In addition to leaf-scale measurements, the vegetation index, leaf water indices, and chlorophyll fluorescence can be measured for the entire canopy and crops [12,13]. Both traditional methods have the common weakness that they do not lend themselves to large-scale monitoring operation [14]. Recent advances in remote sensing techniques provide an additional tool to monitor plants at the canopy scale, which has facilitated the discrimination of crops affected by pest infestation [15,16]. Among remote sensing platforms, satellites and unmanned aerial vehicle (UAV) platforms are the most widely used to carry sensors in present-day research. Although satellite optical data are widely used, their limitations due to spatial resolution and atmospheric effects cannot be ignored [17]. When separating the contributions of different land-surface components such as canopy and soil, the accuracy of satellite data cannot meet application requirements due to mixed pixels [18,19]. In addition, one crucial step of monitoring by Sentinel-2 imagery is collecting validation and reference data. These data for remote sensing applications are traditionally acquired through manual field surveys, which are associated with some limitations and risks [20]. Moreover, the lack of three-dimensional canopy information impedes the accuracy of crop monitoring applications in precision agriculture [21].
In remote sensing monitoring applications, considering the weaknesses of satellite platforms and the fact that UAV platforms to support high-resolution data can be easily obtained, more and more small commercial UAVs with various types of sensors are being used to provide measurements. Over the past decade, a growing number of researchers have used multi-sensor data to monitor crops [14]. In previous studies, many researchers have proposed various methods of combining canopy 3D structures extracted from UAV data and spectral information from Sentinel-2 data to estimate bio-physical crops' parameters such as Chlorophyll (Chl) a content, Chl b content, and leaf nitrogen concentration (N) [1,22]. Researchers have used fusion of data from low-cost RGB, multispectral, and thermal data to estimate biophysical and biochemical [14,23], such as nitrogen concentration and Chl a content [14]. For large-scale monitoring, UAV and Sentinel-2 data have been used to estimate the initial biomass of green algae in the sea [24], as well as the combination of UAV and Sentinel-2A data has been used to evaluate plant physiological status under water stress in vineyards [25]. Several researchers have used low-cost sensors integrated onto a UAV and satellite imaging for stress detection [15]. Previous studies have shown the combination of UAV-extracted information with the Sentinel-2-based vegetation index is effective for crop monitoring and disaster assessment [25][26][27].
However, as demand for precision agriculture increases from growers and governments, providing physiological monitoring results is not enough. It is necessary to provide visual results when monitoring crops to help growers and government workers visualize crop conditions. Unfortunately, it is difficult to provide intuitive quantitative data on disease incidence for growers, because quantitative research studies on disease incidence are still lacking, especially in pergola crops such as vineyards. In addition, field sampling is a common method of assessing vine disease in each row of existing pest incidence studies, which is a difficult process due to several limitations: (1) this is difficult, expensive, and time-consuming work [28]; (2) the reliability of ground-based positioning data may be affected by the impact and density of canopy cover [29,30]; (3) field assessment is restricted by topography [31,32]; (4) considering that the vine canopies vigor is well related to vines diseases status, field assessment can not observe the whole vine canopies from a bird's eye view [18]. In order to overcome these limitations, UAV data are used as an alternative source for reference data for field assessment. In addition, the fusion of high-resolution UAV-derived information and Sentinel-2 based vegetation index is used for monitoring vineyards and quantifying disease incidence and severity.
The fusion of satellite data with UAV data provides a wealth of information as explanatory variables to improve estimation performance of grapevine quality. Nevertheless, the health status of the grapevine is determined by the interaction of many factors, and the relationship between them is not always linear, so it cannot be predicted by linear statistical methods [33]. To overcome the nonlinearity inherent in a large number of variables, machine learning is usually used for estimation. Over the past decade, the growing number of studies focused on modern agricultural applications based on remote sensing using different machine learning (ML) methods have shown the capability of ML methods for classification and regression analysis. Approaches such as partial least squares regression (PLSR) [34], random forest regression (RFR) [35], support vector regression (SVR) [36], and the extreme learning machine (ELM) and its variants [37] have been used for a series of remote sensing-based agricultural analyses. PLSR overperformed in yield prediction of drought-stressed spring crops [38]. Especially, ELM and extended ELM algorithms with the sigmoid activation function replaced by various new activation functions were applied to berry yield and quality prediction [33]. These previous studies demonstrated that ML methods have achieved accurate yield predictions by overcoming the drawbacks of remote sensing datasets such as nonlinearity and spatial autocorrelation [39].
This study has investigated the ability to combine Sentinel-2 data and UAV-derived canopy structure data for monitoring the pests caused by Lycorma delicatula, using UAV data as an alternative to field data. This study assessed the following: (i) the feasibility and ability of replacing field samples by UAV data to monitor pest incidence and severity; (ii) the capability of VIs calculated from Sentinel-2 data to accurately evaluate disease severity (DS) and disease incidence (DI) levels in pest-infected vineyards in Turpan; (iii) the potential for combining features regarding canopy spectral and structure from UAV data and temporal Sentinel-2 data to monitor and predict DI in vineyards using ML methods.

Study Site and Field Data Collection
The study was conducted in a grape-growing area in Turpan (northwestern China, 87 • 6 E, 41 • 12 N), where various kinds of grape are widely planted and have been affected by pests and diseases in recent years showed in Figure 1. The region has a continental warm temperate desert climate with a mean annual precipitation of 16.4 mm and a mean annual temperature of 13.9 • C. As studied by Serrano [25], the grapevine canopies were more affected by pests when high temperatures and heavy rainfall occurred during the vegetative period and during the initial growth period of the berries. Therefore, the Turpan grape plantation became a suitable research area. warm temperate desert climate with a mean annual precipitation of 16.4 mm and a mean annual temperature of 13.9°C. As studied by Serrano [25], the grapevine canopies were more affected by pests when high temperatures and heavy rainfall occurred during the vegetative period and during the initial growth period of the berries. Therefore, the Turpan grape plantation became a suitable research area. According to the information provided by grape growers and local professionals, different grades of symptoms and damaged area were classified as shown in Table 1, and individual rows graded data as 0 (Healthy), 1 (Initial infestation), 2 (Medium infestation), 3 (High infestation), and 4 (Very high infestation) within a sampling area. In the next section, these criteria were also used to determine whether a vine is sick based on high-resolution UAV data.  The field survey was conducted on 16 May to 18 May 2019, observing 25 fields. According to the information provided by grape growers and local professionals, different grades of symptoms and damaged area were classified as shown in Table 1, and individual rows graded data as 0 (Healthy), 1 (Initial infestation), 2 (Medium infestation), 3 (High infestation), and 4 (Very high infestation) within a sampling area. In the next section, these criteria were also used to determine whether a vine is sick based on high-resolution UAV data. Using UAV data as an alternative to field data, high-resolution multispectral data were collected on 16 July 2018 and 25 May 2019 using an airborne vehicle with a Parrot Sequoia+ agricultural camera. This camera consists of two sensors: one was a multispectral sensor that included four 1.2-million-pixel monochrome sensors (global shutter single-band multispectral agricultural camera) that registered bands in the spectral ranges of Green (530-570 nm, Central Wavelength (CWL) at 550), Red (640-680 nm, CWL at 660), Red Edge (730-740 nm, CWL at 735) and Near infrared (770-810 nm, CWL at 790), with a horizontal angular field of view (HFOV) of 61.9, a vertical angular field of view (VFOV) of 48.5, and an RGB camera. When collecting each image, four monochrome sensors were simultaneously triggered to produce four raw images in 12-bit tiff format, which also recorded a set of latitude and longitude coordinates and the height of the ellipsoid coordinates in the WGS84 coordinate system, as well as the rotation angle, position accuracy, and rotation accuracy in the Exchangeable Image File (EXIF) information. Triggering all four sensors simultaneously and recording EXIF information solves the lens parallax problem and ensures the accuracy of four-band image geographical registration in the preprocessing process. The second sensor was a Sunshine Sensor, which had the same interference filter as the four monochrome sensor bands. This component was equipped with Global Position System (GPS), Inertial Measurement Unit (IMU), and a magnetometer. As for the external elements of the Sunshine Sensor, the irradiance at the moment of recording and the size of the bottom angle were also recorded and were later used in the radiometric calibration. Parrot Sequoia+ is set to automatic exposure in flight, and the sensor has an International Organization for Standardization (ISO) of 100. More sensor details can be found in the official Parrot Store.
In this study, the RGB map mosaicked from UAV data had a ground resolution of 6 cm, which made it possible distinguish canopy from background components and assess disease severity (DS) and disease incidence (DI) for every vineyard quickly. DS indicates the severity of the disease, and DI was the proportion of diseased rows in a plot [40,41]. A value 0 was used to represent no incidence and 1 was used to represent incidence. DS and DI were estimated by equations as follows: where x i represented the DS level shown in Table 1, n i represented the number of diseased rows on different DS level, x was the number of diseased rows, and N represented the number of rows within a plot [41]. Two variables, ∆DS and ∆DI, were defined to express the increase or decrease is pests and disease in vineyards as follows:

Sentinel-2 Data
Sentinel-2 is a high-resolution multispectral imaging satellite carrying a multispectral instrument (MSI). The Sentinel-2 system is made up of two satellites, 2a and 2b, that acquire data once every 10 [17]. Sentinel-2 images have been available for free since 2015 [17]. Sentinel-2 improved the feasibility of satellite-based crop monitoring because its spectral band located in the rededge region greatly increased the estimation accuracies of chlorophyll content, the fractional cover of forest canopies, and leaf area index (LAI) [11]. Furthermore, the short revisit interval (every 2-3 days) of the Sentinel-2 satellite at moderate latitudes provides abundant information on crop status in the short term over a large area.
A multi-temporal Sentinel-2 dataset was used to analyze the correlation between VIs derived from Sentinel-2 data and DS, DI acquired from UAV images, and the feasibility of detecting pests and disease using VI trends. This dataset consists of 22 cloud-free Sentinel-2 data including 2a and 2b, and it covered a vegetative period and initial berry growth period over two years from May to July 2018 and from May to July 2019. These Level-1C (L1C) data cover May to July 2018 from Google Earth Engine (GEE) (https://earthengine.google. com/), and Level-2A (L2A) data covering May to July 2019 were downloaded from the European Space Agency (ESA) (https://scihub.copernicus.eu/dhus/#/home).

UAV Data Preprocessing
Previous studies provided evidence that Pix4Dmapper is more user-friendly than other similar software [42]. We used Pix4Dmapper4.3 for photogrammetry and radiometric raw UAV images processing in three steps, including initial processing, dense point cloud point, digital surface model (DSM), orthomosaic, and index map. Raw images were radiometrically calibrated in a target-less automatic workflow [43]. The cooperation between Parrot Sequoia+ and Pix4Dmapper provided absolute reflectance measurements without the need to use radiometric calibration target. Typical workflow and details are described in official technical papers [44]. In addition, an RGB image with 6 cm resolution was acquired by an RGB camera using the Pix4Dmapper software for this study.

Sentinel-2 Data Preprocessing
The multi-temporal Sentinel-2 dataset included L1C and L2A data. The L1C data were atmospherically corrected to L2A expressed in terms of bottom-of-atmosphere reflectance with the Sen2cor Atmospheric Correction Processor (version255) in ESA SNAP. Resampling was performed using a tool specifically designed for Sentinel-2 in ESA SNAP before calculating the VIs.
Meanwhile, the ability of UAV data to be used as field data was tested using field samples and spectral analysis. Previous studies showed that the spectral band of Parrot Sequoia+ was very similar to that of ASD [45]. Therefore, this study tested the correlation between VIs derived from Sentinel-2 and Parrot Sequoia+ and compared the results with the appearance of the high-resolution UAV and Sentinel-2 images.

UAV Imagery-Based Canopy Feature Extraction
Canopy height is an important characteristics of canopy structure. Alessandro [46] has researched the correlation between canopy height (CH) and vine vigor by comparing the CH and the vigor map obtained from Normalized Difference Vegetation Index (NDVI).
Remote Sens. 2021, 13, 457 7 of 21 A digital elevation model (DEM) was generated from UAV point data using the Pix4D mapper (v.4.3), and a digital surface model (DSM) was generated from UAV point data using ArcGIS. The canopy height (CH) was obtained by subtracting DEM from DSM [47,48].
In addition, there are two indicators that can reflect the density and vigor of vines and have been used in previous studies: canopy coverage (CC) and fractional vegetation coverage (VFC). They are both based on computing the percentage green vegetation area per plot using different methods. In this study, vegetation was extracted from high-resolution RGB images using the SVM classifier [49][50][51] shown in Figure 2. The classification result was tested using random selected 288 samples with an overall accuracy of 96.9% and Kappa coefficient of 0.936. Then, the CC is calculated by dividing the area that is classified as grapes by the total plot area, and VFC was calculated based on multispectral information as follows: where NDVI soil and NDVI veg represent the NDVI on bare soil and under full vegetation coverage [52][53][54], respectively; these were normally replaced by the minimum and maximum NDVI in the test area. The two indicators are compared in the next section. To find suitable VIs that correlated with changes in pest incidence and severity, this study selected 17 VIs that were sensitive to changes in vegetation-related physiological and biochemical parameters and that were calculated from the Sentinel-2 spectral bands as shown in Table 2. For 144 vineyards, the average of the accurate values was used to fill

Sentinel-2 Imagery-Based VI Feature Extraction
To find suitable VIs that correlated with changes in pest incidence and severity, this study selected 17 VIs that were sensitive to changes in vegetation-related physiological and biochemical parameters and that were calculated from the Sentinel-2 spectral bands as shown in Table 2. For 144 vineyards, the average of the accurate values was used to fill the gap, and daily VIs were reconstructed through an S-G filter to reduce the effect of atmospheric conditions and other factors [55]. The time series VI consists of data from three weeks before and after the UAV sampling date, and VI_2019/VI_2018 was calculated to determine the strength of statistical relationships among VI, ∆DI, and ∆DS through Pearson correlations and p-values.

Renormalized Difference Vegetation Index
Optimized Soil-Adjusted Vegetation Index Modified Soil-Adjusted Vegetation Index Chlorophyll Index CI = R 750 R 710 [63] Inverted Red-Edge Chlorophyll Index

Sentinel-2 Red Edge Position
Transformed Chlorophyll Absorption Ratio Index TCARI/OSAVI TCARI OSAVI = TCARI/OSAVI [60] Plant Senescence Reflectance Index PSRI = R 678 −R 500 R 750 [67] To connect these features (CH and VIs) with DI, the average of CH and the VIs was computed for every plot. A binary mask layer was applied to exclude soil/shadow and weeds from the background components. Figure 3 shows the workflow, including data preprocessing, feature extraction, machine learning modeling, and analysis.

Modeling Methods
ML methods have been efficiently applied to remote sensing studies because these methods have the potential to monitor crops and estimate vegetation parameters and crop yield using spectral information and canopy structure derived from satellite and UAV data, including machine learning regression algorithms, such as PLSR, RFR, SVR, and extreme learning regression (ELR). To address the nonlinearity of remote sensing datasets, several studies have indicated that these methods are useful. PLSR is a popular method used in crop monitoring because of its ability to decrease loss of the information contained in the input variables, which is similar to Principal Component Regression (PCR) in that it uses statistic rotation [68]. Random forests(RF) is a regression technique; its basic idea is to grow "trees" using Classification and Regression Trees (CART) methodology. RFR is the regression version of RF. The difference between random forests for regression and for classification is that the former's predictor and output values are numerical [69]. SVR is an important branch of SVM(support vector machine); its basic concept is to transform the original input features into a new hyper-space using kernel functions [70]. ELR is the implementation of ELM for regression; the classic ELM is a simple and faster learning algorithm for Single-hidden Layer Feedforward Neural Network (SLFN), which can be easily implemented and decrease training error. In this study, an extension of classic ELM, which has a different activation function from the classic ELM and has been applied to crop monitoring in previous studies [37], was modified. The activation function of this new ELM proposed by Maimaitiyiming [33] and called TanhRe is the combination of two frequently used functions: the rectified linear unit (ReLU) and the hyperbolic tangent (Tanh) functions, with the goal of fitting an input pattern better. TanhRe takes the following form: if > , the nonlinear activation ( ) = ;if ≤ , ( ) = • ( ),where the reasonable range of the constant is from 0 to 1. One hundred vineyard samples (70% of the total) including VIs (ARVI, OSAVI, and GNDVI), VFC, CH, and DI were used as input features to train the models; the rest of the samples were used as validation samples to assess the performance and reliability of the prediction models. In the process of implementing the ML method, grid search was used to determine the number of principal components for PLSR, the parameter C, and the

Modeling Methods
ML methods have been efficiently applied to remote sensing studies because these methods have the potential to monitor crops and estimate vegetation parameters and crop yield using spectral information and canopy structure derived from satellite and UAV data, including machine learning regression algorithms, such as PLSR, RFR, SVR, and extreme learning regression (ELR). To address the nonlinearity of remote sensing datasets, several studies have indicated that these methods are useful. PLSR is a popular method used in crop monitoring because of its ability to decrease loss of the information contained in the input variables, which is similar to Principal Component Regression (PCR) in that it uses statistic rotation [68]. Random forests(RF) is a regression technique; its basic idea is to grow "trees" using Classification and Regression Trees (CART) methodology. RFR is the regression version of RF. The difference between random forests for regression and for classification is that the former's predictor and output values are numerical [69]. SVR is an important branch of SVM(support vector machine); its basic concept is to transform the original input features into a new hyper-space using kernel functions [70]. ELR is the implementation of ELM for regression; the classic ELM is a simple and faster learning algorithm for Single-hidden Layer Feedforward Neural Network (SLFN), which can be easily implemented and decrease training error. In this study, an extension of classic ELM, which has a different activation function from the classic ELM and has been applied to crop monitoring in previous studies [37], was modified. The activation function of this new ELM proposed by Maimaitiyiming [33] and called TanhRe is the combination of two frequently used functions: the rectified linear unit (ReLU) and the hyperbolic tangent (Tanh) functions, with the goal of fitting an input pattern better. TanhRe takes the following form: if x > 0, the nonlinear activation f (x) = x; if x ≤ 0, f (x) = c·tanh(x), where the reasonable range of the constant c is from 0 to 1.
One hundred vineyard samples (70% of the total) including VIs (ARVI, OSAVI, and GNDVI), VFC, CH, and DI were used as input features to train the models; the rest of the samples were used as validation samples to assess the performance and reliability of the prediction models. In the process of implementing the ML method, grid search was used to determine the number of principal components for PLSR, the parameter C, and the coefficient c for ELR. For the SVR method, poly was determined as the kernel function, and the parameter C was determined by tuning. For the RFR method, the number of trees was 400. To assess the performance of these methods, the coefficient of determination (R 2 ), the root mean square error (RMSE), and the coefficient of variance of the RMSE (CV-RMSE) were calculated [71]: where y i andŷ i are the measured and the predicted parameters, respectively. y i is the mean of measured parameters, and n is the number of samples. These machine learning methods and accuracy assessments are implemented through the sklearn package in Python. coefficient c for ELR. For the SVR method, poly was determined as the kernel function, and the parameter C was determined by tuning. For the RFR method, the number of trees was 400. To assess the performance of these methods, the coefficient of determination (R 2 ), the root mean square error (RMSE), and the coefficient of variance of the RMSE (CV-RMSE) were calculated [71]:

Assessment of Sampling
where and ̂ are the measured and the predicted parameters, respectively. ̅ is the mean of measured parameters, and n is the number of samples. These machine learning methods and accuracy assessments are implemented through the sklearn package in Python.

Validation of Canopy Height
CH derived from UAV imagery was validated by comparison with the field samples in 22 vineyards, which presented an R 2 of 0.82 and an RMSE of 0.047 m, as shown in Figure  4. The slight error can usually be attributed to Geometric Standard Deviation (GSD), the corresponding DSM calculation, and deviation of the edge pixels [72].

Assessment of the Relevance of UAV Data to Sentinel-2 data
The next step was to validate the ability of UAV high-spatial-resolution data alternating with field samples for monitoring vineyards. Figure 5 shows very high-resolution images taken on May 16, 2019 for two health conditions: healthy and medium pest incidence, as well as the corresponding Sentinel-2 images. Through visual interpretation, the distinctness of the two conditions can be clearly observed from a bird's-eye and a global perspective. The comparison between the appearance of the high-resolution UAV image

Assessment of the Relevance of UAV Data to Sentinel-2 Data
The next step was to validate the ability of UAV high-spatial-resolution data alternating with field samples for monitoring vineyards. Figure 5 shows very high-resolution images taken on May 16, 2019 for two health conditions: healthy and medium pest incidence, as well as the corresponding Sentinel-2 images. Through visual interpretation, the distinctness of the two conditions can be clearly observed from a bird's-eye and a global perspective. The comparison between the appearance of the high-resolution UAV image and the Sentinel-2 image is obvious. VIs acquired from UAV and Sentinel-2 data over the 144 vineyards, and Figure 6 shows a strong correlation between the two (R 2 = 0.85, p < 0.001 for NDVI and R 2 = 0.71, p < 0.001 for OSAVI). NDVI lightly outperformed OSAVI. Compared with NDVI, OSAVI is more sensitive to vegetation and is more susceptible to changes in atmospheric correction results and observation angle, which also leads to the slightly poor correlation between UAV-derived OSAVI and satellite-derived OSAVI [73].
Remote Sens. 2021, 13, x FOR PEER REVIEW 12 of 22 and the Sentinel-2 image is obvious. VIs acquired from UAV and Sentinel-2 data over the 144 vineyards, and Figure 6 shows a strong correlation between the two (R 2 =0.85, p<0.001 for NDVI and R 2 =0.71, p<0.001 for OSAVI). NDVI lightly outperformed OSAVI. Compared with NDVI, OSAVI is more sensitive to vegetation and is more susceptible to changes in atmospheric correction results and observation angle, which also leads to the slightly poor correlation between UAV-derived OSAVI and satellite-derived OSAVI [73].

Correlation between DI and VIs
In addition, the potential of UAV sampling data to monitor vineyards was assessed on two scales: the relationship between temporal DI trends and the rate of change in vegetation indices was examined, and DI was related to VIs in 2019. All DI and DS were derived from images acquired by Parrot Sequoia+ in July 2018 and May 2019 from the surveyed vineyards (Figures 7 and 8). The correlation between DS and DI was significant (r 2 =0.728, p<0.001), as well as that between temporal change rates ∆DS and ∆DI (r 2 =0.489, p<0.001). Vineyards affected by pests show different rates of change in incidence and severity, and vineyards that had already been affected lightly in 2018 showed an obvious increase in 2019 (e.g., A66 and A102).

Correlation between DI and VIs
In addition, the potential of UAV sampling data to monitor vineyards was assessed on two scales: the relationship between temporal DI trends and the rate of change in vegetation indices was examined, and DI was related to VIs in 2019. All DI and DS were derived from images acquired by Parrot Sequoia+ in July 2018 and May 2019 from the surveyed vineyards (Figures 7 and 8). The correlation between DS and DI was significant (r 2 = 0.728, p < 0.001), as well as that between temporal change rates ∆DS and ∆DI (r 2 = 0.489, p < 0.001). Vineyards affected by pests show different rates of change in incidence and severity, and vineyards that had already been affected lightly in 2018 showed an obvious increase in 2019 (e.g., A66 and A102).

Correlation between DI and VIs
In addition, the potential of UAV sampling data to monitor vineyards was assessed on two scales: the relationship between temporal DI trends and the rate of change in vegetation indices was examined, and DI was related to VIs in 2019. All DI and DS were derived from images acquired by Parrot Sequoia+ in July 2018 and May 2019 from the surveyed vineyards (Figures 7 and 8). The correlation between DS and DI was significant (r 2 =0.728, p<0.001), as well as that between temporal change rates ∆DS and ∆DI (r 2 =0.489, p<0.001). Vineyards affected by pests show different rates of change in incidence and severity, and vineyards that had already been affected lightly in 2018 showed an obvious increase in 2019 (e.g., A66 and A102).  . The X-axis indicates the vineyards sampled, which were labeled from A1 to A148. Due to lack of space, not all labels are shown. Figure 9 shows the significant correlation of 12 out of 17 VIs with both DI and DS. Among the VIs, ARVI, OSAVI, and GNDVI produced higher correlations and coefficients of determination (R 2 ) with ∆DI than the others (R 2 ARVI=0.44, R 2 OSAVI=0.42, R 2 GNDVI=0.43). These coefficients of determination (R 2 ) demonstrated that using a temporal rate of change in VI to fit ∆DI is not very accurate. It can be speculated that the relationship between pests and diseases caused by the change of climate in early spring with temporal vegetation incidence is not obvious. This speculation needs further confirmation. Notably, the results showed that ∆DI decreased with smaller increase in VI ( Figure 10). In addition, the VIs and DI in 2019 were correlated for every plot in Figure 10 and showed strong correlation, and three coefficients of determination (R 2 ARVI=0.55, R 2 OSAVI=0.57, R 2 GNDVI=0.51) outperformed the others, as shown in Figure 11.   Figure 9 shows the significant correlation of 12 out of 17 VIs with both DI and DS. Among the VIs, ARVI, OSAVI, and GNDVI produced higher correlations and coefficients of determination (R 2 ) with ∆DI than the others (R 2 ARVI = 0.44, R 2 OSAVI = 0.42, R 2 GNDVI = 0.43). These coefficients of determination (R 2 ) demonstrated that using a temporal rate of change in VI to fit ∆DI is not very accurate. It can be speculated that the relationship between pests and diseases caused by the change of climate in early spring with temporal vegetation incidence is not obvious. This speculation needs further confirmation. Notably, the results showed that ∆DI decreased with smaller increase in VI ( Figure 10). In addition, the VIs and DI in 2019 were correlated for every plot in Figure 10 and showed strong correlation, and three coefficients of determination (R 2 ARVI = 0.55, R 2 OSAVI = 0.57, R 2 GNDVI = 0.51) outperformed the others, as shown in Figure 11. . The X-axis indicates the vineyards sampled, which were labeled from A1 to A148. Due to lack of space, not all labels are shown. Figure 9 shows the significant correlation of 12 out of 17 VIs with both DI and DS. Among the VIs, ARVI, OSAVI, and GNDVI produced higher correlations and coefficients of determination (R 2 ) with ∆DI than the others (R 2 ARVI=0.44, R 2 OSAVI=0.42, R 2 GNDVI=0.43). These coefficients of determination (R 2 ) demonstrated that using a temporal rate of change in VI to fit ∆DI is not very accurate. It can be speculated that the relationship between pests and diseases caused by the change of climate in early spring with temporal vegetation incidence is not obvious. This speculation needs further confirmation. Notably, the results showed that ∆DI decreased with smaller increase in VI ( Figure 10). In addition, the VIs and DI in 2019 were correlated for every plot in Figure 10 and showed strong correlation, and three coefficients of determination (R 2 ARVI=0.55, R 2 OSAVI=0.57, R 2 GNDVI=0.51) outperformed the others, as shown in Figure 11.    Table 3 shows the relationship between the training and reference samples. It presents the mean, median, minimum, maximum, coefficient of variation (CV), kurtosis, and skewness of samples clearly. ML methods including PLSR, SVR, RFR, and SVR were used to predict DI using the UAV-derived canopy structure, satellite-based VIs, and the combination of canopy structure and VIs. The performance of the UAV-derived canopy structure information used in the models was not ideal with R 2 ranging from 0.305 to 0.432 and CV-RMSE ranging from 0.163 to 0.147. Satellite-based VIs performed better than UAV-derived information regardless of the regression model used, with R 2 ranging from 0.682 to 0.728 and CV-RMSE ranging from 0.11 to 0.1. A combination of UAV-derived canopy structure and satellite-based VIs outperformed the others described, with R 2 ranging from 0.69 to 0.736 and   Table 3 shows the relationship between the training and reference samples. It sents the mean, median, minimum, maximum, coefficient of variation (CV), kurtosis, skewness of samples clearly. ML methods including PLSR, SVR, RFR, and SVR w used to predict DI using the UAV-derived canopy structure, satellite-based VIs, and combination of canopy structure and VIs. The performance of the UAV-derived canopy structure information used in models was not ideal with R 2 ranging from 0.305 to 0.432 and CV-RMSE ranging f 0.163 to 0.147. Satellite-based VIs performed better than UAV-derived information gardless of the regression model used, with R 2 ranging from 0.682 to 0.728 and CV-RM ranging from 0.11 to 0.1. A combination of UAV-derived canopy structure and s lite-based VIs outperformed the others described, with R 2 ranging from 0.69 to 0.736 CV-RMSE ranging from 0.109 to 0.1. Table 4 provides details of the validation metric DI2019 estimation. In addition, canopy coverage extraction results using diffe methods were combined with spectral information separately. UAV-derived  Table 3 shows the relationship between the training and reference samples. It presents the mean, median, minimum, maximum, coefficient of variation (CV), kurtosis, and skewness of samples clearly. ML methods including PLSR, SVR, RFR, and SVR were used to predict DI using the UAV-derived canopy structure, satellite-based VIs, and the combination of canopy structure and VIs. The performance of the UAV-derived canopy structure information used in the models was not ideal with R 2 ranging from 0.305 to 0.432 and CV-RMSE ranging from 0.163 to 0.147. Satellite-based VIs performed better than UAV-derived information regardless of the regression model used, with R 2 ranging from 0.682 to 0.728 and CV-RMSE ranging from 0.11 to 0.1. A combination of UAV-derived canopy structure and satellite-based VIs outperformed the others described, with R 2 ranging from 0.69 to 0.736 and CV-RMSE ranging from 0.109 to 0.1. Table 4 provides details of the validation metrics for DI2019 estimation. In addition, canopy coverage extraction results using different methods were combined with spectral information separately. UAV-derived VFC yielded superior performance to UAV-derived CC with R 2 ranging from 0.69 to 0.736 and CV-RMSE from 0.109 to 0.1. CC features presented poorer performance with R 2 varying from 0.68 to 0.716 and CV-RMSE varying from 0.11 to 0.104. Although at a minor scale, VFC features slightly outperformed CC features regardless of the regression models. In addition, Figure 12 shows the comparison of DI2018 and DI2019 prediction results by PLSR, SVR, RFR, and ELR. R 2 and CV-RMSE are presented in this figure to show the accuracy of the various models. Taken as a whole, the performance of SVR was superior to other models in this study with higher R 2 and lower CV-RMSE.

Machine Learning Modeling
yielded superior performance to UAV-derived CC with R 2 ranging from 0.69 to 0.736 and CV-RMSE from 0.109 to 0.1. CC features presented poorer performance with R 2 varying from 0.68 to 0.716 and CV-RMSE varying from 0.11 to 0.104. Although at a minor scale, VFC features slightly outperformed CC features regardless of the regression models. In addition, Figure 12 shows the comparison of DI2018 and DI2019 prediction results by PLSR, SVR, RFR, and ELR. R 2 and CV-RMSE are presented in this figure to show the accuracy of the various models. Taken as a whole, the performance of SVR was superior to other models in this study with higher R 2 and lower CV-RMSE.

Overall Potential of UAV Data as Alternative to Field Sampling
This study has shown that UAV data have the potential to replace field sampling in vineyards affected by pests and diseases. Since leaf damage and branch wilt are the most

Overall Potential of UAV Data as Alternative to Field Sampling
This study has shown that UAV data have the potential to replace field sampling in vineyards affected by pests and diseases. Since leaf damage and branch wilt are the most direct manifestations of pests and diseases for individual plants, UAV high-resolution RGB images can well reflect the phenomenon of dying canopy leaves and tree branches to evaluate the presence of grape diseases. Therefore, it is reasonable to distinguish between onset and incidence by visual interpretation. There was a strong correlation between the incidence rate and vegetation index, and the incidence decreased with increases in the vegetation index. In addition, DI calculated by UAV image interpretation can be predicted by machine learning methods over a large area.
It is worth noting that high spatial-resolution UAV data can replace field data as a more economical and convenient method to acquire large-area vineyards canopy information. Specifically, the Parrot Sequoia+ agricultural camera integrated four single spectral sensors that solved the robustness problem resulting from parallax during rectification of image pairs. This sensor is a good choice for canopy structure extraction.

Comparison of Machine Learning Models
As evidenced by high R 2 and low CV-RMSE, SVR produced more accurate model in DI prediction. The SVR model has been reported to deal better with high-dimensional and overfitting datasets in previous studies [74]. In second place was the less powerful ELR, which presented superior ability to identify plant traits and perform yield prediction, especially for vineyards, with a new activation function combining Tanh and the ReLu function [33]. However, its performance was slightly inferior to SVR for DI estimation. RFR and PLSR were slightly inferior to the other models. RFR was comparable to SVR in most previous studies because both can tackle high data dimensionality, which is a strength that was also apparent in this study [75]. Although RFR has better noise tolerance, SVR has provided higher accuracies than RFR when using Sentinel-2 imagery in a few studies [76]. PLSR has limitations in dealing with nonlinear relationships between target and features, which also has been demonstrated [77].

Contributions of Different Types of Features Extracted from Multiplicity Sensors
SVR yielded the best performance for predicting DI compared to other ML models ( Figure 9). Therefore, it was used to generate a prediction of DI2018 based on various input features including UAV-derived information, satellite-based VIs, and the fusion of UAV with satellite-based features.
It has already been shown that UAV-derived canopy information can improve the accuracy of predicting DI [78] (Table 5 and Figure 13). Canopy features can reflect canopy growth status and supplement satellite-based spectral information. Specifically, the contribution of VFC can be demonstrated from two important aspects: one is increasing DI along with lower VFC, and the other is that adding VFC as a supplement can reduce noise resulting from satellite-based spectral information with background soil reflectance in crop monitoring.  Among spectral variables, the most crucial VIs included ARVI, OSAVI, and red-edge bands, which were also found to be important in other studies on crop monitoring [79]. Furthermore, these variables have been reported as predictors because they are related to changes in DI and the growth status of canopy affected by pests and diseases.
Notably, test results showed that the UAV-based canopy structure input SVR for DI2019 performed poorly compared to DI2018. In addition, spectral information lightly outperformed canopy structures in the case of a single feature type. This may have been due to the relatively sparse canopy in areas of high incidence and severity in 2019. Undergrowth crops under a sparse canopy interfere with vegetation coverage. Especially for pergola crops such as grapes, there are many undergrowth crops under the canopy, which makes it difficult to distinguish the canopy from the background [22]. In addition, when the grape is seriously damaged, not only the death of branches and leaves and the growth of understory vegetation will occur, but also some physiological parameters of the canopy will change, resulting in the instability of the CH. The unstable performance of CH at different growth stage in crop monitoring has also been reported by Näsi et al. [79,80]. This may be due to a weaker correlation between UAV-derived features and DI in 2019 than in 2018 when the grapes were growing well. The error caused by undergrowth crops and the unstable performance of CH reduce the performance of high-resolution UAV data in the year 2019. By using the fusion of UAV and satellite data, the canopy structure can reduce soil reflectivity and improve performance [22,79]. However, it is still necessary to conduct experiments on how to remove understory vegetation accurately Among spectral variables, the most crucial VIs included ARVI, OSAVI, and red-edge bands, which were also found to be important in other studies on crop monitoring [79]. Furthermore, these variables have been reported as predictors because they are related to changes in DI and the growth status of canopy affected by pests and diseases.
Notably, test results showed that the UAV-based canopy structure input SVR for DI2019 performed poorly compared to DI2018. In addition, spectral information lightly outperformed canopy structures in the case of a single feature type. This may have been due to the relatively sparse canopy in areas of high incidence and severity in 2019. Undergrowth crops under a sparse canopy interfere with vegetation coverage. Especially for pergola crops such as grapes, there are many undergrowth crops under the canopy, which makes it difficult to distinguish the canopy from the background [22]. In addition, when the grape is seriously damaged, not only the death of branches and leaves and the growth of understory vegetation will occur, but also some physiological parameters of the canopy will change, resulting in the instability of the CH. The unstable performance of CH at different growth stage in crop monitoring has also been reported by Näsi et al. [79,80]. This may be due to a weaker correlation between UAV-derived features and DI in 2019 than in 2018 when the grapes were growing well. The error caused by undergrowth crops and the unstable performance of CH reduce the performance of high-resolution UAV data in the year 2019. By using the fusion of UAV and satellite data, the canopy structure can reduce soil reflectivity and improve performance [22,79]. However, it is still necessary to conduct experiments on how to remove understory vegetation accurately when calculating canopy coverage. Further investigation should be conducted to examine the potential/possibility of canopy structure features for crop monitoring at different development stages over different crop species and environments.

Conclusions
This study has demonstrated the potential of high-resolution UAV data acquired by Parrot Sequoia+ to replace manual field samples and supplement satellite-based crop monitoring. With the capability to provide high-resolution canopy features and multispec-tral information, the Parrot Sequoia+ camera showed itself to be an up-and-coming tool to replace field samples. It has also been shown that the fusion of UAV-derived canopy information with essential Sentinel-2 derived VIscan improve the accuracy of DI estimation using machine learning models. Among the machine learning models, SVR outperformed the others in DI prediction. Additionally, to improve the availability of canopy structures information, it may be feasible to extract accurate canopy structure through 3D transfer models or vegetation biophysical variables to reduce errors caused by understory vegetation for monitoring grapevines, and tree heights under different growth states can be studied to improve the monitoring accuracy.