Experimental Evaluation and Consistency Comparison of UAV Multispectral Minisensors

: In recent years, the use of unmanned aerial vehicles (UAVs) has received increasing attention in remote sensing, vegetation monitoring, vegetation index (VI) mapping, precision agriculture, etc. It has many advantages, such as high spatial resolution, instant information acquisition, convenient operation, high maneuverability, freedom from cloud interference, and low cost. Nowadays, different types of UAV-based multispectral minisensors are used to obtain either surface reflectance or digital number (DN) values. Both the reflectance and DN values can be used to calculate VIs. The consistency and accuracy of spectral data and VIs obtained from these sensors have important application value. In this research, we analyzed the earth observation capabilities of the Parrot Sequoia (Sequoia) and DJI Phantom 4 Multispectral (P4M) sensors using di ﬀ erent combinations of correlation coe ﬃ cients and accuracy assessments. The research method was mainly focused on three aspects: (1) consistency of spectral values, (2) consistency of VI products, and (3) accuracy of normalized di ﬀ erence vegetation index (NDVI). UAV images in di ﬀ erent resolutions were collected using these sensors, and ground points with reﬂectance values were recorded using an Analytical Spectral Devices handheld spectroradiometer (ASD). The average spectral values and VIs of those sensors were compared using di ﬀ erent regions of interest (ROIs). Similarly, the NDVI products of those sensors were compared with ground point NDVI (ASD-NDVI). The results show that Sequoia and P4M are highly correlated in the green, red, red edge, and near-infrared bands (correlation coe ﬃ cient (R 2 ) > 0.90). The results also show that Sequoia and P4M are highly correlated in di ﬀ erent VIs; among them, NDVI has the highest correlation (R 2 > 0.98). In comparison with ground point NDVI (ASD-NDVI), the NDVI products obtained by both of these sensors have good accuracy (Sequoia: root-mean-square error (RMSE) < 0.07; P4M: RMSE < 0.09). This shows that the performance of di ﬀ erent sensors can be evaluated from the consistency of spectral values, consistency of VI products, and accuracy of VIs. It is also shown that di ﬀ erent UAV multispectral minisensors can have similar performances even though they have di ﬀ erent spectral response functions. The ﬁndings of this study could be a good framework for analyzing the interoperability of di ﬀ erent sensors for vegetation change analysis.


Introduction
An unmanned aerial vehicle (UAV) is an unmanned aircraft operated by radio remote control equipment and self-provided program control device [1]. The combination of a UAV and a remote The results showed that they both worked well, and the flight campaigns successfully delivered hyperspectral data. Nebiker et al. [41] compared three sensors (Canon s110 NIR, multi-SEPC 4C Prototype, and multi-SEPC 4C commercial) to investigate their characteristics and performance in agronomical research. The investigations showed that the SEPC 4C (multi-SEPC 4C Prototype and multi-SEPC 4C commercial) matched very well with ground-based field spectrometer measurements, while the Canon s110 NIR expressed significant biases. Deng et al. [42] systematically compared the vegetation observation capabilities of MCA and Sequoia based on reflectance and VI. It was found that the reflectance of the MCA camera had higher accuracy in the near-infrared band, and the reflectance accuracy of the Sequoia camera was more stable in each band. The MCA camera can obtain an NDVI product with a higher accuracy after using a more precise nonlinear calibration method.
In recent years, UAV minisensors have begun to show an end-to-end (user to product) development trend, which simplifies the data processing and VI calculation, thus giving users the best sense of use. It is necessary to further explore the performance of different UAV multispectral minisensors based on previous research. To address this, our main objective of this paper was to experimentally evaluate different UAV multispectral minisensors and compare them in terms of consistency. To meet the main objective, we focused on (1) analyzing the consistency of spectral values, (2) analyzing the consistency of VI products, and (3) assessing the accuracy of NDVI products between two sensors. This research will suggest whether vegetation observations from different sensors complement each other or not, thereby further broadening their application in different fields.

Study Area
The study area is located in Fangshan District, Beijing, China (39 • 33 34.93 N, 115 • 47 40.97 E), which has a warm temperate humid monsoon climate ( Figure 1). It covers an area of 0.03 km 2 with flat terrain and 95 m average altitude. The annual average temperature is 11.6 • C, and the annual average precipitation is 602.5 mm. There are a variety of surface types in the area, mainly grassland.
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 19 UHD185 Firefly and Rikola hyperspectral camera (RHC) to introduce their performance in precision agriculture. The results showed that they both worked well, and the flight campaigns successfully delivered hyperspectral data. Nebiker et al. [41] compared three sensors (Canon s110 NIR, multi-SEPC 4C Prototype, and multi-SEPC 4C commercial) to investigate their characteristics and performance in agronomical research. The investigations showed that the SEPC 4C (multi-SEPC 4C Prototype and multi-SEPC 4C commercial) matched very well with ground-based field spectrometer measurements, while the Canon s110 NIR expressed significant biases. Deng et al. [42] systematically compared the vegetation observation capabilities of MCA and Sequoia based on reflectance and VI. It was found that the reflectance of the MCA camera had higher accuracy in the near-infrared band, and the reflectance accuracy of the Sequoia camera was more stable in each band. The MCA camera can obtain an NDVI product with a higher accuracy after using a more precise nonlinear calibration method.
In recent years, UAV minisensors have begun to show an end-to-end (user to product) development trend, which simplifies the data processing and VI calculation, thus giving users the best sense of use. It is necessary to further explore the performance of different UAV multispectral minisensors based on previous research. To address this, our main objective of this paper was to experimentally evaluate different UAV multispectral minisensors and compare them in terms of consistency. To meet the main objective, we focused on (1) analyzing the consistency of spectral values, (2) analyzing the consistency of VI products, and (3) assessing the accuracy of NDVI products between two sensors. This research will suggest whether vegetation observations from different sensors complement each other or not, thereby further broadening their application in different fields.

Study Area
The study area is located in Fangshan District, Beijing, China (39°33′34.93′′ N, 115°47′40.97′′ E), which has a warm temperate humid monsoon climate ( Figure 1). It covers an area of 0.03 km 2 with flat terrain and 95 m average altitude. The annual average temperature is 11.6 °C, and the annual average precipitation is 602.5 mm. There are a variety of surface types in the area, mainly grassland.

Multispectral Sensors and UAV Platforms
Two types of multispectral sensors were compared in this experiment. The Sequoia camera [43] has a total of five imaging sensors, including four multispectral sensors and one RGB sensor ( Figure 2). The spectral response function of Sequoia is shown as the solid line in Figure 3, which was provided by the manufacturer. The focal length of the Sequoia camera is 3.98 mm, the image size is 1280 × 960 pixels, and the sensor size is 4.8 mm × 3.6 mm. It is equipped with a sunshine sensor that can record the illumination information of each image, facilitating the calibration of multispectral images. The self-provided calibration panel can be used for radiometric calibration, and the reflectance data can be obtained directly.
but the reflectance data cannot be obtained directly. Table 1 shows the band information for Sequoia and P4M.  ---blue  450  32  green  550  40  green  560  32  red  660  40  red  650  32  red edge  735  10  red edge  730  32  nearinfrared  790  40 nearinfrared 840 52  During the data collection, the sensors were carried on different UAV platforms. The Sequoia was mounted on a hexarotor UAV called EM6-800 which has the advantages of low cost and high stability. Its payload is 800 g, and the maximum flight time is 40 min; under the maximum load of 1.2 kg, its maximum flight time is 25 min [45]. The UAV is equipped with an onboard flight controller, which includes a compass; an inertial unit; and gyroscopic, barometric, and global positioning system sensors. The onboard flight controller can be used to control the flight missions through flight route and measurement point settings. It can also record and access the data obtained by the mounted The other multispectral sensor considered in the study is the P4M (Figure 2). The P4M camera [44] has a total of six imaging sensors, including five multispectral sensors and one RGB sensor. The spectral response function of P4M is shown as the dashed line in Figure 3, which was provided by the manufacturer. The focal length of the P4M camera is 5.74 mm, the image size is 1600 × 1300 pixels, and the sensor size is 4.87 mm × 3.96 mm. The P4M camera is also equipped with a sunshine sensor, but the reflectance data cannot be obtained directly. Table 1 shows the band information for Sequoia and P4M. During the data collection, the sensors were carried on different UAV platforms. The Sequoia was mounted on a hexarotor UAV called EM6-800 which has the advantages of low cost and high stability. Its payload is 800 g, and the maximum flight time is 40 min; under the maximum load of 1.2 kg, its maximum flight time is 25 min [45]. The UAV is equipped with an onboard flight controller, which includes a compass; an inertial unit; and gyroscopic, barometric, and global positioning system sensors. The onboard flight controller can be used to control the flight missions through flight route and measurement point settings. It can also record and access the data obtained by the mounted sensors for postprocessing. Unlike the Sequoia, the P4M has its own UAV platform, so it can complete the data collection task independently without the help of other aircraft. It has a takeoff weight of 1487 g, and the average flight time is 27 min.

Sequoia and P4M Data
The UAV flight was conducted during sunny and clear sky (without clouds) conditions from 11:00 to 13:00 on 22 August 2019. During data collection, the Sequoia sensor was mounted on the EM6-800 hexarotor UAV, while the P4M was mounted on its own aircraft. For the Sequoia, a calibration target provided by the manufacturer was recorded to perform radiometric calibration in postprocessing.
In this experiment, the two sensors acquired a total of three sets of image data. The Sequoia images were collected while flying at 56 m height with 5 cm resolution and 100 m height with 10 cm resolution. The P4M images were collected while flying at 100 m height with 5 cm resolution. All of those flights were started within a one-hour period (11:27 to 12:22) to maintain similar illumination among each set of images. These images were acquired with 80% overlap. Sequoia acquired a total of 1715 images and P4M acquired 960 images. The specific parameter settings for the sensors are shown in Table 2. Table 2. Parameters of sensors used for comparison. During the data collection process, the P4M camera failed to successfully acquire image data with a resolution of 10 cm. The 10 cm image data used in the comparative experiment was obtained by resampling the 5 cm image.

ASD Data
We used the FieldSpec HandHeld 2 field spectroradiometer produced by Analytical Spectral Devices to measure the ground object spectral data. The coordinates and photos of the measured points were collected for ground object identification and visual interpretation of images. The spectroradiometer can perform continuous spectrum measurement in the wavelength range of 325-1075 nm, with the spectral resolution <3.0 nm at 700 nm, wavelength accuracy of ±1 nm, and field angle of 25 • . It can measure the reflection, transmission, radiance, or irradiance in real time and obtain the continuous spectral curve of the measured object. The spectrum measurement was carried out in sunny and cloudless weather and the time was between 11:00 to 13:00. During the UAV flight, synchronous ground observation was carried out to ensure that the solar elevation angle, zenith angle, and weather conditions measured by the ground object spectrum were consistent with the UAV data. During data collection, the surveyors wore black clothing to absorb sunlight and reduce spectral interference. The spectrum measurement was carried out under natural light conditions, the spectroradiometer was held vertically downward at 1 m above the ground, and the sensor covered about 0.68 m 2 ground area. To improve the accuracy of measurement, each ground point was repeatedly measured (ten times), and then the average value was taken. The spectroradiometer was calibrated every 10 min to reduce the interference of weather change on the spectrum measurement. Considering the small area, similar vegetation species, and relatively uniform ground surface in the study area, a total of eight ground points were selected ( Figure 1). These points were selected randomly. Finally, according to Equation (1), the radiance of the ground object was converted into reflectance using the calibration coefficient provided by the reference plate.
where R i is the reflectance of band i (i = 1, 2, 3, 4), λ imax and λ imin are the maximum and minimum values of wavelength i, C λ is the transmittance of wavelength, and R λ is the reflectance of wavelength λ.
The reflectance data can be used to calculate the true NDVI of the ground point (ASD-NDVI). The area of each ground point is about 0.68 m 2 . Therefore, when comparing the NDVI obtained by the two sensors with the true NDVI (ASD-NDVI), 272 pixels were selected and the average NDVI was taken from the 5 cm resolution image. Similarly, when comparing the NDVI obtained by the two sensors with the true NDVI (ASD-NDVI), 68 pixels were selected and the average NDVI was taken from the 10 cm resolution image.

GCP
Five ground control points (GCPs) were evenly established on the field using printed white crosses to ensure the overlap between the Sequoia and P4M imagery at different times ( Figure 1). The GCP and ASD coordinates were measured with 0.025 m horizontal accuracy and 0.035 m vertical accuracy. A geodetic dual-frequency global navigation satellite system (GNSS) receiver was used in a rapid-static manner (approximately 4 min for each measurement) using the relative positioning approach from a master station located at a point with known coordinates.

Image Resampling
To standardize the spatial resolution of images acquired by different sensors, it is necessary to resample images that are very suitable for experimental comparison [46]. In order to avoid the contingency of the experimental results and to ensure the maneuverability of the UAV flight process, we compared the images of Sequoia and P4M with different spatial resolutions (5 and 10 cm). Therefore, we used ENVI software to resample the P4M images with the spatial resolution of 5 cm to obtain images with the spatial resolution of 10 cm. The pixel aggregate method was adopted in the resampling process [47].

Image Preprocessing
For preprocessing, Sequoia and P4M images were imported into Pix4D mapper [48] and DJI Terra software, respectively. Different steps of initial processing were followed, including point cloud processing, 3D model construction, feature extraction, feature construction, and orthophoto generation. As the Sequoia images can be used to directly obtain the reflectance data of the study area after processing, the VIs were calculated using reflectance data from VI equations. As the P4M images can be used to directly obtain the VIs, there was no need to get reflectance data for these images. Then, the processed images were imported into ENVI software to clip, match, and select different ROIs in a single band for comparison. Figure 5 shows the processed 5 cm spatial resolution image. From Figure 5, it can be seen that there is a slight difference between a and b. For example, the building in the bottom left corner appears white in the P4M image but red in Sequoia, the stones on the right are white in P4M but red in Sequoia, and some roads which are white in P4M are yellow in Sequoia. These differences may be caused by the saturation of the red band in the Sequoia sensor. There are also some small differences between c and d. The Sequoia-derived NDVI (Sequoia-NDVI) is greater than the P4M-derived NDVI (P4M-NDVI); the range of Sequoia-NDVI is −0.19 to 0.93, and the range of P4M-NDVI is −0.43 to 0.85. There are some errors of Sequoia-NDVI in the visual range: Some buildings in the bottom left corner have the NDVI of about 0.7 (yellow part), but the building does not correspond to such a large NDVI value in reality.

Image Resampling
To standardize the spatial resolution of images acquired by different sensors, it is necessary to resample images that are very suitable for experimental comparison [46]. In order to avoid the contingency of the experimental results and to ensure the maneuverability of the UAV flight process, we compared the images of Sequoia and P4M with different spatial resolutions (5 and 10 cm). Therefore, we used ENVI software to resample the P4M images with the spatial resolution of 5 cm to obtain images with the spatial resolution of 10 cm. The pixel aggregate method was adopted in the resampling process [47].

Image Preprocessing
For preprocessing, Sequoia and P4M images were imported into Pix4D mapper [48] and DJI Terra software, respectively. Different steps of initial processing were followed, including point cloud processing, 3D model construction, feature extraction, feature construction, and orthophoto generation. As the Sequoia images can be used to directly obtain the reflectance data of the study area after processing, the VIs were calculated using reflectance data from VI equations. As the P4M images can be used to directly obtain the VIs, there was no need to get reflectance data for these images. Then, the processed images were imported into ENVI software to clip, match, and select different ROIs in a single band for comparison. Figure 5 shows the processed 5 cm spatial resolution image. From Figure 5, it can be seen that there is a slight difference between a and b. For example, the building in the bottom left corner appears white in the P4M image but red in Sequoia, the stones on the right are white in P4M but red in Sequoia, and some roads which are white in P4M are yellow in Sequoia. These differences may be caused by the saturation of the red band in the Sequoia sensor. There are also some small differences between c and d. The Sequoia-derived NDVI (Sequoia-NDVI) is greater than the P4M-derived NDVI (P4M-NDVI); the range of Sequoia-NDVI is −0.19 to 0.93, and the range of P4M-NDVI is −0.43 to 0.85. There are some errors of Sequoia-NDVI in the visual range: Some buildings in the bottom left corner have the  Figure 6 shows the processed 10 cm spatial resolution images of Sequoia. Compared with the 5 cm spatial resolution results, both the false color RGB images ( Figure 5a) and NDVI products ( Figure  5c) are different. In the 10 cm resolution RGB image, the red part of the building in the bottom left corner still exists, but it is significantly smaller than in the 5 cm resolution image; the stones on the right are shown in white instead of red, and the road is shown in white instead of yellow. The problem of red band saturation does not seem to be obvious in 10 cm resolution images. Similarly, the NDVI value of the building in the bottom left corner seems normal.

ROI Selection
For the comparison of different sensors, the most common method is to compare the spectrum information or VI of each corresponding band by statistical regression [49]. In this experiment, four commonly used VIs were compared between Sequoia and P4M. In the spectrum information comparison, the reflectance of Sequoia cannot be directly compared with the DN value of P4M, so we used linear regression to characterize the spectral difference of these two sensors. Additionally, eight ground points were also measured by ASD, but the number of points was too limited to establish a fitting relationship between the ASD and the sensor. Therefore, we compared the NDVI between ASD and sensor point by point. For this experiment, we selected some ROIs in the same location (between images of two sensors) and then compared the average value in each ROI [50]. The images of the study area were divided into 10 × 8 grids on average, and a ROI was selected from each   Figure 6 shows the processed 10 cm spatial resolution images of Sequoia. Compared with the 5 cm spatial resolution results, both the false color RGB images (Figure 5a) and NDVI products ( Figure  5c) are different. In the 10 cm resolution RGB image, the red part of the building in the bottom left corner still exists, but it is significantly smaller than in the 5 cm resolution image; the stones on the right are shown in white instead of red, and the road is shown in white instead of yellow. The problem of red band saturation does not seem to be obvious in 10 cm resolution images. Similarly, the NDVI value of the building in the bottom left corner seems normal.

ROI Selection
For the comparison of different sensors, the most common method is to compare the spectrum information or VI of each corresponding band by statistical regression [49]. In this experiment, four commonly used VIs were compared between Sequoia and P4M. In the spectrum information comparison, the reflectance of Sequoia cannot be directly compared with the DN value of P4M, so we used linear regression to characterize the spectral difference of these two sensors. Additionally, eight ground points were also measured by ASD, but the number of points was too limited to establish a fitting relationship between the ASD and the sensor. Therefore, we compared the NDVI between ASD and sensor point by point. For this experiment, we selected some ROIs in the same location (between images of two sensors) and then compared the average value in each ROI [50]. The images of the study area were divided into 10 × 8 grids on average, and a ROI was selected from each

ROI Selection
For the comparison of different sensors, the most common method is to compare the spectrum information or VI of each corresponding band by statistical regression [49]. In this experiment, four commonly used VIs were compared between Sequoia and P4M. In the spectrum information comparison, the reflectance of Sequoia cannot be directly compared with the DN value of P4M, so we used linear regression to characterize the spectral difference of these two sensors. Additionally, eight ground points were also measured by ASD, but the number of points was too limited to establish a fitting relationship between the ASD and the sensor. Therefore, we compared the NDVI between ASD and sensor point by point. For this experiment, we selected some ROIs in the same location (between images of two sensors) and then compared the average value in each ROI [50]. The images of the study area were divided into 10 × 8 grids on average, and a ROI was selected from each grid. A total of 80 homogeneous ROIs (including vegetation and nonvegetation) were selected in the experiment. The selected ROIs were in flat terrain, properly sized, homogeneous, and almost identical (no other object features were included) [51]. The relation function between Sequoia and P4M was fitted using ordinary least square (OLS) regression. The goodness of fit was defined by the correlation coefficient (also written as R 2 ) [52]. Root-mean-square error (RMSE) was used to measure the deviation degree between the Sequoia-NDVI or P4M-NDVI and ASD-NDVI, as shown in Equation (2).
where y i is the ASD-NDVI, y i is the average Sequoia-NDVI or P4M-NDVI, and n is the total number of ground points (n = 8).

VI Selection
VIs can reflect the growth status of vegetation. Different VIs may have certain differences in reflecting vegetation characteristics [53]. In vegetation studies, among all the possible existent VIs, NDVI, green normalized difference vegetation index (GNDVI), optimal soil-adjusted vegetation index (OSAVI), and leaf chlorophyll index (LCI) are commonly used. These four VIs were compared in this experiment, as shown in Table 3. NDVI is currently the most widely used VI in the world. In agriculture, NDVI is one of the most important tools for crop yield estimation, biomass estimation, and so on [54]. Using the unique response characteristics of vegetation in the near-infrared band, NDVI combines the spectral values of the red band and near-infrared band to quantitatively describe the vegetation coverage in the study area. Compared with NDVI, GNDVI is more sensitive to the change in vegetation chlorophyll content [55]. It combines the spectral values of the green band and the near-infrared band. OSAVI can reduce the interference of soil and vegetation canopy [15]. It also combines the spectral values of the red band and the near-infrared band. LCI is a sensitive indicator of chlorophyll content in leaves and is less affected by scattering from the leaf surface and internal structure variation [56]. It combines the spectral values of the red band, red edge band, and the near-infrared band. Different VIs were selected so that their VI equations contained different bands (Figure 3).

Consistency of Spectral Values
In order to get better experiment results, we compared the images with 5 and 10 cm spatial resolution using the scatter plots of the Sequoia and P4M spectral values for the approximately equivalent spectral bands (green, red, red edge, and near-infrared). In the experiment, Sequoia used the spectral reflectance and P4M used the DN value of the image.
In the first experiment (5 cm spatial resolution), the spectral values of Sequoia and P4M were highly correlated (Figure 7). The two sensors showed a high correlation in the approximately equivalent four bands, and the correlation coefficient of the fitting function was not less than 0.90. The two sensors had the highest correlation in the red band (R 2 = 0.9709), followed by the green band (R 2 = 0.9699) and the red edge band (R 2 = 0.9208); the correlation for the near-infrared band was lower than those of the other three bands (R 2 = 0.9042). It was seen that the spectral values of Sequoia and P4M had an excellent correlation in the green and red bands, and the R 2 was greater than 0.96. Meanwhile, the correlation was low in the red edge and the near-infrared bands, and the R 2 was slightly less than 0.92. Thus, these results showed that spectral values of these two sensors had a high correlation in the green and red bands and a low correlation in the red edge and near-infrared bands.
Remote Sens. 2020, 12, x FOR PEER REVIEW 10 of 19 The two sensors had the highest correlation in the red band (R 2 = 0.9709), followed by the green band (R 2 = 0.9699) and the red edge band (R 2 = 0.9208); the correlation for the near-infrared band was lower than those of the other three bands (R 2 = 0.9042). It was seen that the spectral values of Sequoia and P4M had an excellent correlation in the green and red bands, and the R 2 was greater than 0.96. Meanwhile, the correlation was low in the red edge and the near-infrared bands, and the R 2 was slightly less than 0.92. Thus, these results showed that spectral values of these two sensors had a high correlation in the green and red bands and a low correlation in the red edge and near-infrared bands.  In the second experiment (10 cm spatial resolution), the spectral values of Sequoia and P4M were also well correlated (Figure 8). In the four bands of these sensors, the correlation coefficient of the fitting equation was not less than 0.91, showing a strong correlation. As seen in the 5 cm spatial resolution results, the two sensors had the highest correlation in the red band (R 2 = 0.9793), followed by the green band (R 2 = 0.9727) and the red edge band (R 2 = 0.9436); the correlation for the nearinfrared band was lower than those of the other three bands (R 2 = 0.9199). The spectral values of Sequoia and P4M were highly correlated in the green and red bands, and the R 2 was greater than 0.97. Similarly, the correlations between the two sensors in the red edge and the near-infrared bands were low, and the R 2 was slightly less than 0.94. These results also showed that spectral values of these two sensors had a high correlation in the green and red bands and a weak correlation in the red edge and near-infrared bands. In the second experiment (10 cm spatial resolution), the spectral values of Sequoia and P4M were also well correlated (Figure 8). In the four bands of these sensors, the correlation coefficient of the fitting equation was not less than 0.91, showing a strong correlation. As seen in the 5 cm spatial resolution results, the two sensors had the highest correlation in the red band (R 2 = 0.9793), followed by the green band (R 2 = 0.9727) and the red edge band (R 2 = 0.9436); the correlation for the near-infrared band was lower than those of the other three bands (R 2 = 0.9199). The spectral values of Sequoia and P4M were highly correlated in the green and red bands, and the R 2 was greater than 0.97. Similarly, the correlations between the two sensors in the red edge and the near-infrared bands were low, and the R 2 was slightly less than 0.94. These results also showed that spectral values of these two sensors had a high correlation in the green and red bands and a weak correlation in the red edge and near-infrared bands. In short, the spectral values of Sequoia and P4M were highly correlated in both the green and red bands (R 2 > 0.96), but the correlation was slightly lower in the red edge and the near-infrared bands (R 2 < 0.96). The correlation of spectral values for the two sensors at 10 cm spatial resolution was slightly higher than that of 5 cm. Thus, if we are interested in using both images at the same time, the 10 cm spatial resolution image may be the better choice. Although these two sensors were highly correlated, there was also a slight difference, which may be caused by a variety of mixing factors, including the difference in spectral response function (Figure 3). Among the compared bands, the center wavelength and the wave width of these two sensors were both close in the green and red bands. Although the center wavelength of these two sensors was close in the red edge band, there was a big difference in the wave width. In the near-infrared band, the center wavelength and the wave width of the two sensors were significantly different. This explains how the spectral values differ between Sequoia and P4M.

Consistency of VI Products
The VI products of Sequoia and P4M were highly correlated (Figure 9). Four VIs were compared in this paper, namely NDVI, GNDVI, OSAVI, and LCI, as shown in Figure 9. The results on the left were obtained with 5 cm spatial resolution image, and those on the right were obtained with 10 cm spatial resolution. The black dotted lines are the 1:1 reference lines, and the solid lines are the fitting functions of these sensor-derived VIs (using OLS regression). In short, the spectral values of Sequoia and P4M were highly correlated in both the green and red bands (R 2 > 0.96), but the correlation was slightly lower in the red edge and the near-infrared bands (R 2 < 0.96). The correlation of spectral values for the two sensors at 10 cm spatial resolution was slightly higher than that of 5 cm. Thus, if we are interested in using both images at the same time, the 10 cm spatial resolution image may be the better choice. Although these two sensors were highly correlated, there was also a slight difference, which may be caused by a variety of mixing factors, including the difference in spectral response function (Figure 3). Among the compared bands, the center wavelength and the wave width of these two sensors were both close in the green and red bands. Although the center wavelength of these two sensors was close in the red edge band, there was a big difference in the wave width. In the near-infrared band, the center wavelength and the wave width of the two sensors were significantly different. This explains how the spectral values differ between Sequoia and P4M.

Consistency of VI Products
The VI products of Sequoia and P4M were highly correlated (Figure 9). Four VIs were compared in this paper, namely NDVI, GNDVI, OSAVI, and LCI, as shown in Figure 9. The results on the left were obtained with 5 cm spatial resolution image, and those on the right were obtained with 10 cm spatial resolution. The black dotted lines are the 1:1 reference lines, and the solid lines are the fitting functions of these sensor-derived VIs (using OLS regression).
Among the four VIs, NDVI had the highest correlation, followed by OSAVI, GNDVI, and LCI. In the comparison of 5 cm spatial resolution images, the correlation of NDVI was the highest (R 2 = 0.9863), followed by OSAVI (R 2 = 0.9859), while GNDVI and LCI were lower (GNDVI: R 2 = 0.9595; LCI: R 2 = 0.9516). In the comparison of 10 cm spatial resolution images, the correlation of NDVI was Sequoia-NDVI had better result than P4M-NDVI (most of the scattered points were distributed above the 1:1 line, and a very small part of the scattered points were located on or below the 1:1 line). The legend in Figure 5 also shows that Sequoia-NDVI was slightly higher than P4M-NDVI; the fitting results of GNDVI and LCI were similar to NDVI, and there were also some differences. In both resolutions, Sequoia-GNDVI was higher than P4M-GNDVI (all scattered points were distributed above the 1:1 line), but the distributions of points were more dispersed than those of NDVI. The fitting result of OSAVI was different from those of the previous three indices. In both resolutions, Sequoia-OSAVI was only partially higher than P4M-OSAVI (the scattered points were evenly distributed above, below, and on the 1:1 line). Table 4 shows Sequoia and P4M VI transformation functions derived by OLS regression of the data shown in Figure 9. The transformation functions for the 5 and 10 cm spatial resolution images are listed separately, where S represents the VI of Sequoia and P represents the VI of P4M. Both sensors had good consistency in those four indices. Table 4. Sequoia and P4M VI transformation functions derived by OLS regression of the data illustrated in Figure 9. NDVI, GNDVI, OSAVI, and LCI were used for different combinations of surface reflectivity, so their values were partly determined by the reflectance of the green, red, red edge, and near-infrared bands. There was a certain difference in the spectral response functions of the sensors, which led to slight differences between the VI products. Although users cannot directly obtain the reflectance from P4M image, they can still obtain high-quality VI products. It was seen that P4M, which integrates aircraft, cameras, and data processing software, optimizes the user's experience and improves the working efficiency by providing good VI products.

Accuracy of NDVI
Both the Sequoia-NDVI and P4M-NDVI had high accuracy, not only with a small deviation from ASD-NDVI but also with a good correlation ( Figure 10). Two sets of spatial resolution data (5 and 10 cm) are compared in Figure 10: the left part shows the fitting scattered points of Sequoia-NDVI and ASD-NDVI, while the right part shows the fitting scattered points of P4M-NDVI and ASD-NDVI (blue dots correspond to 5 cm resolution and orange triangles correspond to 10 cm resolution). The Sequoia-NDVI was highly consistent with ASD-NDVI, and the correlation was high. In the comparative study of 5 cm spatial resolution images, RMSE = 0.0622 and R 2 = 0.8523; in 10 cm spatial resolution images, RMSE = 0.0684 ad R 2 = 0.8497. Similar to Sequoia, P4M-NDVI was also highly consistent with ASD-NDVI, maintaining a good correlation. In the comparative study, RMSE = 0.0886 and R 2 = 0.8785 for 5 cm spatial resolution images, while RMSE = 0.0842 and R 2 = 0.8785 for 10 cm spatial resolution images. This indicates that both Sequoia and P4M can provide NDVI products with high accuracy. Furthermore there was no big difference between the VI products obtained from these sensors.

Differences between Sequoia and P4M
Different sensors may have different spectral response functions [59], and such differences will

Differences between Sequoia and P4M
Different sensors may have different spectral response functions [59], and such differences will cause systematic deviations in the spectral values of the images. The consistency of spectral values between the two sensors studied showed a clear difference in the near-infrared band (Figures 7 and 8). The reason for this might be due to their different spectral response functions (Figure 3). Compared with the spectral response function of Sequoia, the spectral range of P4M in the near-infrared band was wider than that of Sequoia (Sequoia: 40 nm; P4M: 52 nm), and the positions of center wavelength in the near-infrared band were also different (Sequoia: 790 nm; P4M: 840 nm). The results also showed that there was some difference in the red edge band. Although the center wavelengths of these two sensors were close in the red edge band (Sequoia: 735 nm; P4M: 730 nm), there were big differences in the wave width (Sequoia: 10 nm; P4M: 32 nm). In contrast, the center wavelength and the wave width of these two sensors were both close in the green (Sequoia: 550 nm and 40 nm; P4M: 560 nm and 32 nm) and red bands (Sequoia: 660 nm and 40 nm; P4M: 650 nm and 32 nm). The different spectral response functions may explain the difference in the spectral values between Sequoia and P4M.
In addition, other factors such as the spectral reflection characteristics of the ground object, the nonuniformity of the ground surface, the observation time, and the solar elevation angle also increased the randomness and uncertainty of this systematic deviation [60][61][62]. In this experiment, the ground surface of the study area was uniform, and the observation time was similar for both sensors, as was the solar elevation angle. Therefore, the reason for the difference in spectral values was probably related to the reflection characteristics of ground target features. The spectral value of a single pixel may be influenced by both the spectral response function and the reflection characteristics of the target feature. Therefore, some objects in the image having high reflection characteristics in a specific spectral band may often be more affected by the difference in spectral response function.
The acquisition of the VI usually requires a series of conversion processes on the spectral values; thus, if there is a deviation in the spectral values, the VI may also be affected. In analysis of the consistency of the VIs between the two sensors, the four VI products of Sequoia and P4M were found to be highly correlated, but there were still some differences. These differences may have a great relationship with the differences in spectral values between the sensors. The reason for this difference in spectral values is also the same as explained above (due to spectral response function). Therefore, this difference between VIs may be caused by the spectral response function and the reflection characteristics of the target features. As we know, the VI is obtained by combining the spectral values of different bands, so using different combination methods of spectral values may also affect the quality of the VI.

Sensitivity of VIs to Spectral Deviation
The calculation of the VI involved spectral values of different spectral bands. NDVI, GNDVI, OSAVI, and LCI were compared in this study, and their calculation included the spectral values of red, green, red edge, and near-infrared bands. Therefore, small changes in the spectral values of each band may have a relatively big impact on the VI results. In addition, the band combination method may also change the sensitivity of the VI to small changes in the spectral values. The experimental results showed that although the spectral values of the Sequoia and P4M were significantly different in the near-infrared band, this difference did not show a significant impact on the VI products. The correlation coefficients of the VI products obtained by these two sensors were greater than 0.95. The NDVI products of the two sensors were also compared with the ASD-NDVI. The results showed that Sequoia-NDVI and P4M-NDVI both have high accuracy. The normalized calculation method of VI eliminated the influence of the difference in spectral values to a certain extent, thus reducing the sensitivity of the VI to such spectral deviations [63].
Poncet et al. [64] found that the error of VIs was correlated with different radiometric calibration methods. In this experiment, for Sequoia, we used a calibration target provided by the manufacturer to perform radiometric calibration in postprocessing. The calibration method might have affected the reflectance, which would have affected the VI.

Selection of Optimal Spatial Scale
The pixel is the smallest unit that constitutes the remote sensing digital image. It is an important symbol to reflect the features of the image and can be used to characterize the ground conditions in the study area. The pixel size determines the spatial resolution of a digital image and amount of information it can contain. After resampling from high spatial resolution to low resolution, the resultant image (low spatial resolution) will lose spectral information and spectral variation [65]. With the increase of remote sensing image scale, the spectral features of several different ground objects may appear simultaneously in a single pixel, resulting in the generation of a mixed pixel. At this time, the signal intensity of the ground object features in the pixel tends to be stable, and the pixel signals received by different sensors will tend to be similar.
The correlation between the spectral values of Sequoia and P4M in each band (green, red, red edge, and near-infrared) seemed to have a certain relationship with the image scale. When the image scale was small (5 cm), the correlation was low; when the image scale was large (10 cm), the correlation was high (Table 5). Fawcett et al. [66] found that the NDVI consistency of a multispectral sensor was similar at different spatial resolutions. Our results also show that the NDVI consistencies of Sequoia and P4M at 5 and 10 cm resolutions are similar (5 cm: R 2 = 0.9863; 10 cm: R 2 = 0.9863). Table 5. Sequoia and P4M spectral value transformation functions and the values of their correlation coefficients (R 2 ).

Limitations
The study was carried out on a single date in a single study area with uniform vegetation species; it would be better if different study areas with different vegetation species in different periods were used. When assessing the suitability of UAV sensors in determining VIs, it would be important to include agricultural land, preferably with different nutrient treatments or crop species. Thus, in the future, it would be better to include agricultural land with crop species while doing VI-related research. Similarly, to assess the accuracy of NDVI, eight ASD points were used, as the terrain was uniform; still, better results could be obtained if more points were used. The use of more than two multispectral minisensors could be more meaningful to analyze the consistency of spectral values, consistency of VI products, and accuracy of NDVI. Therefore, detailed research is needed in the future to obtain improved results and conclusions.

Conclusions
Different UAV multispectral minisensors have been developed for applications in various fields, but their experimental performance and consistency need to be determined before their application. As a preliminary work towards consistency evaluation, different UAV images from Sequoia and P4M sensors with multispectral bands were acquired and preprocessed for ROI creation and VI calculation. The main objective of this research was to experimentally evaluate different UAV multispectral minisensors and compare them in terms of consistency. Using a combined method of consistency of spectral values, consistency of VI products, and accuracy of NDVI, we came to the following conclusions: First, the data acquisition capability of the Sequoia is similar to that of the P4M; both the spectral values and VIs of the two sensors have good correlation (R 2 > 0.90). Second, the VI products obtained from both sensors have good precision, and they are suitable for vegetation remote sensing monitoring. Third, both sensors have similar characteristics, and they may be used interchangeably for large area coverage with high spatial resolution and for daily time series science and applications.