Yield predictions of four hybrids of maize ( Zea mays ) using multispectral images obtained from RPAS in the coast of Peru

: Early assessment of crop development is a key aspect of precision agriculture. Shortening the time of response before a deficit of irrigation, nutrients and damage by diseases is one of the usual concerns in agriculture. Early prediction of crop yields can increase profitability in the farmer's economy. In this study we aimed to predict the yield of four maize commercial hybrids (Dekalb7508, Advanta9313, MH_INIA619 and Exp_05 P MLM ) using remotely sensed spectral vegetation indices (VI). A total of 10 VI (NDVI, GNDVI, GCI, RVI, NDRE, CIRE, CVI, MCARI, SAVI, and CCCI) were considered for evaluating crop yield and plant cover at 31, 39, 42, 46 and 51 days after sowing (DAS). A multivariate analysis was applied using principal component analysis (PCA), linear regression, and r-Pearson correlation. In the present study, highly significant correlations were found between plant cover with VIs at 46 (GNDVI, GCI, RVI, NDRE, CIRE and CCCI) and 51 DAS (GNDVI, GCI, NDRE, CIRE, CVI, MCARI and CCCI). The PCA indicated a clear discrimination of the dates evaluated with VIs at 31, 39 and 51 DAS. The inclusion of the CIRE and NDRE in the prediction model contributed to estimate the performance, showing greater precision at 51 DAS. The use of RPAS to monitor crops allows optimizing resources and helps in making timely decisions in agriculture in Peru.


Introduction
World population growth is constant over time, with estimates of 9.7 billion people by 2050 and 11 billion by 2100 [1]. Therefore, it is essential to strengthen food security, increasing crop production through the efficient use of resources for its sustainability. In this sense, corn is one of the most important cereals in the world and a staple food in many households. It is also a source in animal feed and a fundamental product in the food industry [2,3]. World production is estimated at 1,192 Mt, with the largest producers being the United States, China, Brazil and Mexico [4].
In the last decade, the use of technologies in agriculture has also increased significantly through the usage of geographic information systems (GIS), and global navigation satellite systems (GNSS), remote sensing, RPAS, machinery and other technologies that have supported precision agriculture [5][6][7]. The incorporation of these disciplines allows the collection, processing and analysis of temporal, spatial and individual data and combines them with other data for the implementation of adequate solutions in the use of resources, productivity, quality, profitability and sustainability of agricultural production [8][9][10][11]. A wide range of RPAS and satellite-mounted sensors have been used in phenotyping studies to obtain aerial images and monitor crop development [12,13]. Landsat and Sentinel-2 satellites collect images in the red wavelength and near infrared (NIR) to assess the health of crop development on a regional and global scale [14][15][16][17]. However, the spatial resolution has not been fine enough to meet the phenotypic measurement needs of various research projects in crops and in small areas at the level of small agricultural producers [12,18]. For this reason, the use of RPAS is currently gaining prestige as an integral part of precision agriculture, guaranteeing successful harvests [19].
On the other hand, RPAS with remote sensors can collect detailed information on the phenological development of crops through high spatial and temporal resolution images, which greatly reduces labor and time costs [20][21][22]. These sensors can acquire bands such as thermal infrared, RGB band, NIR band and red edge (RE) band [19]. These bands allow studying biomass growth, nitrogen content, yield, water stress and chlorophyll measurement in citrus, corn, wheat, soybean and grapevine crops [11,[22][23][24][25][26][27], through the application of vegetation indices (VI) such as the normalized difference vegetation index (NDVI) and others based on reflectance [12].
Precise estimation of maize yield at a local or regional scale helps improve food security and develop more supportive models [28]. In Peru, the cultivated area of corn during the 2019 -2020 season was 237,000 hectares with a production of 1.1 Mt per hectare (SIEA, 2021). Recently, maize cultivation has become highly susceptible to climate change conditions with strong variations in yield over the years [29]. It is also subject to the limited availability of technologies that help in the detection, monitoring and analysis of the crop. In this context, the use of RPAS and multispectral sensors are an excellent option to evaluate and estimate the production of this crop [30]. Consequently, in this study we evaluate the performance of four maize hybrids in the Peruvian coast, by applying VI calculated from multispectral images obtained from RPAS.

Study area
The data collection was carried out at the Centro Experimental La Molina (CELM) of the Instituto Nacional de Innovación Agraria (INIA) (-12° 4' W, -76° 56' S) (Figure 1), which is located in the district of La Molina, province and department of Lima (Peru). This area is characterized by a semi-arid climate, presenting an annual rainfall of 5.7 mm and an average temperature of 17.  A drip irrigation system was used, with a drip flow rate of 3.7 l/h and a distance between drippers of 0.2 m. Management practices such as weed and pest control were carried out manually and the use of herbicides as part of the agronomic management of the field.

Data collection
Yield data was obtained from a representative area of 32.8 m 2 in each experimental unit, which was then expressed as t/ha. The process consisted of weighting the total corn grains of each plot, and then extracting a 200 g sample to be dried in an oven at 60°C for an interval of 72 hours, reaching a grain moisture of approximately 10% to estimate yield per hectare.
Images obtained from the RPAS covered the different stages of maize development, which were taken at a height of 30 meters. They were collected between 11:00 a.m. and 2:00 p.m. to minimize changes in the solar zenith angle in cloudless weather conditions [31]. Five dates were selected for the acquisition of the images at 31, 39, 42, 46 and 51 DAS between the months of January to March 2021. The RPAS Phanthon 4 Pro (https://www.dji.com/phantom-4-pro?site=brandsite&from=nav, Shenzhen, China, accessed on 22 March 2022) coupled with Parrot Sequoia multispectral camera (Parrot SA, París, France) was used to acquire the images of the 48 study units.
Four bands were acquired in the wave ranges from 530 to 570 nm (Green); 640 to 680 nm (Red); 730 to 740 nm (Red edge); 770 to 810 nm (Near-Infrared), all at a multispectral image´ spatial resolution of 1 megapixels [32].
For the acquisition of precise images, a luminosity sensor located in the upper part of the RPAS was used. The flight plan was designed with a 75% overlap between images. On the other hand, for the georeferencing of the images, seven ground control points (GCPs) were used, which were measured using a high-precision GNSS, which were marked with topographic targets [33].

Figure 2.
Equipment used in data collection. RPAS and radio control, and Parrot Sequoia camera.

Canopy cover estimation
To calculate the canopy cover, the Image Classification, Editor, and spatial analysis tools (Spatial Analyst Tools) of the ArcMap software (ArcGiS 10.4.1) were used. Manual classification of the images (Achicanoy et al., 2018) was carried out in three classes (vegetation cover, soil and shade). From these, an output surface map with the vegetation cover was generated, which allowed calculating the corn cover percentages from the mosaic of photos obtained with the Phantom 4 drone for each date of images generated in the field.

Vegetation indices estimation
The multispectral images of the RPAS missions were made with the software Pix4D Capture (flight plan management) and processed in the Pix4Dmapper (V4.5.6, Pix4D S.A., Prilly, Switzerland) that allowed to generate the orthomosaic. We also performed the geometric correction and obtained the reflectance values [25]. Before calculating the VI, the supervised classification was applied identifying i) maize, ii) weeds, iii) shade, and iv) soil, which allowed determining the maize cover [34]. The VI were estimated within the area of the corn cover that was previously extracted through spatial mask extraction processing in the software ArcGIS 10.5. In the Table 1 shows the indices evaluated on the five study dates.

Data analysis and model development
Firstly, agronomical yield measurement for each corn hybrid were estimated based on the weights of dry grain of corn expressed in tons per hectare. The averages and standard error for each hybrid and the comparison of means was carried out by a Duncan.test with alpha=0.05.
The canopy cover and VIs were estimated from the multispectral at 31, 39, 42, 46 and 51 DAS. A Duncan.test means comparison was performed among corn hybrids on each date evaluated. With the data over time, box plot graphs were constructed over the five dates for each variable evaluated in the experiment.
A principal component analysis (PCA) was performed to determine the variations between each VI over time and determine the most relevant index in predicting yield. Subsequently, the r-Pearson correlation was applied to the indices with greater performance accuracy. Finally, the yield means between the four maize varieties were compared using Duncan's test with con α=0.05. We used the following libraries within R, [44] fac-toMineR [45], ggplot2 [46], factoextra [47], GGally [48], Hmisc [49] and agricolae [50]. Figure 3 shows the results of applying the Duncan test to compare the means of yield per hybrid at 51 DAS and the percentage of canopy cover, according to DAS and VI. Two groups without significant differences were identified, the first consisting of Advanta9313 (9.91 ± 2.15 t/ha) and Dekalb7508 (8.85 ± 1.38 t/ha), and the second MH_INIA619 (6.23 ± 1.51 t/ha) and Exp_05MLM (5.81 ± 1.21 t/ha) (Figure 3a). The yield varies from 5.81 to 9.91 t/ha. The hybrids Advanta9313 and Decalb7508 presented the highest performance and hybrid Exp_05MLM reported the lowest (Figure 3b). At the level of canopy coverage at 31 DAS, the hybrid Exp_05MLM presented greater coverage, followed by Dekalb7508 and MH_INIA619. Dekalb7508 presented greater canopy coverage at 39 and 46 DAS. However, at 51 DAS the canopy cover for the four hybrids were similar.

Development of prediction models to calculate crop yields
The relationship between crop yield and VI is shown in Table 3. The NDVI, MCARI and SAVI indices reported highly significant correlations for all evaluation days. In turn, correlation for the GNDVI, GCI, RVI indices at 51 DAS varies from very significant to medium importance. However, for the CVI and CCCI indices, the correlation increases according to the DAS. The results of the PCA are presented in Figure 5 for the five dates evaluated. There is a variability of the VI according to the temporality, the greater the distance from the calculation, the greater the difference between them. For comparison, on days 31, 39 and 51 DAS, there is no group overlap (Figure 5a), indicating a clear discrimination of the groups in this multivariate analysis. However, the opposite occurs at 39, 42 and 46 DAS, where there is an overlap because they are very close dates that do not allow a clear discrimination of the indices generated on the maize plant cover.
When performing the PCA for the canopy cover and yield indices at each date of the generated images (Figures 5b-e), we observe that there is no clear discrimination of groups with respect to the maize hybrids evaluated, but at 51 DAS (Figure 5f) two groups are observed. The corn yield at 51 DAS goes in the same direction as most VI, which is also reflected in the correlations with greater significance for that date (Table 2). Based on the indices that presented significant correlations with a P value > 0.45 (GCI, NDRE, CIRE and CCCI), two crop yield prediction models were built at 46 and 51 DAS. Model 1 was built based on the NDRE, CIRE and CCCI indices at 46 DAS, which reported a coefficient of determination (R 2 ) of 0.34 ( Figure 6a) and does not show a significant increase per index (Figures 6b-d). In turn, model 2 was built from the GCI, NDRE and CIRE indices at 51 DAS with an R 2 of 0.62 (Figure 6e). However, at the index level, the GCI Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 17 May 2022 doi:10.20944/preprints202205.0231.v1 reports the lowest R2 (R 2 = 0.33) (Figure 6f) with respect to the NDRE and CIRE, which presented R 2 values of 0.61 (Figures 6g-h).

Figure 6.
Maize yield prediction models with multiple linear regression; a) Model 1 for performance prediction using the NDRE, CIRE and CCCI indices at 46 DAS; b) Model 1 for performance prediction using NDRE at 46 DAS; c) Model 1 for performance prediction using CIRE at 46 DAS; d) Model 1 for performance prediction using CCCI at 46 DAS; e) Model 2 for performance prediction using the GCI, NDRE and CIRE indices at 51 DAS; f) Model 2 for performance prediction using GCI at 51 DAS; g) Model 2 for performance prediction using NDRE at 51 DAS; h) Model 2 for performance prediction using CIRE at 51 DAS.

Discussion
The use of multispectral images obtained from RPAS allowed the prediction of the yield of the maize crop in this study. One of the advantages of using RPAS in the monitoring of experimental plots or crops is that it can be controlled remotely and generates lower maintenance costs and acquisition of high-resolution images [51]. Hybrid Ad-vanta9313 (9.91 t/ha) presented the highest yield at 51 DAS, a value higher than the national average of 4.77 t/ha [52] and similar to those reported by Gavilánez-Luna & Gómez-Vargas [53]. This superiority in yield performance could be due to its wide adaptability to the maize areas of Peru and its good production stability [54].
For the estimation of canopy cover in the experimental plot, a total of 10 VIs were selected (Table 1). VIs were calculated using multispectral reflectance measurements at visible and near-infrared wavelengths. This range of lengths have been used in different precision agriculture applications such as plant counting, growth monitoring, phenology and chlorophyll measurement. [24,31,51,55,56]. At the VI level, it is observed that at 46 and 51 DAS, there are high significant correlations, since, at this stage, the chlorophyll content also increases significantly, as does the canopy cover. In nine indices, values increased cumulatively with advancing growing season. Only the MCARI showed a decrease at 42 DAS. The NDVI values ranged from 0.75 to 0.83 throughout the evaluation, unlike the GNDVI that went from 0.65 to 0.75, the SAVI being very similar to the NDVI on the first date evaluated. The average values were 0.4 and for the 51 DAS they oscillated around 0.70.
In regards to calculated linear regression models, better results were obtained at 51 DAS and only when they were generated from a single vegetation index. Indices NDRE and CIRE presented the highest R 2 (0.61) with the following models: 92.454*NDRE-26.393 and 17.752*CIRE-13.544. On the other hand, the R 2 values are lower than those obtained by Barzin et al. [57], who used the index OSAVI y SCCCI. At the same time Sunoj et al. [58] used exponential and nonlinear NDVI models for yield prediction and obtained R 2 values greater than 0.90. The lower correlations in the early stage may be due to the fact that the physiological characteristics of maize do not yet show significant differences. In other studies, they used satellite images such as MODIS and Landsat 8, where they found high performance predictions at 65-75 and 60-62 DAS, respectively [26,59].
The NDVI showed a low correlation (0.15) for yield estimation. However, higher NDVI values (0.53) have been reported in other studies, this may be due to the location and range of the electromagnetic spectrum taken by the Parrot Sequoia camera [27]. These results are also different from those obtained in wheat crops where the NDVI values fluctuated from 0.40, 0.49 and 0.45, for the early, intermediate and late grain-filling stages for full irrigation treatment [60]. In another study carried out with Landsat images, the indices that best predicted corn crop yield were Enhanced Vegetation Index (EVI), SAVI and Optimized Soil-Adjusted Vegetation Index (OSAVI), which were different from the NDVI [61]. Through the use of Landsat-7 ETM+ and Spot 5 images, they found high correlation values of the NDVI index in the yield of sugarcane, sugar and barley [62,63].
At small spatial scales, our study provided insight into the potential of using remotely sensed spectral VI from RPAS' multispectral images for monitoring and estimating maize field yield. RPAS and multispectral cameras can provide substantial spatial data on crop yield and quality at low cost [64]. In addition, they provided an opportunity to monitor farmers' plots, where crop monitoring is limited. Finally, this study opens new doors for the development of research and practical applications of drones in agriculture for all the three regions of Peru: costa, sierra and selva (coast, highlands and Amazon).

Conclusions
Results indicated a highly significant correlation between canopy cover and 10 VIs derived from RPAS multispectral images. Performance showed high correlations at 46 DAS with six indices (GNDVI, GCI, RVI, NDRE, CIRE and CCCI) and at 51 DAS with seven indices (GNDVI, GCI, NDRE, CIRE, CVI, MCARI and CCCI). Prediction models for performance from multiple correlations at 46 and 51 DAS were similar when three indices or just one were used. The PCA indicated a clear discrimination of the dates evaluated with the VI at 31, 39 and 51 DAS. The corn hybrids Dekalb7508 and Advanta9313 presented better performance than MH_INIA619 and Exp_05PMLM. Corn yield showed a high correlation during the reproductive stage (46 and 51 DAS) with the indices (GNDVI, GCI, RVI, NDRE, CIRE, CVI, MCARI and CCCI). In short, when compared to manual evaluation, VIs will allow timely decisions to be made when monitoring corn crops.