A Novel Vegetation Index for Coffee Ripeness Monitoring Using Aerial Imagery

: Coffee ripeness monitoring is a key indicator for deﬁning the moment of starting the harvest, especially because the coffee quality is related to the fruit ripeness degree. The most used method to deﬁne the start of harvesting is by visual inspection, which is time-consuming, labor-intensive, and does not provide information on the entire area. There is a lack of new techniques or alternative methodologies to provide faster measurements that can support harvest planning. Based on that, this study aimed at developing a vegetation index (VI) for coffee ripeness monitoring using aerial imagery. For this, an experiment was set up in ﬁve arabica coffee ﬁelds in Minas Gerais State, Brazil. During the coffee ripeness stage, four ﬂights were carried out to acquire spectral information on the crop canopy using two quadcopters, one equipped with a ﬁve-band multispectral camera and another with an RGB (Red, Green, Blue) camera. Prior to the ﬂights, manual counts of the percentage of unripe fruits were carried out using irregular sampling grids on each day for validation purposes. After image acquisition, the coffee ripeness index (CRI) and other ﬁve VIs were obtained. The CRI was developed combining reﬂectance from the red band and from a ground-based red target placed on the study area. The effectiveness of the CRI was compared under different analyses with traditional VIs. The CRI showed a higher sensitivity to discriminate coffee plants ready for harvest from not-ready for harvest in all coffee ﬁelds. Furthermore, the highest R 2 and lowest RMSE values for estimating the coffee ripeness were also presented by the CRI (R 2 : 0.70; 12.42%), whereas the other VIs showed R 2 and RMSE values ranging from 0.22 to 0.67 and from 13.28 to 16.50, respectively. Finally, the study demonstrated that the time-consuming ﬁeldwork can be replaced by the methodology based on VIs.


Introduction
Coffee is the second most traded commodity worldwide, playing an important role in the economy of several Latin American, African, and Asian countries [1]. Brazil is by far the world's largest producer and exporter of coffee beans, accounting for 35% of the global production in the 2019/2020 season [2]. About 70% of the Brazilian production is arabica type (Coffea arabica L.), and Minas Gerais State is the state with the largest production. Such facts consolidate the expressive economic and social importance of this crop for Brazil.
The growing global demand for specialty coffee, makes it necessary to develop productive strategies to differentiate the Brazilian product. Coffee is a product whose price depends on the beverage quality, which in turn is influenced by the level of fruit ripeness at harvest and, among other things, by environmental and soil conditions, and crop management and post-harvesting practices [3][4][5]. The beverage quality is higher when obtained from ripe fruits (cherry), in contrast to unripe and overripe fruits that deteriorate its beverage quality, as well as the color and size uniformity of the grains [4][5][6]. Therefore, knowing

Study Area
This study was carried out in five fields of arabica coffee (Coffea arabica L.) located in the Jatobá farm, municipality of Paula Cândido, Minas Gerais State, Brazil (42 • 39.997 S-lower right corner, and 753 m above sea level) (Figure 1). The relief in the area is mountainous (slope varies from 0 to 45%), and the climate is classified as "CWA", (humid subtropical with dry winter and hot summer) according to the Köppen-Geiger climate classification [29].
The UAVs campaigns were carried out under clear-sky conditions between 11:00 and 13:00 h local time using a flight plan, previously defined with the DroneDeploy software (DroneDeploy Inc., San Francisco, CA, USA). Before the UAV flights, 20 ground control points (GCP) were distributed around the study area for further geometric correction of the orthomosaic map. Then, the GCPs coordinates were obtained using a topographic GNSS (Global Navigation Satellite System) receiver, model Trimble ProXT (Trimble Inc., Sunnyvale, CA, USA). Data collection timeline and flight specifications of both UAV platforms are summarized in Table 2.  Before and after each flight, images of the reflectance target provided by the Micasense were taken at 1 m height to perform the radiometric calibration of the images during postprocessing. On the other hand, the RGB camera onboard of the Phantom 4 does not have a specific calibration target or system. Therefore, for this study, four grayscale reflectance targets (85, 27, 12, and 7%) made of plywood and covered with synthetic nappa leather of polyvinyl chloride (PVC) were placed in the field during all flight campaigns and used for calibration of the RGB bands ( Figure 2). Additionally, a red target made of the same material with dimensions of 0.5 × 0.5 m was used to obtain the coffee ripeness index. The reflectance of all targets was obtained using a portable spectroradiometer ASD Handheld 2 (Analytical Spectral Devices, Inc., Boulder, CO, USA), which operates in the wavelength range from 325 nm to 1075 nm with resolution of ±1 nm. A spectralon plate was used as white reference for spectroradiometer calibration.

UAV Imagery Pre-Processing
All images were recorded in RAW format and after processing converted to the Tagged Image File Format (TIFF). The images were processed in the Agisoft™ MetaShape software, version 1.5.3 (Agisoft LLC, St. Petersburg, Russia). The RedEdge MX images were radiometrically calibrated using the correction factors of the Micasense's calibration target, and the RGB camera images were calibrated using the vicarious method [31][32][33]. For that, the pixels of the four PVC targets were manually clipped. Then, linear regression models for each three-band were adjusted using the average pixel digital number (DN) and reflectance values of the targets obtained with the spectroradiometer. The DN values in the whole images were converted to reflectance based on the regression equations.
Subsequently, all images were processed to create the orthomosaics. Firstly, all five bands from the RedEdge MX needed to be aligned since the sensor acquires one image per channel. Thus, the five bands were aligned and grouped in a single file for each multispectral image. Then, the orthomosaics of both cameras were created performing the following steps: (1) image alignment using the UAV's GPS unit; (2) construction of a three-dimensional point cloud using the Structure from Motion (SfM) technique [34,35]; (3 and 4) creation of the dense point cloud, which served as the basis for creating the Digital Surface Model (DSM); (5) the DSM was used to project every image pixel to generate the orthomosaics [36,37]; and finally, (6), using the QGIS, version 3.2 [38], all orthomosaics were georeferenced using the GCP coordinates ( Figure 3).

Laboratory Experiment for Coffee Fruit Ripeness Spectra Characterization
Spectral reflectance patterns of the coffee fruits are useful to understand the main differences and challenges of monitoring their ripeness at the field level from that observed in the UAV images. Based on that, a laboratory reflectance spectroscopy experiment was performed to characterize the spectral pattern of the fruit ripeness. Five samples of coffee fruits (500 fruits per sample) with different percentages of unripe and ripe fruits were conditioned in a flat surface, 8 cm from the spectroradiometer and 80 cm from two halogen lamps (300 W) ( Figure 4). Five measurements per sample were performed using the spectroradiometer.

Extraction of the Vegetation Indices and Field Assessments of the Coffee Ripeness
From the GeoTIFF orthomosaics, the following VIs were obtained for the study area: the Coffee Ripeness Index (CRI); the Green-red Ratio Ripeness Index (GRRI); the Modified Chlorophyll Absorption in Reflectance Index 1 (MCARI1); the Normalized Difference Vegetation Index (NDVI); the Normalized Difference RedEdge Index (NDRE); and the Green Normalized Difference Vegetation Index (GNDVI) ( Table 3). The GRRI was selected because it was previously created to assess the coffee ripeness using UAV imagery, whereas the MCARI1, NDVI, NDRE, and GNDVI were selected due to their good relationship with vegetation pigments, as well as to their extensive use in crop monitoring studies. Those five VIs were chosen for comparison purposes with the CRI.  Table 3. References and equations of the spectral vegetation indices evaluated in this study.

Vegetation Index Equation Reference
CRI R/R target 100 N, Near-infrared; R, Red; G, Green; RE, RedEdge; and R Target , Average reflectance value of the red target in the red band.
The VIs were obtained at the sampling point level, which was defined considering three plants in the same cultivation row. After calculating the IVs, polygonal masks were manually created for each sampling point using the QGIS software. Then, the average values of the polygon pixels were extracted using the zonal statistics tool.
For the validation of the VIs, manual measurements of the coffee ripeness were carried out on the same dates of image acquisition ( Table 2). An irregular grid with 20 samples per hectare was set up on each measurement day for the fields A, C, D, and E. For field B, only 10 samples per hectare were considered due to its lower fruit load. Furthermore, on each plant, four plagiotropic branches were randomly chosen in the plant's middle third, considering one branch per quadrant. After that, the unripe fruits, and the total of fruits on the branches of each plant were counted and the average value was used to represent each sampling point. Finally, using the percentage of unripe fruits, the sampling points were classified into two ripeness class: not ready for harvest with more than 30% of unripe fruits; and ready for harvest with the percentage of unripe fruits with less than 30%. This proportion has been used by the farmers in the region to begin the harvest.

Statistical Analysis
Initially, an analysis of variance (ANOVA) was carried out between the ripeness classes to evaluate the potential of the six VI to discriminate plants that were ready to harvest from not-ready in the field. First, this analysis was performed for each coffee field, and then, all data were grouped into a single dataset to evaluate the influence of the different cultivars, crop canopy and yield on the VIs performance.
Next, linear regression (Y = β 0 + β 1 X) was used to model the relationship of the VIs and the field coffee ripeness. In addition, the significance of the regression coefficients was evaluated using the t-test at 1% probability (p < 0.01). Finally, to infer about which VI presented better adjustment to the coffee ripeness, the following statistical metrics were calculated: Coefficient of determination (R 2 ), and Root mean square error (RMSE). All statistical analyses were performed using the R software, version 3.6.1 [42].

Spectral Characterization of Coffee Fruits Ripeness
Measurements done in laboratory showed that the coffee fruits spectra are highly variable according to the fruit ripeness degree ( Figure 5). As expected, unripe fruits presented a reflectance peak in the green band (560 nm ± 20 nm), whereas in the red band (650 ± 30 nm) it showed an absorption peak, possibly related to the higher amount of chlorophyll in this region [7]. Then, the reflectance increased until the 900 nm in the NIR region. Conversely, after reducing the percentage of unripe fruits ( Figure 5A-E), the samples tended to present a flat spectral behavior up to the red band, where it presented a reflectance peak, which is related to the reduction of chlorophyll pigments and accumulation of anthocyanins [7]. After that, there was an increase in reflectance as observed in the sample with 0% of unripe fruits ( Figure 5E).  In addition, the results also showed that the R, G and NIR bands presented better discrimination of unripe and ripe coffee fruits, while in the RedEdge band (717 nm ± 10 nm) the spectra from both classes tended to overlap around 720 nm. These bands are important to discriminate plants with unripe from those with ripe fruits, in which the differences can be detected using the VIs. Besides that, if we consider the application of the method in field conditions, the R band should be highlighted over the G and NIR bands since the coffee crop reflects a higher proportion of the G and NIR radiation. This characteristic can make it difficult to differentiate coffee fruits from leaves at 60 m height (UAV imagery). Thus, the R band is better since a subtle increase in its reflectance (i.e., increase of fruit ripeness) can be easier to detect than when using the G and NIR bands, in which a reduction of their reflectance cannot discriminate the coffee ripeness due to the spectral confusion among leaves and unripe fruits.

Potential of VIs for Discrimination of Coffee Ripeness Classes
The characterization of the arabica coffee fields used in this study is shown in Table 4. The study area is represented by different coffee cultivars and each coffee field presented different canopy volumes, density of plants, and crop yield. Additionally, the terrain of the region is mountainous, and the slope varied throughout the coffee fields. Regarding the crop yield, the field B that had the highest cultivation area, showed, on the other hand, the lowest yield among all fields. This lower yield is a result of the biennial yield effect, a peculiar characteristic of this crop which exhibits high and low yield values in alternated years. Results of the ANOVA showed that only the CRI (both cameras), GRRI (RGB camera), and the MCARI1 were able to discriminate the coffee plants ready for harvest from notready for harvest (i.e., plants with unripe fruits from plants with ripe fruits), for all coffee fields (Table 5). Conversely, the GRRI when obtained from the RedEdge MX showed significant differences only in the fields A, B, and E. This result can be associated with different factors, such as the size of the dataset among cameras, temporal variability of the VI, and especially the specific characteristics of each coffee field. Regarding the other VIs, the NDRE presented the best results for discriminating the coffee ripeness, in which only the field C presented no significant differences between ripeness classes. Conversely, the NDVI showed significant differences only for field D and E, whereas the GNDVI was only significant in field 2. Moreover, those coffee fields where the ripeness classes were not discriminated by the VIs, presented either higher canopy volumes or higher plant density (Table 4).
A second ANOVA was performed considering the data from all coffee fields as a single dataset. This analysis aimed to evaluate the influence of the different crop characteristics on the VIs performance. Results showed that only the CRI, GRRI, MCARI1, and NDRE presented significant (p < 0.001) differences between the ripeness classes. Furthermore, the information presented in Figure 6, especially the two circles, reinforces the information that pixels of the VIs on the crop canopy could effectively discriminate plants with unripe fruits from those with ripe fruits. In addition, the boxplots make clear that the CRI and MCARI1 presented the higher threshold among the ripeness classes, whereas, for the GRRI and NDRE there was some overlapping between classes.

Relationship between VIs and Coffee Ripeness
Results of the statistical metrics obtained with the five VIs for estimation of the coffee ripeness are presented in Figure 7. From a general point of view, it can be observed that the linear models used for estimation of the coffee ripeness showed satisfactory adjustments (i.e., R 2 and RMSE) considering the characteristics of the fields. Overall, the highest R 2 and lowest RMSE values were obtained in field C (R 2 : 0.70; RMSE: 12.42%) and A (R 2 : 0.68; RMSE: 12.86%) by the CRI using the RedEdge MX, followed by MCARI1 (Field A and C, R 2 : 0.66 and 0.67; RMSE: 13.77 and 13.28%) and the CRI from the RGB camera   16.50, 14.92, 13.42, 14.74, and 15.88%, respectively for the GRRI (RedEdge MX and RGB camera), NDRE, NDVI, and the GNDVI. Besides that, when considering all coffee fields as a single dataset, the linear models from all VIs presented lower performances. Moreover, it is worth mentioning that the differences in the statistical metrics between fields are related to the specific characteristics and dataset size of each coffee field.
Due to the large amount of data, only the linear models that ranked best on the R 2 and RMSE metrics are presented in Figure 8. Overall, the CRI (RedEdge MX) performed best in all fields, except to field E, where the MCARI1 was better fit to the data. In addition, the version derived from the RedEdge MX presented higher performance than its version from the RGB camera in all coffee fields. Despite that, these VIs showed good results considering the variability between fields and the non-uniform fruit ripeness presented by the crop, which makes it harder to develop a universal model for coffee ripeness estimation.

Discussion
Coffee fruit ripeness monitoring is a crucial indicator for defining the optimal harvest time, especially because unlike unripe and overripe fruits, the ripe fruits (cherries) tend to produce beverages with higher quality [5]. Remote sensing is an effective approach that has been widely used to investigate crop parameters, in which some studies resulted in the development of several vegetation indices [11,26,43,44]. In this study, a simple and effective VI based on a single spectral band was developed using UAV imagery for coffee fruit ripeness monitoring.
The CRI outperformed traditional VIs such as the MCARI1, NDVI, NDRE, GNDVI, and also the GRRI, which was previously developed for coffee ripeness monitoring [23]. However, the field characteristics such as plant density, canopy volume and especially, the crop yield directly influenced the VIs performance. The spectral response of a crop canopy tend to be similar to that of a single leaf, but changeable by many factors such as plant tissue optical properties, canopy structure, plant physiology, and climatic conditions [45][46][47]. For the coffee crop, there is an additional factor, the unequal fruit ripening, which is practically inevitable under natural conditions because coffee blossoming in nonequatorial regions as in the southcentral Brazil occurs at different times in the same season (e.g., from August to November) in most of the production areas [28]. These factors made it more challenging to differentiate coffee fruits (unripe and ripe) from leaves when the VIs were used.
As stated before, at the laboratory level, the spectra of unripe and ripe fruits can be easily differentiated as the percentage of unripe fruits reduces ( Figure 5). However, it can be very difficult to differentiate them using aerial imagery, especially under highly dense crop canopies that can induce spectral confusion. For the initial analysis, the CRI, GRRI (RGB camera), and MCARI1 were the only VIs capable of discriminating the ripeness classes in all coffee fields. The GRRI (RedEdge MX) and NDRE presented satisfactory performances, yet they were still influenced by field characteristics. On the other hand, the worst performance was presented by the NDVI and GNDVI that saturated, especially in the fields with higher plant density and canopy volume (Table 4). This problem has been addressed in other studies, whose authors reported the influence of several factors related to crop species, leaf area, crop biomass, it foliage, and others [13,[48][49][50].
When the five fields dataset was grouped, only the CRI, GRRI, MCARI1, and NDRE showed significant differences between the ripeness classes. This result is related to the temporal fruit color change, which is caused by the disappearance of chlorophyll pigments and the accumulation of anthocyanins [7] that altered the crop canopy spectral response ( Figure 6). Besides that, from the fruit filling to the ripeness stage, coffee plants present a high nutritional demand, especially for NPK, which leads to an increase in nutrient translocation from the leaves to the fruits [51]. This can result in nutrient deficiency and also in changes of leaf reflectance in the visible (400-700 nm) and NIR (700-1100 nm) wavelengths [52], which can be better-detected with VIs of higher sensitivity to chlorophyll pigments [53][54][55]. Together, these factors led these VIs, and especially the CRI due to its higher sensitivity of changes in the red wavelength, to present better capability of discriminating plants with unripe fruits from those with ripe fruits.
For the regression analysis, the CRI outperformed the other VIs on most coffee fields, except to field 5 where the MCARI1 showed better performance (Figures 7 and 8). Compared to this study, [23] developed and tested the GRRI for ripeness monitoring in nine coffee fields and obtained a R 2 of 0.43. On the other hand, [9], using the same index reported a much better correlation (R 2 : 0.81) for seven coffee fields. However, these authors presented the results at the field level, which does not entirely represent the spatial variability of the fruit ripeness. Moreover, all coffee fields presented high fruit display on the canopy exterior. Differently, in this study, the sampling points were defined at every three plants, which better represented the spatial variability of fruit ripeness. In addition, not all five coffee fields presented high yield, which as discussed before, played a major role in the VIs performance. In this sense, our results were more satisfactory than those presented above, especially due to the higher sensitivity shown by the CRI in detecting the fruit ripeness changes.
Regarding the results obtained with the CRI and GRRI from both cameras, the differences in the ANOVA and linear regression analysis are related to the radiometric calibration method and mainly to the dataset size of both cameras. The RedEdge MX besides presenting an individual CMOS for each band, it presents a more complex calibration system composed by its downwelling light sensor and the factory-calibrated reference target (For more details look at [56,57]). On the other hand, in the RGB camera due to the absence of a calibration system, we used low-cost targets and the vicarious calibration, a simple and effective method [58]; but not as robust as the one presented by the RedEdge MX.
Apart from that, the dataset size was the main factor influencing the VIs results. The RGB camera was used in four weeks, whereas the RedEdge MX was used only in the third and fourth weeks due to availability ( Table 2). During the two first weeks, there was a higher percentage of unripe fruits than in the last two weeks. This increased the temporal variability of the fruit ripeness and resulted in lower performance by the CRI from the RGB camera. Conversely, the version derived from the RedEdge MX presented a better result due to the higher percentage of ripe fruits in the last weeks, which could be detected from the crop canopy. Differently, the GRRI (RGB camera) was positively influenced by the higher number of samples, unlike its other version (Table 5 and Figure 7).
In addition, the magnitude of the CRI and GRRI values from the RGB camera was different than the ones derived from the RedEdge MX. This result is possibly associated with the sensitivity of the CMOs that can be variable among different wavelengths [59]. For the RGB camera, the same settings of ISO and shutter aperture are used since its single sensor is used to register the information from RGB bands. Conversely, the RedEdge MX does not present this limitation since there is one CMOs for each band, whose settings are individually adjusted according to the band's specifications. Regardless, both VIs obtained with the RGB camera presented satisfactory results, especially the CRI. In this sense, the RGB camera use can be a feasible alternative to monitor the coffee ripeness, especially in small farms due to its lower cost compared to the RedEdge MX.
Overall, our findings corroborate those stated by other authors [9,23], in which the use of RS in the coffee crop for fruit ripeness monitoring is still challenging due to the plant architecture and the high canopy volume presented by most cultivars. In addition, another characteristic that plays a major role in the performance of RS-based methodologies in the coffee crop is the crop yield. Due to the biennial effect, the crop yield alternates between low and high yields every year [13]. This characteristic has a direct influence on the number of fruits displayed on the crop canopy, which end up affecting the performance of the VIs. Despite that, the study demonstrated that the time-consuming manual fruit counts made on a few plants can be replaced by remote sensing approaches. Furthermore, these results fill a gap in the literature of remote sensing studies related to the coffee fruit ripeness monitoring, which is a key factor for defining the beverage quality. Lastly, a recommendation for future studies would be the use of the CRI and other variables (e.g., solar radiation, degrees Brix, canopy temperature, etc.) for prediction of fruit ripeness and beverage quality using machine learning algorithms.

Conclusions
In this study, a simple and effective vegetation index (VI) was proposed for coffee ripeness estimation using aerial imagery. The Coffee ripeness index (CRI) was developed, combining reflectance from the red band and from a ground-based red target. The effectiveness of the CRI was compared in different analysis with traditional VIs such as the MCARI1, NDVI, NDRE, GNDVI, and the GRRI in five coffee fields under distinct cultivation characteristics.
The CRI showed a higher sensitivity to discriminate coffee plants ready for harvest from not-ready for harvest in all coffee fields. However, the field characteristics such as plant density, canopy volume and especially, the crop yield played a key role in the VIs performance. Therefore, the methodology based on VIs, especially the CRI, can yield better results on coffee fields with higher fruit display on the canopy exterior.
Regarding the two cameras evaluated, both of them presented satisfactory results. However, the RGB camera use can be a feasible alternative to monitor the coffee ripeness, especially in small farms due to its lower cost compared to the RedEdge MX. Finally, the study demonstrated that the time-consuming fieldwork can be replaced by the methodology based on VIs.