Feasibility of Combining Deep Learning and RGB Images Obtained by Unmanned Aerial Vehicle for Leaf Area Index Estimation in Rice

: Leaf area index (LAI) is a vital parameter for predicting rice yield. Unmanned aerial vehicle (UAV) surveillance with an RGB camera has been shown to have potential as a low-cost and efﬁcient tool for monitoring crop growth. Simultaneously, deep learning (DL) algorithms have attracted attention as a promising tool for the task of image recognition. The principal aim of this research was to evaluate the feasibility of combining DL and RGB images obtained by a UAV for rice LAI estimation. In the present study, an LAI estimation model developed by DL with RGB images was compared to three other practical methods: a plant canopy analyzer (PCA); regression models based on color indices (CIs) obtained from an RGB camera; and vegetation indices (VIs) obtained from a multispectral camera. The results showed that the estimation accuracy of the model developed by DL with RGB images (R 2 = 0.963 and RMSE = 0.334) was higher than those of the PCA (R 2 = 0.934 and RMSE = 0.555) and the regression models based on CIs (R 2 = 0.802-0.947 and RMSE = 0.401–1.13), and comparable to that of the regression models based on VIs (R 2 = 0.917–0.976 and RMSE = 0.332–0.644). Therefore, our results demonstrated that the estimation model using DL with an RGB camera on a UAV could be an alternative to the methods using PCA and a multispectral camera for rice LAI estimation.


Introduction
Leaf area index (LAI), which represents one half of the total green leaf area (i.e., half of the total area of both sides of all green leaves) per unit horizontal ground surface area [1], is a key vegetation parameter for assessing the mass balance between plants and the atmosphere [2,3], and plays an important role in crop growth estimation and yield prediction [4][5][6]. The efficiency of light capture and utilization are the ultimate factors limiting crop canopy photosynthesis, and LAI primarily determines the interception rate of solar radiation by a crop. Hence, accurate LAI estimation is an important to evaluate crop productivity. Direct LAI measurements have been performed by destructive sampling, but this approach requires a great deal of labor and time. Moreover, it is often difficult to obtain a representative value of the plot because only a part of the plot can be surveyed by direct sampling. Therefore, in recent years, various indirect methods for LAI estimation have been developed to solve these problems.
One of the widely used estimation methods is an indirect measurement using optical measuring devices. The plant canopy analyzer (PCA) is a typical optical measuring device used for this purpose. The PCA can measure LAI in a non-destructive manner by observing the transmitted light in the plant community with a special lens and measuring the rate of its attenuation [7]. It has come to be widely used as an efficient measurement method. points. The data were taken in the direction of the arrows from the position of the enclosed numbers: 4 points were taken from between the plants in each of the 2 rows at a 45-degree angle to the direction of the rows towards the inside of the canopy (from No. 1 to 4), and 6 points were taken from between the rows parallel to the rows (from No. 5 to 10).

Generation of Ortho-Mosaic Images
The coordinates of the reference points installed at the four corners of the test field were obtained by closed traverse surveying with reference to the Japan Geodetic System 2011 Plane Cartesian Coordinate System 9 as a map projection method, and these four points were used as ground control points (GCPs) (Figure 2). The data were taken in the direction of the arrows from the position of the enclosed numbers: 4 points were taken from between the plants in each of the 2 rows at a 45-degree angle to the direction of the rows towards the inside of the canopy (from No. 1 to 4), and 6 points were taken from between the rows parallel to the rows (from No. 5 to 10). The coordinates of the reference points installed at the four corners of the test field were obtained by closed traverse surveying with reference to the Japan Geodetic System 2011 Plane Cartesian Coordinate System 9 as a map projection method, and these four points were used as ground control points (GCPs) (Figure 2).

Calculation of Vegetation Indices and Color Indices
The average reflectance of the eight hills (60 cm × 60 cm) in each plot, which is taken for destructive LAI measurements, was extracted from the multispectral ortho-mosaic images of each band (blue, green, red, rededge and near-infrared) using polygons (15 cm × 30 cm) including each hill using a geographical information system (ArcMap, Esri) ( Figure  3) Figure 2. Examples of ortho-mosaic images (9 July). The green triangles in the four corners of the field represent ground control points (GCPs): (a) a multispectral ortho-mosaic image (near-infrared (NIR)); (b) an RGB ortho-mosaic image.
Each ortho-mosaic image was created from 5-band multispectral images and RGB images taken with the UAV. Tie points were automatically detected from the overlapping area between aerial images, and camera calibration (correction of the lens focal length, principal point position, and radial and tangential distortion) was performed with the tie points. After that, parameters of external orientation (camera position and tilting angle) were estimated using the detected tie points and the four installed GCPs, and the 3D model was developed. This processing was performed so that the accuracy of GCPs was within 1 pixel. Ortho-mosaic images (orthophoto images) of 5-band multispectral cameras and RGB cameras were generated from each of the 3D models ( Figure 2). The resolutions of these images were 12 mm and 9 mm, respectively. When generating the multispectral ortho-mosaic images, the attached light-intensive sensor automatically converts DN into the reflectance of each band and the reflectance was used for calculation of VIs. As for RGB ortho-mosaic images, DNs were used for calculation of CIs. Metashape (Agisoft) was used for the above processing.

Calculation of Vegetation Indices and Color Indices
The average reflectance of the eight hills (60 cm × 60 cm) in each plot, which is taken for destructive LAI measurements, was extracted from the multispectral orthomosaic images of each band (blue, green, red, rededge and near-infrared) using polygons (15 cm × 30 cm) including each hill using a geographical information system (ArcMap, Esri) ( Figure 3).
(a) (b) Figure 2. Examples of ortho-mosaic images (9 July). The green triangles in the four corners of the field represent ground control points (GCPs): (a) a multispectral ortho-mosaic image (near-infrared (NIR)); (b) an RGB ortho-mosaic image.

Calculation of Vegetation Indices and Color Indices
The average reflectance of the eight hills (60 cm × 60 cm) in each plot, which is taken for destructive LAI measurements, was extracted from the multispectral ortho-mosaic images of each band (blue, green, red, rededge and near-infrared) using polygons (15 cm × 30 cm) including each hill using a geographical information system (ArcMap, Esri) ( Figure  3) Various VIs for which the relationship with LAI have been reported were calculated from the multispectral reflectance. In this study, four types of VIs, normalized difference vegetation index (NDVI), simple ratio (SR), modified simple ratio (MSR), and soil adjusted vegetation index (SAVI), were calculated with the two bands of reflectance (λ1, λ2) ( Table 2). In general, these VIs are often used with substitution of the reflectance of near-infrared and red for λ1 and λ2, respectively, but in our present experiments, in addition to these substitutions, we also substituted the reflectance of near-infrared and rededge, and the reflectance of rededge and red for λ1 and λ2, respectively. In total, 12 VIs were calculated from the reflectance obtained from the multispectral ortho-mosaic images in this study ( Table 2).
For the RGB ortho-mosaic images containing the DNs of three colors (red, green, and blue), small images containing eight hills (60 cm × 60 cm) were cut out at a resolution of 100 × 100 pixels (these cut-out RGB images were also used as input data for DL) ( Figure 4) and the DNs of three colors (red (R), green (G), and blue (B)) were acquired using an image processing software package (ImageJ; Wayne Rasband). The normalized DNs of the three colors, r, g, and b, are calculated by dividing the original DNs of red (R), green (G), and blue (B) by the sum of these three original DNs as follows:

Index Formula Reference
VIs The reflectance of NIR and red, NIR and rededge, rededge and red obtained from the multispectral camera were substituted for λ1 and λ2; the normalized digital numbers (DNs) of red, green, and blue obtained from the RGB camera were substituted for r, g, and b (Equations (1)-(3)), respectively.  In this study, nine types of CIs, visible atmospherically resistant index (VARI), excess green vegetation index (E × G), excess red vegetation index (E × R), excess blue vegetation index (E × B), normalized green-red difference index (NGRDI), modified green red vegetation index (MGRVI), green leaf algorithm (GLA), red green blue vegetation index (RGBVI), and vegetativen (VEG), whose relationship with LAI have been reported were calculated from the normalized DNs (Equations (1)-(3)) obtained from the RGB camera (Table 2).

Estimation Model Development and Accuracy Assessment
Replication 1 and 2 were used as training data for model construction (n = 48), and replication 3 was used as validation data to verify the model accuracy (n = 24). The RGB images for the training data of DL were inflated 12 times (n = 576) by flipping left and right and upside down and changing the brightness (0.7, 1.4 times) ( Figure 5).

Original
Flip horizontal  In this study, nine types of CIs, visible atmospherically resistant index (VARI), excess green vegetation index (E × G), excess red vegetation index (E × R), excess blue vegetation index (E × B), normalized green-red difference index (NGRDI), modified green red vegetation index (MGRVI), green leaf algorithm (GLA), red green blue vegetation index (RGBVI), and vegetativen (VEG), whose relationship with LAI have been reported were calculated from the normalized DNs (Equations (1)-(3)) obtained from the RGB camera (Table 2).

Estimation Model Development and Accuracy Assessment
Replication 1 and 2 were used as training data for model construction (n = 48), and replication 3 was used as validation data to verify the model accuracy (n = 24). The RGB images for the training data of DL were inflated 12 times (n = 576) by flipping left and right and upside down and changing the brightness (0.7, 1.4 times) ( Figure 5).
Based on each of the calculated VIs and CIs, regression models of the ground-measured LAI were developed by the least-square method, and their accuracy was verified.
In a previous study, CIs were applied for the machine-learning algorithm to develop LAI estimation models [22]. Therefore, in this research, CIs were also used as input data for machine-learning-algorithms and DL in addition to RGB images for relative evaluation. In total, three patterns of input datasets (nine types of CIs, RGB images, and nine types of CIs and RGB images) were prepared for machine-learning algorithms and DL to assess the potential estimation accuracy of the RGB camera. After that, the LAI estimation models by machine-learning algorithms and DL using each of the input datasets were constructed, then their accuracy was assessed.

Estimation Model Development and Accuracy Assessment
Replication 1 and 2 were used as training data for model construction (n = 48), and replication 3 was used as validation data to verify the model accuracy (n = 24). The RGB images for the training data of DL were inflated 12 times (n = 576) by flipping left and right and upside down and changing the brightness (0.7, 1.4 times) (  As a comparative analysis method for DL, four kinds of machine-learning algorithms, artificial neural network (ANN), partial least squares regression (PLSR), random forest (RF) and support vector regression (SVR), other than DL were used in this study. Scikit-learn, an open source library of Python was used for model development. Main tuning parameters of each machine-learning algorithms, ANN: the number of hidden layer neurons and max iterations, PLSR: the number of explanatory variables, RF: the number of tree (ntree) and the number of features to consider (mtry), SVR: gamma, C and epsilon, were adjusted using a grid search before development of the estimation models.
The Neural Network Console (SONY), which is an integrated development tool for the DL program, was used for development of the estimation model. When developing an estimation model using the nine types of CI data, we designed a simple neural network with 5 fully connected layers ( Figure 6). The CNN, which enables area-based feature extraction and robust recognition against image movement and deformation, is known to be an effective layer for the task of image recognition [33]. Deepening the CNN layers plays an important role for accurate image recognition, because each layer extracts more sophisticated and complex features from images. ResNet is a network structure that has been successfully used to layer CNNs up to 152 layers, and it achieved much higher accuracy than conventional network structures and higher accuracy than humans for this purpose [45]. However, ResNet has a disadvantage in its complexity of architecture. ResNeXt achieved better accuracy than ResNet and succeeded in reducing the calculation cost by introducing the technique of grouped convolution to ResNet [46]. Moreover, ResNeXt showed its high potential in agricultural researches [47,48]. In this research, ResNeXt was modified so that our datasets were applicable and used to develop LAI estimation models from input datasets containing images (RGB images, and nine types of CIs and RGB images) ( Figure 6). Hyper parameters were determined as shown in Table 3.   Figure 7 shows seasonal variations of the ground-measured LAI under each condition (three rice varieties and two nitrogen management levels) observed in this study. LAI gradually increases from the transplanting (Figure 7) and ranged from 0.135 to 6.71 during growth duration. Significant differences in fertilizer management from the 1st to 3rd sampling and varieties at the first sampling were observed.   Figure 7 shows seasonal variations of the ground-measured LAI under each condition (three rice varieties and two nitrogen management levels) observed in this study. LAI gradually increases from the transplanting (Figure 7) and ranged from 0.135 to 6.71 during growth duration. Significant differences in fertilizer management from the 1st to 3rd sampling and varieties at the first sampling were observed.

Regression Models Using Each of VIs and CIs
Regression equations of the LAI estimation models based on each index are shown in Table 4. A comparison of the estimation accuracy of each model is shown in Figure 8. Correlations between the ground-measured LAI and estimated LAI of the regression models based on each of the VIs and CIs are shown in Figures 9 and 10, respectively. The estimation accuracy of the LAI of the regression models differed depending on the indices, and the coefficient of determination ranged from 0.802 to 0.976 and root mean squared error (RMSE) ranged from 0.332 to 1.13 (Figures 8-10). In the estimation models based on VIs, the coefficient of determination ranged from 0.906 to 0.976 and RMSE ranged from 0.332 to 0.644 (Figures 8 and 9). In the estimation models based on CIs, the coefficient of determination ranged from 0.802 to 0.947 and RMSE ranged from 0.401 to 1.13 (Figures 8  and 10). Generally, the estimation model based on VIs acquired from the multispectral camera exhibited higher accuracy than the model based on CIs acquired from the RGB camera (

Regression Models Using Each of VIs and CIs
Regression equations of the LAI estimation models based on each index are shown in Table 4. A comparison of the estimation accuracy of each model is shown in Figure 8. Correlations between the ground-measured LAI and estimated LAI of the regression models based on each of the VIs and CIs are shown in Figures 9 and 10, respectively. The estimation accuracy of the LAI of the regression models differed depending on the indices, and the coefficient of determination ranged from 0.802 to 0.976 and root mean squared error (RMSE) ranged from 0.332 to 1.13 (Figures 8-10). In the estimation models based on VIs, the coefficient of determination ranged from 0.906 to 0.976 and RMSE ranged from 0.332 to 0.644 (Figures 8 and 9). In the estimation models based on CIs, the coefficient of determination ranged from 0.802 to 0.947 and RMSE ranged from 0.401 to 1.13 (Figures 8 and 10). Generally, the estimation model based on VIs acquired from the multispectral camera exhibited higher accuracy than the model based on CIs acquired from the RGB camera ( Figure 8). SR (NIR, Red) achieved the highest accuracy of all indices (R 2 = 0.976 and RMSE = 0.332) followed by NDVI (NIR, Red) (R 2 = 0.959 and RMSE = 0.475) and SAVI (NIR, Red) (R 2 = 0.959 and RMSE = 0.478) (Figures 8 and 9a,d,j). VEG showed the highest accuracy of all CIs (R 2 = 0.947 and RMSE = 0.401) followed by E × G (R 2 = 0.937 and RMSE = 0.440) and GLA (R 2 = 0.935 and RMSE = 0.444) (Figures 8 and 10b,g,i).        Table 2 3 Table 5 shows the accuracy of the LAI estimation model developed by four kinds of machine-learning algorithms using three patterns of input datasets, nine types of CIs, RGB images, and nine types of CIs and RGB images obtained from the RGB camera. As for ANN, PLSR and SVR, the highest accuracy was achieved when the input data was CIs (R 2 = 0.940 and RMSE = 0.401, R 2 = 0.939 and RMSE = 0.422 and R 2 = 0.945 and RMSE = 0.399, respectively). RF achieved the highest accuracy when the input data was nine types of CIs and RGB images, which was the highest accuracy in all combinations (R 2 = 0.957 and RMSE = 0.342).  Table 2. Table 5 shows the accuracy of the LAI estimation model developed by four kinds of machine-learning algorithms using three patterns of input datasets, nine types of CIs, RGB images, and nine types of CIs and RGB images obtained from the RGB camera. As for ANN, PLSR and SVR, the highest accuracy was achieved when the input data was CIs (R 2 = 0.940 and RMSE = 0.401, R 2 = 0.939 and RMSE = 0.422 and R 2 = 0.945 and RMSE = 0.399, respectively). RF achieved the highest accuracy when the input data was nine types of CIs and RGB images, which was the highest accuracy in all combinations (R 2 = 0.957 and RMSE = 0.342).  Table 6 and Figure 11 shows the accuracy of training and validation data with the LAI estimation model constructed by DL using three patterns of input datasets: nine types of CIs, RGB images, and nine types of CIs and RGB images obtained from the RGB camera, respectively. Training data was fitted with R 2 = 0.900 and RMSE = 0.605 for CIs, R 2 = 0.979 and RMSE = 0.280 for images and R 2 = 0.989 and RMSE = 0.203 for CIs + images, respectively ( Table 6). The coefficient of determination ranged from 0.946 to 0.964, and RMSE ranged from 0.322 to 0.434. The estimation model using nine types of CIs as input data underestimated the ground-measured LAI; the estimation accuracy of this model was lower than those of the other two estimation models and there was no improvement from the regression model of VEG, which achieved the highest accuracy in all CIs (R 2 = 0.946 and RMSE = 0.434) (Figure 11a). Higher accuracy was achieved in the estimation model using RGB images as input data (R 2 = 0.963 and RMSE = 0.334) (Figure 11b), and little improvement was observed in the estimation model using nine types of CIs and RGB images as input data, with values of R 2 = 0.964 and RMSE = 0.322 (Figure 11b). These two models containing RGB images as input data achieved almost the same accuracy as the regression model of SR (NIR, Red), which achieved the highest accuracy in VIs (Figure 9d).  Figure 2 shows the relationship between the measured LAI values and the measured values by PCA under each of the variety and fertilization conditions. The measured values with PCA could explain the measured LAI with an accuracy of R 2 = 0.934 and RMSE = 0.308 without significant difference in variety and fertilization level ( Figure 12). However, PCA underestimated the ground-measured LAI by 12% (Figure 12). (c) Figure 11. Correlations between ground-measured LAI and estimated LAI of validation data with models developed by DL with three patterns of input datasets: (a) nine types of CIs, (b) RGB images, (c) nine types of CIs and RGB images. Figure 2 shows the relationship between the measured LAI values and the measured values by PCA under each of the variety and fertilization conditions. The measured values with PCA could explain the measured LAI with an accuracy of R 2 = 0.934 and RMSE = 0.308 without significant difference in variety and fertilization level ( Figure 12). However, PCA underestimated the ground-measured LAI by 12% (Figure 12

Discussion
In this study, we attempted to improve the accuracy of LAI estimation in rice using an RGB camera mounted on a UAV by developing an estimation model using DL with the images as input data, and then compared the estimation accuracies of the resulting model and other hands-on approaches. The model using DL with RGB images could ex-

Discussion
In this study, we attempted to improve the accuracy of LAI estimation in rice using an RGB camera mounted on a UAV by developing an estimation model using DL with the images as input data, and then compared the estimation accuracies of the resulting model and other hands-on approaches. The model using DL with RGB images could explain the large variation of LAI of different rice varieties grown under different fertilizer conditions with high accuracy. The range of LAI in the present study (from 0.135 to 6.71) was sufficiently large to cover the variations of rice leaf area grown under irrigated fields of various environments [49]. Our results thus demonstrated that the model using DL with RGB images could provide an alternative to the methods using a multispectral camera and PCA for the estimation of rice LAI. Figure 13 summarizes the LAI estimation accuracy of five methods: regression models based on SR (NIR, Red) and VEG, which showed the highest accuracy among all indices and all CIs, respectively; estimation models developed by RF using nine types of CIs and RGB images, which showed the highest accuracy in all machine-learning algorithms; estimation models developed by DL using RGB images; PCA (LAI-2200). We will discuss these results in detail. First, we examined the accuracy of LAI estimation by the regression models based on each index. Since spectral information is affected by various factors, including plant morphology, soil background, and the shooting environment [14,49,50], the optimal index for LAI estimation depends on prior information [51][52][53]. Under the condition in this experiment, SR (NIR, Red) was the most favorable of the indices (R 2 = 0.976 and RMSE = 0.334), and VEG was the most accurate of the CIs (R 2 = 0.947 and RMSE = 0.401) ( Table 4, Figures 8 and 13). Although the estimation accuracy varied depending on the index, the VIs obtained from the multispectral camera generally performed better than the CIs obtained from the RGB camera (Table 4, Figure 8), which agreed with the results of Gupta et al. [19]. The reflectance of near-infrared light is more responsive to an increase in leaf area than the reflectance of visible light, because the former is more easily affected by changes in the vegetation structure [37]. Therefore, it is considered that the VIs including the reflectance of near-infrared light acquired from a multispectral camera showed relatively high estimation accuracy.
Then, we assessed the estimation accuracy of four kinds of machine-learning algorithms using three patterns of input datasets, nine types of CIs, RGB images, and nine types of CIs and RGB images. Compared to VEG, which showed the highest estimation First, we examined the accuracy of LAI estimation by the regression models based on each index. Since spectral information is affected by various factors, including plant morphology, soil background, and the shooting environment [14,50,51], the optimal index for LAI estimation depends on prior information [52][53][54]. Under the condition in this experiment, SR (NIR, Red) was the most favorable of the indices (R 2 = 0.976 and RMSE = 0.334), and VEG was the most accurate of the CIs (R 2 = 0.947 and RMSE = 0.401) ( Table 4, Figures 8 and 13). Although the estimation accuracy varied depending on the index, the VIs obtained from the multispectral camera generally performed better than the CIs obtained from the RGB camera (Table 4, Figure 8), which agreed with the results of Gupta et al. [19]. The reflectance of near-infrared light is more responsive to an increase in leaf area than the reflectance of visible light, because the former is more easily affected by changes in the vegetation structure [37]. Therefore, it is considered that the VIs includ-ing the reflectance of near-infrared light acquired from a multispectral camera showed relatively high estimation accuracy.
Then, we assessed the estimation accuracy of four kinds of machine-learning algorithms using three patterns of input datasets, nine types of CIs, RGB images, and nine types of CIs and RGB images. Compared to VEG, which showed the highest estimation accuracy among the CIs acquired from the RGB camera (R 2 = 0.947 and RMSE = 0.401), the estimation model developed by RF using nine types of CIs and images as input data showed an improvement (R 2 = 0.957 and RMSE = 0.342) (Figures 10 and 13, Table 5). Several existing researches have indicated that RF is an ideal algorithm to improve the estimation accuracy of LAI [21,22,25], and the results of this study was consistent with these reports.
Next, we tried to improve the accuracy of LAI estimation using an RGB camera on a UAV by means of a DL technique. Compared to RF, further improvement was observed in the estimation model by DL using RGB images (R 2 = 0.963 and RMSE = 0.334), and its accuracy was comparable to that of SR (NIR, Red) (R 2 = 0.976 and RMSE = 0.332), which showed the highest estimation accuracy among the VIs acquired from the multispectral camera (Figures 9d, 11b and 13). The results suggested that although the RGB camera is inferior when using only CIs, it can be made to achieve high performance equivalent to that of the multispectral camera simply by constructing an estimation model by DL with the images incorporated as input data. In the conventional machine-learning algorithms, the features must be specified in advance. In contrast, DL has the major advantage of being able to identify the characteristics of the images automatically [55]. In this research, since the training data in DL included images with a resolution of 100 × 100 pixels, which contained much more information than the CIs, the characteristics of plant morphology were recognized in greater detail. These factors were considered to be the reason for the achievement of a high estimation accuracy by DL with images.
Additionally, DL is known to be a promising algorithm to develop a robust model which could be applicable to the various conditions [56]. Therefore, there would still be room for improvement in terms of the robustness under various agricultural conditions, since the cultivars and management used in this study are limited. As training data for DL, the number of RGB images were enhanced by 12 times using flipping and changing brightness. Increasing the number of images with more diverse methods would be able to also contribute to further improvement in estimation accuracy. In addition, it is necessary to consider how much the resolution can be reduced while maintaining the explanatory accuracy of the model although we performed DL using 9 mm/pixels images in this study. Lastly, we can expect a further improvement by using multispectral images in addition to RGB images since the estimation accuracy was made better by applying RGB images to DL than the conventional method. By combining these findings with, for example, fixed-wing UAVs and high-resolution satellite sensors, this model could be applied to a wider area.
Finally, we examined the estimation accuracy of PCA, which has been widely used to obtain ground-truth data of LAI in the field of remote sensing. Although the measured value with PCA could explain the ground-measured LAI with an accuracy of R 2 = 0.934 and RMSE = 0.555, this was lower than the accuracies by a multispectral camera (the regression model based on SR (NIR, Red): R 2 = 0.976 and RMSE = 0.332) and an RGB camera (the estimation model developed by DL using RGB images: R 2 = 0.963 and RMSE = 0.334) (Figures 9d, 11b, 12 and 13). This is because plants other than the sampled eight hills got into the view of the PCA sensor, even though a view cap was installed. In addition, PCA led to 12% underestimation of the ground-measured values ( Figure 12). This result was consistent with the previous studies by Maruyama et al. [57] and Fang et al. [58], which reported that PCA underestimates the LAI measurements of rice canopy throughout the growth stage. LAI estimation with PCA is based on the assumption that the leaves are randomly distributed in space. For this reason, two factors have been reported to affect PCA-based measurement of LAI: the first is clamping, which means that parts of a plant are concentrated in one place, thereby undermining the random distribution and causing underestimation of LAI; and the second is the entry of plant components other than leaves into the field of the sensor, which causes LAI overestimation [59]. Especially in the case of rice canopies, leaf overlap [57] and the presence of stems, which were originally spatially aggregated [58], have been reported as the factors leading to clamping, and these factors are considered to be the main cause of the underestimation in this study. This underestimation could be mitigated by using four-ring data instead of five-ring data of PCA [58]. In any case, PCA can measure the canopy LAI non-destructively and rapidly, and will certainly be a useful ground-truth acquisition tool. In order to use the PCA effectively, sufficient attention should be paid to the correspondence of PCA with measured values.

Conclusions
In the present research, we examined whether models developed by DL using RGB images as input data could be an alternative to existing approaches for the estimation of LAI in rice using a UAV equipped with an RGB camera. Our results demonstrated that the model using DL with RGB images could estimate rice LAI as accurately as regression models based on VIs obtained from the multispectral camera and more accurately than PCA. However, the model was developed in limited cultivation conditions, varieties and area. DL has the potential to adapt to various circumstances using big data. Therefore, it would be possible to build a more robust model for wider area by accumulating data under more diverse agricultural and shooting conditions in the future.