1. Introduction
Pasture areas cover 21% of the territory (170 million hectares) in Brazil; however, a large part of these pastures are degraded [
1], leading to lower livestock productivity. The current average Brazilian productivity (73.5 kg of CWE. ha
yr
) is lower than the potential productivity of 294 kg CWE.ha
yr
[
2]. This production gap represents a great challenge to be surpassed by the livestock producing countries. On one hand, the increase in the world population leads to increased demand for protein. On the other hand, policies to combat climate change require more natural environment conservation, thus demanding less area for animal protein production. In this scenario, increasing the productivity of areas already used for animal protein production is essential to meet the growing demand and to attend to the policies for reducing greenhouse gas emissions, without increasing pasture area. To achieve this goal, the development of more productive cultivars by efficient forage breeding methodologies can help reduce the productivity gap [
3].
Tillers are small units of forage grass plants responsible for pasture production. After defoliation of the pasture (e.g., grazing by animals) the regrowth of tillers is crucial to maintain pasture stability and productivity [
4,
5]. The tillers that effectively contribute to productivity are those that regrow up to eight days after mechanical defoliation or grazing by animals [
6]. Thus, one way to measure productivity is to estimate regrowth seven days after defoliation [
7]. However, in situ measurements of this trait can be time-consuming, labor-intensive, and is a subjective task. Thus, the development of low-cost technologies for automated plant phenotyping could help scientists and professionals in forage breeding programs. Machine and deep learning combined with mobile devices, such as smartphones, are powerful and low-cost tools for this purpose. The development of such tools could induce less labor and time and more accuracy in the phenotyping process in forage breeding programs, leveraging the efficiency of these programs and contributing to the release of improved cultivars used to reduce the productivity gap.
Many machine learning methods, such as Support Vector Machine (SVM) and K-nearest neighbors (KNN), have been employed and show outstanding results, indicating their potential role in the future of High-Throughput Phenotyping (HTP) [
8,
9]. Deep Learning is a subset of machine learning techniques known as a versatile tool capable of automatically extracting features and assimilating complex data using a deep neural network. Convolutional Neural Networks (CNNs) have made remarkable achievements in Computer-vision-related tasks [
10]. CNN-based approaches have been widely applied to plant phenotyping because of their ability to create robust models that can be embedded in remote sensors [
11,
12]. The literature often neglects the use of simpler and faster digital image processing approaches. However, in the problem tackled in this study, several research papers have already compared digital image processing and deep learning to grass-like plants, especially between 2018 and 2019, where, in most cases, deep learning showed better performance [
13,
14,
15,
16].
Regarding tiller estimation, Zhifeng et al. [
17] showed that Magnetic Resonance Imaging (MRI) could be used to measure rice tillers, as well as the conventional X-ray computed tomography system. Yet, an image processing procedure is still necessary. Fang et al. [
18] proposed an automatic wheat tiller counting method under field conditions with terrestrial Light Detection and Ranging (LiDAR) using an adaptive layering and hierarchical clustering. Boyle et al. [
19] conducted experiments using RGB images of wheat on different days and at three different angles and used a computer vision algorithm based on the Frangi filter. Deng et al. [
20] trained a Faster R-CNN on three different backbones (ZFNet, VGGNet16, and VGG-CNN-M-1024) and evaluated productive rice tillers detection using mobile images. They achieved good accuracy compared to manual counting. Kristsis et al. [
21] present a plant identification dataset with 125 classes of vascular plants in Greece, which include leaf, flower, fruit, stem in a tree, herb, and fern-like form. They focused the proposal on finding deep learning architectures to deploy on mobile devices. This problem has a different goal from our study. We are not concerned with finding a lightweight architecture. Our proposal aims to help HTP find the best genetic material using mobile images, where computational cost is significant but not a critical factor in our application purposes. In addition, they report their results using validation sets and not as test set [
22]. Another interesting result from a grass-like image input can be found in Fujiwara et al. [
23]. The authors use a CNN to estimate legume coverage with Unmanned Aerial Vehicle (UAV) imageries. This study samples image patches and estimates the coverage of timothy, white clover, and background using a fine-tuned model for each patch. They evaluate only on GoogLeNet [
24].
Although we can find a rich literature in grass-like deep learning literature, to the best of our knowledge, no studies were found that investigate deep-learning-based methods to estimate the regrowth density of tillers in tropical forages using mobile phone images. Mobile phones are more accessible to most researchers than sources used in previous works (e.g., MRI and LiDAR). Furthermore, while other studies count the number of tillers [
17,
18,
19,
20], we use a score between 10 and 100 to represent a percentage of regrown tillers to select the top-k best genetic material.
The selection of top-k genotypes requires a scoring function to define the total order. Therefore the natural choice to perform this task is to treat it as a regression problem. If we train the model as a classification problem as classes of 10, 20, 30, all the way to 100, we tie the scores between these ranges, and therefore we lose the fine grain that is very important to select the top-k plants. Treating this problem as a classification problem instead of a regression problem would throw away all the potential of the total ordering possible using scores as the main output of deep learning models. Furthermore, evaluating the use of mobile phones involves two problems: (1) mobile images and (2) small models. The first problem can greatly vary when considering image quality, light, and resolution. The latter considers small models that often compromise accuracy to obtain a lighter model. We compared small models with bigger models to verify the loss acceptable in these applications.
Our objective is to explore deep learning regression-based methods on mobile phone images to assess the regrowth of tillers. Furthermore, different from other studies that directly count the number of tillers, we propose a methodology to assess the percentage of regrown tillers using scores from 10 to 100. We collected 1124 images with two distinct mobile phones and labeled them manually. Six different architectures were evaluated using 10-fold cross-validation with and without transfer learning. We presented a quantitative and qualitative analysis for regression. Thus, our work indicates the potential of the proposed methodology for the tiller regrowth estimation, which will be useful in increasing the efficiency of the breeding program. Our work can be used to build powerful tools for scientists and researchers to evaluate and select the best cultivar candidates in forage breeding programs and contribute to increasing animal protein productivity.
The rest of this paper is organized as follows.
Section 2 presents the materials and methods adopted in this study.
Section 3 presents the results obtained in the experimental analysis.
Section 4 discusses our achievements. Finally,
Section 5 summarizes the main conclusions and points to future works.
4. Discussion
This study estimates the regrowth density of tropical forages using mobile phone images. To achieve such a goal, we evaluated a series of standard and state-of-the-art deep learning methods from a simpler model such as AlexNet with only five layers to a more complex model such as ResNext101 with 101 layers. These models were adapted to tackle the problem as a regression problem.
For the first time, we report that deep learning methods can deliver correlations from 0.81 to 0.89 in estimating the regrowth density using mobile phone images. We believe that this result is very acceptable and has the potential to speed up data collection of regrowth density and consequently increase the efficiency of forage breeding programs. The closest approach found in the literature was the study conducted by Deng et al. [
20] for rice tillers. The authors used a completely different approach. Their approach required harvesting the rice and evaluating the cross-sections of rice tillers. Using object detection, they estimated the number of productive tillers. Our approach requires just a plot image obtained from a mobile phone without harvesting or other labor-intensive intervention.
Deeper neural nets perform better than the shallower version of the same architecture in most problems [
31]. In HTP, we found some controversy where the deeper model di not always produce the best result. The study conducted by Oliveira et al. [
40] using aerial images taken by an Unmanned Aerial Vehicle (UAV) showed some results where the best performing model among AlexNet, ResNeXt50, MaCNN, LF-CNN, and DarkNet53 was a simple AlexNet. Intrigued by these results, we evaluated a broader range of deep learning architectures with a more diverse number of layers. Interestingly, a 50-layer (Resnet50) network achieved our best-performing result. Again, in a traditional computer vision task, we expected the 101 layer network to give the best result, which did not occur.
The analysis using RROC indicated that all models were below the descending diagonal, suggesting that deep learning models tend to undervalue the prediction of the results in the problem setting of this paper. Castro et al. [
41] also plotted RROC in a biomass prediction problem using deep learning and aerial images, and in their results, this tendency did not exist. We believe that this tendency happens due to the skewed data distribution (
Figure 9) toward higher values.
The heatmap results shed light on where the network is “looking” to predict the regrowth density. To the best of our knowledge, this result is the first study to address the interpretability of deep learning models on regrowth. The results indicate that the circular region is the main area to reveal the lower regrowth area. The center of the plot is the most characteristic area for higher regrowth images in deep learning.
Compared to similar works, ours differs for not using any complex sensor technology, such as MRI and LiDAR, which are highly priced and excessive compared to a mobile phone. In addition, there is no need for a scheme to take pictures on different days and rotations and handcraft features. Furthermore, the main distinction from other works is the estimated trait. We calculated a score representing the regrowth percentage of the tillers instead of counting the number of tillers.
The use of machine learning must be used with care. Although the proposed approach can give valuable estimates of tiller regrowing, it is not advisable to completely substitute the manual labeling field regrow density. It is always good to collect smaller validation sets to evaluate if the learned models still give good estimates. Therefore, the proposed approach never intended to completely replace the manual labeling of fields but rather to allow the HTP research to multiply the number of plots while reducing the need for manual labeling collection.
5. Conclusions
To the best of our knowledge, this is the first research that evaluated CNN-based architectures to estimate regrowth density using RGB images collected by mobile phones. From our perspective, this study also presents the following contributions according to our results: (1) deep learning can deliver correlations from 0.81 to 0.89 in estimating the regrowth density using mobile phone images; (2) the best-performing architecture is not always the deeper model for this problem; (3) the deep learning models tend to undervalue the predictions in our problem setting and; (4) the heatmap indicates the patterns that deep learning models use to predict regrowth density.
Previous works focus on estimating the tiller number. We used a score that represents the percentage of regrown tillers, and we collected a dataset with images of forages taken on different days, locations, phones, and genotypes, promoting more generalized models.
Our results indicate that we might succeed in using our methods for new data prediction. To develop new cultivars, the researchers need to evaluate and select for multiple traits in the breeding program. Thus, there is a huge consumption in time and cost, sometimes with low accuracy, for performing the phenotyping step. Thus, training new algorithms to estimate traits such as disease and insect damages, mineral deficiencies, seed number, and other traits is the next step of this work for using deep learning associated with low-cost mobile devices.
In future work, we will evaluate the problem by employing lightweight deep learning architectures to deploy the model inside the mobile phone. In this way, the annotators can speed up their labeling process, and their task is more related to validating the predictions and collecting images than labeling the plot. We also plan to evaluate the problem using the Learning-To-Rank algorithm and evaluate the use of UAV-based images.