You are currently viewing a new version of our website. To view the old version click .
Sensors
  • Communication
  • Open Access

26 November 2021

Super Resolution Generative Adversarial Network (SRGANs) for Wheat Stripe Rust Classification

,
,
,
,
and
1
School of Electrical Engineering and Computer Science (SEECS), National University of Sciences and Technology (NUST), Islamabad 44000, Pakistan
2
Department of Engineering and Technology, School of Computing and Engineering, University of Huddersfield, Queensgate, Huddersfield HD1 3DH, UK
*
Author to whom correspondence should be addressed.
This article belongs to the Collection Machine Learning in Agriculture

Abstract

Wheat yellow rust is a common agricultural disease that affects the crop every year across the world. The disease not only negatively impacts the quality of the yield but the quantity as well, which results in adverse impact on economy and food supply. It is highly desired to develop methods for fast and accurate detection of yellow rust in wheat crop; however, high-resolution images are not always available which hinders the ability of trained models in detection tasks. The approach presented in this study harnesses the power of super-resolution generative adversarial networks (SRGAN) for upsampling the images before using them to train deep learning models for the detection of wheat yellow rust. After preprocessing the data for noise removal, SRGANs are used for upsampling the images to increase their resolution which helps convolutional neural network (CNN) in learning high-quality features during training. This study empirically shows that SRGANs can be used effectively to improve the quality of images and produce significantly better results when compared with models trained using low-resolution images. This is evident from the results obtained on upsampled images, i.e., 83% of overall test accuracy, which are substantially better than the overall test accuracy achieved for low-resolution images, i.e., 75%. The proposed approach can be used in other real-world scenarios where images are of low resolution due to the unavailability of high-resolution camera in edge devices.

1. Introduction

Wheat is one of the most important crop in the world that is the main source of nutrients for around 40% of the global population. It also provides 20% of the daily protein and food calories to human beings. Economically, the importance of wheat is evident from the fact that the global trade for wheat is more than the global trade for all the other crops combined [1]. Owing to its critically important place in human life, there is a need for proactive approaches to combating the pests and diseases that affect the wheat crop on an annual basis. One of these diseases is yellow rust, which can have a devastating impact on the production of wheat. It can cause up to a 40% yield loss to a total harvest depending upon the severity of the disease.
Attempts to counter the impact of yellow rust on wheat include spraying the whole fields with fungicides which can be counterproductive and have adverse affects on healthy portions of the crop. Therefore, the early and accurate detection of the disease is highly desirable, and various approaches have been adopted for detecting wheat yellow rust in recent years including machine learning and deep learning methods [2,3,4,5]. However, all of these approaches have very specific needs for data in cases of machine learning and high-resolution and clear imagery in the cases of deep learning approaches.
The requirements such as clean and high-quality data for different approaches can be met in experimental and lab settings, but it can be very hard and expensive to meet these requirements in real-world conditions. This is especially true in the case of image data where not all commonly available devices are able to capture high-resolution images. Generative adversarial networks (GAN) [6] have been highly successful in removing various limitations imposed on data. These limitations include volume, quality and variety of the data. There has been an increased usage of GANs in different fields. The different variations of GANs are able to produce realistic images after training on similar sample data [6], while other variants of GANs, super-resolution GANs (SRGAN), have the ability to produce high-resolution images from low-resolution images [7].
Convolutional neural networks (CNN) have proven to be extremely potent on different learning tasks related to images. However, the performance of the CNNs is highly dependent on the quality of images used for training. High-resolution images result in better performance than low-resolution images. In real-world situations, where high-resolution image capturing devices are not readily available, this can become a challenge for improving the performance of CNNs. This challenge is also prevalent where high-resolution training data are available, but upon deployment, the image capturing devices are not able to capture data with similar resolutions, therefore, resulting in diminishing performances of the system. These issues are even more prominent in agricultural applications of CNNs where data acquisition is affected by different environmental factors.
In light of all these issues, this paper presents a system that tackles the challenge of the lack of availability of high-resolution images by harnessing the power of super-resolution GANs for wheat yellow rust detection. All the images used in this paper contain a single leaf roughly in the center of the image. This helps keep the focus of the paper on studying the feasibility of SRGANs for wheat yellow rust detection. The main contributions of this study are:
  • End-to-end pipeline for wheat yellow rust detection in low-resolution images
  • Empirical analysis of effectiveness of SRGANs for wheat yellow rust detection
The rest of the paper is organized as follows: the related work is discussed in Section 2; the methodology is presented in Section 3; Section 4 presents the results and discussion and conclusion. In addition, future works are discussed in Section 5.

3. Methodology

This section provides a detailed account of all the image-processing steps along with the deep learning model used to train for wheat yellow rust detection. A complete pipeline of the methodology used in this study is displayed in Figure 1.
Figure 1. Complete pipeline of the methodology used in this study.

3.1. Data Acquisition

The data for this study were acquired using smartphone cameras and digital cameras from experimental fields maintained by National Agriculture Research Council (NARC). The fields were grown for different varieties of wheat, which allowed the collection of a wide variety of data. The images were collected during March–April 2021, since that is the time when yellow rust disease in wheat crops appears in Pakistan. A total of 1922 images belonging to three target classes, namely healthy, resistant, and susceptible, were collected using uniform background, which were then annotated under the supervision of domain experts from NARC. The images collected under these conditions have a high resolution, with most images having a size of around 3000 × 2500 pixels on average. A sample image of the raw dataset is shown in Figure 2.
Figure 2. Image sample from the raw dataset.

3.2. Image Processing

Image processing is an essential part of the wheat yellow rust detection system. A number of different image processing steps are performed to prepare data suitable for training a deep neural network for the learning task.

3.2.1. Image Segmentation

The raw dataset contains a high amount of noise due to different environmental factors such as sunlight, shadows and varying weather conditions. This information proved to be detrimental to the learning task, i.e., wheat yellow rust detection. There was a need for removal of all the additional background noise because the data in its raw form are not suitable for use. The first task in the image processing pipeline was to segment the region of interest, i.e., the leaf portion in the images. This was performed using Otsu’s method [27], which uses maximum likelihood thresholding for segmentation purposes. The resultant image after applying Otsu’s segmentation is shown in Figure 3.
Figure 3. Output of Otsu’s segmentation.

3.2.2. Leaf Cropping

After segmenting the leaf portion of the images using Otsu’s segmentation method, leaf portion in the images was cropped, and the rest of the image was discarded as it did not contain any information critical to the learning task. This was performed by making use of the fact that the pixels that contain the leaf have numerical values in the natural numbers range while the rest of the pixels with no leaf had a value of zero and could be discarded. Images after the cropping of leaf portion were in the most suitable form for further processing, as shown in Figure 4.
Figure 4. Leaf cropping in the images.

3.2.3. Image Downsampling

After image cropping, the images contained the least possible level of noise in them. The next step in the image-processing pipeline was lowering the resolution of raw images. This was achieved owing to the fact that the original resolution of the images was already high, and increasing the resolution further is a very computationally expensive task. Therefore, the images were downsampled to study the effectiveness of SRGANs on classification of wheat yellow rust. The resolution of all the images was reduced manually to avoid adding unnecessary complexity to the methodology and to keep the focus of the study on the feasibility of SRGANs for wheat rust detection. The final resolution of the images is based on the orientation of leaves in the images. As a general rule, the larger axis of the leaf in the image was reduced to 600 pixels, and the pixels of the other axes were reduced in accordance with the original resolution of the image. A sample output image after reduction in resolution is shown in Figure 5.
Figure 5. Low-resolution image.

3.2.4. Image Upsampling

The resolution of these images was then improved using super-resolution GAN (SRGAN). The images were improved by an upsample factor of 4× using the photorealistic SRGAN presented in [7]. The 4× factor was chosen in order to avoid exceeding the capacity of computational resources. A higher upsampling factor should produce better results. The result of SRGAN on a low-resolution image belonging to the resistant class was shown in Figure 6. The difference between the two images (i.e., low resolution and high resolution) was evident to the naked eye, as seen in Figure 5 and Figure 6. This ensured that all the fine details of the image texture were retained in the resultant image and produced a photorealistic image from low-resolution images. The generative adversarial approach following the idea proposed by [6] used a deep generative network with identical residual blocks. The discriminator in the network consisted of eight convolutional layers and was built in a similar fashion to VGG model. The generator and discriminator architectures are shown in Figure 7.
Figure 6. High-resolution image output of SRGAN.
Figure 7. Generator and discriminator architecture of SRGAN inspired by [7].

3.3. Model Architecture

For the classification of wheat yellow rust into three target classes, i.e., healthy, resistant and susceptible, a three-layered CNN model was used. The selection of a shallow network was due to the limited data availability. The network consisted of two convolutional layers with 20 × 3 × 3 kernels each. A max pooling layer was then used before the classification layer was used to identify the class each image belongs to. Same model architecture was used to train both versions of the data, i.e., low resolution and high resolution. The model architecture was shown in Figure 8.
Figure 8. CNN model architecture used in the study.

4. Results and Discussion

The training for the model was carried out using training and validation sets. The validation set was created by setting aside 15% of the training data. The model described in Section 3.3 was trained to study the effectiveness of upsampled images using SRGANs for wheat yellow rust classification. The model optimizer for all the models was Adam, and the learning rate was kept at 0.001 to ensure no interference from other factors in the results of the models. The results obtained after training models on both versions of the data are discussed in this section.

4.1. Low-Resolution Images

The low-resolution images that were obtained after downsampling the data were used for training the shallow three-layered CNN architecture model. Different simulations were run for a different number of epochs to obtain maximum possible performance on the holdout test set. This number is between 7 and 15 due to the low amount of data. The training for higher epochs generated worse results that did not contribute any useful information to the discussion and therefore were left out of the discussion. The results for these set of simulations were listed in Table 1. The progress of accuracy and loss on training and validation sets for low-resolution images are shown in Figure 9.
Table 1. Results for models trained on low-resolution images.
Figure 9. Accuracy and loss graphs for training and validation on low-resolution images.
The overall performance on test data for all the simulations is similar, but a deeper look into other performance metrics shows that most models are not able to classify resistant class with a high degree of precision. This can be seen in Table 1 which shows the breakdown of models with respect to their performance on resistant class. It is evident from this table that poor performance on resistant class causes the models to have a worse overall performance on test data. The best-performing model (confusion matrix in Figure 10a) produces balanced results in terms of both precision and recall for a resistant class when compared with other models.
Figure 10. Confusion matrices of the best results for test data. (a) Confusion matrix for low-resolution data. (b) Confusion matrix for high-resolution data.

4.2. High-Resolution Images

Similar to the approach taken when working with low-resolution images, high-resolution images were also trained using different number of epochs to obtain the best possible results. The model for this data was also trained using the same range of epochs as the model for low-resolution images. The results for these trainings are shown in Table 2. The confusion matrix of the best-performing model, trained for eight epochs, is shown in Figure 10b. The accuracy graphs for training accuracy and validation accuracy are shown in Figure 11. The graph shows the training progress for eight epochs, which is the best-performing model on test images.’
Table 2. Results for models trained on high-resolution images.
Figure 11. Accuracy and loss graphs for training and validation on high-resolution images.
The results documented in Table 2 show that the performance of models trained on high-resolution images varies with the best overall accuracy on the test set reaching a maximum of 83%. The reason for the poor performance of other instances of training on this data is the inability of these models to detect the resistant class correctly in most cases and, in some cases, poor performance in detecting healthy class as well. This trend is seen in Table 2, showing precision and recall for resistant class for all the models. These results help us draw strong conclusions about the impact of SRGANs for wheat yellow rust detection which are discussed in the subsequent section.

4.3. Impact of SRGANs

The results discussed in Section 4.1 and Section 4.2 show that the models for high-resolution images outperformed the low-resolution image models in terms of best performance in test data. This is evident from the fact that the best overall accuracy for low-resolution models was 75%, while the high-resolution models gave a maximum overall accuracy of 83%. Models trained on high resolution also provide a balanced prediction in terms of precision and recall for all three classes when compared to the models trained on low-resolution images. This is a considerable difference in the performance of the two versions of the data. It shows that SRGANs can provide a significant boost in the performance of deep learning models, especially in the case of low-resolution data.

4.4. Comparison of SR Approaches

The approach presented in this paper is also analyzed with other state-of-the-art SR approaches. Deep residual network-based SR approaches, namely enhanced deep residual networks for single-image super-resolution (EDSR) [13] and wide activation for efficient and accurate image super-resolution (WDSR) [14], were used for this analysis. Low-resolution images of the dataset were upsampled, using both the approaches, and separate models were trained using these super-resolution images. The architecture used for these models is the same as the one discussed in Section 3.3. All the hyperparameters were also kept the same for just comparison between the different approaches. The results, both overall and precision, and recall splits for all three classes produced very close results, with the results for SRGAN being marginally better than the other two approaches. The best results on test data obtained from the data obtained using these approaches are listed in Table 3, along with the best results using the approach used in the methodology in the paper. This shows that the application of SRGANs for wheat yellow rust classification is not dependent on the approach used for upsampling images, since it does not produce a significant difference.
Table 3. Best results obtained using different super resolution approaches.

5. Conclusions and Future Work

This study indicates that super-resolution generative adversarial networks have the potential to aid in the detection of wheat yellow rust disease. This study empirically shows that SRGANs provide a highly capable and effective approach in real-world scenarios, where it is not always possible to have high-resolution images. Digital image processing to remove as much noise as possible is highly advantageous as it ensures the purity of information in the data.
A comparison of the results shows that improving the resolution of images has a direct correlation with the performance of the deep learning models trained for wheat yellow rust detection. The results of low-resolution images max out at 75%, while the results for the high-resolution images, upsampled using SRGANs, improved to 83%. These results can be improved even further if the computing resources have the capacity of handling and upsampling higher-resolution images than the ones used in this study.

Author Contributions

M.H.M.: Conceptualization, Methodology, Investigation, Writing—Review & Editing; R.M.: Conceptualization, Methodology, Validation, Investigation, Writing—Original Draft, Writing—Review & Editing, Visualization, Supervision; I.U.H.: Conceptualization, Validation, Data Curation, Investigation, Writing—Review & Editing; U.S.: Software, Validation, Data Curation, Writing—Original Draft, Writing—Review & Editing, Visualization; S.M.H.Z.: Methodology, Software, Validation, Investigation, Writing—Original Draft, M.H.: Investigation, Methodology, Writing—Review & Editing, validation, Visualization. All authors have read and agreed to the published version of the manuscript.

Funding

This research work is a part of the funded project No. RF-NCAI-023, supported by National Center for Artificial Intelligence (NCAI), National University of Sciences and Technology (NUST), Islamabad, Pakistan.

Data Availability Statement

The data may be requested by reaching out to the authors through email.

Acknowledgments

The research and development of this work is conducted in IoT Lab, NUST-SEECS, Islamabad, Pakistan in collaboration with NARC, Islamabad, Pakistan.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Giraldo, P.; Benavente, E.; Manzano-Agugliaro, F.; Gimenez, E. Worldwide research trends on wheat and barley: A bibliometric comparative analysis. Agronomy 2019, 9, 352. [Google Scholar] [CrossRef] [Green Version]
  2. Johannes, A.; Picon, A.; Alvarez-Gila, A.; Echazarra, J.; Rodriguez-Vaamonde, S.; Navajas, A.D.; Ortiz-Barredo, A. Automatic plant disease diagnosis using mobile capture devices, applied on a wheat use case. Comput. Electron. Agric. 2017, 138, 200–209. [Google Scholar] [CrossRef]
  3. Azadbakht, M.; Ashourloo, D.; Aghighi, H.; Radiom, S.; Alimohammadi, A. Wheat leaf rust detection at canopy scale under different LAI levels using machine learning techniques. Comput. Electron. Agric. 2019, 156, 119–128. [Google Scholar] [CrossRef]
  4. Lu, J.; Hu, J.; Zhao, G.; Mei, F.; Zhang, C. An in-field automatic wheat disease diagnosis system. Comput. Electron. Agric. 2017, 142, 369–379. [Google Scholar] [CrossRef] [Green Version]
  5. Picon, A.; Alvarez-Gila, A.; Seitz, M.; Ortiz-Barredo, A.; Echazarra, J.; Johannes, A. Deep convolutional neural networks for mobile capture device-based crop disease classification in the wild. Comput. Electron. Agric. 2019, 161, 280–290. [Google Scholar] [CrossRef]
  6. Goodfellow, I. Nips 2016 tutorial: Generative adversarial networks. arXiv 2016, arXiv:1701.00160. [Google Scholar]
  7. Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4681–4690. [Google Scholar]
  8. Brock, A.; Donahue, J.; Simonyan, K. Large scale GAN training for high fidelity natural image synthesis. arXiv 2018, arXiv:1809.11096. [Google Scholar]
  9. Zhang, H.; Xu, T.; Li, H.; Zhang, S.; Wang, X.; Huang, X.; Metaxas, D.N. Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 5907–5915. [Google Scholar]
  10. Xu, L.; Skoularidou, M.; Cuesta-Infante, A.; Veeramachaneni, K. Modeling tabular data using conditional gan. arXiv 2019, arXiv:1907.00503. [Google Scholar]
  11. Park, N.; Mohammadi, M.; Gorde, K.; Jajodia, S.; Park, H.; Kim, Y. Data synthesis based on generative adversarial networks. arXiv 2018, arXiv:1806.03384. [Google Scholar] [CrossRef] [Green Version]
  12. Bourou, S.; El Saer, A.; Velivassaki, T.H.; Voulkidis, A.; Zahariadis, T. A Review of Tabular Data Synthesis Using GANs on an IDS Dataset. Information 2021, 12, 375. [Google Scholar] [CrossRef]
  13. Lim, B.; Son, S.; Kim, H.; Nah, S.; Lee, K.M. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 136–144. [Google Scholar]
  14. Yu, J.; Fan, Y.; Yang, J.; Xu, N.; Wang, Z.; Wang, X.; Huang, T. Wide activation for efficient and accurate image super-resolution. arXiv 2018, arXiv:1808.08718. [Google Scholar]
  15. Zhang, Z.; Shi, Y.; Zhou, X.; Kan, H.; Wen, J. Shuffle block SRGAN for face image super-resolution reconstruction. Meas. Control 2020, 53, 1429–1439. [Google Scholar] [CrossRef]
  16. Ha, H.; Hwang, B.Y. Enhancement method of CCTV video quality based on SRGAN. J. Korea Multimed. Soc. 2018, 21, 1027–1034. [Google Scholar]
  17. Cherian, A.K.; Poovammal, E.; Rathi, Y. Improving Image Resolution on Surveillance Images Using SRGAN. In Inventive Systems and Control; Springer: Singapore, 2021; pp. 61–76. [Google Scholar]
  18. Kim, J.; Lee, J.; Song, K.; Kim, Y.S. Vehicle model recognition using SRGAN for low-resolution vehicle images. In Proceedings of the 2nd International Conference on Artificial Intelligence and Pattern Recognition, Beijing, China, 16–18 August 2019; pp. 42–45. [Google Scholar]
  19. Jiang, X.; Xu, Y.; Wei, P.; Zhou, Z. Ct image super resolution based on improved srgan. In Proceedings of the 2020 5th International Conference on Computer and Communication Systems (ICCCS), Shanghai, China, 22–24 February 2020; pp. 363–367. [Google Scholar]
  20. Vinothini, D.S.; Bama, B.S. Attention-Based SRGAN for Super Resolution of Satellite Images. In Machine Learning, Deep Learning and Computational Intelligence for Wireless Communication; Springer: Singapore, 2021; pp. 407–423. [Google Scholar]
  21. Demiray, B.Z.; Sit, M.; Demir, I. D-SRGAN: DEM super-resolution with generative adversarial networks. SN Comput. Sci. 2021, 2, 1–11. [Google Scholar] [CrossRef]
  22. Liu, X.; Liu, Q.; Wang, Y. Remote sensing image fusion based on two-stream fusion network. Inf. Fusion 2020, 55, 1–15. [Google Scholar] [CrossRef] [Green Version]
  23. Barrero, O.; Perdomo, S.A. RGB and multispectral UAV image fusion for Gramineae weed detection in rice fields. Precis. Agric. 2018, 19, 809–822. [Google Scholar] [CrossRef]
  24. Song, Z.; Zhang, Z.; Yang, S.; Ding, D.; Ning, J. Identifying sunflower lodging based on image fusion and deep semantic segmentation with UAV remote sensing imaging. Comput. Electron. Agric. 2020, 179, 105812. [Google Scholar] [CrossRef]
  25. Liu, Y.; Chen, X.; Ward, R.K.; Wang, Z.J. Medical image fusion via convolutional sparsity based morphological component analysis. IEEE Signal Process. Lett. 2019, 26, 485–489. [Google Scholar] [CrossRef]
  26. Ma, J.; Yu, W.; Chen, C.; Liang, P.; Guo, X.; Jiang, J. Pan-GAN: An unsupervised pan-sharpening method for remote sensing image fusion. Inf. Fusion 2020, 62, 110–120. [Google Scholar] [CrossRef]
  27. Kurita, T.; Otsu, N.; Abdelmalek, N. Maximum likelihood thresholding based on population mixture models. Pattern Recognit. 1992, 25, 1231–1240. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.