You are currently viewing a new version of our website. To view the old version click .
Electronics
  • Feature Paper
  • Review
  • Open Access

9 February 2024

A Review of Generative Adversarial Networks for Computer Vision Tasks

,
and
Faculty of Automatic Control and Computer Science, National University of Science and Technology Politehnica Bucharest, 060042 Bucharest, Romania
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Selected Papers from Young Researchers in AI for Computer Vision

Abstract

In recent years, computer vision tasks have gained a lot of popularity, accompanied by the development of numerous powerful architectures consistently delivering outstanding results when applied to well-annotated datasets. However, acquiring a high-quality dataset remains a challenge, particularly in sensitive domains like medical imaging, where expense and ethical concerns represent a challenge. Generative adversarial networks (GANs) offer a possible solution to artificially expand datasets, providing a basic resource for applications requiring large and diverse data. This work presents a thorough review and comparative analysis of the most promising GAN architectures. This review is intended to serve as a valuable reference for selecting the most suitable architecture for diverse projects, diminishing the challenges posed by limited and constrained datasets. Furthermore, we developed practical experimentation, focusing on the augmentation of a medical dataset derived from a colonoscopy video. We also applied one of the GAN architectures outlined in our work to a dataset consisting of histopathology images. The goal was to illustrate how GANs can enhance and augment datasets, showcasing their potential to improve overall data quality. Through this research, we aim to contribute to the broader understanding and application of GANs in scenarios where dataset scarcity poses a significant obstacle, particularly in medical imaging applications.

1. Introduction

It is largely acknowledged that generative adversarial networks (GANs) [1] were a major breakthrough in the field of artificial intelligence. The idea behind GAN was first introduced in 2014 by Ian J. Goodfellow and his team and years later it remains one of the most relevant and promising methods used to tackle generative problems in computer vision and many other fields. GANs are great for generating all sorts of data, not only images. It is also used for generating text, tabular data, music and audio, 3D models, etc. It is also the first architecture developed in the field of Deep Learning that was able to produce such high quality results on most datasets they were trained on, no matter the domain. In this work, we will provide a comprehensive overview of the most widely recognized and commonly used advancements and approaches that have emerged since the inception of the GAN framework.
The GAN is composed of two parts: the generator and the discriminator. The generator generates new instances of data. The discriminator evaluates the authenticity of the generated data. At the beginning of the training the generator will not produce good images, but the discriminator will give feedback to the generator and it will improve. In order to be able to evaluate the authenticity of the generated data, the discriminator is trained on real data. Then, the discriminator receives the image from the generator, and it assigns a probability of the generated image being real. Basically, the discriminator is a standard CNN used for classification, it checks if the generated data falls into the real or fake category. The way the generator works is opposite to the discriminator. The discriminator downsamples the image in order to obtain that probability, meanwhile, the generator takes its input and upsamples it as much as possible to become a good enough piece of data. Both networks try to optimize their specific loss function. During the training process, both networks change and influence one another. Hence the name “Adversarial”, the networks compete in fooling each other. The final goal is to make a generator good enough to approximate its generated distribution to the distribution of the real images. The original paper shows results for the MNIST dataset [2] and CIFAR-10 [3], both of which consist of simple images. MNIST is a dataset that contains 60,000 small images with handwritten digits. The size of each image is 28 × 28 pixels, the digit itself is white and the background is black. Since the image is so small and contains only two colors, the authors of the paper [1] used noise as input for the generator Network. Using random noise as input for the generator of a GAN is a common way to generate data. However, depending on the complexity and specificity of more advanced tasks, a more sophisticated approach may be necessary [1].
For some scenarios, using random noise can be sufficient. The GAN Generator will take this noise and generate corresponding synthetic data. However, there are situations where it is necessary to provide more meaningful or structured inputs to the generator. For example, when generating realistic images, adding random noise can lead to inconsistent or blurry results. Instead, it may be beneficial to provide the generator with a latent vector. We will explore the various inputs used for the generator in the specialized literature.
The pioneers of generative models were hidden Markov models (HMMs) and Gaussian mixture models (GMMs). In the 1950s, they were developed to produce sequential pieces of data. In the field of natural language processing (NLP), recurrent neural networks (RNNs) alongside long short-term memory (LSTM) were a breakthrough and were able to model longer dependencies, allowing for longer sentence generation. In 2013, the variational autoencoder (VAE) was introduced, but its disadvantage compared to the GAN is that it produces blurred and unclear images. The focus of this paper, the GAN, was introduced in 2014 and was a breakthrough because it could generate high-quality images. In 2015, diffusion models were introduced [4]. The basic principle of diffusion models is incorporating noise into the existing training data and then reversing the process to restore the data. In 2017, transformers were proposed, first with applications in NLP and later with applications in computer vision as well. In 2021, stable diffusion was introduced, and it is an important model for text-to-image translation [5]. In this work, we will keep our focus on GANs and their evolution with the purpose of providing a strong technical and practical guide for future research in this field.
The focus of this paper is on how GANs can be used to solve medical data related problems; however, it’s important to mention that GANs have proven to be versatile and powerful tools in various domains like image/video editing [6], generating original data for entertainment industry, image-to-image translation, text-to-image translation, image/video quality enhancer, face aging [7], and human pose generation [8] for security applications. In the software industry for editing photos and videos, GANs can be used to improve the resolution of older images or those taken from a very far distance, such as images from space. Style transfer methods can be employed to create new scenarios. There are GANs specifically developed for video editing effects, like changing the background or adding/removing objects from a frame. In the entertainment industry, GANs can be utilized for the automatic generation of characters or backgrounds [9]. Additionally, we can transform simple sketches into more detailed objects and use them in design related productions (cartoons, games, etc.). For this industry, GANs that generate images based on text can be helpful, rapidly creating characters or objects suitable for the intended scene. Generating the appearances of specific individuals based on age and body pose can be useful in security-related issues. As we’ll see in the next section, we can use the discriminator for one-class classification problems. For instance, in cases of banking fraud, where most transactions in existing datasets are valid, the discriminator can be very useful in identifying transactions that stand out and fit into the category of fraudulent transactions [10].

3. Dataset and Methods

As an experiment, we focused on testing one type of the previously presented GAN methods on a dataset containing medical images. The name of the dataset is CVC-ClinicDB [28] and it is a dataset with 612 images with a resolution of 384 × 288, extracted from the video of a colonoscopy. The dataset was created by the Computer Vision Center, Barcelona, Spain, based on data from The Clinic Hospital of Barcelona. We chose to generate new enhanced resolution images, using different variations of the ESRGAN, with the purpose of finding out how the distribution of the image would differ. First, we generated an image with increased resolution, using the GAN, then, we resized the image to the original size, and finally, we plotted the distribution of the image in comparison to the original one. To do this, we plotted a histogram containing the frequency of the pixel values from 0 to 256.
If we increase the size of an image using an ESRGAN and then resize it back to the original size, the resulting image is not considered a completely new image. It is still derived from the original image, but with enhanced resolution.
The purpose of using ESRGAN is to generate a high-resolution version of the input image by extrapolating details and enhancing fine textures. However, when we resize the image back to the original size, some of the details and textures that were added by the ESRGAN may be lost or altered.
In the context of a segmentation task, using the ESRGAN-enhanced and resized image can have both advantages and limitations. The advantages include the potential for improved segmentation accuracy, due to the increased resolution and enhanced details. The enhanced textures and sharper edges produced by the ESRGAN have the potential to improve the accuracy of object boundary detection.
However, it’s important to note the potential limitations, as well. The resized image may not perfectly retain all the original details and may introduce artifacts or distortions during the resizing process. Additionally, if the ESRGAN introduces any unrealistic or inaccurate features, these could impact the computer vision task results.
We chose to employ a pretrained ESRGAN, due to several compelling reasons. Firstly, pretrained models have already undergone extensive training on large-scale datasets, which helps them capture rich and diverse features from the data. This pretraining phase enables the model to learn complex patterns and representations that are beneficial for enhancing the quality of our output.
Secondly, pretrained ESRGAN models have proved remarkable performance in single image super-resolution tasks. By leveraging the knowledge and insights gained during their training, we can leverage their ability to generate high-resolution and visually appealing images from low-resolution inputs.
Moreover, employing a pretrained ESRGAN allows us to save significant computational resources and time. Rather than training a model from scratch, we can build upon the existing knowledge encoded in the pretrained model, accelerating our development process, and reducing the need for extensive data and computational requirements.
For our experiments, we first used an ESRGAN. Compared to the original SRGAN, it contains a deep neural network that uses residual-in-residual dense blocks, instead of the batch normalization layers. The idea is to generate some new images with their resolution increased (1152 × 1536) and then resize them to the original dimension of the image (288 × 384). The objective is to obtain similar looking images, so that the discriminator is performing its role well, but we do not need the images to have exactly the same distributions, hence the goal is to obtain some new data. It can be seen in Figure 1 and Figure 2 that the distributions of the original image and the generated image are similar, but not identical.
Figure 1. Original image histogram.
Figure 2. Generated image histogram using ESRGAN.
We also used the pretrained model Real-ESRGAN to generate the figures from the next chapter. Usually, a resolution model made for enhancing the resolution of images is trained on datasets containing good resolution images/ground truth images and corresponding images with diverse types of degradation. One challenge is to simulate the degradation that happens in the real world, which can be caused by a lot of variables: camera blur, sensor noise, JPEG compression, sharpening artifacts, image editing, file transfer over the Internet, etc. To overcome this, the authors of [29] introduced a high-order degradation modeling process, with the purpose of simulating complex real-world degradations.
The model was trained using DIV2K [30], Flickr2K [31], and OutdoorSceneTraining [32] datasets. The authors used the Adam optimizer and the loss was a combination of L1 loss, perceptual loss, and GAN loss, with the following weights {1, 1, 0.1}. Some limitations are twisted lines and unpleasant artifacts, caused by GAN training, and unknown and out-of-distribution degradations [29]. Such a limitation of not being able to preserve the natural edges of nuclei. This could affect the model’s generalization capabilities. However, if we deal with a task in which we need to make the data better to visualize by an expert, these artifacts may not be such a very big disadvantage. Improving the resolution of medical images could also be very important in scenarios where the medical image quality is constrained by the quality of the equipment and the time it needs to produce a high-quality image. For example, magnetic resonance imaging (MRIs) and computed tomography (CT scans) are two very common types of medical datatypes, and their quality depends on the equipment and the time allocated for performing the scan. The quality of these images is crucial in making an accurate decision model. In [33], the authors give, as an example, tissues that are small and hard to identify within the eye’s fundus. Elements like soft exudates, microaneurysms, or hemorrhages could potentially be identified better from an enhanced resolution image. The most significant drawback remains that artifacts could appear during the resolution increasing procedure and this may affect model performance. To take advantage of all types of data and combine them to obtain higher quality datasets, the authors of [34] have proposed a multimodal image fusion method, with the purpose of creating a clear image, without artifacts caused by the scanning process, based on a multi discriminator hierarchical wavelet GAN. Methods like [34] could be used to overcome imperfections in medical data and combine the strong points from different types of data like CTs, MRIs, positron emission tomography (PET), etc.

4. Results

Another way to show that the images are not the same is to compute the structural similarity index (SSIM) and the mean squared error (MSE). SSIM values range from −1 to 1, where 1 indicates perfect similarity, and MSE measures the average squared difference between corresponding pixel values in the two images. A lower MSE indicates less difference between the images. The SSIM and MSE scores for Figure 3 and Figure 4 are: SSIM: 0.9185 and MSE: 0.0006. The provided SSIM and MSE values show that, while the two images are highly similar, which was expected and can be evaluated by a visual inspection, they are not exactly the same.
Figure 3. Generated image using PSNR model.
Figure 4. Generated image using Real-ESRGAN.
In Figure 1 and Figure 2, the image pixel distribution for the original image and the ESRGAN generated image can be observed.
One downside of generating higher resolution images is the appearance of artifacts around edges. One method to lower the appearance of artifacts is PSNR, or peak signal-to-noise ratio. It is used in ESRGAN as a metric to evaluate the quality of super-resolved images. PSNR measures the difference between two images by evaluating the peak signal power (the maximum possible value) and the amount of noise or distortion introduced during the process of upscaling or image enhancement. A higher PSNR value indicates a smaller difference between the generated image and the original image, implying better image quality. We also generated a set of images from the same input dataset, with a model using PSNR. The differences might not be immediately apparent, yet the distributions vary notably. In Figure 3, an example of an image generated with the PSNR model is displayed.
Lastly, we used Real-ESRGAN to generate new images, as can be seen in Figure 4, along with the corresponding segmentation mask from Figure 5. The segmentation mask is part of what is provided by the dataset. We ensure the application of the same augmentation technique to the original image and to the corresponding mask. In the case of a segmentation mask, we make sure that features like edges of the area of interest align correctly between the original image and the mask. Real-ESRGAN is a state-of-the-art solution for increasing resolution. Compared to the original ESRGAN, it proposes a U-net discriminator with spectral normalization, to increase discriminator performance and stabilize the training dynamics [29].
Figure 5. Generated segmentation mask using Real-ESRGAN.
To provide a more suggestive visual qualitative analysis, we also applied the Real-ESRGAN to a breast histopathology dataset [35]. The original images are 50 × 50 pixels in size and show invasive ductal carcinoma pathology. If we would like to use this dataset for a task like semantic segmentation, it would be a very unpleasant and maybe impossible task to annotate the images at this size. A simple change of image size produces blurry results. In a scenario like this, it is observed that resolution enhanced images present a better quality. In Figure 6, four of the original images from the dataset are presented. In Figure 7, Figure 8, Figure 9 and Figure 10 we can see on the left the simple resized image and on the right the Real-ESRGAN enhanced image.
Figure 6. Breast Histopathology Images 1, 2, 3, 4.
Figure 7. Breast histopathology image 1 resized (left) and Real_ESRGAN generated (right).
Figure 8. Breast histopathology image 2 resized (left) and Real_ESRGAN generated (right).
Figure 9. Breast histopathology image 3 resized (left) and Real_ESRGAN generated (right).
Figure 10. Breast histopathology image 4 resized (left) and Real_ESRGAN generated (right).
We can observe that there is a similarity of the characteristics between Figure 7 and Figure 10, and, respectively, Figure 8 and Figure 9; this similarity can also be seen in the results from Table 2. The SSIM and MSE values are closer together for images that share similar characteristics. Also, this shows that the chosen model is capable of consistency and generalization.
Table 2. SSIM and MSE values computed between resized and Real_ESRGAN images.

5. Conclusions

The current review has provided a comprehensive overview of the current state-of-the-art architectures of generative adversarial networks within the domain of computer vision. The experiment with enhanced super-resolution generative adversarial networks revealed their capability in generating additional data for diverse datasets, without just duplicating existing information. Looking ahead, GANs hold significant potential in semi-supervised tasks across various domains. Their ability to generate realistic data expands the possibilities for training models with limited labeled data, making them invaluable in scenarios where obtaining large labeled datasets is challenging.
For future work, we want to extend the applicability of ESRGANs. We want to experiment and find out how semantic segmentation tasks would perform on datasets enriched with extra generated images, compared to the original dataset. Furthermore, we want to apply ESRGAN to augment patches from histopathology images. In histopathology imaging, the scanned image of the tissue is of very high resolution, but only a small part of the image actually contains areas of interest. This is why, for histopathology image segmentation, the image is divided into patches, maintaining only the relevant ones for our classes [36]. Consequently, there is a need to increase the resolution of these small patches, where ESRGAN could prove beneficial. Additionally, in the case of these images, a GAN could be employed to alter the colors, as this represents the most common augmentation technique for this type of medical image.

Author Contributions

Conceptualization, A.-M.S., Ș.R. and A.M.F.; formal analysis, A.-M.S., Ș.R. and A.M.F.; investigation, A.-M.S. and Ș.R.; methodology, A.M.F.; project administration, A.M.F.; software, A.-M.S.; supervision, Ș.R. and A.M.F.; validation, Ș.R. and A.M.F.; visualization, A.-M.S. and Ș.R.; writing—original draft, A.-M.S., Ș.R. and A.M.F.; writing—review & editing, A.-M.S., Ș.R. and A.M.F. All authors have read and agreed to the published version of the manuscript.

Funding

This paper is supported by the Romania’s Recovery and Resilience Plan under grant agreement 760009, project “Creation, Operationalization and Development of the National Center of Competence in the field of Cancer”, PNRR-III-C9-2022—I5.

Data Availability Statement

The data presented in this study are available in this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets. arXiv 2014, arXiv:1406.2661v1. [Google Scholar]
  2. Deng, L. The mnist database of handwritten digit images for machine learning research. IEEE Signal Process. Mag. 2012, 29, 141–142. [Google Scholar] [CrossRef]
  3. The CIFAR-10 Dataset. Available online: https://www.cs.toronto.edu/~kriz/cifar.html (accessed on 22 August 2023).
  4. Toloka. History of Generative AI. Toloka Team. Available online: https://toloka.ai/blog/history-of-generative-ai/ (accessed on 22 August 2023).
  5. Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-Resolution Image Synthesis with Latent Diffusion Models. arXiv 2021, arXiv:2112.10752. [Google Scholar]
  6. Ling, H.; Kreis, K.; Li, D.; Kim, S.W.; Torralba, A.; Fidler, S. EditGAN: High-Precision Semantic Image Editing. arXiv 2021, arXiv:2111.03186. [Google Scholar]
  7. Antipov, G.; Baccouche, M.; Dugelay, J.-L. Face Aging with Conditional Generative Adversarial Networks. arXiv 2017, arXiv:1702.01983. [Google Scholar]
  8. Siarohin, A.; Lathuiliere, S.; Sangineto, E.; Sebe, N. Appearance and Pose-Conditioned Human mage Generation using Deformable GANs. arXiv 2019, arXiv:1905.00007v2. [Google Scholar]
  9. Ruan, S. Anime Characters Generation with Generative Adversarial Networks. In Proceedings of the 2022 IEEE International Conference on Advances in Electrical Engineering and Computer Applications (AEECA), Dalian, China, 20–21 August 2022. [Google Scholar] [CrossRef]
  10. Developer, N.; Mamaghani, M.; Ghorbani, N.; Dowling, J.; Bzhalava, D.; Ramamoorthy, P.; Bennett, M.J. Detecting Financial Fraud Using GANs at Swedbank with Hopsworks and NVIDIA GPUs. Available online: https://developer.nvidia.com/blog/detecting-financial-fraud-using-gans-at-swedbank-with-hopsworks-and-gpus/ (accessed on 26 March 2021).
  11. Radford, A.; Metz, L.; Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv 2016, arXiv:1511.06434v2. [Google Scholar]
  12. Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein GAN. arXiv 2017, arXiv:1701.07875v3. [Google Scholar]
  13. Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A. Improved Training of Wasserstein GANs. arXiv 2017, arXiv:1704.00028v3. [Google Scholar]
  14. Mirza, M.; Osindero, S. Conditional Generative Adversarial Nets. arXiv 2014, arXiv:1411.1784v1. [Google Scholar]
  15. Zhu, J.-Y.; Park, T.; Alexei, P.I.; Efros, A. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. arXiv 2020, arXiv:1703.10593v7. [Google Scholar]
  16. Isola, P.; Zhu, J.-Y.; Zhou, T.; Efros, A.A. Image-to-Image Translation with Conditional Adversarial Networks. arXiv 2018, arXiv:1611.07004v3. [Google Scholar]
  17. Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive growing of GANs for improved quality, stability, and variation. arXiv 2018, arXiv:1710.10196v3. [Google Scholar]
  18. Karras, T.; Aila, T.; Laine, S. A Style-Based Generator Architecture for Generative Adversarial Networks. arXiv 2019, arXiv:1812.04948v3. [Google Scholar]
  19. Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 105–114. [Google Scholar] [CrossRef]
  20. Papers with Code. Available online: https://paperswithcode.com/method/relativistic-gan (accessed on 1 June 2023).
  21. Wang, X.; Yu, K.; Wu, S.; Gu, J.; Liu, Y.; Dong, C.; Loy, C.C.; Qiao, Y.; Tang, X. ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks. arXiv 2018, arXiv:1809.00219v2. [Google Scholar]
  22. Armanious, K.; Jiang, C.; Fischer, M.; Küstner, T.; Hepp, T.; Nikolaou, K.; Gatidis, S.; Yang, B. MedGAN: Medical Image Translation using GANs. arXiv 2019, arXiv:1806.06397v2. [Google Scholar] [CrossRef] [PubMed]
  23. Baowaly, M.K.; Lin, C.-C.; Liu, C.-L.; Chen, K.-T. Synthesizing electronic health records using improved generative adversarial networks. J. Am. Med. Inform. Assoc. 2018, 26, 228–241. [Google Scholar] [CrossRef] [PubMed]
  24. Xie, H.; Lei, H.; Zeng, X.; He, Y.; Chen, G.; Elazab, A.; Wang, J.; Zhang, G.; Lei, B. AMD-GAN: Attention encoder and multi-branch structure based generative adversarial networks for fundus disease detection from scanning laser ophthalmoscopy images. Neural Netw. 2020, 132, 477–490. [Google Scholar] [CrossRef] [PubMed]
  25. Li, G.; Yun, I.; Kim, J.; Kim, J. DABNet: Depth-wise asymmetric bottleneck for real-time semantic segmentation. arXiv 2019, arXiv:1907.11357. [Google Scholar]
  26. Yang, Y.; Hou, C.; Lang, Y.; Yue, G.; He, Y. One-Class Classification Using Generative Adversarial Networks. IEEE Access 2019, 7, 37970–37979. [Google Scholar] [CrossRef]
  27. Khan, S.S.; Madden, M.G. One-class classification: Taxonomy of study and review of techniques. Knowl. Eng. Rev. 2014, 29, 345–374. [Google Scholar] [CrossRef]
  28. Kaggle. Available online: https://www.kaggle.com/datasets/balraj98/cvcclinicdb (accessed on 14 December 2023).
  29. Wang, X.; Xie, L.; Dong, C.; Shan, Y. Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. arXiv 2021, arXiv:2107.10833. [Google Scholar]
  30. Agustsson, E.; Timofte, R. Ntire 2017 challenge on single image super-resolution: Dataset and study. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
  31. Timofte, R.; Agustsson, E.; Van Gool, L.; Yang, M.; Zhang, L.; Lim, B.; Son, S.; Kim, H.; Nah, S.; Lee, K.M.; et al. Ntire 2017 challenge on single image super-resolution: Methods and results. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
  32. Wang, X.; Yu, K.; Dong, C.; Loy, C.C. Recovering realistic texture in image super-resolution by deep spatial feature transform. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
  33. Umirzakova, S.; Mardieva, S.; Muksimova, S.; Ahmad, S.; Whangbo, T. Enhancing the Super-Resolution of Medical Images: Introducing the Deep Residual Feature Distillation Channel Attention Network for Optimized Performance and Efficiency. Bioengineering 2023, 10, 1332. [Google Scholar] [CrossRef] [PubMed]
  34. Zhao, C.; Yang, P.; Zhou, F.; Yue, G.; Wang, S.; Wu, H.; Chen, G.; Wang, T.; Lei, B. MHW-GAN: MultiDiscriminator Hierarchical Wavelet Generative Adversarial Network for Multimodal Image Fusion. IEEE Trans. Neural Netw. Learn. Syst. 2023, 1–15. [Google Scholar] [CrossRef] [PubMed]
  35. Janowczyk, A.; Madabhushi, A. Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases. J. Pathol. Inform. 2016, 7, 29. [Google Scholar] [CrossRef]
  36. Liu, Y.; He, Q.; Duan, H.; Shi, H.; Han, A.; He, Y. Using Sparse Patch Annotation for Tumor Segmentation in Histopathological Images. Sensors 2022, 22, 6053. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.