Generative Adversarial Networks in Histological Image Segmentation: A Systematic Literature Review

Cruz, Yanna Leidy Ketley Fernandes; Silva, Antonio Fhillipi Maciel; Santana, Ewaldo Eder Carvalho; Costa, Daniel G.

doi:10.3390/app15147802

Open AccessSystematic Review

Generative Adversarial Networks in Histological Image Segmentation: A Systematic Literature Review

by

Yanna Leidy Ketley Fernandes Cruz

¹

,

Antonio Fhillipi Maciel Silva

²

,

Ewaldo Eder Carvalho Santana

^1,3

and

Daniel G. Costa

^4,*

¹

Graduate Program in Electrical Engineering, Federal University of Maranhão (UFMA), São Luís 65080-805, Brazil

²

Computer Science Department, State University of Piauí (UESPI), Floriano 64800-000, Brazil

³

Graduate Program in Computer and Systems Engineering, State University of Maranhão (UEMA), São Luís 65081-400, Brazil

⁴

SYSTEC-ARISE, Faculty of Engineering, University of Porto, 4200-465 Porto, Portugal

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(14), 7802; https://doi.org/10.3390/app15147802

Submission received: 20 June 2025 / Revised: 6 July 2025 / Accepted: 10 July 2025 / Published: 11 July 2025

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

Histological image analysis plays a crucial role in understanding and diagnosing various diseases, but manually segmenting these images is often complex, time-consuming, and heavily reliant on expert knowledge. Generative adversarial networks (GANs) have emerged as promising tools to assist in this task, enhancing the accuracy and efficiency of segmentation in histological images. This systematic literature review aims to explore how GANs have been utilized for segmentation in this field, highlighting the latest trends, key challenges, and opportunities for future research. The review was conducted across multiple digital libraries, including IEEE, Springer, Scopus, MDPI, and PubMed, with combinations of the keywords “generative adversarial network” or “GAN”, “segmentation” or “image segmentation” or “semantic segmentation”, and “histology” or “histological” or “histopathology” or “histopathological”. We reviewed 41 GAN-based histological image segmentation articles published between December 2014 and February 2025. We summarized and analyzed these papers based on the segmentation regions, datasets, GAN tasks, segmentation tasks, and commonly used metrics. Additionally, we discussed advantages, challenges, and future research directions. The analyzed studies demonstrated the versatility of GANs in handling challenges like stain variability, multi-task segmentation, and data scarcity—all crucial challenges in the analysis of histopathological images. Nevertheless, the field still faces important challenges, such as the need for standardized datasets, robust evaluation metrics, and better generalization across diverse tissues and conditions.

Keywords:

generativeadversarial network; medical image analysis; segmentation; histological image; deep learning

1. Introduction

The analysis of histological images has become increasingly prominent as a critical component in supporting clinical decision-making [1]. This growing relevance is largely driven by the integration of machine learning algorithms, which have shown great potential in automating complex visual tasks; for example, its identification enables the precise identification of cellular and tissue structures, such as nuclei, cytoplasm, glands, or tumor regions [2].

During histopathological evaluation, hematoxylin and eosin (H&E) staining is widely used to highlight cellular and tissue structures. Hematoxylin stains cell nuclei blue or purple, while eosin colors the cytoplasm and extracellular matrix in shades of pink [3]. Although effective, the staining process is highly sensitive to variations in protocols, reagent concentrations, and incubation times, which can result in considerable differences in color and contrast between samples. In addition, image acquisition may be affected by differences in slide scanners, lighting conditions, and resolution settings [4]. These sources of variability can lead to inconsistencies in visual appearance, making automated analysis, especially tasks like segmentation, more challenging and less generalizable across datasets.

In response to the challenges posed by variability in staining and image acquisition, generative neural networks, such as generative adversarial networks (GANs), have emerged as promising tools in the field. These models are capable of learning complex data distributions and producing realistic synthetic representations, making them particularly valuable for enhancing model generalization across heterogeneous datasets [5]. Building upon the growing interest in the use of GANs for histological image segmentation, recent research has explored their integration not only as primary segmentation models but also in auxiliary roles to address specific limitations in medical imaging workflows. Generative networks have been employed for tasks such as data augmentation [6], domain adaptation [7], and label refinement [8]. For example, Hou et al. [6] leveraged GANs to synthesize realistic histopathological patches, enabling improved segmentation performance in datasets with limited annotations. In the context of domain adaptation, Vo and Khan [7] proposed an edge-preserving GAN to bridge domain gaps caused by differences in staining or scanning protocols, thus ensuring consistent segmentation across diverse datasets. Furthermore, Qu et al. [8] demonstrated how GANs can refine weak annotations by learning the underlying structural distribution of histological components. Their ability to simulate consistent, high-quality training data and adapt models to new domains directly addresses key limitations associated with manual staining and imaging inconsistencies, contributing to more robust and accurate segmentation in histological image analysis.

However, despite their potential, GANs in particular present significant challenges that may limit their practical adoption in histological imaging workflows. These include the need for large amounts of training data, sensitivity to hyperparameters, and the lack of standardized evaluation metrics to assess the quality of generated outputs [9]. In order to gain a comprehensive understanding of GANs and their applications in image segmentation, a systematic review is necessary to consolidate current knowledge, evaluate the methodologies employed, and identify trends, limitations, and gaps in the literature.

This review summarizes published articles on GAN-based architectures for histological image segmentation. The reviewed studies are categorized according to the specific tasks addressed by the GANs, such as data augmentation, stain normalization, virtual stain, and tasks addressed by the segmentation in organs, including breast, kidney, liver, stomach, prostate, and other regions.

Therefore, this review study is expected to make valuable contributions by guiding researchers in selecting relevant research topics within this broad domain and by supporting the development of more effective methodological approaches. Although the use of Generative Adversarial Networks (GANs) is not new and has been explored in several review studies from different perspective [10,11,12], existing reviews do not comprehensively address the application of GANs to histological images. This gap can pose challenges for researchers entering this field. Thus, this systematic literature review aims to fill a significant research gap by providing a focused and thorough analysis of GAN-based approaches in the context of histological imaging.

The main contributions of this article can be summarized as follows:

A thorough synthesis of the current state of the art regarding the use of generative models, particularly GANs, in histological image segmentation;
Important perspectives on the use of GANs across different aspects, including yearly trends in GAN applications, the specific tasks addressed, the organs involved, the datasets utilized, and the evaluation criteria employed in the studies analyzed;
A practical guidance for future research, helping to shape the direction of generative model development in histological imaging.

The remainder of this article is organized as follows: Section 2 discusses previous literature reviews in related research domains, aiming to identify existing gaps and trace the field’s evolution. Section 3 introduces the fundamental principles of histological image segmentation and explains the operational concepts of GANs. The review methodology, including the research questions, search strategy, and study selection criteria, is detailed in Section 4. Section 5 presents the review’s main findings, such as standard evaluation metrics for segmentation models and the specific contributions of GANs in medical image segmentation. Section 6 discusses the implications of the results, offering a critical analysis of the reviewed literature. Finally, Section 7 presents the study’s findings and the references used throughout the article.

2. Related Review Works

While systematic literature reviews are widely recognized as essential for understanding a research domain, it is equally important to identify prior studies with similar objectives. Doing so helps uncover existing gaps in the literature and provides insight into how the field has evolved. This section presents previous literature reviews in related areas, serving as a foundation to support the discussions throughout this article.

In recent years, medical image segmentation has significantly transformed from traditional rule-based algorithms to more advanced deep learning-based approaches. In particular, histological image segmentation, a domain characterized by complex tissue structures, staining variability, and subtle morphological patterns, has greatly benefited from data-driven methods. These modern techniques address challenges such as low contrast, staining artifacts, and overlapping cellular structures, enabling more accurate and robust segmentation of tissue components and cellular features [13,14,15].

Among the most promising approaches are GANs, which have demonstrated significant potential in histopathological image segmentation. GANs operate through adversarial training, allowing them to model complex distributions and generate realistic segmentation outputs even when working with limited or imbalanced datasets. One of the key advantages of GANs in this domain is their ability to handle domain adaptation and data bias by learning the underlying data distribution in a supervised, semi-supervised, or unsupervised fashion [16,17].

Several studies have comprehensively reviewed applying GANs in medical imaging, highlighting their promise and limitations. For instance, Sultan et al. [16], Alhumaid et al. [18] focused on the role of GANs in medical image segmentation, especially emphasizing their potential to improve image quality and structure delineation. The review covers various medical imaging modalities without specializing in histological images. The article is helpful as a general foundation but does not delve into critical aspects such as histopathology-specific challenges, staining heterogeneity, or cellular-level image characteristics. Segmentation is conceptualized, and the authors do not conduct a systematic analysis of datasets or performance metrics. In contrast, Islam et al. [19] delivers a comprehensive review of GAN applications across the medical imaging domain, covering tasks such as synthesis, augmentation, enhancement, and segmentation. However, segmentation is only treated as one of several applications, and histological images are not a central concern.

From a theoretical perspective, Gui et al. [17] offers an in-depth review of GAN algorithms, models, and training strategies. While this work is essential for understanding foundational and architectural aspects of GANs, including advanced variants such as PGGAN and WGAN, it does not focus on medical imaging tasks and completely omits segmentation-specific discussions. Therefore, it serves as a firm technical reference but does not bridge the gap toward domain application.

Specific to histopathology, Jose et al. [20] explored using GANs in digital pathology, focusing on whole-slide imaging and nuclear segmentation. This work highlights the variability of histological data and the potential of GANs to normalize and adapt images across domains. Nonetheless, it provides limited quantitative analysis and no comparison of GAN variants. Other works like Chen et al. [5] and Zhao et al. [21] centered on data augmentation and attention mechanisms, respectively. While they broaden the understanding of GAN applications, they generally do not address regulatory or ethical implications.

A more recent and comprehensive contribution by Hussain et al. [22] approached the use of GANs from the perspective of medical image reconstruction while also considering segmentation as part of the broader landscape of image enhancement tasks. The review distinguishes itself by offering a well-structured analysis of application domains, model architectures, and performance outcomes, providing valuable insights into the evolution of GAN-based methods.

Focusing specifically on image segmentation, Xun et al. [23] presented an in-depth review of GAN-based methods and their architectures tailored for medical segmentation tasks. While the review covers many techniques, it does not systematically evaluate the datasets or compare performance across benchmarks, limiting its reproducibility utility.

Unlike previous reviews that broadly address the use of GANs in general medical imaging or focus on specific applications such as data augmentation or stain normalization, this review provides a focused and structured analysis of GAN-based architectures, specifically within the context of histological image segmentation. It distinguishes itself by not only examining segmentation models but also exploring the auxiliary roles GANs play, such as domain adaptation and label refinement, within segmentation pipelines. Furthermore, the review categorizes studies according to both the technical tasks performed and the anatomical regions analyzed, offering a multidimensional perspective that is currently lacking in the literature. This level of granularity aims to support researchers and practitioners in identifying methodological gaps, understanding current trends, and selecting appropriate GAN approaches tailored to specific histological challenges.

Table 1 presents a comparative analysis of recent and relevant studies in the literature focused on using generative adversarial networks for histological image segmentation. The study includes key aspects such as types of images addressed (including histopathological slides and whole slide images), datasets, investigated tasks, GAN models discussed, evaluation metrics, and objectives, which follows well-recognized standards to compare scientific works in SLR studies [24,25]. Moreover, the selected metrics align with those commonly used in AI-driven imaging reviews across multiple domains.

Overall, these studies underscore the increasing interest in GAN-based methods for enhancing analytical accuracy in complex tissue structures. They address critical challenges such as stain variability, scarcity of annotated data, and domain shifts across heterogeneous datasets. Based on the performed analyses, we could identify that while GANs have shown promising results in histological segmentation tasks, the current literature remains sparse and fragmented. Most reviews still concentrate on broader medical imaging domains (e.g., MRI, CT) and provide limited discussion on histology-specific applications. This reveals a clear gap in domain-focused synthesis, particularly concerning validating GANs. As presented in this study, a focused, systematic review is thus essential for understanding the capabilities, limitations, and emerging directions of GANs in histological image segmentation.

This review article focuses exclusively on using GANs for the “image segmentation task” in image histology. It examines papers published from GAN’s introduction in 2014 to February 2025. Although the analysis may not have been exhaustive, these studies are expected to provide sufficient knowledge to aid both novice and experienced researchers in selecting research topics and refining methodological approaches within the studied subject.

3. Fundaments and Basic Definitions

3.1. Histological Image Segmentation

Histology is the science dedicated to the study of biological tissues at a microscopic level. Histological images are digital representations of tissue slides, usually obtained by optical microscopy after application of specific dyes, such as H&E. These images capture a complex cellular and tissue architecture, and are fundamental for understanding tissue morphology and diagnosing diseases, especially in the field of pathology. In biomedical practice, histological images allow the identification of cellular patterns, anatomical structures and alterations that may indicate pathological processes, such as inflammation, dysplasia or cancer [27,28].

The role of histological images in medical diagnosis is irreplaceable. In diagnostic pathology, they are a basis for the diagnosis of several diseases, guiding therapeutic and prognostic decisions. Furthermore, they are essential in biomedical research, for example, to understand molecular mechanisms of diseases or evaluate responses to experimental treatments [29].

However, manual analysis of these images, traditionally performed by experienced pathologists, faces significant challenges. This process is time-consuming, subject to human fatigue, and often presents inter- and intra-observer variability, i.e., different specialists may interpret the same image differently [30]. Furthermore, the complexity of histological images, with their color variations, morphological patterns, and heterogeneous textures, makes the task even more challenging. The increase in the volume of exams due to the growing demand for health services also overloads professionals, highlighting the need for automated methods to assist in the analysis [28].

In this context, the segmentation of histological images emerges as a fundamental step in automated analysis. Segmentation aims to identify and delimit specific structures within images, such as cell nuclei, glands, collagen fibers, or tumor regions. This step is essential to extract quantitative information that complements the qualitative analysis performed by experts [27]. Practical applications of segmentation include tumor detection in histological sections [31,32], quantification of cell nuclei for tumor health studies [33], assessment of degrees of inflammation or fibrosis [30], and prognostic analysis based on morphological characteristics [34].

Despite its potential, segmentation of histological images presents significant technical difficulties. Color and intensity variations introduced by different staining protocols, texture patterns, and overlapping cellular structures make the problem challenging for traditional algorithms. In addition, artists generated during slide preparation, such as folds or bubbles, can interfere with the accuracy of segmentations [29]. These difficulties have motivated the development of more robust approaches, such as the use of deep neural networks and, more recently, models based on generative adversarial networks (GANs).

3.2. Introduction to Generative Adversarial Networks

Introduced in [35], GANs are a class of deep learning models designed to generate realistic synthetic data by learning the underlying distribution of a given dataset. The architecture of a generative adversarial network is composed of two main components: the generator

(G)

and the discriminator

(D)

, which are trained simultaneously in an adversarial process. Figure 1 illustrates the basic architecture of a GAN model.

During the training process of a GAN, the architecture is composed of two key components: a generator G and a discriminator D. The GAN is trained using two distributions: a real data distribution

P_{data} (X)

, from which a dataset X is sampled, and a noise distribution

P_{z} (Z)

, from which random latent variables Z are drawn.

Initially, the generator G receives as input M noise vectors

z_{1}, z_{2}, \dots, z_{m} \sim P_{z} (Z)

and transforms them into synthetic samples

{\hat{x}}_{1}, {\hat{x}}_{2}, \dots, {\hat{x}}_{m}

, denoted collectively as

\hat{X}

. Simultaneously, M real samples

x_{1}, x_{2}, \dots, x_{m} \sim P_{data} (X)

are used to train the discriminator.

The discriminator D is trained to distinguish real samples from fake ones using a gradient ascent approach. It acts as a binary classifier, aiming to output

D (x) = 1

for real data and

D (\hat{x}) = 0

for generated data. Once D is trained for several iterations, the generator is updated via gradient descent, typically with a lower learning rate to ensure training stability.

The ultimate goal is for the generator G to produce samples that the discriminator D cannot distinguish from real ones, i.e.,

D (\hat{x}) \approx 0.5

for

\hat{x} = G (z)

. Mathematically, the value function

V (G, D)

can be used to represent the adversarial game process in GAN, which is formulated as a minimax optimization problem [36], as shown in Equation (1):

min_{G} max_{D} V (D, G) = E_{x \sim P_{data} (x)} [log D (x)] + E_{z \sim P_{z} (z)} [log (1 - D (G (z)))]

(1)

Here, the generator seeks to minimize

log (1 - D (G (z)))

, effectively attempting to fool the discriminator, while the discriminator is trained to maximize its ability to distinguish between real and fake samples by minimizing the binary cross-entropy (BCE) loss.

Despite their theoretical soundness and success in various applications, generative adversarial networks (GANs) face several practical challenges during training. Among the most frequently encountered issues are convergence difficulties, where the adversarial training dynamic between the generator G and discriminator D fails to reach a stable equilibrium [37,38]. Another key problem is GANs is highly sensitive to hyperparameters and network architectures; minor adjustments can lead to vastly different outcomes, making the training process brittle and difficult to tune [35,39]. Additionally, overfitting can occur when the discriminator becomes overly strong, memorizing the training data and failing to provide useful feedback to the generator [40].

Addressing these challenges has motivated a rich body of research proposing extensions, regularization techniques, alternative architectures, and novel training algorithms aimed at stabilizing GAN training and improving output quality.

To mitigate these issues, numerous variants and extensions of GANs have been proposed, which will be explored in subsequent sections.

3.3. Types of GANs

Since their introduction by Goodfellow et al. [35], generative adversarial networks have transformed the field of data generation and image synthesis. The original GAN architecture, despite its conceptual simplicity, faced challenges related to training instability, convergence issues, and mode collapse. In response, numerous GAN variants have been proposed to address these limitations and adapt the architecture to specific applications, such as medical image segmentation, and more recently, histological image analysis. In this subsection, we will introduce the representative GAN variants.

3.3.1. cGAN

Conditional GAN (cGAN), proposed by Mirza and Osindero [41], are an extension of the traditional GAN architecture, where both the generator and the discriminator are conditioned on auxiliary information, such as class labels or more detailed tags associated with the data. This conditional input, denoted as c, is provided to both networks as an additional input to guide the data generation process. As a result, the generator learns to produce samples that not only resemble real data but also conform to the specified condition, rather than generating generic samples from an unknown noise distribution. Equation (2) formally represents the loss function of a cGAN:

min_{G} max_{D} V (D, G) = E_{x \sim p_{d a t a} (x)} [log D (x | c)] + E_{z \sim p_{z} (z)} [log (1 - D (G (z | c)))]

(2)

where

D (x ∣ c)

and

G (z ∣ c)

represent the discriminator and generator conditioned on the auxiliary information c, respectively; x denotes the real data sample, and

p_{z} (z)

refers to the prior noise distribution used as input to the generator [20].

3.3.2. CycleGAN

For scenarios where paired data is unavailable, CycleGAN offers a powerful alternative. Proposed by Zhu et al. [42], it enables designed to perform image-to-image translation between two domains. Traditionally, image translation tasks rely on supervised learning with paired datasets, where each input image in one domain has a corresponding target image in the other. However, in many real-world scenarios, such paired training data are not available. CycleGAN addresses this limitation by enabling unpaired image translation—learning to map the underlying structure and style from one domain to another without the need for aligned image pairs.

The CycleGAN architecture consists of two generators,

G : X \to Y

and

F : Y \to X

, and two discriminators,

D_{X}

and

D_{Y}

. The discriminator

D_{X}

aims to distinguish between real images from domain X and synthetic images generated by

F (Y)

, while

D_{Y}

differentiates between real images in domain Y and generated images

G (X)

. To ensure consistency and preserve key features during translation, the model enforces a cycle consistency loss, encouraging

F (G (X)) \approx X

and

G (F (Y)) \approx Y

[36].

3.3.3. StyleGAN

StyleGAN [43,44] is a generative adversarial network architecture specifically designed to improve the control and quality of image synthesis. Unlike traditional GANs, which map a latent vector z directly to an image, StyleGAN introduces an intermediate latent space

W

, allowing for greater disentanglement of image features. The latent vector is first passed through a mapping network to produce

w \in W

, which is then used to modulate the layers of the generator via adaptive instance normalization (AdaIN). This mechanism enables fine-grained control over various visual attributes, such as texture, pose, and structure, at different levels of the network.

The generator in StyleGAN progressively synthesizes high-resolution images through a coarse-to-fine architecture, while the discriminator learns to distinguish real images from those generated. The result is the production of highly realistic images, especially in human face synthesis, with improved variability and control over style compared to earlier GAN models.

3.3.4. PGGAN

Progressive Growing of GANs (PGGAN), proposed by Karras et al. [45], introduced a novel training strategy for generative adversarial networks that significantly enhances the quality, stability, and variation of high-resolution image synthesis. Unlike traditional GANs, which are trained with fixed-size networks on full-resolution images, PGGAN starts with low-resolution images (e.g.,

4 \times 4

) and progressively increases the resolution by adding layers to both the generator and the discriminator during training. This incremental growth allows the network to first learn global structures before focusing on fine-grained details.

The standard adversarial objective remains the same (Equation (3)). However, PGGAN introduces two core innovations to stabilize training and improve output quality:

Progressive architecture growth: new layers are gradually added to the generator and discriminator. During transitions, a linear interpolation (fade-in) smoothly blends the outputs from the lower and higher-resolution paths:

x_{output} = (1 - α) \cdot x_{low} + α \cdot x_{high}

(3)

where

α \in [0, 1]

increases over time as the network adapts to the new resolution.

Improved normalization: instead of traditional batch normalization, PGGAN employs pixelwise feature vector normalization in the generator and minibatch standard deviation in the discriminator to reduce mode collapse and stabilize training.

Furthermore, Karras et al. [43,45] evaluate the performance using the Fréchet Inception Distance (FID), a perceptual quality metric that measures both the fidelity and diversity of generated images. PGGAN showed significant improvements over prior GAN variants, especially for complex datasets.

In medical imaging, particularly in histopathology, the ability to synthesize realistic, high-resolution tissue structures is highly valuable. PGGAN has been adopted for synthetic data generation and augmentation in tasks such as tumor detection, gland segmentation, and classification of histological patterns, helping to improve the robustness of machine learning models where real annotated data is scarce [46,47,48].

3.3.5. WGAN

The Wasserstein GAN (WGAN), introduced by Arjovsky et al. [49], addresses key training instabilities and mode collapse issues observed in traditional GANs by replacing the Jensen–Shannon (JS) divergence with the Earth Mover (EM) distance, also known as the Wasserstein-1 distance.

Given two probability distributions

P_{r}

(real data) and

P_{g}

(generated data), the Wasserstein-1 distance is defined as:

W (P_{r}, P_{g}) = inf_{γ \in Π (P_{r}, P_{g})} E_{(x, y) \sim γ} [∥ x - y ∥]

(4)

where

Π (P_{r}, P_{g})

denotes the set of all joint distributions

γ (x, y)

whose marginals are

P_{r}

and

P_{g}

.

Using the Kantorovich-Rubinstein duality, the WGAN objective becomes:

max_{D \in D} E_{x \sim P_{r}} [D (x)] - E_{z \sim P_{z}} [D (G (z))]

(5)

Here, D (often referred to as the critic instead of a discriminator) is constrained to be 1-Lipschitz, originally enforced via weight clipping. The generator G is trained to minimize this objective.

The use of the Wasserstein distance provides a smoother and more meaningful loss metric that correlates better with the quality of generated samples. As a result, WGAN improves convergence and offers more stable training, even in cases of poor discriminator performance.

In practice, the WGAN critic loss is:

L_{D} = - E_{x \sim P_{r}} [D (x)] + E_{z \sim P_{z}} [D (G (z))]

(6)

and the generator loss is:

L_{G} = - E_{z \sim P_{z}} [D (G (z))]

(7)

In summary, WGAN and its variants have become valuable tools in medical image processing, enabling artifact correction, anomaly detection, and synthetic data generation, ultimately contributing to improved diagnostic performance and model reliability [50,51]. These models address key challenges in the field such as the scarcity of labeled data, noise suppression, and the synthesis of realistic medical images.

3.3.6. Pix2Pix

Pix2Pix is a conditional generative adversarial network (cGAN) architecture proposed by Isola et al. [52], specifically designed for image-to-image translation tasks. Unlike traditional GANs that learn to generate images from random noise, Pix2Pix learns a mapping from an input image to an output image, making it suitable for tasks such as image colorization, edge-to-photo translation, and semantic segmentation.

The core idea of Pix2Pix is to train a generator G that produces an output image

G (x, z)

, given an input image x and optional noise z, such that the output image resembles a target image y. A discriminator D is simultaneously trained to distinguish between real image pairs

(x, y)

and fake pairs

(x, G (x, z))

. The conditional adversarial loss that guides this process is defined as:

L_{c G A N} (G, D) = E_{x, y} [log D (x, y)] + E_{x, z} [log (1 - D (x, G (x, z)))]

(8)

To further encourage the generator to produce outputs close to the ground truth, a traditional

L_{1}

loss is added:

L_{L 1} (G) = E_{x, y, z} {[∥ y - G (x, z) ∥}_{1}]

(9)

The final objective function for the Pix2Pix model is a weighted combination of the adversarial loss and the

L_{1}

loss:

G^{*} = arg min_{G} max_{D} L_{c G A N} (G, D) + λ L_{L 1} (G)

(10)

where

λ

is a hyperparameter that balances the two loss terms.

4. Review Methodology

This systematic literature review was conducted following a structured methodology based on the PRISMA guidelines [25,53] (see Supplementary File PRISMA checklist). The review process included the definition of research questions, the construction of a comprehensive search strategy, the selection of relevant studies through predefined inclusion and exclusion criteria, and the extraction of data. The quality of the included studies was assessed using a standardized checklist to ensure methodological rigor and transparency. The overall process was documented using a PRISMA flow diagram to provide a clear overview of the study selection process.

To ensure the novelty of our systematic review, we consulted the PROSPERO database [54], which serves as a registry for systematic review protocols. This platform is crucial in promoting research transparency and minimizing redundancy by publicly making information on ongoing and planned reviews available. Our search did not identify any registered reviews that addressed the same research question, indicating a gap in the existing literature.

4.1. Research Questions

This review was structured around a set of research questions formulated to examine the application of generative adversarial networks in histological image segmentation. The objective was to analyze the breadth of applications, methodological approaches, and evaluation strategies reported in the existing body of literature. To guide this investigation, the following Research Questions (RQs) were established:

(RQ1) Which types of histological tissues are used in studies applying generative adversarial networks (GANs) for image segmentation tasks?
(RQ2) Which datasets are used in studies?
(RQ3) What are the main goal of applying GANs to histological segmentation tasks?
(RQ4) Which GAN architectures are used in the studies?
(RQ5) What metrics are used to evaluate the performance of GANs in the proposed tasks?
(RQ6) What is the specific goal of the segmentation performed in the developed studies?
(RQ7) Which segmentation architecture is used in the studies?
(RQ8) What metrics are used to evaluate the segmentation quality?

4.2. Search Strategy

A systematic search for primary studies was conducted on 4 February 2025, using the following digital libraries: Springer Link, IEEE Xplore (Institute of Electrical and Electronics Engineers), Scopus, Multidisciplinary Digital Publishing Institute (MDPI), and PubMed. The search process was carried out by one researcher and validated by two additional researchers to ensure consistency and reliability. The search terms were defined based on the main components of the research, technology, task, and domain, and included the following combination of keywords and their synonyms: (“generative adversarial network” OR “GAN”) AND (“segmentation” OR “image segmentation” OR “semantic segmentation”) AND (“histology” OR “histological” OR “histopathology” OR “histopathological”).

4.3. Selection Criteria

The study selection process was conducted in three main stages, following predefined inclusion and exclusion criteria, as outlined in Table 2. In the first stage, duplicate records retrieved from different databases were removed. In the second stage, during the title and abstract screening, independent reviewers evaluated whether the studies met the inclusion criteria. In the third stage, the full texts of the remaining articles were assessed in detail to confirm their eligibility. In cases of disagreement, a third reviewer was consulted or the issue was resolved through discussion.

4.4. Data Extraction and Synthesis

To ensure consistency, reproducibility, and reliability in the data extraction process, a structured form was developed using Google Forms. This instrument was carefully designed to encompass all elements necessary to answer the research questions, comprehensively characterize the selected studies, and identify gaps and promising trends for future investigations. Each article was read in full by the reviewers, and the relevant data were systematically and consistently extracted. Subsequently, Microsoft Excel (version 2504, Build 16327.20264) was employed as a supporting tool for organizing and systematizing the extracted data. The information collected from each article included in this systematic review covered key aspects essential for a detailed comparative analysis of GAN-based approaches applied to histological image segmentation. Among the extracted data were the year of publication, the organ analyzed in the study, the specific segmentation domain, the nature of the dataset used (public or private), the dataset name, the GAN technique employed, the metrics used to evaluate GAN performance along with their respective results, the specific task assigned to the GAN, the segmentation technique applied, the evaluation metrics for segmentation and their results, the experimental setting or environment, the availability of source code or algorithm URLs, the type of learning employed (supervised, unsupervised, or semi-supervised), and the overall objective of the study as reported by the authors. The synthesized results are presented in Supplementary File S1.

5. Results

5.1. Search Results

The study selection process was conducted in three phases, as illustrated by the PRISMA in Figure 2. Initially, after searching five databases, we retrieved a total of 1227 articles. Initially, after searching five databases, we retrieved a total of 1227 articles. In the first phase, identification, 41 duplicate articles were excluded. In the next phase, screening, we performed title and abstract reviews, excluding 965 articles that did not meet the initial criteria. Subsequently, 221 reports were sought for retrieval, of which 4 could not be accessed in full text and were therefore excluded. In the final phase, eligibility, 217 articles were assessed through full-text reading. After detailed analysis, 175 articles were discarded, resulting in a final total of 41 articles included in the review.

5.2. Applications and Trends of GANs in Histopathological Image Segmentation

The application of generative adversarial networks in medical image analysis has gained significant momentum in recent years due to their ability to synthesize high-quality, realistic data and enhance image characteristics across various clinical tasks. In histopathological image processing, GANs have been particularly valuable for tasks such as data augmentation, stain normalization, and image-to-image translation, offering robust solutions to data scarcity and variability challenges [51,55]. Furthermore, segmentation tasks have also benefited from GAN-based approaches, which often incorporate adversarial losses to refine boundary accuracy and structural consistency [50]. The following figures provide a quantitative overview of how GANs have been utilized across different segmentation tasks and target regions in histological imaging.

Figure 3 and Figure 4 provide a detailed quantitative overview of the distribution of GAN-related research in histological image segmentation. As illustrated in Figure 3, the most frequently addressed task using GANs was data augmentation, appearing in 20 studies, followed by stain normalization with 12 studies. Other relevant applications included virtual staining, color normalization, and synthetic image generation, each represented in 4 studies. Less frequently explored were domain adaptation and reconstruction. This distribution highlights a clear research emphasis on preprocessing and data enrichment tasks, particularly those aimed at improving dataset variability and quality in histopathological imaging.

Figure 4 categorizes the segmented regions targeted by these studies. The majority focused on nuclei segmentation, accounting for 20 studies, indicating a strong interest in cellular-level analysis. Segmentation of combined structures such as tissue and cells (8 studies), tissue alone (6 studies), and tissue and nuclei (2 studies) was also present, reflecting the complexity and layered structure of histological data. Fewer works addressed cell-only segmentation (1 study), cell and nuclei combined (2 studies), and the segmentation of area, count, and distance between cells (1 study). These findings suggest that nuclei-focused segmentation remains a priority, while more integrative or structural analyses involving multiple tissue components are less commonly pursued.

Together, these results reinforce the role of GANs as versatile tools in histological image segmentation, particularly in tasks that support dataset augmentation and normalization, and in anatomically detailed applications such as nuclei delineation.

Yearly Distribution

The analysis presented in Figure 5 highlights the yearly progression of GAN-based research focused on histological image analysis. From the first identified study in 2018 through early 2025, a total of 41 publications met the selection criteria. The data indicate a growing interest in leveraging GANs for histopathology applications, starting with a single publication in 2018, gradually increasing to 3 in 2019 and 2 in 2020. A more pronounced growth can be observed from 2021 onward, with 6 publications in 2021, 9 in 2022, and peaking at 10 in 2023. Although there was a slight decrease in 2024 (7 publications) and 2025 (3 publications as of the analysis date), the trend overall reflects the expanding role of GANs in supporting histological image processing tasks such as stain normalization, synthetic image generation, and segmentation. This upward trajectory underscores the increasing adoption of GANs as powerful tools to enhance diagnostic workflows, improve image quality, and support data augmentation in histopathological research.

5.3. Datasets

The studies included in this review used a variety of datasets to train and evaluate GAN-based models applied to histological image segmentation. The Table 3 provides a comprehensive overview of the datasets employed in histopathological image studies. It offers a structured analysis based on tissue type, image size, magnification, annotation, and availability criteria.

First, various tissue types are represented across the datasets, supporting the development of generalizable models. Dataset sizes range from fewer than 100 to hundreds of thousands of images, with resolutions spanning small patches to full whole-slide images (WSIs). Most images were acquired at 20× or 40× magnification, critical for capturing histological detail, though some datasets lack magnification metadata, which may hinder reuse. High-quality annotations are available in many datasets, enabling precise model training, though some (e.g., TCGA, KIRC) lack detailed labels, restricting their use to unsupervised or semi-supervised approaches. Public availability of most datasets promotes reproducibility and algorithm benchmarking, but access to some remains restricted, limiting transparency and broader validation in clinical settings.

5.4. GANs in Histological Image

The application of GANs in digital pathology has expanded rapidly, serving crucial roles in tasks such as data augmentation, stain normalization, virtual staining, synthetic image generation, domain transformation and adaptation; and High-resolution image reconstruction. An extensive review of recent studies reveals the most commonly used models, the specific tasks they address, settings, and their quantitative performance metrics.The Table 4 presents a wide range of applications of GANs in the field of digital histology image processing.

5.4.1. Data Augmentation

Employing GANs for data augmentation enables the generation of histological images with significant morphological diversity, thereby enhancing the reliability and adaptability of segmentation, classification, and detection models [96]. For example, in a study by Kweon et al. [86], PGGAN was utilized on an RTX 3090 GPU to produce synthetic images, achieving FID = 34.96 and IS = 2.91, indicating satisfactory visual fidelity and diversity. Conversely, Hossain et al. [73] utilized CycleGAN but with lower performance metrics (MSE = 5.664, SSIM = 0.204), showcasing the variation in image quality depending on the GAN architecture and implementation specifics.

Noteworthy is the study by Baykal et al. [76] which combined CycleGAN and RRAGAN, achieving superior results with SSIM = 0.933 and PSNR = 24.109, utilizing Google Colab for training. These outcomes underscore how the selection of architecture and computational resources impacts the effectiveness of data augmentation tasks. Apart from enhancing model performance, data augmentation with GANs aids in preventing overfitting, enhancing class balance, and facilitating deep neural network training in resource-constrained settings, thereby serving as both a technological advancement and a solution to systemic challenges in medical data availability. This positions data augmentation as a fundamental application of GANs in the realms of digital pathology and biomedical imaging.

5.4.2. Stain Normalization

Stain normalization has emerged as one of the most critical and frequently addressed applications of generative GANs in digital pathology. To overcome this challenge, numerous studies have adopted GAN-based approaches that aim to harmonize the visual characteristics of histological slides while preserving the morphological integrity of the tissues [67,68,97]. Among the various architectures, CycleGAN has become one of the most commonly used models due to its ability to perform unpaired image-to-image translation, a crucial advantage when paired samples across different staining domains are unavailable. For instance, Puma et al. [68] applied CycleGAN to normalize slides from diverse datasets, achieving strong visual fidelity as indicated by Fréchet Inception Distance (FID) scores of 15.32 on MoNuSeg and 6.80 on head and neck cancer datasets.

Other notable implementations of CycleGAN include its use by Song et al. [67], who reported an AJI of 0.6570 and PQ of 0.625 in downstream nucleus segmentation, as well as Gasermayr et al. [97], who explored its utility in unsupervised stain adaptation. Additionally, Taieb and Hamarneh [75] pioneered early stain normalization with GANs, achieving Inception Scores (IS) of 0.700 for MITOSIS, 0.950 for COLON, and 0.260 for OVARY datasets using adversarial training on an NVIDIA Titan X GPU.

Table 4. Comparative analysis of GAN models for histological image processing tasks.

Work	Task	Model	Setting	Performance
Azam et al. (2024) [84]	Virtual Staining	Pix2Pix-GAN CUT	-	$P i x 2 P i x : F I D = 11.16, C U T : F I D = 38.52$
Bouteldja et al. (2022) [69]	Data augmentation, Stain Normalization	CycleGAN	GPU NVIDIA A100, 10 GB VRAM	-
De Bel et al. (2021) [93]	Data augmentation, Stain normalization	CycleGAN	-	-
Deshpande et al. (2022) [66]	Synthetic image generation	GAN	GPU NVIDIA GTX TITAN X single, 12 GB RAM	$F I D = 142.84$
Deshpande et al. (2024) [62]	Data augmentation	SynCLay	-	CoNiC- $F I D = 60, 15 \pm 1.07,$ $S S I M = 0.86 \pm 0.002$ , PanNuke- $F I D = 100.80 \pm 1.83,$ $S S I M = 0.79 \pm 0.02$
Du et al. (2025) [70]	Stain normalization	P2P-GAN, DSTGAN	GPU NVIDIA RTX 4090	TUPAC: $S S I M = 0.985, P C C = 0.992, P S N R = 39.130 \pm 0.844$ ; MITOS: $S S I M = 0.984, P C C = 0.991, P S N R = 37.698 \pm 2.635$ ; ICIAR: $S S I M = 0.984, P C C = 0.992, P S N R = 38.399 \pm 1.750$ ; MICCAI: $S S I M = 0.975, P C C = 0.990, P S N R = 34.174 \pm 0.892$ .
Falahkheirkhah et al. (2023) [95]	Synthetic image generation	Pix2Pix-GAN	GPU NVIDIA 2080	$M e a n = 4.9379 and S D = 0.7177$
Fan et al. (2024) [65]	Color normalization	CycleGAN	-	-
Gadermayr et al. (2019) [97]	Stain normalization	CycleGAN	GPU NVIDIA GTX 1080 Ti	-
Guan et al. (2024) [56]	Stain Normalization and Virtual staining	GramGAN	NVIDIA RTX 3090 GPU, Intel (TM) i5-10400 CPU, 16 GB RAM	CSS = 0.714, FID = 52.086, KID = 5.815
He et al. (2022) [61]	Synthetic Image generation	cGAN	GPU NVIDIA TITAN RTX, 24 GB VRAM	$S S I M = 0.9066 \pm 0.065,$ $M S E = 0.0076 \pm 0.003$
Hou et al. (2020) [87]	Data augmentation	GAN	-	-
Hossain et al. (2024) [73]	Data augmentation	CycleGAN	-	$M S E = 5.664, P S N R = 10.610,$ $S S I M = 0.204$
Hossain et al. (2023) [91]	Data augmentation	CycleGAN	-	$S S I M = 0.335, M S E = 5.094, P S N R = 11.111$
Hu et al. (2018) [59]	Data augmentation and Stain normalization	WGAN-GP	NVIDIA Tesla V100 GPU with 32 GB of memory, CPU Intel Xeon, 128 GB of RAM	-
Juhong et al. (2022) [81]	Reconstruction high-resolution image	SRGAN- ResNeXt	GPU NVIDIA RTX 2060, CPU Intel Core i7-9750H, 16 GB RAM	$P S N R = 32.13,$ $S S I M = 0.93,$ $M S E = 2.75$
Kablan (2023) [76]	Data augmentation, Stain Normalization	CycleGAN RRAGAN	Google Colabority, GPU Tesla K80, 12 GB RAM	$F S I M = 0.921, P S N R = 24.109, S S I M = 0.933,$ $M S E = 2.88 \times 10^{2}, M S S S I M = 0.965, R M S E = 8.3576,$ $E R G A S = 4.29 \times 10^{3}, U Q I = 0.987, R A S E = 6.18 \times 10^{2}$
Kapil et al. (2021) [83]	Data augmentation and Domain transformation	CycleGAN DASGAN	GPU NVIDIA V100, 32 GB RAM and GPU NVIDIA K80	-
Kweon et al. (2022) [86]	Data augmentation	PGGAN	Intel Core i7-10700, NVIDIA RTX 3090 GPU 24 GB RAM	$F I D = 34.9635,$ $I S = 2.91609 \pm 0.0387$
Lafarge et al. (2019) [77]	Stain normalization and Data augmentation	DANN	-	-
Lahiani et al. (2020) [85]	Virtual staining	CycleGAN	GPU NVIDIA Testa v100	$C W S S I M = 0.880 \pm 0.170$
Li, Hu and Kak (2023) [63]	Data augmentation	G-SAN	AMD Ryzen 7 5800X, 32 GB of RAM, NVIDIA RTX 3090 GPU, 24 GB of memory	-
Liu, Wagner and Peng (2022) [57]	Data augmentation	GAN	GPU Tesla V100, CPU Intel Xeon 6230, 128 GB RAM	-
Lou Wei et al. (2022) [79]	Data augmentation	CSinGAN	NVIDIA Tesla V100	-
Mahapatra and Maji (2023) [90]	Color normalization	LSGAN TredMiL	-	$N M I = 0.800,$ $B i C C = 0.750, W s C C = 0.590$
Mahmood et al. (2019) [88]	Stain normalization	cGAN	Intel Xeon E5-2699 v4, 256 GB of RAM, NVIDIA Tesla V100 GPU	-
Naglah et al. (2022) [92]	Data augmentation, Color normalization, Domain transformation	cGAN CycleGAN	-	cGAN: $M M I = 0.26 \pm 0.07, N M I = 0.10 \pm 0.03,$ $H C = 0.75 \pm 0.13, B C D = 0.31 \pm 0.06$ ; CycleGAN: $M M I = 0.24 \pm 0.08, N M I = 0.09 \pm 0.03,$ $H C = 0.61 \pm 0.26, B C D = 0.35 \pm 0.15$ .
Purma et al. (2024) [68]	Stain normalization	CycleGAN	GPU NVIDIA Tesla V100, 16 GB de memória, Intel Xeon CPU E5-2680 v4	FID: ClaS = 82.44, FID: MoNuSeg = 15.32, FID: HN cancer = 6.80, FID: Combined = 7.63
Razavi et al. (2022) [71]	Data augmentation and Stain Normalization	cGAN MiNuGAN	GPU NVIDIA 2080 Ti, 11 GB RAM	-
Rong et al. (2023) [89]	Color Normalization	Restore-GAN	GPU NVIDIA P100, 16 GB RAM	$S S I M = 0.949 \pm 0.020,$ $P S N M = 30.678 \pm 1.624$
Shahini et al. (2025) [78]	Synthetic image generation	ViT-P2P	-	-
Song et al. (2023) [67]	Stain normalization	CycleGAN	Nvidia RTX 3090 GPU, 24 GB of GPU memory, Intel Xeon, 64 GB of RAM	-
Taieb and Hamarneh (2017) [75]	Stain Normalization	GAN	NVIDIA Titan X GPU	$M I T O S I S- I S = 0.700,$ $C O L O N- I S = 0.950,$ $O V A R Y- I S = 0.260$
Vasiljevic et al. (2021) [94]	Data augmentation	CycleGAN, StarGAN	-	$C y c l e G A N : F I D = 5.359,$ $S t a r G A N : F I D = 11.572$
Vasiljevic et al. (2023) [72]	Data augmentation and Virtual staining	CycleGAN HistoStarGAN	-	$F I D = 56.00, S S I M = 0.81$
Wang et al. (2021) [82]	Stain normalization	ACCP-GAN	GPU NVIDIA RTX, CPU Intel Xeon Silver 4110, 128 GB RAM	$F S I M = 0.9657, M S- S S I M = 0.9694,$ $P S N R = 27.5058, V I F = 0.9031,$ $F I D = 27.1231$
Wang et al. (2022) [80]	Data augmentation	GAN	NVIDIA GTX 1070, 16 GB RAM	-
Ye et al. (2025) [60]	Stain normalization	SASSL	-	-
Yoon et al. (2024) [74]	Virtual staining	CUT, E-CUT CycleGAN E-CycleGAN	GPU Nvidia RTX 3090 AMD EPYC 7302 CPU, 346 GB of RAM	FID-CycleGAN: 67.76, E-CycleGAN: 61.91, CUT: 54.87, E-CUT: 50.91 KID-CycleGAN: 2.290, E-CycleGAN: 1.631, CUT: 0.642, E-CUT: 0.245
Zhang et al. (2021) [64]	Data augmentation	MASG-GAN	GPU NVIDIA RTX 2080 Ti, CPU Intel (TM) i9-10850 K, 32 GB RAM	-
Zhang et al. (2022) [58]	Color normalization and Virtual staining	CSTN	NVIDIA RTX 3090 GPU	$F I D = 101.37,$ $K I D = 6.129, A H R = 1.98$

Beyond CycleGAN, more advanced architectures have demonstrated improvements in both chromatic coherence and structural fidelity. Du et al. [70] employed Pix2Pix GAN (P2P-GAN) and DSTGAN across datasets such as TUPAC, MITOS, and MICCAI, attaining high performance with SSIM > 0.97 and PSNR values exceeding 34. Similarly, Restore-GAN introduced by Rong et al. (2023) [89] reached SSIM = 0.949 and PSNR = 30.678, validating the efficacy of more customized GAN variants for stain normalization tasks.

Complementary approaches such as WGAN-GP (Hu et al. [59]) and cGAN (Mahmood et al. [88], Razavi et al. [71]) have further diversified the GAN landscape, offering benefits in both stain transfer and data augmentation. For example, Hu et al. leveraged WGAN-GP on high-memory V100 GPUs for combined stain normalization and augmentation, while Mahmood et al. used cGANs with Tesla V100 to handle large-scale stain harmonization pipelines.

Other innovative models such as GramGAN (Guan, Li and Zhang [56]) and ACCP-GAN (Wang et al. [82]) have been proposed to improve visual quality metrics like FSIM, MS-SSIM, and VIF. Meanwhile, hybrid frameworks like RRAGAN and CycleGAN as explored by Kablan [76], and self-supervised methods such as SASSL [60], underscore the evolution of stain normalization into a field marked by methodological diversity and increasing performance benchmarks.

Foundational contributions by Lafarge et al. [77], Bel et al. [93], and Bouteldja et al. [69] further demonstrate how stain normalization is often coupled with data augmentation to boost training generalizability.

5.4.3. Virtual Staining

Virtual staining is an innovative and rapidly advancing task within computational pathology. GANs are employed to simulate histological staining on unstained tissue images, such as those acquired via fluorescence or autofluorescence microscopy [98]. This approach aims to replace traditional, labor-intensive, and costly staining protocols with automated computational transformations, enabling faster and more cost-effective tissue analysis [99].

A range of GAN architectures has been proposed for this task, each showing varying degrees of success. For example, Yoon et al. [74] used a combination of CycleGAN and E-CUT architectures, reporting a FID of 50.91 and KID of 0.2451, demonstrating a moderately effective simulation of staining. Similarly, Guan, Li and Zhang (2024) introduced GramGAN, obtaining FID = 52.086, KID = 5.815, and CSS = 0.714, suggesting a reasonable approximation of stain distribution with somewhat higher KID values.

A noteworthy contribution is from Zhang et al. [58], who proposed the CSTN model for both stain normalization and virtual staining, but reported a high FID of 101.37, indicating room for improvement in image fidelity and realism. Vasiljevic et al. [72] introduced HistoStarGAN, a domain-adaptive method that generalizes across multiple stain types. Their model reached FID = 56.0 and SSIM = 0.81 across seven domains, indicating solid performance and robustness, although with slightly reduced structural similarity compared to state-of-the-art models.

On the other hand, Lahiani et al. [85] explored the application of CycleGAN for virtual staining on an NVIDIA Tesla V100 GPU. They reported CWSSIM = 0.880 ± 0.170 and SNR = 1.132, indicating strong structural consistency and acceptable signal quality. In a more advanced approach, Azam et al. [84] developed a method combining Pix2Pix and CUT, achieving FID = 11.16 on colorectal datasets and FID = 38.52 on breast cancer data.

5.4.4. Synthetic Image Generation

Synthetic image generation is a strategy that is particularly valuable in pathology, where obtaining annotated images is resource-intensive [100]. Building on this concept, He et al. [61] employed a Conditional GAN (cGAN) model for both image generation and transformation tasks using semantic label maps. Implemented on an NVIDIA TITAN RTX GPU with 24 GB VRAM, their approach achieved strong quantitative results: Structural Similarity Index (SSIM) of 0.9066 ± 0.065 and Mean Squared Error (MSE) of 0.0076 ± 0.003. Shahini et al. [78] introduced a novel Vision Transformer-based model, ViT-P2P, specifically for semantic-to-image generation. The study evaluated image realism using alternative perceptual quality metrics: Natural Image Quality Evaluator (NIQE) = 8.94 ± 1.62, Multi-scale Aesthetic Image Quality Assessment (MANIQA) = 0.51 ± 0.03, and Perceptual Aesthetics Quality Predictor (PAQ2PIQ) = 70.22 ± 1.76.

Deshpande et al. [66] proposed a GAN-based model trained on an NVIDIA GTX TITAN X GPU with 12 GB RAM. Their model, designed for synthetic data augmentation, reported a Fréchet Inception Distance (FID) of 142.84. In Falahkheirkhah et al. [95] developed a high-definition image-to-image translation model (Pix2Pix-GAN) for synthetic histological image generation. Running on an NVIDIA 2080 GPU, the model achieved a mean perceptual quality score of 4.9379 with a standard deviation of 0.2692.

5.4.5. Color Normalization

Color normalization standardizes image appearance and significantly enhances downstream analytical outcomes, as shown in Fan et al. [65]. In the paper, CycleGAN was applied for color normalization and reported performance metrics directly linked to segmentation accuracy, PQ = 0.5527 and AJI = 0.5797.

In a broader application, Naglah et al. [92] compared cGAN and CycleGAN for color normalization, domain transformation, and data augmentation. The results showed that cGAN achieved Mutual Information (MI) = 0.26 ± 0.07, Normalized MI (NMI) = 0.10 ± 0.03, Hierarchical Clustering (HC) = 0.75 ± 0.13, and Bhattacharyya Distance (BCD) = 0.31 ± 0.06. In contrast, CycleGAN yielded MI = 0.24 ± 0.08, NMI = 0.09 ± 0.03, HC = 0.61 ± 0.26, and BCD = 0.35 ± 0.15, suggesting cGAN slightly outperformed CycleGAN in this context for preserving domain-specific features.

Moving toward more advanced models, Zhang et al. [58] developed a Cross-Stain Transfer Network (CSTN) for both color normalization and virtual staining. Implemented on an NVIDIA RTX 3090 GPU, CSTN produced an FID of 101.37, Kernel Inception Distance (KID) of 6.129, and Average Hue Ratio (AHR) of 1.98, reflecting moderate visual alignment between synthetic and reference staining styles.

Likewise, Rong et al. [89] introduced Restore-GAN, which achieved outstanding quantitative performance with SSIM = 0.949 ± 0.020 and PSNR = 30.678 ± 1.624, confirming its effectiveness in preserving morphological structure while correcting stain variability.

Further contributing to this field, Mahapatra and Maji [90] proposed a combined LSGAN and TredMiL approach, obtaining NMI = 0.800, BCC = 0.750, and WsCC = 0.590, reinforcing the feasibility of integrating adversarial learning with structural constraints to ensure consistent chromatic representation.

5.4.6. Domain Transformation and Reconstruction of High-Resolution Image

Domain adaptation and high-resolution image reconstruction are essential for improving cross-domain generalization, as they align heterogeneous data distributions and recover fine structural details. Kapil et al. [83] explored domain adaptation using the CycleGAN and DASGAN architectures. Supported by powerful hardware, including NVIDIA V100 (32 GB) and K80 GPUs, their approach demonstrated the feasibility of unsupervised domain adaptation for harmonizing histopathological data from diverse sources. However, specific quantitative metrics were not reported, limiting the objective assessment of the model’s performance.

Complementing this, Naglah et al. [92] performed a comprehensive comparison between cGAN and CycleGAN for domain transformation alongside color normalization and data augmentation. Their results indicated that cGAN marginally outperformed CycleGAN concerning mutual information (MI = 0.26 ± 0.07 vs. 0.24 ± 0.08), normalized mutual information (NMI = 0.10 ± 0.03 vs. 0.09 ± 0.03), hierarchical clustering (HC = 0.75 ± 0.13 vs. 0.61 ± 0.26), and Bhattacharyya distance (BCD = 0.31 ± 0.06 vs. 0.35 ± 0.15).

In high-resolution image reconstruction, Juhong et al. [81] proposed a super-resolution approach combining SRGAN with a ResNeXt backbone to enhance histopathological image quality. Executed on a system with NVIDIA RTX 2060 GPU and Intel Core i7-9750H CPU, their model achieved a Peak Signal-to-Noise Ratio (PSNR) of 32.13, Structural Similarity Index Measure (SSIM) of 0.93, and Mean Squared Error (MSE) of 2.75.

5.5. Segmentation Task in Histological Image Analysis

Segmentation in histopathological images is a crucial step for both quantitative and qualitative analysis of biological tissues, extensively applied in detecting cell nuclei, tissue structures, and morphometric analysis [101]. Numerous studies have proposed deep learning architectures and approaches to perform these tasks, utilizing different models and learning strategies, including supervised, unsupervised, semi-supervised, and self-supervised learning. In the following subsections, GAN-based segmentation approaches are systematically categorized according to the specific segmentation tasks, such as nuclei, cells, and tissues. A comparative analysis of segmentation tasks in histological image analysis is summarized in Table 5, emphasizing the used model, diversity in learning strategies and performance. Moreover, to promote transparency and reproducibility, we include information on the availability of code that was provided in the original studies. For each reviewed article, we identified whether the authors had shared their resources through public platforms. This information is summarized in a dedicated column (Data available) in Table 5, enabling readers to quickly assess the reproducibility potential of each work.

5.5.1. Nuclei Segmentation

Nuclei segmentation is the most common task among the reviewed studies, as evidenced by the predominance of models specifically applied to this cellular structure. Hou et al. [87] developed an analysis pipeline that segments nuclei in whole-slide tissue images from multiple cancer types, utilizing a multi-level quality control process (WSI level and image patch level) to evaluate the quality of segmentation results. In Hossain et al. [73], an application based on deep learning networks has been developed. The system performs image preprocessing and generates synthetic images to address the scarcity of pathological data through a CycleGAN network. The generated images are fed into CNN, Mask R-CNN, and modified U-Net networks to segment nuclear regions and separate overlapping kidney nuclei.

The study by Zhang et al. [40] addresses the challenging task of nuclei segmentation in histopathological images by proposing a Cross-Boosted Multi-Target Domain Adaptation (CB-MTDA) framework. This method tackles the significant issue of domain variability, which often obstructs the generalization of segmentation models across different staining protocols and imaging modalities.

Unsupervised domain adaptation (UDA) methods have been proposed to mitigate the distributional gap between different imaging modalities for unsupervised nuclei segmentation in histopathology images (Fan et al.) [65]. Moreover, a dual-branch nucleus shape and structure-preserving module has been proposed to prevent nucleus over-generation and deformation in the synthesized images of liver cancer, kidney cancer, and colon cancer. Li, Hu and Kak [63] introduced a Generative Stain Augmentation Network (G-SAN) for stain augmentation of H&E-stained histological images. G-SAN can augment an input cell image with selected and realistic stains by disentangling the morphological and stain-related representations. Through the downstream tasks of patch classification and nucleus segmentation, it was quantitatively demonstrated that the quality of G-SAN-augmented images surpasses that of images produced by existing stain augmentation approaches.

GANs are also widely employed as supportive tools in histopathological image analysis. For instance, BenTaieb and Hamarneh [75] leveraged GANs to perform stain normalization via adversarial stain transfer. This pre-processing step mitigates variability in staining protocols and improves the consistency of nuclei segmentation algorithms. Similarly, Mahmood et al. [88] introduced a deep adversarial training framework specifically designed for multi-organ nuclei segmentation. Their architecture employs a segmentation network guided by a discriminator that evaluates the morphological realism of the segmented masks. This adversarial learning paradigm improved generalization across different tissue types and staining variations.

Song et al. [67] proposed a self-supervised pretraining method using CycleGANs to translate histological images between unpaired domains. The framework aims to capture nuclear morphology and information distribution by translating histopathological images and pseudo-masked images. While the approaches above rely on supervised learning, other studies explore unsupervised and semi-supervised paradigms. Hu et al. [59] demonstrated the use of unified GANs with a novel loss formulation to perform robust cell-level visual representation learning in an unsupervised setting. The model is not only label-free and easily trained but also capable of unsupervised segmentation of bone marrow cellular components. Lou Wei et al. [79] explored full-nuclei segmentation by choosing only a few image patches to annotate, augmenting the training set of selected samples, and achieving nuclei segmentation in a semi-supervised manner.

Zhang et al. [64] presented MASG-GAN, a novel architecture that combines multi-view attention and superpixel-guided mechanisms. This framework not only performs nuclei segmentation but also classifies tissue types, leveraging shared features for both tasks. The model capitalizes on the semantic interdependencies between segmentation and classification, enhancing overall performance in complex tissue environments. Building upon this, Wang et al. [80] proposed a multi-task GAN architecture that simultaneously learns nuclei segmentation and image reconstruction. Their model enhances spatial coherence and focuses on the most informative regions by integrating dual attention mechanisms and recurrent convolutional modules.

Data augmentation and synthetic data generation represent other important applications of GANs. Hossain et al. [91] proposed generating annotated synthetic data from small labeled samples. Synthetic backgrounds are created using base image patches, with GANs preserving texture characteristics. Nuclear shapes are extracted from original images using ground truth masks and then transformed with image processing techniques. These shapes are placed on filtered backgrounds, maintaining nuclear morphology and distribution. The synthetic image patches, along with their corresponding masks, are used to train a modified U-net for nuclear region segmentation. Lafarge et al. [77] conducted a comparative analysis involving staining normalization and data augmentation across two distinct tasks: generalization to images obtained in invisible pathology laboratories for mitosis detection and generalization to invisible organs for nuclei segmentation.

GANs are also widely employed as supportive tools in histopathological image analysis. Rong et al. [89] took this further by developing Restore-GAN, which improves image quality by restoring blurred regions, enhancing low resolution, and normalizing colors. When the Mask R-CNN segmentation model was applied to these restored images, its performance improved significantly. Similarly, Juhong et al. [81] demonstrated a custom CNN for both super-resolution image enhancement from low-resolution images and characterization of both cells and nuclei from hematoxylin and eosin (H&E) stained breast cancer histopathological images by using a combination of generator and discriminator networks, so-called super-resolution generative adversarial network-based on aggregated residual transformation (SRGAN-ResNeXt), to facilitate cancer diagnosis in low-resource settings.

Shahini et al. [78] proposed SENSE (SEmantic Nuclear Synthesis Emulator), a novel framework for synthesizing realistic histological images with precise control over cellular distributions. The approach introduces three main innovations: a statistical modeling system that captures class-specific nuclear features from specialized annotations, a hybrid ViT-Pix2Pix GAN architecture that effectively translates semantic maps into high-fidelity histological images, and a modular design that allows independent control of cellular properties, including type, count, and spatial distribution.

Mahapatra and Maji [90] presented a new deep generative model (LSGAN) to reduce the color variation found among histological images. The proposed model assumes that the latent color appearance information, extracted through a color appearance encoder, and the constraint information, extracted via a patch density encoder, are independent of one another. The research demonstrated that the performance of nuclei segmentation improves on the image patch after color normalization compared to the image patch before normalization.

Style augmentation using multi-modal microscopy data is the focus of [57]. The authors developed a technique that generates diverse training samples by simulating different imaging modalities. The concept of style augmentation is strongly connected to image-to-image translation models, such as the use of a GAN, which can simulate various staining conditions. These augmentations assist nuclei segmentation models in generalizing across datasets.

Finally, Azam et al. [84] evaluated the impact of deriving cell labels on the performance of H&E-based models applied to lung cancer tissue. Specifically, two virtual staining models were compared based on the Pix2Pix (P2P-GAN) architecture: one trained with cell labels obtained from the same tissue section and another trained using cell labels from an adjacent tissue section. The H&E image was then converted into a nuclear mask through nuclear segmentation using the H&E-trained StarDist model. The cells identified in the mIHC image were matched to the nearest nuclei in the generated H&E image.

5.5.2. Cell Segmentation

Cell segmentation is essential for quantitatively analyzing cell populations and tumor characteristics in histopathology. Studies such as Hu et al. [59] take an essential early step in using GANs for segmenting cells in histopathology images. Their approach is particularly noteworthy for being unsupervised; by leveraging GANs, they generate realistic-looking images that assist the model in learning cell-level features without relying on extensive manual annotations. This represents a significant contribution, as annotating histological data can be time-consuming and costly. However, while their work concentrates on learning visual representations and extracting cell-level features, the segmentation accuracy is limited since the model primarily aims to capture cellular structures rather than produce precise segmentation masks. Similarly, Guan et al. [56] introduced a more advanced unsupervised framework that emphasizes stain style transfer using a multi-domain progressive GAN architecture. While their primary goal is stain normalization and harmonization across datasets, they also implicitly address segmentation challenges by enhancing the consistency in the visual appearance of histological images, which is crucial for robust cell segmentation.

On the other hand, supervised approaches, such as the one by Deshpande et al. [62], introduced SynCLay. This novel interactive synthesis framework enables users to create synthetic histology images from customized cellular layouts. Although their focus is on image generation rather than direct segmentation, the ability to generate realistic cellular structures proves highly valuable for training segmentation models. By simulating diverse and complex cellular arrangements, SynCLay generates augmented datasets that help segmentation models learn to identify and delineate cells in challenging situations, such as overlapping or morphologically varied nuclei.

5.5.3. Tissue Segmentation

The use of GANs for segmenting and processing histological tissue images has encompassed diverse approaches, each addressing unique challenges in digital pathology. Wang et al. [82] introduce a novel end-to-end framework named automatic consecutive context perceived transformer GAN (ACCP-GAN) for fully automatic serial sectioning image blind inpainting. The first stage network (auto-detection module) is designed to detect and repair broken areas roughly. It then guides the second stage network (refined inpainting module) to generate expected patches precisely; therefore, the segmentation part integrates with the restoring part. Based on a self-attention mechanism, the transformer module (SPTransformer) is introduced to enable the refined inpainting module to focus on features from neighboring images, assisting in correcting inpainting results.

De Bel et al. [93] propose a Residual CycleGAN for domain adaptation across different histology stains. This work is particularly valuable for segmentation tasks as it tackles the common issue of stain variability, ensuring that segmentation models trained on one stain generalize more effectively across datasets. By transforming tissue images between domains while preserving underlying structures, their method enables robust segmentation across diverse staining protocols.

Ye et al. [60] introduced a novel stain-adaptive self-supervised (SASSL) learning method framework for histopathological image analysis. The method focuses on harmonizing stain variations, which, like de Bel et al. [93], indirectly strengthens segmentation pipelines by improving data consistency without requiring manual labels.

Lahiani et al. [85] take a creative angle with a perceptual embedding consistency-driven GAN for generating seamless virtual whole-slide images (WSIs). This approach enables the synthesis of large-scale, realistic tissue images while maintaining fine structural details. Although their focus is image synthesis, the ability to generate high-fidelity tissue representations provides valuable data augmentation for segmentation tasks, particularly in rare or complex tissue structures.

Supervised learning is key for segmentation in histology, and several studies use GANs to enhance data quality. He et al. [61] present a UNet+/seg-cGAN model for translating label-free images into histology-like images, facilitating visual reviews by pathologists. Meanwhile, Falahkheirkhah et al. [95] explore the generation of deepfake histological images using a Pix2Pix-GAN model. The work highlights the potential of synthetic data to augment and diversify training datasets for segmentation models in prostate cancer histological images using simple semantic labeling.

5.5.4. Multiclass Segmentation: Tissues + Nuclei or Cells

Some studies have sought to integrate the segmentation of multiple structures simultaneously, aiming for a more comprehensive understanding of the tissue microenvironment. Kweon et al. [86] propose a supervised GAN model with an integrated segmentation module for generating oligodendroglioma images, effectively incorporating segmentation into the synthesis pipeline. Yoon et al. [74] present a multi-task deep learning framework for virtual staining, segmentation, and classification of label-free photoacoustic images, demonstrating the potential of multimodal learning. Gadermayr et al. [97] introduce a GAN-based approach for stain-independent segmentation, facilitating generalization across domains. Purma et al. [68] leverage generative diffusion models with self-supervision for histology segmentation, highlighting progress in self-supervised learning for segmentation tasks.

Other studies focus on specific challenges: Naglah et al. [92] use cGANs for fibrosis detection and quantification. Baykal Kablan [76] proposes a region-aware GAN architecture for stain normalization, addressing key issues for segmentation consistency. Vasiljevic et al. [72] introduce HistoStarGAN, a unified framework for stain normalization, stain transfer, and stain-invariant segmentation in renal histopathology, while Vasiljevic et al. [94] focus on domain generalization through unsupervised domain augmentation with CycleGAN.

Du et al. [70] present an intensely supervised two-stage GAN for stain normalization, while Bouteldja et al. [69] use CycleGAN-based augmentation to tackle stain variability. Razavi et al. [71] developed MiNuGAN, a conditional GAN model for simultaneous mitosis and nuclei segmentation on multi-center breast H&E images, demonstrating GANs’ versatility for specific segmentation tasks. Deshpande et al. [66] introduce SAFRON, a synthesis framework for colorectal cancer histology images, which also supports segmentation models. Lastly, Kapil et al. [83] apply domain adaptation techniques for automated tumor cell scoring and survival analysis on PD-L1 stained tissue images, linking segmentation with clinical outcome analysis.

The field of histopathological image segmentation is advancing towards increasingly generalizable, efficient, and robust solutions, with potential direct impact on clinical applications and biomedical research.

5.6. Overview of Evaluation Metrics

Table 6 provides a comprehensive overview of the most commonly used evaluation metrics for GANs and segmentation models, particularly in the context of histopathological image analysis. What immediately stands out is that these metrics are not applied generically; they are directly aligned with the distinct goals of each model type. For GANs, the focus lies in the visual fidelity of generated images, aiming to make them as realistic as possible. In contrast, segmentation models are evaluated based on their precision in identifying and delineating specific structures, such as cell nuclei, which are essential for accurate pathological assessment.

Among the metrics used for evaluating GANs, the Fréchet Inception Distance (FID) emerges as one of the most prevalent, cited in numerous studies [56,58,62,66,68,74,82,86,94]. Another widely adopted metric is the Structural Similarity Index (SSIM), found in works such as [73,76,81], which assesses structural fidelity, an important factor for ensuring that fine-grained histological details are preserved.

Other metrics, such as Peak Signal-to-Noise Ratio (PSNR) and Mean Squared Error (MSE), continue to be used due to their simplicity, though they are less effective in capturing perceptual quality. More specialized metrics, including Between-image Color Constancy (BiCC) and Within-set Color Constancy (WsCC), appear in studies like [76,90] and address color consistency, a crucial consideration in histological images, where subtle color differences can significantly impact interpretation.

On the segmentation side, the Dice Similarity Coefficient (DSC) stands out as the leading metric. It is referenced in over twenty papers [58,65,67,81,87], underscoring its importance for quantifying overlap between predicted and annotated segmentations. This metric is particularly vital in nucleus segmentation tasks, where even minor pixel-level inaccuracies can affect diagnostic outcomes. Complementary metrics such as the F1-score, precision, and recall/sensitivity are also common [82,90,97], offering a deeper understanding of a model’s ability to handle false positives and false negatives, critical aspects in clinical applications.

The Intersection over Union (IoU), used in works like [57,68,81], provides a strong measure of segmentation accuracy by comparing the predicted and ground-truth masks. Metrics such as Pixel Accuracy (PA) and Jaccard Index (JI) are also employed to offer broader performance insights. Additionally, spatial metrics like the Hausdorff Distance (HD), which calculates the largest distance between corresponding boundary points, are important when the shape and positioning of segmented nuclei have clinical implications.

5.7. Study Summary

Table 7 presents a summary that compares key details of each study.

6. Discussion

This systematic review investigated the growing application of GANs in histological image segmentation, aiming to answer a set of research questions ranging from the types of tissues and datasets used to the model architectures and evaluation metrics employed. The findings underscore the versatility of GANs in addressing the inherent challenges of histopathological image analysis, such as staining variability, data scarcity, and the need for accurate segmentation of complex structures.

In addressing the research questions (RQ1 and RQ2), concerning histological tissues and datasets, the analysis reveals a wide diversity in the types of histological tissues used, reflecting the medical research focus on pathologies related to vital organ systems. Organs such as the kidney, breast, colon, lung, and prostate are frequently represented, highlighting their high clinical relevance. In addition, less commonly studied tissues—including the thymus, testis, pleura, mediastinum, and adrenal gland—have also been investigated, indicating an effort to develop more generalizable models. Datasets such as BBBC-data, PanNuke, and MoNuSeg, which contain samples from multiple organs, are particularly valuable for training models with broader generalization capabilities.

Regarding the datasets, a significant variation in size was observed, ranging from small collections of 30 to 100 images (e.g., CryoNuSeg or in-house datasets) to large-scale repositories containing tens or even hundreds of thousands of images (e.g., TCGA, KIRC, Kather). Image dimensions vary widely, from small patches (e.g., 256 × 256 pixels) to whole slide images (WSIs), which can exceed 100,000 × 100,000 pixels. Image magnification, most commonly 20× or 40×, is a critical factor that directly affects resolution and the level of detail available, thereby impacting segmentation outcomes. Most datasets provide detailed annotations, which are essential for supervised learning tasks. However, some (e.g., KIRC, TCGA, MUC1) are released without comprehensive labeling, limiting their applicability to unsupervised or semi-supervised approaches or requiring additional, resource-intensive annotation efforts. The public availability of most datasets is a significant strength, fostering scientific reproducibility and advancing AI research in pathology, although access to specific datasets remains restricted.

Regarding the research questions on the objectives of applying GANs and the architectures used (RQ3 and RQ4), the studies analyzed reveal various purposes and technical approaches, highlighting the versatility of these networks. The most prevalent application of GANs identified in the review is data augmentation, the most common use case reported in 20 studies. In this context, GAN architectures such as CycleGAN, PGGAN, cGAN, and WGAN-GP synthesize realistic histological images, enriching training datasets and enhancing the robustness and adaptability of segmentation models.

The second most frequent application is stain normalization, essential for mitigating the bias introduced by staining variability. Models like CycleGAN, Pix2Pix GAN (P2P-GAN), DSTGAN, Restore-GAN, WGAN-GP, cGAN, and GramGAN have been widely used to harmonize the visual appearance of images while preserving their underlying morphological structure. Following this, color normalization, although overlapping with stain normalization, aims explicitly to adjust color variations caused by different scanners and acquisition settings, directly influencing analytical consistency and segmentation accuracy [102,103]. Another important application is virtual staining, which simulates histological staining on unlabeled images, enabling faster and more cost-effective analyses while preserving the original tissue samples [104]. Architectures such as CycleGAN, E-CUT, GramGAN, CSTN, HistoStarGAN, Pix2Pix, and CUT have been employed for this purpose.

Synthetic image generation, which involves creating artificial histological images from semantic maps or structural sketches, is another valuable application, especially in contexts where annotation is resource-intensive [105]. cGAN and ViT-P2P are examples of models applied to this task. Lastly, domain transformation and high-resolution image reconstruction play a key role in overcoming dataset heterogeneity and improving image quality, facilitating cross-domain generalization and restoring fine structural details [106]. Notable models in these areas include CycleGAN and SRGAN-ResNeXt, respectively.

The most frequently used GAN architectures in histological image processing include CycleGAN, which emerged as the most prominent model, especially for stain normalization and data augmentation, due to its ability to perform image-to-image translation without requiring paired data. Other prominent architectures include cGAN, Pix2Pix, PGGAN, WGAN, and their variants, each tailored to address specific domain challenges. More recent and specialized models like DST-GAN, HistoStarGAN, and ViT-P2P indicate a trend toward customized GAN architectures designed for specific histopathological tasks.

In addressing the research questions regarding the objectives of segmentation and the segmentation architectures employed (RQ6 and RQ7), this review shows that the primary goal of segmentation in histopathological images is to identify and delineate specific structures—such as cell nuclei, glands, collagen fibers, or tumor regions—in order to extract quantitative information that complements qualitative analysis. The most common segmentation task is nuclear segmentation, reported in 20 studies, due to the clinical relevance of nuclear morphology in cancer diagnosis and prognosis [107]. Models such as U-Net, Mask R-CNN, and their variants are frequently used, often enhanced by GAN-based techniques for data augmentation, stain normalization, or domain adaptation.

Another relevant task is cell segmentation, essential for quantitatively analyzing cell populations and tumor features. Both supervised and unsupervised approaches are explored, often leveraging GANs for synthetic data generation or style enhancement. Tissue segmentation, which targets larger and more complex regions, addresses challenges such as staining variability and enhances model generalization. Common architectures used include U-Net, ResNet, and SegNet.

In multiclass segmentation (e.g., tissues + nuclei or cells), some studies simultaneously segment multiple structures to provide a more comprehensive understanding of the tissue microenvironment, often using U-Net-based models and integrating GANs for data generation or enhancement. Due to their effectiveness in medical image segmentation, the dominant segmentation architectures remain U-Net and Mask R-CNN, particularly under data-limited scenarios. Other architectures such as CNN, ResNet, HoVer-Net, and SegNet are also mentioned and frequently integrated with or supported by GANs.

In the research questions addressing GAN performance evaluation and segmentation quality assessment (RQ5 and RQ8), a distinct set of metrics is employed, each aligned with the specific goals of the respective model types. For GANs, the primary focus lies in visual fidelity and the statistical similarity between generated images and real data. The Fréchet Inception Distance (FID) is one of the most commonly used metrics, as it measures the distance between feature distributions extracted from real and synthetic images, indicating how convincingly realistic the generated images are. The Structural Similarity Index (SSIM) evaluates structural fidelity, which is essential for preserving fine histological details. Additional metrics such as Peak Signal-to-Noise Ratio (PSNR) and Mean Squared Error (MSE) are also employed, along with more specialized measures for color consistency, such as Between-image Color Constancy (BiCC) and Within-set Color Constancy (WsCC).

For segmentation models, the evaluation prioritizes the accuracy in identifying and delineating structures. The Dice Similarity Coefficient (DSC) is the primary metric, quantifying the overlap between predicted and annotated segmentations and is particularly critical in tasks involving nuclear segmentation. Complementary metrics such as F1-score, precision, and recall are commonly used to assess the model’s ability to handle false positives and false negatives. The Intersection over Union (IoU) offers a robust measure of segmentation precision by comparing predicted and ground truth masks. Additionally, spatial metrics like the Hausdorff Distance (HD), which calculates the most significant distance between corresponding boundary points, are essential when the shape and positioning of segmented nuclei have clinical implications. While the metrics used for GANs and segmentation focus on different aspects, there is an increasing need to consider them in a complementary fashion, particularly in hybrid approaches where synthetic images are used to improve the training of segmentation models or where segmentation maps guide image generation.

The growing interest in GAN-based approaches is primarily driven by their ability to synthesize realistic images, address data scarcity issues, and facilitate tasks such as stain normalization, data augmentation, and domain adaptation. Several studies have demonstrated the potential of GANs in improving segmentation accuracy, especially in challenging scenarios involving small datasets, class imbalance, and variations in staining protocols. These contributions have enabled progress in key segmentation tasks, including nuclei, cellular, and tissue segmentation, ultimately supporting downstream applications such as tumor detection, quantification of cellular structures, and prognostic assessments.

Current research trends reveal a strong focus on developing hybrid models that integrate GANs with other architectures, such as U-Net, attention mechanisms, and transformers, to enhance the accuracy and generalization of segmentation models. Moreover, there is an increasing interest in exploring self-supervised and weakly supervised learning strategies, aiming to reduce the dependency on costly pixel-level annotations. The integration of multi-modal data, such as combining histological images with clinical or genomic information, is also emerging as a promising avenue to support holistic analysis and improve clinical decision-making.

Besides exploring technical and methodological aspects, it is important to consider the ethical and clinical implications of using GAN-based approaches in histopathology. Creating synthetic data, while helpful for training models and improving generalization, also risks introducing artifacts that may not match real biological structures, which could lead to misinterpretations or diagnostic mistakes. Therefore, thorough validation of these models in clinical settings is essential to guarantee reliability and patient safety. Additionally, incorporating GAN-based tools into clinical workflows involves navigating complex regulatory frameworks that often lag behind technological progress. Addressing these issues requires not only technological innovation but also collaboration across disciplines with pathologists, ethicists, and regulatory agencies to ensure AI solutions meet healthcare standards and ethical responsibilities.

7. Conclusions

This review offers an original perspective by systematically categorizing the use of GANs in histological image segmentation according to their specific applications. The studies analyzed demonstrate the versatility of GANs in handling challenges such as stain variability, multi-task segmentation, and data scarcity, which are critical factors in histopathological analysis. While promising, the field still faces substantial challenges, including the need for standardized datasets, robust evaluation metrics, and improved generalization across diverse tissues and conditions. Continued research and collaboration across disciplines will be essential to fully harness the potential of GANs and drive impactful advancements in digital pathology. By bridging the gap between deep learning innovation and clinical application, GANs promise to transform how we analyze and interpret histological data, ultimately contributing to more accurate, efficient, and accessible diagnostics in the future.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/app15147802/s1, Supplementary File S1: PRISMA 2020 Checklist; Supplementary File S2: Data extraction and synthesis-Generative adversarial network in histological image segmentation: A systematic literature review.

Author Contributions

Conceptualization, Y.L.K.F.C. and A.F.M.S.; methodology, Y.L.K.F.C. and A.F.M.S.; validation, Y.L.K.F.C. and D.G.C.; formal analysis, Y.L.K.F.C., D.G.C., and E.E.C.S.; investigation, Y.L.K.F.C.; writing—original draft preparation, Y.L.K.F.C., A.F.M.S., E.E.C.S., and D.G.C.; writing—review and editing, Y.L.K.F.C., D.G.C., A.F.M.S., and E.E.C.S.; supervision, E.E.C.S. and D.G.C.; funding acquisition, E.E.C.S. and D.G.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Brazilian Government funding agencies Coordination for the Improvement of Higher Education Personnel (CAPES) under Grant 88881.982645/2004-01, Foundation for Research Support and Scientific and Technological Development of Maranhão (FAPEMA), Foundation for Research Support of Piauí (FAPEPI) and Foundation of the State University of Piauí (UESPI).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All relevant data are within this manuscript and its Supplementary Files.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CNN	Convolutional Neural Network
GAN	Generative Adversarial Network
WSI	Whole Slide Image
CUT	Contrastive Unpaired Translation
LSGAN	Least Squares Generative Adversarial Network
DASGAN	Domain Adaptation and Stain GAN
cGAN	Conditional Generative Adversarial Network
SRGAN	Super-Resolution Generative Adversarial Network
DANN	Domain-Adversarial Neural Network
WGAN	Wasserstein Generative Adversarial Network
ACCP-GAN	Attention-Conditional Cycle-Consistent Progressive GAN
RRAGAN	Residual and Recurrent Attention GAN
SASSL	Stain-Augmented Self-Supervised Learning
E-CUT	Enhanced Contrastive Unpaired Translation
PGGAN	Progressive Growing of GANs
G-SAN	Generative Stain Augmentation Network
CSTN	Cycle-consistent Stain Transfer Network
UDA	Unsupervised Domain Adaptation
CB-MTDA	Cross-Boosted Multi-Target Domain Adaptation
H&E	Hematoxylin and Eosin

References

Mezei, T.; Kolcsár, M.; Joó, A.; Gurzu, S. Image Analysis in Histopathology and Cytopathology: From Early Days to Current Perspectives. J. Imaging 2024, 10, 252. [Google Scholar] [CrossRef] [PubMed]
Ali, M.; Benfante, V.; Basirinia, G.; Alongi, P.; Sperandeo, A.; Quattrocchi, A.; Giannone, A.G.; Cabibi, D.; Yezzi, A.; Raimondo, D.D.; et al. Applications of Artificial Intelligence, Deep Learning, and Machine Learning to Support the Analysis of Microscopic Images of Cells and Tissues. J. Imaging 2025, 11, 59. [Google Scholar] [CrossRef] [PubMed]
Chan, J. The Wonderful Colors of the Hematoxylin–Eosin Stain in Diagnostic Surgical Pathology. Int. J. Surg. Pathol. 2014, 22, 12–32. [Google Scholar] [CrossRef] [PubMed]
Hoque, M.Z.; Keskinarkaus, A.; Nyberg, P.; Seppänen, T. Stain normalization methods for histopathology image analysis: A comprehensive review and experimental comparison. Inf. Fusion 2024, 102, 101997. [Google Scholar] [CrossRef]
Chen, Y.; Yang, X.H.; Wei, Z.; Heidari, A.A.; Zheng, N.; Li, Z.; Chen, H.; Hu, H.; Zhou, Q.; Guan, Q. Generative Adversarial Networks in Medical Image augmentation: A review. Comput. Biol. Med. 2022, 144, 105382. [Google Scholar] [CrossRef]
Hou, L.; Samaras, D.; Kurc, T.M.; Gao, Y.; Davis, J.E.; Saltz, J.H. Robust Histopathology Image Analysis: To Label or to Synthesize? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 8533–8542. [Google Scholar]
Vo, T.; Khan, N. Edge-preserving image synthesis for unsupervised domain adaptation in medical image segmentation. In Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Glasgow, UK, 11–15 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 3753–3757. [Google Scholar]
Qu, H.; Liu, L.; Zhang, D.; Li, B.; Wang, W.; Gorriz, J.M. Weakly supervised deep nuclei segmentation using points annotation in histopathology images. Med Image Anal. 2020, 63, 101699. [Google Scholar] [CrossRef]
Saad, M.; Hossain, M.A.; Tizhoosh, H.R. A Survey on Training Challenges in Generative Adversarial Networks for Biomedical Image Analysis. arXiv 2022, arXiv:2201.07646. [Google Scholar] [CrossRef]
Lee, M. Recent Advances in Generative Adversarial Networks for Gene Expression Data: A Comprehensive Review. Mathematics 2023, 11, 3055. [Google Scholar] [CrossRef]
Liu, L.; Xia, Y.; Tang, L. An overview of biological data generation using generative adversarial networks. In Proceedings of the 2020 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS), Shenyang, China, 11–13 December 2020; pp. 141–144. [Google Scholar] [CrossRef]
Iqbal, A.; Sharif, M.; Yasmin, M.; Raza, M.; Aftab, S. Generative adversarial networks and its applications in the biomedical image segmentation: A comprehensive survey. Int. J. Multimed. Inf. Retr. 2022, 11, 333–368. [Google Scholar] [CrossRef]
Hörst, F.; Rempe, M.; Heine, L.; Seibold, C.; Keyl, J.; Baldini, G.; Ugurel, S.; Siveke, J.; Grünwald, B.; Egger, J.; et al. CellViT: Vision Transformers for Precise Cell Segmentation and Classification. arXiv 2023, arXiv:2306.15350. [Google Scholar] [CrossRef]
Obeid, A.; Boumaraf, S.; Sohail, A.; Hassan, T.; Javed, S.; Dias, J.; Bennamoun, M.; Werghi, N. Advancing Histopathology with Deep Learning Under Data Scarcity: A Decade in Review. arXiv 2024, arXiv:2410.19820. [Google Scholar]
Lahreche, F.; Moussaoui, A.; Oulad-Naoui, S. Medical Image Semantic Segmentation Using Deep Learning: A Survey. In Proceedings of the International Conference on Emerging Intelligent Systems for Sustainable Development (ICEIS 2024), Aflou, Algeria, 26–27 June 2024; pp. 324–345. [Google Scholar] [CrossRef]
Sultan, B.; Rehman, A.; Riyaz, L. Generative Adversarial Networks in the Field of Medical Image Segmentation. In Deep Learning Applications in Medical Image Segmentation: Overview, Approaches, and Challenges; Wiley: Hoboken, NJ, USA, 2024; pp. 185–225. [Google Scholar] [CrossRef]
Gui, J.; Sun, Z.; Wen, Y.; Tao, D.; Ye, J. A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications. IEEE Trans. Knowl. Data Eng. 2023, 35, 3313–3332. [Google Scholar] [CrossRef]
Alhumaid, M.; Alhumaid, M.; Fayoumi, A. Transfer Learning-Based Classification of Maxillary Sinus Using Generative Adversarial Networks. Appl. Sci. 2024, 14, 3083. [Google Scholar] [CrossRef]
Islam, S.; Aziz, M.T.; Nabil, H.R.; Jim, J.R.; Mridha, M.F.; Kabir, M.M.; Asai, N.; Shin, J. Generative Adversarial Networks (GANs) in Medical Imaging: Advancements, Applications, and Challenges. IEEE Access 2024, 12, 35728–35753. [Google Scholar] [CrossRef]
Jose, L.; Liu, S.; Russo, C.; Nadort, A.; Ieva, A.D. Generative Adversarial Networks in Digital Pathology and Histopathological Image Processing: A Review. J. Pathol. Inform. 2021, 12, 43. [Google Scholar] [CrossRef]
Zhao, J.; Hou, X.; Pan, M.; Zhang, H. Attention-based generative adversarial network in medical imaging: A narrative review. Comput. Biol. Med. 2022, 149, 105948. [Google Scholar] [CrossRef]
Hussain, J.; Båth, M.; Ivarsson, J. Generative adversarial networks in medical image reconstruction: A systematic literature review. Comput. Biol. Med. 2025, 191, 110094. [Google Scholar] [CrossRef]
Xun, S.; Li, D.; Zhu, H.; Chen, M.; Wang, J.; Li, J.; Chen, M.; Wu, B.; Zhang, H.; Chai, X.; et al. Generative adversarial networks in medical image segmentation: A review. Comput. Biol. Med. 2022, 140, 105063. [Google Scholar] [CrossRef]
Harari, M.B.; Parola, H.R.; Hartwell, C.J.; Riegelman, A. Literature searches in systematic reviews and meta-analyses: A review, evaluation, and recommendations. J. Vocat. Behav. 2020, 118, 103377. [Google Scholar] [CrossRef]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372. [Google Scholar] [CrossRef]
Ibrahim, M.; Khalil, Y.A.; Amirrajab, S.; Sun, C.; Breeuwer, M.; Pluim, J.; Elen, B.; Ertaylan, G.; Dumontier, M. Generative AI for synthetic data across multiple medical modalities: A systematic review of recent developments and challenges. Comput. Biol. Med. 2025, 189, 109834. [Google Scholar] [CrossRef] [PubMed]
Haggerty, J.; Wang, X.; Dickinson, A.; O’Malley, C.; Martin, E. Segmentation of epidermal tissue with histopathological damage in images of haematoxylin and eosin stained human skin. BMC Med. Imaging 2014, 14, 7. [Google Scholar] [CrossRef] [PubMed]
Mahbod, A.; Schaefer, G.; Dorffner, G.; Hatamikia, S.; Ecker, R.; Ellinger, I. A dual decoder U-Net-based model for nuclei instance segmentation in hematoxylin and eosin-stained histological images. Front. Med. 2022, 9, 978146. [Google Scholar] [CrossRef]
He, L.; Long, L.; Antani, S.; Thoma, G. Histology image analysis for carcinoma detection and grading. Comput. Methods Programs Biomed. 2012, 107, 538–556. [Google Scholar] [CrossRef]
Fu, X.; Liu, T.; Xiong, Z.; Smaill, B.; Stiles, M.; Zhao, J. Segmentation of histological images and fibrosis identification with a convolutional neural network. Comput. Biol. Med. 2018, 98, 147–158. [Google Scholar] [CrossRef]
Wesdorp, N.J.; Zeeuw, J.M.; Postma, S.C.; Roor, J.; van Waesberghe, J.H.T.; van den Bergh, J.E.; Nota, I.M.; Moos, S.; Kemna, R. Deep learning models for automatic tumor segmentation and total tumor volume assessment in patients with colorectal liver metastases. Eur. Radiol. Exp. 2023, 7, 75. [Google Scholar] [CrossRef]
Schoenpflug, L.A.; Lafarge, M.W.; Frei, A.L.; Koelzer, V.H. Multi-task learning for tissue segmentation and tumor detection in colorectal cancer histology slides. arXiv 2023, arXiv:2304.03101. [Google Scholar]
Gudhe, N.R.; Kosma, V.M.; Behravan, H.; Mannermaa, A. Nuclei instance segmentation from histopathology images using Bayesian dropout based deep learning. BMC Med. Imaging 2023, 23, 162. [Google Scholar] [CrossRef]
Li, Y.J.; Chou, H.H.; Lin, P.C.; Shen, M.R.; Hsieh, S.Y. A novel deep learning-based algorithm combining histopathological features with tissue areas to predict colorectal cancer survival from whole-slide images. J. Transl. Med. 2023, 21, 731. [Google Scholar] [CrossRef]
Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. Adv. Neural Inf. Process. Syst. 2014, 27. [Google Scholar] [CrossRef]
Heng, Y.; Ma, Y.; Khan, F.G.; Khan, A.; Ali, F.; AlZubi, A.A.; Hui, Z. Survey: Application and analysis of generative adversarial networks in medical images. Artif. Intell. Rev. 2025, 58, 39. [Google Scholar] [CrossRef]
Kodali, N.; Abernethy, J.; Hays, J.; Kira, Z. On Convergence and Stability of GANs. arXiv 2017, arXiv:1705.07215. [Google Scholar]
Fedus, W.; Goodfellow, I.; Dai, A.M. Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence at Every Step. arXiv 2023, arXiv:2002.05616. [Google Scholar]
Lucic, M.; Kurach, K.; Michalski, M.; Gelly, S.; Bousquet, O. Are GANs Created Equal? A Large-Scale Study. Adv. Neural Inf. Process. Syst. NeurIPS 2018, 31. [Google Scholar]
Zhang, H.; Yu, Y.; Jojic, N.; Xiao, H.; Wang, Y.; Liang, Y.; Hsieh, C.J. Overfitting in Adversarially Robust Deep Learning. In Proceedings of the 37th International Conference on Machine Learning (ICML), Virtual, 13–18 July 2020; PMLR: Cambridge, MA, USA, 2020; pp. 11134–11143. [Google Scholar]
Mirza, M.; Osindero, S. Conditional generative adversarial nets. arXiv 2014, arXiv:1411.1784. [Google Scholar] [CrossRef]
Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar] [CrossRef]
Karras, T.; Laine, S.; Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4401–4410. [Google Scholar] [CrossRef]
Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and improving the image quality of StyleGAN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 8110–8119. [Google Scholar] [CrossRef]
Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive growing of GANs for improved quality, stability, and variation. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar] [CrossRef]
Han, C.C.; Rundo, L.; Araki, R.; Tang, Y.C.; Peng, S.L.; Nakayama, H.; Lin, C.C.; Li, C.H. Infinite Brain MR Images: PGGAN-based Data Augmentation for Tumor Detection. arXiv 2019, arXiv:1903.12564. [Google Scholar]
Korkinof, D.; Rijken, T.; O’Neill, M.; Matthew, J.; Glocker, B. High-Resolution Mammogram Synthesis using Progressive Generative Adversarial Networks. arXiv 2018, arXiv:1807.03401. [Google Scholar]
Xue, Y.; Ye, J.; Zhou, Q.; You, H.; Ouyang, W.; Do, Q.V. Selective Synthetic Augmentation with HistoGAN for Improved Histopathology Image Classification. arXiv 2021, arXiv:2111.06399. [Google Scholar] [CrossRef]
Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein GAN. arXiv 2017, arXiv:1701.07875. [Google Scholar] [CrossRef]
Kazeminia, S.; Baur, C.; Kuijper, A.; van Ginneken, B.; Navab, N.; Albarqouni, S.; Mukhopadhyay, A. GANs for medical image analysis. Artif. Intell. Med. 2020, 109, 101938. [Google Scholar] [CrossRef]
Yi, X.; Walia, E.; Babyn, P. Generative adversarial network in medical imaging: A review. Med Image Anal. 2019, 58, 101552. [Google Scholar] [CrossRef] [PubMed]
Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1125–1134. [Google Scholar] [CrossRef]
Hillis, C.; Bagheri, E.; Marshall, Z. The Role of Protocol Papers, Scoping Reviews, and Systematic Reviews in Responsible AI Research. IEEE Technol. Soc. Mag. 2025, 44, 72–76. [Google Scholar] [CrossRef]
PROSPERO. International Prospective Register of Systematic Reviews. Centre for Reviews and Dissemination, University of York. 2025. Available online: https://www.crd.york.ac.uk/prospero/ (accessed on 10 February 2025).
Frid-Adar, M.; Klang, E.; Amitai, M.; Goldberger, J.; Greenspan, H. GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 2018, 321, 321–331. [Google Scholar] [CrossRef]
Guan, X.; Wang, Y.; Lin, Y.; Li, X.; Zhang, Y. Unsupervised multi-domain progressive stain transfer guided by style encoding dictionary. IEEE Trans. Image Process. 2024, 33, 767–779. [Google Scholar] [CrossRef]
Liu, Y.; Wagner, S.J.; Peng, T. Multi-modality microscopy image style augmentation for nuclei segmentation. J. Imaging 2022, 8, 71. [Google Scholar] [CrossRef]
Zhang, H.; Liu, J.; Wang, P.; Yu, Z.; Liu, W.; Chen, H. Cross-boosted multi-target domain adaptation for multi-modality histopathology image translation and segmentation. IEEE J. Biomed. Health Inform. 2022, 26, 3197–3208. [Google Scholar] [CrossRef]
Hu, B.; Tang, Y.; Eric, I.; Chang, C.; Fan, Y.; Lai, M.; Xu, Y. Unsupervised learning for cell-level visual representation in histopathology images with generative adversarial networks. IEEE J. Biomed. Health Inform. 2018, 23, 1316–1328. [Google Scholar] [CrossRef]
Ye, H.; Yang, Y.y.; Zhu, S.; Wang, D.H.; Zhang, X.Y.; Yang, X.; Huang, H. Stain-adaptive self-supervised learning for histopathology image analysis. Pattern Recognit. 2025, 161, 111242. [Google Scholar] [CrossRef]
He, Y.; Li, J.; Shen, S.; Liu, K.; Wong, K.K.; He, T.; Wong, S.T. Image-to-image translation of label-free molecular vibrational images for a histopathological review using the UNet+/seg-cGAN model. Biomed. Opt. Express 2022, 13, 1924–1938. [Google Scholar] [CrossRef]
Deshpande, S.; Dawood, M.; Minhas, F.; Rajpoot, N. SynCLay: Interactive synthesis of histology images from bespoke cellular layouts. Med. Image Anal. 2024, 91, 102995. [Google Scholar] [CrossRef]
Li, F.; Hu, Z.; Chen, W.; Kak, A. A laplacian pyramid based generative h&e stain augmentation network. IEEE Trans. Med. Imaging 2023, 43, 701–713. [Google Scholar]
Zhang, H.; Liu, J.; Yu, Z.; Wang, P. MASG-GAN: A multi-view attention superpixel-guided generative adversarial network for efficient and simultaneous histopathology image segmentation and classification. Neurocomputing 2021, 463, 275–291. [Google Scholar] [CrossRef]
Fan, J.; Liu, D.; Chang, H.; Cai, W. Learning to generalize over subpartitions for heterogeneity-aware domain adaptive nuclei segmentation. Int. J. Comput. Vis. 2024, 132, 2861–2884. [Google Scholar] [CrossRef]
Deshpande, S.; Minhas, F.; Graham, S.; Rajpoot, N. SAFRON: Stitching across the frontier network for generating colorectal cancer histology images. Med. Image Anal. 2022, 77, 102337. [Google Scholar] [CrossRef]
Song, Z.; Du, P.; Yan, J.; Li, K.; Shou, J.; Lai, M.; Fan, Y.; Xu, Y. Nucleus-aware self-supervised pretraining using unpaired image-to-image translation for histopathology images. IEEE Trans. Med. Imaging 2023, 43, 459–472. [Google Scholar] [CrossRef]
Purma, V.; Srinath, S.; Srirangarajan, S.; Kakkar, A.; Prathosh, A. GenSelfDiff-HIS: Generative Self-Supervision Using Diffusion for Histopathological Image Segmentation. IEEE Trans. Med. Imaging 2024, 44, 618–631. [Google Scholar] [CrossRef]
Bouteldja, N.; Hölscher, D.L.; Bülow, R.D.; Roberts, I.S.; Coppo, R.; Boor, P. Tackling stain variability using CycleGAN-based stain augmentation. J. Pathol. Inform. 2022, 13, 100140. [Google Scholar] [CrossRef]
Du, Z.; Zhang, P.; Huang, X.; Hu, Z.; Yang, G.; Xi, M.; Liu, D. Deeply supervised two stage generative adversarial network for stain normalization. Sci. Rep. 2025, 15, 7068. [Google Scholar] [CrossRef]
Razavi, S.; Khameneh, F.D.; Nouri, H.; Androutsos, D.; Done, S.J.; Khademi, A. MiNuGAN: Dual segmentation of mitoses and nuclei using conditional GANs on multi-center breast H&E images. J. Pathol. Inform. 2022, 13, 100002. [Google Scholar]
Vasiljević, J.; Feuerhake, F.; Wemmert, C.; Lampert, T. HistoStarGAN: A unified approach to stain normalisation, stain transfer and stain invariant segmentation in renal histopathology. Knowl.-Based Syst. 2023, 277, 110780. [Google Scholar] [CrossRef]
Hossain, M.S.; Armstrong, L.J.; Cook, D.M.; Zaenker, P. Application of histopathology image analysis using deep learning networks. Hum.-Centric Intell. Syst. 2024, 4, 417–436. [Google Scholar] [CrossRef]
Yoon, C.; Park, E.; Misra, S.; Kim, J.Y.; Baik, J.W.; Kim, K.G.; Jung, C.K.; Kim, C. Deep learning-based virtual staining, segmentation, and classification in label-free photoacoustic histology of human specimens. Light. Sci. Appl. 2024, 13, 226. [Google Scholar] [CrossRef] [PubMed]
BenTaieb, A.; Hamarneh, G. Adversarial stain transfer for histopathology image analysis. IEEE Trans. Med. Imaging 2017, 37, 792–802. [Google Scholar] [CrossRef] [PubMed]
Baykal Kablan, E. Regional realness-aware generative adversarial networks for stain normalization. Neural Comput. Appl. 2023, 35, 17915–17927. [Google Scholar] [CrossRef]
Lafarge, M.W.; Pluim, J.P.; Eppenhof, K.A.; Veta, M. Learning domain-invariant representations of histological images. Front. Med. 2019, 6, 162. [Google Scholar] [CrossRef]
Shahini, A.; Gambella, A.; Molinari, F.; Salvi, M. Semantic-driven synthesis of histological images with controllable cellular distributions. Comput. Methods Programs Biomed. 2025, 261, 108621. [Google Scholar] [CrossRef]
Lou, W.; Li, H.; Li, G.; Han, X.; Wan, X. Which pixel to annotate: A label-efficient nuclei segmentation framework. IEEE Trans. Med. Imaging 2022, 42, 947–958. [Google Scholar] [CrossRef]
Wang, H.; Xu, G.; Pan, X.; Liu, Z.; Lan, R.; Luo, X. Multi-task generative adversarial learning for nuclei segmentation with dual attention and recurrent convolution. Biomed. Signal Process. Control 2022, 75, 103558. [Google Scholar] [CrossRef]
Juhong, A.; Li, B.; Yao, C.Y.; Yang, C.W.; Agnew, D.W.; Lei, Y.L.; Huang, X.; Piyawattanametha, W.; Qiu, Z. Super-resolution and segmentation deep learning for breast cancer histopathology image analysis. Biomed. Opt. Express 2022, 14, 18–36. [Google Scholar] [CrossRef]
Wang, L.; Zhang, S.; Gu, L.; Zhang, J.; Zhai, X.; Sha, X.; Chang, S. Automatic consecutive context perceived transformer GAN for serial sectioning image blind inpainting. Comput. Biol. Med. 2021, 136, 104751. [Google Scholar] [CrossRef]
Kapil, A.; Meier, A.; Steele, K.; Rebelatto, M.; Nekolla, K.; Haragan, A.; Silva, A.; Zuraw, A.; Barker, C.; Scott, M.L.; et al. Domain adaptation-based deep learning for automated tumor cell (TC) scoring and survival analysis on PD-L1 stained tissue images. IEEE Trans. Med. Imaging 2021, 40, 2513–2523. [Google Scholar] [CrossRef] [PubMed]
Azam, A.B.; Wee, F.; Väyrynen, J.P.; Yim, W.W.Y.; Xue, Y.Z.; Chua, B.L.; Lim, J.C.T.; Somasundaram, A.C.; Tan, D.S.W.; Takano, A.; et al. Training immunophenotyping deep learning models with the same-section ground truth cell label derivation method improves virtual staining accuracy. Front. Immunol. 2024, 15, 1404640. [Google Scholar] [CrossRef] [PubMed]
Lahiani, A.; Klaman, I.; Navab, N.; Albarqouni, S.; Klaiman, E. Seamless virtual whole slide image synthesis and validation using perceptual embedding consistency. IEEE J. Biomed. Health Inform. 2020, 25, 403–411. [Google Scholar] [CrossRef] [PubMed]
Kweon, J.; Yoo, J.; Kim, S.; Won, J.; Kwon, S. A novel method based on GAN using a segmentation module for oligodendroglioma pathological image generation. Sensors 2022, 22, 3960. [Google Scholar] [CrossRef]
Hou, L.; Gupta, R.; Van Arnam, J.S.; Zhang, Y.; Sivalenka, K.; Samaras, D.; Kurc, T.M.; Saltz, J.H. Dataset of segmented nuclei in hematoxylin and eosin stained histopathology images of ten cancer types. Sci. Data 2020, 7, 185. [Google Scholar] [CrossRef]
Mahmood, F.; Borders, D.; Chen, R.J.; McKay, G.N.; Salimian, K.J.; Baras, A.; Durr, N.J. Deep adversarial training for multi-organ nuclei segmentation in histopathology images. IEEE Trans. Med. Imaging 2019, 39, 3257–3267. [Google Scholar] [CrossRef]
Rong, R.; Wang, S.; Zhang, X.; Wen, Z.; Cheng, X.; Jia, L.; Yang, D.M.; Xie, Y.; Zhan, X.; Xiao, G. Enhanced pathology image quality with restore–generative adversarial network. Am. J. Pathol. 2023, 193, 404–416. [Google Scholar] [CrossRef]
Mahapatra, S.; Maji, P. Truncated normal mixture prior based deep latent model for color normalization of histology images. IEEE Trans. Med. Imaging 2023, 42, 1746–1757. [Google Scholar] [CrossRef]
Hossain, M.S.; Armstrong, L.J.; Abu-Khalaf, J.; Cook, D.M. The segmentation of nuclei from histopathology images with synthetic data. Signal Image Video Process. 2023, 17, 3703–3711. [Google Scholar] [CrossRef]
Naglah, A.; Khalifa, F.; El-Baz, A.; Gondim, D. Conditional GANs based system for fibrosis detection and quantification in Hematoxylin and Eosin whole slide images. Med. Image Anal. 2022, 81, 102537. [Google Scholar] [CrossRef]
de Bel, T.; Bokhorst, J.M.; van der Laak, J.; Litjens, G. Residual cyclegan for robust domain transformation of histopathological tissue slides. Med. Image Anal. 2021, 70, 102004. [Google Scholar] [CrossRef] [PubMed]
Vasiljević, J.; Feuerhake, F.; Wemmert, C.; Lampert, T. Towards histopathological stain invariance by unsupervised domain augmentation using generative adversarial networks. Neurocomputing 2021, 460, 277–291. [Google Scholar] [CrossRef]
Falahkheirkhah, K.; Tiwari, S.; Yeh, K.; Gupta, S.; Herrera-Hernandez, L.; McCarthy, M.R.; Jimenez, R.E.; Cheville, J.C.; Bhargava, R. Deepfake histologic images for enhancing digital pathology. Lab. Investig. 2023, 103, 100006. [Google Scholar] [CrossRef]
Ruiz-Casado, J.L.; Molina-Cabello, M.A.; Luque-Baena, R.M. Enhancing Histopathological Image Classification Performance through Synthetic Data Generation with Generative Adversarial Networks. Sensors 2024, 24, 3777. [Google Scholar] [CrossRef]
Gadermayr, M.; Gupta, L.; Appel, V.; Boor, P.; Klinkhammer, B.M.; Merhof, D. Generative adversarial networks for facilitating stain-independent supervised and unsupervised segmentation: A study on kidney histology. IEEE Trans. Med. Imaging 2019, 38, 2293–2302. [Google Scholar] [CrossRef]
Bai, B.; Yang, X.; Li, Y.; Zhang, Y.; Pillar, N.; Ozcan, A. Deep Learning-enabled Virtual Histological Staining of Biological Samples. arXiv 2022, arXiv:2211.06822. [Google Scholar] [CrossRef]
Xu, Z.; Huang, X.; Fernández Moro, C.; Bozóky, B.; Zhang, Q. GAN-based Virtual Re-Staining: A Promising Solution for Whole Slide Image Analysis. arXiv 2019, arXiv:1901.04059. [Google Scholar]
Smith, J.; Doe, J. High-Resolution Generative Adversarial Neural Networks Applied to Histological Images Generation. In Proceedings of the ICANN 2018, Rhodes, Greece, 4–7 October 2018; pp. 195–202. [Google Scholar]
Vu, Q.D.; Graham, S.; Kurc, T.; To, M.N.N.; Shaban, M.; Qaiser, T.; Koohbanani, N.A.; Khurram, S.A.; Kalpathy-Cramer, J.; Zhao, T.; et al. Methods for Segmentation and Classification of Digital Microscopy Tissue Images. Front. Bioeng. Biotechnol. 2019, 7, 53. [Google Scholar] [CrossRef]
Zhang, W.; Li, M.; Chen, Y. Color Normalization Techniques to Improve Consistency in Histopathological Image Analysis. J. Med. Imaging Anal. 2023, 45, 102341. [Google Scholar]
Shaban, M.T.; Baur, C.; Navab, N.; Albarqouni, S. Staingan: Stain style transfer for digital histological images. Comput. Methods Programs Biomed. 2020, 184, 105245. [Google Scholar]
Zhang, Y.; Huang, L.; Pillar, N.; Li, Y.; Migas, L.G.; Van de Plas, R.; Spraggins, J.M.; Ozcan, A. Virtual Staining of Label-Free Tissue in Imaging Mass Spectrometry. arXiv 2024, arXiv:2411.13120. [Google Scholar]
Taneja, N.; Zhang, W.; Basson, L.; Aerts, H.; Orlov, N. High-resolution histopathology image generation and segmentation through adversarial training. In Proceedings of the European Conference on Computer Vision (ECCV), Montreal, QC, Canada, 11 October 2021. [Google Scholar]
Chen, J.; Yu, L.; Ma, Z. Unsupervised domain adaptation for histopathological image segmentation with adversarial learning. Pattern Recognit. 2023, 135, 109150. [Google Scholar]
Fahoum, I.; Tsuriel, S.; Rattner, D.; Greenberg, A.; Zubkov, A.; Naamneh, R.; Greenberg, O.; Zemser-Werner, V.; Gitstein, G.; Hagege, R.; et al. Automatic analysis of nuclear features reveals a non-tumoral predictor of tumor grade in bladder cancer. Diagn. Pathol. 2024, 19, 1–10. [Google Scholar] [CrossRef] [PubMed]

Figure 1. The basic structure of a GAN.

Figure 2. PRISMA flow diagram containing selection, evaluation and inclusion of studies.

Figure 3. Distribution on GAN-based research focused on histological image segmentation analysis.

Figure 4. Distribution of segmented target regions of the studies.

Figure 5. Annual distribution on GAN-based research.

Table 1. Comparative analysis of review articles focused on GANs in the medical area.

Study	Year	Image Type	Datasets	Task Investigated	GAN Types	Metrics	Objective
Jose et al. [20]	2021	Histopathology (WSI)	TCGA, Camelyon	Histological image processing	CycleGAN, Pix2Pix, CGAN	Qualitative	GANs in digital pathology and histology
Chen et al. [5]	2022	Radiology, microscopy	BRATS, LUNA, ISIC	Augmentation	CycleGAN, StyleGAN, DCGAN	Limited	GANs for data augmentation
Xun et al. [23]	2022	General medical imaging	ACDC, DRIVE	Segmentation	CGAN, U-GAN, SegAN	Limited	Review of segmentation using GANs
Zhao et al. [21]	2022	Radiology, pathology	Not detailed	Attention-enhanced segmentation	AttentionGAN, TransGAN	Limited	Emerging GAN architectures with attention
Gui et al. [17]	2023	General (incl. medical)	Not detailed	General overview	DCGAN, WGAN, InfoGAN, BigGAN	Theoretical	Broad review of GANs including healthcare
Islam et al. [19]	2024	Radiology, Pathology	Public/private datasets	Augmentation, Synthesis, Segmentation	DCGAN, CGAN, CycleGAN, StyleGAN	Limited	Comprehensive review on GANs in medical imaging
Sultan et al. [16]	2024	MRI, CT, Ultrasound	Not specified	Segmentation	Vanilla GAN, CGAN, CycleGAN	Limited	Conceptual overview of GANs in medical image segmentation
Hussain et al. [22]	2025	CT, MRI	Multiple (e.g., BraTS)	Reconstruction (incl. segmentation)	Progressive GANs, CycleGAN	Performance evaluation	Systematic review on GANs in image reconstruction
Ibrahim et al. [26]	2025	Multi-modal (MRI, histology, EHR)	Multimodal datasets	Synthetic data generation	Diffusion, GANs, VAEs	Limited	Synthetic data generation in medicine
This review	2025	Histology and histopathology	Multiple datasets	Segmentation	GAN, CycleGAN, PGGAN, Pix2Pix, StyleGAN, StarGAN, cGAN, WGAN	Qualitative and quantitive	Systematic review focused on GANs in histological image segmentation

Table 2. Inclusion and exclusion criteria for studies.

Inclusion Criteria (IC)	Exclusion Criteria (EC)
(IC1) Studies using GANs for segmentation of histological images.	(EC1) Studies using GANs only to synthesize images, without a focus on segmentation.
(IC2) Studies with experimental validation (quantitative or qualitative).	(EC2) Works that do not involve histological or microscopic images.
(IC3) Published between 2014 and 2025.	(EC3) Studies using GANs only for classification or detection, not segmentation.
(IC4) Articles in English.	(EC4) Articles in a language other than English.
(IC5) Full articles.	(EC5) Review, surveys, Case studies, reports, short papers, conference, abstracts, communications, theses, and dissertations.

Table 3. Overview of the datasets used in the included studies.

Dataset	Paper	Type of Tissue	Size	Magnification	Labeled	Availability
ANHIR	[56]	Kidney, breast, colon, lung and gastric mucosa	231 images (WSI) (100,000 × 200,000)	20× and 40×	Yes	Public
BBBC-data	[57,58]	Liver, Brain, Colon, Lung, Ovary, Kidney, Blood, Immune System, Embryo, Heart and Nervous System	700 images	20×	Yes	Public
BMCD-FGCD	[59]	Bone marrow	29 images (WSI) (1500 × 800)	—	Yes	Public
CAMELYON16	[60]	Breast (lymph nodes)	400 images (WSI) (100,000 × 200,000)	20× and 40×	Yes	Public
CARS	[61]	Thyroid	200 images (256 × 256)	40×	Yes	Private
CoNiC	[62]	Colon	4981 patches (256 × 256)	—	Yes	Public
CoNSep	[58,63,64]	Adrenal, Larynx, Lymph node, Mediastinum.	30 images (512 × 512)	40×	Yes	Public
CPM17	[58,63,64,65]	Colon	64 images (500 × 500)	20× and 40×	Yes	Public
Crag	[66,67]	Colorectal carcinome	213 images (1512 × 1512)	20×	Yes	Public
CryoNuSeg	[63]	Adrenal gland, larynx, lymph node, mediastinum, pancreas, pleura, testis, thymus and Thyroid gland	30 images (512 × 512)	20×	Yes	Public
Digestpath	[66]	Colorectal carcinoma	46 images (5000 × 5000)	20×	Yes	Public
Glomeruli	[56]	Kidney	200 images (WSI) (1000 × 1000)	40×	Yes	Public
HN Cancer	[68]	Head and Neck	1562 images (1024 × 1280)	10×	Yes	Private
HuBMAP	[69]	Kidney	9 images (WSI)	40×	Yes	Public
ICIAR-BACH	[70]	Breast	400 images (2048 × 1536)	20×	Yes	Public
In-House	[67]	Colorectal	1093 images (WSI) (512 × 512)	40×	Yes	Public
In-House-MIL	[67]	Kidney, Lung, Breast, Prostate, Endometrium	260 images (WSI) (10,000 × 10,000)	40×	Yes	Public
ICPR14 e 12	[59,71]	Breast	163 images (WSI) (2000 × 2000)	40×	Yes	Public
Kather	[67]	Large intestine (colon and rectum)	25,058 images (224 × 224)	20×	Yes	Public
KidneyArtPathology	[72]	Kidney	5000 images (512 × 512)	20× and 40×	Yes	Public
KIRC	[58,63,64,73]	Kidney	>190,000 images (WSI) (100,000 × 100,000)	20×	No	Public
KPMP	[69]	Kidney	85 images (WSI)	40×	Yes	Public
Kumar	[58,64,65,67,74]	Breast, Liver, Kidney, Prostate, Bladder, Colon and Stomach	30 images (1000 × 1000)	40×	Yes	Public
Lizard	[67]	Large intestine (colon and rectum)	133 images (1016 × 917)	20×	Yes	Public
Lung dataset	[56]	Lung	23,744 patchs	40×	-	Private
LYON19 (IHC)	[58]	Breast, colon, prostate	871 images (256 × 256)	40×	Yes	Public
MICCAI’16 GlaS	[68,70,75,76,77]	Large intestine (colon)	165 images (775 × 522)	20×	Yes	Public
MITOS-ATYPIA 14	[70,76]	Breast	1200 images (1539 × 1376)	10×, 20× and 40×	Yes	Public
MoNuSAC	[63,78]	Lung, prostate, kidney and breast	294 images (113 × 81 and 1398 × 1956)	40×	Yes	Public
MoNuSeg	[63,68,79,80]	Prostate, Lung, Kidney, Colon, Breast, Pancreas, Oral cavity, Stomach, Liver and Bladder	44 images (1000 × 1000)	40×	Yes	Public
MThH	[58]	Colon	36,000 images (256 × 256)	20× and 40×	Yes	Public
MUC1	[81]	Breast	13,000 images (256 × 266)	40×	No	Private
N7, E17, N5	[82]	Kidney (mouse)	1917 patchs (128 × 128)	—		Private
NSCLC	[83]	Lung	269,000 pachs (128 × 128)	10×	Yes	Public
Onco-SG	[84]	Lung	57 images (1792 × 768)	20×	Yes	Public
PanNuke	[62,67,80]	Breast, Lung, Prostate, Kidney, Brain, Colon and Liver	7901 images (256 × 256)	20× and 40×	Yes	Public
Roche	[85]	Liver (colorectal)	50 images (WSI) (512 × 512)	10x		Private
RWTH Aachen	[69]	Kidney	1009 images (WSI)	40×	Yes	Private
TCGA	[73,77,79,81,86,87,88,89,90,91]	Breast, Liver, Kidney, Prostate, Bladder, Brain, Colon and Stomach	>20,000 WSIs (100,000 × 100,000)	20× and 40×	No	Public
TNBC	[58,63,64,74,79]	Breast	50 images (512 × 512)	20× and 40×	Yes	Public
TUPAC16	[70,71,77]	Breast	821 images (WSI)	40×	Yes	Public
VALIGA	[69]	Kidney	648 images (WSI)	40×	Yes	Private
-	[92]	Liver	64, 128, 256, 512 e 1024 patchs	40×	Yes	Private
-	[93]	Colon and Kidney	100 images (WSI)	—	Yes	Private
-	[94]	Kidney	10 images	40×	Yes	Private
-	[95]	Prostate and Colon	102 images (WSI) (1024 × 1024)	10×	Yes	Public

Table 5. Comparative analysis of segmentation tasks in histological image analysis.

Paper	Task	Model	Learning	Data Available *	Performance
Azam et al. (2024) [84]	Nuclei	U-Net	Supervised	http://github.com/abubakrazam/Pix2Pix_TIL_H-E.git	$P e a r s o n c o r r e l a t i o n = 0.95,$ $A c c = 0.982$
Bouteldja et al. (2022) [69]	Cell and Tissue	U-Net	Unsupervised	https://github.com/NBouteldja/KidneyStainAugmentation	$I D S C = 94.6$
De Bel et al. (2021) [93]	Tissue	U-Net	Unsupervised	-	$D S C = 0.850$
Deshpande et al. (2022) [66]	Cell and Tissue	U-Net	Supervised	http://warwick.ac.uk/TIALab/SAFRON	$D S C = 0.97 \pm 0.03$
Deshpande et al. (2024) [62]	Cell	HoVer-Net	Supervised	https://github.com/Srijay/SynCLay-Framework	-
Du et al. (2025) [70]	Tissue and Nuclei	U-net	Semi-supervised	-	MICCAI: $D S C = 0.860 \pm 0.009,$ $I o U = 0.760 \pm 0.012,$ $P A = 0.864 \pm 0.008$
Falahkheirkhah et al. (2023) [95]	Tissue	U-Net, ResNet	Supervised	https://github.com/kiakh93/Synthesizing-histological-images	Real images + Synthesized— $P A = 0.9267, m P A = 0.9522,$ $I o U = 0.7265$
Fan et al. (2024) [65]	Nuclei	Mask R-CNN	Unsupervised	-	Kumar: $D S C = 0.7930 \pm 0.0446, A J I = 0.5797 \pm 0.0740,$ $S Q = 0.7373 \pm 0.0307, P Q = 0.5527$ ; CPM17: $D S C = 0.8237 \pm 0.0471, A J I = 0.6090 \pm 0.0867,$ $S Q = 0.7624 \pm 0.0337, P Q = 0.6274 \pm 0.0867$
Gadermayr (2019) [97]	Cells and Nuclei	U-Net	Supervised and Unsupervised	-	Supervised— $F 1 - s c o r e = 0.900, P r e c i s i o n = 0.890, R e c a l l = 0.920$
Guan, Li and Zhang (2024) [56]	Cell	Mask R-CNN	Unsupervised	https://github.com/xianchaoguan/GramGAN	$m A P @ [0.50 : 0.95] = 0.617$
He et al. (2022) [61]	Tessue	U-Net+	Supervised	-	$D S C = 0.8751 \pm 0.079$
Hou et al. (2020) [87]	Nuclei	Mask R-CNN	-	-	$D S C = 0.797$
Hossain et al. (2023) [91]	Nuclei	U-Net	Supervised	-	$A C C : 1.0, D S C : 0.999, J I : 0.999$
Hossain et al. (2024) [73]	Nuclei	Mask R-CNN, CNN, U-Net	-	-	Real images: $A c c = 0.979, D S C = 0.875, A J I = 0.791$ ; Synthetic image: $A c c = 1.0, D S C = 0.999, A J I = 0.999$ .
Hu et al. (2018) [59]	Cell	-	Unsupervised	https://github.com/bohu615/nu_gan	$I o U = 0.560, F 1- s c o r e = 0.700$
Juhong et al. (2022) [81]	Nuclei	Interception U-Net	Supervised	-	$I o U = 0.869, D S C = 0.893$
Kablan (2023) [76]	Cell and Tissue	U-Net	Unsupervised	https://github.com/junyanz/pytorch-CycleGAN-andpix2pix	$D S C = 0.881, A c c = 0.887, P r e = 0.910, R e c a l l = 0.867$
Kapil et al. (2021) [83]	Cell and Tissue	SegNet	Supervised	-	Two class- $F 1- s c o r e = 0.916$ , Three class $F 1- s c o r e = 0.899$
Kweon et al. (2022) [86]	Cell and Tissue	U-Net	-	-	-
Lafarge et al. (2019) [77]	Nuclei	CNN	Supervised	-	$F 1- S c o r e = 0.851 \pm 0.011$
Lahiani et al. (2020) [85]	Tissue	ResNet	Unsupervised	-	Tumor: $F 1- S c o r e = 0.81$ , No Tumor: $F 1- S c o r e = 0.940$
Li, Hu and Kak (2023) [63]	Nuclei	CNN	Unsupervised	https://github.com/lifangda01/GSAN-Demo	$P Q = 0.4885, A P = 0.5035, A J I = 0.4693$
Liu, Wagner and Peng (2022) [57]	Nuclei	Mask R-CNN	Supervised	http://www.kaggle.com/c/data-science-bowl-2018/submit	$I o U = 0.570$
Lou Wei et al. (2022) [79]	Nuclei	Mask R-CNN	Semi-supervised and Supervised	-	$A J I = 0.5374, D S C = 0.7536$
Mahapatra and Maji (2023) [90]	Nuclei	Watershed	Unsupervised	-	$D S C = 0.7042$
Mahmood et al. (2019) [88]	Nuclei	U-Net	Supervised	https://github.com/mahmoodlab/NucleiSegmentation	$a H D = 4.291, F 1- S c o r e = 0.866, A J I = 0.721$
Naglah et al. (2022) [92]	Cell and Tissue	U-Net	Unsupervised	-	$A c c = 0.86 \pm 0.08, D S C = 0.75 \pm 0.09$
Purma et al. (2024) [68]	Tissue and Nuclei	U-Net	Supervised and Self-supervised	https://github.com/suhas-srinath/GenSelfDiff-HIS	Glas: $A J I = 0.8470, I o U = 0.8470, H D = 6.3517, F 1- s c o r e = 0.9026$ ; MoNuSeg— $A J I = 0.6545, I o U = 0.6545, H D = 7.3261, F 1- s c o r e = 0.7895$ ; HC Cancer— $A J I = 0.7913, I o U = 0.8105, H D = 4.6855, F 1- s c o r e = 08413$
Razavi et al. (2022) [71]	Cell and Nuclei	-	Supervised	-	Mitosis— $D S C = 721$ , Nuclei— $D S C = 0.784$ , $F 1- S c o r e = 0.854$
Rong et al. (2023) [89]	Nuclei	Mask R-CNN	Unsupervised	http://pytorch.org/get-started/previous-versions/#v1101	$A c c = 0.8204$
Shahini et al. (2025) [78]	Nuclei	-	Usupervised	-	$D S C = 84.86$
Song et al. (2023) [67]	Nuclei	Mask R-CNN	Self-supervied	https://github.com/zhiyuns/UNITPathSSL	$K u m a r : F 1- S c o r e = 0.9560, D S C = 0.853, A J I = 0.657, P Q = 0.625$
Taieb and Hamarneh (2017) [75]	Nuclei	AlexNet	Unsupervised	-	MITOSIS: $A C C = 0.900 \pm 0.010$ ; COLON: $A C C = 0.820 \pm 0.090$ ; OVARY: $A C C = 0.617 \pm 0.130$ .
Vasiljevic et al. (2021) [94]	Cell and Tissue	U-Net	Unsupervised	-	$P r e c i s i o n = 0.838, R e c a l l = 0.919, F 1- s c o r e = 0.875$
Vasiljevic et al. (2023) [72]	Cell and Tissue	U-Net	Supervised	-	$P r e c i s i o n = 0.864, R e c a l l = 0.877, F 1- s c o r e = 0.870$
Wang et al. (2021) [82]	Tissue	U-Net	Unsupervised	-	N5: $A c c = 0.9995, R c a l l = 0.9996, P r e c i s i o n = 0.9994,$ $D S C = 0.9927, I o U = 0.9990$
Wang et al. (2022) [80]	Nuclei	RCSAU-Net	Supervised	https://github.com/antifen/Nuclei-Segmentation	MoNuSeg: $A C C = 0.919, F 1- s c o r e = 0.867, D S C = 0.820,$ $A J I = 0.619, H D = 4.334, R e c a l l = 0.893$ ; PanNuke: $P r e c i s i o n = 0.9203, R e c a l l = 0.9101, F 1- s c o r e = 0.8661,$ $D S C = 0.8482, A J I = 0.6550, H D = 4.3768$ .
Ye et al. (2025) [60]	Tissue	PRANet	Self-supervised	http://github.com/YeahHighly/SASSL_PR_2024	$P A = 0.948, D S C = 0.946, I o U = 0.898$
Yoon et al. (2024) [74]	Area, count and distance cells	U-Net	-	https://github.com/YoonChiHo/DL-based-frameworkfor-automated-HIA-of-label-free-PAH-images	-
Zhang et al. (2021) [64]	Nuclei	U²-Net	Supervised	-	TNBC: $P A = 0.9152, R e c a l l = 0.8539, s p e c i f i c i t y = 0.8620,$ $F 1- s c o r e = 0.8556, I o U = 0.7908, D S C = 0.9301, P R I = 0.9889$ .
Zhang et al. (2022) [58]	Nuclei	DSCN	Supervised and Unsupervised	http://github.com/wangpengyu0829	$P A = 0.8735, D S C = 0.7256, A J I = 0.5997, M A E = 0.0882$

* Accessed on 19 June 2025.

Table 6. Summary of common evaluation metrics of GAN and segmentation model.

GAN		Segmentation
Name	Paper	Name	Paper
Fréchet Inception Score (FID)	[56,58,62,66,68,74,82,86,94]	Accuracy (Acc)	[73,75,76,80,82,84,89,91,92]
Structural Similarity Index (SSIM)	[61,62,70,73,76,81,82,89,91]	Dice Similarity Coefficient (DSC)	[58,60,61,64,65,66,67,69,70,71,76,78,79,80,81,82,87,90,91,92,93]
Peak Signal-to-Noise Ratio (PSNR)	[70,73,76,81,82,89,91]	F1-Score	[59,64,67,68,71,72,77,80,83,85,88,90,94,97]
Meane Square Error (MSE)	[61,73,76,81,91]	Intersection Over Union (IOU)	[57,59,60,64,68,70,81,82,95]
Average Human Rank (AHR)	[58]	Aggregated Jaccard Index (AJI)	[58,63,65,68,73,79,80,88]
Pearson correlation coefficient (PCC)	[70]	Jaccard Index (JI)	[91]
Root Meane Square Error (RMSE)	[76]	Pixel Accuracy (PA)	[58,60,64,70,95]
Kernel Inception Distance (KID)	[56,58,74]	Precision (Pre)	[63,72,73,76,82,94,97]
Interception Score (IS)	[75,86]	Recall (Rec)/Sensitivity (Sen)	[64,72,76,82,94,97]
Contrast-Structure Simirarity (CSS)	[56]	Average Pompeiu–Hausdorff distance (aHD)	[88]
Normalized Mutual Information (NMI)	[90,92]	Hausdorff Distance (HD)	[68,80]
Complex Wavelet Structural Similarity (CWSSIM)	[85]	Mean Absolute Error (MAE)	[58]
Bhattacharyya Distance (BCD)	[92]	Panoptic Quality (PQ)	[63,65]
Histogram Correlation (HC)	[92]	Probabilistic Rand Index (PRI)	[64]
Feature Similarity Index (FSIM)	[76,82]	Mean Average Precision (mAP)	[56,95]
Multi-scale Structural Similarity Index (MSSSIM)	[76,82]	Pearson Correlation Coefficient (PCC)	[84]
Erreur Relative Globale Adimensionnelle de Synthese (ERGAS)	[76]	Score Segmentation (SQ)	[65]
Universal Quality Index (UQI)	[76]	Average Precision (AP)	[63]
Erreur Relative Globale Adimension (Misep)	[76]
Erreur Relative Globale Adimension (Miss)	[76]
Name	Paper	Name	Paper
Average Spectral Error (RASE)	[76]
Between-image Color Constancy (BiCC)	[90]
Within-set Color Constancy (WsCC)	[90]
Measuring Mutual Information (MMI)	[92]
Visual Information Fidelity (VIF)	[82]
Standard Deviation (SD)	[95]

Table 7. Summary of publications for GAN in histological image segmentation.

Paper	Model GAN	Model Segmentation	Datasets	Metrics GAN	Metrics Segmentation
Azam et al. (2024) [84]	Pix2Pix-GAN CUT	U-Net	Onco-SG	FID	Pearson correlation, Acc
Bouteldja et al. (2022) [69]	CycleGAN	U-Net	RWTH Aachen VALIGA	-	IDSC
De Bel et al. (2021) [93]	CycleGAN	U-Net	-	-	DSC
Deshpande et al. (2022) [66]	GAN	U-Net	-	FID	DSC
Deshpande et al. (2024) [62]	SynCLay	HoVer-Net	Crag Digestpath	FID, SSIM	-
Du et al. (2025) [70]	P2P-GAN, DSTGAN	U-Net	ICIAR-BACH MICCAI’16 GlaS MITOS-ATYPIA 14 TUPAC16	SSIM, PCC, PSNR	DSC, IoU, PA
Falahkheirkhah et al. (2023) [95]	Pix2Pix-GAN	U-Net, ResNet	-	Mean, SD	PA, mPA, IoU
Fan et al. (2024) [65]	CycleGAN	Mask R-CNN	CPM17 Kumar	-	DSC, AJI, SQ, PQ
Gadermayr et al. (2019) [97]	CycleGAN	U-Net	-	-	F1-Score, Precision, Recall
Guan, Li and Zhang (2024) [56]	GramGAN	Mask R-CNN	ANHIR Glomeruli Lung dataset	CSS, FID, KID	mAP
He et al. (2022) [61]	cGAN	U-Net+	CARS	SSIM, MSE	DSC
Hou et al. (2020) [87]	GAN	Mask R-CNN	TCGA	-	DSC
Hossain et al. (2023) [91]	CycleGAN	U-Net	TCGA	SSIM, MSE, PSNR	Acc, DSC, JI
Hossain et al. (2024) [73]	CycleGAN	Mask R-CNN, CNN, U-Net	KIRC TCGA	SSIM, MSE, PSNR	Acc, DSC, AJI
Hu et al. (2018) [59]	WGAN-GP	-	BMCD-FGCD Dataset A, B, C and D	-	IoU, F1-Score
Juhong et al. (2022) [81]	SRGAN- ResNeXt	Interception U-Net	MUC1 TCGA	SSNR, SSIM, MSE	IoU, DSC
Kablan (2023) [76]	CycleGAN RRAGAN	U-Net	MICCAI’16 GlaS MITOS-ATYPIA 14	FSIM, PSNR, SSIM, MSE, MSSSIM, RMSE, ERGAS, UQI, RASE	DSC, Acc, Precision, Recall
Kapil et al. (2021) [83]	CycleGAN DASGAN	SegNet	NSCLC	-	F1-Score
Kweon et al. (2022) [86]	PGGAN	U-Net	TCGA	FID, IS	-
Lafarge et al. (2019) [77]	DANN	CNN	MICCAI’16 GlaS TCGA TUPAC16	-	F1-Score
Lahiani et al. (2020) [85]	CycleGAN	ResNet	Roche	CWSSIM	F1-Score
Li, Hu and Kak (2023) [63]	G-SAN	CNN	CPM17 CoNSep CryoNuSeg KIRC MoNuSAC MoNuSeg TNBC	-	PQ, AP, AJI
Liu, Wagner and Peng (2022) [57]	GAN	Mask R-CNN	BBBC-data	-	IoU
Lou Wei et al. (2022) [79]	CSinGAN	Mask R-CNN	MoNuSeg TCGA TNBC	-	AJI, DSC
Mahapatra and Maji (2023) [90]	LSGAN TredMiL	Watershed	TCGA	NMI, BiCC, WsCC	DSC
Mahmood et al. (2019) [88]	cGAN	U-Net	TCGA	-	aHD, F1-Score, AJI
Naglah et al. (2022) [92]	cGAN CycleGAN	U-Net	-	MMI, NMI, HC, BCD	Acc, DSC
Purma et al. (2024) [68]	CycleGAN	U-Net	HN Cancer MICCAI’16 GlaS MoNuSeg	FID	AJI, IoU, HD, F1-score
Razavi et al. (2022) [71]	ICPR14 e 12 TUPAC16	-	cGAN MiNuGAN	-	DSC, F1-score
Rong et al. (2023) [89]	Restore-GAN	Mask R-CNN	TCGA	SSIM, PSNM	Acc
Shahini et al. (2025) [78]	ViT-P2P	-	MoNuSAC	-	DSC
Song et al. (2023) [67]	CycleGAN	Mask R-CNN	KatherKumar Lizard In-House In-House-MIL PanNuke	-	F1-Score, DSC, AJI, PQ
Taieb and Hamarneh (2017) [75]	GAN	AlexNet	MICCAI’16 GlaS	IS	Acc
Vasiljevic et al. (2021) [94]	CycleGAN, StarGAN	U-Net	-	FID	Precision, Recall, F1-score
Vasiljevic et al. (2023) [72]	CycleGAN HistoStarGAN	U-Net	KidneyArtPathology	FID, SSIM	Precision, Recall, F1-score
Wang et al. (2021) [82]	ACCP-GAN	U-Net	N7, E17, N5	FSIM, MS-SSIM, PSNR, VIF, FID	Acc, Rcall, Precision, DSC, IoU
Wang et al. (2022) [80]	GAN	RCSAU-Net	MoNuSeg PanNuke	-	Acc, F1-score, DSC, AJI, HD, Recall
Ye et al. (2025) [60]	SASSL	PRANet	CAMELYON16	-	PA, DSC, IoU
Yoon et al. (2024) [74]	CUT, E-CUT CycleGAN E-CycleGAN	U-Net	CPM17 Kumar TNBC	FID, KID	-
Zhang et al. (2021) [64]	MASG-GAN	U²-Net	CoNSep CPM17 KIRC Kumar TNBC	-	PA, Recall, specificity, F1-score, IoU, DSC, PRI
Zhang et al. (2022) [58]	CSTN	U-Net	BBBC-data CoNSep CPM17 KIRC Kumar LYON19 (IHC) MThH TNBC	FID, KID, AHR	PA, DSC, AJI7, MAE

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cruz, Y.L.K.F.; Silva, A.F.M.; Santana, E.E.C.; Costa, D.G. Generative Adversarial Networks in Histological Image Segmentation: A Systematic Literature Review. Appl. Sci. 2025, 15, 7802. https://doi.org/10.3390/app15147802

AMA Style

Cruz YLKF, Silva AFM, Santana EEC, Costa DG. Generative Adversarial Networks in Histological Image Segmentation: A Systematic Literature Review. Applied Sciences. 2025; 15(14):7802. https://doi.org/10.3390/app15147802

Chicago/Turabian Style

Cruz, Yanna Leidy Ketley Fernandes, Antonio Fhillipi Maciel Silva, Ewaldo Eder Carvalho Santana, and Daniel G. Costa. 2025. "Generative Adversarial Networks in Histological Image Segmentation: A Systematic Literature Review" Applied Sciences 15, no. 14: 7802. https://doi.org/10.3390/app15147802

APA Style

Cruz, Y. L. K. F., Silva, A. F. M., Santana, E. E. C., & Costa, D. G. (2025). Generative Adversarial Networks in Histological Image Segmentation: A Systematic Literature Review. Applied Sciences, 15(14), 7802. https://doi.org/10.3390/app15147802

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Generative Adversarial Networks in Histological Image Segmentation: A Systematic Literature Review

Abstract

1. Introduction

2. Related Review Works

3. Fundaments and Basic Definitions

3.1. Histological Image Segmentation

3.2. Introduction to Generative Adversarial Networks

3.3. Types of GANs

3.3.1. cGAN

3.3.2. CycleGAN

3.3.3. StyleGAN

3.3.4. PGGAN

3.3.5. WGAN

3.3.6. Pix2Pix

4. Review Methodology

4.1. Research Questions

4.2. Search Strategy

4.3. Selection Criteria

4.4. Data Extraction and Synthesis

5. Results

5.1. Search Results

5.2. Applications and Trends of GANs in Histopathological Image Segmentation

Yearly Distribution

5.3. Datasets

5.4. GANs in Histological Image

5.4.1. Data Augmentation

5.4.2. Stain Normalization

5.4.3. Virtual Staining

5.4.4. Synthetic Image Generation

5.4.5. Color Normalization

5.4.6. Domain Transformation and Reconstruction of High-Resolution Image

5.5. Segmentation Task in Histological Image Analysis

5.5.1. Nuclei Segmentation

5.5.2. Cell Segmentation

5.5.3. Tissue Segmentation

5.5.4. Multiclass Segmentation: Tissues + Nuclei or Cells

5.6. Overview of Evaluation Metrics

5.7. Study Summary

6. Discussion

7. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI