Lineage Plasticity and Stemness Phenotypes in Prostate Cancer: Harnessing the Power of Integrated “Omics” Approaches to Explore Measurable Metrics

Simple Summary Prostate cancer remains the most frequent cause of cancer morbidity, the second most frequent cause of cancer mortality in men in the developed world and is an exemplar of a heterogeneous disease. Stemness phenotypes and lineage plasticity have been highlighted as key drivers of heterogeneity observed both across patients and within the same patient. However, markers that indicate the presence or absence of these events remain to be identified. Next-generation sequencing has proven to be a beneficial approach to distinguish predictive and prognostic biomarkers in various diseases, including prostate cancer. This review explores measurable metrics that can reliably reflect lineage plasticity at the genomic, transcriptomic, and epigenomic levels, as well as bioinformatic tools that can be used to identify measures of lineage-plasticity in prostate cancer, in order to inform preclinical and clinical research. Abstract Prostate cancer (PCa), the most frequent and second most lethal cancer type in men in developed countries, is a highly heterogeneous disease. PCa heterogeneity, therapy resistance, stemness, and lethal progression have been attributed to lineage plasticity, which refers to the ability of neoplastic cells to undergo phenotypic changes under microenvironmental pressures by switching between developmental cell states. What remains to be elucidated is how to identify measurements of lineage plasticity, how to implement them to inform preclinical and clinical research, and, further, how to classify patients and inform therapeutic strategies in the clinic. Recent research has highlighted the crucial role of next-generation sequencing technologies in identifying potential biomarkers associated with lineage plasticity. Here, we review the genomic, transcriptomic, and epigenetic events that have been described in PCa and highlight those with significance for lineage plasticity. We further focus on their relevance in PCa research and their benefits in PCa patient classification. Finally, we explore ways in which bioinformatic analyses can be used to determine lineage plasticity based on large omics analyses and algorithms that can shed light on upstream and downstream events. Most importantly, an integrated multiomics approach may soon allow for the identification of a lineage plasticity signature, which would revolutionize the molecular classification of PCa patients.


Introduction
Prostate cancer (PCa) remains the most frequent and second most lethal cancer in men in the developed world [1], and is an exemplar of heterogeneous disease.According to the American Cancer Society, over 280,000 new cases of PCa and over 34,000 deaths due to this disease are anticipated in 2023 in the United States.The clinical course of PCa varies from indolent behavior (which requires minimal, if any, therapeutic intervention) to aggressive disease that progresses rapidly and is resistant to therapy.Morphologically, inter-and intratumor heterogeneity has been observed in PCa.The latter is best described with the Gleason score, which represents the sum of the two most prevalent histologic patterns, with tertiary patterns frequently present and separately accounted for in the pathology report [2,3].
Therapeutic options vary depending on the disease stage, Gleason score, and serum PSA levels, as well as patient age, comorbidities, and preferences [4].In most cases, PCa is an AR-dependent tumor; thus, androgen deprivation therapy (ADT) and AR signaling inhibition (ARSI) therapy persist as the mainstay systemic therapies for patients with recurrent or metastatic PCa [5][6][7].While most patients have a long-term response to ADT, many cancers do recur, leading to castrate-resistant prostate cancer (CRPC).While the majority of CRPC tumors remain AR-driven through various mechanisms, including the acquisition of activating AR mutations, AR gene amplifications, ligand-independent AR splice variants, or ligand promiscuity, up to 20% of CRPC tumors adapt to or lose AR dependence as a means to evade AR-targeted therapy.In these patients, aggressive, atypical clinical features ensue, including lytic bone metastases, visceral dissemination, and low PSA levels for disease burden.Histologically, a transformation from adenocarcinoma to small-cell (neuroendocrine) PCa has been seen in this setting in some, but not all, cases [8,9].The term aggressive variant prostate cancer (AVPC) is used to describe this disease state, which has limited therapeutic options and accounts for 30% of PCa deaths [10,11].Many clinical and preclinical efforts have been undertaken to elucidate the underlying biology of disease evolution toward the AR-indifferent cell state in order to identify biomarkers that could facilitate the early recognition of these patients, as well as potential therapeutic targets.
Lineage plasticity, defined as the ability of cells to change their differentiation state, has emerged as a significant hallmark of cancer progression and treatment resistance, and has been proposed as a source of intratumoral heterogeneity [12][13][14].Based on this ability, neoplastic cells can adapt by switching from one committed developmental pathway to another, and this transformation has been proposed as a driver of intratumoral heterogeneity and cancer progression [12].The best-studied phenotype of cancer lineage plasticity is the epithelial-to-mesenchymal transition (EMT), which allows neoplastic cells to transform to a less differentiated state with enhanced tumorigenic and metastatic properties [15][16][17].In EMT, the epithelial phenotype of the cell changes dynamically toward a mesenchymal phenotype, while the tumor progresses in order to bypass either immunologic or therapeutic barriers [18].Similarly, lineage plasticity has been implicated in the development of aggressive and treatment-resistant phenotypes, such as neuroendocrine and small-cell PCa [19].It has also been hypothesized to account for the phenotypic heterogeneity of AVPCs and the bidirectional transition of cancer cells between two morphologic and molecular states: AR-driven adenocarcinoma cells and AR-indifferent cells of small-cell and various other morphologies [20].Hence, lineage plasticity may be used as an additional classifier for patients with PCa, with patients belonging to one of two groups: a group with lineage-constrained disease, which includes patients who respond well to current treatment strategies and whose cancer may expand with consistent histology or slow phenotypic changes and better prognosis [21][22][23][24], and a group with lineage-plastic disease, which includes patients with short or no response to current therapies, whose cancer progresses rapidly with diverse histology and poor outcomes [10,19,[25][26][27][28][29].Such classification would represent a fundamental milestone, enabling clinicians to predict the response to AR inhibi-tion, identify aggressive variants, and facilitate the emergence of new therapeutic targets.However, lineage plasticity measures remain to be defined.
In routine clinicopathological evaluation, histology (morphology and Gleason score/ grade group), imaging, and clinical TNM staging are applied to characterize PCa heterogeneity and identify potential lineage plasticity from a histological/clinical perspective [30][31][32][33].These methods provide valuable information about the tumor heterogeneity, differentiation status, and aggressiveness, which can help researchers and clinicians better understand the clinical implications of lineage plasticity in cancer.However, finer molecular characterizations of tumor heterogeneity and lineage plasticity are needed.In recent years, research has increasingly shifted toward molecular insights driven by nextgeneration sequencing (NGS) technologies [34].These advanced methods offer significant opportunities for the in-depth analysis of lineage plasticity in cancer.In particular, singlecell technologies, such as single-cell RNA sequencing (scRNA-seq), enable researchers to perform detailed analyses of tumors at the individual cell level, offering a more profound understanding of cellular heterogeneity and lineage plasticity.
Lineage plasticity is reported to be enabled through genomic and epigenetic events [35], encompassing two differentiation scenarios: (a) dedifferentiation, which refers to a transition from a fully differentiated cell state to a less differentiated cell state, and (b) transdifferentiation, which refers to a transition from a fully differentiated cell state to an alternative fully differentiated cell state [36].In PCa, lineage plasticity tracing studies have demonstrated that neuroendocrine prostate cancer (NEPC) cells originate from luminal cells in response to ADT, supporting the transdifferentiation scenario [37].While genetic alterations, such as ETS (E26 transformation-specific) gene fusions, have been identified as drivers of PCa initiation and progression, recent evidence has highlighted the importance of epigenetic and transcriptomic alterations in promoting lineage plasticity [27,38].As bidirectional inherited changes in chromatin structures that modify gene expression and cell phenotype without any genomic changes, epigenetic alterations represent an ideal mechanism for the development of lineage plasticity [39,40].Indeed, DNA methylation and histone modifications have been shown to alter gene expression programs and promote lineage plasticity [41].Similarly, changes in transcription factor expression and activity can promote lineage plasticity by driving cells toward alternative differentiation states [42].Therefore, epigenetic changes can enhance the switch between different developmental states in accordance with the microenvironmental pressures that occur under various therapeutic strategies.
Cancer stem cells (CSCs) have been proposed as key factors accounting for intratumoral heterogeneity, tumor progression, and evolution [43].However, the more specific transformation of a neoplastic cell to a stem-like state is a plasticity event that may reveal the true driver in specific contexts [35,44].Lineage plasticity is associated with the stem-like behavior of neoplastic cells.While stem cells can give rise to different developmental cell states, lineage-plastic cells adapt to environmental changes for the sake of their survival by switching between different cellular conditions.A recent study identified a mesenchymal and stem-like PCa cell state as a result of an ARSI-therapy-induced lineage plasticity response [45].A detailed investigation into relevant molecular and epigenetic factors, as well as the identification of a cell population that shows stem-like or lineage plasticity-like characteristics, would help elucidate the underlying biology of tumor progression, tumor heterogeneity, therapy resistance, and metastasis.
In PCa, the Yamanaka pluripotent factor SOX2 has been associated with aggressive disease [27,46,47].The FOXC2 protein has also been identified as a candidate stem cell marker in aggressive NEPC.The gain of function of this marker has been linked to therapy resistance (enzalutamide and docetaxel) and the epithelial-to-neuroendocrine transdifferentiation, while the loss of function has been shown to restore ADT sensitivity and the neuroendocrine-to-epithelial transformation both in vitro and in vivo [48].Various stem cell-associated signaling cascades, including EMT and EMT-related pathways (TGF-β pathway, Wnt signaling, and Hedgehog pathway), PI3K-AKT-mTOR signaling, JAK/STAT, and others, have been reported to play key roles in the development of aggressive PCa phenotypes and are potentially linked to lineage plasticity [49,50].EMT and mesenchymal-to-epithelial transition (MET) pathways enable neoplastic cells to transform from an epithelial to a mesenchymal morphology, and metastasize from the site of origin to distant sites returning to an epithelial morphology [51][52][53].These pathways are best described as "flavors" of lineage plasticity that have been able to elucidate how neoplastic cells metastasize and grow in distant sites.In addition, the N-Myc and Aurora kinase-A pathways have been shown to be upregulated in aggressive PCa phenotypes and have been suggested as candidate pathways that fuel lineage plasticity events, potentially serving as targets for therapeutic strategies [54][55][56][57][58][59][60].
In recent years, "omics" has revolutionized cancer research and treatment by enabling a more comprehensive understanding of the molecular complexity underlying cancer development and progression.Approaches such as genomic, transcriptomic, and epigenomic analyses have showcased the potential to decipher the intricate genetic and epigenetic alterations and molecular pathways that drive tumorigenesis [61], including the identification of key oncogenic drivers and candidate therapeutic targets towards facilitating the development of personalized medicine approaches [62].As the field continues to evolve, the integration of omics data into clinical practice holds tremendous promise in transforming cancer care and, ultimately, increasing the chances of achieving successful cancer management.
In this review, we discuss the current research advances in the evaluation of lineage plasticity in PCa, including genomic and epigenetic data in support of PCa evolution and progression through various plastic cell states.We discuss the need to identify a lineage plasticity signature using transcriptomic and epigenomic techniques, and address candidate measures that have been studied for other types of cancer that could potentially be leveraged in PCa.We further discuss bioinformatic tools that could be employed in the development of lineage plasticity signatures.

Genomic Drivers of PCa Progression and Evolution
Significant advancements in genomic technologies have enabled researchers uncover the genetic landscape of PCa.Genome-wide association studies have identified numerous single-nucleotide polymorphisms (SNPs) associated with an increased risk of developing PCa, while NGS has revealed a diverse array of somatic genomic alterations present in PCa, including point mutations, gene fusions, copy number alterations, and structural variations [63].Table 1 summarizes the most frequent genomic events observed in prostate cancer.Here, we describe these genetic events and their role in prostate cancer development, progression, and evolution, and we highlight the events that correlate with lineage plasticity in PCa.

Overexpression Not determined
Affects histone post-translational modifications and associated with poor recurrence-free survival [64,91,[131][132][133][134][135][136][137] Gene fusions of androgen-regulated genes and members of the ETS family of transcription factors are the most common genetic events in primary PCa, with radical prostatectomy and biopsy specimens showing ETS fusions in 27% to 79% of cases [65,66].In addition, mutations in the SPOP gene, which encodes an E3 ubiquitin ligase, are the second most common genetic event and the most common mutation in PCa.SPOP mutations have been associated with alterations in AR signaling and DNA damage repair pathways, as well as resistance to BET inhibitors through the stabilization of BRD4 [82 -85].SPOP mutations and ETS fusions are mutually exclusive, and studies have identified molecular subtypes of PCa based on the genetic driver present; however, these genetic events have not conferred any prognostic or predictive information to date, nor could they be used to guide therapy selection.
Secondary genetic events include the loss of the PTEN tumor suppressor gene, which is located on chromosome 10.PTEN loss is present in approximately 15% to 20% of primary PCa cases and in 40% to 60% of cases upon disease progression, with greater frequency in ETS-rearranged cases, and has been associated with a higher tumor stage, metastasis, and recurrence [86,[98][99][100][101].PTEN loss leads to the increased activity of the PI3K/AKT/mTOR pathway, which promotes cell survival and proliferation [88,102].TP53 loss has also been seen upon disease progression, with mutations observed in approximately 6% to 8% of patients with primary PCa and >28% of patients with metastatic PCa [104,105].TP53 is a tumor suppressor gene that plays a critical role in maintaining genomic stability, and its loss or inactivation can lead to the accumulation of additional genetic alterations and increased tumor aggressiveness [103,[106][107][108][109]. Another defect associated with PCa aggressiveness and poor prognosis is the loss of the tumor suppressor retinoblastoma (RB1) gene, which regulates the cell cycle by inhibiting E2F transcription factor activity [89][90][91][92].Despite the challenges in targeting RB1 loss in PCa, there is growing interest in developing therapeutic strategies that specifically target this pathway.
Although PCa has been considered an AR-driven disease, AR gene alterations are very rare in primary PCa, and emerge only after androgen deprivation as a mechanism of castration resistance.In that setting, AR amplification has been reported in up to 30% to 40% of cases, and AR gene mutations have been identified in approximately 10% to 15% of cases [69].AR splice variants are also frequently observed in CRPC, especially after treatment with second-generation antiandrogens.The most common AR variant associated with enzalutamide (ENZA) and abiraterone resistance is AR-V7, an AR isoform that lacks the ligand-binding domain [110].
Recent studies have shown that combined defects in the tumor suppressors TP53, RB1, and PTEN have been linked to aggressive PCa phenotypes and are potential drivers of lineage plasticity in PCa [26][27][28].An integrative analysis showed that the null expression of TP53 through a genomic copy loss or biallelic mutation is seen in ~40% to 50% of metastatic PCa specimens, while biallelic RB1 inactivation, primarily due to a genomic copy loss, occurs in ~12% of them.Only ~4% of metastatic PCa have been reported to have combined defects at both TP53 and RB1 [93,94].However, combined defects in these genes are more frequent in AVPC patients.Indeed, defects in at least two out of these three genes are included in the NCCN criteria for identifying AVPC patients [95], and can be used to predict benefits from adding carboplatin to cabazitaxel [11], marking them as key lineageassociated markers that are used in clinical practice for decision making.In addition, TP53 mutations, as defined by a diffuse and intense TP53 expression, and RB1 loss, as defined by a lack of expression due to immunohistochemistry, have been associated with small-cell carcinoma morphology [96], which is the prototype of AVPC.RB1 loss is frequently seen in both the adenocarcinoma and small-cell carcinoma components of mixed tumors [138].Using PCa mouse models with PTEN mutations followed by RB1 loss, Ku et al. [26] showed that PTEN and RB1 serve as lineage plasticity markers that enhance tumor metastasis, while the additional loss of tumor suppressor TP53 allows tumors to resist antiandrogen therapy.Using in vitro and in vivo models of human PCa, Mu et al. [27] showed that the loss of function of tumor suppressors TP53 and RB1 is mediated by increased SOX2 levels.SOX2 is a reprogramming transcription factor, one of the four Yamanaka factors that play crucial roles in differentiation processes [47,97].The inhibition of SOX2 expression led to the restoration of TP53 and RB1 function, resulting in the increased expression of basal (CK5, CK14, and TP63) and neuroendocrine (SYP, CHGA, and NSE) lineage markers.
A recent examination of a vast repository of PCa patient-derived xenografts (PDXs) [139] that reflects the spectrum of lethal PCa used various high-throughput techniques, including whole-genome sequencing, targeted sequencing, and RNA sequencing (RNA-seq), to gain a better understanding of potential genomic alterations that may contribute to the development of PCa.The heterozygous deletion or amplification of specific genes did not seem to impact gene expression, while most homozygous deletions resulted in null expression.Several known fusions (TMPRSS2-ERG, TMPRSS2-ETV4, and SLC45A3-ELK4) were observed.The combined defects in tumor suppressors RB1, TP53, and PTEN seemed to be the only key players for PCa aggressiveness that could be linked to lineage plasticity events.
Biomarkers that can be used to guide therapy selection are sparse in PCa.AR-V7 has been proposed as such due to its association with abiraterone and ENZA resistance [112].However, its use is limited by the rarity of its expression in the pre-abiraterone/ENZA era and its induced expression after exposure to one of these agents.Mutations in DNA homologous recombination repair genes BRCA2 (the most common), BRCA1, ATM, and CHEK2 are early genetic events associated with an increased risk of developing aggressive PCa [73][74][75]87], and are predictive of the response to PARP inhibitors [86].Mutations in DNA mismatch repair (MMR) genes are less frequent, usually seen in the sporadic setting [76][77][78], and are predictive of the response to the immune checkpoint inhibitor pembrolizumab [79].
The genomic alterations observed in PCa are diverse and complex, with many alterations linked to androgen signaling and DNA damage repair pathways.Understanding these genomic alterations is critical for the development of novel diagnostic and therapeutic strategies for PCa.Indeed, the predictive molecular biomarkers currently in use for therapy selection in PCa represent genomic alterations, i.e., homologous recombination gene variants, microsatellite instability, and tumor suppressor gene defects, the latter two frequently identified through immunohistochemistry, an easy and low-cost, widely-available technique.However, genetic alterations alone have failed to fully describe the heterogeneity and complexity of PCa progression, and additional predictive markers for therapy selection are needed to improve patient outcomes.Thus, we and others have hypothesized that a complementary epigenetic network drives lethal PCa progression [140][141][142].

Epigenetic Changes in PCa Evolution
Whereas genomic alterations can modify gene expression and functioning through changes in DNA sequencing, epigenetic alterations modulate gene expression without any changes in DNA sequencing.Similar to genomic alterations, epigenetic changes can be inherited by daughter cells following cell division.However, in contrast to genomic alterations, epigenetic alterations can be reversed (i.e., due to pharmacologic inhibition or as a response to environmental stimuli) and are, thus, bidirectional, inheritable regulators of gene expression [39].Epigenetic changes include: (1) DNA methylation, (2) chromatin remodeling through histone post-translational modifications (PTMs) (methylation, acetylation, phosphorylation, etc.), and (3) the effects of noncoding RNAs (ncRNAs, primarily microRNAs (miRNAs), and long noncoding RNAs (lncRNAs)) [143,144].Epigenetic reprogramming has emerged as an important contributor to cellular processes that drive cancer initiation, progression, and therapy response [145]; thus, epigenetic discoveries can contribute to a better understanding of the underlying events that lead to lineage plasticity and PCa lethality.Table 1 summarizes the most frequently observed epigenetic events in prostate cancer.

DNA Methylation
DNA methylation, the best-studied epigenetic mechanism, refers to the addition of methyl groups to the cytosine residues of CpG dinucleotides [146].DNA methylation is a dynamic procedure that takes place through enzymes called DNA methyltransferases (DNMTs).In general, CpG dinucleotides are methylated in CpG islands (CpG-dense regions above a threshold of an observed versus expected frequency) in noncoding areas and in the promoters of silenced genes, whereas promoters of expressed genes are unmethylated [147,148].Here, we describe the most studied DNA methylation changes observed in PCa and highlight those that could potentially be linked to lineage-plastic disease.
Based on the vast majority of altered DNA methylation patterns in PCa, the growth suppressor genes APC and RARβ and cell adhesion genes CDH1 and CD44 are among the most frequently hypermethylated genes in PCa [116][117][118][119][120][121][123][124][125].The detection of the hypermethylation of the promoter region of the DNA repair gene GSTP1 (36-100% sensitivity) and cell-cycle-associated gene RASSF1a (53-96% sensitivity) [115,128] in biopsies and body fluids (serum, plasma, urine, and ejaculates) has been suggested as a sensitive and specific marker for detecting PCa.Moreover, the hypermethylation of a three-gene classifier (GAS6/GSTP1/HAPLN3) has also been proposed as a biomarker for the obtainment of a more accurate PCa diagnosis [149].Another gene frequently hypermethylated in PCa is PITX2, which is a transcription factor involved in several cellular processes, including cell proliferation and differentiation.The aberrant DNA methylation of this gene has been associated with a higher risk of recurrence and metastasis in PCa [150,151].The writers of DNA methylation (DNA methyltransferases-DNMTs) also represent a relevant object of investigation in PCa.A gradient of DNMT expression levels from low-to high-grade PCa has been reported by our group [152].In addition, global hypermethylation has been linked to metastatic PCa, and hypomethylation at pericentromeric regions and repetitive sequences has been observed in the same patients [153].The latter refers to the hypomethylation of certain repetitive sequences, such as long interspersed nuclear element-1 (LINE-1) and satellite 2 (Sat2), both of which have been linked to genomic instability and poor prognosis in PCa [129,130].However, none of these have been translated into clinical practice, nor have any of them shown predictive significance.
Recently, Loyfer et al. [154] released a DNA methylation atlas of 39 normal cell types.For each cell type, 25 markers were highlighted as being uniquely unmethylated compared to the other cell types.These markers could potentially be used as biomarkers for cell type identification in liquid biopsies and, combined with PCa-specific hypermethylated markers, as presented earlier, could result in better noninvasive tests for tumor detection and classification.In addition, the investigation of the methylation status of these markers in the whole spectrum of PCa (primary, metastatic, and AVPC) could shed light on those that could be used as lineage-plasticity-specific markers that drive the progression of the disease.In 2016, Beltran et al. [9] performed a differential methylation analysis between neuroendocrine (CRPC-NE) and adenocarcinoma (CRPC-Adeno) CRPC subtypes to elucidate the potential epigenetic drivers of PCa evolution.They highlighted four genes (CCND1, GATA2, MAPKAPK3, and SPDEF) that were observed to be both hypermethylated and downregulated in the CRPC-NE cohort compared to CRPC-Adeno.Interestingly, Loyfer et al. listed SPDEF in the top 1000 markers that seemed to be significantly unmethylated in normal prostate tissue.It is also known that SPDEF regulates cell differentiation and has been associated with tumor metastasis in PCa [155].In addition, Beltran et al. observed eight genes (ASXL3, CAND2, ETV5, GPX2, JAKMIP2, KIAA0408, SOGA3, and TRIM9) that were hypomethylated and overexpressed in the CRPC-NE cohort compared to CRPC-Adeno [9].These findings could potentially be linked to lineage plasticity in PCa and potential epigenetic markers.
The DNA methyltransferase inhibitor agent 5-azacytidine has been shown to reverse the global hypermethylation patterns that are observed during cancer development and evolution.This drug has been FDA approved (May 2022) for therapy in newly diagnosed juvenile myelomonocytic leukemia (NCT02447666) [156,157], and >200 clinical trials are currently recruiting patients to test 5-azacytidine alone or in combination for various cancer types, including PCa (https://clinicaltrials.gov/, accessed on 31 May 2023).

Histone PTMs
Chromatin remodeling through histone PTMs represents another epigenetic mechanism frequently altered in PCa.Chromatin can be packed as an accessible euchromatin, which enables gene expression, or as a heterochromatin, which induces gene suppression [158].This chromatic structure is mainly mediated by histone PTMs, which include methylation, acetylation, ubiquitylation, SUMOylation, and phosphorylation on specific residues of the N-terminal tails of histones [159,160].In contrast to DNA methylation, which is associated with gene silencing, histone modifications are linked to either gene activation or repression, depending on which residues are modified and the type of modifications present [160,161].For instance, H3K27ac (the acetylation of lysine 27 on histone H3) and H3K4me3 (the trimethylation of lysine 4 on histone H3) are present at the promoters of transcriptionally active genes, whereas H3K27me3 (the trimethylation of lysine 27 on histone H3) is enriched at repressed gene promoters.Histone methyltransferases (HMTs) and histone demethylases (HDMs) are responsible for adding and removing methyl groups, respectively, and histone acetyltransferases (HATs) and histone deacetylases (HDACs) mediate the addition and removal of acetyl groups to/from histones, respectively [162][163][164].For example, PRC1 and PRC2 polycomb complexes mediate the trimethylation of histone H3 at lysine 27 residues (marker H3K27me3), resulting in gene silencing and chromatin condensation [165].
The deregulation of histone PTMs modulates gene expression and plays a crucial role in chromatin remodeling.Enzymes that add and remove histone PTMs have been reported to be of clinical relevance in PCa, including the enhancer of zeste homolog 2 (EZH2), which catalyzes the addition of methyl groups to histone H3 at lysine 27 (H3K27); lysine-specific demethylase 1A (KDM1A, also known as LSD1), which catalyzes the demethylation of mono-and dimethylated lysines, specifically histone H3 at lysines 4 and 9 (H3K4 and H3K9); and lysine-specific demethylase 7B (KDM7B, also known as PHF8), which is selective for mono-and dimethylated states [137,166,167].EZH2 has been shown to be overexpressed upon PCa progression [44] and >50 clinical trials using EZH2 inhibitors are ongoing (https://clinicaltrials.gov, accessed on 31 May 2023).KDM1A and KDM7B are also highly expressed in patients with lethal CRPC [136], and numerous clinical trials using KDM1A inhibitors are in progress, though none have been reported for KDM7B.HDACs are often overexpressed in PCa as well, but while HDAC inhibitors seem to have promising results in hematological malignancies, phase II clinical trials of HDAC inhibitors (vorinostat, pracinostat, panobinostat, and romidepsin) in PCa have failed due to toxicity or disease progression [168].For the aforementioned histone modifications, EZH2 overexpression has been found to lead to AR silencing in AR-indifferent PCa, transdifferentiation from adenocarcinoma to NEPC, and the activation of lineage-plasticity-related factors [169,170].The clinical trials of epigenetic modulators that have been or are being tested in PCa can be found in Supplementary Table S1.

Chromatin Remodeling through ncRNAs
Noncoding RNAs (ncRNAs) are RNA transcripts not translated into proteins and can be divided into two groups according to size: miRNAs, which comprise transcripts 18-to 200-nucleotides long, and lncRNAs, which comprise transcripts longer than 200 nucleotides.Gene regulation through ncRNA relies on the binding of ncRNAs to the 3 UTRs of their target mRNAs, resulting in the RNA degradation or inhibition of translation [171,172].Aberrant ncRNA expression has been documented in various types of cancer, including PCa [173].Mapping the ncRNAs of the human genome, as well as their targets, is an ongoing and rapidly expanding effort.
Gene regulation through ncRNA is a promising discovery that could lead to a new biomarker/therapeutic approach.Abnormal miRNA and lncRNA expression has been well documented in most cancer types [174].Several studies have shown the importance of lncRNAs as modulators of key cellular processes in cancer, and it is believed that many of these transcripts could serve as potential cancer biomarkers [175].Recent studies indicate that the lncRNAs HOX transcript antisense RNA (HOTAIR), growth arrest-specific 5 (GAS5), PCa gene expression marker 1 (PCGEM1), PCa ncRNA-1 (PRNCR1), PCa antigen 3 (PCA3), and PCa gene expression marker 1 (PCGEM1) interact with AR signals for CRPC progression [173,[176][177][178][179]. Another extensively studied lncRNA in cancer is PCAT-1, which has been shown to be upregulated in PCa and to promote cancer cell proliferation, migration, and invasion [180][181][182][183].Moreover, PCAT-1 has been reported to be associated with poor prognosis in PCa patients and could be potentially linked to lineage plasticity [184,185].Additionally, in a recent publication, Singh et al. [186] highlighted the importance of lncRNA H19 and its association with NEPC, suggesting that upregulated H19 levels can be used as a candidate diagnostic and predictive marker of NEPC and a putative marker of biochemical recurrence and metastatic disease in patients receiving ADT.
Besides lncRNAs and miRNAs, other classes of ncRNAs that have been implicated in PCa include circular RNAs (circRNAs), small nucleolar RNAs (snoRNAs), and PIWIinteracting RNAs (piRNAs).The circRNA circHIPK3 has been reported to be upregulated in PCa and to promote cancer cell proliferation and invasion [196][197][198][199].Moreover, the circRNA circSMARCA5 has been shown to be downregulated in PCa, and its function correlates with the suppression of PCa metastasis [200].The snoRNA SNORA42 has been reported to be downregulated in PCa and to inhibit cancer cell proliferation and migration [201].It was also found that piRNA piR-31470 plays a crucial role in the hypermethylation of the promoter of GSTP1 in PCa [202], while piR-001773 and piR-017184 promote PCa progression by downregulating PCDH9 expression [203].While these findings are promising, longitudinal studies across the spectrum of PCa progression are necessary in order to identify biomarkers with high sensitivity and specificity.
It is clear that DNA methylation, histone PTMs, and ncRNAs are important regulators of gene expression in PCa, and that their dysregulation aids in tumor development and progression.Some of the aforementioned markers could potentially be associated with lineage plasticity and serve as candidate epigenetic markers for lineage-plastic PCa.Even though various studies have identified one or a combination of epigenetic players or events as having a prognostic role, none of them are currently used in routine practice.This may be attributed to the complexity of epigenetic regulation and the notion that a single or even multiple epigenetic markers would not be able to fully describe this complexity.Targeting these modifications may represent a promising therapeutic strategy (Figure 1).The lack of the success of epigenetic modulators in solid tumors in general, and PCa in particular, may be attributed to two (not mutually exclusive) factors: (a) the lack of predictive biomarkers for patient selection (which would likely include a network of markers rather than just one or a few) or (b) the need for combination therapy to effectively alter the epigenome.Hence, there is an urgent need to identify epigenetic networks that could serve as both candidate biomarkers and potential therapeutic targets for lineage-plastic PCa.

PCa Heterogeneity as Defined by Transcriptomic Profiles
Transcriptomics has become a cornerstone in unraveling the intricate heterogeneity of prostate cancer.At a low scale, specific RNA expression profiles can be analyzed with various commercially available platforms to prognosticate specific sets of patients and aid clinical decision making [204].However, through scrutinizing thousands of genes simultaneously, transcriptomics provides a comprehensive snapshot of gene expression patterns that underlie the diverse characteristics of cancer cells within the prostate tumor microenvironment [205][206][207][208][209].This may hold promise for even better patient risk stratification in the future.
For example, using a large-scale transcriptomic dataset of 19,470 patients, Spratt et al. were able to identify a low AR-active subgroup in treatment-naïve primary PCa that exhibited molecular characteristics similar to mCRPC [208].Han et al. were able to identify two luminal (luminal A and luminal S) and two aggressive (AVPC-I and AVPC-M) subtypes, as well as a subtype with mixed transcriptional profiles, with the aggressive subtypes (AVPC-I and AVPC-M) more likely to show docetaxel resistance [210].Sutera and collaborators performed the RNA expression profiling of the primary tumors of patients with mCRPC (stratified in synchronous versus metachronous metastatic disease) [209] and showed that patients who progressed slower had a more hormone-dependent

PCa Heterogeneity as Defined by Transcriptomic Profiles
Transcriptomics has become a cornerstone in unraveling the intricate heterogeneity of prostate cancer.At a low scale, specific RNA expression profiles can be analyzed with various commercially available platforms to prognosticate specific sets of patients and aid clinical decision making [204].However, through scrutinizing thousands of genes simultaneously, transcriptomics provides a comprehensive snapshot of gene expression patterns that underlie the diverse characteristics of cancer cells within the prostate tumor microenvironment [205][206][207][208][209].This may hold promise for even better patient risk stratification in the future.
For example, using a large-scale transcriptomic dataset of 19,470 patients, Spratt et al. were able to identify a low AR-active subgroup in treatment-naïve primary PCa that exhibited molecular characteristics similar to mCRPC [208].Han et al. were able to identify two luminal (luminal A and luminal S) and two aggressive (AVPC-I and AVPC-M) subtypes, as well as a subtype with mixed transcriptional profiles, with the aggressive subtypes (AVPC-I and AVPC-M) more likely to show docetaxel resistance [210].Sutera and collaborators per-formed RNA expression profiling of the primary tumors of patients with mCRPC (stratified as synchronous versus metachronous metastatic disease) [209] and showed that patients who progressed slower had a more hormone-dependent transcriptional profile compared to those with synchronous metastases.These findings strengthen the idea that patients who are destined to follow a more aggressive disease show unique transcriptional profiles, and that the identification of these profiles could inform the clinicians for therapy selection (i.e., earlier use of chemotherapy).
These findings enable a new approach for patient stratification based on transcriptomic subtypes.This stratification offers a more refined framework for personalized treatment strategies, allowing clinicians to tailor interventions according to the specific molecular characteristics of each patient's tumor.As transcriptomic technologies continue to advance, the exploration of prostate cancer transcriptomic subtypes promises to provide deeper insights into the complexity of the disease, ultimately, paving the way for more effective therapeutic interventions and improved patient outcomes.
However, there are some caveats in the use of transcriptomic analysis in routine practice.Salami et al. modified commercially available molecular scores (cell cycle progression score [211], genomic classifier score [212], and genomic prostate score [213]) by including molecular characteristics of the cellular organization (FLNC, GSN, TPM2, and GSTM2), stroma component (BGN, COL1A1, and SFRP4) and others, and showed that scores differed between the different grade groups from different tumor foci from the same patient, highlighting PCa tumor heterogeneity at a transcriptomic level [205].Similarly, Wei et al. performed both genomics and transcriptomics and showed that significant genetic diversity was observed both within different tumor foci from the same patient as well as within different cores from the same tumor focus, underscoring both the intertumoral and intratumoral heterogeneity at the genomic and transcriptomic level for any single patient [206].These findings have significant implications for using genomic classifiers in precision medicine, especially in the biopsy setting, as a single core from the prostate may not accurately predict the patient prognosis or therapy response.Instead, the range of genomic alterations from multiple cores from the index focus, which is the focus with the most aggressive characteristics, as well as from additional potentially aggressive lesions may be more informative for each patient.
In addition to risk stratification, spatial transcriptomics allows researchers to identify unique gene signatures associated with distinct cellular subpopulations.It also enables the transcriptomic subtyping of the tumor subpopulations, shedding light on the underlying molecular diversity that influences disease progression and therapeutic responses [207,208,210].Thus, spatial transcriptomics have provided a new approach for unraveling the intricate molecular landscape within the context of tissue architecture, thus, providing a spatially resolved understanding of how genes are expressed across different regions of the tumor microenvironment [214][215][216].In prostate cancer, spatial transcriptomics offers the opportunity to uncover the heterogeneous distribution of cellular populations, including cancer cells, stromal cells, immune cells, and more.By preserving the spatial context, this method enables the identification of distinct molecular signatures associated with various tumor regions, unveiling potential interplays between different cell types.A recent study showed that a spatial transcriptomic approach enabled the identification of the gene expression heterogeneity observed in a PCa specimen with de novo neuroendocrine PCa and coexisting adenocarcinoma [217].In addition, using spatially resolved metabolic network modeling, Wang and collaborators analyzed the complexity of the metabolic microenvironment of PCa and showed that malignant-cell-specific metabolic vulnerabilities may serve as candidate targets [218].In 2018, Berglund et al. [219] used spatial transcriptomics, aiming to map the prostate cancer microenvironment and adjacent areas at a transcriptomic level, and revealed that cancer gene expression could be seen beyond the histologic boundaries of the tumor and that changes in the microenvironment may precede cancer-related genetic changes.These findings have important implications, as abnormal transcriptomics from histopathologically normal areas may alert clinicians to an adjacent or future tumor formation.
Almost 50 years ago, Cunha and Lung, in their stromal-epithelial recombination experiments, showed that prostate epithelial development is dependent on stromal AR signaling [220].Now, it is well established that stromal-epithelial interactions maintain the homeostasis of prostate tissue, with stromal AR signaling mediating epithelial growth and differentiation and epithelial AR signaling mediating luminal cell function [221].The stromal microenvironment is altered during prostate cancer development.Cancer-associated fibroblasts (CAFs) have been seen in the tumor stroma, and a microenvironment that enables disease progression, therapy resistance, and metastasis emerged [222][223][224].Stromalepithelial crosstalk in prostate cancer has been under investigation for the last several years in order to better understand its role in disease progression and metastasis.Altered stroma exhibits unique molecular profiles that have also been associated with metastasis [225,226].A low AR expression in stromal cells has been linked to disease progression and/or worse outcome (biochemical relapse, ADT resistance, etc.) [227,228], indicating a protective role of stromal AR.A recent review highlighted the changes observed in AR signaling in tumor stroma that could influence the tumor's behavior [229].A transcriptomic analysis has identified changes in the gene expression profile of the stroma adjacent to tumors, with prognostic implications for the patient, indicating that the molecular profile of tumoradjacent stroma could reveal valuable information regarding PCa diagnosis, progression, and evolution [219,230,231].
In the context of prostate cancer, transcriptomics has been instrumental in revealing the intricate interplay between cancer cells, stromal cells, and immune cells, shedding light on their contributions to disease progression and treatment resistance.As transcriptomic techniques evolve, including single-cell RNA sequencing and spatial transcriptomics, the implementation of these advanced methodologies in studying prostate cancer heterogeneity promises to uncover deeper insights into the molecular dynamics within tumors, enabling more targeted therapeutic strategies and, ultimately, advancing precision oncology approaches.

Computational and Molecular Perspectives on Lineage Plasticity
Lineage plasticity has long been recognized as a key feature of organ development and tissue regeneration.In recent years, advances in single-cell genomics and transcriptomics have been used to expand our understanding of the mechanisms that underlie lineage plasticity.Most studies have used lineage tracing methods, which label specific cell populations and track their fate over time using in vitro culture systems, genetic manipulation, and transplantation assays [232][233][234][235][236][237][238][239].While lineage tracing measures allow us to trace longitudinal lineage changes, measures that predict the ability of a cell to switch its differentiation program, undergo dedifferentiation, revert to a more stem-like state, or transdifferentiate into a different cell type remain to be developed.Recent studies have introduced NGS technologies and multiomics as the most promising tools to provide such measures [240][241][242][243][244][245].Here, we discuss next-generation methods that can be used to develop candidate measures to predict whether a tumor sample shows lineage plasticity features.
Epimutation clocks are hereditary epigenetic alterations that establish fluctuating changes during the progression and evolution of cancer and have been studied in various cancer types.Gabbutt et al. introduced markers that can be used as a fluctuating DNA methylation clock [246] that enables "flip-flopping" between methylated and unmethylated states in colorectal cancer.They further applied this approach to whole blood samples to detect fluctuating DNA methylation clocks and distinguish between acute and chronic leukemias [246], supporting the idea that fluctuating methylation clocks can provide a powerful tool to quantify somatic cell evolution in human tissues.The investigation of epimutation clocks in PCa could give rise to potential markers that could be linked to aggressive disease and lineage plasticity events.
Recent studies in lung adenocarcinoma (LUAD) have shown that multiomics comprising single-cell transcriptomics combined with single-cell epigenomics can reveal distinct and well-described cell states during cancer development and evolution [247,248].Marjanovic et al. revealed the emergence of a "high-plasticity cell state" (HPCS) with a distinct transcriptional and chromatin profile during the development of LUAD.They analyzed single-cell transcriptomes across the spectrum of LUAD development (seven stages from preneoplastic hyperplasia to LUAD) using genetically engineered mouse models (GEMMs) and showed that a cluster of cells with a highly mixed AT1/AT2 lineage signature was prevalent from early adenomas to fully formed LUAD.These HPCS cells had the most profuse and strong connections to give rise to other cell states and substantial trajectories, and they indeed gave rise to numerous cell states and substantial trajectories when cultured in 3D tumor spheres.In addition, the HPCS expression signature differed from the molecular signature of cancer and normal stem cells [247].Additionally, a clusterbased pan-cancer analysis across a TCGA collection suggested that the HPCS signature may define more aggressive cancer types associated with drug resistance [247].Chan et al. [50] analyzed the emergence of HPCS using PCa GEMMs that recapitulated the transition from adenocarcinoma to NEPC with prostate-specific deletion of TP53, RB1, and PTEN, and identified a lineage plasticity signature of mixed luminal-basal gene markers that were unique in highly plastic cells.They demonstrated that the combined defects in these tumor suppressors led to lineage-plastic cell states with a unique mixed luminal-basal molecular signature.In addition, they highlighted the emergence of JAK-STAT and FGFR pathway activation among the programs associated with lineage plasticity and showed that the inhibition of JAK and FGFR at highly plastic organoids resulted in normal acinar morphology.The authors also compared their findings to human disease using the scRNA-seq of patient samples with CRPC and organoids derived from human CRPC cells, confirming the relevance of their results [50].Therefore, they showed that the increased activity of JAK and FGFR was associated with lineage plasticity events.Taken together, these findings strengthen the idea that HPCS represents a lineage plasticity property that is present from the early stages of the disease and can give rise to diverse phenotypic lineages when the tumor's survival is threatened (i.e., through therapies), leading to poor outcomes.Therefore, measures that can predict the presence of HPCS could be used as candidate lineage plasticity biomarkers and potential therapeutic targets.
Blanco et al. [249] showed that chromatin remodeling represents an epigenetic "memory", creating inherited chromatin dynamics that give rise to cell states that result in lineage plasticity, which can be described as an inherent cell property rather than as a specific event.Memory cell states are defined by genetic and epigenetic alterations that can be triggered through diverse environmental stimuli, leading to chromatin remodeling and the emergence of the most appropriate cell state at each particular time [250].In their recent publication, Tang et al. [251] combined an assay for transposase-accessible chromatin with sequencing (ATAC-Seq), chromatin immunoprecipitation sequencing (ChIP-seq), and RNAseq analyses in PCa cell lines and patient-derived organoids and xenografts to identify four CRPC subtypes with unique chromatin and transcriptional profiles.Those included AR-dependent (CRPC-AR), neuroendocrine (CRPC-NE), Wnt-dependent with low AR expression (CRPC-WNT), and stem-like with low AR expression (CRPC-SCL) subtypes.This study also showed, in agreement with others, that combined defects in the tumor suppressors RB1, TP53, and PTEN were associated with lineage plasticity and aggressive PCa phenotypes [26][27][28]251,252].In addition, they identified master transcription factors for each CRPC subtype, with AR and FOXA1 being prevalent for CRPC-AR; NEUROD1 and ASCL1 for CRPC-NE; TCF7L12 for CRPC-WNT; and FOSL1 for CRPC-SCL [251].A pathologic, genomic, and marker gene expression analysis provided validation of the four subgroups, with CRPC-AR showing high levels of AR expression and score, CRPC-NE having high SYP expression and a NE-morphologic score, CRPC-WNT specimens showing elevated AXIN2 expression, and the CRPC-SCL subtype being defined by high CD44 expression levels.Formaggio et al. [253] implicated the overexpression of three (SOX2, OCT4, and MYC) out of the four Yamanaka factors in the dedifferentiation process of lineage plasticity observed in PCa, while the loss of the Yamanaka factor KLF4 was associated with tumor evolution [253].
Thus, early data support the notion that PCa may be classified based on lineage plasticity through the presence of HPCS, as well as epigenetic markers that may characterize these states.The presence of HPCS in a tumor sample may potentially be linked to lineage plasticity and could indicate aggressive disease with poor outcomes.The identification of epigenetic markers and mechanisms that enable lineage plasticity in some, but not all, patients could represent a fundamental milestone for diagnosis, prognosis, and targeted therapy.The development of bioinformatic tools that focus on HPCS identification is likely be critical in this effort.

Bioinformatic Tools for Lineage Plasticity Signatures and Measures
Bioinformatic tools have played a pivotal role in the identification of biomarkers, as well as the development of molecular signatures in cancer [254][255][256][257][258]. Through the analysis of large datasets, these tools enable the identification of genes and pathways that are dysregulated in cancer cells.The identification of biomarkers through bioinformatic analyses enhances the development of targeted therapies and personalized medicine [259][260][261].In addition, enrichment analyses provide mechanistic insights into the underground biology of the development and evolution of the disease [262,263].Overall, the integration of bioinformatics into cancer research has significantly improved our understanding of the molecular mechanisms underlying cancer and has the potential to improve patient outcomes.Here, we focused on bioinformatic tools that could be incorporated into a multiomics approach to identify lineage plasticity measures and signatures.Table 2 provides a list of bioinformatic tools that could potentially be used to generate measures of lineage plasticity.
Table 2. Bioinformatic tools that could be used for genomic, transcriptomic, and epigenetic enrichment and downstream analysis.

Tool
Description GitHub Link (If Available)

Genomics
A variety of genomic bioinformatic tools are available to describe mutations, including copy number alterations (CNAs), which refer to changes in the copy number of genomic regions, such as amplifications and deletions, and copy number variations (CNVs), which are more comprehensive and encompass a broader range of structural alterations in the genome, including CNAs as well as duplications and complex rearrangements across bulk and single-cell data.These tools enable the identification of specific genomic drivers of cancer [267,269,271,272,301].Importantly, an integrated multiomics approach would be needed to associate these genomic drivers with lineage plasticity.Here, we described some of the most well-known tools used for CNV identification in genomic data, highlighting those that could be used for single-cell sequencing analyses.Single-cell DNA sequencing (scDNA-seq) allows for single-cell resolution, but has limitations regarding DNA quantity (approximately 6 pg) that are not applicable for whole-genome sequencing [302].There are methods to overcome these limitations (i.e., multiple displacement amplification, multiple annealing, and looping-based amplification cycles) when amplification bias arises.To identify reliable CNAs, specifications, including genomic uniformity, depth of coverage, and throughput, are important parameters.A higher depth of coverage enables the detection of smaller CNAs with a higher resolution of CNA boundaries [302].The throughput of scDNA-seq refers to the number of cells that can be simultaneously sequenced, as well as the time needed to complete the sequencing procedure.A high throughput enables a large number of cells to be sequenced, resulting in a more detailed understanding of those cells.
MuTect [264] is a powerful computational tool in cancer genomics.Developed by the Broad Institute of MIT and Harvard, MuTect is specifically designed for the detection of somatic mutations using tumor-normal paired samples obtained from NGS data.It compares the genetic profiles of tumor and normal samples to identify and differentiate true somatic mutations from sequencing artifacts and germline variants.This process includes four key steps: the removal of low-quality sequence data, variant detection, filtering and the removal of false-positive results, and the identification of somatic versus germline mutations.This tool has been used for the detection of somatic mutations in various cancer types, including PCa [83,139,265,303].In PCa, MuTect has been used to identify genomic variations in African populations, which showed an elevated tumor mutational burden in African men with treatment-naïve, high-risk PCa [304].In addition, using the MuTect package, Hong et al. [305] showed that enrichment of TP53 mutations was linked with metastatic potential in blood samples from patients with metastatic PCa.Therefore, MuTect is a solid tool that can be used for mutational screening and revealing potential mutational drivers of lineage-plastic PCa.
Maftools is an R (Bioconductor) package that provides a comprehensive suite of tools for the analysis and visualization of mutations in cancer genomics data [267].It provides functions ("plotmafSummary" and "maftoolsSignatur") to analyze the mutational burden and mutation signatures.Mutation annotations can also include additional information, such as gene annotations, functional impact predictions, and known cancer driver genes.In addition, the package includes advanced visualization functions to generate high-quality plots, such as oncoplots, waterfall plots, and heatmaps, to aid in the identification and interpretation of driver mutations and their associated clinical outcomes.Maftools is a powerful, versatile, and user-friendly tool for the analysis, interpretation, and visualization of somatic mutation data in cancer genomics.By providing a range of visualization and analysis functions, it allows researchers to gain insights into cancer's genetic mechanisms and to identify potential therapeutic targets.While Maftools was originally designed for bulk DNA-seq analyses, it can potentially be applied to the analysis of somatic mutations in scDNA-seq data, particularly if the data have been aggregated to generate a mutation frequency matrix or mutation annotation format (MAF) file.Maftools has also been used to detect mutations in PCa specimens, as well as for meta-analyses [268,306].Therefore, it provides an additional tool for identifying mutations that could be linked to lineage-plastic PCa.To achieve this, a selection of the most appropriate cohorts is mandatory to enhance the reliability of the results at a genomic level.
CopyKit is an R package designed to preprocess and analyze single-cell CNV genomic data in advance of the detection and visualization of CNVs, including those that occur at low allele frequencies and in subclonal populations [269].CopyKit enables the analysis of the copy number substructures of tumor samples, as well as in furthering the investigation into the intratumoral heterogeneity that is frequently seen in PCa [307].It also provides a quality control module to process high-quality aneuploid cells for downstream analyses.It marks euploid cells and then filters low-quality cells.CopyKit employs a Bayesian framework for CNV detection, which allows for the accurate estimation of the copy number and allele frequency, as well as the assessment of uncertainty and false discovery rates.The package includes a range of visualization tools, such as heatmaps and scatter plots, which enable the exploration and interpretation of CNV data at different scales.CopyKit is a userfriendly tool that can facilitate the analysis and interpretation of CNV data in a range of genomic sequencing applications, including single-cell sequencing and tumor heterogeneity studies.It provides the advantage of detecting CNVs even in low allele frequencies and in subclonal populations, enabling a better characterization of heterogeneous PCa samples.
HMMcopy is a hidden Markov model (HMM)-based package that provides a wide range of tools for the preprocessing, analysis, visualization, and downstream analysis of genomic data [301].HMMcopy provides a set of functions and algorithms for detecting and quantifying CNVs from sequencing data, particularly in the context of scDNA-seq.The main advantage of HMMcopy is its ability to accurately detect low-frequency CNVs and mosaic events, which can be missed using other methods.In addition, it enables the simultaneous inference of the segmentation and absolute copy number [302].After reading the sequencing data, segmentation takes place; then, based on the segment data, a HMM is trained to infer the most likely copy number states.While HMMcopy has mainly been used for other types of data (CGH data), it has also been applied to large-scale scDNA-seq data.The main limitation of HMMcopy is the manual calibration of many parameters and its unreliable detection of ploidy, which often results in an inaccurate copy number estimation [270].HMMcopy is a frequently used tool for studying CNVs in prostate cancer and has been used to reveal potential biomarkers of lethal outcomes in patients with PCa [308][309][310].
CHISEL (Copy number Haplotype Inference in Single cells using Evolutionary Links) is the first tool for allele-specific and haplotype-specific copy number inference in scDNAseq data [271].Using a matched normal or pseudonormal sample derived from diploid cells, CHISEL can overcome the low coverage of scDNA-seq data to detect CNAs by amplifying the weak SNP signal.It can also calculate the B-allele frequency (BAF) in genomic regions of approximately 5 Mb by combining a reference-based algorithm with a novel algorithm to phase short haplotype blocks in each cell.CHISEL provides a hierarchical clustering of cells with similar genomic characteristics, as well as tools for gene enrichment and downstream analyses.It can also be integrated with other single-cell sequencing data types such as RNA-seq to better understand tumor evolution and lineage plasticity.CHISEL offers different Python commands to run either the entire pipeline with all the steps or only some specific steps (chisel, chisel_nonormal, etc.).

Transcriptomics
Bioinformatic approaches that aim to analyze scRNA-seq data represent the vast majority of tools for characterizing lineage plasticity, either through tracing trajectories, using machine learning algorithms, or identifying HPCSs.The Waddington optimal transportation (Waddington-OT) model is a well-known algorithm [273] based on the idea that cells are randomly drawn from a probability distribution of gene expressions and that each cell has a distribution of likely origins and possible fates.This framework uses longitudinal scRNA-seq data to understand how these probability distributions change over time.It applies the mathematical approach of optimal transport to investigate the process of cellular reprogramming after a transient overexpression of transcription factors [47] to answer various questions, such as what types of cells arise during reprogramming, which developmental paths lead to reprogramming and alternative fates, and what intrinsic factors and cell-cell interactions play a role in this process.The insights of this framework could potentially improve the efficiency of cell reprogramming toward a desired outcome.Regarding the application of the Waddington-OT model to scRNA-seq data, the development of the code first requires the loading and normalizing of patient data.Then, highly variable genes are selected and the parameters of the model are defined.Next, random initial cell states are generated and the developmental landscape and gradient are identified.Lastly, the "OTclust" function is used to find cell trajectories based on the Waddington-OT model.The Waddington-OT model was used in a recent publication based on the identification of HPCSs in lung carcinoma [247], and can also be used for longitudinal studies in PCa to highlight cell populations with HPCS characteristics.However, it requires a sequential scRNA-seq data collection, which elevates the experimental cost.
Similar to the Waddington-OT model, Forrow et al.
[242] developed an algorithm called Lineage-OT that aims to combine lineage tracing and trajectory inference in a unified manner.The framework utilizes mathematical tools from graphical models and optimal transport to reconstruct developmental trajectories from time courses with snapshots of both cell states and lineages.According to the findings, incorporating lineage data into the framework results in improved accuracy in tracking complex state transitions with even fewer measured time points.Furthermore, the integration of lineage tracing with trajectory inference could enable the accurate reconstruction of developmental pathways that are difficult to recover using state-based methods alone.Therefore, optimal transportation models could potentially be used, not only to define lineage tracing in longitudinal samples, but also to develop predictor models of whether a tumor is destined to progress in aggressive phenotypes or whether it is likely to remain with consistent lineage even after its expansion.
Monocle 2 is an R package that focuses on cell fate identification via reversed graph embedding (RGE), a machine learning approach for a more accurate reconstruction of singlecell trajectories [274,275].The pipeline works on scRNA-seq data and includes: (a) differentially expressed gene identification for each cluster using t-distributed stochastic neighbor embedding (t-SNE) dimension reduction followed by density peak clustering [311]; (b) pseudotime trajectory reconstruction using the DDRTree RGE algorithm, which is performed to lead at a "principal graph" [312,313].The principal graph is shaped as a curve with branches, where the breakpoints are the "intermediate" datasets and the branches are different cellular states/outcomes [274].In addition, a branch expression analysis modeling (BEAM) algorithm is used to identify genes with significant branch-dependent expressions in order to determine the intermediate datasets [314].After reading and preprocessing the data, the developmental trajectory is generated using Monocle 2's DDRTree algorithm.Monocle 3 has now been released and further information on installing and using this version is available at https://cole-trapnell-lab.github.io/monocle3/(accessed on 1 July 2023).The publication of the updated Monocle version is not yet available, as it is currently in the beta phase of its development.However, this tool could be powerful for the identification of cellular states within the same dataset compared to the Waddington-OT and Lineage-OT models.Monocle 2 shows a significant advantage of lineage tracing within the same sample, enabling the investigation of tumor heterogeneity within a single specimen.This would be compelling for the identification of HPCS populations in PCa specimens that could determine the presence or absence of lineage plasticity events.
The Seurat R package is a popular toolkit for the analysis, visualization, and exploration of scRNA-seq data [276].This tool could be employed to identify and quantify lineage plasticity signatures by providing a comprehensive toolkit for single-cell data analyses, enabling researchers to uncover cellular heterogeneity and developmental trajectories.It includes many techniques and methods for data transmutation, detection, infiltration of doublet genes (scDblFinder), and data normalization.For a downstream analysis, various Seurat-supported techniques are available, such as the principal components analysis (PCA), clustering, and the UMAP package, which simplifies data visualization by condensing it into two dimensions.Cell cluster identification is achieved through the "FindClusters" function, employing a shared nearest neighbor (SNN) modularity optimization-based clustering algorithm.SNN compares the nearest neighbors of each cell and defines clusters based on the similarity of their local neighborhoods.Thus, the "FindClusters" function can effectively group cells with similar gene expression profiles, which often represent distinct cell types or states.Additionally, the Seurat package offers an accessible and computationally efficient gene signature function called "AddModuleScore" [278,280].The module score represents the average expression of a group of genes (usually related to a specific biological function or pathway) in each cell, adjusted for the overall gene expression level in that cell.This allows researchers to compare the activity of certain gene sets between different cell types or conditions.AddModuleScore has been extensively used to test molecular signatures in PCa [315][316][317] and could be further used for molecular signatures linked with lineage-plastic disease.
Unlike the "AddModuleScore" function in the Seurat package, which normalizes its scores using the dataset's average expression, UCell [279] uses the Mann-Whitney U analysis to calculate gene signature scores.This package allows researchers to investigate the activity of specific gene sets in individual cells, facilitating the identification of cellular subpopulations and uncovering biological processes or pathways that are active in distinct cell types or states.Thus, gene signature scoring algorithms are accessible and assist in enhancing signatures as potential biomarkers for various conditions, such as lineage plasticity.A molecular signature indicative of lineage plasticity could significantly improve cancer prognosis and therapeutic markers for PCa.
CytoTRACE [281] is one of the first tools that aims to measure the presence of lineage plasticity and improve our understanding of cellular dynamics in cancer progression.It is a computational method designed to analyze scRNA-seq data to predict the differentiation potential of individual cell clusters.It also leverages single-cell gene expression data to rank cells based on their differentiation states, from undifferentiated stem cells to more differentiated cell types.CytoTRACE can potentially serve as an independent measure of lineage plasticity, as it calculates the number of expressed genes per cell using single-cell transcriptomic data.With its extensive coverage, including over 18,000 annotated gene sets, it allows for the identification of 52 experimentally determined cell states.This comprehensive approach enables researchers to investigate cellular hierarchies and differentiation potentials, making CytoTRACE a valuable tool for assessing lineage plasticity in various biological contexts.Furthermore, using single-cell transcriptomic and bulk ATAC-seq of human paraxial mesoderm lineage phenotypes, Gulati et al. determined that less differentiated cells possess larger gene counts, which also reflects a more open chromatin accessibility profile compared to well-differentiated cell states.This algorithm could potentially be applied to identify HPCS clusters during the evolution of PCa and set a fundamental milestone in the HPCS identification of lineage-plastic PCa.

Epigenetics
Epigenetic studies aim to describe the epigenomic landscape of a tumor to better elucidate the underlying biology of cancer evolution.ChIP-seq of histone markers (i.e., H3K27ac and H3K4me1) or ATAC-seq can reveal the chromatin accessibility of a cell state during tumor progression and evolution, while DNA methylation sequencing (DNAme-seq) can be used to identify hypermethylated and hypomethylated regions that can be correlated with disease state or other properties, including lineage plasticity [281,[285][286][287][288][289][290][291][292][318][319][320].These analyses can reveal candidate epigenetic biomarkers that are linked to lineage plasticity.Here, we discussed the most common packages that are used for ChIP-seq, DNAme-seq, and ATAC-seq analyses.

ChIP-Seq Analysis Tools
The model-based analysis of ChIP-seq (MACS) remains the most well-known tool for identifying enriched regions of transcription factor binding and histone modifications from ChIP-seq data [282,283].The updated version MACS2 uses a combination of modeling and peak merging strategies to accurately identify enriched regions [284].MACS2, in addition to Poisson distribution, incorporates a local lambda model to account for local biases in the data and a dynamic threshold to control for false positives.MACS2 can detect both broad and sharp peaks in contrast with MACS, which was designed for sharp peaks.Additionally, MACS2 provides options for conducting a downstream analysis, such as peak annotation and motif discovery.MACS2 has been used in PCa to identify chromatin regions that are altered upon the acquisition of ENZA-resistance and enabled the selection of the appropriate therapy to target ENZA-resistant CRPC [321].In addition, recent studies have used MACS2 to describe the epigenetic landscape of primary and aggressive subtypes of PCa, enabling the identification of candidate therapeutic targets [322,323].Further investigations of the epigenetic landscape of lineage-plastic PCa using MACS2 could inform candidate epigenetic markers and therapeutic targets.
SICER (spatial clustering for identification of ChIP-enriched regions) is a widely used peak-calling method for ChIP-seq data [285,324].SICER uses a clustering approach to identify enriched regions based on spatial proximity and divides the genome into nonoverlapping windows of size w and identifies regions of enrichment.Then, it applies a clustering algorithm to group these regions into larger enriched domains.To complete this operation, an algorithm is developed in Python using the parameters of the window size and a threshold approach allows SICER to identify smaller, closely spaced enriched regions that lead to high sensitivity and specificity.SICER also incorporates an FDR estimation method to control for multiple testing, providing a measure of statistical significance for the identified regions.Additionally, SICER can detect both broad and sharp peaks, whereas MACS is optimized for sharp peaks.Using SICER, Coleman et al. [325] determined the epigenetic landscape of BRD4 binding sites and identified BET bromodomain inhibitor sensitivity through MYC suppression, while Dhar et al. [326] introduced the MTA1/Epi-miR-22/Ecadherin axis as an important metastasis-promoting epigenetic signaling pathway in PCa.
ChIPseeker is an R Bioconductor package that is also well known for the annotation and visualization of ChIP-seq data [286].ChIPseeker annotates ChIP-seq peaks to genomic features, including genes, exons, introns, promoters, and enhancers.It also provides custom annotation functions that can be used to annotate user-defined genomic features (e.g., "annotatePeak").Its visualization functions include heatmaps ("plotHeatmap"), profiles, and genome browser tracks, allowing users to explore and visualize ChIP-seq data in a variety of ways.ChIPseeker also provides the opportunity for the downstream and enrichment analysis of the genes annotated by the ChIP-seq peaks, which could provide insight into the biological processes and pathways regulated by the transcription factors or histone modifications of interest.The main difference between MACS and ChIPseeker is that MACS focuses on peak calling and the identification of enriched regions of ChIP-seq data, while ChIPseeker is primarily used for the annotation and visualization of ChIPseq peaks.Recent studies used ChIPseeker to screen chromatin alterations upon drug administration in PCa samples [169,327].

DNAme-Seq Analysis Tools
Bismark is a widely known tool for the alignment and analysis of DNAme data, providing simultaneous read mapping and methylation calling in a single command [287].It has been designed in bash language to perform the alignment of bisulfite-treated reads to a reference genome provided by the user and to available public databases.It then discriminates the methylation status between cytosine residues in CpG, CHG, and CHH contexts, enabling the visualization of methylation data to interpret the results.Bismark is a highly configurable tool that allows users to adjust its parameters, such as alignment stringency, quality filtering, and read trimming.It also provides options for filtering out PCR duplicates and calculating differential methylation between samples.Bismark is available as a command-line tool and can be run on Linux, MacOS, and Windows operating systems.It requires a Perl [328,329] programming language, as well as the installation of either the Bowtie [330,331] or Bowtie 2 [332,333] alignment programs.Bismark can work with whole-genome bisulfite sequencing (WGBS) data and representation bisulfite sequencing (RRBS) data.It has been extensively used to study the epigenetic landscape of the whole spectrum of PCa providing important insights into hypomethylated and hypermethylated genes in each cell state [334][335][336][337].
BS Seeker is another tool for the alignment of bisulfite-treated reads to a reference genome.It can also identify methylated and unmethylated cytosines at a single-base resolution [288].To perform these functions, lists containing the command and its arguments to run BS Seeker for alignment or methylation calling are necessary.BS Seeker also uses Bowtie to map the bisulfite reads generated from WGBS or RRBS data.It includes the post-procedure removal of low-quality mappings based on the number of mismatches.While Bismark uses a bisulfite-aware alignment algorithm that accounts for the effects of bisulfite treatment on the DNA sequence, BS Seeker uses a two-step alignment approach that first aligns the reads to an unconverted reference genome and then uses a bisulfite conversion algorithm to generate a converted genome for the alignment of the bisulfitetreated reads.This approach may provide greater flexibility in the choice of a reference genome and alignment algorithm, but may also introduce some biases or inaccuracies in the conversion process.
MethylKit is an R package that has been designed for the analysis and visualization of DNA methylation data [289].It provides a variety of tools for the identification of differentially methylated regions (DMRs) from bisulfite sequencing data, as well as functions for the downstream analysis of DMRs.It includes algorithms for data normalization and visualization tools, such as heatmaps, scatter plots, and density plots, for the interpretation of findings.In addition, MethylKit enables the annotation of the genomic and gene ontology analyses of the DMRs.In contrast to Bismark and BS Seeker, which focus on alignment, MethylKit focuses on the analysis and visualization of the results.MethylKit has been used in PCa for the identification and annotation of differentially methylated sites that could reveal potential epigenetic markers linked to different disease states [9,335,338].

ATAC-Seq Analysis Tools
Many tools that can be used for ChIP-seq analyses are also applicable for bulk ATACseq analyses, including MACS2 and ChIPSeeker [284,286].Here, we highlighted a number of tools that could be used to analyze single-cell ATAC-seq data.
Single-Nucleus Analysis Pipeline for ATAC-seq (snapATAC) is one of the few packages designed for conducting a comprehensive single-cell ATAC-seq (scATAC-seq) analysis [291].
SnapATAC provides a variety of tools, including the alignment of the read to a reference genome, quality control, peak calling, visualization, and clustering.It allows for the identification of cellular heterogeneity by comparing chromatin accessibility profiles between cells.In addition, the tool supports the integration of single-cell gene expression data, which allows for the identification of cell types and states based on chromatin accessibility and gene expression patterns.It can also predict enhancer-promoter interactions and enables batch correction, differential accessibility analysis, identification of lineage trajectories and key transcription factors.In PCa, snapATAC has been used to study chromatin sites that are shared in low-grade PCa and lost in high-grade samples [339].
Cellcano is a recently developed tool for the inference of cellular hierarchies in scATACseq data [292].It uses a two-round supervised learning algorithm to identify cell types.First, it uses the reference dataset to train a multilayer perceptron to identify anchor cells in the target dataset.Then, using these anchor cells, it trains a knowledge distiller model (KD model) [340] to learn the relationships between chromatin accessibility profiles and cell types.The trained KD model is then applied to predict the cell types of non anchor cells.Cellcano is a recent tool and no publications on PCa have been published yet.However, it is a promising tool for cell type annotation in scATAC data, and could be used to provide insights into tumor heterogeneity observed at the chromatin level.
Signac is a recent R tool that has been designed for the analysis and visualization of scATAC-seq data [293].It provides a variety of tools, including peak calling, quality control, visualization, clustering, and integration with scRNA-seq data.It also enables the identification of differentially accessible peaks, enriched motifs, key transcription factors, and gene annotation of the peaks.Importantly, Signac has been designed to interact with the Seurat package, enabling multiomics analyses.In PCa, Signac has been used to identify epigenetic markers of metastatic potential, which is in the same direction as the goal of identifying markers of lineage-plastic disease [341,342].
EpiAnno is a Python tool that has recently been developed for scATAC-seq data analyses using a probabilistic generative model and a Bayesian neural network [294].The model is designed to embed cells into a latent space where each cell type corresponds to a Gaussian mixture distribution.EpiAnno characterizes cell heterogeneity and has shown accurate results for within-dataset and cross-dataset annotations.The trained EpiAnno and learned cell-embedding parameters are interpretable and can reveal biological insights through a tissue-specific expression enrichment analysis, partitioned heritability analysis, cell type-specific enhancer identification, and cell type-specific cis-coaccessibility analysis.Since EpiAnno is a recently developed tool, no publications in PCa are available.However, it provides important features for studying intratumoral heterogeneity, which is one of the most important characteristics of PCa.

Enrichment Analysis
Gene enrichment analyses provide descriptions of upstream and downstream regulatory pathways and associated molecules, which are necessary to elucidate the biological background of cancer development and evolution.Gene enrichment analyses can be performed using computational software (i.e., GSEA and IPA), while publicly available databases (i.e., NCBI/NIH, GEO, and TCGA) provide an essential repository of welldefined data and molecular signatures that could be used in this setting [295,296,343,344].
The gene set enrichment analysis (GSEA v. 4.3.2) is a powerful global tool for the characterization of cellular functions and pathway enrichment, as well as endogenous and exogenous changes and the relations between the genes of individual datasets.GSEA requires a ranked list of genes through differential expression, as well as the selection of the window of the ranked list and preferred parameters for analysis [345].In this way, GSEA provides the analysis of an extended list of gene sets with information regarding the expression status (upregulation or downregulation) of the input datasets.The vast majority of publications have used GSEA to perform pathway enrichment analyses, and it has provided important insights into the underlying mechanisms of PCa development and evolution [346][347][348][349][350][351][352][353].
Ingenuity Pathway Analysis (IPA Summer Release (2023)) is another software broadly used for gene set analyses, aiming to elucidate upstream and downstream biological events.In contrast to GSEA, IPA predicts the master regulators for each upstream or downstream regulatory pathway and potential targets for drug development and experiments [296].While both software are easy to use with detailed tutorials, GSEA has the advantage of being free of fees, while IPA requires a subscription fee.IPA has been extensively used in PCa to study canonical pathways and molecule interactions in different disease states [354][355][356][357][358][359].
Improvements in this field are continuous, and online platforms for enrichment analysis have already emerged.Enrichr analysis is an integrative web-based tool for enrichment analysis that includes one of the biggest lists of gene set libraries [297,298].Enrichr visualizes the enrichment results as clustergrams and includes information about differentially expressed genes after drug, gene, disease, and pathogen perturbations.Another web-based tool, FLAME [300], allows for the input of multiple gene lists with a parallel exploration and analysis, and utilizes STRING'S API [360] to generate interactive protein-protein interaction (PPI) networks.FLAME provides a visual analytics approach with adjustments and parameter options in addition to heatmaps, bar charts, Manhattan plots, networks, and tables.

Conclusions and Future Directions
The identification of the determinants of lineage plasticity and the definition of a measurable metric of such signatures at the molecular level in PCa are predicted to translate into prognostic and predictive biomarkers of the disease, as well as new therapeutic strategies, particularly with the goal of addressing chemoresistance.NGS technologies are promising tools for the development of such measures.Single-cell NGS technologies studying GEMMs, human tissues, and PDXs allow us to perform longitudinal lineage tracing across the different cell states that arise during tumor development, therapy resistance, migration, and metastasis [361][362][363].As presented in this review, NGS assays have been used to provide high-resolution information relevant to intratumoral heterogeneity and the tumor microenvironment at genetic, transcriptional, and epigenetic levels and to identify crucial factors and cell states that promote tumor progression, therapy resistance, and migration [364][365][366][367][368].
Historically, microscopy methods have been tried and tested for their efficacy as a robust tool for analyzing cancer-related challenges.The evolution of NGS, the emergence of big data, and the plethora of machine learning tools create a highly promising molecular avenue for cancer analysis.Once technological and analytical hurdles have been resolved, it is predicted that the application of DNA-seq in molecular pathology evaluation of tumors could eventually rival that of the microscope [369].Moving forward, the research community should focus on integrating these two aspects to achieve a system-level understanding of lineage plasticity, which would yield more reliable and comprehensive results.
Combined defects in the tumor suppressors RB1, TP53, and PTEN seem to be significant for PCa lineage plasticity events.In addition, epigenetic alterations, including the overexpression of epigenetic modulators such as EZH2 and SOX2, seem to be involved in tumor evolution as components of lineage plasticity.Epimutation clocks similar to those proposed in recent studies [246,370] remain to be characterized in PCa.Furthermore, the presence of HPCS clusters could be used as candidate biomarkers for lineage plasticity, which is linked to aggressive phenotypes.The characterization of lineage-plasticity-associated chromatin remodeling could also represent a fundamental milestone for understanding and targeting lineage-plastic PCa.Unique signatures identified through enrichment analyses and signature scores could inform the characterization of lineage plasticity, revealing additional targets to disrupt the driver events.
Author Contributions: S.L.: conceptualization and writing-original draft preparation; E.P.: conceptualization and writing-original draft preparation; V.Z.: writing-review and editing; C.L.: funding acquisition and writing-review and editing; A.G.V.; writing-review and editing; R.S.: conceptualization, funding acquisition, and writing-review and editing; V.T.: conceptualization, funding acquisition, and writing-review and editing.All authors have read and agreed to the published version of the manuscript.

40 Figure 1 .
Figure 1.Pipeline of current and future therapy selection approaches in prostate cancer.

Figure 1 .
Figure 1.Pipeline of current and future therapy selection approaches in prostate cancer.

Funding:
The authors acknowledge the financial support of the MD Anderson Prostate Cancer SPORE Grant P50 CA140388 (R.S. (CEP award), C.L.), the MD Anderson Institutional Research Grant (R.S.), and the Research Committee of the University of Patras (82151) "Epigenetic Research in Prostate Cancer" grant (V.T.; C.L.) by MD Anderson Cancer Center.

Table 1 .
Frequent gene alterations in prostate cancer.