Dysregulation of Placental Functions and Immune Pathways in Complete Hydatidiform Moles

Gene expression studies of molar pregnancy have been limited to a small number of candidate loci. We analyzed high-dimensional RNA and protein data to characterize molecular features of complete hydatidiform moles (CHMs) and corresponding pathologic pathways. CHMs and first trimester placentas were collected, histopathologically examined, then flash-frozen or paraffin-embedded. Frozen CHMs and control placentas were subjected to RNA-Seq, with resulting data and published placental RNA-Seq data subjected to bioinformatics analyses. Paraffin-embedded tissues from CHMs and control placentas were used for tissue microarray (TMA) construction, immunohistochemistry, and immunoscoring for galectin-14. Of the 14,022 protein-coding genes expressed in all samples, 3,729 were differentially expressed (DE) in CHMs, of which 72% were up-regulated. DE genes were enriched in placenta-specific genes (OR = 1.88, p = 0.0001), of which 79% were down-regulated, imprinted genes (OR = 2.38, p = 1.54 × 10−6), and immune genes (OR = 1.82, p = 7.34 × 10−18), of which 73% were up-regulated. DNA methylation-related enzymes and histone demethylases were dysregulated. “Cytokine–cytokine receptor interaction” was the most impacted of 38 dysregulated pathways, among which 17 were immune-related pathways. TMA-based immunoscoring validated the lower expression of galectin-14 in CHM. In conclusion, placental functions were down-regulated, imprinted gene expression was altered, and immune pathways were activated, indicating complex dysregulation of placental developmental and immune processes in CHMs.


Introduction
Gestational trophoblastic disease (GTD) is characterized by abnormal trophoblastic proliferation and includes hydatidiform mole (complete and partial) and gestational trophoblastic neoplasia (invasive mole, choriocarcinoma, placental site trophoblastic tumor, and epithelioid trophoblastic tumor) [1,2]. Diagnosis relies on clinical presentation and histologic assessment, and treatment is aimed at uterine evacuation with chemotherapy typically reserved for gestational trophoblastic neoplasia (GTN) [2,3]. Post-molar human chorionic gonadotropin (hCG) monitoring is an essential part of management for evaluating the development of GTN, which follows complete hydatidiform moles (CHMs) in~15-20% of cases [4]. Although GTN is considered among the most curable of all solid tumors with cure rates over 90%, unrecognized and misdiagnosed GTD can result in unnecessarily increased maternal morbidity and mortality [3].
The reported incidence of GTD varies by geographic location, race or ethnicity, maternal age, and histopathologic subtype. Hydatidiform mole reportedly complicates up to two per 1000 pregnancies in Southeast Asia and Japan, nearly twice that proportion reported in North America, Australia, New Zealand, and Europe [5]. This geographic variation has been partially attributed to racial and ethnic differences, as the prevalence of GTD is elevated in American Indians, Asians, and Hispanics across the world [6,7]. Asian and American Indian women have also been shown to have more aggressive disease, with increased risk of developing GTN compared to other ethnic groups [8,9]. Extremes of maternal age are also correlated with higher rates of CHMs, with an increased incidence among women over 40 years of age and under 20 years of age [10]. Beyond maternal age and ethnicity, prior GTD is the strongest risk factor for GTD, with an incidence of 1.3% [11].
These described risk factors (race/ethnicity and prior GTD) are in accord with an underlying genetic etiology of GTD [1,2,12]. The pathophysiology of hydatidiform moles involves overrepresentation of the paternal genome. The biparental diandric triploid karyotype of partial moles accords with dispermic fertilization, while androgenetic diploid karyotype of most CHMs is consistent with monospermic or dispermic fertilization. Recently, some women with recurrent androgenetic CHMs were shown to have bi-allelic deleterious mutations in MEI1 (meiotic double-stranded break formation protein 1), TOP6BL (type 2 DNA topoisomerase 6 subunit B-like), and REC114 (meiotic recombination protein REC114), leading to meiotic double-strand break formation and extrusion of all maternal chromosomes [13]. Absence of maternal imprinting of gene expression in hydatidiform moles has also been observed in the rare biparental hydatidiform moles due to NLRP7 (NLR family pyrin domain containing 7) or KHDC3L (KH domain containing 3 like) mutations, suggesting a common endpoint of pathogenesis [12,14,15]. However, for the more common sporadic CHMs, little is known regarding mechanisms responsible for either pathogenesis or progression to GTN.
The few targeted gene expression studies on molar tissue and a recent meta-analysis of these studies showed that the main genes differentially expressed (DE) in molar tissues may be those involved in villous trophoblast differentiation [16]. However, these findings were based on a limited set of molecules, and these studies mostly targeted placenta-or trophoblast-specific transcripts that were known to be differentially expressed during trophoblast differentiation. A more comprehensive approach to identifying genes and pathways involved in the development of molar disease would be a genome-wide gene expression analysis using either microarrays or RNA-Seq, followed by protein-level validation of DE transcripts.
We sought to implement such a high-dimensional and systems biology approach, similar to that used in our recent study on the pathophysiological processes in preeclampsia [17], to gain more in-depth insight into CHM pathogenesis at RNA and protein levels. This high-dimensional, agnostic study is the first to evaluate gene expression levels in CHMs using RNA-Seq followed by protein level validation of selected DE transcripts by immunostaining of tissue microarrays (TMA) and immunoscoring. The aim of our study is to identify genes with expression levels that differ in molar tissue from CHMs in comparison to placental chorionic tissue from uncomplicated pregnancies at similar stages of gestation. More complete understanding of the molecular pathways perturbed in CHMs may inform future efforts to improve procedures for early diagnosis and prognostication.

Differential Gene Expression in CHMs
Among transcripts of 14,022 protein-coding genes analyzed with RNA-Seq, a total of 3,729 (27%) were found to be DE in CHMs in comparison to normal first trimester placental tissues. Of these, 2,667 (72%) were up-regulated while 1,062 (28%) were down-regulated in CHM tissues (Supplementary Table S1, Figure 1). Of note, there were seven genes with placenta-specific expression among the top 20 transcripts most down-regulated but not up-regulated in CHMs (Table 3). Figure 1. Differential gene expression in complete hydatidiform moles. All 14,022 expressed proteincoding genes are represented in terms of their measured differences in transcript abundance (x-axis) and the significance of the difference (y-axis) on a volcano plot. The significance is represented as negative log (base 10) of the adjusted p-value so that more significant differences in expression are plotted higher on the y-axis. Dotted lines represent the thresholds used to select the differentially expressed (DE) genes: <−1 and >1 for the magnitude of differential expression and pFDR < 0.05 for statistical significance. According to these criteria, of the 3,729 DE transcripts, 2,667 were up-regulated (depicted with red), while 1,062 were down-regulated (depicted with blue) in molar tissues. Enrichment analysis revealed significant enrichment (OR = 1.88, p = 0.0001) of placenta-specific genes (Supplementary Table 2, Figure 2A) among DE genes. Of interest, 50 out of 63 (79%) placentaspecific DE genes, found to be expressed mainly by the trophoblast, were down-regulated. Among functions of products of these genes were growth hormones (CSHL1, chorionic somatomammotropin hormone-like 1; CSH1, CSH2), immune response (LGALS14, lectin, galactoside-binding, soluble, 14; PSG4), cell attachment (EGFL6, EGF-like domain multiple 6; SMAGP, small cell adhesion glycoprotein; SVEP1, Sushi, Von Willebrand factor type A, EGF and pentraxin domain containing protein 1), coagulation and blood pressure regulation (AGTR1, angiotensin II receptor type 1; F13A1, coagulation factor XIII A chain), and cell differentiation and developmental processes (PAGE4, PAGE family member 4; PLAC1, placenta specific 1; RASA1, RAS P21 protein activator 1). Figure 1. Differential gene expression in complete hydatidiform moles. All 14,022 expressed protein-coding genes are represented in terms of their measured differences in transcript abundance (x-axis) and the significance of the difference (y-axis) on a volcano plot. The significance is represented as negative log (base 10) of the adjusted p-value so that more significant differences in expression are plotted higher on the y-axis. Dotted lines represent the thresholds used to select the differentially expressed (DE) genes: <−1 and >1 for the magnitude of differential expression and pFDR <0.05 for statistical significance. According to these criteria, of the 3,729 DE transcripts, 2,667 were up-regulated (depicted with red), while 1062 were down-regulated (depicted with blue) in molar tissues. Enrichment analysis revealed significant enrichment (OR = 1.88, p = 0.0001) of placenta-specific genes (Supplementary Table S2, Figure 2A) among DE genes.

Dysregulated Biological Processes and Pathways in CHMs
Use of iPathwayGuide to investigate biological processes and pathways dysregulated in CHMs revealed that among 665 gene ontology (GO) biological processes, the most impacted were "cell adhesion", "biological adhesion", "multicellular organismal process", and "signaling" (Supplementary Table S5). Applying Elim pruning that iteratively removes genes mapped to a significant GO term from more general higher level GO terms, identified "calcium ion binding", "growth factor activity", "extracellular matrix structural constituent", and "G protein-coupled receptor activity" to be the most impacted among 150 dysregulated biological processes (Table 4, Supplementary Table S6). We found the most impacted GO molecular functions to be "signaling receptor activity", "molecular transducer activity", and "gated channel activity" among 105 dysregulated molecular functions (Supplementary Table S7). Applying Elim pruning, "regulation of signaling receptor activity", "cell adhesion", "chemical synaptic transmission", and "extracellular matrix organization" were identified as the most impacted among 628 dysregulated molecular functions (Table 5, Supplementary Table S8). The most impacted Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways included "cytokine-cytokine receptor interaction", "cell adhesion molecules", "protein digestion and absorption", and "neuroactive ligand-receptor interaction", all important for placental functions (Table 6,  Supplementary Table S9, Figure 3). An unanticipated finding was the most extensive dysregulation of "cytokine-cytokine receptor interaction" pathway ( Figure 4) and "cell adhesion" pathway, both required for immune cell influx and activation. In addition, representation of 17 immune-related pathways among 38 dysregulated pathways ( Table 6, Supplementary Table S9) reflects a strong immune component of molar pathogenesis.  Immune-related pathways are shown in bold blue. False discovery rate, pFDR.

Validation of RNA-Seq Results at the Protein Level
First, we immunostained TMA slides for cyclin-dependent kinase inhibitor p57 (p57) expression to confirm the histopathology diagnosis of CHM at the molecular level. Out of 26 samples with the histopathology diagnosis of CHM, we detected cytotrophoblastic p57 staining in three samples, while 23 (88%) samples were devoid of p57 expression (Supplementary Figure 1), confirming the histopathological diagnosis of CHM in 23 samples. For validation of RNA-level findings, we conducted immunostaining of galectin-14 (gal-14), which is encoded by LGALS14, one of the genes most down-regulated in CHMs according to our RNA-Seq study. Average gal-14 immunoscores were 43% lower in CHMs than in gestational age-matched controls (1.47 ± 0.08 and 2.60 ± 0.06, respectively, p < 0.001). Additionally, the percentage of low-intensity staining (1+) was higher in molar than in control tissues (58% and 3%, respectively), while the percentage of high-intensity staining (3+) was

Validation of RNA-Seq Results at the Protein Level
First, we immunostained TMA slides for cyclin-dependent kinase inhibitor p57 (p57) expression to confirm the histopathology diagnosis of CHM at the molecular level. Out of 26 samples with the histopathology diagnosis of CHM, we detected cytotrophoblastic p57 staining in three samples, while 23 (88%) samples were devoid of p57 expression (Supplementary Figure S1), confirming the histopathological diagnosis of CHM in 23 samples. For validation of RNA-level findings, we conducted immunostaining of galectin-14 (gal-14), which is encoded by LGALS14, one of the genes most down-regulated in CHMs according to our RNA-Seq study. Average gal-14 immunoscores were 43% lower in CHMs than in gestational age-matched controls (1.47 ± 0.08 and 2.60 ± 0.06, respectively, p < 0.001). Additionally, the percentage of low-intensity staining (1+) was higher in molar than in control tissues (58% and 3%, respectively), while the percentage of high-intensity staining (3+) was lower in molar than in control tissues (6% and 61%, respectively), resulting in a significant difference in the distribution of gal-14 immunoscores (p < 0.001), consistent with RNA-Seq results for this locus ( Figure 5). lower in molar than in control tissues (6% and 61%, respectively), resulting in a significant difference in the distribution of gal-14 immunoscores (p < 0.001), consistent with RNA-Seq results for this locus ( Figure 5). CHMs (n = 23) are displayed on the left and right graphs, respectively. An unpaired t-test was used to compare mean immunoscores between control and CHM groups. Fisher's exact test was used to test the difference in frequency of gal-14 immunostaining between the two groups. *** p < 0.001. Cytotrophoblast, CTB; syncytiotrophoblast, STB; villous stroma, VS.

Principal Findings of This Study
High-dimensional transcriptomic analysis identified numerous distinctions between CHM and normal placenta, from which noteworthy patterns can be discerned. 1) the most highly expressed genes in first trimester normal placentas are those previously shown to have placenta-specific or predominantly placental expression; 2) in CHMs, overall gene expression levels are higher, while expression of placenta-specific transcripts are lower than in first trimester normal placentas; 3) the pathogenesis of CHMs involves the dysregulation of 27% of protein-coding genes expressed in both normal placentas and CHMs; 4) most DE genes (72%) in CHMs are up-regulated; 5) placental functions appear to be down-regulated in CHMs, since placenta-specific genes are enriched in DE (C) Gal-14 immunoscores (mean ± SEM) and proportion of staining intensities in control placentas (n = 29) and CHMs (n = 23) are displayed on the left and right graphs, respectively. An unpaired t-test was used to compare mean immunoscores between control and CHM groups. Fisher's exact test was used to test the difference in frequency of gal-14 immunostaining between the two groups. *** p < 0.001. Cytotrophoblast, CTB; syncytiotrophoblast, STB; villous stroma, VS.

Principal Findings of This Study
High-dimensional transcriptomic analysis identified numerous distinctions between CHM and normal placenta, from which noteworthy patterns can be discerned. (1) the most highly expressed genes in first trimester normal placentas are those previously shown to have placenta-specific or predominantly placental expression; (2) in CHMs, overall gene expression levels are higher, while expression of placenta-specific transcripts are lower than in first trimester normal placentas; (3) the pathogenesis of CHMs involves the dysregulation of 27% of protein-coding genes expressed in both normal placentas and CHMs; (4) most DE genes (72%) in CHMs are up-regulated; (5) placental functions appear to be down-regulated in CHMs, since placenta-specific genes are enriched in DE genes, and most are down-regulated; (6) epigenetic reprogramming of the zygotic DNA and parental imprinting is dysregulated in CHMs; and (7) immune pathways are activated in CHMs as immune-related genes are enriched in DE genes and 17 immune-related pathways are impacted, mostly up-regulated, among 38 dysregulated pathways.

Placental Gene Expression and Functions are Down-Regulated in CHMs
In accord with previous studies [17,23], here we found genes with placenta-specific or predominant placental expression among the most highly expressed genes-12 out of 20-in first trimester normal placentas. These genes are involved in unique placental functions, including hormones, hormone synthesizing machinery, proteases, and immune proteins, which are pivotal for placental development, growth, signaling, and maternal-fetal immune tolerance. In CHMs, however, in spite of the overall median gene expression level being~13% higher than in normal first trimester placentas, the overall median placenta-specific transcript level was 23% lower. This is also substantiated by the lower number of placenta-specific genes among the 20 most highly expressed transcripts in CHMs.
When investigating differential gene expression, we found 27% of the expressed protein-coding genes dysregulated in CHMs, of which 72% were up-regulated, underlining the generally higher gene expression levels in CHMs. Placenta-specific genes were enriched among DE genes, and there were much more down-regulated (79%) than up-regulated placenta-specific DE genes. In fact, the 20 most down-regulated genes included seven placenta-specific genes, while the 20 most up-regulated genes were devoid of these. Based on the functions of down-regulated placenta-specific genes, we can infer that placental growth and development, cell attachment, signaling, and immune defense are the most highly impacted processes in CHM. This was also supported by the classical pathway analysis, which revealed among the most impacted GO biological processes "cell adhesion", "signaling", "hormone activity", "growth factor activity", and "extracellular matrix structural constituent", while the most impacted KEGG pathways included "cell adhesion molecules", "neuroactive ligand-receptor interaction", and "ECM-receptor interaction".

Possible Causes of the Dysregulation of Placental Gene Expression in CHMs
Targeted gene expression studies on molar tissues and a recent meta-analysis of these studies showed that placenta-specific genes involved in villous trophoblast differentiation are also differentially expressed in molar tissues, and concluded that molar pathogenesis is, indeed, rooted in problems with trophoblast differentiation [16]. However, these studies were focusing on just a limited set of placenta-specific transcripts known to be differentially expressed during trophoblast differentiation, thus did not give a comprehensive insight into this important question. We recently discovered that abnormal villous trophoblast differentiation is at the root of the pathogenesis of early-onset preeclampsia, also reflected by profound placental dysregulation of placenta-specific genes [17,19]. Here we applied a similar non-targeted approach to investigate this issue. First, we intersected the list of DE genes in CHMs with the list of DE genes during villous trophoblast differentiation from our parallel study [19] and found only minimal enrichment for trophoblast differentiation genes in dysregulated genes in CHMs, much less than for placenta-specific genes. This suggests that trophoblast differentiation is moderately affected in CHMs, while the expression of placenta-specific genes is more extensively impacted. Among the affected placenta-specific genes we validated the down-regulation of LGALS14 at the protein level. Since histopathological evidence shows the generally two-layered structure of villous trophoblasts in CHMs as in normal first trimester placentas, in spite of the histological and molecular evidence of locally more proliferative and or immortalized trophoblasts in CHMs [24][25][26][27][28][29][30][31], the histopathology of CHMs also underlines the generally moderate problem with villous trophoblast differentiation in CHMs.
A more compelling explanation would be that there is a problem with placental gene regulation due to the changes in DNA methylation and imprinting [32][33][34] since CHMs only contain paternal but not maternal genomes [35][36][37]. To test this hypothesis, we intersected the list of DE genes with the list of imprinted genes obtained from the GeneImprint database, and we found a significant enrichment of imprinted genes, underlining the link between CHM pathogenesis and affected paternal imprinting. In addition, we detected the down-regulation of the de novo DNA methyltransferase DNMT3A in CHMs, similar to the down-regulation of DNMT3A in absent fetal development, which is also the characteristics of Dnmt3a or Dnmt3b knockout mice [38]. Moreover, TET3, which is enriched specifically in the male pronucleus and responsible for paternal-genome conversion of 5mC into 5hmC [21], was two-fold up-regulated in CHMs. This is in good accordance with the paternal origin of all chromosomes in CHMs. Furthermore, four histone demethylases, which are active in the regulation of chromatin remodeling and gene expression and in influencing cellular differentiation, development, tumorigenesis, and inflammatory diseases [22,39], were found to be up-regulated in CHMs. These findings suggest that the epigenetic reprogramming of the zygotic DNA and parental imprinting are dysregulated in CHMs.

Immune Functions are Widely Dysregulated in CHMs
The pathogenesis of familial, biparental hydatidiform moles is caused by the inactivating mutations of NLRP7 and KHDC3L [40][41][42], genes involved in imprinting and inflammation. Inactivating mutations in NLRP7 inhibit cytokine (IL-1, interleukin-1; TNF, tumor necrosis factor) secretion by interfering with their trafficking, resulting in an in utero environment tolerogenic for the growth of these complete paternal allografts [41]. In this context, it is important that we found a wide dysregulation of immune pathways at the maternal-fetal interface in CHMs, which may also be at the heart of disease pathogenesis. To understand how this would be possible, we need to look first at immune processes in normal pregnancies. In healthy pregnant women, maternal-fetal immune tolerance mechanisms are complex and dynamic since embryo implantation in the first trimester is a pro-inflammatory process at the maternal-fetal interface, while the second trimester brings an anti-inflammatory milieu for the growing fetus in the womb, and the initiation of parturition at the end of pregnancy is characterized by a transition back towards a pro-inflammatory state [43][44][45][46]. These dynamic alterations in immune states are driven by the changing placental expression of molecules that regulate maternal immune responses [47][48][49][50][51][52][53][54][55][56][57][58]. Importantly, the dysregulated expression of immunoregulatory molecules at the maternal-fetal interface and the consequent disturbances in maternal-fetal immune regulation are strongly linked with alterations in the invasiveness of the trophoblast [59,60] and the development of severe pregnancy complications, such as miscarriages [61][62][63][64][65], preterm labor [66][67][68][69][70][71][72], or preeclampsia [73][74][75][76][77][78][79].
Herein, we detected the enrichment of genes involved in immune-related functions among DE genes in CHMs, and there were much more up-regulated than down-regulated immune-related genes. Of note, we noticed that IL1A and IL33 were down-regulated while FOXP3 (forkhead box P3), IL7, IL10, LIF (leukemia inhibitory factor), and TGFB1 were up-regulated, which could theoretically induce a tolerogenic environment, similarly to familial hydatidiform moles. Indeed, previous studies described the infiltration of regulatory T cells, CD3+ T cells, and NK cells in CHMs at the implantation site [80][81][82]. This is in line with the finding herein on the up-regulation of chemokines and the activation of "cytokine-cytokine receptor interaction" and "cell adhesion molecules" pathways as well as biological processes such as "cell adhesion", "biological adhesion", and "signaling" needed for immune cell influx and immune responses.
On the other hand, we also detected the down-regulation of LGALS13, LGALS14, the up-regulation of various HLA-I (A, B, C, E, F, G) and HLA-II (DMA, DMB, DPB1, DRA) genes, complement pathway genes (e.g., C1S, C3, C5, CR1) and interleukins (e.g., IL6, IL8/CXCL8, IL15, IL16, IL17D), and the overall involvement of 17 immune-related pathways among 38 dysregulated pathways. Of note, eight out of 10 (e.g., CD14, CD36, FCGRT, neonatal Fc receptor; LYVE1, lymphatic vessel endothelial hyaluronan receptor 1) Hofbauer cell marker genes [83] were down-regulated in CHMs in our study in line with the relative lack of Hofbauer cells in CHM villi [84,85]. These findings suggest an enhanced maternal, over fetal, antigen-presenting and pro-inflammatory environment at the implantation site, which is also supported by the activation of the "antigen processing and presentation", "phagosome", "complement and coagulation cascades", "allograft rejection", and "graft-versus-host disease" pathways. This is in line with the fact that CHMs express only paternal antigens and thus represent complete allografts that could stimulate a vigorous alloimmune response from the maternal host [85], including local synthesis of complement C3 and complement activation [86]. Not only paternal alloantigens but also increased necrosis and apoptosis, due to the higher trophoblastic cell proliferation and turnover rate in CHMs [87,88], may trigger the production of several pro-inflammatory cytokines [89] and the activation of the complement system [89,90].
Based on all of these findings, we suggest that there is a complex immune dysregulation at the implantation site in CHMs, which include the parallel induction of anti-and pro-inflammatory processes due to the presentation of excess paternal antigens, aponecrotic trophoblastic debris, and the dysregulation of upstream transcription factors and other gene regulatory mechanisms. These are probably the consequences of abnormal paternal genome dosage and imprinting, resulting in impaired gene expression and placental development. To delineate the exact sequence of events, one would need to collect multiple samples from the implantation site in a time-series from patients and controls, and then to perform the extensive characterization of the localization and expression of RNA and protein signals. However, this type of sample collection is not possible due to ethical and technical reasons in spite of the fact that cellular and molecular characterization of such time-series materials would now be possible using, for example, RNA-Seq after laser capture microdissection or single-cell RNA-Seq as well as mass spectrometry-based tissue imaging. However, these techniques were not available for our current study.

Strengths and Limitations of the Study
According to our knowledge, this is the first non-hypothesis testing study that utilized various tools of high-dimensional biology to gain in-depth insights into the pathogenesis of CHMs. The strengths of the study are as follows: (1) strict clinical definitions and homogenous patient groups; (2) standardized, quick placental sample collection during surgeries and pregnancy terminations; (3) standardized histopathological examination of molar pregnancies and placentas based on international criteria; (4) high-quality sample processing and state-of-the-art high-dimensional methods for RNA and protein level analysis; (5) parallel expression profiling of various proteins on large tissue sets with tissue microarray and immunostaining followed by semi-quantitative immunoscoring and statistical analysis; and (6) the use of leading bioinformatics tools for RNA-Seq, gene ontology, and pathway analyses.
Limitations of the study are as follows: (1) the relatively modest number of cases due to the strict clinical and histopathological inclusion criteria used for patient enrollment. Paradoxically, this was also one of the most important strengths of our study; (2) data for cases and controls were retrieved from various sources in the RNA-Seq discovery study; (3) TMAs did not allow us the cellular compositional analysis of the investigated tissues due to the small core diameter size; and (4) it is possible that the differences in gene expression levels may originate from differential gene expression of particular cell types, differential cellular composition of CHMs and normal placentas, or the surgical/methodological differences in tissue sampling during elective terminations or CHM surgeries.

Concluding Remarks
In conclusion, our data shows that placental functions are down-regulated, expression of imprinted genes is altered, and immune pathways are activated in CHMs. Taken together, the results indicate that complex dysregulation of placental developmental and immune processes are pivotal for the pathogenesis of CHMs, providing new biological insight likely to inform molecular translational research addressing multiple stages of the natural history of CHM, such as assays for improved detection and refined diagnosis in early pregnancy [91], risk stratification with respect to malignant potential [92], and differential diagnosis of unexplained pregnancy loss [93]. . Eligible subjects were identified upon presentation to Los Angeles County (LAC) and USC Medical Center. CHMs were identified clinically by a combination of ultrasonographic features ("snowstorm" or "cluster of grapes" appearance) with elevated blood hCG levels. Molar tissues were obtained between 8 2/7 and 14 0/7 weeks of gestational age by therapeutic surgical dilation and curettage. Confirmation of CHM diagnosis was determined by the histopathologic examination of surgical material. One subject who had CHM on the basis of clinical features was subsequently reclassified as having choriocarcinoma after pathological examination and was excluded from the study. Another subject with CHM was excluded due to pooling of several samples from various curettages into one sample, leaving the analyzed number of CHMs at four. Gestational age-matched control chorionic villous tissues were obtained at the Reproductive Options Clinic at LAC + USC Medical Center from surgical elective pregnancy terminations. To increase control group size, we also obtained RNA-Seq data of first trimester control chorionic villous tissues from a published study [23] (Table 7).

Subjects in the Tissue Microarray Validation Study
GTD tissue samples had been collected during usual care of women who underwent treatment for GTD at the Department of Obstetrics and Gynecology, Keck School of Medicine, USC (Los Angeles, CA, USA). After approvals (protocols HS-14-00808 and HS-15-00720 for IRB approval, Lab agreement: 16-03-03) were obtained from the USC IRB, the institutional pathology database at the LAC Medical Center, Keck Medical Center of USC, and Norris Comprehensive Cancer Center was utilized to identify GTD cases by searching for the keywords "gestational trophoblastic disease", "molar pregnancy", "hydatidiform mole", "complete mole", and "invasive mole" between 1 January 2000 and 11 September 2017. Inclusion criteria included GTD cases from whom archived histopathology specimens were available. GTD cases without salient clinical information were excluded.
For eligible GTD cases, data describing clinical and demographic features, tumor characteristics and markers, treatment course, and survival outcomes were abstracted from the medical record. Clinical and demographic features included patient age at diagnosis of GTD, ethnicity, pregnancy history including interval months from the last menstrual period of index pregnancy, body mass index, and medico-surgical history. Tumor characteristics and markers included the World Health Organization (WHO) score, histology subtype, tumor size, pretreatment beta-hCG, metastatic sites, and number. Treatment course included surgical intervention (dilation and curettage, or hysterectomy) and chemotherapy (type, cycle number, and response). Survival outcomes included progression-free survival (PFS) and overall survival. Progression-free survival was determined as the time interval between treatment initiation for GTD and the date of first recurrence or last follow-up date if there was no recurrence. Overall survival was determined as the time interval between treatment initiation for GTD and the date of death related to GTD or last follow-up date if the patient was alive.
Of 311 cases that were initially identified, 44 CHM cases were chosen for TMA based on tissue availability (Tables 8 and 9). We placed no restriction on the age of gestation and found pregnancies to be dated according to ultrasound scans between 5 and 15 weeks. Patients with multiple pregnancies were excluded, and specimens and data were de-identified for use in research according to procedures approved by the Institutional Review Boards of USC. From the 44 GTD tissues identified between 4 2/7 and 36 1/7 weeks of gestational age, the TMA project included 26 samples obtained from therapeutic surgical dilation and curettage within the first trimester (13 6/7 weeks) and having good tissue quality. The 26 samples were obtained from 23 patients. One patient provided two specimens at the same sampling time, while another patient provided three specimens at different time points. Control samples of first trimester placental tissue (n = 29) matched by gestational age to GTD samples were collected prospectively at the Maternity Private Clinic (Budapest, Hungary) and deposited into the Perinatal Biobank of the Research Centre for Natural Sciences, Hungarian Academy of Sciences (Budapest, Hungary). Informed consent for use of the material in research was obtained from women prior to sample collection, and specimens and data were stored anonymously. The research was approved by the Health Science Board of Hungary (TUKEB 4834-0/2011-1018EKU), and all use of tissue and data in this research conformed to the principles set out in the World Medical Association (WMA) Declaration of Helsinki. Placentas were collected from pregnancies voluntarily terminated by surgical dilation and curettage between 5 and 14 weeks of gestation according to ultrasound scans, excluding multiple pregnancies (Table 8).

Sample Collection and Preparation for RNA-Seq
Tissue samples were rinsed with sterile saline and examined under a lightbox for content confirmation and selection of molar tissue or chorionic villi. Samples were then stored in RNAlater RNA stabilization reagent (approximately 10 µL reagent per 1mg of tissue, Qiagen, Germantown, MD, USA) at 37 • C for a maximum of 24 h until collected for dissection. Samples were then dissected intõ 1 g tissue aliquots and stored at −80 • C. Stored samples were then thawed at room temperature and homogenized with electric homogenizer. Total RNA was isolated using the protocol provided with the Qiagen RNeasy Mini Kit. RNA quantity was measured by a Nanodrop ND-8000 analyzer (Thermo Scientific, Waltham, MA, USA). Samples were then stored at −80 • C until used.

RNA Sequencing
Library preparation and paired-end sequencing were performed by the USC Epigenome Core Laboratory. Double-stranded cDNA fragments were synthesized from mRNA, ligated with adapters, and size-selected for library construction according to Illumina protocol [94]. RNA sequencing was carried out on Illumina Genome Analyzer (Illumina, San Diego, CA, USA).

Histopathological Examinations
All slides cut from formalin-fixed, paraffin-embedded (FFPE) molar tissue blocks were retrieved from our USC + LAC histology archives. The slides were reviewed for confirmation of the diagnosis and regions were identified for TMA construction. The blocks from the matching slides were retrieved and submitted for TMA construction.
Samples of tissue from first trimester placentas had been fixed in 10% neutral-buffered formalin and then embedded in paraffin. Five µm sections were cut from tissue blocks, stained with hematoxylin and eosin (H&E), and examined using light microscopy. A perinatal pathologist blinded to patients' clinical information, except for the gestational age, examined the histopathology of placental samples using standard perinatal pathological protocol and previously-published diagnostic criteria [95][96][97], and identified regions of each tissue block from which to sample cores.

Tissue Microarray Construction
A material transfer agreement between the Hungarian Academy of Sciences and USC (No: 1996/2017) enabled the construction of TMAs containing tissues collected at USC and at Maternity Private Clinic and deposited into the Perinatal Biobank. Four TMAs were constructed, each containing three cylindrical cores two mm in diameter from each sample of first trimester FFPE placental (n = 29) or GTD (n = 26) tissue. Cores from the same sample were transferred into recipient paraffin blocks adjacent to each other using an automated tissue arrayer (TMA Master II, 3DHISTECH, Budapest, Hungary). Each paraffin block contained three cores from all tissues, with liver included as negative control and third trimester placenta as positive control material.

Immunohistochemistry and Immunoscoring
To validate the histological diagnosis of CHM, we conducted immunostaining for p57, which is expressed from a paternally imprinted gene and is a recognized diagnostic marker in CHM [12,98,99]. To validate RNA level findings at the protein level for one down-regulated gene, five-µm-thick sections were cut from each TMA onto adhesive glass slides (SuperFrost Ultra Plus, TS Labor, Budapest, Hungary), and immunostained for p57 and galectin-14 using antibodies and conditions listed in Table 10. Briefly, sections were dewaxed using xylene and rehydrated in graded alcohol series. For p57 immunostaining, endogen peroxidases were blocked using 10% H 2 O 2 in methanol for 20 min, then antigen retrieval was performed in Tris-EDTA pH 9.0 buffer for 20 min at 96 • C. For galectin-14 immunostaining, endogen peroxidases were blocked using 1% H 2 O 2 in methanol for 20 min, then antigen retrieval was performed in Tris-EDTA pH 9.0 buffer for 32 min at 100 • C. In all staining procedure, after 10 min Novolink protein blocking (Leica-Novocastra, Wetzlar, Germany), the sections were incubated at room temperature with antibodies and then with reagents of the Novolink Polymer Detection System (Leica-Novocastra). Between incubation steps, the sections were washed trice for 5 min in Tris-buffered saline (pH 7.4). Finally, sections were counterstained with hematoxylin, and these were mounted (DPX Mountant; Sigma-Aldrich, St. Louis, MO, USA) after dehydration.
Immunostained TMA images were semi-quantitatively scored by two examiners blinded to the clinical information with an immunoreactive score modified from one previously used in our studies [18,65,74,[100][101][102]. Trophoblastic immunostaining intensity was graded as 0 = negative, 1 = weak, 2 = intermediate, and 3 = strong. For each specimen, all villi and trophoblastic tissues in a random field of each of the cores were evaluated by both examiners using Panoramic Viewer 1.15.4 (3DHISTECH Ltd., Budapest, Hungary).
RNA-Seq count data for additional late first trimester control placentas (22 XY and 17 XX) were downloaded from GEO (AccNo: GSE109082) [23]. Differential gene expression analysis was performed using the R package DESeq2 [106,107].
Because data included differing amounts of non-coding RNA leading to high numbers of DE genes presumed to be spurious, we restricted the differential gene expression analysis to 19,690 protein-coding genes based on the ENSEMBL "biotype" annotation. We further excluded genes with zero read count in any sample, resulting in a final analytic set of data on 14,022 genes. As a set of DE transcripts, we selected those with a fold change of ≥2 between CHM and control, while holding the false discovery rate (pFDR) to <0.05.
To estimate absolute levels of expression, we normalized RNA-Seq count data by the geometrical mean of measured counts of four housekeeping genes (TBP, TATA-box binding protein; CYC1, cytochrome C1; UBC, ubiquitin C; TOP1, topoisomerase (DNA) I), and calculated the mean expression both in CHM and control groups of samples.

Pathway Analysis
To identify significantly impacted pathways, biological processes, and molecular functions, we compared expression between CHM and control groups using Advaita Bio's iPathwayGuide (https://www.advaitabio.com/ipathwayguide). This software analysis tool implements the "impact analysis" approach that takes into consideration the direction and type of all signals on a pathway and the position, role and type of every gene, as described in [108].

Enrichment Analyses
We used Fisher's exact test to test enrichment of the DE transcript set with genes previously demonstrated to have expression patterns of particular relevance: (1) genes with placenta-specific expression (n = 164, Supplementary Table S2) [17]; (2) genes involved in villous trophoblast differentiation (n = 1,937; GEO: GSE130339, the list was provided by authors of an unpublished work [19]); (3) genes imprinted in humans (n = 209; imprinted gene databases, www.geneimprint.com, Supplementary Table S3 and 4 genes related to immune processes (n = 1,449; www.innatedb.com; the final list is the manual curation and combination of Immport, Immunogenetic Related Information Source, and Immunome Database Gene Lists, Supplementary Table S4).

Analysis of TMA Data
For each core, immunostain was scored by each of two examiners using a scale of 0 to 3, 0 representing none and 3 strong immunostaining. Quantitative immunoscores were determined as follows. For each core, immunoscores from examiners were first averaged to represent that core, and resulting values for each of the three cores representing one sample were averaged to represent that sample. Mean immunoscores between control and CHM groups were compared using the unpaired t-test. The Fisher's exact test was performed to test the distribution of gal-14 immunoscores between the control and CHM groups. Results were considered statistically significant at * p < 0.05, ** p < 0.01, and *** p < 0.001.

Data Availability
USC RNA-Seq data was deposited to the Gene Expression Omnibus (GEO) database according to the MIAME guidelines (accession number: GSE138250).