Construction of Copy Number Variation Map Identiﬁes Small Regions of Overlap and Candidate Genes for Atypical Female Genitalia Development

: Copy number variations (CNVs) have been implicated in various conditions of differences of sexual development (DSD). Generally, larger genomic aberrations are more often considered disease-causing or clinically relevant, but over time, smaller CNVs have been associated with various forms of DSD. The main objective of this study is to identify small CNVs and the smallest regions of overlap (SROs) in patients with atypical female genitalia (AFG) and build a CNV map of AFG. We queried the DECIPHER database for recurrent duplications and/or deletions detected across the genome of AFG individuals. From these data, we constructed a chromosome map consisting of SROs and investigated such regions for genes that may be associated with the development of atypical female genitalia. Our study identiﬁed 180 unique SROs (7.95 kb to 45.34 Mb) distributed among 22 chromosomes. The most SROs were found in chromosomes X, 17, 11, and 22. None were found in chromosome 3. From these SROs, we identiﬁed 22 genes as potential candidates. Although none of these genes are currently associated with AFG, a literature review indicated that almost half were potentially involved in the development and/or function of the reproductive system, and only one gene was associated with a disorder that reported an individual patient with ambiguous genitalia. Our data regarding novel SROs requires further functional investigation to determine the role of the identiﬁed candidate genes in the development of atypical female genitalia, and this paper should serve as a catalyst for downstream molecular studies that may eventually affect the genetic counseling, diagnosis, and management of these DSD patients.


Introduction
The development of female genital structures is controlled by a set of intricate biological processes. Disruptions at the genetic, endocrine, structural, and/or environmental level can lead to atypical or differences of sexual development (DSD), conditions associated with malformation and/or dysfunction of the reproductive system.
At the genetic level, development of the internal female reproductive tract depends on three finely-regulated processes: development of the Müllerian ducts, regression of the Wolffian ducts, and differentiation of the Müllerian ducts [1][2][3]. The reproductive tract is initially undifferentiated in the form of the Wolffian duct, which is derived from the mesonephros of the urinary tract and present in all embryos [4]. Expression of genes such as WNT4 induce formation of the Müllerian duct, which uses the preexisting Wolffian duct as a scaffold [2]. The presence of a Y chromosome, specifically the SRY gene, determines whether the primitive gonads differentiate into ovaries or testes. Without SRY, no anti-Müllerian hormone (AMH) or testosterone is produced, resulting in the maintenance of the Müllerian ducts and regression of the Wolffian ducts [2][3][4][5]. Maturation of the Müllerian ducts further depends on additional genetic pathways which result in development of the uterus, oviducts, cervix, and upper third of the vagina. Improper fusion of the Müllerian ducts or a failure of primitive tissue regression can lead to atypical structures such as bicornuate uterus or uterine or vaginal septum [1,4]. Development of the external female genitalia is less understood; however, TP63 has been identified as a major genetic player in the differentiation of urogenital mucosa and septation of the cloaca [6].
Despite tight genetic regulation, environmental factors can throw female reproductive tract development into disarray. Endocrine disrupting chemicals, which mimic or antagonize estrogens and androgens, are found throughout the environment, from nature and wildlife to commerce and laboratories. These xenoestrogens can affect fetal development in utero and even during pre-conception. For example, in the mid-1900s, physicians prescribed diethylstilbestrol (DES) to prevent miscarriage-it was later discovered that females exposed to DES in utero developed "T-shaped" uteruses which in turn predisposed them to worse pregnancy outcomes [2,7]. Furthermore, changes at the epigenetic level (e.g., DNA methylation, histone acetylation, etc.) can affect gene expression pre-and postnatally, leading to atypical reproductive structures and function [8].
While the rate of molecular diagnosis is higher in 46, XX DSD patients compared to 46, XY DSD patients (64% vs. 12% respectively), the majority, if not all, of those are diagnosed with CAH [11,15,16]. The remaining 46, XX DSD patients often do not know the genetic etiology of their conditions, and historically, such patients have been diagnosed using phenotypic presentation. Indeed, geneticists have begun focusing on patient comorbidities and malformative syndromes with documented DSD for clues into the genetic origin of atypical genital development [17,18].
Copy Number Variations. Chromosomal microarray analysis (CMA) has been utilized to uncover copy number variants (CNVs) that are potentially significant in the manifestation of DSD. For example, previous studies have shown that duplications of SOX9 [19,20] and NR0B1 [21,22] are relevant in the development of 46, XX and 46, XY DSD, respectively. Throughout the years, an abundance of CNV data has been published and stored in various public databases such as DECIPHER [23]. Utilization of CNV data from these resources allowed for the characterization and delineation of the smallest regions of overlap (SROs) that often result in the identification of candidate genes that may have clinical relevance.
In this retrospective study, we queried the DECIPHER database for genomic regions containing recurrent duplications and/or deletions in patients with atypical development of female external and internal genital structures. From these data, we constructed a whole genome map of SROs and further investigated such regions for candidate genes that may be associated with various forms of atypical female genitalia. We also expanded our phenotype analysis to include other non-genital anomalies that may be comorbid with atypical female sex development.
To the best of our knowledge, this is the first study to create a whole genome CNV map of atypical female genitalia and to identify corresponding candidate genes. Data from this study can be utilized for downstream research, and findings from all above-mentioned efforts may provide a better understanding of the genetic etiology of atypical female genitalia, facilitate diagnosis and genetic counseling, and improve patient management. Lastly, our data will further contribute to the development of a whole genome map for DSD overall.

Data Source and Collection
We searched DECIPHER's open-access database (accessed 16 June 2020) to identify patients with documented "abnormality of the female genitalia" (thereafter referred to as atypical female genitalia when possible). Additionally, we categorized the phenotypes by whether they affected internal genitalia, external genitalia, or both. Some patients were categorized as "male" because they exhibited phenotypes such as hypospadias, cryptorchidism, and male "pseudohermaphroditism" (Supplementary Table S1).

Delineation of SROs
Using the chromosome map function "Browser" in DECIPHER, recurrent and overlapping copy number variable regions across the entire genome were identified. Within each CNV cluster (≥2 CNVs), an SRO map was created. Singular CNVs were disregarded. SROs were further classified as unabridged or extrapolated (Supplementary Figure S1). Unabridged SROs are defined by one patient's full CNV and are completely contained within the confines of CNVs from all other patients in a given interval. Extrapolated SROs are not defined by one complete CNV-instead, CNVs from two individual patients establish the boundaries of the SRO (i.e., one CNV establishes the beginning of the SRO while another establishes the end).
Locus and size of each SRO were further determined using the UCSC Genome Browser (https://genome.ucsc.edu/; accessed on 16 June 2020) [24]. Chromosomal locations for each SRO were manually entered into Chromosome Analysis Suite 4.1 (ChAS; Affymetrix, Inc., Santa Clara, CA, USA) to generate the SRO CNV map. Gene content was determined by accessing each patient profile and their respective gene list. For extrapolated SROs, we obtained gene content by organizing genes by location and manually counting those, including non-protein coding, within the boundaries of the SRO. Based on DECIPHER data, genes were categorized as candidates for atypical female genitalia if they displayed a high likelihood of haploinsufficiency (%HI; 0.00-10.00) and/or the probability of loss-of-function (LoF) intolerance (pLI; 0.90-1.00).

Definition and Identification of Candidate Genes
In this study, we defined a candidate gene as one that has yet to undergo functional studies that prove causality between it and the development of AFG. The initial list of candidates contained hundreds of genes. To further narrow our list of candidate genes, we focused on small SROs (≤500 kb) that contained five or fewer haploinsufficient and/or LoF intolerant genes. We investigated whether these genes had previously been associated with DSD or the development/function of the reproductive system. PubMed was searched to identify studies that had associated the candidate genes with atypical development of the female genitalia. We searched using the following terms: (gene name) and (disorders of sexual development OR DSD OR intersex). A similar approach was used with Google. To confirm each gene's novelty status, we accessed their individual OMIM phenotypes through their respective DECIPHER profiles. Candidate regions are SROs that contain no transcribed genes and may hold regulatory sequences, such as promoters, enhancers, silencers, etc.

Comorbidities
Within SROs, similarities in patients' DSD phenotypes were assessed. Phenotypes were classified based on the structure/organ affected. Certain phenotypes, such as uterine neoplasm, primary amenorrhea, etc., received individual categories. The proportion of patients that exhibited the same phenotypes of abnormal female genitalia as the reference patient, whose CNV determined the SRO, was noted. For extrapolated SROs, of the two CNVs defining its boundaries, the patient who had the fewest number of overall phenotypes was selected as the reference patient. DSD phenotypes not exhibited by the reference patient but exhibited by non-reference patients were also recorded. An analogous process was used for non-DSD comorbidities which were listed in order of prevalence. Atypical female genitalia phenotypes were then grouped by affected region and assessed for recurring comorbidities.

Results
Genotypic and Phenotypic Characterization. We identified 300 patients (access date: 16 June 2020) with a spectrum of phenotypic findings including, but not limited to, aplasia/hypoplasia of reproductive structures, endocrine dysfunction, neoplasms, and cysts (Supplementary Table S1). There were 191 patients that only exhibited 1 DSD phenotype, while 108 exhibited two or more. One patient did not have any apparent DSD phenotypes. Additionally, 137 patients had DSD phenotypes that solely affected the internal genitalia while 94 patients had DSD phenotypes that only affected the external genitalia ( Figure 1). Both internal and external genitalia were atypical in 25 patients. Very few patients (n = 5) exhibited atypical male phenotypes. 39 patients were classified as unspecified due to ambiguity in their reported phenotypes. Most of the patients (n = 264) that exhibited atypical female genitalia had a 46, XX sex chromosome complement, while 23 individuals were 46, XY ( Figure 1). The remaining 13 did not have a known sex (n = 9) or were classified as "other" (n = 4). Patients exhibited a variety of phenotypes, with abnormality of the labia, abnormality of the uterus, and abnormality of the female genitalia being the most common ( Figure 2).

Delineation of SROs
We were able to identify 56 CNV clusters across the genome. Within those clusters, we identified 180 SROs throughout 22 pairs of chromosomes ( Figure 3). The X chromosome contained the highest number of SROs (n = 35) while chromosomes 14 and Y had the least number of SROs with one each (Figure 4). Chromosome 3 was the only chromosome that did not contain any SROs. The majority (64%) of SROs were extrapolated while 65 were unabridged (Supplementary Figure S1). SROs tended to consist of a combination of deletions and duplications, with 64% containing both. Of the remaining SROs, 33% only contained deletions while a mere 3% solely contained duplications ( Figure 5). SROs ranged from 7.95 kb to 45 Mb in size. Most SROs were larger than 1 Mb (70.00%), with only 10.56% (n = 19) being less than 250 kb ( Figure 5).

Previously Described Genes and Regions
There were 25 SROs that overlapped with genes or regions that are known to cause or are associated with atypical female genitalia (Table 1). Only one region (CNV 22q11.21), containing six adjacent SROs, overlapped with candidate genes identified in our analysis.

Candidate Genes and Regions
Within the 180 SROs, 335 genes displayed a high probability of haploinsufficiency while 660 genes were considered extremely loss-of-function (LoF) intolerant (Table 2). Applying the SRO size and gene quantity limitations, we were left with 22 candidates to further investigate. Although none of the genes were widely associated with DSD, a literature review indicated that almost half were potentially involved in the development and/or function of the reproductive system ( Table 3). Most of these genes did not exist in regions that were linked to DSD. In exception, GGNBP2, MAZ, SCARF2, MED15, UBE2L3, and MAPK1 overlapped with CNVs that are associated with MRKH syndrome types I and II [25,26]; however, none of these genes correlated to phenotypes/syndromes in Online Mendelian Inheritance in Man (OMIM). Additionally, six candidate genes had associated OMIM phenotypes, while the remaining 16 had no associated phenotypes. Of the six genes with OMIM phenotypes, only one was associated with a disorder that reported an individual patient with ambiguous genitalia [27].

Gene-Desert SROs
Four SROs did not contain any genes, including one that was considered for candidate genes (SRO087). They ranged from 29.28 kb to 684.32 kb. 75% of these SROs were extrapolated and consisted of both deletions and duplications. The most common DSD phenotype seen in these patients were abnormalities of the external genitalia (50%).

Comorbidities
The majority of our DSD patient population exhibited additional phenotypes unrelated to the reproductive system. Intellectual disability (47%), short stature (35%), micrognathia (27%), microcephaly (24%), and low-set ears (23%) were frequently encountered (Supplementary Table S2); however, certain female reproductive organs presented with unique comorbidities. Not infrequently, abnormality of the pinna, hypertelorism, atrial septal defect, short neck, and premature birth were seen with abnormalities of the uterus, while anal atresia and hypertelorism were common in patients with abnormalities of the vagina. Patients with abnormalities of the ovaries tended to be small for gestational age and have hypertelorism, while patients with abnormalities of the breasts and nipples tended to have muscular hypotonia. Comorbidities in patients with labial abnormalities and clitoral abnormalities aligned with those of the overall patient population, and there was only one patient with an abnormality of the fallopian tubes, so we could not determine the significance of the present comorbidities.

Discussion
Human sexual development requires a complicated synchronization of many biological elements that affect female reproductive structures and endocrine function. Dysregulation of a single gene or gene networks can lead to the atypical development and function of the reproductive system. Historically, clinicians relied on visible phenotypes to diagnose patients with 46, XX DSD. As advanced genomic technologies evolved, diagnoses of various forms of DSD have accelerated, resulting in a wealth of CNV data. Such data are used for further investigation of candidate genes and regions as well as downstream functional studies to elucidate clinical relevance.

DECIPHER
The database proved to be instrumental in the creation of our SRO map and our search for novel candidate genes. DECIPHER is unique in that persons and institutions can freely access its data and submit genetic information that is available for proceeding use. This quality makes it invaluable not only for research of DSD but also for research of all genetic variations. Additionally, DECIPHER's inclusion of sex chromosome information that is more expansive than "male" and "female" takes away one of the obstacles faced by DSD researchers while also being affirmative to patient sex identity [28].

Delineation of SROs
It has already been shown that small CNVs are potentially involved in the development of DSD [29][30][31]. Therefore, we focused on SROs less than 500 kb in size because they are often overlooked when assessing the genome for candidate genes. Although many SROs were large in size (>500 kb) and therefore outside our range of focus, these data can be used in future investigations. With further analysis, these larger CNVs can be partitioned into smaller regions that eventually pinpoint the etiology of various AFG.

Candidate Genes and Regions
Clinical medicine has often excluded small CNVs in the genomic analysis of DSD. Over the past decade, the identification of small deletions and duplications has improved diagnostics given that approximately 30% of all DSD patients cannot be classified as sex chromosome, 46, XY, or 46, XX DSD [32]. Furthermore, 19-21% of 46, XX individuals receive a molecular diagnosis [33,34]. Previous studies have shown that small CNV analysis can reveal genes/regions that are relevant in sex development [29,30]. For instance, a 5.2 kb region upstream of SOX9 was found to contain regulatory elements for sex development in both 46, XX and 46, XY individuals [31]. Our study further demonstrates the promise of small (<500 kb) CNVs when performing genetic testing for patients with atypical female genitalia. We would argue that size does not matter, but gene content does.
We identified 36 small SROs (<500 kb) that contained 22 candidate genes with %HI < 10% and pLI scores > 0.90 (Table 3). The majority of these genes did not overlap with genes or regions that were previously documented in cases of 46, XX DSD. Identification of these regions further endorses the importance of deletions and duplications in sexual development, and given the use of HI and pLI as predictive tools, our data can steer scientists towards relevant genes for functional studies. Some SROs (SRO102, SRO115, SROs135-141) overlapped with CNVs that have been identified as potentially pathogenic [25,26,35]; however, studies don't often address these novel genes, and the candidates have yet to be associated with DSD phenotypes in OMIM. For example, GGNBP2 is located within the region 17q12, which is commonly associated with MRKH types I and II [36][37][38], but remains to be further explored as a candidate gene despite its role in testes morphology and spermatogenesis [39]. Other dosage-sensitive genes housed within common CNVs have also yet to be thoroughly investigated. MAZ, located in region 16p11.2, is a dosage-sensitive transcription factor that is responsible, in part, for genitourinary development [40]. While one CNV study briefly mentions MAZ in its methodology [37], it exists in the periphery of the study's focus, and the literature is otherwise devoid of information on MAZ's role in GU development.
Within regions 22q11.21 and 16p11.2 exist two genes, MAPK1 and MAPK3, respectively, that have the potential to become more explicitly associated with DSD. MAPK1 and MAPK3 function as kinases downstream of MAP3K1, which has an established role in DSD. Increased phosphorylation of MAPK1 and MAPK3 due to gain-of-function mutations in MAP3K1 leads to upregulation of FOXL2 and FST [41,42], additional genes with roles in ovary formation [43][44][45]. It is also known that MAPK1/3 plays an important role in luteinizing hormone signal transduction during ovulation [46]. The dosage-sensitivity of MAPK1 and MAPK3 further supports evidence that CNVs in these regions lead to the development of ambiguous genitalia.
SCARF2 presents an interesting case: Despite its presence on an individual academic center's DSD panel [47], there is no evidence explicitly stating the gene's role in development of AFG. However, like GGNBP2, it is located in a region (22q11.21) that commonly exhibits CNVs that are associated with ambiguous genitalia and MRKH types I and II [26,35]. Currently, it is known that mutations in SCARF2 cause the rare, autosomal recessive disorder Van Den Ende-Gupta syndrome (VDEGS; OMIM: 600920) which is mainly characterized by skeletal and craniofacial phenotypes [48]; however, a single case of VDEGS reportedly exhibited ambiguous genitalia [27]. Interestingly, VDEGS maps to distal 22q11.2, which contains the critical region responsible for DiGeorge syndrome (OMIM: 188400). Similar to VDEGS, patients with DiGeorge syndrome have distinct abnormal facies but with added cardiac defects, thymic hypoplasia, and hypocalcemia [49]. Less frequently, 46, XX patients may exhibit genitourinary anomalies such as absent uterus or uterine didelphys [50,51]. While DiGeorge syndrome's distinguishing phenotypes can be attributed to a haploinsufficiency of TBX1 [49], the genetic etiology of the genitourinary phenotypes is not well understood. Further investigation into the function of SCARF2 may resolve the association between CNV 22q11.2 and abnormalities of female genitalia while also shedding light as to why 46, XX patients with VDEGS and DiGeorge syndrome exhibit genital anomalies.
Outside of identified but unexplored CNVs, there still exist genes that are prime for inquiry. PAPPA is an insulin-like growth factor binding protein (IGFBP) protease that increases bioavailable insulin growth factor which, in turn, promotes the development of a dominant ovarian follicle [52,53]. Predictably, when PAPPA is knocked out, follicular development and ovarian function in female murine models is disrupted, resulting in decreased fertility [54] and suggesting that the gene plays a significant role in the ovary's functional integrity. Nonetheless, to our knowledge, PAPPA is not yet under consideration as a potential candidate gene for 46, XX DSD.
SENP3 and EIF4A1 are jointly transcribed and eventually spliced into their respective, individual transcripts. Despite their different functions at the molecular level, they both seem to play a unique role in fertility. During meiosis, SENP3, a deSUMOylation protease, is required for the G2-M transition. Downregulation of the gene disrupts spindle assembly and germinal vesicle breakdown, which prevents the first polar body extrusion and, therefore, proper oocyte maturation [55]. Unlike SENP3, it is suggested that EIF4A1 plays a critical role post-fertilization. As a translation initiation factor, EIF4A1 produces proteins that are necessary for early embryonic cell division. Sans EIF4A1, there is inadequate growth of the blastocyst, leading to implantation failure and decreased fertility [56]. Additionally, it should be noted that point mutations in the EIF4B family transcription factor family can result in gonadal dysgenesis and amenorrhea in 46, XX individuals [57]. SENP3 and EIF4A1 flock the point of fertilization and seem to play significant roles in oocyte integrity. The potential of these genes leads us to suggest further investigation into their effects on female reproductive system development.

Gene-Desert SROs
SROs lacking expressed genes should also not be ignored, for they may contain DNA sequences that are necessary for the regulation of genes involved in the development of the female reproductive tract. In 2011, Benko et al. identified a 78 kb non-coding regulatory region upstream of SOX9. A few years later, this sex-determining region was whittled down to approximately 5.2 kb [31,58]. Similarly, rearrangements in the regulatory region of SOX3 have been found to be associated with sex reversal in 46, XX males [59]. Despite the landscape being devoid of genes, non-coding regions of the genome hold unfound potential in understanding human sex development.

Comorbidities
Although anomalies affecting every organ system were revealed, the most common comorbidities were intellectual disability, short stature, and micrognathia. Interestingly, we found that intellectual disability and other developmental delays occurred notably less frequently in other studies, and when separated by sex, 46, XX patients were found to have no incidences of short stature [11,17]. The prevalence of these comorbidities in our patient population is important to note, given that DSD diagnosis has historically relied on clinical presentation. Turner syndrome is often the first diagnosis considered in femalepresenting DSD patients when short stature, intellectual disability, and cardiac anomalies are present; karyotypes are typically used to diagnose this condition [60,61]. Turner patients also tend to exhibit smaller mandible size, along with retrognathia and mandibular posterior rotation [62][63][64]. Additionally, most individuals with alpha-thalassemia X-linked intellectual disability (ATRX) syndrome have comorbid undifferentiated streak gonads [18]. Furthermore, MRKH phenotypes can occur in Silver-Russel syndrome, which is characterized by abnormalities of the skeletal system such as short stature and micrognathia with narrow chin [65][66][67]. Given the concurrence of these phenotypes, clinicians should perhaps approach DSD with a wider scope and consider other whole genome technologies, such as CMA, when assessing patients and performing targeted analysis of DSD genes and regions.
Although our study focused on patients with atypical female genitalia who were presumably 46, XX, a small proportion (7.67%) of patients were 46, XY. These patients displayed a wide range of DSD phenotypes, but most common were abnormalities of the ovary, uterus, and labia. Five of these patients had what would be considered male phenotypes such as cryptorchidism, hypospadias, and male DSD ("pseudohermaphroditism"). The remaining 18 patients only exhibited female DSD phenotypes. Interestingly, despite the fair number of patients with 46, XY phenotype, we found only one SRO on chromosome Y, and it contained no genes with high likelihood of haploinsufficiency or loss-of-function intolerance. Regardless of the presence of candidate genes, it is nonetheless necessary to follow-up with these patients. 46, XY females with gonadal dysgenesis are at higher risk of developing germ cell tumors (e.g., gonadoblastoma, dysgerminoma, Sertoli cell tumor, etc.) [68,69]. Endocrine, gastrointestinal, and congenital comorbidities are also frequent, and these patients should be assessed and counseled accordingly [70].

Limitations and Recommendations
The generation of our SRO map depended on the properties of the CNVs reported in the DECIPHER database. Despite our expectations, many of the deletions and/or duplications found in patients were large, which in turn led to massive SROs outside the scope of this study. There are a few likely reasons for the skewed presentation of size; the current standard for CMA tends to disregard smaller CNVs potentially leading to underreporting. Additionally, DECIPHER collects data from institutions internationally which utilize different array technologies and techniques when collecting patients' genetic information. It is for these reasons that we present our SRO map not as a complete and conclusive coordinate system but as a guide towards regions of the genome that are promising for future study. Furthermore, we would argue that these larger regions often contain an overabundance of data that is difficult to sieve through. Larger CNVs must be resolved at finer detail, and we suggest that subsequent studies focus on smaller regions in order to pinpoint significant candidate genes.
Not infrequently, DECIPHER was lacking in data regarding the haploinsufficiency and loss of function effects of genes. Therefore, it is possible that potential candidate genes were missed given our method of selection. Regardless, our study provides adequate fodder for inquiry, allowing scientists to utilize our data to predict gene HI and pLI in future functional studies.
When combing the literature for previously established associations with atypical female genitalia, it was impossible to perform a complete and total search on each candidate gene. There is a slight possibility that we overlooked a study that tied a candidate gene to DSD, especially if the study indicated a CNV region but didn't address the genes within. However, given our search protocol, we believe we adequately assessed the literature for existing evidence.
It is still necessary to discuss an apparent disconnect between studies on specific genes and those on CNVs. We found quite a few studies that investigated large CNVs that are associated with the manifestation of MRKH [36][37][38], yet there existed no follow-up studies probing potential candidate genes found in those regions. These CNV regions proved valuable in better understanding the etiology of the disorder; nonetheless, it is important to resolve the genes in said regions as to understand why certain CNVs lead to pathological states.
Our final recommendation regards neither genes nor CNVs. Instead, it addresses DECIPHER's classification system. While analyzing the database, we came across DSD phenotypes that included the term "hermaphroditism". Only a few patients were cataloged using this terminology; nonetheless, using "hermaphroditism" to describe patients is both unhelpful and pejorative. Many intersex patients consider the term outdated and stigmatizing [14], and clinically, "hermaphroditism" is problematic given that it focuses on gonadal anatomy and does not consider other aspects of atypical genital development nor gender identity [9,71,72]. Classifications centered on "hermaphroditism" should be removed from DECIPHER, and instead, clinicians should record phenotypes that are specific to organ morphology and dysfunction [28]. This follows for other classifications such as general "abnormality of the female genitalia" and "ambiguous genitalia, female". Since these classifications provide no insight into the patient's biology or lived experience, we recommend a more affirmative term such as "atypical" instead of "abnormal". Understandably, it may be impossible to dispose of these classifications altogether, depending on the extent of data submitted; nevertheless, physicians should be encouraged to enter more detailed phenotypes when adding patient information into DECIPHER.

Conclusions
This study successfully identified small regions of overlap in patients with atypical development of female genitalia. Our findings suggest that CMA can be used to identify smaller, clinically significant CNVs which have historically been disregarded. The genes within these regions have an untapped potential which, upon further investigation, may provide a better understanding of the genetic etiology of the development of atypical female genitalia. We are hopeful that this study will inspire exploration of these candidate genes and lead to better diagnosis and management of individuals with AFG.
Author Contributions: I.E.A. conceptualized and designed the study and reviewed manuscript; A.U.A. collected and analyzed data and wrote the manuscript. All authors have read and agreed to the published version of the manuscript.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to data usage agreement with DECIPHER Project.