The Genetics of Polycystic Ovary Syndrome: An Overview of Candidate Gene Systematic Reviews and Genome-Wide Association Studies

Polycystic Ovary Syndrome (PCOS) is a complex condition with mechanisms likely to involve the interaction between genetics and lifestyle. Familial clustering of PCOS symptoms is well documented, providing evidence for a genetic contribution to the condition. This overview aims firstly to systematically summarise the current literature surrounding genetics and PCOS, and secondly, to assess the methodological quality of current systematic reviews and identify limitations. Four databases were searched to identify candidate gene systematic reviews, and quality was assessed with the AMSTAR tool. Genome-wide association studies (GWAS) were identified by a semi structured literature search. Of the candidate gene systematic reviews, 17 were of high to moderate quality and four were of low quality. A total of 19 gene loci have been associated with risk of PCOS in GWAS, and 11 of these have been replicated across two different ancestries. Gene loci were located in the neuroendocrine, metabolic, and reproductive pathways. Overall, the gene loci with the most robust findings were THADA, FSHR, INS-VNTR, and DENND1A, that now require validation. This overview also identified limitations of the current literature and important methodological considerations for future genetic studies. Much work remains to identify causal variants and functional relevance of genes associated with PCOS.


Introduction
Polycystic ovary syndrome (PCOS) is a major public health concern affecting 6-10% of reproductive aged women [1]. PCOS is exacerbated by obesity and has significant metabolic, reproductive, and psychological features, including an increased risk of Type 2 Diabetes Mellitus (T2DM) with an earlier age of onset, subfertility, and an increased risk of depression and anxiety symptoms [2][3][4]. At present, the internationally accepted criteria for diagnosis of PCOS is the revised Rotterdam criteria [5], which requires exclusion of other causes of hyperandrogenism, like adrenal or pituitary dysfunction, and presence of two of the following three characteristics: oligo-or anovulation, clinical and/or biochemical signs of hyperandrogenism, and polycystic ovaries on ultrasound. The Rotterdam criteria yields four phenotypes of PCOS, and there is evidence that the different PCOS phenotypes have varying degrees of adiposity, and may differ in metabolic and reproductive profiles [6]. The proposed pathophysiology of PCOS is a synergistic relationship between perturbed gonadotrophin releasing hormones (GnRH) pulsatility and hyperandrogenism probably accompanied by hyperinsulinemia, insulin resistance, and inflammation. However, the nuances of these relationships are yet to be fully elucidated ( Figure 1) [7][8][9].
J. Clin. Med. 2019, 8,1606 2 of 17 ultrasound. The Rotterdam criteria yields four phenotypes of PCOS, and there is evidence that the different PCOS phenotypes have varying degrees of adiposity, and may differ in metabolic and reproductive profiles [6]. The proposed pathophysiology of PCOS is a synergistic relationship between perturbed gonadotrophin releasing hormones (GnRH) pulsatility and hyperandrogenism probably accompanied by hyperinsulinemia, insulin resistance, and inflammation. However, the nuances of these relationships are yet to be fully elucidated ( Figure 1) [7][8][9]. In addition, PCOS pathophysiology appears to have a polygenic predisposition that is exacerbated by environmental factors, especially obesity [2]. The polygenic predisposition is well documented with familial clustering of PCOS symptoms, providing evidence for a genetic contribution to the pathophysiology [10]. Both female and male family members of women diagnosed with PCOS share common characteristics of the syndrome. Moreover, they seem to be more prone to develop T2DM and metabolic syndrome [11]. Finally, mono-and dizygotic twin studies have demonstrated the heritability of PCOS to be approximately 70% [12].
Two complementary types of genetic studies are Genome Wide Association Studies (GWAS) and candidate gene studies. GWAS look for associations between common genetic polymorphisms and the trait or disease without a predefined hypothesis about the possible role of genetic variants in the pathophysiology. However, a common misunderstanding about GWAS is that they identify specific genes. They merely provide information about a genetic region (gene loci) that is significantly associated with the trait. On one hand, the identified gene loci might be directly involved in gene function if located in or near a gene, or they might have a regulatory function for genes up-or downstream. Hence, genetic loci detected by GWAS provide ideal a priori candidate genes that are located within these loci to investigate. GWAS have been conducted in Chinese, Korean, and European cohorts, and have identified up to 19 distinct genetic loci in, or near, known genes that are associated with PCOS [13][14][15][16][17][18][19]. Candidate gene studies are particularly useful to validate and decipher the functional impact of gene loci identified by GWAS to contextualise clinical relevance [20]. A multitude of candidate gene studies have been conducted in PCOS, identifying single nucleotide polymorphisms (SNPs) that may contribute to the genetic basis of PCOS [14]. These studies provide an effective approach for detecting genetics variants that are either causative or belong to a shared haplotype that is causative [21][22][23]. However, candidate gene studies have many common limitations, such as sample size and selection bias from confounding variables such as In addition, PCOS pathophysiology appears to have a polygenic predisposition that is exacerbated by environmental factors, especially obesity [2]. The polygenic predisposition is well documented with familial clustering of PCOS symptoms, providing evidence for a genetic contribution to the pathophysiology [10]. Both female and male family members of women diagnosed with PCOS share common characteristics of the syndrome. Moreover, they seem to be more prone to develop T2DM and metabolic syndrome [11]. Finally, mono-and dizygotic twin studies have demonstrated the heritability of PCOS to be approximately 70% [12].
Two complementary types of genetic studies are Genome Wide Association Studies (GWAS) and candidate gene studies. GWAS look for associations between common genetic polymorphisms and the trait or disease without a predefined hypothesis about the possible role of genetic variants in the pathophysiology. However, a common misunderstanding about GWAS is that they identify specific genes. They merely provide information about a genetic region (gene loci) that is significantly associated with the trait. On one hand, the identified gene loci might be directly involved in gene function if located in or near a gene, or they might have a regulatory function for genes up-or downstream. Hence, genetic loci detected by GWAS provide ideal a priori candidate genes that are located within these loci to investigate. GWAS have been conducted in Chinese, Korean, and European cohorts, and have identified up to 19 distinct genetic loci in, or near, known genes that are associated with PCOS [13][14][15][16][17][18][19]. Candidate gene studies are particularly useful to validate and decipher the functional impact of gene loci identified by GWAS to contextualise clinical relevance [20]. A multitude of candidate gene studies have been conducted in PCOS, identifying single nucleotide polymorphisms (SNPs) that may contribute to the genetic basis of PCOS [14]. These studies provide an effective approach for detecting genetics variants that are either causative or belong to a shared haplotype that is causative [21][22][23]. However, candidate gene studies have many common limitations, such as sample size and selection bias from confounding variables such as ancestry, diagnostic criteria, BMI, and source of participants, which can limit statistical power and result in different findings [24,25].
In all fields of science, systematic evaluations are crucial for establishing the consistency and significance of the current evidence base [24]. Such evaluations may take the form of systematic reviews of individual studies or overviews of systematic reviews. An overview of systematic reviews aims to assess the methodological quality of systematic reviews on a given topic and the consistency of evidence contained in them [26]. The aim of this overview was to systematically evaluate the current evidence regarding the genetics of PCOS from candidate gene systematic reviews and GWAS. We also explored the methodological quality of the systematic reviews to identify how future genetic studies can align the findings of candidate gene and GWAS studies by validating gene loci. A comprehensive understanding of the current evidence and its limitations is necessary to ultimately improve our understanding of the biological origins of PCOS.

Protocol and Registration
This review was designed and reported in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines [27]. The protocol was registered in the international prospective register of systematic reviews PROSPERO (CRD42016052649).  Table S2. This search was modified for EMBASE and CINAHL using their subject headings instead of the MeSH subject headings. Two independent reviewers (L.J.M. or A.M.-A. (update of search) and D.H), who were not blinded to the names of investigators or sources of publication, identified and selected the systematic reviews that met the inclusion criteria.

Eligibility Criteria and Inclusion Criteria
The Participant, Intervention, Comparison, Outcomes, and Studies (PICOs) framework was used for this Overview of Systematic Reviews (OSR) (Table S1). Briefly, the population was women with PCOS, and the intervention was a systematic review with or without a meta-analysis with a primary focus on genetic associations. Systematic reviews not on the genetics of PCOS (e.g., focusing on assessment or treatment) were excluded and are the focus of separate OSRs [28,29]. The specific inclusion criteria for systematic reviews were a publication date from 2009 onwards, description of a search strategy containing at least key words or terms, inclusion of the number of identified and included articles, and quality appraisal of the articles. The comparison term was not applicable in this context. The outcomes included the methodology, results, and quality of each systematic review. Only articles published in English were included.

Data Extraction
All eligible systematic reviews (SR) were examined and extracted independently by two reviewers (D.H and A.M.-A). The data extracted included information on authors, publication date, inclusion criteria, SR outcomes, number of participants in the SR, and quality of identified articles in each SR. Methodological variables specific to genetic association studies were also extracted: source of participants, whether the control group were in Hardy-Weinberg Equilibrium (HWE), and the method by which the control group was dealt with in the SR if it departed from HWE. Briefly, the HWE principle states that if control groups are healthy and therefore "disease-free", they should be in equilibrium and genetic variation remains constant [30]. Departures from HWE can indicate a number of methodological issues including poor study design or genotyping errors. There is no consensus on which method is most appropriate to deal with deviations from HWE, but common procedures are excluding any studies that have a significant deviation from HWE before conducting meta-analysis, conducting sensitivity analysis to examine whether meta-analysis results are altered when studies containing control groups not in HWE are excluded, or correcting the pooled odds ratio [31]. For each SR the diagnostic criteria for PCOS and the criteria for defining the control group were extracted.

Quality Assessment of Systematic Reviews
All included systematic reviews were evaluated by two independent reviewers (D.H and A.M.-A) using the Assessing the Methodological Quality of Systematic Reviews 2 (AMSTAR2) tool, which contains 16 items to appraise the methodological aspects of systematic reviews [32]. Systematic reviews were ranked as high, moderate, low, and critically low quality based on the number of critical (items 2, 4, 7, 9, 11, 13, and 15) domains missed [32]. At all stages of data extraction and quality assessment, disagreements between the two reviewers (D.H. and A.M.-A.) were discussed and resolved by consensus or arbitration (M.G.-H).

Semi-Structured Search of Genome-Wide Association Studies (GWAS)
GWAS are extremely informative in providing candidate genetic loci but were not included in our original systematic search and PICO. Therefore, a semi structured search was performed to identify all GWAS to date (April 2019) to provide a robust and comprehensive overview of genetics and PCOS. Keywords searched were "PCOS", "Polycystic Ovary Syndrome", "GWAS", and "genome-wide association studies". Narrative reviews and pathways analysis were excluded as this was not the focus of the paper.

Metabolism
The genes involved in metabolic function investigated were INSR, Adiponectin, TCF7L2, IRS-1, IRS-2, Calpain-10, CY1A1, CYP11A1, PON1, DENND1A, and Insulin gene-variable number of tandem repeats (INS-VNTR) ( Figure 3). In two systematic reviews, the SNP rs1801278 in IRS-1 was associated with an increased risk of PCOS when carrying the A allele, in a combined cohort of women from different ancestries [39,44]. The III allele in INS-VNTR was associated with an increased risk of PCOS compared to I allele [52], also in a combined cohort of multiple ancestries. The SNP rs4646903 in CYP1A1 and the CYP11A1 microsatellite [TTTA] n repeat polymorphism were associated with increased risk of PCOS, however not across all ancestries [41,42]. In adiponectin, the T allele in the rs1501299 SNP was shown to have a decreased risk of PCOS, however ancestry played a major role, with significant association found only in a mixed population of "East Asian" ancestry. Three SNPs in Calpain-10 were associated with increased risk of PCOS, with ancestry again playing a major role, with a significant association only found in a mixed population of "Asian" ancestry [40]. The two SNPs in PON1 were associated with increased risk of PCOS overall but, again, this was restricted to specific ancestries and diagnostic criteria [49]. Two SNPs in DENND1A (rs10818854 and rs10986105) were associated with increased risk of PCOS [48], while the SNP (rs2479106) in DENND1A was only associated in the women of mixed "Asian" ancestry [48]. In summary, the genes IRS-1, INS-VNTR, Calpain-10, PON1, CYP1A1, CYP11A1, DENND1A, and Adiponectin were found to be associated with risk of PCOS. Some of the SNPs in these genes were not consistent across ancestries indicating that future genetic studies should include larger samples sizes and investigation of other SNPs within these genes to uncover the causal SNP variant associated with PCOS in different ancestries.
with a significant association only found in a mixed population of "Asian" ancestry [40]. The two SNPs in PON1 were associated with increased risk of PCOS overall but, again, this was restricted to specific ancestries and diagnostic criteria [49]. Two SNPs in DENND1A (rs10818854 and rs10986105) were associated with increased risk of PCOS [48], while the SNP (rs2479106) in DENND1A was only associated in the women of mixed "Asian" ancestry [48]. In summary, the genes IRS-1, INS-VNTR, Calpain-10, PON1, CYP1A1, CYP11A1, DENND1A, and Adiponectin were found to be associated with risk of PCOS. Some of the SNPs in these genes were not consistent across ancestries indicating that future genetic studies should include larger samples sizes and investigation of other SNPs within these genes to uncover the causal SNP variant associated with PCOS in different ancestries.

Androgens and Gonadotrophins
Four systematic reviews focused on CYP17, FSHR, AMH, AMH receptor (AMHR II), and androgen receptor (AR) genes and their association with androgens and gonadotrophins (Figure 4). The rs6166 SNP in FSHR was associated with PCOS with the Asn allele showing a protective effect for PCOS, but this was only significant in a mixed population of women with European ancestry [37]. CAG repeat length polymorphism in AR was positively associated with plasma testosterone concentration, but not with PCOS per se [53]. There were no clear association between SNPs in CYP17, AMH or AMHR with PCOS [36,51], however, it may be possible that other SNPs within these gene loci are

Androgens and Gonadotrophins
Four systematic reviews focused on CYP17, FSHR, AMH, AMH receptor (AMHR II), and androgen receptor (AR) genes and their association with androgens and gonadotrophins (Figure 4). The rs6166 SNP in FSHR was associated with PCOS with the Asn allele showing a protective effect for PCOS, but this was only significant in a mixed population of women with European ancestry [37]. CAG repeat length polymorphism in AR was positively associated with plasma testosterone concentration, but not with PCOS per se [53]. There were no clear association between SNPs in CYP17, AMH or AMHR with PCOS [36,51], however, it may be possible that other SNPs within these gene loci are related to PCOS and/or specific clinical characteristics of PCOS [54]. In summary, SNPs in the genes FSHR and AR were found to be associated with risk of PCOS.
J. Clin. Med. 2019, 8, 1606 7 of 17 related to PCOS and/or specific clinical characteristics of PCOS [54]. In summary, SNPs in the genes FSHR and AR were found to be associated with risk of PCOS.

Inflammation
Four systematic reviews focused on inflammation and investigated cytokine genes: TNF-α, IL-6, IL-β, IL-10, and IL-18 ( Figure 5). Of the three SNPs investigated within TNF-α, only one (rs1799964) was positively associated with PCOS. All four systematic reviews concurred that the C allele (rs1800795) in IL-6 was a protective factor for PCOS risk. However, in three of the systematic reviews, when only primary studies with control groups in HWE were included, the association was no longer significant [34,45,46]. Although, the most recent systematic review [47] with the largest sample size found that this association held when controls were in HWE, indicating that rs1800795 in IL-6 may be associated with PCOS. SNPs in the genes IL-β, IL-10, and IL-18 were not associated with PCOS. In summary, evidence suggests that TNF-α (rs1800795) and IL-6 (rs1800795) may be associated with PCOS, but larger sample sizes and further studies with appropriate control groups are required to confirm these findings.

Inflammation
Four systematic reviews focused on inflammation and investigated cytokine genes: TNF-α, IL-6, IL-β, IL-10, and IL-18 ( Figure 5). Of the three SNPs investigated within TNF-α, only one (rs1799964) was positively associated with PCOS. All four systematic reviews concurred that the C allele (rs1800795) in IL-6 was a protective factor for PCOS risk. However, in three of the systematic reviews, when only primary studies with control groups in HWE were included, the association was no longer significant [34,45,46]. Although, the most recent systematic review [47] with the largest sample size found that this association held when controls were in HWE, indicating that rs1800795 in IL-6 may be associated with PCOS. SNPs in the genes IL-β, IL-10, and IL-18 were not associated with PCOS. In summary, evidence suggests that TNF-α (rs1800795) and IL-6 (rs1800795) may be associated with PCOS, but larger sample sizes and further studies with appropriate control groups are required to confirm these findings.

Genome-Wide Association Studies
Six GWAS have been conducted to identify gene loci that are associated with PCOS. Two GWAS were conducted in women with Han Chinese ancestry (both North and South) [15,16,19], two in women with Korean ancestry [18,19], and two in women with European ancestry, determined by genetic analysis of local ancestry [13,17]. To date, only one meta-analysis of GWAS has been conducted in women with European ancestry [17]. In total, 19 genetic loci associated with risk of PCOS have been identified in three biogeographical ancestries, Korean, Han Chinese, and European. Eleven of the 19 loci are common to Han Chinese and European ancestry. Table 1 summarises the findings of the 6 GWAS. Table 2 summaries the SNPs identified from the meta-analysis including 3 novel loci [17].

Genome-Wide Association Studies
Six GWAS have been conducted to identify gene loci that are associated with PCOS. Two GWAS were conducted in women with Han Chinese ancestry (both North and South) [15,16,19], two in women with Korean ancestry [18,19], and two in women with European ancestry, determined by genetic analysis of local ancestry [13,17]. To date, only one meta-analysis of GWAS has been conducted in women with European ancestry [17]. In total, 19 genetic loci associated with risk of PCOS have been identified in three biogeographical ancestries, Korean, Han Chinese, and European. Eleven of the 19 loci are common to Han Chinese and European ancestry. Table 1 summarises the findings of the 6 GWAS. Table 2 summaries the SNPs identified from the meta-analysis including 3 novel loci [17].  Table 2. SNPs identified from a GWAS meta-analysis in women of European biogeographical ancestry.

Study Diagnostic Criteria Gene Locus SNPs Nearest Gene
Day et al. [17] Rotterdam NIH * Gene loci that have been found in common in women of Han Chinese ancestry.
Only four systematic reviews (25%) described in detail the inclusion criteria for the control groups: absence of irregular cycles, subfertility, polycystic ovarian morphology and signs of hyperandrogenism, or healthy with proven fertility (Supplementary Table S5). The remaining systematic reviews either did not describe any criteria for control group inclusion or described women in the control groups as healthy but without including any further detail.

Assessment of Systematic Review Quality Using the AMSTAR Tool
None of the systematic reviews met all 16 AMSTAR criteria (Table S4) [32]. Of the 21 systematic reviews, 24% (5/21) were of high quality, indicating that the reviews provided accurate and comprehensive results, and 57% (12/21) were of moderate quality, indicating that there were weaknesses identified but were able to provide meaningful results. The remaining four systematic reviews were of low quality and should be treated with caution [32]. Of the systematic reviews relating to metabolic dysfunction, 2/13 were of high quality, 8/13 were of moderate quality, and 2/13 were of low quality. Of the systematic reviews relating imbalances in androgens and gonadotrophins, 2/4 were of moderate quality and 2/4 were of low quality. Of the systematic reviews relating to inflammation, 2/4 were of high quality and 2/4 were of moderate quality.

Discussion
This systematic overview, encompassing both candidate gene systematic reviews and GWAS, highlights gene loci that have robust associations with PCOS. These include the genes DENN1DA, INS-VNTR, and INSR, which are related to metabolic dysfunction, and THADA and FSHR, relating to imbalances in androgens and gonadotrophins pathways. However, SNPs in inflammatory genes seem unrelated to PCOS, and require additional investigation, especially in the context of obesity and PCOS.

Metabolic Dysfunction
Metabolic dysfunction is involved in the aetiology of PCOS [9,29]. Much research has been conducted in this area, supported by our finding that over half of the gene loci identified were concerned with metabolic dysfunction. More specifically, most studies examined genetic variants within genes that regulate insulin resistance, which is strongly implicated in the aetiology and reproductive and metabolic health consequences of PCOS [6,7]. The genetic variants from candidate gene systematic reviews, including CYP1A1, CYP11A1, Adiponectin, Calpain-10, and PON-1, provide promising candidate genes, but the SNPs investigated were not significant across all ancestries. This could indicate that while these genetic loci could be associated with PCOS, the genotypic variations (SNPs) within in the gene loci may vary depending on the ancestry of population being studied [56][57][58]. This is clearly illustrated by the GWAS in the DENND1A gene, whereby the rs10818854, rs2479106, and rs10986105 SNPs were associated with PCOS in those of Han Chinese ancestry and mixed "Asian" ancestry [15,16,48]. While in European ancestry, it was the SNP rs9696009 that was associated with risk of PCOS [17], highlighting the important role that ancestry plays in the genetics of PCOS.
SNPs in INS-VNTR, which are implicated in the development of T2DM, were associated in two candidate gene systematic reviews with increased risk of PCOS [40,52]. Further, the Shi et al. GWAS identified the SNP rs2059807 in INSR to be associated with PCOS in a Han Chinese ancestry [15]. It was then followed up in a meta-analysis by Feng et al. [33] who investigated multiple SNPs in INSR including the rs2059807. While they were unable to perform a meta-analysis due to insufficient data, three of the four genetic studies found a significant association with PCOS [33]. This SNP provides an ideal candidate to investigate in other ancestries, and further reinforces the importance of insulin resistance in the aetiology of PCOS.
In summary, robust candidate gene studies of the following loci are required in different ancestries:

Dysregulation of Androgens and Gonadotrophins
Follicular arrest, menstrual dysfunction, and anovulation are commonly observed in PCOS [59][60][61], and are linked to excess androgens, FSH, and LH imbalances and elevated AMH levels. Therefore, candidate gene studies have focused on SNPs in genes such as CYP17, AMH, or AMHR that are associated with these pathways. From this overview, we cannot at this time establish or refute any of these candidate genes, as the quality of the current candidate gene systematic reviews were of moderate quality [37,51] or low quality [36,53].
The FSHR locus provides a more promising genetic loci found in the GWAS conducted by Chen et al. and Shi et al. [15,16] who examined women of Han Chinese ancestry and identified that the SNPs rs2268361 and rs2349415 were associated with PCOS. In women of European ancestry, the candidate gene systematic review conducted by Qiu et al. indicate that the SNP rs6166 in the FSHR gene may be the more relevant genetic variant [37]. The FSHR locus has also been significantly associated, in phenotypic studies, with levels of gonadotrophins (FSH and LH), indicating this may be a causal gene in the development of PCOS [62]. However, interestingly the GWAS [13,14,17] that were conducted in European ancestry did not uncover any association between the SNP rs6166 that was identified by Qiu et al. in the candidate gene systematic review. We speculate this may be due to the different diagnostic criteria used [37]. The GWAS included only women with PCOS diagnosed by the NIH criteria, and therefore potentially biased the gene loci towards the more severe phenotype of hyperandrogenism and anovulation, while the candidate gene systematic review conducted by Qiu et al. used the more inclusive Rotterdam criteria which includes the polycystic ovarian morphology (PCOM) phenotype [37]. A recent review has highlighted this conflicting relationship between the SNPs in the FHSR locus and risk of PCOS [63], emphasising that the SNPs may be correlated with higher basal FSH serum levels and the PCOM phenotype, rather than PCOS per se, hence the discrepancy between the GWAS findings and the candidate gene systematic review.
The loci THADA has been replicated across multiple ancestries and diagnostic criteria, providing another promising candidate gene in PCOS aetiology. While the functional role of THADA has yet to be well characterised, SNPs within the THADA locus have been identified as a candidate gene in T2DM, a comorbidity of PCOS [64]. Subsequent genotype-phenotype correlational analysis found that the THADA gene contributes to hyperandrogenism in PCOS with Han Chinese ancestry [65]. In summary, robust markers related to imbalances in androgens and gonadotrophins have been identified and future studies need to (a) identify the underlying molecular mechanism of the THADA and FSHR gene in PCOS, and (b) uncover the causal SNP variant across different ancestries and phenotypes.

Inflammation
Inflammation potentially acts as a link between insulin resistance and hyperandrogenism in PCOS and is associated with both [66,67]. Two candidate gene systematic reviews focused on SNPs in the TNF-α gene, which is a pro-inflammatory cytokine that has been associated with PCOS, ovarian function, and ovulation, and is a known mediator of insulin resistance [34,46,66]. Neither reported significant associations between the rs1800629 SNP and PCOS [34,46]. However, the SNP rs1799964 was positively associated with PCOS suggesting this may be the casual polymorphism in the TNF-α gene for susceptibility to PCOS [46]. Four systematic reviews examined the SNP rs1800795 in the IL-6 gene and found a decreased risk of PCOS when carrying the C allele. Caution is warranted as when only primary studies with control groups in HWE were included, this association was no longer significant based on three of the candidate gene systematic reviews [34,45,46]. However, the most recent systematic review [47] with the largest sample size found that this association held when controls were in HWE, indicating that rs1800795 in IL-6 may be associated with PCOS, but this requires further investigation before conclusions can be made [34,[45][46][47].
The conflicting findings regarding SNPs in inflammatory genes could be explained by selection bias introduced by the source of recruitment. Compared to community based recruitment, recruitment from hospitals would lead to selection bias of the more severe hyperandrogenic-anovulatory phenotype, which has greater rates of obesity and inflammation [6]. Whether low-grade inflammation is intrinsic to PCOS or a consequence of PCOS-related obesity is contentious. Some, but not all, studies demonstrate that inflammation is independent of BMI in women with PCOS [68,69]. Unfortunately, only one of the included systematic reviews investigated the confounding influence of BMI on inflammation gene variants [34]. Obesity is known to exacerbate many of the symptoms of PCOS, and it would be prudent for future systematic reviews to investigate the role BMI may play in the association between inflammatory related gene loci and PCOS. In fact, the impact of environmental factors on PCOS is demonstrated by weight management being the first line treatment for the condition [70]. The contribution of these gene-environment interactions and epigenetics to PCOS is an emerging field, with recent findings revealing specific epigenetic reprogramming of genes involved in reproductive function in women with PCOS [71]. Therefore, it is important that future studies acknowledge and investigate the role of gene-environment interactions in the context of PCOS.

Recommendations for Future Genetic Analysis
We found little overlap between gene loci identified from GWAS and those identified from the candidate gene systematic reviews. This could be due to multiple reasons. Firstly, some of the gene loci identified from the candidate gene systematic reviews may have had real but weak associations in GWAS that were lost in the adjustment for multiple comparisons [72][73][74]. A second reason may be that the GWAS were not designed to be sensitive enough to detect rare genetic variants that may have a larger effect on genetic risk. Instead, the GWAS focused on common variants, and in general, these alleles may only have a small effect on genetic risk [74,75]. This was recently shown in a family-based association study that identified multiple rare genetic variants in DENN1DA that would have been unable to be detected in GWAS, and these rare SNPs were found to contribute to PCOS and, in particular, to hormonal imbalances [76]. GWAS have identified less than 10% of PCOS heritability, while twin and family studies indicate that heritability may be up to 70%. This emphasises that a combined approach of GWAS, candidate gene association, and family-based studies are required to fully elucidate the genetic contribution to the origins of PCOS [77].
Future genetic studies should consider performing Phenome-Wide Association Studies (PheWAS) which examine many different phenotypes to see which, if any, are associated with a given genetic variant [78]. The findings of this overview provide potential pathways that could be used in PheWAS to determine the functional relevance of the identified genes, and secondly, to explore associations between genetic polymorphisms and different PCOS phenotypes. The importance of this has been recently highlighted in a GWAS meta-analysis where it was identified that the SNP rs804279 in the GATA4/NEIL2 gene loci was strongly associated with the NIH diagnostic criteria, which encompasses only hyperandrogenic PCOS phenotypes, compared to the Rotterdam criteria, which also encompasses the non-hyperandrogenic PCOS phenotypes [17]. Rigorous reporting and examination of the differing diagnostic criteria, and therefore PCOS phenotypes, will be particularly important to both genetic studies and PheWAS to elucidate whether the four phenotypes of PCOS may have different molecular origins [6].
This overview is a timely reminder of important methodological considerations for future genetic studies in PCOS and in complex diseases more generally. Genetic studies need to be more meticulous in the reporting of ancestry to determine genetic variation in humans. Many studies used the term "race/ethnicity" which is not appropriate in a genetic study, as commonly used ethnicity terms including Caucasian, Asian, White, African, or Latino are poor predictors of human genetic variation or similarity [58,79]. Instead it would be more accurate to calculate ancestry or the geographical origins of individual participant's ancestry (biogeographical ancestry) to uncover genetic variation in PCOS. Ancestry Informative Markers (AIMs) can estimate biogeographical ancestry, especially those of mixed ancestral background, and provide an improved understanding of the impact of genes in PCOS [75,79].
Criteria for control groups need to be clearly defined, as most of the systematic reviews simply stated that they included healthy women or did not define the relevant inclusion criteria. This may affect the strength of association [75,80], as suggested by two systematic reviews reported in this OSR that included the same primary studies but came to different conclusions [38,43]. While Ramos et al. (2015) excluded any controls from their meta-analysis that were not considered healthy, Shen et al. (2014) did not describe the inclusion criteria for the control group, therefore it is difficult to compare the meta-analyses to understand where this variation in findings originated from. Another contentious issue is whether to include or exclude individual primary studies whose control groups did not conform to the Hardy-Weinberg Equilibrium (HWE) [31,81]. We note a variety of methods were used to deal with the primary studies that departed from HWE, and some of the systematic reviews did not consider this issue at all. Almost all systematic reviews acknowledged they were limited by a small sample size, highlighting the need for larger primary studies and systematic reviews, with the candidate genes informed by family-based studies and GWAS. Although most systematic reviews were at low risk of bias, there was a lack of consistent methodological rigour regarding clear definitions of cases and controls, ancestry, and the differences in dealing with deviations from HWE, which should be addressed to progress our knowledge of the role of genetics in the aetiology of PCOS.

Conclusions
This overview of systematic reviews and GWAS identified several PCOS candidate gene loci that are located in the neuroendocrine, metabolic, and reproductive pathways. Additionally, we described the limitations and important methodological considerations that should inform and complement future genetic studies. We have provided a comprehensive catalogue of gene loci, with work now required to identify causal variants and functional relevance to the biological origins and established pathophysiology of PCOS.