A novel germline mutation in the POT1 gene predisposes to familial non-medullary thyroid cancer

Non-medullary thyroid cancer (NMTC) is a common endocrine malignancy with a genetic basis that has yet to be unequivocally established. In a recent whole genome sequencing study of five families with recurrence of NMTCs, we shortlisted promising variants with the help of bioinformatics tools. Here, we report in silico analyses and in vitro experiments on a novel germline variant (p.V29L) in the highly conserved oligonucleotide/oligosaccharide binding domain of the Protection of Telomeres 1 (POT1) gene in one of the families. The results showed that the variant demonstrates a reduction in telomere-bound POT1 levels in the mutant protein as compared to its wild-type counterpart. HEK293Tcells carrying POT1V29L showed increased telomere length in comparison to wild type cells, strongly suggesting that the mutation causes telomere dysfunction and may play a role in predisposition to NMTC in this family. This study reports the first germline POT1 mutation in a family with a predominance of thyroid cancer, thereby expanding the spectrum of cancers associated with mutations in the shelterin complex.

NMTC is about three-fold higher when a first-degree relative is diagnosed with NMTC compared to 53 those without affected family members (Fallah, Pukkala et al., 2013). Apart from the rare syndromic 54 forms of familial NMTC (FNMTC), including familial adenomatous polyposis, Gardner syndrome, Cowden 55 syndrome, Carney complex type 1, Werner syndrome, and DICER1 syndrome, the genetic basis of 56 FNMTC is largely unknown (Hincza et al., 2019, Peiling Yang & Ngeow, 2016. FNMTC has been 57 associated with an earlier age of onset, a higher incidence of multifocality and more aggressive disease 58 compared to its sporadic counterpart (El Lakis, Giannakou et al., 2019, Fallah et al., 2013). Thus, it is 59 important to identify genetic factors behind the familial disease to facilitate genetic counseling and 60 clinical management of the patients. 61 Various approaches, including genome-wide association studies, linkage analyses, targeted sequencing, 62 and whole exome sequencing, have been employed to gain understanding into the genetic basis of 63 FNMTC. Several genes and loci, including mainly low-penetrance variants near or in FOXE1, SRGAP1, 64 TITF-1/NKX2-1, DIRC3, and CHEK2, have been suggested to affect non-syndromic FNMTC susceptibility 65 (Hincza et al., 2019, Peiling Yang & Ngeow, 2016. In addition, an imbalance of the telomere-telomerase 66 complex has been demonstrated in the peripheral blood of familial papillary thyroid cancer patients 67 (Capezzone, Cantara et al., 2008). 68 Recently, we performed whole genome sequencing (WGS) on five families with documented recurrence 69 of NMTC and analyzed these samples with our in-house developed variant prioritization pipeline 70 (FCVPPv2) along with other in silico tools (Srivastava, Kumar et al., 2019). This allowed us to identify a 71 novel missense variant (p.V29L) in the protection of telomeres 1 (POT1) gene in one of the families. 72 POT1 is a critical component of the shelterin complex, which binds and protects telomeres by 73 modulating telomere capping, replication, and extension by telomerase (de Lange, 2018

88
The subject of this study was an Italian family with reported recurrence of NMTC (Fig. 1). Five members 89 of this family were affected by PTC, Hürthle cell cancer, micro-PTC or a combination of two subtypes (II-90  2, II-3, II-5, II-8, II-9). Three members were possible carriers affected by benign nodules (I-1, II-4, II-6) and 91 two were unaffected (II-1, II-7). WGS was performed on eight of these family members. The variants 92 were filtered based on pedigree data considering family members diagnosed with NMTC or micro-PTC as 93 cases, benign nodules or goiter as potential variant carriers and unaffected members as controls. 94 95 A total of 101081 variants, with mean allele frequencies less than 0.1%, was reduced by pedigree-based 96 filtering to 2708. We did not identify any deleterious loss-of-function variants, however, six non-97 synonymous variants in six genes (EPYC, SPOCK1, MYBPC1, ACSS3, NRP1, and POT1) segregated with the 98 disease in the family and passed the filters of the FCVPPv2 (Srivastava et al., 2019). An overview of the 99 process leading to the selection of a candidate variant is outlined in Figure 1B. Given the importance of 100 POT1 in various cancers, we selected it as our candidate variant for further in silico analyses and 101 functional validation. A list of all shortlisted variants and their scores is available in the supplementary 102 data (Table S1). 103

104
In silico studies predict the importance of the p.V29L mutation to POT1 protein function 105 Comparative sequence analysis of the p.V29L position showed it to be highly conserved across selected 106 representative species within the phylogeny ( Fig. 2A). The p.V29L variant is located in the OB1 domain of the protein, as are several other germline and somatic variants reported in a wide spectrum of other 108 human cancers (Fig. 2B, Table S2). It is also evident that the region around the position p.V29L is highly 109 conserved. The tolerance of POT1 protein function to single amino acid substitutions was calculated by 110 SNAP2 and accessed using PredictProtein. The heat map representation of the resulting data shows a 111 highly deleterious effect of almost all substitutions in the position p.V29. An aggregation of highly 112 deleterious effects of any amino acid change can be seen in the selected range (1-72 amino acids; Fig.  113 2C). These predictions reinforce the biological importance of the OB folds. 114

115
We attained the crystal structure of the N-terminal domain of POT1 (aa 1-185) as a complex with ssDNA 116 from the RCSB PDB database (1XJV) (Lei, Podell et al., 2003). This domain binds G-rich telomeric ssDNA 117 with the same specificity and higher affinity than the full-length protein, suggesting that this segment 118 encompasses the entire DNA-binding region of the protein (Lei et al., 2003). The OB-fold shown in this 119 crystal structure consists of a highly curved, five-stranded anti-parallel β-barrel. The interaction of the 120 ssDNA with the concave groove of the OB folds along with the position of our variant (p.V29L) can be 121 seen in Figure 2D. Moreover, we predicted the change in protein stability by the p.V29L substitution 122 using the mutation Cutoff Scanning Matrix approach (mCSM), which relies on graph-based signatures to 123 predict the impact of missense mutations on protein stability. The thermodynamic change in free energy 124 caused by the p.V29L mutation was predicted to be destabilizing (ΔΔG = -0.886 Kcal/mol) (Fig. 2D). 125 The V29L mutation aggravates DNA dependent functions of POT1 protein levels between POT1 WT and POT1 V29L transfected cells (Fig. 3A). We then performed chromatin 135 immunoprecipitation (ChIP) assays to examine the effect of the POT1 V29L variant on the binding of POT1 136 to telomeric chromatin. Our results showed significantly weakened binding of telomeric DNA to POT1 V29L 137 as compared to POT1 WT (Fig. 3B). 138 Furthermore, to confirm our findings from the ChIP assay, we performed an electrophoretic mobility 139 shift assay (EMSA) using constructs containing cDNA for wild-type and mutant POT1 that were 140 translated in vitro and incubated with radiolabeled telomeric ssDNA. EMSA results confirmed that the 141 p.V29L alteration affected the ability of POT1 to bind to the 3' end of the G-rich telomeric overhang, 142 whereas wild-type POT1 was able to efficiently bind to telomeric ssDNA (Fig. 3C). In an attempt to assess 143 the effect of the POT1 V29L variant on telomere length, we measured telomere length in the family 144 members using WGS data and found no significant difference. This could be due to naturally occurring 145 variance in telomere length within the human population as well as due to age differences and the 146 limited number of samples. This situation is distinct from one that is present in the analysis of cell lines. 147 We demonstrated this experimentally by passaging 40 times of HEK293T cells transfected with POT1 WT 148 and POT1 V29L and subsequently re-analyzing the telomere lengths. The results showed that the telomere 149 length was significantly longer in mutated cells compared to the wild-type cells (two-tailed student's t-150 test, P<0.005; Fig. 3D). 151

152
In our study, we identified a novel germline POT1 missense mutation that segregated with thyroid 153 cancer in an Italian family. As the scope of personalized therapy and medical genetics advance, the 154 importance of identifying mutations and pathways affected in different cancers is heightened. Next-155 generation sequencing has emerged as the state-of-the-art tool for the identification of driver mutations 156 in tumors and novel cancer-predisposing genes in Mendelian diseases. The heritability of thyroid cancer 157 can be attributed to both rare, high-penetrance mutations and common, low-penetrance variants. Our 158 approach was focused on identifying the former in a familial pedigree. 159 The assay showed a significant decrease in the mutant POT1 protein's ability to bind to ssDNA as compared 167 to its wild-type counterpart (p= 0.01, student's t-test. with poor prognoses (Jegerlehner, Bulliard et al., 2017). This is only possible if there is a strong 186 understanding of predictive germline variants and their underlying pathways. We acknowledge that one 187 of the limitations of this study is that the proposed disease-causing variant was found in only one family. 188 However, when dealing with rare, high-penetrance variants, it is a challenging task to locate more than 189 one family with a mutation in the same gene. Nonetheless, this draws attention to two aspects. First,it 190 is evident that many other disease-causing loci have yet to be discovered and second, there is a certain 191 ambiguity associated with the selection of one causal variant in a family, as other deleterious variants 192 that are shared amongst patients in the family could also be important in the pathogenesis of the 193 studied phenotype. 194 In conclusion, the POT1 mutation reported in this study plays a role in NMTC predisposition. data, were included to evaluate the intolerance of genes to functional mutations. The ExAC consortium 243 has developed two additional scoring systems using large-scale exome sequencing data including 244 intolerance scores (pLI) for loss-of-function variants and Z-scores for missense and synonymous variants. 245 These were used for nonsense and missense variants respectively. However, all the intolerance scores 246 were used to rank and prioritize the genes and not as cut-offs for selection. 247 After shortlisting variants according to the aforementioned criteria, we performed a literature review on 248 the prioritized candidates and checked if coding variants in important oncogenes, tumor suppressor 249 genes or autosomal dominant familial syndrome genes had been missed by the cut-offs of the pipeline. 250 These variants were handled leniently with regard to conservation and deleteriousness cut-offs and 251 were included in the further analysis. 252

Candidate variant selection and validation 253
After filtering the variants based on the FCVPPv2, we visually inspected the WGS data for correctness 254 using the Integrative Genomics Viewer (IGV) (

Measurement of relative telomere length
Telomere length was measured on DNA extracted from HEK293T POT WT and POT1 V29L cells after 40 290 passages using real-time PCR as described earlier by others and in our lab (Hosen, Rachakonda et al., 291 2015). Telomere and albumin primer sequences 5′ to 3′ were: 292 ACACTAAGGTTTGGGTTTGGGTTTGGGTTTGGGTTAGTGT (Telg), 293 TGTTAGGTATCCCTATCCCTATCCCTATCCCTATCCCTAACA (Telc), 294 CGGCGGCGGGCGGCGCGGGCTGGGCGGCCATGCTTTTCAGCTCTGCAAGTC (Albugcr2) and 295 GCCCGGCCCGCCGCGCCCGTCCCGCCGAGCATTAAGCTCTTTGGCAACGTAGGTTTC (Albdgcr2). 296 Telomere/single-copy gene (T/S) values were calculated by 2 −ΔCt and relative T/S values (i.e., RTL values) 297 were generated by dividing sample T/S values with the T/S value of reference DNA sample (genomic 298 DNA pooled from 10 healthy individuals). All the experiments were done in triplicates and repeated 299 twice. 300 Western Blot 301 Protein lysates were prepared and quantified using the BCA protein assay kit (Pierce, Darmstadt, 302 Germany). 20 μg of the proteins were then blotted onto 0.2 μM nitrocellulose membranes and blocked 303 with 5% milk. Membranes were incubated overnight at 4 ℃ with the target Anti-Myc tag antibody 304 [9E10] -ChIP Grade (ab32). Immune complexes were detected with the corresponding HRP-conjugated 305 secondary antibody (Anti-rabbit IgG, HRP-linked Antibody, cell signaling, 7074). The loading quantity 306 control was incubated with the Anti-beta-Actin antibody [AC-15] (HRP) (ab49900) overnight at 4°C. Blots 307 were developed by using ECL Western blot substrate (EMD Millipore, Darmstadt, Germany). 308 Chromatin Immunoprecipitation (ChIP) assay and telomere dot-blots [9E10] -ChIP Grade ab32). The lysates were sonicated with 5 × 5 min with 5s on/off intervals (Bioruptor, 316 Diagenode) to get the DNA lengths between 200 and 1000 bp. The immunoprecipitated DNA was 317 purified with the iPure kit (Diagenode, c03010014). Purified DNA was slot blotted onto a Hybond N+ 318 membrane with the help of a dot-blot apparatus (Bio-Rad, 170-6545) and subsequently hybridized with 319 a biotin-labeled (TTAGGG) 3 probe synthesized by Sigma. The North2South ® Chemiluminescent 320 Hybridization and Detection Kit (Thermo Fisher: 17097) was used to detect the biotin signal with the 321 help of a CCD camera. Signals were then quantified by Image J, and the fold of enrichment was 322 calculated. The amount of telomeric DNA after ChIP was normalized to the total input telomeric DNA. 323 Electrophoretic mobility shift assay (EMSA)

324
The gel shift assay of POT1 wt and POT1 V29L was performed as described previously (Ramsay et al., 2013). 325 In brief, 20 µl reaction was prepared in EMSA buffer (25 mM HEPES-NaOH (pH 7.5), 100 mM NaCl, 1 mM 326 EDTA and 5% glycerol) supplemented with 1 µg of poly(dI-dC) and around 30-40 ng of γP32 labeled ds 327 telomere Probe (GGTTAGGGTTAGGGTTAGGG) per reaction. The reactions were incubated for 30 min. at 328 25°C. POT1 wt and POT1 V29L were immunoprecipitated from HEK293T cells, ectopically expressing 329 respective POT1 proteins, by lysing them in 20 mM Tris pH 7.5, 40 mM NaCl, 2 mM MgCl 2 , 0.5% NP40, 330 50U/ml Benzonase, supplemented with protease and phosphatase inhibitors. After 15 min. of 331 incubation on ice, the NaCl concentration was adjusted to 450 mM and the incubation was continued 332 for another 15 minutes. Lysates were clarified by centrifugation (13200 rpm, 20 min, 4°C) and 1.0 mg of 333 total protein was used per immunoprecipitation in IP buffer (25 mM Tris-Cl (pH 7.5), 150 mM NaCl, 1.5 334 mM DTT, 10% glycerol, 0.5% NP40) supplemented with protease and phosphatase inhibitors. 335 Endogenous proteins were captured onto protein G-magnetic beads (NEB; #S1430S), washed extensively 336 in IP buffer and used for POT1 wt and POT1 V29L source. After gel shift incubation, the reaction contents 337 were loaded onto a pre-electrophoresed 5% acrylamide/bis (37.5:1) gel in 0.5xTBE and run at 100V at 338 25°C. The gels were dried and analyzed by autoradiography. The labeled probe consensus alone served 339 as a negative control of the EMSA. 340 for first-degree relatives is increased 3-5-fold compared to the general population and the genetic 358 alterations responsible for this disease are hardly known. Several low-penetrance predisposing loci have 359 been identified using GWAS studies in recent years, but no high-penetrance mutations have been 360 confirmed so far. 361

362
We identified a novel germline variant in the POT1 gene, a component of the shelterin complex. 363 Functional in vitro studies showed that the mutation weakens the binding of POT1 to telomere DNA, 364 leading to abnormal telomere elongation. This is the first POT1 mutation identified in a family with a 365 predominance of thyroid cancers, thereby expanding the spectrum of cancers associated with mutations 366 in the shelterin complex. 367