Spectrum of Causative Mutations in Patients with Hemophilia A in Russia

Hemophilia A (HA) is one of the most widespread, X-linked, inherited bleeding disorders, which results from defects in the F8 gene. Nowadays, more than 3500 different pathogenic variants leading to HA have been described. Mutation analysis in HA is essential for accurate genetic counseling of patients and their relatives. We analyzed patients from 273 unrelated families with different forms of HA. The analysis consisted of testing for intron inversion (inv22 and inv1), and then sequencing all functionally important F8 gene fragments. We identified 101 different pathogenic variants in 267 patients, among which 35 variants had never been previously reported in international databases. We found inv22 in 136 cases and inv1 in 12 patients. Large deletions (1–8 exons) were found in 5 patients, and we identified a large insertion in 1 patient. The remaining 113 patients carried point variants involving either single nucleotide or several consecutive nucleotides. We report herein the largest genetic analysis of HA patients issued in Russia.


Introduction
Hemophilia A (HA, MIM no. 306700) is an inherited, recessive, X-linked bleeding disorder caused by a wide spectrum of mutations in the gene encoding coagulation factor VIII (F8 gene). HA affects 1 in 5000 males. The F8 gene has a span of approximately 186 kb on chromosome X at locus q28 and consists of 26 exons [1]. The FVIII protein consists of a signal peptide (19 residues) and a sequence of domains (A1a1A2a2Ba3A3C1C2) that contains 2332 residues, for a total of 2351 residues. The mature FVIII molecule is a heterodimer, it circulates as heavy (A1A2B domains) and light (A3C1C2 domains) chains bound non-covalently by a divalent metal bridge. Cleavage, prior to secretion and at activation, results in a coagulant heterotrimer consisting of domains A1 + A2 + A3C1C2 [2]. Depending on the plasma procoagulant level of FVIII (FVIII:C, %), HA is classified into three clinical phenotypes: severe (FVIII:C <1%), moderate (FVIII:C 1-5%) and mild (FVIII:C >5%).
HA was the first inherited disease to be controlled by replacement treatment (i.e., the infusion of blood or blood products containing FVIII), but some patients develop antibodies against therapeutic FVIII, called inhibitors, which seriously compromise the patient's prognosis. Understanding the factors that predispose a patient to such an adverse reaction is important for the management of HA [3][4][5][6].
More than 3500 different disease-causing pathogenic variants of HA have been identified and reported in international databases: Factor VIII Variant Database (f8-db.eahad. org (accessed on 25 December 2022)), Human Gene Mutation Database (www.hgmd.cf.

Patient Samples
In this study, we included patients from 273 unrelated families with different forms of HA, recruited from 1990 to 2022. They originated from different regions of Russia, and not all of them consulted with hematologists at our center. That is why clinical and laboratory data (including FVIII clotting activity and presence of inhibitors) were not available for all patients. The available patient information is provided in Table S1.
As we frequently did not have an exact information about FVIII clotting activity, all severe and moderate patients were grouped together (severe/moderate patients).
We obtained material from the affected male proband in 252 families and from only the asymptomatic female carrier in 21 families.
The study was carried out according to the Principles of the Declaration of Helsinki and informed consent was obtained from all participants. Patients were considered to have HA according to the international consensus of the 2001 International Society on Thrombosis and Haemostasis (ISTH) Factor VIII and Factor IX Subcommittee [28].

DNA Collection and Extraction
Genomic DNA was isolated from EDTA-treated whole blood samples using phenolchloroform extraction and ethanol precipitation [29]. Genomic DNA was dissolved in TE buffer and frozen until genotyping. The severe/moderate patients were examined for inv22 and inv1. Inv22 was detected by modified long-range polymerase chain reaction (LD-PCR) [30] using Promega GoTaq ® Long PCR Master Mix (Promega Corporation, Madison, WI, USA). Later in the study, we began using the method described in ref. [31], because it was more reliable. Inv1 was detected using an established method [8].

Amplification of the F8 Gene
For patients without inversion of introns 1 and 22, we sequenced the entire F8 coding region, including all exons and flanking intronic regions using primers developed in our laboratory [32]. The PCR reactions were carried out on a Tercik™ programmable thermocycler (DNK-Technology, Moscow, Russia) with PCR Master Mix (Thermo Fisher Scientific, Waltham, MA, USA) using 10 pmol of each oligonucleotide primer (Syntol, Moscow, Russia) and 50-100 ng template DNA in 25 µL reaction mixture. We analyzed the obtained PCR fragments with electrophoresis in 6% polyacrylamide gel (PAAG) or in 0.75% agarose gel (for LD-PCR products), followed by staining with ethidium bromide and visualization under UV light. Amplified DNA fragments were purified using the Wizard ® PCR Preps DNA Purification System (Promega Corporation, Madison, WI, USA) and subjected to direct cycle sequence analysis using the ABI PRISM ® BigDye TM Terminator v.3.1 Cycle Sequencing Kit (Thermo Fisher Scientific, Waltham, MA, USA) on an ABI PRISM 3100Avant genetic analyzer sequencer (Applied Biosystems, Foster City, CA, USA) at Genome CCU (Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia).

Large Deletions/Insertions Detection
Large F8 gene deletions were identified by consistent failure of PCR amplification of a single exon or adjacent F8 exons, as indicated by missing or altered bands upon electrophoresis of PCR products. At least three separate attempts to amplify missing fragments from the subject's genomic DNA were performed using the same primers and PCR conditions, alongside successful amplification and sequencing of the exons flanking the suggested deletion. For detection of deletion breakpoints, we carried out LD-PCR using primers for amplification of the exons framing the deletion and GoTaq ® Long PCR Master Mix following the manufacturer's protocol. We were able to identify deletion breakpoints for two patients. To accomplish this, we designed primers flanking the copies of the Alu element that could be involved in the deletion formation. For patient A375, we used primers F8-delF (5 -GTTTGTTTACATTTGTCCCAACT-3 , c.787+2045_2067, intron 6) and F8-delR (5 -TGCAACTCAAAGGACTAAACA-3 , c.1903 +1569_1589, intron 12). For patient A469, we used primers F8-5D [32] and Del6R (5 -CAGTTGACTCTTGAACAATACA-3 , c.787+2976_2995, intron 6).
Large duplications that could not be identified using routine PCR-based analysis methods were tested with multiplex ligation-dependent probe amplification (MLPA). MLPA was carried out using the F8 SALSA MLPA kit P178 (MRC Holland, Amsterdam, The Netherlands) according to the manufacturer's instructions. Exon dosage was calculated using Coffalyser.Net software (MRC Holland).

Results
Molecular analysis of the F8 gene in 273 unrelated patients allowed us to identify pathogenic variants in 267 patients (97.8%). In the remaining six patients (2.2%), we did not find causal variants in the F8 gene. One patient had pathogenic variants in the vWF gene, so his diagnosis was changed from HA to von Willebrand disease (vWD) type 2N [42]. The remaining five patients did not have pathogenic variants in the vWF gene.
Among the 267 patients with successfully identified F8 gene alterations, we found 101 different pathogenic variants, 35 of which had never been previously reported in international databases. A summary of the identified F8 gene defects and clinical features of patients are presented in Table S1. The distribution of different pathogenic variant types in our sample population is given in Figure 1. We found inv22 in 136 cases (50.9% of patients with genetically verified HA) and inv1 in 12 patients (4.5%). Large deletions spanning one to eight exons were found in five patients (1.9%), and we identified a large insertion in one patient (0.4%). The remaining 113 (42.3%) patients carried point variants involving either single nucleotides or several consecutive nucleotides.

Results
Molecular analysis of the F8 gene in 273 unrelated patients allowed us to identify pathogenic variants in 267 patients (97.8%). In the remaining six patients (2.2%), we did not find causal variants in the F8 gene. One patient had pathogenic variants in the vWF gene, so his diagnosis was changed from HA to von Willebrand disease (vWD) type 2N [42]. The remaining five patients did not have pathogenic variants in the vWF gene.
Among the 267 patients with successfully identified F8 gene alterations, we found 101 different pathogenic variants, 35 of which had never been previously reported in international databases. A summary of the identified F8 gene defects and clinical features of patients are presented in Table S1. The distribution of different pathogenic variant types in our sample population is given in Figure 1. We found inv22 in 136 cases (50.9% of patients with genetically verified HA) and inv1 in 12 patients (4.5%). Large deletions spanning one to eight exons were found in five patients (1.9%), and we identified a large insertion in one patient (0.4%). The remaining 113 (42.3%) patients carried point variants involving either single nucleotides or several consecutive nucleotides.

Inversions
We found inv22 in 136 out of 267 unrelated patients (50.9%, Table 1), the majority of whom had severe/moderate HA (122 out of 124, 98% of cases with known HA severity) and 20 of whom had developed inhibitors.

Inversions
We found inv22 in 136 out of 267 unrelated patients (50.9%, Table 1), the majority of whom had severe/moderate HA (122 out of 124, 98% of cases with known HA severity) and 20 of whom had developed inhibitors.
We found inv1 in 12 out of 267 patients (4.5%, Table 1). Patients with inv1 also had predominantly severe/moderate HA (11 out of 12, 91.7% of cases with known HA severity) and one patient had developed inhibitors. Notably, three patients had abnormal inv1 with additional deletions or duplications of adjacent regions; these patients were described in more detail in ref. [43] along with several similar cases from our population.

Large Deletions and Insertions
In five patients with severe/moderate HA, we found large deletions. They were initially identified by failure to amplify certain exons and then confirmed using MLPA.

Large Deletions and Insertions
In five patients with severe/moderate HA, we found large deletions. They were initially identified by failure to amplify certain exons and then confirmed using MLPA.
All five large deletions were unique and affected different exons (Table S1)  In one patient, PCR of exon 14 of the F8 gene yielded a fragment approximately 1500 bp longer than expected from the reference sequence. Sanger sequencing allowed us to identify a large insertion between nucleotides c.3117 and c.3118.
For the exon 6 deletion, we implemented LD-PCR using primers flanking exons 5 and 7. As a result, we obtained a fragment of 15,000 bp instead of the normal fragment of 18,000 bp. We suggested that the deletion might be caused by homologous recombination between different copies of Alu elements. Our calculations showed that the obtained In one patient, PCR of exon 14 of the F8 gene yielded a fragment approximately 1500 bp longer than expected from the reference sequence. Sanger sequencing allowed us to identify a large insertion between nucleotides c.3117 and c.3118.
For the exon 6 deletion, we implemented LD-PCR using primers flanking exons 5 and 7. As a result, we obtained a fragment of 15,000 bp instead of the normal fragment of 18,000 bp. We suggested that the deletion might be caused by homologous recombination between different copies of Alu elements. Our calculations showed that the obtained length of the PCR product could be achieved if the involved Alu element from intron 6 was AluYe5 (c.787 + 2221 − c.787 + 2484). Therefore, we specifically designed primer Del6R flanking this Alu element to use it as a reverse primer in the PCR system with the forward primer flanking exon 5. PCR yielded a fragment nearly 2000 bp in length, while the distance between the primers in the whole F8 gene was approximately 5600 bp.
In the case of the ex7-12 deletion, we used LD-PCR with the exon 6 forward primer and exon 13 reverse primer (distance between the primers in the whole F8 gene was approximately 37 000 bp) and obtained the only PCR fragment that was nearly 7500 bp in length. As in the previous case, we suggested that the deletion might involve Alu elements. We chose the same AluYe5 in intron 6 (c.787 + 2221 − c.787 + 2484) and AluY (c.1903 + 1051 − c.1903 + 1348) in intron 12 since our calculations showed that a fragment 7500 bp in length could be obtained in LD-PCR only if those copies of Alu-repeat were involved in the formation of the deletion. We specifically designed primers F8-delF and F8-delR flanking those Alu elements. Amplification of the patient's DNA using those primers yielded a fragment approximately 1300 bp in length. Sequencing of this fragment enabled us to determine the deletion breakpoints. One of the deletion breakpoints was 273 bp upstream from the AluYe5 element in intron 6 (c.787 + 1949) and another deletion breakpoints was inside the AluY element in intron 12 (position c.1903 + 1113) (Figure 2b,c). Using the PCR system with primers F8-delF/F8-delR, we also detected this deletion in family members of the patient, where seven out of nine women appeared to be carriers of this gene defect.
Two out of 5 indels resulted from the simple substitution of two consecutive nucleotides, while 3 indels were more complex in nature. Among the three observed complex indels, one case was associated with triplication of a 12-nucleotide fragment, but deleted and inserted sequences did not have anything in common in the remaining two cases ( Figure 3).
length of the PCR product could be achieved if the involved Alu element from intron 6 was AluYe5 (c.787 + 2221 − c.787 + 2484). Therefore, we specifically designed primer Del6R flanking this Alu element to use it as a reverse primer in the PCR system with the forward primer flanking exon 5. PCR yielded a fragment nearly 2000 bp in length, while the distance between the primers in the whole F8 gene was approximately 5600 bp. Sequencing of this fragment enabled us to determine the deletion breakpoints. One deletion breakpoint was located in intron 5 (c.671 − 1051) between two Alu elements: AluJb (c.670 + 656 − c.670 + 948) and AluSz (c.671 − 659 − c.671 − 351). Another deletion breakpoint (c.787 + 2580) was close to abovementioned AluYe5 element in intron 6, inside AluSg element (c.787 + 2496 − c.787 + 2781) in intron 6 (Figure 2a,c). Using the same PCR system with primers F8-5D/Del6R, we also detected this deletion in family members of the patient. His mother and sister appeared to be carriers of this gene defect, while his cousin was not.
In the case of the ex7-12 deletion, we used LD-PCR with the exon 6 forward primer and exon 13 reverse primer (distance between the primers in the whole F8 gene was approximately 37 000 bp) and obtained the only PCR fragment that was nearly 7500 bp in length. As in the previous case, we suggested that the deletion might involve Alu elements. We chose the same AluYe5 in intron 6 (c.787 + 2221 − c.787 + 2484) and AluY (c.1903 + 1051 − c.1903 + 1348) in intron 12 since our calculations showed that a fragment 7500 bp in length could be obtained in LD-PCR only if those copies of Alu-repeat were involved in the formation of the deletion. We specifically designed primers F8-delF and F8-delR flanking those Alu elements. Amplification of the patient's DNA using those primers yielded a fragment approximately 1300 bp in length. Sequencing of this fragment enabled us to determine the deletion breakpoints. One of the deletion breakpoints was 273 bp upstream from the AluYe5 element in intron 6 (c.787 + 1949) and another deletion breakpoints was inside the AluY element in intron 12 (position c.1903 + 1113) ( Figure  2b,c). Using the PCR system with primers F8-delF/F8-delR, we also detected this deletion in family members of the patient, where seven out of nine women appeared to be carriers of this gene defect.
Two out of 5 indels resulted from the simple substitution of two consecutive nucleotides, while 3 indels were more complex in nature. Among the three observed complex indels, one case was associated with triplication of a 12-nucleotide fragment, but deleted and inserted sequences did not have anything in common in the remaining two cases ( Figure 3). Two indels with two consecutive nucleotide changes and 20 single nucleotide substitutions resulted in nonsense mutations. In total, they affected 26 out of 267 patients and represented 22 different pathogenic variants (9.7% of the sample population).
Five nonsense and eight frameshift variants were not previously described. The majority of patients with loss-of-function mutations had severe/moderate HA (39 out of 42 patients, 92.8% of cases with known severity). Two patients with frameshift mutations and 7 patients with nonsense mutations had developed inhibitors (Table 1).

Splicing Variants
In 10 out of 267 unrelated patients (3.7%), we revealed 10 different genetic alterations affecting splicing. Eight out of ten variations were changed canonical splicing dinucleotides (±1-2 positions from an exon), one substitution was in the +4 position, and the last variation affected the +5 position. Three splicing variants were not previously reported. In another paper exploring Russian HA patients, the authors described splicing mutations c.6901-1G>C and c.6901-2A>C [27], which were also identified in our patients but have not been described in other countries.
All 9 patients with splicing mutations, for whom we knew F8:C (%), had severe/moderate HA and one patient had developed inhibitors (Table 1).

Missense Mutations and Inframe Deletion
We identified 44 different missense variants in 56 out of 267 patients (20.2%), 16 of which were not previously described.
Among patients with missense variants, those with severe/moderate HA (29 out of 50 patients, 58% of cases with known severity) were slightly more prevalent. However, this was obviously a result of our sample population being skewed towards severe/moderate HA and not the characteristics of the Russian population. Two patients with missense mutations (one with severe/moderate HA and another with mild HA) had developed inhibitors (Table 1).
In our sample population, one inframe deletion (0.4%) was detected-known pathogenic variant c.5142_5144delACG p.(Arg1715del) [44]. Although the influence of this type of genetic alterations on protein function is usually disputable, it was evaluated by all deleteriousness prediction scoring methods as pathogenic. Variant p.(Arg1715del) was identified in a patient with severe HA without data about the presence of inhibitor antibodies.

Previously Undescribed Variants Assessment
Among the studied patients, we identified 29 variants that affected one to several nucleotides and had not been previously reported (Table 2). This included 4 deletions of 1-2 nucleotides, a nucleotide duplication, 19 substitutions of a single nucleotide, and five indels. These variants resulted in frameshift (N = 8), nonsense (N = 5), splicing (N = 2), and missense (N = 14) mutations. Only one of the new variants was recurrent.  According to the ACMG/AMP Variant Curation Guidelines [41], we classified 24 out of 29 variants as pathogenic and likely pathogenic, while a verdict of uncertain significance was obtained for 5 missense variants. Deleteriousness prediction software results are given in Table S2.
All large deletions and the large insertion were also previously undescribed, but their pathogenicity was undoubted.

Clinical Manifestations of Studied Patients
We had clinical data for the majority of our patients; however, we were unable to determine the severity of HA for 23 out of 267 families (8.6%) and no information about the presence of inhibitors was available for 165 cases (61.3%). Most families had the severe/moderate form of HA-217 cases (81.3% of all 267 HA patients, 88.9% of all 244 patients with known severity). This made our sample population strongly skewed towards severe/moderate HA, which is sometimes observed in studies [9,44] despite the proportions of severe/moderate/mild HA in the population believed to be approximately 50%/10%/40% [45]. Indeed, in broader studies, especially those based on national registers, these expected proportions are often met [16,[46][47][48]. The prevalence of more clinically prominent cases in our data can be explained by the specifics of patient recruitment. Patients with severe HA experience more everyday inconveniences than those with milder forms of HA, and therefore are more interested in genetic diagnostics, including female relative carrier detection.
The prevalence of severe forms of HA in our sample population was concordant with the observed proportions of inv1 and inv22 that jointly covered approximately 55% of the sample [7,8].

Patients without Genetic Variants in F8 Gene
We were unable to find the genetic cause of the clinical presentation in five cases out of 273 (1.8%). For those patients, we excluded common inversions (inv1 and inv22), nucleotide substitutions in all F8 gene exons and adjacent intronic regions, large deletions, and insertions. Our methods did not include mRNA analysis and whole gene sequencing, so we cannot exclude deep intronic variants leading to alternative splicing, pathogenic variants in regulatory regions, or unique inversions that could not be detected by Sanger sequencing and MLPA.
It also worth noting that although deep intronic mutations are sometimes identified in patients with severe or moderate HA [16,52,57], they are generally found in patients with the milder form of HA [51,53,55,57]. In contrast, in this study, four out of five patients without identified causative variants had severe HA (FVIII:C < 1%).

Patients with Two Genetic Variants
Two unrelated patients (A429, A384) simultaneously had two variants in the F8 gene. The DNA of probands' mothers was available, so we verified that these variants did not appear de novo.
In patient A429, we identified two variants that have never been previously described: a single nucleotide substitution leading to the missense amino acid change p.(Pro1265Leu) and a substitution of two consecutive nucleotides (classified as indel according to the HGVS guidelines) leading to the nonsense amino acid change p.(Met1363Ile*). The missense variant was located 98 codons before the premature termination codon. However, the p.(Met1363Ile*) variant clearly had a predominant influence on the patient's phenotype, so it is impossible to say whether the p.(Pro1265Leu) variant was pathogenic or benign. Moreover, this missense variant was evaluated by most of the used deleteriousness prediction scoring methods as benign.
In patient A384, we identified two substitutions both leading to missense amino acid changes: p.(Arg301Cys) reported in the EAHAD database as pathogenic and p. (His336Arg), which was new. Both missense variants were predicted to be pathogenic according to the PolyPhen-2, MutationTaster, SIFT, and CADD scoring programs, but VarSite and Missense3D yielded different results. P.(Arg301Cys) was evaluated as likely damaging: the change in amino acid sidechain size being large, Arg>Cys substitution-very highly unfavored, changing the buried charged amino acid with an uncharged amino acid, and the p.301 position being highly conserved. P.(His336Arg) was more tolerated: the change in amino acid sidechain size not being large; His>Arg substitution-neutral, without any structural consequences, and the p.336 position being fairly conserved. According to ACMG/AMP guidelines, p.(His336Arg) was classified as having an uncertain significance (Table 1). To sum up, all predictions led us to the suggestion that the patient's phenotype was influenced by p.(Arg301Cys), while p.(His336Arg) seemed likely to be benign.
Therefore, despite the fact that we identified two patients with more than one genetic alteration in the F8 gene, we cannot confirm with certainty that both variants in both cases were pathogenic.
The recurrent variants were 9 single nucleotide substitutions (1 new), two known deletions, and one known duplication. We compared haplotypes of patients with recurring variants using polymorphisms in the F8 gene and showed that only in one substitution occurred a founder effect which we were able to prove [32], while the remaining cases were independent appearances.
Seven out of nine recurring nucleotide substitutions were CpG substitutions, all of which were previously described. Another two recurring substitutions (1 new) were not Among our patients with SNVs, we identified 12 recurrent variants found in 2-7 patients apiece and 78 unique variants found in a single patient each. Deletions, duplications, and substitutions were represented among recurrent as well as among unique variants. All indels were unique. Deletions and duplications were slightly more represented among recurrent variants than among unique variants-25% (3 out of 12) and 17% (13 out of 78), respectively.
The recurrent variants were 9 single nucleotide substitutions (1 new), two known deletions, and one known duplication. We compared haplotypes of patients with recurring variants using polymorphisms in the F8 gene and showed that only in one substitution occurred a founder effect which we were able to prove [32], while the remaining cases were independent appearances.
Seven out of nine recurring nucleotide substitutions were CpG substitutions, all of which were previously described. Another two recurring substitutions (1 new) were not located at CpG sites, both of which were found only in the Russian population. Fourteen out of 65 unique substitutions were in CpG sites, all of which were previously described. The remaining 41 unique substitutions (16 new) were not in CpG sites.
It is widely believed that CpG sites are hotspots for mutagenesis and recurrent mutations usually occur in these locations, but in fact, not all CpG are equally mutagenic, which is also true for the F8 gene (Table S3) Table S3). Notably, all substitutions in CpG sites identified in our population fell mostly into the intermediate activity group but were not the most active CpG sites (according to the Factor VIII Variant Database; Table S3). These results suggest that our HA population has a unique pattern of CpG mutagenesis.

Large Deletions
There are three main types of molecular mechanisms of genomic rearrangements: replication-based mechanisms (RBMs), non-allelic homologous recombination (NAHR), and non-homologous non-replicative DNA repair mechanisms. RBMs result from replication slippage and template switching during DNA replication and produce mostly microhomology at breakpoint junctions. NAHR between two genomic regions with high sequence homology (>99%) results in extensive sequence homology at breakpoint junctions. Examples of NAHR are the most prevalent mutations in HA-inv22 and inv1. There are two types of non-homologous non-replicative DNA repair mechanisms: non-homologous end joining (NHEJ) results mostly in blunt ends with sometimes a short insertion of random nucleotides at breakpoint junctions, and alternative end-joining (Alt-EJ), which can generate short microhomologies. Large deletions in the F8 gene usually appear as a result of NHEJ [60].
In our case, both large deletions with identified breakpoints resulted from NHEJ involving Alu repeats, but in both cases, only one of the breakpoints was exactly in an Alu repeat, while another breakpoint was outside it (Figure 2c). Interestingly, in both cases, one of the breakpoints was located in intron 6, which comprised two Alu repeats-AluSg and AluYe5. This genomic region appeared to be a DNA breakage hotspot that was previously noted by other researchers [60].

De Novo Origin Evaluation
The probability of de novo origin of the identified variants in probands was evaluated on the basis of family HA history and direct verification of the identified variants in female relatives.
In the case of a family history of HA, de novo origin of the genetic variants in our patients was excluded. The presence of family HA history was verified if family members were able to identify another HA patient in the same or earlier generations. In all other cases, family history was classified as "no data" because knowledge about relatives beyond the third generation in our country is frequently absent due to the prevalence of nuclear families.
Another way to clarify the status of mutation was to test for the presence of the mutation in female carriers of the same or ascending generations (e.g., mother or sister of proband). If the mutation was identified in these relatives, then de novo origin was excluded. If the mutation in the same or ascending generations was absent, then de novo origin was confirmed. If only descending female relatives (e.g., daughter of proband) or no female carriers were available, then the mutation origin in family was deemed undeterminable.
We were able to assess de novo origin of the pathogenic variants in 159 out of 267 apparently unrelated patients. In 81 families, de novo origin was excluded due to family history, and it was ruled out in 73 families because of verified mutation carrier status in female relatives. We confirmed de novo origin in only 5 cases, as patients' mothers did not carry F8 gene variants identified in the proband. This resulted in the proportion of de novo pathogenic variants in HA to be 3.15% (5 out of 159). Five confirmed de novo origin cases involved four single nucleotide variants (two nonsense and two missense amino acid changes) and one complex indel.

Inhibitor Development
The numbers of patients with different pathogenic variant types, including their clinical presentation and inhibitor history, are summarized in Table 2. We had data about inhibitor status for 102 patients, 37 of whom had developed inhibitors (36.3%). Thirty six patients with inhibitors had severe/moderate HA. This prevalence was concordant with the literature data [16,61,62].
Integration of data obtained by different research groups led to the following distribution of the types of mutations in the F8 gene in relation to the risk of inhibitor development: 1.
High risk of inhibitor development: large deletions (several exons), nonsense mutations in the light chain (A3C1C2 domains); 2.
Overall, our results corresponded with published data [3][4][5]63]. Patients with large deletions were in the high-risk group (4 out of 5, or 80% had developed inhibitors). According to the classification, nonsense mutations are present in the high-and moderate-risk groups, which held true for our sample population. Additionally, as in the literature data, it seems that nonsense mutations in the heavy chain of FVIII protein were rarely associated with the development of inhibitors (5 out of 7, or 71% of patients without inhibitors had nonsense mutations in this chain), compared to nonsense mutations in the light chain. Patients with inv22 and inv1 were in the intermediate-risk group. The frequency of inhibitor development in this group did not differ significantly from the overall frequency in the HA population. Missense mutations occurred in the low-risk group, as only 7.7% (2 out of 26) of patients with this mutation type had developed inhibitors. As for the influence of frameshift and splicing mutations on inhibitor development, this could not be evaluated owing to the lack of inhibitor status information for those groups.

Conclusions
We report herein the largest genetic analysis of HA patients issued in Russia. Although the overall mutation spectrum of the F8 gene in the Russian population reflected the tendencies revealed in other populations, some interesting characteristics were also found, such as a different activity pattern of CpG sites, a mutation with a founder effect, splicing mutations that have been already described twice in the Russian population but nowhere else, etc. Nevertheless, this data could improve the genetic diagnostics for female carriers and prenatal testing of this disease in our country. Russia has a vast and complex geographic distribution and history, so, undoubtfully, this study will need to be continued in order to increase the coverage of HA patients.

Supplementary Materials:
The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/genes14020260/s1, Table S1: Patients included in the study and the identified pathogenic variants; Table S2: Results of pathogenicity assessment for the previously undescribed nucleotide variants; Table S3 Institutional Review Board Statement: The study was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of National Medical Research Center for Hematology (protocol code 167, 09.11.2022).

Informed Consent Statement:
Written informed consent was obtained from the patients to publish this paper.