Human Exome Sequencing and Prospects for Predictive Medicine: Analysis of International Data and Own Experience

Glotov, Oleg S.; Chernov, Alexander N.; Glotov, Andrey S.

doi:10.3390/jpm13081236

Open AccessReview

Human Exome Sequencing and Prospects for Predictive Medicine: Analysis of International Data and Own Experience

by

Oleg S. Glotov

^1,2,*

,

Alexander N. Chernov

^1,3,* and

Andrey S. Glotov

¹

Department of Genomic Medicine, D. O. Ott Research Institute of Obstetrics, Gynecology and Reproductology, 199034 St. Petersburg, Russia

²

Department of Experimental Medical Virology, Molecular Genetics and Biobanking of Pediatric Research and Clinical Center for Infectious Diseases, 197022 St. Petersburg, Russia

³

Department of General Pathology and Pathological Physiology, Institute of Experimental Medicine, 197376 St. Petersburg, Russia

^*

Authors to whom correspondence should be addressed.

J. Pers. Med. 2023, 13(8), 1236; https://doi.org/10.3390/jpm13081236

Submission received: 28 June 2023 / Revised: 25 July 2023 / Accepted: 26 July 2023 / Published: 8 August 2023

(This article belongs to the Section Methodology, Drug and Device Discovery)

Download

Browse Figures

Review Reports Versions Notes

Abstract

Today, whole-exome sequencing (WES) is used to conduct the massive screening of structural and regulatory genes in order to identify the allele frequencies of disease-associated polymorphisms in various populations and thus detect pathogenic genetic changes (mutations or polymorphisms) conducive to malfunctional protein sequences. With its extensive capabilities, exome sequencing today allows both the diagnosis of monogenic diseases (MDs) and the examination of seemingly healthy populations to reveal a wide range of potential risks prior to disease manifestation (in the future, exome sequencing may outpace costly and less informative genome sequencing to become the first-line examination technique). This review establishes the human genetic passport as a new WES-based clinical concept for the identification of new candidate genes, gene variants, and molecular mechanisms in the diagnosis, prediction, and treatment of monogenic, oligogenic, and multifactorial diseases. Various diseases are addressed to demonstrate the extensive potential of WES and consider its advantages as well as disadvantages. Thus, WES can become a general test with a broad spectrum pf applications, including opportunistic screening.

Keywords:

whole-exome sequencing; personalized medicine; human monogenic diseases; oligogenic and multifactorial diseases

1. Introduction

In recent decades, scientific and technological advances in biology and medicine have produced new high-tech methods for early diagnostics and paved the way to the identification of genetic markers and the introduction of new screening strategies in clinical practice. This has enabled clinicians to precisely identify the causes of rare MDs, improve prevention, and boost the efficiency of treatment for multifactorial socially significant diseases, ultimately contributing to better health quality and life expectancy among the populations of economically developed countries [1]. All of these achievements have given an impetus to a paradigm shift in the overall healthcare system and enabled the transition from group-based to predictive or preventive personalized medicine (PM) and therapy that rely on a disease’s clinical diagnosis and stage, a patient’s gender and age, as well as the individual molecular genetic biomarker profiles associated with pathology development, prognosis, outcome, and treatment efficacy. Advances in genetics and information technology allowed for the emergence of new disciplines, such as genomics (proteomics, metabolomics, transcriptomics, and pharmacogenomics), and shaped the set of standard criteria for large dataset processing using bioinformatics, high-throughput techniques, and new-generation DNA sequencing (NGS) to study gene structure [1].

The identification of single-nucleotide variants (SNVs, SNPs) raised further challenges in annotating pathological variants and associating them with diseases. The HapMap and 1000 Genomes projects data were used to develop a methodology for genome-wide association studies (GWASs) [2]. GWASs allowed for the characterization of population frequencies of multiple SNVs/SNPs associated with multifactorial diseases (MFDs), such as type 1 diabetes (T1D) [3] and type 2 diabetes (T2D) [4].

In 2009, Shendure’s research group pioneered the use of whole-exome NGS to detect genetic aberrations and eventually discovered Miller–Fisher syndrome, a rare recessive inflammatory (autoimmune) demyelinating polyradiculoneuropathy [5]. Since 2012, it has become clear that as soon as we understand the type of inheritance, disease pathogenesis, and SNV/SNP population frequencies, the annotation and clinical interpretation of NGS-identified pathogenic gene variants are no longer impossible [6]. Notably, the contemporary identification of genes and their variants associated with a particular disease (condition) enables us to study their pathogenetic role.

NGS applications in medical research are versatile. Via their intended use, high-throughput sequencing techniques can be split into the following groups: (1) analysis of entire genome (whole-genome sequencing, WGS); (2) analysis of protein-coding genes in a genome (whole-exome sequencing, WES); (3) analysis of particular disease-causing gene sequences (from clinical exomes embracing some 4000–5000 clinically relevant genes, to kits for small target regions of one–three genes or loci); (4) transcriptome sequencing (RNA-seq); and (5) analysis of bacterial microbiome biological diversity, Table 1 [1,7].

Considering that the purpose is to limit the examination to protein-coding nucleotide sequences to identify rare pathological SNPs, insertions, deletions that may underlie a disease, or the identification of new genetic markers of oligogenic diseases and MFDs, WES is a more cost-efficient technique than WGS [8]. Versatile investigations show that the performance of exome tests is only 2% inferior than that of WGS tests, which detect 44 to 50% of all pursued mutations [9,10]. This is largely conditioned by the fact that exome analyses utilize longer fragments and more advanced probe designs [11]. WES occupies a special place among NGS tests. Considering its relatively low cost, WES is currently more attractive in clinical settings. Furthermore, WES allows for the significant minimization of the size of the analyzed database to 6 GB, in contrast to WGS (90 GB) [12]. For example, WES is 10-fold more efficient in diagnosing MODY mutations [13]. In addition, WES is becoming more and more famous, thanks to various tools for its analysis. For instance, in combination with a unique bioinformatic approach, this method enabled us to identify over 10 new markers for T2D [14,15,16].

Today, WES is becoming a first-line examination procedure for sophisticated investigatory as well as practical purposes. WES/WGS allows for the identification of new unknown or rare variants that are associated with diseases or very rare pathologies, or are present in some populations, but have been overlooked in differential diagnoses so far without being considered to cause disease. Sherri Bale et al. utilized WES to diagnose disorders of the central nervous system (31%, N = 1082), multiple congenital anomalies (36%, N = 729), and cardiovascular (28%, N = 54), skeletal (39%, N = 54), musculoskeletal (40%, N = 43), vision (47%, N = 60), skin (32%, N = 31), and hearing (55%, N = 11) abnormalities in 3040 patients, with a WES total yield of 28.8%. These included 2091 cases, where 6.2% (N = 129) had pathogenic variants [17]. Using WES, Suyun Qian et al. diagnosed metabolic, neuromuscular, and multiple deformities (MDs) in 25% (43 of 169) of critically ill children [18]. In people with severe developmental delays, microcephaly, seizures, dysmorphic facial features, and poor muscle mass, 29,860 variants in 19,160 genes have been studied via the use of WES. A pathogenic variant was detected in the ASXL transcriptional regulator-3 (ASXL3) gene, which causes Bainbridge–Ropers syndrome [19].

A meta-analysis published in 2023, including 50,417 probands, provides similar diagnostic rates for WES and WGS and shows the greater clinical utility of WGS compared to WES. With the recent downward trend in the cost of VUS found in noncoding regions of the genome and WGS, it is expected that WGS will be widely used in clinical research [20].

However, not all populations have been well studied so far. Recent years have seen the publication of numerous studies demonstrating a variety of population characteristics for allele frequencies of functionally significant polymorphisms. It is therefore apparent that each population has its own spectrum of SNPs. It is thus critical to investigate the population frequencies of functionally significant gene alleles in early stages of research. The importance of population studies is explained through frequency differences in functional polymorphic gene variants across different populations, which may depend on geographical conditions, regions of residence, dietary features, race and ethnicity, and many other factors [21,22]. Therefore, there is little doubt that, in different populations or ethnic groups, the way that identical genetic polymorphisms affect the etiology and pathogenesis of a particular disease is never the same [22,23,24].

WES, like any other technology, shows a set of drawbacks due its technical characteristics. Thus, the processing and analysis of a large amount of WES data becomes a bottleneck, making it difficult to differentiate small mutations from random errors that occur during sequencing [25]. In addition, the main disadvantage of WES is the uneven coverage of DNA code reads over target genes, which results in many low-coverage regions that prevent the accurate annotation and interpretation of variants [26]. WES data may include inconsistencies, such as outliers and anomalies, or inconsistent speeds at which data are loaded onto the repository, GC bias, the association of variants with biological traits, and phenotype. The interpretation of sequencing results for clinical diagnosis is another limitation [8]. Therefore, all of this requires the clear systematization of existing knowledge to establish the concept of the application of WES in clinical practice.

The purpose of our review is to study the risk factors for socially significant diseases based on WES results and to develop methods allowing for the identification of clinically relevant gene variants in order to assess the risk of monogenic, oligogenic, and multifactorial pathologies.

2. Human Monogenic Diseases: Population Genetics Research

2.1. Human Monogenic Diseases

The OMIM database (as of 1 June 2023) includes entries for 7377 hereditary diseases and syndromes, as well as their molecular associations [27]. These include 6305 phenotypes associated with one single gene, i.e., showing the monogenic nature of a genetic trait or syndrome. This was largely achieved due to the active implementation of WES and the exome consortium [28,29,30].

2.2. Population Genetic Researches for Monogenic Diseases

The genetic structure of human populations has been extensively studied worldwide. Nonreference (i.e., non-wild type) allele frequency in s particular population is a most important factor influencing the clinical interpretation of a genetic variant. Genetic variability in many regions of the world is poorly understood despite the very large number of variants (125,748) in the genome aggregation database (gnomAD, version v. 2.1). Wenhao Zhou et al. analyzed the prevalence of cystic fibrosis (CF) using 30,951 WES (20,909 pediatric and 10,042 parent) samples and compared these with those of Caucasians [31]. After filtration, 477 variants of the cystic fibrosis transmembrane regulator (CFTR) gene were left, and 53 variants were annotated as pathogenic/probably pathogenic (P/LP). The authors used the annotated variants to evaluate the prevalence of CF in China to be 1/128,434. Only 39.6% (21/53) of the variants were used to screen for CF in Caucasians, producing underestimated values for the prevalence of CF in China among children (1/143,171 vs. 1/1,387,395, p = 5 × 10⁻²⁴) and an adult population (1/110,127 versus 1/872,437, p = 7 × 10⁻¹⁰). The allele frequencies of six (L88X, M469V, G622D, G970D, D979A, and 1898+5G->T) pathogenic variants were higher in a Chinese population compared with a gnomAD non-Finland European population (all p < 0.1). Using haplotype analysis, the researchers showed greater diversity in haplotypes in a Chinese population compared to Caucasians. The founder mutations of the Chinese and Caucasians were G970D and F508del, with two SNPs (rs213950–rs1042077) identified as related genotypes in an exon region.

Our investigations did not identify prevalent pathogenic SNPs missing from ClinVar or dbSNP in autosomal recessive disease-causing genes. This indicates that the majority of disease alleles are common for Russian and European populations, at least for disorders with recessive inheritance patterns. These results allowed us to suggest preliminary estimates for the prevalence of monogenic disorders, based on the identified exome variants for the region (Table 2).

Although the small sample size does not allow us to reliably determine the extent of discordance, our results for CF and phenylketonuria are consistent with estimates for these genes [32]. Remarkably, our findings show that, in Northwest Russia, Stargardt disease is more prevalent than cystic fibrosis, as has been the belief [32].

The research also looks into pathogenic variants for a number of human diseases. The results are formulated in Table 1, showing the diseases with the highest prevalence: Stargardt disease caused by mutations in the ABCA4 (MIM#601691) gene, which has also been previously reported [33]. Our results are concordant with earlier large-scale research into the incidences of pathogenic alleles associated with cystic fibrosis in a non-Finnish European population [42]. Our estimates of CF, phenylketonuria, and galactosemia prevalence were concordant with those of other genetic studies [32,43].

Thus, our results indicate the need to create genetic population databases for the interpretation of variants and the identification of disease risk factors.

2.3. WES Application to Identify New Variants in the Genomes of Patients

WES allows for the identification of new gene variants in patients with MDs. Doctors Daniel Trujillano, Rami Abou Jamra, et al., using WES, sequenced 2819 samples of 1000 patients from 54 countries with a wide phenotypic spectrum. Overall, they determined 320 pathogenic (P) or likely pathogenic (LP) and 303 unique variants from 1000 patients undergoing clinical WES, 307 (30.7%) of which had a positive gene finding. In addition, other findings included ethylmalonic encephalopathy (ETHE1), Niemann–Pick disease type C2 (NPC2), Temtamy syndrome, pyruvate dehydrogenase E1-alpha deficiency (PDHA1), galactosemia (GALT), propionic acidemia (PCCA), homocystinuria (CBS), CF, long QT syndrome, and polycystic kidney disease. This justifies the idea that highly heterogeneous pathologies can be effectively detected using WES. Among other findings, new genes were detected, such as non-receptor protein tyrosine phosphatase type 23 (PTPN23) associated with brain developmental delay and atrophy, potassium channel tetramerization domain containing 3 (KCTD3) causing severe intellectual disability and seizures, alpha three subunit of sodium voltage-gated channel (SCN3A) associated with autosomal dominant encephalopathy, protoporphyrinogen oxidase (PPOX) causing variegate porphyria and developmental delay, and FERM and PDZ domain-containing 4 protein (FRMPD4) implicated in X-linked intellectual disability as well as recessive Dravet syndrome. The total WES diagnostic rate stands at 31% [44]. In another study, Joanne Trinh et al. sequenced 26,119 exome samples from 4351 patients with neurodevelopmental disorders (NDDs), such as global developmental and motor delay, macrocephaly, microcephaly, seizures, and delayed speech and language development. Researchers determined 65 rare variants in 14 genes. The 14 detected variants were classified as P or LP and included cyclin dependent kinase 13 (CDK13), chromodomain helicase DNA binding protein 4 (CHD4), potassium voltage-gated channel subfamily Q member 3 (KCNQ3), lysine methyltransferase 5B (KMT5B), transcription factor 20 (TCF20), and C2H2-type zinc finger protein (ZBTB18). The 51 remaining variants (78%) belonged to the VUS category. Two of the patients had multiple molecular diagnoses, including P/LP variants in forkhead box G1 transcription factor (FOXG1), CDK13, and the transmembrane protein 237 (TMEM237) and KMT5B genes. The total WES diagnostic rate was 31% [45]. Zhang Q et al. sequenced 1360 patients to identify 604 genetic pathologies associated with 150 genetic syndromes, 510 genes, and 718 variants. In this cohort, the overall WES positive identification rate for disease-related gene alteration was 44.41%. Investigators detected growth abnormalities in 49.37% (118/239), seizures in 44.54% (102/229), autism spectrum disorder in 32.76% (38/116), global developmental delay in 54.84% (51/93), motor deterioration in 48.06% (99/206), abnormalities of the respiratory system in 40.61% (67/165), cerebral palsy in 41.26% (59/143), and abnormalities of the head or neck in 55.52% (161/290), the skin in 53.70 (58/108), the endocrine system in 49.78 (112/225), hearing or vision in 58.51% (55/94), the skeletal system in 53.95% (116/215), and the cardiovascular system in 43.20% (54/125) of samples [46].

WES allows for the identification of new, very different variants in various populations. WES enabled us to identify new variants in the low-density lipoprotein receptor (LDLR) gene in 59 Russian patients with a history of familial hypercholesterolemia (FH) [47]. FH results from genetic variants in the LDLR, apolipoprotein B (APOB), and subtilisin/kexin proprotein convertase type 9 (PCSK9) genes [48]. FH-associated variants were determined in 25 children and 18 adults, showing mutation detection rates of 89 and 58% for the children and adults, respectively. In the adults, 13 patients had variants in the LDLR gene, 3 patients had APOB variants, and 2 had ATP-binding cassette transporter 5 (ABCG5)/G8 mutations. Twenty-one children had FH-associated variants in the LDLR gene; see Table 2. Our study identified seven novel pathogenic or likely pathogenic LDLR variants (Table 3). Among them, four missense variants were located in the protein coding regions, and two were frameshift mutations responsible for the production of truncated proteins. These mutations were only reported in one patient, whereas an intron 6 splicing variant (c.940+1_c.940+4delGTGA) was detected in four unrelated individuals. Variant p.Gly592Glu in the LDLR gene was identified in six (10%) Russian patients and may presumably constitute the main FH variant in the Russian population.

FH is a common, underdiagnosed, and untreated genetic disease worldwide [50]. Therefore, WES sequencing data can be used to detect new candidate genes.

Current sequencing methods allow for the detection of a bundle of hereditary diseases in an individual, thus gaining unprecedented significance. Such cases are not as rare as they may seem. For instance, we would like to refer to a case of the coinheritance of X-linked and dominant forms of ichthyosis [51]. This information may be valuable for genetic counseling because of similar clinical symptoms. It is therefore necessary to analyze both steroid sulfatase (STS) and filaggrin (FLG) genes to exclude combined forms of ichthyosis. Notably, NGS allows us to identify P or LP SNPs in genes that were earlier believed to possess mutations of a single type [52].

For a set of disorders, adequate therapy is the most critical outcome of NGS examination. In male probands with delayed growth and bone age, intellectual impairment, skeletal and facial features, and partial responses to hormone treatment, we identified a c.7466C>G (p.Ser2489*) heterozygous pathogenic mutation in the last exon of the SRCAP (Snf2 related CREBBP activator protein) gene, thus suggesting a new model of floating harbor syndrome (FHS) pathogenesis. These genetic mutations have dominant-negative effects that explain the limited efficacy of growth hormone treatment in FHS [53].

2.4. General Strategy and Algorithm of WES Implementation in Human Genetic Pathology Diagnostics

WES provides a robust technique for MD diagnosis in humans. Yingchao Liu et al. utilized WES to study 169 children with critical disorders (median age = 10.5 months) and MDs [18]. Monogenic disorders were diagnosed in 43 (25%) patients. Pathologies with the highest incidences included metabolic (33%) and neuromuscular (19%) diseases, as well as multiple deformities (14%). The efficacy of diagnoses in children with metabolic disorders, growth impairment, or ocular abnormalities improved once thorough clinical data were available. WES data enabled adjustments in 30 (70%) cases, including disease monitoring initiation in 41.9% (18 cases), rehabilitation and palliative care in 27.9% (12 cases), the modification of ongoing treatment in 25.6% (11 cases), other comprehensive evaluation procedures in 7% (3 cases), and family intervention in 4.7% (2 cases).

Tasja Scholz et al. studied the diagnostic efficacy of WES for MDs to identify phenotypes in 61 infants with critical idiopathic disorders [54]. Investigators performed one single WES, two duo-WES, and fifty-nine trio-WES. The overall diagnostic rate was 46% (28/61) and 50% (15/30) in neonate subgroups. The yielded data showed that WES is a noninvasive diagnostic tool with a high rate of MD identification in neonates and infants. Thus, the evidence justifies the application of WES as a first-line examination for preconception genetic diagnosis and in idiopathic disorders in probands with a “blurred” phenotype.

To ensure efficiency (see Table 4), the following cost-effective strategy is suggested for the genetic diagnosis of MODY, WD, and other MDs associated with major mutations. We also show additional benefits of the application of WES in disease diagnosis.

The obtained data are concordant with the global assumptions (7.7 × 10⁻⁶) [31].

It should be noted that NGS does not always suffice to formulate a diagnosis; hence, in some cases, concurrent or subsequent Sanger sequencing is required to detect the other pathogenic variant. In patients with a blurred clinical picture, differential diagnosis with WES is necessary to identify the root cause of a disease. For example, NGS was used to analyze the hotspot region in the RNA processing endoribonuclease (RMRP) gene promoter in a proband with extremely rare autosomal recessive skeletal chondrodysplasia (anauxetic dysplasia, AD). Heterozygous rs387906533 (n.91_92delinsGC) variants of the nucleotide sequence (chr9:35657924-35657925delCTinsGC) were detected in exon 1 of the RMRP gene and an unknown n.–6_–5insTCTCAGCTTCAC substitution (chr9:g.35658020 35658021insTCTCAGCTTCAC) in the gene promoter region; see Figure 1. The variant is a 12-nucleotide insertion between the TATA box and the transcription start site [61].

It was found that the n.–6_–5insTCTCTCAGCTTCAC mutation was of paternal origin and the n.91_92delinsGC mutation was of maternal origin. No prior evidence has ever been reported regarding the insertion in the RMRP gene promoter region as a cause of AD with no extraskeletal manifestations (typical of carriers of similar mutations) [61].

3. New-Generation Sequencing, Phenotypic Screening, Oligogenic and Multifactorial Diseases

Hereditary diseases caused by pathogenic variants in a few genes are much more prevalent—i.e., the so-called oligogenic hereditary diseases [63]. Oligogenic diseases are an interim condition between MDs, associated with one specific defected gene, and polygenic diseases, caused by several genes and exogenic factors.

3.1. Oligogenic Etiology of Cardiomyopathies

Understanding the disease origin is therefore critical. Whether the disorder is mono- or oligogenic, or multifactorial, the answer is often far from evident. WES is a promise to answer this question. Pak-Chung Sham et al. utilized WES to identify six new P or LP variants in 40 patients with hypertrophic (HCM, n = 14) and dilated cardiomyopathy (DCM, n = 26) [63]. Hypertrophic cardiomyopathy caused by gene variants coding sarcomeric proteins—β-myosin heavy chain (MYH7) and myosin binding protein C (MYBPC3)—account for up to 50% of all clinical cases: myosin light chain 2 (MYL2), myosin light chain 3 (MYL3), and cardiac troponin T (TNNT2) in 5–10%, cardiac troponin I (TNNI3) in 5%, cardiac troponin C (TNNC1) in < 1%, cardiac α-actin (ACTC1) in < 1%, α-tropomyosin (TPM1) in 1.5%, and cysteine and glycine rich protein 3 (CSRP3) [64]. Authors established that frameshift (11:47372858, c.A224insG+) mutations in the MYBPC3 gene and missense (rs193922390, c.5135 G>A, p.R1712Q) mutations in the MYH7 gene were pathogenic variants for HCM. Missense variants, such as rs138049878 (c.2608 C>T, p.R870C), rs727503260 (c.2302 G>C, p.G768R), and rs397516088 (c.1063 G>A, p.A355T) in the MYH7 gene, and rs199476306 (c.188 C>T, p.A63V) in the TPM1 gene, were rated as likely pathogenic for HCM. The diagnostic WES yield for HCM produced 43% (six variants from fourteen patients). The missense variant rs121964856 (c.260 G>A, p.R87Q) in the TNNT2 gene was evaluated as pathogenic for DCM. The frameshift 2:179423322 (c.A60242del, p.S20082) and splicing 2:179549632 (c.13859) variants in the TTN gene were likely pathogenic for DCM. The diagnostic WES yield for DCM produced 12% (three variants from twenty-six patients).

Our study provides a follow-up to the research carried out by fellow investigators, with the aim of shedding light on the mode of inheritance of cardiomyopathies and to identify potential risk markers of the disease. The study subjects therefore include patients and conditionally healthy donors with different (favorable or unfavorable) histories [65,66,67]. According to the SNP SIFT analysis, substitutions in the TNNT2 gene were the most remarkable variants (Table 5).

We therefore suggest that HCM is a polygenic, rather than a monogenic, disease.

3.2. Monogenic Diabetes Mellitus

Monogenic maturity onset diabetes of the young (MODY) is yet another disorder with a multifactorial etiology, which constitutes 1–6% of diabetes mellitus (DM) cases in children and adolescents [69]. To date, 13 types of MODY are known, associated with the expression of 13 genes causing moderate or manifest hyperglycemia [70]. Due to the wide variety of clinical forms induced by numerous MODY-associated gene variants, various treatment approaches are used, from diet and physical activity to insulin therapy.

Zanchao Liu et al. applied WES and Sanger sequencing to analyze genetic markers in 200 MODY patients from Northern China [71]. The researchers found a rare rs535471991 (c. T1895G) mutation in the phosphorylated domain of the forkhead box M1 transcription factor (FOXM1) gene, which impairs the functionality and proliferation of pancreatic β-cells. These results suggest that the FOXM1 gene is involved in the pathogenesis of MODY. These findings can pave the way towards future target therapy for MODY patients.

Philippe Froguel et al. performed WES for the identification of mutations in three affected and one healthy relative in the MODY-X family, compared to 406 controls [72]. In total, 324 variants were detected in the study family and controls. It was c.679G>A (p.Glu227Lys) substitution in potassium inwardly rectifying channel subfamily J member 11 (KCNJ11), however, that was responsible for MODY-X in the study family (LOD score of 3.68).

Today, WES enables investigators to identify pathogenic variants in other non-glucokinase (GCK) genes [13]. Three patients were identified as having different variants of target genes. Patient 1 had a GCK in-frame deletion that was associated with a hepatocyte nuclear factor-1 alpha (HNF1A) missense mutation (patient #226). Patient 2 had two missense substitutions in GCK and Src family tyrosine kinase proto-oncogene (BLK, patient #529). Patient 3 (#662) reported splicing in GCK and a missense mutation in BLK in the presence of wolframin ER transmembrane glycoprotein (WFS1) variants. Clinical symptoms in patient #226 were more typical for MODY2. Both patients #529 and #662 had clinical manifestations of GCK-MODY. Our data show the absence of severe pathogenic effects caused by detected non-GCK variants [13]. Figure 2 shows the spectrum of P variants associated with neonatal diabetes mellitus (NDM) and MODY.

Pathogenic variants in the GCK gene can be associated with different clinical disease manifestations [13].

Monogenic diabetes comprises an amount of non-MODY transient forms associated with 20 genes [73]. Neonatal diabetes can be inherited in a dominant or recessive manner and manifests as various syndromes [74]. Hyperglycemia is often diagnosed prior to the manifestation of other symptoms due to the extremely early onset of diabetes. The treatment approaches for neonatal diabetes without MODY are based on the presence of a specific genetic mutation that causes the diabetic phenotype. Our data regarding pathogenic variants in a single gene and their link with various diseases or DM manifestations (MODY and NDM) emphasize the importance and cautious implementation of “penetrance” and “expressiveness” as concepts describing such genetic syndromes in clinical practice. Nosology varieties could also originate from different variants of transformed function in the same genes, exacerbated by environmental factors.

3.3. Multifactorial Diseases (MFDs): Type 2 Diabetes Mellitus

Multifactorial diseases (MFDs) include almost all most prevalent chronic disorders, i.e., atherosclerosis, diabetes, obesity, bronchial asthma, osteoporosis, endometriosis, malignant tumors, and neuropsychiatric as well as cardiovascular diseases, arising from the interaction of many genes with adverse environmental factors [75]. Currently, the International Classification of Diseases (ICD) includes more than 55,000 nosological units [76]. The vast majority of them belong to MFDs. As of 11 March 2023, over 12,000 human diseases, 220,322 SNPs, and 493,105 genome associations were registered [77]. Today, there are three main approaches to the identification of candidate genes: the functional mapping method (candidate gene analysis), genetic linkage in high-risk families, and GWASs, including genome sequencing. GWASs are actively used to analyze and test samples from different national biobanks, including the UK Biobank [78].

According to the current level of knowledge, the influence of genetic factors on the expression and penetrance of phenotypic traits is due to the presence of genetic polymorphisms of point mutations with strong effects, on the one hand, or frequent SNPs with weak effects, on the other hand [79]. To understand the genotype–phenotype correlation, the preferable approach is to investigate candidate genes associated with MFDs [80].

Insulin-resistant T2D and obesity are very frequent chronic pathologies with multifactorial etiology with 128 and more than 700 genetic markers, respectively [81,82]. The genetic framework of T2DM and obesity has been studied using GWASs [65]. In this study, Mark McCarthy, using the 1000 Genomes multiethnic reference panel, conducted a GWAS analysis on 26,676 T2D patients and 132,532 control European individuals to detect 13 novel T2D-associated loci (p < 5 × 10⁻⁸) near the glucagon like peptide 2 receptor (GLP2R), gastric inhibitory polypeptide (GIP), and human leukocyte antigen (HLA-DQA1) genes [82]. SNVs included the following: rs60780116 (4:185708807, T>C, p = 7.38 × 10⁻⁸) in the long-chain fatty-acid-coenzyme A ligase (ACSL1) gene, rs2292626 (10:124186714, C>T, p = 1.75 × 10⁻¹²) in pleckstrin homology domain containing A1 (PLEKHA1), rs1061810 (11:43877934, A>C, p = 5.29 × 10⁻⁹) in the hydroxysteroid 17-beta dehydrogenase 12 (HSD17B12) gene, rs2925979 (16:81534790,T>C, p = 2.72 × 10⁻⁸) in the C-Maf inducing protein (CMIP) gene, rs78761021 (17:9780387, G>A, p = 5.49 × 10⁻⁸) in the glucagon like peptide 2 receptor (GLP2R) gene, and rs79349575 (17:46967038, A>T, p = 2.61 × 10⁻⁷) in the GIP gene. In general, 128 SNVs at 113 loci were associated with T2D.

In another study, John Chambers et al. carried out a GWAS analysis on 16,677 T2D patients from South Asian and 33,856 controls, and observed 21 new genetic loci for T2D with a significant association (p = 4.7 × 10⁻⁸–5.2 × 10⁻¹²) [83]. Among them, the most statistical significances with T2D were as follows: rs10916784 (1: 20729451, G, p = 5.2 × 10⁻¹²) in the von Willebrand factor A domain containing 5B1 (VWA5B1) gene, rs74790763 (5:122675214, C, p = 3.2 × 10⁻¹¹) in the centrosomal protein 120 (CEP120) gene, rs62486442 (8:12623463, A, p = 2.4 × 10⁻¹⁰) in the LON peptidase N-terminal domain and ring finger 1 protein (LONRF1) gene, rs13257283 (8:105608497, G, p = 4.2 × 10⁻⁸) in the LDL receptor related protein 12 (LRP12) gene, and rs9568861 (13:54079446, T, p = 2.5 × 10⁻⁸) in the olfactomedin 4 (OLFM4) gene.

Our investigation displays that WES can serve as a technique to detect novel genes, associated with T2D and obesity [48]. The research included the detection of novel SNPs and loci for T2D and obesity in 110 Russian patients based on biologically meaningful filtering criteria. We have determined SNPs that serve as markers for T2D (rs1126930, rs9379084), obesity (rs11960429), and body mass index (rs1956549, rs11553746, and rs7195386). Using this approach, we detected rs11863726 in hemoglobin subunit theta 1 (HBQ1), rs328 in lipoprotein lipase (LPL), and rs112984085 in Vav guanine nucleotide exchange factor 3 (VAV3) for T2D and obesity. We also identified rs6271 in dopamine beta-hydroxylase (DBH), rs34042554 in protocadherin alpha 1 (PCDHA1), rs144183813 in pleckstrin homology domain containing A5 (PLEKHA5), rs62618693 in glutamine and serine rich 1 (QSER1), and rs61758785 in RAD51 paralog B (RAD51B) for obesity, in addition to rs685523 in ADAM metallopeptidase with thrombospondin type 1 motif 13 (ADAMTS13), rs2233984 in chromosome 6 open reading frame 15 (C6ORF15), rs17801742 in collagen type II alpha 1 chain (COL2A1), rs61737764 in integrin subunit beta 6 (ITGB6), and rs9379084 in Ras responsive element binding protein 1 (RREB1) for T2D in Russian patients [14,84,85,86].

Notably, in our investigation we employed a multiperspective approach (Figure 2) to detect candidate SNPs of T2D and obesity in a Russian population, which turned out to be a practical approach for limited-cohort studies (Figure 3). We used SNP and gene association tests based on the filtering of protein-altering SNPs and the prioritization of case- or control-specific genetic substitutions. We discovered that this approach prioritizes SNPs of middle and low frequency with a higher OR.

In conclusion, our research displays that WES is a rational approach that allows for the identification of MFD-associated genetic markers in limited populations. Thus, this strategy facilitates the identification of disease genes for polygenic traits.

3.4. Prospects of Comprehensive Individualized Screening for MFD Polygenic Factors

Although the polygenic inheritance of most common SNPs may exert certain mild underlying effects contributing to most common pathologies, the main effects belong to rare and common variants’ interactions [87]. It is currently unknown as to whether the application of a genome-wide polygenic score (GPS) to polygenic predictors would enable the detection of individuals at an increased clinical risk on a level comparable to rare monogenic substitutions [88].

Earlier attempts to develop such a GPS had limited success due to insufficient risk stratification for clinical utility. These efforts were hindered by three problems: (i) the small size of GWASs; (ii) computational limitations; and (iii) the lack of big datasets required for GPS validation. Khera and colleagues studied whether a GPS can detect groups of individuals bearing a level of risk exceeding that of monogenic variants. They analyzed five frequent pathologies: breast cancer, T2D, CAD, atrial fibrillation, and inflammatory bowel disease. For each of the illnesses, a GPS was developed based on extensive GWAS data from European participants and the UK Biobank genotype data. The predictors showed an AUC of between 0.79 and 0.81 in the validation dataset, with the best predictor (GPSCAD) involving 6,630,150 mutations. This predictor was well manifested in the test cohort with an AUC of 0.81 (Table 6).

The advantage of GPSCAD is that it can be assessed starting from birth, well before the discriminative capacity of risk factors used in clinical practice to predict CAD. Information on high GPSCAD values and individual hereditary predisposition may facilitate prevention efforts. For example, we showed that a high CAD risk could be offset by a healthy lifestyle or cholesterol-lowering treatment with statins [88]. Similar results were obtained for four other conditions. T2D is a main factor of cardiovascular and renal pathologies. The polygenic predictor determined 3.5% of the population as having a 3-fold risk and the top 1% as having a 3.30-fold risk [88]. Both drugs and cardinal lifestyle interventions prevent progression to T2D [89]. Polygenic scores make a quantitative metric of personal hereditary risk based on the cumulative effect of multiple common SNPs. Risk prediction accuracy shall improve considerably with the advancement of exome and whole-genome data. There is no doubt that, within a few years, the combination of genome-wide sequencing data and genome-wide searches for allelic associations for all major MFDs is bound to dramatically increase the predictive value of presymptomatic hereditary predisposition testing, considering the rapid progress in molecular medicine [1].

However, WES cannot achieve a 100% diagnostic rate. Therefore, assistance will need to be sought from undiagnosed disease networks or collaborating with researchers to further characterize VUS variants.

3.5. Clinical Genetic Passport (CGP)

WES enables investigators to study the structure, genetic polymorphisms, and functions of different variants in the genome within specific populations, providing insights into the hereditary mechanisms of a particular monogenic disease or MFD and contributing to their diagnosis, prevention, and treatment, in anticipation of the future revised classification of human diseases. This, in turn, leads to a paradigm shift and propels the advent, as well as rapid development, of molecular predictive medicine and genetic passport frameworks as steppingstone towards a clinical genetic passport (CGP), resulting from exome sequencing data [90,91]. A comprehensive approach, based on an entire array of molecular, genetic, cytogenetic, and embryological methods, in pregnancy planning is desperately demanded today. CGPs, genetic mapping, and NGS are present in all areas of medical science, enabling clinical medicine to solve reproductive problems, among other issues [1,23], such as the following: noninvasive prenatal testing (NIPT) or screening for monogenic and oligogenic diseases via the detection of pathogenic variants in probands and high-risk families; pregnancy planning via the use of preimplantation genetic diagnostics (PGDs) and treatment; and diagnosis confirmation, prospectively extending into MFD and infectious (COVID-19) disease risk assessment in addition to the identification of phenotypic traits in humans [92,93,94] (Figure 4).

The above being the case, efficient CGP implementation requires ample population databases (including on a national level), reinforced by advanced bioinformatic and statistical protocols for sequencing data processing and analysis, in order to evaluate the incidence of SNPs associated with hereditary and other pathologies.

It is important to note that many genetic substitutions previously identified as pathogenic occur increasingly frequently in healthy persons, causing Mendelian inheritance diseases. This was the strongest factor in reducing false-positive associations of variants with phenotypes [95]. Of note are critically important updates in genetic terminology regarding the newly emerged methods. Instead of the common terms “mutation” and “polymorphism”, in 2015 and 2017 the American College of Medical Genetics and Genomics (ACMG) and the Russian Society of Medical Genetics, respectively, recommended the use of the term “nucleotide sequence variant” with the following modifiers: (1) P, (2) LP, (3) uncertain significance, (4) likely benign, or (5) benign [96,97]. Today, there is a clear understanding that genetic variants are the main carriers of predictive information on disease pathogenicity and possess two main characteristics:

-: Penetrance (the percentage of carriers of the corresponding genotype that exhibit the trait);
-: Expressivity (varying manifestation of the trait in individuals with the same genotype) [98].

This being the case, the terms proposed back in 1925 by Timofeev-Resovsky have turned out to be of such massive importance and far ahead of their time [98], such that today they offer an explanation as to why “mutation” is receding into obscurity as a term, replaced by the term “variant” with its five modifications [97].

Moreover, WES allows for the instantaneous identification of several hereditary diseases in any individual. Reported clinical cases refer to jointly inherited X-linked and autosomal dominant forms of ichthyosis [51], Wilson disease, and hemochromatosis [32,99]. By knowing the molecular defects that lead to the development of the disease, patients may benefit from the most adequate follow-up.

Most diseases are not monogenic; therefore, prior to risk assessment, the nature of a disease shall be properly understood (i.e., monogenic, oligogenic, or multifactorial condition), which is not always easy. The ability to determine persons at a high genetic risk of the most frequent pathologies (diabetes, cardiovascular disease, etc.) at any age presents both opportunities and challenges in clinical medicine. Our studies on hereditary cardiomyopathy [65,66], familial hypercholesterolemia [47], and MODY [13] are an attempt to step up to this challenge. The situation with MFDs is somewhat more complicated, since changes in the genome affect disease etiology, with a large set of genes predisposing to disease (the additivity phenomenon). The disease predisposition is shaped by a large number of environmental factors, while inheritance is not subject to explanation by Mendelian laws only [1]. Indeed, identical diagnoses may be triggered by different risk factors and etiologies in different individuals. Our study shows that WES offers rational algorithms with which to identify genetic markers of complicated diseases, even in limited samples. Genomic medicine can also help identify rare conditions concealed behind a complicated multistep and multicomponent disease diagnosis. Moreover, various common diseases have reported rare genetic variants that increase the MFD risk by several times in heterozygous carriers, e.g., the presence of risk variants for familial hypercholesterolemia in 0.4% of the population, which increase the CAD risk three-fold [100]. Therefore, the risk of monogenic pathologies and MFDs is best assessed via the use of WES.

Molecular medicine and its main areas (predictive medicine, gene therapy, pharmacogenomics, etc.) are to shape a diverse landscape of applied human sciences in the 21st century and potentially the third millennium. Future horizons in genetic testing applications include the following: presymptomatic (pre-emptive) genetic testing (GT) in high-risk families, prospective GT with mandatory follow-up in high-risk individuals based on test findings, and randomized predictive testing [1]. Thus, the concept of predictive medicine—CGPs for solving the problems of preconceptional screening, PGDs, the birth of healthy offspring, diagnostics, and the prevention of MFDs as well as infectious diseases—shall rely on NGS technologies as a fundamental tool using specialized proprietary databases, algorithms, bioinformatics, and genetic custom concepts of expression and penetrance (Figure 5).

Thus, the current advances in PM and their practical value depend on genome sequencing quality and functional analyses within a systems genetics paradigm. The EPMA program suggests a PM roadmap that predominantly relies on the mass sequencing of individual genomes to elucidate their population, ethnic, social, and even interstitial features. The genetic analysis of gene expression allows individual omics profiles to be compared with a patient’s clinical and lab data. Based on such data, integrated gene networks are identified for a patient’s organs and systems most susceptible to pathological processes, and different future developmental scenarios are analyzed. As a result, patients are both the source of information and PM data users [1].

4. Conclusions

In conclusion, we would like to emphasize that the scientific foundations of precision medicine and advances in the diagnostics as well as treatment of monogenic, oligogenic, multifactorial, and infectious diseases are driven by the efficient application of NGS technologies jointly with modern analysis algorithms and genetic custom concepts of expression and penetrance. Future horizons to apply genetic testing include the following: presymptomatic (pre-emptive) genetic testing (GT) in high-risk families, prospective GT with mandatory follow-up in high-risk individuals based on test findings, and randomized predictive testing [1]. Knowledge of gene structure, the peculiarities of genetic polymorphisms, and the functions of different variants in the genome with regard to population specificity provide insights into the hereditary nature of a particular monogenic disease or MFD, contributing to their diagnosis, prevention, and treatment, as well as demanding the revision of human diseases’ classification; this, in turn, leads to a paradigm shift and propels the advent and rapid development of molecular medicine as a new science [100].

Author Contributions

O.S.G. and A.N.C. drafted the manuscript. A.S.G. and O.S.G. completed and critically revised the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This study was funded by the Ministry of Science and Higher Education of the Russian Federation (project Multicenter research bioresource collection “Human Reproductive Health” contract № 075-15-2021-1058, 28 September 2021).

Institutional Review Board Statement

Not available.

Informed Consent Statement

Consent was obtained from all authors.

Data Availability Statement

The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Baranov, V.S.; Baranova, E.V.; Aseev, M.V. Evolution of Predictive Medicine; Baranov, V.S., Ed.; Eko-Vektor Publisher: St. Petersburg, Russia, 2021; 359p. [Google Scholar]
Frazer, K.A.; Ballinger, D.G.; Cox, D.R.; Hinds, D.A.; Stuve, L.L.; Gibbs, R.A.; Belmont, J.W.; Boudreau, A.; Hardenbol, P.; Leal, S.M.; et al. The International Hapmap Consortium. A second-generation human haplotype map of over 3.1 million SNPs. Nature 2007, 449, 851–861. [Google Scholar] [CrossRef]
Stankov, K.; Benc, D.; Draskovic, D. Genetic and epigenetic factors in etiology of diabetes mellitus type 1. Pediatrics 2013, 132, 1112–1122. [Google Scholar] [CrossRef]
Sladek, R.; Rocheleau, G.; Rung, J.; Dina, C.; Shen, L.; Serre, D.; Boutin, P.; Vincent, D.; Belisle, A.; Hadjadj, S.; et al. A genome-wide association study identifies novel risk loci for type 2 diabetes. Nature 2007, 445, 881–885. [Google Scholar] [CrossRef]
Ng, S.B.; Turner, E.H.; Robertson, P.D.; Flygare, S.D.; Bigham, A.W.; Lee, C.; Shaffer, T.; Wong, M.; Bhattacharjee, A.; Eichler, E.E.; et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 2009, 461, 272–276. [Google Scholar] [CrossRef]
Rabbani, B.; Mahdieh, N.; Hosomichi, K.; Nakaoka, H.; Inoue, I. Next-generation sequencing: Impact of exome sequencing in characterizing Mendelian disorders. J. Hum. Genet. 2012, 57, 621–632. [Google Scholar] [CrossRef]
Goloshchapov, O.V.; Bakin, E.A.; Kucher, M.A.; Stanevich, O.V.; Suvorova, M.A.; Gostev, V.V.; Glotov, O.S.; Eismont, Y.A.; Polev, D.E.; Lobenskaya, A.Y.; et al. Bacteroides fragilis is a potential marker of effective microbiota transplantation in acute graft-versus-host disease treatment. Cell. Ther. Transplant. 2020, 9, 47–59. [Google Scholar] [CrossRef]
Suwinski, P.; Ong, C.K.; Ling, M.H.T.; Poh, Y.M.; Khan, A.M.; Ong, H.S. Advancing Personalized Medicine through the Application of Whole Exome Sequencing and Big Data Analytics. Front. Genet. 2019, 10, 49. [Google Scholar] [CrossRef]
Belkadi, A.; Bolze, A.; Itan, Y.; Cobat, A.; Vincent, Q.B.; Antipenko, A.; Shang, L.; Boisson, B.; Casanova, J.-L.; Abel, L. Whole-genome sequencing is more powerful than whole-exome sequencing for detecting exome variants. Proc. Natl. Acad. Sci. USA 2015, 112, 5473–5478. [Google Scholar] [CrossRef]
Alfares, A.; Aloraini, T.; Subaie, L.A.; Alissa, A.; Qudsi, A.A.; Alahmad, A.; Mutairi, F.A.; Alswaid, A.; Alothaim, A.; Eyaid, W.; et al. Whole-genome sequencing offers additional but limited clinical utility compared with reanalysis of whole-exome sequencing. Genet. Med. 2018, 20, 1328–1333. [Google Scholar] [CrossRef]
Barbitoff, Y.A.; Polev, D.E.; Glotov, A.S.; Serebryakova, E.A.; Shcherbakova, I.V.; Kiselev, A.M.; Kostareva, A.A.; Glotov, O.S.; Predeus, A.V. Systematic dissection of biases in whole-exome and whole-genome sequencing reveals major determinants of coding sequence coverage. Sci. Rep. 2020, 10, 2057. [Google Scholar] [CrossRef]
AllSeq. WGS vs. WES. Available online: http://allseq.com/kb/wgsvswes (accessed on 6 January 2023).
Glotov, O.S.; Serebryakova, E.A.; Turkunova, M.E.; Efimova, O.A.; Glotov, A.S.; Barbitoff, Y.A.; Nasykhova, Y.A.; Predeus, A.V.; Polev, D.E.; Fedyakov, M.A.; et al. Whole-exome sequencing for monogenic diabetes in Russian children reveals wide spectrum of genetic variants in MODY-related and unrelated genes. Mol. Med. Rep. 2019, 20, 4905–4914. [Google Scholar]
Barbitoff, Y.A.; Serebryakova, E.A.; Nasykhova, Y.A.; Predeus, A.V.; Polev, D.E.; Shuvalova, A.R.; Vasiliev, E.V.; Urazov, S.P.; Sarana, A.M.; Scherbak, S.G.; et al. Identification of Novel Candidate Markers of Type 2 Diabetes and Obesity in Russia by Exome Sequencing with a Limited Sample Size. Genes 2018, 9, 415. [Google Scholar] [CrossRef]
Lu, Y.; Li, Y.; Li, G.; Lu, H. Identification of potential markers for type 2 diabetes mellitus via bioinformatics analysis. Mol. Med. Rep. 2020, 22, 1868–1882. [Google Scholar] [CrossRef]
Alur, V.; Raju, V.; Vastrad, B.; Vastrad, C.; Kavatagimath, S.; Kotturshetti, S. Bioinformatics Analysis of Next Generation Sequencing Data Identifies Molecular Biomarkers Associated with Type 2 Diabetes Mellitus. Clin. Med. Insights Endocrinol. Diabetes 2023, 16, 11795514231155635. [Google Scholar] [CrossRef]
Retterer, K.; Juusola, J.; Cho, M.T.; Vitazka, P.; Millan, F.; Gibellini, F.; Vertino-Bell, A.; Smaoui, N.; Neidich, J.; Monaghan, K.G.; et al. Clinical application of whole-exome sequencing across clinical indications. Genet. Med. 2016, 18, 696–704. [Google Scholar] [CrossRef]
Liu, Y.; Hao, C.; Li, K.; Hu, X.; Gao, H.; Zeng, J.; Guo, R.; Liu, J.; Guo, J.; Li, Z.; et al. Clinical Application of Whole Exome Sequencing for Monogenic Disorders in PICU of China. Front. Genet. 2021, 12, 677699. [Google Scholar] [CrossRef]
Hegde, M.; Santani, A.; Mao, R.; Ferreira-Gonzalez, A.; Weck, K.E.; Voelkerding, K.V. Development and Validation of Clinical Whole-Exome and Whole-Genome Sequencing for Detection of Germline Variants in Inherited Disease. Arch. Pathol. Lab. Med. 2017, 141, 798–805. [Google Scholar] [CrossRef]
Chung, C.C.Y.; Hue, S.P.Y.; Ng, N.Y.T.; Doong, P.H.L.; Chu, A.T.W.; Chung, B.H.Y. Meta-analysis of the diagnostic and clinical utility of exome and genome sequencing in pediatric and adult patients with rare diseases across diverse populations. Genet. Med. 2023, 25, 100896. [Google Scholar] [CrossRef]
Khusnutdinova, E.K.; Fatkhlislamova, R.I.; Khidiiatova, I.M.; Viktorova, T.V.; Grinchuk, O.V. Restriction-deletion polymorphism of V-region of mitochondrial DNA in populations of peoples of Volga-Ural region. Genetics 1997, 33, 996–1000. [Google Scholar]
Stepanov, V.A. Ethnogenomics of the Population of Northern Eurasia; Pechatnaya Manufaktura: Tomsk, Russia, 2002; 244p. [Google Scholar]
Baranov, V.S. Genome Paths: A Way to Personalized and Predictive Medicine. Acta Naturae 2009, 1, 70–80. [Google Scholar] [CrossRef][Green Version]
Glotov, O.S.; Glotov, A.S.; Tarasenko, O.A.; Ivashchenko, T.E.; Baranov, V.S. Study of functionally significant polymorphism of ACE, AGTR1, ENOS, MTHFR, MTRR and APOE genes in population of North-West region of Russia. Ecol. Genet. 2004, 2, 32–35. [Google Scholar] [CrossRef]
Hofmann, A.L.; Behr, J.; Singer, J.; Kuipers, J.; Beisel, C.; Schraml, P.; Moch, H.; Beerenwinkel, N. Detailed simulation of cancer exome sequencing data reveals differences and common limitations of variant callers. BMC Bioinform. 2017, 18, 8. [Google Scholar] [CrossRef] [PubMed]
Wang, Q.; Shashikant, C.S.; Jensen, M.; Altman, N.S.; Girirajan, S. Novel metrics to measure coverage in whole exome sequencing datasets reveal local and global non-uniformity. Sci. Rep. 2017, 7, 885. [Google Scholar] [CrossRef]
OMIM Gene Map Statistics. Available online: https://www.omim.org/statistics/geneMap (accessed on 6 January 2023).
Zhao, Y.; Fang, L.T.; Shen, T.w.; Choudhari, S.; Talsania, K.; Chen, X.; Shetty, J.; Kriga, Y.; Tran, B.; Zhu, B.; et al. Whole genome and exome sequencing reference datasets from a multi-center and cross-platform benchmark study. Sci. Data 2021, 8, 296. [Google Scholar] [CrossRef]
Janicki, E.; De Rademaeker, M.; Meunier, C.; Boeckx, N.; Blaumeiser, B.; Janssens, K. Implementation of Exome Sequencing in Prenatal Diagnostics: Chances and Challenges. Diagnostics 2023, 13, 860. [Google Scholar] [CrossRef]
Szustakowski, J.D.; Balasubramanian, S.; Kvikstad, E.; Khalid, S.; Bronson, P.G.; Sasson, A.; Wong, E.; Liu, D.; Wade Davis, J.; Haefliger, C.; et al. Advancing human genetics research and drug discovery through exome sequencing of the UK Biobank. Nat. Genet. 2021, 53, 942–948. [Google Scholar] [CrossRef]
Ni, Q.; Chen, X.; Zhang, P.; Yang, L.; Lu, Y.; Xiao, F.; Wu, B.; Wang, H.; Zhou, W.; Dong, X. Systematic estimation of cystic fibrosis prevalence in Chinese and genetic spectrum comparison to Caucasians. Orphanet J. Rare Dis. 2022, 17, 129. [Google Scholar] [CrossRef]
Barbitoff, Y.A.; Skitchenko, R.K.; Poleshchuk, O.I.; Shikov, A.E.; Serebryakova, E.A.; Nasykhova, Y.A.; Polev, D.E.; Shuvalova, A.R.; Shcherbakova, I.V.; Fedyakov, M.A.; et al. Whole exome sequencing provides insights into monogenic disease prevalence in Northwest Russia. Mol. Genet. Genom. Med. 2019, 7, e964. [Google Scholar] [CrossRef]
Sheremet, N.L.; Zhorzholadze, N.V.; Ronzina, I.A.; Grushke, I.G.; Kurbatov, S.A.; Chukhrova, A.L.; Loginova, A.N.; Shcherbakova, P.O.; Tanas, A.S.; Polyakov, A.V.; et al. Molecular genetic diagnosis of Stargardt disease. Vestn. Oftalmol. 2017, 133, 4–11. [Google Scholar] [CrossRef]
Al-Khuzaei, S. An Overview of the Genetics of ABCA4 Retinopathies, an Evolving Story. Genes 2021, 12, 1241. [Google Scholar] [CrossRef]
Abramov, D.D.; Kadochnikova, V.V.; Yakimova, E.G.; Belousova, M.V.; Maerle, A.V.; Sergeev, I.V.; Ragimov, A.A.; Donnikov, A.E.; Trofimov, D.Y. High carrier frequency of CFTR gene mutations associated with cystic fibrosis, and PAH gene mutations associated with phenylketonuria in Russian population. Bull. Russ. State Med. Univ. 2015, 4, 32–35. [Google Scholar]
Guo, J.; Garratt, A.; Hill, A. Worldwide rates of diagnosis and effective treatment for cystic fibrosis. J. Cyst. Fibros. 2022, 21, 456–462. [Google Scholar] [CrossRef] [PubMed]
Mannucci, P.M.; Duga, S.; Peyvandi, F. Recessively inherited coagulation disorders. Blood 2004, 104, 1243–1253. [Google Scholar] [CrossRef]
Hillert, A. The Genetic Landscape and Epidemiology of Phenylketonuria. Am. J. Hum. Genet. 2020, 107, 234–250. [Google Scholar] [CrossRef] [PubMed]
Ala, A.; Walker, A.P.; Ashkan, K.; Dooley, J.S.; Schilsky, M.L. Wilson’s disease. Lancet 2007, 369, 397–408. [Google Scholar] [CrossRef] [PubMed]
Collet, C. High genetic carrier frequency of Wilson’s disease in France: Discrepancies with clinical prevalence. BMC Med. Genet. 2018, 19, 143. [Google Scholar] [CrossRef] [PubMed]
Berry, G.T.; FFACMG. Classic Galactosemia and Clinical Variant Galactosemia Synonyms: Galactose-1-Phosphate Uridylyltranserase Deficiency, GALT Deficiency; Initial Posting: 4 February 2000, Last Update: 11 March 2021; National Library of Medicine: Bethesda, MD, USA, 2021.
Lazarin, G.A.; Haque, I.S.; Nazareth, S.; Iori, K.; Patterson, A.S.; Jacobson, J.L.; Marshall, J.R.; Seltzer, W.K.; Patrizio, P.; Evans, E.A.; et al. An empirical estimate of carrier frequencies for 400+ causal Mendelian variants: Results from an ethnically diverse clinical sample of 23,453 individuals. Genet. Med. 2013, 15, 178–186. [Google Scholar] [CrossRef]
Abramov, D.D.; Belousova, M.V.; Kadochnikova, V.V.; Ragimov, A.A.; Trofimov, D.Y. Carrier frequency of GJB2 and GALT mutations associated with sensorineural hearing loss and galactosemia in the Russian population. Bull. Russ. State Med. Univ. 2017, 6, 20–23. [Google Scholar] [CrossRef][Green Version]
Trujillano, D.; Bertoli-Avella, A.M.; Kumar Kandaswamy, K.; Weiss, M.E.; Köster, J.; Marais, A.; Paknia, O.; Schröder, R.; Garcia-Aznar, J.M.; Werber, M.; et al. Clinical exome sequencing: Results from 2819 samples reflecting 1000 families. Eur. J. Hum. Genet. 2017, 25, 176–182. [Google Scholar] [CrossRef]
Trinh, J.; Kandaswamy, K.K.; Werber, M.; Weiss, M.E.R.; Oprea, G.; Kishore, S.; Lohmann, K.; Rolfs, A. Novel pathogenic variants and multiple molecular diagnoses in neurodevelopmental disorders. J. Neurodev. Disord. 2019, 11, 11. [Google Scholar] [CrossRef]
Zhang, Q.; Qin, Z.; Yi, S.; Wei, H.; Zhou, X.Z.; Su, J. Clinical application of whole-exome sequencing: A retrospective, singlecenter study. Exp. Ther. Med. 2021, 22, 753. [Google Scholar] [CrossRef]
Miroshnikova, V.V.; Romanova, O.V.; Ivanova, O.N.; Fedyakov, M.A.; Panteleeva, A.A.; Barbitoff, Y.A.; Muzalevskaya, M.V.; Urazgildeeva, S.A.; Gurevich, V.S.; Urazov, S.P.; et al. Identification of novel variants in the LDLR gene in Russian patients with familial hypercholesterolemia using targeted sequencing. Biomed. Rep. 2021, 14, 15. [Google Scholar] [CrossRef]
EAS Familial Hypercholesterolaemia Studies Collaboration; Vallejo-Vaz, A.J.; Marco, M.D.; Stevens, C.A.T.; Akram, A.; Freiberger, T.; Hovingh, G.K.; Kastelein, J.J.P.; Mata, P.; Raal, F.J.; et al. Overview of the current status of familial hypercholesterolaemia care in over 60 countries-The EAS Familial Hypercholesterolaemia Studies Collaboration (FHSC). Atherosclerosis 2018, 277, 234–255. [Google Scholar] [CrossRef]
Barbitoff, Y.A.; Bezdvornykh, I.V.; Polev, D.E.; Serebryakova, E.A.; Glotov, A.S.; Glotov, O.S.; Predeus, A.V. Catching hidden variation: Systematic correction of reference minor alleles in clinical variant calling. Genet. Med. 2018, 20, 360–364. [Google Scholar] [CrossRef] [PubMed]
Di Taranto, M.D.; Giacobbe, C.; Fortunato, G. Familial hypercholesterolemia: A complex genetic disease with variable phenotypes. Eur. J. Med. Genet. 2020, 63, 103831. [Google Scholar] [CrossRef]
Alaverdian, D.A.; Fedyakov, M.; Polennikova, E.; Ivashchenko, T.; Shcherbak, S.; Urasov, S.; Tsay, V.; Glotov, O.S. X-linked and autosomal dominant forms of the ichthyosis in coinheritance. Drug Metab. Pers. Ther. 2019, 34, 20190008. [Google Scholar] [CrossRef] [PubMed]
Koshevaya, Y.S.; Kusakin, A.V.; Buchinskaia, N.V.; Pechnikova, V.V.; Serebryakova, E.A.; Koroteev, A.L.; Glotov, A.S.; Glotov, O.S. Description of first registered case of the Lopes-Maciel-Rodan syndrome in Russia. Int. J. Mol. Sci. 2022, 23, 12437. [Google Scholar] [CrossRef]
Turkunova, M.E.; Barbitoff, Y.A.; Serebryakova, E.A.; Polev, D.E.; Berseneva, O.S.; Bashnina, E.B.; Baranov, V.S.; Glotov, O.S.; Glotov, A.S. Molecular Genetics and Pathogenesis of the Floating Harbor Syndrome: Case Report of Long-Term Growth Hormone Treatment and a Literature Review. Front. Genet. 2022, 13, 846101. [Google Scholar] [CrossRef]
Scholz, T.; Blohm, M.E.; Kortüm, F.; Bierhals, T.; Lessel, D.; van der Ven, A.T.; Lisfeld, J.; Herget, T.; Kloth, K.; Singer, D.; et al. Whole-Exome Sequencing in Critically Ill Neonates and Infants: Diagnostic Yield and Predictability of Monogenic Diagnosis. Neonatology 2021, 118, 454–461. [Google Scholar] [CrossRef] [PubMed]
Balashova, M.S.; Tuluzanovskaya, I.G.; Glotov, O.S.; Glotov, A.S.; Barbitoff, Y.A.; Fedyakov, M.A.; Alaverdian, D.A.; Ivashchenko, T.E.; Romanova, O.V.; Sarana, A.M.; et al. The spectrum of pathogenic variants of the ATP7B gene in Wilson disease in the Russian Federation. J. Trace Elem. Med. Biol. 2020, 59, 126420. [Google Scholar] [CrossRef]
Dillon, O.J.; Lunke, S.; Stark, Z.; Yeung, A.; Thorne, N.; Gaff, C.; White, S.M.; Tan, T.Y. Exome sequencing has higher diagnostic yield compared to simulated disease-specific panels in children with suspected monogenic disorders. Eur. J. Hum. Genet. 2018, 26, 644–651. [Google Scholar] [CrossRef] [PubMed]
Monaghan, K.G.; Leach, N.T.; Pekarek, D.; Prasad, P.; Rose, N.C.; ACMG Professional Practice and Guidelines Committee. The use of fetal exome sequencing in prenatal diagnosis: A points to consider document of the American College of Medical Genetics and Genomics (ACMG). Genet. Med. 2020, 22, 675–680. [Google Scholar] [CrossRef] [PubMed]
Petrovski, S.; Aggarwal, V.; Giordano, J.L.; Stosic, M.; Wou, K.; Bier, L.; Spiegel, E.; Brennan, K.; Stong, N.; Jobanputra, V.; et al. Whole-exome sequencing in the evaluation of fetal structural anomalies: A prospective cohort study. Lancet 2019, 393, 758–767. [Google Scholar] [CrossRef]
Lord, J.; McMullan, D.J.; Eberhardt, R.Y.; Rinck, G.; Hamilton, S.J.; Quinlan-Jones, E.; Prigmore, E.; Keelagher, R.; Best, S.K.; Carey, G.K.; et al. Prenatal exome sequencing analysis in fetal structural anomalies detected by ultrasonography (PAGE): A cohort study. Lancet 2019, 393, 747–757. [Google Scholar] [CrossRef]
Best, S.; Wou, K.; Vora, N.; van der Veyver, I.B.; Wapner, R.; Chitty, L.S. Promises, pitfalls and practicalities of prenatal whole exome sequencing. Prenat. Diagn. 2018, 38, 10–19. [Google Scholar] [CrossRef] [PubMed]
Fedyakov, M.A. Anauksetic Dysplasia: Clinic, Molecular Genetic Diagnosis and Treatment. Molecular Biological Technologies in Medical Practice; Maslennikov, A.B., Ed.; Akademizdat: Novosibirsk, Russia, 2021; Volume 32, pp. 81–92. [Google Scholar]
Heng Mak, T.S.; Lee, Y.-K.; Tang, C.S.; Hai, J.J.S.H.; Ran, X.; Sham, P.-C.; Tse, H.-F. Coverage and diagnostic yield of Whole Exome Sequencing for the Evaluation of Cases with Dilated and Hypertrophic Cardiomyopathy. Sci. Rep. 2018, 8, 10846. [Google Scholar]
Kousi, M.; Katsanis, N. Genetic modifiers and oligogenic inheritance. Cold Spring Harb. Perspect. Med. 2015, 5, a017145. [Google Scholar] [CrossRef] [PubMed]
Kim, K.-H. Genetics of Cardiomyopathy: Clinical and Mechanistic Implications for Heart Failure. Korean Circ. J. 2021, 51, 797836. [Google Scholar] [CrossRef]
Glotov, A.S.; Kazakov, S.V.; Zhukova, E.A.; Alexandrov, A.V.; Glotov, O.S.; Pakin, V.S.; Danilova, M.M.; Poliakova, I.V.; Niyazova, S.S.; Chakova, N.N.; et al. Targeted next-generation sequencing (NGS) of nine candidate genes with custom AmpliSeq in patients and a cardiomyopathy risk group. Clin. Chim. Acta 2015, 446, 132–140. [Google Scholar] [CrossRef] [PubMed]
Komissarova, S.M.; Chakova, N.; Niyazova, S.S.; Kazakov, S.V.; Zhukova, E.A.; Aleksandrov, A.V.; Glotov, O.S.; Glotov, A.S. The specifics of hypertrophic cardiomyopathy clinical presentation in patients with various mutations of sarcomere genes. Russ. J. Cardiol. 2016, 1, 20–25. [Google Scholar] [CrossRef]
Teekakirikul, P.; Kelly, M.A.; Rehm, H.L.; Lakdawala, N.K.; Funke, B.H. Inherited cardiomyopathies: Molecular genetics and clinical genetic testing in the postgenomic era. J. Mol. Diagn. 2013, 15, 158–170. [Google Scholar] [CrossRef] [PubMed]
Jääskeläinen, A.; Nevanperä, N.; Remes, J.; Rahkonen, F.; Järvelin, M.-R.; Laitinen, J. Stress-related eating, obesity and associated behavioural traits in adolescents: A prospective population-based cohort study. BMC Public Health 2014, 14, 321. [Google Scholar] [CrossRef]
Hattersley, A.T.; Greeley, S.A.W.; Polak, M.; Rubio-Cabezas, O.; Njølstad, P.R.; Mlynarski, W.; Castano, L.; Carlsson, A.; Raile, K.; Chi, D.V.; et al. ISPAD clinical practice consensus guidelines 2018: The diagnosis and management of monogenic diabetes in children and adolescents. Pediatr. Diabetes 2018, 19, 47–63. [Google Scholar] [CrossRef] [PubMed]
Barbetti, F.; D’Annunzio, G. Genetic causes and treatment of neonatal diabetes and early childhood diabetes. Pest Pract. Res. Clin. Endocrinol. Metab. 2018, 32, 575–591. [Google Scholar] [CrossRef] [PubMed]
Zhong, L.; Zhao, Z.; Hu, Q.; Li, Y.; Zhao, W.; Li, C.; Xu, Y.; Rong, R.; Zhang, J.; Zhang, Z.; et al. Identification of Maturity-Onset Diabetes of the Young Caused by Mutation in FOXM1 via Whole-Exome Sequencing in Northern China. Front. Endocrinol. 2020, 11, 534362. [Google Scholar] [CrossRef]
Bonnefond, A.; Philippe, J.; Durand, E.; Dechaume, A.; Huyvaert, M.; Montagne, L.; Marre, M.; Balkau, B.; Fajardy, I.; Vambergue, A.; et al. Whole-Exome Sequencing and High Throughput Genotyping Identified KCNJ11 as the Thirteenth MODY Gene. PLoS ONE 2012, 7, e37423. [Google Scholar] [CrossRef]
Lemelman, M.B.; Letourneau, L.; Greeley, S. Neonatal diabetes mellitus: An update on diagnosis and management. Clin. Perinatol. 2018, 45, 41–59. [Google Scholar] [CrossRef] [PubMed]
Greeley, S.A.; Naylor, R.N.; Philipson, L.H.; Bell, G.I. Neonatal diabetes: An expanding list of genes allows for improved diagnosis and treatment. Curr. Diab. Rep. 2011, 11, 519–532. [Google Scholar] [CrossRef]
100,000 Genomes Project. Available online: https://www.genomicsengland.co.uk (accessed on 6 January 2023).
Petersen, I. Classification and Treatment of Diseases in the Age of Genome Medicine Based on Pathway Pathology. Int. J. Mol. Sci. 2021, 22, 9418. [Google Scholar] [CrossRef]
The NHGRI-EBI GWAS Catalog. Available online: https://www.ebi.ac.uk/gwas (accessed on 6 January 2023).
Genetic Data-UK Biobank. Available online: https://www.ukbiobank.ac.uk/scientists-3/uk-biobank-axiom-array (accessed on 6 January 2023).
Reich, D.E.; Lander, E.S. On the allelic spectrum of human disease. Trends Genet. 2001, 17, 502–510. [Google Scholar] [CrossRef]
Baranov, V.S.; Glotov, A.S.; Glotov, O.S.; Ivaschenko, T.E.; Aseev, M.V.; Shved, N.Y.; Kozlovskaya, M.A.; Moskalenko, M.V.; Demin, G.S.; Bespalova, O.N.; et al. Genetic Passport the Basis of Individual and Predictive Medicine; Baranov, V.S., Ed.; SPb: ‘N-L’, Ltd.: Moscow, Russia, 2009; 527p. [Google Scholar]
Wang, X.; Strizich, G.; Hu, Y.; Wang, T.; Kaplan, R.C.; Qi, Q. Genetic markers of type 2 diabetes: Progress in genome-wide association studies and clinical application for risk prediction. J. Diabetes 2016, 8, 24–35. [Google Scholar] [CrossRef]
Scott, R.A.; Scott, L.J.; Mägi, R.; Marullo, L.; Gaulton, K.J.; Kaakinen, M.; Pervjakova, N.; Pers, T.H.; Johnson, A.D.; Eicher, J.D.; et al. An expanded genome-wide association study of type 2 diabetes in Europeans. Diabetes 2018, 66, 2888–2902. [Google Scholar] [CrossRef]
Loh, M.; Zhang, W.; Ng, H.K.; Schmid, K.; Lamri, A.; Tong, L.; Ahmad, M.; Lee, J.J.; Ng, M.C.Y.; Petty, L.E.; et al. Identification of genetic effects underlying type 2 diabetes in South Asian and European populations. Commun. Biol. 2022, 5, 329. [Google Scholar] [CrossRef]
Zabetian, C.P.; Romero, R.; Robertson, D.; Sharma, S.; Padbury, J.F.; Kuivaniemi, H.; Kim, K.-S.; Kim, C.-H.; Köhnke, M.D.; Kranzler, H.R.; et al. A revised allele frequency estimate and haplotype analysis of the DBH deficiency mutation IVS1+2T->C in African- and European-Americans. Am. J. Med. Genet. Part A 2003, 123, 190–192. [Google Scholar] [CrossRef]
Shungin, D.; Winkler, T.W.; Croteau-Chonka, D.C.; Ferreira, T.; Locke, A.E.; Mägi, R.; Strawbridge, R.J.; Pers, T.H.; Fischer, K.; Justice, A.E.; et al. New genetic loci link adipose and insulin biology to body fat distribution. Nature 2015, 518, 187–196. [Google Scholar] [CrossRef] [PubMed]
Flajollet, S.; Poras, I.; Carosella, E.D.; Moreau, P. RREB-1 is a transcriptional repressor of HLA-G. J. Immunol. 2009, 183, 6948–6959. [Google Scholar] [CrossRef]
Gibson, G. Rare and common variants: Twenty arguments. Nat. Rev. Genet. 2012, 18, 135–145. [Google Scholar] [CrossRef] [PubMed]
Khera, A.V.; Chaffin, M.; Aragam, K.G.; Haas, M.E.; Roselli, C.; Choi, S.H.; Natarajan, P.; Lander, E.S.; Lubitz, S.A.; Ellinor, P.T.; et al. Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations. Nat. Genet. 2018, 50, 1219–1224. [Google Scholar] [CrossRef] [PubMed]
Knowler, W.C.; Barrett-Connor, E.; Fowler, S.E.; Hamman, R.F.; Lachin, J.M.; Walker, E.A.; Nathan, D.M.; Diabetes Prevention Program Research Group. Reduction in the incidence of type 2 diabetes with lifestyle intervention or metformin. N. Engl. J. Med. 2002, 346, 393–403. [Google Scholar]
Collins, F.S.; McKusick, V.A. Implication of Human Genome Project for Medical Science. JAMA 2001, 285, 540–544. [Google Scholar] [CrossRef]
Peltonen, L.; McKusick, V.A. Genomics and medicine. Dissecting human disease in the postgenomic era. Science 2001, 91, 1224–1229. [Google Scholar]
Pendina, A.A.; Shilenkova, Y.V.; Talantova, O.E.; Efimova, O.A.; Chiryaeva, O.G.; Malysheva, O.V.; Dudkina, V.S.; Petrova, L.I.; Serebryakova, E.A.; Shabanova, E.S.; et al. Reproductive History of a Woman with 8p and 18p Genetic Imbalance and Minor Phenotypic Abnormalities. Front. Genet. 2019, 10, 1164. [Google Scholar] [CrossRef]
Saifitdinova, A.F.; Glotov, O.S.; Poliakova, I.V.; Bichevaya, N.K.; Loginova, J.A.; Kuznetsova, R.A.; Leonteva, O.A.; Nevskaia, E.E.; Pavlova, O.A.; Puppo, I.L.; et al. Mosaicism in preimplantation human embryos. Integr. Physiol. 2020, 1, 225–230. [Google Scholar] [CrossRef]
Shcherbak, S.G.; Anisenkova, A.Y.; Mosenko, S.V.; Glotov, O.S.; Chernov, A.N.; Apalko, S.V.; Urazov, S.P.; Garbuzov, E.Y.; Khobotnikov, D.N.; Klitsenko, O.A.; et al. Basic Predictive Risk Factors for Cytokine Storms in COVID-19 Patients. Front. Immunol. 2021, 12, 745515. [Google Scholar] [CrossRef]
Lek, M.; Karczewski, K.J.; Minikel, E.V.; Samocha, K.E.; Banks, E.; Fennell, T.; O’Donnell-Luria, A.H.; Ware, J.S.; Hill, A.J.; Cummings, B.B.; et al. Analysis of Protein-Coding Genetic Variation in 60,706 Humans. Nature 2016, 536, 285–291. [Google Scholar] [CrossRef]
Richards, S.; Aziz, N.; Bale, S.; Bick, D.; Das, S.; Gastier-Foster, J.; Grody, W.W.; Hegde, M.; Lyon, E.; Spector, E.; et al. Standards and guidelines for the interpretation of sequence variants: A joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genet. Med. 2015, 17, 405–424. [Google Scholar] [CrossRef]
Ryzhkova, O.P.; Kardymon, O.L.; Prohorchuk, E.B.; Konovalov, F.A.; Maslennikov, A.B.; Stepanov, V.A.; Afanasyev, A.A.; Zaklyazminskaya, E.V.; Kostareva, A.A.; Pavlov, A.E.; et al. Guidelines for interpretation of human DNA sequence data obtained by massively parallel sequencing (MPS) (revision 2018, version 2). Med. Genet. 2019, 18, 3–23. [Google Scholar]
Inge-Vechtomov, S.G. Genetics with the Basics of Selection; Publishing House H-J: St. Petersburg, Russia, 2010; 720p. [Google Scholar]
Tulzunovskaya, I.G.; Zhuchenko, N.A.; Balashova, M.; Filimonov, M.I.; Rosina, T.P.; Glotov, O.S.; Asanov, A.Y. Wilson-Conovalov disease: Intrafamilial clinical polymorphism. Pediatria J. GN SPERANSKY 2017, 96, 215–216. [Google Scholar]
Abul-Husn, N.S.; Manickam, K.; Jones, L.K.; Wright, E.A.; Hartzel, D.N.; Gonzaga-Jauregui, C.; O’Dushlaine, C.; Leader, J.B.; Kirchner, H.L.; Lindbuchler, D.M.; et al. Genetic identification of familial hypercholesterolemia within a single U.S. health care system. Science 2016, 354, aaf7000. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Hotspot RMRP gene promoter region in the proband [62].

Figure 2. Differences in the prevalence of genetic variants causative of NDM and MODY. Abbreviations: GATA6, GATA binding protein 6; ABCC8, member 8 of the ATP binding cassette subfamily C; EIF2AK3, eukaryotic translation initiation factor 2 alpha kinase 3; FOXP3, P3 forkhead box transcriptional factor; GCK, glucokinase; WFS1, Wolframin ER transmembrane glycoprotein; HNF1A, HNF1 homeobox A transcriptional factor; and PAXA, paired box A transcriptional factor.

Figure 3. Schematic application of WES data in T2D research (A) [16] and data analysis pipeline in a study on COVID-19 (B) [14]. MAF, minor allele frequency.

Figure 4. Block diagram for a clinical genetic passport, to aid reproduction.

Figure 5. Exome study within the predictive medicine framework of a clinical genetic human health passport. Different colors reflect the sequence of genetic analysis: green for sequencing data, yellow for bioinformatics, gray for the risk analysis of monogenic and oligogenic diseases, and white for MFDs as well as infection diseases.

Table 1. Targeted NGS technologies by intended use.

NGS Technology	Study Aims	Working Principle	Data Size, GB	Research and Clinical Applications
WGS	Identification of genetic mutations and SNPs in coding and noncoding genome regions	DNA extraction, fragmentation, sequencing, data analysis, and identification of relevant variants	90–120	Research, diagnosis, and treatment
WES	Identification of variants in protein coding loci and genes (exome)	DNA extraction, fragmentation, target gene identification, sequencing, data analysis, and variant annotation	6–12	Clinical diagnosis, disease-causing genes
TS	Screening for DNA variants affecting numerous genes	DNA extraction, fragmentation, target gene sequence capture, sequencing, data analysis, and variant annotation	0.5–3	Treatment
RNA-seq	Identification of gene expression, new protein isoforms, detection of merging genes, SNPs, insertions, deletions, and small noncoding RNAs	RNA and cDNA extraction, fragmentation, sequencing, data analysis, and variant annotation	3–6	Research and therapy, biomarker discovery, and drug resistance
ChIP-seq		DNA fragmentation, binding beaded antibodies to target proteins, DNA purification, sequencing, and identification of gene variants	1–2	The influence of transcription factors on phenotype-affecting mechanisms and diseases

Table 2. MD prevalence in Russia and globally determined by the frequencies of pathogenic SNPs [32].

Disease/Condition	Gene	Allele Count	Carrier Frequency (Lower/Upper CI)	Disease Frequency (Lower/Upper CI)	Known Frequency	References
Retinal dystrophy, Stargardt disease	ABCA4	13 (23)	0.0350 (0.0206/0.0589)	3.1 × 10⁻⁴ (1.1 × 10⁻⁴/8.8 × 10⁻⁴)	1 in 10,000 1 in 8000	[33] [34]
Cystic fibrosis	CFTR	11 (19)	0.0296 (0.0167/0.0522)	2.2 × 10⁻⁴ (6.9 × 10⁻⁵/6.9 × 10⁻⁴)	1 in 10,000 1 in 3000–16,000	Reported carrier frequency of 0.032 [35] [36]
Phenylketonuria	PAH	11 (18)	0.0296 (0.0167/0.0522)	2.2 × 10⁻⁴ (6.9 × 10⁻⁵/6.9 × 10⁻⁴)	1 in 10,000 1 in 4500 [Italy]–1 in 125,000 [Japan]	Reported carrier frequency of 0.029 [37] [38]
Wilson disease	ATP7B	4 (6)	0.0108 (0.0042/0.0274)	2.9 × 10⁻⁵ (4.3 × 10⁻⁶/1.9 × 10⁻⁴)	1 in 30,000 1 in 30,000	Similar global incidence reported [39,40]
Galactosemia	GALT	4 (5)	0.0108 (0.0042/0.0274)	2.9 × 10⁻⁵ (4.3 × 10⁻⁶/1.9 × 10⁻⁴)	1 in 20,000 1 in 48,000	Reported carrier frequency of 0.006 [35,41]

Abbreviations: ABCA4, ATP binding cassette subfamily A member 4; ATP7B, P-type cation transport ATPase family 7B; CFTR, cystic fibrosis transmembrane regulator; and GALT, galactose-1-phosphate uridylyltransferase.

Table 3. Pathogenicity of novel LDLR gene variants.

Gene	Patient ID	Exon/Intron	Variant	Allele Frequency in GnomAD	Allele Frequency in [49]	Variant Pathogenicity Classification by ACMG
LDLR	G31	4	c.316_328delCCCAAGACGTGCT p.(Lis107Argfs*95)	Not found	Not found	P (PVS1 PS1 PM1 PM2 PP3)
LDLR	G29	4	c.325T>G p.(Cys109Gly)	Not found	Not found	LP (PS1 PM1 PM2 PM5 PP3)
LDLR	G36	4	c.401G>C (p.Cys134Ser)	Not found	Not found	LP (PS1 PM1 PM2 PM5 PP3)
LDLR	1	4	c.433_434insG p(Val145Glyfs*35)	Not found	Not found	P (PVS1 PM2 PP3)
LDLR	G18	4	c.616A>C (p.Ser206Arg)	Not found	Not found	Uncertain significance (PM2 PP1 PP3)
LDLR	G21	IVS6	c.940+1_c.940+4 delGTGA (g.18154_18157delGTGA)	Not found	Not found	P (PVS1 PM1 PM2 PP3)
LDLR	32	8	c.1186G>C p.(Gly396Arg)	Not found	Not found	P (PVS1 PM1 PM2 PM5 PP3)
LDLR	G26	IVS8	c.1186+1G>T (g.22279G>T)	Not found	Not found	P (PVS1 PM2 PP3)
LDLR	G17	11	c.1684_1691delTGGCCCAA p.(Pro563Hisfs*14)	Not found	Not found	P (PVS1 PM1 PM2 PP3)

Abbreviations: LDLR, low-density lipoprotein receptor; ACMG, American College of Medical Genetics; and P, pathogenic.

Table 4. Most efficient diagnostic strategies for hereditary diseases.

Nosology	Efficiency of Diagnostics Prior to NGS, %	Efficiency of Diagnostics after WES, %	Efficiency of Diagnostics with Novel Variants Considered, %	Reference
Cystic fibrosis	45–55 (1 mutation) 58 (35 mutations)	67–80	-	Unpublished
WD	Up to 75 (4 mutations) Up to 86 (12 mutations)	Up to 96	97	[55]
MODY	15–35	40–50	55	[13]
Genetically heterogeneous condition	28%	65%		[56]
Neurometabolic disorder	24%	35%		[56]
Single anomalyy of the fetuses		6%		[57]
Two and more anomalies of the fetuses		35%		[57]
Anomalies of the fetuses		10.3–18.9%		[58]
Anomalies of the fetuses		8.5–15.4%		[59]
Anomalies of the fetuses		6.2–80%		[60]

Abbreviations: MODY, maturity onset diabetes of the young; WD, Wilson’s disease; WES, whole-exome sequencing; and NGS, next-generation sequencing.

Table 5. Main HCM genetic variants determined in patients and control group [65].

Gene	Nucleotide Change	Diseased/Risk/Healthy, %	Risk	Risk2	p-Value	Polyphen 2	SIFT	Clinical Verification
MYBPC3	c.977G>A (NM_000256.3)	5/4/0	19	−99	0.41	Benign	Damaging	[68]
MYBPC3	c.2678G>T (NM_000256.3)	5/17/0	16	−96	-	Probably damaging	Damaging	-
CASQ2	c.1014+12delG (NM_001232.3)	13/4/0	49	−249	8.62 × 10⁻⁵	-	-	-
TNNT2	c.97+151delC (NM_000364.3)	0/0/10	−100	20	1.80 × 10⁻⁵	-	-	-
TNNT2	c.223+92G>C (NM_000364.3)	0/0/29	−300	60	1.902 × 10⁻⁷	-	-	-
TNNT2	c.223+93C>G (NM_000364.3)	0/0/33	−350	70	2.535 × 10⁻⁴	-	-	-

Abbreviations: MYBPC3, cardiac myosin binding protein C; CASQ2, Calsequestrin 2; TNNT2, cardiac troponin T; and SIFT, scale-invariant feature transform.

Table 6. GPS derivation and testing for five common MFDs [84].

Disease	Discovery GWAS, Case/Control	Prevalence in the Validation Dataset	Prevalence in the Testing Dataset	No. of SNPs in GPS	Tuning Parameter	AUC (95% CI) in the Validation Dataset	AUC (95% CI) in the Testing Dataset
CAD	60,801/123,504	3963/120,280 (3.4%)	8676/288,978 (3.0%)	6,630,150	LDPred (ρ = 0.001)	0.81 (0.80–0.81)	0.81 (0.81–0.81)
Atrial fibrillation	17,931/115,142	2024/120,280 (1.7%)	4576/288,978 (1.6%)	6,730,541	LDPred (ρ = 0.003)	0.77 (0.76–0.78)	0.77 (0.76–0.77)
T2D	26,676/132,532	2785/120,280 (2.4%)	5853/288,978 (2.0%)	6,917,436	LDPred (ρ = 0.01)	0.72 (0.72–0.73)	0.73 (0.72–0.73)
Inflammatory bowel disease	12,882/21,770	1360/120,280 (1.1%)	3102/288,978 (1.1%)	6,907,112	LDPred (ρ = 0.1)	0.63 (0.62–0.65)	0.63 (0.62–0.64)
Breast cancer	122,977/105,974	2576/63,347 (4.1%)	6586/1,576,895 (4.2%)	5218	Pruning and thresholding (r/2 < 0.2; p < 5 × 10⁻⁴	0.68 (0.67–0.69)	0.69 (0.68–0.69)

Abbreviations: CAD, coronary artery disease; T2D, type 2 diabetes; GPS, genome-wide polygenic score; GWAS, genome-wide association study; SNP, single-nucleotide polymorphism; and AUC, area under the ROC Curve.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Glotov, O.S.; Chernov, A.N.; Glotov, A.S. Human Exome Sequencing and Prospects for Predictive Medicine: Analysis of International Data and Own Experience. J. Pers. Med. 2023, 13, 1236. https://doi.org/10.3390/jpm13081236

AMA Style

Glotov OS, Chernov AN, Glotov AS. Human Exome Sequencing and Prospects for Predictive Medicine: Analysis of International Data and Own Experience. Journal of Personalized Medicine. 2023; 13(8):1236. https://doi.org/10.3390/jpm13081236

Chicago/Turabian Style

Glotov, Oleg S., Alexander N. Chernov, and Andrey S. Glotov. 2023. "Human Exome Sequencing and Prospects for Predictive Medicine: Analysis of International Data and Own Experience" Journal of Personalized Medicine 13, no. 8: 1236. https://doi.org/10.3390/jpm13081236

APA Style

Glotov, O. S., Chernov, A. N., & Glotov, A. S. (2023). Human Exome Sequencing and Prospects for Predictive Medicine: Analysis of International Data and Own Experience. Journal of Personalized Medicine, 13(8), 1236. https://doi.org/10.3390/jpm13081236

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Human Exome Sequencing and Prospects for Predictive Medicine: Analysis of International Data and Own Experience

Abstract

1. Introduction

2. Human Monogenic Diseases: Population Genetics Research

2.1. Human Monogenic Diseases

2.2. Population Genetic Researches for Monogenic Diseases

2.3. WES Application to Identify New Variants in the Genomes of Patients

2.4. General Strategy and Algorithm of WES Implementation in Human Genetic Pathology Diagnostics

3. New-Generation Sequencing, Phenotypic Screening, Oligogenic and Multifactorial Diseases

3.1. Oligogenic Etiology of Cardiomyopathies

3.2. Monogenic Diabetes Mellitus

3.3. Multifactorial Diseases (MFDs): Type 2 Diabetes Mellitus

3.4. Prospects of Comprehensive Individualized Screening for MFD Polygenic Factors

3.5. Clinical Genetic Passport (CGP)

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI