Next Article in Journal
A Multiple-Medical-Image Encryption Method Based on SHA-256 and DNA Encoding
Next Article in Special Issue
Deriving Vocal Fold Oscillation Information from Recorded Voice Signals Using Models of Phonation
Previous Article in Journal
Generalizing the Wells–Riley Infection Probability: A Superstatistical Scheme for Indoor Infection Risk Estimation
Previous Article in Special Issue
LFDNN: A Novel Hybrid Recommendation Model Based on DeepFM and LightGBM
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Gene-Based Algorithm for Identifying Factors That May Affect a Speaker’s Voice

Center for Voice Intelligence and Security, Carnegie Mellon University, Pittsburgh, PA 15213, USA
Entropy 2023, 25(6), 897; https://doi.org/10.3390/e25060897
Submission received: 17 April 2023 / Revised: 26 May 2023 / Accepted: 28 May 2023 / Published: 2 June 2023
(This article belongs to the Special Issue Information-Theoretic Approaches in Speech Processing and Recognition)

Abstract

:
Over the past decades, many machine-learning- and artificial-intelligence-based technologies have been created to deduce biometric or bio-relevant parameters of speakers from their voice. These voice profiling technologies have targeted a wide range of parameters, from diseases to environmental factors, based largely on the fact that they are known to influence voice. Recently, some have also explored the prediction of parameters whose influence on voice is not easily observable through data-opportunistic biomarker discovery techniques. However, given the enormous range of factors that can possibly influence voice, more informed methods for selecting those that may be potentially deducible from voice are needed. To this end, this paper proposes a simple path-finding algorithm that attempts to find links between vocal characteristics and perturbing factors using cytogenetic and genomic data. The links represent reasonable selection criteria for use by computational by profiling technologies only, and are not intended to establish any unknown biological facts. The proposed algorithm is validated using a simple example from medical literature—that of the clinically observed effects of specific chromosomal microdeletion syndromes on the vocal characteristics of affected people. In this example, the algorithm attempts to link the genes involved in these syndromes to a single example gene (FOXP2) that is known to play a broad role in voice production. We show that in cases where strong links are exposed, vocal characteristics of the patients are indeed reported to be correspondingly affected. Validation experiments and subsequent analyses confirm that the methodology could be potentially useful in predicting the existence of vocal signatures in naïve cases where their existence has not been otherwise observed.

1. Introduction

Aside from diseases that affect the biological structures and processes involved in voice production, myriad other factors are known to influence voice. Some simple examples include age, exhaustion, smoking, which often makes the voice sound hoarse, or alcohol, which makes the voice sound slurred. The ensuing changes in the voice signal can be thought to be the “biomarkers” that give us information about the corresponding causative factors and allow us to infer their nature through voice analysis. Such relationships form the basis for artificial-intelligence (AI)-based voice profiling techniques that attempt to deduce a speaker’s bio-relevant and environmentally related parameters from voice. However, virtually all research on voice profiling, diagnostics, and biometrics is currently predicated on clinically observed or statistically inferred relationships between changes in voice and the corresponding factors that are thought to cause them. The relationships that are chanced upon in this manner provide the basis for building predictive AI (machine-learning- or rule-based) mechanisms that can deduce the underlying factors that potentially influence voice through voice analysis.
For example, it is known that smoking affects voice. To establish this, a human-observation-based approach would be: (a) an audiological one based on hearing the voices of smokers to determine if they deviate from those of non-smokers in an acoustic sense, and/or (b) a visual one where an analyst studies the spectrogram (or some other visual representation) of the speech signal to find patterns that distinguish one class of recordings from another. The spectrogram in this case is a “feature representation”. A statistical approach, on the other hand, would gather examples of speech recordings from people who smoke and those who do not, extract feature representations from the recordings, and find significant differences in the statistics of these features obtained from the two sets of recordings. Alternatively, a classifier model may be trained to discriminate between voice samples from smokers and non-smokers. If high test accuracies are achieved in this task, the existence of a biomarker for smoking in voice is indicated. This is a purely data-driven approach for establishing the existence of biomarkers in voice.
The problem with these approaches is that neither is scalable. The number of factors that can influence the human persona is virtually infinite. Human observations are limited to the effects that are perceptually discernible in voice, and data-based discovery is confined and limited by the availability of representative data. This paper provides a more formal methodology for establishing the existence of biomarkers and for identifying which factors are likely to affect voice and which are not. The methodology is based on genomic considerations, as explained below.
Before we proceed, however, it must be noted that this is not an algorithm for computational biology research. Its use to aid biological discovery or establish biological facts in its current form has not been tested. It is only meant to establish the tentative links that are needed to justify computational profiling efforts.

1.1. A genomic-Based Approach to Detect the Existence of Biomarkers

The working hypothesis for this paper, and one that has also recently been proposed in the context of voice profiling [1], is that if a given factor exerts an influence on the speaker, and if pathways of biological effects can be traced from that influence to the speaker’s voice production system, then voice must be affected (and must carry biomarkers for the factor). The methodology proposed herein is a literal test of this hypothesis in that it traces biological pathways from cause to effect to establish the existence of biomarkers.
For this, we begin with the genetic underpinnings of human vocal capabilities. In this context, it is important to differentiate between voice production and speech production. The former refers to the production of acoustic energy in any form within the vocal tract, and the latter refers to the modulation of the acoustic signals thus produced to form words and sentences in a language used for interpersonal communication. We will use the term “vocal production” to refer to both.
Vocal production in humans is a complex and multifaceted process and involves interactions between multiple genes and environmental factors. The genetic basis of vocal production is not fully understood. Nevertheless, a number of genes have been found to be involved in the process. Some that are now known to influence vocal production include:
  • FOXP2: This gene codes for a protein called “Forkhead Box P2”, which is involved in the development and function of the brain, including the areas responsible for language and speech.
  • TAFT: This gene codes for a protein called TAFT1, which is involved in the development and function of the larynx, a structure in the throat that is involved in vocal production.
  • OTOF: This gene codes for a protein called Otoferlin, which is involved in the development and function of the auditory system, including the inner ear. Feedback from this system greatly influences vocal production.
  • MYO15A: This gene codes for a protein called Myosin XVa, which is involved in the development and function of the auditory system, including the hair cells in the inner ear.
  • SEMA3A: This gene codes for a protein called Semaphorin 3A, which is involved in the development and function of the auditory system, including the auditory nerve.
Recent efforts to identify and delineate the genes responsible for functional speech in humans have especially highlighted the importance of FOXP2, one of the protein-coding genes mentioned above. It is involved in a variety of biological pathways and cascades that are thought to regulate language development. It is autosomal dominant, and mutations in it cause speech and language disorders (OMIM: SPCH1). In this paper, we choose FOXP2 as an example gene to work with. This choice was only made for illustrative purposes. The methodology presented in its context is itself a generic one and can be applied to any of the genes listed above (and possibly others that exist) whose functions are relevant to the analysis at hand.
For illustrative purposes, we proceeded with the broad and simplifying assumption that any influence on speech and language is ultimately the phenotypic expression of FOXP2. The objective was thus to:
  • Formalize the methodology to find a link between an influencing factor and this gene;
  • Validate the methodology;
  • Demonstrate its predictive potential.
To accomplish these goals, for simplicity, we chose the example of a category of medical conditions for which the underlying genetic causes are known; the effects of the conditions are observed and reported in medical literature, and the effects can involve problems with vocal production.
The medical conditions we choose are chromosomal microdeletion syndromes, which result from the deletion of specific genes in specific cytogenetic locations on human chromosomes. We propose an algorithm to find a link between the genes in the regions of microdeletions, to FOXP2. The connections in these links are derived using a path-search algorithm applied to a graph composed from known biological pathways that involve these (and other) genes. The “strength” of these connections is then defined in terms of the characteristics of the linkage discovered. In the validation stage, our goal is to show that there is a direct correlation between the strength of linkages found and the extent of vocal symptoms experienced by the affected individuals.
Before we proceed with the algorithm, in the paragraphs below, we first provide a working categorization of speech disorders, as reported in various medical literature. This is necessary for the clarity of the results presented later in this paper.

1.2. Anomalies in Speech Production

From a bio-mechanical perspective, human speech is the result of two complex processes that happen simultaneously: one that produces sound—the pressure wave that we sense as the voice signal—and another that modulates this signal (through articulator movements) to produce speech, thus altering the voice signal’s frequency characteristics and shaping it into sounds with unique identities that are uttered sequentially to form words and sentences in a language. The overall process of voice and speech production is driven and controlled by neuromuscular and cognitive factors to different degrees. It is also moderated to different degrees by feedback obtained through auditory pathways. Generally, diseases that affect these functions, naturally also influence speech and alter the characteristics of the voice signal to proportional degrees. In most cases, when reporting such changes, references to “speech” implicitly include voice, as we see below.
Changes in speech are categorically described in terms of six major aspects of speech production: respiration, phonation, articulation, resonance, evolution, and prosody. In addition, terminology that relates to voice quality is often used to describe speech. Voice quality is, however, a subjective term and comprises many constituents, or sub-qualities (e.g., nasal, breathy, rough, twangy, etc.), that refer to the perceptual flavor of speech (or how a speaker’s voice sounds to the listener). Physical anomalies that affect the shape and tissue structure of the vocal tract cause changes in all of these aspects. Speech delays and language difficulties result from cognitive and learning disabilities. These and other intellectual disabilities affect articulation, evolution, and prosody. Their effect on voice also manifests as changes in voice quality. Craniofacial anomalies affect the physical dimensions of the vocal tract structures, often restricting the movement of the articulators as a result and causing speaking impairments. Motor problems affect articulation, phonation, and respiration. These cause speech aberrations and also affect voice quality. Hearing problems disturb the feedback mechanisms involved in controlling speech production and often lead to difficulties in prosody and articulation of speech.
In this paper, we do not focus explicitly on voice acoustic or quality characteristics, focusing instead on problems with speech (that subsume voice characteristics to some extent) as described under the OMIM (referring to the catalog Online Mendelian Inheritance in Man: https://www.omim.org/ (accessed on 21 September 2022)) category “Speech and Language disorders (SPCH1)”. Even this category, however, is too broad and encompasses a wide range of speech problems, such as delays in acquisition of speech abilities, retardation in speech development with age, speech anomalies resulting from language delay, expression, and articulation.
For the purpose of this paper, it is necessary to make finer distinctions between these categories. The problem in doing so is that the language used in the literature to describe voice- and speech-related problems in the context of genetic syndromes is not standardized. For example, the terms “speech disorder”, “speech disturbance”, “speech anomalies”, “speech aberrations”, and “speech impairment” may each refer to a range of symptoms that may be overlapping to various degrees. For the purpose of this paper, it is therefore useful to map the broad range of speech problems into the following categories that are sufficiently discriminatory in terms of the different aspects of speech production mentioned above while still being limited in number:
  • Absence of speech: Phrases (in clinical/scientific literature) referring to (a) no development of speech capabilities, (b) no expressive speech, which is mostly limited to vocalizations, or (c) almost absent speech with a severely limited vocabulary (0–4 words).
  • Apraxia: Phrases referring to difficulty using language correctly while speaking, leading to speaking and communication difficulties.
  • Delayed speech: Phrases referring to developmental delay, the retarded development of the ability to speak, or the retarded acquisition of language skills and communication skills (ability to use a vocabulary correctly to communicate in a cogent manner).
  • Dysarthric speech or dysarthria: Phrases referring to speaking problems resultant from damaged, paralyzed, or weakened muscles of the articulators caused by motor problems. Dysarthria results in slurred words, poor phonation, etc. The speaker uses vocabulary as in normal speech but finds it difficult to move the articulators (tongue, lips, jaw, etc.) correctly to form the proper sounds to utter the words.
  • Idiosyncratic speech: Phrases referring to poor conformance to cogent language or incoherent language with articulation abnormalities.
  • Impaired speech: Phrases referring to poor articulation and phonation, as well as difficulties that result in sparse and disfluent speech.

2. Methodology

As mentioned earlier, we consider the example of syndromes resulting from chromosomal microdeletions and focus on their symptoms relating to speech abilities.
Chromosomal microdeletions are structural anomalies of chromosomes in which small sections of a chromosome are deleted or missing. The loss of the specific set of genes from the deleted section often results in phenotypic changes. An implicated gene is a gene in the deleted region of a chromosome that is known to cause much of the observed effects of the syndrome in affected individuals. These are identified through microarray and other studies. For most microdeletion syndromes, some such genes have been identified and reported in the medical literature. We use this information from the medical literature as-is in the sections below.
Although the human genome is very large in comparison to a typical chromosomal microdeletion region, microdeletions often cause serious problems. In fact, only a small set of deletions are compatible with life or fetal survival. This set continues to expand with the addition of newly discovered deletions in surviving individuals who have the means to reach genetic testing facilities. However, it is still a very small set and can be exhaustively studied. Most known deletions are well documented in the literature, both from the genetic and medical perspectives. Information about the genes associated with them is readily available through well-curated publicly accessible repositories. Thus, they are good example cases for this paper.
The methodology proposed herein analyzes ensembles of biological pathway chains, each of which connects a specific gene in the cytogenetic region of chromosomal microdeletion to the FOXP2 gene. A “biological pathway” here is defined as in standard terminology, referring to a physiological process at the cellular level that is enabled by the action of multiple genes that perform specific functions within the process.
We define a pathway chain as the sequential linkage of pathways where links between pathways are shared genes (implicitly meaning that the molecules resultant from genes are shared—we will use the term “gene” with this implicit meaning in the context of pathways for generality going forward). For example, consider a pathway that signals for a cell to stop dividing when an injury to the nuclear DNA strand is being repaired. It may involve the coordinated chemical action of molecules that are formed by the transcription of multiple genes that perform different functions. It would also be connected to a repair pathway by necessity. Thus, the two pathways can be considered to be links in a single pathway chain (they must share some genes in a functional sense) that perform the function of relaying messages from one pathway to another. Such genes may also perform other functions that are essential to both pathways.
The hypothesis we make here is that for a gene, if a chain from its pathway(s) (i.e., from the biological pathways it contributes to directly) extends to pathways that influence voice production, then the phenotype resultant from the absence or aberrant functioning of the gene can be expected to include anomalies in speech production and voice characteristics.

2.1. Voice Chains

Our definition of a voice chain extends our definition of a pathway chain in that the head of the chain must now necessarily be a pathway that includes a gene that influences voice or speech production, while the termination of the chain is not necessarily a biological pathway but could include any given set of genes with a common characterization (such as a common cytogenetic location or function).
In this paper, the voice-related gene chosen is FOXP2, but in other analyses, voice chains could involve other genes (e.g., as in [2]) without loss of generality. The terminal link in the chain is taken to be a genetic microdeletion syndrome. A voice chain thus establishes a relationship between an influencing factor—a genetic microdeletion syndrome in this case—and a corresponding effect on voice/speech production. We refer to a voice chain that includes a sequence of α pathways from the microdeletion region to the FOXP2 gene as a level-α voice chain (Figure 1). The specific genes on the microdeletion that link it to the voice chain are referred to as “chainlink” genes. We represent the set of chainlink genes that connect a microdeletion to level- α voice chains as the chainlink set V N α . Since we aim to analyze the genetic basis of the effect of microdeletions on voice, these chainlink sets are the focus of our analysis. Note that the subscript N in V N α denotes the manner in which overlaps or linkages between biological pathways are defined. This subscript is fixed for the purpose of this paper, for which the exact manner in which pathway overlaps are defined is described in Figure 1 and Figure 2 below. We however leave the N in place to facilitate future differentiations and variations of the proposed algorithm based on how the pathway intersections (or unions) may be defined.
In order to trace the genetic links between a microdeletion syndrome and voice, we first attempt to identify voice chains of different lengths that link to the genes in the microdeletion region. For this, we must find voice chains that link the FOXP2 gene to the syndrome, and identify the specific genes from the syndrome through which they are linked. We do so using the graph-search algorithm described below.
From our perspective, a biological pathway B is represented by the set of genes it involves: B : = { g : g is a gene in the specified pathway } . Two pathways, B 1 and B 2 , are linked if there are genes that are common to both pathways, i.e., B 1 B 2 . Thus, the set of all pathways can be represented as a graph where the nodes are biological pathways, and two nodes are linked only if the corresponding pathways have common genes, as illustrated in Figure 2a.
A pathway chain is any non-repeating sequence of pathways B 1 B 2 B 3 B N such that B i B i + 1 and B i B j for i j , i.e., where every pair of adjacent pathways has common genes, and there are no closed loops in the chain. In terms of the graph (see Figure 2a), a pathway chain is any path between any two nodes in the graph. A voice chain V is any chain V = B V B 2 B 3 B N S where the head node B V (and the head node alone) is a pathway that includes the FOXP2 gene, i.e., F O X P 2 B V , and the terminal node S is a set of genes with common characterization, as mentioned earlier. The length of the chain | V | is the number of nodes α in the chain, not counting the terminal node S . For the purpose of this paper, we will assume S to be the set of genes in a microdeletion region associated with a syndrome. Thus, S = { g : g is a gene in the microdeletion region } . All voice chains of length α form the set of level- α voice chains, and the chainlink genes that connect S to the level- α voice chains form the chainlink set V N α .
To find voice chains of the form B V , B 1 , S arising from the microdeletion region S (which we will refer to as the “syndrome” for brevity), we introduce the microdeletion region in the pathway graph (Figure 2b). Voice chains are now the paths from B V to S (Figure 2c,d). A breadth-first algorithm, described in Algorithm 1, is used to extract the chainlink sets V N α for voice chains of multiple levels. The outcome of the algorithm is the set of chainlink genes V N α [ S ] that connect each syndrome S to voice chains of level α , for 1 α 2 . We restrict ourselves to chains of lengths of up to 2 since, at greater lengths, the chained influences cannot be disambiguated, as indicated by prior studies in the (highly related) context of protein–protein interactomes, e.g., [3]. Another reason for restricting ourselves to level 2 chains is that for the specific example chosen in this paper, there are not enough data that allow us to build deeper chains meaningfully (without resorting to self-loops, which may lead to incorrect conclusions).    
Algorithm 1: Pseudocode for a breadth-first algorithm for computing the set of chainlink genes that form level 1 and level 2 voice chains for FOXP2.
Entropy 25 00897 i001

2.2. Ensemble Analysis

In the methodology we propose, for any microdeletion region S , we derive the set of chainlink genes within it for which α -level chains exist. The size and composition of this set can then be used in conjunction with the level of the voice chain to indicate the effect on voice (in a later analysis). In general, we can work with any level- α voice chains in such an analysis; however, we restrict ourselves to α = 1 and α = 2 .

3. Analysis

V N α , where α = 1 , 2 , were computed for a total of 82 microdeletion syndromes of chromosomes 1–20/22/X,Y. Genomic information, including gene names, was obtained from the HUGO Gene Nomenclature Committee’s (HGNC) human genome database, comprising 42,764 gene symbols and names and 3245 gene families and sets as of the time of conducting this analysis. Information about the phenotypes and the specific genes implicated in a syndrome was obtained from a survey of the current literature on medical genetics and genomics and from the Online Mendelian Inheritance in Man (OMIM) repository for authoritative information about human genes and genetic phenotypes.
The FOXP2 gene chosen for this analysis has been strongly implicated in speech and language disorders [4,5], including monogenic speech disorders. The cytogenetic location (chromosome locus) of this gene is 7q31.1. Mutations in this gene are known to cause speech and language disorder Type 1, also called “Autosomal dominant speech and language disorder with orofacial dyspraxia”. The phenotype description and known molecular basis for this disorder can be found under OMIM entry SPCH1:602081. The FOXP2 gene encodes for the protein “Forkhead Box Protein P2” [6]. This protein is a transcription factor; it controls the activity of other genes. It binds to the DNA of the genes that it controls through a region known as a Forkhead Domain. It thus plays a critical role in several protein-coding and other biological pathways and has been well studied [7]. A more detailed summary of this gene can be obtained from the Human Protein Atlas [8].
The ensemble of pathways used for this analysis was obtained from the Carcinogenic Potency Database (CPDB), described on its website as “a single standardized resource of the results of 45 years of chronic, long-term carcinogenesis bioassays”. Its current database of human biological pathways contains 4319 pathways and their gene compositions. This database has been used extensively in the medical literature and was chosen in this case for illustrative purposes since there is (importantly) no inherent bias towards the speech phenotype in it. In this database, there is only one pathway that contains the gene FOXP2. This is the Adenoid Cystic Carcinoma (ACC) pathway, which contains 63 genes, listed below for reference:
Gene membership of the ACC pathway:
AKT1 ARID1A ARID4B ARID5B ATM ATRX BCOR BCORL1 BRCA1 BRD1 CEBPA CMTR2 CNTN6 CREBBP CTBP1 DTX4 EP300 ERBB2 ERBIN FBXW7 FGF16 FGFR4 FOXO3 FOXP2 H1-4 H2AC16 HRAS IL17RD INSRR JMJD1C KANSL1 KAT6A KDM6A KDM6B KMT2C MAGI1 MAGI2 MAML3 MAP2K2 MAX MGA MORF4L1 MYB MYBL1 MYC MYCBP MYCN NCOR1 NFIB NOTCH1 NSD1 PIK3CA PRKDC PTEN RAF1 SETD2 SMARCA2 SMARCE1 SMC1A SRCAP TLK1 TP53 UHRF1
Table A1, given in Appendix A, documents the voice chains found for a set of 75 documented microdeletion syndromes. This range excludes chromosome 21, for which sufficient documentation was not found in the literature. Only voice chains up to level 2 are shown in this table and used in the analysis presented in this paper. This is sufficient to demonstrate the viability of the methodology for the discovery of voice chains proposed in this paper. The entries in the rows and columns of this table are explained in detail in Appendix A.
Table 1 summarizes some of the information in Table A1 to help understand the analysis given in the next section. The information given in Table 1 includes, for each syndrome listed in it, the corresponding implicated genes that are also discovered to be chainlink genes by the algorithm proposed in this paper; the overall counts of level-1 and level-2 chainlink genes for each syndrome, along with the number of additional pathways they collectively connect to (in parentheses); and the corresponding phenotypic effects on speech that have been reported in the scientific literature.

4. Inferences

A wealth of conclusions can be drawn from Table 1 (and from its more detailed version, Table A1 in Appendix A). However, we focus only on those that help validate the usefulness of the proposed algorithm.

4.1. Voice Chains as Predictors of Speech Characteristics

Of the 76 syndromes in Table A1, voice chains were found to exist for all. By our hypothesis, this would imply that in all cases, there is a potential for voice to be affected. The syndromes 15q11–q13 and Xq28 have two versions each, divided in the medical literature based on symptoms, rather than gene composition of the microdeletion region. We can therefore combine them for analysis, leaving us with 74 syndromes to be analyzed. For the syndrome 16q22, no information about the speech issues was found in the medical literature. Only the remaining 73 syndromes are considered in the analysis below.
The incidence of speech pathologies (including all forms of pathologies) among the general population is reported to be about 5% [65], and between 2.3% and 24.6% among children [66]. Of the 73 syndromes, 17 syndromes had both level-1 and level-2 voice chains, while 56 had only level-2 chains. The occurrence of speech aberrations was reported for all 17 syndromes with level-1 chains and for all but 6 of the 56 syndromes with only level-2 voice chains. Thus, voice chains correlate highly with the existence of speech anomalies.

4.2. Voice Chains as Information-Carrying Entities

Let us study how voice chains correlate with the presence or absence of specific voice problems. Such correlations would show that voice chains carry information about how the voice may be affected. This information is expected to be coarse-grained since we only take the presence or absence of any gene into consideration and consider no other cytogenetic information related to it.
From Table A1, we observe the following.
Level-1 chains:
The number of level-1 chainlink genes is limited to 1 or 2 in all cases and is not amenable to statistical analysis. However, we make the following observations:
  • Level-1 voice chains co-occur with speech problems 100% of the time.
  • For all instances where level-1 voice chains are present, severe symptoms occur 100% of the time (impaired, delayed or absent speech).
  • For all instances of syndromes with no effect on speech (i.e., where normal is not just one of a range of other speech symptoms), level-1 chains are absent 100% of the the time.
Level-2 Chains:
Level-2 voice chains are present in all cases and co-occur with speech disorders in all but 6 cases; thus, in only 6 cases has speech been reported to be normal. Therefore, level-2 voice chains co-occur with speech problems 91.8% of the time.
We note that a syndrome may have level-2 voice chains through many chainlink genes, which could number in the tens or even hundreds. Each of the chainlink genes could, in turn, also be associated with multiple other pathways, in addition to the one connecting it to FOXP2. We refer to the total number of pathways that include the chainlink genes of a syndrome as its “chainlink connectivity”.
Table 2 presents some statistics of syndromes, chainlink genes, and chainlink connectivity associated with speech disorders of different severity.The problems considered are: absent speech (the most severe symptom), impaired speech (a symptom that is less severe than absent), delayed speech (a cognitive symptom that is also less severe than absent and comparable in severity to impaired speech—a physical symptom), dysarthric speech (a symptom related to physical issues), and apraxic speech (due to CNS disorders; this is less severe compared to absent speech and often subsumes idiosyncratic speech). Each row of the table represents one type of speech problem and shows the number of syndromes associated with it, the mean and median of the counts of chainlink genes for the syndromes, and the mean and median of the chainlink connectivities of the syndromes. From an inspection of Table 2, a distinct pattern emerges. Rank ordering the symptoms by ascending order of the means of the counts of chainlink genes, we see that their connectivities also fall in almost the same order:
normal ( 32 , 404 ) < apraxia ( 43 , 675 ) < dysarthria ( 43 , 725 ) < impaired ( 46 , 734 ) < delayed ( 47 , 682 ) < absent ( 60 , 902 )
This rank ordering is consistent with the rank ordering of symptom severity based on the descriptions in the medical literature. In general, statistically speaking, the number of chainlink genes and the chainlink connectivity both appear to relate monotonically to the severity of the speech disorder.
Figure 3 shows scatter plots for counts of chainlink genes, chainlink connectivity, and a scatter of chainlink gene counts vs. normalized (per-chainlink-gene) chainlink connectivity for different severities of voice problems. Once again, it is apparent from the figures that the distributions of chainlink counts and chainlink connectivity is predictive of the type of speech problem. In particular, as is evident from Figure 3c, the distribution for normal speech stands out distinctly, as does that for absent speech, although the latter is not as distinctive as the former. Among the other levels, the distributions for apraxic and dysarthric speech appear similar, and so also do those for impaired and delayed speech appear similar.
In order to quantify these differences, we modeled the distributions of chainlink counts and chainlink connectivities for the different severity levels. These distributions have the general characteristics of over-dispersed Poisson distributions and can be modelled as Conway–Maxwell–Poisson (CMP) distributions [67]. The CMP distribution is a two-parameter exponential-family PMF over non-negative integers that takes the form
P ( n ) = λ n ( n ! ) ν 1 Z ( λ , ν )
where λ , ν > 0 are the parameters of the distribution. Given a set of integers, λ and ν can be obtained through a maximum likelihood estimator [68].
Figure 4 shows the maximum likelihood estimates of the probability distributions of chainlink counts (Figure 4a) and connectivities (Figure 4b) for syndromes associated with speech problems of different severity levels. These, too, follow the visible trends of Figure 3, where the distributions of both chainlink counts and chainlink connectivities for syndromes associated with the two extreme conditions, normal and absent speech, are distinct from those for other types of problems.
To quantify the differences in the distributions, we define the code distance between two sets of integers C i = { n 1 , , n i } and C j = { m 1 , , m j } as the excess number of bits required to encode them if each set is encoded using the optimal code for the other set rather than itself.
D ( P i , P j ) = n C i log 2 P j ( n ) P i ( n ) + n C j log 2 P i ( n ) P j ( n )
where P i ( ) and P j ( ) are the estimates of the PMFs for C i and C j , respectively. In our case, we choose the maximum likelihood estimates of the CMP distributions for the sets to compute this metric.
Table 3a shows the code distances between the chainlink counts for different types of speech problems. Table 3b shows the same for their chainlink connectivities. In both cases, we observe that the distributions for normal speech stand clearly apart from those for the other types of speech problems. The distributions for fully absent speech, too, are distinctive from those for other problem types. Among apraxic, dysarthric, impaired, and delayed speech, the differences between the distributions of adjacent degrees of severity is minimal; however, the distances show a distinct increasing trend with increases in the degree of impairment.
Overall, from the above analysis, the properties of the level-2 chains of a syndrome appear predictive of the degree of the speech problems associated with it. Our analysis has considered the chainlink counts and connectivities indpendently, and each of them shows this behavior. A joint analysis of both may show stronger dependencies.
Most importantly, note that in all of the analysis above, we have ignored the secondary effects of other issues, such as intellectual disability and craniofacial anomalies, a highly simplifying assumption. A more correct information measure that takes these into account is expected to show even stronger relationships between the level and degree of connectivity of a syndrome to the FOXP2 pathway and its effect on speech.

4.3. Why Are There No Instances of Missing Voice Chains?

Are voice chains redundant? The fact that there are no missing voice chains is easily explainable. The reason is linked to the size of the syndromic regions. To understand this, consider the following facts.
Our database comprises 4319 unique pathways. A total of 1205 of these pathways are linked to the pathway that carries the FOXP2 gene, and collectively, these include 11,746 genes. Thus, a randomly chosen gene from the entire human genome of 42.7k genes (as in the HGNC Human Genome database) has a 27.5% chance of being on a pathway that links to the FOXP2 containing pathway, i.e., of being a level-2 chainlink gene.
The shortest microdeletion considered (2q23.1) includes 9 genes, each of which has a 27% chance of being a level-2 chainlink gene. The syndrome itself then has a 94.44% probability of having a level-2 voice chain purely by chance. The second shortest pathway includes 26 genes and has a 99.98% probability of having a level-2 voice chain by chance. The remaining pathways are larger (in terms of the number of genes), and it is virtually impossible for them not to have a level-2 chain.
As a result, it is realistic to expect that, as a consequence of the density with which the FOXP2 containing pathway is linked to other pathways, any syndrome arising from genetic aberrations that includes even a moderately sized set of genes will have an effect on voice. It remains a plausible hypothesis that any factor that influences gene function has at least some chance to ultimately affect voice—for example, at least a 27% chance within the boundaries of the example presented in this paper.
The above argument assumes that the genes in a microdeletion region are randomly chosen. The mean of the fraction of genes in a microdeletion that appears in any voice chain is observed to be 28.79% with a variance of 0.014, indicating concordance with the assumption of randomness. A secondary implication is that the likelihood of adjacent genes in the same cytogenetic region to be chainlink genes is independent of one another.

4.4. Ancillary Observations

Some important ancillary observations emerge from this study, which may be important to note. These are mentioned briefly below.
  • For each syndrome, some genes have been identified as largely important—i.e., these are implicated largely for the syndrome’s effect on the individual. Of the syndromes for which there is information about implicated genes, we see that in only 8 syndromes (2p16.1–p15, 2q23.1, 9p24.3, 11q23, 13q12.3, 17q23.1–q23.2, 19p13.13, and Yq11), none of the implicated genes appear in the two levels of voice chains shown. In all other cases, the implicated genes impact FOXP2 pathways and are likely to have a bearing on speech anomalies. We have noted earlier that FOXP2 is not the only gene known to be related to voice production. If we had chosen some other gene as an example in this paper (instead of FOXP2), it is likely that the implicated genes for the 8 exceptions mentioned above would appear as chainlink genes (while some others may not). This hypothesis can be easily tested in corresponding experiments.
  • Identifying candidate genes for further investigation: Using only chainlink genes that appear on level-1 chains as illustrative examples (see Table A1 in the Appendix A for reference), we see that voice chains can be useful in identifying candidate genes for further investigation in the context of speech issues. Some examples are given below. The likely candidates are written in parentheses, while the already implicated genes are indicated in bold:
    • 1p36 (ARID1A): Although not implicated for this syndrome in studies so far, ARID1A is located in 1p36.11, a region frequently deleted in human cancers [69]. Disruption in its function may lead to the co-occurrence of oncological and speech issues. This hypothesis is verifiable.
    • 5q35.3 (NSD1): The gene NSD1 appears in a level-1 chain and is also an implicated gene. Ideally, this should not be a candidate for further investigation. However, paradoxically, while effects on speech are expected, the literature reports normal speech for some subjects for this case. This may be a result of biased sampling (the more severe cases may not be conducive to life due to other concurrent severe symptoms, which is a common occurrence in microdeletion syndromes; in some cases, only mosaic individuals survive). This warrants some investigation.
    • 11p15.5 (HRAS): Although not implicated, and although two studies cited under OMIM: 130650 for this syndrome explicitly mention HRAS as not significant, HRAS has nevertheless been independently found to be extremely significant in RASopathy and cancer studies, e.g., [70]. Its role in this syndrome needs to be re-evaluated given its influence on 347 biological pathways and its strong influence on speech.
    • 16p11.2 and 16p12.2–16p11.2 (SRCAP): Although not implicated, it connects to only one other pathway in the ensemble, and that is the ACC pathway of FOXP2. The effects on speech are expected to be strong if this gene is aberrant. This gene may be implicated in further investigations.
    • 17p13.1 (KDM6B): Speech is absent in this syndrome. The gene TP53 is implicated, which also appears at level-1 and is associated with 206 pathways. KDM6B is the only other gene in the level-1 voice chains and connects to only 8 other pathways. It is likely that this gene also plays a strong role in influencing speech and merits investigation.
    • 17q12 (ERBB2): The gene ERBB2 is associated with 124 pathways. It is a well-known oncogene [71], in that perturbations in its function have been observed to have deleterious effects. If it is also connected to FOXP2, then its appearance in the voice chain allows a surprising hypothesis—that biomarkers of some oncological conditions may also be present in voice.
    • 19p13.3 (MAP2K2,UHRF1): MAP2K2 and URHF1 are not implicated. However their appearance as level-1 chainlink genes warrants investigation, especially for MAP2K2, which influences 257 pathways. Prompted by this, a literature search did reveal that MAP2K2 has been implicated in this syndrome recently [72], although this is not on the OMIM records, which were largely consulted for this study.
    • 22q13.3 (BRD1): The gene BRD1 is not implicated and appears in 9 pathways only, but the effect on speech is severe in this syndrome. This warrants the investigation of BRD1 independently in relation to speech characteristics. A literature search reveals that BRD1 is indeed strongly associated with brain development and susceptibility to both schizophrenia and bipolar affective disorder [73], and consequent effects on speech are highly likely.
    • Xp11.22 (SMC1A): Although SMC1A is not implicated, it appears in 33 pathways. The speech issues are severe and the gene warrants investigation for this effect. A recent report in the literature has implicated it in severe intellectual disability and therapy-resistant epilepsy in females [74]. The former is known to be associated with severe speech anomalies.
    • Xp11.3 (KDM6A): Although not implicated, KDM6A warrants investigation. In the literature, it is independently known to be associated with delayed speech and psychomotor development [75].
  • Expression of speech characteristics: The observation that deletions of genes on all chromosomes ultimately results in the expression of speech anomalies carries significance. From a much broader perspective, this suggests that the effect on speech may be supported by the action of multiple concurrent biological pathways. There may be no single gene or genes (on select chromosomes) that may code for speech capabilities per se, and FOXP2 may be one of a few genes that may consolidate and regulate the speech- and language-related emergent effects. It may be that genes directly code for structural elements in the range of phenotypes, while other properties, such as speech and language abilities, are emergent from the coordination of these (and epigenetic) factors.
    A more prosaic argument for this can also be presented. Within the ensemble of syndromes analyzed, there are three kinds of of cause-and-effect relationships: (a) syndromes with physical structures of the vocal tract (e.g., craniofacial anomalies that include cleft palate, changes in lip shape, etc.), which adversely affect the biomechanical aspects of voice and speech production, (b) syndromes in which auditory and motor functions are compromised, and (c) syndromes that affect the normal functions of the brain, causing cognitive, learning, memory, and other issues that are, in turn, likely to lead to speech problems. In no case do we see only speech aberrations in isolation of these factors. The associations between speech and other expressed factors have, in fact, been ubiquitously observed, e.g., [48]. This may support the hypothesis that speech abilities are likely to be emergent from an ensemble of factors (including other phenotypes), rather than expressed directly by any “speech” gene.

5. Conclusions

The hypothesis that the existence of voice chains is correlated with speech characteristics is adequately validated by the statistical analysis presented in this paper. The analysis presented in this paper, in fact, also shows that the level of voice chains is correlated positively with the severity of speech problems. Based on this, a simple information measure has been suggested to rank-order the effects of specific sets of voice chains on speech. We also see how the methodology presented can potentially provide leads to specific genes that might be candidates for further investigation in the context of speech issues and microdeletion syndromes. While the example of chromosomal microdeletion syndromes used for this paper is very specific, the methodology itself may be easily generalized and extended to reveal the potential effects of other diseases with a genetic basis and of other factors that influence gene function in some manner on speech, voice and (in further refinements of the analysis) their specific qualities and characterisitcs. As a specific suggestion, one exercise that would allow for a more comprehensive analysis would be to explore the entire human genome database to identify which genes are connected via voice chains (and to what level), as well as whether or not there have been corresponding effects on voice reported in the biomedical literature. In cases where large amounts of data are available, one could also explore such connections in an entirely data-driven manner, using AI-based biomarker discovery mechanisms.

Funding

This research was funded in part by the U.S. Army Research Office and the U.S. Army Futures Command Grant No. W911NF-20-D-0002. Its content does not reflect the position or the policy of the U.S. Army and no official endorsement should be inferred.

Institutional Review Board Statement

Ethical review and approval were waived for this study since the data used for this research was obtained from public sources. No human subjects research was explicitly carried out for this work.

Data Availability Statement

The code required to reproduce the results in this paper is archived for public use at https://datadryad.org/ (accessed on 21 September 2022) under the title “Data for Connecting voice profiling to genomics”.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A

This table lists the voice chains found for a set of 76 documented microdeletion syndromes. This range excludes chromosome 21, for which sufficient documentation was not found in the literature. The analysis was conducted for voice chains up to level 2. The format of each row in this table is as follows:
First column:
  • In each row, the top left entry is the microdeletion syndrome.
  • Below this, on the left, is the OMIM record for the cytogenetic region of the syndrome.
  • Below the OMOM record, the common names by which the syndrome is referred to in the medical literature are listed.
  • Below the list of common names, sets of chainlink genes that form level-1 and level-2 voice chains (denoted as { V N 1 } and { V N 2 } ) respectively, found by the proposed algorithm are listed. These sets are listed only if they are present.
  • For each voice chain listed, the first entry denotes the level of the voice chain. Following this is the list of chainlink genes that belong to the corresponding microdeletion region, which are linked to FOXP2 through the corresponding voice chain level.
  • For each gene, the number of pathways that a gene connects to (in general, inclusive of connections to the ACC pathway of FOXP2) is written as its subscript.
  • All chainlink genes are not named. For brevity, the first 10 genes with the greatest number of links are listed, and the total number of the rest of the genes (total chainlink gene count) is indicated. At the end, the genes with a single pathway link are explicitly listed. They are included in the total chainlink gene count mentioned above. Thus, for example, the level-2 voicechain for the syndrome 2p16.1–p15 has a total chainlink gene count of 17, including all genes that are explicitly listed in the table.
  • In each row, the genes in the voice chains that have been implicated for the syndrome’s phenotypic expression in prior studies are shown in parentheses on the top right. The genes that are also present as chainlink genes in the voicechains found for the syndrome are shown in bold font.
Second column:
  • The second column in this table lists the corresponding speech characteristics, with references. Where no reference is cited, the information is found in the OMIM record for the syndrome (where possible, OMIM references are used for brevity).
Table A1. Chainlink genes for level-1 and level-2 voice chain ensembles for 76 chromosomal microdeletion syndromes. Each chain forms a link from the gene shown to the ACC biological pathway of FOXP2 and has been automatically derived. The total number of pathways that each gene influences independently is shown as a subscript to its name. Observed phenotypic effects on speech are given in the last column. When references are not cited, the information reflects that in the OMIM records for the syndrome.
Table A1. Chainlink genes for level-1 and level-2 voice chain ensembles for 76 chromosomal microdeletion syndromes. Each chain forms a link from the gene shown to the ACC biological pathway of FOXP2 and has been automatically derived. The total number of pathways that each gene influences independently is shown as a subscript to its name. Observed phenotypic effects on speech are given in the last column. When references are not cited, the information reflects that in the OMIM records for the syndrome.
Syndrome(Implicated Genes from Microarray Studies)Reported Effects on Speech
Other Information
Voice Chains and Their Member Chainlink Genes
Chromosome 1 deletions
1p36(Multiple incl. RERE, SPEN) Delayed [9] or absent speech
OMIM: 607872; Cytogenetic location: 1p36; Genomic coordinates (GRCh38): 1:23,600,000–27,600,000
Names: Chromosome 1p36 deletion sydrome; monosomy 1p36 syndrome
{ V N 1 } ARID1A 10
{ V N 2 } : PIK3CD 168 MTOR 146 GNB1 140 CDC42 140 CASP9 108 RPS6KA1 93 PRKCZ 75 CNKSR1 57 SFN 55 DVL1 48 ( 216 m o r e ) SSU72 1 RBP7 1 PLEKHN1 1 PLEKHM2 1 PEX10 1 NBL1 1 MTCO3P12 1 MMP23B 1 MMP23A 1 HSPB7 1 HMGN2 1 GPR3 1 EXOSC10 1 DISP3 1 CROCCP2 1 CAMTA1 1 AHDC1 1 ACAP3 1
1q21.1–q21.2(Multiple incl. RBM8A  , GJAS  )
Delayed [10] or Impaired [11] speech
OMIM: 612474/274000, Cytogenetic location: 1q21.1 Genomic coordinates (GRCh38): 1:143,200,000–147,500,000 Names: Thrombocytopenia with absent radius (TAR) syndrome (OMIM 274000)
{ V N 2 } : H4C14 67 H4C15 66 H2BC21 63 PRKAB2 57 H3C15 56 H3C14 56 H3C13 56 H2AC20 53 H2AC19 53 H2AC18 53 ( 31 m o r e )
1q41–q42(Multiple incl. DISP1, HPE10, LEFTY1, LEFTY2, WDR26, TSEN2, BPNT1) Apraxia [12]
OMIM: 612530; Cytogenetic location: 1q41–q42 Genomic coordinates (GRCh38): 1:214,400,000–236,400,000
Names: Chromosome 1q41–q42 deletion syndrome; holoprosencephaly 10; HPE10
{ V N 2 } : NUP133 88 H2BU1 63 TGFB2 62 DUSP10 54 H3-3A 50 WNT3A 47 ARF1 46 PARP1 41 MIR3620 38 PSEN2 36 ( 50 m o r e ) DISP1 2 MIR215 1 MIR194-1 1 DNAH14 1 CDC42BPA 1
1q43–q44(AKT3, ZBTB18) Delayed, impaired, or absent speech
OMIM: 612337
Names: Mental retardation, autosomal dominant 22; MRD22; chromosome 1q43–q44 deletion syndrome (included); chromosome 1qter deletion syndrome (included)
{ V N 2 } : AKT3 169 RYR2 74 ACTN2 69 CHRM3 42 MTR 38 FH 31 ADSS2 30 EXO1 26 RGS7 16 KMO 15 ( 68 m o r e ) FMN2 1
Chromosome 2 deletions
2p16.1–p15(Multiple incl. BCL11A) Dysarthria, apraxia, or impaired speech [13]
OMIM: 612513; Cytogenetic location: 2p16.1–p15 Genomic coordinates (GRCh38): 2:54,700,000–63,900,000
Names: Chromosome 2p16.1–p15 deletion syndrome
{ V N 2 } : RPS27A 293 XPO1 56 UGP2 33 MDH1 27 REL 23 CCT4 14 RTN4 13 B3GNT2 12 VRK2 10 USP34 7 ( 7 m o r e ) OTX1 1 CCDC88A 1
2p21(Multiple incl. SLC3A1, PREPL)Apraxia or idiosyncratic speech [14]
OMIM: 606407; Cytogenetic location: 2p21 Genomic coordinates (GRCh38): 2:41,500,000–47,500,000
Names: Hypotonia–cystinuria syndrome; cystinuria with mitochondrial disease; homozygous 2p21 deletion syndrome
{ V N 2 } : CALM2 234 PRKCE 73  SLC3A1 36 HAAO 22 ATP6V1E2 19 ABCG5 18 MSH2 17 ABCG8 17 EPAS1 16 COX7A2L 15 ( 14 m o r e ) SIX3 1 EML4 1
2q23.1(MBD5) Delayed or impaired speech
OMIM: 156200
Names: Mental retardation, autosomal dominant 1; MRD1; chromosome 2q23.1 deletion syndrome
{ V N 2 } : ORC4 29 KIF5C 2
2q32–q33(SATB2) (HOXD cluster and regulatory elements, COL3A1 COL5A2, GTF3C3, CASP8, CASP10) Absent speech
OMIM: 612313
Names: Glass syndrome; GLASS; chromosome 2q32–q33 deletion syndrome; SATB2-associated syndrome
{ V N 2 } : CREB1 203 STAT1 130  CASP8 103 NUP35 79 ITGAV 64 CD28 60 SUMO1 52 AOX1 50 FZD5 43 FZD7 37 ( 52 m o r e ) TMEFF2 1 KLF7 1 HSPE1 1 DUSP19 1 DNAH7 1
2q37.3(HDAC4) Impaired speech [15]
OMIM: 600430 (2q37); Cytogenetic location: 2q37 Genomic coordinates (GRCh38): 2:236,400,000–242,193,529
Names: Chromosome 2q37 deletion syndrome, brachydactyly–intellectual disability syndrome; Albright hereditary osteodystrophy-like syndrome Type 3
{ V N 2 } : HDAC4 33 GPC1 27 AGXT 25 COL6A3 19 ACKR3 14 NEU4 13 NDUFA10 12 PRLH 11 PER2 11 DTYMK 11 ( 28 m o r e ) TWIST2 1 ILKAP 1
Chromosome 3 deletions
3p13(FOXP1) Delayed, idiosyncratic, and impaired speech and dysarthria (all severe), apraxia [16]
OMIM: 613670
Names: Mental retardation with language impairment with or without autistic features
{ V N 2 } : MITF 19 PROK2 8 PPP4R2 5 GPR27 5 EIF4E3 5  FOXP1 3 GXYLT2 2 RYBP 1
3q13.31(DRD3, ZBTB20 , GAP43, LSAMP) Impaired [17] or absent speech
OMIM: 615433; Cytogenetic location: 3q13.31 Genomic coordinates (GRCh38): 3:113,700,000–117,600,000
Names: Chromosome 3q13.31 deletion syndrome
{ V N 2 } : DRD3 22 ATP6V1A 22  GAP43 6 QTRT2 3  LSAMP 3
3q29(Multiple incl. PAK2, DLG1) Delayed speech
OMIM: 609425; Cytogenetic location: 3q29 Genomic coordinates (GRCh38): 3:192,600,000–198,295,559
Names: Chromosome 3q29 deletion syndrome; microdeletion 3q29 syndrome; 3qter deletion syndrome
{ V N 2 } : PAK2 74  DLG1 71 NCBP2 53 HES1 39 RPL35A 22 TFRC 21 RNF168 18 BDH1 17 PCYT1A 15 MUC20 13 ( 14 m o r e ) FBXO45 1
Chromosome 4 deletions
4p16.3(Multiple incl. FGFR3, MSX1) Delayed [18] or absent speech
OMIM: 194190; Cytogenetic location: 4p16.3 Genomic coordinates (GRCh38): 4:0–4,500,000
Names: Wolf–Hirschhorn syndrome; Pitt–Rogers–Danks syndrome; Pitt syndrome; Wittwer syndrome; Dillan 4p syndrome
{ V N 1 } CTBP1 25
{ V N 2 } : FGFR3 109 TNIP2 32 NELFA 27 MIR943 26 CTBP1 25 ADRA2C 22 NSD2 19 GRK4 15 DGKQ 14 PDE6B 13 HAUS3 13 SLC26A1 12 SLBP 12 ATP5ME 12 RNF4 11 CPLX1 11 RGS12 10 ( 27 m o r e ) ZFYVE28 1 PCGF3 1 NSG1 1 MXD4 1 ABCA11P 1
4q21(Multiple)Delayed or absent speech; impaired speech [19,20]
OMIM: 613509; Cytogenetic location: 4q21 Genomic coordinates (GRCh38): 4:86,000,000–87,100,000
Names: Chromosome 4q21 deletion syndrome
{ V N 2 } : MAPK10 122 FGF5 118 NUP54 79 PAQR3 47 PRKG2 27 CXCL10 22 CXCL9 18 ABRAXAS1 18 SEC31A 16 CXCL11 16 ( 31 m o r e ) SHROOM3 1 SEPTIN11 1 HNRNPDL 1 G3BP2 1 BMP2K 1 AFF1 1
Chromosome 5 deletions
5p (5p15.2 and/or (5p15.3 or 5p15.33))(Multiple incl. TERT, CTNND2) Delayed or absent speech and apraxia [21]
OMIM: 123450
Names: Chromosome 5p deletion syndrome; cri-du-chat syndrome; cat cry syndrome; Lejeune syndrome; 5p monosomy; partial monosomy 5
{ V N 2 } : ADCY2 116 SLC6A3 73 SDHA 37  TERT 24 TRIO 23 MTRR 19 LPCAT1 15 CEP72 15 SRD5A1 14 SLC9A3 13 ( 27 m o r e ) OTULINL 1  CTNND2 1 CLPTM1L 1
5q14.3(MEF2C) Absent speech
OMIM: 612881 (distal version 5q14.3–q15); Cytogenetic location: 5q14.3–q15 Genomic coordinates (GRCh38): 5:83,500,000–98,900,000
Names: Distal chromosome 5q14.3 deletion syndrome; periventricular heterotopia associated with chromosome 5q deletion; periventricular nodular heterotopia 5; PVNH5
{ V N 2 } : RASA1 89 CCNH 74  MEF2C 70 POLR3G 17 COX7C 15 MIR9-2 7 EDIL3 3
5q33.1(RPS14) Dysarthria [22]
OMIM: 153550
Names: Chromosome 5q deletion syndrome; 5q syndrome; refractory macrocytic anemia due to 5q deletion; MAR
{ V N 2 } : RPS14 29 GPX3 18 DCTN4 13 SPARC 12 SLC36A1 12 NMUR2 11 CD74 10 NDST1 9 GM2A 8 TNIP1 6 ( 7 m o r e ) SYNPO 1 IRGM 1
5q35.3(NSD1) Normal [23] or delayed speech
OMIM: 117550
Names: Sotos syndrome 1; Sotos1; Nevo syndrome; cerebral gigantism, Nevo type; chromosome 5q35 deletion syndrome
{ V N 1 }  NSD1 8
{ V N 2 } : MAPK9 167 LTC4S 53 F12 33 SQSTM1 30 MAML1 29 RACK1 25 FLT4 23 CANX 22 GRK6 20 GRM6 16 ( 37 m o r e ) CLK4 1
Chromosome 6 deletions
6pter–p24(Multiple incl. FOXC1, GMDS) Delayed speech
OMIM: 612582; Cytogenetic location: 6pter–p24 Genomic coordinates (GRCh38): 6:0–13,400,000
Names: Chromosome 6pter-p24 deletion syndrome
{ V N 2 } : RIPK1 83 EDN1 43 F13A1 34 TFAP2A 19 DSP 18 BMP6 17 IRF4 15 TUBB2A 14  GMDS 13 ELOVL2 13 ( 31 m o r e ) PAK1IP1 1 MAK 1 FOXQ1 1 DUSP22 1 C6orf201 1
6q25.3(Multiple incl. ARID1B) Delayed speech, apraxia, dysarthria [24]
OMIM: N.A.
{ V N 2 } : SLC22A2 53 GTF2H5 50 ACAT2 49 SLC22A1 24 EZR 24 SOD2 20 IGF2R 20 SYNJ2 18 SLC22A3 16 TCP1 13 ( 11 m o r e )
Chromosome 7 deletions
7p21(TWIST1 (7p21.1)) Delayed speech [25]
OMIM: 101400
Names: Saethre–Chotzen syndrome; acrocephalosyndactyly III; ACS3; ACS III; Chotzen syndrome; acrocephaly, skull asymmetry, and mild syndactyly
{ V N 2 } : RPA3 74 PRPS1L1 30 HDAC9 26 ITGB8 18 AHR 18 POLR1F 16 NDUFA4 15 DGKB 14  TWIST1 9 THSD7A 4 ( 8 m o r e ) MIOS 1 MEOX2 1 ARL4A 1
7q11.23(Multiple incl. ELN, LIMK1, GTF2IRD1, GTF2I) Normal or delayed speech
OMIM: 194050; Cytogenetic location: 7q11.23 Genomic coordinates (GRCh38): 7:72,700,000–77,900,000
Names: Williams syndrome; WS; WMS; chromosome 7q11.23 deletion syndrome, 1.5- to 1.8-MB; Williams–Beuren syndrome; WBS
{ V N 2 } : YWHAG 64 POM121 59 RFC2 57 MDH2 42  LIMK1 34 HSPB1 31 STX1A 30 FZD9 30 NCF1 29 POR 15 ( 25 m o r e ) ABHD11 1
Chromosome 8 deletions
8p23.1(GATA4)No significant speech anomaly reports
OMIM: 179613 (not exclusive to syndrome)
Names: Recombinant chromosome 8 syndrome; REC8 syndrome; chromosome 8q22.1-qter duplication and 8pter-p23.1 deletion; San Luis Valley syndrome
{ V N 2 } : FDFT1 44 CTSB 17 BLK 17 TNKS 16  GATA4 15 AGPAT5 14 ANGPT2 12 MIR124-1 10 NEIL2 9 CLDN23 8 ( 39 m o r e ) PINX1 1 MIR598 1
8q22.1(CCNE2, TMEM67, FAM92A1)Delayed speech
OMIM: 608156; Cytogenetic location: 8q22.1 Genomic coordinates (GRCh38): 8:92,300,000–97,900,000
Names: Nablus mask-like facial syndrome; NMLFS; chromosome 8q22.1 deletion syndrome
{ V N 2 } : CCNE2 45 SDC2 34 TP53INP1 13 GDF6 12 UQCRB 11 PTDSS1 9 ESRP1 8 PDP1 5 NDUFAF6 5 CDH17 4 ( 3 m o r e )
8q24.11-q24.13(TRPS1, EXT1) Delayed speech
OMIM: 150230; Cytogenetic location: 8q24.11–q24.13 Genomic coordinates (GRCh38): 8:116,700,000–126,300,000
Names: Langer–Giedion syndrome; LGS; chromosome 8q24.1 deletion syndrome; trichorhinophalangeal syndrome type II; TRPS2
{ V N 2 } : SQLE 43 TAF2 20 RAD21 20 MIR3610 19 TNFRSF11B 15 DERL1 13 NDUFB9 12  EXT1 12 SLC30A8 10 FBXO32 8 ( 11 m o r e ) WASHC5 1 FAM91A1 1 DSCC1 1
Chromosome 9 deletions
9p24.3(DMRT1, DMRT2) Delayed speech, dysarthria, apraxia [26]
OMIM: 154230; Cytogenetic location: 9p24.3 Genomic coordinates (GRCh38): 9:0–2,200,000
Names: 46,XY sex reversal 4; SRXY4; 46,XY gonadal dysgenesis, partial or complete, with 9p24.3 deletion; chromosome 9p24.3 deletion syndrome
{ V N 1 } SMARCA2 9
{ V N 2 } : SMARCA2 9 DOCK8 3 WASHC1 1
9q34.3(EHMT1) Delayed speech [27], Apraxia [28] and Absent speech
OMIM: 610253
Names: Kleefstra syndrome; 9q syndrome; 9q subtelometric deletion syndrome; chromosome 9q34.3 deletion syndrome
{ V N 1 } NOTCH1 62
{ V N 2 } : GRIN1 139 TRAF2 78 NOTCH1 62 PTGDS 51 ANAPC2 37 TUBB4B 28 NELFB 27 ENTPD8 25 CACNA1B 20 AGPAT2 20 ( 36 m o r e ) UAP1L1 1 MIR602 1 FUT7 1 CYSRT1 1
Chromosome 10 deletions
10pter–p13 or 10p14–p15.1(Multiple incl. GATA3)Sensorineural hearing loss
OMIM: 146255
Names: Barakat syndrome; hypoparathyroidism, sensorineural deafness, and renal disease/dysplasia syndrome; HDRS; nephrosis, nerve deafness, and hypoparathyroidism
{ V N 2 } : IL2RA 83 PRKCQ 71 AKR1C3 70 IDI1 40 CALML5 36 AKR1C4 35 CALML3 33  GATA3 27 NUDT5 22 TAF3 21 ( 39 m o r e ) ZMYND11 1 CELF2 1
10q23(PTEN, BMPR1A)Delayed or absent speech; impaired speech [29]
OMIM: 612242 (10q22.3–q23.2)
Names: Chromosome 10q22.3–q23.2 deletion syndrome
{ V N 1 }  PTEN 107
{ V N 2 } : PTEN 107 CYP2C8 100 NRG3 85 CYP2C9 78 FAS 70 CYP2C19 66 GLUD1 46 LIPA 39  BMPR1A 28 PLCE1 25 ( 49 m o r e ) PDLIM1 1 PCGF5 1 MMRN2 1 IFIT2 1
10q26(HMX3, DOCK1, C10ORF90) Impaired speech; delayed speech [30]
OMIM: 609625; Cytogenetic location: 10q26 Genomic coordinates (GRCh38): 10:128,800,000–133,797,422
Names: Chromosome 10q26 deletion syndrome; terminal chromosome 10q26 deletion syndrome
{ V N 2 } : FGFR2 113 CYP2E1 91 ECHS1 88 ACADSB 41  DOCK1 33 BUB3 28 OAT 21 RPL21P16 20 EIF3A 19 GRK5 17 ( 38 m o r e ) RAB11FIP2 1 MKI67 1 EBF3 1 CUZD1 1 CALY 1
Chromosome 11 deletions
11p11.2–p12(EXT2, ALX4) Idiosyncratic speech, dysarthria, delayed speech, apraxia [31]
OMIM: 601224 (11p11.2); Cytogenetic location: 11p11.2 Genomic coordinates (GRCh38): 11:43,400,000–48,800,000
Names: Potocki–Shaffer syndrome; PSS; chromosome 11p11.2 deletion syndrome: proximal 11p deletion syndrome; DEFECT11 syndrome
{ V N 2 } : TRAF6 134 NUP160 88 F2 85 PSMC3 30 NR1H3 26 CREB3L1 26 CKAP5 25 DDB2 22 SPI1 19 HSD17B12 19 ( 36 m o r e )
11p13-p12(Multiple incl. WT1, PAX6, BDNF) (SLC1A2, PRRG4) ,Impaired speech [32]
OMIM: 612469; Cytogenetic location: 11p13–p12 Genomic coordinates (GRCh38): 11:31,000,000–43,400,000
Names: Wilms tumor, aniridia, genitourinary anomalies, mental retardation and obesity syndrome; WAGRO; WAGRO syndrome; WAGR syndrome with obesity; chromosome 11p13-p12 deletion syndrome
{ V N 2 } : TRAF6 134 CAT 37 CD44 27 CD59 19 CSTF3 16  SLC1A2 14 EIF3M 14 PDHX 12 WT1 10 APIP 10 ( 13 m o r e ) PRR5L 1 ELF5 1
11p15.5(Multiple incl. CDKN1C, H19, IGF2) Impaired speech [33]
OMIM: 130650;
Names: 130650: Beckwith–Wiedemann syndrome; BWS; exomphalos–macroglossia–gigantism syndrome; EMG syndrome; Wiedemann–Beckwith syndrome; WBS
{ V N 1 } : HRAS 347
{ V N 2 } : HRAS 347 INS 103 POLR2L 85 KCNQ1 69 DUSP8 53 TNNT3 52 TNNI2 52 AP2A2 47 IRF7 39 TH 30 ( 40 m o r e ) RIC8A 1 RASSF7 1 NLRP6 1 MIR483 1 KRTAP5-4 1 KRTAP5-1 1 DEAF1 1 CRACR2B 1 BRSK2 1
11q13.3(FGF4, FGF3, FADD) Delayed speech [34]; Impaired speech [35]
OMIM: 166750
Names: Chromosome 11q13 deletion syndrome; ododental syndrome; otodental dysplasia and coloboma due to 11q13.3 microdeletion
{ V N 2 } : FGF4 135 CCND1 122  FGF3 103 FGF19 95  FADD 73 CPT1A 38 CTTN 22 PPFIA1 8 TPCN2 5 SHANK2 4 ANO1 3
11q23(Multiple incl. FLI1) Dysarthric speech, absent speech [36]; impaired speech [37]
OMIM: 188025; Cytogenetic location: 11q23 Genomic coordinates (GRCh38): 11:114,600,000–121,300,000
Names: Chromosome 11q23 deletion syndrome; thrombocytopenia, Paris–Trousseau type; TCPT; Paris–Trousseau syndrome; 11q terminal deletion syndrome
{ V N 2 } : FXYD2 158 PPP2R1B 145 CBL 104 H2AX 62 NCAM1 56 CD3G 43 CD3D 40 SC5D 39 DLAT 39 APOA1 39 ( 67 m o r e ) ZPR1 1 TMPRSS4 1 TMPRSS13 1 TAGLN 1 SIK3 1 POU2F3 1 MPZL2 1 C11orf52 1
11q23.3-q25(FLI1, BSX, NRGN, FRA11B, JAM3) Impaired speech, delayed speech, apraxia [38]
OMIM: 147791; Cytogenetic location: 11q23.3–q25 Genomic coordinates (GRCh38): 11:114,600,000–135,086,622
Names: Jacobsen syndrome; JBS; Del(11)(qter); distal deletion of 11q; distal monosomy 11q; monosomy 11qter
{ V N 2 } : FXYD2 158 CBL 104 KCNJ5 70 H2AX 62 HSPA8 51 CHEK1 48 CD3G 43 CD3D 40 SC5D 39 APOA1 39 ( 109 m o r e ) ZPR1 1 VPS26B 1 TMPRSS4 1 TMPRSS13 1 TAGLN 1 SLC37A2 1 SIK3 1 POU2F3 1 MPZL2 1 IGSF9B 1 GRAMD1B 1 EI24 1
Chromosome 12 deletions
12q14.3(HMGA2) Absent, impaired [38] or delayed speech [39]
OMIM: 618908
Names: Silver–Russell syndrome 5; SRS5
{ V N 2 } : IRAK3 18 WIF1 15 MIR6502 15 GNS 15 MSRB3 12 LEMD3 10 GRIP1 9  HMGA2 7 CAND1 7
Chromosome 13 deletions
13q12.3(POMP) Delayed speech [40]
OMIM: 601952
Names: Keratosis linearis with ichthyosis congenita and sclerosing keratoderma syndrome; KLICK syndrome
{ V N 2 } : HMGB1 43 FLT1 36 ALOX5AP 13 SLC7A1 11 HSPH1 11 B3GLCT 5
13q14(RB1) Normal speech [41]
OMIM: 613884; Cytogenetic location: 13q14 Genomic coordinates (GRCh38): 13:50,300,000–54,700,000
Names: Chromosome 13q14 deletion syndrome
{ V N 2 } : RB1 94 FOXO1 91 GTF2F2 51 KBTBD7 50 SLC25A15 24 HTR2A 23 TNFSF11 18 LPAR6 16 CYSLTR2 15 DGKH 14 ( 36 m o r e ) WDFY2 1 ITM2B 1 DLEU2 1 DLEU1 1 AKAP11 1
13q22.3(EDNRB) Normal speech [41]
OMIM: 277580;
Names: Waardenburg syndrome, Type 4A; WS4A; Waardenburg syndrome with Hirschsprung disease, Type 4A; Waardenburg–Shah syndrome; Shah–Waardenburg syndrome; WS4
{ V N 2 } : EDNRB 26 FBXL3 10 MYCBP2 1
13q33–q34(SOX1, ARHGEF7) Apraxia
OMIM: 619148; Cytogenetic location: 13q33–q34 Genomic coordinates (GRCh38): 13:106,400,000–114,364,328
Names: Chromosome 13q33–q34 deletion syndrome
{ V N 2 } : IRS2 108 RASA3 52 TFDP1 48 F7 46 F10 45 CDC16 37 COL4A1 36 COL4A2 33  ARHGEF7 31 ATP4B 27 ( 20 m o r e ) ING1 1
Chromosome 14 deletions
14q11–q22(Multiple incl. PAX9, SUPT16H, CHD8, RALGAPA1] Delayed [42] or Absent speech
OMIM: 613457; Cytogenetic location: 14q11–q22 Genomic coordinates (GRCh38): 14:18,200,000–57,600,000
Names: Chromosome 14q11-q22 deletion syndrome
{ V N 2 } : NFKBIA 168 ADCY4 99 GNG2 89 SOS2 75 POLE2 47 PNP 45 SLC7A7 40 SLC7A8 39 BMP4 33 GZMB 28 ( 163 m o r e ) ZNF219 1 WDHD1 1 TRD 1 TEP1 1  RALGAPA1 1  PAX9 1 NDRG2 1 MIR208A 1 DLGAP5 1 CEBPE 1 BAZ1A 1 AKAP6 1
14q22.1–q23.1(Multiple incl. PTGDR, BMP4) Impaired speech [43]
OMIM: 609640; Cytogenetic location: 14q22.1–q22.3 Genomic coordinates (GRCh38): 14:50,400,000–57,600,000
Names: Frias syndrome; Chromosome 14q22 deletion syndrome; growth deficiency, facial anomalies, and brachydactyly
{ V N 2 } : GNG2 89 MNAT1 70 PRKCH 35  BMP4 33 PSMA3 26 GNPNAT1 25 PELI2 24 PPM1A 23 PYGL 21 PTGER2 20 ( 31 m o r e ) WDHD1 1 SIX6 1 SIX1 1 DLGAP5 1
14q32.2(DLK1, MEG3, RTL1) Delayed speech; idiosyncratic speech [44]
OMIM: 608149 (paternal)/ 616222 (maternal); Cytogenetic location: 14q32 Genomic coordinates (GRCh38): 14:103,500,000–107,043,718
Names: 14q32.2 Kagami–Ogata syndrome; uniparental disomy, paternal, chromosome 14/14q32.2; Temple syndrome; uniparental disomy, maternal, chromosome 14
{ V N 2 } : BDKRB2 32 CCNK 25 BDKRB1 24 YY1 21 MIR6764 20 CYP46A1 19 PAPOLA 18 DEGS2 13 EVL 12 WARS1 10 ( 12 m o r e ) MIR345 1 MIR342 1  MEG3 1
Chromosome 15 deletions
15q11.2(NIPA1, NIPA2, CYFIP1, TUBGCP5) Delayed speech
OMIM: 615656; Cytogenetic location: 15q11.2 Genomic coordinates (GRCh38): 15:20,500,000–25,500,000
Names: Burnside–Butler syndrome; 15q11.2 BP1–BP2 microdeletion
{ V N 2 } : CYFIP1 16 UBE3A 15  TUBGCP5 7 SNURF 7 OR4N4 6 SNRPN 5 OR4N4C 5 OR4M2 5  NIPA2 2  NIPA1 2 NDN 2
15q11–q13(NDN, SNRPN) Delayed or Impaired speech
OMIM: 176270 (for 5q11.2)
Names: Prader-Willi syndrome; Prader-Lambhart-Willi syndrome; Labhart-Willi syndrome; Prader’s syndrome; Prader-Labhart-Willi-Fanconi syndrome
{ V N 2 } : HERC2 22 TJP1 20 RYR3 16 GABRA5 16 CYFIP1 16  UBE3A 15 GABRB3 15 CHRNA7 14 GABRG3 13 TUBGCP5 7 ( 20 m o r e ) MTMR10 1 HMGN2P5 1 FMN1 1
15q11–q13(UBE3A) Absent speech; impaired speech [45]
OMIM: 105830 (for 15q11.2)
Names: Angelman syndrome; happy puppet syndrome
–Same as above–
15q13.3(CHRNA7, CHRFAM7A, OTUD7A) Impaired or idiosyncratic speech
OMIM: 612001; Cytogenetic location: 15q13.3 Genomic coordinates (GRCh38): 15:30,900,000–33,400,000
Names: Chromosome 15q13.3 microdeletion syndrome
{ V N 2 } : RYR3 16  CHRNA7 14 TRPM1 4 OTUD7A 4 FAN1 4 ARHGAP11A 3 MTMR10 1 FMN1 1
15q24(Multiple incl. SIN3A) Delayed or impaired speech
OMIM: 613406
Names: Witteveen–Kolk syndrome; WITKOS; chromosome 15q24 deletion syndrome (included)
{ V N 2 } : CSK 89 NRG4 82 CYP1A2 79 HCN4 56 CYP1A1 48 PML 32 CYP11A1 32  SIN3A 25 COX5A 16 MPI 15 ( 28 m o r e ) SNX33 1 PTPN9 1 CLK3 1
Chromosome 16 deletions
16p11.2(SH2B1  , TBX6, CORO1A) Apraxia [46]; dysarthria [47]; delayed or impaired speech
OMIM: 611913; Cytogenetic location: 16p11.2 Genomic coordinates (GRCh38): 16:28,500,000–35,300,000
Names: Chromosome 16p11.2 deletion syndrome (593 kb) or 16p11.2 deletion syndrome (220 kb)
{ V N 1 } SRCAP 1
{ V N 2 } : MAPK3 447 ALDOA 46 CD19 38 ITGAM 32 VKORC1 31 SULT1A1 29 ITGAL 25 STX4 22 PYCARD 19 CDIPT 19 ( 60 m o r e ) SRCAP 1 RNF40 1 PYDC1 1 ORAI3 1 MAZ 1 HIRIP3 1 AHSP 1
16p12.2–p11.2(Multiple incl. SH2B1) * [76] Delayed or impaired speech
OMIM: 613604; Cytogenetic location: 16p12.2–p11.2 Genomic coordinates (GRCh38): 16:21,200,000–35,300,000
Names: Chromosome 16p12.2–p11.2 deletion syndrome, 7.1 to 8.7 MB
{ V N 1 } SRCAP 1
{ V N 2 } : MAPK3 447 PRKCB 184 PLK1 65 ALDOA 46 CD19 38 SCNN1G 35 SCNN1B 35 TNRC6A 33 ITGAM 32 VKORC1 31 ( 96 m o r e ) XPO6 1 USP31 1 SRCAP 1 RNF40 1 PYDC1 1 ORAI3 1 MAZ 1 IGSF6 1 HIRIP3 1 CLN3 1 CHP2 1 AHSP 1
16p12.1(Multiple) Delayed speech
OMIM: 136570; Cytogenetic location: 16p12 Genomic coordinates (GRCh38): 16:24,200,000–28,500,000
Names: Chromosome 16p12.1 deletion syndrome, 520 KB
{ V N 2 } : TNRC6A 33 IL4R 24 CACNG3 20 EIF3CL 14 HS3ST4 8 GTF3C1 8 SLC5A11 7 IL27 7 AQP8 7 NSMCE1 5 ( 7 m o r e ) XPO6 1 CLN3 1
16p13.11(Multiple incl. MYH11) Delayed speech [48]
OMIM: 619351
Names: 16p13.1 microdeletion predisposing to autism and/or ID; megacystis–microcolon–intestinal hypoperistalsis syndrome 2; MMIHS2
{ V N 2 } : ABCC1 41 NDE1 23 MIR484 23  MYH11 18 RRN3 7 ABCC6 4
16p13.3(CREBBP, DNASE1, TRAP1) Apraxia, dysarthria, impaired or delayed speech [49]
OMIM: 610543; Cytogenetic location: 16p13.3 Genomic coordinates (GRCh38): 16:0–7,800,000
Names: Chromosome 16p13.3 deletion syndrome, proximal; severe Rubinstein–Taybi syndrome (RTS); broad thumb–hallux syndrome; Rubinstein syndrome; Rubinstein–Taybi deletion syndrome; RSTS deletion syndrome
{ V N 1 }  CREBBP 140
{ V N 2 } : PDPK1 173  CREBBP 140 ADCY9 102 TSC2 78 MLST8 69 GNG13 68 CACNA1H 60 AXIN1 49 UBE2I 44 ELOB 40 RPS2 35 ( 111 m o r e ) WDR24 1 TFAP4 1 SOX8 1 SEPTIN12 1 RHBDL1 1 RBFOX1 1 PGP 1 NPRL3 1 NAGPA 1 MIR3176 1 HAGHL 1 GNPTG 1 GLIS2 1 FAHD1 1 E4F1 1 CHTF18 1 BAIAP3 1
16q22(CBFB) –Not available–
OMIM: 614541; Cytogenetic location: 16q22 Genomic coordinates (GRCh38): 16:72,800,000–74,100,000
Names: Chromosome 16q22 deletion syndrome
{ V N 1 } : CMTR2 2
{ V N 2 } : CDH1 60 TRADD 50 SNTB2 50 SLC7A6 42 NFATC3 30 TAT 29 PARD6A 27 E2F4 26 NQO1 25 ST3GAL2 24 ( 83 m o r e ) WWP2 1 PSKH1 1 PKD1L3 1 NUTF2 1 NOB1 1 CHTF8 1 CDH16 1
16q24.3–q24.2(CDH15, ZNF778, ANKRD11, ZFPM1) * * Delayed or impaired speech [50]
{ V N 2 } : SLC7A5 48 MVD 42 CYBA 28 APRT 28 RPL13 22 CDT1 20 TUBB3 13 MC1R 11 DPEP1 10 TRAPPC2L 9 ( 18 m o r e ) SPIRE2 1 CHMP1A 1
Chromosome 17 deletions
17p11.2(LLGL1, RAI1, UBB) Delayed speech; dysarthria [51]
OMIM: 182290 Names: Smith–Magenis syndrome; chromosome 17p11.2 deletion syndrome
{ V N 2 } : UBB 280 MAP2K3 108 MAPK7 72 ALDH3A1 50 SHMT1 49 ALDH3A2 46 SREBF1 27 TOP3A 25 MIR6778 20 KCNJ12 17 ( 26 m o r e )
17p13.1(Multiple incl. KCNAB3, GUCY2D, TP53, TRAPPC1, MPDU1, CDG1F, FXR2, FMRP, EFNB3) Absent speech
OMIM: 613776; Cytogenetic location: 17p13.1 Genomic coordinates (GRCh38): 17:6,500,000–10,800,000
Names: Chromosome 17p13.1 deletion syndrome
{ V N 1 }  TP53 206 KDM6B 8
{ V N 2 } : TP53 206 ATP1B2 161 PIK3R5 104 DLG4 76 POLR2A 71 ALOX12 55 ALOX15B 51 ALOX12B 48 DVL2 45 SLC2A4 38 VAMP2 37 ( 63 m o r e ) TNK1 1 SHBG 1 MYH4 1 MYH13 1 MYH1 1 MIR497 1 MIR324 1 DNAH2 1
17p13.3(LIS1, PAFAH1B1, YWHAE) †‡Delayed speech [52]
OMIM: 247200; Cytogenetic location: 17p13.3 Genomic coordinates (GRCh38): 17:0–3,400,000
Names: Miller–Dieker lissencephaly syndrome; MDLS; chromosome 17p13.3 deletion syndrome (included)
{ V N 2 } : CRK 98 RPA1 86  YWHAE 83  PAFAH1B1 34 INPP5K 16 ABR 12 SERPINF2 10 SLC43A2 9 OR3A1 8 OR1G1 8 ( 27 m o r e ) SMYD4 1 SERPINF1 1
17q11.2(NF1)‡No significant speech issues
OMIM: 162200
Names: Neurofibromatosis type I; NF1; Von Recklinghausen disease; neurofibromatosis, peripheral type; Morbus–Recklinghausen
{ V N 2 } : SLC6A4 73 KSR1 66  NF1 62 NOS2 41 PSMD11 30 VTN 28 NLK 27 CDK5R1 26 RPL23A 22 ALDOC 19 ( 30 m o r e ) RAB11FIP4 1 MIR451A 1 MIR423 1
17q12(Multiple incl. HNF1B, LHX1, CCL3L3, SNIP) Delayed or impaired speech
OMIM: 614527; Cytogenetic location: 17q12 Genomic coordinates (GRCh38): 17:33,500,000–39,800,000
Names: Chromosome 17q12 deletion syndrome
{ V N 1 } ERBB2 124
{ V N 2 } : CACNB1 129 ERBB2 124 AP2B1 49 PIP4K2B 43 ACACA 41 CCL5 30 CCL2 30 PSMB3 25 CCL4 24 RPL23 23 ( 49 m o r e ) PEX12 1 MMP28 1 DUSP14 1
17q21.31(KANSL1, MAPT, CRHR1) Delayed or absent speech
OMIM: 610443
Names: Koolen–De Vries syndrome; KDVS; chromosome 17q21.31 deletion syndrome; Microdeletion 17q21.31 syndrome
{ V N 1 } : KANSL1 4 BRCA1 64
{ V N 2 } : ITGA2B 95 BRCA1 64 MAP3K14 59 G6PC1 43 FZD2 38 WNT3 36 DUSP3 31 HDAC5 28 NSF 27 AOC3 24 ( 36 m o r e ) TMEM106A 1 RND2 1 HEXIM1 1
17q23.1-q23.2(TBX4) Delayed speech [53]
OMIM: 613355; Cytogenetic location: 17q23.1–q23.2 Genomic coordinates (GRCh38): 17:59,500,000–63,100,000
Names: Chromosome 17q23.1–q23.2 deletion syndrome
{ V N 2 } : RPS6KB1 91 CLTC 54 BRIP1 24 MIR21 19 SKA2 12 MRC2 12 PTRH2 7 CA4 7 MED13 5 DHX40 4 ( 5 m o r e ) BCAS3 1 APPBP2 1
17q24.3–q24.2(Multiple incl. ABCA5, MAP2K6, SOX9) Delayed speech [54]
OMIM: 135400; Cytogenetic location: 17q24.2–q24.3 Genomic coordinates (GRCh38): 17:66,200,000–72,900,000
Names: Hypertrichosis, congenital generalized, with or without gingival hyperplasia; HTC3; fibromatosis, gingival, with hypertrichosis; chromosome 17q24.2–q24.3 deletion syndrome
{ V N 2 } : PRKCA 237  MAP2K6 131 PRKAR1A 122 KCNJ2 71 PSMD12 31 CACNG4 21 KPNA2 16 KCNJ16 13 ( 14 m o r e ) BPTF 1
Chromosome 18 deletions
18q(Multiple incl. MBP, TSHZ1) Delayed [55] or impaired [56] speech
OMIM: 601808; Cytogenetic location: 18q Genomic coordinates (GRCh38): 18:18,500,000–80,373,285
Names: Chromosome 18q deletion syndrome; 18q syndrome
{ V N 2 } : BCL2 100 SMAD4 79 ROCK1 78 SMAD2 66 ACAA2 60 NFATC1 52 PIK3C3 43 SMAD7 36 LAMA3 35 SLC14A2 34 ( 112 m o r e ) SS18 1 SETBP1 1 SERPINB4 1 SERPINB13 1 RIOK3 1 NETO1 1 MIR122 1 MBD1 1 LINC-ROR 1 GAREM1 1 CELF4 1
Chromosome 19 deletions
19p13.13(Multiple incl. NFIX, MAST1, CALR) Delayed speech; impaired speech [57]
OMIM: 613638; Cytogenetic location: 19p13.13 Genomic coordinates (GRCh38): 19:12,600,000–13,800,000
Names: Chromosome 19p13.13 deletion syndrome
{ V N 1 } MAP2K2 257 UHRF1 6
{ V N 2 } : MAP2K2 257 FGF22 102 VAV1 100 POLR2E 85 SHC2 81 GNG7 75 PIP5K1C 58 GTF2F1 53 GNA11 48 PSPN 47 ( 121 m o r e ) ZBTB7A 1 TJP3 1 SEMA6B 1 REXO1 1 PLIN4 1 MYDGF 1 MIR7-3 1 DPP9 1 DAZAP1 1
19q13.11(Multiple incl. LSM14A, UBA2, WTIP, TSHZ3) Delayed or absent speech; impaired speech [58]
OMIM: 613026 (distal)/ 617219 (proximal); Cytogenetic location: 19q13.11 Genomic coordinates (GRCh38): 19:31,900,000–35,100,000 Names: Chromosome 19q13.11 deletion syndrome, distal; chromosome 19q13.11 deletion syndrome, proximal
{ V N 1 } CEBPA 23
{ V N 2 } : SCN1B 62 GPI 51 SLC7A9 39 PSMC4 28 CEBPA 23 SLC7A10 13  UBA2 8 CHST8 7  WTIP 6 RGS9BP 6 ( 12 m o r e )
Chromosome 20 deletions
20p12.3(Multiple incl. BMP2)Delayed or Impaired speech [59]
{ V N 2 } : PLCB1 159 PCNA 79 PLCB4 68  BMP2 33 MCM8 25 PROKR2 9 CRLS1 9 CDS2 9 GPCPD1 8 RPS18P1 6 HAO1 6 TRMT6 3 LRRN4 1
Chromosome 22 deletions
22q11.2(Multiple incl. TBX1  , COMT  , INI1 , TOP3B  )Apraxia, dysarthria, delayed, or impaired speech [60]
OMIM: 611867; Cytogenetic location: 22q11.2 Genomic coordinates (GRCh38): 22:23,100,000–25,500,000
Names: Chromosome 22q11.2 deletion syndrome; distal chromosome 22q11.2 deletion syndrome
{ V N 2 } : MAPK1 469 CRKL 68 GGT1 62 BID 56 ADORA2A 35  COMT 33 GNAZ 32 UPB1 24 ATP6V1E1 23 IGL 22 ( 56 m o r e ) PIWIL3 1 MIR185 1
22q12.2(NF2) Delayed or impaired speech [61]
OMIM: 101000
Names: Neurofibromatosis type II; neurofibromatosis, central type; acoustic schwannomas, bilateral; bilateral acoustic neurofibromatosis; BANF; acoustic neurinoma, bilateral; ACN
{ V N 2 } : LIF 26 PLA2G3 20 AP1B1 20 LIMK2 19 INPP5J 18 OSM 15 PISD 14 SFI1 13 RNF185 13 MTMR3 12 ( 17 m o r e ) SEC14L2 1 PIK3IP1 1 MIR3200 1 DEPDC5 1
22q13.3(ARSA, SHANK3) Delayed or absent speech
OMIM: 606232
Names: Phelan–McDermid syndrome; PHMDS; chromosome 22q13.3 deletion syndrome; telomeric 22q13 monosomy syndrome
{ V N 1 } BRD1 9
{ V N 2 } : MAPK11 140 MAPK12 84 NUP50 79 PRR5 38 PPARA 35 WNT7B 31 TYMP 23 HDAC10 19 CHKB 19  ARSA 19 ( 35 m o r e ) UPK3A 1 PLXNB2 1 GRAMD4 1 CELSR1 1
Chromosome X deletions
Xp11.3(Multiple incl. RP2, ZNF674) Impaired speech [62]
OMIM: 300578; Cytogenetic location: Xp11.3 Genomic coordinates (GRCh38): X:42,500,000–47,600,000
Names: Chromosome Xp11.3 deletion syndrome; mental retardation, X-linked, with retinitis pigmentosa
{ V N 1 } KDM6A 6
{ V N 2 } : ARAF 106 MAOA 72 MAOB 37 UBA1 18 TIMP1 16 NDUFB11 10 MIR221 10 CHST7 10 MIR222 9 USP11 8 ( 10 m o r e )
Xp21(GK, DMD, NR0B1) †‡Delayed speech [63]
OMIM: 300679; Cytogenetic location: Xp21 Genomic coordinates (GRCh38): X:31,500,000–37,800,000
Names: Chromosome Xp21 deletion syndrome; Complex glycerol kinase deficiency
{ V N 2 } : TAB3 52 CYBB 26  GK 16  DMD 12  NR0B1 8 XK 5 ARX 2
Xq28 (a)(ABCD1, BCAP31, SLC6A8) Delayed speech [64]
OMIM: 300475
Names: Deafness, dystonia, and cerebral hypomyelination; DDCH; contiguous ABCD1/DXS1375E deletion syndrome, included; CADDS, included
{ V N 2 } : IKBKG 161 IRAK1 85 MIR718 54 DUSP9 52 F8 46 H2AB1 43 NSDHL 35 G6PD 33 FLNA 28 IDH3G 26 ( 54 m o r e ) MPP1 1 MIR224 1 MIR105-1 1 MAGEA11 1
Xq28 (b)(MECP2) Normal to absent speech
OMIM: 312750
Names: Rett syndrome; RTT; RTS autism, dementia, ataxia, and loss of purposeful hand use
–same as above–
Chromosome Y deletions
Yq11(USP9Y, BPY2, CDY1) Normal speech
OMIM: 415000
Names: Spermatogenic failure, Y-linked, 2; SPGFY2; Sertoli-cell-only syndrome; Del Castillo syndrome; germ cell aplasia, spermatogenic failure
{ V N 2 } : RPS4Y2 25 UTY 4 NLGN4Y 4 KDM5D 3 CD24P4 3 TMSB4Y 1
Sources: †: [77], ‡: [78], †‡: From OMIM records, ⋆: [32], ∗∗: [50].

References

  1. Singh, R. Profiling Humans from Their Voice; Springer-Nature: Singapore, 2019. [Google Scholar]
  2. Sataloff, R.T. Genetics of the voice. J. Voice 1995, 9, 16–19. [Google Scholar] [CrossRef] [PubMed]
  3. Ganapathiraju, M.K.; Thahir, M.; Handen, A.; Sarkar, S.N.; Sweet, R.A.; Nimgaonkar, V.L.; Loscher, C.E.; Bauer, E.M.; Chaparala, S. Schizophrenia interactome with 504 novel protein–protein interactions. NPJ Schizophr. 2016, 2, 16012. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  4. Morgan, A.; Fisher, S.E.; Scheffer, I.; Hildebrand, M. FOXP2-Related Speech and Language Disorders. In GeneReviews®; University of Washington: Seattle, WA, USA, 2017. [Google Scholar]
  5. Fisher, S.E.; Scharff, C. FOXP2 as a molecular window into speech and language. Trends Genet. 2009, 25, 166–177. [Google Scholar] [CrossRef] [PubMed]
  6. FOXP2. Online Mendelian Inheritance in Man (OMIM) Entry 605317. Available online: https://omim.org/entry/605317 (accessed on 21 September 2022).
  7. Den Hoed, J.; Fisher, S.E. Genetic pathways involved in human speech disorders. Curr. Opin. Genet. Dev. 2020, 65, 103–111. [Google Scholar] [CrossRef]
  8. FOXP2. The Human Protein Atlas. Available online: https://www.proteinatlas.org/ENSG00000128573-FOXP2 (accessed on 21 September 2022).
  9. Bac, C. Investigation of Speech Delay in Individuals with 1p36 Deletion Syndrome. Ph.D. Thesis, University of Cincinnati, Cincinnati, OH, USA, 2015. [Google Scholar]
  10. Pang, H.; Yu, X.; Kim, Y.M.; Wang, X.; Jinkins, J.K.; Yin, J.; Li, S.; Gu, H. Disorders Associated With Diverse, Recurrent Deletions and Duplications at 1q21.1. Front. Genet. 2020, 11, 577. [Google Scholar] [CrossRef]
  11. Brazil, A.; Stanford, K.; Smolarek, T.; Hopkin, R. Delineating the phenotype of 1p36 deletion in adolescents and adults. Am. J. Med. Genet. Part A 2014, 164, 2496–2503. [Google Scholar] [CrossRef]
  12. He, J.; Xie, Y.; Kong, S.; Qiu, W.; Wang, X.; Wang, D.; Sun, X.; Sun, D. Psychomotor retardation with a 1q42.11–q42.12 deletion. Hereditas 2017, 154, 6. [Google Scholar] [CrossRef] [Green Version]
  13. Peter, B.; Matsushita, M.; Oda, K.; Raskind, W. De novo microdeletion of BCL11A is associated with severe speech sound disorder. Am. J. Med. Genet. Part A 2014, 164, 2091–2096. [Google Scholar] [CrossRef]
  14. Eggermann, T.; Spengler, S.; Venghaus, A.; Denecke, B.; Zerres, K.; Baudis, M.; Ensenauer, R. 2p21 Deletions in hypotonia–cystinuria syndrome. Eur. J. Med. Genet. 2012, 55, 561–563. [Google Scholar] [CrossRef]
  15. Chen, C.P.; Lin, S.P.; Chern, S.R.; Tsai, F.J.; Wu, P.C.; Lee, C.C.; Chen, L.F.; Lee, M.S.; Wang, W. Deletion 2q37.3 ⟶ qter and duplication 15q24.3 ⟶ qter characterized by array CGH in a girl with epilepsy and dysmorphic features. Genet. Couns. 2010, 21, 263. [Google Scholar]
  16. Palumbo, O.; D’Agruma, L.; Minenna, A.F.; Palumbo, P.; Stallone, R.; Palladino, T.; Zelante, L.; Carella, M. 3p14.1 de novo microdeletion involving the FOXP1 gene in an adult patient with autism, severe speech delay and deficit of motor coordination. Gene 2013, 516, 107–113. [Google Scholar] [CrossRef]
  17. Lowther, C.; Costain, G.; Melvin, R.; Stavropoulos, D.J.; Lionel, A.C.; Marshall, C.R.; Scherer, S.W.; Bassett, A.S. Adult expression of a 3q13.31 microdeletion. Mol. Cytogenet. 2014, 7, 23. [Google Scholar] [CrossRef] [Green Version]
  18. Van Borsel, J.; De Grande, S.; Van Buggenhout, G.; Fryns, J.P. Speech and language in Wolf-Hirschhorn syndrome: A case-study. J. Commun. Disord. 2004, 37, 21–33. [Google Scholar] [CrossRef]
  19. Bonnet, C.; Andrieux, J.; Beri-Dexheimer, M.; Leheup, B.; Boute, O.; Manouvrier, S.; Delobel, B.; Copin, H.; Receveur, A.; Mathieu, M.; et al. Microdeletion at chromosome 4q21 defines a new emerging syndrome with marked growth restriction, mental retardation and absent or severely delayed speech. J. Med. Genet. 2010, 47, 377–384. [Google Scholar] [CrossRef] [Green Version]
  20. Tran, T.M.; Sherwood, J.K.; Doolittle, M.J.; Sathler, M.F.; Hofmann, F.; Stone-Roy, L.M.; Kim, S. Loss of cGMP-dependent protein kinase II alters ultrasonic vocalizations in mice, a model for speech impairment in human microdeletion 4q21 syndrome. Neurosci. Lett. 2021, 759, 136048. [Google Scholar] [CrossRef]
  21. Kristoffersen, K.E. Speech and language development in cri du chat syndrome: A critical review. Clin. Linguist. Phon. 2008, 22, 443–457. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  22. Flax, J.F.; Hare, A.; Azaro, M.A.; Vieland, V.J.; Brzustowicz, L.M. Combined linkage and linkage disequilibrium analysis of a motor speech phenotype within families ascertained for autism risk loci. J. Neurodev. Disord. 2010, 2, 210–223. [Google Scholar] [CrossRef] [Green Version]
  23. Rauch, A.; Beese, M.; Mayatepek, E.; Dörr, H.G.; Wenzel, D.; Reis, A.; Trautmann, U. A novel 5q35.3 subtelomeric deletion syndrome. Am. J. Med. Genet. Part A 2003, 121, 1–8. [Google Scholar] [CrossRef]
  24. Peter, B.; Lancaster, H.; Vose, C.; Fares, A.; Schrauwen, I.; Huentelman, M. Two unrelated children with overlapping 6q25.3 deletions, motor speech disorders, and language delays. Am. J. Med. Genet. Part A 2017, 173, 2659–2669. [Google Scholar] [CrossRef]
  25. Bianchi, E.; Aricŏ, M.; Podestă, A.F.; Grana, M.; Fiori, P.; Beluffi, G.; Opitz, J.M.; Reynolds, J.F. A family with the Saethre-Chotzen syndrome. Am. J. Med. Genet. 1985, 22, 649–658. [Google Scholar] [CrossRef]
  26. Vanzo, R.J.; Martin, M.M.; Sdano, M.R.; South, S.T. Familial KANK1 deletion that does not follow expected imprinting pattern. Eur. J. Med. Genet. 2013, 56, 256–259. [Google Scholar] [CrossRef] [PubMed]
  27. Yatsenko, S.; Cheung, S.; Scott, D.; Nowaczyk, M.; Tarnopolsky, M.; Naidu, S.; Bibat, G.; Patel, A.; Leroy, J.; Scaglia, F.; et al. Deletion 9q34.3 syndrome: Genotype-phenotype correlations and an extended deletion in a patient with features of Opitz C trigonocephaly. J. Med. Genet. 2005, 42, 328–335. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  28. Samango-Sprouse, C.; Lawson, P.; Sprouse, C.; Stapleton, E.; Sadeghin, T.; Gropman, A. Expanding the phenotypic profile of Kleefstra syndrome: A female with low-average intelligence and childhood apraxia of speech. Am. J. Med. Genet. Part A 2016, 170, 1312–1316. [Google Scholar] [CrossRef] [PubMed]
  29. Septer, S.; Zhang, L.; Lawson, C.E.; Cocjin, J.; Attard, T.; Ardinger, H.H. Aggressive juvenile polyposis in children with chromosome 10q23 deletion. World J. Gastroenterol. WJG 2013, 19, 2286. [Google Scholar] [CrossRef]
  30. Nishi, E.; Uehara, T.; Yanagi, K.; Hasegawa, Y.; Ueda, K.; Kaname, T.; Yamamoto, T.; Kosaki, K.; Okamoto, N. Clinical spectrum of individuals with de novo EBF3 variants or deletions. Am. J. Med. Genet. Part A 2021, 185, 2913–2921. [Google Scholar] [CrossRef]
  31. Kim, H.G.; Rosenfeld, J.A.; Scott, D.A.; Bénédicte, G.; Labonne, J.D.; Brown, J.; McGuire, M.; Mahida, S.; Naidu, S.; Gutierrez, J.; et al. Disruption of PHF21A causes syndromic intellectual disability with craniofacial anomalies, epilepsy, hypotonia, and neurobehavioral problems including autism. Mol. Autism 2019, 10, 35. [Google Scholar] [CrossRef]
  32. Xu, S.; Han, J.; Morales, A.; Menzie, C.; Williams, K.; Fan, Y.S. Characterization of 11p14-p12 deletion in WAGR syndrome by array CGH for identifying genes contributing to mental retardation and autism. Cytogenet. Genome Res. 2008, 122, 181–187. [Google Scholar] [CrossRef]
  33. Borsel, J.V.; Morlion, B.; Snick, K.V.; Leroy, J.S. Articulation in Beckwith-Wiedemann syndrome: Two case studies. Am. J. Speech-Lang. Pathol. 2000, 9, 202–213. [Google Scholar] [CrossRef]
  34. Kim, Y.S.; Kim, G.H.; Byeon, J.H.; Eun, S.H.; Eun, B.L. Chromosome 11q13 deletion syndrome. Korean J. Pediatr. 2016, 59, S10. [Google Scholar] [CrossRef] [Green Version]
  35. Chilian, B.; Abdollahpour, H.; Bierhals, T.; Haltrich, I.; Fekete, G.; Nagel, I.; Rosenberger, G.; Kutsche, K. Dysfunction of SHANK2 and CHRNA7 in a patient with intellectual disability and language impairment supports genetic epistasis of the two loci. Clin. Genet. 2013, 84, 560–565. [Google Scholar] [CrossRef]
  36. Takahashi, I.; Takahashi, T.; Sawada, K.; Shimojima, K.; Yamamoto, T. Jacobsen syndrome due to an unbalanced translocation between 11q23 and 22q11.2 identified at age 40 years. Am. J. Med. Genet. Part A 2012, 158, 220–223. [Google Scholar] [CrossRef]
  37. Penny, L.A.; Dell’Aquila, M.; Jones, M.C.; Bergoffen, J.; Cunniff, C.; Fryns, J.P.; Grace, E.; Graham, J.M.; Kousseff, B.; Mattina, T.; et al. Clinical and molecular characterization of patients with distal 11q deletions. Am. J. Hum. Genet. 1995, 56, 676. [Google Scholar]
  38. Manolakos, E.; Orru, S.; Neroutsou, R.; Kefalas, K.; Louizou, E.; Papoulidis, I.; Thomaidis, L.; Peitsidis, P.; Sotiriou, S.; Kitsos, G.; et al. Detailed molecular and clinical investigation of a child with a partial deletion of chromosome 11 (Jacobsen syndrome). Mol. Cytogenet. 2009, 2, 26. [Google Scholar] [CrossRef]
  39. Lynch, S.A.; Foulds, N.; Thuresson, A.C.; Collins, A.L.; Annerén, G.; Hedberg, B.O.; Delaney, C.A.; Iremonger, J.; Murray, C.M.; Crolla, J.A.; et al. The 12q14 microdeletion syndrome: Six new cases confirming the role of HMGA2 in growth. Eur. J. Hum. Genet. 2011, 19, 534–539. [Google Scholar] [CrossRef] [Green Version]
  40. Bartholdi, D.; Stray-Pedersen, A.; Azzarello-Burri, S.; Kibaek, M.; Kirchhoff, M.; Oneda, B.; Rødningen, O.; Schmitt-Mechelke, T.; Rauch, A.; Kjaergaard, S. A newly recognized 13q12.3 microdeletion syndrome characterized by intellectual disability, microcephaly, and eczema/atopic dermatitis encompassing the HMGB1 and KATNAL1 genes. Am. J. Med. Genet. Part A 2014, 164, 1277–1283. [Google Scholar] [CrossRef]
  41. Tüysüz, B.; Collin, A.; Arapoğlu, M.; Suyugül, N. Clinical variability of Waardenburg–Shah syndrome in patients with proximal 13q deletion syndrome including the endothelin-B receptor locus. Am. J. Med. Genet. Part A 2009, 149, 2290–2295. [Google Scholar] [CrossRef]
  42. Fonseca, D.J.; Prada, C.F.; Siza, L.M.; Angel, D.; Gomez, Y.M.; Restrepo, C.M.; Douben, H.; Rivadeneira, F.; de Klein, A.; Laissue, P. A de novo 14q12q13.3 interstitial deletion in a patient affected by a severe neurodevelopmental disorder of unknown origin. Am. J. Med. Genet. Part A 2012, 158, 689–693. [Google Scholar] [CrossRef]
  43. Martínez-Fernández, M.L.; Bermejo-Sánchez, E.; Fernández, B.; MacDonald, A.; Fernández-Toral, J.; Martínez-Frías, M.L. Haploinsufficiency of BMP4 gene may be the underlying cause of Frias syndrome. Am. J. Med. Genet. Part A 2014, 164, 338–345. [Google Scholar] [CrossRef]
  44. Huang, H.; Mikami, Y.; Shigematsu, K.; Uemura, N.; Shinsaka, M.; Iwatani, A.; Miyake, F.; Kabe, K.; Takai, Y.; Saitoh, M.; et al. Kagami–Ogata syndrome in a fetus presenting with polyhydramnios, malformations, and preterm delivery: A case report. J. Med. Case Rep. 2019, 13, 340. [Google Scholar] [CrossRef] [Green Version]
  45. Murthy, S.; Nygren, A.; El Shakankiry, H.; Schouten, J.; Al Khayat, A.; Ridha, A.; Al Ali, M. Detection of a novel familial deletion of four genes between BP1 and BP2 of the Prader-Willi/Angelman syndrome critical region by oligo-array CGH in a child with neurological disorder and speech impairment. Cytogenet. Genome Res. 2007, 116, 135–140. [Google Scholar] [CrossRef]
  46. Mei, C.; Fedorenko, E.; Amor, D.J.; Boys, A.; Hoeflin, C.; Carew, P.; Burgess, T.; Fisher, S.E.; Morgan, A.T. Deep phenotyping of speech and language skills in individuals with 16p11.2 deletion. Eur. J. Hum. Genet. 2018, 26, 676–686. [Google Scholar] [CrossRef] [PubMed]
  47. Demopoulos, C.; Kothare, H.; Mizuiri, D.; Henderson-Sabes, J.; Fregeau, B.; Tjernagel, J.; Houde, J.F.; Sherr, E.H.; Nagarajan, S.S. Abnormal speech motor control in individuals with 16p11.2 deletions. Sci. Rep. 2018, 8, 1274. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  48. Sahoo, T.; Theisen, A.; Rosenfeld, J.A.; Lamb, A.N.; Ravnan, J.B.; Schultz, R.A.; Torchia, B.S.; Neill, N.; Casci, I.; Bejjani, B.A.; et al. Copy number variants of schizophrenia susceptibility loci are associated with a spectrum of speech and developmental delays and behavior problems. Genet. Med. 2011, 13, 868–880. [Google Scholar] [CrossRef] [PubMed]
  49. Hennekam, R.C.; Baselier, A.C.; Beyaert, E.; Bos, A.; Blok, J.; Jansma, H.; Thorbecke-Nilsen, V.; Veerman, H. Psychological and speech studies in Rubinstein-Taybi syndrome. Am. J. Ment. Retard. 1992, 96, 645–660. [Google Scholar] [PubMed]
  50. Novara, F.; Rinaldi, B.; Sisodiya, S.M.; Coppola, A.; Giglio, S.; Stanzial, F.; Benedicenti, F.; Donaldson, A.; Andrieux, J.; Stapleton, R.; et al. Haploinsufficiency for ANKRD11-flanking genes makes the difference between KBG and 16q24.3 microdeletion syndromes: 12 new cases. Eur. J. Hum. Genet. 2017, 25, 694–701. [Google Scholar] [CrossRef] [Green Version]
  51. Gropman, A.L.; Duncan, W.C.; Smith, A.C. Neurologic and developmental features of the Smith-Magenis syndrome (del 17p11.2). Pediatr. Neurol. 2006, 34, 337–350. [Google Scholar] [CrossRef]
  52. Schiff, M.; Delahaye, A.; Andrieux, J.; Sanlaville, D.; Vincent-Delorme, C.; Aboura, A.; Benzacken, B.; Bouquillon, S.; Elmaleh-Berges, M.; Labalme, A.; et al. Further delineation of the 17p13.3 microdeletion involving YWHAE but distal to PAFAH1B1: Four additional patients. Eur. J. Med. Genet. 2010, 53, 303–308. [Google Scholar] [CrossRef]
  53. Schönewolf-Greulich, B.; Ronan, A.; Ravn, K.; Baekgaard, P.; Lodahl, M.; Nielsen, K.; Rendtorff, N.D.; Tranebjaerg, L.; Brøndum-Nielsen, K.; Tümer, Z. Two new cases with microdeletion of 17q23.2 suggest presence of a candidate gene for sensorineural hearing loss within this region. Am. J. Med. Genet. Part A 2011, 155, 2964–2969. [Google Scholar] [CrossRef]
  54. Vergult, S.; Dauber, A.; Delle Chiaie, B.; Van Oudenhove, E.; Simon, M.; Rihani, A.; Loeys, B.; Hirschhorn, J.; Pfotenhauer, J.; Phillips, J.A.; et al. 17q24.2 microdeletions: A new syndromal entity with intellectual disability, truncal obesity, mood swings and hallucinations. Eur. J. Hum. Genet. 2012, 20, 534–539. [Google Scholar] [CrossRef]
  55. Cody, J.D.; Sebold, C.; Malik, A.; Heard, P.; Carter, E.; Crandall, A.; Soileau, B.; Semrud-Clikeman, M.; Cody, C.M.; Hardies, L.J.; et al. Recurrent interstitial deletions of proximal 18q: A new syndrome involving expressive speech delay. Am. J. Med. Genet. Part A 2007, 143, 1181–1190. [Google Scholar] [CrossRef]
  56. Marseglia, G.; Scordo, M.R.; Pescucci, C.; Nannetti, G.; Biagini, E.; Scandurra, V.; Gerundino, F.; Magi, A.; Benelli, M.; Torricelli, F. 372 kb Microdeletion in 18q12.3 causing SETBP1 haploinsufficiency associated with mild mental retardation and expressive speech impairment. Eur. J. Med. Genet. 2012, 55, 216–221. [Google Scholar] [CrossRef]
  57. Bonaglia, M.C.; Marelli, S.; Novara, F.; Commodaro, S.; Borgatti, R.; Minardo, G.; Memo, L.; Mangold, E.; Beri, S.; Zucca, C.; et al. Genotype–phenotype relationship in three cases with overlapping 19p13.12 microdeletions. Eur. J. Hum. Genet. 2010, 18, 1302–1309. [Google Scholar] [CrossRef] [Green Version]
  58. Melo, J.B.; Estevinho, A.; Saraiva, J.; Ramos, L.; Carreira, I.M. Cutis Aplasia as a clinical hallmark for the syndrome associated with 19q13.11 deletion: The possible role for UBA2 gene. Mol. Cytogenet. 2015, 8, 21. [Google Scholar] [CrossRef] [Green Version]
  59. Amasdl, S.; Natiq, A.; Sbiti, A.; Zerkaoui, M.; Lyahyai, J.; Amzazi, S.; Liehr, T.; Sefiani, A. 20p12.3 deletion is rare cause of syndromic cleft palate: Case report and review of literature. BMC Res. Notes 2016, 9, 5. [Google Scholar] [CrossRef] [Green Version]
  60. Solot, C.B.; Sell, D.; Mayne, A.; Baylis, A.L.; Persson, C.; Jackson, O.; McDonald-McGinn, D.M. Speech-language disorders in 22q11.2 deletion syndrome: Best practices for diagnosis and management. Am. J. Speech-Lang. Pathol. 2019, 28, 984–999. [Google Scholar] [CrossRef] [Green Version]
  61. Davidson, T.B.; Sanchez-Lara, P.A.; Randolph, L.M.; Krieger, M.D.; Wu, S.Q.; Panigrahy, A.; Shimada, H.; Erdreich-Epstein, A. Microdeletion del (22)(q12.2) encompassing the facial development-associated gene, MN1 (meningioma 1) in a child with Pierre-Robin sequence (including cleft palate) and neurofibromatosis 2 (NF2): A case report and review of the literature. BMC Med. Genet. 2012, 13, 19. [Google Scholar] [CrossRef] [Green Version]
  62. Hayashi, S.; Mizuno, S.; Migita, O.; Okuyama, T.; Makita, Y.; Hata, A.; Imoto, I.; Inazawa, J. The CASK gene harbored in a deletion detected by array-CGH as a potential candidate for a gene causative of X-linked dominant mental retardation. Am. J. Med. Genet. Part A 2008, 146, 2145–2151. [Google Scholar] [CrossRef]
  63. Fries, M.H.; Lebo, R.V.; Schonberg, S.A.; Golabi, M.; Seltzer, W.K.; Gitelman, S.E.; Golbus, M.S. Mental retardation locus in Xp21 chromosome microdeletion. Am. J. Med. Genet. 1993, 46, 363–368. [Google Scholar] [CrossRef]
  64. Gedeon, A.; Meinänen, M.; Ades, L.; Kääriäinen, H.; Gecz, J.; Baker, E.; Sutherland, G.; Mulley, J. Overlapping submicroscopic deletions in Xq28 in two unrelated boys with developmental disorders: Identification of a gene near FRAXE. Am. J. Hum. Genet. 1995, 56, 907. [Google Scholar]
  65. National Institute on Deafness and Other Communication Disorders. Statistics on voice, speech, and language. IEEE/ACM Trans. Audio Speech Lang. Process. 2008, 25, 2098–2111. [Google Scholar]
  66. American Speech-Language-Hearing Association. Speech Sound Disorders: Articulation and Phonology. Practice Portal. 2017. Available online: www.asha.org/Practice-Portal/Clinical-Topics/Articulation-and-Phonology (accessed on 21 September 2022).
  67. Shmueli, G.; Minka, T.P.; Kadane, J.B.; Borle, S.; Boatwright, P. A useful distribution for fitting discrete data: Revival of the Conway–Maxwell–Poisson distribution. J. R. Stat. Soc. Ser. C (Appl. Stat.) 2005, 54, 127–142. [Google Scholar] [CrossRef]
  68. Sellers, K.F.; Borle, S.; Shmueli, G. The COM-Poisson model for count data: A survey of methods and applications. Appl. Stoch. Model. Bus. Ind. 2012, 28, 104–116. [Google Scholar] [CrossRef]
  69. Huang, J.; Zhao, Y.L.; Li, Y.; Fletcher, J.A.; Xiao, S. Genomic and functional evidence for an ARID1A tumor suppressor role. Genes Chromosom. Cancer 2007, 46, 745–750. [Google Scholar] [CrossRef] [PubMed]
  70. Sheffels, E.; Sealover, N.E.; Theard, P.L.; Kortum, R.L. Anchorage-independent growth conditions reveal a differential SOS2 dependence for transformation and survival in RAS-mutant cancer cells. Small GTPases 2021, 12, 67–78. [Google Scholar] [CrossRef]
  71. Bertucci, F.; Borie, N.; Ginestier, C.; Groulet, A.; Charafe-Jauffret, E.; Adélaïde, J.; Geneix, J.; Bachelart, L.; Finetti, P.; Koki, A.; et al. Identification and validation of an ERBB2 gene expression signature in breast cancers. Oncogene 2004, 23, 2564–2575. [Google Scholar] [CrossRef] [Green Version]
  72. Siggberg, L.; Olsén, P.; Näntö-Salonen, K.; Knuutila, S. 19p13.3 aberrations are associated with dysmorphic features and deviant psychomotor development. Cytogenet. Genome Res. 2011, 132, 8–15. [Google Scholar] [CrossRef]
  73. Severinsen, J.; Bjarkam, C.R.; Kiar-Larsen, S.; Olsen, I.M.; Nielsen, M.M.; Blechingberg, J.; Nielsen, A.L.; Holm, I.E.; Foldager, L.; Young, B.D.; et al. Evidence implicating BRD1 with brain development and susceptibility to both schizophrenia and bipolar affective disorder. Mol. Psychiatry 2006, 11, 1126–1138. [Google Scholar] [CrossRef] [Green Version]
  74. Jansen, S.; Kleefstra, T.; Willemsen, M.; De Vries, P.; Pfundt, R.; Hehir-Kwa, J.; Gilissen, C.; Veltman, J.; de Vries, B.; Vissers, L. De novo loss-of-function mutations in X-linked SMC1A cause severe ID and therapy-resistant epilepsy in females: Expanding the phenotypic spectrum. Clin. Genet. 2016, 90, 413–419. [Google Scholar] [CrossRef]
  75. Porntaveetus, T.; Abid, M.F.; Theerapanon, T.; Srichomthong, C.; Ohazama, A.; Kawasaki, K.; Kawasaki, M.; Suphapeetiporn, K.; Sharpe, P.T.; Shotelersuk, V. Expanding the oro-dental and mutational spectra of Kabuki syndrome and expression of KMT2D and KDM6A in human tooth germs. Int. J. Biol. Sci. 2018, 14, 381. [Google Scholar] [CrossRef] [Green Version]
  76. Sagi-Dain, L.; Maya, I.; Peleg, A.; Reches, A.; Banne, E.; Baris, H.N.; Tenne, T.; Singer, A.; Ben-Shachar, S. Microarray analysis in pregnancies with isolated unilateral kidney agenesis. Pediatr. Res. 2018, 83, 825–828. [Google Scholar] [CrossRef]
  77. Žilina, O.; Teek, R.; Tammur, P.; Kuuse, K.; Yakoreva, M.; Vaidla, E.; Mölter-Väär, T.; Reimand, T.; Kurg, A.; Õunap, K. Chromosomal microarray analysis as a first-tier clinical diagnostic test: Estonian experience. Mol. Genet. Genom. Med. 2014, 2, 166–175. [Google Scholar] [CrossRef]
  78. Hsu, F.; Kent, W.J.; Clawson, H.; Kuhn, R.M.; Diekhans, M.; Haussler, D. The UCSC known genes. Bioinformatics 2006, 22, 1036–1046. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Voice chains of different levels. In this reversed perspective ideogram, a link depicts a pathway, the black dots on it are genes, and the chromosomes that they belong to are shown as rods where relevant. The lines connecting genes are only meant for visual clarity. Chains are formed with respect to genes on a chromosome (a microdeletion region in this case, shown shaded in yellow). In this ideogram, Gene-1 and FOXP2 lie on the same pathway, contributing to a level-1 voice chain. In a level-2 chain, FOXP2 and the microdeletion region are on different pathways, but the pathways share a set of genes. Gene 2 and Gene 3 have level-2 voice chains.
Figure 1. Voice chains of different levels. In this reversed perspective ideogram, a link depicts a pathway, the black dots on it are genes, and the chromosomes that they belong to are shown as rods where relevant. The lines connecting genes are only meant for visual clarity. Chains are formed with respect to genes on a chromosome (a microdeletion region in this case, shown shaded in yellow). In this ideogram, Gene-1 and FOXP2 lie on the same pathway, contributing to a level-1 voice chain. In a level-2 chain, FOXP2 and the microdeletion region are on different pathways, but the pathways share a set of genes. Gene 2 and Gene 3 have level-2 voice chains.
Entropy 25 00897 g001
Figure 2. Voice chains of different types. (a): Shows the different components of a graph comprising biological pathways. Each node in this graph represents a biological pathway. The genes g i that contribute to each pathway are listed within the node. Nodes are linked to each other through edges that represent the set of shared genes. If two nodes have no gene in common, no edge exists between them. Genes that link pathways (nodes) are explicitly shown on the edges. (b): Explains what a chainlink gene means in the context of a microdeletion region and the target voice gene (FOXP2 in the example used in this paper). To compose voice chains, the set of genes of interest (e.g., from a microdeletion region) is added to the a graph as a node (shaded pink). The “chainlink” genes that link the set to the graph are also shown. The topmost (left) node represents a biological pathway that contains the FOXP2 gene ( g 1 , highlighted in yellow). (c) A Level-1 voice chain is the edge shown in red. The “chainlink” gene g 2 that links the set (in pink) to the voice chain is also shown. (d) Exemplifies Level-2 chains. The colored edges represent Level-2 chains. The “chainlink” genes are also shown.
Figure 2. Voice chains of different types. (a): Shows the different components of a graph comprising biological pathways. Each node in this graph represents a biological pathway. The genes g i that contribute to each pathway are listed within the node. Nodes are linked to each other through edges that represent the set of shared genes. If two nodes have no gene in common, no edge exists between them. Genes that link pathways (nodes) are explicitly shown on the edges. (b): Explains what a chainlink gene means in the context of a microdeletion region and the target voice gene (FOXP2 in the example used in this paper). To compose voice chains, the set of genes of interest (e.g., from a microdeletion region) is added to the a graph as a node (shaded pink). The “chainlink” genes that link the set to the graph are also shown. The topmost (left) node represents a biological pathway that contains the FOXP2 gene ( g 1 , highlighted in yellow). (c) A Level-1 voice chain is the edge shown in red. The “chainlink” gene g 2 that links the set (in pink) to the voice chain is also shown. (d) Exemplifies Level-2 chains. The colored edges represent Level-2 chains. The “chainlink” genes are also shown.
Entropy 25 00897 g002
Figure 3. (a) Chainlink counts: Scatter of chainlink gene counts for syndromes associated with voice problems of different severities. Each cross represents a single syndrome. The horizontal axis value indicates the number of chainlink genes for the syndrome. The y-axis is a dummy axis. (b) Chainlink connectivity: Scatter of chainlink connectivities for syndromes associated with voice problems of different severities. Each panel shows the aggregate no. of pathways that all chainlink genes connect to (x-axis) for each syndrome (denoted by a cross) that exhibits the labeled speech characteristic. The y-axis is a dummy axis. (c) Count vs. normalized connectivity: Chainlink gene count (x-axis) vs. normalized chainlink gene connectivity (y-axis). The normalized connectivity is the average connectivity per chainlink gene.
Figure 3. (a) Chainlink counts: Scatter of chainlink gene counts for syndromes associated with voice problems of different severities. Each cross represents a single syndrome. The horizontal axis value indicates the number of chainlink genes for the syndrome. The y-axis is a dummy axis. (b) Chainlink connectivity: Scatter of chainlink connectivities for syndromes associated with voice problems of different severities. Each panel shows the aggregate no. of pathways that all chainlink genes connect to (x-axis) for each syndrome (denoted by a cross) that exhibits the labeled speech characteristic. The y-axis is a dummy axis. (c) Count vs. normalized connectivity: Chainlink gene count (x-axis) vs. normalized chainlink gene connectivity (y-axis). The normalized connectivity is the average connectivity per chainlink gene.
Entropy 25 00897 g003
Figure 4. Conway–Maxwell–Poisson models for the distributions of chainlink counts and chainlink connectivities for different types of speech problems. (a) Distribution of chainlink counts. (b) Distribution of chainlink connectivities.
Figure 4. Conway–Maxwell–Poisson models for the distributions of chainlink counts and chainlink connectivities for different types of speech problems. (a) Distribution of chainlink counts. (b) Distribution of chainlink connectivities.
Entropy 25 00897 g004
Table 1. Information about level-1 and level-2 voice chain ensembles for 76 chromosomal microdeletion syndromes. For each syndrome, the corresponding implicated gene (culled from medical literature, with details given in the Appendix A) that was also discovered to be a chainlink gene by the proposed algorithm in either level-1 or level-2 chains is listed in the second column. The number of chainlink genes (i.e., genes that have chains that connect to the ACC biological pathway of FOXP2) is shown in the third and fourth columns for level-1 and level-2 chains, respectively. The total number of pathways that they collectively influence is indicated next to each count, in parentheses. The observed phenotypic effects on speech are given in the last column. Del: delayed speech; Imp: impaired speech; Norm: normal speech; Abs: absent speech; Apr: apraxia; Dys: dysarthria; Idio: idiosyncratic.
Table 1. Information about level-1 and level-2 voice chain ensembles for 76 chromosomal microdeletion syndromes. For each syndrome, the corresponding implicated gene (culled from medical literature, with details given in the Appendix A) that was also discovered to be a chainlink gene by the proposed algorithm in either level-1 or level-2 chains is listed in the second column. The number of chainlink genes (i.e., genes that have chains that connect to the ACC biological pathway of FOXP2) is shown in the third and fourth columns for level-1 and level-2 chains, respectively. The total number of pathways that they collectively influence is indicated next to each count, in parentheses. The observed phenotypic effects on speech are given in the last column. Del: delayed speech; Imp: impaired speech; Norm: normal speech; Abs: absent speech; Apr: apraxia; Dys: dysarthria; Idio: idiosyncratic.
SyndromeImplicated Genes
from Microarray
Studies That Also
form Voice Chains
No. of Genes in { V N 1 } No. of Genes in { V N 2 } Reported Effects on
Speech
1p36SPEN1 (10)226 (3152)Del [9], or Abs
1q21.1–q21.2RBM8A, GJAS41 (906)Del [10] or Imp [11]
1q41–q42DISP1, LEFTY1,
LEFTY2, BPNT1
60 (1020)Apr [12]
1q43–q44AKT378 (903)Del, Imp, or Abs
2p16.1–p15 17 (508)Dys, Apr, or Imp [13]
2p21SLC3A124 (557)Apr or Idio [14]
2q23.1 2 (31)Del or Imp
2q32–q33COL3A1, COL5A2,
GTF3C3, CASP8,
CASP10
62 (1429)Abs
2q37.3HDAC438 (330)Imp [15]
3p13FOXP18 (48)Del, Idio, Imp and Dys
(all severe), Apr [16]
3q13.31DRD3, GAP43, LSAMP5 (56)Imp [17] or Abs
3q29PAK2, DLG124 (413)Del
4p16.3FGFR31 (25)44 (495)Del [18] or Abs
4q21 41 (696)Del or Abs; Imp [19,20]
5p (5p15.2 and/or
(5p15.3 or 5p15.33))
TERT, CTNND237 (500)Del, Abs, and Apr [21]
5q14.3MEF2C7 (275)Abs
5q33.1RPS1417 (153)Dys [22]
5q35.3NSD11 (8)47 (653)Norm [23] or Del
6pter–p24FOXC1, GMDS41 (416)Del
6q25.3ARID1B21 (363)Del, Apr, Dys [24]
7p21TWIST118 (243)Del [25]
7q11.23ELN, LIMK1,
GTF2IRD1, GTF2I
35 (536)Norm or Del
8p23.1GATA449 (337)No significant anomaly
reports
8q22.1CCNE213 (152)Del
8q24.11–q24.13EXT121 (211)Del
9p24.3 1 (9)3 (13)Del, Dys, Apr [26]
9q34.3EHMT11 (62)46 (718)Del [27], Apr [28] and
Abs
10pter–p13 or
10p14–p15.1
GATA349 (796)Sensorineural hearing
loss
10q23PTEN, BMPR1A1 (107)59 (1003)Del or Abs; Imp [29]
10q26DOCK148 (708)Imp; Del [30]
11p11.2–p12EXT246 (737)Idio, Dys, Del, Apr
[31]
11p13–p12PAX6, SLC1A2,
PRRG4
23 (342)Imp [32]
11p15.5IGF21 (347)50 (1243)Imp [33]
11q13.3FGF4, FGF3, FADD11 (608)Del [34] Imp [35]
11q23 77 (1421)Dysarthric, Abs [36];
Imp [37]
11q23.3-q25FLI1, JAM3119 (1522)Imp, Del, Apr [38]
12q14.3HMGA29 (108)Abs, Imp [38] or Del
[39]
13q12.3 6 (119)Del [40]
13q14RB146 (595)Norm [41]
13q22.3EDNRB3 (37)Norm [41]
13q33–q34SOX1, ARHGEF730 (635)Apr
14q11–q22PAX9, SUPT16H,
CHD8, RALGAPA1
173 (2021)Del [42] or Abs
14q22.1–q23.1PTGDR, BMP441 (530)Imp [43]
14q32.2DLK1, MEG323 (242)Del; Idio [44]
15q11.2NIPA1, NIPA2, CYFIP1, TUBGCP511 (72)Del
15q11–q13NDN, SNRPN30 (230)Del or Imp
15q11–q13UBE3A –Same as above–Abs; Imp [45]
15q13.3CHRNA7, OTUD7A8 (47)Imp or Idio
15q24SIN3A38 (623)Del or Imp
16p11.2SH2B1, TBX6,
CORO1A
1 (1)70 (1067)Apr [46]; Dys [47], Del
or Imp
16p12.2–p11.2SH2B11 (1)106 (1682)Del or Imp
16p12.1 17 (157)Del
16p13.11MYH116 (116)Del [48]
16p13.3CREBBP, TRAP11 (140)122 (1532)Apr, Dys, Imp or Del
[49]
16q22CBFB1 (2)79 (830)–Not available–
16q24.3–q24.2CDH15, ZNF778,
ZFPM1
28 (307)Del or Imp [50]
17p11.2LLGL1, UBB36 (845)Del; Dys [51]
17p13.1KCNAB3, GUCY2D,
TP53, TRAPPC1,
MPDU1, FXR2, EFNB3
2 (214)74 (1387)Abs
17p13.3PAFAH1B1, YWHAE37 (487)Del [52]
17q11.2NF140 (620)No significant issues
17q12HNF1B, LHX1,
CCL3L3
1 (124)59 (890)Del or Imp
17q21.31KANSL1, MAPT,
CRHR1
2 (68)46 (740)Del or Abs
17q23.1–q23.2 28 (307)Del [53]
17q24.3–q24.2ABCA5, MAP2K6,
SOX9
22 (720)Del [54]
18qMBP122 (1666)Del [55] or Imp [56]
19p13.13 2 (263)131 (2080)Del; Imp [57]
19q13.11UBA2, WTIP1 (23)22 (283)Del or Abs; Imp [58]
20p12.3BMP213 (415)Del or Imp [59]
22q11.2TBX1, COMT, TOP3B66 (1294)Apr, Dys, Del or Imp
[60]
22q12.2NF227 (253)Del or Imp [61]
22q13.3ARSA, SHANK31 (9)45 (670)Del or Abs
Xp11.3RP21 (6)20 (335)Imp [62]
Xp21GK, DMD, NR0B17 (121)Del [63]
Xq28 (a)ABCD1, BCAP31,
SLC6A8
64 (1053)Del [64]
Xq28 (b)MECP2–same as above–Norm to Abs
Yq11 6 (40)Norm
Table 2. Row-wise: Statistics showing the number of syndromes (count) associated with each voice disorder, the number of chainlink genes associated with the corresponding set of syndromes, and the connectivity of chainlink genes for the set.
Table 2. Row-wise: Statistics showing the number of syndromes (count) associated with each voice disorder, the number of chainlink genes associated with the corresponding set of syndromes, and the connectivity of chainlink genes for the set.
Speech TypeCountChainlink GenesChainlink Connectivity
MeanMedianMeanMedian
Normal63240404595
Apraxic194330675557
Dysarthric114336725737
Impaired314638734608
Delayed514737682500
Absent196046902718
Table 3. (a) Code distance between the (sets of) chainlink counts for different types of speech disorders. (b) Code distance between the (sets of) chainlink connectivities for diferrent types of speech disorders.
Table 3. (a) Code distance between the (sets of) chainlink counts for different types of speech disorders. (b) Code distance between the (sets of) chainlink connectivities for diferrent types of speech disorders.
(a)
NormalApraxicDysarthricImpairedDelayedAbsent
Normal02.12.37.614.715.3
Apraxic 00.21.12.96.8
Dysarthric 00.21.23.7
Impaired 00.53.7
Delayed 04.6
Absent 0
(b)
NormalApraxicDysarthricImpairedDelayedAbsent
Normal06.37.017.422.522.2
Apraxic 00.71.51.95.9
Dysarthric 00.00.41.7
Impaired 00.92.4
Delayed 05.5
Absent 0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Singh, R. A Gene-Based Algorithm for Identifying Factors That May Affect a Speaker’s Voice. Entropy 2023, 25, 897. https://doi.org/10.3390/e25060897

AMA Style

Singh R. A Gene-Based Algorithm for Identifying Factors That May Affect a Speaker’s Voice. Entropy. 2023; 25(6):897. https://doi.org/10.3390/e25060897

Chicago/Turabian Style

Singh, Rita. 2023. "A Gene-Based Algorithm for Identifying Factors That May Affect a Speaker’s Voice" Entropy 25, no. 6: 897. https://doi.org/10.3390/e25060897

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop