Pathogenic Copy Number Variations Involved in the Genetic Etiology of Syndromic and Non-Syndromic Intellectual Disability—Data from a Romanian Cohort

The investigation of unexplained global developmental delay (GDD)/intellectual disability (ID) is challenging. In low resource settings, patients may not follow a standardized diagnostic process that makes use of the benefits of advanced technologies. Our study aims to explore the contribution of chromosome microarray analysis (CMA) in identifying the genetic etiology of GDD/ID. A total of 371 Romanian patients with syndromic or non-syndromic GDD/ID, without epilepsy, were routinely evaluated in tertiary clinics. A total of 234 males (63.07%) and 137 (36.93%) females, with ages ranging from 6 months to 40 years (median age of 5.5 years), were referred for genetic diagnosis between 2015 and 2022; testing options included CMA and/or karyotyping. Agilent Technologies and Oxford Gene Technology CMA workflows were used. Pathogenic/likely pathogenic copy number variations (pCNVs) were identified in 79 patients (21.29%). Diagnosis yield was comparable between mild ID (17.05%, 22/129) and moderate/severe ID 23.55% (57/242). Higher rates were found in cases where facial dysmorphism (22.97%, 71/309), autism spectrum disorder (ASD) (19.11%, 26/136) and finger anomalies (20%, 27/96) were associated with GDD/ID. GDD/ID plus multiple congenital anomalies (MCA) account for the highest detection rates at 27.42% (17/62). pCNVs represent a significant proportion of the genetic causes of GDD/ID. Our study confirms the utility of CMA in assessing GDD/ID with an uncertain etiology, especially in patients with associated comorbidities.


Introduction
Neurodevelopmental disorders (ND), including global developmental delay (GDD)/ intellectual disability (ID) and/or autism spectrum disorders (ASD), affect 1-3% of the world's population [1]. In developed countries, severe ID is reported in 2.5 to 5 in 1000 children, while mild ID has a higher prevalence, especially among children with low socioeconomic status [2].
The cited causes of GDD/ID include prenatal and perinatal infections or trauma, genetic abnormalities, environmental factors, metabolic anomalies, nutritional deficits and toxic exposure, but in 75% of cases, the etiology is unknown [3]. Various genetic causes can lead to GDD/ID; the most frequent is Down syndrome [4], while Fragile X syndrome is the most common inherited cause [5]. Genetic abnormalities involved include aneuploidies, copy number variations (CNVs), tandem repeats, indels and short variation [1,6].
Romanian data is scarce. To our knowledge, there is a single publication about CMA in a small cohort of 36 patients with GDD/ID and obesity from a north western region of our country [25].
Our study aims thus to present the results of aCGH and karyotype testing from a Romanian cohort of 371 patients with GDD/ID, contributing to the existing reports on the utility of aCGH in GDD/ID diagnosis and the involvement of various chromosomal regions in the etiology of this complex pathology.

Patient Inclusion and Evaluation
This study includes 371 patients evaluated for GDD and/or ID in pediatric, child neurology or medical genetics departments from different regions of Romania. They had been referred to the Regional Centre for Medical Genetics (CRGM), Dolj, Craiova for genetic testing between 2015 and 2022. A total of 234 of the patients were boys (63.07%) and 137 (36.93%) were girls, with ages ranging from 6 months to 40 years (median age was 5.5 years) at their first medical assessment.
Inclusion criteria to the current study were the presence of GDD in children <5 years old or ID in children over 5 years old [26], and the absence of epilepsy. Epilepsy can determine ID; a different diagnostic approach may be better suited [27].
Most of the patients showed syndromic involvement, with the presence of dysmorphic features and/or various malformations.
Clinical evaluations, including personal history, psychomotor and behavioral development, ID severity, the presence of dysmorphic features, neuroimaging and EEG studies, were obtained from referring clinicians, neurologists or pediatric neurologists.
ID severity was classified as mild, moderate, severe or profound by DSM-5 (Diagnostic and Statistical Manual of Mental Disorders, 5th Edition) criteria, following the evaluation of conceptual, social and practical impairment.
Ethical approval for the study was granted by the local research ethics committees of the involved institutions. A written informed consent form was signed by the parents or legal guardians of the patients.

Genetic Testing
Resource limitations at times dictated the first intention genetic test, despite recommendations. For 169 patients, karyotyping was used as the first choice. All 371 cases, irrespective of karyotyping results, were run though CMA as soon as it became available.
High-resolution aCGH was performed using testing options from Agilent Technologies, Santa Clara, CA, USA: Agilent SurePrint G3 CGH ISCA v2 8 × 60K (141 patients), 4 × 180K (37 patients), and Oxford Gene Technology Operations Ltd.: CytoSure ISCA V2 CGH 8 × 60K microarrays (193 patients), following the protocols provided by the manufacturer [28]. A feature extraction program was used to obtain post-hybridization data. Subsequent analysis was performed with the recommended software: CytoGenomics software from Agilent and Cytosure Interpret Software from OGT, respectively.
CMA, standard karyotyping and/or multiplex ligation-dependent probe amplification (MLPA) were used to confirm the CMA findings and perform segregation analysis where commercial kits to cover the region of interest were available, and/or parents consented.

Variant Interpretation and Reporting
Quality criteria included a coverage of at least 4 hybridized probes with a minimum average ratio of 0.5 and a software reported p value of less than 0.05. Deletions do not require size cut-offs [29]; only duplications larger than 250 kb were reported, as it is more difficult to assign a pathogenic role to copy number gains. Variants with >50% overlap with the CNVs detected in healthy individuals in the Database of Genomic Variants (DGV) and those without gene content were eliminated from the analysis. Reported genomic regions use GRCh37/hg19 for reference purposes.
For the current study, only clinically relevant findings, pathogenic/likely pathogenic CNVs (pCNVs), are reported. Incidental findings were not disclosed, as per patient consent.
Univariate analysis (odds ratio OR, with confidence intervals CI) indicated predictive phenotypes for a higher diagnostic yield-a higher chance to have a pCNV-in our cohort with unexplained GDD/ID. Data were analyzed using SPSS Statistics for Windows, Version 22.0 (IBM, Armonk, NY, USA: IBM Corp).

Diagnosis Rate
In our cohort of 371 Romanian individuals with unexplained GDD/ID with syndromic or non-syndromic presentation, 79 patients were found to carry at least one rare exonic pCNV, as seen in Figure 1 below. This places the overall diagnostic yield of aCGH in the current study at 21.29%.
Conventional karyotyping had been performed prior to CMA for 169 patients. Performing CMA on the 19 cases where karyotyping identified pCNVs was requested to better define the regions involved. The diagnostic rate for karyotyping is 11.24% (19/169). pCNVs larger than 5 Mb could have been detected by classic karyotyping. If we are to assume the same accuracy as CMA for >5 Mb deletions/duplications, then, in our study, the karyotyping diagnosis rate would become 8.62% (32/371). Conventional karyotyping had been performed prior to CMA for 169 patients. Performing CMA on the 19 cases where karyotyping identified pCNVs was requested to better define the regions involved. The diagnostic rate for karyotyping is 11.24% (19/169). pCNVs larger than 5 Mb could have been detected by classic karyotyping. If we are to assume the same accuracy as CMA for >5 Mb deletions/duplications, then, in our study, the karyotyping diagnosis rate would become 8.62% (32/371).

Genetic Findings
Detailed findings are described in Table 1: absolute coordinates and the size of the pCNVs, the number of OMIM genes reported, the major phenotypic characteristics, the diagnosed syndrome and de novo status.

Genetic Findings
Detailed findings are described in Table 1: absolute coordinates and the size of the pCNVs, the number of OMIM genes reported, the major phenotypic characteristics, the diagnosed syndrome and de novo status. pCNVs findings were not significantly different between males and females: 49 (20.94%) male patients and 30 (21.90%) female patients. Five of our cases had two pCNVs concurrently.
The average size of pCNVs ranged from 0.07 Mb to 55.02 Mb with a median of 0.65 Mb. Similar to previously published studies, the absolute size for losses was generally smaller than that of gains: a median size of 5.31 Mb compared to 8.44 Mb, respectively. pCNVs were generally equally distributed in almost all chromosomes; CNVs were not identified in chromosomes 4, 6 and Y in our cohort. An enrichment of pCNVs was found in chromosomes 22 and 15, mainly due to a few common syndromes identified in our cohorts-Prader-Willi/Angelman syndrome (3 patients), 15q11.2 duplication syndrome (6 patients), 22q11.2 deletion syndrome (1 patient) and 22q11.2 duplication syndrome (3 patients). The chromosomes with less frequent CNVs (below 3 cases) were chromosomes 5, 10, 11, 12, 14, 19, 20 and 21. As Table 1 describes, in these 79 cases with clinically relevant findings, we found 84 rare exonic pCNVs-35 gains and 49 losses, assigned as follows: -pCNVs overlapping with known genomic disorders were found in 34/84 (40.48%) of CNVs, out of which 15 were gains and 19 were losses. These are associated with known microduplication or deletion syndromes, allowing genetic diagnosis for 9.16% of the patients of this study. -pCNVs not associated with any known syndrome, but already reported in the literature, were found in 50/84 (59.52%) of CNVs, with 20 gains and 30 losses.
Where possible, we aimed to evaluate the de novo status and/or segregation patterns of the detected genetic anomalies in trios of mother-father-affected individual; in 2 cases siblings were also analyzed. We classified 35 aberrations as de novo; one of these was also present in the proband's sibling, a case with an otherwise similar clinical phenotype (Table 1, cases #220 and #225). A total of 49 CNVs were reported to be of unknown inheritance where testing results were not available at the time of the article being published.

Clinical Findings
The major phenotypic characteristics of the cases are reported in Table 1. The cohort characteristics are summarized in Table 2. Most patients presented additional features, including ASD, MCA, psychiatric or behavioral issues, cranio-facial dysmorphism, skeletal and muscular anomalies, and variations in height or body weight. The relationship between these and GDD/ID remains unclear. Many cases presented syndromic features, as can be concluded by the high presence of MCA and atypical facial appearance.  Table 2 summarizes the cohort characteristics and provides statistical calculations for each clinical feature in order to test if it is a predictive phenotype for a higher diagnostic yield in our cohort with unexplained GDD/ID. All the patients in our study had GDD at the time of the study or at an earlier age, with 89.22% considered intellectually disabled. Divided into subgroups based on ID severity, the group of mild ID had a higher diagnostic yield (17.05%) compared with the moderate, severe and profound ID group (23.55%).
Facial dysmorphisms, though mostly minor findings, were reported in 83.28% of cases, microcephaly or macrocephaly in 37.46%, congenital anomalies in 16.71%, ASD in 36.65%, ADHD in 20.21% and speech/language delay in 64.15%. Other phenotypes had lower frequencies. Most patients had more than one associated feature.
We found associations between positive findings and clinical features with: hearing impairment (OR = 2.58), dysmorphic facial features (OR = 2.01), fingers abnormalities (OR = 1.68), congenital anomalies (OR = 1.50), ADHD (OR = 1.21) or psychiatric disturbance (OR = 0.48). There was no significant higher diagnostic yield by CMA for the other phenotypes.

Diagnosis Rate and Choice of Test
The diagnostic yield that CMA provides is reported to be between 10 and 25%, higher than karyotype testing, as also shown by our findings. A large cohort study has reported 118 rare de novo CNVs associated with ID [38]. Further analysis of the respective regions identified 10 genes for which a loss of function could lead to ID. [39] In a group of 342 children with unexplained GDD or ID, aCGH detected pCNVs in 13.2% of the patients [40]. In Table 3, we present the diagnosis yield from recent microarray studies on European cohorts with GDD/ID. As it stands, CMA remains the first choice of diagnostic tool for the detection of causative CNVs in human diseases at present in many health systems. Despite this recommendation, in low-resource settings conventional karyotyping is still a widely used genetic test in clinical practice. Although CMA has proved its role in identifying genetic causes of neurodevelopmental disorders, in Romania conventional karyotyping is still the predominant genetic test in clinical practice. There are only a few publications of CMA in cohorts of ND patients. Diana Miclea et al. analyzed 36 patients with GDD/ID and obesity from northern Romania between 2015 and 2017, using the iScan System (Illumina, San Diego, CA, USA) with a diagnostic yield of 33.3% [25]. Our study analyzed a cohort of 371 patients from Romania, who underwent microarray testing for diagnostic purposes between 2015 and 2022. This is among the first CMA studies of Romanian patients with unexplained GDD/ID and additional comorbidities. In the current study, a total of 84 pathogenic changes were detected among 79 patients with syndromic GDD/ID (21.29%). Our diagnostic rates are in line with previous reports from multicenter studies [7,23,24].
CMA diagnostic yield reported in the literature may vary largely, being the subject of different selection criteria for patient inclusion, CNV classification and/or the inclusion of control groups, as well as the preliminary exclusion of large genomic imbalances [42]. In our study, we did not apply strict selection criteria-the presence of GDD/ID associated, or not, with MCA, ASD, or dysmorphic features. Patients with seizures or epilepsy were not included in this cohort.
Existing publications recommend a minimum resolution of 200-400 kb for postnatal analyses [7,45,46]. CMA was performed in our center using several types of DNA microarrays (180K and 60K), without any notable difference in their diagnostic reliability. Although 180K platforms offer a three-fold higher resolution, we also find that both 60K and 180K microarray platforms comply the existing requirements regarding resolution, similar with previous reports [22].
The detection of submicroscopic gains and losses of genetic material known as CNVs is the major advantage of CMA over G-banded karyotyping. Due to the higher resolution and whole genome coverage CMA offers, it can precisely identify the chromosomal breakpoints, the size, and gene content of CNVs. This impacts identification of clinically relevant microdeletion/duplication in the context of the patient's phenotype. In contrast to aCGH, SNP-aCGH allows the detection of triploidy, low-level mosaicism, loss of heterozygosity (LOH) and uniparental disomy (UD).
CMA is not only a highly reliable confirmatory test of chromosomal aberrations detected through conventional G-banding karyotyping, but it is also able to specify the size and gene content. In the subgroup of patients with karyotyping offered as a first-choice test, all identified microdeletion and microduplication syndromes had been already reported, whereas the 13 additional ones identified by CMA were novel.
We need to emphasize that a complete genetic diagnosis may require complementary methods, e.g., both CMA and karyotyping [47,48]. CMA did not make karyotyping obsolete. Balanced rearrangements remain undetectable regardless of the type of array used.
For patients #315 and #328, we used a combined strategy including G-banding karyotyping and CMA. Conventional karyotyping offered the global and complex analysis of chromosomal organization and CMA confirmed and characterized the chromosomal rearrangements at molecular level. This stands as proof that CMA cannot entirely replace the standard conventional G-banding karyotyping due to its inability to detect balanced chromosomal rearrangements or to specify gains of DNA sequences in the karyotype.
Standardization of the diagnostic process in GDD/ID with unexplained etiology is an endeavor that has to continually adapt to diagnostic challenges and technical solutions. G-banding karyotyping has been frequently utilized, but slowly transitioned to being an adjunct to microarray technology [49]. Best practices for the first-line assessment of unexplained GDD recommend microarray as the first-line means of genetic investigation [50]. There is a boon of studies on CMA use in GDD/ID associated, or not, with ASD, MCA and facial dysmorphism [7,24,[51][52][53]. The most recent position statement maintains the recommendation, adding whole-exome (WES) or -genome sequencing (WGS) to increase identification of causal variants in up to 40% of patients with severe ID, as they can detect most CNVs and additional gene variations [54][55][56].

Cohort Findings
The recommended criteria to establish CNV pathogenicity include chromosome rearrangement size, gene content, inheritance pattern and making use information from databases and the relevant literature to check overlap with dosage sensitive regions [7,57].
The pCNVs identified in our cohort were typically very large, with a mean size of 2.46 Mb (median: 0.65 Mb), and most of them contained multiple genes. Some of the benign CNVs situated in gene-poor regions, such as those close to centromeres, unreported here, were very large.
In our cohort, in submicroscopic pCNVs, we observed more than a 1.4-fold higher frequency of microdeletions than microduplications (58.33% vs. 41.67%). It is well known that due to the content of dosage sensitive haplo-insufficient genes, microdeletions have a higher pathogenicity than their reciprocal microduplications [44,58,59]. A clinical interpretation of microduplication requires the assessment of their de novo status, gene content analysis and public databases of CNV interrogation [60]. Usually, microduplications larger than 1 Mb are expected to be likely pathogenic. In our cohort, we observed that CNVs larger than 1 Mb are more likely to have a pathogenic/likely pathogenic effect on the phenotype. We detected a total of 12 CNVs smaller than 500 kb with a defined clinical impact: 25% of them were de novo (3/12) and 75% of them (9/12) lacked this assessment.
Similar to previous publications [19,61,62], chromosomal imbalances involving 15q11.2, 16p11.2 and 22q11.2 loci were the most common findings in our cohort, with 6, 4 and 3 CNVs, respectively. These CNVs are characterized by incomplete penetrance and have been associated with a wide variety of phenotypes, including GDD/ID, ASD, psychiatric disorders, and epilepsy [63][64][65]. In these cases, clinical interpretation and family genetic counseling is challenging, especially so if an inheritance assessment of these chromosomal imbalances is not possible or they are inherited from a healthy parent [66,67]. In our group, parental testing could be performed for all patients within this category: all of them presented de novo CNVs.
In our cohort, we identified three cases of 15q11.2 microdeletion. The literature reports 15q11.2 microdeletion as one of the most common chromosomal abnormalities involved in the pathogenesis of ASD [44,68]. We also identified two cases of proximal 16p11.2 microdeletion (all of them de novo). This region is characterized by variable penetrance. Rosenfeld et al. reported in 2013 that the 16p11.2 microdeletion has a penetrance of 46.8%, which is compatible with its powerful adverse impact on the phenotype [64]. In our study, all the CNVs detected in these regions were de novo, having a higher general penetrance. The penetrance analysis also supports the role of other CNVs detected in our cohort (proximal 1q21.1 microduplication, 15q11.2 microdeletion), with low penetrance (<20%) as "risk" or "susceptibility loci" in the pathogenesis of GDD/ID, ASD and MCA.
The implementation of CMA in genetic testing practice has rapidly increased the diagnostic yield of idiopathic GDD/ID associated with MCA, ASD and/or facial dysmorphism by allowing the identification of novel submicroscopic rearrangements involved in the pathogenesis of these clinical phenotypes. This is a critical and challenging point in CMA data interpretation.
Based on the clinical data, the most frequently reported phenotypes are also the main reasons of referral: GDD/ID, MCA and/or dysmorphia, and ASD (Table 2). For instance, congenital anomalies, along with facial dysmorphisms, were reported in more than 16.71% of our cohort. Univariate analysis showed a significant association for the presence of pCNVs with dysmorphic facial features, psychiatric disorders, MCA and finger anomalies. Furthermore, secondary phenotypes, ASD and speech/language delay, fingers abnormalities and hearing impairment, were shown to be associated with higher findings of pCNVs in our patients with GDD/ID. However, larger sample sizes would be crucial to confirming these findings.

Limitations and Perspectives
There are several limitations to our current study, that we acknowledge, and have tried to tackle to the best of our abilities: (i) our study's sample size is relatively small; (ii) subjective assessment of some clinical features cannot be excluded; (iii) the lack of validation by other molecular genetic assays for some of the patients; (iv) inheritance status is missing for the interpretation of rare variants; (v) balanced abnormalities, small-scale mutations and low-level mosaicism cannot be detected by CMA and need to be further evaluated; (vi) sequencing can to look for additional variation that may explain this complex pathology.
CMA represents a standard method in the genetic diagnostic multistep algorithm, serving as a first-tier test, confirmatory test or test following conventional G-banding karyotyping as well. Despite the good diagnostic yield reported, most of our cases remain undiagnosed and more complex genetic tests based on NGS methods are required to identify the genetic etiology. This we recognize as a limitation of the genetic evaluation we were able to provide, which we hope to address in future diagnostic workflows. A lack of sequence information does not enable us to identify relevant causal genetic variations or refine the diagnosis by identifying recessive conditions revealed by deletions in our cohort, for instance.

Conclusions
There are only a few publications on CMA testing in cohorts of GDD/ID patients with associated comorbidities from East European countries. Our study is among the first CMA evaluation reports of a Romanian patient cohort with unexplained GDD/ID and additional comorbidities.
The reported diagnostic rate for CMA in this study was 21.29% in line with reports in the literature. We defined the pCNV yields and profiles of a Romanian cohort of patients with unexplained GDD/ID associated with different features in the context of other European studies.
Our study reinforces CMA as an effective diagnostic tool for both detection and precise characterization of clinically relevant CNVs in patients with GDD/ID, ASD and MCA. CMA can characterize the regions involved in structural abnormalities detected by conventional karyotyping. A correct and complete diagnosis dictates that CMA and conventional karyotyping should be used complementarily in certain instances. Parental analysis is essential for genetic counselling, particularly when the patient has terminal deletion/duplication or large CNVs.
The main reasons for referral for CMA testing in our study were GDD/ID, MCA, dysmorphic facial features, and ASD. Dysmorphic facial features and ASD (as a main or secondary feature) and secondary phenotypes such as micro/macrocephaly, MCA, psychiatric disorders, ADHD or speech/language delay are possible predictive phenotypes of a higher diagnostic rate through CMA.
Due to its wide application and clear cost effectiveness, CMA is now the most efficient cytogenetic screening method routinely used in genetic diagnostics and it is only likely to be replaced by when the costs of NGS based methods are significantly reduced.