Identification of Extremely Rare Pathogenic CNVs by Array CGH in Saudi Children with Developmental Delay, Congenital Malformations, and Intellectual Disability

Chromosomal imbalance is implicated in developmental delay (DD), congenital malformations (CM), and intellectual disability (ID), and, thus, precise identification of copy number variations (CNVs) is essential. We therefore aimed to investigate the genetic heterogeneity in Saudi children with DD/CM/ID. High-resolution array comparative genomic hybridization (array CGH) was used to detect disease-associated CNVs in 63 patients. Quantitative PCR was done to confirm the detected CNVs. Giemsa banding-based karyotyping was also performed. Array CGH identified chromosomal abnormalities in 24 patients; distinct pathogenic and/or variants of uncertain significance CNVs were found in 19 patients, and aneuploidy was found in 5 patients including 47,XXY (n = 2), 45,X (n = 2) and a patient with trisomy 18 who carried a balanced Robertsonian translocation. CNVs including 9p24p13, 16p13p11, 18p11 had gains/duplications and CNVs, including 3p23p14, 10q26, 11p15, 11q24q25, 13q21.1q32.1, 16p13.3p11.2, and 20q11.1q13.2, had losses/deletions only, while CNVs including 8q24, 11q12, 15q25q26, 16q21q23, and 22q11q13 were found with both gains or losses in different individuals. In contrast, standard karyotyping detected chromosomal abnormalities in ten patients. The diagnosis rate of array CGH (28%, 18/63 patients) was around two-fold higher than that of conventional karyotyping (15.87%, 10/63 patients). We herein report, for the first time, the extremely rare pathogenic CNVs in Saudi children with DD/CM/ID. The reported prevalence of CNVs in Saudi Arabia adds value to clinical cytogenetics.


Introduction
Children under the age of 5 years are categorized as individuals with global developmental delay (DD) if they present with slow performance in reaching at least two of the following milestones: gross or fine motor activity, speech or language, cognition or mental activity, and social or personal activities of daily living [1,2]. Individuals with congenital malformations (CM) had a problem in the heart, kidney, brain, muscles, or skeleton since birth, and individuals with intellectual disability (ID) had problems with general mental abilities: (i) intellectual functioning (such as learning, reasoning, problem-solving) and/or (ii) adaptive functioning (such as language, number concept, time calculation, memory, social responsibility, communication, and independent living) [3,4]. The Saudi Population Registry (statistics authority) reported a combined disabilities population of around discuss the characteristic features and clinical significance of the detected CNVs including pathogenic and VUS and compare the diagnostic yields of array CGH with previous reports.

Patients and Ethical Approval
We recruited 63 children with DD, CM, and/or ID after obtaining their informed consent from parents/guardians and research approval from the institutional ethics committee (approval Code # 012-CEGMR-ETH-0), and work was performed in accordance with the Declaration of Helsinki. Children below the age of 18 years who were diagnosed with distinct features of DD, CM, and ID and were residents of the Western region of Saudi Arabia were included in the study. Patients who refused to give informed consent were excluded. Clinical examination of the patients was conducted at the Center of Excellence in Genomic Medical Research referred by the KAU Hospital (Jeddah), the Maternity and Child Hospital (Jeddah), and the Pediatrics Clinic of Taif Hospital (Taif), all in Saudi Arabia, and were referred to the Center of Excellence in Genomic Medical Research for molecular cytogenetic testing. Clinical information and family history were recorded to establish the DD etiology and to elucidate the diagnostic process of unexplained DD/CM/ID.

Cytogenetics Analyses
Karyotyping based on G banding using Trypsin and Giemsa (GTG banding) was performed based on microscopic examination of at least 20 metaphases per case. Chromosomes were analyzed using Applied Imaging Karyotyping software (Applied Imaging, Santa Clara, CA, USA), and karyotypes were described according to the International System for Human Cytogenomic Nomenclature (ISCN, 2020) [24,32,33].

DNA Preparation and Whole-Genome Array CGH
Genomic DNA from 5 mL patient's blood was extracted using QIAamp DNA Blood Mini Kit (Qiagen, Hilden, Germany) and purified using QIA-Miniprep Kit (Qiagen). The concentration and quality of DNA were determined using a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA).
To investigate genome defects, we applied high-density array CGH using SurePrint G3 Human CGH Microarray Kit in 1 × 244 K (AMADID Number: 014693) and 2 × 400 K (AMADID Number: 021850) formats, consisting of 244,000 and 400,000 copy number probes, respectively (Agilent Technologies, Santa Clara, CA, USA), with UCSC hg18 as the reference genome. The overall median probe spacing of the 1 × 244 K and 2 × 400 K chip was 8.9 kb and 5.3 kb, respectively, whereas the spacing of RefSeq genes was 7.4 kb and 4.6 kb, respectively. Microarray analysis was conducted according to Agilent's assay procedures, with modifications. Commercial human reference DNA was used (Agilent Technologies, Santa Clara, CA, USA). Upon being enzymatically digested using AluI and RsaI, the DNA samples were labeled with cyanine 3-deoxyuridine triphosphate (Cy3-dUTP) using SureTag DNA Labeling Kit (Agilent Technologies), whereas sex-matched reference DNA samples were labeled with Cy5-dUTP. The labeled DNA was purified before being mixed with Cot-1 DNA, 10× array CGH blocking agent, and 2× HI-RPM hybridization buffer (Agilent Technologies); this mixture was dispensed into a microarray slide. Hybridization was performed in an Agilent hybridization chamber at 67 • C and 20 rpm for 24 h and then washed stringently with wash buffer 1 and wash buffer 2 (Agilent Technologies). Microarray slide images were captured using Agilent SureScan Microarray Scanner G2505C.

Interpretation of CNVs
CNV analysis was performed using Agilent Cytogenomics v5.2.0.2 and human genome build hg18. A CNV was considered either a gain or loss if the region had at least three consecutive probes with a mean log 2 ratio of ±0.25, respectively. A mean log 2 ratio > 0.58 was considered a gain, whereas that <−1 indicated a loss. Following the recommended guidelines for detecting pathogenic variants, CNVs < 300 kb were excluded from further analysis. In addition, the CNVs were classified as benign if the corresponding regions did not harbor genes or were present in the healthy normal controls (Database of Genomic Variants, DGV; http://projects.tcag.ca/variation, accessed on 25 November 2022).

Quantitative Real-Time PCR
Quantitative real-time PCR (qPCR) was used to validate the deletions and duplications of CNVs detected by array-CGH. The primer sets were designed for selected genomic regions of the target genes including FLI1, SHANK3, and MBP, and an endogenous GAPDH gene as an internal control using Primer-3 Software (V.0.4.0). The reaction was run in a final volume of 10 µL, comprising of 5 µL SYBR-Green qPCR master mix (KAPA Biosystems, Wilmington, NC, USA), 10 pmol of each primer, and 20 ng genomic DNA. The PCR was performed in triplicate using SYBR-Green qPCR master mix (KAPA Biosystems, USA) in a 96-well plate. Raw data was generated by StepOne Plus™ Real-Time PCR Systems and Data Assist software. qPCR data were analyzed by ∆∆C T or Livak method and the Graph Pad PRISM software was used for presentation.
We discovered that in a few cases, the size of the chromosomal abnormalities overlapped, including contiguous essential genes for the same syndrome. Sizes of the diseaseassociated CNVs varied and can be grouped as: <5 Mb (17.8%), 5-10 Mb (13.5%), 10-20 Mb     (Figure 4).

Confirmation of CNVs by Quantitative Real-Time PCR (qPCR)
qPCR results confirmed the CNVs detected by array CGH. We found a significant decrease in the gene copy number of FLI1 (11q24q25 deletion) and SHANK3 (22q11q13 deletion), and a significant increase in MBP (18q23 duplication) ( Figure 5).

Discussion
In this study, we identified the CNVs associated with DD/CM/ID, and the diagnostic yield of array CGH (28%) was in accordance with previous findings [1,19,34]. The clinical significance of identified CNVs was classified into three categories: established clinical significance (syndromic pathogenic and non-syndromic pathogenic), VUSs, and without clinical significance (benign). The pathogenic CNV is directly associated with patient phenotype, the non-syndromic CNV encompasses diverse disorders that are indirectly associated with individual phenotype, and the VUS is linked to indistinct disease vulnerability. Rare chromosomal abnormalities with no systematic analysis of limited cases, where common clinical manifestations remain elusive, were also classified as VUSs.
The human genome is diploid and expected to contain two copies of each autosome, except the sex chromosome in men. However, in reality, genetic variations are commonly present, ranging from large chromosome anomalies and copy number variations to single nucleotide changes in the human genome. CNVs, usually DNA segment size range from 1 Kb to 10 Mb, are present at variable copy numbers in comparison with a reference genome. Chromosome abnormalities are less frequent, but their presence leads to different diseases/syndromes, while CNVs and SNPs are frequently present in the genome as benign but a few are pathogenic as well [35,36]. In general, humans harbor 10-100 CNVs, which are mostly benign [37]. Herein, we detected 2537 unfiltered CNVs among 63 individuals (~40 CNVs per person) with most of them being benign. Only 42 CNVs across 19 patients (~2 CNVs per person) were pathogenic or VUSs.

Discussion
In this study, we identified the CNVs associated with DD/CM/ID, and the diagnostic yield of array CGH (28%) was in accordance with previous findings [1,19,34]. The clinical significance of identified CNVs was classified into three categories: established clinical significance (syndromic pathogenic and non-syndromic pathogenic), VUSs, and without clinical significance (benign). The pathogenic CNV is directly associated with patient phenotype, the non-syndromic CNV encompasses diverse disorders that are indirectly associated with individual phenotype, and the VUS is linked to indistinct disease vulnerability. Rare chromosomal abnormalities with no systematic analysis of limited cases, where common clinical manifestations remain elusive, were also classified as VUSs.
The human genome is diploid and expected to contain two copies of each autosome, except the sex chromosome in men. However, in reality, genetic variations are commonly present, ranging from large chromosome anomalies and copy number variations to single nucleotide changes in the human genome. CNVs, usually DNA segment size range from 1 Kb to 10 Mb, are present at variable copy numbers in comparison with a reference genome. Chromosome abnormalities are less frequent, but their presence leads to different diseases/syndromes, while CNVs and SNPs are frequently present in the genome as benign but a few are pathogenic as well [35,36]. In general, humans harbor 10-100 CNVs, which are mostly benign [37]. Herein, we detected 2537 unfiltered CNVs among 63 individuals (~40 CNVs per person) with most of them being benign. Only 42 CNVs across 19 patients (~2 CNVs per person) were pathogenic or VUSs.
Our data along with previous research demonstrates that array CGH is efficient in identifying known and novel disease associated CNVs. Extensive CNV analysis of developmental disabilities identified eight VUSs (3p23p14, 11q12.1q14, 13q21.1q32.1, 16p13p11, 16q21q23, 18q23, 20q11.1q13.2, and 22q11q13 [61]. There are some limitations of the study, but it still has future directions. The cohort size was not large enough to truly represent the population. Clinical pictures and images were not available for publication because of confidentiality and the patient's privacy policy. The discrepancy was found in the G-banding and the array CGH result because a couple of big CNVs (>10 Mb) were detected by array CGH but not found in G-banding, despite using the allowed resolution band of 550, and this might be because the traditional technique lacked accuracy. The array CGH confirmed all karyotyping results, but was limited in detecting balanced translocations or inversions, ring chromosomes, and lowlevel mosaicism. An additional challenge lies in interpreting VUSs found through array CGH and validating their clinical significance. These considerations lead us to recommend that diagnoses employ both array CGH and conventional karyotyping to confirm positive cases and identify CNVs in negative cases. In the future, the validation of detected CNVs, especially VUSs, and their confirmation on a bigger cohort will overcome the limitations of the current study.

Conclusions
This is a first array CGH-based comprehensive study from Saudi Arabia, and for the first time we herein, we report the extremely rare pathogenic CNVs/genes (8q24, 10q26, 11q24q25, 18p11, 15q25q26, and 16p13p11) among Saudi individuals with DD, CM, and ID that may contribute to their genetic etiology. Additionally, our result showed a couple of potential causative CNVs that may be re-classified as pathogenic CNVs after detailed validation and functional characterization. Our results enhanced the knowledge of the copy number variants underlying DD, CM, and ID in the Saudi population, and array technology will potentially help to improve the genetic diagnosis of CNVs and novel syndromes in neonatal and prenatal cases.

Supplementary Materials:
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/children10040662/s1, Table S1: Clinical information of pediatric developmental delay, congenital malfunction, and intellectual disability patients included in this study. Institutional Review Board Statement: The research was approved from the institutional ethics committee (approval Code # 012-CEGMR-ETH-0). Informed Consent Statement: All patient and/or their patents were informed about research participation and wer included in study after their consents.

Data Availability Statement:
The raw datasets used in this study are available at GEO repository with accession number of GSE182101 (super series), GSE181995 (subseries) and GSE182081 (subseries).