Association of Polymorphisms of MASP1/3, COLEC10, and COLEC11 Genes with 3MC Syndrome

The Malpuech, Michels, Mingarelli, Carnevale (3MC) syndrome is a rare, autosomal recessive genetic- disorder associated with mutations in the MASP1/3, COLEC1,1 or COLEC10 genes. The number of 3MC patients with known mutations in these three genes reported so far remains very small. To date, 16 mutations in MASP-1/3, 12 mutations in COLEC11 and three in COLEC10 associated with 3MC syndrome have been identified. Their products play an essential role as factors involved in the activation of complement via the lectin or alternative (MASP-3) pathways. Recent data indicate that mannose-binding lectin-associated serine protease-1 (MASP-1), MASP-3, collectin kidney-1 (collectin-11) (CL-K1), and collectin liver-1 (collectin-10) (CL-L1) also participate in the correct migration of neural crest cells (NCC) during embryogenesis. This is supported by relationships between MASP1/3, COLEC10, and COLEC11 gene mutations and the incidence of 3MC syndrome, associated with craniofacial abnormalities such as radioulnar synostosis high-arched eyebrows, cleft lip/palate, hearing loss, and ptosis.


Introduction
The Malpuech, Michels, Mingarelli and Carnevale syndrome [1][2][3][4] is commonly called 3MC syndrome. In 1989, Carnevale et al. reported a phenotype consisting of downslanting palpebral fissures, ptosis of the eyelids, periumbilical depression, hypertelorism, radioulnar synostosis, and developmental delay in two Italian siblings (MIM 265050). In 1996, two sisters with similar ocular, facial, skeletal, and abdominal defects, but with normal intelligence, were reported by Mingarelli et al. (also MIM 265050). Since the clinical picture of patients suffering from Carnevale and Mingarelli syndromes overlapped with Michels (MIM 257920) and Malpuech syndromes (MIM 248340), it was suggested that all four disorders should be reclassified into one "3MC syndrome". It is a rare, autosomal recessive genetic disorder, characterized by a wide spectrum of developmental abnormalities that could include high-arched eyebrows, cleft lip/palate, hearing loss, short stature, umbilical hernias/omphalocele, and urogenital abnormalities. The prevalence of 3MC syndrome is unknown. The largest number of affected persons is located in the Middle East. 3MC syndrome disorders are caused by mutations in the mannose-binding lectin-associated serine protease (MASP)1/3 [5], COLEC11 [6], or COLEC10 [7] genes. The number of 3MC patients carrying known mutations in these three genes reported so far still remains rather small but the disease is more likely to occur in families with consanguineous parents. In total, forty-six 3MC patients from 34 families with mutations in the above-mentioned genes have been diagnosed. Among them, 26 persons from 20 families had mutations in the MASP1/3 (Table 1), 17 individuals from 12 families had mutations in the COLEC11 (Table 2), and three patients from two families had mutations in the COLEC10 gene (Table 3). Those mutations abort or impair function of their corresponding proteins, resulting in defective control of cell migration at an early stage of migration interferes with ontogenesis of tissues and organs and leads to the various abnormalities manifested. The proteins expressed are also factors involved in the activation of complement via the lectin pathway, an important branch of the innate immune system [6,8]. It was suggested that dysfunction of the lectin pathway can be compensated by other defense mechanisms, which seems to explain why immune system dysfunctions are not part of the 3MC syndrome.
Therefore, MASP-1 and MASP-3 are proteases composed of six well-characterized domains, of which five constitute the heavy chain while the sixth (SP) constitutes the light chain. Upon activation of proenzymes, the peptide bond between them is cleaved and both chains remain connected via a disulphide bond [11,12]. The CUB (C1r/C1s-Uegf (uchrin epidermal growth factor)-BMP (bone morphogenic protein)) and EGF (epidermal growth factor) domains are responsible for forming MASP/MAp44 complexes with such pattern-recognition molecules (PRM) as collectins and ficolins [13]. It should be stressed that most of the proteins possessing the CUB domain are involved in developmental processes, such as bone morphogenetic protein 1, the dorso-ventral patterning protein tolloid, a family of spermadhesins, and the neuronal recognition molecule A5 [8,14]. The CCP (complement control protein) domains are common to a variety of complement factors [15]. The serine protease (SP) domain with catalytic activity is characteristic for chymotrypsin-like proteases [8]. There are several single nucleotide polymorphisms of the MASP1/3 gene strongly associated with protein serum levels. Among them, homozygosity in intron 8 (rs3774275, G/G) was associated with decreased MASP-3 but increased MASP-1 and MAp44. Variant alleles for rs698090 and rs67143992 are associated with an increase in MASP-1 and MAp44 and a decrease in MASP-3, while variant alleles of rs72549154 and rs35089177 result in decreases of MASP-1 and MAp44 and an increase of MASP-3 [9], Figure 1.

Tissue Expression of the MASP-1, MASP-3, and MAp44
MASP-1 is mainly expressed in the liver although specific mRNA was also found in the brain and cervix. Similarly, the major site of MASP-3 synthesis is the liver but its expression was also noted in a variety of other sites, including the colon, bladder, and uterus. MAp44 is synthesized in the heart while weaker expression takes place in the liver, brain, and cervix [10,17,18]. The expression of MASP-3 has been considered ubiquitous compared with other isoforms [10]. These results emphasize the importance of alternative splicing mechanisms in the regulation of expression in different tissues. The widespread expression of the MASP1/3 gene may indicate local functions of its products MASP-1, MASP-3, and MAp44. The median serum levels of MASP-1, MASP-3, and MAp44 in healthy adults were found to equal 10.8, 6.7 and 2.2 µg/mL, respectively. Concentrations of MASP-1 and MASP-3 showed gender differences-the first mentioned was higher in women while the second was higher in men [19]. Moreover, MASP-3 and MAp44 also correlate with age [19].
MASP-3 is believed to contribute to the activation of complement via the alternative pathway through pro-factor D cleavage [20]. Another substrate is insulin-like growth factor-binding protein 5 (IGFBP-5) that modulates the effects of IGF on cell survival, differentiation, and proliferation [30].
MAp44 functions as an inhibitor of the complement pathway by competing with MASP for PRM-binding sites [31]. It moreover contributes to regulation of cardiac development [32].
Some clinical associations of MASP-1, MASP-3, and MAp44 other than 3MC have been reported. For example, Weinschutz Mendes et al. (2020) found that polymorphisms influencing serum concentrations of MASP-3/MAp44 modulate susceptibility to leprosy [33]. Larsen et al. (2019) showed that low MASP-1 concentrations were associated with clotting disorders in patients with septic shock [34]. Michalski et al. (2019) demonstrated that children undergoing surgical correction of congenital heart disease with low pre-operative MAp44 concentration were more likely to suffer from post-operative complications such as systemic inflammatory response syndrome (SIRS), renal failure, multiorgan dysfunction (MODS), or low cardiac output syndrome (LCOS), while high MASP-1 and low MASP-3 were often associated with fatal outcome [35].

Genes and Protein Structures
Collectins constitute a group of 400-800 kDa oligomeric, calcium-dependent lectins consisting of basic subunits that are triplets of 30-35 kDa polypeptide chains. Collectin liver-1 (collectin-10) (CL-L1) and collectin kidney-1 (collectin-11) (CL-K1) are closely related, both structurally and functionally [36]. The collectin polypeptide chain consists of an N-terminal region containing cysteine, a collagen-like domain with Gly-Xaa-Yaa repeats (where Xaa and Yaa represent any amino acid residues), a neck region, and a C-terminal carbohydrate-recognition domain (CRD). The CRD is responsible for interactions with sugar residues on microbial surfaces or aberrant host cells, whereas the collagen-like domain forms complexes with proteins of the MASP family as well as interacting with cell receptors. CL-L1 and CL-K1 form heterocomplexes called CL-LK, existing in blood. Each molecule is stabilized by a disulphide bridge (created by Cys residues present in N-terminal domains) with two CL-K1 and one CL-L1 polypeptide chains [37]. CL-K1 is present in species ranging from the zebrafish to humans (with protein identities of full length protein: 72-98%). The protein identity between the CRDs of CL-K1 and CL-L1 reaches 54%. This compares with only 25-32% for the corresponding domains of other collectins. [38].
The human COLEC10 gene is localized to chromosome 8 (8q23-q24.1) and includes six exons. The first one encodes for the untranslated region, the N-terminal domain, and the first Gly-Xaa-Yaa sequence of the collagen-like region. The rest of the collagen-like domain is encoded by exons 2-5. Exon 5 is also responsible for the sequence of alpha-helical coiled-coil neck region while exon 6 encodes for the CRD domain [39]. Twenty polymorphic sites were identified in the promoter, exons, and flanking regions of the COLEC10 gene [40]. None of promoter polymorphisms were found to influence CL-L1 serum level. The +3654 C>T polymorphism (rs149331285, exon 5) affects the protein structure due to an amino acid exchange (p.Arg125Trp). Heterozygosity was demonstrated to be associated with significantly-higher CL-L1 serum concentration, compared with C/C homozygosity. Moreover, the -161/-157AAAATdel (rs148350292) may disturb the binding of several transcription factors essential for liver development or immune response modulation [40].
The COLEC11 gene is localized to chromosome 2 (2p25.3). It includes eleven exons, of which seven encode for the dominant isoform CL-11-1. With the exception of the first one that encodes the untranslated region, the other six exons are arranged and encode similar regions to that of the COLEC10 gene [39]. Several variations in the promoter, exons and introns have been identified. Among them, promoter polymorphism -9570 C>T (rs3820897) was shown to influence CL-K1 serum level. The non-synonymous SNP (p.His219Arg) in exon 7 (+39618 C>G, rs7567833) has no impact on protein concentration in serum, however it is suspected to influence ligand binding [40], Figure 2.  [36]. The collectin polypeptide chain consists of an N-terminal region containing cysteine, a collagenlike domain with Gly-Xaa-Yaa repeats (where Xaa and Yaa represent any amino acid residues), a neck region, and a C-terminal carbohydrate-recognition domain (CRD). The CRD is responsible for interactions with sugar residues on microbial surfaces or aberrant host cells, whereas the collagenlike domain forms complexes with proteins of the MASP family as well as interacting with cell receptors. CL-L1 and CL-K1 form heterocomplexes called CL-LK, existing in blood. Each molecule is stabilized by a disulphide bridge (created by Cys residues present in N-terminal domains) with two CL-K1 and one CL-L1 polypeptide chains [37]. CL-K1 is present in species ranging from the zebrafish to humans (with protein identities of full length protein: 72-98%). The protein identity between the CRDs of CL-K1 and CL-L1 reaches 54%. This compares with only 25-32% for the corresponding domains of other collectins. [38]. The human COLEC10 gene is localized to chromosome 8 (8q23-q24.1) and includes six exons. The first one encodes for the untranslated region, the N-terminal domain, and the first Gly-Xaa-Yaa sequence of the collagen-like region. The rest of the collagen-like domain is encoded by exons 2-5. Exon 5 is also responsible for the sequence of alpha-helical coiled-coil neck region while exon 6 encodes for the CRD domain [39]. Twenty polymorphic sites were identified in the promoter, exons, and flanking regions of the COLEC10 gene [40]. None of promoter polymorphisms were found to influence CL-L1 serum level. The +3654 C>T polymorphism (rs149331285, exon 5) affects the protein structure due to an amino acid exchange (p.Arg125Trp). Heterozygosity was demonstrated to be associated with significantly-higher CL-L1 serum concentration, compared with C/C homozygosity. Moreover, the -161/-157AAAATdel (rs148350292) may disturb the binding of several transcription factors essential for liver development or immune response modulation [40].
The COLEC11 gene is localized to chromosome 2 (2p25.3). It includes eleven exons, of which seven encode for the dominant isoform CL-11-1. With the exception of the first one that encodes the untranslated region, the other six exons are arranged and encode similar regions to that of the COLEC10 gene [39]. Several variations in the promoter, exons and introns have been identified. Among them, promoter polymorphism -9570 C>T (rs3820897) was shown to influence CL-K1 serum level. The non-synonymous SNP (p.His219Arg) in exon 7 (+39618 C>G, rs7567833) has no impact on protein concentration in serum, however it is suspected to influence ligand binding [40], Figure 2.

Tissue Expression of the Collectin-Liver 1 and Collectin-Kidney 1
The tissue expression of both CL-K1 and CL-L1 is ubiquitous. The highest expression of the first mentioned was found in the liver, kidney, adrenal gland, ovary and gallbladder (and to a lower extent, in the lung, ovary, testis, and retina) whereas expression of CL-L1 was found mainly in the liver and adrenal gland [38,41,42]. In the adrenal gland, CL-K1 was reported to be expressed in cells

Tissue Expression of the Collectin-Liver 1 and Collectin-Kidney 1
The tissue expression of both CL-K1 and CL-L1 is ubiquitous. The highest expression of the first mentioned was found in the liver, kidney, adrenal gland, ovary and gallbladder (and to a lower extent, in the lung, ovary, testis, and retina) whereas expression of CL-L1 was found mainly in the liver and adrenal gland [38,41,42]. In the adrenal gland, CL-K1 was reported to be expressed in cells of all three layers-spinal, banded, and reticular, whereas in the kidney, it is expressed in the distal canals, as well as the glomeruli and proximal tubules [38]. In the ovary, it is associated with granulosa and theca lutein cells while in the testis, the main expression sites are seminiferous tubules [38]. In the liver, the expression of both CL-K1 and CL-L1 is associated with hepatocytes [38,43]. Furthermore, high expression of both collectins was detected in placenta [42]. Average human CL-L1 serum level was estimated to be 0.5 µg/mL, while CL-K1 was estimated to be approximately 0.4 µg/mL. Their concentrations are strongly correlated [19]. No significant differences with sex, age (18-70 years), or diurnal variations were found [19,40,44,45].
The microbial structures recognized by CL-L1 are not yet established, but formation of heterodimers with CL-K1 could possibly lead to an increased range of interactions with microorganisms, as well as to their higher affinity. Thus, CL-LK is characterized by high affinity to d-mannose (d-Man), n-acetyl-d-glucosamine (d-GlcNAc), d-galactose (d-Gal), l-and d-fucose (l-Fuc, d-Fuc), and n-acetyl-d-mannosamine (d-ManNAc) [41]. CL-LK, due to cooperation with MASP, may initiate activation of complement via the lectin pathway (and cross-talk with the coagulation cascade). They may also aggregate and opsonize microorganisms contributing to enhanced phagocytosis [48].
Through the collagen-like domain, but also via its CRD, CL-K1 binds to mammalian and bacterial DNA in a calcium-independent manner [49]. It also binds to DNA exposed on apoptotic/necrotic cells and may facilitate their clearance, preventing formation of anti-DNA antibodies. The CL-K1-DNA interaction may lead to complement activation. However, it is much weaker compared to other collectins, probably due to different degrees of protein oligomerization [50].
Only a few reports concerning CL-LK disease associations and the clinical significance of CL-L1/CL-K1 have been published to date. Increased blood CL-L1 levels were observed at early stages of acute liver failure and cirrhosis. Smedbråten et al. (2017) reported that high plasma concentrations of CL-L1 and CL-K1 at the time of transplantation are correlated with increased mortality in kidney transplant recipients [51]. Troldborg et al. (2018) found lower CL-L1 and CL-K1 levels in systemic lupus erythematosus (SLE patients), compared with healthy controls, however no association with SLE disease activity index (SLEDAI) score was found [52]. Storm et al. (2014) reported that CL-L1 may significantly distinguish between patients with colorectal cancer (CRC), patients with adenomas, and individuals without neoplastic bowel lesions [53]. Higher levels of CL-L1 were associated with lower odds of CRC [53]. Świerzko et al. (2018), reported higher serum levels of CL-LK complex in multiple myeloma patients compared with healthy individuals [45]. Farrar et al., (2016) showed that deficiency of CL-K1 is protective in the case of ischemia, preventing tubular damage and loss of renal function [54]. Similar conclusions were also drawn by Wu et al. (2018), who demonstrated that CL-K1 plays a harmful role in the development of tubulointerstitial fibrosis by leukocyte chemotaxis and the impact on the proliferation of renal fibroblasts [55]. Koipallil Gopalakrishnan Nair et al. (2015) observed an approximately 3-fold increase in CL-K1 and CL-L1 transcript expression in human ectopic endometriotic mesenchymal stem cells (MSCs) in comparison with eutopic MSCs [56].

Associations of COLEC10, COLEC1,1 and MASP1/3 Gene Mutations with 3MC Syndrome
Association of MASP1/3 gene polymorphisms with 3MC were first reported by Sirmaci et al. (2010), in three individuals from two related Turkish families. They found a missense c.2059 G>A (p.Gly687Arg) mutation specific to the MASP-3 isoform and nonsense c.870 G>A (p.Trp290*) in exon 6, which is shared by all gene products. In silico results suggested that the p.Gly687Arg mutation may be damaging for the MASP-3 catalytic domain [5]. Other abnormalities of the same gene were identified by Rooryck et al. (2011). They described three homozygous missense mutations, all located within exon 12, encoding the MASP-3 protease domain, in the 3MC-syndrome-affected patients (c.1489C>T p.His497Tyr, c.1888T>C p.Cys630Arg, and c.1997G>A p.Gly666Glu). All of them are predicted to affect protein activity [6]. Atik [58]. Carriers of the above-mentioned mutations are suspected to be CL-K1-functionally-deficient. Two novel non-synonymous homozygous COLEC11 mutations in 3MC patients were later described by Munye et al. (2017). One of them, c.309delT (p.Gly104Valfs29 in exon 4) again seems to predict premature appearance of a stop codon, while another one, c.G496A (p.Ala166Thr in exon 6), causes a change in the structure, affecting protein activity. Additionally, they found deletion of 10 nucleotides (89_98del ATGACGCCTG in exon 2) which predicted a frameshift change and the introduction of a premature stop codon (p.Asp30Alafs68) [7]. To summarize, 11 mutations have so far been described in the COLEC11 gene in 17 3MC patients ( Table 2). Six of them affect the CRD region of the collectin. Three mutations of the COLEC10 gene were found associated with 3MC syndrome: c.25 C>T (exon 1), c.226delA (exon 3), and c.528 C>G (exon 6). The last-mentioned results in severe impairment of protein secretion, whereas the two others lead to the nonsense-mediated decay of transcripts [7]. All mutations in the COLEC10 gene occurring in patients with 3MC syndrome are shown in Table 3.

Concluding Remarks
The craniofacial disruptions observed in patients suffering from 3MC syndrome are similar to neural crest migration disorders. The proper migration of neural crest cells (NCC) is essential for the formation of bones, cartilage, ganglia, and muscles in the head [64]. Control and regulation of NCC migration is complex and involves many genetic pathways. Thus, there is a possibility that proteins with collagen-like regions linked to a CRD domain play dual roles, contributing to immunity and development. Certain complement components have previously been shown to play an essential role in cell migration. C3a and its C3aR receptor co-attracted crest cells, to coordinate migration in the first stages of NCC regulation [65]. Furthermore, surfactant protein D [66,67] and surfactant protein A [68] being collectins (although lacking complement activation property), have been also described as chemoattractants. Studies employing zebrafish [6] demonstrated that loss of CL-K1 and MASP-1/-3 is associated with craniofacial abnormalities. Both proteins were proposed to act as guidance cues for neural crest cell migration [47]. Moreover, CL-L1 was shown to regulate development of craniofacial structures acting as a migratory chemoattractant [7]. Three CL-K1 gene mutations associated with 3MC syndrome, resulting in Ser169Pro and Gly204Ser substitutions and Ser217 deletion, prevent normal secretion from mammalian cells due to structural changes caused by the failure to bind Ca2+ during biosynthesis. The destabilization of CRD probably leads to elimination of protein via the endoplasmic-reticulum-associated protein degradation pathway [47]. Gorelik et al. (2017) also reported that MASP-1 plays an important role in radial neuronal migration in the development of the cerebral cortex. Deficiency of MASP-1 and other components of the lectin pathway (C3 and MASP-2) leads to impairments in radial migration resulting in improper positioning of neurons and disorganized cortical layers [69]. Because CL-LK heterocomplexes were found to bind MASP-1 or MASP-3 homodimers via their collagen-like regions [47,70] it is possible that correct migration of neural crest cells requires cooperation between CL-LK and MASP-1/-3. This may be supported by the relationship between MASP1/3, COLEC10, and COLEC11 mutations and the incidence of 3MC syndrome, associated with craniofacial abnormalities. During embryogenesis, these three genes are strongly expressed in the craniofacial cartilage, palatal structures, bronchi, heart, and kidneys and the corresponding proteins act as chemoattractants for the cranial crest nerve cells (crucial for the formation of the head skeleton), recognizing certain endogenous carbohydrate epitopes [6]. It should be however mentioned, that mice with knockout in MASP1/3 or COLEC11 genes developed normally and no defects similar to 3MC syndrome were noticed [71][72][73]. It moreover still remains to be clarified which defense pathways may compensate for MASP-1/3, CL-L1, and CL-K1 dysfunctions and why this rare disease is generally not accompanied by impaired immune response.
As mentioned, so far 11 mutations in the COLEC11 gene, three in the COLEC10 gene, and 16 in the MASP1/3 gene associated with 3MC syndrome have been described. Further studies, however, are necessary to better understand the mechanisms by which dysfunction of MASP1/3, COLEC10, and COLEC11 genes may lead to the 3MC syndrome.

Conflicts of Interest:
The authors declare no conflict of interest.