Studying the Structural Significance of Galectin Design by Playing a Modular Puzzle: Homodimer Generation from Human Tandem-Repeat-Type (Heterodimeric) Galectin-8 by Domain Shuffling

Tissue lectins are emerging (patho)physiological effectors with broad significance. The capacity of adhesion/growth-regulatory galectins to form functional complexes with distinct cellular glycoconjugates is based on molecular selection of matching partners. Engineering of variants by changing the topological display of carbohydrate recognition domains (CRDs) provides tools to understand the inherent specificity of the functional pairing. We here illustrate its practical implementation in the case of human tandem-repeat-type galectin-8 (Gal-8). It is termed Gal-8 (NC) due to presence of two different CRDs at the N- and C-terminal positions. Gal-8N exhibits exceptionally high affinity for 3′-sialylated/sulfated β-galactosides. This protein is turned into a new homodimer, i.e., Gal-8 (NN), by engineering. The product maintained activity for lactose-inhibitable binding of glycans and glycoproteins. Preferential association with 3′-sialylated/sulfated (and 6-sulfated) β-galactosides was seen by glycan-array analysis when compared to the wild-type protein, which also strongly bound to ABH-type epitopes. Agglutination of erythrocytes documented functional bivalency. This result substantiates the potential for comparative functional studies between the variant and natural Gal-8 (NC)/Gal-8N.


Introduction
Glycosylation of lipids and proteins is a complex, non-random event that establishes a molecular fingerprint on cells and in tissues [1][2][3][4][5][6][7]. The more we learn about glycan structures and their changes upon differentiation or disease processes through technical advances [8][9][10][11], the higher becomes the likelihood of the validity of the concept of a biological meaning of glycan determinants (sugar code; [12]). The implied potential of glycans as signals can be realized by recognition processes. Fittingly, the application of sugar receptors (lectins) from plants and invertebrates as tools for glycan detection attests the operativeness of glycan-protein interactions [13][14][15][16][17][18][19][20].
The broad success of this application underscores the manifold opportunities residing in functional pairing between glycans and endogenous lectins in (patho)physiology, and, indeed, human lectins are emerging as potent effectors [21][22][23]. Apparently, as is the case for other types of receptors, the interaction brings matching lectin and glycoconjugate together. Its specificity stems from features on the levels of the glycan, the type of scaffold (protein or sphingolipid), and the lectin, so that pairs of distinct cellular glycoconjugates and their lectins find each other to form functional complexes. In addition to their direct contact involving molecular complementarity between sugars and peptide regions [24], topological factors on both sides are assumed to underlie the exquisite selectivity of mutual target selection in order to translate glycan-encoded information into cellular activities [25]. They encompass presentation of lectin-binding epitopes in clusters by branching or by spatial vicinity in microdomains as well as of corresponding display of carbohydrate recognition domains (CRDs). Of note, a lectin's modular architecture thus appears as crucial for bringing the binding partners together. Although this sounds perfectly reasonable, the given hypothesis requires solid experimental backing.
Looking at the sugar, the preparation of multivalent glycosides by chemical synthesis has made valuable tools available [26][27][28][29][30], and these efforts teach a salient lesson on what to do on the protein side. The rational engineering of variants of tissue lectins is the equivalent route toward the given end. With focus on topological aspects, a change in lectin design, i.e., its mode of CRD presentation, can become as powerful a means to delineate structure-function relationships as glycoclusters already are. Our report exemplifies this strategy by turning a human tandem-repeat-type lectin with two different CRDs into a covalently linked homodimer. If the ensuing work with the variant reveals an interesting glycan specificity, then respective applications in glycoconjugate monitoring can come as added value.
We here focus on a member of the family of adhesion/growth-regulatory galectins, which share the fold (β-sandwich), the ligand class (β-galactosides), and a sequence signature for ligand contact. Three classes of their structural organization are commonly found in vertebrates ( Figure 1) [31][32][33]. In contrast to non-covalently associated homodimers shown in the top part of Figure 1, two different CRDs are combined in one protein in tandem-repeat-type family members (middle part). Galectin-8 (Gal-8) is such a protein [34]. Physiologically, it is a matricellular protein with a broad range of activities on immune, endothelial, and bone cells, and is also present in tumor cells (for recent examples, please see [35][36][37][38]). The sequence differences between its two CRDs (Gal-8N, Gal-8C, forming the Gal-8 (NC) wild-type protein) place them rather far apart in the phylogenetic tree [32,39]. As further consequence, these deviations translate into a preference for sialylated/sulfated β-galactosides for Gal-8N, reaching an affinity in the nM range [40][41][42][43][44][45][46][47][48]. Obviously, the engineering of a homodimeric (proto-type-like) Gal-8 variant with two Gal-8N units, termed Gal-8 (NN) in contrast to Gal-8 (NC) used for the wild-type protein, would generate a tool that can enable the determination of whether and then which lectin properties are affected by turning the NC into an NN form. An illustrative example is the comparative characterization of counterreceptors, hereby addressing the fundamental issue of the significance of combining two different CRDs in contrast to the proto-type design of homodimers ( Figure 1). This work on human Gal-8 extends previous efforts on the murine homologue used as platelet activator and T cell mitogen [36,49].
In this report, we first describe cDNA tailoring and production of the human homodimeric Gal-8 (NN) variant protein, followed by presenting results of the analysis of its biochemical characteristics, its binding profile in a glycan array, as well as activity in aggregation and tumor cell proliferation assays. This study thus serves two purposes: (i) to turn human tandem-repeat-type Gal-8 (with two different CRDs) into a homodimeric display, and, more generally; (ii) to illustrate the feasibility and perspectives of generation of new tools by structural re-design of an endogenous lectin. Figure 1. Illustration of the structural design of the three classes of vertebrate galectins, i.e., non-covalently associated homodimer (proto-type), covalently linked heterodimer (with natural variability of linker length by alternative splicing, here given in number of amino acids, for human Gal-8; tandem-repeat-type) and trimodular combination of a C-terminal CRD with nine non-triple helical collagenous repeats and an N-terminal peptide with two sites for serine phosphorylation (chimera-type).

Protein Production and Characterization
2.1.1. Engineering and Yield cDNA amplifications were directed to engineer sequences that encode Gal-8 (NN) variants with two N-type CRDs separated by the standard (short: S, 33 aa) or long (L, 75 aa) linkers or connected directly. Screening of production of the human proteins directed from these cDNAs was done in extracts of transfected bacteria by gel electrophoresis and Western blotting. Soluble protein was found exclusively in the case of Gal-8 (NN) (without linker). Its purification under conditions found to be optimal (pET-24a, 22 °C, 100 μM isopropyl-β-D-thiogalactopyranoside (IPTG)) yielded about 3 mg/L. Compared to quantities of about 17-20 mg/L for wild-type Gal-8S/L (NC) proteins and 25 mg/L for Gal-8N, this yield is rather small. As done for these proteins, purification could be based on affinity chromatography with resin-presented lactose so that the canonical binding activity was obviously maintained in the variant.

Structural Characterization
Recombinant proteins obtained by extensive re-design require a rigorous analytical processing to verify its assumed biochemical nature. Respective analysis of this variant protein started with determination of its mass. Peaks for singly and doubly charged molecular ions without evidence for any major contamination were recorded (Figure 2a). The measured value of 33,781.9 Da of the singly charged protein is rather close to the calculated mass (33,777.8 Da), as is also the case for wild-type Gal-8S (NC) processed as control (measured: 35,805.4 Da; calculated: 35,807.8 Da) (Figure 2b). Next, mass spectrometric fingerprinting was performed to ascertain the correspondence of protein to cDNA sequence. To obtain a high degree of sequence coverage two rounds of peptide profiling were done after independent treatment of Gal-8 (NN) with trypsin ( Figure 3) or with chymotrypsin Figure 1. Illustration of the structural design of the three classes of vertebrate galectins, i.e., non-covalently associated homodimer (proto-type), covalently linked heterodimer (with natural variability of linker length by alternative splicing, here given in number of amino acids, for human Gal-8; tandem-repeat-type) and trimodular combination of a C-terminal CRD with nine non-triple helical collagenous repeats and an N-terminal peptide with two sites for serine phosphorylation (chimera-type).

Engineering and Yield
cDNA amplifications were directed to engineer sequences that encode Gal-8 (NN) variants with two N-type CRDs separated by the standard (short: S, 33 aa) or long (L, 75 aa) linkers or connected directly. Screening of production of the human proteins directed from these cDNAs was done in extracts of transfected bacteria by gel electrophoresis and Western blotting. Soluble protein was found exclusively in the case of Gal-8 (NN) (without linker). Its purification under conditions found to be optimal (pET-24a, 22 • C, 100 µM isopropyl-β-D-thiogalactopyranoside (IPTG)) yielded about 3 mg/L. Compared to quantities of about 17-20 mg/L for wild-type Gal-8S/L (NC) proteins and 25 mg/L for Gal-8N, this yield is rather small. As done for these proteins, purification could be based on affinity chromatography with resin-presented lactose so that the canonical binding activity was obviously maintained in the variant.

Structural Characterization
Recombinant proteins obtained by extensive re-design require a rigorous analytical processing to verify its assumed biochemical nature. Respective analysis of this variant protein started with determination of its mass. Peaks for singly and doubly charged molecular ions without evidence for any major contamination were recorded (Figure 2a). The measured value of 33,781.9 Da of the singly charged protein is rather close to the calculated mass (33,777.8 Da), as is also the case for wild-type Gal-8S (NC) processed as control (measured: 35,805.4 Da; calculated: 35,807.8 Da) (Figure 2b). Next, mass spectrometric fingerprinting was performed to ascertain the correspondence of protein to cDNA sequence. To obtain a high degree of sequence coverage two rounds of peptide profiling were done after independent treatment of Gal-8 (NN) with trypsin ( Figure 3) or with chymotrypsin ( Figure 4). The combination of the results of both experiments reached 97% representation of the sequence in resulting peptides. Tryptic digestion of the (wild-type) Gal-8S (NC) protein, performed as control in parallel, covered 92% of the sequence (Supplementary Materials Figure S1). These results extended the evidence for absence of any deviation from the expected structure.  Figure S1). These results extended the evidence for absence of any deviation from the expected structure.   As an additional means to collect mass information on N-and C-terminal regions by stepwise sequencing, starting from the smallest obtained c-or (z + 2) ions, respectively, spectra of reflectron in-source decay (reISD) ( Figure 5a) and linISD ( Figure 5b) were recorded. At the C-terminus, the peptide ladder started at (z + 2) 22/23, at the N-terminus at c36/c37 (reISD) and c42/43, reaching the c71/72 positions (lin(ear)ISD). Listing of the calculated and measured masses of the c, z + 2, and y-ions is given in Supplementary Materials Table S1, solidifying the evidence for the assumed error/substitution-free product nature. The presented collective experimental information excluded both any deviation in the sequenced stretch from the cDNA-based template and a post-translational modification except for the iodoacetamide-dependent covalent modification of cysteine and As an additional means to collect mass information on N-and C-terminal regions by stepwise sequencing, starting from the smallest obtained c-or (z + 2) ions, respectively, spectra of reflectron in-source decay (reISD) ( Figure 5a) and linISD ( Figure 5b) were recorded. At the C-terminus, the peptide ladder started at (z + 2) 22/23, at the N-terminus at c36/c37 (reISD) and c42/43, reaching the c71/72 positions (lin(ear)ISD). Listing of the calculated and measured masses of the c, z + 2, and y-ions is given in Supplementary Materials Table S1, solidifying the evidence for the assumed error/substitution-free product nature. The presented collective experimental information excluded both any deviation in the sequenced stretch from the cDNA-based template and a post-translational modification except for the iodoacetamide-dependent covalent modification of cysteine and oxidation of methionine residues. Determination of the isoelectric point at 9.1 (calculated: 9.06) corroborated this conclusion (Supplementary Materials Figure S2). In gel filtration, the variant protein gave a single peak at the position predicted for the monomer status irrespective of the presence of lactose (40 mM) ( Figure 6). These data documented the purity and quaternary structure and enabled us to proceed with study of the variant's profile of glycan specificity in an array and in FACScan analysis, as well as its activity as agglutinin and a cell growth regulator.  Figure S2). In gel filtration, the variant protein gave a single peak at the position predicted for the monomer status irrespective of the presence of lactose (40 mM) ( Figure 6). These data documented the purity and quaternary structure and enabled us to proceed with study of the variant's profile of glycan specificity in an array and in FACScan analysis, as well as its activity as agglutinin and a cell growth regulator.

Glycan Specificity Profile
Following biotinylation under activity-preserving conditions, the variant Gal-8 (NN) was first tested in a solid-phase assay with a glycoprotein without/with N-glycan α2,3-sialylations (fetuin/asialofetuin). The ligand was adsorbed to the plastic surface of microtiter plate wells, and assays with biotinylated lectin revealed saturable and carbohydrate-inhibitable binding at KD-values of 148 ± 29 nM (asialofetuin) and 93 ± 10 nM (fetuin with α2,3-sialylation in the α1,6-arm and the β1,4-branch of the α1,3-arm [50]). These results, as the resin-based purification did, revealed the ability of the labeled variant to bind to surface-presented glycocompounds so that performing array-based analysis was possible. As shown in Table 1, strong signals were observed mostly for the sulfated trisaccharide Neu5Acα2,3Galβ1,3(6-O-Su)-GlcNAc, 3′-sialyllactose, the GD3 tetrasaccharide, and other 3′-sialylated saccharides. These glycans exhibited binding properties for the wild-type protein, too, under these conditions. The exchange of the N/C-CRDs, however, led to a nearly complete abrogation of binding of Gal-8 (NN) to histo-blood group ABH determinants, a typical feature of the C-terminal CRD of Gal-8 (NC).

Cell Binding and Aggregation
Moving from glycan presentation on an array to cell surfaces (CHO cells with their abundant α2,3-sialylation), respective processing with the labeled variant protein led to a signal (Figure 7a).

Glycan Specificity Profile
Following biotinylation under activity-preserving conditions, the variant Gal-8 (NN) was first tested in a solid-phase assay with a glycoprotein without/with N-glycan α2,3-sialylations (fetuin/asialofetuin). The ligand was adsorbed to the plastic surface of microtiter plate wells, and assays with biotinylated lectin revealed saturable and carbohydrate-inhibitable binding at K D -values of 148 ± 29 nM (asialofetuin) and 93 ± 10 nM (fetuin with α2,3-sialylation in the α1,6-arm and the β1,4-branch of the α1,3-arm [50]). These results, as the resin-based purification did, revealed the ability of the labeled variant to bind to surface-presented glycocompounds so that performing array-based analysis was possible. As shown in Table 1, strong signals were observed mostly for the sulfated trisaccharide Neu5Acα2,3Galβ1,3(6-O-Su)-GlcNAc, 3 -sialyllactose, the GD3 tetrasaccharide, and other 3 -sialylated saccharides. These glycans exhibited binding properties for the wild-type protein, too, under these conditions. The exchange of the N/C-CRDs, however, led to a nearly complete abrogation of binding of Gal-8 (NN) to histo-blood group ABH determinants, a typical feature of the C-terminal CRD of Gal-8 (NC).

Cell Binding and Aggregation
Moving from glycan presentation on an array to cell surfaces (CHO cells with their abundant α2,3-sialylation), respective processing with the labeled variant protein led to a signal (Figure 7a).
The variant thus retains binding to cells, a typical feature of wild-type galectins, and changes in the glycan profile can affect binding, e.g., in the status of sialylation [51]. The intensity of cell staining by this variant was reduced by enzymatic desialylation (Figure 7a). This process does not abolish binding but makes terminal N-acetyllactosamine (LacNAc) accessible which can still act as a ligand. Their presentation engendered enhanced binding of the wild-type protein, especially seen in mean fluorescence intensity (Figure 7b). Association to the surface of a cell can also make bridging (in trans) possible. In this assay type, the label-free protein is tested for its ability to aggregate cells. As the wild-type protein does, the variant acts as an agglutinin. Rabbit erythrocytes were aggregated at the minimal concentration of 0.3 µg/50 µL (1.25 µg/mL for the wild-type protein), with 12.5 mM lactose blocking the lectins' activity. When testing mixtures of the two CRDs (8N + 8C) to show dependence of activity on bivalency, a concentration of 20 µg/mL was required for agglutination. In the case of human erythrocytes, positivity was observed at 1.25 µg/50 µL for both bivalent proteins. Likely reflecting the difference in signal intensity in FACScan analysis, the minimal concentrations for aggregate formation of the CHO cells were 1.7 µg/50 µL for the variant and 0.78 µg/50 µL for Gal-8S. As a measure of post-binding activity, testing of Gal-8S (WT/F19Y) had delineated a negative effect of the natural single nucleotide polymorphism-based Gal-8S (F19Y) form on proliferation of human colon cancer lines (SW480, HCT116) [47]. This single-site deviation from the common sequence thus led to a growth inhibition, posing the question of an impact after domain shuffling. When assayed under identical conditions at 100 µg/mL, presence of the wild-type and variant proteins had no significant influence on cell growth (not shown). In contrast, Gal-8N reduced the cell number by about 40% under identical conditions, and the significant release of lactate dehydrogenase revealed toxicity exerted by the N-type CRD but not its homodimer. The variant thus retains binding to cells, a typical feature of wild-type galectins, and changes in the glycan profile can affect binding, e.g., in the status of sialylation [51]. The intensity of cell staining by this variant was reduced by enzymatic desialylation (Figure 7a). This process does not abolish binding but makes terminal N-acetyllactosamine (LacNAc) accessible which can still act as a ligand. Their presentation engendered enhanced binding of the wild-type protein, especially seen in mean fluorescence intensity (Figure 7b). Association to the surface of a cell can also make bridging (in trans) possible. In this assay type, the label-free protein is tested for its ability to aggregate cells. As the wild-type protein does, the variant acts as an agglutinin. Rabbit erythrocytes were aggregated at the minimal concentration of 0.3 μg/50 μL (1.25 μg/mL for the wild-type protein), with 12.5 mM lactose blocking the lectins' activity. When testing mixtures of the two CRDs (8N + 8C) to show dependence of activity on bivalency, a concentration of 20 μg/mL was required for agglutination. In the case of human erythrocytes, positivity was observed at 1.

Discussion
The emerging role of lectins as readers and interpreters of glycan-encoded determinants with biomedical relevance provides ample incentive to delineate structure-activity relationships in detail. The initial focus of engineering structural aspects has been given to altering the quaternary structure. Examined as a role model, tetrameric concanavalin A had first been turned into dimers by succinylation or acetylation [52], which were later made monovalent by partial photoaffinity labeling [53,54]. A different route, that is selective reduction of disulfide bridges between subunits followed by alkylation of the sulfhydryl groups, also led to monomers. They could favorably be employed in flow cytometry due to the loss of capacity for mediating cell aggregation [55,56]. In these cases, the valency of the lectin was deliberately reduced by chemical modification. The same aim was reached by a single-site mutation, and the resulting variant had become instrumental to

Discussion
The emerging role of lectins as readers and interpreters of glycan-encoded determinants with biomedical relevance provides ample incentive to delineate structure-activity relationships in detail. The initial focus of engineering structural aspects has been given to altering the quaternary structure. Examined as a role model, tetrameric concanavalin A had first been turned into dimers by succinylation or acetylation [52], which were later made monovalent by partial photoaffinity labeling [53,54]. A different route, that is selective reduction of disulfide bridges between subunits followed by alkylation of the sulfhydryl groups, also led to monomers. They could favorably be employed in flow cytometry due to the loss of capacity for mediating cell aggregation [55,56]. In these cases, the valency of the lectin was deliberately reduced by chemical modification. The same aim was reached by a single-site mutation, and the resulting variant had become instrumental to separate glycan binding from glycan cross-linking, with biomedical potential for blocking viral infection without activating T cells [57]. The growing realization that tissue lectins pair with a fairly small set of functional counterreceptors gives such efforts a direct physiological scope and impact.
As shown in Figure 1 for galectins, two structural parameters define each family member: (i) overall design of the protein, classified into three groups; and (ii) the contact site for glycans in the CRD. These properties are the main features that should underlie the recognition process, a challenge for devising variants. Each type of design has already invited us to take the first step along this way. In the case of homodimeric Gal-1, covalent bridging of the two CRDs by insertion of linkers between them and the combination of the CRD of Gal-1 with a CRD from Gal-9 have turned the proto-type into a tandem-repeat-type protein [58][59][60][61][62]. The N-terminal tail of chimera-type Gal-3 had been stepwisely shortened or used as a platform to present the Gal-8N CRD [63][64][65][66]. Homodimer formation had been performed for Gal-9 (Gal-9 (NN) and Gal-9 (CC)) and revealed similar eosinophil chemoattractant capacities and activities to induce apoptosis in Jurkat T cells for the wild-type and variant proteins [67,68]. The two CRDs of Gal-9, however, do not present such a marked disparity of glycan specificity for 3 -sialylation as Gal-8N/C do [40]. The exceptionally high affinity in the nM range of Gal-8N for sialylated/sulfated β-galactosides was a reason to embark on the engineering of a Gal-8 variant with two 8N CRDs arranged in tandem. Tandem-repeat-type Gal-4 had been engineered with respect to the length of the linker [69][70][71].
Production of the variant of human Gal-8 as soluble protein was possible for the version without linker. The lectin could be purified by affinity chromatography on lactose-bearing resin. The thorough mass spectrometric analysis of the basic protein excluded presence of any post-translational or chemical modification, for example, formation of an adduct with lactose by glycation, found for Gal-3 at Lys176 recently [72]. In solution, Gal-8 (NN) behaved as a monomer under the conditions of gel filtration, its lactose-inhibitable cross-linking activity revealing that both CRDs are active in the product. This molecular architecture led to a profile of strongly binding glycans with sialylated/sulfated β-galactosides and a comparatively drastic reduction of positivity for ABH histo-blood group epitopes. These glycans preferentially interact with Gal-8's C-CRD [44]. As a consequence, opposite responses were seen in assays on fluorescent cell binding after sialidase treatment, with a decrease for Gal-8 (NN) and increase for Gal-8 (NC). The marked contribution of 6-O-sulfation of the GlcNAc moiety in type I LacNAc in the most active glycan for the variant reflects its special role to enhance affinity and selectivity against galectin-1, when testing sulfated LacNAc derivatives with both galectins [73].
In conclusion, the availability and documented activity in cell binding and as agglutinin open the door to define in detail the impact of combining two CRDs with special preferences while maintaining affinity to the canonical β-galactoside LacNAc. Explicitly, comparative testing in assays on cellular uptake and intracellular sorting [74], contact formation between cells, e.g., myeloma/endothelial cells and the extracellular matrix [75,76], and counterreceptor characterization [38,49] are now possible. Furthermore, such experimental work can also help to answer the question why Gal-8N is toxic, whereas Gal-8 (NN) has no such activity on the tested human colon cancer cells. Considering the versatility of branch-end sialylation/sulfation as a recognition signal, generating new tools for detection and isolation of distinct negatively charged glycans has principal merit [77][78][79]. Toward this aim, this variant can serve as platform for further mutational adaption of the human galectin. Interestingly, such a site-specific process implemented strong affinity for α2,6-sialylated N-glycans into a galactoside-specific β-trefoil lectin from the earthworm [80], underscoring the promising perspective of lectin engineering [81], from single-site mutations to altering the modular architecture, as shown here.

cDNA Engineering and Protein Production
Establishing the tandem-repeat arrangement of two Gal-8N CRDs on the level of the cDNA started by separate amplifications of cDNAs for Gal-8N without sequence extension, and the Gal-8N CRD was extended by sequences encoding the Gal-8S linker (33 aa) or the Gal-8L linker (75 aa). Following cDNA amplification, sequences were cloned into a bacterial vector so that artificial generation of restriction sites could be exploited to yield the new homodimeric display. In a final step, site-directed mutagenesis was applied to reconstitute wild-type codons at the position of the artificial restriction sites. In detail, PCR amplification was directed for the first N-CRD by the sense primer 5 -CATATGATGTTGTCCTTAAACAACCTAC-3 with an internal NdeI restriction site and the antisense primer 5 -GGTACCAATTGAGTGAATATTCACTTTG-3 with an internal KpnI restriction site to generate the Gal-8 (NN). Variants with linkers contained respective extensions (33 aa linker: 5 -AGATCTAAGCTGGGGCGTGC-3 , 75 aa linker: 5 -AGATCTTGACACATAGTTCATAGGTG-3 , both sense primers with an internal BglII restriction site). For the second N-CRD, the sense primer 5 -GGTACCTTGTCCTTAAACAACCTAC-3 with an internal KpnI restriction site (Gal-8 (NN); variants with linker: 5 -AGATCTTTGTCCTTAAACAACCTACA-3 with an internal BglII restriction site) and the antisense primer 5 -GTCGACTCAACCAATTGAGTGAATATT-3 with an internal SalI restriction site were used.
The amplification products were then propagated in the pGEM-T easy vector (EcoRV-linearized with single 3 T overhangs; Promega, Mannheim, Germany), digestion was performed at the 5 /3 end with the respective pair of restriction enzymes (NdeI/KpnI, KpnI/SalI, NdeI/BglII, BglII/SalI), and gel extraction led to vector-released cDNAs with sticky ends. Respective cDNAs were combined making use of the artificial restriction site (Gal-8 (NN): KpnI; linker versions: BglII) and ligated into a pET-24a plasmid (Novagen, Darmstadt, Germany). In the final step of engineering, the pET-24a plasmids containing the complete cDNAs encoding the homodimeric Gal-8 variants (900 bp for Gal-8 (NN), 999 bp for this protein with the 33 aa-long linker and 1125 bp for the protein version with the 75 aa-long linker) were then template in a modified QuikChange ® site-directed mutagenesis procedure (Agilent Technologies, Waldbronn, Germany) to revert codon sequences of the artificial restriction sites to the wild-type codons. Resulting plasmids were isolated from kanamycin-resistant colonies grown on LB agar plates and correct sequences ascertained by commercial DNA sequencing. Recombinant protein production was done in the pET-24a (pGEMEX-1; Promega)/Escherichia coli strain BL21(DE3)pLysS/Rosetta TM (DE3)pLysS system with TB medium (Roth, Karlsruhe, Germany), systematically testing parameters (after an initial growth phase of 4-5 h at 37 • C up to an OD of 600 nm of 0.6-0.8), i.e., the temperature at 22 • C, 30 • C and 37 • C and the final IPTG concentrations of 75 µM, 100 µM, and 200 µM. Presence of the protein in the soluble fraction was monitored by analysis using gel electrophoresis and Western blotting with a home-made polyclonal anti-Gal-8 antibody preparation after extract separation into soluble and pellet fractions as described [82]. Soluble protein was purified by affinity chromatography on lactose-Sepharose 4B as crucial step, as previously described for human and chicken galectin-8 [47,82].

Analytical Procedures
Matrix-assisted laser desorption/ionization time-of-flight (TOF) mass spectrometry on an Ultraflex TOFTOF I instrument (Bruker Daltonik, Bremen, Germany) equipped with a nitrogen laser (20 Hz) was performed for the intact protein in the positive-ion linear mode with ion acceleration voltage at 25 kV and first extraction plate at 23 kV. Peptide fingerprinting was done in the positive-ion reflectron mode at reflector voltage of 26.3 kV and 21.75 kV at the first extraction plate. Proteolytic cleavage by trypsin and chymotrypsin was carried out in 40 mM NH 4 HCO 3 or 100 mM Tris-HCl (pH 7.8), respectively, starting with 10 µg of protein dissolved in 10 µL digestion buffer. Following routine treatment for reduction of disulfide bridges by dithiothreitol (DTT) and alkylation of any resulting thiol groups by iodoacetamide, 100 ng trypsin (overnight at 37 • C) or 100 ng chymotrypsin (3 h at 25 • C) were applied, followed by desalting the solution using zip-tip C18 (Merck Millipore, Darmstadt, Germany) according to the manufacturer's instructions. The peptides were eluted with 2 µL saturated solution of α-cyano-4-hydroxy-cinnamic acid in 50% acetonitrile in 0.1% TFA (TFA50), 1 µL pipetted on the MALDI target followed by 1 µL of the TFA50 solution. The top-down approach of protein characterization by ISD used sinapinic acid as matrix, as described [62,66]. Settings for linISD in the positive-ion linear mode were 25 kV for ion acceleration and 23.2 kV at the first extraction plate, for reISD 21.75 kV at the first extraction plate and a reflector voltage at 26.3 kV. Data acquisition following up to 5000 individual laser shots, calibration procedures including instrument control, and data analysis including processing annotated spectra by BioTools 3.0 (Bruker Daltonik) were done as described [62,65]. The isolelectric point was determined by two-dimensional gel electrophoresis after dissolving 10 µg protein in 155 µL of a solution of 8 M urea, 20 mM DTT and 2% CHAPS, loading the sample on an IEF-strip (Zoom IPG strip, pH 6-10; Thermo Fisher Scientific, Dreieich, Germany), and running the gel in a ZOOM IPG Runner Cell. For the second dimension, a NuPAGE Novex 4-12% Bis-Tris gel (Thermo Fisher Scientific) was applied. Finally, the gel was Coomassie stained. The theoretical pI value was calculated with the "ExPASy Compute pI tool" (ExPASy, http://web.expasy.org/compute_pi/). Gel filtration (100 µg of protein in 50 µL buffer) was performed on a calibrated Superose HR10/30 column using an ÄKTA purifier 10 system (GE Healthcare, Munich, Germany) at 4 • C and a flow rate of 0.5 mL/min.

Glycan Array
Arrays produced by printing glycans (50 µM; total of 416 oligosaccharides) were from Semiotik LLC (Moscow, Russia). Gal-8 (NN/NC), labeled by conjugation of biotin using the N-hydroxysuccinimide ester derivative (Sigma, Munich, Germany) under activity-preserving conditions as described [83][84][85], was tested at the concentration of 50 µg/mL in phosphate-buffered saline (PBS) containing 0.1% Tween-20 and 1% bovine serum albumin. This solution was incubated for 1 h at 37 • C in a humidified chamber. The chips had been pretreated with PBS containing 0.1% Tween-20 for 15 min. After thorough washing to remove the labeled protein, probing with streptavidin labeled with the ALEXA Fluor ® 555 dye (Thermo Fisher Scientific) followed for 45 min at 22 • C. Washing with PBS containing 0.001% Tween-20 and then with deionized water removed the fluorescent reagent before chips were scanned on a Innoscan 1100 AL scanner (Innopsys, Carbonne, France) using an excitation wavelength of 543 nm at 10 µm resolution. Data on six spots per test compound on the chip surface were processed using ScanArray Express 4.0 software and the fixed 70 µm-diameter circle method as well as Microsoft Excel, as described [86,87]. The results are reported as median RFU (relative fluorescence units) of replicates. Median deviation was measured as interquartile range. Any signal whose fluorescence intensity exceeded the background value by a factor of five (signals from ligand-free areas were counted as a background) was considered as significant.

Solid-Phase and Cell Assays
Dissociation constants of binding of biotinylated Gal-8 (NN) to the N-glycans of glycoproteins (fetuin and the chemically desialylated asialofetuin) were determined in microtiter plate wells presenting surface-adsorbed ligand (after overnight incubation at 4 • C of solution at 0.5 µg/50 µL) and Scatchard analysis, as described [88]. Carbohydrate-dependent galectin binding to the surface of parental Chinese hamster ovary (CHO) cells, kindly provided from P. Stanley (Albert Einstein College of Medicine, Bronx, NY, USA), by flow cytofluorometry using streptavidin/R-phycoerythrin as fluorescent indicator (1:40; Sigma, Munich, Germany) was determined without/with treatment with C. perfringens neuraminidase (0.01 U in 50 µL PBS for 2 × 10 5 cells at 37 • C for 1 h; ROCHE, Mannheim, Germany) as described [51,89]. Haemagglutination of trypsin-treated, glutaraldehyde-fixed rabbit and human erythrocytes was analyzed in 96-well (V-shaped) microtiter plates using 2-fold serial dilutions as determined [90]. Aggregation of CHO cells was analyzed by microscopic assessment. Growth of cells of the human colon adenocarcinoma lines HCT116 and SW480 in Dulbecco's minimal essential medium containing 10% fetal bovine serum was quantitated in parallel assays using a commercial kit (CellTiter 96, Promega), as described [47].
Supplementary Materials: The following are available online. Figure S1: Mass spectrometric fingerprinting of peptides obtained by treatment of Gal-8S (NC) with trypsin; Table S1a: Calculated and experimental masses of c-ions observed in the reISD spectra for Gal-8 (NN); Table S1b: Calculated and experimental masses of z + 2-ions observed in the reISD spectra for Gal-8 (NN).