Arabinogalactan Structures of Repetitive Serine-Hydroxyproline Glycomodule Expressed by Arabidopsis Cell Suspension Cultures

Arabinogalactan-proteins (AGPs) are members of the hydroxyproline-rich glycoprotein (HRGP) superfamily. They are heavily glycosylated with arabinogalactans, which are usually composed of a β-1,3-linked galactan backbone with 6-O-linked galactosyl, oligo-1,6-galactosyl, or 1,6-galactan side chains that are further decorated with arabinosyl, glucuronosyl, rhamnosyl, and/or fucosyl residues. Here, our work with Hyp-O-polysaccharides isolated from (Ser-Hyp)32-EGFP (enhanced green fluorescent protein) fusion glycoproteins overexpressed in transgenic Arabidopsis suspension culture is consistent with the common structural features of AGPs isolated from tobacco. In addition, this work confirms the presence of β-1,6-linkage on the galactan backbone identified previously in AGP fusion glycoproteins expressed in tobacco suspension culture. Furthermore, the AGPs expressed in Arabidopsis suspension culture lack terminal-rhamnosyl residues and have a much lower level of glucuronosylation compared with those expressed in tobacco suspension culture. These differences not only suggest the presence of distinct glycosyl transferases for AGP glycosylation in the two systems, but also indicate the existence of minimum AG structures for type II AG functional features.


Introduction
Arabinogalactan-proteins (AGPs) are diverse members of the hydroxyproline-rich glycoprotein (HRGP) superfamily and broadly implicated in many aspects of plant growth and development, ranging from cell proliferation to plant microbe interactions [1][2][3][4][5][6][7][8]. AGP family members include the hyperglycosylated classical AGPs; AGP chimeras that combine non-HRGP domains with AGP motifs; and AGP hybrids that mix AGP motifs with signature motifs of two other types of HRGPs, namely the extensins and/or proline-rich proteins [9]. Finally, there are many proteins that are not hydroxyproline-rich, yet contain a few short sequences [10], or 'glycomodules', that direct local glycosylation typical of AGPs [11].
AGPs are distinguished from other HRGP family members [12] by the presence of type II, type III (Mugwort) [13], or type I (Ginkgo) [14] arabinogalactan polysaccharides (AGs), which consist of a β-1,3-, β-1,6or β-1,4-linked galactan backbone, respectively, as well as O-linked to hydroxyproline (Hyp) residues, although AGPs are traditionally referred to as HRGPs that are highly glycosylated with type II AGs. Carbohydrate occurring mainly as Hyp-O-linked AGs can account for as much as 95% of an AGP's dry weight [1], elaborating most of the interactive surface. As such, the AG substituents are major determinants of AGP molecular function.
Type II AGs are branched heteropolysaccharides containing D-Galp, L-Araf, and often D-GlcpA and L-Rhap [15] attached to non-contiguous Hyp residues [16,17] that are often clustered in AGP glycomodule sequences X-Hyp-X-Hyp, where X is usually Ser or Ala, although lone Hyp residues can be targets as well [18]. In contrast, short O-linked arabinoside substituents decorating contiguous Hyp residues are minor components of AGPs but are the major glycans of the extensins and PRPs [18], which lack arabinogalactan polysaccharides. These differences in HRGP glycosylation give rise to different structures and chemistry, and thus different types of networks and biological functions. For example, extensins and PRPs form covalent wall networks through tyrosine crosslinking [19,20], whereas classical AGPs occur primarily at the membrane-wall interface, intercellularly or in exudates [21,22], and participate in non-covalent and covalent networks through glycosidic bonds with pectin and xylan [23,24].
In our quest to relate structure to function, we previously used high-resolution onedimensional (1D) and two-dimensional (2D) NMR techniques to elucidate the detailed glycan structure of four Hyp-O-AGs isolated from AGP fusion glycoproteins expressed in tobacco BY2 cells. Two Hyp-O-glycans, designated AHP1 (Ala-Hyp polysaccharide 1) and AHP2, were isolated from (Ala-Hyp) 51 -EGFP consisting of 51 repeats of an "Ala-Hyp" motif fused with enhanced green fluorescent protein (EGFP) [15,17], and two more were isolated from the fusion glycoprotein, human interferon α-2b (hIFNα-2b)-(Ser-Hyp) 20 (20 repeats of a "Ser-Hyp" motif) [25]. All possessed common composition and linkage patterns, essentially differing only in the number and size of the side chains attached to the galactan mainchain, with the maximum side chain size being the six-residue unit proposed decades ago by Defaye and Wong ( Figure 1) [26,27]: With the discoveries of AGP-specific degrading enzymes in recent years, it is now possible to selectively remove specific side chain sugar residues or cleave the 1,3-galactan backbone [28][29][30]. Using this approach combined with gel electrophoresis and mass spectrometry analysis, Dupree and coworkers identified long β-1,6-galactan side chains from a variety of AGP or AG samples. This long side chain feature is corroborated by the identification of galactosyltransferases that can elongate the AG β-1,6-galactan side chain [31]. Thus, current data suggest that the AGP β-1,3-galactan backbone is elaborated with 6-O-linked galactosyl, oligo-1,6-galactosyl, or 1,6-galactan residues [32], which, at the periphery, have arabinosyl, glucuronosyl, rhamnosyl, and/or fucosyl residues.
AGP glycosylation is species-and tissue-dependent owing to the presence of genes encoding the various glycosyltransferases and the corresponding regulators that control their temporospatial expression [33]. However, we lack biochemical verifications of the different AGP glycosylations. This limits our understanding of the relationship between glycosylation and AGP function. Here, we used 1D and 2D NMR techniques together with glycosyl composition and glycosyl linkage analyses to characterize two Hyp-O-AGs isolated from the AGP fusion glycoprotein, (Ser-Hyp) 32 -EGFP, expressed in an Arabidopsis cell suspension culture [11]. Glycosylations of the two Hyp-O-AGs, designated Arabidopsis Ser-Hyp polysaccharides 1 and 2 (AtSHP-1 and AtSHP-2), were compared to those expressed in Nicotiana, which is only distantly related to Arabidopsis.

Hyp-O-Polysaccharides from the Arabidopsis Fusion Glycoprotein (Ser-Hyp) 32 -EGFP
To analyze the glycosylation of Arabidopsis AGPs, a simple repetitive AGP glycomodule-EGFP fusion protein, (Ser-Hyp) 32 -EGFP, was expressed and purified from the culture media of transgenic Arabidopsis cells [11]. The fusion glycoprotein co-precipitated with Yariv reagent, demonstrating that the glycomodules were glycosylated with arabinogalactan polysaccharides. Glycosyl composition analysis of (Ser-Hyp) 32 -EGFP showed that it contained Ara, Gal, and GlcA, which are common residues in type II AGs. The fusion glycoproteins were hydrolyzed under mild base conditions to cleave the peptide backbone and release the Hyp-O-attached AG polysaccharides, which were fractionated on a size exclusion column that yielded a single peak containing Hyp and sugar residues [11]. The major fraction 17 and a later fraction 19, designated AtSPHP-1 and AtSPHP-2, respectively, were chosen for further detailed structural analyses.

Structure of AtSPHP-1
AtSPHP-1 size and composition. Our previous measurements of Hyp and monosaccharides of the fractionated Hyp-O-polysaccharides showed that they contained 18-27 glycosyl residues per Hyp [11], suggesting that these AGs ranged from 18 to 27 in degree of polymerization (DP). TMS glycosyl composition analysis yielded 45.1% Ara, 52.5% Gal, and 2.4% GlcA, while glycosyl linkage analysis produced linkages of 47.9% Araf, 47.1% Galp, and 5.0% t-GlcpA ( Table 1). The increased mol% of GlcA in the glycosyl linkage analysis is consistent with the different sample hydrolysis conditions used in the two analyses (in 1 M methanolic HCl at 80 • C for 18 h for TMS glycosyl composition analysis vs. in 2 M TFA at 121 • C for 2 h for glycosyl linkage analysis), where the harsh hydrolysis obviously hydrolyzed more GlcA for derivatization and subsequent detection. It also suggests that AtSPHP-1 is composed of Ara/Gal/GlcA sugar residue in a ratio of 10:10:1. In summary, our analyses indicate AtSPHP-1 was a 21-residue glycan O-linked to Hyp. A set of NMR spectra of AtSPHP-1, including Correlated Spectroscopy (COSY), Total Correlation Spectroscopy (TOCSY), Heteronuclear Single Quantum Coherence (HSQC), and Heteronuclear Multiple Bond Correlation (HMBC) spectra, were collected and analyzed to elucidate the chemical structure. 1 H signals of individual glycosyl or Hyp residue were identified in the COSY and TOCSY spectra (Figure 2A), while the corresponding 13 C signals were found in the HSQC spectrum ( Figure 2B). We confirmed the 1 H/ 13 C assignments from the spectra and anomeric configurations of the saccharide residues ( Table 2). The linkages between these residues were established based on correlations in the HMBC ( Figure 2C).
L-Hyp and allo-Hyp. The presence of two sets of characteristic cross peaks in the HSQC spectrum (cross peaks G, H, and I in Figure 2B) indicated that the Hyp residues were a mixture of L-Hyp and allo-Hyp isomers formed during base hydrolysis of the polypeptide backbone [15,34]. Side chain characterization. Based on glycosyl composition data, AtSPHP-1 was composed of 10 Ara residues. The glycosyl linkage result further showed that the 10 Ara included four t-Araf, three 5-Araf, and three 3-Araf residues. This is consistent with a 6:4 integral ratio of anomeric signal A to B in Figure 2B, which corresponded to six 3-Araf and 5-Araf and 4 t-Araf, respectively. The presence of three 5-Araf in the molecule was also supported by a 3:7 integral ratio of Araf C/H-5 signals between 67.3/3.81, 3.88 ppm (5-Araf C/H-5) and 62.0/3.82, 3.71 ppm (t-and 3-Araf C/H-5). The HMBC correlations B1-5A (Araf H-1 at 5.09 ppm to Araf C-5 at 67.3 ppm) and B1-3F (Araf H-1 at 5.09 ppm to Galp C-3 at 80.9 ppm) showed that some t-Araf were 1→5 linked to 5-Araf and the other(s) were/was 1→3 linked to side chain Galp residue(s) (Gal sc ) ( Figure 2C). In addition, HMBC correlations A1-3F (Araf H-1 at 5.24 ppm to Galp C-3 at 80.9 ppm) and A1-3A (Araf H-1 at 5.24 ppm to Araf C-3 at 83.0 ppm) demonstrated that some Araf were 1→3 linked to 3-Araf and the others were 1→3 linked to Gal sc residues. Thus, our results suggest that the Araf residues formed four structural units that substituted Gal sc residues, among which three units were t-Araf -(1→5)-Araf -(1→3)-Araf -(1→3)-Gal sc and one t-Araf -(1→3)-Gal sc . The number of Araf substitutions also advocates that there were four Gal sc residues in this polysaccharide and, correspondingly, six Galp residues on the galactan backbone (Gal bb ). Furthermore, the correlation E1-6F (GlcpA H-1 at 4.51 ppm to side chain Galp C-6 at 70.0 ppm) confirmed the lone GlcpA residue was 1→6 linked to a Gal sc residue. However, we could not determine which of the four Gal sc was substituted and we chose one in Figure 3A. Likewise, we assigned the α-L-Araf -(1→3) side chain to another Gal sc , although any one of the four Gal sc residue were candidates. Therefore, the Gal sc included three 3-Galp and one 3,6-Galp that were 1→6 linked to backbone, and the galactan backbone consisted of one t-Galp, one 6-Galp, and four 3,6-Galp residues.

Structure of AtSPHP-2
AtSPHP-2 size and composition. As a later fraction from the size exclusion column, AtSPHP-2 was smaller than AtSPHP-1, differing in the extent of side chain elaboration. TMS glycosyl analysis of AtSPHP-2 indicated the Gal/Ara molar ratio was 10:9 (Table 1).

AtSPHP-1 and AtSPHP-2 Are Variations on a Conserved Theme
As previously reported, arabinogalactan polysaccharides isolated from endogenous AGPs can contain hundreds of glycosyl residues [11,37]. We also observed that the sizes of Hyp-O-polysaccharides yielded from native AGPs purified from culture media of Arabidopsis suspension culture ranged from 65 to 142 glycosyl residues per Hyp [11]. However, Hyp-O-polysaccharides generated from the overexpression of AGP glycomodule-or AGPrecombinant proteins fused with EGFP or hIFNα-2b, either produced by transgenic tobacco or Arabidopsis suspension culture, were much smaller, with an average size of around 15 to 22 sugar residues. These Hyp-O-polysaccharides include AHP-1 and -2 from (Ala-Hyp) 51 -EGFP, Hyp-polysaccharide-1 and -2 from hIFNα-2b-(Ser-Hyp) 20 , and Hyp-glycan from rhGH-(Ser-Hyp) 10 , all overexpressed by tobacco cells [11,15,38], as well as AtSPHP-1 and -2 from this work. The smaller AG size is most likely due to the overexpression of the corresponding fusion glycoproteins/glycomodules and the inability of the glycosylation enzymes to completely glycosylate all of the abundant polypeptides expressed by the translational machinery driven by the 35S CaMV promoter. Therefore, the limited glycosylation enzymes and sugar donors resulted in short polysaccharides attached on the polypeptides. Future co-expression of AGP-specific glycosyltransferases and nucleotide sugar interconverting enzymes with (Ser-Hyp) 32 -EGFP may produce fusion glycoproteins with longer arabinogalactan polysaccharides.
Recent work demonstrated the existence of long β-1,6-galactan side chain 1→6-linked to the AGP galactan backbone [29,30]. The results showed that up to 30% of AG side chains were long β-1,6-galactans decorated with Ara, GlcA, and other minor glycosyl residues. However, it is hard to identify the long β-1,6-galactan side chain in AtSPHP-1 and -2, mainly because of the short average length of the AGs. Furthermore, NMR methods that can only provide an average picture of all of the polysaccharides in the sample make it hard to distinguish possible minor long β-1,6-galactans from the whole population.
Our HMBC spectra clearly show the existence of β-1,6-linkage on the AG galactan backbone. The simple patterns of the HSQC spectra suggest that tobacco AHP-1 and AHP-2 and Arabidopsis AtSPHP-1 and AtSPHP-2 share a repeating structural motif. They have a galactan backbone comprised of two β-1→3 trigalactosyl blocks connected by a β-1→6 linkage, although one block of AHP-1 is truncated, lacking a Gal at the non-reducing end of the backbone. The kink structural feature of AG galactan backbone is reminiscent of the backbone repeats of mixed-linkage glucan (MLG) from Poales [39]. The biosynthetic mechanism of MLG may have some relevance for AGP galactan backbone assembly and warrants future investigation. In addition, side chains, when present on the trigalactosyl blocks, initiate with Gal sc residues β-1→6 linked to the first and second Gal bb residues in the trigalactosyl blocks, numbering from the reducing end. When present, α-linked arabinosyl chains ranging in size from one to three residues are α-1→3-linked to the Gal sc residues, while β-linked GlcpA occupies the O-6 position. Thus, such trigalactosyl units with the decorated carbohydrates may serve as the building blocks for large AG polysaccharides. This is supported by the fact that the short AtSPHP-1 and -2 and the large Hyp-O-AGs of native AGPs isolated from Arabidopsis suspension culture media are composed of Ara, Gal, and GlcA in similar molar ratios [11].
Our recent work on an RG-I-AGP complex released from cell walls of Arabidopsis suspension cultured cells by endopolygalacturonase suggests that the terminal Rha of AGPs may serve as the attachment site of RG-I [24]. Notably, the Arabidopsis AGs lacked the terminal α-L-rhamopyranosyl residues 4-linked to GlcpA in tobacco AHP-1 and AHP-2. Is the lack of Rha due to the lack of corresponding rhamnosyl transferase activity in the suspension cultured Arabidopsis cells? Where does the AGP Rha of RG-I-AGP come from? This apparent contradiction is indeed related to how the RG-I-AGP is synthesized. One possibility is that the Rha addition in Arabidopsis is regulated and only occurs when the assembly of RG-I-AGP complexes is needed, which requires a precisely controlled expression of AGP-specific rhamnosyl transferases. Another possibility is that the intermolecular Rha between the AGP and RG-I originates from the Rha at the reducing end of an RG-I glycan that is transferred to the terminal GlcpA of an AGP by possible pectin transglycanases. Nevertheless, it is worth confirming whether any genes encoding AGP-specific rhamnosyl transferases and the corresponding enzyme activity are present in these cells when sequences of such rhamnosyl transferases become available.
Another observation is that AtSPHP-1 and -2 from Arabidopsis cell culture underwent less gluconosylation than AHP-1 and -2 from tobacco cell culture. A similar trend was also observed by comparing GlcA amounts in EGFP-LeAGP-1 overexpressed by Arabidopsis and tobacco cells, which showed a significantly lower amount of GlcA in the major Hyp-O-polysaccharide fractions prepared from Arabidopsis EGFP-LeAGP-1 [11]. However, although a minor component, the conserved presence of GlcA in bifurcated calciumbinding [40] AG side chains might be directly related to their global biological significance, as demonstrated recently [8,32], which suggests binding and releasing apoplastic calcium is a function of AGPs.
A conserved backbone structure with different side chain decoration seems the major theme of AGP glycosylation in different plant species and tissues. Indeed, co-precipitation with β-Gal Yariv reagent, a common feature of AGPs, only requires a conserved β-1,3galactan with a degree of polymerization (DP) greater than five [41]. Here, the average galactan backbone length of Arabidopsis (Ser-Hyp) 32 -EGFP able to co-precipitate with Yariv reagent is six, which is above the minimal requirement of five and close to seven, with the DP leading to more efficient Yariv precipitation. On the other hand, the difference in side chain modifications not only suggests the presence of distinct glycosyl transferases and possibly glycosyl hydrolases for AGP glycosylation and maturation between the two species, but also indicate the existence of minimum AG structures required for their in vivo functions [41]. Future identification and classification of AGP glycosyl transferases in different plant species and tissues will help us understand the evolution and divergency of AGP glycosylation, and thus the functions of AGP glycosylation in plants.

Isolation of (Ser-Hyp) 32 -EGFP from Arabidopsis Suspension Cultured Cells
Arabidopsis cells were transformed with the (Ser-Pro) 32 -EGFP gene cassette and cell lines expressing the fusion glycoprotein selected and cultured as described earlier [11,18]. (Ser-Hyp) 32 -EGFP was isolated from culture media by a combination of hydrophobic interaction chromatography and reverse-phase chromatography, as described earlier [11,18]. In brief, culture media of the transformed Arabidopsis cells were filtered and concentrated via rotary evaporation at room temperature. Solid NaCl was added to the concentrated media to a final concentration of 2 M, which was loaded on a hydrophobic interaction column (Phenyl Sepharose 6 Fast Flow, 16 × 700 mm, Amersham-Pharmacia Biotech, Amersham, UK). The bound materials were eluted from the column with distilled water and dialyzed against distilled water using 3.5 kDa MWCO dialysis tubing (Spectrum Chemical, New Brunswick, NJ, USA), concentrated via lyophilization, and separated on a semipreparative reverse-phase column (10 µm, PRP-1, 7 × 305 mm, Hamilton) at a flow rate of 1 mL/min using a gradient of 100% (v/v) buffer A (0.1% [v/v] aqueous trifluoroacetic acid) increasing to 70% (v/v) buffer B (80% [v/v] acetonitrile in 0.1% aqueous trifluoroacetic acid) for 105 min. The eluent of separation was monitored at 220 nm on a 1050 HPLC system (Hewlett-Packard, Novi, MI, USA). The (Ser-Hyp) 32 -EGFP eluted in about 50% (v/v) buffer B.

Co-Precipitation with Yariv Reagent
We assayed the reactivity of reverse-phase chromatography isolated fractions with β-galactosyl Yariv reagent using tobacco AGPs as a standard, as described earlier [17,18]. In brief, one hundred micrograms of (Ser-Hyp) 32 -EGFP was dissolved in 300 µL of dd H 2 O, followed by the addition of 300 µL of (β-D-galactosyl) 3 -Yariv reagent at a concentration of 1 mg/mL in 2% [w/v] NaCl aqueous solution. The mixture was incubated at room temperature for 1 h, followed by centrifuging at 6000× g to collect the precipitates. The precipitates were washed with 2% NaCl aqueous solution and dissolved in 0.1 M NaOH. The absorbance of the solution was measured at 420 nm and compared to the AGP standard.

Isolation of Hyp-O-Glycans
The (Ser-Hyp) 32 -EGFP (40 mg) were separately dissolved in 2 mL of 0.44 N NaOH (aqueous) and heated at 108 • C for 20 h. The hydrolysate was cooled on ice and titrated to pH 7.8 with 1 M HCl. The neutralized solution was then freeze-dried.
The hydrolysate was dissolved in 1 mL of distilled water and fractionated on an analytical Superdex peptide column (Amersham Biosciences, Piskataway, NJ, USA) eluted with 20% acetonitrile (aqueous) at a flow rate of 0.3 mL/min [11]. Fractions (0.6 mL per fraction) were assayed for Hyp and neutral sugar. High-molecular-weight fractions containing Hyp and sugar from the (Ser-Hyp) 32 -EGFP hydrolysate were selected and then re-fractionated on the Superdex peptide column before further analyses.

Sugar Analyses and Hyp Assay
Neutral sugars were analyzed as alditol acetate derivatives by gas chromatography (GC) using a 6-foot × 2 mm PEG succinate 224 column programmed from 130 • to 180 • at 4 • C/min [11,17]. Data were captured by Hewlett-Packard Chem station software. One hundred micrograms of glycoprotein was used for each analysis. We assayed the uronic acid content of 70 µg of each sample via the specific colorimetric assay based on the reaction with m-hydroxydiphenyl [17]. Glucuronic acid was the standard. Glycosyl compositions were also analyzed by GC/mass spectrometry (GC/MS) of trimethylsilyl derivatives of methyl glycosides, as previously described [42]. Glycosyl linkage analysis was performed by combined GC/MS of the partially methylated alditol acetate (PMAA) derivatives produced from the sample, using a procedure that was slightly modified from a previously described method [43], in which uronic acid components of each sample were methylated and reduced using lithium aluminum deuteride (LiAlD4) prior to PMAA derivatization.
Hydroxyproline was assayed colorimetrically using the Kivirikko's method [17], which involves alkaline hypobromite oxidation and subsequent coupling with acidic Ehrlich's reagent and monitoring at 560 nm.

NMR Analyses
All NMR spectra were collected at the Campus Chemical Instrument Center, Ohio State University. Two milligrams of each Hyp-O-glycan isolated from (Ser-Hyp) 32 -EGFP was dissolved in 0.7 mL of D 2 O. Because the water signal occurred in the anomeric region of the spectrum, 1-D 1 H NMR spectra and 2-D TOCSY, COSY, HSQC, and HMBC spectra were recorded at 55 • C on a Bruker 800 MHz DRX spectrometer outfitted with a cryoprobe [15,34]. HSQC experiments were carried out with spectral width of 8 kHz for 1 H and 30 kHz for 13 C, respectively. 1 J CH was set to 150 Hz with a relaxation delay of 1.5 s. For HMBC spectra, the dataset was acquired at 4 × 1 K data points with multibounds J CH at 8 Hz. DQCOSY spectra were collected with a spectral width of 8 kHz in both dimensions with a relaxation delay of 2 s. For TOCSY experiments, the mixing time was set at 90 ms. All data were processed using NMRPipe and analyzed using nmrview [15,34].

Conclusions
Compared with the AG structures of AGP glycomodules expressed by tobacco suspension cultured cells, the AGP glycomodules produced by Arabidopsis suspension cultures adopt a conserved theme on the AG backbone, while the side chain has different patterns of glycosyl additions. Specifically, Arabidopsis AGPs lack terminal rhamnose decoration and have a much lower level of glucuronosylation.  Data Availability Statement: Data will be made available upon request.