NMR-based Structural Studies of the Glycosylated MUC1 Tandem Repeat Peptide

MUC1 is a glycoprotein that plays an important role in cancer pathogenesis. In order to study the effect of glycosylation on the conformational propensities of the tandem repeat domain of MUC1, we have determined the structure of the MUC1 tandem repeat peptide AHGVTSAPDTRPAPGSTAPP, O-glycosylated with the trisaccharide ( -Glc-1,4-Glc-1,4-GalNAc-) at Thr5. This glycopeptide was synthesized to model a heavily Oglycosylated threonine residue in the tandem repeat domain. The NMR experiments used in this study included TOCSY, NOESY, ROESY, DQF-COSY, HSQC and 1D NMR. The peak volumes determined using the program SPARKY were converted into distance constraints using the program CALIBA. The programs FiSiNOE and HABAS were used to generate angle constraints. Using conformational restraints obtained from NMR, the program DYANA was used to determine the structures of the peptide. Finally, structural refinement was performed within the SYBYL software package using GLYCAM parameters and Kollman-all atom types. The presence of strong sequential N connectivities suggested an extended conformation of the peptide backbone. Strong sequential connectivities were indicative of a trans conformation of the Ala-Pro peptide bonds. In addition, presence of sequential NN connectivities in the peptide segments Gly3-Val4-Thr5Ser6, Asp9-Thr10-Arg11 and Gly-Ser16 were indicative of twist-like conformations of the peptide backbone in these peptide segments.


Introduction
MUC1 [1] is a glycoprotein expressed on the cell surface of normal and cancer cells.The large extracellular fragment of MUC1 consists of tandem repeats of a 20 amino acid motif.Each tandem repeat AHGVTSAPDTRPAPGSTAPP includes two serine and three threonine residues that are potential sites of O-glycosylation.In normal secretory epithelial cells, MUC1 is heavily glycosylated and expressed on the apical surface, whereas in cancer cells MUC1 is aberrantly glycosylated and contributes to reconfiguration of cell-cell interactions.The altered properties of MUC1 on tumor cells modulate adhesion to receptors on adjoining cells and tissues thereby facilitating metastasis [2].Thus, glycosylation of MUC1 plays an important role in the pathophysiology of various cancers which include pancreatic adenocarcinoma, lung, breast and colon cancer.
The initiation of O-glycosylation of MUC1 in the tandem repeat domain is catalyzed by a family of enzymes called UDP-GalNAc: polypeptide N-acetylgalactosaminyltransferases (GalNAc transferases) [3][4][5][6][7].Previous in vitro studies on the initiation of glycosylation of the tandem repeat domain of MUC1 have shown that attachment of GalNAc residues to specific sites within the tandem repeat peptide of MUC1 facilitates glycosylation at other sites and leads to high-density glycosylation [8][9][10].It has been hypothesized that the GalNAc transferases following initial glycosylation, bind to the GalNAc residue through their lectin domain, which results in conformational changes in its catalytic domain or in the acceptor substrate peptide backbone that facilitate glycosylation at proximal and distant sites [9].In order to characterize the conformational changes on acceptor substrates that take place following initial glycosylation, previous studies were focused on MUC1-based tandem repeat peptides monoglycosylated with a single GalNAc residue at Thr5 [11].We extended those studies to glycosylation of tandem repeat peptides with bulky carbohydrate moieties.Using NMR, we have investigated the structure of a MUC1-based tandem repeat peptide O-glycosylated with the trisaccharide ( -Glc-1,4--Glc-1,4--GalNAc-) at Thr5.The trisaccharide -Glc-1,4--Glc-1,4--GalNAc was designed to mimic natural glycoforms that glycosylate the tandem repeat domain of MUC1.

Materials and Methods
Total correlation spectroscopy (TOCSY), nuclear Overhauser enhancement spectroscopy (NOESY), double quantum filtered correlation spectroscopy (DQF-COSY), 1D NMR, and heteronuclear single quantum coherence (HSQC) NMR experiments were used to obtain structural data on the peptide.All data were acquired on a 600 MHz Varian INOVA spectrometer operating at 14.1 Tesla.The sample was dissolved in 90:10 H2O:D2O to a concentration of 3mM at pH 4.5 and the data were acquired at a sample temperature of 5 o C. Water suppression was accomplished by using presaturation (50 Hz field) except for the HSQC experiment, which used pulsed field gradient solvent suppression.NMR data were apodized by a Gaussian function and processed using VNMR software (Varian, Palo Alto, CA).
Sequential resonance assignments of protons [12] in the glycopeptide were obtained using the TOCSY and NOESY spectra. 3J N vicinal coupling constants were obtained using the DQF-COSY spectra.In addition to data obtained from the NOESY and TOCSY spectra, the HSQC spectrum was used to assist resonance assignments of protons in the trisaccharide.The peak intensities in the NOESY spectrum (200 ms) were integrated using the Gaussian fit method implemented in the program SPARKY [13].The peak integrals were input into the program CALIBA [14] to obtain upper distance constraints.The distance constraints and coupling constants were used by the programs FiSiNOE [15] and HABAS [16] to generate , , and 1 torsion angle constraints and stereospecific assignments.
Additional stereospecific assignments for -methylene protons were obtained using the program GLOMSA [14].A model of the peptide AHGVTSAPDTRPAPGSTAPPA, O-glycosylated with the trisaccharide ( -Glc-1,4--Glc-1,4--GalNAc-) at Thr5, was created within the software package SYBYL (version 6.6) [17] using Kollman forcefield [18] and Glycam parameters [19].The torsion angle constraints, distance constraints, and stereospecific assignments were used by the program DYANA [20] within SYBYL to generate 100 structures of the glycopeptide that were further subjected to constrained energy minimization.In order to determine specific conformational features of the peptide segment GVTSA, which contains the glycosylated Thr and flanking residues, the 100 energyminimized structures were clustered together using a root-mean-square-deviation (RMSD) criterion of 0.6 Å for the backbone N, C and C atoms.The conformational propensities of individual amino acid residues in the peptide segment GVTSA were determined by their relative occupancy in different regions of the Ramachandran plot [21,22].

NOE Connectivities and Coupling Constants
Figure 1 shows the summary of sequential connectivities observed in the peptide backbone.These data indicate strong to medium sequential N NOE connectivities between majority of the amino acid residues.Strong sequential connectivities within the dipeptide segments Xxx-Pro and absence of sequential connectivities were indicative of a trans conformation for these dipeptide segments [12].
Abundant N sequential connectivities suggest that at least some or a majority of the conformers of the glycopeptide exist in solution having extended conformations of the peptide backbone.However, the presence of weak to medium sequential NN connectivities in the peptide segments Gly3-Val4-Thr5*-Ser6 (* = -Glc-1,4--Glc-1,4--GalNAc-), Asp9-Thr10-Arg11 and Gly15-Ser16 (Figures 1 and 2), indicates that a significant population of conformers have twist-like conformations of the peptide backbone in these peptide segments.The absence of medium-range NOEs in the peptide excludes the presence of tight turns or -helical structures in the peptide backbone.Thus, the majority of conformers of the glycopeptide exist in solution in a dynamic equilibrium between extended and twistlike conformations for the peptide segments GVSTA, DTR and GS.The degenerate chemical shifts of H and NH protons of His2, Thr5, Ser6, Ala7, Val4, and Thr10 precluded the determination of 3 J N coupling constants for these amino acid residues.The coupling constants for the remaining amino acid residues ranged from 6.2 -8.5 Hz and did not provide any evidence for the presence of ordered secondary structures in the peptide backbone.

H Chemical Shift Deviations for the Glycopeptide
The H proton chemical shift deviations of amino acid residues from their corresponding random coil values [23], provides an indication of specific secondary structures in peptides and proteins.A summary of the chemical shift deviations of the amino acid residues in the glycopeptide is shown in   Cluster Analysis of the Peptide Segment GVTSA In order to determine structural features of the peptide segment GVTSA, we clustered all conformers of the glycopeptide at the segment GVTSA using an RMSD criterion of 0.6 Å for backbone atoms N, C , and C. The clustering of conformers resulted in three distinct clusters containing 25%, 45%, and 16% (Clusters 1, 2, and 3) of the conformers and several small clusters with 1-4 conformers.In order to determine distinct conformational propensities of individual amino acid residues in the peptide segment GVTSA, the average structures of the conformers in each cluster were calculated.The and torsion angles values (in degrees) for each amino acid residue in the average structure of the peptide segment GVTSA in each cluster were determined (Table 1).
In the peptide segment GVTSA, Val4 showed conformational propensities for the extendedstrand-like and inverse turn-like regions of the Ramachandran plot.Thr5 and Ala7 showed conformational propensities for regions corresponding to extended -strand-like conformations, whereas Ser6 showed conformational propensities for regions corresponding to extended -strand-like or polyproline II-like conformations.
In order to determine the effects of extensive glycosylation on the conformational propensities of the peptide backbone, further studies will be focused on peptides that will be glycosylated with trisaccharides at other potential sites of O-glycosylation in the tandem repeat domain, which include Thr10 in the PDTR and Thr17 in the GSTA region.These conformational studies together with in vitro kinetic and affinity studies for these glycopeptides will provide important insights into the regulation of glycosylation of the tandem repeat domain of MUC1.

Figure 1 .
Figure 1.Summary of sequential NOE connectivities and coupling constants observed between amino acid residues in the peptide backbone.

Figure 2 .
Figure 2. A section of the 200 ms NOESY spectra of the glycopeptide showing sequential amideamide cross peaks (HN = amide proton of the amino acid residues, HAC = amide proton of the Nacetyl group in the -GalNAc residue).

Figure 3 .
Figure 3. Val4 and Thr5 showed significant downfield deviations (~0.3 ppm) in H chemical shifts from random coil values.These deviations in chemical shifts from random coil values were probably caused by local conformational effects induced by the glycosylation at Thr5.In general, despite the presence of amino acids having significant deviation of chemical shifts from the random coil values, the lack of a dense grouping of amino acids with significant downfield or upfield shifts suggests the absence of ordered secondary structures, such as -helices or -strands in the peptide backbone.

Figure 4 .
Figure 4. Schematic diagram showing connectivities between protons of -GalNAc residue in the trisaccharide and protons of the amino acid residues in the peptide segment VTSA (H1, H2, H3, H4, H5, H61, H62 = protons of -GalNAc residue, Q3AC = methyl group pseudo atom in the N-acetyl group, QQG = pseudo atom at the center of two methyl groups in Val4, QG2 = pseudo atom at the center of the methyl group in Thr5, HAC = amide proton in the N-acetyl group, HA = proton in the peptide backbone, H = amide proton in the peptide backbone).

Table 1 .
Values of the torsion angles and (in degrees) of amino acid residues in the average structure of each cluster for the peptide segment GVTSA.