O-Glycan-Dependent Interaction between MUC1 Glycopeptide and MY.1E12 Antibody by NMR, Molecular Dynamics and Docking Simulations

Anti-mucin1 (MUC1) antibodies have been widely used for breast cancer diagnosis and treatment. This is based on the fact that MUC1 undergoes aberrant glycosylation upon cancer progression, and anti-MUC1 antibodies differentiate changes in glycan structure. MY.1E12 is a promising anti-MUC1 antibody with a distinct specificity toward MUC1 modified with an immature O-glycan (NeuAcα(2-3)Galβ(1-3)GalNAc) on a specific Thr. However, the structural basis for the interaction between MY.1E12 and MUC1 remains unclear. The aim of this study is to elucidate the mode of interaction between MY.1E12 and MUC1 O-glycopeptide by NMR, molecular dynamics (MD) and docking simulations. NMR titration using MUC1 O-glycopeptides suggests that the epitope is located within the O-linked glycan and near the O-glycosylation site. MD simulations of MUC1 glycopeptide showed that the O-glycosylation significantly limits the flexibility of the peptide backbone and side chain of the O-glycosylated Thr. Docking simulations using modeled MY.1E12 Fv and MUC1 O-glycopeptide, suggest that VH mainly contributes to the recognition of the MUC1 peptide portion while VL mainly binds to the O-glycan part. The VH/VL-shared recognition mode of this antibody may be used as a template for the rational design and development of anti-glycopeptide antibodies.


Introduction
It has been established that malignant transformation of cells involves abnormal glycosylation of the cell surface molecules [1]. Mucin1 (MUC1) was discovered as a carcinoma-associated mucin-like glycoprotein antigen and found to be peanut-agglutininreactive [2]. So far, many anti-MUC1 antibodies have been developed, and some of them are specific to cancer progression [3][4][5][6][7]. This specificity likely originates from the fact that abnormal O-glycosylation occurs at MUC1 on cancer cells. MUC1 is therefore considered the prime target of specific immunotherapy, including antibody-drug conjugates and CAR-T therapy [8,9]. Currently, some anti-MUC1 monoclonal antibodies are widely used as a clinical tool to detect and monitor breast cancer [10].
Information is currently rather limited on the precise epitope and binding specificity for many developed antibodies. So far, the binding modes of anti-MUC1 glycopeptide antibodies have mainly been revealed by X-ray crystallography. A crystal structure of SM3 in complex with MUC1 glycopeptide revealed that the antibody mainly recognizes the peptide part and the GalNAc residue points towards a solvent with a limited interaction with the SM3 antibody [11]. Another example is anti-MUC1 antibody AR20.5, and the cocrystal structure shows that the sugar moiety of the MUC1 glycopeptide does not directly contact the antibody [12]. It seems that O-glycosylation induces a preferred conformation, which is recognized for AR20.5 binding.
MY.1E12 (mouse IgG2a) is an antibody that was developed by immunizing mice with human milk fat membrane. It has been shown that this MY.1E12 antibody binds to MUC1 in an O-glycan-dependent manner [13,14]. Yoshimura et al. synthesized a series of MUC1 peptides with different glycosylation patterns as ligands for anti-MUC1 antibodies and investigated their affinity using an ELISA assay [15]. It was shown that MY.1E12 recognizes NeuAcα(2-3)Galβ(1-3)GalNAc only when attached to Thr8 of MUC1. This suggests that the binding of MY.1E12 is highly specific. Therefore, this antibody is a promising candidate for the development of cancer therapeutics. However, the structural basis of this unique binding mode has not yet been characterized.
In this study, we used NMR, MD simulations and docking simulations to analyze the mode of interaction between MUC1 O-glycopeptide and MY.1E12.

NMR Titration Study
First, we conducted NMR titration studies using MY.1E12 and MUC1 O-glycopeptides both to experimentally detect the interaction and to obtain information on the epitope region. We used three O-glycopeptides with different peptide chain lengths, i. CAR-T therapy [8,9]. Currently, some anti-MUC1 monoclonal antibodies are widely used as a clinical tool to detect and monitor breast cancer [10].
Information is currently rather limited on the precise epitope and binding specificity for many developed antibodies. So far, the binding modes of anti-MUC1 glycopeptide antibodies have mainly been revealed by X-ray crystallography. A crystal structure of SM3 in complex with MUC1 glycopeptide revealed that the antibody mainly recognizes the peptide part and the GalNAc residue points towards a solvent with a limited interaction with the SM3 antibody [11]. Another example is anti-MUC1 antibody AR20.5, and the co-crystal structure shows that the sugar moiety of the MUC1 glycopeptide does not directly contact the antibody [12]. It seems that O-glycosylation induces a preferred conformation, which is recognized for AR20.5 binding.
MY.1E12 (mouse IgG2a) is an antibody that was developed by immunizing mice with human milk fat membrane. It has been shown that this MY.1E12 antibody binds to MUC1 in an O-glycan-dependent manner [13,14]. Yoshimura et al. synthesized a series of MUC1 peptides with different glycosylation patterns as ligands for anti-MUC1 antibodies and investigated their affinity using an ELISA assay [15]. It was shown that MY.1E12 recognizes NeuAcα(2-3)Galβ(1-3)GalNAc only when attached to Thr8 of MUC1. This suggests that the binding of MY.1E12 is highly specific. Therefore, this antibody is a promising candidate for the development of cancer therapeutics. However, the structural basis of this unique binding mode has not yet been characterized.
In this study, we used NMR, MD simulations and docking simulations to analyze the mode of interaction between MUC1 O-glycopeptide and MY.1E12.

NMR Titration Study
First, we conducted NMR titration studies using MY.1E12 and MUC1 O-glycopeptides both to experimentally detect the interaction and to obtain information on the epitope region. We used three O-glycopeptides with different peptide chain lengths, i.e.,      Figure S16). From this observation, the binding was experimentally confirmed in the solution between MY.1E12 and MUC1 (27AA). It was found that the NH signals from the C-terminus still showed a sharp signal in the presence of excess MY.1E12 (antibody:ligand = 1:0.5), suggesting that the C-terminal region of MUC1 (27AA) is not involved in the interaction with MY.1E.12. Line broadening of His side chains is also indicative of an antibody-binding region. There are two His residues in MUC1 (27AA), H5 and H25. H5 side-chain signals are broader than those of H25. This implies that the H5 side chain is at or near the antibody binding site, while that of H25 is not.
The binding was also monitored by 2D CLIP-COSY experiments to avoid signal overlapping (Figure 2b). In the presence of antibody, signals from the C-terminal region of the glycopeptide (S19, T20, A21 and V27) were clearly observed and sharp ( Figure 2b). Therefore, it is likely that the C-terminal region of MUC1 (27AA) is not included in the binding epitope of this antibody. To quantitatively analyze the data, the peak heights of each signal in the CLIP-COSY spectra were measured and the ratio of peak height (MUC1+antibody/MUC1 alone) plotted ( Figure 2c). This result supports the conclusion that the N-terminal region of the MUC1 glycopeptide is indeed involved in binding to antibodies.
We performed a similar NMR titration experiment using a shorter glycopeptide MUC1 (20AA) that still contained a putative epitope (Figure 3, Supplementary Figures S17 and S18). We observed that the signal from V7 Hγ is significantly broadened, while the T20 Hγ signal remains sharp in the presence of an equimolar amount of antibody (antibody:MUC1 = 1:4) ( Figure 3). This suggests that the C-terminal region of MUC1 (20AA) is less involved in MY.1E12 binding than the N-terminal region of the glycopeptide. those of H25. This implies that the H5 side chain is at or near the antibody binding site, while that of H25 is not. The binding was also monitored by 2D CLIP-COSY experiments to avoid signal overlapping ( Figure 2b). In the presence of antibody, signals from the C-terminal region of the glycopeptide (S19, T20, A21 and V27) were clearly observed and sharp ( Figure 2b). Therefore, it is likely that the C-terminal region of MUC1 (27AA) is not included in the binding epitope of this antibody. To quantitatively analyze the data, the peak heights of each signal in the CLIP-COSY spectra were measured and the ratio of peak height (MUC1+antibody/MUC1 alone) plotted ( Figure 2c). This result supports the conclusion that the Nterminal region of the MUC1 glycopeptide is indeed involved in binding to antibodies.
We performed a similar NMR titration experiment using a shorter glycopeptide MUC1 (20AA) that still contained a putative epitope ( Figure 3, Supplementary Figures S17 and S18). We observed that the signal from V7 Hγ is significantly broadened, while the T20 Hγ signal remains sharp in the presence of an equimolar amount of antibody (antibody:MUC1 = 1:4) ( Figure 3). This suggests that the C-terminal region of MUC1 (20AA) is less involved in MY.1E12 binding than the N-terminal region of the glycopeptide.      Figure 5). The distribution of φ and ψ torsion angles between NeuAc and Gal residues were previously reported as φ = 69° ± 14 and ψ = −125° ± 16 [20]. The distribution of torsion angles between Gal and GalNAc residues are reported in the glycan fragment database as φ = −77° ± 18 and ψ = −150° ± 49 [21]. Therefore, the distributions of torsion angles from the MD simulation are rather consistent with previous reports.

Modeling of MY.1E12
To perform the docking study, we built a 3D model of the MY.1E12 Fv domain using Discovery Studio 2021. CDRs were identified using the Annotate Sequence tool in the Discovery Studio 2021 (Figure 6a), and the numbering scheme was based on IMGT [22]. A chimeric antibody (PDB ID: 3MBX IgG 1 ) [23] was used as the template. It comprises an H chain, which is an anti-human IL-13 antibody (IgG), and an L chain, which is an anti-human EMMPRIN antibody (IgG). MY.1E12 shares a CDR sequence identity with these templates of 66.7%. A Ramachandran plot of the 3D model (Figure 6b,c) [24] shows that most of the amino acid residues except for Gly are located within the favored region, validating the 3D model in terms of the main chain dihedral angles.

Modeling of MY.1E12
To perform the docking study, we built a 3D model of the MY.1E12 Fv domain using Discovery Studio 2021. CDRs were identified using the Annotate Sequence tool in the Discovery Studio 2021 (Figure 6a), and the numbering scheme was based on IMGT [22]. A chimeric antibody (PDB ID: 3MBX IgG1) [23] was used as the template. It comprises an H chain, which is an anti-human IL-13 antibody (IgG), and an L chain, which is an antihuman EMMPRIN antibody (IgG). MY.1E12 shares a CDR sequence identity with these templates of 66.7%. A Ramachandran plot of the 3D model (Figure 6b,c) [24] shows that most of the amino acid residues except for Gly are located within the favored region, validating the 3D model in terms of the main chain dihedral angles.

Docking of MUC1 Glycopeptide to MY.1E12
Docking poses of MUC1 O-glycopeptide and MY.1E12 antibody as well as that of MUC1-MY.1E12 were built using ZDOCK software [25]. In the latter case, MUC1 (9AA) is known to bind to MY.1E12. Docking was performed under conditions such that the entire ligand and CDRs of MY.1E12 are involved in binding.
Since ZDOCK applies a rigid docking procedure, pseudo-flexible docking was performed using three MUC1 conformers that were extracted from the MD trajectory. The conformers were chosen at the simulation times of 8 ns, 9 ns, and 10 ns, and 30 docking poses were obtained. Of these, 10 (33.3%) were categorized into one group sharing a similar docking topology (Figure 7a). Solvent accessibility (ASA) was calculated from these 10 docking poses to identify the binding sites of the receptor and ligand. For the analysis of binding sites (epitope and paratope), solvent accessibility was calculated in the presence and absence of the binding partner. The results show that MY.1E12 uses CDR H1, H2 and H3 for binding to MUC1 O-glycan, while CDR L1, L2, L3 and H3 are involved in binding to the MUC1 peptide (Figure 7b). Paratope analysis shows that the N-terminal region of the MUC1 peptide (A1-S6) interacts with the VH domain, while NeuAc is recognized by the VL domain (Figure 7c). This NeuAc-VL interaction is consistent with previous reports that sialic acid is essential for binding to MY.1E12 [14,15]. A 2D plot analysis

Docking of MUC1 Glycopeptide to MY.1E12
Docking poses of MUC1 O-glycopeptide and MY.1E12 antibody as well as that of MUC1-MY.1E12 were built using ZDOCK software [25]. In the latter case, MUC1 (9AA) is known to bind to MY.1E12. Docking was performed under conditions such that the entire ligand and CDRs of MY.1E12 are involved in binding.
Since ZDOCK applies a rigid docking procedure, pseudo-flexible docking was performed using three MUC1 conformers that were extracted from the MD trajectory. The conformers were chosen at the simulation times of 8 ns, 9 ns, and 10 ns, and 30 docking poses were obtained. Of these, 10 (33.3%) were categorized into one group sharing a similar docking topology (Figure 7a). Solvent accessibility (ASA) was calculated from these 10 docking poses to identify the binding sites of the receptor and ligand. For the analysis of binding sites (epitope and paratope), solvent accessibility was calculated in the presence and absence of the binding partner. The results show that MY.1E12 uses CDR H1, H2 and H3 for binding to MUC1 O-glycan, while CDR L1, L2, L3 and H3 are involved in binding to the MUC1 peptide (Figure 7b). Paratope analysis shows that the N-terminal region of the MUC1 peptide (A1-S6) interacts with the V H domain, while NeuAc is recognized by the V L domain (Figure 7c). This NeuAc-V L interaction is consistent with previous reports that sialic acid is essential for binding to MY.1E12 [14,15]. A 2D plot analysis indicates that heavy-chain CDR loops (H1, H2 and H3) interact with the peptide region, while the light chain CDR1 loop (L1) binds to the glycan part (Figure 7d). indicates that heavy-chain CDR loops (H1, H2 and H3) interact with the peptide region, while the light chain CDR1 loop (L1) binds to the glycan part (Figure 7d).  exhibits a positively charged area associated with a lysyl residue located near the NeuAc residue ( Figure 8).

MD Simulation of MUC1
Coordinates of MUC1 glycan were created using Carbohydrate Builder in GLYCAM (https://dev.glycam.org/) (accessed on 6 June 2021). Peptide coordinates were created using Discovery Studio 2021 [24] The peptide and glycan were then attached using a Glycoprotein Builder in GLYCAM. The coordinates of MY.1E12 were created by homology modeling. CHARMm [26] was assigned as the force field. Simulation time was set to 10 ns. Explicit periodic boundary was used as the solvation model. Orthorhombic cell shape was used in explicit periodic boundary solvation model. The minimum distance from

MD Simulation of MUC1
Coordinates of MUC1 glycan were created using Carbohydrate Builder in GLYCAM (https://dev.glycam.org/) (accessed on 6 June 2021). Peptide coordinates were created using Discovery Studio 2021 [24] The peptide and glycan were then attached using a Glycoprotein Builder in GLYCAM. The coordinates of MY.1E12 were created by homology modeling. CHARMm [26] was assigned as the force field. Simulation time was set to 10 ns. Explicit periodic boundary was used as the solvation model. Orthorhombic cell shape was used in explicit periodic boundary solvation model. The minimum distance from periodic boundary was set to 7.0 Å. For each MUC1 coordinate, 439-1045 water molecules were explicitly placed and TIP3 [27] was used as the force field template. Minimization of the initial coordinate was done in two steps. The first step eliminated the distortion of the entire structure with the steepest descent algorithm. In the second step, minimization was performed with adopted basis Newton-Raphson (NR). Heating was carried out at 310 K. After equilibration, the time step was set to 2 fs and NAMD was carried out under an nPT ensemble [28].

Modeling of MY.1E12 Fv Domain
The 3D structure of the antibody was generated by a homology modeling technique. The amino acid sequences were as follows, with CDR underlined: Identification of suitable homologous template structures for the modeling of the target protein was carried out using the tool BLAST. CDRs were identified using the Annotate Sequence tool. Then, Identify Framework Templates was used to search for candidate templates. CDR numbering scheme was based on IMGT [22]. As a result, chimeric antibodies against IL-13 and EMMPRIN (PDB ID: 3MBX) were adopted as templates. Modeling was performed with the Model Antibody Framework tool. The Model Antibody Loop was subsequently performed to rebuild the CDR.

Docking Simulation of Glycopeptide and Antibody
Docking simulation of antibody-glycopeptide complex was performed using ZDOCK. For MUC1-MY.1E12 docking, MY.1E12 was used as the receptor and MUC1 (9AA) as the ligand. Stable structures for docking were derived from those at the end of MD simulation. ZDOCK is a rigid body docking algorithm, and to create alternative ligand structures we selected three MUC1 conformers in the middle of the MD simulation. The conformers used were those at 8 ns, 9 ns, and 10 ns. In the docking simulations, all glycan regions and all peptide regions were considered as active sites based on the NMR and MD simulation results. The active site of MY.1E12 was defined as the CDR region, and the other sites were defined as blocking sites. The contact surface area was calculated using ASA.

Conclusions
To gain an understanding of the structural basis of the O-glycan-dependent interaction between MUC1 and MY.1E12 antibody, we used several approaches. NMR titration locates the epitope to the O-glycan and nearby amino acid residues. MD simulation suggests O-glycosylation limits the conformational flexibility of the O-glycosylation site. In silico docking implicates both the O-glycan and peptide of the MUC1 ligand in binding to the MY.1E12 antibody. The elucidation of the likely mode of recognition will help develop novel glycopeptide-specific antibodies with desired sequence specificity. We are continuing to clarify the detailed binding mode of this antibody.