Identification of Sumoylated Proteins in the Silkworm Bombyx mori

Small ubiquitin-like modifier (SUMO) modification (SUMOylation) is an important and widely used reversible modification system in eukaryotic cells. It regulates various cell processes, including protein targeting, transcriptional regulation, signal transduction, and cell division. To understand its role in the model lepidoptera insect Bombyx mori, a recombinant baculovirus was constructed to express an enhanced green fluorescent protein (eGFP)-SUMO fusion protein along with ubiquitin carrier protein 9 of Bombyx mori (BmUBC9). SUMOylation substrates from Bombyx mori cells infected with this baculovirus were isolated by immunoprecipitation and identified by LC–ESI-MS/MS. A total of 68 candidate SUMOylated proteins were identified, of which 59 proteins were functionally categorized to gene ontology (GO) terms. Analysis of kyoto encyclopedia of genes and genomes (KEGG) pathways showed that 46 of the identified proteins were involved in 76 pathways that mainly play a role in metabolism, spliceosome and ribosome functions, and in RNA transport. Furthermore, SUMOylation of four candidates (polyubiquitin-C-like isoform X1, 3-hydroxyacyl-CoA dehydrogenase, cyclin-related protein FAM58A-like and GTP-binding nuclear protein Ran) were verified by co-immunoprecipitation in Drosophila schneide 2 cells. In addition, 74% of the identified proteins were predicted to have at least one SUMOylation site. The data presented here shed light on the crucial process of protein sumoylation in Bombyx mori.

In B. mori, it has been reported that BmSUMO participates in the immune response by regulating the expression of v-rel avian reticuloendotheliosis viral oncogene homolog A of Bombyx mori (BmRelA) [20]. However, several other functions of BmSUMO and the identity of the other substrate proteins that are targeted by the SUMOylation system of B. mori remain largely unknown. Here, a proteomic approach, based on immunoprecipitation, is reported for identification of substrates modified by SUMOylation, which sheds light on the crucial process of protein SUMOylation in B. mori.

Subcellular Localization of Small Ubiquitin-Like Modifier of Bombyx mori (BmSUMO)
It has been reported that in the HeLa cell line, SUMO-1, SUMO-2, and SUMO-3 proteins are all localized in the nuclear membrane, in the nuclear bodies, and in the cytoplasm [4,21]. To investigate the subcellular localization of SUMO in BmN cells, immunofluorescence analysis was carried out using a confocal laser-scanning microscope. Although fluorescence was observed across the entire cell, it was mainly observed in the cytoplasm and appeared as aggregated dots in the nucleus (Figure 1). The latter observation was in accordance with that of a previous study that found that many SUMOylated proteins are located in the nucleus [22]. Additionally, SUMO is found to be enriched in the nuclei of cultured Drosophila S2 and pole cells, and repression of SUMOylation alters its distribution into the cytoplasm [23,24]. In the control set, no fluorescence was detected ( Figure 1). As a control, pre-immune serum was used as the primary antibody. The nuclei were stained with 4',6-diamidino-2-phenylindol (DAPI). The samples were viewed using a confocal laser fluorescence microscope. (Scale bars = 20 and 5 µm respectively).

Isolation of SUMOylated Proteins
To identify the SUMOylated (modified by SUMO protein) proteins of B. mori, a baculovirus vector, which expresses the fusion protein of enhanced green fluorescent protein (eGFP)-SUMO and the ubiquitin carrier protein 9 of Bombyx mori (BmUBC9), was constructed. BmN cells were infected with the recombinant virus. The SUMOylated proteins were isolated with anti-GFP microbeads. After elution with the supplied elution buffer, the eluate was subjected to SDS-PAGE, followed by western blotting with an anti-GFP antibody. The control was prepared by using the vector expressing eGFP only, and an identical isolation process was performed in parallel. As shown in Figure 2, only the band of eGFP was observed in the control set, while a number of distinct bands, besides that of the 37-kDa eGFP-BmSUMO fusion protein, were observed in the eluate from cells infected with eGFP-SUMO and BmUBC9-expressing recombinant virus. This observation strongly indicated that several SUMOylated candidates had been successfully isolated.

Figure 2.
Isolation of SUMO conjugates by immunoprecipitation. Enhanced green fluorescent protein (eGFP), eGFP-SUMO and ubiquitin carrier protein 9 of Bombyx mori (BmUBC9), and eGFP-SUMO were expressed in baculovirus vectors and subjected to GFP immunoprecipitation. The immunoprecipitated proteins were then separated by SDS-PAGE and subjected to western blot analysis with an anti-eGFP antibody. Lane 1, eGFP; Lane 2, eGFP-SUMO and BmUBC9; Lane 3, eGFP-SUMO only.

Identification of SUMOylated Proteins by LC-ESI-MS/MS
The isolated SUMOylated proteins from eGFP-SUMO and BmUBC9 preparation were identified by LC-ESI-MS/MS. The eluted proteins were electrophoresed for a short time on SDS-PAGE, following which the bands were cut and processed for LC-ESI-MS/MS analysis. The same protocol was used for the control sample. In total, 68 proteins that were identified by two or more peptides in the eGFP-SUMO and BmUBC9 preparation, and absent from eGFP control preparation (Table 1) were considered candidates for SUMOylation. It should be noted that the control eGFP and eGFP-SUMO preparations have some overlapping proteins, which may have been observed owing to nonspecific interactions. In addition, some proteins were found to be unique to the control sample and may have been purified owing to the complexity of the sample preparation as well as the limited sequencing time in the mass spectrometer [25]. This possibility was partially confirmed with the MS results of the SUMO proteins present in both control eGFP and eGFP-SUMO preparations; hence, these were excluded from Table 1. Data regarding the proteins identified by a single peptide are presented in Table S1.

Confirmation of SUMOylated Proteins by Co-Immunoprecipitation (co-IP)
To validate our proteomics approach, four candidate proteins such as BGIBMGA001549 (polyubiquitin-C-like isoform X1, PCLIS), BGIBMGA000511 (3-hydroxyacyl-CoA dehydrogenase, HCD), BGIBMGA004023 (cyclin-related protein FAM58A-like, CRP) and BGIBMGA006751 (GTP-binding nuclear protein Ran, GNPRan) were selected to verify their modification by SUMO using co-immunoprecipitation (co-IP). Flag-tagged SUMO and His-tagged candidates were co-expressed in Drosophila S2 cells. Two days after inducing expression by CuSO4. Cells were processed for immunoprecipitation with anti-Flag magnetic beads and analyzed by western blotting using anti-His antibody. As shown in Figure 3, in all four of these cases examined, His-tagged candidate proteins could be immunoprecipitated from candidate proteins + SUMO transfected S2 cells; the apparent molecular weights are compatible with attachment of one ( Figure 3B,D,E) or two SUMO proteins ( Figure 3C). No proteins were observed in the control set of GFP + SUMO transfected S2 cells.

Functional Annotation of SUMOylated Proteins
To understand the functions of the identified proteins, the protein sequences were queried against the InterPro databases. The resultant proteins were functionally categorized based on universal gene ontology (GO) annotation terms by using the online GO tool WEGO (Web Gene Ontology Annotation Plot) and were classified into cellular component, molecular function, and biological process categories, according to the GO hierarchy by using WEGO. The majority of the proteins (59 of 68) were functionally categorized to GO terms ( Figure 4). Under the "biological process" category, the proteins were involved in "cellular process" (19.39%), "metabolic process" (16.48%), "single organism process" (9.2%), and "biological regulation" (8.05%). Under the "cellular component" category, the proteins were classified as "cell" (21.56%), "cell part" (21.56%), "organelle" (16.5%) and "macromolecular complex" (14.37%). Under the "molecular function" category, the identified proteins were mainly involved in "binding" (48.15%) and "catalytic activity" (35.8%) (Figure 4).
Analysis by the kyoto encyclopedia of genes and genomes (KEGG) tool showed that 46 of the identified proteins were involved in 76 pathways, which were largely classified into metabolism, genetic information processing, environmental information processing, cellular processes, organismal systems, and human disease. Pathways that play a role in metabolism, spliceosome and ribosome functions, and in RNA transport showed the highest number of matches.

Discussion
Being an important PTM, SUMOylation regulates many cellular pathways. However, identifying new SUMO targets and understanding the functions of protein SUMOylation has been largely limited by the low level of SUMOylation. It has been reported that the SUMO-UBC9 fusion protein has a higher conjugating activity for SIM-containing targets such as Sp100 and human cytomegalovirus immediate-early 2 (IE2) protein [26]. Hence, we used a special baculovirus vector, which can express foreign proteins under different promoters, to express eGFP-SUMO and BmUBC9. The control construct that could express eGFP-SUMO, but lacked BmUBC9, was also used for immunoprecipition ( Figure 2, lane 3). The data presented here show that compared to eGFP-SUMO, eGFP-SUMO and BmUBC9 bound to a much larger number of diverse cellular proteins and interacted with some SIM-containing proteins with higher affinities. Thus, the results confirmed that the SUMO-UBC9 fusion construct could be useful for identifying or purifying SUMO-modified proteins and for modulating the cellular SUMO pathway. However, it should be noted that over-expression of both SUMO and Ubc9 may lead to the pull-down of contaminants, non-relevant SUMO targets or proteins interacting non-covalently with SUMO (but not SUMOylated themselves). Of the 68 candidate SUMOylated proteins, some may not be truly modified by SUMOylation in vivo. Additional experiments such as immunoblotting and/or SUMOylation site identifications should be performed to confirm the status and dynamics of SUMOylation of each individual target.
A number of proteomic studies have been performed to identify the substrates of SUMOylation in both yeast and higher eukaryotes [27][28][29][30]. On comparing the potential substrates of silkworm with the yeast data [25], some homologs such as budding proteins (Buds), transcriptional activator Gcns Uba2, and transcript elongation factors (Spts) were identified. In this study, all the components of the multiprotein complex involved in SUMOylation, including SAE1, SAE2, and PIAS, were found to coimmunoprecipitate in the presence of eGFP-BmSUMO and BmUBC9. Furthermore, BmUBC9 and BmSUMO were identified in both eGFP and eGFP-SUMO plus BmUBC9 preparations (data not shown). Some of the homologous proteins identified in this study have been previously reported to be SUMOylated. For example, the endoplasmic reticulum-resident chaperone calreticulin is known to be SUMOylated, and abolishing SUMOylation enhanced calreticulin expression in an X Box binding protein 1 (XBP-1) dependent manner, resulting in increased calreticulin-counteracted endoplasmic reticulum stress [31]. The human transcription factor II D (TFIID) subunits TBP-associated factors 5 (TAF5) and TAF12 are modified by SUMO-1 both in vitro and in vivo, resulting in dynamic regulation of the promoter-binding activity of TFIID [32]. The endoribonuclease Dicer, which cleaves the stem-loop structure from pre-miRNAs, allowing them to dissociate into their mature, 20-22-nucleotide single-stranded form, has also been shown to be SUMOylated [33]. Moreover, as shown in Table 1, a number of ribosomal proteins and heat shock proteins (HSPs) have been identified as SUMOylation substrates. It is worth mentioning that several ribosomal proteins and HSPs isolated from human sperm have also been found to be SUMOylated [34]. SUMOylation of the human ribosomal protein S3 (rpS3) increases its stability [35]. In addition, HSP27 has been reported to be associated with the SUMOylation system [36][37][38][39].
In this study, recombinant baculovirus of B. mori nucleopolyhedrovirus (BmNPV) was used to express eGFP-SUMO plus BmUBC9. Baculoviridae is a family of enveloped, double-stranded DNA (81.7-178.7 kb) viruses that infect invertebrates, particularly insects of the order Lepidoptera. Numerous reports have confirmed that the viral infection is closely associated with SUMOylation [40,41]. It is known that some viral proteins such as the chicken embryo lethal orphan (CELO) adenovirus Gam1 protein [42,43], the human papillomavirus (HPV) E6 protein [44,45], and the Epstein-Barr virus (EBV) latent membrane protein 1 (LMP1) [46] can target SUMOylation enzymes, causing broad effects on host SUMO substrates. In addition, some viral proteins such as the human cytomegalovirus (HCMV) IE1 protein [47] and the HPV E1 protein [48] can exploit the host SUMOylation mechanism because they need to be SUMO-modified in order to exert their functions. Some other viral proteins such as human herpes virus-6 (HHV6) IE1, Epstein-Barr virus (EBV) Zta, EBV Rta, human immunodeficiency virus (HIV) P6 Gag, and the N protein of severe acute respiratory syndrome coronavirus (SARS-CoV) have all been shown to take advantage of the host SUMOylation system [5,49,50]. In addition, some viruses encode SUMOylation mimicking enzymes, such as the Kaposi's sarcoma-associated herpes virus (KSHV) K-bZIP protein and the adenovirus early region 1B 55-kDa protein (E1B-55K) protein, both of which possess SUMO ligase activities [51,52]. During our investigations, several proteins from BmNPV were also identified (Table S2). It has been reported that infection of Spodoptera frugiperda (Sf9) cells by Autographa californica multiple nuclear polyhedrosis virus (AcMNPV) leads to a decrease in the amount of free SUMO, coupled with an increase in the amount of SUMO-conjugated products [53]. Further studies will focus on the additional roles that SUMOylation may be playing during baculovirus replication.
To identify the isolated SUMOylation substrates conclusively, the sequence of each protein was analyzed using the GPS-SUMO program with the threshold MEDIUM [54,55]. Of the 203 proteins identified with one or more peptides, ~74% contained at least one SUMOylation site, and ~13% contained at least one SUMO-interaction Motifs (SIM) ( Table S3). An analysis of all the identified SUMOylation sites showed that approximately 41% (400 out of 983 sites) of them do not conform to the ΨKxE motif. In this regard, the current understanding of SUMOylation recognition is still inadequate [44]. Thus, it is possible that, although no canonical SUMOylation sites and SIMs have been identified in the remaining 13% proteins, their presence cannot be conclusively nullified.

Plasmid Construction and Cell Transfection
The BmSUMO gene was amplified with the forward primer BmsumoF (5'-GTCGACGCTGAT GAAAAGAAGGGA-3') and the reverse primer BmsumoR (5'-CTGCAGTTACTCCTCCGGTCTG CTG-3'). The BmUBC9 gene was amplified with primers UBC9FA (5'-CTCGAGATGTCAGGGATA GCAAGT-3') and UBC9RA (5'-GCATGCGATATTTACTCAGCAGCA-3'). Subsequently, the Bmsumo and BmUbc9 fragments were cloned into a modified pFastBac Dual vector (Invitrogen, Carlsbad, CA, USA), which contains the eGFP coding region under the polyhedrin promoter, and were then transferred into a wild type bacmid DNA by homologous recombination to construct the recombinant baculovirus bacmid eGFP-BmSUMO + BmUBC9. After white-blue plaque selection, the positive colonies were selected and analyzed by PCR with M13 universal primers. The empty vector was used as a negative control.
The recombinant bacmid was transfected into BmN cells for amplification. The third-generation virus (MOI = 10 pfu/cell) was further used to infect BmN cells for subsequent protein expression.

Western Blotting
Aliquots of each input lysate and both the eGFP-BmSUMO + BmUBC9 and control preparations were subjected to SDS-PAGE. After electrophoresis, the protein samples were transferred onto a PVDF membrane (Immobilon-P, Millipore, Merck KGaA, Darmstadt, Germany) in cold Towbin buffer (0.025 M Tris, 0.19 M Glycine, and 20% methanol) by using the Trans-Blot Cell apparatus (Bio-Rad, Shanghai, China). After blocking with 5% skimmed milk in PBS-T (1× PBS and 0.1% Tween-20), the membrane was washed three times with PBS-T for 5 min and was incubated with either the eGFP-epitope (Beyotime, Beijing, China) antibody or in PBS-T with 5% skimmed milk at 37 °C for 1 h. Subsequently, either anti-rabbit or anti-goat HRP-conjugated secondary antibodies (Pierce, Rockford, IL, USA) were utilized to detect the reactive band(s). The results were visualized with the ECL detection system (Amersham Biosciences, Piscataway, NJ, USA).

Antibody Preparation and Confocal Laser-Scanning Microscopy
The BmSUMO fragment was gel purified and subcloned into the expression vector pET28a with a 6× His tag at its N-terminus. The fused protein was expressed in E. coli BL21 by inducing with 1 mM Isopropyl β-D-Thiogalactoside (IPTG) at 30 °C. The polyclonal antibody was raised using standard procedures. The purified SUMO protein was mixed with complete Freund's adjuvant and injected into New Zealand White rabbits, followed by two injections in incomplete Freund's adjuvant.
The BmN cells were washed three times with 1× PBS and fixed in methanol:acetone (1:1) on ice for 15 min; this was followed by three washes with 1× PBS. The cells were then incubated with BmSUMO polyclonal antibody for 1 h at room temperature. After washing, the cells were incubated with Protein-G fused with enhanced green fluorescent protein (eGFP) and stained with the nucleus (DNA)-specific stain DAPI (Sigma, Shanghai, China) for 1 h. The cells were directly observed using a Leica TCS SP5 confocal laser-scanning microscope (Leica, Shanghai, China). The control was prepared as described above, except that the antibody used was pre-immune serum.

LC-ESI-MS/MS Analysis
After adjusting the pH to 8.5 with 1 M ammonium bicarbonate, the total protein (100 μg) extracted from each sample was chemically reduced for 1 h at 60 °C by adding dithiothreitol (DTT) at a final concentration of 10 mM and was carboxyamidomethylated in 55 mM iodoacetamide for 45 min at room temperature in the dark. Then, Trypsin Gold (Promega, Madison, WI, USA) was added to a final substrate/enzyme ratio of 30:1 (w/w). Trypsin digestion was carried out at 37 °C for 16 h. After digestion, the peptide mixture was acidified by 10 μL of formic acid for MS analysis. Each peptide sample was desalted using a Strata X column (Phenomenex, Guangzhou, China), vacuum-dried, and then resuspended in 200 μL of buffer A (2% acetonitrile and 0.1% formic acid). After centrifugation at 20,000× g for 10 min, the supernatant was recovered to obtain a peptide solution with a final concentration of ~0.5 μg/μL. In total, 10 μL of the supernatant was loaded onto a 2-cm C18 trap column in a LC-20AD nanoHPLC (Shimadzu, Kyoto, Japan) by using the autosampler. Then, the peptides were eluted onto an in-house packed 10-cm analytical C18 column (inner diameter, 75 μm). The samples were loaded at 8 μL/min for 4 min. Then, a gradient of 2% to 35% buffer B (98% CAN and 0.1% FA) was run for 44 min at 300 nL/min, followed by a 2-min linear gradient to 80% buffer B, and maintenance at 80% buffer B for 4 min, following which it was finally adjusted to 5% buffer B in 1 min. The peptides were subjected to nanoelectrospray ionization followed by tandem mass spectrometry (MS/MS) in a QEXACTIVE (ThermoFisher Scientific, San Jose, CA, USA) coupled online to the HPLC. Peptides were detected in the Orbitrap at a resolution of 70,000. The peptides were selected for MS/MS using high-energy collision dissociation (HCD) operating mode with a normalized collision energy setting of 27.0; ion fragments were detected in the Orbitrap at a resolution of 17,500. A data-dependent procedure that alternated between one MS scan followed by 15 MS/MS scans was applied for the 15 most abundant precursor ions above a threshold ion count of 20,000 in the MS survey scan with a dynamic exclusion duration of 15 s. The electrospray voltage applied was 1.6 kV. Automatic gain control (AGC) was used to optimize the spectra generated by the Orbitrap. The AGC target for full MS was 3 × 10 6 and 1 × 10 5 for MS2. For MS scans, the m/z scan range was 350 to 2000 Da. For MS2 scans, the m/z scan range was 100-1800.
Drosophila S2 cells were grown in Schneider's insect medium (Sigma-Aldrich, Shanghai, China) supplemented with 10% FBS. 1 × 10 7 of S2 cells were transfected with 1 µg of each SUMO and SUMOyalation candidate constructions. Twenty-four hours after transfection, 0.5 mM CuSO4 was added to the medium to induce protein expression. Forty-eight hours after induction, cells were collected and washed with PBS. Then washed cell were lysed in IP buffer (25 mM Tris-HCl pH 7.4, 150 mM NaCl, 1 mM EDTA, 1% NP-40 and 5% glycerol) supplemented with 10 mM N-ethylmaleimide and 1× protease inhibitor cocktail on ice for 1 h. After 12,000 rpm/10 min centrifugation, supernatants were collected and incubated with Anti-FLAG M2 Magnetic Beads (Sigma-Aldrich, Shanghai, China) on ice for 3 h. The beads were finally collected and washed three times with immunoprecipitation buffer. Proteins binding to the beads were eluted by adding 20 µL of 1× electrophoresis sample buffer and analyzed by Western-blotting with Anti-His antibody (Sigma-Aldrich, Shanghai, China). The whole cell lysates were used as input samples.

Data Analysis
Raw data files acquired from the Orbitrap were converted into MGF files using Proteome Discoverer 1.2 (PD 1.2, Thermo, Pittsburgh, PA, USA), and the MGF files were searched. Protein identification was performed using the Mascot search engine (Matrix Science, London, UK; version 2.3.02) against a database containing 21,893 sequences. For protein identification, a mass tolerance of 20 ppm was permitted for intact peptide masses and that of 0.6 Da for fragmented ions, with allowance for one missed cleavage in the tryptic digests. Gln->pyro-Glu (N-term Q), Oxidation (M), and Deamidated (NQ) were the potential variable modifications, and Carbamidomethyl (C) was the fixed modification. The charged states of the peptides were set to +2 and +3 specifically, and an automatic decoy database search was performed in Mascot by choosing the decoy checkbox in which a random sequence of database is generated and tested for raw spectra as well as the real database. To reduce the probability of false peptide identification, only peptides with significant scores (≥20) at the 99% confidence interval generated by a Mascot probability analysis greater than "identity" were counted as identified. Every confirmatory protein identification involved at least one unique peptide. Functional annotations of the proteins were conducted using Blast2GO program against the non-redundant protein database (NR; NCBI). The KEGG database [56,57] and the COG database [58,59] were used to classify and group these identified proteins.

Conclusions
SUMOylation represents a vital PTM that pervades numerous aspects of cell biology, including protein targeting, transcriptional regulation, signal transduction, and cell division. To provide further insight into this complex process, a proteomics approach was undertaken to identify the targets of SUMOylation in the model lepidopteran B. mori. A total of 68 candidate SUMOylated proteins were identified from the B. mori proteome, of which 59 proteins were functionally categorized to GO terms. KEGG pathways analysis showed that 46 proteins were involved in 76 pathways that mainly play a role in metabolism, spliceosome and ribosome functions, and in RNA transport. In total, 74% of the identified proteins were predicted to have at least one SUMOylation site. These data shed light on the crucial process of protein SUMOylation in B. mori.