Identification of 35 C-Type Lectins in the Oriental Armyworm, Mythimna separata (Walker)

Simple Summary The oriental armyworm Mythimna separata is a lepidopteral agricultural pest that causes serious damage to many crops, such as maize, wheat, and sorghum. To control this pest, it is advisable to take comprehensive measures, including the use of chemical pesticides, microbial pesticides, and cultural practices. However, microbial pesticides (entomopathogens) can be eliminated by the insect immune system. C-type lectins (CTLs) are a family of pattern-recognition receptors that recognize carbohydrates and mediate immune responses. C-type lectins in the oriental armyworm have not yet been identified and characterized. In this study, a transcriptome of M. separata larvae was constructed and a total of 35 CTLs containing single or dual carbohydrate-recognition domains (CRDs) were identified from unigenes. Phylogenetic analyses, sequence alignments and structural predictions were performed. Gene expression profiles in different developmental stages, naïve larval tissues, and bacteria/fungi-challenged larvae were analyzed. Overall, our findings indicate that most dual-CRD CTLs are expressed in mid-late-stage larvae, pupae, and adults. Bacterial and fungal challenges can stimulate the expression of many CTLs in larval hemocytes, fat body, and midgut. Our data suggest the importance of CTLs in immune responses of M. separata. Abstract Insect C-type lectins (CTLs) play vital roles in modulating humoral and cellular immune responses. The oriental armyworm, Mythimna separata (Walker) (Lepidoptera: Noctuidae) is a migratory pest that causes significant economic loss in agriculture. CTLs have not yet been systematically identified in M. separata. In this study, we first constructed a transcriptome of M. separata larvae, generating a total of 45,888 unigenes with an average length of 910 bp. Unigenes were functionally annotated in six databases: NR, GO, KEGG, Pfam, eggNOG, and Swiss-Prot. Unigenes were enriched in functional pathways, such as those of signal transduction, endocrine system, cellular community, and immune system. Thirty-five unigenes encoding C-type lectins were identified, including CTL-S1~CTL-S6 (single CRD) and IML-1~IML-29 (dual CRD). Phylogenetic analyses showed dramatic lineage-specific expansions of IMLs. Sequence alignment and structural modeling identified potential ligand-interacting residues. Real-time qPCR revealed that CTL-Ss mainly express in eggs and early stage larvae, while IMLs mainly express in mid-late-stage larvae, pupae, and adults. In naïve larvae, hemocytes, fat body, and epidermis are the major tissues that express CTLs. In larvae challenged by Escherichia coli, Staphylococcus aureus, or Beauveria bassiana, the expression of different CTLs was stimulated in hemocytes, fat body and midgut. The present study will help further explore functions of M. separata CTLs.


Introduction
Insects depend on the innate immune system to recognize and eliminate pathogens [1]. Germline-encoded pattern recognition receptors (PRRs) can recognize pathogen-associated

RNA Sample Preparation, Library Construction, and Sequencing
Fourth instar larvae were frozen in liquid nitrogen and stored at −80 • C. Total RNA was isolated using the Trizol Reagent (Invitrogen Life Technologies, Carlsbad, CA, USA). RNA concentration was determined using a NanoDrop spectrophotometer (Thermo Scientific, Waltham, MA, USA). RNA quality and integrity were determined by RNA agarose gel electrophoresis and Agilent Bioanalyzer 2100 system. Three micrograms of RNA were used as input material for the RNA sample preparations. Sequencing libraries were generated using the TruSeq RNA Sample Preparation Kit (Illumina, San Diego, CA, USA). To select cDNA fragments of the preferred 200 bp in length, the library fragments were purified using the AMPure XP system (Beckman Coulter, Beverly, CA, USA). DNA fragments with ligated adaptor molecules on both ends were selectively enriched using Illumina PCR Primer Cocktail in a 15 cycle PCR reaction. Products were purified and quantified using the Agilent high sensitivity DNA assay on a Bioanalyzer 2100 system. The sequencing library was sequenced on a NovaSeq 6000 platform (Illumina) by Personal Biotechnology Co., Ltd (Nanjing, Jiangsu, China).

De Novo Transcriptome Analysis Flow
Raw data were filtered to remove low-quality reads using Cutadapt v2.7 to generate clean data (>10 bp overlap: AGATCGGAAG; 20% base error rate was allowed) [28]. Trinity v2.5.1 with the default setting was used to montage clean reads to generate transcript sequence files [29]. The longest transcript of each gene (Unigene) was extracted as the representative sequence of the gene. Databases used in gene annotation include NR (NCBI non-redundant protein sequences), GO (Gene Ontology), KEGG (Kyoto Encyclopedia of Genes and Genome), eggNOG (evolutionary genealogy of genes: Non-supervised Orthologous Groups), Swiss-Prot, and Pfam.

Analyses of the Expression Profiles by Real-Time qPCR (RT-qPCR)
To explore the expression profile in different developmental stages, samples from six stages (eggs, early stage larvae, mid-stage larvae, late-stage larvae, pupa, and adults) were ground in liquid nitrogen and stored at −80 • C. To explore the expression profile in naïve larval tissues, fifth instar larvae were anesthetized on ice and dissected to collect hemocytes, fat body, midgut, epidermis, and Malpighian tube. Collected tissues were immediately homogenized in SparkZol reagent (Sparkjade Biotechnology Co., Ltd., Jinan, Shandong, China). For the gene induction analysis, E. coli, S. aureus, and B. bassiana conidia were inactivated with 3% formaldehyde. Moreover, 10 4 bacteria were resuspended in PBS (10 mM phosphate buffer, 37 mM NaCl, 2.7 mM KCl, pH 7.4) and injected into fifth instar larvae with a microsyringe. PBS was injected as the negative control. B. bassiana conidia were resuspended in PBS with 0.05% tween-80, and 4 × 10 4 conidia were injected into larvae. PBS with 0.05% tween-80 was injected as the negative control. Hemocytes, fat body, and midgut were collected 6 and 24 h post-injection. Three larvae were used in each group, and all experiments were performed in three replicates. cDNA was synthesized from 1 µg total RNA using SPARKscript II RT Plus Kit (with gDNA Eraser). RT-qPCR (95 • C 30 s, 40 cycles of 95 • C 5 s, 60 • C 30 s) was performed using MonAmp™ ChemoHS qPCR Mix (Monad Biotech Co., Ltd., Suzhou, Jiangsu, China) with the CFX96 real-time PCR detection system (Bio-Rad, Hercules, CA, USA). The relative gene expression level was calculated by the 2 −∆∆CT method. Primer information was provided in Table S1.

Transcriptome Sequencing, Unigene Assembly, and Functional Annotation
A cDNA library was constructed for M. separata larvae and sequenced using the Illumina platform. This run produced 47,594,876 raw reads and 44,966,148 clean reads (Clean reads: 94.47%, Q20: 97.45%, Q30: 93.07%). The clean reads were assembled into 81,837 transcripts with a mean length of 1135 bp. Transcripts were further assembled into 45,888 unigenes with a mean length of 910 bp (N50 = 1687 bp). The transcriptome dataset was deposited in Sequence Read Archive (PRJNA702891).

Spatial and Temporal Expression Profiles
To explore the possible functions of CTLs, the expression profile in different developmental stages was analyzed by RT-qPCR. The hierarchical clustering analysis shows distinct expression patterns: CTL-S1, CTL-S2, CTL-S4, CTL-S5, and CTL-S6 mainly express in eggs and early stage larvae; IML-4, IML-5, and IML-18 express in adults; IML-3,  Figure 7A). Larval tissues can produce lots of immune factors to resist the invasion of pathogens. Thus, the expression profile in larval hemocytes, fat body, midgut, Malpighian tube, and epidermis was analyzed. The clustering analysis shows that hemocytes, fat body, and epidermis are the major tissues expressing CTL genes. Notably, IML-14 and IML-29 are mostly expressed in the midgut ( Figure 7B).

Discussion
With the advancement of the next-generation sequencing (NGS) technique, RNA-seq has become an indispensable tool for studying the transcriptome of non-model organisms, including some agricultural pests [39]. In this study, the transcriptome generated 44,966,148 clean reads and 45,888 unigenes with a mean length of 910 bp, which was comparable to the previous M. separata transcriptomes [40][41][42][43][44][45][46][47]. Most homologs of M. separata transcripts were found in lepidopteran species, especially in A. transitella and B. mori. By Gene Ontology (GO) classification and KEGG pathway annotation, unigenes were classified into a variety of biological processes, cellular components, molecular functions, and pathways.
Animal C-type lectin-like domain-containing proteins can be classified into 16 groups based on domain architecture and phylogenetic relationships [48]. In insects, CTLs can be classified based on domain architecture into CTL-S, IML, and CTL-X. CTL-S exist in several insect orders: Lepidoptera, Coleoptera, Hymenoptera, Diptera, and Hemiptera. The numbers of CTL-S vary in different species: Bombyx mori (12), Manduca sexta (8), Tribolium castaneum (10), Drosophila melanogaster (30), Anopheles gambiae (21), Aedes aegypti (37), Acyrthosiphon pisum (2), and Plutella xylostella (5). Immulectins were almost entirely found in Lepidoptera: Bombyx mori (6) and Manduca sexta (19) [5]. Here, we identified 6 CTL-S and 29 immulectins from the unigenes of M. separata larvae. The phylogenetic analysis showed that CTL-S genes were duplicated in the common ancestor before speciation ( Figure 5A), while most IML genes were duplicated after speciation ( Figure 5B). Similar phylogenetic relationships were also found in B. mori and M. sexta [7,8]. Since immulectins broadly participate in regulating the innate immune responses, the expansion of immulectins may greatly improve the survival rates of lepidopteran pests in the natural environment.
The expression profile of developmental stages shows that most IMLs express in larvae and pupae. Only three IMLs (IML-4, IML-5, and IML-18) express in adults. Interestingly, most CTL-S express in eggs and early-stage larvae ( Figure 7A). These results suggest that CTL-S may be important for the development of and immunity in embryos and early stage larvae, while IMLs are critical for immunity in larvae, pupae, and adults. A Periplaneta lectin participates in the organization or stabilization of the epidermis during leg regeneration [49]. H. armigera CTL3 maintains normal larval growth and development by maintaining ecdysone and juvenile hormone signaling and suppressing the abundance of Enterocuccus mundtii [50]. CTLs also show a specific spatial expression pattern in naïve larval tissues: hemocytes, fat body, and epidermis are responsible for expressing CTLs ( Figure  7B). These are the major larval tissues generating immune molecules. Bacterial and fungal infections induced dramatic changes in the expression of some CTLs (Figures 8 and 9). Our findings are similar to previous transcriptomic studies. In the cotton bollworm H. armigera, a transcriptome-based analysis showed that most CTL genes did not undergo any significant changes in the second instar larvae after B. bassiana infection, while most of them were upregulated in the fat body of the fifth instar larvae [17]. Although we did not compare the induction of CTLs between early stage and late-stage M. separata larvae, we found that most IMLs were expressed in mid-late-stage larvae and pupae. These data suggest that IMLs are important for immune responses in these stages. In the Japanese pine sawyer beetle, Monochamus alternatus infected with the entomopathogenic fungus Metarhizium anisopliae, several differentially expressed unigenes were CTLs [51]. Twelve CTL genes were identified in Adelphocoris suturalis (Hemiptera: Miridae) immune responsive genes against fungal and bacterial pathogens [52]. Fourteen CTLs were identified in immunity-related genes in Ostrinia furnacalis against entomopathogenic fungi [53].
A simple rule to predict the ligand specificity of CRDs is based on some key residues: 'EPN'-motif CRDs can recognize mannose-type ligands; 'QPD'-motif CRDs usually recognize galactose-type ligands. Mutating 'EPN' in MBP-A to 'QPD' caused a shift from mannose to galactose ligands [54]. Conversely, mutating 'QPD' in sea cucumber CEL-I to 'EPN' led to a weak binding affinity for mannose [55]. Some surrounding residues or structures also can affect ligand selection. An additional mutation of Trp 105 to His in MBP-A further increased the affinity to mannose [55]. A glycine-rich loop helps to exclude mannose and accommodate galactose in Gal-type CTLs [56,57]. However, there are exceptions to this rule. CEL-IV, a CTL in sea cucumber, Cucumaria echinate, contains the 'EPN' motif but binds galactose [58]. TC14, a CTL from the tunicate Polyandrocarpa misakiensis contains 'EPS' but binds galactose [59]. Some CRDs bind carbohydrates in the absence of the 'EPN/QPD' motif or Ca 2+ . The CRD of eosinophil major basic protein (EMBP) binds to heparin and heparan sulfate at a different contact site through electrostatic interactions and hydrogen bonds [60]. Bivalve lectins SPL-1 and SPL-2, which contained 'RPD' and 'KPD' motifs, showed Ca 2+ -independent binding affinity for GlcNAc or GalNAc [61]. Structural studies have elucidated how these residues interact with Ca 2+ and carbohydrates. For monosaccharide ligands, steric restrictions are imposed by the coordination bonds and hydrogen bonds formed between 3-OH/4-OH, Ca 2+ , and 'EPN/QPD' motifs. Mannose has equatorial 3-OH and equatorial 4-OH, while galactose has equatorial 3-OH and axial 4-OH. In addition, the hydrogen donors and acceptors are reversed between E(acceptor)-P-N(donor) and Q(donor)-P-D(acceptor) (Figure 6D,E). Therefore, the predicted ligand specificity needs to be verified experimentally.
To sum up, this study built a de novo transcriptome assembly of M. separata larvae, from which 6 'S-type' and 29 'IML-type' CTLs were identified. Sequence features, phylogenetic relationships, ligand specificity, and expression profiles were studied. Further studies are required to explore the function of each CTL in the oriental armyworm.