Deciphering Novel Antimicrobial Peptides from the Transcriptome of Papilio xuthus

Simple Summary Insects live in a pathogenic microorganism rich environment. Thus, insects develop a stronger defense in terms of innate immunity. Antimicrobial peptides (AMPs) are one of the key tools in the insect’s innate immune system, which kills the invading pathogens. In this study, we used Papilio xuthus, the Asian swallowtail butterfly, to identify the AMPs from their genomic product. We utilized next generation sequencing technology and in silico analysis tools to decipher the possible novel AMPs. The obtained novel AMPs were then tested for the antibacterial and antifungal activities. Seven novel peptides were identified as the antimicrobial agent, and these can be used as a lead candidate in the process of antibacterial therapy development against various infectious diseases. Abstract An insect’s innate immune system is the front line of defense against many invading microorganisms. One of the important components of this defense system is antimicrobial peptides (AMPs). Papiliocin is a well-studied antimicrobial peptide (AMP) isolated from the swallowtail butterfly, Papilio xuthus, and it was previously reported to be effective against Gram-positive bacteria, Gram-negative bacteria, and fungi, particularly in drug resistant Gram-negative bacteria. Hence, we aimed to identify novel AMPs from Papilio xuthus using its transcriptome. We immunized the swallowtail butterfly with Escherichia coli, Staphylococcus aureus, Candida albicans, and the total RNA was isolated. De novo transcriptome assembly and functional annotations were conducted, and AMPs were predicted using an in-silico pipeline. The obtained 344,804,442 raw reads were then pre-processed to retrieve 312,509,806 (90.6%) total clean reads. A total of 38,272 unigenes were assembled with the average length of 1010 bp. Differential gene expression analysis identified 584 and 1409 upregulated and downregulated genes, respectively. The physicochemical, aggregation, and allergen propensity were used as filtration criteria. A total of 248 peptides were predicted using our in-house pipeline and the known AMPs were removed, resulting in 193 novel peptides. Finally, seven peptides were tested in vitro and three peptides (Px 5, 6, and 7) showed stronger antimicrobial activity against Gram-negative bacteria and yeast. All the tested peptides were non-allergens. The identified novel AMPs may serve as potential candidates for future antimicrobial studies.


Introduction
The innate immune system is the front line of defense against an invading microorganism in all living organisms, especially insects. One of the important components of this defense system is antimicrobial peptides (AMPs). AMPs are effective against Gram-positive bacteria, Gram-negative bacteria, fungi, viruses, and parasites. The first insect AMP (cecropin) was isolated and studied in 1980s and since then, 318 insect AMPs have been cataloged to date in the Antimicrobial Peptide Database [1]. Based on the sequence and structure, insect AMPs are broadly classified into three groups: peptides with α-helices (e.g., Cecropins), β-sheet peptides with conserved cysteine residues (e.g., Defensins), and linear proline or glycine rich peptides (e.g., Drosocin, Attacins) [2][3][4][5]. Although insect AMPs are diversified in sequence and structural features, they share some common functionality such as a net positive charge and inclusion of approximately 50% hydrophobic residues. These properties contribute majorly to their bactericidal and antifungal activity. These features can be a key factor in the screening and development of AMP-based antimicrobial therapeutics [6].
Papilio xuthus (P. xuthus), commonly known as the swallowtail butterfly, is found in the far east Asian regions of Korea, Japan, China, Taiwan, and Myanmar. Papiliocin is the 37 residue α-helical cecropin-like AMP isolated from an immune challenged larvae of swallowtail butterfly [7]. Papiliocin has exhibited the bactericidal activity against both Gram-positive and Gram-negative bacteria, but the effect is more pronounced in Gram-negative strains. Additionally, the antibacterial and antifungal activity of papiliocin is observed in multidrug-resistant bacterial and fungal strains, respectively. Also, it is reported to possess an anti-inflammatory activity and non-toxicity toward mammalian cells [8][9][10]. These reported properties of papiliocin renders it as a potential antimicrobial from swallowtail butterfly that can be used as an alternative or a supportive for classical antibiotics. AMPs may serve as an effective antimicrobial against antimicrobial resistance, a growing concern in clinical settings.
In this study we have aimed to discover novel AMPs from the swallowtail butterfly. In this context, we utilized the next generation sequencing (NGS) technology in deciphering the transcriptome of immunized P. xuthus. We used our AMP prediction pipeline, a step-by-step screening of physiochemical activity, which also included allergen testing. After the initial screening, the peptides were blasted with protein databases and the known AMPs were removed. Finally, the screened peptides were evaluated for the antimicrobial activity and we found seven novel AMPs from P. xuthus.

Microorganisms and Growth Conditions
The bacterial strains Escherichia coli (KACC 13821, ATCC 11775) and Staphylococcus aureus (KACC 10768, ATCC 25923), and the yeast strain Candida albicans (KCTC 7121, ATCC 14053) were purchased from the Korean Agricultural Culture Collection (KACC) and Korean Collection for Type Cultures (KCTC). Both the bacteria and yeast were grown overnight in tryptic soy broth (TSB; Difco, Thermo Fisher Scientific, Waltham, MA, USA) at 37 • C, 200 rpm to the stationary phase. Bacteria were re-inoculated in fresh TSB medium and cultivated for 3 h to the log phase. The strains were stored with 15% glycerol at −70 • C.

Insect and Immunization
Papilio xuthus were reared at room temperature with 60% ± 5% relative humidity and a 16/8 light/dark photoperiod cycle. For immunization, each last instar larva was injected with the mixture of Escherichia coli, Staphylococcus aureus, and Candida albicans (5 × 10 3 colony-forming units) suspended in 5 µL sodium phosphate buffer (10 mM; pH 7.4). For the non-immunized control, each larva of swallowtail butterfly was injected with the 5 µL sodium phosphate buffer alone (10 mM; pH 7.4). After maintaining both the immunized and non-immunized larvae at 25 • C ± 1 • C for 18 h, total RNA was isolated.

Transcriptome Sequencing, De Novo Assembly, and Functional Annotations
Briefly, total RNA from three non-immunized and three immunized larvae were isolated for whole transcriptome sequencing. For the total RNA isolation, the larvae were washed with PBS and sterilized with 70% ethanol and then anesthetized on ice. RNA was isolated from the whole body using TRIZOL reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's instructions. The total RNA quality and quantity of each sample were analyzed by a Bioanalyzer 2100 system using an RNA 6000 Nano Kit (Agilent technologies, Inc., Santa Clara, CA, USA). Approximately 2 µg of total RNA ((RNA Integrity Number (RIN) >7.0) of each sample was used to construct transcriptome cDNA libraries using a Truseq Stranded mRNA Prep Kit (Illumina Technologies, San Diego, CA, USA), according to the manufacturer's instructions. Further, cDNA libraries were sequenced using an Illumina HiSeq 2500 platform to generate 100 bp paired end reads. An initial raw data quality was checked by the FastQC (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Then, Trimmomatic [11] was used to remove low quality reads and adaptor contents using default parameters. Further, the Trinity program [12] was used for the de novo assembly of high-quality reads with default parameters. The redundancy transcripts were identified through Cluster Database at High Identity with Tolerance CD-HIT-EST [13] with an identify threshold of 95% sequence similarity. In addition, TransDecoder [12] was used to identify the longest ORF per transcript and protein fragment for the peptide prediction. Furthermore, functional annotations of the filtered unigenes were identified through a homology search against the Swiss-Prot databases by BLASTX with an E-value cut-off of 10 −5 . Meanwhile, domains were identified by using InterProScan [14]. The assembled unigenes were subjected to a GO enrichment analysis by the GOseq Bioconductor package [15] based on the Wallenius non-central hypergeometric distribution with the corrected p-value less than 0.05.

Expression Analysis
To identify the differentially expressed genes (DEG), initially cleaned raw reads were mapped to the assembled unigenes with Bowtie [16] using default parameters. The gene transcript abundance was calculated as transcripts per kilobase million (TPM) [17] using the RNA-Seq by Expectation-Maximization (RSEM) package [18]. The generated read counts of all genes were given as input for the identification of differential expressed genes (DEG) using EdgeR [19]. The cut off was set as p-value < 0.05 and log2 Fold Change (log2FC) ≥ 1 for the downstream analysis.

Antimicrobial Peptide (AMP) Prediction and Classification
A modified in silico strategy was employed to predict AMPs from the deduced amino acid sequences. Briefly, peptides were screened for physiochemical propensity, followed by mapping and prediction in the peptide database and CAMP classifier, respectively. The in silico pipelines were run with parameters defined previously [20] to predict the physiochemical property. The AMPs were then mapped with the CAMP database [21] and classified as novel or known AMPs. Two programs, PatMatch (no mismatch) for sequences ≤20 residues in length [22] and BLASTP (1E-05) for sequences ≥20 residues in length were used. The BLAST results were filtered with a similarity score ≥90. Sequences with the observed similarity at the given cutoff values were considered as known AMPs, and the others were considered as novel AMPs. Finally, the novel AMPs were manually validated for continuous stretches of amino acids to account for the low-complexity regions and assembly artifacts.

Peptide Synthesis
All the novel peptides selected based on the various prediction tools above-mentioned [23] were synthesized using solid-phase peptide synthesis methods at AnyGen Co. Ltd. (Gwangju, Korea). Then, each peptide was purified to >95% by high-performance liquid chromatography, and the purity was confirmed by mass spectrometry analysis. The peptides were dissolved in acidified distilled water (0.01% acetic acid) and stored at −20 • C until used in subsequent experiments.

Antimicrobial Activity Assay
The radial diffusion assay was performed to test the antimicrobial activities of peptides, as described previously with slight modifications [24]. In brief, bacteria and yeast strains were grown to the mid-logarithmic phase in TSB at 37 • C and then washed twice with 10 mM sodium phosphate buffer (pH 7.4). A total of 4 × 10 6 CFU was added to 10 mL of an underlay agarose gel [9 mM sodium phosphate, 1 mM sodium citrate, pH 7.4, 1% (w/v) Type I (low electroendosmosis) agarose (Sigma, St. Louis, MO, USA), and 3% TSB (Difco, USA)]. The underlay gel was poured into a 100-mm INTEGRIDTM Petri dish. After the agarose solidification, 3-mm-diameter wells were punched and 5 µL of each peptide solution was added to each well. Buffer alone was used as a negative control. Plates were incubated at 37 • C for 3 h to allow for diffusion of the peptides. The underlay gel was then covered with the 10 mL of nutrient-rich agar overlay (6% TSB and 1% agarose). The antimicrobial activity of a peptide was measured as the diameter of the cleared zone around each well after 12 h of incubation at 37 • C. This experiment was repeated at least three times independently.

Hemolytic Assay
This experiment was approved by the Institutional Animal Care and Use Committee (IACUC) of the National Institute of Agricultural Sciences (approval number: NAS-202014). The hemolytic activity of the peptides was determined by monitoring the release of hemoglobin from mouse erythrocytes at 540 nm. For the hemolytic assay, 20 µL of each peptide solution at a predetermined concentration was added to 180 µL of a 2.5% (v/v) suspension of mouse erythrocytes in phosphate-buffered saline (PBS). Melittin (Sigma, USA), a hemolytic and α-helical peptide isolated from bee venom, was used as the positive control. This mixture was incubated for 30 min at 37 • C, and 600 µL of PBS was then added to each tube. After 3 min of centrifugation at 10,000× g, the supernatant was removed, and the absorbance was measured at 540 nm. Evaluations were made from the results of at least three independent experiments, each carried out in triplicate.

Data Availability
All the raw data utilized in this study were deposited at the National Center for Biotechnology (NCBI) Sequence Read Archive (SRA) under the accession number SRA1096681.

Illumina Sequencing and De Novo Transcriptome Assembly
Three biological replicates for each group, a total of six samples (non-immunized vs. immunized) were sequenced and the total number of raw reads obtained in each group is summarized in Table S1. Overall workflow of P. xuthus de novo transcriptome assembly and the AMP prediction are illustrated in Figure 1. First, adapter and low-quality sequences were removed, and the resultant total clean reads were 90.6% (312,509,806) and the individual clean reads ranged between 89.1% and 93.48%. (Table S1). An Illumina quality control score of Q30 (incorrect base call accuracy was 1 in 1000) was observed to be high in all sequenced samples, which was indicative of good quality data (Table S1). The obtained clean reads were then assembled de novo using "Trinity", a software package for transcriptome assembly. A total of 38,272 unigenes were assembled with the average length of 1010 bp. Unigenes with the length varying from 200 to >2000 bp were obtained and the unigene with >200 bp length was considered for further processing.

Functional Annotations and Differential Gene Expression Analysis
The databases Swiss-Prot, GO, and InterProscan were used to annotate the assembled data. A total of 19,900 (52%) unigenes were annotated by the BLASTX search against the Swiss-Prot and about 13,842 (36%) unigenes were annotated using InterProscan (Table S1). The gene function prediction by GO analysis resulted in 13,945 annotated genes distributed in three major categories, biological process, cellular component, and molecular function (Figure 2a). To induce the immune system, we immunized P. xuthus with the mixture of Gram-negative, Gram-positive bacteria, and fungus. DEG analysis was carried out to find the differential expression pattern of the immunized and nonimmunized P. xuthus. We obtained a total of 1993 differentially expressed transcripts based on the cutoff log2FC ≥ 1 and p-value < 0.05. Among these, 584 transcripts were upregulated, and 1409 transcripts were downregulated in the immunized group when compared with the control. The DEGs were subjected to GO enrichment analysis and the top five subcategories were metabolic process, catalytic activity, small molecule metabolic process, oxidoreductase activity, and organic substance metabolic process (Figure 2b).

Functional Annotations and Differential Gene Expression Analysis
The databases Swiss-Prot, GO, and InterProscan were used to annotate the assembled data. A total of 19,900 (52%) unigenes were annotated by the BLASTX search against the Swiss-Prot and about 13,842 (36%) unigenes were annotated using InterProscan (Table S1). The gene function prediction by GO analysis resulted in 13,945 annotated genes distributed in three major categories, biological process, cellular component, and molecular function (Figure 2a). To induce the immune system, we immunized P. xuthus with the mixture of Gram-negative, Gram-positive bacteria, and fungus. DEG analysis was carried out to find the differential expression pattern of the immunized and non-immunized P. xuthus. We obtained a total of 1993 differentially expressed transcripts based on the cutoff log2FC ≥ 1 and p-value < 0.05. Among these, 584 transcripts were upregulated, and 1409 transcripts were downregulated in the immunized group when compared with the control. The DEGs were subjected to GO enrichment analysis and the top five subcategories were metabolic process, catalytic activity, small molecule metabolic process, oxidoreductase activity, and organic substance metabolic process (Figure 2b).

AMP Prediction
AMP prediction from the insect transcriptome using the computational approach is our long-standing idea from which we have continuously thrived in testing [20,25,26]. The natural environment of insect livelihood is rich in harmful bacteria, fungus, parasites, and viruses, thus there is a need for an efficient defense system [3]. Several AMPs identified from insects have been reported to possess antibacterial, antifungal, and antiviral activities that can be utilized for the therapeutic design of antimicrobials. Papiliocin is an AMP with antimicrobial and anti-inflammatory activities and was previously isolated from P. xuthus [10]. To retrieve a biological insight of the other promising AMPs from swallowtail butterfly, we immunized and screened its transcriptome using a state of art in silico pipeline (Figure 1). Continuing expansion of NGS technology and related data analysis was advantageously utilized in understanding the immune related genes or peptides in the insects [27,28]. P. xuthus AMP prediction involves several physiochemical measures such as residue length, charge, aggregation propensity, and allergen prediction. AMPs' modes of action primarily involve membrane destabilization for which the positively charged AMP is critical for binding with the negatively charged membrane components. Thus, the bactericidal or antifungal activities of AMPs depend on its physiochemical properties such as length, isoelectric point (pI), and charge. In the case of aggregation property, the AMPs are desired not to possess in vitro aggregation because it decreases its activity and availability. However, in vivo aggregation on bacterial cell surfaces helps in bacterial clearance from the host. Additionally, AMP containing the protease cleavage site was removed to avoid degradation in the human body. The allergen prediction parameter was also included in this study to avoid the extended usage of AMP in in-vitro and clinical settings. All the predicted AMPs were non-allergen. Collectively, all the peptides were passed through the step-by-step physiochemical filtration schema. The detailed cutoff values are mentioned in the Materials and Methods Section 2.5 and Table S2. The peptides that pass all the physiochemical propensity were then mapped to the CAMP database to identify novel AMPs. In total, 248 AMPs passed through all filters were predicted to be novel putative AMP.

Experimental Validation of Putative and Novel AMPs
After the in silico prediction, 193 novel AMPs with potentially high activity were ultimately selected. Among them, 14 peptides were synthesized according to the results of the secondary structure prediction [29]. We chose the α-helix regions based on secondary structure due to the efficiency and cost of peptide synthesis for the development of antimicrobial agents. After an initial screening of the synthesized peptide, we finally selected seven AMPs (Tables 1 and 2) and tested their antimicrobial activities against E. coli, S. aureus, and C. albicans using a radial diffusion assay (Figure 3). We found antimicrobial activity in seven synthetic peptides (Px 3, 4, 5, 6, 7, 11, 13), which increased in a dose-dependent manner. Melittin was used as a positive control. Remarkably, the peptides Px 4, 6, 7 showed stronger antimicrobial activity than melittin and Px 5, 13 showed similar antimicrobial activity with melittin in E. coli. The antimicrobial activities of Px 3 showed similar antimicrobial activity with melittin in S. aureus. Correspondingly, Px 3, 6, 7 showed stronger antimicrobial activity than melittin and Px 4, 5 showed similar antimicrobial activity with melittin against C. albicans. Altogether, the results showed that Px 5, 6, and 7 had a broad range of activities toward Gram-negative bacteria and yeast.   The seven selected synthetic peptides showing antimicrobial activity in the radial diffusion assay were studied for hemolysis effect (Figure 4). Melittin lysed 92% of mouse red blood cells at a concentration of 12.5 μg/mL, whereas no hemolytic activity was observed for the seven synthetic peptides at this concentration and up to 100 μg/mL. Melittin has strong and broad antimicrobial spectrum, but the peptide lacks selectivity toward microbes. The purpose of this experimental study was to find novel peptides that have potent antimicrobial activities with little or no cytotoxicity. Thus, these data indicate that the selected peptides are useful for the development of novel antimicrobial agents.  The seven selected synthetic peptides showing antimicrobial activity in the radial diffusion assay were studied for hemolysis effect (Figure 4). Melittin lysed 92% of mouse red blood cells at a concentration of 12.5 µg/mL, whereas no hemolytic activity was observed for the seven synthetic peptides at this concentration and up to 100 µg/mL. Melittin has strong and broad antimicrobial spectrum, but the peptide lacks selectivity toward microbes. The purpose of this experimental study was to find novel peptides that have potent antimicrobial activities with little or no cytotoxicity. Thus, these data indicate that the selected peptides are useful for the development of novel antimicrobial agents.
Insects 2020, 11, x 8 of 10  The seven selected synthetic peptides showing antimicrobial activity in the radial diffusion assay were studied for hemolysis effect (Figure 4). Melittin lysed 92% of mouse red blood cells at a concentration of 12.5 μg/mL, whereas no hemolytic activity was observed for the seven synthetic peptides at this concentration and up to 100 μg/mL. Melittin has strong and broad antimicrobial spectrum, but the peptide lacks selectivity toward microbes. The purpose of this experimental study was to find novel peptides that have potent antimicrobial activities with little or no cytotoxicity. Thus, these data indicate that the selected peptides are useful for the development of novel antimicrobial agents.

Conclusions
In this study, we aimed at revealing novel AMPs from the immunized P. xuthus transcriptome. We obtained seven peptides from in silico screening, which were also tested in vitro for the antimicrobial activity and toxicity toward normal cells. Our work will be a resource for both the transcriptome data and AMP discovery in swallowtail butterfly.