Transcriptome Analysis of Larval Segment Formation and Secondary Loss in the Echiuran Worm Urechis unicinctus

The larval segment formation and secondary loss in echiurans is a special phenomenon, which is considered to be one of the important characteristics in the evolutionary relationship between the Echiura and Annelida. To better understand the molecular mechanism of this phenomenon, we revealed the larval transcriptome profile of the echiuran worm Urechis unicinctus using RNA-Seq technology. Twelve cDNA libraries of U. unicinctus larvae, late-trochophore (LT), early-segmentation larva (ES), segmentation larva (SL), and worm-shaped larva (WL) were constructed. Totally 243,381 unigenes were assembled with an average length of 1125 bp and N50 of 1836 bp, and 149,488 unigenes (61.42%) were annotated. We obtained 70,517 differentially expressed genes (DEGs) by pairwise comparison of the larval transcriptome data at different developmental stages and clustered them into 20 gene expression profiles using STEM software. Based on the typical profiles during the larval segment formation and secondary loss, eight signaling pathways were enriched, and five of which, mTOR, PI3K-AKT, TGF-β, MAPK, and Dorso-ventral axis formation signaling pathway, were proposed for the first time to be involved in the segment formation. Furthermore, we identified 119 unigenes related to the segment formation of annelids, arthropods, and chordates, in which 101 genes were identified in Drosophila and annelids. The function of most segment polarity gene homologs (hedgehog, wingless, engrailed, etc.) was conserved in echiurans, annelids, and arthropods based on their expression profiles, while the gap and pair-rule gene homologs were not. Finally, we verified that strong positive signals of Hedgehog were indeed located on the boundary of larval segments using immunofluorescence. Data in this study provide molecular evidence for the understanding of larval segment development in echiurans and may serve as a blueprint for segmented ancestors in future research.


Introduction
Echiurans are a group of marine benthic invertebrates including approximately 230 species, which all inhabit marine environments from the intertidal zone to thousands of meters in the deep sea [1][2][3], such as Bonellia viridis in coastal sediment and Hydroides elegans in the deep sea. Academically, it has been a controversial issue whether the echiurans belong to Annelida or a separate phylum Echiura [4]. One view is that echiurans are a subtaxon of annelid based on their corresponding morphological information, such as the ultrastructure of cuticle and cilium [5], and ladder-like nervous system [6], as well as recent molecular phylogenetic analyses [1,[7][8][9]. However, other researchers considered that The gastrula hatches and develops into a swimming larva, the trochophore, in which a prototroch separates the larva into the episphere (upper hemisphere) and the hyposphere (lower hemisphere) to form the head and the trunk, respectively. The segments occur firstly in the early-segmentation larva and then maintain through the late-segmentation larva. The secondary loss of the segments appears in the worm-shaped larva whose inhabit mode changes from free-living into burrowing in the sediment. A body plan without segments is kept in adult U. unicinctus.

Illumina Sequencing, de novo Assembly and Functional Annotation
To obtain an overview of the U. unicinctus larval transcriptome, twelve cDNA libraries were constructed from the larvae of four developmental stages, late-trochophore (LT), early-segmentation larva (ES), segmentation larva (SL), and worm-shaped larva (WL). A total of 609.53 Mb raw reads were generated and 591.58 Mb clean reads were obtained (Table 1). These clean reads were then de novo assembled by Trinity software and generated 243,381 unigenes (gene and unigene used hereinafter all represent unigene) with an average length of 1,125 bp and N50 of 1,836 bp ( Table 2). The PCA analysis showed that three biological replicates of each developmental stage segregated together, respectively, and LT and ES samples tended to cluster as well ( Figure S1), which indicated the reliability of the data. All RNA-Seq data had been submitted to the NCBI Sequence Read Archive under the accession number of SRP156975. The gastrula hatches and develops into a swimming larva, the trochophore, in which a prototroch separates the larva into the episphere (upper hemisphere) and the hyposphere (lower hemisphere) to form the head and the trunk, respectively. The segments occur firstly in the early-segmentation larva and then maintain through the late-segmentation larva. The secondary loss of the segments appears in the worm-shaped larva whose inhabit mode changes from free-living into burrowing in the sediment. A body plan without segments is kept in adult U. unicinctus.

Illumina Sequencing, De Novo Assembly and Functional Annotation
To obtain an overview of the U. unicinctus larval transcriptome, twelve cDNA libraries were constructed from the larvae of four developmental stages, late-trochophore (LT), early-segmentation larva (ES), segmentation larva (SL), and worm-shaped larva (WL). A total of 609.53 Mb raw reads were generated and 591.58 Mb clean reads were obtained (Table 1). These clean reads were then de novo assembled by Trinity software and generated 243,381 unigenes (gene and unigene used hereinafter all represent unigene) with an average length of 1125 bp and N50 of 1836 bp ( Table 2). The PCA analysis showed that three biological replicates of each developmental stage segregated together, respectively, and LT and ES samples tended to cluster as well ( Figure S1), which indicated the reliability of the data. All RNA-Seq data had been submitted to the NCBI Sequence Read Archive under the accession number of SRP156975.  All the unigenes were aligned against seven public databases with a cutoff E-value < 1 × 10 −5 , including the NCBI non-redundant protein/nucleotide database (NR and NT), the Kyoto Encyclopedia of Genes and Genomes Ortholog (KO), the Swiss-Prot protein database (SwissPort), the Protein family (Pfam), the Gene Ontology (GO), and the euKaryotic Ortholog Groups (KOG) databases. Totally 149,488 unigenes (61.42% of all the 243,381 unigenes) were annotated, and 7,704 (3.16%) unigenes were simultaneously annotated by the seven databases (Table 3). The distribution of the best blast hits over species showed that the top five matched species of the unigene numbers were Capitella teleta (34.4%), Oxytricha trifallax (6.1%), Stylonychia lemnae (5.5%), Crassostrea gigas (5.4%), and Vitrella brassicaformis (5.4%) ( Figure 2). GO analysis showed that a total of 111,479 unigenes (45.8% of the total annotated sequences) were assigned at least one GO term. These unigenes were sorted into 56 level-2 GO terms under three main GO categories: biological processes, molecular functions, and cellular components ( Figure S2A). Based on the assigned GO terms, unigenes were found to be highly enriched in "cellular process" GO analysis showed that a total of 111,479 unigenes (45.8% of the total annotated sequences) were assigned at least one GO term. These unigenes were sorted into 56 level-2 GO terms under three main GO categories: biological processes, molecular functions, and cellular components ( Figure S2A). Based on the assigned GO terms, unigenes were found to be highly enriched in "cellular process" (62,603, 56.2%), "metabolic process" (54,080, 48.5%), and "single-organism process" (50,858,45.6%) under the biological process category; "binding" (60,844, 54.6%), "catalytic activity" (46,129, 41.4%), and "transporter activity" (9605, 8.6%) under the molecular function. In the cellular component category, the top three GO terms were "cell" (32,853,29.5%), "cell part" (32,844,29.4%) and "organelle" (22,612,20.3%).

Identification, Classification, and Validation of the Differentially Expressed Genes (DEGs)
Pairwise comparisons (LT vs. ES, ES vs. SL, SL vs. WL, LT vs. SL, LT vs. WL, and ES vs. WL) were performed to screen the DEGs by using DEGseq with the criteria of FDR ≤ 0.05 and |log2(foldchange)| ≥ 1. A total of 70,517 DEGs were obtained (Figure 3), of which the most were assigned to pairwise comparisons of SL vs. WL, which may be related to U. unicinctus larvae experiencing a series of transformations in living habit (from swimming larvae to benthic juveniles) and especially in morphology (disappearance of the body segments and peritroch, shrink of the upper hemisphere, etc.). In contrast, few DEGs were assigned to pairwise comparisons of ES vs. SL. During this period, there are few changes in U. unicinctus morphology, except that the body segments are further matured, which is consistent with the actual developmental process.  To get dynamic expression patterns of DEGs during the development process, the STEM software program (Short Time-series Expression Miner, designed for clustering, comparing, and visualizing gene expression data [30]) was utilized to classify all the DEGs according to their abundance changes. Twenty profiles were produced according to their FPKM values, of which eight significant expression profiles (profile 0, profile 6, profile 9, profile 10, profile 11, profile 16, profile 18, and profile 19) were identified, and marked with different background colors ( Figure 4). Profile 16 and profile 9 were the two represented patterns as most segment formation-related genes in U. unicinctus were assigned to these two profiles. To get dynamic expression patterns of DEGs during the development process, the STEM software program (Short Time-series Expression Miner, designed for clustering, comparing, and visualizing gene expression data [30]) was utilized to classify all the DEGs according to their abundance changes. Twenty profiles were produced according to their FPKM values, of which eight significant expression profiles (profile 0, profile 6, profile 9, profile 10, profile 11, profile 16, profile 18, and profile 19) were identified, and marked with different background colors ( Figure 4). Profile 16 and profile 9 were the two represented patterns as most segment formation-related genes in U. unicinctus were assigned to these two profiles.
To validate the RNA-Seq results, eight genes (wnt11, smo, nk1, post2, abca3, nlk, rp-sae, and odd) which represented different gene expression patterns were selected for further confirmation with qRT-PCR. The results showed that the vast majority of the chosen unigenes displayed concordant expression tendencies as in RNA-Seq ( Figure 5), confirming that these RNA-Seq data were reliable to quantify gene expression accurately.
To get dynamic expression patterns of DEGs during the development process, the STEM software program (Short Time-series Expression Miner, designed for clustering, comparing, and visualizing gene expression data [30]) was utilized to classify all the DEGs according to their abundance changes. Twenty profiles were produced according to their FPKM values, of which eight significant expression profiles (profile 0, profile 6, profile 9, profile 10, profile 11, profile 16, profile 18, and profile 19) were identified, and marked with different background colors ( Figure 4). Profile 16 and profile 9 were the two represented patterns as most segment formation-related genes in U. unicinctus were assigned to these two profiles.
To validate the RNA-Seq results, eight genes (wnt11, smo, nk1, post2, abca3, nlk, rp-sae, and odd) which represented different gene expression patterns were selected for further confirmation with qRT-PCR. The results showed that the vast majority of the chosen unigenes displayed concordant expression tendencies as in RNA-Seq ( Figure 5), confirming that these RNA-Seq data were reliable to quantify gene expression accurately.   Different characters indicate the significant difference (p < 0.05), with the uppercase letter for qRT-PCR and the lowercase for RNA-Seq. smo belongs to profile11, wnt11 and post2 belong to profile18, nk1 belongs to profile16, abca3 belongs to profile10, nlk and odd belong to profile9 and rp-sae belongs to profile2, respectively.

Expression Profiles of Genes Related to Segment Formation
In the larval transcriptomes of U. unicinctus, a total of 119 unigenes related to the segment formation of annelids, arthropods, and chordates were identified. Out of these genes, 101 genes Figure 5. Validation of the RNA-Seq data using qRT-PCR. The blue columns represent the qRT-PCR results; the red lines show the FPKM value. Different characters indicate the significant difference (p < 0.05), with the uppercase letter for qRT-PCR and the lowercase for RNA-Seq. smo belongs to profile11, wnt11 and post2 belong to profile18, nk1 belongs to profile16, abca3 belongs to profile10, nlk and odd belong to profile9 and rp-sae belongs to profile2, respectively.

Expression Profiles of Genes Related to Segment Formation
In the larval transcriptomes of U. unicinctus, a total of 119 unigenes related to the segment formation of annelids, arthropods, and chordates were identified. Out of these genes, 101 genes (Table S1) were known to be involved in the segment formation in Drosophila and annelids, including 7 gap genes (krüppel, hunchback, giant, tailless, etc.), 6 pair-rule genes (hairy, even-skipped, runt, etc.), 44 segment polarity like genes (hedgehog, wingless, engrailed, etc.) and 44 homeotic genes (Hox, Para-hox, and NK genes). We performed hierarchical clustering of the 101 segment formation-related gene homologs to examine the similarity and diversity of expression profiles ( Figure 6). The results showed that all the 7 gap genes and most of the pair-rule genes (5 of 6, except odd) were expressed stably (p > 0.05) during the progress of segment formation (from LT to ES). However, 13 of 44 segment polarity genes were up-regulated from LT to ES, while down-regulated from SL to WL. One of 10 Hox genes (lox5), 1 of 3 ParaHox genes (pdx), and 8 of 22 NK genes (NK1, NK4, Lbx, Msx, Tlx, NK5, Dbx, and Vax) were down-regulated from SL to WL.  (Table S1) were known to be involved in the segment formation in Drosophila and annelids, including 7 gap genes (krüppel, hunchback, giant, tailless, etc.), 6 pair-rule genes (hairy, even-skipped, runt, etc.), 44 segment polarity like genes (hedgehog, wingless, engrailed, etc.) and 44 homeotic genes (Hox, Para-hox, and NK genes). We performed hierarchical clustering of the 101 segment formation-related gene homologs to examine the similarity and diversity of expression profiles ( Figure 6). The results showed that all the 7 gap genes and most of the pair-rule genes (5 of 6, except odd) were expressed stably (p > 0.05) during the progress of segment formation (from LT to ES). However, 13 of 44 segment polarity genes were up-regulated from LT to ES, while down-regulated from SL to WL. One of 10 Hox genes (lox5), 1 of 3 ParaHox genes (pdx), and 8 of 22 NK genes (NK1, NK4, Lbx, Msx, Tlx, NK5, Dbx, and Vax) were down-regulated from SL to WL.

Expression Characteristics of Hedgehog (HH) in U. unicinctus Larvae
The U. unicinctus larva transcriptional data showed that hedgehog was stable and highly expressed in ES and SL, but the expression level was significantly decreased in WL. To explore the possible role of HH in processes of U. unicinctus segment formation and secondarily loss, we detected the expression pattern of HH using immunohistochemistry technology. The results showed that HH positive signals were widely distributed throughout the body (Figure 7). However, the strong HH signals appeared in the boundary of larval segments in the ES and SL ( Figure 7B1, C1), and then the

Expression Characteristics of Hedgehog (HH) in U. unicinctus Larvae
The U. unicinctus larva transcriptional data showed that hedgehog was stable and highly expressed in ES and SL, but the expression level was significantly decreased in WL. To explore the possible role of HH in processes of U. unicinctus segment formation and secondarily loss, we detected the expression pattern of HH using immunohistochemistry technology. The results showed that HH positive signals were widely distributed throughout the body (Figure 7). However, the strong HH signals appeared in the boundary of larval segments in the ES and SL ( Figure 7B1,C1), and then the HH signals weakened and lost its original expression pattern in the WL, an unsegmented larva ( Figure 7D). Negative controls were shown in Figure S3. HH signals weakened and lost its original expression pattern in the WL, an unsegmented larva ( Figure 7D). Negative controls were shown in Figure S3.

Function of Gap and Pair-rule Genes in U. unicinctus was not Consistent with Drosophila during Segmentation
According to the morphological feature in U. unicinctus larvae, the visible larval segments occur initially in the developmental process from LT to ES, and are maintained from ES to SL, and then the segments lose secondarily in WL [31]. As seen in hierarchical clustering ( Figure 6A) that most of the gap and pair-rule genes (12 of 13) expressed stably (p > 0.05) in U. unicinctus larvae during the segment formation from LT to ES, except odd, implying that the regulation mode of genes involved in the larval segment formation of U. unicinctus may not be completely consistent with Drosophila. This result is consistent with the pair-rule orthologs examined in the Annelid Capitella teleta that none of the pair-rule genes exhibit segmental or pair-rule stripes of expression in the ectoderm or

Function of Gap and Pair-Rule Genes in U. unicinctus Was Not Consistent with Drosophila during Segmentation
According to the morphological feature in U. unicinctus larvae, the visible larval segments occur initially in the developmental process from LT to ES, and are maintained from ES to SL, and then the segments lose secondarily in WL [31]. As seen in hierarchical clustering ( Figure 6A) that most of the gap and pair-rule genes (12 of 13) expressed stably (p > 0.05) in U. unicinctus larvae during the segment formation from LT to ES, except odd, implying that the regulation mode of genes involved in the larval segment formation of U. unicinctus may not be completely consistent with Drosophila. This result is consistent with the pair-rule orthologs examined in the Annelid Capitella teleta that none of the pair-rule genes exhibit segmental or pair-rule stripes of expression in the ectoderm or mesoderm [23]. By contrast, 13 of 44 segment polarity genes (en, hh, wg, gli, β-catenin, dpp, dsh, ck1, apc, lrp5/6, shrp4, frizzled1/2/7, and wnt11, Figure 6B) were up-regulated from LT to ES, and down-regulated from SL to WL, suggesting that these genes may be involved in function maintenance of larval segments in U. unicinctus, which was similar to the model of Drosophila and Annelid Platynereis dumerilii segments maintenance [13,19,32]. Meanwhile, 8 NK genes (NK1, NK4, Lbx, Msx, Tlx, NK5, Dbx, and Vax) of 44 homeotic genes showed significant difference (p < 0.05) from SL to WL, but only 1 Hox gene (lox5) and 1 ParaHox gene (pdx) showed significant difference (p < 0.05) ( Figure 6C), indicating that these eight NK genes may also be involved in function maintenance of larval segments in U. unicinctus, like segment polarity genes. To date, research has shown that NK1, NK4, Tlx, Msx, and Lbx are expressed in a segment polarity-like pattern early in the development of the Onychophoran Euperipatoides rowelli [33] and the Annelid P. dumerilii [25]. Those results imply that NK genes may be involved in the formation of the body segment in both animals, although there are no functional studies on them in onychophorans and annelids currently. The above results proved that the segment polarity mechanism in Drosophila is conserved in echiurans, annelids, and arthropods, but the gap genes and pair-rule genes do not support a role in U. unicinctus segmentation. Furthermore, NK genes may be involved in the formation and maintenance of larval segments in U. unicinctus.

Key Genes and Pathways Involved in U. unicinctus Larval Segment Formation and Secondary Loss
As consistent with the trend of larval segment formation and secondary loss in the echiuran worm U. unicinctus, the expression levels of DEGs of profile 16 ( Figure 4) were firstly increased and reached to top in ES and SL, and then decreased. Most of the segment formation-related genes can be found in this profile, such as en, hh and wg, the three major segment polarity genes. Moreover, dsh and smo were also enriched in this profile. The dsh gene encodes a protein that is an essential component of Wnt/β-catenin signal and it plays a key role in segment polarity in the early embryo of Drosophila [34,35]. The smo gene is also a segment polarity gene required for correct patterning of every segment in Drosophila and the Annelid P. dumerilii, which encodes a seven-pass membrane protein, a receptor for the Hedgehog signal [36,37]. The expression levels of DEGs in the profile 9 were expressed stably during the stages of LT to SL, and then decreased at the stage of WL. Similarly, many segment formation-related genes were enriched in this profile like ptc and β-catenin. These two genes were regarded as segment polarity genes in Drosophila, belonging to the Hedgehog and Wnt/β-catenin pathways, respectively [38,39]. Therefore, it was considered that DEGs of profile 16 and profile 9 were closely involved in larval segment formation and secondary loss in U. unicinctus.
To find new pathways from profile 16 and profile 9, the KEGG pathway annotation and enrichment were conducted. Finally, 8 key pathways (Table 4) were identified, out of which three (Hedgehog, Wnt, and Notch) pathways were known to be involved in the segment (or somite) formation in annelids, arthropods, or chordates [22]. Hedgehog signaling plays a central role in the development of most metazoans, which is originally identified as a mutation that causes a "segment polarity" phenotype in Drosophila [40]. Recent research indicates that the function of the Hedgehog pathway in "segment polarity" is conserved in other insects [20] or noninsect arthropods [41,42] and probably also in most annelids [13,43]. Wnt signaling is a highly conserved cellular communication system that regulates a wide range of developmental processes, including axis elongation and segmentation [44]. Studies of wg-en regulatory system in arthropods and annelids indicated that their delineation of segmental boundaries is an ancestral feature [13,45]. Delta/Notch signaling controls a wide spectrum of developmental processes, including body and leg segmentation in arthropods and segmentation clocks in vertebrates [12,46]. However, the other five pathways obtained in our study have not been confirmed to be associated with the segment formation, and the expression tendencies of the genes in these five pathways are consistent with the segment polarity genes in U. unicinctus. The TGF-β pathway can regulate diverse processes as cell proliferation, differentiation, motility, adhesion, organization, and programmed cell death [47]. The PI3K/Akt and AKT/mTOR pathways are critical for cellular proliferation, growth, survival, and mobility [48]. Dorso-ventral axis formation, during Drosophila embryogenesis, is required for the establishment of dorsoventral cell fates, determination of segmental identity, maintenance of amnioserosa and ventral neuroectodermal cells, germ band retraction, and production of cuticle [49]. MAPKs are serine-threonine protein kinases that play an important role in the regulation of many cellular processes including cell growth and proliferation, differentiation, and apoptosis [50]. In the future, it is necessary to conduct functional studies on whether these pathways participate in the formation and maintenance of segments in U. unicinctus.

Hedgehog Is a Conservative Gene for the Larval Segment Boundary Definition in U. unicinctus
Researchers show that the segmental subdivision of the D. melanogaster embryo along its anteroposterior axis is regulated by the interaction of a cascade of factors. The para-segmental boundaries are defined and maintained by the establishment of hedgehog, engrailed, and wingless expression domains and their mutual interaction. Loss of Hedgehog activity leads to the breakdown of segment boundaries [51]. Functional studies confirmed a conserved role for Hedgehog in the maintenance and patterning of segments in the Arthropod Tribolium castaneum [20] and the Annelid Platynereis dumerilii [19]. However, knowledge of its functional role in echiurans is still fragmentary. Our results of HH expressional profiles (Figure 7) suggested that HH may participate in the defining of the larval segment boundary in U. unicinctus and its function is evolutionarily conserved with annelids and arthropods.

Ethics Statement
The collection and handling of the U. unicinctus were performed in accordance with the Institutional Animal Care and Use Committee of the Ocean University of China (OUC-IACUC) and the local government on 15 June 2016.

Animals and Sampling
Adult U. unicinctus was collected from a coastal intertidal flat in Yantai, China. Mature sperms and ova were obtained by dissecting the nephridia (gonoducts) of male and female worms. Artificial insemination was conducted by mixing the sperms and ova with a ratio of 10:1 in filtered seawater (FSW). The fertilized eggs were reared in FSW (17 • C, pH 7.7, and salinity 30), and the hatched larvae were fed with single-cell algae (Isochrysis galbana, Chlorella vulgaris, Chaetoceros muelleri). The larvae at different stages were sampled and fixed in 4% paraformaldehyde for 15 h, and then dehydrated with serial methanol (25%, 50%, 75%, and 100%) and stored in 100% methanol at -30 • C for immunofluorescent histochemistry analysis. The larvae from four developmental stages, late-trochophore (LT, 25 days after hatching), early-segmentation larva (ES, 30 days after hatching), segmentation larva (SL, 35 days after hatching) and worm-shaped larva (WL, 42 days after hatching) were collected, frozen with liquid nitrogen immediately and then stored at −80 • C, respectively for total RNA extraction. Three biological replicates from each developmental stage were prepared.

RNA Extraction, RNA-Seq Library Construction, and Sequencing
Total RNA was extracted from each stored larval sample using Trizol (Invitrogen, Carlsbad, CA, USA) according to the manufacturer's instructions, and treated with RNase-free DNase I (TaKaRa, Dalian, China) for 30 min at 37 • C to remove residual DNA. The concentration, purity, and integrity of the RNAs were assessed using NanoDrop 2000 (Thermo Scientific, Wilmington, DE, USA), and agarose gel electrophoresis were performed to assess the quality of total RNAs. Five µg total RNA per sample was used for RNA-Seq library preparation. The mRNA was isolated using oligo-dT beads (Qiagen, Hilden, Germany) and then broken into short fragments by adding fragmentation buffer. The first-strand cDNA was generated using random hexamer-primer with the short fragments as templates. The second-strand cDNA was synthesized using dNTPs, RNase H, and DNA polymerase I. The cDNA fragments were purified using a QiaQuick PCR extraction kit (Qiagen, Hilden, Germany). These purified fragments were washed with EB buffer for end reparation and poly(A) addition and then ligated to sequencing adapters. After agarose gel electrophoresis, the suitable fragments were selected as templates for the PCR amplification to construct the cDNA library. Finally, these libraries (total 12 libraries from 12 RNA samples) were respectively sequenced using Illumina HiSeq X Ten system (Illumina, San Diego, CA, USA) with the paired-end sequencing of 150 bp by Novogene Company (Beijing, China).

De Novo Assembly and Functional Annotation
The raw reads were firstly filtered to obtain high-quality sequences (clean reads) by removing the reads containing adapter sequences, ambiguous bases ('N' < 5%), and the low-quality reads with the Phred quality score < 20. All clean reads from the twelve libraries were jointly assembled into unigenes using Trinity software [52]. Finally, unigenes were annotated with NR, NT, Swissprot, KOG, Pfam, GO, and KO databases. The sequence direction of the unigenes was determined according to the best alignment results. When the results were conflicted among databases, the direction was determined consecutively by NR, NT, Swissprot, KOG, Pfam, KOG, and KO. When a unigene was not aligned to any database, ESTScan (http://myhits.isb-sib.ch/cgi-bin/estscan) was used to predict coding regions and determine sequence direction. GO annotation was performed by Blast2GO software (https://www.blast2go.com/). Functional classification of the unigenes was performed using WEGO software (http://wego.genomics.org.cn/).

Enrichment and Dynamic Expression Profile of Differentially Expressed Genes (DEGs)
The mapped fragments were normalized according to the expected number of fragments per kilobase of transcript sequence per million base pairs sequenced (FPKM) for each gene, which facilitated the comparison of transcript levels between samples. Differential expression genes (DEGs) between each of the larval stages (LT vs. ES, LT vs. SL, LT vs. WL, ES vs. SL, ES vs. WL, and SL vs. WL) were identified by the DEseq2 R package, and DEGs were determined as the FDR (False discovery rate) < 0.05 and |log2(foldchange)| ≥ 1. Go enrichment analysis of the DEGs was performed using GOSeq R package with the Wallenius non-central hypergeometric distribution model to adjust gene length bias in DEGs. KEGG pathway enrichment analysis of the DEGs was done using KOBAS [53] with the hypergeometric distribution model. The enrichment p-values were adjusted using the Benjamin and Hochberg method and significance was determined with adjusted p < 0.05. In addition, the expression patterns of DEGs during the larval development were determined with clusters generated by STEM [30].

Validation of RNA-Seq Data with Quantitative RT-PCR (qRT-PCR)
To examine the reliability of the transcriptome data, 8 differently expressed genes, wnt11, smo, nk1, post2, abca3, nlk, rp-sae, and odd were validated by qRT-PCR with the 12 RNA samples used for the transcriptional analysis. The cDNA was synthesized for each sample using Prime-ScriptTM RT reagent Kit (TaKaRa, Dalian, China) with the gene-specific primers (Table S2) designed using Primer Premier 5.0 according to their predicted CDS sequences. The amplifications were performed with SYBR Premix Ex Taq kit (TaKaRa, Dalian, China) in LightCycler 480 Real-Time PCR system. The PCR mixture consisted of 10 µL SYBR Premix Ex Taq II, 2 µL template cDNA, 2 µL forward primer (10 µM), 2 µL reverse primer (10 µM), and 4 µL ddH 2 O. The qRT-PCR condition was: denature at 95 • C for 30 s, followed by 39 cycles of 5 s at 95 • C, and 60 • C for 30 s. Each sample was run in 3 technical replicates. The relative expression levels were normalized to the reference gene ATPase, and expression ratios were calculated using the 2 -∆∆Ct method. The experimental data were presented as the mean ± standard deviation from three samples with three parallel repetitions, and all qRT-PCR assays were validated in compliance with "the MIQE guidelines" [54]. Significant differences between means were tested using a one-way analysis of variance (ANOVA) followed by Tukey's HSD test with SPSS software 18.0 (SPSS Inc., Chicago, IL, USA). The significance level was set at p < 0.05.

Immunofluorescence Histochemistry
The larvae samples were rehydrated in a gradient methanol series (100%, 75%, 50%, and 25%), and blocked with 3% bovine serum albumin (BSA) (Shanghai biotechnology Technology, Shanghai, China) diluted by PBT (pH 7.4). Subsequently, the samples were transferred into preimmune serum, or primary antibody (the specific anti-HH polyclonal antibody prepared with the HH recombinant protein detailed in Figure S4) diluted 1:200 in BSA and incubated overnight at 4 • C on a nutator. Afterward, the samples were rinsed in PBT for 2 h and incubated subsequently with secondary fluorochrome-conjugated antibodies (donkey anti-rabbit Alexa Fluor 488, Invitrogen, CA, USA) diluted 1:300 in PBT for 2 h. At last, the larvae were washed six times in PBT and incubated in PBT with 2.5% DAPI (Solarbio, Beijing, China) in the dark for 2 h to label cell nuclei. Negative controls were obtained by preimmune serum in order to check for antibody specificity. All the samples were analyzed with the confocal laser-scanning microscope Nikon A1RSi (Nikon, Tokyo, Japan). Drawings and final panels were designed using Adobe Photoshop (Adobe, San Jose, CA, USA).

Conclusions
Our study presented the first transcriptome analysis focusing on the gene expression profiles of the segment formation and secondary loss in U. unicinctus, in which 101 genes were identified involving in the segment formation as in Drosophila and annelids. Most of the segment polarity genes were conserved in echiurans, annelids, and arthropods, while the gap genes and pair-rule genes were not. Besides, NK genes may be involved in the formation and maintenance of larval segments in U. unicinctus. Moreover, eight key pathways were identified to be involved in U. unicinctus larval segment formation and secondary loss. We also verified that HH might participate in the defining of the larval segment boundary in U. unicinctus and its function is evolutionarily conserved with annelids and arthropods. This study provides a basic understanding of the molecular mechanism of larval segment formation in echiurans.
Supplementary Materials: Supplementary materials can be found at http://www.mdpi.com/1422-0067/20/8/1806/ s1. Table S1. Expression trends of the 101 gene homologs involved in segment formation in Urechis unicinctus transcriptome; Table S2. Primer sequences of the DEGs for qRT-PCR; Figure S1. PCA analysis of U. unicinctus larval samples used for transcriptional analysis; Figure S2. Functional annotation of all unigenes in U. unicinctus larval transcriptome; Figure S3. Immunofluorescence assay with preimmune serum as a negative control in U. unicinctus larvae during the segment formation and secondary loss; Figure S4. Expression and purification of U. unicinctus Hedgehog recombinant protein analyzed by SDS-PAGE and specificity analysis of the anti-Hedgehog polyclonal antibody detected by Western blotting.
Author Contributions: Z.Z., X.H., and Z.Q. conceived the study, designed the experiment, assisted in writing, and provided financial support for the project; M.W. assisted with the experimental design, participated in sampling and RNA extraction; Q.L., T.Z., Y.X., D.Z. and D.K. contributed to animal treatment and sampling; Z.Q. helped to analyze the qRT-PCR results. All authors read and approved the final manuscript.
Funding: This work was support by the National Natural Science Foundation of China (31572601).