ceRNA Network Regulation of TGF-β, WNT, FOXO, Hedgehog Pathways in the Pharynx of Ciona robusta

The transforming growth factor-β (TGF-β) family of cytokines performs a multifunctional signaling, which is integrated and coordinated in a signaling network that involves other pathways, such as Wintless, Forkhead box-O (FOXO) and Hedgehog and regulates pivotal functions related to cell fate in all tissues. In the hematopoietic system, TGF-β signaling controls a wide spectrum of biological processes, from immune system homeostasis to the quiescence and self-renewal of hematopoietic stem cells (HSCs). Recently an important role in post-transcription regulation has been attributed to two type of ncRNAs: microRNAs and pseudogenes. Ciona robusta, due to its philogenetic position close to vertebrates, is an excellent model to investigate mechanisms of post-transcriptional regulation evolutionarily highly conserved in immune homeostasis. The combined use of NGS and bioinformatic analyses suggests that in the pharynx, the hematopoietic organ of Ciona robusta, the Tgf-β, Wnt, Hedgehog and FoxO pathways are involved in tissue homeostasis, as they are in human. Furthermore, ceRNA network interactions and 3′UTR elements analyses of Tgf-β, Wnt, Hedgehog and FoxO pathways genes suggest that different miRNAs conserved (cin-let-7d, cin-mir-92c, cin-mir-153), species-specific (cin-mir-4187, cin-mir-4011a, cin-mir-4056, cin-mir-4150, cin-mir-4189, cin-mir-4053, cin-mir-4016, cin-mir-4075), pseudogenes (ENSCING00000011392, ENSCING00000018651, ENSCING00000007698) and mRNA 3′UTR elements are involved in post-transcriptional regulation in an integrated way in C. robusta.


Introduction
The transforming growth factor-β (TGF-β) is a cytokine involved in regulation of cell fate and behavior in all tissues of the body [1]. In human hematopoiesis, TGF-β plays an important role in regulating hematopoietic stem cell (HSC) behavior, particularly in quiescence and self-renewal [2]. TGF-β signals are activated through a canonical small mother against decapentaplegic (SMAD)-mediated) pathway. Following the binding with the ligand, type II receptors and type I receptors trigger transmit intracellular signaling through the phosphorylation of downstream effector SMADSs [3][4][5]. The SMAD pathway is fully integrated into the intracellular signaling network, to address the expression and activities of ligands, antagonists, receptors by interactions with intracellular signaling such as Wintless (WNT), Hedgehog (HH), phosphoinositide 3-kinase (PI3K)-Akt [6], nuclear factor kB (NF-kB), and Janus kinases (JAKs), signal transducer and activator of transcription proteins (STATs) signaling pathways [7,8].
are involved in post-transcriptional regulation of genes mechanisms that have been highly conserved during the evolution of animal species.
National Center of Biotechnology Information (ncbi) data (ncbi.nlm.nih.gov/genome/ annotation_euk/Ciona_intestinalis/102; release October 2014) shows that 6% of transcripts in C. robusta genome are ncRNAs, including also RNA transfer (tRNAs) and long noncoding RNAs (lncRNAs). The different percentage of ncRNAs in our study (1% compared to 6%) is probably justified by the fact that we investigated only one organ (pharynx) and not the whole C. robusta transcriptome. Coding RNAs from NGS sequencing were analyzed through the functional classification system of a Gene Ontology (GO) tool (panther.org) (PANTHER 16.0, release April 2020), Supplementary Material, functional classification MF, BP and CC sheets ( Figure 1). involved in post-transcriptional regulation of genes mechanisms that have been highly conserved during the evolution of animal species.
National Center of Biotechnology Information (ncbi) data (ncbi.nlm.nih.gov/genome/annotation_euk/Ciona_intestinalis/102; release October 2014) shows that 6% of transcripts in C. robusta genome are ncRNAs, including also RNA transfer (tRNAs) and long noncoding RNAs (lncRNAs). The different percentage of ncRNAs in our study (1% compared to 6%) is probably justified by the fact that we investigated only one organ (pharynx) and not the whole C. robusta transcriptome. Coding RNAs from NGS sequencing were analyzed through the functional classification system of a Gene Ontology (GO) tool (panther.org) (PANTHER 16.0, release April 2020), Supplementary Material, functional classification MF, BP and CC sheets ( Figure 1). */** Chart tooltips are read as: category name (accession); * percent of gene hit against total # genes; **percent of gene hit against total # function hits. Information is reported just for the three most representative subclasses for each GO class. . */** Chart tooltips are read as: category name (accession); * percent of gene hit against total # genes; ** percent of gene hit against total # function hits. Information is reported just for the three most representative subclasses for each GO class.
The GO analysis showed that the most representative categories for the molecular function subclass were: catalytic activity (GO:0003824), binding (GO:0005488), and molecular function regulator (GO:0098772). The most represented categories for the biological process subclass were: cellular process (GO:0048511), metabolic process (GO:0008152), and biological regulation (GO:0065007). For the cellular component subclass the most representative categories were: cellular anatomical entity (GO:0110165) and intracellular (GO:0005622). As shown in Figure 2, miRNAs and pseudogenes are the most representative classes of noncoding transcripts. The GO analysis showed that the most representative categories for the molecular function subclass were: catalytic activity (GO:0003824), binding (GO:0005488), and molecular function regulator (GO:0098772). The most represented categories for the biological process subclass were: cellular process (GO:0048511), metabolic process (GO:0008152), and biological regulation (GO:0065007). For the cellular component subclass the most representative categories were: cellular anatomical entity (GO:0110165) and intracellular (GO:0005622). As shown in Figure 2, miRNAs and pseudogenes are the most representative classes of noncoding transcripts.

miRNAs Analysis
Of the 90 miRNAs identified by NGS, 54 were annotated and divided into two classes: 11 conserved miRNAs (Table 1) and 43 (Table 2) not conserved through the species (Supplementary Material, miRNA list sheet). Thirty-six of the 90 miRNAs were not annotated in the Ensembl database. Since the NGS sequences identified miRNA precursors (sequences from 59 to 77 base pairs long), we decided to align them using the miRBase Blastn tool (mirbase.org, release October 2018) to find a match with annotated mature miRNA sequences within the precursors (analysis results are in Supplementary Material, mirbase precursor and mature sheet). Only Blast results reporting an E-value < e − 5 = (10 − 5) were considered significant results and five potential mature miRNAs were identified: cin-mir-4105-5p, cin-mir-4198-5p, cin-mir-5596a-3p, cin-mir-5598-3p, cin-mir-5606-5p. Although just one miRNA sequence has an E-value score < e − 5 (cin-miR-5598-3p) and all precursor sequences identified with Blast research had E-values < e − 7; this difference is explained by the different length of sequence pairing when aligning precursor query sequences with mature miRNAs. A pipeline used to find miRNA annotations is shown in Supplementary Material ( Figure S1).

miRNAs Analysis
Of the 90 miRNAs identified by NGS, 54 were annotated and divided into two classes: 11 conserved miRNAs (Table 1) and 43 (Table 2) not conserved through the species (Supplementary Material, miRNA list sheet). Thirty-six of the 90 miRNAs were not annotated in the Ensembl database. Since the NGS sequences identified miRNA precursors (sequences from 59 to 77 base pairs long), we decided to align them using the miRBase Blastn tool (mirbase.org, release October 2018) to find a match with annotated mature miRNA sequences within the precursors (analysis results are in Supplementary Material, mirbase precursor and mature sheet). Only Blast results reporting an E-value < e − 5 = (10 − 5) were considered significant results and five potential mature miRNAs were identified: cin-mir-4105-5p, cin-mir-4198-5p, cin-mir-5596a-3p, cin-mir-5598-3p, cin-mir-5606-5p. Although just one miRNA sequence has an E-value score < e − 5 (cin-miR-5598-3p) and all precursor sequences identified with Blast research had E-values < e − 7; this difference is explained by the different length of sequence pairing when aligning precursor query sequences with mature miRNAs. A pipeline used to find miRNA annotations is shown in Supplementary Material ( Figure S1). To study the evolution pattern of the 11 conserved C. robusta miRNA homologues in animal genomes (Table 3), we used the ZooMir (Supplementary Material ZooMir sheet) (insr.ibms.sinica.edu.tw/ZooMir/index.php, release April 2019), and MirGeneDB 2.0 (https://mirgenedb.org/, release 2015) database of homologous miRNA genes to search animal genomes that have been validated and annotated. Although cin-mir-375 is not present in the ZooMir list and cin-mir-141 is not present in MirGeneDB, both are present in miRBase database (http://www.mirbase.org/, release October 2018). The cin-mir-92c-5p, cin-mir-92a-3p, cin-mir-375-3p, cin-mir-153-5p and cin-let-7d-5p were highly conserved in chordate species (Table 3). miR-375 was highly conserved from C. elegans to H. sapiens, with one isoform except for D. rerio where there are two. Moreover, the miR-92 was conserved from D. melanogaster to H. sapiens and miR-let7 from C. elegans to H. sapiens. The let-7 miRNA was one of the first miRNAs discovered in C. elegans, and results high conserved from nematode to the human. According to miRBase, C. elegans, D. melanogaster, S. purpuratus, C. robusta, B. floridae, X. tropicalis, D. rerio, G. gallus, M. musculus and H. sapiens, all express a version of let-7 (Table 3). let-7 miRNA orthologues share a consensus sequence called the 'seed sequence' GAGGUAG ( Figure 3), that spans from nucleotides 2 through 8 of mature miRNA (data not shown), suggesting that their targets and functions may be similar in diverse animal species. The C. elegans and D. melanogaster have a single isoform, whereas chordates including C. robusta (mir-let-7a, a2, b, c, d, f) have multiple let-7 isoforms. Indeed D. rerio and H. sapiens have diverse let-7 family members (let-7a, -7b, -7c, -7d, -7e, -7f, -7g, -7h, -7i, -7j, -7k) although some differences may be observed (for example, let-7h exists in D. rerio but not in H. sapiens). According to miRBase, in human the let-7 family is composed of nine mature let-7 miRNAs encoded by 12 different genomic loci, some of which are clustered together (data not shown), in D. rerio the let-7 family is composed of eighteen mature let-7 miRNAs.

Pseudogene Characterization and Analysis
NGS technologies identified 27 pseudogenes in C. robusta that resulted annotated in Ensembl database and that were analyzed through Basic Local Alignment Search Tool (Blastn tool) (blast.ncbi.nlm.nih.gov/Blast.cgi, release October 2020), to find their parental gene and percentage of identity with parental genes (paralogues) ( Table 4) (Supplemetary Material, pseudogene sheet). Pseudogenes showed a high percentage of identity with paralogue genes (94-100%) and some pseudogenes only had a few nucleotides that differed in sequence from their parental genes. Eleven of 27 pseudogene sequences were "predicted" by the Blast prediction algorithm, and the remaining 16 sequences were "validated". All of them resulted in the same chromosome of parental genes and had lost their ability to encode proteins.

RNA-RNA Predictions
To investigate ceRNA interactions and possible involvement in the post-transcriptional regulation of pathway involved in pharynx tissue homeostasis, we tested all the transcripts produced with NGS technology, using the miRNATIP prediction tool.
To consider, the most significant interactions, we filtered obtained results according to the lowest free energy values. All data of both interaction types are shown in Supplementary Materials. Figure 4 showed the RNA-RNA prediction pipeline. For miRNA-mRNA prediction, miRNATIP tool evidenced 341 interactions (228 unique genes with 20 different miRNAs), and five miRNAs were conserved through evolution (cin-mir-92c-5p, cin-mir-92a-3p, cin-mir-375-3p, cin-mir-153-5p and cin-let-7d-5p). miRNA-pseudogenes prediction evidenced 84 predictions (23 different pseudogenes with 50 different miRNAs). All miRNAtarget interactions and miRNA-pseudogene interactions are in Supplemental Material (Supplementary Material miRNA-target prediction filter sheet and miRNA-pseudogene prediction sheet). These interactions that considered all annotated coding and noncoding transcripts had free energy < of −12 kcal/mol. To investigate ceRNA interactions and possible involvement in the post-transcriptional regulation of pathway involved in pharynx tissue homeostasis, we tested all the transcripts produced with NGS technology, using the miRNATIP prediction tool.
To consider, the most significant interactions, we filtered obtained results according to the lowest free energy values. All data of both interaction types are shown in Supplementary Materials. Figure 4 showed the RNA-RNA prediction pipeline. For miRNA-mRNA prediction, miRNATIP tool evidenced 341 interactions (228 unique genes with 20 different miRNAs), and five miRNAs were conserved through evolution (cin-mir-92c-5p, cin-mir-92a-3p, cin-mir-375-3p, cin-mir-153-5p and cin-let-7d-5p). miRNA-pseudogenes prediction evidenced 84 predictions (23 different pseudogenes with 50 different miRNAs). All miRNA-target interactions and miRNA-pseudogene interactions are in Supplemental Material (Supplementary Material miRNA-target prediction filter sheet and miRNApseudogene prediction sheet). These interactions that considered all annotated coding and noncoding transcripts had free energy < of −12 kcal/mol.

GO and Pathway Analysis
To investigate Gene Ontology (GO) annotations of targets which were predicted to interact with specific miRNAs and pseudogenes obtained by NGS sequences, gene enrichment and pathway analyses were performed using two different web servers: the Database for Annotation, Visualization and Integrated Discovery (DAVID) tool (david.ncifcrf.gov, release October 2016) and the Protein Analysis Through Evolutionary Relationships (PANTHER) tool (pantherdb.org, release April 2020).

GO and Pathway Analysis
To investigate Gene Ontology (GO) annotations of targets which were predicted to interact with specific miRNAs and pseudogenes obtained by NGS sequences, gene enrichment and pathway analyses were performed using two different web servers: the Database for Annotation, Visualization and Integrated Discovery (DAVID) tool (david.ncifcrf.gov, release October 2016) and the Protein Analysis Through Evolutionary Relationships (PAN-THER) tool (pantherdb.org, release April 2020).
Gene enrichment analysis was performed according to the three GO annotation classes: biological processes (BP), molecular functions (MF), and cellular component (CC) (all complete GO and pathway analysis are in Supplementary Material, BP, MF, CC, pathway kegg sheet). All GO results were filtered for the p-value statistical test (p-value < 0.05) and the Benjamini correction test (Benjamini < 0.05). Most representative terms were linked to WNT signaling (canonical and noncanonical pathway), MAP kinase activity, transcription regulation and protein binding. Figure 5 shows GO analysis of both BP, MF and CC classes. All GO results were filtered for p-value statistical test (p-value < 0.05) and Benjamini correction test (Benjamini < 0.05). Pathway analysis evidenced four pathways linked to homeostasis: the Tgf-β, Wnt, FoxO, and Hh pathways (Supplementary Material kegg pathway, Wnt gene list, FoxO gene list, Tgfβ gene list, Hh gene list sheets). Pathway results were filtered for the p-value statistical test (p-value < 0.05) and the Benjamini correction test (Benjamini < 0.05). Pathway selection was done filtering for the highest correction values of pathway analysis. Gene enrichment analysis was performed according to the three GO annotation classes: biological processes (BP), molecular functions (MF), and cellular component (CC) (all complete GO and pathway analysis are in Supplementary Material, BP, MF, CC, pathway kegg sheet). All GO results were filtered for the p-value statistical test (p-value < 0.05) and the Benjamini correction test (Benjamini < 0.05). Most representative terms were linked to WNT signaling (canonical and noncanonical pathway), MAP kinase activity, transcription regulation and protein binding. Figure 5 shows GO analysis of both BP, MF and CC classes. All GO results were filtered for p-value statistical test (p-value < 0.05) and Benjamini correction test (Benjamini < 0.05). Pathway analysis evidenced four pathways linked to homeostasis: the Tgf-β, Wnt, FoxO, and Hh pathways (Supplementary Material kegg pathway, Wnt gene list, FoxO gene list, Tgfβ gene list, Hh gene list sheets). Pathway results were filtered for the p-value statistical test (p-value < 0.05) and the Benjamini correction test (Benjamini < 0.05). Pathway selection was done filtering for the highest correction values of pathway analysis.
See Table 5 for a summary of pathway analyses and ceRNA interactions in selected pathways. (All GO and pathway analyses are shown in Supplementary Material, BP, MF, CC, kegg pathway sheet).   See Table 5 for a summary of pathway analyses and ceRNA interactions in selected pathways. (All GO and pathway analyses are shown in Supplementary Material, BP, MF, CC, kegg pathway sheet).

Identification of Structural Elements in the mRNA 3 -UTR of Tgf-β, Wnt, Hedgehog and FoxO Pathways Genes
In silico analyses were performed using the RegRNA 2.0 tool (http://regrna2.mbc. nctu.edu.tw/ release February 2021) to investigate the cis-elements of the 3 -UTR involved in post-transcriptional regulation of Tgf-β, Wnt, Hedgehog and FoxO pathways genes ( Figure 6).

ceRNA Network Reconstruction
Gene enrichment and pathway analyses from NGS data, together with RNA-RNA interaction predictions through the miRNATIP predictor, evidenced a ceRNA interaction network involving four pathways linked to pharynx tissue homeostasis (Figures 7 and 8) (Supplementary Material, network sheet). Table 5 shows a detailed list of these interactions. Indeed, different miRNAs interact with genes belonging to Tgf-β, Wnt, FoxO and Hh pathways, and they are both conserved and species-specific miRNAs. Conserved miR-NAs are cin-let-7d-5p, cin-mir-153-5p, and cin-mir-92c-5p (Table 1). Moreover, three pseudogenes were shown to interact with miRNAs in these pathways: EN-SCING00000011392, ENSCING00000018651, ENSCING00000007698 (Figures 7 and 8). Ensembl annotation analysis showed that they could belong, respectively, to the annotation categories of C. robusta zinc finger protein (zf(c2h2)-31), Not4 protein (not4), and Rrp12like protein (LOC100184722) ( Table 4). Almost all mRNAs showed musashi-binding elements (MBE), and a γ-interferon activated inhibitor of translation (GAIT) element. In addition, RegRNA 2.0 identified in 3 -UTR of Smad4, Smad2/3b, secreted frizzled-related protein (sfrp2), p38 kinase, transforming growth factor β receptor IIa (tgfbr-iia), transcription factor protein(Gli) a cytoplasmic polyadenylylation element (CPE), involved in Poly(A) tail elongation after the export of the mRNA to the cytoplasm, though a mechanism called cytoplasmic polyadenylation; in Nemo-like kinase, Glycogen synthase kinase α/β(Gsk) a polyadenylation response element (MOS-PRE); an AU found in Smad4, Wnt ligand, p38 kinase; in Smad 4, Tgfbr-iib, Smoothened protein, a GU-rich destabilization element, identified in human as a conserved sequence, involved in rapid mRNA turnover; in Wnt signaling ligand, sfrp2, Gli, there was a UNR binding site. The RNA binding protein UNR can act as an activator or inhibitor of translation initiation, promote mRNA turnover, or stabilizes mRNA.

ceRNA Network Reconstruction
Gene enrichment and pathway analyses from NGS data, together with RNA-RNA interaction predictions through the miRNATIP predictor, evidenced a ceRNA interaction network involving four pathways linked to pharynx tissue homeostasis (Figures 7 and 8) (Supplementary Material, network sheet). Table 5 shows a detailed list of these interactions. Indeed, different miRNAs interact with genes belonging to Tgf-β, Wnt, FoxO and Hh pathways, and they are both conserved and species-specific miRNAs. Conserved miRNAs are cin-let-7d-5p, cin-mir-153-5p, and cin-mir-92c-5p (Table 1). Moreover, three pseudogenes were shown to interact with miRNAs in these pathways: ENSCING00000011392, ENSC-ING00000018651, ENSCING00000007698 (Figures 7 and 8). Ensembl annotation analysis showed that they could belong, respectively, to the annotation categories of C. robusta zinc finger protein (zf(c2h2)-31), Not4 protein (not4), and Rrp12-like protein (LOC100184722) ( Table 4). The four pathways are represented as networks formed by nodes connected by edges, and they are colored as follows: Wnt-green, Tgf-β-red and FoxO-yellow, Hh-purple. In RNA networks are also shown miRNAs and pseudogenes. Interacting ceRNAs are colored in green as they are interactors of Wnt pathway. The couple cin-mir-92c and ENSCING00000011392 is also ceRNA couple for HH2. Pink nodes represent proteins that are shared by two or more pathways. Figure 7. The four pathways are represented as networks formed by nodes connected by edges, and they are colored as follows: Wnt-green, Tgf-β-red and FoxO-yellow, Hh-purple. In RNA networks are also shown miRNAs and pseudogenes. Interacting ceRNAs are colored in green as they are interactors of Wnt pathway. The couple cin-mir-92c and ENSCING00000011392 is also ceRNA couple for HH2. Pink nodes represent proteins that are shared by two or more pathways.

Discussion
In humans, numerous studies show the crucial role of transforming growth factor β (TGF-β) to regulate the self-renewal capacity of hematopoietic stem cells (HSCs) in vivo, precisely, in the bone marrow (BM) it maintains the homeostasis of normal hematopoiesis through the precise generation of mature blood cells [2,[34][35][36][37][38]. Furthermore, the TGF-β

Discussion
In humans, numerous studies show the crucial role of transforming growth factor β (TGF-β) to regulate the self-renewal capacity of hematopoietic stem cells (HSCs) in vivo, precisely, in the bone marrow (BM) it maintains the homeostasis of normal hematopoiesis through the precise generation of mature blood cells [2,[34][35][36][37][38]. Furthermore, the TGF-β signaling has an essential function to maintain the self-renewal capacity of HSCs, even when the tissue context changes [39,40].
In Ciona robusta the pharynx, its hematopoietic organ, consists of two epithelial monolayers aligned in transversal and longitudinal bars, where flows the haemolymph containing hemocytes, mainly lymphocyte-like cells, hyaline amoebocytes, morula cells, unilocular refractile granulocytes, signet ring cells [34]. In C. robusta following LPS challenge, Tgf-β cytokine was transcriptionally upregulated and expressed by the hemocytes that are found to flow inside the pharynx vessels. In upregulation of Tgf-β mRNA time profile (1-72 h) two phases can be recognized, the first (1-4 h) when the inflammatory response starts end the second (48-72 h) in which homeostasis is reached and maintained [30].
In this study, NGS and bioinformatics suggest that combinatorial and crosstalk at multiple levels, between TGF-β, WNT, FOXO, and Hedgehog (HH) signaling pathways are involved in pharynx tissue homeostasis.
The HH signaling pathway is evolutionarily conserved and is required for embryonic patterning, tissue repair, and regeneration. In vertebrates, HH signaling is controlled by polypeptide ligand, an intracellular signaling molecule called Hedgehog (HH). If the ligand is not present, two cell-surface transmembrane proteins called the patched receptor (PTCH1 or PTCH2) repress the 7-membrane-spanning receptor-like protein smoothened (SMO) leading to the phosphorylation of GLI (glioma-associated oncogene homolog) transcription factors by several protein kinases such as kinase A (PKA), GSK-3b, and CK1, and subsequent cleavage of GLI into truncated forms that act as repressors of HH target genes [41]. The binding of the HH ligand delete the inhibition of SMO leading to the activation and translocation of GLI proteins into the nucleus to control the expression of HH target genes [41].
In vertebrates, the WNT signaling pathways is involved in regulating many aspects of development and play important roles in cell-fate determination, self-renewal, and maintenance of stem cells [2]. Complete sequences of animal genomes have revealed that signaling pathways, such as TGF-β, WNT and FOXO [42][43][44][45] can be integrated to build an higher order networks and coordinated in signaling network that allows cells to read a limited set of cues and mount the diverse responses connected to the developmental processes and homeostasis [46].
In C. robusta gene enrichment and pathway analyses from NGS data, together with RNA-RNA interaction predictions evidenced a ceRNA (mRNA, miRNA and pseudogene) interaction network involved in post-transcriptional regulation of four pathways (Tgf-β, Wnt, FoxO, and Hh) linked to pharynx tissue homeostasis.
MicroRNAs (miRNAs), found in diverse eukaryotes from plants to animals, inhibit gene expression largely in a posttranscriptional manner, by interacting with miRNA response elements (MREs) in the mRNA 3 -UTRs of the target genes. Pseudogenes which have the similar MREs target genes of miRNAs [47], leading to direct competition for miRNAs and allowing pseudogenes to modulate gene expression [48,49], participating in the physiological maintenance of endogenous homeostasis and the pathological process of disease [50]. In C. robusta NGS identified 15,074 transcripts and of these 1% (201 elements) was classified as noncoding RNA (ncRNA). Of these 201, 90 were miRNAs and 27 pseudogenes. Gene enrichment and pathway analyses from NGS data, together with RNA-RNA interaction predictions by miRNATIP predictor, evidenced a ceRNA interaction network involving the four pathways linked to pharynx tissue homeostasis. Both conserved (cinlet-7d-5p, cin-mir-153-5p, and cin-mir-92c-5p) and species-specific miRNAs were found in ceRNA interaction network. Moreover, only three pseudogenes were found to interact with miRNAs involved in gene expression regulation of Wnt and Hh pathways.
After transcription, mRNAs are exported to the cytoplasm [51] to be translated. In the cytoplasm mRNAs can be turned over through regulated decay [52] by post-transcriptional regulatory steps important for gene expression. Studies show that interaction between RNA-binding proteins (RBPs) and specific cis-regulatory elements in target transcripts in 3 -UTR is the basis for post-transcriptional regulation of gene expression [53]. RegRNA analyses of the 3 -UTR mRNA of genes involved in C. robusta Tgf-β, Wnt, FoxO, and Hh signaling pathways showed a variety of cis-regulatory elements, including the CPE, ARE, MBE, GU-rich destabilization element, UNR binding site (an important cis-element for RNA regulation) and GAIT elements that, in humans, are implicated in post transcriptional regulation of several immune-related mRNAs, and have an important role in gene specific translation control in innate immunity [54].
All these data suggest that in C. robusta, as in humans, Tgf-β, FoxO, Wnt, and Hh signaling extensively cross-talk at many levels, and that multiple signaling inputs are integrated into the core transcription factor network to regulate target gene expression cooperatively in tissue homeostasis and the genes of these signaling pathways are subjected to a complex system of post-transcriptional regulations that involve in an integrated way miRNAs, pseudogenes and 3 UTR elements.

Tunicates
In this study, the model organism Ciona robusta, was formerly classified as Ciona intestinalis. Molecular studies have confirmed that C. intestinalis constitutes a compilation of species rather than a single species [55][56][57][58].
C. robusta specimens were collected from Sciacca harbor (Sicily, Italy) and were acclimatized and maintained as reported in Arizza et al. [33] Fragments of pharynx tissue (200 mg) were immediately soaked in RNAlater tissue collection solution (AMBION, Austin, TX, USA) and Total RNA extraction was performed as reported in in Arizza et al. [33].

Gene Functional Enrichment
Gene Ontology enrichment analysis: C.robusta transcripts produced by NGS were analyzed through The Clusters of Orthologous Genes (COG) database (geneontology. org, release February 2021). All the three different Gene Ontology subcategories were investigated: (i) molecular functions (MF); (ii) cellular components (CC); (iii) and biological pathways (BP). The results for the three GO categories were filtered for p-value and for adjusted p-value (<0.05), and for Benjamini correction test (Benjamini < 0.05).
The Protein Analysis Through Evolutionary Relationships (PANTHER GO-slim analysis tool) system connected to the COG database was used to expand the annotation data of the gene and protein families obtained from GO. The "GO-slim" analysis mode was set, as it is more reliable and accurate than the GO annotation mode "GO-complete".

Analysis of Not-Annotated miRNAs
The sequences of the 36 miRNA precursors were aligned using miRBase Blastn alignment tool (mirbase.org, release October 2018). This tool allows us to find similar protein or nucleotide sequences to the target sequence under investigation. Each sequence was taken from the Ensembl genome browser (release 101, August 2020). Only "query" sequences with E-value < 0.05 and S score > 100 were considered as significant results.
E-value is defined as the probability due to chance, that there is another alignment with a similarity greater than the given S score. The S score is a measure of the similarity of the query to the sequence shown. As the typical threshold for a good E-value from a Blast search is e − 5 = (10 − 5) or lower, just a few sequences were considered as potential good results.
Sequence alignment was performed both with miRNA precursor sequences than with mature sequences. This choice allowed us to compare E-values results for each alignment to increase the power of analysis (sequence alignment with precursor and mature miRNAs are shown in supplementary files, miRBase blast mature and precursor sheets). Search sequence was done selecting stem-loop sequence, and miRNA mature; Blastn tool was applied as alignment, and E-value cutoff of 10 and species filter on C. robusta were selected to blast the "query" sequence.

Analysis of Pseudogenes
The pseudogenes annotated by Ensembl database were analyzed through Basic Local Alignment Search Tool (Blastn tool) (blast.ncbi.nlm.nih.gov/Blast.cgi, release October 2020), to find regions of similarity between biological sequences and to see if pseudogenes produced by NGS derived by known genes and if they were conserved through evolution. The pseudogenes were analyzed according to Ensembl "genebuild" algorithm. Two parameters were used to evaluate the significance of the match: the expected value (E), a parameter that describes the number of hits one can "expect" to see by chance when searching a database of a particular size; and the score S (S), that is inversely correlated with the E value.

RNA-RNA Interaction Prediction
RNA-RNA interaction predictions (i.e., miRNA-pseudogene and miRNA-mRNA target) were performed through the miRNA target interaction predictor (miRNATIP) algorithm [59]. For each RNA pair, miRNATIP exploits an approach that combines the advantages of an artificial neural network and the effectiveness of the relative binding free energy computation for each putative interaction.
The first step of the algorithm is the training of a self-organizing map (SOM) neural network [60]. This network is able to map high-dimensional datasets into a smaller dimensional space. The structure of the SOM is typically made by a two-dimensional lattice of interconnected neurons. By using competitive learning, all the sequences corresponding to miRNA seeds will be projected into this lattice (map), and all of them will be clustered according to their structural similarity. In other words, each miRNA seed will be as close to other miRNA seeds as they are similar, whereas different miRNA seeds will be as distant as their sequence are dissimilar. As a result of this phase, the map can be divided into areas where the input patterns share structural feature values. Each area represents a cluster. The second step deals with the projection of mRNA/pseudogene fragments on the trained map, in order to check if they can match a cluster. If fragments are compatible with a cluster, they will be considered for the next step. As a result, for each cluster, a list of miRNA seed-mRNA fragments is obtained, representing a preliminary list of putative interactions.
The third step, take care of clustered elements. For each pair of miRNA seed and mRNA/pseudogene fragment, miRNATIP compute the dissimilarity between the miRNA tail and an extended version of mRNA/pseudogene fragment (i.e., will be considered some nucleotides near to the fragment). The algorithm also considers potential bulges between miRNA seed and tail. If the computed dissimilarity is under a specific threshold, the putative interaction is considered for the next step. The last step filters out wrongly predicted interactions, by means of the computation of the binding free energy.

Analysis of 3 UTR mRNA Target
To characterize the 3 UTR elements of mRNAs, a computational analysis was performed using the regulatory RNA motifs and elements finder tool (http://regrna.mbc.nctu. edu.tw/html/prediction.html, release February 2021).

Pathway Enrichment Analysis
Pathway analysis was performed using the DAVID tool. First, functional annotation analysis was selected. The C. robusta gene list was compared with the reference C. robusta list present by the tool as background. Pathway annotation chart report was then selected, through the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway chart. DAVID pathway viewer displays user genes on static pathway maps generated by BioCarta and KEGG. EASE (Exact Annotation Significance Enrichment) score threshold, a modified Fisher exact p-value used for functional annotation analysis, was set as 0.1. Results were considered statistically significant if p-value and adjusted p-value were < 0.05. The updated version of DAVID tool was 20 August 2020, DAVID release 6.7.
All statistical assessments of GO term enrichment and pathway analyses were performed by Fisher's exact test in combination with a robust Benjamini correction test, for multiple testing. The row p-value and Benjamini thresholds were set as <0.05.

ceRNA Network Construction and Visualization
To build ceRNA network evidenced by RNA-RNA interaction prediction, C. robusta pathways (Wnt, FoxO, Hh and Tgf-β) were downloaded from KEGG database (https://www.genome.jp/kegg/, release January 2021) in the Cytoscape tool (cytoscape. org, release March 2017). Cytoscape is an open-source software platform for visualizing molecular interaction networks and biological pathways and integrating these networks with annotations. The Cytokegg app was used to load KGML files, previously downloaded from KEGG, in Cytoscape db, and the STRING app was then used to convert the networks into STRING networks (STRINfy networks).
In particular, "igraph" package of R was used to analyses networks produced by STRING and to generate the image of integrated networks