Population and Transcriptomic Changes of the Tropical Fish Parasite Caligus confusus (Copepoda: Caligidae) with Seasonal Variations in Sea Temperature

: Fish–parasite systems could be subject to two scenarios under climate change: (i) increased water temperature might enhance parasite metabolism, allowing the parasite to spread rapidly; (ii) parasitism could decrease if the optimal temperature for growth and transmission is exceeded. Sea lice are parasitic copepods commonly found on marine ﬁsh in tropical regions, yet their biology remains poorly investigated. In this study, we analyzed the changes in infection levels and the transcriptomic response of the tropical sea louse Caligus confusus to two seasonal seawater temperatures (30 ◦ C, “warm”, and 21 ◦ C, “cold”). The prevalence of C. confusus was signiﬁcantly higher in the colder water. A de novo transcriptomic analysis of C. confusus , the ﬁrst for a tropical sea louse, revealed 426 over-expressed and 1402 down-expressed transcripts at the lower temperature. In particular, we observed over-expression of transcripts encoding vitellogenins ( vit-1 , vit-2 , vit-4 , and vit-6 ) and matrix metalloproteinases ( mmp-2 and mmp-9 ), which are involved in reproduction and development. These results suggest that the cold tropical season physiologically favors C. confusus and that low temperature favors embryo development, which might ultimately lead to a higher prevalence. It is possible, therefore, that climate change could reduce some tropical sea lice populations during extreme warming events.


Introduction
Copepods are a diverse group of crustaceans composed of the free-living species that are the primary components of marine zooplankton, and symbiotic species, most of which have a direct life cycle and are commonly ectoparasitic on fish, although they have also been found in practically all marine animals [1].Members of the family Caligidae, which includes 30 genera and more than 450 species, also called sea lice, are the most common parasitic copepods in marine fishes [2].The most speciose genera in this family are Caligus and Lepeophtheirus, comprising 278 and 124 species, respectively [1].Sea lice, mainly Caligus Fishes 2023, 8, 475 2 of 14 rogercresseyi and Lepeophtheirus salmonis, have become widely known due to their negative impact on fish aquaculture, causing skin irritation, ulcerations, anemia, lethargy, weight loss, secondary infections, and mortality in fish [3][4][5].
These general scenarios for parasitic copepods, especially at tropical latitudes, have been scarcely investigated, and the molecular mechanisms involved are not well understood.Whereas transcriptomic research has shed some light on the complex biological processes of C. rogercresseyi and L. salmonis [13][14][15][16], to date, transcriptomes are not available for tropical sea lice.
The aims of this study were to perform a de novo transcriptome assembly of C. confusus, and to document changes in infection levels and the transcriptomic responses of this parasite at contrasting seawater temperatures.This sea louse species is commonly found in fish of the family Carangidae (jacks) from the tropical waters of the Eastern Pacific and Indo-Pacific regions [17,18].In particular, we focused on genes that are involved in reproduction and development, which might be associated with observable changes in the prevalence of C. confusus.Such data could be useful for better understanding the effect of climate change on sea lice, and parasitic copepods in general, in tropical regions.

Fish Sampling
Fish (Caranx caninus) were captured in the coastal waters of Mazatlán (23 • 12 48 N, 106 • 26 20 W), on the eastern side of the entrance to the Gulf of California (Mexico).The average sea surface temperature in this zone reaches a maximum of 31 • C between July and October and a minimum of 21 • C during January to March [19].Thus, to determine the population and molecular changes occurring in C. confusus at contrasting temperatures, field samples were collected during October 2021 (warm season) and February 2022 (cold season).

Changes in Infection Levels
A total of 93 and 26 fish specimens were collected during October 2021 (average 30 • C) and February 2022 (average 21.8 • C), respectively.The fish were acquired directly from local fishermen while fresh.In the laboratory, the gills of the fish were removed and examined under a stereomicroscope (Motic, Richmond, BC, Canada) for the presence of sea lice, which were carefully transferred to labeled vials and preserved in 96% alcohol for later identification.Specimens of C. confusus were identified based on the morphological characteristics described by Ho and Lin [20].To that end, copepods were cleared in lactic acid in order to observe their appendages in detail under a compound microscope (Leica DMLB, Leica Microsystems, Wetzlar, Germany).The prevalence and mean intensity of infection [21], with confidence intervals (CIs) at 95%, were calculated and compared between sampling months by using Fisher's exact test and bootstrap t-testing, respectively, using the software Quantitative Parasitology (Qpweb 1.0.15)[22].C, oxygen 98%, salinity 34.6 psu).On these dates, the fish were caught together at the same site by local fishermen at a depth of 2 m, and immediately inspected for the purpose of recovering live sea lice.To that end, gills were removed, and sea lice were carefully collected under a stereomicroscope using sterile forceps.Specimens of C. confusus were transferred to a Petri dish containing filtered seawater (to remove excess mucus) for 1 min, preserved in RNAlater (Sigma-Aldrich) at room temperature for 24 h, and then stored at −20 • C until RNA extraction.Sixty female copepods were pooled for each sampling month, that is, October (warm) and February (cold).

RNA Extraction, Library Preparation, and Sequencing
The total RNA of each pool of copepods was extracted using Trizol reagent (Molecular Research Center, Inc., Cincinnati, OH, USA) according to the manufacturer's instructions.The remaining genomic DNA was removed with a DNA-free kit (Life Technologies Corporation, South San Francisco, CA, USA).The quantity and quality of the total RNA were measured at 230, 260, and 280 nm with a Nano-Drop 2100 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA).Total RNA was loaded onto a 1% agarose gel containing ethidium bromide.After electrophoresis, the gel was visualized using a Biorad Universal Hood II Gel Doc System (Bio-Rad Laboratories Segrate, Milan, Italy).The final quality of total RNA was verified by capillary electrophoresis using a 2100 Agilent Bioanalyzer (Agilent, Santa Clara, CA, USA) with a Nano Kit RNA chip.The RNA integrity number (RIN) was 7.5.
The RNA sequencing (RNA-seq) libraries were prepared using the MGI Easy RNA Library Prep Kit (MGI-Tech, MGI Tech Co., Ltd., Shenzhen, China).Sequencing of each library was performed on the MGSEQ-2000 platform (MGI Tech Co., Ltd., Shenzhen, China).Extraction of RNA, cDNA library construction, and sequencing were performed at the Genomic Services Laboratory (UGA-Langebio, CINVESTAV, Irapuato, Mexico).Raw sequence reads of C. confusus from cold and warm temperatures were deposited in the NCBI Short Read Archive (SRA) database under the accession numbers SRR25611633 and SRR25611632, respectively (BioProject PRJNA1004606).

De Novo Transcriptome Assembly and Functional Annotation
The quality of the raw RNA-seq library data was assessed with FastQC 0.11.9 and reports were gathered with MultiQC 1.14.[23].The RNA-seq data showed neither adapter bias nor low quality at the 3 end.No processing of the fragments was necessary.Phred scores per cycle for each library are reported in Supplementary File S1.In total, 893 million RNA-seq fragments from the two libraries (warm and cold) were used to assemble a de novo transcriptome in Trinity 2.6.6 [24] with default parameters for paired-end reads.Transcripts with 96% identity were clustered using the CD-HIT 4.6 software [25] to reduce redundant sequences.The open reading frames (ORFs) and putative proteins were predicted from the assembled transcriptomes using TransDecoder 5.5.0 [26], in combination with Swiss-Prot and Pfam searches (options: -retain_pfam_hits, -retain_blastp_hits).The functional annotation was performed with Trinotate [27] to gather the results of Blastn 2.6.0,Blastp 2.6.0 [28], Hmmer 3.1b2 [29], SignalP 4.1 [30], and TMHMM 2.0 [31] for each assembled transcript.This Transcriptome Shotgun Assembly project has been deposited at DDBJ/ENA/GenBank under the accession GKOC00000000.The version described in this paper is the first version, GKOC01000000.

Evaluation of Assembly
Although pooling many individuals is useful for obtaining sufficient genomic material from small organisms, a disadvantage is the increased likelihood of contamination, which compromises the quality of the assembly.Therefore, to confirm that the genomic material was representative of a single species, that is, C. confusus, the assembled transcriptome in this study was aligned against the 28s ribosomal gene of L. chilensis (NCBI ID: JX896325.2).Hit sequences were retrieved and aligned against the NCBI nucleotide collection (nt) database using Blastn, with default parameters.The results revealed the presence of sequences of more than one caligid species.It is possible that a few specimens other than C. confusus were inadvertently collected, as the observation of individuals was not conducted with absolute rigor to avoid subjecting them to stress related to handling.Because sequences cannot be filtered computationally between closely related species, we continued the posterior analysis using this transcriptome.
To eliminate sequences of fish and possible contamination by bacteria, virus, fungi, and another species, strict filtering was performed by aligning the putative proteins against the UniRef90 database (accessed on 19 June 2022) [32] using Diamond 2.0.15 [33] with the option "--sensitive" (e-value < 1 × 10 −3 ).Similar to the findings of Caña-Bozada et al., the hit sequences that were the best matches with non-Protostomia species were considered contaminants and eliminated from the transcriptome and putative proteins [34].Completeness of the transcriptome was assessed in BUSCO 5.3.2[35] using the dataset of orthologous genes conserved across arthropods, which contains 1013 BUSCO groups.

Differentially Expressed Transcripts and Enrichment Analysis
For the differential expression analysis, first, the RNA-seq reads were mapped to the assembled transcriptome using the package Kallisto 0.42.4 [36], to obtain a matrix of transcript counts and abundances.The matrix of transcript abundances was used as an input in the Bioconductor package edgeR 3.40.2[37] to identify differentially expressed transcripts (DETs) by comparing the transcripts obtained from the cold group against those from the warm.In this analysis, the transcripts with expression levels below 4 counts per million reads (cpm) were removed and transcript counts were normalized using the "calcNormFactors" function.Transcripts with the values log (fold change) ≥ 2.0 and p < 0.01 were considered DETs.
Functional enrichment analysis was performed using two approaches-Gene Ontology (GO) enrichment analysis and gene set enrichment analysis (GSEA).Enrichment analysis of the GO terms was performed with TopGop 2.52.0 [38], using Fisher's exact testing and the weight algorithm.Term enrichment was considered at p < 0.01.To avoid redundancy, only those GO terms with fewer than 400 annotated proteins were considered.
GSEA was performed using Fast Gene Set Enrichment Analysis (FGSEA 1.24.0)[39].This analysis was performed with the following parameters: nperm = 10,000; minSize = 14; maxSize = 400.Gene sets with p < 0.01 were considered as differentially enriched.GSEA results were collapsed with the collapsePathways function, implemented in the fgsea package, to reduce redundancy.

Transcriptomic Responses De Novo Transcriptome Assembly of C. confusus
The de novo transcriptome of C. confusus combining the two MGI-seq libraries (warm and cold) generated 230,861 transcripts and 182,298 genes.The N10 and N50 results were 3841 and 1000, respectively, with an average length of 1012.3 bp (Table 1).We obtained an average of 35% mapping for all the libraries.A total of 101,640 ORF and protein sequences were predicted from the transcriptome.After contaminant sequences were filtered out, 55,530 ORF and protein sequences were retained.The ORF and protein sequences in fasta format are available in Supplementary File S2.

DET and Enrichment Analysis
DET analysis was conducted on a subset of 20,640 genes that exhibited cpm values greater than 4.This analysis showed that, under contrasting temperatures, 1828 transcripts were differentially expressed (LFC ≥ 2.0 and p < 0.01); of these, 426 were over-expressed and 1402 were down-expressed in colder sea temperature (Supplementary File S3).
The GO enrichment analysis revealed that 66 processes were up-regulated in cold conditions, of which 35 are associated with BP, 17 with MF, and 14 with CC.Conversely, 111 processes were down-regulated, of which 62 relate to BP, 27 to MF, and 22 to CC.The most prominent processes are depicted in Figure 3.
transcripts were differentially expressed (LFC ≥ 2.0 and p < 0.01); of these, 426 were overexpressed and 1402 were down-expressed in colder sea temperature (Supplementary File S3).
The GO enrichment analysis revealed that 66 processes were up-regulated in cold conditions, of which 35 are associated with BP, 17 with MF, and 14 with CC.Conversely, 111 processes were down-regulated, of which 62 relate to BP, 27 to MF, and 22 to CC.The most prominent processes are depicted in Figure 3.  FGSEA identified 59 enriched GO terms (p < 0.01), 41 of which correspond to BP, 15 to MF, and 3 to CC (Figure 4).Particularly in processes related to reproduction and development, we observed transcripts encoding vitellogenin (vit-1, vit-2, vit-4, and vit-6) in the GO terms' nutrient reservoir activity, and metalloproteinases (mmp-2 and mmp-9) in the GO terms' endodermal cell differentiation and positive regulation of vascular-associated smooth muscle cell proliferation.Other processes involving metalloproteinases were also found, including cellular response to UV-A, response to amyloid-beta, and negative regulation of cation channel activity (Figure 3 and Supplementary File S4).
to MF, and 3 to CC (Figure 4).Particularly in processes related to reproduction and development, we observed transcripts encoding vitellogenin (vit-1, vit-2, vit-4, and vit-6) in the GO terms nutrient reservoir activity, and metalloproteinases (mmp-2 and mmp-9) in the GO terms endodermal cell differentiation and positive regulation of vascularassociated smooth muscle cell proliferation.Other processes involving metalloproteinases were also found, including cellular response to UV-A, response to amyloid-beta, and negative regulation of cation channel activity (Figure 3 and Supplementary File S4).

Changes in Infection Levels and Transcriptomic Response
The higher prevalence of C. confusus on Ca. caninus in cold seas is similar to that observed in another tropical sea louse-C.omissus on Scomberomorus sierra [11].These findings contrast with the population dynamics of sea lice at higher latitudes.Lõhmus and Björklund reported that the occurrence of parasites in general in temperate regions has strong seasonality, with the occurrence and transmission being higher during warm periods [12].This seasonality has been observed in the sea lice L. salmonis, C. epinepheli, and C. rogercresseyi from temperate latitudes [40][41][42].This is because sea lice follow the temperature-dependent seasonal patterns common in ectotherms, with their development being faster in summer than in winter [7].
Conversely, in the tropics, it seems that the increase in temperature compromises the reproduction and infectivity of sea lice, as suggested by the low prevalence of C. confusus observed in the warm season.In general, animals in shallow tropical marine ecosystems subsist near their upper temperature limits and are highly vulnerable to increases in temperature [43].This issue has been shown for the free-living copepod Pseudodiaptomus annandalei, which results in prolonged development, reduced size at maturity, smaller clutch sizes, lower hatching success, and reduced naupliar production under the extreme warming conditions (34 °C) that occur frequently during summer periods [44].In sea lice larvae, body size and food reserves are reduced at higher temperatures, so that the infective copepodid stage remains viable for a shorter period of time, whereas at lower temperatures viability is higher because sea lice produce more and larger eggs and have more time to find a host [7,9,45].

Changes in Infection Levels and Transcriptomic Response
The higher prevalence of C. confusus on Ca. caninus in cold seas is similar to that observed in another tropical sea louse-C.omissus on Scomberomorus sierra [11].These findings contrast with the population dynamics of sea lice at higher latitudes.Lõhmus and Björklund reported that the occurrence of parasites in general in temperate regions has strong seasonality, with the occurrence and transmission being higher during warm periods [12].This seasonality has been observed in the sea lice L. salmonis, C. epinepheli, and C. rogercresseyi from temperate latitudes [40][41][42].This is because sea lice follow the temperature-dependent seasonal patterns common in ectotherms, with their development being faster in summer than in winter [7].
Conversely, in the tropics, it seems that the increase in temperature compromises the reproduction and infectivity of sea lice, as suggested by the low prevalence of C. confusus observed in the warm season.In general, animals in shallow tropical marine ecosystems subsist near their upper temperature limits and are highly vulnerable to increases in temperature [43].This issue has been shown for the free-living copepod Pseudodiaptomus annandalei, which results in prolonged development, reduced size at maturity, smaller clutch sizes, lower hatching success, and reduced naupliar production under the extreme warming conditions (34 • C) that occur frequently during summer periods [44].In sea lice larvae, body size and food reserves are reduced at higher temperatures, so that the infective copepodid stage remains viable for a shorter period of time, whereas at lower temperatures viability is higher because sea lice produce more and larger eggs and have more time to find a host [7,9,45].
It is possible that in the (tropical) cold season C. confusus produces larger eggs with greater energy reserves which leads to higher prevalence.This hypothesis is supported by our transcriptomic analysis, which revealed a positive regulation of vitellogenin and metalloproteinase transcripts in C. confusus sampled in the cold season relative to specimens from the warm season.Vitellogenins are a group of proteins that play a crucial role in the reproductive biology of many animals.In fish, warm temperatures can reduce vitellogenin levels, which might be reflected in poor survival of ova and infertility [46].In particular, in egg-laying species such as copepods, vitellogenin serves as a reserve of nutrients for the embryos as they develop inside the egg and on which they survive after hatching until they feed autonomously [47,48].Semmouri et al. observed that the expression of vitellogenins was positively correlated with the density of the free-living copepods [49].
Fishes 2023, 8, 475 9 of 14 Something similar could occur in C. confusus, in such a way that vitellogenins provide energy for embryonic development and food larval stages, leading to higher infection rates and, consequently, higher prevalence.
In C. confusus, the over-expression of the nutrient reservoir activity process was activated by vitellogenin genes vit-1, vit-2, vit-4, and vit-6.vit-1 and vit-2 have been previously identified and characterized in the sea lice L. salmonis and C. rogercresseyi [13, 16,50], which is consistent with our results.Conversely, vit-4 and vit-6 have not been previously reported in copepods.In nematodes, increased expression of vit-4 and vit-6 is associated with aging [51,52].It would be interesting to carry out more studies to validate and characterize other possible vitellogenin precursor genes in parasitic copepods, such as vit-4 and vit-6.The identification and understanding of vitellogenins might provide valuable insights into the reproductive biology and ecological dynamics of sea lice.
Other important transcripts over-expressed in C. confusus in colder tropical seas were matrix metalloproteinases (MMPs) mmp-2 and mmp-9, which, according to the GO annotation, are involved in different biological processes and molecular functions such as endodermal cell differentiation and regulation of vascular cells associated with smooth muscle cell proliferation.MMPs are involved in cell-cell adhesion and cell and basement membrane remodeling, and in the degradation of the extracellular matrix [53].In invertebrates, MMPs are important for tissue remodeling [54].To the best of our knowledge, to date there have been no reports of the role of metalloproteinases in copepods.In Drosophila melanogaster, only two MMPs genes have been identified-mmp-1 and mmp-2-which participate in various processes during embryogenesis by histolysis, tissue remodeling, nervous system development, and neuronal dendritic remodeling [54].In other arthropods, such as the crab Eriocheir sinensis, molting processes and innate immune responses are regulated by MMPs [55], whereas in the silkworm, Bombyx mori, MMPs are functionally required for fat body cell dissociation and ovary development in female pupae [56].In the parasitic nematode Haemonchus contortus, silencing of mmp-12 significantly reduced egg count, larval hatchability, and adult worm count and size [57].Based on this evidence, and with the over-expression of MMPs observed herein, we suppose that the cold tropical season physiologically favors C. confusus, and that low sea temperatures favor embryo development, which might ultimately lead to higher prevalence.
Climate change is causing an increase in ocean temperatures, and some studies propose that the effects of climate change might be more severe in coastal areas of tropical regions [19,43].It is projected that by the year 2100, the sea temperatures will increase by 2 • C [58].This increase can have negative consequences on sea lice populations, primarily because a 2 • C rise might exceed the tolerance limits of certain species [8,11,59].Based on the present study, we could expect that climate change will reduce some tropical sea lice populations during extreme warming events.This could alter ecosystem processes in general because parasites are vital for regulating host abundance [60]; parasite extinctions might have unforeseen costs that impact the health and abundance of a large number of free-living species [61].However, it would be interesting to investigate whether parasites can develop adaptive molecular mechanisms to counteract the detrimental effects of sea warming, as has been proposed for free-living species [62].Thus, it is important to perform more detailed studies on the molecular biology of parasites.

De Novo Transcriptome Quality
This study presents a first draft of the de novo transcriptome for the tropical sea louse C. confusus.Only a few de novo transcriptomes are available for parasitic copepods (Table 2).The transcriptome of C. confusus presented herein has good quality metrics, one of which is the guanine-cytosine (GC) content of 36.96%.In free-living and parasitic copepods, GC content is typically around 30-40% [63,64].The completeness (94.7%) indicated by BUSCO suggests that the majority of the expected genes were successfully recovered, indicating a high-quality assembly and accurate sequencing.Furthermore, the BUSCO score for C. confusus is higher than that of other assemblies of parasitic copepods such as C. rogercresseyi, L. salmonis, and Lernaeocera branchialis (Figure 2).

Conclusions
The present study provides the first transcriptome for a tropical sea louse species (C.confusus).The analysis of this transcriptome indicated over-expression of vitellogenin and metalloproteinase transcripts in specimens obtained from colder topical waters (21 • C) compared with those from warmer waters (30 • C).These proteins favor growth, development, and survival in the specimens obtained.This might help to explain the higher prevalence of C. confusus on its host Ca. caninus during February, one of the coldest months of the year in the coastal Mazatlán region.Institutional Review Board Statement: This study did not consider experiments with live animals.All fish were obtained from commercial catches.The fish used to determine the prevalence and mean intensity of C. confusus were already deceased, and the fish from which C. confusus specimens were obtained for transcriptomics studies were handled according to the American Veterinary Medical Association (AVMA) Guidelines for the Euthanasia of Animals (AVMA 2020).None of the species involved are subject to conservation measures.

Figure 1 .Figure 2 .
Figure 1.Gene Ontology processes of annotated transcripts of C. confusus de novo assembly.BP, biological process; CC, cellular component; MF, molecular function.# on the X axis; number

Figure 2 .
Figure 2. BUSCO completeness analysis of C. confusus transcriptome and other copepods of the order Siphonostomatoida.T, transcriptome; EST, expressed sequence tag.Completeness was assessed using the dataset of orthologous genes conserved across arthropods, which contains 1013 BUSCO groups.

Figure 3 .
Figure 3. Results of the GO terms enrichment analysis.The graphs report some GO terms assigned to biological process (BP), molecular function (MF), or cellular component (CC) and up-regulated (A) or down-regulated (B) in C. confusus taken from cold waters.

Figure 3 .
Figure 3. Results of the GO terms enrichment analysis.The graphs report some GO terms assigned to biological process (BP), molecular function (MF), or cellular component (CC) and up-regulated (A) or down-regulated (B) in C. confusus taken from cold waters.

Figure 4 .
Figure 4. FGSEA results.The graph reports the 24 most enriched GO terms in C. confusus taken from cold waters.

Figure 4 .
Figure 4. FGSEA results.The graph reports the 24 most enriched GO terms in C. confusus taken from cold waters.

Funding:
The National Council of Humanities, Science and Technology of Mexico (CONAHCYT) provided a postdoctoral scholarship to C.A.P.-A.and Ph.D. scholarships to V.H.C.-B.and J.M.O.-C.The National Autonomous University of Mexico (UNAM), through the Institute of Marine Sciences and Limnology (ICML), funded the laboratory research and analysis.

Table 1 .
Summary statistics for de novo assembly and annotation of the Caligus confusus transcriptome.

Table 2 .
Metrics of the de novo assembly transcriptomes of selected parasitic copepods.
* Data not provided.