The Gut Microbiota of Naturally Occurring and Laboratory Aquaculture Lytechinus variegatus Revealed Differences in the Community Composition, Taxonomic Co-Occurrence, and Predicted Functional Attributes

: Sea urchins, in many instances, are collected from the wild, maintained in the laboratory aquaculture environment, and used as model animals for various scientiﬁc investigations. It has been increasingly evident that diet-driven dysbiosis of the gut microbiome could affect animal health and physiology, thereby impacting the outcome of the scientiﬁc studies. In this study, we compared the gut microbiome between naturally occurring (ENV) and formulated diet-fed laboratory aquaculture (LAB) sea urchin Lytechinus variegatus by amplicon sequencing of the V4 region of the 16S rRNA gene and bioinformatics tools. Overall, the ENV gut digesta had higher taxa richness with an abundance of Propionigenium , Photobacterium , Roseimarinus , and Flavobacteriales . In contrast, the LAB group revealed fewer taxa richness, but noticeable abundances of Arcobacter , Agarivorans , and Shewanella . However, Campylobacteraceae, primarily represented by Arcobacter spp., was commonly associated with the gut tissues of both ENV and LAB groups whereas the gut digesta had taxa from Gammaproteobacteria, particularly Vibrio spp. Similarly, the co-occurrence network displayed taxonomic organizations interconnected by Arcobacter and Vibrio as being the key taxa in gut tissues and gut digesta, respectively. Predicted functional analysis of the gut tissues microbiota of both ENV and LAB groups showed a higher trend in energy-related metabolisms, whereas amino acids, carbohydrate, and lipid metabolisms heightened in the gut digesta. This study provides an outlook of the laboratory-formulated diet-fed aquaculture L. variegatus gut microbiome and predicted metabolic proﬁle as compared to the naturally occurring animals, which should be taken into consideration for consistency, reproducibility, and translatability of scientiﬁc studies. in both ENV and LAB gut tissue, with Alphaproteobacteria have the highest degrees (2 and 3), and LAB gut digesta revealed Arcobacter as having the highest degrees (6 and 7). Additionally, the metabolisms of macronutrients in the gut digesta were consistently higher than the gut tissues in both LAB and ENV groups. These results indicate that the


Introduction
Organisms from all domains of life cohabit in the nearshore marine ecosystems worldwide, where they thrive by modulating their community structure with the fluctuating biotic and abiotic factors, food sources, and other perturbances, such as natural calamities and human activities [1][2][3]. The resiliency of these ecosystems in such changing environments is often contingent upon the composition and metabolic activities of microbial communities that help sustain some of the crucial trophic functions [2]. However, oftentimes organisms from their natural habitat are collected and transferred to laboratories and Adult L. variegatus were collected from Saint Joseph Bay Aquatic Preserve, Florida (29.80 • N 85.36 • W), and transported to the University of Alabama at Birmingham (UAB) and held for six months in a recirculating saltwater tank system before tissue collection and processed for the microbiome analysis (LAB group; n = 3), as described elsewhere [41]. The LAB urchins were fed ad libitum once every 24-48 h with a formulated feed consisting of 6% lipid, 28% protein, and 36% carbohydrate relative percentages [32], and the aquaria conditions were maintained at 22 ± 2 • C with a pH of 8.2 ± 0.2 and salinity of 32 ± 1 parts per thousand (ppt.). Similarly, adult L. variegatus (n = 3) from the same location (29.80 • N 85.36 • W) with 1 m 2 area were collected and transported to UAB (ENV group), and tissues were retrieved and processed for microbiome analysis within 7 ± 1 h after collection. The conditions of the water were recorded as follows: 20 ± 2 • C with a pH of 7.8 ± 0.2 and salinity of 28 ± 1 ppt. during sample collection [42]. The gut tissues and the digesta were collected from urchins using the procedures described elsewhere [24]. Briefly, a radial incision was made around the Aristotle's Lantern mastication apparatus of the urchins using sterile instruments. The pharynx enclosed by the lantern was separated from the digestive tract, collected intact, without tearing, and rinsed with sterile phosphate-buffered saline water (1× PBS, pH 7.4) (Fisher Scientific, Hampton, NH, USA). The gut tissues were then carefully opened by incision and the digesta were separated by gently rinsing with sterile 1× PBS (pH 7.4). The gut tissues were examined under a dissecting scope to make sure no residual digesta associated with the tissues before subjected to DNA purification. Similarly, the gut digesta were gently rinsed multiple times in sterile 1× PBS (pH 7.4) to remove any gut tissue-associated residual bacteria. All procedures were followed according to the Animal Care and Use Committee ( The metacommunity DNA samples from urchins were purified using the Zymo Research kit (Irvine, CA, USA). Then high throughput amplicon sequencing (HTS) was performed on an Illumina MiSeq sequencing platform using the 250 bp paired-end kits (Illumina, Inc., San Diego, CA, USA) and by targeting the V4 hypervariable region of the bacterial 16S rRNA gene [41,42]. The resultant sequences were demultiplexed and FASTQ formatted [43,44] and then deposited on the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) under Bioproject #PRJNA291441 and #PRJNA326427 for the LAB and ENV groups, respectively, as described previously [45]. The subgroups for the LAB group were relabeled for this study as LAB.Gut.Tissue (n = 3) and LAB.Gut.Digesta (n = 3), and for the ENV group the samples were relabeled as ENV.Gut.Tissue (n = 3) and ENV.Gut.Digesta (n = 3).
To determine the taxa with significant differential abundance between gut tissue and gut digesta samples in the compartmentalized gut ecosystem, the ASV table of all the gut tissue and gut digesta samples were further analyzed by the Linear Discriminant Analysis (LDA) Effect Size (LEfSe) (v1.0.8.post1) and visualized via python3 package Matplotlib (v3.1.0) and Seaborn (v0.9.0). Briefly, the non-parametric Kruskal-Wallis sumrank test was used between classes to determine significant differential abundance set at a significance of p = 0.05 [59], followed by the pairwise Wilcoxon signed-rank test between the subclasses at a significance of p = 0.05 [60]. The resultant data was used for LDA analysis using the log(10) values at an inclusion threshold of ±3.6 [61,62]. Those taxa with a significant effect size were also listed in a table format, to show the LDA effect size and average relative abundance in each group with standard deviations determined through the statistical analysis of metagenomic profiles. The heatmap was generated using the attribute clustermap of Seaborn (v0.9.0) with average linkage and the Euclidean distance metric of Z-score normalized relative abundance for hierarchical clustering. The representative sequence corresponding to the highly abundant Campylobacteraceae ASV determined in this study was further analyzed by using NCBI Basic Local Alignment Search Tool (BLAST) [63] against the non-redundant nucleotide collection (nr/nt) database optimized for highly similarity sequences via MEGABLAST (http://blast.ncbi.nlm.nih.gov, accessed on 28 August 2020).

Co-Occurrence Analysis of Microbial Taxa
Co-occurrence Network interference (CoNet v1.1.1) [64][65][66] was used to determine significant co-occurrence patterns between the microbial communities of the gut tissue and the gut digesta. To accomplish this, the ASV table data was uploaded into Cytoscape (v3.8.0) [64,65] through the CoNet (v1.1.1) plugin. A parent-child exclusion was applied, and the links between higher-level taxa were not explored. The gut digesta and gut tissue taxonomic entries with a cumulative group sum of 200 and at least 2/3 of samples containing non-zero values were kept with a 10-8 pseudo-count to determine the significant co-occurrences between taxa [64,[66][67][68][69][70][71][72][73]. The 200 highest (most positive) and lowest (most negative) edges were chosen and combined via the union approach using the mean value [68]. Multi-edge scores were then shuffled row-wise at 100 permutations (for the randomization). The brown method [74] was used to merge node pairs, which were assigned via the p-values of the multi-edges. The unstable edges were filtered out, and the threshold was set to a p < 0.05 for significance [64,65] to determine the q-value (the corrected significance value). The final network was assembled in Cytoscape (v3.8.0). The radial layout algorithm was used from the yFiles plugin (v1.0) [74], and NetworkAnalyzer (v2.7) [75] determined the topological parameter (undirected approach). The node sizes were scaled according to their group abundance size, and edges were scales via q-value. Edges were colored via their positive (co-presence; green) and negative (co-exclusion; red) association. The nodes which had a significant number of edges (high degree), closeness centrality, and low betweenness centrality (determined via Network Analyzer (v2.7)) have been described elsewhere as key taxa [68,[76][77][78]. Microsoft Excel software (Seattle, WA, USA) was used to plot these features as a scatter plot (y = closeness centrality; x = betweenness centrality). The top 5 nodes were then selected as likely to be key taxa based on closeness centrality scores.

Predicted Functional Analysis
The predicted functional capacity of gut microbial communities was determined through Phylogenetic Investigation of Communities by Reconstruction of Unobserved States (PICRUSt2) [79]. This was accomplished using the PICRUSt2 picrust2_pipeline.py single script. The script runs sequence placement, hidden-state prediction of genomes, metagenome prediction, pathway-level predictions, and the weighted Nearest Sequenced Taxon Index (NSTI) values. This single script outputs the unstratified EC number metagenome predictions, KO metagenome predictions, and predicted pathway abundances and coverages per sample. To add descriptions of each pathway the "add_descriptions.py" command was used to describe each functional category. This was further analyzed using linear discriminant analysis (LDA) effect size (LEFSe) against predicted functional profiles corresponding to the microbial communities of L. variegatus gut tissues and gut digesta. The visualization was performed using BURRITO software (a visualization tool for exploratory data analysis of metagenomic data) (http://borenstein-lab.github.io/burrito/, accessed on 24 August 2020) estimated functional abundances using the QIIME2 16S rRNA ASV table, and the PICRUSt2 KEGG_metagenome_output table to compute species abundance, function abundance, and the share of each function linked to each species, which then displayed metabolic pathways of amino acid, carbohydrate, energy, membrane transport, cell motility, and cofactors, which were selected for bar plot analysis (plotted using R ggplot package) [80].

Read Quality and Sample Statistics
The raw sequence count of the V4 segment of the 16S rRNA generated from the paired-end Illumina MiSeq across all samples yielded 1,433,598 reads following quality checking (Supplementary Table S1). A total of 834 observed ASVs were identified following quality filtering through QIIME2 (v.2020.8) ( Table 1). The observed taxonomic distribution is listed in Supplementary Table S2.

Taxonomic Distribution across Samples
At the phylum level, the gut tissue of both the LAB and ENV L. variegatus was dominated by Proteobacteria, which is represented by an almost exclusive abundance of the class Epsilonproteobacteria (data not shown). At the highest achievable taxonomic resolution determined through the described bioinformatics tools, these taxa were identified as order Campylobacterales, primarily represented by the Campylobacteraceae family of bacteria, comprising >90% of the relative abundance in all gut tissues ( Figure 1; Supplementary Table S2). Further analysis using the NCBI BLAST alignment (http://ncbi.nlm.nih.gov, accessed on 28 August 2020) of the representative sequence provided an additional resolution to this taxon. From the top 100 assigned identities, 34% were related to Uncultured Arcobacter sp., 8% to Arcobacter sp., 7% as Arcobacter bivalviorum, and 3% to Sulfuricurvum sp., all with an E-value < 5E −83 and percent identity > 89.76% (data not shown). The ENV gut tissue showed a noticeable abundance of Candidatus Hepatoplasma (~2.5-7.5%). However, the relative abundance of this taxon was negligible in the LAB group (<1%).
The gut digesta of both the LAB and ENV groups showed taxa assigned to Gammaproteobacteria to be the most abundant. From this class, Vibrio was found to be the more dominant taxon in the LAB digesta (~35-65%), as compared to the ENV digesta (~9-18%). However, the LAB gut digesta showed a unique abundance of taxa that were not noticeable in the ENV digesta, which included Agarivorans (~2-24%) and Shewanella algae (~2-8%) from Gammaproteobacteria, Rhodobacteraceae from Alphaproteobacteria (~3-7%), and order Campylobacterales (~13-40%). In contrast, the ENV digesta showed Photobacterium of Gammaproteobacteria (~9-11%), Propionigenium of Fusobacteria (~9-12%), Roseimarinus of Bacteroidetes (~9-11%), and a noticeable abundance of Flavobacteriales (~8-13%) (Figure 1; Supplementary Table S2). Taxonomic identities were based on their assignment through the (SILVA v138) database as determined by the Quantitative Insights into Microbial Ecology (QIIME_2, v2020.8) and graphed using R (ggplot package). Sample designations are as follows: EnvGutTissue = naturally occurring L. variegatus gut tissues; LabGutTissue = laboratory-maintained L. variegatus gut tissues; EnvGutDigesta = naturally occurring L. variegatus gut digesta; LabGutDigesta = laboratory-maintained L. variegatus gut digesta. The relative abundance plot was created using Microsoft Excel Software (Seattle, WA, USA). Some of the major taxa are indicated within the graph. Due to the insufficient space within some of the stacked column bars, the common taxa are indicated by the connecting lines. A list of taxa and their abundances is presented in Supplementary Table S1.

Alpha Diversity
The alpha diversity for both LAB gut tissue and digesta showed lower taxonomic diversity as compared to the ENV group (Table 1). Overall, the LAB and ENV gut tissue had the least number of ASVs compared to the gut digesta samples. The ENV gut digesta had the highest diversity and ASV count, followed by the LAB gut digesta that showed a Taxonomic identities were based on their assignment through the (SILVA v138) database as determined by the Quantitative Insights into Microbial Ecology (QIIME_2, v2020.8) and graphed using R (ggplot package). Sample designations are as follows: EnvGutTissue = naturally occurring L. variegatus gut tissues; LabGutTissue = laboratory-maintained L. variegatus gut tissues; EnvGutDigesta = naturally occurring L. variegatus gut digesta; LabGutDigesta = laboratory-maintained L. variegatus gut digesta. The relative abundance plot was created using Microsoft Excel Software (Seattle, WA, USA). Some of the major taxa are indicated within the graph. Due to the insufficient space within some of the stacked column bars, the common taxa are indicated by the connecting lines. A list of taxa and their abundances is presented in Supplementary Table S1.

Alpha Diversity
The alpha diversity for both LAB gut tissue and digesta showed lower taxonomic diversity as compared to the ENV group (Table 1). Overall, the LAB and ENV gut tissue had the least number of ASVs compared to the gut digesta samples. The ENV gut digesta had the highest diversity and ASV count, followed by the LAB gut digesta that showed a comparatively moderate alpha diversity and ASV count. A t-test comparison between the alpha diversity values of the gut tissues from the LAB and ENV groups showed no significant (p > 0.05) differences using the Shannon (p = 0.58) and Simpson (p = 0.227) metrics. However, a comparison between the LAB and ENV digesta showed significant differences in the Shannon (p = 0.02) and Simpson (p = 0.05) values between the two groups.

Beta Diversity
The microbial taxonomic distribution patterns determined through Bray-Curtis metrics across all samples determined that the gut tissues from both the LAB and ENV groups cluster strongly together (Figure 2a). For the gut digesta, distinct subclustering according to group assignment was observed. These cluster patterns were also elaborated in a dendrogram ( Figure 2b). ANOSIM and Adonis also supported the low within-group variation shown by the cluster patterns, revealing R and R 2 values of 0.778 and 0.913, respectively (p = 0.001), thus indicating significant grouping based on biological replicates. significant (p > 0.05) differences using the Shannon (p = 0.58) and Simpson (p = 0.227) metrics. However, a comparison between the LAB and ENV digesta showed significant differences in the Shannon (p = 0.02) and Simpson (p = 0.05) values between the two groups.

Beta Diversity
The microbial taxonomic distribution patterns determined through Bray-Curtis metrics across all samples determined that the gut tissues from both the LAB and ENV groups cluster strongly together (Figure 2a). For the gut digesta, distinct subclustering according to group assignment was observed. These cluster patterns were also elaborated in a dendrogram ( Figure 2b). ANOSIM and Adonis also supported the low within-group variation shown by the cluster patterns, revealing R and R 2 values of 0.778 and 0.913, respectively (p = 0.001), thus indicating significant grouping based on biological replicates.  LEfSe analysis performed between the collective gut tissue and gut digesta showed those taxa that contributed most to the effect size (Figure 3a,b). For the gut tissue samples, the highest effect size was attributed to the abundant Campylobacteraceae taxon (LDA score = 5.65) followed by Candidatus_Hepatoplasma. For the gut digesta, Vibrio showed the highest effect size (LDA score = 5.11). This taxon was more abundant in the LAB digesta (44.66 ± 5.04), as compared to the ENV digesta (16.21 ± 4.99). This was followed by Flavobacteriales, Propionigenium, and Photobacterium, which were noticeably abundant in the ENV digesta, whereas Agarivorans and Rhodobacteraceae in the LAB digesta. Few taxa that were represented at low abundances in the LAB digesta, particularly, Alteromonadales and OM60 presented in Supplementary Table S2 and withing phylum Gammaproteobacteria (Figure 3a,b). Similarly, order Marinilabiliaceae of phylum Bacteroidetes and family Pirellulaceae of phylum Planctomycete were also found in reasonably low abundances in the gut ENV gut digesta. These taxa were negligibly abundant in the gut tissues (Supplementary Table S2).
Appl. Microbiol. 2021, 1, FOR PEER REVIEW 8 LEfSe analysis performed between the collective gut tissue and gut digesta showed those taxa that contributed most to the effect size (Figure 3a,b). For the gut tissue samples, the highest effect size was attributed to the abundant Campylobacteraceae taxon (LDA score = 5.65) followed by Candidatus_Hepatoplasma. For the gut digesta, Vibrio showed the highest effect size (LDA score = 5.11). This taxon was more abundant in the LAB digesta (44.66 ± 5.04), as compared to the ENV digesta (16.21 ± 4.99). This was followed by Flavobacteriales, Propionigenium, and Photobacterium, which were noticeably abundant in the ENV digesta, whereas Agarivorans and Rhodobacteraceae in the LAB digesta. Few taxa that were represented at low abundances in the LAB digesta, particularly, Alteromonadales and OM60 presented in Supplementary Table S2 and withing phylum Gammaproteobacteria (Figure 3a,b). Similarly, order Marinilabiliaceae of phylum Bacteroidetes and family Pirellulaceae of phylum Planctomycete were also found in reasonably low abundances in the gut ENV gut digesta. These taxa were negligibly abundant in the gut tissues (Supplementary Table S2). The effect size was visualized as a bar graph of two classes, one class representing the gut tissue samples (n = 6; green bars) that comprised the subclass laboratory-maintained sea urchin gut tissue (n = 3), and naturally occurring sea urchin gut tissue (n = 3); and the other class representing the gut digesta samples (n = 6; red bars) that comprised the subclass laboratory-maintained sea urchin gut digesta (n = 3) and naturally occurring sea urchin gut digesta (n = 3). The values shown on the x-axis correspond to the log(10) effect size values at an inclusion threshold of ±3.6. (b) Heatmap of the 17 differentially abundant taxa revealed by LEfSe between sea urchin gut digesta and gut tissues. Green bar = gut tissues (n = 6); red bar = gut digesta (n = 6). The relative abundances were converted to Z-scores by taxa, shown in blue color. Relative abundances are also indicated through black track lines. Dendrograms represent a clustering of taxa (columns) and samples (rows) based on hierarchical clustering with Euclidean distance metric and average linkage. (* = Taxonomic classification beyond the level could not be identified in the reference database).

Co-Presence, Co-Exclusion, and Key Taxa in ENV and LAB Digesta
The co-occurrence network representing potential interactions occurring among the microbial taxa from ENV urchin digesta produced 39 nodes and 254 edges (Figure 4a). NetworkAnalyzer (v2.7) determined the network properties and showed an average number of neighbors of 13.026, the characteristic path length of 1.693, with a network density of 0.171, and a clustering coefficient of 0.387. The bacterium with the largest degree was Congregibacter (22 total), with most of these associations shown as co-presence (16 total). Closeness centrality values were plotted against betweenness centrality values to display trends via scatter plot analysis (Figure 4b). The top five candidate key taxa in ENV The effect size was visualized as a bar graph of two classes, one class representing the gut tissue samples (n = 6; green bars) that comprised the subclass laboratory-maintained sea urchin gut tissue (n = 3), and naturally occurring sea urchin gut tissue (n = 3); and the other class representing the gut digesta samples (n = 6; red bars) that comprised the subclass laboratory-maintained sea urchin gut digesta (n = 3) and naturally occurring sea urchin gut digesta (n = 3). The values shown on the x-axis correspond to the log(10) effect size values at an inclusion threshold of ±3.6. (b) Heatmap of the 17 differentially abundant taxa revealed by LEfSe between sea urchin gut digesta and gut tissues. Green bar = gut tissues (n = 6); red bar = gut digesta (n = 6). The relative abundances were converted to Z-scores by taxa, shown in blue color. Relative abundances are also indicated through black track lines. Dendrograms represent a clustering of taxa (columns) and samples (rows) based on hierarchical clustering with Euclidean distance metric and average linkage. (* = Taxonomic classification beyond the level could not be identified in the reference database).

Co-Presence, Co-Exclusion, and Key Taxa in ENV and LAB Digesta
The co-occurrence network representing potential interactions occurring among the microbial taxa from ENV urchin digesta produced 39 nodes and 254 edges (Figure 4a). NetworkAnalyzer (v2.7) determined the network properties and showed an average number of neighbors of 13.026, the characteristic path length of 1.693, with a network density of 0.171, and a clustering coefficient of 0.387. The bacterium with the largest degree was Congregibacter (22 total), with most of these associations shown as co-presence (16 total). Closeness centrality values were plotted against betweenness centrality values to display trends via scatter plot analysis (Figure 4b). The top five candidate key taxa in ENV urchins digesta were chosen based on the descriptions of the topological qualities of taxonomic nodes and ranked via their closeness centrality described elsewhere [77]. The key taxa were represented by Desulfobulaceae, Flavobacteriales, Spirochaetes, Propionigenium, and Agarivorans (Figure 4b).
In contrast, the network of the LAB urchin digesta yielded 35     The network analysis displays edges based on the q-value, which were merged via the brown method at p < 0.05 and are shown as red (co-exclusion) and green (co-presence). The radial layout algorithm was used from the yFiles plugin (v1.0), and NetworkAnalyzer (v2.7) was used to determine the topological parameters (undirected approach). The node size was scaled according to their group abundance size, and edges were scaled via the q-value. (b) A scatter plot analysis was conducted based on the topological metrics selected by NetworkAnalyzer (v2.7) to reveal patterns of key (keystone) species between the naturally occurring L. variegatus gut digesta taxonomy based on the closeness and betweenness centrality scores, and the degree (the number of co-exclusion and co-presence edges). Microsoft Excel software (Seattle, WA, USA) was used to determine the linear regression. The linear regression between closeness and betweenness centrality was displayed as logarithmic (R 2 value = 0.2079). The top 5 taxonomic entries are shown and ranked via their closeness centrality.
Appl. Microbiol. 2021, 1, FOR PEER REVIEW 10 positive) and lowest (most negative) edges were chosen and combined with the union approach using the mean value. Multi-edge scores were then shuffled row-wise at 100 permutations (for the randomization). The network analysis displays edges based on the q-value, which were merged via the brown method at p < 0.05 and are shown as red (co-exclusion) and green (co-presence). The radial layout algorithm was used from the yFiles plugin (v1.0), and NetworkAnalyzer (v2.7) was used to determine the topological parameters (undirected approach). The node size was scaled according to their group abundance size, and edges were scaled via the q-value.   (a) Co-occurrence Network interference (CoNet v1.1.1) was used to determine significant co-occurrence patterns between the microbiota of laboratory maintained (LAB) Lytechinus variegatus gut digesta, as analyzed through Cytoscape (v3.8.0). The gut digesta taxonomic entries with a cumulative group sum of 200 and at least 2/3 of samples containing non-zero values were kept with a 10-8 pseudo-count to determine the significant co-occurrences between taxa. The 200 highest (most positive) and lowest (most negative) edges were chosen and combined with the union approach using the mean value. Multi-edge scores were then shuffled row-wise at 100 permutations (for the randomization). The network analysis displays edges based on the q-value, which were merged via the brown method at p < 0.05 and are shown as red (co-exclusion) and green (co-presence). The radial layout algorithm was used from the yFiles plugin (v1.0), and NetworkAnalyzer (v2.7) determined the topological parameters (undirected approach). The node size was scaled according to the group abundance size, and edges were scaled via the q-value. (b) A scatter plot analysis was conducted based on the topological metrics selected by NetworkAnalyzer (v2.7), to reveal patterns of key species between the LAB L. variegatus gut digesta based on closeness and betweenness centrality scores, and with the degree (the number of co-exclusion and co-presence edges). Microsoft Excel software (Seattle, WA, USA) was used to determine the linear regression. The linear regression between closeness and betweenness centrality was displayed as logarithmic (R 2 value = 0.4528). The top 5 taxonomic entries are shown and ranked via their closeness centrality.
The network of LAB urchin gut tissues yielded 15 nodes, 53 edges (Supplementary Figure S1a). NetworkAnalyzer (v2.7) showed an average number of neighbors of 7.1, a characteristic path length of 1.433, a network density of 0.252, and a clustering coefficient of 0.377. The largest abundance of the taxon was Campylobacteraceae (>90%). Closeness centrality values plotted against betweenness centrality values displayed the top five candidate key taxa, Luteolibacter, Arcobacter, Vibrio, Ruegeria, and Comamonadaceae (Supplementary Figure S1b).
The network of ENV urchin gut tissues yielded 10 nodes, 17 edges (Supplementary Figure S2a). NetworkAnalyzer (v2.7) showed the average number of neighbors of 3.4, the characteristic path length of 1.047, with a network density of 0.189, and a clustering coefficient of 0.305. Similar to the LAB urchin gut tissues, the largest abundance of the taxon was Campylobacteraceae (>90%). The closeness centrality and betweenness centrality values represented in the scatter plot showed Campylobacteraceae, Alphaproteobacteria, Bacteroidales, Flavobacteriales, and Propionigenium as the top five taxa (Supplementary Figure S2b). The LAB and ENV gut tissue networks are provided in the supplementary material, due to the large abundance of Campylobacteraceae (>90%), and minimal taxa diversity. Figure 5. (a) Co-occurrence Network interference (CoNet v1.1.1) was used to determine significant co-occurrence patterns between the microbiota of laboratory maintained (LAB) Lytechinus variegatus gut digesta, as analyzed through Cytoscape (v3.8.0). The gut digesta taxonomic entries with a cumulative group sum of 200 and at least 2/3 of samples containing non-zero values were kept with a 10-8 pseudo-count to determine the significant co-occurrences between taxa. The 200 highest (most positive) and lowest (most negative) edges were chosen and combined with the union approach using the mean value. Multi-edge scores were then shuffled row-wise at 100 permutations (for the randomization). The network analysis displays edges based on the q-value, which were merged via the brown method at p < 0.05 and are shown as red (co-exclusion) and green (co-presence). The radial layout algorithm was used from the yFiles plugin (v1.0), and NetworkAnalyzer (v2.7) determined the topological parameters (undirected approach). The node size was scaled according to the group abundance size, and edges were scaled via the q-value. (b) A scatter plot analysis was conducted based on the topological metrics selected by NetworkAnalyzer (v2.7), to reveal patterns of key species between the LAB L. variegatus gut digesta based on closeness and betweenness centrality scores, and with the degree (the number of co-exclusion and co-presence edges). Microsoft Excel software (Seattle, WA, USA) was used to determine the linear regression. The linear regression between closeness and betweenness centrality was displayed as logarithmic (R 2 value = 0.4528). The top 5 taxonomic entries are shown and ranked via their closeness centrality.
The network of LAB urchin gut tissues yielded 15 nodes, 53 edges (Supplementary Figure S1a). NetworkAnalyzer (v2.7) showed an average number of neighbors of 7.1, a characteristic path length of 1.433, a network density of 0.252, and a clustering coefficient of 0.377. The largest abundance of the taxon was Campylobacteraceae (>90%). Closeness centrality values plotted against betweenness centrality values displayed the top five candidate key taxa, Luteolibacter, Arcobacter, Vibrio, Ruegeria, and Comamonadaceae (Supplementary Figure S1b).
The network of ENV urchin gut tissues yielded 10 nodes, 17 edges (Supplementary Figure S2a). NetworkAnalyzer (v2.7) showed the average number of neighbors of 3.4, the characteristic path length of 1.047, with a network density of 0.189, and a clustering coefficient of 0.305. Similar to the LAB urchin gut tissues, the largest abundance of the taxon was Campylobacteraceae (>90%). The closeness centrality and betweenness centrality values represented in the scatter plot showed Campylobacteraceae, Alphaproteobacteria, Bacteroidales, Flavobacteriales, and Propionigenium as the top five taxa (Supplementary Figure S2b). The LAB and ENV gut tissue networks are provided in the supplementary material, due to the large abundance of Campylobacteraceae (>90%), and minimal taxa diversity.

Figure 6. The distribution of Amplicon Sequence Variants (ASV) read counts with different Nearest
Sequenced Taxon Index (NSTI) values of Lytechinus variegatus gut microbiota using a method described elsewhere [79]. The x-axis indicates the sample category, and the y-axis indicates the cumulative percentage of ASV read counts. The color key indicates the NSTI value for ASVs. The mean NSTI values for ENV gut digesta, ENV gut tissue, LAB gut digesta, and LAB gut tissue were 0.29 ± 0.04 (mean ± SD), 0.20 ± 0.03, 0.08 ± 0.02, and 0.14 ± 0.004, respectively. NSTI < 0.15 indicates a moderate to high quality of metagenomic content prediction.
Overall, the trends of KEGG-Level-2 categories were consistent among sample replicates, irrespective of habitat (Figure 7a). The LAB and ENV gut tissues showed heightened cell motility, replication and repair, translation, and other amino acid metabolism pathways when compared to the gut digesta. In contrast, the gut digesta of both groups showed a heightened abundance of carbohydrate metabolism, lipid metabolism, membrane transport, metabolism of terpenoids [81,82], and xenobiotics biodegradation as compared to the gut tissues. Moreover, these metabolic categories were noticeably enriched in the ENV gut digesta as compared to the LAB gut digesta. KEGG-level-3 observations showed a preferential abundance of methionine metabolism, oxidative phosphorylation, glyoxylate and dicarboxylate metabolism, porphyrin and chlorophyll metabolism, purine metabolism, and pyrimidine metabolism in the gut tissue as well as categories related to transport systems (membrane transport, ABC transport, membrane transporters) (Figure 7b). The gut digesta displayed categories related to peptidase metabolism, folate biosynthesis, glycan and lipopolysaccharide biosynthesis proteins, and histidine metabolism. Gut digesta also displayed a higher abundance of categories related to bacterial motility and chaperones. Other categories that were enriched in the ENV group in contrast to the LAB group included peptidase metabolism, chlorophyll metabolism, purine metabolism, tryptophan metabolism, glutamate metabolism, methionine metabolism, lysine biosynthesis, pyruvate metabolism, and translation pathways (Figures 7a,b and 8). When LAB and ENV gut digesta were compared, the enriched pathways in LAB gut digesta were bacterial chemotaxis and flagellar assembly, both of which were related to cell mobility; on the other hand, the ENV gut digesta showed higher differential abundance in streptomycin biosynthesis and other glycan degradation (Supplementary Figure S3). Figure 6. The distribution of Amplicon Sequence Variants (ASV) read counts with different Nearest Sequenced Taxon Index (NSTI) values of Lytechinus variegatus gut microbiota using a method described elsewhere [79]. The x-axis indicates the sample category, and the y-axis indicates the cumulative percentage of ASV read counts. The color key indicates the NSTI value for ASVs. The mean NSTI values for ENV gut digesta, ENV gut tissue, LAB gut digesta, and LAB gut tissue were 0.29 ± 0.04 (mean ± SD), 0.20 ± 0.03, 0.08 ± 0.02, and 0.14 ± 0.004, respectively. NSTI < 0.15 indicates a moderate to high quality of metagenomic content prediction.
Overall, the trends of KEGG-Level-2 categories were consistent among sample replicates, irrespective of habitat (Figure 7a). The LAB and ENV gut tissues showed heightened cell motility, replication and repair, translation, and other amino acid metabolism pathways when compared to the gut digesta. In contrast, the gut digesta of both groups showed a heightened abundance of carbohydrate metabolism, lipid metabolism, membrane transport, metabolism of terpenoids [81,82], and xenobiotics biodegradation as compared to the gut tissues. Moreover, these metabolic categories were noticeably enriched in the ENV gut digesta as compared to the LAB gut digesta. KEGG-level-3 observations showed a preferential abundance of methionine metabolism, oxidative phosphorylation, glyoxylate and dicarboxylate metabolism, porphyrin and chlorophyll metabolism, purine metabolism, and pyrimidine metabolism in the gut tissue as well as categories related to transport systems (membrane transport, ABC transport, membrane transporters) (Figure 7b). The gut digesta displayed categories related to peptidase metabolism, folate biosynthesis, glycan and lipopolysaccharide biosynthesis proteins, and histidine metabolism. Gut digesta also displayed a higher abundance of categories related to bacterial motility and chaperones. Other categories that were enriched in the ENV group in contrast to the LAB group included peptidase metabolism, chlorophyll metabolism, purine metabolism, tryptophan metabolism, glutamate metabolism, methionine metabolism, lysine biosynthesis, pyruvate metabolism, and translation pathways (Figure 7a,b and Figure 8). When LAB and ENV gut digesta were compared, the enriched pathways in LAB gut digesta were bacterial chemotaxis and flagellar assembly, both of which were related to cell mobility; on the other hand, the ENV gut digesta showed higher differential abundance in streptomycin biosynthesis and other glycan degradation (Supplementary Figure S3). When LAB and ENV gut tissue were compared, only carbon fixation pathways were identified as differentially abundant pathways in the LAB group (data not shown).
Appl. Microbiol. 2021, 1, FOR PEER REVIEW 13 When LAB and ENV gut tissue were compared, only carbon fixation pathways were identified as differentially abundant pathways in the LAB group (data not shown).

Discussion
The sequence dataset used in this study was analyzed using the QIIME2/ASV tools, which have been described as superior tools on multiple fronts, such as the recovery of sequences, by avoiding spurious taxa assignments as well as providing more accurate diversity estimates as compared to previously reported tools, including QIIME1 (v1.9.1) and others [45,47,83,84]. Based upon the current analysis, the microbiota in the L. variegatus gut ecosystem using the rarified HTS data revealed that Arcobacter spp. belonging to Epsilonproteobacteria were the dominant taxon in the gut tissues of both LAB and ENV samples (Figure 1). In general, these findings are comparable qualitatively to other studies [24,25,31,45,85]. Besides Arcobacter, other members of the Epsilonproteobacteria are commonly associated with other marine Echinoderms, such as unculturable Helicobacter in sea stars Patiria pectinifera and Asterias amurensis [86], and Helicobacter, Sulfurospirillum, and Sulfuricurvum among the sea cucumber [87]. The presence of Epsilonproteobacteria (33.6%) has been reported on the surface of a red urchin Loxechinus albus from a fish farming of aquaculture environment [88]. The source of the reported Epsilonproteobacteria in a closed aquaculture environment [88] could be the shedding of the gut contents. Interestingly, the Epsilonproteobacteria was not detected in the gut microbiota of an Antarctic heart urchin (Spatangoida) Abatus agassizii [89], although the explanation for such variation is currently unknown. In contrast, a comparison between the LAB and ENV gut digesta showed noticeable differences in microbial community composition. Overall, both LAB and ENV gut digesta showed Vibrio to be dominant, which is consistent with previously reported urchins Strongylocentrotus droebachiensis, Tripneustes ventricosus [30], Strongylocentrotus intermedius, Strongylocentrotus nudus [90], Echinus esculentus [91], L. variegatus [85], Hemicentrotus pulcherrimus [92], and Strongylocentrotus purpuratus [41], as well as in the coelomic fluid of Paracentrotus lividus [93].
However, the higher relative abundance of Vibrio in the LAB group could be due to the relatively higher salt concentration (32 ± 1 ppt.), pH (~8.4), and temperature (22 ± 2 • C) used in the laboratory aquaculture as compared to the conditions of their natural habitat (salinity = 28 ± 1 ppt.; pH = 7.8 ± 0.2; temperature = 20 ± 2 • C) [94]. Other differences included a prevalence of Photobacterium in the ENV group, certain species of which have been reported to perform lipid metabolisms in another urchin species, Paracentrotus lividus [95][96][97]. Additionally, the prevalence of a strictly anaerobic genus Propionigenium of phylum Fusobacteria in the ENV groups likely due to the higher fraction of non-digestible carbohydrates in seagrass compared to the laboratory-formulated feed. Propionigenium spp. generate the fatty acid propionate from succinate [98][99][100] and may benefit urchins through such health-related effects as mitigation of inflammation [99].
Although the gut tissues in both ENV and LAB groups had comparable taxonomic diversity, the differences in the alpha diversity observed could be due to the diverse bacterial taxa in the nearshore Gulf of Mexico marine habitat (Table 1). Additionally, fluctuations of the abiotic factors such as pH, temperature, photoperiod, and salinity could also promote differences in the microbial diversity on a temporal scale [101][102][103][104]. Moreover, it has been reported that diet plays a significant role in the gut microbial composition in a wide range of organisms [105,106]. Although L. variegatus generally grazes seagrass in their habitat, they also consume alternate food sources such as detritus materials, various algae, and their epibionts [33,35]. In contrast, the defined laboratory-maintained aquaculture conditions could have contributed to the low alpha diversity in the gut digesta of the LAB group. This could be due to the small number of species outcompeting other species due to excess nutrient loading in laboratory-maintained conditions [107]. Although a similar reduction of the alpha diversity has been reported [8,108,109], this is not a universal trend in laboratory-maintained animals, as had been observed in the zebrafish [5], baboon [110], hydra [111], fruit flies [112], and birds [113][114][115], at least at the phylum and class levels. Thus, in our study, the replacement of the diet from their natural habitat with the defined nutritionally balanced feed used ad libitum for the LAB group likely restructured the gut microbial community but may need further investigation. However, it is important to note that the differences in the gut microbiota in experimental animals when maintained and fed a formulated diet as compared to animals used directly from the wild could impact the interpretations of the outcomes of various scientific studies [12,15].
The ANOSIM and Adonis statistics (p = 0.001) supported distinct cluster patterns of the four groups comprising the gut tissue and gut digesta of the LAB (n = 3) and ENV (n = 3) urchin's biological replicates used in this study. The microbial communities of the gut tissue and gut digesta showed unique cluster patterns of biological replicates as determined by beta diversity analysis (Figure 2a,b). Notably, the gut tissues of both the LAB and ENV groups clustered together, indicating a gut tissue-specific microbiota likely maintained following the transition to the laboratory aquaculture environment. Although the LAB and ENV gut digesta showed common higher taxonomic classifications, such as Gammaproteobacteria, differences at the lower taxonomic levels were noted. Such results suggest that habitat may influence the microbial community composition in the gut digesta environment. The LEfSe analysis between the gut tissue and gut digesta across both groups predicted Campylobacteraceae and Vibrio to contribute to the uniquely separated microbial ecology in the L. variegatus gut ecosystem (Figure 3a,b).
To gain further insight into the interactive aspects of the microbiota, we used CoNet (v1.1.1) [64,66,67] analysis for theoretical modeling of relationships occurring at the taxonomic level in the gut digesta of ENV and LAB urchins. Overall, the CoNet identified highly abundant Propionigenium as the key taxon in the ENV gut digesta (45%) (Figure 4a,b), with a high degree of positive associations with other taxa, while it exhibited relatively low abundance in the LAB gut digesta (~2%). In addition, the Propionigenium also exhibited a significant degree of positive associations (six out of seven) with taxa in the LAB digesta other than the ones in the ENV digesta (Figure 5a,b). Interestingly, Propionigenium was also found in the purple urchins, S. purpuratus, suggesting their dominant influence in structuring the gut microbial communities in urchins [41,76]. Propionigenium is known for energy metabolism primarily through membrane-associated energy transduction and ATP synthesis [116]. Thus, a high degree of association and abundance of Propionigenium in the gut digesta microbial community likely support the CoNet results.
Two of the taxa identified in both LAB and ENV gut digesta were Desulfobacteraceae and Desulfovibrio. The highest resolution in the ENV gut digesta, which were included in the top 15 most likely key taxa in the gut digesta, were sulfur-reducing bacteria (<1%), which represented a total degree of 9 and 18 associations with the majority of those indicated as coexclusion relationships. These taxa belong to the phylum Deltaproteobacteria and represent species known to use sulfate as electron acceptors [117]. However, in LAB gut digesta, the resolution significantly decreased. The CoNet (v1.1.1) [64][65][66] analysis was used to perform the theoretical modeling of the relationships occurring between microbial taxa in the gut tissue between ENV and LAB urchins. Campylobacteraceae (~90%) abundance was responsible for the majority of LAB and ENV tissue, and this resulted in a reduced network structure. Overall, the CoNet identified Arcobacter as the key taxon (<1%) for the LAB gut tissue, with a high degree of positive associations (7 out of 13) (Supplementary Figure S1a). Alphaproteobacteria was identified as a key taxon for the ENV gut tissue, with a high degree of positive associations (3 out of 5) (Supplementary Figure S1b). It is important to note that this network will be influenced by Campylobacteraceae, as well as the reduced number of taxa to work with.
Although Vibrio comprises a noticeably abundant taxon in both the ENV and LAB groups, low-abundance taxa such as Congregibacter (<1%), a member of Gammaproteobacteria, in the ENV group exhibited a high closeness centrality compared to betweenness centrality, as well as a large number of edges (22 total), with 16 positive associations to other taxa (Figure 4a,b). The Congregibacter is an aerobic, bacteriochlorophyll a-producing, a photosynthetic bacterium found in marine environments, and contributes to marine carbon cycling [118]. Thus, it is reasonable to predict that this bacterium, although found in relatively low abundance, will exhibit a high degree of interaction with other taxa and potentially play an important role in energy metabolism in the gut digesta of the ENV sam-ples. In the LAB group, Octadecabacter (<1%), a member of Proteobacteria, was observed to have a high closeness centrality compared to betweenness centrality, and a large number of edges (22 total), with 20 positive associations to other taxa (Figure 5a,b). Octadecabacter has been described as part of the marine Roseobacter clade, which is found as a dominant symbiont in the brittle star Amphipholis squamata inhabiting shallow intertidal cold coastal waters, and which might play a possible role in nutrient uptake [119]. Octadecabacter has also been found in extremely cold Arctic and Antarctic Sea ice ecosystems, with the possible role of nitrogen metabolism [120]. However, the presence of this bacterium and its potential metabolic role in LAB urchins in our study is unclear and may need further investigation in future studies.
The predicted functional analysis of the gut microbial communities using PICRUSt2 (v2.3.0-b) indicated carbohydrate, amino acid, and lipid metabolisms to be more dominant in the gut digesta than in the gut tissue in both LAB and ENV groups (Figure 7a,b). These results indicate that the gut digesta is the primary location for the microbial-driven metabolism of environmental and laboratory-prepared dietary macromolecules [121][122][123]. Moreover, these metabolic categories were enriched in the ENV gut digesta, suggesting a higher metabolic capacity. Conversely, the gut tissues of both groups showed energy metabolism to be significantly heightened as compared to the gut digesta ( Figure 8). This category includes nitrogen and sulfur metabolisms, which have been attributed to the microbial communities of other urchins [29,30,93,124,125]. Additionally, Arcobacter spp. has been described as a chemolithoautotrophic bacterium [78], performing crucial biochemical processes in the marine environment, such as sulfur oxidation in hydrothermal vents [126] and nitrogen metabolisms [127]. However, whether these metabolisms are of any benefit to their host's health and nutrition, including the specific metabolic input of the dominant Epsilonproteobacteria of the gut tissue, remains to be clarified.
Bacterial motility is an essential virulence factor in pathogens, including Vibrio, which can affect a vast range of aquatic organisms [128]. In our study, as compared to the ENV digesta, the LAB gut digesta showed higher differential abundance in cell motility (bacterial chemotaxis and flagellar assembly in Supplementary Figure S3). This could have caused adverse outcomes reflecting the lower alpha diversity in the LAB gut digesta (Table 1) [107,129]. The glycan degradation pathway, on the other hand, was the most differentially abundant in ENV gut digesta group (Supplementary Figure S3). It has been reported that gut microbiota can forage glycans from the host as a nutrient source [130,131]. However, further studies are needed to better understand the nutrient exchange between the host and the gut digesta to maintain a steady-state metabolic benefit for this animal.

Conclusions
In conclusion, L. variegatus maintained a distinct microbial community representing primarily Arcobacter spp. in the gut tissues. Predicted functional roles indicated that this taxon is involved in the energy metabolisms irrespective of the laboratory conditions (LAB) or their natural habitat (ENV). While a comparison of the microbiota at the most resolved level exhibited distinct differences between the gut digesta of the LAB and ENV groups, consistencies were observed at the phylum or the class level. The co-presence, co-exclusion, and the key taxa determined through CoNet network revealed an abundance of Vibrio in both ENV and LAB gut digesta, with ENV gut digesta having an abundance of Flavobacteriales, Propionigenium, Photobacterium, and Campylobacteraceae. However, the LAB gut digesta displayed only an abundance of Vibrio and Arcobacter. The ENV gut digesta revealed Congregibacter and Rhodobacteraceae as having the highest degrees (22 and 21), and LAB gut digesta revealed Octadecabacter and Ruegeria as having the highest degrees (22 and 21). The CoNet analysis revealed an abundance of Campylobacteraceae in both ENV and LAB gut tissue, with Alphaproteobacteria have the highest degrees (2 and 3), and LAB gut digesta revealed Arcobacter as having the highest degrees (6 and 7). Additionally, the metabolisms of macronutrients in the gut digesta were consistently higher than the gut tissues in both LAB and ENV groups. These results indicate that the gut digesta is potentially the primary location in the L. variegatus gut ecosystem at which maximum microbial energy metabolisms occur.
The results from this provide an insight into the potential impact of the diet on structuring and predicted metabolic functions of the gut microbial communities of laboratory aquaculture vs. naturally occurring L. variegatus. Such changes likely contribute to the host's metabolism and health, further emphasizing the importance of gut microbiome composition and the selection of diet, which may affect the reproducibility, consistency, and interpretation of scientific data when L. variegatus are used directly from their natural habitat, compared to a laboratory aquaculture environment, in various laboratory experiments. However, we realize that future studies may be needed to further elaborate the significance of the close association of Epsilonproteobacteria with the gut tissues and the potential role of habitat-specific or laboratory-formulated diet in restructuring the microbiota, particularly in the mucous-encapsulated high-energy gut digesta that contribute to the host's metabolism and health.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/applmicrobiol1020016/s1, Figure S1: Co-occurrence Network (CoNet) patterns of the laboratory maintained (LAB) Lytechinus variegatus gut tissues displaying (A) edges with taxonomic co-exclusion and co-presence, and (B) scatter plot analysis showing top 5 taxa ranked via their closeness centrality, Figure S2: Co-occurrence Network (CoNet) patterns of the naturally occurring (ENV) Lytechinus variegatus gut tissues displaying (A) edges with taxonomic co-exclusion and co-presence, and (B) scatter plot analysis showing top 5 taxa ranked via their closeness centrality, Figure S3: Linear discriminant analysis (LDA) effect size (LEfSe) predicted functional profiles corresponding to the microbial communities of the Lytechinus variegatus gut digesta between LAB and ENV samples. Table  S1: The QIIME2 (v.2020.8) ASV table data generated through DADA2 (v1.10) for all samples in this study with taxonomy assigned; Table S2: A list of Amplicon Sequence Variant (ASV) found by QIIME2 output in the analyzed samples.