Metatranscriptomic Analysis of Oil-Exposed Seawater Bacterial Communities Archived by an Environmental Sample Processor (ESP)

The use of natural marine bacteria as “oil sensors” for the detection of pollution events can be suggested as a novel way of monitoring oil occurrence at sea. Nucleic acid-based devices generically called genosensors are emerging as potentially promising tools for in situ detection of specific microbial marker genes suited for that purpose. Functional marker genes are particularly interesting as targets for oil-related genosensing but their identification remains a challenge. Here, seawater samples, collected in tanks with oil addition mimicking a realistic oil spill scenario, were filtered and archived by the Environmental Sample Processor (ESP), a fully robotized genosensor, and the samples were then used for post-retrieval metatranscriptomic analysis. After extraction, RNA from ESP-archived samples at start, Day 4 and Day 7 of the experiment was used for sequencing. Metatranscriptomics revealed that several KEGG pathways were significantly enriched in samples exposed to oil. However, these pathways were highly expressed also in the non-oil-exposed water samples, most likely as a result of the release of natural organic matter from decaying phytoplankton. Temporary peaks of aliphatic alcohol and aldehyde dehydrogenases and monoaromatic ring-degrading enzymes (e.g., ben, box, and dmp clusters) were observed on Day 4 in both control and oil-exposed and non-exposed tanks. Few alkane 1-monooxygenase genes were upregulated on oil, mostly transcribed by families Porticoccaceae and Rhodobacteraceae, together with aromatic ring-hydroxylating dioxygenases, mostly transcribed by Rhodobacteraceae. Few transcripts from obligate hydrocarbonoclastic genera of Alcanivorax, Oleispira and Cycloclasticus were significantly enriched in the oil-treated exposed tank in comparison to control the non-exposed tank, and these were mostly transporters and genes involved in nitrogen and phosphorous acquisition. This study highlights the importance of seasonality, i.e., phytoplankton occurrence and senescence leading to organic compound release which can be used preferentially by bacteria over oil compounds, delaying the latter process. As a result, such seasonal effect can reduce the sensitivity of genosensing tools employing bacterial functional genes to sense oil. A better understanding of the use of natural organic matter by bacteria involved in oil-biodegradation is needed to develop an array of functional markers enabling the rapid and specific in situ detection of anthropogenic pollution.


Introduction
The use of ecogenomic sensors and devices that can automatize molecular analytical techniques are promising for in situ marine applications. Examples of these applications include understanding fundamental processes in complex and dynamic environments, detection of episodic events such as oil leakages from subsea operational infrastructures and monitoring of ecosystem recovery [1][2][3][4]. The development of microbial target-based genosensors that specifically target the genetic material of microorganisms that respond to oil for use as oil sensors and their automatization for in situ monitoring of oil pollution events at sea is novel [5,6]. Microbial communities inhabiting the water column of the global ocean are appropriate for the development of genosensors as these communities are ubiquitous and respond rapidly to changes in their environment [7]. In a recent study, bacteria were found to be the most responsive to drilling and production operations in the South Taranaki Bight region of the Tasman Sea, further emphasizing the need for genosensing targets based on this domain of life [8]. Next-generation sequencing (NGS)-based approaches targeting microbes, or more specifically bacteria in aquatic environments, have been suggested and explored as the new frontier in water quality assessment [9]. The assessment and evaluation of this type of novel approach towards integration into routine monitoring practices has been increasing in the past decade [10][11][12]. NGS-based analysis of microbial response can also be used to identify potential bioindicator species and genes. Quantitative molecular assays targeting indicator bacterial species can then be developed and integrated into ecogenomic devices. Ecogenomic devices equipped with molecular assays targeting specific biomarker genes, could have a significant impact for application in industrialized areas of the ocean where legal requirements governing anthropogenic activities call for novel technologies with quick response times to mitigate their effect on the ocean ecosystem.
One off-the-shelf genosensing device is the Environmental Sampling Processor (ESP), a fully developed and automated ecogenomic sensor capable of performing different analytical assays at sea for remote detection of marine microbes, microalgae and small invertebrates. This unique device is the product of many years of development, during which the ESP has evolved from a so-called 2G-ESP (large, fixed on a buoy or a drifter) to the present 3G-ESP (smaller, mobile in a LR-AUV) [13,14]. While the ESP device cannot perform in situ RNA sequencing, metatranscriptomic analysis of ESP archived samples have been performed to uncover in situ patterns of microbial activity [15]. Although the ESP was not developed for the purpose of environmental monitoring of offshore industrial activities such as those related to oil and gas, two recent studies indicated it could potentially be used in that context by quantifying taxonomic markers of oil pollution using specific 16S rRNA genes [16,17].
Oil contamination is known to induce characteristic shifts in the composition of pelagic marine bacterial communities resulting in higher abundance of some species within the Gamma-and Alphaproteobacteria classes [18][19][20][21][22]. There are several bacterial groups which have been suggested as indicators of oil pollution at sea. Typical microbial genera associated with marine oil spills in different environments are alkane-degrading microbes such as Alcanivorax, Oleispira and at least one species of Marinobacter (M. hydrocarbonoclasticus), which are succeeded by PAH-degrading groups such as Cycloclasticus and some species of Colwellia [20,[23][24][25]. Among them, Oleispira and uncultured Oceanospirillaceae were identified as potentially good indicators for oil pollution in cold waters [16,26] responding quickly and increasingly to oil load in water. In general, 16S rRNA gene-based amplicon sequencing studies are used to characterize the composition of microbial communities and to follow their temporal succession. In comparison to metagenomics or metatranscriptomics, the bioinformatic steps of a 16S rRNA-based approach are better standardized and biostatistical analysis of taxa count tables are also better established. However, the full potential of the phylogenetic marker-based analysis for oil detection can only be explored when a significant increase in obligate hydrocarbonoclastic bacteria (OHCB) is observed [27]. This is because OHCB can utilize only a very narrow range of carbon substrates (i.e., hydrocarbons), and hence their bloom can directly be linked to the occurrence of oil in water. Besides oil-degraders, there are hydrocarbon-tolerant opportunistic bacteria which can also benefit from the occurrence of oil in seawater, even though they are not directly involved in oil biodegradation. The drawback of the phylogenetic marker-based analysis is the lack of a direct link to active metabolic pathways which provide a more precise picture about ongoing processes. Therefore, functional markers of such specific pathways related to petroleum degradation are better suited to track occurrence of oil in water directly and can be explored for oil sensing [28]. Some studies have used quantitative functional gene-based biomarker approaches such as qPCR to follow bioremediation processes (mostly soil and sediment) using primers designed for specific hydrocarbon degradation-relevant genes, such as alkane 1-monooxygenase and ring-hydroxylating dioxygenases [29][30][31]. Other functional gene marker-based environmental assessments have been mainly limited to DNA-based GeoChip-microarray analysis or other microarray systems [32,33]. The potential of using mRNA transcripts of genes involved in oil biodegradation as biomarkers of oil pollution has not been fully explored [34]. Only a few studies have attempted to pursue a metatranscriptomic approach in relation to marine hydrocarbon pollution [35][36][37][38][39][40].
In the present study, two 2G-ESP instruments were used in a laboratory-controlled setup to analyze and preserve bacterioplankton samples from seawater collected during an oil exposure experiment mimicking realistic oil spill condition. The samples were used to perform post-retrieval metatranscriptomic analyses. The main aim of this study was to identify transcripts differentiating oil-exposed seawater samples from non-exposed, with the long-term vision of developing functional gene-based, ESP compatible assays for real-time in situ oil pollution detection. The metatranscriptomics analysis was carried out to perform: (1) a taxonomic investigation of the transcripts to identify the pool of active bacteria increasing significantly in oil treated seawater; and (2) a differential expression analysis of genes between control and oil exposed water samples to identify a pool of functional genes directly or indirectly involved in oil compound biodegradation. The main hypotheses tested in this study were that in the presence of oil: (1) there is a pool functional bacterial genes that characterize the active OHCB communities in oil-exposed seawater, which can be explored to find specific oil markers used for genosensing; (2) specific changes such as increase in expression of hydrocarbon specific initial oxygenase genes involved in the activation of aliphatic and aromatic hydrocarbons can be used for early detection of oil by genosensing; and (3) there is an increase in the abundance of genes from OHCB. The outcome of this study is a milestone to support further development toward the use of novel ecogenomic sensors such as ESP for monitoring transient oil pollution events from offshore industrial activities.

Experimental Setup
Seawater was collected through a pipeline from approximately 20 m depth (temperature and salinity were approximately 8 • C and 33 PSU, respectively) from a pristine coastal area near Kvitsøy (59 • 03 44" N 05 • 24 42" E) (North Sea, outer edges of Boknafjorden, Norway) in a 1 m 3 container in April 2016 (spring season). The water sample was transported to the NORCE facility (Randaberg, Norway) within approximately 2 h. Clean 150 L tanks were filled with the Kvitsøy seawater and allowed to acclimate in the laboratory for 24 h at 5 • C in the dark prior to the start of the experiment. The experimental exposure system was based on the setup described in [41]. Oil treated tanks (O) were prepared by adding a pre-mixed stock solution of oil (a naphthalenic crude oil from the Troll oil field) and seawater to the tanks to obtain a nominal oil concentration of 10 mg·L −1 at the start of the experiment. At the center of each tank, a funnel attached to a water pump (Aquaclear 70, Rolf C. Hagen (USA) Corp., MA, USA) was used to pull water and oil downwards in the tank, allowing for sufficient energy to mix water and oil. There was no lid on the tanks to allow for natural weathering from evaporation of the most volatile compounds to mimic processes as they would occur in the open ocean. Control tanks (C) were subsequently prepared with the same funnel/pump system but without addition of oil and were placed in a separate climate room also set at 5 • C preventing contamination from the exposed room. For each control and oil exposed treatment, three tanks were used but only one for each treatment was dedicated to the present ESP metatranscriptomic study. The experiment was conducted in the dark.

Bacterioplankton Analysis
Seawater samples were collected manually in autoclaved glass bottles from the control and oil treated tanks daily for 7 days. One sample of 1000 mL of water was collected using a pipe placed in the middle of each tank at the start of the experiment (T0, after 24 h of acclimation to laboratory conditions and prior to oil addition) and then 700 mL was collected each following day for 7 days. Seawater samples were immediately brought to two ESPs placed in a separate room, one dedicated to processing exposed and one dedicated to processing samples from non-exposed tanks. The glass bottles were connected to the ESP instruments which processed further the samples automatically. The ESPs filtered the samples onto 25 mm diameter, 0.22 µm pore size hydrophilic Durapore filters (Millipore, Burlington, MA, USA) and archived the filters after addition of RNAlater™ stabilization solution (Ambion, Austin, TX, USA) for RNA preservation [15,42]. After each operation, the filters were retrieved from the ESP and stored at −80 • C until use. In total, 15 filters were collected during the experiment (1 filter/day/condition and the initial seawater), of which 5 were selected for the metatranscriptomics analysis.

Chemical Analysis
Seawater was collected from each of the tanks for total hydrocarbon content (THC) and polycyclic aromatic hydrocarbon (PAH) composition by GC-FID and GC-MS analysis, respectively, on Days 1, 3, 5 and 7. PAH analysis was performed for the 16 EPA (Environmental Protection Agency, USA) priority aromatic compounds [43]. All analyses were performed by Intertek West Lab AS (Tananger, Norway) according to ISO 28540:2011. The limit of quantification for THC analysis was 0.4 mg·L −1 and for PAHs, this ranged from 0.01 to 0.02 µg·L −1 .

Nucleic Acid Extractions and Sequencing
Total RNA and DNA was extracted from five archived filters for metatranscriptomics and amplicon sequencing analysis collected after 24 h acclimation (T0), 4 days (T4C and T4O) and 7 days (T7C and T7O). The extraction was carried out using the AllPrep Bacterial DNA/RNA/Protein Mini Kit (Qiagen, Hilden, Germany) following the manufacturer's instructions. All samples were eluted in 50 µL volume. RNA concentration was quantified on a Qubit instrument (Thermo Fisher Scientific, Carlsbad, CA, USA) and the quality of the samples was evaluated on agarose gel.

Metatranscriptomics
Total RNA (2-8 µg) (details in Table S1) was sent to the sequencing facility at the University of Cambridge (Cambridge, UK) where it was subjected to Ribo-Zero ribosomal RNA removal prior to library preparation using Illumina TruSeq Stranded mRNA kit (Illumina, CA, USA). Paired-end (75 bp) sequencing was carried out on a NextSeq500 platform (Illumina, CA, USA) using the 150-cycle kit.

Amplicon Sequencing
The SSU rRNA gene was amplified using forward Bakt_341F (5'-CCTACGGGNGGCWGCAG-3') and reverse primer Bakt_805R (5'-GACTACHVGGGTATCTAATCC-3') primers, which target the V3 and V4 region [44,45]. The PCR master mix consisted of 12.5 µL of AmpliTaq Gold®360 PCR Master Mix (Thermo Fisher Scientific, Carlsbad, CA, USA), 1 µL of each primer to the final concentration of 250 nM, 2.5 µL of GC enhancer (Thermo Fisher Scientific, Carlsbad, CA, USA), 2 µL of DNA (1:10 diluted) and double-distilled water (ddH 2 O) to the final volume of 25 µL. The reaction cycling conditions were: 95 • C for 10 min, followed by 31 cycles of 95 • C for 30 s, 54 • C for 45 s and 72 • C for 7 min, with a final extension step at 72 • C for 7 min. To ensure amplification of uncontaminated products, all PCRs included negative controls (no template samples). The amplicons were prepared in triplicates per individual sample and then pooled, purified using PureLink™ Quick Gel Extraction and PCR Purification Combo Kit (Thermo Fisher Scientific, Carlsbad, CA, USA) and finally quantified using a Qubit®Fluorometer (Thermo Fisher Scientific, Carlsbad, CA, USA). The prepared amplicons were sequenced from both ends at BaseClear BV (Leiden, Netherlands) using the Illumina MiSeq platform (Illumina, CA, USA). The raw sequencing reads are publicly available in the Sequence Read Archive (SRA) NCBI within the PRJEB32487 study under the accession numbers ERS3402758-ERS3402761.
High-quality (quality score > 20 according to FastQC) non-ribosomal RNA reads from all five samples were then combined in one file and de novo assembled with IDBA-Tran using default parameters [51]. Only those reads which mapped back to these contigs were used in downstream analysis in order to decrease the probability of spurious annotations of short reads [52]. Contigs were used to assess the taxonomic composition of the metatranscriptome. Protein coding DNA sequences (CDS, i.e., genes-these terms are used interchangeably in the text) were predicted from contigs using MetaGeneMark [53]. The read counts of assembled transcripts and putative CDS were determined by mapping the quality-filtered reads of each sample against the assembled contigs and CDS using BWA [54]. Reads which did not map back to contigs or CDS were not used in further analysis. The read counts of transcripts and CDS sequences were normalized as transcripts per million (TPM) according to Wagner et al. [55]. The statistics of each read processing step are shown in Table S2. The raw sequencing reads are publicly available in the Sequence Read Archive (SRA) NCBI under the project number PRJNA556155 with the following individual SRA Run IDs: T0, SRR9734925; T4C, SRR9734976; T4O,  SRR9940697; T7C, SRR9940698; and T7O, SRR9940650. For taxonomic classification, contigs were aligned to the NCBI non-redundant (nr) protein database (ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz, downloaded in July 2019) using DIAMOND v0.9.9.110 [56] and blastx with default parameters. The blastx results were analyzed using the longReads LCA algorithm of MEGAN6 with parameters of percent to cover = 80.0, min score = 50.0, max expected = 0.001, top percent = 10.0, min support percent = 0.0 and min support = 1 [57]. MEGAN assignments were then exported using the "taxonName to readName" separator to obtain a list of contig IDs that aligned to each taxon. To obtain the abundance for each taxon in each sample, contig IDs ("readName") were replaced by TPM counts using a custom script. STAMP was used to generate a principal component analysis (PCA) plot based on relative abundance of families with sum relative abundance > 0.001% [58].
Functional annotations of all CDS were generated with DIAMOND against the NCBI nr database and using the online KEGG GhostKOALA annotation pipeline [59]. STAMP was used to generate a PCA plot based on the TPM counts of unique KEGG orthologs (6548 KOs). To identify oil-specific KOs, Venn diagram analysis was performed using the R package VennDiagram (https://CRAN.R-project. org/package=VennDiagram) [60]. Briefly, a list of KOs with TPM > 0 was generated for each sample, then unique and shared KOs were identified and retrieved.
To identify CDS that were significantly more abundant in oil treated samples, differential expression analysis was carried out. The raw read count table for CDS was used as input for the DESeq2 available in R/Bioconductor package [61]. As differential expression analysis requires a minimum of 2 replicates for each condition, the Day 4 and Day 7 datasets for each condition (control and oil exposed) were treated as replicates, i.e., oil samples (T4O and T7O) were compared with no-oil samples (T4C and T7C). This decision was based on additional amplicon sequencing data (not shown), which suggested that the temporal variation between Day 4 and Day 7 was similar to the variation between biological replicates, i.e., the different tanks. CDS with log2 fold change (log2 FC) values of ≥ 1 and p-values of < 0.05 were defined as significantly differentially abundant/expressed. The putative function of upregulated CDS was determined using the online KEGG GhostKOALA annotation pipeline [59]. The result of this analysis was a table of genes, annotated with KEGG Orthology (KO) terms organized according to the KEGG PATHWAY Database (https://www.genome.jp/kegg/pathway.html). KO terms that were not in the KEGG pathways but were relevant to oil degradation, Bacterial motility proteins (ko02035), Secretion system (ko02044) and Transporters (ko02000), were manually collected according to the KEGG BRITE Database (https://www.genome.jp/kegg/brite.html). A custom script was used to obtain the TPM count table, based on the list of CDS IDs corresponding to each KO term. Upregulated CDS with functions linked to hydrocarbon degradation were taxonomically annotated using blastp search with DIAMOND using the combination of the best blast hit and the LCA algorithm (top parameter set to 2%).

Amplicon Sequencing
The raw sequences were processed using USEARCH (v10.0.240_i86linux32) according to the recommended settings [62,63]. The sequences were merged and filtered according to length and quality criteria (quality threshold of Q20) and sequencing reads that did not meet the quality requirements were discarded. Unique sequences were extracted, then OTU clustering (at 97% similarity) and a de novo chimera identification step was performed using the UPARSE-OTU algorithm. Singletons were removed and, finally, reads were mapped to the final OTUs list to assign abundances to each OTU. The OTU table was then rarefied to a depth of 20,000 reads per sample. Unique bacterial OTUs we classified using The Ribosomal Database Project (RDP) Naive Bayesian [64] database (Edgar R., rdp_16s_v16.fa) and OTUs taxonomy was merged with the rarefied OTU table. The resulting OTU table was normalized by converting read counts into relative abundances.

Total Hydrocarbon Content and PAH Concentration
THC was below detection limit in the acclimated seawater (T0) and in all the control seawater samples. Hydrocarbon analyses confirmed the presence of petroleum hydrocarbons in the seawater exposed to oil on Day 1 (1.9 mg·L −1 THC; 6.04 µg·L −1 sum of 16 EPA PAHs; and 54.79 µg·L −1 sum naphthalenes, phenanthrenes and dibenzothiophenes (NPDs)). Thereafter, the concentration of hydrocarbons appeared to decrease rapidly over time with the largest drop in concentrations taking place between Days 1 and 3 ( Figure 1 and Table S3).

Amplicon Sequencing
The raw sequences were processed using USEARCH (v10.0.240_i86linux32) according to the recommended settings [62,63]. The sequences were merged and filtered according to length and quality criteria (quality threshold of Q20) and sequencing reads that did not meet the quality requirements were discarded. Unique sequences were extracted, then OTU clustering (at 97% similarity) and a de novo chimera identification step was performed using the UPARSE-OTU algorithm. Singletons were removed and, finally, reads were mapped to the final OTUs list to assign abundances to each OTU. The OTU table was then rarefied to a depth of 20,000 reads per sample. Unique bacterial OTUs we classified using The Ribosomal Database Project (RDP) Naive Bayesian [64] database (Edgar R., rdp_16s_v16.fa) and OTUs taxonomy was merged with the rarefied OTU table. The resulting OTU table was normalized by converting read counts into relative abundances.

Total Hydrocarbon Content and PAH Concentration
THC was below detection limit in the acclimated seawater (T0) and in all the control seawater samples. Hydrocarbon analyses confirmed the presence of petroleum hydrocarbons in the seawater exposed to oil on Day 1 (1.9 mg· L −1 THC; 6.04 µ g· L −1 sum of 16 EPA PAHs; and 54.79 µ g· L −1 sum naphthalenes, phenanthrenes and dibenzothiophenes (NPDs)). Thereafter, the concentration of hydrocarbons appeared to decrease rapidly over time with the largest drop in concentrations taking place between Days 1 and 3 ( Figure 1 and Table S3).  ; and (C) the sum of naphthalenes, phenanthrenes and dibenzothiophenes (NPDs, µg·L −1 ) determined by gas chromatography in the oil exposed tank over time.  Figures S2 and S3). As much as 88% and 92% of sum PAHs and sum NPDs disappeared at Day 7, respectively, while THC decreased by 73%.

RNA-Sequencing
Illumina sequencing of the total mRNA from the five samples resulted in a total of 127 million quality-filtered reads with~17-29 million reads passing QC and rRNA filtering per sample (Table S2). Co-assembly of metatranscriptomic reads produced 628,164 contigs with sizes ranging from 200 to 33,000 bp and with an N50 of 521 bp. On average, 69% (67.7-71.1%) of reads did map back onto these contigs. The only exception was the acclimated seawater (T0), for which this proportion was much lower (34.7%).

Amplicon Sequencing
The SSU rRNA amplicon sequencing resulted in 37,064-42,712 high quality reads per sample. Low quality sequencing reads represented < 5% and chimeric sequences represented < 0.1%. In total, 200 OTUs belonging to the domain Bacteria were identified.

Taxonomic Analysis of the Microbial Communities
According to the taxonomic analysis of contigs from the metatranscriptome, the seawater communities were primarily dominated by bacteria, except for the acclimated seawater sample (T0), which had a high relative abundance of eukaryotic (~25%) and unclassified (~32%) sequences ( Figure 2). The fraction of unclassified sequences was below 11% for all other samples.  Figures S2 and S3). As much as 88% and 92% of sum PAHs and sum NPDs disappeared at Day 7, respectively, while THC decreased by 73%.

RNA-Sequencing
Illumina sequencing of the total mRNA from the five samples resulted in a total of 127 million quality-filtered reads with ~17-29 million reads passing QC and rRNA filtering per sample (Table  S2). Co-assembly of metatranscriptomic reads produced 628,164 contigs with sizes ranging from 200 to 33,000 bp and with an N50 of 521 bp. On average, 69% (67.7-71.1%) of reads did map back onto these contigs. The only exception was the acclimated seawater (T0), for which this proportion was much lower (34.7%).

Amplicon Sequencing
The SSU rRNA amplicon sequencing resulted in 37,064-42,712 high quality reads per sample. Low quality sequencing reads represented < 5% and chimeric sequences represented < 0.1%. In total, 200 OTUs belonging to the domain Bacteria were identified.

Taxonomic Analysis of the Microbial Communities
According to the taxonomic analysis of contigs from the metatranscriptome, the seawater communities were primarily dominated by bacteria, except for the acclimated seawater sample (T0), which had a high relative abundance of eukaryotic (~25%) and unclassified (~32%) sequences ( Figure  2). The fraction of unclassified sequences was below 11% for all other samples. Principal component analysis (PCA) of the relative abundance of families suggested a clear separation of the acclimated seawater from all the other samples along PC1 (81.1% of the variation), while control and oil exposed samples separated along PC2 (17.8% of variation) ( Figure 3). The temporal aspect of the differences between the two control and the two oil exposed samples seemed to be less pronounced (PC3, 0.6% of variation). Non-metric multidimensional scaling (NMDS) based on Bray-Curtis distance showed similar grouping patters on the genus and CDS level ( Figure S4). Principal component analysis (PCA) of the relative abundance of families suggested a clear separation of the acclimated seawater from all the other samples along PC1 (81.1% of the variation), while control and oil exposed samples separated along PC2 (17.8% of variation) ( Figure 3). The temporal aspect of the differences between the two control and the two oil exposed samples seemed to be less pronounced (PC3, 0.6% of variation). Non-metric multidimensional scaling (NMDS) based on Bray-Curtis distance showed similar grouping patters on the genus and CDS level ( Figure S4).
The diversity and taxonomic composition of the seawater bacterial communities based on assembled contigs was overall very similar to that based on 16S amplicon sequences ( Figure 4 and Tables S4 and S5). The acclimated seawater (T0) was dominated by families Rhodobacteraceae (24.1%), Flavobacteriaceae (17.8%), other unclassified Bacteria (16.4%), unclassified Gammaproteobacteria (9.3%), OMG group (7.5%) and unclassified Bacteroidetes (5.3%). Compared to the T0 sample, major changes at the family level were revealed by the enrichment of seawater with Colwelliaceae (mainly Colwellia), Alteromonadaceae and Campylobacteraceae (mainly Arcobacter) related sequences ( Figure 4). The diversity and taxonomic composition of the seawater bacterial communities based on assembled contigs was overall very similar to that based on 16S amplicon sequences ( Figure 4 and Tables S4 and S5). The acclimated seawater (T0) was dominated by families Rhodobacteraceae (24.1%), Flavobacteriaceae (17.8%), other unclassified Bacteria (16.4%), unclassified Gammaproteobacteria (9.3%), OMG group (7.5%) and unclassified Bacteroidetes (5.3%). Compared to the T0 sample, major changes at the family level were revealed by the enrichment of seawater with Colwelliaceae (mainly Colwellia), Alteromonadaceae and Campylobacteraceae (mainly Arcobacter) related sequences ( Figure 4). All control and oil-exposed samples collected on Days 4 and 7 were dominated by Colwelliaceae (23.3-30.7%), Campylobacteraceae (3.1-14.9%), Alteromonadaceae (9.8-13.7%), Rhodobacteraceae (15.8-19.6%) and Flavobacteriaceae (4.7-5.4%) ( Figure 5). Families Colwelliaceae, Campylobacteraceae and Alteromonadaceae were nearly undetectable in the acclimated seawater (T0 sample) but appeared to bloom in both control and oil exposed incubations. These abundant families were generally dominated by a single or a few genera, i.e., genus Colwellia dominated family Colwelliaceae; genus Arcobacter dominated family Campylobacteraceae; genera Glaciecola, Paraglaciecola and Alteronomas dominated family Alteromonadaceae; genus Planktomarina and unclassified Rhodobacteraceae dominated family Rhodobacteraceae; and genera Polaribacter, unclassified Flavobacteraceae, Formosa and Flavobacterium dominated family Flavobacteriaceae ( Figure S5). Differences between oil-treated and control samples were rather small with only one of the dominant families, Campylobacteraceae (mainly the genus Arcobacter), showing a rather clear preference for unexposed seawater (difference in means of control vs. oil: 14.8%, excluding T0). Flavobacteriaceae was slightly more abundant in T4O sample,  The diversity and taxonomic composition of the seawater bacterial communities based on assembled contigs was overall very similar to that based on 16S amplicon sequences ( Figure 4 and Tables S4 and S5). The acclimated seawater (T0) was dominated by families Rhodobacteraceae (24.1%), Flavobacteriaceae (17.8%), other unclassified Bacteria (16.4%), unclassified Gammaproteobacteria (9.3%), OMG group (7.5%) and unclassified Bacteroidetes (5.3%). Compared to the T0 sample, major changes at the family level were revealed by the enrichment of seawater with Colwelliaceae (mainly Colwellia), Alteromonadaceae and Campylobacteraceae (mainly Arcobacter) related sequences ( Figure 4). . Microbial community composition in seawater exposed to oil and seawater control observed at the start of exposure (T0) and at Days 4 and 7 of the exposure determined by metatranscriptomic and 16S rRNA amplicon sequencing analysis. The community composition identified based on the metatranscriptome was derived from the family level taxonomy of annotated contigs (proportion (%) of TPM).
All control and oil-exposed samples collected on Days 4 and 7 were dominated by Colwelliaceae (23.3-30.7%), Campylobacteraceae (3.1-14.9%), Alteromonadaceae (9.8-13.7%), Rhodobacteraceae (15.8-19.6%) and Flavobacteriaceae (4.7-5.4%) ( Figure 5). Families Colwelliaceae, Campylobacteraceae and Alteromonadaceae were nearly undetectable in the acclimated seawater (T0 sample) but appeared to bloom in both control and oil exposed incubations. These abundant families were generally dominated by a single or a few genera, i.e., genus Colwellia dominated family Colwelliaceae; genus Arcobacter dominated family Campylobacteraceae; genera Glaciecola, Paraglaciecola and Alteronomas dominated family Alteromonadaceae; genus Planktomarina and unclassified Rhodobacteraceae dominated family Rhodobacteraceae; and genera Polaribacter, unclassified Flavobacteraceae, Formosa and Flavobacterium dominated family Flavobacteriaceae ( Figure S5). Differences between oil-treated and control samples were rather small with only one of the dominant families, Campylobacteraceae (mainly the genus Arcobacter), showing a rather clear preference for unexposed seawater (difference in means of control vs. oil: 14.8%, excluding T0). Flavobacteriaceae was slightly more abundant in T4O sample, Figure 4. Microbial community composition in seawater exposed to oil and seawater control observed at the start of exposure (T0) and at Days 4 and 7 of the exposure determined by metatranscriptomic and 16S rRNA amplicon sequencing analysis. The community composition identified based on the metatranscriptome was derived from the family level taxonomy of annotated contigs (proportion (%) of TPM).
All control and oil-exposed samples collected on Days 4 and 7 were dominated by Colwelliaceae (23.3-30.7%), Campylobacteraceae (3.1-14.9%), Alteromonadaceae (9.8-13.7%), Rhodobacteraceae (15.8-19.6%) and Flavobacteriaceae (4.7-5.4%) ( Figure 5). Families Colwelliaceae, Campylobacteraceae and Alteromonadaceae were nearly undetectable in the acclimated seawater (T0 sample) but appeared to bloom in both control and oil exposed incubations. These abundant families were generally dominated by a single or a few genera, i.e., genus Colwellia dominated family Colwelliaceae; genus Arcobacter dominated family Campylobacteraceae; genera Glaciecola, Paraglaciecola and Alteronomas dominated family Alteromonadaceae; genus Planktomarina and unclassified Rhodobacteraceae dominated family Rhodobacteraceae; and genera Polaribacter, unclassified Flavobacteraceae, Formosa and Flavobacterium dominated family Flavobacteriaceae ( Figure S5). Differences between oil-treated and control samples were rather small with only one of the dominant families, Campylobacteraceae (mainly the genus Arcobacter), showing a rather clear preference for unexposed seawater (difference in means of control vs. oil: 14.8%, excluding T0). Flavobacteriaceae was slightly more abundant in T4O sample, while Alteromonadaceae, Rhodobacteraceae, and Colwelliaceae had higher relative abundance in both T4O and T7O in comparison to controls. In addition to the most dominant families, unclassified Gammaproteobacteria, OMG group, unclassified Rhodobacterales, Pelagibacteraceae, Rhodospirillaceae, Cryomorphaceae and other unclassified Bacteria also had slightly higher relative abundance in oil treated seawater at both timepoints. Unclassified Bacteroidetes and Oceanospirillaceae were slightly more abundant in control seawater.
while Alteromonadaceae, Rhodobacteraceae, and Colwelliaceae had higher relative abundance in both T4O and T7O in comparison to controls. In addition to the most dominant families, unclassified Gammaproteobacteria, OMG group, unclassified Rhodobacterales, Pelagibacteraceae, Rhodospirillaceae, Cryomorphaceae and other unclassified Bacteria also had slightly higher relative abundance in oil treated seawater at both timepoints. Unclassified Bacteroidetes and Oceanospirillaceae were slightly more abundant in control seawater.

Functional Analysis of the Microbial Communities
MetaGeneMark detected 674,434 putative coding sequences (CDS or genes), of which 568,144 were successfully annotated against the NCBI protein database (308,553 unique NCBI IDs) and 176,757 against the KEGG Ortholog Database (6548 unique KEGG orthologs) ( Table S6). PCA of the unique KEGG orthologs revealed similar patterns as observed from the taxonomical analysis, with PC3 (representing temporal separation between samples) explaining 6.4% of the variation ( Figure 6). Functional differences between the Day 4 and the Day 7 oil exposed samples seemed to be more pronounced than taxonomical differences, according to their larger distance along PC3. Color codes: blue circles, oil exposed; yellow squares, control tank; green triangle, acclimated seawater.

Functional Analysis of the Microbial Communities
MetaGeneMark detected 674,434 putative coding sequences (CDS or genes), of which 568,144 were successfully annotated against the NCBI protein database (308,553 unique NCBI IDs) and 176,757 against the KEGG Ortholog Database (6548 unique KEGG orthologs) ( Table S6). PCA of the unique KEGG orthologs revealed similar patterns as observed from the taxonomical analysis, with PC3 (representing temporal separation between samples) explaining 6.4% of the variation ( Figure 6). Functional differences between the Day 4 and the Day 7 oil exposed samples seemed to be more pronounced than taxonomical differences, according to their larger distance along PC3. while Alteromonadaceae, Rhodobacteraceae, and Colwelliaceae had higher relative abundance in both T4O and T7O in comparison to controls. In addition to the most dominant families, unclassified Gammaproteobacteria, OMG group, unclassified Rhodobacterales, Pelagibacteraceae, Rhodospirillaceae, Cryomorphaceae and other unclassified Bacteria also had slightly higher relative abundance in oil treated seawater at both timepoints. Unclassified Bacteroidetes and Oceanospirillaceae were slightly more abundant in control seawater.

Functional Analysis of the Microbial Communities
MetaGeneMark detected 674,434 putative coding sequences (CDS or genes), of which 568,144 were successfully annotated against the NCBI protein database (308,553 unique NCBI IDs) and 176,757 against the KEGG Ortholog Database (6548 unique KEGG orthologs) ( Table S6). PCA of the unique KEGG orthologs revealed similar patterns as observed from the taxonomical analysis, with PC3 (representing temporal separation between samples) explaining 6.4% of the variation ( Figure 6). Functional differences between the Day 4 and the Day 7 oil exposed samples seemed to be more pronounced than taxonomical differences, according to their larger distance along PC3. Color codes: blue circles, oil exposed; yellow squares, control tank; green triangle, acclimated seawater. Color codes: blue circles, oil exposed; yellow squares, control tank; green triangle, acclimated seawater.
Based on the most abundant functions, control samples (T4C and T7C) were distinguished from oil exposed ones (T4O and T7O) and the T0 sample by a number of phosphorous acquisition related genes (alkaline phosphatase, phosphate-binding protein, phosphate ABC transporter and phosphonate ABC transporter), carbon and energy storage related functions (phasin family protein and isocitrate lyse) and by the higher abundance of cation acetate symporter genes expressed mainly by Arcobacter, Glaciecola, Sulfitobacter and Colwellia species. The later (Day 7) control sample was particularly enriched in these types of genes. Abundant functions differentiating the oil exposed samples (T4O and T7O) from the rest were mainly PQQ-dependent dehydrogenases (methanol/ethanol family) and aldehyde dehydrogenases from different Colwellia species. In addition, a TonB−dependent receptor from Colwellia polaris and a porin from Candidatus Colwellia aromaticivorans was also highly enriched in oil-exposed samples. These transcripts were also present in the control samples, with TPM values 1-13 times lower than in the oil exposed samples. TPM levels in T0 for these transcripts was however very low (TPM < 2). Most of the abundant functions observed in the oil exposed samples showed decreasing TPM values between Day 4 and Day 7, except for a number of hypothetical proteins and phage associated sequences which increased substantially in the oil exposed tank by Day 7.

Upregulated Functions
Differential expression analysis (log2 fold-change ≥ 1, p ≤ 0.05) resulted in 13,427 CDS (~2% of all the predicted CDS) that were significantly upregulated in seawater with oil in comparison to control (Table S7). As a result of using the Day 4 and Day 7 datasets for each condition (non-exposed and oil-exposed) as pseudo-replicates, approximately twice as many of the DESeq2-picked genes had smaller temporal change in the control tank than in the oil-exposed one. It is also interesting to point out that DESeq2 picked several genes as upregulated which had TPM of 0 in one of the two control samples (1902 genes) while it selected relatively few (74 genes) that had TPM of 0 in one of the two oil-exposed samples. As many as 5507 upregulated genes had higher TPM values in the T0 sample than in the control T4C and 5794 upregulated genes had a TPM of 0 in the T0 sample. Using the KEGG database, 7067 differentially expressed (DE) genes (52.6%) were successfully annotated to 1581 unique KOs, while the BLAST search against the NCBI protein database annotated 11,950 DE genes (89%) to 9057 unique NCBI IDs.
KEGG mapping of the DE genes revealed that several pathways potentially involved in response to petroleum hydrocarbons were significantly enriched (p < 0.05), including "Xenobiotics biodegradation and metabolism", "Fatty acid degradation" and "Degradation of aromatic compounds" (Figure 7 and Table S8). Analysis of the genes within the "Xenobiotics biodegradation and metabolism" (283 genes) and "Degradation of aromatic compounds" (84 genes) revealed gene clusters for catabolism of benzoate (ben, box and pca genes), benzene (dmp), catechol (dmp, bph and cat) and homoprotocatechuate (hpa). Within the pathway "Fatty acid degradation" (170 genes), gene clusters involved in fatty acid beta-oxidation (fad) were expressed (Figure 7). In total, 1619 genes (12.1% of DE genes) were annotated by KEGG as transporter proteins. Among this large number of transporters, we found those involved in mineral and organic ion transport, ion channels, especially those involved in ammonium transport, as well as various amino acid, phosphate and sugar transporters. Bacterial motility genes were represented by as many as 219 genes (1.6% of DE genes). The majority of these genes coded for proteins involved in flagella and pilus assembly. Secretion system (107 genes, 0.8%) showed that proteins were secreted mainly through the Sec system, Types I and II. Genes associated with two-component system (100 genes, 0.7%) were involved in nitrogen regulation, polar flagellar synthesis, central carbon metabolism, phosphate starvation response, swarming activity and biofilm formation. Biofilm formation genes (87 genes, 0.6%) were associated with transcriptional regulators such as rpoS and oxyR. However, as Figure 7 shows, these functions were also expressed in the control tank, and, with the exception of transporters (ko02000), all of them increased in both the control and oil exposed tank in comparison to the T0 sample. Moreover, expression levels of these DE functions largely followed a similar pattern of peaking on Day 4 and declining by Day 7 in both control and oil exposed tanks.
To investigate whether taxonomic profiles could uncover more oil-specific responses, proteins based on KEGG identifiers and genes identified in the literature were selected and further investigated (Figure 8). To investigate whether taxonomic profiles could uncover more oil-specific responses, proteins based on KEGG identifiers and genes identified in the literature were selected and further investigated ( Figure 8). The most abundant motility/flagella-related gene, fliC, was expressed mainly by the same bacterial families when comparing control and oil exposed seawater; however, the taxonomical profile of these four samples were clearly distinct from that of T0 ( Figure 8). Notably, this gene was the most taxonomically diverse among the selected motility-related proteins. While T0 flagellins were dominated by Gammaproteobacteria, Porticoccaceae, Alphaproteobacteria and other unclassified bacteria, all other flagellins found in incubated samples were dominated by Colwelliaceae, Figure 7. TPM abundances of selected KEGG pathways and BRITE hierarchies of upregulated genes (log2 fold-change ≥ 1, p ≤ 0.05) between control and oil exposed samples.  Figure 7. TPM abundances of selected KEGG pathways and BRITE hierarchies of upregulated genes (log2 fold-change ≥ 1, p ≤ 0.05) between control and oil exposed samples.
To investigate whether taxonomic profiles could uncover more oil-specific responses, proteins based on KEGG identifiers and genes identified in the literature were selected and further investigated (Figure 8). The most abundant motility/flagella-related gene, fliC, was expressed mainly by the same bacterial families when comparing control and oil exposed seawater; however, the taxonomical profile of these four samples were clearly distinct from that of T0 ( Figure 8). Notably, this gene was the most taxonomically diverse among the selected motility-related proteins. While T0 flagellins were dominated by Gammaproteobacteria, Porticoccaceae, Alphaproteobacteria and other unclassified bacteria, all other flagellins found in incubated samples were dominated by Colwelliaceae, Figure 8. Taxonomic analysis of selected oil degrading genes. TPM values of upregulated genes (log2 fold-change ≥ 1, p ≤ 0.05) between control and oil exposed samples were grouped according to their gene function and colored by their taxonomic assignment.

Upstream Pathways: Motility, Secretion, and Transporters
The most abundant motility/flagella-related gene, fliC, was expressed mainly by the same bacterial families when comparing control and oil exposed seawater; however, the taxonomical profile of these four samples were clearly distinct from that of T0 ( Figure 8). Notably, this gene was the most taxonomically diverse among the selected motility-related proteins. While  (Table S7). Other functions related to motility, e.g., mcp, pilY, CheB/CheR and motA, were all nearly absent from the T0 sample while being mostly expressed by the abundant Colwelliaceae family in all other samples. Among the selected secretion-related genes, secB, tatB and gspC exhibited the most dramatic shift in taxonomic composition with the increase in Colwelliaceae and Alteromonadaceae classified genes by Day 4. All transporters annotated by KEGG showed a taxonomic shift from T0 to Day 4 represented by the same two families, both in control and in oil exposed samples.

Alkane Degradation
The transcript abundance of alkane 1-monooxygenase (alkB) genes selected by the differential expression analysis was generally low (TPM sum < 40) and already detectable in the T0 sample. Out of the 11 DE genes annotated as alkB or rubB, eight were classified as SAR92 clade bacterium with an average log2-fold change of 2.33. This Porticoccaceae family bacterium contained both genes responsible for initial oxidation of n-alkanes (alkane 1-monooxygenase and rubredoxin NAD+ reductase) and was highly active in the acclimated seawater (T0) already. The highest log2-fold change was observed in case of a Gammaproteobacteria bacterium and a Halieaceae bacterium, but only the former showed an increasing trend from Day 4 to Day 7. However, no alkane monooxygenases from OHCB were found among the DE genes ( Figure 9). Only four upregulated cytochrome P450 genes were found, with an average log2-fold change of 1.96 (classified as Bacteria and Gammaproteobacteria bacterium species). Pseudoalteromonadaceae and Pseudomonadaceae. Flagellins with the highest log2-fold change (> 5) and highest increase in TPM from Day 4 to Day 7 (T7O/T4O > 5) were from less abundant Methylophaga species, Burkholderiaceae, Thiotrichaceae and Halomonadaceae bacterium (Table S7). Other functions related to motility, e.g., mcp, pilY, CheB/CheR and motA, were all nearly absent from the T0 sample while being mostly expressed by the abundant Colwelliaceae family in all other samples. Among the selected secretion-related genes, secB, tatB and gspC exhibited the most dramatic shift in taxonomic composition with the increase in Colwelliaceae and Alteromonadaceae classified genes by Day 4. All transporters annotated by KEGG showed a taxonomic shift from T0 to Day 4 represented by the same two families, both in control and in oil exposed samples.

Alkane Degradation
The transcript abundance of alkane 1-monooxygenase (alkB) genes selected by the differential expression analysis was generally low (TPM sum < 40) and already detectable in the T0 sample. Out of the 11 DE genes annotated as alkB or rubB, eight were classified as SAR92 clade bacterium with an average log2-fold change of 2.33. This Porticoccaceae family bacterium contained both genes responsible for initial oxidation of n-alkanes (alkane 1-monooxygenase and rubredoxin NAD+ reductase) and was highly active in the acclimated seawater (T0) already. The highest log2-fold change was observed in case of a Gammaproteobacteria bacterium and a Halieaceae bacterium, but only the former showed an increasing trend from Day 4 to Day 7. However, no alkane monooxygenases from OHCB were found among the DE genes ( Figure 9). Only four upregulated cytochrome P450 genes were found, with an average log2-fold change of 1.96 (classified as Bacteria and Gammaproteobacteria bacterium species). Many genes participating in the subsequent intermediary steps, e.g., dehydrogenation of alcohols (90 genes), dehydrogenation of aldehydes (67 genes) and β-oxidation (134 genes, mainly from Colwellia sp.), were upregulated in oil exposed samples. Among these genes, β-oxidation enzymes, fadABJ and ACSL were represented by the highest TPM counts, approximately 10-fold higher than the 11 oil-enriched alkane 1-monooxygenase genes. While the upregulated initial oxygenases were expressed almost exclusively by members of the family Porticoccaceae (mainly SAR92 clade bacterium) and unclassified Gammaproteobacteria, the β-oxidation genes were mainly transcribed by Rhodobacteraceae and Colwelliaceae. Abundant alcohol dehydrogenases frmA, adh1 and also qheDH (last one not shown in Figure 8) showed taxonomic affiliation to Alphaproteobacteria (Rhodobacterales), Pelagibacteraceae and SAR92 clade, respectively. The highest TPM counts for these genes were observed in the T0 samples. The same was observed for the selected aldehyde dehydrogenase, ALDH, represented by taxa within Alphaproteobacteria. Both types of enzymes also maintained their taxonomic profile over time regardless of the presence of oil. Aldehyde dehydrogenases belonging to KO K00138 (aldB), however, were mostly absent from T0 (except for those associated with Rhodobacterales and Pelagibacter) and became enriched under laboratory conditions, in particular in the oil exposed tank (average log2-fold change = 2.2) (Table S7). These aldB genes were mainly associated with Colwellia species, Arcobacter, Rhodobacterales, Gammaproteobacteria and Candidatus Pelagibacter. Many genes participating in the subsequent intermediary steps, e.g., dehydrogenation of alcohols (90 genes), dehydrogenation of aldehydes (67 genes) and β-oxidation (134 genes, mainly from Colwellia sp.), were upregulated in oil exposed samples. Among these genes, β-oxidation enzymes, fadABJ and ACSL were represented by the highest TPM counts, approximately 10-fold higher than the 11 oil-enriched alkane 1-monooxygenase genes. While the upregulated initial oxygenases were expressed almost exclusively by members of the family Porticoccaceae (mainly SAR92 clade bacterium) and unclassified Gammaproteobacteria, the β-oxidation genes were mainly transcribed by Rhodobacteraceae and Colwelliaceae. Abundant alcohol dehydrogenases frmA, adh1 and also qheDH (last one not shown in Figure 8) showed taxonomic affiliation to Alphaproteobacteria (Rhodobacterales), Pelagibacteraceae and SAR92 clade, respectively. The highest TPM counts for these genes were observed in the T0 samples. The same was observed for the selected aldehyde dehydrogenase, ALDH, represented by taxa within Alphaproteobacteria. Both types of enzymes also maintained their taxonomic profile over time regardless of the presence of oil. Aldehyde dehydrogenases belonging to KO K00138 (aldB), however, were mostly absent from T0 (except for those associated with Rhodobacterales and Pelagibacter) and became enriched under laboratory conditions, in particular in the oil exposed tank (average log2-fold change = 2.2) (Table S7). These aldB genes were mainly associated with Colwellia species, Arcobacter, Rhodobacterales, Gammaproteobacteria and Candidatus Pelagibacter.

Aromatic Degradation
KEGG mapping of DE genes did not suggest the presence of aromatic ring-hydroxylating dioxygenases (RHDO) in the dataset reported here; however, BLAST annotated 29 DE genes as RHDO ( Figure 10). The most abundant one was from Rhodobacteraceae bacterium and Rhodobacterales bacterium HTCC2255, both showing similar or somewhat higher TPM counts in oil exposed samples in comparison to the acclimated seawater (T0). Only a Colwellia sp. RHDO was oil-specific; however, this gene had a very low TPM count.

Aromatic Degradation
KEGG mapping of DE genes did not suggest the presence of aromatic ring-hydroxylating dioxygenases (RHDO) in the dataset reported here; however, BLAST annotated 29 DE genes as RHDO ( Figure 10). The most abundant one was from Rhodobacteraceae bacterium and Rhodobacterales bacterium HTCC2255, both showing similar or somewhat higher TPM counts in oil exposed samples in comparison to the acclimated seawater (T0). Only a Colwellia sp. RHDO was oilspecific; however, this gene had a very low TPM count. KEGG analysis revealed an abundance of oxygenases and hydroxylases acting on monoaromatic compounds and highly expressed intermediary enzymes involved in channeling activated monoaromatic compounds towards the tricarboxylic acid (TCA) cycle. Enzymes responsible for initial oxidation, demethylation and subsequent conversion of monoaromatic rings were nearly absent from the T0 sample, except for boxA and boxB (two epoxidases involved in benzoate degradation, ring opening steps). Nearly complete pathways for several meta-cleavage routes of catechol degradation were reconstructed from the set of DE genes. Gene clusters for two different benzoate catabolic pathways were found: (1) ben pathway (benABCD) transcribed by Rhodobacteraceae; and (2) box pathway (boxABC) transcribed mainly by Rhodobacteraceae and Colwelliaceae. Moreover, genes involved in aerobic benzene/toluene degradation (dmpLMNOP) transcribed mainly by Colwelliaceae were also enriched in the oil tank. Except for the box cluster, these aromatic compound converting enzymes were absent from the acclimated seawater but increased in the control tank as well. Moreover, by Day 7, the number of transcripts responsible for aromatic compound degradation were reduced to levels similar to that in T4C. Contrary to other enzymes, benzoate/toluate 1,2-dioxygenase components and dihydroxy-cyclohexadiene carboxylate dehydrogenase (benABCD), and benzoyl-CoA converting enzymes (boxABC) maintained elevated TPM level in T7O. These enzymes cover three consecutive steps of the benzoate degradation pathway. Their taxonomic profiles were consistently dominated by the two most abundant families, Colwelliaceae and Alteromonadaceae.
To uncover highly oil-specific responses, the DE genes were further explored. First, differentially abundant genes were ordered according to T7O/T4O ratio and only genes with a ratio > 10 were kept, corresponding to a ten-fold increase over time in oil exposed samples. Unannotated genes, hypothetical proteins, ribosomal proteins and phage or virus transcripts were then removed leaving 294 genes for further evaluation. A large majority of the selected genes belonged to either Methylophaga species (96 genes) or Gammaproteobacteria bacterium (82 genes). The most abundant Methylophaga genes were 3-hexulose-6-phosphate synthase, F0F1 ATP synthase, elongation factor Tu, formaldehyde-activating enzyme and an ammonium transporter. For Gammaproteobacteria bacterium, a copper-binding protein, isocitrate lyase genes, F0F1 ATP synthase genes, formaldehydeactivating enzyme, benzoyl-CoA 2,3-epoxidase, transketolase genes, ammonium transporters, a porin, a cold-shock protein and a peroxiredoxin had the highest TPM in the T7O sample. Among the KEGG analysis revealed an abundance of oxygenases and hydroxylases acting on monoaromatic compounds and highly expressed intermediary enzymes involved in channeling activated monoaromatic compounds towards the tricarboxylic acid (TCA) cycle. Enzymes responsible for initial oxidation, demethylation and subsequent conversion of monoaromatic rings were nearly absent from the T0 sample, except for boxA and boxB (two epoxidases involved in benzoate degradation, ring opening steps). Nearly complete pathways for several meta-cleavage routes of catechol degradation were reconstructed from the set of DE genes. Gene clusters for two different benzoate catabolic pathways were found: (1) ben pathway (benABCD) transcribed by Rhodobacteraceae; and (2) box pathway (boxABC) transcribed mainly by Rhodobacteraceae and Colwelliaceae. Moreover, genes involved in aerobic benzene/toluene degradation (dmpLMNOP) transcribed mainly by Colwelliaceae were also enriched in the oil tank. Except for the box cluster, these aromatic compound converting enzymes were absent from the acclimated seawater but increased in the control tank as well. Moreover, by Day 7, the number of transcripts responsible for aromatic compound degradation were reduced to levels similar to that in T4C. Contrary to other enzymes, benzoate/toluate 1,2-dioxygenase components and dihydroxy-cyclohexadiene carboxylate dehydrogenase (benABCD), and benzoyl-CoA converting enzymes (boxABC) maintained elevated TPM level in T7O. These enzymes cover three consecutive steps of the benzoate degradation pathway. Their taxonomic profiles were consistently dominated by the two most abundant families, Colwelliaceae and Alteromonadaceae.
To uncover highly oil-specific responses, the DE genes were further explored. First, differentially abundant genes were ordered according to T7O/T4O ratio and only genes with a ratio > 10 were kept, corresponding to a ten-fold increase over time in oil exposed samples. Unannotated genes, hypothetical proteins, ribosomal proteins and phage or virus transcripts were then removed leaving 294 genes for further evaluation. A large majority of the selected genes belonged to either Methylophaga species (96 genes) or Gammaproteobacteria bacterium (82 genes). The most abundant Methylophaga genes were 3-hexulose-6-phosphate synthase, F0F1 ATP synthase, elongation factor Tu, formaldehyde-activating enzyme and an ammonium transporter. For Gammaproteobacteria bacterium, a copper-binding protein, isocitrate lyase genes, F0F1 ATP synthase genes, formaldehyde-activating enzyme, benzoyl-CoA 2,3-epoxidase, transketolase genes, ammonium transporters, a porin, a cold-shock protein and a peroxiredoxin had the highest TPM in the T7O sample. Among the remaining genes, a phosphate starvation-inducible protein (PhoH), branched-chain amino acid ABC transporter substrate-binding protein of Cycloclasticus, together with TRAP transporter from Colwellia, ABC transporter and benzoate 1,2-dioxygenase from Phaeobacter inhibens had the highest TPMs (Table S7).
Next, we focused on the OHCB associated DE genes and found the previously mentioned phosphate starvation-inducible protein (PhoH), a branched-chain amino acid ABC transporter substrate-binding protein of Cycloclasticus, ammonium transporters, flagellin, glutamine synthetase, lipoprotein NlpD, nitrate/nitrite transport system substrate-binding protein, nitrite reductase, nitrogen regulatory protein and urea transport protein of Oleispira antarctica and allophanate hydrolase from Alcanivorax. Finally, Venn diagram analysis of KEGG orthologs revealed that tmoA (toluene monooxygenase system protein), benD (dihydroxycyclohexadiene carboxylate dehydrogenase) and kefB (glutathione-regulated potassium-efflux system protein) were orthologs uniquely present in oil exposed seawater, albeit at very low TPM levels.

Rapid Decrease of Petroleum Hydrocarbon Concentrations
Chemical analysis of seawater samples showed a rapid decrease in both aliphatic and aromatic hydrocarbon concentrations during the incubation. This likely resulted from a combination of abiotic and biotic processes, such as volatilization and coating to tank wall surface. Volatilization likely contributed to a significant portion of the loss of short-chain and monoaromatic hydrocarbons and naphthalenes. Adsorption to tank wall was observed. Compared to a study of Ribicic et al. [23] the loss of PAHs occurred much earlier in our study (between Days 1 and 3). They observed significant PAH biodegradation only between Days 13 and 16 using North Sea, Norwegian fjord seawater, at a similar concentration of a chemically dispersed paraffinic oil (Statfjord crude, 3 mg·L -1 ). Possibly, a well-adapted microbial community could respond to oil exposure as quickly as suggested by the chemistry data in our experiment. Indeed, it has been shown that due to chronic influx of ubiquitous anthropogenic dissolved organic carbon (ADOC, including hydrocarbons) into the global ocean, natural bacterial communities evolved to quickly respond and taxa with recognized oil-degrading capabilities was shown to increase in relative abundance within 24 h to ADOC amendment [65]. The microbial community of the seawater used in this experiment could have been partly adapted to low levels of oil contamination, although THC level was below detection limit in the acclimated seawater (T0). Nevertheless, the following three interesting signatures may suggest some level of adaptation (Table S6) Other explanations for the rapid loss of both types of hydrocarbon compounds could include additional weathering processes such as absorption to biotic particles, e.g. phytoplankton. Phytoplankton were shown to be able to adsorb PAHs at high concentrations (up to 16 µ g·g -1 ) into their biomass [67] and phytoplankton absorption has recently been suggested as an important mechanism of PAH transport to sediments after their death [68]. The metatranscriptomic analysis revealed that the acclimated seawater contained a large percentage of eukaryotic transcripts. As much as 8.4% of all BLASTX annotated predicted genes (48,273 out of 573,163) were of eukaryotic The very low relative abundance of OHCB genera, not only in the T0 sample but throughout the experiment, suggests that they did not represent significant biomass within the community, making it unlikely that they were responsible for rapid alkane and PAH biodegradation. Moreover, only Oleispira and Alcanivorax-associated alkane monooxygenases were present in the dataset, and only Alcanivorax (Alcanivorax pacificus) was expressing its alkane monooxygenase genes in the acclimated seawater sample, albeit at very low level (TPM of 1.021). Regarding ring-hydroxylating dioxygenases, none of these were classified into known OHCB genera. As Figure 11 shows, all the OHCB genera remained active or even became more active in the control tank compared to the oil exposed tank over time. Besides nearly identical activities in control and oil exposed seawater, their lack of increase in alkane monooxygenase relative expression in the presence of oil suggests that they were most likely utilizing carbon sources other than petroleum hydrocarbons [66].
Other explanations for the rapid loss of both types of hydrocarbon compounds could include additional weathering processes such as absorption to biotic particles, e.g., phytoplankton. Phytoplankton were shown to be able to adsorb PAHs at high concentrations (up to 16 µg·g -1 ) into their biomass [67] and phytoplankton absorption has recently been suggested as an important mechanism of PAH transport to sediments after their death [68]. The metatranscriptomic analysis revealed that the acclimated seawater contained a large percentage of eukaryotic transcripts. As much as 8.4% of all BLASTX annotated predicted genes (48,273 out of 573,163) were of eukaryotic origin, with Viridiplantae being most dominant (61% of eukaryotic transcripts) followed by Stramenopiles (14%), Opisthokonta (10%), Cryptophyta (7%), Haptophyceae (5%) and Alveolata (2%). In comparison, 13% of transcripts were found to bin to similar eukaryotic groups during a phytoplankton bloom in the Gulf of Mexico [69]. Although the absolute numbers of these organisms were not determined, their dominance in the T0 dataset suggests that influence of algal biomass should be taken into consideration when interpreting the chemistry results.

Microbial Community and Activities in the Acclimated Seawater
On the taxonomy side, the most abundant taxa in the acclimated seawater was the genus Micromonas, a green alga, while on the functional side, photosystem proteins from various eukaryotic phytoplankton, e.g., Aureoumbra lagunensis, Teleaulax amphioxeia and Guillardia theta, were the most relatively abundant proteins. Several bacterial taxa, often associated with phytoplankton-rich coastal marine communities, had comparable abundances to the dominant green algae and diatom groups in the acclimated seawater, e.g., Polaribacter, Porticoccaceae (possibly including SAR92), Cryomorphaceae, Planktomarina, Formosa, SAR86, Crocinitomicaceae and Ulvibacter (Table S5). These groups have been found during a diatom bloom in Monterey Bay, towards the end of a diatom-dominated spring phytoplankton bloom in the German Bight, in the bacterioplankton community of a Phaeocystis globosa spring bloom in the North Sea and in the bacterial community of a Phaeocystis antarctica bloom in the Amundsen Sea [70][71][72][73]. Hence, the seawater collected for this experiment was indeed a typical example of a spring coastal marine microbial community. Interestingly, at least two of these genera, i.e., Polaribacter and Ulvibacter, are also considered to be typical examples of cold-adapted oil degrading bacteria [74].
It is well known that phytoplankton have a significant role in providing a habitat for hydrocarbon degrading bacteria (HCB) in the so-called photosphere [75,76]. Phytoplankton can create hydrocarbon enriched zones in surface seawater as they can produce PAHs, volatile hydrocarbons, isoprene and hydrocarbon-like substances such as alkenones [77,78]. This phenomenon could in turn make phytoplankton rich seawater more prepared to biodegrade accidentally spilled or leaked petrogenic hydrocarbons. While most green algae transcripts in the acclimated seawater belonged to typical coastal seawater green algae, e.g., Micromonas and Bathycoccus, we also found three genes of Botryococcus braunii, a well-known hydrocarbon-producing microalga [79,80]. This suggests the possibility that naturally occurring hydrocarbons could have been inducing the expression of the alkane monooxygenases observed in the T0 sample. Crocinitomicaceae, SAR92 clade and Bacteroidetes, which dominated the alkane monooxygenases expressed in the acclimated seawater (and other samples as well), and Rhodobacterales bacterium HTCC2255, Halieaceae and Candidatus Pelagibacter ubique, which dominated the RHDO transcripts in the acclimated seawater, have all been shown to be active in phytoplankton dominated communities and involved in the succession of phytoplankton blooms.

Dominant Generalists and Phytoplankton-Associated Taxa
Within three days of incubation, the entire microbial community changed dramatically both in the control and oil exposed tank, characterized by orders of magnitude decreases in algae and diatom transcripts concurrent with: (1) a significant decrease in mainly Bacteroidetes (e.g., Flavobacteriaceae, Cryomorphaceae and Polaribacter) and virus (e.g., Phycodnaviridae) transcripts; (2) a transient decrease in Rhodobacteraceae; and (3) a clear bloom of Colwelliaceae, Campylobacterceae (mainly Arcobacter) and Alteromonadaceae. The decline in algal and diatom transcript relative abundance could be associated with decreased activity due to insufficient light conditions or other limiting parameters in the incubation tanks. However, it is likely to have been a result of cell mortality due to viral infection. Taxonomic analysis of the viral contigs showed that, excluding the unclassified viruses, the Siphoviridae and Myoviridae phage families dominated the viral community, followed by Phycodnaviridae, a family of viruses that infect eukaryotic algae (Table S9). Phycodnaviridae were particularly abundant in the acclimatized sample (T0), suggesting that they might have had a role in the decrease of Micromonas transcripts. If such a viral lysis event occurred, a significant amount of phytoplankton-derived organic material was released in both the control and oil exposed tanks, similar to what could be observed at the end of a phytoplankton bloom-leading to the succession of opportunistic bacteria taking advantage of the abundant carbon source.
Indeed, one of the blooming bacterial taxa, Colwellia, has been previously associated with the degradation of phytoplankton-derived organic matter of a Phaeocystis bloom [73]. Colwellia represents a group of heterotrophic cold-adapted bacteria able to degrade a wide range of organic compounds [81]. Through isolation from an enrichment culture, it has been demonstrated that some Colwellia have the capacity to degrade petroleum hydrocarbons in the presence of Corexit dispersant [82]. The genome of this isolate has not been sequenced; therefore, the exact metabolic pathways are not known. Laboratory studies using stable isotope probing (SIP) showed that Colwellia can oxidize gaseous hydrocarbons (ethane and butane) [83] and naphthalene [84]. Colwellia species have been suggested to be important players in the response to oil spills; however, it is well-known that there is a large variation among Colwellia species regarding their functional potential [52]. This genus is often observed as the first responder in laboratory experiments, particularly under cold conditions [83,85]. More commonly, however, Colwellia species co-occur with obligate hydrocarbonoclastic species, such as Oleispira or in later stages of oil degradation, with Cycloclasticus [23,84,85]. In this study, the nearly identical increase in the relative abundance of Colwellia transcripts in control and oil exposed seawater suggests that this bacterium, represented mainly by Colwellia psychrerythraea and Candidatus Colwellia aromaticivorans, utilized carbon from sources other than the added petroleum hydrocarbons, i.e., carbon released from evanescent phytoplankton. A detailed species-level Colwellia composition is shown in Figure S6. Most recently, Candidatus Colwellia aromaticivorans was reported as a novel hydrocarbon-degrading bacterium identified through metagenomic analysis of a microcosm-scale oil exposure experiment [86,87]. However, isolation and confirmation of this strain being able to grow on petroleum hydrocarbons has not been performed, only in silico predictions of its metabolic profile have been reported [86].
While Campylobacteraceae is also considered to be a group containing cold-adapted oil degrading bacteria, we observed a significantly lower relative abundance of Campylobacteraceae (in particular, Arcobacter) in the oil exposed tank in comparison to the control tank. This might suggest that some Campylobacteraceae species were susceptible to toxic compounds present in crude oil rather than being involved in their degradation [74,88].
Alteromonadaceae represents a large group of motile Proteobacteria, which have been observed to form a transient bloom in response to manipulations during laboratory mesocosm set-up, where responses of bacterioplankton to either phosphorous or glucose addition were examined [89].
Alteromonadaceae also gathers a group of bacteria which have previously been reported as oil-degraders. On the one hand, the genus Glaciecola has been suggested as an early marker (responder) of oil pollution and an oil degrading Glaciecola strain has previously been isolated [85,90]. On the other hand, this genus has also been suggested to be one of the dominant consumers of phytoplankton-derived organic matter during a cold-water spring bloom [91].
Surprisingly, none of the families with known OHCB genera became dominant during the incubation period of this study. In fact, our datasets showed an overall decline in abundance of the Oceanospirillaceae family, a group that has been often observed in oil spill response. However, one OHCB genus, Oleispira (Oceanospirillaceae), was slightly more abundant in oil-treated seawater by Day 7. Krolicka et al. [17] have suggested Oleispira to be a good bioindicator of oil occurrence in water based on the abundance and rapid response of this taxa shortly after oil contamination. The observed increase of Oleispira at Day 7 in this study could suggest that the degradation of the added petroleum hydrocarbons was delayed until after the phytoplankton-derived organic matter was utilized by the opportunistic families. Alternatively, other non-OHCB taxa must have contributed to the initial alkane oxidation, such as SAR92 clade bacterium (Porticoccaceae). However, a previous study that analyzed metagenomes of microbial communities after exposure to oil showed that the family Oceanospirillaceae was contributing the most to initial n-alkanes degradation, whereas Colwellia and Porticoccaceae were involved in secondary n-alkane catabolism and beta-oxidation [23].

Dominant Functions Reflect the "Tank Effect"
Phosphorous acquisition dominated the microbial activity in the control tank (T4C and T7C samples), through uptake of dissolved inorganic phosphate and through transport of dissolved organic phosphate in the form of phosphonate. Phosphonate is an important component of the marine dissolved organic phosphorous pool and was suggested to represent a significant phosphorous source in the oceans [92]. For the utilization of other forms of organic phosphorous, likely originating from dead phytoplankton, alkaline phosphatase was expressed [93]. This enzyme was found to be the most responsive process in a study where the effect of organic matter amendment was examined in a marine mesocosm [94]. Dominance of phasin, a polyhydroxyalkanoate (PHA) granule-associated protein, possibly playing a role in PHA-related stress response, suggested the presence of excess organic matter which was likely utilized by bacteria for the synthesis of PHA for carbon and energy storage [95]. The high expression of the glyoxylate-shunt enzyme, isocitrate lyase, also suggests that organic carbon was rather utilized by incorporation into the biomass of rapidly growing opportunistic species (biosynthetic pathways) than by respiration to carbon-dioxide [96]. Overrepresentation of isocitrate lyase, together with the other glyoxylate shunt enzyme malate synthase, in a plume sample of the Deepwater Horizon spill was found by [39]. According to the authors, this indicated the ongoing assimilation of hydrocarbon-derived intermediary metabolites (acetyl-CoA) into biomass. The oil exposed tank (T4O and T7O samples) was characterized by the dominance of PQQ-dependent dehydrogenases (methanol/ethanol family) and aldehyde dehydrogenases (mainly from different Colwellia species). The TPM counts of these were at least an order of magnitude higher than initial oxygenases in the reported metatranscriptome (Table S6). An increase in PQQ-containing alcohol dehydrogenase expression was reported by Rivers et al. [39] in samples retrieved from the plume of the DWH oil spill. They also reported higher expression of aldehyde dehydrogenases in the plume samples. However, in contrast with our results, this coincided with alkane monooxygenase expression and in general the levels of initial monooxygenase enzymes exceeded the levels of aldehyde dehydrogenases. Such co-occurrence was not observed in our dataset. Some PQQ-dependent ethanol dehydrogenases (exaA) and aldehyde dehydrogenases (exaC) taking part in the ethanol oxidation pathway were shown to be essential for survival and growth at cold temperatures [97]. Hence, these dominant proteins could also reflect an essential aspect of the cold-adapted lifestyle of the active bacterioplankton in the oil exposed tank. Some TonB-dependent receptors (TBDRs), such as OmpS of Alcanivorax dieselolei, have been shown to be important in hydrocarbon transport across the bacterial cell envelop [98]. However, most of them are outer membrane proteins known for the active transport of iron siderophore complexes and acquisition of other large molecules, e.g., vitamin B12 [99]. In addition, TBDRs could also be involved in transporting sugar oligomers (i.e., partly degraded polysaccharides) with the help of sugar-binding proteins [100].
The major distinguishing characteristic of the T7O sample was the large amount of hypothetical proteins and phage/viral sequences. Although the total abundance of viral transcripts was similar over the course of the experiment, several phage-associated sequences showed a preferential increase in T7O. It is known that hydrocarbon pollution can increase virus abundance through prophage induction in marine bacteria [101,102], leading to regulation of microbial population density by selective killing of abundant bacteria types [103,104], subsequently releasing back into the water dissolved organic carbon, which is then utilized by bacterial populations [105,106]. Bacteriophages might also influence oil degradation by transferring hydrocarbon degradation genes between bacteria [107]. Phage associated genes have been found to be more abundant in oil-contaminated seawater compared with uncontaminated water [32]. A recent study showed that phages that infect known degraders can be highly abundant in hydrocarbon polluted water [108]. Taxonomic analysis of the viral contigs at the species level showed abundance of the uncultured Mediterranean phage uvMED, abundant in the marine environment [109,110]. Analysis of the contigs that showed >2-fold increase in transcription in oil exposed samples, showed that most of the identified potential phage hosts belonged to Pseudoalteromonas, Vibrio and Xanthomonas. Their transcription increased from 4-8-fold on Day 4 to 31-62-fold on Day 7 (Table S9).

Upregulated Functions
Oil pollution in the marine environment quickly triggers a number of microbial responses, including a broad range of activities: (1) chemotaxis, either away from toxic hydrocarbons or towards them; (2) efflux mechanisms to facilitate secretion of small hydrocarbons that may enter bacterial cells by passive transport, together with secretion systems to transport, proteins (toxins and extracellular enzymes) or biosurfactants outside the cell; (3) death and lysis of sensitive organisms which releases a pool of organic material available for the oil-tolerant and primary or secondary oil-degrading organisms; (4) adhesion to and biofilm formation on oil droplets and signaling between the biofilm forming cells; (5) an increase in nutrient transport and processes to generate available nitrogen, phosphorous and sulfur for growth on carbon-rich and nutrient-poor oil components; and (6) biosurfactant synthesis [38,39,[111][112][113]. Indeed, most of these processes, also reported in recent large scale metatranscriptomic studies [36,37], were represented in the upregulated genes in our dataset (Table S7). However, most of them were also increasing in expression levels in the control tank and many of them had higher TPM values in the T0 samples than in the Day 4 control (T4C) ( Figure 8). As mentioned in Section 3.5, our differential expression analysis approach did to some extent preferentially select genes with small temporal change in the control tank. Moreover, it is likely that more upregulated genes could have been detected with a larger number of true biological replicates. Our aim was to use this approach to narrow down the pool of genes to be examined, rather than to use it to comprehensively identify all significantly upregulated genes.
A straightforward explanation for the simultaneous increase in abundance in control and oil exposed tank could be that most of the upregulated genes were involved in the processing of phytoplankton-derived material released as a result of algal senescence during incubation. Many of the responses reported from bacteria in response to oil are similar to those reported in phytoplankton bloom succession studies and in experiments where anthropogenic dissolved organic carbon (ADOC) or deep-sea water (DSW) is added to seawater [65,71,114]. Increased expression in response to these environmental perturbations typically includes the same broad categories of motility, chemotaxis, transporters, etc. Therefore, it can be challenging to identify truly specific indicators of anthropogenic oil contamination.
Functions associated with a declining phytoplankton bloom community were found in the presented dataset, including 283 differentially abundant genes encoding TRAP transporters (mainly from Alphaproteobacteria, Rhodobacteraceae, but also Candidatus Colwellia aromaticivorans and Pelagibacteraceae), 170 genes encoding branched-chain amino acid transporters (mainly from Alphaproteobacteria, Rhodobacteraceae, but also Gammaproteobacteria, Colwellia and Roseobacter) and 55 genes encoding carbohydrate ABC-type transporters (mainly from Alphaproteobacteria, Rhodobacteraceae and Oceanospirillaceae). All these dissolved organic matter (DOM) uptake-associated functions were found to be significantly differentially represented in a bloom community and are strongly linked to the consumption of carboxylic acids, amino acids and carbohydrates released during bloom senescence [69]. The presence of lysozyme genes from Candidatus Colwellia aromaticivorans, alginate lyases belonging to Colwellia, two glycoside hydrolases from Glaciecola and Alteromonadales and other enzymes, such methylamine dehydrogenases, taurine and sulfonate utilization genes among the upregulated genes further supports the hypothesis that some of the differentially abundant activities were likely reflecting response to phytoplankton-derived material [115][116][117].
Alkane 1-monooxygenases, rubredoxin and P450 cytochromes have been suggested as important and specific functional markers of aliphatic hydrocarbon degradation [29,118]. However, the use of alkane 1-monooxygenase as marker for anthropogenic oil pollution in the marine environment remains controversial. Alkane 1-monooxygenase genes were found to be abundant in various seawater environments as determined by degenerate PCR [119] and metagenomic studies [36,37,120]. One study that quantified alkB gene using qPCR found no clear distinction between the chronically oil polluted Baltic Sea coastal water compared to a nearby less exposed site [121]. Similarly, metatranscriptomic analysis showed that alkane 1-monooxygenases had similar expression levels in plume samples in comparison to the non-plume following the Deepwater Horizon oil spill [39], while GeoChip-based functional gene array hybridization data showed significantly higher alkB gene numbers in the oil contaminated samples (19-26 genes) in comparison to non-contaminated ones (11-15 genes) [32]. Another field study found higher alkB gene expression, measured using qPCR, in oil-contaminated seawater of the Yellow sea in comparison to uncontaminated seawater [122]. In a laboratory experiment, Paisse et al [123] observed that expression of alkB genes (reverse transcription-qPCR) was detected immediately after oil addition (after 1 h) but diminished after two days, even though the alkane degradation was observed during the 14 days of incubation.
Aromatic ring-hydroxylating dioxygenases (RHDO, alpha and beta subunits) are used as robust markers of PAH biodegradation [68]. Primers designed to amplify various RHDOs from Gram-negative and Gram-positive bacteria have been successfully used to differentiate oil-contaminated seawater of the Yellow sea from uncontaminated [122]. In our study, upregulated functions related to aromatic degradation were mostly associated with monooxygenases and dioxygenases involved in degradation of monoaromatic compounds (benzene, toluene, and phenol) and central intermediates (benzoate, catechol, protocatechuate, etc.). Monooxygenases and dioxygenases involved in aromatic degradation were also significantly enriched in the plume transcriptomes analyzed after the Deepwater Horizon oil spill [39]. Among the enriched genes reported by Rivers et al. was the gene cluster benABC, also found to be significantly upregulated on oil in our study (transcribed by Rhodobacteraceae). Most other upregulated aromatic monooxygenases (dmpLMNOP and pheA) and dioxygenases (dmpB, catA and hpaD) found in our study were transcribed by Colwellia, a genus that was abundant in both oil exposed and control incubations.
To establish how alkB genes or RHDOs could be used in a robust manner as oil pollution markers, further work is needed to consolidate the above-mentioned findings, particularly for alkB. It is important to keep in mind, that metatranscriptomics provides a taxonomic and functional profile of actively growing organisms and their expressed genes, resulting in knowledge about the relative contributions of active taxa and genes at a given time. However, an increase or a decrease in relative abundance of an organism or TPM of a gene does not necessarily mean that the absolute values of its biomass or its expression levels have changed. Quantitative approaches such as qPCR are necessary to establish the absolute numbers of organisms or genes of interest. While metatranscriptomics is helpful in identifying gene targets, the robustness of these targets needs to be validated through quantitative assays and through field measurements thereafter. Further research needs to focus on linking together profiling-based approaches with quantitative ones and DNA-based methods with RNA-focused ones. In addition, quantitative analysis and profiling of dissolved organic matter in the seawater subjected to laboratory experiment or collected in the field, together with phytoplankton cell counts, should be implemented in future studies in order to establish better control of these confounding factors.

Conclusions
This work was performed based on the assumption that the presence of oil in water can be rapidly evaluated using a set of functional genes expressed by microorganisms in response to oil. Such specific indicators of oil contamination could then be used on ecogenomic devices, such as the ESP, for real-time detection of oil contamination and also to monitor recovery. Indeed, the metatranscriptome of the five samples archived by ESP instruments revealed some distinction between control and oil-exposed samples: (1) Campylobacteraceae (genus Arcobacter) was relatively less abundant in the oil treatment with the majority of the downregulated transcripts belonging to Arcobacter species; (2) a dominance (peak) of intermediary alcohol and aldehyde dehydrogenases, and monoaromatic ring-degrading enzymes on Day 4 with a subsequent decline by Day 7 except for benzoate/toluate 1,2-dioxygenase components and dihydroxy-cyclohexadiene carboxylate dehydrogenase (benABCD), and benzoyl-CoA converting enzymes (boxABC); and (3) a significant increase in phage-associated sequences, as well as Methylophaga and Gammaproteobacteria bacterium transcripts on Day 7 in the oil treatment. The robustness of these gene transcripts to distinguish non-exposed from oil-exposed seawater needs to be evaluated with a larger set of samples and for different seasons before they can be developed into more targeted quantitative assays for oil occurrence detection with ESP.
Furthermore, upregulated genes of OHCB, particularly Cycloclasticus (i.e., phosphate starvation inducible protein, branched-chain amino acid ABC transporter substrate-binding protein, ammonium transporter and glutamate synthetase), Oleispira (i.e., nitrite reductase, glutamine synthetase, urea transport and ammonium transporter) and Alcanivorax (allophanate hydrolase) were identified. The suitability of these transcripts for oil spill monitoring requires further testing to clarify whether an intense nutrient (nitrogen and phosphorous) acquisition phase does always precede proliferation of OHCB or our observation was specific to a situation where excess nutrients became available as a result of phytoplankton senescence.
Both types of initial oxygenase enzymes, e.g., alkane monooxygenases and ring-hydroxylating dioxygenases responsible for the activation of aliphatic and aromatic hydrocarbons, were present already in the acclimated seawater. Only 11 out of 86 alkane monooxygenases and 29 out of 427 aromatic ring-hydroxylating dioxygenases were identified as upregulated in oil exposed samples in comparison to control using a differential expression analysis approach that selected for genes with small temporal variation in the control samples. Initial monooxygenases from OHCB were found, but only for alkane monooxygenase (Oleispira and Alcanivorax) and not among the upregulated genes.
Despite these differences in gene transcripts, we conclude that the effect of the relatively low oil concentration became largely masked and/or delayed, likely due to the simultaneous release and degradation of algae-derived organic matter. Abundant genes, e.g., alcohol and aldehyde dehydrogenases, which differentiated oil exposed seawater from control, were likely involved in processing of phytoplankton-derived material. Hence, these transcripts cannot be considered specific and suitable markers for oil occurrence in seawater based on these findings.
Our results highlight the importance of considering the overall taxonomic context and interactions within all domains of life present in an ecosystem, when interpreting the effect of oil exposures on bacterial communities. Moreover, experimental designs for discovering and testing functional markers need to incorporate appropriate controls not only for potential abiotic confounding factors (temperature, light, nutrients, etc.) but also biotic ones, e.g., the response of microeukaryotes and viruses to laboratory enclosure. This study emphasizes that careful considerations in experimental design are needed when studying the effects of contaminants on community level functional profiles, and that there is a need to validate laboratory findings with actual field conditions. Furthermore, given the complexity of the interactions between ecosystem members, network analysis could uncover an array of functional genes, which, together implemented as a multitarget assay, could be used in sensitive and robust warning systems. Further insights into taxonomic and functional changes of microbial communities inferred from experimental work that consider environmental factors such as seasonality and organic matter content are needed. Thereafter, a robust array of targets best appropriate for integration into genosensor technologies, such as ESP, can be selected and used for sensing community changes in response to oil contamination.
The datasets generated for this study can be found in the NCBI's Sequence Read Archive (SRA). The raw sequencing reads generated through amplicon sequencing are publicly available under the project number PRJEB32487 (ERS3402758-ERS3402761). The raw sequencing reads generated through metatranscriptomics are publicly available under the project number PRJNA556155 with the following individual SRA Run IDs: T0, SRR9734925; T4C, SRR9734976; T4O, SRR9940697; T7C, SRR9940698; and T7O, SRR9940650.
Supplementary Materials: Supplementary materials can be found at http://www.mdpi.com/2076-2607/8/5/744/s1. Figure S1 Flowchart of the bioinformatic analysis; Figure S2 Concentrations of individual polycyclic aromatic hydrocarbons measured on day1, 3, 5 and 7 of the experiment-shown on a logarithmic scale due to orders of magnitude differences between the different compounds; Figure Table S4. Diversity metrics calculated on family level from the RNA sequencing (RNAseq) and amplicon sequencing (16S) datasets. H-Shannon index and J-Pielou's evenness; Table S5 Uploaded as an Excel Worksheet. Contains: Table S5.1. Taxonomic classification of microbial community in seawater exposed to oil and seawater control observed at the start of exposure (T0), at the day 4 and at the last day of exposure (day 7) determined by metatranscriptomic and 16S rRNA amplicon,