Collection of Environmental Variables and Bacterial Community Compositions in Marian Cove, Antarctica, during Summer 2018

: Marine bacteria, which are known as key drivers for marine biogeochemical cycles and Earth’s climate system, are mainly responsible for the decomposition of organic matter and produc-tion of climate-relevant gases (i.e., CO₂, N₂O, and CH₄). However, research is still required to fully understand the correlation between environmental variables and bacteria community composition. Marine bacteria living in the Marian Cove, where the inflow of freshwater has been rapidly increas-ing due to substantial glacial retreat, must be undergoing significant environmental changes. During the summer of 2018, we conducted a hydrographic survey to collect environmental variables and bacterial community composition data at three different layers (i.e., the seawater surface, middle, and bottom layers) from 15 stations. Of all the bacterial data, 17 different phylum level bacteria and 21 different class level bacteria were found and Proteobacteria occupy 50.3% at phylum level following Bacteroidetes. Gammaproteobacteria and Alphaproteobacteria, which belong to Proteobacteria, are the highest proportion at the class level. Gammaproteobacteria showed the highest relative abundance in all three seawater layers. The collection of environmental variables and bacterial composition data contributes to improving our understanding of the significant relationships between marine Antarctic regions and marine bacteria that lives in the Antarctic. Dataset: http://doi.org/10.5281/zenodo.4549854 Dataset License: CC-BY


Summary
The collection of environmental variables and bacterial composition data will help understand the correlation between environmental variables and bacterial community composition in the rapidly changing Antarctic marine environment. Environmental variables and bacterial data were collected at three different layers (seawater surface, middle, and bottom layers) from 15 different stations in Marian Cove, Antarctica, in 2018. The data set can be found in Supplementary Table S1 that comprises temperature (T), salinity (S), dissolved oxygen (DO), dissolved inorganic nitrogen (DIN), phosphate (PO₄ 3− ), and silicate (SiO₂) in units of µmol L −1 (µM). Bacterial composition data contain 559 different bacteria at the species level (Supplementary Table S2). The obtained bacterial data were quality filtered by removing ambiguous DNA sequences, chimera sequences, and denoising. Relative abundance (%) was calculated with an operational taxonomic unit (OTU) count  number of each sample. In total, 8,022,571 OTUs were detected, and Gammaproteobacteria had the highest relative abundance in each seawater surface, middle, and the bottom layer at all 15 stations, followed by Alphaproteobacteria (Supplementary Table S2). All bacterial sequence data were deposited in the National Center for Biotechnology Information (NCBI).

Study Area
The study area was the small Marian Cove in the Antarctic, located between the Weaver and Barton Peninsulas. The length, width, and depth of the Marian Cove are 4.5, 1.5, and 120 m, respectively ( Figure 1) [1,2]. The Marian Cove is experiencing rapid environmental changes in the Antarctic region as a result of global warming [3][4][5][6][7][8], such as increased glacial retreat [9], which causes an increased inflow of freshwater [10]. Seawaters inflow into Marian Cove through Maxwell Bay in the Bellingshausen Sea. Horizontally, the Marian Cove is characterized by the continuous inflow of seawater and freshwater [11][12][13][14]. Vertically, the Marian Cove is composed of three different water masses: Surface Glacier Water (SGW: relatively low S and nutrients), Surface Maxwell Bay Water (SMBW; relatively high T and nutrients), and Subsurface Marian Cove Water (SMCW; relatively high S and low nutrients) [15,16].

Marine Environmental Variables
To investigate the differences in the bacterial community composition of each water mass that exhibits different environmental characteristics, we conducted a hydrographic survey to collect environmental variables and bacterial community composition data at three different layers (seawater surface, middle, and bottom) from 15 stations during January 2018. The seawater temperature, salinity, and dissolved oxygen were measured insitu using a conductivity, temperature, and depth (CTD) instrument (Supplementary Table S1). Seawater sampling for nutrients (i.e., dissolved inorganic nitrogen (DIN), phosphate (PO₄ 3− ), and silicate (SiO₂) and bacterial composition analysis were performed using a 5 L Niskin bottle. All seawater samples obtained through the filtering system and bacterial samples and nutrient samples were immediately frozen in the −80°C and −20°C deep freezer, respectively, until the analysis. Nutrients of DIN, phosphate, and silicate were measured using an autoanalyzer (Quaatro, Seal Analytical, Norderstedt, Germany). The total number of collected environmental data (T, S, DO, DIN, phosphate, and silicate) was 264 (Supplementary Table S1

Bacterial Community Compositions
For the bacterial data quality check, the QC20 score (the percentage of bases in which the Phred quality score is above 20) of all 44 samples was higher than 97, with a minimum of 97.56 and a maximum of 98.3 (the average was 97.92). The relative abundance (%) was calculated using each sample OTU count number. For all 44 samples, the total number of sequences read and number of OTU were 8,022,571 and 1,924,384, respectively. The bacterial data (Supplementary Table S2) show that Proteobacteria (50.3%) was the most abundant phylum level, followed by Bacteroidetes (30.4%), Firmicutes (6.4%), and Actinobacteria (3.4%). Additionally, unclassified (35.6%) was the most abundant class-level bacteria category, followed by Gammaproteobacteria (27%), Alphaproteobacteria (17%), and Clostridia (4%). Gammaproteobacteria showed the highest relative abundance in the surfacemiddle-bottom layers (Supplementary Table S2).

Collection and Measurement of Marine Environmental Variables
In January 2018, seawater was collected at three different layers (surface, middle, and bottom layer) from 15 different stations located between the glacier present inside the Marian Cove and Maxwell Bay using 5 L Niskin bottles (Figure 1 and Supplementary Table S1). These three layers were determined with CTD profile, described the shallowest depth as the surface layer, the deepest depth as the bottom layer, and the middle depth of the surface and bottom layer as the middle layer. For bacterial community analysis, 2 L of seawater from each station was filtered through a 0.2 µm membrane (Whatman 47 mm polycarbonate membrane). Filtered samples were then immediately frozen and stored at −80 °C until DNA extraction. For nutrient analysis (dissolved inorganic nitrogen (DIN = NH₄ + + NO₂ − + NO₃ − ), phosphate (PO₄ 3− ), and silicate (SiO₂), seawater was filtered through a 0.2 µm syringe filter (Sartorius, Cat. No 16532, Gottingen, Germany) and filtered seawater was placed in a 50 mL conical tube and stored at −20°C until the analysis was performed.
Marine environmental variables of the seawater samples were measured at each of the 15 stations. Seawater T, S, and DO were measured using a conductivity-temperature-depth instrument (CTD; RBR Ltd., Ottawa, ON, Canada). After thawing the frozen seawater for nutrient analysis, DIN, phosphate, and silicate contents were measured using an autoanalyzer (Quaatro, Seal Analytical, Norderstedt, Germany).

DNA Extraction and Sequencing
DNA from the filter paper was extracted using the PowerSoil® DNA Isolation Kit (Cat. No 12888, MOBIO, Carlsbad, CA) to assess the bacterial community composition. Using PicoGreen and Nanodrop, DNA quantitative and qualitative analysis were carried out, respectively. To amplify the extracted environmental DNA V3-V4 regions, a Polymerase chain reaction (PCR) was carried out using primers 341F and 805R. The Illumina (San Diego, CA, USA) MiSeq™ platform was used for sequencing.
The paired-end sequence generated as the result of sequencing was merged using FLASH (v. 1.2.11, Center for Computational Biology, College Park, MD, USA) [17] to obtain a single long sequence. The sequence data were formed after removing low-quality sequences, chimera sequences, and ambiguous sequences, which were treated as sequencing errors. Denoising was performed using rDNA tools of CD-HIT-OTU, the rRNA operational taxonomic unit (OTU) analysis program. Then, clustered sequences with 97% or more sequence similarity at the species level of OTU were obtained [18,19]. QIIME (v. 1.8.0, San Diego, CA, USA) was used to analyze microbial community populations [20,21]. Each OTU was compared to the National Center for Biotechnology Information 16S microbial database. Raw sequence data were deposited in the NCBI with Sequence Read Archive (SRA) accession number PRJNA533713. Further details can be found in S. Kim. et al. [22].

Data Availability Statement:
The data presented in this study are available within the supplementary material at www.mdpi.com/2306-5729/6/3/27/s1.