Comparative Analysis of Sequence Polymorphism in Complete Organelle Genomes of the ‘Golden Tide’ Seaweed Sargassum horneri between Korean and Chinese Forms

: Drifting and inundating brown seaweed Sargassum horneri biomass is called “golden tide”, as it resembles golden massive algal blooms like green tides. This phenomenon occurs globally and its serious ecological impacts on coastal ecosystems have recently begun to be paid attention to. In the present study, by sequencing whole organelle genomes of Korean indigenous S. horneri , we aimed to develop novel molecular markers that can be used for di ﬀ erentiating indigenous from nonindigenous individuals. To this end, we analyzed sequence polymorphisms in mitochondrial (mt) and chloroplast (cp) genomes of two Korean benthic samples in comparison to Chinese ones as a reference. We mapped mt genomes of 34,620~34,628 bp and cp genomes of 123,982~124,053 bp for the Korean samples. In comparative analyses, mtDNA cytochrome c oxidase subunit II ( cox2 ) gene showed the highest number of single nucleotide polymorphisms (SNPs) between Korean and Chinese individuals. NADH dehydrogenase subunit 7 ( Nad7 )-proline tRNA ( trnP ) intergenic spacer (IGS) in the mt genome showed a 14 bp insertion or deletion (indel) mutation. For the cp genome, we found a total of 54 SNPs, but its overall evolution rate was approximately four-fold lower than the mt genome. Interestingly, analysis of Ka / Ks ratio in the cp genome revealed a signature of positive selection on several genes, although only negative selection prevalent in mt genome. The ‘candidate’ genetic markers that we found can be applied to discriminate between Korean indigenous and nonindigenous individuals. This study will assist in developing a molecular-based early detection method for e ﬀ ectively managing nonindigenous S. horneri in Korean waters.


Introduction
Drifting and inundating brown macroalgae Sargassum biomass followed by its local blooming is called "golden tide", as it resembles golden massive shoaling algal blooms like green tides. This phenomenon takes place globally and annually, and thus the urgency of its damaging impacts on The genus Sargassum C. Agardh (Sargassaceae, Phaeophyceae) is comprised of 537 species worldwide [10] and they typically occur from subtidal to intertidal environments. A majority of these species undertake a benthic life with holdfasts attached to the substrate. In a natural state, S. horneri and other Sargassum species provide valuable ecosystem functions and services for a diverse array of other coexisting marine species, serving as nurseries and spawning grounds and refuges. Sargassum horneri is dioecious and sexually reproduce with an obligate annual life cycle [11]. A recent demographic study of the two Korean populations of S. horneri showed that thalli recruited in autumn, grew in winter and senesced in summer [9]. When individuals of S. horneri detached from The genus Sargassum C. Agardh (Sargassaceae, Phaeophyceae) is comprised of 537 species worldwide [10] and they typically occur from subtidal to intertidal environments. A majority of these species undertake a benthic life with holdfasts attached to the substrate. In a natural state, S. horneri and other Sargassum species provide valuable ecosystem functions and services for a diverse array of other coexisting marine species, serving as nurseries and spawning grounds and refuges. Sargassum horneri is dioecious and sexually reproduce with an obligate annual life cycle [11]. A recent demographic study of the two Korean populations of S. horneri showed that thalli recruited in autumn, grew in winter Sustainability 2020, 12, 7280 3 of 14 and senesced in summer [9]. When individuals of S. horneri detached from the substrate by the drag force of waves and/or currents, they can float and drift on the surface of the ocean by gas vesicles [12]. High mortality rate driven by S. horneri thalli detached from the substrate was found in winter and also in early spring [9]. Once detached, drifting individuals could damage coastal ecosystems if they form massive patches through rapid growth and propagation [13]. Nevertheless, drifting massive seaweed rafts in the open ocean sometimes offer refuge habitats for pelagic fishes and invertebrate fauna [14].
For possible management and control efforts of golden tides in Korean waters, early and accurate detection of nonindigenous individuals will be crucial for the rapid removal, in order to prevent potential hybridization incidences between these nonindigenous and Korean indigenous S. horneri individuals. Developing a molecular-based early detection technique would therefore be one of the options for their possible control. In our previous study [4], we found that there were significant differences in the genetic structure between floating (nonindigenous) and Korean benthic (indigenous) populations based on seven polymorphic microsatellite loci. Nevertheless, analyses of mitochondrial (mt) DNA cytochrome c oxidase subunit III (cox3) gene could not differentiate indigenous from nonindigenous individuals, due to the fact that this marker showed insufficient polymorphism. By considering geographic distribution of cox3 haplotypes determined in previous studies [4,15] along with the direction of major oceanic currents (e.g., the Kuroshio Current) surrounding this region, we were only able to speculate that S. horneri biomass floating to the Korean coast was likely to originate from Zhoushan, Zhejiang province of China [4]. Since the geographic origin(s) of floating S. horneri could not be firmly determined with mtDNA-based phylogeographic analysis, we, here, set out exploring genetic variation at whole organelle-genomic levels by comparing mt and chloroplast (cp) genomes between Korean and Chinese S. horneri. These analyses will help to differentiate between indigenous and nonindigenous individuals and further determine the origins of nonindigenous individuals more precisely.
The mt and cp genomes of the genus Sargassum have provided important insights into understanding their inter-and intraspecific phylogenetic relationships and evolutionary histories [16,17]. Given uniparental inheritance and accelerated evolutionary rates, mtDNA may provide a higher resolution in elucidating phylogenetic relationships among closely related taxa in brown algae [18]. The cpDNA would also help to advance our understanding of phylogenetic relationships and evolutionary processes that have not been fully resolved, particularly for terrestrial plants and seaweeds [19]. Sequence polymorphisms in mt and cpDNA genomes have been successfully applied to identifying different strains and tracking evolutionary origins of golden tide seaweed biomass in the Yellow Sea [20].
In this study, we analyzed sequence polymorphisms in the whole organelle genomes between two Korean indigenous samples, including one from Jindo Island in the South Sea and the other from Sokcho in the East Sea and nonindigenous Chinese sample (which was collected from Nanji Islands, Zhejiang Province) [21,22]. The aims of the present study were two-fold. First, we constructed and mapped the whole mt and cp genomes for the two Korean indigenous S. horneri individuals. Second, by comparing these organelle genomes with the previously published Chinese ones, we sought to develop novel genetic markers, such as indels (insertion or deletion mutations) and/or single nucleotide polymorphisms (SNPs) that can distinguish Korean indigenous from nonindigenous populations. This study will inform control and management efforts of golden tides and help to develop a "molecular-based early detection method" for nonindigenous S. horneri individuals in Korean waters, which is particularly essential for the rapid removal.

Sample Collection and DNA Extraction
Sargassum horneri individuals were sampled by scuba-diving from the rocky bed in approximately 3 m water depth at Jindo Island (JD; 34 • Figure 2). The sampled specimens were transported to the laboratory in ice box with frozen ice packs (approximately 4-10 • C) that blocked out light until DNA extraction. Using fresh leaf tissue, genomic DNA was extracted using a DNeasy Plant Mini Kit (Qiagen, Germany) according to the manufacturer's directions after ground to fine power in liquid nitrogen. The DNA quality and quantity were checked by electrophoresis on 1 % agarose gel and also by a Qubit ® 2.0 Fluorometer (Invitrogen, USA). (approximately 4-10 °C) that blocked out light until DNA extraction. Using fresh leaf tissue, genomic DNA was extracted using a DNeasy Plant Mini Kit (Qiagen, Germany) according to the manufacturer's directions after ground to fine power in liquid nitrogen. The DNA quality and quantity were checked by electrophoresis on 1 % agarose gel and also by a Qubit ® 2.0 Fluorometer (Invitrogen, USA). (SC) in the East Sea of Korea, and a sample of reference genome (filled circles in dark red) [21,22] was collected at Nanji Island (CH), Zhejiang province in China. Areas in red denote Zhoushan region, Zhejiang province in China, which was assumed to be geographic/genetic origin(s) of floating populations in our previous study [4]. Areas in blue denote Jeju Island and Haenam in the South Sea, respectively, which were main coastal regions in Korea influenced by floating S. horneri populations.

Genome Sequencing, Assembly, and Mapping
Paired-end sequencing for the total organelle genomes (mt and cpDNA genomes) of two Korean S. horneri samples was performed on the MiSeq (Illumina Inc., USA) platform with a library insert size of 300-500 bp at LabGenomics (http://www.labgenomics.co.kr/) in Korea. We obtained 168,510 and 3,407,076 raw read pairs with a length of 301 bp for JD and SC samples, respectively. These raw paired-end read pairs were trimmed using Geneious Prime (Biomatters Ltd., New Zealand) by error probability limit set to 0.05. The trimmed reads were aligned using Geneious in reference with previously determined S. horneri mt (GenBank accession no. KJ938300.1) and cp (accession no. KP881334.1) genome sequences sampled from Nanji Islands, Zhejiang province in China (CH; 27°27′N, 121°04′E) [21,22]. The number of mapped reads to the reference genomes was found to be 4,885 (JD)-12,832 (SC) and 11,949 (JD)-34,730 (SC) for mt and cp genomes, respectively. Only using the mapped reads, de novo assembly was then performed with the default setting (assembler: Geneious, sensitivity: medium sensitivity/fast) of the Geneious. The Geneious de novo assembly is based on an overlap assembler with a blast-like algorithm.
The protein coding genes (PCGs), transfer RNA genes (tRNAs), and ribosomal RNA genes (rRNAs) in the organelle genomes were predicted and annotated using Dual Organellar GenoMe  [21,22] was collected at Nanji Island (CH), Zhejiang province in China. Areas in red denote Zhoushan region, Zhejiang province in China, which was assumed to be geographic/genetic origin(s) of floating populations in our previous study [4]. Areas in blue denote Jeju Island and Haenam in the South Sea, respectively, which were main coastal regions in Korea influenced by floating S. horneri populations.

Genome Sequencing, Assembly, and Mapping
Paired-end sequencing for the total organelle genomes (mt and cpDNA genomes) of two Korean S. horneri samples was performed on the MiSeq (Illumina Inc., USA) platform with a library insert size of 300-500 bp at LabGenomics (http://www.labgenomics.co.kr/) in Korea. We obtained 168,510 and 3,407,076 raw read pairs with a length of 301 bp for JD and SC samples, respectively. These raw paired-end read pairs were trimmed using Geneious Prime (Biomatters Ltd., New Zealand) by error probability limit set to 0.05. The trimmed reads were aligned using Geneious in reference with previously determined S. horneri mt (GenBank accession no. KJ938300.1) and cp (accession no. KP881334.1) genome sequences sampled from Nanji Islands, Zhejiang province in China (CH; 27 • 27 N, 121 • 04 E) [21,22]. The number of mapped reads to the reference genomes was found to be 4885 (JD)-12,832 (SC) and 11,949 (JD)-34,730 (SC) for mt and cp genomes, respectively. Only using the mapped reads, de novo assembly was then performed with the default setting (assembler: Geneious, Sustainability 2020, 12, 7280 5 of 14 sensitivity: medium sensitivity/fast) of the Geneious. The Geneious de novo assembly is based on an overlap assembler with a blast-like algorithm.
The protein coding genes (PCGs), transfer RNA genes (tRNAs), and ribosomal RNA genes (rRNAs) in the organelle genomes were predicted and annotated using Dual Organellar GenoMe Annotator (DOGMA) using default parameters and then manually edited through a comparison with the published genome database [23]. tRNAs were confirmed using tRNAscan-SE [24]. The physical maps of the circular mt and cp genomes were drawn using OGDRAW program [25]. Mt and cp genome sequences of the Korean S. horneri samples can be accessed via GenBank with accession numbers MT795186-MT795189, respectively.

Comparative Analyses of Nucleotide Diversity and Synonymous (Ks) and Non-Synonymous Substitution (Ka) Rates
To explore genomic regions that can distinguish nonindigenous floating and indigenous Korean benthic populations, three mt and three cp genomes (JD, SC and CH) were aligned using MAFFT v1.4.0 alignment tool by set to default parameters (algorithm: auto, scoring matrix: 200PAM/K = 2, gap open penalty = 1.53, offset value = 0.123) [26]. We then manually searched for SNP or indel sites between two Korean and one Chinese genomes for each of the mt and cpDNA.
The mt and cpDNA genomes of three samples (JD, SC and CH) were aligned in a pairwise manner using a Geneious Alignment tool. Sliding window analysis of the mt and cp genomes was then conducted to calculate nucleotide diversity (Pi) values using DnaSP 5 [27] with a window length of 600 bp and a step size of 200 bp. To determine genomic regions that might have undergone a positive selection, we estimated synonymous (Ks) and non-synonymous (Ka) substitution rates for a total of 35 and 137 PCGs of mt and cpDNA, respectively using Geneious Alignment. The aligned total length, Pi, and Ka/Ks ratio for each separate gene were calculated in DnaSP 5.

Comparative Analysis of Genome Sequences between Korean and Chinese Individuals
By comparing the sequenced mt and cp genomes of Korean samples with the published genome database for a Chinese sample, the number and order of genes were observed to be the same as the reference genomes, suggesting no structural variation such as insertions or rearrangements exists. We identified 14 bp indel (insertion/deletion) mutation in nad7 (NADH dehydrogenase subunit 7)-trnP (proline tRNA) intergenic spacer (IGS) ( Table 2). Genetic variation between JD and CH genomes revealed the highest number of seven SNPs on cytochrome c oxidase subunit II (cox2) in a total length of 3,147 bp. For a comparison of SC with CH genomes, cox2 gene also showed the highest number of 140 SNPs in a total of 3,154 bp. Although the mt genome had a shorter length in size than the cp genome, it had considerably more variable sites between Korean and Chinese genomes.  For the cp genome, we found 90 bp deletion in trnW (tryptophan tRNA) gene only for JD. While the mt genomes between JD and CH had a total of 52 SNPs in 34,620 bp, the cp genomes had only a total of 54 SNPs in 123,982 bp (Table 2). Overall, the mt genome (0.15%) had an approximately four-fold greater polymorphism than the cp genome (0.04%).

Nucleotide Diversity and Ka/Ks Ratio
We found 38 Korean unique SNPs in the mt genome and 45 SNPs in the cp genome, based on our comparative analysis of JD, SC, and CH genomes ( Table 2).
The Ka/Ks ratios in the 20 PCGs of mt genomes were always lower than 1, while Ka/Ks ratios in some of the 94 PCGs of the cp genomes were > 1 (Supplementary Material Tables S1 and S2). The Ka/Ks ratios of cp genomes between JD and SC showed always < 1 (Table S2), whereas that between JD and CH showed > 1 in only three genes, including dnaB (replicative DNA helicase), orf467 (open reading frame), secA (protein translocase subunit), and that between SC and CH showed > 1 in two genes, including petF (ferredoxin) and psaI (photosystem I reaction center subunit VIII) ( Table 3). These results suggest that the genes with Ka/Ks ratio > 1 might be under a positive selection, in response to different regional environments (e.g., Korea vs. China).       pairwise Pi values between the two genomes. Individual abbreviations as in Figure 1. Table 3. Results of synonymous (Ks) and non-synonymous (Ka) substitution rates for the protein coding genes (PCGs) of the chloroplast genomes between two Korean benthic and Chinese reference genomes. Although Ka/Ks ratios were always < 1 between Korean samples, two (petF, psaI) and three (dnaB, orf467, secA) genes showed Ka/Ks ratios >1 between Korean and Chinese comparisons. Sample abbreviations as in Figure 1.

Discussion
We successfully constructed the whole mt and cp genomes of two Korean benthic individuals using a reference genome available. From our organelle genomes-based comparative analyses between the Korean and Chinese Sargassum horneri forms, we identified several candidate markers, such as an indel and SNPs that can potentially be applied to distinguish Korean indigenous from Chinese nonindigenous populations. Although the mtDNA cox3 gene is known to harbor relatively high polymorphism and has often been used as a useful genetic marker for population genetic or phylogeographic studies of Sargassum species e.g., [15,28,29], we could not identify any sequence polymorphism between nonindigenous floating and indigenous Korean benthic populations of S. horneri in our previous study [4]. In the present investigation, we therefore sought to explore sequence variation between Korean indigenous and Chinese (that were sampled from Nanji Islands, Zhejiang Province where floating populations in Korean waters are presumed to be originated from there) at the whole organelle genome levels [4]. Excessive levels of within-species variation in DNA sequences are sometimes large challenges to the delineation of 'species' [30], while we sought to find a variation for the 'hotspot' region in organelle genomes that can distinguish Korean indigenous and nonindigenous individuals.
Although the results of intraspecific comparisons of S. horneri in our study showed fairly conserved levels of polymorphism in both mtDNA and cpDNA sequences, the cpDNA genomes showed Pi values approximately four times lower than the mtDNA genomes, supporting the previous findings that cpDNA is more conserved than mtDNA in the genus Sargassum [20]. At the species level, a three-fold greater evolutionary rate was detected in mtDNA relative to cpDNA [20]. In a previous study comparing cpDNA genomes of brown algae, the Order Fucales including the genus Sargassum was the most conservative taxa, especially for photosynthetic-related genes [21]. Depending on the importance of gene functions in connection with a functional constraint, mutations in the cpDNA genome may readily be eliminated if they occur, but mutations in mtDNA genome may accumulate as they would likely be selectively neutral [31], otherwise its non-neutral evolution is also feasible. We planned to test those 'candidate' genetic markers (an indel and SNPs) on both the mt and cp genomes, with an attempt to genetically discriminate between Korean indigenous and Chinese nonindigenous samples. The variable sites that we found can be exploited to identify differences between benthic and floating populations (e.g., application of SNPs for only Korean or nonindigenous samples). Although only two Korean benthic samples were analyzed, we suggest that those markers can be applied for distinguishing between nonindigenous floating and indigenous benthic samples.
We found that the highest number of SNPs was present in mtDNA cox2 gene, which was shown to accompany the largest changes in gene lengths when compared to other Phaeophyceae [32]. The cytochrome c oxidase (CO) is an enzyme that activates an enzyme that transfers electrons directly to molecular oxygen. It produces large amounts of ATP through cellular respiration in mitochondria. Therefore, it is thought that this particular gene has the highest mutation rate, among which cox2, a second subunit of CO genes can be strongly influenced by chemically unstable environments by cellular respiration. Also, we found an indel mutation in nad7-trnP IGS on mtDNA. An IGS is noncoding DNA sequences spanned between coding regions and is therefore known to have faster mutation rates than any other sites. Therefore, indel sites of IGS have frequently been used to identify invasive or alien species (populations) or lineages [33].
Comparison of SNPs on organelle genomes across PCGs can often provide important information for determining whether particular gene(s) may be a target by action of natural selection [34]. Our observed Ka/Ks ratio < 1 in all the genes of mt genomes suggests that purifying selection acts on (slightly) deleterious mutations. However, cp genomes showed a Ka/Ks ratio > 1 only in dnaB, orf467 and secA gene between JD and CH, and in the petF and psaI genes between SC and CH genomes, although Korean comparisons always showed a signal of negative selection. These results suggest the possibility that beneficial mutations might have evolved in these particular genes with different functions as a result of positive natural selection due to different habitat environments between Korea and China. DnaB's function is known to unwind double-stranded DNA from chromosomal replication fork. Detected ORFs are regions of unknown functions that are potentially translated [35]; further analysis is needed to clarify their functions but the elevated Ka/Ks in the JD-CH comparison of orf467 may suggest that their functions are important to fitness. SecA gene delivers proteins through the cell membrane through ATP hydrolysis. PetF gene transports electrons in a variety of metabolisms. PsaI gene is involved in photosynthesis. The occurrence of natural selection, such as positive selection identified in five genes, can be under rapid evolutionary radiation, and is thus likely to influence speciation [36]. Among them, the psaI gene revealed the highest Ka/Ks ratio with a value close to six, and it is known that genes related to photosystem I are mostly conservative [37]. However, beneficial non-synonymous substitutions remain in the psaI gene of the SC and CH cp genomes, which might have been beneficial for survival and reproduction in different regional environments, but further studies are needed to verify this interesting hypothesis. However, there is a possibility that the observed signature of positive selection might be false inferences resulting from simultaneous non-independent substitutions within short stretches of DNA, called multinucleotide mutations (MNMs), which are known to prevail in eukaryote genomes [38,39]. Therefore, it is necessary to analyze more plastid genomes to test for the presence of MNMs to ensure whether our findings of positive selection are valid.
We have identified various SNP regions by conducting comparative analysis of genomes, however, nonetheless given that mtDNA and cpDNA of seaweeds are sometimes relatively more conservative, we should consider looking for variation in the nuclear genome [40] using more advanced genome sequencing technologies, such as double digest restriction-site associated DNA-sequencing (ddRAD-Seq) [41]. A recent study suggested that developed SNP markers in the nuclear genome of Sargassum muticum with ddRAD-Seq technique provide much higher marker resolution than any other markers, such as mtDNA and microsatellites [42]. It could also be possible that physiological or ecological characteristics (e.g., stable isotopic signature, morphology), rather than DNA sequences per se can allow us to differentiate Korean indigenous from nonindigenous populations. For example, the notion that the longer the drifting period, the more changes accompany in the thickness of thallus and stem, or even in color, needs to be studied. This would yield insights into how long plants have been floating around the sea, and also how the developmental morphology differs between floating and benthic populations [32].
Although only two Korean samples of S. horneri were analyzed and there will still be further efforts for the development of novel genetic markers, the obtained genome information will serve as reference database and thus will help to more precisely identify the origin of the floating S. horneri biomass that flows into the Korean coast. In the future, our study will inform management efforts including the development of "a DNA-based early detection technique for nonindigenous S. horneri", which helps to mitigate ecological and economic damages on the Korean coastal ecosystem.
Supplementary Materials: The following are available online at http://www.mdpi.com/2071-1050/12/18/7280/s1, Table S1: The Ka/Ks ratio calculated by pairwise protein-coding gene of each three mitochondrial genomes; Table S2: The Ka/Ks ratio calculated by pairwise protein-coding gene of each three chloroplast genomes. Sample abbreviations as in Figure 1.