Complete Mitogenome Sequencing, Annotation, and Phylogeny of Grateloupia turuturu, a Red Alga with Intronic cox1 Gene

The mitochondrial genome (mitogenome) is essential for identifying species and tracing genetic variation, gene patterns, and evolutionary studies. Here, the mitogenome of Grateloupia turuturu was sequenced on the Illumina sequencing platform. This circular mitogenome (28,265 bp) contains 49 genes, including three rRNAs, twenty transfer RNAs (tRNAs), and twenty-six protein-coding genes (PCGs). Nucleotide composition indicates biased AT (68.8%) content. A Group II intronic sequence was identified between two exons of the cox1 gene, and this sequence comprises an open reading frame (ORF) that encodes a hypothetical protein. The gene content, annotation, and genetic makeup are identical to those of Halymeniaceae members. The complete mitogenome sequences of the Grateloupia and Polyopes species were used in a phylogenetic analysis, which revealed that these two genera are monophyletic and that G. turuturu and G. elliptica are closely related. This newly constructed mitogenome will help us better understand the general trends in the development of cox1 introns in Halymeniaceae, as well as the evolution of red algal mitogenomes within the Rhodophyta and among diverse algal species.


Introduction
Rhodophyta algae (red algae) are an evolutionarily significant eukaryotic lineage which inhabit marine and freshwater.Rhodophyta species are mostly multicellular, photoautotrophic, and abundant in marine habitats (around 98%) and rare in freshwater, with a few rare terrestrial or sub-aerial representatives [1].The red alga have photosynthetic pigments, chlorophylls a and d, and characteristic red colors due to the phycoerythrin pigment.In the evolutionary sense, red algae are plant-like because they have a single shared parent with green algae (Chlorophyta) and higher plants (Embryophyta) [2,3].Rhodophytes are divided into seven classes with around 7538 species, and among them, the Florideophyceae class possesses the maximum number of species (7141), which are mostly marine, multicellular algae including seaweeds [4].
Relatively little is known about the mitogenome of Rhodophytes, and due to advancements in software and molecular technologies, more and more detailed studies are being reported.In fact, red algal mitogenomes are more complete than previously reported [12], and it has also been reported that red algae, Strylonematophyceae, contain multiple minicircular mitochondrial genomes that encode one or two genes [13].These studies are made possible by applying bundles of software tools.The red algal mitogenomes have less molecular weight than other algae, and because of their maternal inheritance, they are a useful tool for evolutionary and phylogenetic studies.In addition, mitogenomes have a specific sequence that gives reliable data for studying the gene order, makeup, contents, and secondary structures of the encoded RNA [14,15], and it is also useful for making molecular kits (barcoding markers) for economically important species identification [16].The Grateloupia species contain a characteristic intronic cox1 gene (Table 1), and such features are useful to understand evolutionary and phylogenetic studies [3,[6][7][8][9]17].Algae mitogenomes consist of introns in the genic region, tandem repeats, and large intergenic repeats, which create challenges for assembling complete circular mitogenomes [15] but due to revolutionary advances in sequencing technologies and bioinformatics tools, such issues can be overcome.So, utilizing modern, next-generation sequencing methods and bioinformatics tools, we provide here the full mitochondrial genome of red algae as well as a phylogenetic relationship based on the complete mitogenome sequence.In this study, we used de novo assembly on the Illumina platform to sequence the complete circular mitogenome of G. turuturu.Gene annotation, genetic makeup, and gene order were confirmed using several bioinformatics tools and phylogenetic studies based on complete mitogenome sequencing.This study's data were submitted to the NCBI GenBank and will be useful for future research on the evolution and phylogeny of red algae species.

Sample Collection and DNA Isolation
A deep-sea diver from the Marine Eco-Technology Institute in Busan, South Korea, collected Grateloupia turuturu from the coast of Gijang (35 • 28 N, 129 • 25 E) and then deposited it there under the voucher number PU-T01-S-MA-04 (contact person: Dr. Young-Ryun Kim, yykim@marine-eco.co.kr).Total DNA was extracted using the QIAGEN DNEasy Blood and Tissue Kit (QIAGEN, Hilden, Germany) as per the manufacturer's protocol, and the purity and concentration of DNA were confirmed via a NanoDrop spectrophotometer (Thermo Fisher Scientific D1000, Waltham, MA, USA).Purified total genomic DNA samples were kept at −20 • C until required.

Whole Genome Sequencing
G. turuturu genome was sequenced using the Illumina Platform (Illumina Inc., San Diego, CA, USA).The library preparation and sequencing processes were carried out by the Macrogen Company in Daejeon, South Korea.Sequencing libraries were prepared using the TrueSeq Nano DNA Kit according to the manufacturer's protocol, and sequencing was performed on the Illumina HiSeq 2500 Platform in paired-end 150 bp mode.Before downstream analysis, raw data initially underwent quality checks to obtain clean reads.The low-quality bases (phred quality score, Q < 20), empty reads, and Illumina adapters were removed to mitigate the analytical bias by Trimmomatic [18].After filtering, 12,903,396 total reads (GC = 40.05%,Q20 = 99.26%)were produced from a total of 14,873,050 raw reads (GC = 40.23%,Q20 = 97.33%).The overall quality of the produced sequencing reads was verified using FastQC v0.11.5 (Babraham Institute, Bioinformatics) [19], and mitogenome de novo assembly was finished using various k-mers [20] and the SPAdes v3.13.0 program [21].
The following formula was used to calculate the asymmetric base composition of the mitochondrial genome: GC

Phylogenetic Analysis
The phylogenetic tree was made by using the complete circular mitogenome sequences of eight red algae from the family Halymeniaceae (Table 1) and one alga from the family Glaucocystaceae (Glaucocystis nostochinearum, GenBank accession number HQ908425) as an out-group member.All mitogenomes utilized in this investigation were obtained from the NCBI GenBank.The dataset was initially processed by ClustalW for multiple sequence alignment in MEGA11 [32].Multiple sequenced aligned datasets were used to generate a maximum-likelihood phylogenetic tree using the Tamaru-Nei model and 1000 replicated bootstraps in MEGA11 with the default parameters [29,33].

Data Availability
The mitogenome sequence and related data were submitted to the NCBI GenBank (http://www.ncbi.nlm.nih.gov/,accessed on 12 May 2023 and 16 June 2023).The complete mitogenome sequence is available for the public under the accession number OQ972988, along with associated data including Sequence Read Archive (SRA), BioProject, and BioSample with the assigned numbers PRJNA984428, SAMN35767756, and SRR24947511, respectively.

Genome Size and Organization
The contig with a length of 28,265 bp was identified as the mitochondrial genome; based on BlastN analysis, it matches the reference species of Grateloupia, and the mitogenome size is comparable to that of other red algal mitogenomes (Table 1).The mitogenome sequence of Grateloupia turuturu is available in GenBank with accession number OQ972988.The complete circular mitogenome map with gene arrangement is shown in Figure 1.The contig is 28,265 bp long and is composed of A = 36.1%,T = 32.7%,G = 16.1%, and C = 15.5%, with a bias of 68.8%A + T contents.The G. turuturu mitogenome contains 3 rRNA, 20 tRNA, and 26 PCGs (including intronic and hypothetical protein genes), including 14 respiratory chain subunits (complexes 1-4), four ATP synthase subunits (complex 5), two each of LSU and SSU ribosomal proteins, one independent protein translocase (tatC), and two hypothetical protein genes (orf641 and orf173).Among these genes, 24 (12 PCGs, 10 tRNAs, and 2 rRNAs genes) are found on the heavy strand (H-strand), while the rest (14 PCGs, 10 tRNA, and 1 rRNA gene) are found on the light strand (L-strand).The positive AT skew (0.049) and GC skew (0.032) were observed in this study with the presence of more A and G than T and C, respectively (Table 1).In comparison to Grateloupia [6][7][8][9] and Polyopes [10,11] species with complete mitogenome features, the mitogenome of G. turuturu demonstrates no significant gene losses; however, G. elliptica (OP479979) [7] has closer mitogenome features in terms of nucleotide composition, bias AT content, and gene compositions.In Halymeniales, the typical complete mitogenome was circular and approximately 25 to 30 kb in length with correspondingly conserved gene content, which encoded 24 PCGs (excluding intronic and hypothetical genes), 2-3 rRNAs, and 18-23 tRNAs with A + T bias nucleotides (Table 1) [6][7][8][9][10][11].

Protein-Coding Gene Features
The PCG area, which included intronic and hypothetical genes, made up 71.53% the G. turuturu mitogenome and was 20,220 base pairs long.nad5 is the longest PCG w 1998 bp, while atp9 is the smallest with 231 bp.Each PCG was initiated by a canonical A codon, except for tatC, which was initiated by a TTG codon (Table 2).Similar results h been demonstrated in G. cornea (OQ910480), G. elliptica [7], and P. affinis [10].Furthermo out of 26 PCGs, 21 terminated with the TAA codon, except 5 PCGs (sdh2, cox2, atp8, a and rps11) which terminated with the TAG codon, which was typical for Grateloupia [6 and P. lancifolius [11].The G. turuturu mitogenome was analyzed for intergenic nucleoti and it was noted that junctions of three gene pairs have an overlap; 1 bp each betw trnL (number 2)-nad6 and trnH-sdh2, and 51 bp between cox3-ymf39.Furthermore, intergenic gaps differ from 1 bp to 650 bp in length, with the longest gap of 650 bp betw

Figure 1 .
Figure 1.Gene map of the Grateloupia turuturu (OQ972988) mitochondrial genome.Different c gories of genes are represented by abbreviations and arrows outside and inside the circle, wh indicates the direction of gene transcription.A gene (cox1) containing group II introns is deno with an asterisks.The map was drawn using OrganellarGenomeDRAW (https://chlorobox.mpimgolm.mpg.de/OGDraw.html,accessed on 15 June 2023).

Figure 1 .
Figure 1.Gene map of the Grateloupia turuturu (OQ972988) mitochondrial genome.Different categories of genes are represented by abbreviations and arrows outside and inside the circle, which indicates the direction of gene transcription.A gene (cox1) containing group II introns is denoted with an asterisks.The map was drawn using OrganellarGenomeDRAW (https://chlorobox.mpimp-golm.mpg.de/OGDraw.html, accessed on 15 June 2023).

Table 1 .
An overview of the complete mitogenomes utilized in this study.

Table 4 .
Mitochondrial rRNA and tRNA in Halymeniaceae.