Plastid Genome Evolution of Two Colony-Forming Benthic Ochrosphaera neapolitana Strains (Coccolithales, Haptophyta)

Coccolithophores are well-known haptophytes that produce small calcium carbonate coccoliths, which in turn contribute to carbon sequestration in the marine environment. Despite their important ecological role, only two of eleven haptophyte plastid genomes are from coccolithophores, and those two belong to the order Isochrysidales. Here, we report the plastid genomes of two strains of Ochrosphaera neapolitana (Coccolithales) from Spain (CCAC 3688 B) and the USA (A15,280). The newly constructed plastid genomes are the largest in size (116,906 bp and 113,686 bp, respectively) among all the available haptophyte plastid genomes, primarily due to the increased intergenic regions. These two plastid genomes possess a conventional quadripartite structure with a long single copy and short single copy separated by two inverted ribosomal repeats. These two plastid genomes share 110 core genes, six rRNAs, and 29 tRNAs, but CCAC 3688 B has an additional CDS (ycf55) and one tRNA (trnL-UAG). Two large insertions at the intergenic regions (2 kb insertion between ycf35 and ycf45; 0.5 kb insertion in the middle of trnM and trnY) were detected in the strain CCAC 3688 B. We found the genes of light-independent protochlorophyllide oxidoreductase (chlB, chlN, and chlL), which convert protochlorophyllide to chlorophyllide during chlorophyll biosynthesis, in the plastid genomes of O. neapolitana as well as in other benthic Isochrysidales and Coccolithales species, putatively suggesting an evolutionary adaptation to benthic habitats.


Introduction
Haptophytes, along with cryptophytes, alveolates, and stramenopiles, belong to the group of taxa that underwent secondary endosymbiosis (CASH lineage), possessing a red algal-derived chlorophyll-a,c containing plastid [1,2], and they thrive predominantly in marine environments. Haptophytes are named for the haptonema, a flagella-like filamentous appendage involved in prey capture and cell attachment. Marine haptophytes contribute up to 10% of global carbon cycling through fixing CO 2 by photosynthesis and biomineralization [3,4]. Furthermore, they account for 50% of the calcium carbonate precipitation in the oceans [5,6].
Within the 28S rRNA tree ( Figure 1B, Supplementary Figure S1), the monophyletic Within the 28S rRNA tree ( Figure 1B, Supplementary Figure S1), the monophyletic Ochrosphaera clade was grouped with the Hymenomonas clade, the same as 18S rRNA tree.
The 28S rRNA tree also showed that the Ochrosphaera clade was subdivided into two clades, with CCAC 3688 B in one clade and A15,280 in the other clade ( Figure 1B).
The 28S rRNA tree also showed that the Ochrosphaera clade was subdivided into two clades, with CCAC 3688 B in one clade and A15,280 in the other clade ( Figure 1B).
Morphologically, the cells of CCAC 3688 B and A15,280 each had two parietal goldenyellow plastids with pyrenoids; however, there were some morphological differences between strain A15,280 and the other Ochrosphaera strains (Figure 2 Figure 2). Under the same cultivation conditions, A15,280 produced naked cells that were covered with mucilage ( Figure 2D-F), while the other strains typically formed colonies whose cells were covered with coccoliths (Figure 2A-C, G-L).
Ochrosphaera neapolitana was originally described using isolates collected from the coast of Naples, Italy [33]. Two additional species, O. verrucosa and O. rovignensis, were described by the same author using samples collected from the Adriatic Sea coastline in what is now Croatia [34]. The three species were separated based on cell sizes (O. neapolitana; 4.6-6 µm, O. verrucosa; 8-11.5 µm, and O. rovignensis; 11-13 µm) [20]. Thus, all the Ochrosphaera strains in this study fit the cell size for O. neopolitana, the type species.  Ochrosphaera neapolitana was originally described using isolates collected from the coast of Naples, Italy [33]. Two additional species, O. verrucosa and O. rovignensis, were described by the same author using samples collected from the Adriatic Sea coastline in what is now Croatia [34]. The three species were separated based on cell sizes (O. neapolitana; 4.6-6 µm, O. verrucosa; 8-11.5 µm, and O. rovignensis; 11-13 µm) [20]. Thus, all the Ochrosphaera strains in this study fit the cell size for O. neopolitana, the type species.
The SEM observations showed that strains CCAC 3688 B, NIES-1395, and NIES-1964 contained both pully-shaped and vase-shaped coccoliths with six to eight elements ( Figure 2). These two types of coccoliths were not detected for A15,280, which produced incomplete coccoliths ( Figure 2E,F). Gayral and Fresnel-Morange (1971) studied O. neapolitana based on samples from Luc-sur-Mer Marine Station, France, and they suggested that the vase-shaped and pully-shaped coccoliths were key characteristics for the type species [19,20].
Molecular phylogenetic studies showed that A15,280 grouped with NIES-1395, NIES-1964, UTEX LB 1722, and other strains in both the 18S and 28S rRNA trees, and we concluded that all four strains should be identified as O. neopolitana. Therefore, we assume that A15,280 secondarily lost its ability to form complete coccoliths.

General Features of the Plastid Genomes
The complete circular plastid genomes of O. neapolitana CCAC 3688 B and A15,280 had assembled lengths of 116.9 kb and 113.6 kb, respectively ( Figure 3A). They had a quadripartite structure containing a long single copy (LSC) and a short single copy (SSC) separated by two ribosomal inverted repeats (IRa, IRb). The genomes shared a core set of 110 protein-coding genes, six rRNAs and 29 tRNAs, but strain CCAC 3688 B had an additional coding sequence (CDS) (ycf55) and one tRNA (trnL-UAG). These two gene deletions affected the synteny block difference, i.e., one inversion (trnF to ycf55) and one gene transposition (ycf45) that distinguished the two plastid genomes ( Figure 3A). High divergence rates were found between the genomes in the intergenic regions (IGR), and this was due to numerous short sequence insertions and deletions (indels) as well as single nucleotide polymorphisms (SNPs) ( Figure 3B). Especially, two large insertions at the IGRs (2 kb insertion between ycf35 and ycf45; 0.5 kb insertion in the middle of trnM and trnY) were detected, and these large insertions at the IGRs and the two additional genes resulted in the CCAC 3688 B plastid genome (116.9 kb) being 3 kb larger than that of A15,280 (113.7 kb) ( Figure 3).
Divergence rates between two strains of O. neapolitana were calculated based on the DNA sequence alignment ( Figure 3B). The average DNA sequence variation between the two strains was 25.2%, while that of only the genic region was 11.9%. Divergence rates higher than 30% were mostly found in IGR and two transposition regions; however, one exceptional genic region case was found in ycf80 (39.7%). These genetic features, along with incomplete coccolith formation in A15,280, represent infraspecific variation of O. neapolitana.
The two new O. neapolitana plastid genomes and eleven publicly available haptophyte plastid genomes were compared ( Figure 4). The genome sizes ranged from 95.281 kb (Diacronema lutheri) to 116.9 kb (O. neapolitana CCAC 3688 B) ( Figure 4A). The number of protein coding genes ranged from 108 to 119, the GC content ranged between 35.4 and 36.9%, and no introns were found in the plastid genomes. The average size of the IGR ranged between 89.4 and 143.4 bp, but the two O. neapolitana plastid genomes fell outside this range, and the IGRs measured 167.6-176.1 bp in size (Table 1). Among the plastid genomes of rhodophytes and the CASH lineage (cryptophytes, alveolates, stramenopiles, and haptophytes), the cumulative IGRs sizes were largely correlated with the plastid genome size (Supplementary Figure S3). In the case of the haptophytes, the two O. neapolitana plastid genomes contained more than 20% IGRs (21.1-22.0%), whereas other species with smaller genomes had smaller portions of IGRs (13.2-18.9%), therefore supporting the hypothesis that IGR size is correlated with genome size.
Although there was a conserved plastid genome structure within each order of haptophytes, low structural integrities were found between the orders, especially in the copy number of ribosomal repeats and their constituents (tRNAs and rRNAs) ( Figure 4B,C). The Pavlovales had one copy of the ribosomal operon, which was formed by 16S rRNA-trnL-trnA-26S rRNA, but the other haptophyte taxa (i.e., Rappephyceae and Coccolithophyceae) had a pair of inverted repeats that created a quadripartite structure in the genome. Pavlomulina ranunculiformis (Pavlomulinales) contained symmetric repeats with two intact IRs. Within the Coccolithophyceae, two Phaeocystales species (Phaeocystis antarctica and P. globosa) contained asymmetric repeats, i.e., the IRa had an intact ribosomal repeat, but the IRb lacked two tRNAs [31]. In the case of Chrysochromulina tobinii and Chrysochromulina parva (Prymnesidales), the two ribosomal repeats had only one tRNA each (tmL in Ira and trnA in IRb) [35]. Two strains of Emiliania huxleyi (Isochrysidales) had two IRs; however, they had the shortest SSC between two IRs among all haptophyte plastid genomes [30]. In Tisochrysis lutea and Isochrysis galbana (Isochrysidales), two ribosomal repeats were aligned in same direction [31,32], and the 5S rRNA was not found in the T. lutea plastid genome. Finally, the two O. neapolitana strains had a canonical quadripartite plastid genome structure (i.e., two IRs with two tRNAs, SSC, and LSC). Divergence rates between two strains of O. neapolitana were calculated based on the DNA sequence alignment ( Figure 3B). The average DNA sequence variation between the two strains was 25.2%, while that of only the genic region was 11.9%. Divergence rates higher than 30% were mostly found in IGR and two transposition regions; however, one  and haptophytes), the cumulative IGRs sizes were largely correlated with the plastid genome size (Supplementary Figure S3). In the case of the haptophytes, the two O. neapolitana plastid genomes contained more than 20% IGRs (21.1-22.0%), whereas other species with smaller genomes had smaller portions of IGRs (13.2-18.9%), therefore supporting the hypothesis that IGR size is correlated with genome size. Although there was a conserved plastid genome structure within each order of haptophytes, low structural integrities were found between the orders, especially in the copy number of ribosomal repeats and their constituents (tRNAs and rRNAs) ( Figure 4B, C). The Pavlovales had one copy of the ribosomal operon, which was formed by 16S rRNA-trnL-trnA-26S rRNA, but the other haptophyte taxa (i.e., Rappephyceae and Coccolithophyceae) had a pair of inverted repeats that created a quadripartite structure in the genome. Pavlomulina ranunculiformis (Pavlomulinales) contained symmetric repeats with two intact IRs. Within the Coccolithophyceae, two Phaeocystales species (Phaeocystis antarctica and P. globosa) contained asymmetric repeats, i.e., the IRa had an intact ribosomal repeat, but the IRb lacked two tRNAs [31]. In the case of Chrysochromulina tobinii and Chrysochromulina parva (Prymnesidales), the two ribosomal repeats had only one tRNA each (tmL in Ira and trnA in IRb) [35]. Two strains of Emiliania huxleyi (Isochrysidales) had two IRs; however, they had the shortest SSC between two IRs among all haptophyte plastid genomes [30]. In Tisochrysis lutea and Isochrysis galbana (Isochrysidales), two ribosomal repeats were aligned in same direction [31,32], and the 5S rRNA was not found in the T. lutea plastid genome. Finally, the two O. neapolitana strains had a canonical quadripartite plastid genome structure (i.e., two IRs with two tRNAs, SSC, and LSC).

Distribution of Genes in the Haptophyte Plastid Genomes
The majority of the genes in the haptophyte plastid genomes was conserved, but some differences in gene content were observed ( Figure 5 and Supplementary Table S3). For example, 97 genes involved with photosynthesis, DNA synthesis, DNA repair, protein export, and sulfur-related genes were well conserved in the haptophyte plastid gene inventories. Three LIPOR genes of chlB, chlL, and chlN were only present in the two O. neapolitana plastids, whereas the psaK and psbX genes were absent in Ochrosphaera. The secG and ycf47, protein-export-related genes, and membrane translocator ycf 80 gene, were absent in two Pavlovales strains. The thiS and thiG genes were absent in species of Pavlovales, Pavlomulinales, and Phaeocystales. No horizontal gene transfer (HGT) candidates were found in the plastid genome except for rpl34 gene, which is a well-known bacterial gene transfer case to the ancestor of the haptophyte and cryptophyte plastid genomes [36].
Light-independent protochlorophyllide oxidoreductases (LIPOR; chlL, chlN, and chlB) are a group of enzymes that convert protochlorophyllide to chlorophyllide irrespective of the presence of light. Different from other red algal plastid descendants including cryptophytes (in pseudogenized form), alveolates, and stramenopiles, LIPOR genes have not been previously reported for haptophytes; therefore, it has been suggested that haptophytes lost these genes secondarily [27][28][29]. Here, we unexpectedly found the LIPOR genes in the plastid genomes of two O. neapolitana strains. To test the presence of these genes in other haptophyte taxa, we searched for the chlL, chlN, and chlB genes in 11 selected haptophyte strains using specific PCR primers. Although this taxon sampling overrepresented three haptophyte genera with well-established benthic stages (i.e., four Chrysotila, three Ruttnera, and three Ochrosphaera strains), we included taxa from all seven orders of the Haptophyta. Consistent with the plastid genome data, the chlL and chlN genes were found in all Ochrosphaera and Chrysotila strains (Coccolithales) and in three Ruttnera strains (Isochrysidales). The chlB gene was absent in Chrysotila stipitata strain A13,147 and Ruttnera lamellosa strain A12,715 (Supplementary Figure S4). The presence of LIPOR genes in the Ruttnera strains is interesting, because other Isochrysidales (two Emiliania huxleyi strains, Isochrysis galbana, and Tisochrysis lutea) lacked the three LIPOR genes. A common feature of Ochrosphaera, Chrysotila, and Ruttnera is that they have a dominant benthic palmelloid stage (e.g., [7,14], and this study). In contrast, LIPOR genes were not found in haptophyte species that had a predominantly or entirely planktonic stage: Diacronema, Pavlomulina, Phaeocystis, Chrysochromulina, and Emiliania species (Supplementary Table S4 and Supplementary Figure S4).

Distribution of Genes in the Haptophyte Plastid Genomes
The majority of the genes in the haptophyte plastid genomes was conserved, but some differences in gene content were observed ( Figure 5 and Supplementary Table S3). For example, 97 genes involved with photosynthesis, DNA synthesis, DNA repair, protein export, and sulfur-related genes were well conserved in the haptophyte plastid gene inventories. Three LIPOR genes of chlB, chlL, and chlN were only present in the two O. neapolitana plastids, whereas the psaK and psbX genes were absent in Ochrosphaera. The secG and ycf47, protein-export-related genes, and membrane translocator ycf80 gene, were absent in two Pavlovales strains. The thiS and thiG genes were absent in species of Pavlovales, Pavlomulinales, and Phaeocystales. No horizontal gene transfer (HGT) candidates were found in the plastid genome except for rpl34 gene, which is a well-known bacterial gene transfer case to the ancestor of the haptophyte and cryptophyte plastid genomes [36]. Light-independent protochlorophyllide oxidoreductases (LIPOR; chlL, chlN, and chlB) are a group of enzymes that convert protochlorophyllide to chlorophyllide irrespective of the presence of light. Different from other red algal plastid descendants including cryptophytes (in pseudogenized form), alveolates, and stramenopiles, LIPOR genes have not been previously reported for haptophytes; therefore, it has been suggested that haptophytes lost these genes secondarily [27][28][29]. Here, we unexpectedly found the LIPOR genes in the plastid genomes of two O. neapolitana strains. To test the presence of these genes in other haptophyte taxa, we searched for the chlL, chlN, and chlB genes in 11 selected haptophyte strains using specific PCR primers. Although this taxon sampling overrepresented three haptophyte genera with well-established benthic stages (i.e., four Chrysotila, three Ruttnera, and three Ochrosphaera strains), we included taxa from all seven orders of the Haptophyta. Consistent with the plastid genome data, the chlL and chlN genes were found in all Ochrosphaera and Chrysotila strains (Coccolithales) and in three Ruttnera strains (Isochrysidales). The chlB gene was absent in Chrysotila stipitata strain A13,147 and Ruttnera lamellosa strain A12,715 (Supplementary Figure S4). The presence of LIPOR genes in the Ruttnera strains is interesting, because other Isochrysidales (two Emiliania huxleyi strains, Isochrysis galbana, and Tisochrysis lutea) lacked the three LIPOR genes. A common feature of Ochrosphaera, Chrysotila, and Ruttnera is that they have a dominant benthic palmelloid stage (e.g., [7,14], and this study). In contrast, LIPOR genes were not found in haptophyte species that had a predominantly or entirely planktonic stage: The presence of LIPOR genes in some Isochrysidales and Coccolithales species suggests two possibilities. First, the genes may have been acquired during two independent horizontal gene transfer events, because all the basal taxa lack these genes. Second, ancestral genes may be retained independently in the benthic Isochrysidales and Coccolithales but lost in other haptophyte lineages. To test these two possibilities, we compiled the LIPOR gene sequences from two O. neapolitana strains as well as other taxa from the NCBI database. The concatenated gene tree revealed that red algae and the red algal-derived plastid-containing lineages formed a highly supported cluster (BS = 100%) (Supplementary Figure S5). Individual gene trees with more taxa are consistent with the concatenated gene tree (Supplementary Figure S6). This suggests that the LIPOR genes in haptophytes, cryptophytes, alveolates, and stramenopiles were inherited from a red-algal plastid ancestor [27][28][29]37]. In addition, the chlL and chlN genes are co-localized, and this structural feature is conserved in both red algae and the secondary endosymbiotic CASH linage, as well as in Viridiplantae and glaucophytes combined, the sister groups of red algae (Supplementary Figure S7, Supplementary Table S4). This also supports the independent gene retention hypothesis, because it is unlikely to have two independent HGTs of chlL and chlN genes located side by side.
Based on these results, independent LIPOR gene retention within three genera and frequent gene losses in other haptophyte lineages is a more preferable hypothesis than independent gene acquisitions. This postulation can be supported by the following reasons. Horizontal gene transfer to plastid genomes is rarely observed compared to transfers into the nucleus; one exceptional case is the rpl34 genes in haptophyte and cryptophyte plastid genomes. It has been reported that the plastid genome is highly resistant to the uptake of intracellular DNA [38]. On the other hand, the independent loss of LIPOR genes was reported from cryptophyte plastid genomes [28]. Some cryptophyte species (e.g., Cryptomonas curvata and Storeatula sp. CCMP 1868) possess three LIPOR genes, some taxa (e.g., Rhodomonas salina, Chroomonas placoidea, and Chroomonas mesostigmatica) have pseudogenized genes, while other taxa (e.g., Guillardia theta, Teleaulax amphioxeia, and Cryptomonas paramecium) have completely lost them. Based on this finding, Kim and colleagues [28] suggested that LIPOR genes are undergoing deletion in cryptophytes.
It is too early to discuss why some haptophytes retain LIPOR genes, but we cautiously postulate a correlation of LIPOR genes and benthic habitats. The biosynthesis of the LIPOR protein, containing the Fe-S cluster, could be metabolically disadvantageous to algae in iron-depleted ocean environments [27,39,40]. Moreover, LIPOR enzymes, similar to the nitrogenases from which they evolved, are very sensitive to oxygen, perhaps explaining why their genes were lost from algae living in oxygen-saturated environments (e.g., oxygenated layers of the open ocean) [24,26,27,41]. Conversely, in microbial mats, which are largely microaerophilic or anoxic, the retention of LIPOR genes is favored. Sassenhagen and Rengefors [40] found LIPOR genes in planktonic raphidophytes that are not benthic but migrate into the anoxic hypolimnion to access phosphorus. In contrast, the POR (lightdependent protochlorophyllide oxidoreductase) protein can perform the same function without the Fe-S cluster; therefore, the POR is advantageous in the surface of oxygen-rich open ocean [27]. In addition, both por and LIPOR genes can be expressed under the light conditions, but only LIPOR genes are expressed under dark conditions. Therefore, benthic haptophytes growing in shady habitats (e.g., under macroalgae or within algal mats) may benefit from the expression of LIPOR genes when producing chlorophylls [22,27]. Analogously, the wildtype of the green alga Chlamydomonas reinhardtii produces chlB and chlN proteins independent of light, but the chlL protein is only produced under dark or low light conditions (less than 15 µmol m −2 s −1 ). Interestingly, Chlamydomonas mutant cells cannot produce the chlL protein under any light conditions, and as a consequence, mutant cells change to a yellow color under dark conditions because protochlorophyllide accumulates [42].

Algal Cultures and DNA Preparation
Ochrosphaera neapolitana CCAC 3688 B was isolated from Gran Canaria, Canary Islands, Spain (27 • [14] were also investigated: Chrysotila stipitata A13,112, A12,964, A13,147, and A13,110 and Ruttnera lamellosa A12,715. The marine strains were cultivated in 2× L1 medium with pH 7.8~8.0 [43]. The freshwater Diacronema sp. A13,432 was cultivated in the DY-V medium with pH 6.8~7.0 [44]. In addition, deep frozen cells of Ruttnera lamellosa A13,109 and A12,964 were used (see [14]). The strains used in our experiment are listed in the Supplementary Table S5. A modified 2× CTAB method was used to extract DNA from the 14 haptophyte strains. Briefly, samples were plunged directly into liquid nitrogen for 1 min and thawed at 96 • C for 1 min. This freeze-thaw cycle was repeated three times, and the samples were then ground using a sterile micropestle. After this homogenization step, a CTAB DNA extraction method was followed using a 2× CTAB lysis solution as described in Stewart (1997) [45].

Species Identification and Phylogenetic Analysis
For species identification, partial 18S rRNA was amplified from 11 haptophyte strains, whereas an 18S rRNA sequence of Chrysotila stipitata A13110 was downloaded from NCBI (Accession number: KF696663.1), and sequences for O. neapolitana A15,280 and CCAC 3688 B were acquired from nuclear genome data (Supplementary Table S1). PCR primers for hapto18S-337F (CTACCATGGCGTTAACGGGT) and hapto18S-1423R (TTGC-CGCAAACTTCCACTTG) were newly designed. The PCR conditions were an initial denaturing phase at 95 • C for 2 min, 30 repetitions at 95 • C for 20 s, annealing of each primer set at 58 • C for 40 s, and extension at 72 • C for 1 min. The PCR products were purified using a LaboPassTM Gel Extraction Kit (Cosmo GeneTech Inc., Seoul, Republic of Korea) and then sent for Sanger sequencing (Macrogen Inc., Seoul, Republic of Korea).
Partial 18S rRNA sequences from PCR were used as queries for the BLASTn search (e-value ≤ 1 × 10 −5 ) to collect the homologous sequence for an alignment using MAFFT (V.8.3.10) with default option for the total of 77 haptophyte taxa. Geneious Prime (V.2020.2.4) was used for manual sequence trimming. The maximum likelihood phylogenetic analysis was conducted using IQ-TREE (V.1.6.8) with 1000 replications for the ultrafast bootstrap analysis [46]. The best evolutionary model was selected with the IQ-TREE basic option for model selection function, and a TN+F+I+G4 model for 18S rRNA and a GTR+F+G4 model for 28S rRNA were chosen as the best models, respectively. The 28S rRNA sequences for O. neapolitana CCAC 3688 B and A15,280 were obtained from their genome sequencing data, and other 28 rRNA sequences were downloaded from NCBI (Supplementary Table S1).

Morphological Observations: Light Microscopy and Scanning Electron Microscopy
The cell morphology of four O. neapolitana strains (CCAC 3688 B, A15,280, NIES-1395, and NIES-1964) was observed using inverted and upright compound microscopes (Leica DMI3000, Leica Camera AG, Wetzlar, Germany; Olympus BX53, Olympus Corporation, Tokyo, Japan). The scale morphology was observed with a scanning electron microscope (SEM) using cultured cells that were fixed for 20 min with a few drops of 2% (v/v) osmium tetroxide dissolved in the 2× L1 medium. After fixation and centrifugation, the pellet was rinsed with autoclaved distilled water (repeated three times), and the cells were transferred to SEM specimen stubs covered with aluminum foil. Specimens were placed into a 65 • C dry oven for one day and then kept in desiccators filled with silica gel for three days. Dried specimens were coated with iridium and observed with a JEOL JSM-6700F Scanning Electron Microscope (JEOL Ltd., Tokyo, Japan) located at the Cooperative Center for Research Facilities, Sungkyunkwan University (Suwon, Republic of Korea).

Validation of the Gains and Losses of Genes
Publicly available haptophyte plastid genomes were retrieved from NCBI (Supplementary Table S2), and protein sequences were extracted to map gene gains or losses. Orthologous gene families (OGFs) of haptophyte proteins were clustered using Or-thoFinder (v1.1.8). Each orthologous protein was confirmed by phylogenetic analysis and an NCBI batch CD search. For phylogenetic analyses, each individual gene was searched using BLASTp against the NCBI RefSeq (CXV_ver_201606) database. These homologous clusters were aligned using Clustal Omega (v.1.2.1). In total, 88 plastid genes were concatenated (41,508 amino acid sequences), and the alignment was manually confirmed. Phylogenetic analysis of the concatenated sequences was carried out using IQ-TRE E with LG+F+I+G4 as the best evolutionary model.

Conclusions
This study provided complete plastid genomes for two O. neapolitana strains, which represent the first two genome reports for the Coccolithales. Due to large intergenic regions, the two plastid genomes had the largest size among the published haptophyte plastid genomes. These two plastid genomes share little of the canonical ribosomal repeat structure with other haptophyte plastid genomes. Meanwhile, these two plastid genomes show infrastructural variations including the additional ycf55 and one trnL-UAG in CCAC 3688 B, along with incomplete coccolith formation in A15,280 strain. LIPOR genes were retained in some haptophyte species that have a dominantly benthic form suggesting a correlation to the benthic life form (e.g., anoxic conditions in microbial mats and dark conditions under the macroalgae). Further study is needed to test this hypothesis.