Genomic Insights on the Carbon-Negative Workhorse: Systematical Comparative Genomic Analysis on 56 Synechococcus Strains

Synechococcus, a type of ancient photosynthetic cyanobacteria, is crucial in modern carbon-negative synthetic biology due to its potential for producing bioenergy and high-value products. With its high biomass, fast growth rate, and established genetic manipulation tools, Synechococcus has become a research focus in recent years. Abundant germplasm resources have been accumulated from various habitats, including temperature and salinity conditions relevant to industrialization. In this study, a comprehensive analysis of complete genomes of the 56 Synechococcus strains currently available in public databases was performed, clarifying genetic relationships, the adaptability of Synechococcus to the environment, and its reflection at the genomic level. This was carried out via pan-genome analysis and a detailed comparison of the functional gene groups. The results revealed an open-genome pattern, with 275 core genes and variable genome sizes within these strains. The KEGG annotation and orthology composition comparisons unveiled that the cold and thermophile strains have 32 and 84 unique KO functional units in their shared core gene functional units, respectively. Each KO functional unit reflects unique gene families and pathways. In terms of salt tolerance and comparative genomics, there are 65 unique KO functional units in freshwater-adapted strains and 154 in strictly marine strains. By delving into these aspects, our understanding of the metabolic potential of Synechococcus was deepened, promoting the development and industrial application of cyanobacterial biotechnology.


Introduction
The oxygen-producing photosynthesis of cyanobacteria is one of the most profound physiological roles of microorganisms within the Earth's environment and evolution of life.Synechococcus and Prochlorococcus are the two most dominant groups of cyanobacteria [1,2].Compared to the latter, Synechococcus is a more versatile responder and widely distributed species, with diverse metabolic forms [3], playing a pivotal role in resource utilization and environmental protection, such as wastewater treatment, high-value product extraction, and biomass transformation [4], thus contributing to a dual harvest of economic and ecological benefits.
Synechococcus is a genus of cyanobacteria defined by its morphological characteristics [5].Its size ranges from 0.2 to 2 µm.In 1979, Paul W. Johnson officially recognized it as "small unicellular cyanobacteria with ovoid-to-cylindrical cells that reproduce through binary traverse fission in a single plane and lack sheaths" [6].Synechococcus exhibits extensive diversity in ecological habitats and can be found in various ecosystems, including some of the most extreme environments such as hot springs, the equator, and polar regions [7].Many Synechococcus species have key advantages such as high biomass, fast growth rate, genetic editing capability, and potential for conversion into biofuels [8], making them ideal model organisms for promoting the development of new synthetic biology chassis.Synechococcus strain UTEX 2973 is a fast-growing cyanobacteria strain with good resistance to high temperature and huge potentials for the production of carbohydrate feedstocks, accumulating glycogen contents as high as 50% of dry cell weight independent of nitrogen depletion [9].It represents a promising candidate for use as a synthetic biology chassis.Previously, Synechococcus was tested and optimized as chassis for PHB production [10].Furthermore, Synechococcus elongatus PCC 7942 was genetically modified to include the heterologous pathway for PLA production, making it a suitable chassis for bioplastic synthesis [2].Of note, synthetic biology provides powerful technical support for the industrialization of Synechococcus.Emerging synthetic biology tools, such as regularly interspaced, clustered, short palindromic repeats (CRISPR)/cpf1, riboswitches, and metabolic network reprogramming circuits, have accelerated the industrial applications of these tools [11].By improving the host characteristics, researching and implementing pathway engineering strategies, and enhancing target products, synthetic biology technology is expected to push the industrialization of Synechococcus to a new level [12].
Since the sequencing of the first strain of Synechococcus sp.WH 8102 in 2003 [13], complete genomic sequences and annotation information of this genus have been emerging.In 2009, scientists successfully assembled the first complete Synechococcus strain CC9902, with a genomic size of 2.24 Mb.In 2020, the sequencing of the strain CBW1006 was completed; this strain had the largest genome at 3.86 Mb among the sequenced strains.Synechococcus comprises relatively small genome sizes and complexities [14], consisting of a single circular chromosome without plasmids, with genes necessary for survival and photosynthesis, allowing for adaptive advantages for survival [15].
However, due to the complexity and instability of their growing environments, industrial production faces many challenges.Synthetic biology attempts to achieve design goals by reconstructing metabolic networks, but the complexity of metabolic networks increases the uncertainty of this process.Comparative genomic analysis can identify key nodes and regulatory factors in metabolic networks, guiding the rational reconstruction of metabolic networks and providing theoretical support for improving the environmental adaptability of Synechococcus.
Based on comprehensive genomic sequence analysis, 56 Synechococcus strains were compared in this study on the premise of clarifying genetic relationships, the adaptability of Synechococcus to the environment, its reflection at the genomic level in terms of the pangenome, and a comparison of functional genes was explored.Moreover, six representative strains with different temperature and salinity preferences were selected, and some relevant key metabolic pathways and genes were identified, laying the foundation for research on salt and temperature adaptation mechanisms.

Synechococcus Strains and Database Accession Numbers
A total of 56 Synechococcus strains with complete assembly were accessed using the accession numbers listed in Table 1 for their genome characteristics.

Construction of Phylogenetic Trees
The phylogenetic analysis was conducted using the complete genome sequences of 56 Synechococcus strains as operational taxonomic units and established on GTDB (https: //gtdb.ecogenomic.org/accessed on 18 August 2023).
The relative evolutionary tree was constructed using MAFFT V7 (https://mafft.cbrc.jp/alignment/software/ accessed on 18 August 2023) with 16S rRNA, peroxiredoxin, and cytochrome C oxidase subunit I as evolutionary markers.During the alignment process, the "automated1" parameter option was selected, and the software automatically calculated and selected the optimal evolutionary distance model to obtain a tree file.The tree file format was visualized using the IQ-TREE web server (http://iqtree.cibiv.univie.ac.at/ using USEARCH to perform power law regression analysis and generate a pan-core gene trend plot.The functional annotation module of BPGA was then utilized to visualize the COG functional annotation results by inputting genome sequences and COG data into GenBank format files, with specific parameters set such as an E-value threshold of 10 −5 and the USEARCH clustering algorithm with an identity value of 0.5.

Analysis of Genes Exclusively Related to Temperature and Salinity
Six strains have been selected for comparative analysis that are representative, highly studied, and well-characterized in terms of temperature and salinity.These strains exhibit exceptional performance and robust tolerance, providing strong support for further research and applications.After enrichment and KEGG pathway analyses, the sequence information of the core genome was extracted and annotated in the KEGG database.The KO identifiers for common gene functional units were obtained and mapped to the KEGG pathway database to find metabolic pathways and the key genes related to environmental adaptation.These KO identifiers for temperature and salinity preference strains were organized into a text document format and uploaded to the online tool Venny (https://bioinfogp.cnb.csic.es/tools/venny/accessed on 24 August 2023).This tool displays the intersection and difference set of multiple sets of data in a visual and concise manner, with the difference set representing unique gene functional units of temperature-and salinity-adaptive types of strains, predicting the possible association with environmental adaptation.

General Features of 56 Synechococcus Strains
The genome size of these strains ranges from 2.11 to 3.86 Mb, as shown in Table 1.They have various G + C contents, ranging from 40.6% in PCC 7502 to 68% in RSCF101, suggesting a higher diversity of genomes.The number of marine strain genomes sequenced is much greater than that of the freshwater strains.Compared with marine isolates (≈2.63 Mb, 57.57%), freshwater isolates (≈3.1 Mb, 50.25%) have relatively larger genomes, but a lower GC content.The size of protein coding sequences (CDS) ranges from 2288 (Synechococcus sp.WH8109) to 3654 (Synechococcus sp.BMK-MC-1), and there is a positive linear relationship between CDS and genome sizes (Figure 1).Compared to other genera in cyanobacteria, the genome size of Synechococcus is relatively small but the gene density is relatively high.

Phylogenetic and Comparative Genome Analyses
The information of 56 Synechococcus strains, including their name, geographical origin, and collection depth, is shown in Table 2.These strains can be divided into subclusters, with different species having distinct geographical distribution patterns (Table S1).
Upon constructing evolutionary trees using different molecular markers, the findings showed that there is a certain geographical environmental correlation between different strains, with those sharing the same habitat tending to cluster together (Figure 2).All confidence levels were very high, and when comparing mutual verification, it was observed that the genetic relationships were consistent, indicating the reliability of using conservative markers for evolutionary analysis.Among them, the evolutionary tree based on peroxidase construction highlighted the decisive role of intrinsic features in the protein sequence [16].These trees depicted different aspects of the influence of the environment and genetic factors on the evolution process of the strains and specific enzymes, revealing the complex evolutionary history of Synechococcus.positive linear relationship between CDS and genome sizes (Figure 1).Compared to other genera in cyanobacteria, the genome size of Synechococcus is relatively small but the gene density is relatively high.

Phylogenetic and Comparative Genome Analyses
The information of 56 Synechococcus strains, including their name, geographical origin, and collection depth, is shown in Table 2.These strains can be divided into subclusters, with different species having distinct geographical distribution patterns (Table S1).

Core and Pan Genomic Analyses of the 56 Synechococcus Strains
In pan-genomic studies, the power law model [19] can be used to determine whether a strain's genomic data belong to a conservative pan-genome or an open pan-genome.The power law regression analysis of 56 Synechococcus strains performed in Figure 3 showed that the functional adaptability value of the 56 strains was 0.69 (<1), indicating that the pan-genome is still open.This indicates that the genomes of Synechococcus strains are highly plastic and may acquire new genes more easily, making them more adaptable to various complex environments and leading to a wider distribution.Their core genomes,

Core and Pan Genomic Analyses of the 56 Synechococcus Strains
In pan-genomic studies, the power law model [19] can be used to determine whether a strain's genomic data belong to a conservative pan-genome or an open pan-genome.The power law regression analysis of 56 Synechococcus strains performed in Figure 3 showed that the functional adaptability value of the 56 strains was 0.69 (<1), indicating that the pan-genome is still open.This indicates that the genomes of Synechococcus strains are highly plastic and may acquire new genes more easily, making them more adaptable to various complex environments and leading to a wider distribution.Their core genomes, on the other hand, are relatively conservative: as the number of analyzed genomes increases, more gene acquisition and loss events occur among different strains, resulting in a gradually decreasing number of core genes.When the number of analyzed strains increases to 56, the total number of gene families reaches around 30,000, and the final number of core gene families stabilizes at around 275 (Table S2).The COG functional annotation of the specific genomes displayed in Figure 4 showed that the most common functions in the core genome of Synechococcus are related to the metabolism (52.8%), and information storage and processing (34.1%), while the distribution of accessory and unique gene functions was similar, with metabolic and information-processing functions accounting for 32.46% and 26.49%, respectively.Intracellular biological processes and signaling mechanisms accounted for 15.81% and 19.14%, respectively.When poorly characterized, these two accounted for 27.61% and 27.41%, respectively.Based on their minimum tolerance for growth temperature, these strains could be further classified into cold-adapted, warm-adapted, and thermophilic strains (Table 3).Based on their minimum tolerance for growth temperature, these strains could be further classified into cold-adapted, warm-adapted, and thermophilic strains (Table 3).
The growth temperature range and optimal growth temperature of warm-adapted strains are moderate; they cannot survive at temperatures below 10 • C and the optimal growth temperature is generally around 30-40 • C. The lowest growth temperature of cold-adapted strains is below 16 • C. Compared to warm-adapted strains (3.4 Mb, GC 49%), the genome sizes of 3.0 Mb for thermophilic strains and 2.6 Mb for cold-adapted strains are relatively small, but the GC contents (about 60% for thermophilic strains and about 58% for cold-adapted strains) are higher.Almost all cold-adapted strains can survive or culture at around 20 • C, indicating that all Synechococcus strains should have certain heat resistance.

Temperature Adaptation Mechanisms
The total number of core KO identifiers is 2305 for thermophilic strains, 2704 for warm-adapted strains, and 1383 for cold-adapted strains, as displayed in Figure 5.After further comparison of the core KO identifiers of different temperature-type strains, we concluded that the gene functional units found only in cold-adapted and thermophilic bacterial strains contain 32 and 84, respectively.adapted strains is below 16 °C.Compared to warm-adapted strains (3.4 Mb, GC 49%), the genome sizes of 3.0 Mb for thermophilic strains and 2.6 Mb for cold-adapted strains are relatively small, but the GC contents (about 60% for thermophilic strains and about 58% for cold-adapted strains) are higher.Almost all cold-adapted strains can survive or culture at around 20 °C, indicating that all Synechococcus strains should have certain heat resistance.

Temperature Adaptation Mechanisms
The total number of core KO identifiers is 2305 for thermophilic strains, 2704 for warm-adapted strains, and 1383 for cold-adapted strains, as displayed in Figure 5.After further comparison of the core KO identifiers of different temperature-type strains, we concluded that the gene functional units found only in cold-adapted and thermophilic bacterial strains contain 32 and 84, respectively.

Thermal Adaptation Strategies
The thermophilic strains possess 84 specific core gene functional units (Table S3), which involve various metabolic and disease pathways, such as cancer and bacterial resistance.The core metabolic functions of cold-adapted strains contain 32 unique gene functional units (Table S4), with the most abundant being the core metabolic pathway (Figure 6).Each KO functional unit contains one or more key genes related to thermal adaptation, and their unique key genes can be regulated through the following strategies: (1) Heat shock protein expression.In response to high temperature environments, the

Thermal Adaptation Strategies
The thermophilic strains possess 84 specific core gene functional units (Table S3), which involve various metabolic and disease pathways, such as cancer and bacterial resistance.The core metabolic functions of cold-adapted strains contain 32 unique gene functional units (Table S4), with the most abundant being the core metabolic pathway (Figure 6).Each KO functional unit contains one or more key genes related to thermal adaptation, and their unique key genes can be regulated through the following strategies: (1) Heat shock protein expression.In response to high temperature environments, the primary adaptive response of all organisms is a heat shock response [20].An increased expression of SEC63 in response to high temperatures can help correct misfolding errors caused by high temperatures [21].(2) Molecular repair mechanisms.CDC48 utilizes ATPase activity to facilitate the assembly and disassembly of protein complexes, thereby clearing misfolded proteins in various organelles such as the nucleus, cytoplasm, endoplasmic reticulum, and mitochondria [22].An MPG protein initiates base excision repair by cutting off the glycosyl bonds of numerous damaged bases, thus repairing DNA damage caused by high temperatures [23].(3) Accumulation of solutes for protection.Under high-temperature conditions, bacteria accumulate solutes to balance the water loss caused by high temperatures and help cells maintain osmotic balance, thus increasing their survival rate [24].TreY, TreZ, and MTTase possess thermotolerance properties and can maintain stable structures and catalytic activities at high temperatures, helping bacteria adapt to glucose metabolism and energy utilization within high temperature environments.The cysA-encoded sulfate transport system provides sulfur sources for the synthesis of hydrogen sulfide and helps thermophilic bacteria aggregate compatible sulfate solutes, improving the stability of cells in high-temperature environments [25].The specific OpuA, OpuB, OpuC, and OpuD transport systems in thermophilic bacteria have been extensively studied for their stress protection functions.The uptake of OpuA transporter mediates the high-affinity uptake of glycine betaine and proline betaine [26].The substrate-binding protein of OpuC [27] can recognize a wide range of compatible solutes such as choline, botulinum toxin, and glycine betaine.

Cold Adaptation Strategies
Typically, organisms adapt to low-temperature environments by regulating the expression of cold shock proteins.However, no common cold-shock-protein-encoding genes were detected in the genome of the cold-adapted Synechococcus strains studied, such as cspA, cspB, cspC, or cspG.This indicates that Synechococcus does not rely on cold shock proteins for survival under low temperatures and employs alternative pathways and molecular mechanisms to achieve low temperature adaptation.For instance, it involves fatty acid synthesis and unsaturation modulation related to membrane fluidity, solute accumulation, and the stabilization of protein structures.
The increase in unsaturated fatty acid concentration enhances membrane fluidity at low temperatures.LcyB can convert lycopene to carotene, enhancing the tolerance of strain due to salt, drought, and oxidative stress [28].LcyB also catalyzes the cyclization of lycopene to produce desaturated carotenoids such as aromelin and naranjine, which can be converted into precursors of unsaturated fatty acids in cell membranes [29].
The rational regulation of glycogen synthesis and breakdown is one of the crucial mechanisms for spirulina to adapt to low-temperature conditions.GlgX and GlgP proteins degrade glycogen; the absence of glgX leads to excessive glycogen accumulation [30].Under low-temperature conditions, glycogen acts as a storage carbon source and energy source.The upregulation of glycogen synthesis pathways allows excess glucose to be stored temporarily to prevent excessive solutes from inhibiting metabolic enzyme activity.Increasing glycogen accumulation also helps balance water entry due to low temperature, while inhibiting glycogen-decomposition-related enzyme activities to reduce glycogen consumption [31].
Stabilizing protein structure is among the strategies employed by cold-adapted Synechococcus strains to respond to low-temperature stresses.Pex5 plays a critical role in targeting proteins to the peroxisome for degradation.Even under low-temperature conditions, Pex5 retains its transport and import functions to help correctly fold and localize enzymes within the peroxisome [32].RhlE1, an example of this family, can still participate in protein folding under suboptimal growth temperatures [33].DNAJC3 is a molecular chaperone localized to the endoplasmic reticulum that transiently binds to newly synthesized proteins, especially the regions with unstable structures, helping maintain proper protein synthesis and prevent the incorrect folding of susceptible domains [34].transporter mediates the high-affinity uptake of glycine betaine and proline betaine [26].
The substrate-binding protein of OpuC [27] can recognize a wide range of compatible solutes such as choline, botulinum toxin, and glycine betaine.

Cold Adaptation Strategies
Typically, organisms adapt to low-temperature environments by regulating the expression of cold shock proteins.However, no common cold-shock-protein-encoding genes were detected in the genome of the cold-adapted Synechococcus strains studied, such as cspA, cspB, cspC, or cspG.This indicates that Synechococcus does not rely on cold shock proteins for survival under low temperatures and employs alternative pathways and molecular mechanisms to achieve low temperature adaptation.For instance, it involves fatty acid synthesis and unsaturation modulation related to membrane fluidity, solute accumulation, and the stabilization of protein structures.
The increase in unsaturated fatty acid concentration enhances membrane fluidity at low temperatures.LcyB can convert lycopene to carotene, enhancing the tolerance of strain due to salt, drought, and oxidative stress [28].LcyB also catalyzes the cyclization of lycopene to produce desaturated carotenoids such as aromelin and naranjine, which can be converted into precursors of unsaturated fatty acids in cell membranes [29].
The rational regulation of glycogen synthesis and breakdown is one of the crucial mechanisms for spirulina to adapt to low-temperature conditions.GlgX and GlgP proteins degrade glycogen; the absence of glgX leads to excessive glycogen accumulation [30].

Salt Tolerance and Comparative Genomics in Synechococcus 3.5.1. Genome Features of Synechococcus with Different Salinity Growth Conditions
Among the 56 strains, salt tolerance ranges, optimal salt levels, and salt tolerance categories were studied in 27 Synechococcus strain genomes, as listed in Table 4.The genome size of freshwater strains was between 2.67 and 3.72 Mb (average of 3.06 Mb); for euryhaline strains, it was between 2.44 and 3.86 Mb (average of 3.28 Mb); and that of strictly marine strains was between 2.22 and 3.35 Mb (average of 2.58 Mb).The GC content of freshwater strains was between 40.6% and 55.5%, while that of salt-tolerant strains was between 62.6% and 67.1%.These values indicate that of euryhaline strains tend to have larger genome sizes, beneficial for encoding more salt resistance-related proteins.Strictly saline strains have relatively higher GC contents, which helps maintain the stability of their genomes.

Salinity Adaptation Mechanisms
Compared with the functional units of the core genes of salt-loving bacteria, 65 KO functional units were specific to freshwater stains, and 154 KO functional units were specific to strictly marine strains (Figure 7).More KO units were identified in the core and specific units, which may be related to the fact that strictly marine stains require more genes to support their adaptation to high-salinity environments.Compared with the functional units of the core genes of salt-loving bacteria, 65 KO functional units were specific to freshwater stains, and 154 KO functional units were specific to strictly marine strains (Figure 7).More KO units were identified in the core and specific units, which may be related to the fact that strictly marine stains require more genes to support their adaptation to high-salinity environments.According to the list of unique 154 KEGG pathways of strict marine strains (Table S5), the highest number of pathways belongs to the ABC transporter pathway of metabolism pathway.The relatively higher number of pathways is related to the synthesis and accumulation of solutes such as glycogen and lipids.The freshwater strains have 65 unique keg core identifiers (Table S6), which represent different pathways, including the majority of the core and sugar metabolic pathways.This demonstrates strong environmental adaptability by regulating energy metabolism, nutrient absorption mechanisms, and oxidative redox homeostasis (Figure 8); freshwater strains adapt to low- According to the list of unique 154 KEGG pathways of strict marine strains (Table S5), the highest number of pathways belongs to the ABC transporter pathway of metabolism pathway.The relatively higher number of pathways is related to the synthesis and accumulation of solutes such as glycogen and lipids.The freshwater strains have 65 unique keg core identifiers (Table S6), which represent different pathways, including the majority of the core and sugar metabolic pathways.This demonstrates strong environmental adaptability by regulating energy metabolism, nutrient absorption mechanisms, and oxidative redox homeostasis (Figure 8); freshwater strains adapt to low-salt or salt-free environments.According to the list of unique 154 KEGG pathways of strict marine strains (Table S5), the highest number of pathways belongs to the ABC transporter pathway of metabolism pathway.The relatively higher number of pathways is related to the synthesis and accumulation of solutes such as glycogen and lipids.The freshwater strains have 65 unique keg core identifiers (Table S6), which represent different pathways, including the majority of the core and sugar metabolic pathways.This demonstrates strong environmental adaptability by regulating energy metabolism, nutrient absorption mechanisms, and oxidative redox homeostasis (Figure 8); freshwater strains adapt to low-

High-Salt Adaptation Strategies
Analyzing the key genes involved in specific functional units of obligate aerobes, strictly saline strains can adapt to high salinity via the following strategies:

Synthesis of compatible solutes
The accumulation of compatible solutes is a common defense measure used by bacteria to resist harmful effects caused by high salinity and tolerate high-salinity environments.On one hand, this method can alleviate high salinity pressure, and on the other hand, these compatible solutes can be quickly synthesized and degraded [35].The CodA gene can convert choline into betaine glycine, providing tolerance to salt stress [36].Amt is an important enzyme in the betaine metabolism pathway.It catalyzes the conversion of betaine from methionine to betaine glycine, which is a compatible solute that can resist salt stress [37].Glycerol acts as a compatible solute and enhances tolerance to various non-biotic stresses.Glycerol metabolism is mediated by glp genes.Under conditions of glycerol phospholipid metabolism, the encoding glpA gene is highly upregulated, leading to an increased accumulation of glycerol in cells [38].

Antioxidant defense
Salinity induces changes in osmotic potential in cells by reducing the water potential.Subsequently, the accumulation of ions in cells disturbs ion homeostasis and leads to changes in membrane permeability, affecting essential ion absorption.These disturbances caused by salinity lead to a series of metabolic disorders followed by reactive oxygen species (ROS) generation [39].Glutathione peroxidase (GPX) is an antioxidant enzyme that exhibits high expression under different types of environmental stressors such as

High-Salt Adaptation Strategies
Analyzing the key genes involved in specific functional units of obligate aerobes, strictly saline strains can adapt to high salinity via the following strategies: 1.

Synthesis of compatible solutes
The accumulation of compatible solutes is a common defense measure used by bacteria to resist harmful effects caused by high salinity and tolerate high-salinity environments.On one hand, this method can alleviate high salinity pressure, and on the other hand, these compatible solutes can be quickly synthesized and degraded [35].The CodA gene can convert choline into betaine glycine, providing tolerance to salt stress [36].Amt is an important enzyme in the betaine metabolism pathway.It catalyzes the conversion of betaine from methionine to betaine glycine, which is a compatible solute that can resist salt stress [37].Glycerol acts as a compatible solute and enhances tolerance to various non-biotic stresses.Glycerol metabolism is mediated by glp genes.Under conditions of glycerol phospholipid metabolism, the encoding glpA gene is highly upregulated, leading to an increased accumulation of glycerol in cells [38].

Antioxidant defense
Salinity induces changes in osmotic potential in cells by reducing the water potential.Subsequently, the accumulation of ions in cells disturbs ion homeostasis and leads to changes in membrane permeability, affecting essential ion absorption.These disturbances caused by salinity lead to a series of metabolic disorders followed by reactive oxygen species (ROS) generation [39].Glutathione peroxidase (GPX) is an antioxidant enzyme that exhibits high expression under different types of environmental stressors such as bacterial infection, contact with heavy metals, or high concentrations of salt [40].Under saltstress conditions, genetically modified strains show better anti-salinity than non-genetically modified individuals, indicating that overexpression of the GPX gene can effectively protect the strain from harm caused by salinity and promptly eliminate excess hydrogen peroxide and lipid peroxides.Overexpression of the GPX gene does not affect normal growth of the strain, but enhances its anti-salinity ability [41].

Ion transport and distribution
Ion transport is a crucial step in controlling cellular uptake and subsequent storage, reduction or export that is essential for the growth of strictly marine strains.Phosphate is a basic nutrient required for all living organisms.All genes responsible for using phosphorus are located in the phosphorus transport system (phnC) [42].Under high-salinity conditions, increasing Na + intake leads to an elevated osmotic potential, which helps balance the osmotic energy required for cell growth via an addition phosphorus [43].Under salt stress conditions, PstA can act as a sensor for monitoring changes in intracellular phosphorus concentration due to changes in its expression and activity [44].The transport process of nitrate and nitrite mainly involves MFS (facilitated diffusion superfamily) transporters [45].To avoid toxicity caused by nitrate, it undergoes assimilatory reduction and active transport processing [46].

Freshwater Adaptation Strategies
In terms of energy metabolism, Synechococcus utilizes methane and other organic substances in freshwater to generate energy.For example, it participates in methane metabolism through the beta subunit of the hydrogenase enzyme coded by frhB [47], and the two subunits coded by cofG and cofH form a methane monooxygenase that participates in the pathway [48].Furthermore, the key enzymes in the glycolysis pathway, such as 3-phosphoglycerate dehydrogenase (GAPDH) and fructose-1,6-bisphosphate aldolase (FBPase), contribute to energy production [49].
In terms of nutrient absorption, the alga utilizes phosphorus transport systems regulated by pstS [50], sulfate transport systems composed of CysP, CysU, and CysW [51], branched amino acid transport systems regulated by livF, livG, livH, and livM genes [52], and carbonic bicarbonate transport system including cmpA, cmpB, and cmpC genes [53] to uptake scarce nutrients.These systems help cells absorb essential amino acids, sulfates, and bicarbonates to meet their freshwater survival needs.
Regarding oxidative redox regulation, the alga protects cells from high salt or oxidative stress through enzymes with antioxidant functions encoded by the katE and CAT genes [54] and superoxide dismutase SOD2 [55].These mechanisms help maintain a stable redox state within cells, resisting external environmental pressures.

Discussion
This study compared and analyzed the genome sequence of Synechococcus, elucidating the genomic features and relationship with environmental adaptability from multiple perspectives.The Synechococcus genome is highly diverse, with different strains forming specific genomic compositions, key genes, and metabolic pathways based on temperature and salinity environments.A systematic phylogenetic and pan-genome analysis was conducted on 56 strains at the whole-genome level.The results showed that these strains have an open-genome pattern, with 275 core genomes and variable genomes of varying sizes.The phylogenetic relationships of different strains are related to their survival environment.
By isolating only a single variable (either temperature or salinity), a complete picture of the fitness of Synechococcus populations with different genotypes relative to natural variations in both of these conditions might not have been obtained.Furthermore, environmental parameters other than temperature and salinity (e.g., nutrient availability, pH, etc.) may influence the overall fitness of these organisms and thus affect their distribution.The gene families and pathways found in this study may reveal further evidence of adaptation to temperature (and salinity) and could give insight into other environmental factors affecting the adaptations of Synechococcus.
The growth of Synechococcus strains is related to temperature, which is a major environmental factor controlling photosynthetic rate and biogeography [56].The high genetic diversity in Synechococcus leads to their ability to tolerate a wide range of temperatures [57].Temperature plays a crucial role in many industrial processes and chemical reactions.Thermophilic Synechococcus can reduce dependence on low-temperature environments in industrial fermentation processes, decrease industrial fermentation costs, and increase production efficiency.From the perspective of temperature adaptability, the genomic sequences of different types of strains were analyzed.The results showed that compared with warm-adapted strains, the genomes of cold-adapted and thermophilic strains underwent reduction.Through KEGG annotation and orthology (KO) composition comparisons, it was concluded that there are 32 and 154 unique KO functional units within the shared core gene functional units of the two types of strains, respectively, which rely on metabolic pathways such as the regulation of heat shock protein expression and membrane fatty acid composition for temperature adaptation.Extreme thermophilic or cold-adapted bacteria are highly interesting from an industrial processing perspective.The ability to grow at high or cold temperatures in bioreactors reduces the costs of cooling and prevents contamination by mesophilic spoilage bacteria [58].Synechococcus can be used as a platform to introduce heat and cold adaptive genes identified in this study, enhancing its heat and cold tolerance.Additionally, utilizing Synechococcus as a chassis facilitates the production of cell factories after introducing these genes, offering innovative solutions for bioenergy, biologic medicine, environmental protection, and other fields.
An attractive strategy for controlling biological contamination is to increase the salinity of the growth media.Salt-tolerant Synechococcus is capable of adapting to seawater fermentation, and can be used to reduce production costs and simplify the process.Under different salinities, enriched populations are clearly different [59].Most Synechococcus strains have adapted to long-term relatively stable salinity levels and cannot tolerate drastic changes in salinity [60].Compared with freshwater host strains, most Synechococcus strains colonize marine environments and have greater resource utilization advantages.For example, PCC7002 and PCC11901 are not only rapidly growing, salt-tolerant, and heatresistant [61,62], but also good chassis strains for industrial applications.From the perspective of salinity adaptability, among the freshwater-adapted strains, there are 65 unique KO functional units in the shared core genome that rely on energy metabolism, nutrient absorption, and osmotic pressure regulation for freshwater adaptation.For saline-tolerant strains, there are 154 unique KO functional units in the shared core genome that adapt to highsalinity environments through antioxidation, nutrient absorption, and transport regulation.
This study provides genomic-scale insights into the genetic basis of Synechococcus' adaptation to different temperatures and salinities, offering a new perspective and abundant gene targets for understanding the environmental adaptability of cyanobacteria.Future research can expand upon several aspects by: (1) expanding the sample size by collecting more complete genome sequences of Synechococcus, hence enhancing our understanding of its diversity and evolutionary relationships; (2) deeply analyzing the predicted key genes and their metabolism regulation in temperature and salt stress responses, experimentally verifying their contribution to improving environmental adaptability, and aiding application development; (3) investigating the effects of different environments on Synechococcus phenotypes and genotypes, expanding our knowledge of microbial adaptability theory; and (4) attempting to modify relevant genes and optimize Synechococcus' environmental adaptability.Genome size information helps clarify the metabolic characteristics of chassis cells and guide the design of synthetic biology based on their features.
The research prospects in the field of Synechococcus are vast.It not only deepens our understanding of the rules governing microbial environmental adaptability but also actively promotes their applications in environmental remediation, high-value product biosynthesis, and other areas, fully exploring and utilizing its potential value.

Conclusions
In this research, the complete genome sequences of 56 Synechococcus strains were selected as the analysis targets for conducting the comprehensive genomic investigation.Bioinformatics tools were employed to analyze and compare the genomic features of various strains from a comparative genomics perspective, while considering temperatureand salinity-related environmental data.This effectively clarified the inherent connection between genome structural variations and environmental adaptability.Additionally, this study constructed a systemic evolutionary tree using multiple marker genes and analyzed the ecological adaptability of Synechococcus to temperature and salinity from multiple dimensions.Moreover, it proposed a molecular mechanism for explaining how Synechococcus responds to environmental changes at the genomic level.At the same time, it identified several key genes and metabolic pathways that may be related to salt and temperature adaptation, providing theoretical guidance for the targeted optimization and large-scale application of Synechococcus.

Supplementary Materials:
The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/bioengineering10111329/s1,Table S1: Dominant ecological conditions for Synechococcus clades; Table S2: Status of pan-genome comparative analysis; Table S3: Gene families association with thermal adaptation; Table S4: Gene families association with cold adaptation; Table S5: Gene families association with halotolerance; Table S6: Gene families association with freshwater adaptation.

Figure 1 .
Figure 1.Linear regression diagram of CDS and genome size.The grey dots represent individual Synechococcus genomes, while the line represents the overall trend in the data.

Figure 1 .
Figure 1.Linear regression diagram of CDS and genome size.The grey dots represent individual Synechococcus genomes, while the line represents the overall trend in the data.Bioengineering 2023, 10, x FOR PEER REVIEW 7 of 19

Figure 2 .
Figure 2. Systematic evolutionary diversity.Phylogenomic analysis with previously available marine Synechococcus reference genomes placed these within several clades, including 5.1, 5.2, 5.3 [17], Marine cluster C [18], thermophile, and freshwater.(a) Circular phylogenetic tree based on the complete genome sequence.The phylogenetic tree is from GTDB, with branch lengths reflecting phylogenetic information as inferred from the concatenation of 120 marker genes; (b) circular phylogenetic tree based on 16S rRNA; (c) unrooted circular phylogenetic tree based on peroxiredoxin, where the blue color represents the phylogenetic group to which the strain belongs; (d) circular phylogenetic tree analysis of cytochrome C oxidase subunit I sequences.

Figure 2 .
Figure 2. Systematic evolutionary diversity.Phylogenomic analysis with previously available marine Synechococcus reference genomes placed these within several clades, including 5.1, 5.2, 5.3 [17], Marine cluster C [18], thermophile, and freshwater.(a) Circular phylogenetic tree based on the complete genome sequence.The phylogenetic tree is from GTDB, with branch lengths reflecting phylogenetic information as inferred from the concatenation of 120 marker genes; (b) circular phylogenetic tree based on 16S rRNA; (c) unrooted circular phylogenetic tree based on peroxiredoxin, where the blue color represents the phylogenetic group to which the strain belongs; (d) circular phylogenetic tree analysis of cytochrome C oxidase subunit I sequences.

Figure 3 .
Figure 3. Pan and core plot of the 56 Synechococcus with complete genomes.The purple and orange curves represent the core genome number plot and pan-genome number plot, respectively.Equations used to fit the curves are shown above the plot, respectively.

Figure 4 .
Figure 4. COG distribution of core, accessory, and unique genes.

Figure 3 .
Figure 3. Pan and core plot of the 56 Synechococcus with complete genomes.The purple and orange curves represent the core genome number plot and pan-genome number plot, respectively.Equations used to fit the curves are shown above the plot, respectively.

Figure 3 .
Figure 3. Pan and core plot of the 56 Synechococcus with complete genomes.The purple and orange curves represent the core genome number plot and pan-genome number plot, respectively.Equations used to fit the curves are shown above the plot, respectively.

Figure 4 .
Figure 4. COG distribution of core, accessory, and unique genes.

3. 4 .
Temperature Adaptation Mechanism of Synechococcus and Comparative Genomics 3.4.1.Growth Temperature and Structural Characteristics of the Synechococcus Genome

Figure 4 .
Figure 4. COG distribution of core, accessory, and unique genes.

3. 4 .
Temperature Adaptation Mechanism of Synechococcus and Comparative Genomics 3.4.1.Growth Temperature and Structural Characteristics of the Synechococcus Genome Bioengineering 2023, 10, x FOR PEER REVIEW 10 of 19

Figure 5 .
Figure 5. Unique KO identifiers within core identifiers.Venn diagram showing the KO identifier numbers of (a) thermophilic strains; (b) warm-adapted strains; (c) cold-adapted; and (d) three different temperature types.

Figure 5 .
Figure 5. Unique KO identifiers within core identifiers.Venn diagram showing the KO identifier numbers of (a) thermophilic strains; (b) warm-adapted strains; (c) cold-adapted; and (d) three different temperature types.

Figure 6 .
Figure 6.Distribution of the KEGG pathway.The red and blue bar graphs represent the quantity of KEGG pathways.(a) Thermophilic Synechococcus specific core metabolic pathway distributions; and (b) cold-adapted Synechococcus specific core metabolic pathway distributions.

Figure 6 .
Figure 6.Distribution of the KEGG pathway.The red and blue bar graphs represent the quantity of KEGG pathways.(a) Thermophilic Synechococcus specific core metabolic pathway distributions; and (b) cold-adapted Synechococcus specific core metabolic pathway distributions.

Figure 7 .
Figure 7. Venn diagram of pangenomes of strictly marine and freshwater strains using KEGG core KO identifiers.

Figure 7 .
Figure 7. Venn diagram of pangenomes of strictly marine and freshwater strains using KEGG core KO identifiers.

Figure 7 .
Figure 7. Venn diagram of pangenomes of strictly marine and freshwater strains using KEGG core KO identifiers.

Figure 8 .
Figure 8. Distribution of the KEGG pathway.The deep blue and light blue bar graphs represent the quantity of KEGG pathways.(a) Strictly marine Synechococcus specific core metabolic pathway distributions; and (b) freshwater Synechococcus specific core metabolic pathway distributions.

Figure 8 .
Figure 8. Distribution of the KEGG pathway.The deep blue and light blue bar graphs represent the quantity of KEGG pathways.(a) Strictly marine Synechococcus specific core metabolic pathway distributions; and (b) freshwater Synechococcus specific core metabolic pathway distributions.

Table 2 .
List of geographical isolation information of Synechococcus.

Table 2 .
List of geographical isolation information of Synechococcus.

Table 3 .
Growth temperature characteristics of different strains of Synechococcus.

Table 3 .
Growth temperature characteristics of different strains of Synechococcus.

Table 4 .
Salinity growth conditions of the researched Synechococcus.