Next Article in Journal
Transcriptomic Insights into the Effects of Inoculation Density in Areca catechu Tissue Culture
Previous Article in Journal
Impact of Alien Chromosome Introgression from Thinopyrum ponticum on Wheat Grain Traits
Previous Article in Special Issue
Patterns of Genetic and Clonal Diversity in Myriophyllum spicatum in Streams and Reservoirs of Republic of Korea
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Codon Usage Bias of the Polyphenol Oxidase Genes in Camellia sinensis: A Comprehensive Analysis

by
Yeşim Aktürk Dizman
Department of Biology, Faculty of Arts and Sciences, Recep Tayyip Erdoğan University, 53100 Rize, Türkiye
Plants 2025, 14(19), 3074; https://doi.org/10.3390/plants14193074
Submission received: 23 August 2025 / Revised: 23 September 2025 / Accepted: 2 October 2025 / Published: 4 October 2025
(This article belongs to the Special Issue Plant Genetic Diversity and Molecular Evolution)

Abstract

Tea, a widely consumed beverage globally, is a vital agricultural product for many countries. Polyphenol oxidases (PPOs), copper-containing enzymes found in plants, fungi, and animals, are essential for physiological metabolism and enzymatic browning in tea plants (Camellia sinensis). Codon usage bias (CUB), a key evolutionary characteristic, offers valuable insights into species evolution and gene function. However, the codon usage patterns of Camellia sinensis polyphenol oxidase (CsPPO) genes remain undocumented. In this study, we conducted, for the first time, a comprehensive analysis of CUB in 24 CsPPO genes, comparing their CUB profiles with those of other Camellia species (Camellia lanceoleosa, Camellia nitidissima, Camellia ptilophylla) and non-Camellia species (Actinidia chinensis, Cornus florida, Rhododendron vialii) to elucidate potential evolutionary relationships and functional constraints influencing CUB. Nucleotide composition analysis revealed an AT-rich bias, with a preference for G/C-ending codons at the third position. Codon usage indices indicated low expression levels and weak CUB. RSCU and RFSC analyses revealed that the preferred and high-frequency codons were mostly G/C-ending. Codon usage frequency analysis suggested Zea mays as a suitable host for CsPPO gene expression. ENC-GC3s, PR2, and neutrality plots showed natural selection had a stronger impact than mutation on CUB. Additionally, measure independent of length and composition (MILC) values confirmed low PPO gene expression levels, and correlation analyses demonstrated that both nucleotide composition and gene expression affect CUB. Overall, codon usage in CsPPO genes is mainly shaped by natural selection, with weak bias and low expression potential, providing useful insights for future genetic engineering and heterologous expression.

1. Introduction

Camellia sinensis, widely recognized as the tea plant, is a perennial evergreen woody species cultivated across a broad geographical range extending from tropical to temperate climates. C. sinensis, a member of the Theaceae family, is a crop of significant commercial and medicinal value, primarily due to its rich secondary metabolites and its widespread popularity as an aromatic beverage with a distinctive flavor. The leaves of C. sinensis are rich in various secondary metabolites, including polyphenols, vitamins, alkaloids, volatile oils, and polysaccharides [1,2,3]. Tea polyphenols are primarily composed of catechins, phenolic acids, and flavanones [4]. Polyphenols make up about 36% of the dry weight of young tea leaves, playing crucial roles not only in plant physiology but also in promoting human health [5]. Polyphenol oxidase (PPO), a key metalloenzyme in C. sinensis, is encoded and expressed by nuclear genes [6]. PPO, a vital enzyme in tea production, catalyzes the transformation of key phenolic metabolites into various derivatives, thereby influencing the extent of tea oxidation and significantly shaping its flavor, color, and taste [7]. Apart from its involvement in tea processing, PPO is also vital for stress responses, plant defense, and the regulation of secondary metabolism [8,9]. Due to the functional significance of PPO genes in tea plants, elucidating their genetic and evolutionary features is crucial. Codon usage bias, a key determinant of gene expression and evolutionary dynamics, offers valuable insights into the regulatory mechanisms and adaptive evolution of PPO genes. Accordingly, clarifying codon usage in CsPPO genes can provide insights into how translational constraints have shaped oxidative metabolism and stress response capacity in tea plants, two key biological attributes that underpin both tea quality and environmental adaptation in C. sinensis.
Codons play a fundamental role in the transmission of genetic information, serving as the critical link between nucleotide sequences and their corresponding amino acids during protein synthesis in living organisms. Amino acids may be encoded by between one and six different codons, a characteristic known as codon degeneracy [10]. Among these, codons that correspond to the same amino acid are referred to as synonymous codons, and they are not used with equal frequency during translation [11]. The preferential utilization of specific synonymous codons over others is termed codon usage bias (CUB). Previous studies have revealed that multiple factors influence CUB patterns, such as natural selection, gene length, mutation pressure, genome organization, levels of gene expression, and the structure of the tRNA anticodon-binding domain [12,13]. So far, studies have shown that the frequency of synonymous codon usage varies unevenly among different species, between nuclear and organelle genes, and even within individual genes [14,15]. Investigating CUB in specific genes or genomes is essential for understanding the molecular mechanisms of gene expression and for designing efficient expression vectors to enhance the production of target genes [12]. Optimal codon usage reflects molecular strategies organisms employ to adapt to their environments, thereby revealing the evolutionary forces that shape long-term genomic evolution [16]. Consequently, analyzing CUB provides valuable insights into both the evolutionary trajectories and expression patterns of genes. Consistent with this view, in C. sinensis, examining the balance between selection and mutation in the codon usage of CsPPO genes provides a valuable framework for understanding how the tea plant has evolutionarily optimized PPO activity to regulate phenolic metabolism while simultaneously maintaining resilience under both biotic and abiotic stresses.
CUB has been widely studied in several plant species, including Helianthus annuus [13], Gynostemma species [17], Populus species [15], and Coffea species [18], contributing to our understanding of molecular biology, genetics, and the evolutionary dynamics of plant genomes [19]. In tea plants, PPO genes are of particular interest because they play crucial roles in physiological metabolism and enzymatic browning. Although numerous studies have investigated the characteristics and phylogenetic relationships of nuclear genes in C. sinensis [20,21], the codon usage patterns of PPO genes in this species remain unexplored. The objective of this study was to systematically analyze codon usage bias in 24 CsPPO genes, identify the factors influencing their codon usage, and compare these patterns with PPO genes from other Camellia species (C. lanceoleosa, C. nitidissima, C. ptilophylla) and non-Camellia species (A. chinensis, C. florida, R. vialii). Using nucleotide composition analysis, codon usage indices, and correlation approaches, this work represents the first comprehensive investigation of CsPPO codon usage patterns. The significance of this study lies in its potential applications for optimizing PPO gene expression in synthetic biology, genetic engineering, and crop improvement programs. Identifying optimal codons in PPO genes can guide the development of stable transgenic systems, facilitate efficient heterologous expression, and enable targeted modifications in metabolic pathways to improve tea quality and stress resilience. Ultimately, this research provides a broader perspective on plant gene evolution and the molecular mechanisms regulating secondary metabolism in tea plants.

2. Results

2.1. Patterns of Codon Composition

A key consideration is that genomic nucleotide composition can significantly shape CUB [22]. This study examined the nucleotide composition of the 24 CsPPO genes to assess its potential influence on CUB (Table 1 and Table S1). The nucleotide frequencies of CsPPO genes were as follows: A (27.87%), T (23.27%), C (24.15%), and G (24.71%). To further evaluate whether this nucleotide composition pattern is species-specific, we performed a comparative analysis with PPO genes from a diverse set of plant species. Specifically, we included PPO genes from other Camellia species as well as non-Camellia species representing broader taxonomic diversity. This comparative analysis revealed that the average nucleotide composition across these PPO genes was A (27.09%), T (23.80%), C (24.40%), and G (24.71%) (Table 1 and Table S1). These values demonstrate a modestly higher A content in CsPPO genes, possibly indicating species-specific compositional features. Nucleotide composition at the third codon position was also analyzed. CsPPO genes exhibited A3s (28.38%), T3s (31.25%), G3s (35.04%), and C3s (33.38%), while the average values across other PPO genes were A3s (26.26%), T3s (34.16%), G3s (33.88%), and C3s (32.56%). These differences suggest that CsPPO genes exhibit a stronger preference for G and C at the third codon position, and a reduced usage of A/T, relative to other species. The mean AT and GC content in CsPPO genes was 51.13% and 48.87%, respectively, which is comparable to the mean AT (50.89%) and GC (49.11%) contents observed in other PPO genes. AT3 values in CsPPO genes ranged between 45.00% and 55.26%, with a mean of 46.25%, which is slightly lower than the AT3 average of 47.27% in other PPO sequences. The GC content at various codon positions serves as a critical indicator of nucleotide composition bias. Meanwhile, the GC3s content in CsPPO genes was 52.50%, slightly higher than the 51.31% average of other PPO genes, suggesting a moderate preference for G/C-ending codons in CsPPO genes. Furthermore, GC content at the first and second codon positions in CsPPO genes was 52.70% and 40.15%, respectively, compared to 52.76% (GC1) and 41.84% (GC2) in PPO genes from other species. These values indicate that while GC1 content is relatively conserved, GC2 content is somewhat reduced in CsPPO genes. These nucleotide composition patterns, particularly at the third codon position, may contribute to CUB and could reflect species-specific evolutionary pressures or functional adaptations in C. sinensis.

2.2. Codon Usage Indices Analysis

In this study, ENC values of CsPPO genes varied between 49.01 and 57.73, with a mean of 56.34 (Table 2 and Table S1), indicating relatively low codon usage bias. To determine whether the codon usage indices of CsPPO genes reflect species-specific trends, we compared them with PPO genes from other Camellia species and non-Camellia species (Table 2 and Table S1). The average ENC value across these species was 56.21, which is slightly lower than that of the CsPPO genes, suggesting that codon usage in CsPPO genes may be slightly less constrained. The CAI values of CsPPO genes exhibited a range of 0.162–0.204, with a mean value of 0.195. Given the generally positive correlation between codon adaptation index and gene expression levels, these CAI values suggest that CsPPO genes are likely to be expressed at relatively low levels. The mean CAI value of PPO genes from other species was similar (0.196), indicating comparable levels of translational adaptation. The CBI values of CsPPO genes spanning –0.239 to –0.015, averaging –0.058, suggesting a largely random pattern in the use of non-preferred codons among these genes. Other plant PPO genes exhibited a similar average CBI of –0.065, further supporting the absence of strong codon bias across these genes. GRAVY and AROMA metrics were also examined to characterize the physicochemical properties of the PPO proteins. CsPPO genes had a mean GRAVY value of –0.476, supporting their hydrophilic nature, while other PPO genes showed a comparable mean of –0.464. The AROMO values in CsPPO genes ranged from 0.046 to 0.093, with a mean of 0.090, closely matching the 0.091 average observed in other plants PPO genes. These results indicate that the aromatic amino acid composition is generally conserved across PPO genes regardless of species origin.

2.3. RSCU and RFSC Analyses and Determination of High-Frequency Codons

CUB was assessed through RSCU and RFSC analyses. The results revealed that 26 codons (RSCU > 1) were preferred in the CsPPO genes (Table S2). Of these, 16 codons had G/C endings (9 ending in C and 7 in G), while the remaining ten had A/T endings. These results provided evidence that the codons of CsPPO genes predominantly ended in G/C, indicating a bias in synonymous codon usage influenced by compositional constraints. Additionally, four codons (CTC, GTG, ACC, CGA) were classified as over-represented (RSCU > 1.6), and eight (CTA, CTG, ATC, GTT, GTA, ACG, GCA, CGC) as under-represented (RSCU < 0.6), indicating differential selection pressures among synonymous codons in the CsPPO genes. To further assess whether codon usage trends in CsPPO genes are species-specific, we conducted a comparative RSCU and RFSC analyses using PPO genes from other Camellia species and non-Camellia species (Table S2). In C. lanceoleosa, 29 preferred codons were detected, with 14 ending in G/C and 10 ending in A/T. Two codons (GTG, ACC) were over-represented, while eight were under-represented (GTT, TCG, ACG, CAC, GAC, CGT, CGC, GGC). In C. nitidissima, 33 preferred codons were observed, including 17 ending in G/C and 10 in A/T. Four codons (TTG, GTG, ACC, AGA) were over-represented, whereas eight (CTG, GTT, GTA, ACG, GAC, CGT, CGC, GGC) were under-represented. In C. ptilophylla, 27 preferred codons were identified, with 15 ending in G/C and 10 in A/T. Three codons (CTC, GTG, CGA) were over-represented, while nine (CTG, ATC, GTT, GTA, TCA, ACG, GCA, TGT, CGC) were under-represented. Among the non-Camellia species, A. chinensis exhibited 30 preferred codons, of which 19 ended in G/C and 11 in A/T. Two codons (GTG, AGG) were over-represented, and seven (GTT, GTA, AGT, ACA, GAA, CGC, CGA) were under-represented. In C. florida, 30 preferred codons were detected, with 12 ending in G/C and 18 in A/T. Four codons (TTG, GTG, GCT, AGG) were over-represented, whereas eleven (TTA, GTC, GTA, ACG, GCA, CAC, CAG, CGC, CGA, CGG, GGC) were under-represented. In R. vialii, 28 preferred codons were identified, including 18 ending in G/C and 10 in A/T. Seven codons (ATC, GTG, TCC, ACC, GCC, AGA, AGC) were over-represented, while eleven (CTG, ATA, GTA, TCA, CCT, ACG, CAG, CGT, CGC, CGA, GGC) were under-represented. Overall, the data demonstrate that PPO genes in Camellia species generally exhibit a preference for G/C-ending codons, although the extent of over- and under-representation of specific codons varies among species, reflecting both compositional constraints and species-specific selection pressures.
By calculating the RFSC values of both CsPPO genes and PPO genes from other plants, high-frequency codons were identified, and in line with the approach of Zhou et al. [23] those with a relative frequency greater than 60% (the proportion of a given codon among the total synonymous codons for a particular amino acid) were classified as high-frequency codons. The high-frequency codons in the CsPPO genes are listed in Table 3 and Table S2. A total of 18 high-frequency codons were identified, including TTC, CTC, ATT, GTG, TCC, CCG, ACC, GCC, TAC, CAT, CAA, AAC, AAG, GAT, GAG, TGC, CGA, and GGG. It was noted that the majority of these high-frequency codons preferentially ended in G/C. However, the ACG codon in the CsPPO genes exhibited a low RSCU value, which could help mitigate potential mutations associated with DNA methylation [24]. The GTA codon in CsPPO genes exhibited relatively low RSCU values, and the decrease in TA could enhance protein synthesis by preventing mRNA degradation [25]. In the PPO genes of other plant species, 20 high-frequency codons were identified in C. lanceoleosa, 21 in C. nitidissima, 19 in C. ptilophylla, 17 in A. chinensis, 20 in C. florida, and 20 in R. vialii (Table 3 and Table S2) Comparative analysis revealed seven high-frequency codons (GTG, TCC, ACC, CAA, AAG, GAT, GAG) shared between the CsPPO genes and the PPO genes of these species. Overall, high-frequency codon profiles demonstrated a general bias toward G/C-ending codons across all species, although the specific codon composition varied, indicating both conserved and species-specific codon usage preferences.

2.4. Determination of Optimal Codons

Gene datasets representing high and low expression levels were established based on the ENC values of each PPO coding sequence. Subsequently, the RSCU and ΔRSCU values were computed using the CodonW. Optimal codons were identified based on ΔRSCU values exceeding 0.08, with RSCU values greater than 1 in high-bias genes (high expressed genes) and less than 1 in low-bias genes (low expressed genes) (Table S3) [13,26]. The analysis revealed that the CsPPO genes included eight optimal codons: TTT, ATA, CCT, GCT, AAT, CGG, AGG, and GGT. Likewise, optimal codon analysis of PPO genes from other Camellia species and non-Camellia plant species identified a total of 13 optimal codons: TTT, CTT, ATA, GTT, TCA, CCT, ACA, GCT, TAT, AAT, TGT, AGG, and GGT. The findings revealed that seven optimal codons in the PPO genes (TTT, ATA, CCT, GCT, AAT, AGG, and GGT) were shared between C. sinensis, other Camellia species, and non-Camellia plant species.

2.5. Codon Usage Frequency Analysis

Transgenic research frequently relies on heterologous gene expression, which can be influenced by numerous factors. Among these, the selection of optimal codons is one of the most critical determinants for successful expression of exogenous genes in the host. Differences in codon usage bias between the CsPPO genes and the host organisms can substantially affect gene expression levels; therefore, CUB must be carefully considered when investigating CsPPO gene expression in an exogenous system. In this study, the codon usage frequencies of the CsPPO genes and PPO genes from other plant species were analyzed and compared with those of six other species: Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, Nicotiana tabacum, Triticum aestivum, and Zea mays (Table S4). The results revealed differences in codon usage frequencies between the CsPPO gene and the six model species, listed in decreasing order as follows: Zea mays (2–10 codons), Triticum aestivum (4–11), Saccharomyces cerevisiae (6–10), Arabidopsis thaliana (8–9), Escherichia coli (8–10), and Nicotiana tabacum (10–11). The codon frequency of the CsPPO gene showed only minor differences compared to that of Zea mays. In contrast, more pronounced differences were observed between the CsPPO gene and Triticum aestivum, Saccharomyces cerevisiae, Arabidopsis thaliana, Escherichia coli, and Nicotiana tabacum. In conclusion, the results indicate that Zea mays may be an optimal host for the heterologous expression of the CsPPO gene. Similarly, the results indicate that Zea mays may be an optimal host for the heterologous expression of the PPO genes of C. lanceoleosa (2–10 codons), C. nitidissima (3–9 codons), C. ptilophylla (2–12 codons), and R. vialii (2–8 codons). Furthermore, the results indicate that Triticum aestivum and Arabidopsis thaliana may be an optimal host for the heterologous expression of the PPO genes of A. chinensis (4–9 codons) and C. florida (3–7 codons), respectively. Saccharomyces cerevisiae and Escherichia coli, representing eukaryotic and prokaryotic expression systems, are commonly utilized in gene expression studies. In the present study, the S. cerevisiae expression system appears to be more favorable than the Escherichia coli system for the expression of the CsPPO gene.

2.6. ENC Plot Analysis of PPO Genes

The codon usage variation among the 24 CsPPO genes and PPO genes from other plant species was analyzed using an ENC plot (Figure 1). As depicted in Figure 1, a few genes clustered above or adjacent to the expected curve, implying mutational pressure impacted their codon usage. In contrast, most genes fell below the predicted curve, indicating that natural selection played the dominant role in shaping CUB.

2.7. PR2 Plot Analysis

To distinguish the relative impacts of mutational pressure and natural selection on CUB, PR2 plot analysis was conducted (Figure 2). The analysis revealed that the 24 CsPPO genes, along with PPO genes from both Camellia and non-Camellia plant species, were unevenly distributed across the four regions, with most genes positioned far from the central value of 0.5. Only a few genes were located in proximity to the center. This finding implies that natural selection may exert a considerable influence on the usage patterns of the third codon base in these genes. Furthermore, the A3/(A3 + T3) ratio for the majority of codon bases was less than 0.5, whereas the G3/(G3 + C3) ratio surpassed 0.5 in certain gene codons. These results demonstrate that codon base usage, especially at the third position, tends to favor T over A and G over C.

2.8. Neutrality Plot Analysis

The neutrality plot was employed to analyze the relationship between GC12s (the average GC content at the first and second codon positions) and GC3s, elucidating the respective roles of mutational pressure and natural selection in shaping codon usage patterns (Figure 3). Regression analysis based on 24 CsPPO genes and additional PPO genes from both Camellia and non-Camellia plant species yielded a slope of 0.073. The correlation coefficient (r = 0.291) indicated a weak association between GC12s and GC3s. This result suggests that mutational pressure contributed only 7.3% to the codon usage patterns of PPO genes, while other factors, primarily natural selection, accounted for the remaining 92.7%. In this context, natural selection plays a predominant role in shaping the codon usage patterns of both CsPPO genes and PPO genes from other plant species.

2.9. Correspondence Analysis of PPO Genes

COA was conducted on codon usage patterns in CsPPO genes. The analysis was conducted using the RSCU values calculated from these genes. The results indicated that Axis 1 accounted for 68.01% of the total variation, whereas Axis 2 explained 20.16% of the overall variation (Figure 4A). AT-ending codons were predominantly localized near the central region of the plot, primarily within the positive quadrant, whereas GC-ending codons were mostly aggregated around the axis origin, with a few distributed toward the negative quadrant. To further examine codon usage divergence across species, an additional COA was performed using PPO genes from Camellia species and non-Camellia species. In this comparative analysis, Axis 1 and Axis 2 accounted for 35.52% and 13.94% of the total variation, respectively (Figure 4B). Codons were clearly separated along these axes based on their third nucleotide base: A/T-ending codons tended to cluster in specific regions distinct from those of G/C-ending codons. COA indicates that CUB in PPO genes results from a combination of mutational pressure, natural selection, and other potential contributing factors.

2.10. Amino Acid Usage Frequency

Amino acid composition acts as a vital indicator, revealing insights into an organism’s physiological functions, biological stability, and evolutionary trajectories [27]. Variations in genomic GC content have a direct effect on amino acid composition, subsequently affecting codon usage patterns [28]. In CsPPO genes, leucine, proline, aspartic acid, and lysine were the most abundant amino acids, whereas tryptophan, cysteine, and methionine were present at lower frequencies (Figure 5). Overall, hydrophobic and hydrophilic amino acids accounted for a substantial proportion of the complete amino acid profile in CsPPO genes. In addition to CsPPO genes, amino acid usage frequencies were also analyzed for PPO genes from other Camellia species and non-Camellia species (Table S5). The results showed that the most frequently used amino acids across all species were alanine, aspartic acid, leucine, proline, and lysine. In contrast, methionine, cysteine, and tryptophan were the least frequently used amino acids. These findings align with the patterns observed in CsPPO genes, in which hydrophobic and hydrophilic amino acids collectively constitute a major proportion of the overall amino acid composition.

2.11. Impact of Codon Usage Bias on Gene Expression

The MILC value serves as a measure of gene expression levels. Higher MILC values correspond to increased levels of gene expression [29]. We determined the MILC values for CsPPO genes and PPO genes of other plant species. The results indicated that these values ranged from 0.48 to 0.50 (Table 4), suggesting that the expression levels of CsPPO genes and the other plant species were low. Furthermore, we performed a correlation analysis to examine the association between gene expression levels and CUB. A significant positive correlation was found between SCUO and MILC values (Table S6), suggesting that CUB may influence gene expression levels.

2.12. Correlation Analysis

The association between nucleotide content and CUB indices was examined in the CsPPO genes. In CsPPO genes, ENC showed significant positive correlations with GC, GC2, GC3s, C3s, and A3s, while significant negative correlations were observed with GC1, T3s, G3s, and AT3 (Table 5). This indicates that codon usage bias could be associated with the nucleotide composition at different codon positions. Compared to GC1 and GC2, GC3 exerts a stronger influence on codon preference, particularly at the third position of synonymous codons.
While the composition of the first two bases is largely shaped by neutral mutations, the third base composition is predominantly governed by natural selection. Additionally, correlation analysis between CAI and various CUB indices showed that, in CsPPO genes, CAI is significantly positively correlated with GC1, T3s, and AT3s, while exhibiting a significant negative correlation with GC, GC2, GC3s, A3s, and C3s (Table 5). Gene expression levels, as assessed by CAI values, were found to influence CUB in the CsPPO genes. To assess natural selection’s influence on CUB in CsPPO genes, we examined correlations between GRAVY, AROMA and codon usage parameters (A3s, T3s, G3s, C3s, GC3s, ENC) (Table 6). The results showed significant associations between GRAVY, AROMA values and these parameters, suggesting that natural selection contributes to shaping CUB.
Furthermore, the association between nucleotide content and CUB indices was also analyzed in the PPO genes of other plant species (Table S7). In contrast to CsPPO genes, no significant correlation was detected between ENC and CAI values and any of the nucleotide composition parameters (GC, GC1, GC2, GC3s, A3s, T3s, C3s, G3s, AT3). However, a significant correlation was observed only between the ENC value and the GRAVY value in these species, which suggests that natural selection plays a substantial role in influencing codon usage patterns in these PPO genes, even in the absence of strong associations with nucleotide content.

3. Discussion

CUB is a complex and important evolutionary phenomenon observed across a wide range of organisms, providing valuable insights into genomic architecture and the evolutionary history of genomes [30]. CUB has been investigated across diverse species, encompassing prokaryotes as well as unicellular and multicellular eukaryotes [31,32]. Variation in synonymous codon usage across different gene-coding regions results from CUB and is shaped by selective forces, including nucleotide composition constraints, natural selection, and mutational pressure [33]. Studies on CUB have been conducted in various genes of C. sinensis, including CsSAD, CsSPDS, and CsGPAT [34,35,36]. However, this study presents the first comprehensive investigation of codon usage in CsPPO genes, comparing 24 CsPPO genes with PPO genes from other Camellia and non-Camellia species to assess codon usage patterns and evaluate the evolutionary forces shaping codon bias. This approach highlights the functional and evolutionary significance of PPO genes, which are directly linked to plant defense and stress responses, thereby extending codon usage studies to a biologically and agriculturally relevant gene family.
The nucleotide content and the 3rd codon position significantly influence the CUB of a gene [37]. In GC-rich organisms like Triticum aestivum [38], Zea mays [39], and Sorghum bicolor [40], there is a preference for G or C at the 3rd codon position. In contrast, AT-rich organisms, such as Helianthus annuus [13], Delphinium grandiflorum [41], and Porphyra umbilicalis [42], tend to favor A or T at the third codon position. In the present study on PPO genes of both C. sinensis and the other analyzed plant species, the average GC (G + C) content was found to be lower than the AT (A + T) content. However, at the third codon position, the mean GC3 content was higher than the AT3) content. Since GC3 composition is widely regarded as an indicator of base composition bias [43], our results suggest a bias toward the preferential use of G- or C-ending codons. These findings may reflect compositional constraints influencing the CUB of CsPPO genes and the PPO genes from other analyzed plant species. This observation was supported by the results of the RSCU analysis. Moreover, the RFSC and high-frequency codon analyses further confirmed a preference for GC-ending codons, aligning with trends reported in previous studies [44,45]. Earlier studies have suggested that the frequency of synonymous codon usage exhibits variation not only between genomes but also among functionally related genes and even within individual genes [46,47]. It is therefore widely accepted that variations in codon usage among genes within the same genome are largely driven by selective pressures, as evidenced by the stronger codon bias observed in highly expressed genes, which preferentially utilize codons corresponding to the most prevalent cognate tRNAs [48,49].
Selective pressures aimed at maximizing translational efficiency and accuracy play a central role in shaping codon usage patterns, as evidenced by the frequent use of optimal codons [50]. Identifying optimal codons offers valuable insights for rational and efficient codon optimization strategies [51,52,53]. In this study, eight optimal codons were identified in the CsPPO genes, and thirteen in the PPO genes of other analyzed plant species, respectively. These findings not only support efforts in codon optimization but also offer valuable insights into the association between gene expression and codon preference. Based on these results, our study emphasizes that codon modification and alignment with host genomes can significantly enhance transcriptional and translational efficiencies. This is particularly relevant for plant biotechnology, where optimized heterologous expression of PPO enzymes can contribute to improved stress tolerance, crop quality, and biotechnological production systems. In this context, our results highlight distinct host–gene compatibility patterns for PPO heterologous expression. Zea mays appears particularly well-suited for expressing the CsPPO gene and the PPO genes from C. lanceoleosa, C. nitidissima, C. ptilophylla, and R. vialii, suggesting a broad compatibility with phylogenetically diverse species. In contrast, Triticum aestivum and Arabidopsis thaliana emerged as optimal hosts for the PPO genes of A. chinensis and C. florida, respectively, indicating that host selection may be influenced by gene-specific or lineage-specific factors [54]. Taken together, these host–gene associations not only expand our understanding of codon bias but also provide a practical roadmap for designing effective plant-based expression systems in agriculture and synthetic biology.
CUB is mainly influenced by natural selection and mutation pressure [12], though the predominant forces driving this bias can differ between species. Previous studies have shown that the codon usage patterns of nuclear genes are largely influenced by natural selection throughout evolution [55,56,57]. In contrast, another study reported that mutation pressure primarily influences CUB in soybean nuclear genes [58]. These findings indicate that codon usage patterns of nuclear genes differ across plant species. A systematic analysis of the factors influencing CUB in the CsPPO genes and the PPO genes of other plant species was conducted using an integrated approach combining ENC plot, PR2 plot, and neutrality plot analyses. The ENC plot analysis provided a qualitative assessment of the principal determinants influencing codon usage patterns in PPO genes. The findings demonstrated that mutational bias exerted a relatively minor influence on codon usage patterns when compared to the effects of natural selection and other factors. Comparable findings have been reported in studies on Rosales species, Hemerocallis citrina, and Citrus species [59,60,61]. Neutrality plot analysis indicated the absence of a significant correlation between GC12 and GC3 in the PPO genes. In the PPO genes, mutation pressure contributed only 7.3% to the codon usage pattern, whereas natural selection accounted for 92.7%, suggesting that natural selection was the predominant force shaping codon preference. This result is consistent with previous studies on Gnetales species, Diplandrorchis sinica, and Dryas octopetala [62,63,64]. The PR2 plot analysis revealed an uneven distribution of PPO genes across the four quadrants, indicating a bias in the third codon position, with a higher usage frequency of T over A and G over C. These findings suggest that natural selection is the primary driver of codon usage bias in PPO genes. Comparable patterns have also been observed in other plants, including Cymbidium species and Medicago truncatula [65,66]. This strong predominance of selection underscores the adaptive nature of CUB and highlights its potential role in fine-tuning PPO expression under varying ecological and evolutionary contexts. However, to further investigate the factors shaping codon usage, COA of RSCU revealed that the first axis accounted for only a limited proportion of the codon usage variation, indicating that codon bias in the CsPPO genes and PPO genes from other plant species is shaped not solely by natural selection but also by additional selective constraints. This observation is reinforced by the distinct clustering pattern in the COA plot, which showed a clear separation between GC-ending and AT-ending codons, underscoring the role of nucleotide composition in shaping CUB [67]. Such a pattern is consistent with the influence of mutational pressure, where GC-rich genomes tend to favor G- or C-ending codons, whereas AT-rich genomes preferentially utilize A- or T-ending codons [68]. Furthermore, the potential contribution of translational selection—via differences in tRNA abundance or efficiency for GC- versus AT-ending codons—suggests that codon usage patterns are the outcome of an interplay between mutation bias and natural selection [50]. These results not only corroborate earlier findings that codon usage is a non-random phenomenon shaped by complex evolutionary forces [19,69], but also extend this understanding by demonstrating that such forces are at work in PPO genes across phylogenetically diverse plant species. This broader perspective highlights the evolutionary and functional significance of codon usage patterns in plant genomes and provides a framework for future studies on the molecular evolution of PPO genes.
The amino acid composition of proteins plays a crucial role in determining their structural configuration, functional performance, and evolutionary flexibility [28]. In the present study, both the CsPPO genes and the PPO genes of other plant species exhibited a distinctive amino acid composition, characterized by elevated abundances of leucine, proline, aspartic acid, and lysine, and reduced frequencies of tryptophan, cysteine, and methionine. The presence of both hydrophobic and hydrophilic amino acids in these PPO genes may indicate a carefully optimized balance between structural stability and enzymatic activity [70,71]. Hydrophobic amino acids like leucine and proline might aid in the proper folding and core stabilization of the PPO enzyme, which is critical for sustaining its activity under environmental stress [71,72,73]. Conversely, hydrophilic residues such as aspartic acid and lysine improve solubility and promote interactions with substrates and other biomolecules [74,75], thereby enhancing the enzyme’s function in oxidation reactions linked to defense responses and browning processes [76,77]. This structural–functional optimization suggests that selective forces act not only at the codon level but also at the amino acid level, reinforcing the adaptive nature of PPO evolution in response to biotic and abiotic pressures.
In our study, the SCUO values for the PPO genes in both C. sinensis and the other plant species were consistently below 0.50, indicating a low level of CUB and suggesting a relatively balanced use of synonymous codons. Such a pattern may reflect weaker selective constraints on codon choice, with mutational pressure or genetic drift playing a more prominent role in shaping codon usage. This observation is consistent with previous reports for chloroplast genes in Oryza and Theaceae species [78,79], thereby extending these findings to PPO genes and highlighting a potentially conserved codon usage strategy across phylogenetically distinct plant lineages. In our current study, MILC, together with ENC and CAI values, was employed as an indicator of gene expression and was found to exhibit low levels. These results suggest that, based on MILC, gene expression levels may vary to some extent among the different species. Based on gene expression levels, a correlation was observed between SCUO and MILC values, suggesting that CUB may influence gene expression. This trend aligns with previous findings in Fagopyrum species and Theaceae family members [29,79].
Previous studies have demonstrated that additional natural selection-driven factors, such as GRAVY and AROMO values, also contribute to CUB [80,81]. The strong positive correlations between GRAVY and AROMO values support the notion that these protein traits, shaped by the coding sequences, may be associated with CUB in CsPPO genes as well as in PPO genes from other plant species. Earlier studies have also identified notable positive and negative correlations between CUB and GRAVY/AROMO values in a range of organisms, such as Sesamum indicum, Taenia saginata, and Epichloë festucae [82,83,84]. Collectively, these observations highlight hydrophobicity and aromaticity as key selective forces shaping codon usage bias across plant species.
Despite these novel insights, certain limitations of our study should be acknowledged. Although chromosome-scale assemblies of C. sinensis exist, a fully comprehensive genomic resource covering the genetic diversity of tea is still lacking. This limits our ability to assess codon usage in a complete genomic context, where competition among mRNAs for tRNAs can be fully understood. Our results should therefore be seen as a first step, paving the way for broader genome-wide analyses. Future research that couples codon usage analyses with functional genomics and experimental validation (e.g., heterologous expression of PPO genes in different hosts) will be critical to clarify the adaptive significance of codon bias under natural and agricultural conditions. Such approaches will not only address current limitations but also extend the implications of our study to plant breeding, stress resilience, and synthetic biology applications.

4. Materials and Methods

4.1. Sequence Retrieval and Nucleotide Composition Analysis

In this study, PPO gene sequences from 24 C. sinensis isolates, other Camellia species (Camellia lanceoleosa, Camellia nitidissima, Camellia ptilophylla), and non-Camellia species (Actinidia chinensis, Cornus florida, Rhododendron vialii) were retrieved from the NCBI GenBank database (http://www.ncbi.nlm.nih.gov/, accessed on 8 July 2025) (Table 1). CodonW (version 1.4.2) (https://codonw.sourceforge.net/, accessed on 8 July 2025) was employed to calculate ENC, the overall GC content of each gene (GC), the GC content at the 3rd position of synonymous codons (GC3s), as well as the proportions of C, A, T, and G at the 3rd position of synonymous codon (C3s, A3s, T3s, G3s). The CAIcal server [85] was used to calculate the overall nucleotide composition (A, G, T, C), the total AT and AT3 contents, as well as the GC content at the first and second codon positions, denoted as GC1 and GC2, respectively.

4.2. Codon Adaptation Index (CAI) and Codon Bias Index (CBI)

The CAI value indicates how closely the usage frequency of synonymous codons in a coding region aligns with that of optimal codons. It ranges from 0 to 1, with higher values reflecting better adaptation and potentially higher levels of gene expression [86,87]. The CBI is another numerical measure used to quantify CUB. It measures the deviation between the observed frequency of a gene’s preferred codons and the expected frequency derived from the genome’s overall codon usage pattern [88]. Higher CBI values reflect a stronger bias toward the use of preferred codons. The CBI spans from 0 to 1, where a value of 1 means that only optimal codons are used, while a value below 0 suggests the absence of optimal codon usage [89].

4.3. Grand Average of Hydropathy (GRAVY) and Aromaticity (AROMA) Analysis

GRAVY and AROMA indices serve as quantitative measures to evaluate natural selection’s influence on codon usage, representing the relative abundances of hydrophobic and aromatic amino acids, respectively. Higher AROMA or GRAVY values indicate a greater proportion of aromatic or hydrophobic amino acids in the protein product [90]. These values were computed using CodonW.

4.4. Analyses of Relative Synonymous Codon Usage (RSCU) and Relative Synonymous Codon Usage Frequency (RFSC)

RSCU is a quantitative metric that represents the relative frequency of usage for each synonymous codon. An RSCU value exceeding 1 indicates that a codon is utilized more frequently than expected among synonymous codons for the same amino acid, reflecting a preferential or biased codon usage [91]. An RSCU value below 1 signifies that a codon is used less frequently than expected among synonymous codons for a given amino acid, indicating a reduced or negatively biased codon usage. An RSCU of 1 means that the codon is used with equal frequency or randomly in the RNA transcript, showing no bias toward a specific amino acid [91]. Synonymous codons with RSCU values above 1.6 are regarded as overrepresented, while those with values below 0.6 are considered underrepresented [92]. The RSCU is computed using the formula developed by Sharp and Li [93], as shown below (1):
R S C U = X i j j n i X i j n i
In this formula, Xij indicates the number of times the j-th codon is used to encode the i-th amino acid, while ni denotes the number of synonymous codons that correspond to the i-th amino acid, ranging from 1 to 6 due to the redundancy of the genetic code.
RFSC represents the proportion of a specific codon’s usage relative to the total usage of all synonymous codons for a given amino acid. It is determined using the equation established by Sharp and Li [93], as outlined below (2):
R F S C = X i j j n i X i j
In this formula, Xij denotes the count of the j-th codon used to encode the i-th amino acid. Using this formula, the RFSC value is determined for each codon. A codon is classified as a high-frequency codon if it satisfies either of the following criteria: (1) the RFSC value of the codon is greater than 60% of the corresponding specific codon’s value. (2) the RFSC value of the codon exceeds the average frequency of all synonymous codons by more than 50% [23,29].

4.5. Optimal Codons Analysis

The effective number of codons (ENC) is a key measure used to evaluate how strongly codon usage is biased [26]. ENC values range from 20, which signifies maximum codon usage bias—where only a single synonymous codon is employed for each amino acid—to 61, indicating completely unbiased usage, with all synonymous codons utilized uniformly [94]. Lower ENC values correspond to a higher degree of codon usage bias [95]. The ENC values of the PPO genes were computed using CodonW. Based on these values, the genes were sorted in ascending order. The top 10% with the highest ENC values and the bottom 10% with the lowest were selected to represent low and high expression datasets, respectively. CodonW was then used to calculate the RSCU values for each of these datasets. Optimal codons were determined using the ΔRSCU method, where ΔRSCU represents the difference between the RSCU value in the high-expression dataset (RSCUhigh) and that in the low-expression dataset (RSCUlow). A codon is regarded as optimal when its ΔRSCU is ≥0.08, with an RSCU value exceeding 1 in the high-expression dataset and falling below 1 in the low-expression dataset [26].

4.6. Comparative Analysis of Codon Utilization Frequency

Codon usage bias was evaluated by analyzing the frequency at which individual codons are utilized. The codon usage frequency of CsPPO genes and PPO genes from other plant species was analyzed using the CUSP tool available on the EMBOSS Explorer online platform (https://www.bioinformatics.nl/cgi-bin/emboss/cusp, accessed on 10 July 2025). Codon usage frequency data for Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, Triticum aestivum, Nicotiana tabacum, and Zea mays were retrieved from the Codon Usage Database (https://dnahive.fda.gov/dna.cgi?cmd=codon_usage&id=537&mode=cocoputs, accessed on 10 July 2025). The codon usage frequencies of the PPO genes were compared with those of the referenced datasets. If the proportion is ≤0.5 or ≥2, it signifies a significant divergence in codon usage bias between the two organisms. Conversely, if the proportion falls within the range of 0.5 to 2, it indicates a highly similar codon usage preference, suggesting that the organism may be suitable as a host for heterologous gene expression [96].

4.7. ENC Plot Analysis

ENC plot analysis examines the correlation between ENC and GC3s within genes [44]. In this plot, GC3s values are represented on the x-axis, while ENC values are plotted on the y-axis for each gene. The expected ENC value was computed using the following Equation (3) [38]. When genes are located on or close to the expected ENC curve, it suggests that their CUB is primarily shaped by mutational pressure. Conversely, genes positioned significantly below the expected curve exhibit codon preferences shaped primarily by natural selection and additional evolutionary forces [44].
E N C e x p = 2 + G C 3 s + ( 29 G C 3 s 2 + 1 G C 3 s 2 )

4.8. Paritiy Rule 2 (PR2) Plot Analysis

PR2 plot analysis was employed to evaluate the relative contributions of natural selection and mutational pressure on nucleotide composition at the 3rd codon position. For each gene, the frequencies of the four nucleotides (T, A, G, and C) at the 3rd codon position were determined to calculate GC bias [G3/(G3 + C3)] and AT bias [A3/(A3 + T3)]. A PR2 bias plot was generated by plotting AT bias versus GC bias to illustrate the balance between purine and pyrimidine composition within genes. The center of the plot, with coordinates at (0.5, 0.5), denotes the equilibrium point where the frequencies of complementary bases are equal (A = T and G = C), indicating no bias. The position and orientation of each gene relative to the central point of the PR2 bias plot reflect the extent of deviation from parity rule 2 [97]. The aggregation of genes near the plot’s center indicates that codon usage bias is largely governed by mutational pressure, whereas substantial deviations from this central equilibrium suggest a stronger role of natural selection in shaping codon usage patterns [67,98].

4.9. Neutrality Analysis

Neutrality plot analysis involves constructing a regression line by plotting GC3s against GC12s. This method is employed to evaluate the respective influences of mutational pressure and natural selection in determining CUB within genes [99]. A regression line with a slope nearing 1 suggests that CUB is largely influenced by mutational pressure, with a relatively minor role played by natural selection. In contrast, a slope approaching 0 reflects a dominant influence of selection pressure in determining codon usage patterns [100].

4.10. Correspondence Analysis (COA)

COA was utilized to explore the underlying factors that may contribute to CUB. To characterize the CUB of PPO genes, the analysis was performed using the RSCU values of 59 codons, excluding ATG, TGG, and the three stop codons. Scatter plots were constructed with Axis 1 and Axis 2 representing the horizontal and vertical coordinates, respectively. The codon usage patterns were inferred based on the spatial distribution of data points within the plot.

4.11. Gene Expression Level Analysis

To elucidate the association between CUB and gene expression, synonymous codon usage order (SCUO) analysis was conducted. SCUO values for each gene were calculated using an online tool (https://www.genscript.com/tools/rare-codon-analysis, accessed on 10 July 2025) [101]. The calculation formula of SCUO is as follows (4) [102]:
S C U O = i = 1 n i j = 1 n i x i j i = 1 18 j = 1 n i x i j S C U O i
where j is the codon for the i-th amino acid. The SCUO metric quantifies the bias in synonymous codon usage across the entire sequence, with values ranging from 0 to 1. Higher SCUO scores indicate stronger codon selection pressure and are typically correlated with elevated gene expression levels, reflecting adaptation for increased translational efficiency [103]. The measure independent of length and composition (MILC) is an index of gene expression level that quantifies the deviation in codon usage between a given gene and an expected codon distribution [104]. MILC values for each gene were computed using an online tool (https://www.genscript.com/tools/rare-codon-analysis, accessed on 10 July 2025) [101]. The formula to calculate MILC is (5) [104]:
M I L C = a M a L C
Here, L denotes the gene length in codons, ensuring that the expected increase with a larger number of codons is considered. The correction factor C is applied to prevent overestimation of overall bias in relatively short sequences. Ma represents the contribution of a given amino acid a to codon usage bias. The MILC scale ranges from 0, indicating low expression, to 1, indicating high expression [29].

4.12. Statistical Analysis

CodonW (version 1.4.2) (https://codonw.sourceforge.net/, accessed on 8 July 2025) was used to analyze various CUB metrics, such as the CAI and CBI. Additionally, CodonW was utilized to compute amino acid frequencies and conduct correspondence analysis. OriginPro 9.0 software was employed to carry out the statistical analyses. Spearman’s rank correlation test was utilized to evaluate the associations between variables, considering p-values below 0.05 as indicative of statistical significance.

5. Conclusions

This study provides the first comprehensive analysis of CUB in CsPPO genes. Analysis of base composition and RSCU revealed that CsPPO genes exhibit a preference for codons ending in G or C. A total of eight optimal codons were identified in the CsPPO genes, offering valuable insights for the optimization of gene expression. Comparative analyses with other Camellia and non-Camellia species confirmed both conserved and species-specific codon usage patterns. Evolutionary force analyses, including ENC, PR2, and neutrality plots, demonstrated that natural selection rather than mutational pressure is the predominant factor shaping CUB in CsPPO genes. Codon usage comparisons indicated that Zea mays is the most suitable host for their heterologous expression. These findings will not only expand the available CsPPO gene resources but also deepen our understanding of CUB in CsPPO genes, providing a solid theoretical foundation for future studies on their genetic and evolutionary dynamics.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/plants14193074/s1, Table S1. Nucleotide compositional analysis of PPO genes. Table S2. RSCU and RFSC values of the codons in PPO genes. Table S3. The RSCU values and ΔRSCU value in the high and low expression library of PPO genes. Table S4. Comparison of codon usage frequency between PPO genes and six organisms. Table S5. Analysis of amino acid usage frequency in PPO genes of C. sinensis and other plant species. Table S6. Correlation between SCUO and MILC in PPO genes of C. sinensis and other plant species. Table S7. Correlation analysis between ENC values, CAI values, AROMA values, GRAVY values and codon composition of PPO genes of other Camellia species and non-Camellia species.

Funding

This study has been supported by the Recep Tayyip Erdoğan University Development Foundation (Grant number: 02025006010538).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The author cordially thank the Recep Tayyip Erdoğan University Development Foundation for the financial support of this study.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Zeng, C.; Lin, H.; Liu, Z.; Liu, Z. Metabolomics Analysis of Camellia sinensis with Respect to Harvesting Time. Food Res. Int. 2020, 128, 108814. [Google Scholar] [CrossRef]
  2. Chacko, S.M.; Thambi, P.T.; Kuttan, R.; Nishigaki, I. Beneficial Effects of Green Tea: A Literature Review. Chin. Med. 2010, 5, 13. [Google Scholar] [CrossRef] [PubMed]
  3. Li, C.F.; Zhu, Y.; Yu, Y.; Zhao, Q.Y.; Wang, S.J.; Wang, X.C.; Yao, M.Z.; Luo, D.; Li, X.; Chen, L.; et al. Global Transcriptome and Gene Regulation Network for Secondary Metabolite Biosynthesis of Tea Plant (Camellia sinensis). BMC Genom. 2015, 16, 560. [Google Scholar] [CrossRef] [PubMed]
  4. Truong, V.L.; Jeong, W.S. Cellular Defensive Mechanisms of Tea Polyphenols: Structure-Activity Relationship. Int. J. Mol. Sci. 2021, 22, 9109. [Google Scholar] [CrossRef]
  5. Singh, K.; Rani, A.; Paul, A.; Dutt, S.; Joshi, R.; Gulati, A.; Ahuja, P.S.; Kumar, S. Differential Display Mediated Cloning of Anthocyanidin Reductase Gene from Tea (Camellia sinensis) and Its Relationship with the Concentration of Epicatechins. Tree Physiol. 2009, 29, 837–846. [Google Scholar] [CrossRef] [PubMed]
  6. Zou, C.; Zhang, X.; Xu, Y.; Yin, J. Recent Advances Regarding Polyphenol Oxidase in Camellia sinensis: Extraction, Purification, Characterization, and Application. Foods 2024, 13, 545. [Google Scholar] [CrossRef]
  7. Tang, M.G.; Zhang, S.; Xiong, L.G.; Zhou, J.H.; Huang, J.A.; Zhao, A.Q.; Liu, Z.H.; Liu, A.L. A Comprehensive Review of Polyphenol Oxidase in Tea (Camellia sinensis): Physiological Characteristics, Oxidation Manufacturing, and Biosynthesis of Functional Constituents. Compr. Rev. Food Sci. Food Saf. 2023, 22, 2267–2291. [Google Scholar] [CrossRef]
  8. Zhang, J.; Zhang, X.; Ye, M.; Li, X.W.; Lin, S.B.; Sun, X.L. The Jasmonic Acid Pathway Positively Regulates the Polyphenol Oxidase-Based Defense against Tea Geometrid Caterpillars in the Tea Plant (Camellia sinensis). J. Chem. Ecol. 2020, 46, 308–316. [Google Scholar] [CrossRef]
  9. Huang, C.; Zhang, J.; Zhang, X.; Yu, Y.; Bian, W.; Zeng, Z.; Sun, X.; Li, X. Two New Polyphenol Oxidase Genes of Tea Plant (Camellia sinensis) Respond Differentially to the Regurgitant of Tea Geometrid, Ectropis obliqua. Int. J. Mol. Sci. 2018, 19, 2414. [Google Scholar] [CrossRef]
  10. Liu, Y. A Code within the Genetic Code: Codon Usage Regulates Co-Translational Protein Folding. Cell Commun. Signal. 2020, 18, 145. [Google Scholar] [CrossRef]
  11. Hu, Q.; Wu, J.; Fan, C.; Luo, Y.; Liu, J.; Deng, Z.; Li, Q. Comparative Analysis of Codon Usage Bias in the Chloroplast Genomes of Eighteen Ampelopsideae Species (Vitaceae). BMC Genom. Data 2024, 25, 80. [Google Scholar] [CrossRef]
  12. Yang, Q.; Xin, C.; Xiao, Q.S.; Lin, Y.T.; Li, L.; Zhao, J.L. Codon Usage Bias in Chloroplast Genes Implicate Adaptive Evolution of Four Ginger Species. Front. Plant Sci. 2023, 14, 1304264. [Google Scholar] [CrossRef] [PubMed]
  13. Gao, Y.; Lu, Y.; Song, Y.; Jing, L. Analysis of Codon Usage Bias of WRKY Transcription Factors in Helianthus annuus. BMC Genom. Data 2022, 23, 46. [Google Scholar] [CrossRef]
  14. Li, C.; Zhou, L.; Nie, J.; Wu, S.; Li, W.; Liu, Y.; Liu, Y. Codon Usage Bias and Genetic Diversity in Chloroplast Genomes of Elaeagnus Species (Myrtiflorae: Elaeagnaceae). Physiol. Mol. Biol. Plants 2023, 29, 239–251. [Google Scholar] [CrossRef] [PubMed]
  15. Bu, Y.; Wu, X.; Sun, N.; Man, Y.; Jing, Y. Codon Usage Bias Predicts the Functional MYB10 Gene in Populus. J. Plant Physiol. 2021, 265, 153491. [Google Scholar] [CrossRef]
  16. Jia, X.; Liu, S.; Zheng, H.; Li, B.; Qi, Q.; Wei, L.; Zhao, T.; He, J.; Sun, J. Non-Uniqueness of Factors Constraint on the Codon Usage in Bombyx mori. BMC Genom. 2015, 16, 356. [Google Scholar] [CrossRef]
  17. Zhang, P.; Xu, W.; Lu, X.; Wang, L. Analysis of Codon Usage Bias of Chloroplast Genomes in Gynostemma Species. Physiol. Mol. Biol. Plants 2021, 27, 2727–2737. [Google Scholar] [CrossRef] [PubMed]
  18. Li, Y.; Hu, X.; Xiao, M.; Huang, J.; Lou, Y.; Hu, F.; Fu, X.; Li, Y.; He, H.; Cheng, J. An Analysis of Codon Utilization Patterns in the Chloroplast Genomes of Three Species of Coffea. BMC Genom. Data 2023, 24, 42. [Google Scholar] [CrossRef]
  19. Mazumder, T.H.; Alqahtani, A.M.; Alqahtani, T.; Emran, T.B.; Aldahish, A.A.; Uddin, A. Analysis of Codon Usage of Speech Gene Foxp2 among Animals. Biology 2021, 10, 1078. [Google Scholar] [CrossRef]
  20. Xu, F.; Liu, W.; Wang, H.; Alam, P.; Zheng, W.; Faizan, M. Genome Identification of the Tea Plant (Camellia sinensis) ASMT Gene Family and Its Expression Analysis under Abiotic Stress. Genes 2023, 14, 409. [Google Scholar] [CrossRef]
  21. Chen, S.; Kong, Y.; Zhang, X.; Liao, Z.; He, Y.; Li, L.; Liang, Z.; Sheng, Q.; Hong, G. Structural and Functional Organization of the MYC Transcriptional Factors in Camellia sinensis. Planta 2021, 253, 93. [Google Scholar] [CrossRef]
  22. Deb, B.; Uddin, A.; Chakraborty, S. Composition, Codon Usage Pattern, Protein Properties, and Influencing Factors in the Genomes of Members of the Family Anelloviridae. Arch. Virol. 2021, 166, 461–474. [Google Scholar] [CrossRef] [PubMed]
  23. Zhou, M.; Tong, C.; Shi, J. Analysis of Codon Usage Between Different Poplar Species. J. Genet. Genom. 2007, 34, 555–561. [Google Scholar] [CrossRef] [PubMed]
  24. Tian, G.; Xiao, G.; Wu, T.; Zhou, J.; Xu, W.; Wang, Y.; Xia, G.; Wang, M. Alteration of Synonymous Codon Usage Bias Accompanies Polyploidization in Wheat. Front. Genet. 2022, 13, 979902. [Google Scholar] [CrossRef] [PubMed]
  25. Al-Saif, M.; Khabar, K.S.A. UU/UA Dinucleotide Frequency Reduction in Coding Regions Results in Increased mRNA Stability and Protein Expression. Mol. Ther. 2012, 20, 954–959. [Google Scholar] [CrossRef]
  26. Li, X.; Liu, L.; Ren, Q.; Zhang, T.; Hu, N.; Sun, J.; Zhou, W. Analysis of Synonymous Codon Usage Bias in the Chloroplast Genome of Five Caragana. BMC Plant Biol. 2025, 25, 322. [Google Scholar] [CrossRef] [PubMed]
  27. Lamolle, G.; Simón, D.; Iriarte, A.; Musto, H. Main Factors Shaping Amino Acid Usage Across Evolution. J. Mol. Evol. 2023, 91, 382–390. [Google Scholar] [CrossRef]
  28. Du, M.Z.; Liu, S.; Zeng, Z.; Alemayehu, L.A.; Wei, W.; Guo, F.B. Amino Acid Compositions Contribute to the Proteins’ Evolution under the Influence of Their Abundances and Genomic GC Content. Sci. Rep. 2018, 8, 7382. [Google Scholar] [CrossRef]
  29. Liu, Q.; Li, S.; He, D.; Liu, J.; He, X.; Lin, C.; Li, J.; Huang, Z.; Huang, L.; Nie, G.; et al. Comparative Analysis of Codon Usage Patterns in the Chloroplast Genomes of Fagopyrum Species. Agronomy 2025, 15, 1190. [Google Scholar] [CrossRef]
  30. Iriarte, A.; Lamolle, G.; Musto, H. Codon Usage Bias: An Endless Tale. J. Mol. Evol. 2021, 89, 589–593. [Google Scholar] [CrossRef]
  31. Gupta, S.; Paul, K.; Roy, A. Codon Usage Signatures in the Genus Cryptococcus: A Complex Interplay of Gene Expression, Translational Selection and Compositional Bias. Genomics 2021, 113, 821–830. [Google Scholar] [CrossRef]
  32. Yang, Y.; Wang, X.; Shi, Z. Comparative Study on Codon Usage Patterns across Chloroplast Genomes of Eighteen Taraxacum Species. Horticulturae 2024, 10, 492. [Google Scholar] [CrossRef]
  33. Bhattacharyya, D.; Uddin, A.; Das, S.; Chakraborty, S. Mutation Pressure and Natural Selection on Codon Usage in Chloroplast Genes of Two Species in Pisum L. (Fabaceae: Faboideae). Mitochondrial DNA Part A DNA Mapp. Seq. Anal. 2019, 30, 664–673. [Google Scholar] [CrossRef]
  34. Li, C.; Pan, L.L.; Wang, Y.; Wang, J.; Ding, Z.T. Codon Bias of the Gene for Chloroplast Glycerol-3-Phosphate Acyltransferase in Camellia sinensis (L.) O. Kuntze. Biochem. Syst. Ecol. 2014, 55, 212–218. [Google Scholar] [CrossRef]
  35. You, E.; Wang, Y.; Ding, Z.T.; Zhang, X.F.; Pan, L.L.; Zheng, C. Codon Usage Bias Analysis for the Spermidine Synthase Gene from Camellia sinensis (L.) O. Kuntze. Genet. Mol. Res. 2015, 14, 7368–7376. [Google Scholar] [CrossRef]
  36. Pan, L.L.; Wang, Y.; Hu, J.H.; Ding, Z.T.; Li, C. Analysis of Codon Use Features of Stearoyl-Acyl Carrier Protein Desaturase Gene in Camellia sinensis. J. Theor. Biol. 2013, 334, 80–86. [Google Scholar] [CrossRef]
  37. Jenkins, G.M.; Holmes, E.C. The Extent of Codon Usage Bias in Human RNA Viruses and Its Evolutionary Origin. Virus Res. 2003, 92, 1–7. [Google Scholar] [CrossRef] [PubMed]
  38. Zhang, W.J.; Zhou, J.; Li, Z.F.; Wang, L.; Gu, X.; Zhong, Y. Comparative Analysis of Codon Usage Patterns among Mitochondrion, Chloroplast and Nuclear Genes in Triticum aestivum L. J. Integr. Plant Biol. 2007, 49, 246–254. [Google Scholar] [CrossRef]
  39. Ma, Q.P.; Li, C.; Wang, J.; Wang, Y.; Ding, Z.T. Analysis of Synonymous Codon Usage in FAD7 Genes from Different Plant Species. Genet. Mol. Res. 2015, 14, 1414–1422. [Google Scholar] [CrossRef] [PubMed]
  40. Clément, Y.; Sarah, G.; Holtz, Y.; Homa, F.; Pointet, S.; Contreras, S.; Nabholz, B.; Sabot, F.; Sauné, L.; Ardisson, M.; et al. Evolutionary Forces Affecting Synonymous Variations in Plant Genomes. PLoS Genet. 2017, 13, e1006799. [Google Scholar] [CrossRef] [PubMed]
  41. Duan, H.; Zhang, Q.; Wang, C.; Li, F.; Tian, F.; Lu, Y.; Hu, Y.; Yang, H.; Cui, G. Analysis of Codon Usage Patterns of the Chloroplast Genome in Delphinium grandiflorum L. Reveals a Preference for AT-Ending Codons as a Result of Major Selection Constraints. PeerJ 2021, 9, e10787. [Google Scholar] [CrossRef]
  42. Li, G.; Pan, Z.; Gao, S.; He, Y.; Xia, Q.; Jin, Y.; Yao, H. Analysis of Synonymous Codon Usage of Chloroplast Genome in Porphyra umbilicalis. Genes Genom. 2019, 41, 1173–1181. [Google Scholar] [CrossRef]
  43. Geng, X.; Huang, N.; Zhu, Y.; Qin, L.; Hui, L. Codon Usage Bias Analysis of the Chloroplast Genome of Cassava. S. Afr. J. Bot. 2022, 151, 970–975. [Google Scholar] [CrossRef]
  44. Guan, D.L.; Ma, L.B.; Khan, M.S.; Zhang, X.X.; Xu, S.Q.; Xie, J.Y. Analysis of Codon Usage Patterns in Hirudinaria manillensis Reveals a Preference for GC-Ending Codons Caused by Dominant Selection Constraints. BMC Genom. 2018, 19, 542. [Google Scholar] [CrossRef]
  45. Zhao, Y.; Zhang, S. Comparative Analysis of Codon Usage Bias in Six Eimeria Genomes. Int. J. Mol. Sci. 2024, 25, 8398. [Google Scholar] [CrossRef] [PubMed]
  46. Parvathy, S.T.; Udayasuriyan, V.; Bhadana, V. Codon Usage Bias. Mol. Biol. Rep. 2022, 49, 539–565. [Google Scholar] [CrossRef] [PubMed]
  47. Plotkin, J.B.; Kudla, G. Synonymous but Not the Same: The Causes and Consequences of Codon Bias. Nat. Rev. Genet. 2011, 12, 7152–7157. [Google Scholar] [CrossRef] [PubMed]
  48. Novoa, E.M.; Ribas de Pouplana, L. Speeding with Control: Codon Usage, TRNAs, and Ribosomes. Trends Genet. 2012, 28, 574–581. [Google Scholar] [CrossRef] [PubMed]
  49. Sharp, P.M.; Li, W.H. The Codon Adaptation Index-a Measure of Directional Synonymous Codon Usage Bias, and Its Potential Applications. Nucleic Acids Res. 1987, 15, 1281–1295. [Google Scholar] [CrossRef]
  50. Wu, X.; Xu, M.; Yang, J.R.; Lu, J. Genome-Wide Impact of Codon Usage Bias on Translation Optimization in Drosophila melanogaster. Nat. Commun. 2024, 15, 8329. [Google Scholar] [CrossRef]
  51. Wang, B.; Shao, Z.Q.; Xu, Y.; Liu, J.; Liu, Y.; Hang, Y.Y.; Chen, J.Q. Optimal Codon Identities in Bacteria: Implications from the Conflicting Results of Two Different Methods. PLoS ONE 2011, 6, e22714. [Google Scholar] [CrossRef]
  52. Demissie, E.A.; Park, S.Y.; Moon, J.H.; Lee, D.Y. Comparative Analysis of Codon Optimization Tools: Advancing toward a Multi-Criteria Framework for Synthetic Gene Design. J. Microbiol. Biotechnol. 2025, 35, e2411066. [Google Scholar] [CrossRef] [PubMed]
  53. Lu, C.; Guo, L.; Fang, B. DNA Sequence Changes Resulting from Codon Optimization Affect Gene Expression in Pichia pastoris by Altering Chromatin Accessibility. J. Fungi 2025, 11, 282. [Google Scholar] [CrossRef]
  54. Zhang, Q.; Chen, X.; You, H.; Chen, B.; Jia, L.; Li, S.; Zhang, X.; Ma, J.; Wu, X.; Wang, K.; et al. Specific Selection on XEG1 and XLP1 Genes Correlates with Host Range and Adaptability in Phytophthora. Nat. Commun. 2025, 16, 3638. [Google Scholar] [CrossRef]
  55. Wang, Z.K.; Liu, Y.; Zheng, H.Y.; Tang, M.Q.; Xie, S.Q. Comparative Analysis of Codon Usage Patterns in Nuclear and Chloroplast Genome of Dalbergia (Fabaceae). Genes 2023, 14, 1110. [Google Scholar] [CrossRef]
  56. Wang, L.; Xing, H.; Yuan, Y.; Wang, X.; Saeed, M.; Tao, J.; Feng, W.; Zhang, G.; Song, X.; Sun, X. Genome-Wide Analysis of Codon Usage Bias in Four Sequenced Cotton Species. PLoS ONE 2018, 13, e0194372. [Google Scholar] [CrossRef]
  57. Song, Y.; Shen, M.; Cao, F.; Yang, X. Compare Analysis of Codon Usage Bias of Nuclear Genome in Eight Sapindaceae Species. Int. J. Mol. Sci. 2024, 26, 39. [Google Scholar] [CrossRef]
  58. Sinha, K.; Jana, S.; Pramanik, P.; Bera, B. Selection on Synonymous Codon Usage in Soybean (Glycine max) WRKY Genes. Sci. Rep. 2024, 14, 26530. [Google Scholar] [CrossRef] [PubMed]
  59. Zhang, K.; Wang, Y.; Zhang, Y.; Shan, X. Codon Usage Characterization and Phylogenetic Analysis of the Mitochondrial Genome in Hemerocallis citrina. BMC Genom. Data 2024, 25, 6. [Google Scholar] [CrossRef] [PubMed]
  60. Zhang, Y.; Shen, Z.; Meng, X.; Zhang, L.; Liu, Z.; Liu, M.; Zhang, F.; Zhao, J. Codon Usage Patterns across Seven Rosales Species. BMC Plant Biol. 2022, 22, 65. [Google Scholar] [CrossRef]
  61. Shen, Z.; Gan, Z.; Zhang, F.; Yi, X.; Zhang, J.; Wan, X. Analysis of Codon Usage Patterns in Citrus Based on Coding Sequence Data. BMC Genom. 2020, 21, 234. [Google Scholar] [CrossRef]
  62. Ling, L.; Zhang, S.; Yang, T. Analysis of Codon Usage Bias in Chloroplast Genomes of Dryas octopetala var. asiatica (Rosaceae). Genes 2024, 15, 899. [Google Scholar] [CrossRef]
  63. Yang, X.; Wang, Y.; Gong, W.; Li, Y. Comparative Analysis of the Codon Usage Pattern in the Chloroplast Genomes of Gnetales Species. Int. J. Mol. Sci. 2024, 25, 10622. [Google Scholar] [CrossRef] [PubMed]
  64. Chen, X.; Zhao, Y.; Xu, S.; Zhou, Y.; Zhang, L.; Qu, B.; Xu, Y. Analysis of Codon Usage Bias in the Plastid Genome of Diplandrorchis sinica (Orchidaceae). Curr. Issues Mol. Biol. 2024, 46, 9807–9820. [Google Scholar] [CrossRef] [PubMed]
  65. Song, H.; Liu, J.; Chen, T.; Nan, Z.B. Synonymous Codon Usage Pattern in Model Legume Medicago truncatula. J. Integr. Agric. 2018, 17, 2074–2081. [Google Scholar] [CrossRef]
  66. Rao, A.; Chen, Z.; Wu, D.; Wang, Y.; Hou, N. Codon Usage Bias in the Chloroplast Genomes of Cymbidium Species in Guizhou, China. S. Afr. J. Bot. 2024, 164, 429–437. [Google Scholar] [CrossRef]
  67. Aktürk Dizman, Y. Exploring Codon Usage Patterns and Influencing Factors in Ranavirus DNA Polymerase Genes. J. Basic Microbiol. 2024, 64, e2400289. [Google Scholar] [CrossRef]
  68. Hershberg, R.; Petrov, D.A. General Rules for Optimal Codon Choice. PLoS Genet. 2009, 5, e1000556. [Google Scholar] [CrossRef] [PubMed]
  69. Aktürk Dizman, Y. Comprehensive Analysis of the Codon Usage Patterns in the Polyprotein Coding Sequences of the Honeybee Viruses. Front. Vet. Sci. 2025, 12, 1567209. [Google Scholar] [CrossRef]
  70. Buric, F.; Viknander, S.; Fu, X.; Lemke, O.; Carmona, O.G.; Zrimec, J.; Szyrwiel, L.; Mülleder, M.; Ralser, M.; Zelezniak, A. Amino Acid Sequence Encodes Protein Abundance Shaped by Protein Stability at Reduced Synthesis Cost. Protein Sci. 2025, 34, e5239. [Google Scholar] [CrossRef]
  71. Boro, N.; Alexandrino Fernandes, P.; Mukherjee, A.K. Computational Analysis to Comprehend the Structure-Function Properties of Fibrinolytic Enzymes from Bacillus spp. for Their Efficient Integration into Industrial Applications. Heliyon 2024, 10, e33895. [Google Scholar] [CrossRef]
  72. Hayat, S.; Hayat, Q.; Alyemeni, M.N.; Wani, A.S.; Pichtel, J.; Ahmad, A. Role of Proline under Changing Environments: A Review. Plant Signal. Behav. 2012, 7, 1456–1466. [Google Scholar] [CrossRef] [PubMed]
  73. Yang, X.; Lu, M.; Wang, Y.; Wang, Y.; Liu, Z.; Chen, S. Response Mechanism of Plants to Drought Stress. Horticulturae 2021, 7, 50. [Google Scholar] [CrossRef]
  74. Yang, Q.; Zhao, D.; Liu, Q. Connections Between Amino Acid Metabolisms in Plants: Lysine as an Example. Front. Plant Sci. 2020, 11, 928. [Google Scholar] [CrossRef]
  75. Nozaki, Y.; Tanford, C. The Solubility of Amino Acids and Related Compounds in Aqueous Thylene Glycol Solutions. J. Biol. Chem. 1965, 240, 3568–3575. [Google Scholar] [CrossRef] [PubMed]
  76. Zhang, S. Recent Advances of Polyphenol Oxidases in Plants. Molecules 2023, 28, 2158. [Google Scholar] [CrossRef]
  77. Sui, X.; Meng, Z.; Dong, T.; Fan, X.; Wang, Q. Enzymatic Browning and Polyphenol Oxidase Control Strategies. Curr. Opin. Biotechnol. 2023, 81, 102921. [Google Scholar] [CrossRef] [PubMed]
  78. Chakraborty, S.; Yengkhom, S.; Uddin, A. Analysis of Codon Usage Bias of Chloroplast Genes in Oryza Species: Codon Usage of Chloroplast Genes in Oryza Species. Planta 2020, 252, 67. [Google Scholar] [CrossRef]
  79. Wang, Z.; Cai, Q.; Wang, Y.; Li, M.; Wang, C.; Wang, Z.; Jiao, C.; Xu, C.; Wang, H.; Zhang, Z. Comparative Analysis of Codon Bias in the Chloroplast Genomes of Theaceae Species. Front. Genet. 2022, 13, 824610. [Google Scholar] [CrossRef]
  80. Bera, B.C.; Virmani, N.; Kumar, N.; Anand, T.; Pavulraj, S.; Rash, A.; Elton, D.; Rash, N.; Bhatia, S.; Sood, R.; et al. Genetic and Codon Usage Bias Analyses of Polymerase Genes of Equine Influenza Virus and Its Relation to Evolution. BMC Genom. 2017, 18, 652. [Google Scholar] [CrossRef]
  81. Wang, H.; Liu, S.; Lv, Y.; Wei, W. Codon Usage Bias of Venezuelan Equine Encephalitis Virus and Its Host Adaption. Virus Res. 2023, 328, 199081. [Google Scholar] [CrossRef]
  82. Li, X.; Song, H.; Kuang, Y.; Chen, S.; Tian, P.; Li, C.; Nan, Z. Genome-Wide Analysis of Codon Usage Bias in Epichloë festucae. Int. J. Mol. Sci. 2016, 17, 1138. [Google Scholar] [CrossRef] [PubMed]
  83. Yang, X.; Luo, X.; Cai, X. Analysis of Codon Usage Pattern in Taenia saginata Based on a Transcriptome Dataset. Parasites Vectors 2014, 7, 527. [Google Scholar] [CrossRef] [PubMed]
  84. Andargie, M.; Congyi, Z. Genome-Wide Analysis of Codon Usage in Sesame (Sesamum indicum L.). Heliyon 2022, 8, e08687. [Google Scholar] [CrossRef] [PubMed]
  85. Puigbò, P.; Bravo, I.G.; Garcia-Vallve, S. CAIcal: A Combined Set of Tools to Assess Codon Usage Adaptation. Biol. Direct 2008, 3, 38. [Google Scholar] [CrossRef]
  86. Paola, N.D.; De Melo Freire, C.C.; De Andrade Zanotto, P.M. Does Adaptation to Vertebrate Codon Usage Relate to Flavivirus Emergence Potential? PLoS ONE 2018, 13, e0191652. [Google Scholar] [CrossRef]
  87. Carbone, A.; Zinovyev, A.; Képès, F. Codon Adaptation Index as a Measure of Dominating Codon Bias. Bioinformatics 2003, 19, 2005–2015. [Google Scholar] [CrossRef]
  88. Li, Q.; Luo, Y.; Sha, A.; Xiao, W.; Xiong, Z.; Chen, X.; He, J.; Peng, L.; Zou, L. Analysis of Synonymous Codon Usage Patterns in Mitochondrial Genomes of Nine Amanita Species. Front. Microbiol. 2023, 14, 1134228. [Google Scholar] [CrossRef]
  89. De Mandal, S.; Mazumder, T.H.; Panda, A.K.; Kumar, N.S.; Jin, F. Analysis of Synonymous Codon Usage Patterns of HPRT1 Gene across Twelve Mammalian Species. Genomics 2020, 112, 304–311. [Google Scholar] [CrossRef]
  90. Zang, M.; He, W.; Du, F.; Wu, G.; Wu, B.; Zhou, Z. Analysis of the Codon Usage of the ORF2 Gene of Feline Calicivirus. Infect. Genet. Evol. 2017, 54, 54–59. [Google Scholar] [CrossRef]
  91. Gao, W.; Chen, X.; He, J.; Sha, A.; Luo, Y.; Xiao, W.; Xiong, Z.; Li, Q. Intraspecific and Interspecific Variations in the Synonymous Codon Usage in Mitochondrial Genomes of 8 Pleurotus Strains. BMC Genom. 2024, 25, 456. [Google Scholar] [CrossRef]
  92. Das, J.K.; Roy, S. Comparative Analysis of Human Coronaviruses Focusing on Nucleotide Variability and Synonymous Codon Usage Patterns. Genomics 2021, 113, 2177–2188. [Google Scholar] [CrossRef]
  93. Sharp, P.M.; Li, W.H. Codon Usage in Regulatory Genes in Escherichia coli Does Not Reflect Selection for “rare” Codons. Nucleic Acids Res. 1986, 14, 7737–7749. [Google Scholar] [CrossRef]
  94. Wright, F. The “effective Number of Codons” Used in a Gene. Gene 1990, 87, 23–29. [Google Scholar] [CrossRef]
  95. Niu, Y.; Luo, Y.; Wang, C.; Liao, W. Deciphering Codon Usage Patterns in Genome of Cucumis sativus in Comparison with Nine Species of Cucurbitaceae. Agronomy 2021, 11, 2289. [Google Scholar] [CrossRef]
  96. Sheng, J.; She, X.; Liu, X.; Wang, J.; Hu, Z. Comparative Analysis of Codon Usage Patterns in Chloroplast Genomes of Five Miscanthus Species and Related Species. PeerJ 2021, 9, e12173. [Google Scholar] [CrossRef] [PubMed]
  97. Sueoka, N. Two Aspects of DNA Base Composition: G+C Content and Translation- Coupled Deviation from Intra-Strand Rule of A = T and G = C. J. Mol. Evol. 1999, 49, 53–58. [Google Scholar] [CrossRef] [PubMed]
  98. Rahman, S.U.; Hu, Y.; Rehman, H.U.; Alrashed, M.M.; Attia, K.A.; Ullah, U.; Liang, H. Analysis of Synonymous Codon Usage Bias of Lassa Virus. Virus Res. 2025, 353, 199528. [Google Scholar] [CrossRef] [PubMed]
  99. Sueoka, N. Directional Mutation Pressure and Neutral Molecular Evolution. Proc. Natl. Acad. Sci. USA 1988, 85, 2653–2657. [Google Scholar] [CrossRef]
  100. Peng, Q.; Zhang, X.; Li, J.; He, W.; Fan, B.; Ni, Y.; Liu, M.; Li, B. Comprehensive Analysis of Codon Usage Patterns of Porcine Deltacoronavirus and Its Host Adaptability. Transbound. Emerg. Dis. 2022, 69, e2443–e2455. [Google Scholar] [CrossRef]
  101. Fan, K.; Li, Y.; Chen, Z.; Fan, L. GenRCA: A User-Friendly Rare Codon Analysis Tool for Comprehensive Evaluation of Codon Usage Preferences Based on Coding Sequences in Genomes. BMC Bioinform. 2024, 25, 309. [Google Scholar] [CrossRef] [PubMed]
  102. Angellotti, M.C.; Bhuiyan, S.B.; Chen, G.; Wan, X.F. CodonO: Codon Usage Bias Analysis within and across Genomes. Nucleic Acids Res. 2007, 35, 132–136. [Google Scholar] [CrossRef] [PubMed]
  103. Nambou, K.; Anakpa, M.; Tong, Y.S. Human Genes with Codon Usage Bias Similar to That of the Nonstructural Protein 1 Gene of Influenza A Viruses Are Conjointly Involved in the Infectious Pathogenesis of Influenza A Viruses. Genetica 2022, 150, 97–115. [Google Scholar] [CrossRef]
  104. Supek, F.; Vlahoviček, K. Comparison of Codon Usage Measures and Their Applicability in Prediction of Microbial Gene Expressivity. BMC Bioinform. 2005, 6, 18. [Google Scholar] [CrossRef] [PubMed]
Figure 1. ENC-plot analysis of PPO genes. Each shape represents an individual gene, while the standard curve depicts the expected ENC values under the assumption of random codon usage.
Figure 1. ENC-plot analysis of PPO genes. Each shape represents an individual gene, while the standard curve depicts the expected ENC values under the assumption of random codon usage.
Plants 14 03074 g001
Figure 2. PR2 plot analysis between AT bias [A3/(A3 + T3] and GC bias [G3/(G3+ C3)] of PPO genes.
Figure 2. PR2 plot analysis between AT bias [A3/(A3 + T3] and GC bias [G3/(G3+ C3)] of PPO genes.
Plants 14 03074 g002
Figure 3. Neutrality plot analysis of PPO genes between GC12s and GC3s. The slope value reflects the proportion of mutational pressure contributing to the total variation.
Figure 3. Neutrality plot analysis of PPO genes between GC12s and GC3s. The slope value reflects the proportion of mutational pressure contributing to the total variation.
Plants 14 03074 g003
Figure 4. Correspondence analysis based on the RSCU values of PPO genes of C. sinensis (A) and other plant species (Camellia lanceoleosa, Camellia nitidissima, Camellia ptilophylla, Actinidia chinensis, Cornus florida, Rhododendron vialii) (B).
Figure 4. Correspondence analysis based on the RSCU values of PPO genes of C. sinensis (A) and other plant species (Camellia lanceoleosa, Camellia nitidissima, Camellia ptilophylla, Actinidia chinensis, Cornus florida, Rhododendron vialii) (B).
Plants 14 03074 g004
Figure 5. Amino acid usage frequencies analysis of CsPPO genes.
Figure 5. Amino acid usage frequencies analysis of CsPPO genes.
Plants 14 03074 g005
Table 1. Nucleotide compositional analysis of PPO genes.
Table 1. Nucleotide compositional analysis of PPO genes.
SpeciesAccession No.T3s%C3s%A3s%G3s%GC%AT%GC3s%AT3%GC1%GC2%
C. sinensisDQ513313.129.8634.7628.1535.2249.1750.8353.7845.0052.5040.00
C. sinensisMK977642.129.9436.0929.6732.2149.2550.7552.5945.8651.9041.72
C. sinensisMK977643.130.6734.1528.6034.5748.8951.1152.7546.0052.3340.33
C. sinensisMK977644.135.3731.9127.0032.2548.8351.1749.7448.6653.1941.95
C. sinensisMK977645.145.2212.1726.0548.1545.3954.6144.3055.2659.2132.24
C. sinensisMZ442717.130.6734.1528.3834.7349.0650.9452.9245.8352.5040.50
C. sinensisMZ442718.130.6734.1528.6734.5748.8951.1152.7546.0052.3340.33
C. sinensisFJ656220.130.4134.4928.4434.5749.0650.9453.0945.6752.3340.50
C. sinensisEU787433.130.6734.1528.6734.5748.8951.1152.7546.0052.3340.33
C. sinensisJX465712.130.6734.1528.6734.5748.8951.1152.7546.0052.3340.33
C. sinensisGQ214317.130.6734.1528.4434.8148.9451.0652.9245.8352.3340.33
C. sinensisAY659975.130.434.829.7431.9548.9051.1051.7946.8852.0841.49
C. sinensisFJ210643.130.1234.8428.5134.6549.0650.9453.3645.3352.0040.50
C. sinensisEF650017.130.8834.1528.5134.6548.8951.1152.7546.0052.3340.33
C. sinensisEF650016.130.5334.0228.1535.1449.2550.7553.1845.5852.7540.57
C. sinensisEF635860.130.6734.1528.6734.5748.9451.0652.7546.0052.5040.33
C. sinensisEF623826.130.6734.1528.634.5748.8951.1152.7546.0052.3340.33
C. sinensisMH250121.130.6734.1528.6734.5748.8351.1752.7546.0052.3340.17
C. sinensisMH250120.130.6634.3628.0834.9849.1150.8953.1845.5052.8340.00
C. sinensisMH250119.129.834.4928.0835.4749.3350.6753.7845.0052.5040.50
C. sinensisMH250118.130.2534.5728.3135.1449.2250.7853.4445.3352.6740.33
C. sinensisGQ129142.129.8634.7628.1535.2249.1150.8953.7845.0052.3340.00
C. sinensisMZ442720.130.8233.8828.8334.4848.8351.1752.4946.3352.3340.50
C. sinensisMZ442719.129.8634.5628.1535.3849.1750.8353.7845.0052.5040.00
C. lanceoleosa KAI7999995.137.4529.7627.2731.6848.3851.6247.6650.6753.6842.14
C. nitidissimaACM43505.131.3026.9332.0048.6048.6051.4049.0449.3353.6941.44
C. ptilophyllaABF19601.135.0528.2134.8149.5049.5050.5053.8144.9752.8540.60
A. chinensisPSR98570.132.2622.1441.0050.7550.7549.2555.7742.9353.4141.76
C. floridaXM_059815174.128.6327.1931.0347.6047.6052.4046.0052.7452.5742.95
R. vialiiXM_058340762.138.3425.8132.7649.8349.8350.1755.6143.0050.3342.17
Table 2. Indices of codon usage bias for PPO genes.
Table 2. Indices of codon usage bias for PPO genes.
SpeciesAccession No.ENCCBICAIGRAVYAROMA
C. sinensisDQ513313.157.07−0.0480.196−0.4794660.09182
C. sinensisMK977642.155.87−0.0640.196−0.4670120.091537
C. sinensisMK977643.156.81−0.0560.197−0.4766280.09182
C. sinensisMK977644.157.73−0.0920.193−0.4110920.092437
C. sinensisMK977645.149.01−0.2390.161−0.3185430.046358
C. sinensisMZ442717.156.83−0.0540.197−0.4891490.09015
C. sinensisMZ442718.156.89−0.0560.196−0.4899830.09182
C. sinensisFJ656220.156.63−0.050.197−0.4923210.09182
C. sinensisEU787433.156.89−0.0560.196−0.4899830.09182
C. sinensisJX465712.156.89−0.0560.196−0.4899830.09182
C. sinensisGQ214317.156.86−0.0540.197−0.4899830.09182
C. sinensisAY659975.155.80−0.0150.2−0.4518260.093913
C. sinensisFJ210643.156.66−0.0470.198−0.4923210.093489
C. sinensisEF650017.156.64−0.0590.195−0.4796330.09182
C. sinensisEF650016.156.37−0.0440.199−0.5115390.091973
C. sinensisEF635860.156.97−0.0560.196−0.4931550.09015
C. sinensisEF623826.156.81−0.0560.197−0.4766280.09182
C. sinensisMH250121.156.79−0.0550.196−0.4866440.09182
C. sinensisMH250120.156.19−0.0410.201−0.4823040.09182
C. sinensisMH250119.155.32−0.0390.198−0.4787980.093489
C. sinensisMH250118.155.52−0.0330.204−0.530050.09182
C. sinensisGQ129142.156.93−0.0460.198−0.478130.09182
C. sinensisMZ442720.157.47−0.050.198−0.4893160.09182
C. sinensisMZ442719.157.11−0.0450.197−0.4853090.09182
C. lanceoleosaKAI7999995.157.76−0.1050.192−0.4045230.092127
C. nitidissimaACM43505.157.41−0.0990.189−0.3957980.090756
C. ptilophyllaABF19601.155.74−0.0350.197−0.4931090.092437
A. chinensisPSR98570.156.23−0.1010.185−0.4911670.093333
C. floridaXM_059815174.155.09−0.0250.214−0.5289040.094684
R. vialiiXM_058340762.155.00−0.0260.196−0.469950.085142
Table 3. High-frequency (HF) codons identified in PPO genes of C. sinensis and other plant species.
Table 3. High-frequency (HF) codons identified in PPO genes of C. sinensis and other plant species.
SpeciesNumber of HF CodonsHF Codons Identified
C. sinensis18TTC, CTC, ATT, GTG, TCC, CCG, ACC, GCC, TAC, CAT, CAA, AAC, AAG, GAT, GAG, TGC, CGA, GGG
C. lanceoleosa20TTC, CTT, ATT, GTG, TCC, CCT, ACC, GCC, TAT, TAC, CAT, CAA, AAG, GAT, GAG, TGT, CGA, AGA, GGG
C. nitidissima21TTC, TTG, ATT, GTG, TCC, CCT, CCC, ACC, GCT, TAT, TAC, CAT, CAA, AAT, AAG, GAT, GAG, TGT, TGC, CGG, GGG
C. ptilophylla19TTC, CTC, ATT, GTG, TCC, CCG, ACC, GCC, TAC, CAT, CAA, AAC, AAG, GAT, GAG, TGC, CGA, GGG
A. chinensis17TTT, TTG, ATT, GTG, TCC, CCG, ACC, GCC, TAT, CAC, CAA, AAT, AAG, GAT, GAG, AGG, GGG
C. florida20TTT, TTG, ATA, GTG, TCC, CCA, ACC, GCT, TAC, CAT, CAA, AAT, AAG, GAT, GAG, TGT, TGC, AGG, GGT
R. vialii20TTC, CTT, ATC, GTG, TCC, CCC, ACC, GCC, TAC, CAT, CAA, AAC, AAA, AAG, GAT, GAG, TGC, AGA, AGG, GGG
Table 4. SCUO and MILC analyses of PPO genes in C. sinensis and other plant species.
Table 4. SCUO and MILC analyses of PPO genes in C. sinensis and other plant species.
SpeciesSCUOMILC
C. sinensis0.060.50
C. lanceoleosa0.060.50
C. nitidissima0.060.50
C. ptilophylla0.060.49
A. chinensis0.070.49
C. florida0.070.48
R. vialii0.060.49
Table 5. Correlation analysis between ENC values, CAI values and codon composition of CsPPO genes.
Table 5. Correlation analysis between ENC values, CAI values and codon composition of CsPPO genes.
IndicesGCGC1GC2 GC3sA3sT3sC3sG3sAT3
ENC
r0.87924 *−0.90933 *0.8891 *0.79626 *0.55742 *−0.82919 *0.90027 *−0.88771 *−0.82775 *
p<0.05<0.05<0.05<0.05<0.05<0.05<0.05<0.05<0.05
CAI
r−0.4184 *0.47938 *−0.16986−0.68726 *−0.5215 *0.64212 *−0.47633 *0.188850.64205 *
p<0.05<0.05<0.05<0.05<0.05<0.05<0.050.37683<0.05
* p-value < 0.05.
Table 6. Correlation analysis among ENC, A3s, T3s, G3s, C3s, GC3s, GRAVY and AROMA values.
Table 6. Correlation analysis among ENC, A3s, T3s, G3s, C3s, GC3s, GRAVY and AROMA values.
IndicesENCA3sT3sC3sG3sGC3s
GRAVY
r−0.74322 *−0.64531 *0.91663 *−0.86921 *0.68189 *−0.93131 *
p<0.05<0.05<0.05<0.05<0.05<0.05
AROMA
r0.92501 *0.68872 *−0.93759 *0.98578 *0.68189 *0.89588 *
p<0.05<0.05<0.05<0.05<0.05<0.05
* p-value < 0.05.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Aktürk Dizman, Y. Codon Usage Bias of the Polyphenol Oxidase Genes in Camellia sinensis: A Comprehensive Analysis. Plants 2025, 14, 3074. https://doi.org/10.3390/plants14193074

AMA Style

Aktürk Dizman Y. Codon Usage Bias of the Polyphenol Oxidase Genes in Camellia sinensis: A Comprehensive Analysis. Plants. 2025; 14(19):3074. https://doi.org/10.3390/plants14193074

Chicago/Turabian Style

Aktürk Dizman, Yeşim. 2025. "Codon Usage Bias of the Polyphenol Oxidase Genes in Camellia sinensis: A Comprehensive Analysis" Plants 14, no. 19: 3074. https://doi.org/10.3390/plants14193074

APA Style

Aktürk Dizman, Y. (2025). Codon Usage Bias of the Polyphenol Oxidase Genes in Camellia sinensis: A Comprehensive Analysis. Plants, 14(19), 3074. https://doi.org/10.3390/plants14193074

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop