Analysis of Functional Single-Nucleotide Polymorphisms (SNPs) and Leaf Quality in Tea Collection under Nitrogen-Deficient Conditions

This study discusses the genetic mutations that have a significant association with economically important traits that would benefit tea breeders. The purpose of this study was to analyze the leaf quality and SNPs in quality-related genes in the tea plant collection of 20 mutant genotypes growing without nitrogen fertilizers. Leaf N-content, catechins, L-theanine, and caffeine contents were analyzed in dry leaves via HPLC. Additionally, the photochemical yield, electron transport efficiency, and non-photochemical quenching were analyzed using PAM-fluorimetry. The next generation pooled amplicon–sequencing approach was used for SNPs-calling in 30 key genes related to N metabolism and leaf quality. The leaf N content varied significantly among genotypes (p ≤ 0.05) from 2.3 to 3.7% of dry mass. The caffeine content varied from 0.7 to 11.7 mg g−1, and the L-theanine content varied from 0.2 to 5.8 mg g−1 dry leaf mass. Significant positive correlations were detected between the nitrogen content and biochemical parameters such as theanine, caffeine, and most of the catechins. However, significant negative correlations were observed between the photosynthetic parameters (Y, ETR, Fv/Fm) and several biochemical compounds, including rutin, Quercetin-3-O-glucoside, Kaempferol-3-O-rutinoside, Kaempferol-3-O-glucoside, Theaflavin-3′-gallate, gallic acid. From our SNP-analysis, three SNPs in WRKY57 were detected in all genotypes with a low N content. Moreover, 29 SNPs with a high or moderate effect were specific for #316 (high N-content, high quality) or #507 (low N-content, low quality). The use of a linear regression model revealed 16 significant associations; theaflavin, L-theanine, and ECG were associated with several SNPs of the following genes: ANSa, DFRa, GDH2, 4CL, AlaAT1, MYB4, LHT1, F3′5′Hb, UFGTa. Among them, seven SNPs of moderate effect led to changes in the amino acid contents in the final proteins of the following genes: ANSa, GDH2, 4Cl, F3′5′Hb, UFGTa. These results will be useful for further evaluations of the important SNPs and will help to provide a better understanding of the mechanisms of nitrogen uptake efficiency in tree crops.

Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5.8 (in #316) mg g −1 . Additionally, most of the genotypes with

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5.8 (in #316) mg g −1 . Additionally, most of the genotypes with -mutant 46.1 ± 4.6 c Large #855 mutant form diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5.8 (in #316) mg g −1 . Additionally, most of the genotypes with -mutant 55.6 ± 3.9 bc Large #527 mutant form diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5.8 (in #316) mg g −1 . Additionally, most of the genotypes with -mutant 57.4 ± 5.0 bc Large #3986 mutant form uated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5.8 (in #316) mg g −1 . Additionally, most of the genotypes with -mutant 58.2 ± 5.1 bc Large #551 mutant form the N assimilation [60]. However, the polymorphisms in these genes have not been evaluated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5.8 (in #316) mg g −1 . Additionally, most of the genotypes with -mutant 65.6 ± 5.9 b Extra-large #157 mutant form acid transporters such as AAPs, LHTs, CATs, ProTs, and UMAMITs play important roles in the N assimilation [60]. However, the polymorphisms in these genes have not been evaluated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5.8 (in #316) mg g −1 . Additionally, most of the genotypes with -mutant 66.5 ± 5.8 b Extra-large #3180 mutant form amino-acid transporters (LHT1) have been identified [5,6,33,[57][58][59]. Additionally, aminoacid transporters such as AAPs, LHTs, CATs, ProTs, and UMAMITs play important roles in the N assimilation [60]. However, the polymorphisms in these genes have not been evaluated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5.8 (in #316) mg g −1 . Additionally, most of the genotypes with -mutant 68.0 ± 6.0 b Extra-large #536 mutant form amino-acid transporters (LHT1) have been identified [5,6,33,[57][58][59]. Additionally, aminoacid transporters such as AAPs, LHTs, CATs, ProTs, and UMAMITs play important roles in the N assimilation [60]. However, the polymorphisms in these genes have not been evaluated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5.8 (in #316) mg g −1 . Additionally, most of the genotypes with -mutant 68.7 ± 6.2 b Extra-large #1405 mutant form amino-acid transporters (LHT1) have been identified [5,6,33,[57][58][59]. Additionally, aminoacid transporters such as AAPs, LHTs, CATs, ProTs, and UMAMITs play important roles in the N assimilation [60]. However, the polymorphisms in these genes have not been evaluated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5.8 (in #316) mg g −1 . Additionally, most of the genotypes with -mutant 69.8 ± 6.0 b Extra-large #1877 mutant form amino-acid transporters (LHT1) have been identified [5,6,33,[57][58][59]. Additionally, aminoacid transporters such as AAPs, LHTs, CATs, ProTs, and UMAMITs play important roles in the N assimilation [60]. However, the polymorphisms in these genes have not been evaluated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5 amino-acid transporters (LHT1) have been identified [5,6,33,[57][58][59]. Additionally, aminoacid transporters such as AAPs, LHTs, CATs, ProTs, and UMAMITs play important roles in the N assimilation [60]. However, the polymorphisms in these genes have not been evaluated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5 amino-acid transporters (LHT1) have been identified [5,6,33,[57][58][59]. Additionally, aminoacid transporters such as AAPs, LHTs, CATs, ProTs, and UMAMITs play important roles in the N assimilation [60]. However, the polymorphisms in these genes have not been evaluated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5 amino-acid transporters (LHT1) have been identified [5,6,33,[57][58][59]. Additionally, aminoacid transporters such as AAPs, LHTs, CATs, ProTs, and UMAMITs play important roles in the N assimilation [60]. However, the polymorphisms in these genes have not been evaluated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5 amino-acid transporters (LHT1) have been identified [5,6,33,[57][58][59]. Additionally, aminoacid transporters such as AAPs, LHTs, CATs, ProTs, and UMAMITs play important roles in the N assimilation [60]. However, the polymorphisms in these genes have not been evaluated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5 amino-acid transporters (LHT1) have been identified [5,6,33,[57][58][59]. Additionally, aminoacid transporters such as AAPs, LHTs, CATs, ProTs, and UMAMITs play important roles in the N assimilation [60]. However, the polymorphisms in these genes have not been evaluated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5 amino-acid transporters (LHT1) have been identified [5,6,33,[57][58][59]. Additionally, aminoacid transporters such as AAPs, LHTs, CATs, ProTs, and UMAMITs play important roles in the N assimilation [60]. However, the polymorphisms in these genes have not been evaluated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.

Phenotypic Evaluation of the Collection
Among the 20 genotypes, significant differences were observed according to their leaf size, leaf N content, photosynthetic efficiency, and leaf quality. Fifteen genotypes were classified as extra large according to their leaf area (Table 1). These genotypes were separated into three groups: (a) those with a mean leaf area of 149 cm 2 (tetra-ploidy genotype #619), (b) those with a mean leaf area of 83-112 cm 2 (#1102, #316, #2697, Karatum, #1385, #212), and (c) those with a mean leaf area of 65-75 cm 2 (#551, #157, #3180, #536, #1405, #1877, #582, #1467). Most of the genotypes in the (b) and (c) groups were characterized by an increased genome size and related to are tri-and aneuploidy (Supplementary Table S1). Significant differences (p ≤ 0.05) in leaf N content were observed among the studied genotypes, with leaf N content values varying from 2.3 to 3.7%, and the mean standard deviation was 0.05-0.20% ( Figure 1A). The highest leaf nitrogen content (3.4-3.7%) was detected in several mutant forms, namely #316, #582, and #1405. The lowest N-content (2.3%) was detected in the #507 mutant form. The caffeine content (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.7 to 11.7 mg g −1 . The highest caffeine content (8.6-11.7 mg g −1 ) was observed in several genotypes with a high leaf nitrogen content, namely #316, #582, #1405, #212, #551, and cv Karatum. The content of L-theanine (mg g −1 dry leaf mass) varied significantly among the genotypes, with values ranging from 0.2 (in #507) to 5 amino-acid transporters (LHT1) have been identified [5,6,33,[57][58][59]. Additionally, aminoacid transporters such as AAPs, LHTs, CATs, ProTs, and UMAMITs play important roles in the N assimilation [60]. However, the polymorphisms in these genes have not been evaluated in genotypes with different N efficiencies. The goal of this study was to analyze the diversity of and the SNPs in the key N metabolism genes and their correspondence with specific phenotypes in the collection of 20 tea genotypes in Northwest Caucasia.
other genotypes. On the other hand, the lowest NPQ value was detected in #855, #582, and #1102, evidencing the higher level of photochemical energy utilization in these three genotypes compared to the other genotypes.

Detection of SNPs in the Selected Genes and their Relationships with Phenotypes
A high level of variability in SNP number was observed in target genes ( Table 2). The highest SNP density in the exone regions (about 2) was observed in three genes, namely F3 5 Hb, 4CL, and AMT1.2, while the lowest SNP density (about 0.0-0.1) was detected in the exons of bG, WD40, GDH2, LAR, AlaAT1, bHLH35, MYB7, and bHLH36. Additionally, the highest rate of SNPS/exon (more than 30%) was detected in F3 5 Hb, AMT1.2, DFRa, PIP, and bG.

Detection of SNPs in the Selected Genes and their Relationships with Phenotype
A high level of variability in SNP number was observed in target genes (T highest SNP density in the exone regions (about 2) was observed in three ge F3′5′Hb, 4CL, and AMT1.2, while the lowest SNP density (about 0.0-0.1) was the exons of bG, WD40, GDH2, LAR, AlaAT1, bHLH35, MYB7, and bHLH36. A the highest rate of SNPS/exon (more than 30%) was detected in F3′5′Hb, AM PIP, and bG.
Interestingly, no polymorphisms were detected in FLS, CHS, and ANRa. The SnpEff tool variation annotation was used to classify the SNPs accord effect impact, and the SNPs were classified as either high-effect, moderate-eff   We generated heatmaps to compare the SNP frequencies in the exon regions of the studied genes ( Figure 5). The heatmaps indicated that the highest frequencies occurred in AMT1.2, UFGTa, UFGTb, and 4Cl in most of the genotypes, while the lowest frequencies were observed in GDHa, GDH2, WD40, bHLH35, AlaAT1, LAR, and bG. The application of the neighbor joining method to the studied mutant forms indicated two distinct branches, and one of them was divided into two sub-branches. The first branch consisted of six genotypes with the highest SNP frequencies, namely #507, #855, #1385, #3986, #1467, and #619. Most of them were characterized by a low N content and low caffeine and L-theanine contents. The second branch consisted of 14 genotypes and was divided into two subbranches. Most of the genotypes with high leaf quality shared the same sub-branch. We generated heatmaps to compare the SNP frequencies in the exon regions of the studied genes ( Figure 5). The heatmaps indicated that the highest frequencies occurred in AMT1.2, UFGTa, UFGTb, and 4Cl in most of the genotypes, while the lowest frequencies were observed in GDHa, GDH2, WD40, bHLH35, AlaAT1, LAR, and bG. The application of the neighbor joining method to the studied mutant forms indicated two distinct branches, and one of them was divided into two sub-branches. The first branch consisted of six genotypes with the highest SNP frequencies, namely #507, #855, #1385, #3986, #1467, and #619. Most of them were characterized by a low N content and low caffeine and L-theanine contents. The second branch consisted of 14 genotypes and was divided into two subbranches. Most of the genotypes with high leaf quality shared the same sub-branch. the neighbor joining method to the studied mutant forms indicated two distinct branches, and one of them was divided into two sub-branches. The first branch consisted of six genotypes with the highest SNP frequencies, namely #507, #855, #1385, #3986, #1467, and #619. Most of them were characterized by a low N content and low caffeine and L-theanine contents. The second branch consisted of 14 genotypes and was divided into two subbranches. Most of the genotypes with high leaf quality shared the same sub-branch.  The sets of SNPs that were classified as high-or moderate-effect SNPs were finally identified (Table S3). Among them, two SNPs in WRKY57 were most frequent among the most genotypes with a low nitrogen content and low tea quality. Moreover, 29 SNPs with a high or moderate effect were associated with #316 or #507 (two genotypes that contrast according to their leaf N content and leaf quality). Specifically, for #507, these specific SNPs were observed in the following genes: GDHa (1 SNP), GDH2 (1), WD40 (2), 4CL (7), F3 5 Hb A linear regression model was applied to reveal the possible associations between the phenotypic data and the SNPs. This analysis led to the recording of 16 significant associations, and the level of significance was set at p value < 0.05 (Table 3). Theaflavin, L-theanine, and ECG were associated with several SNPs of the following genes: ANSa, DFRa, GDH2, 4CL, AlaAT1, MYB4, LHT1, F3 5 Hb, and UFGTa. Among them, seven SNPs of moderate effect may be responsible for changes in the amino acid contents of the final proteins of the following genes: ANSa, GDH2, 4Cl, F3 5 Hb, and UFGTa.

Discussion
The new regulatory approaches regarding plant mineral nutrition have been outlined to improve yield quality and quantity and are based on a desire to create cultivars that can effectively adapt to a specific level of soil fertility and are characterized by high nitrogen uptake and utilization efficiency [1,61]. Different plant genotypes can uptake and utilize soil nitrogen with different levels of efficiency [6,7]. Thus, the identification of these genotypes and the discovery of the mechanisms underlying high-level N deficiency are important to develop molecular markers and facilitate their further application in breeding programs [3][4][5].
In this study, twenty tea genotypes displayed significant differences according to their leaf size; ploidy level; and N, caffeine, L-theanine, and catechin contents. The genotype #507 was characterized by having the smallest leaf size; lowest N, L-theanine, and caffeine contents; lowest photochemical yield (Fv/Fm), operating efficiency of the PSII (Y), and Electron transport rate (ETR); and the highest non-photochemical quenching compared to the other genotypes. These results indicate this genotype's low level of photochemical utilization of light energy, leading to the low photosynthetic capacity of this genotype [62][63][64]. The mutant form #316 was characterized by an extra-large leaf size and showed the highest N, caffeine, and theanine contents. This genotype was characterized by having average values of Y, ETR, and Fv/Fm compared to the other genotypes. The SNP distribution and frequencies of these two genotypes showed great dissimilarity, and the two were placed in two different branches, showing the highest genetic difference among these two genotypes.
Significant positive correlations were detected between the nitrogen content and biochemical parameters such as theanine, caffeine, and catechin contents. L-theanine and caffeine have been shown to be positively correlated with soil N content, which is consistent with our results [65]. However, according to our results, no negative correlation was observed between theanine and catechins. This is not consistent with the other studies in the literature, as some have reported that catechin and L-theanine contents are negatively correlated [66]. Surprisingly, in this study, we observed no correlations between nitrogen content and the parameters of photosynthetic efficiency (Y, ETR, NPQ, Fv/Fm).
The allele frequency data are useful for identifying the loci underlying phenotypic responses to selection or natural variation in phenotypes. The highest SNP frequencies were detected in the following genes: UFGTa, UFGTb, 4Cl, and AMT1.2, indicating their high variability among tea accessions. Among them, the first three genes are related to tea quality; UFGT encodes UDP-flavonoid 3-O-glucosyl transferase and genes related to the anthocyanin biosynthesis pathway, while 4Cl encodes 4-coumaric acid, CoA ligase, which participates in the biosynthesis of flavonoids [67]. AMT1.2 is one of the key genes that encodes the ammonium transporter that regulates NH 4 + uptake [68]. Tea plants have been reported to utilize ammonium more efficiently than nitrate, resulting in better growth [69,70]. However, nitrate-fertilized young shoots have been shown to exhibit a greater total catechin content and higher expression of genes encoding the flavonoid biosynthetic enzymes dihydroflavonol 4-reductase (DFR), chalcone synthase (CHS), and phenylalanine ammonia-lyase (PAL) compared to ammonium-fertilized shoots [71].
SNPeff can determine the different impact of SNPs [72]. A high-impact variant can cause function loss or gain, a premature stop codon, or a change in protein structure or function. A moderately significant variation may result in a non-disruptive change in protein function or structure. A low-impact variant may result in a silent mutation, which means that the genetic change has no influence on amino acid sequence or protein function. It could be a conservative missense variation with the same amino acid alteration. A modifier impact variant is one that is expected to alter the effect of another variant on the protein or to affect regulatory regions that control gene expression. Through analyzing SNPs with a high or moderate effect, we identified 18 SNPs that are unique to the low-quality genotype #507 in the following genes: GDHa (1 SNP), GDH2 (1), WD40 (2), 4CL (7), F3 5 Hb (3), WRKY57 (2), and UFGTa (2). Among them, only two SNPs in GDH2 and 4CL are significantly associated with theaflavin and ECG and promote changes in protein structures. As mentioned above, 4Cl has been shown to participate in the flavonoid biosynthesis pathway, and this finding is consistent with our results. GDH encodes glutamate dehydrogenases, central enzymes in nitrogen metabolism, assimilating ammonia into glutamine or deaminating glutamate into α-oxoglutarate. Tea plant has two GDH genes: CsGDH1 encodes the β-GDH subunit, and CsGDH2/3 encode the α-GDH subunit, and their proteins all feature an NADH-specific motif [73].
To summarize, in this study, we revealed significant positive correlations between nitrogen content and biochemical parameters such as theanine, caffeine, and catechin contents. However, significant negative correlations between photosynthetic parameters (Y, ETR, Fv/Fm) and several biochemical compounds, such as rutin, Quercetin-3-O-glucoside, Kaempferol-3-O-rutinoside, Kaempferol-3-O-glucoside, Theaflavin-3'-gallate, and gallic acid, were observed. The application of a linear regression model revealed 16 significant associations; theaflavin, L-theanine, and ECG were associated with several SNPs of the following genes: ANSa, DFRa, GDH2, 4CL, AlaAT1, MYB4, LHT1, F3 5 Hb, and UFGTa. Among them, seven SNPs of moderate effect led to changes in the amino acid contents of the final proteins of the following genes: ANSa, GDH2, 4Cl, F3 5 Hb, and UFGTa. Among the 18 SNPs that were found to be unique to the low-quality genotype #507, only two SNPs (in GDH2 and in 4CL) were observed to have significant associations with theaflavin and ECG and promote changes in protein structure. Our results will be useful for further analyses of the associations of these SNPs in broad germplasm diversity with respect to tea collection and for the development of molecular markers for trait-oriented tea breeding.

Plant Materials and Phenotypic Evaluation
The plant materials were obtained from the field gene bank of the Federal Research Centre the Subtropical Scientific Centre of the Russian Academy of Sciences (FRC SSC RAS). Mutant forms derived in USSR between 1970 and 1980 via the γ-irradiation of seeds (mostly cv. "Kolkhida", cv. "Qimen") were included in this study (Table S1). All plants were about 31-33 years old. All plants were clonally propagated with 30-60 replicates per genotype and grown on brown forest acid soil (pH 5.5) with a nitrogen content of 30 mg kg −1 (compared to the optimal 80 mg kg −1 N for tea plantation). No fertilizers have been applied in the experimental plot for the last 27 years.
The leaf-related traits were characterized using the ten most fully expanded mature leaves collected from each cultivar and each replicate. The leaf area size was classified for all 106 genotypes of the entire tea collection according to Wang and Tang [74]: (1) small-leaf (leaf area ≤ 20 cm 2 ); (2) middle-leaf (leaf area 20-40 cm 2 ); (3) large leaf (leaf area 40-60 cm 2 ); and (4) extra-large leaf (leaf area ≥ 60 cm 2 ).
Photosynthetic efficiency was analyzed in the dark-acclimated leaves using the Junior-PAM chlorophyll-fluorometer with default settings. Ten mature leaves from each plant were included in the analysis. After applying actinic light, the following parameters were analyzed: Fv/Fm-maximum photochemical quantum yield of PS II; Y(II)-Effective photochemical quantum yield of PS II; NPQ-Stern-Volmer type non-photochemical fluorescence quenching; and ETR-electron transport rate [62][63][64].
The leaf nitrogen content in the mature leaves was analyzed spectrophotometrically using the Kjeldahl method, which includes the digestion (samples were heated in the presence of sulfuric acid) and distillation of the solution and the conversion of ammonium salt to ammonia via the addition of sodium hydroxide, followed by trapping the distilled vapors in HCl-water solution. Finally, the amount of ammonia or the amount of nitrogen present in the sample was then determined via back titration via the neutralization of HCl using NaOH solution [75].
The contents of caffeine, L-theanine and catechins (gallocatechin (GC), epigallocatechin (EGC), epicatechin (EC), epicatechin gallate (ECG), gallocatechin gallate (GCG), and epigallocatechin gallate (EGCG)) (mg g −1 dry leaf mass) were evaluated via HPLC using the following extraction protocol: adult tea leaves (3-4 leaves from the top of the branch) were fixed via steam treatment at 100 • C for 20 min in a water bath and subsequently dried. Approximately 200 mg of dried tea leaves were placed into a hermetically sealed container containing 4.0 mL 80% methanol-water solution, which was subsequently incubated for one week at +4 • C in the dark. After that, the vessels with the methanol leaf extracts were placed in a UV bath for 30 min and then centrifuged at 13,000× g rpm 10 min. A total of 1 mL of supernatants were injected into a HPLC column. The Agilent Technologies 1100 HPLC chromatographer, equipped with a flow-through vacuum degasser G1379A, 4th channel low-pressure gradient channel pump G13111A, automatic injector G1313A, column thermostat G13116A, and diode array detector G1316A, was used. The 2.1 × 150 mm column filled with octadecyl silyl sorbent, grain size of 3.5 µm, "ZORBAX-XDB C18" was applied. The acetonitrile solution was used for the gradient; the initial composition of the mobile phase, consisting of 90% (v/v) of solvent A (0.1% H 3 PO4) and 10% of solvent B (90.0% acetonitrile), was maintained for 8 min. Solvent A was then decreased linearly to 40% at 25 min and 0% at 90 min before being increased to 100% at 29.1 min to 34 min. Programming was then continued in the isocratic mode as follows: 40% A at 70.1 to 75.0 min and 7% A at 75.1 to 90.1 min (flow rate of 0.30 mL/min, column temperature of 40 • C). The detection wavelengths were 195 nm for L-theanine and 273 nm for caffeine and catechins. The identification of the substances was performed based on the holding time of the standards of the respective compounds.

Gene Selection and Primer Design and Long-Range Polymerase Chain Reaction (LR-PCR)
Thirty target genes were selected from the literature data (Table S2). The flanking primers were designed based on the reference tea genome Camellia sinensis var. sinensis cv. Shuchazao [76,77]   The LR-PCR mixture of 20 µL consisted of 10 µL 2 × LR-PCR buffer containing a mix of HS-Taq and Pfu DNA-polymerases (Biolabmix, Novosibirsk, Russia https://biolabmix. ru/catalog/pcr/long-range/ (accessed on 21 September 2023), 0.3 µL (10 µM) of each primer (forward and reverse), sterile PCR water, and 1 µL of the DNA sample (50 ng µL −1 ). Amplification was performed in a MiniAmp thermal cycler (Thermo Fisher Scientific, USA) according to the following protocol: one cycle of preheating at 94 • C-4 min, 35 cycles of amplification (denaturation at 94 • C-20 s, annealing at 58-62 • C-20 s, elongation at 68 • C-2.5-5.5 min), and final elongation at 68 • C-10 min. The PCR products were separated in 2% agarose gel for 2.5 h at 90 V. After that, the fragments were cut out from the agarose gel, filtered through absorbent cotton [78], and then spined at 10,000× g for 15 min; 1/5 volume of acetate Na 3M and 80% volume of isopropanol were added, mixed, incubated vertically at −80 • C at 15 min, and centrifuged at 13,000× g for 20 min at +4 • C. Finally, the pellets were washed twice with 500 µL of 80% ethanol and dissolved in 10 µL of PCR water.

Pooled Amplicon Sequencing, Filtering and Variation Calling
For sequencing, one sample was obtained from each variety. To prepare fragment DNA libraries, we used PCR products that were obtained via the amplification of the target genes of tea collection. Fragment DNA libraries were prepared equimolarly from the mixed PCR products using the NEBNext Ultra II DNA Reagent Kit Library Prep Kit for Illumina according to the manufacturer's protocol. Briefly, 526 ng of amplified PCR product was fragmented to 200-300 bp. using Covaris S220 with microTUBE-50 AFA Fiber Screw-Cap (Covaris, Woburn, MA, USA) in 50 µL of sterile water. Fragmented DNA was used for the further 3 adenylation and ligation of the NEBNext Adapter for Illumina and following 3 cycles of amplification. A qualitative evaluation of the resulting libraries was carried out on an Agilent bioanalyzer TapeStation 4150 using High Sensitivity D5000 ScreenTape and High kits Sensitivity D5000 Reagents (Agilent, Santa Clara, CA, USA). A quantitative evaluation of the products was performed via real-time PCR using the KAPA Library Quantification Kit (KAPA biosystems, Wilmington, MA, USA).
The obtained fragments of the DNA library were mixed equimolarly into a pool and sequenced on the Illumina MiSeq via pair-end reads 76 + 76 bp. Sequencing data were demultiplexed via index sequences using the bcl2fastq v2.20.0.422 program with default parameters. In total, 184,000-392,000 pairs of reads were obtained for each DNA library. The initial quality assessment of the deep sequencing data was performed using the FastQC v0.11.2 software [79]. AdapterRemoval v2 programs [80] (with parameters --trimqualities, --minquality 20, --minlength 50) was used to remove adapters and low-quality sequences. A total of 94.34% of pairs of reads were preserved after filtering (Table S2).
Filtered data were mapped against the reference genome of tea plant (GCF_004153795.1). Mapping was performed using the bwa mem function from the package BWA programs [81]. The MarkDuplicates function of the picard-tools v2.22.2 (Picard toolkit [82]) software package was applied to remove duplicates. The quality of the alignments was evaluated using the Samtools v1.9 software package [83]. The depth coverage of the target genome regions was assessed using the CollectWgsMetrics function of picard-tools software package (https://broadinstitute.github.io/picard/ accessed on 21 September 2023) (with COVER-AGE_CAP = 10,000 parameter). On average, 96.44% of reads were mapped to the tea genome. For each sample, on average, we obtained 229-fold coverage of the target genome regions of tea.
SNP density was calculated as mean SNP per gene/fragment length of gene in kb. To obtain an overview of the SNP distribution and the possible enrichments of the SNPs for the genes, we normalized the SNP frequency in each gene. The SNP frequency in each gene was calculated using the following formula: SNP_freq = (SNP_count/per_gene)/gene length × 10 3 where SNP_count/per_gene is the amount of the SNPs detected in a certain gene, and gene_length is the length of the gene. The factor 10 3 was applied to the denominator to leverage the SNP_Freq values in order to facilitate a fair and easy comparison.

Statistical Analysis
Statistical analyses of the data were carried out using XLSTAT software (free trial version) (https://www.xlstat.com/ Accessed on 21 September 2023). A one-way ANOVA, Student's t-test, and Tukey's test were applied to determine the significant differences between the variants. Additionally, hierarchical clustering was performed, and dissimilarities were calculated using the DICE coefficient (with agglomeration using Ward's method). Additionally, a principal component analysis was conducted based on Pearson (n) correlations. To find the associations between SNPs and the phenotypes, a linear regression model was applied in conjunction with a statistical test adjusted for multiple comparisons (Bonferroni), with significant associations being noted at p values < 0.05.