Identiﬁcation and Characterization of LaSCL6 Alleles in Larix kaempferi (Lamb.) Carr. Based on Analysis of Simple Sequence Repeats and Allelic Expression

: Simple sequence repeats (SSRs) are widely used as markers for the assessment of genetic diversity and marker-assisted breeding. In a previous study, two SSRs (GCA and CCA), were found in the genomic sequence of Larix ( La ) SCL6 , which plays important roles in the growth and development of Larix kaempferi (Lamb.) Carr. In this study, we analyzed the polymorphisms of these two SSRs in the L. kaempferi population. We found that each SSR had ﬁve di ﬀ erent polymorphisms, among which (GCA) 7 and (CCA) 7 were predominant. In addition, 12 haplotypes were detected, with (GCA) 7 (CCA) 7 having the highest frequency. Furthermore, we detected the haplotypes of LaSCL6 in mature trees and their seeds and analyzed the relationships between parents and o ﬀ spring. The expression patterns of ﬁve LaSCL6 alleles were analyzed and they showed balanced expression during vegetative development. Taken together, these ﬁndings not only provide more genetic information on LaSCL6 , but also provide a candidate marker for genetic studies and breeding.


Introduction
Simple sequence repeats (SSRs) are a class of DNA sequences consisting of short, tandem-repeated motifs (1-6 bp in length) [1]. SSR sequences are highly polymorphic, and their presence results in allelic variants with different frequencies at the population level [2][3][4]. In addition, some allelic variants show differential expression during plant growth and development [5][6][7]. Due to their advantages of being co-dominant, multi-allelic, reliable, PCR-based, and abundant in plant genomes [8], SSRs have been used as markers for the assessment of genetic diversity [9,10], genetic structure [11], parentage analyses [12], pedigree and mating system analyses [13], and marker-assisted breeding [14].
Larix kaempferi (Lamb.) Carr, a forest tree of important ecological and economic value and widely-grown in the northern hemisphere, is monoecious and mainly wind-pollinated [15]. In recent years, SSRs have been used to investigate the pollen contamination rate and paternal contributions of L. kaempferi, which affects the seed productivity and severely limits the improvement of seed orchard yield and the construction of clonal seed orchards [16,17].
SCARECROW-LIKE6 (SCL6), a member of the GRAS (GAI-RGA-SCR) family that is regulated by microRNA171 at the post-transcriptional level, is involved in many aspects of growth and development, such as shoot branching [18], meristem maintenance [19], and somatic embryogenesis [20,21]. In a previous study, we identified the homologue of SCL6 in L. kaempferi, and we found two SSRs (GCA and CCA) in the genomic sequence of LaSCL6 [22]. Understanding the genetic variation of a specific gene in a population is important for the study of its functions and for future breeding improvements.
In this study, we analyzed the polymorphism of SSR sequences of LaSCL6 in the L. kaempferi population, and we studied the specific expression of five alleles. Our results not only provide more genetic information on LaSCL6, but also provide a candidate marker for genetic studies and breeding.

Plant Materials
All the materials were collected from a Dagujia seed orchard (42 • 22 N, 124 • 51 E), Liaoning Province, in Northeast China. Endosperms were used to assess the frequency of SSRs and they were separated from 200 mature seeds; after immersion in water for one day, the seed coats were removed, and then the endosperms were isolated for DNA extraction. The needles and seeds of three mature trees were used to study the genetic information delivery from parents to offspring, and endosperms and embryos were sampled separately; the needles, endosperms, and embryos were used for DNA extraction. Six-month-old seedlings were used to study the expression of alleles. After analyzing the alleles of 40 seedlings, the needles, stem, and the root from a single seedling were sampled for RNA extraction. All the samples used for DNA and RNA extraction were frozen in liquid nitrogen and stored at −80 • C.

DNA Extraction and Polymerase Chain Reaction (PCR) Amplification
The genomic DNA of L. kaempferi was isolated with the CTAB plant genome DNA rapid extraction kit (Aidlab Biotech, Beijing, China) according to the manufacturer's protocols. The primers 5 -AGCGAGGTCAAGAAAGAAGAGC-3 and 5 -TTGGGAACGAATGGCGTAGGG-3 were used to amplify the sequence fragment containing two SSR loci from the DNA template with Platinum ® Taq DNA polymerase (Invitrogen, Carlsbad, CA, USA). The PCR products were purified with a gel extraction kit (Tiangen, Beijing, China), ligated into the pEASY ® -T1 simple cloning vector (TransGen Biotech, Beijing, China), and sequenced. Multiple sequence alignments were performed with ClustalX [23]. The tertiary structures were predicted by SWISS-MODEL (https://swissmodel.expasy.org/).

RNA Extraction and Quantitative Reverse Transcription Polymerase Chain Reaction (qRT-PCR)
Total RNAs extracted from the needle, stem, and root were isolated with the EasyPure RNA kit (TransGen Biotech, Beijing, China) according to the improved manufacturer's protocol and then reverse-transcribed into cDNA with the TransScript ® II one-step gDNA removal and cDNA Synthesis SuperMix kit (TransGen Biotech, Beijing, China). TB Green ® Premix Ex Taq™ (Tli RNase H Plus) (Takara, Shiga, Japan) was used to assess the expression levels of alleles with the allele-specific primers. LaEF1A1 was used as the internal control [21]. The qRT-PCR was performed with three biological replicates and the data are shown as mean ± SD. Statistical analysis was performed with SPSS19.0 using analysis of variance.
Notably, in the repeat region of GCA, there were two single-nucleotide polymorphisms (SNPs) (G-A and C-T) ( Figure 1, Tables 1 and 2). When G is changed to A, the codon CAG changes to CAA, but this is a synonymous SNP without changing the amino-acid. When C is changed to T, the codon CAG changes to the termination codon TAG, resulting in the termination of LaSCL6 translation and a change in the protein length, and this reveals the regulation of LaSCL6 expression at the genomic level [22].

SSRs Affect LaSCL6 Structure and Function
Protein has a defined three-dimensional structure after the folding of the primary structure. The three-dimensional structures of 12 haplotypes of LaSCL6 were predicted and the results showed that they differed in at least at five places ( Figure 2). The GCA and CCA repeat regions in LaSCL6 result in repeats of the CAG and CCA codons, which code for polyglutamine and proline, respectively, and this might affect the structure and function of LaSCL6.

SSRs Affect LaSCL6 Structure and Function
Protein has a defined three-dimensional structure after the folding of the primary structure. The three-dimensional structures of 12 haplotypes of LaSCL6 were predicted and the results showed that they differed in at least at five places ( Figure 2). The GCA and CCA repeat regions in LaSCL6 result in repeats of the CAG and CCA codons, which code for polyglutamine and proline, respectively, and this might affect the structure and function of LaSCL6.
SSRs in the exon affect protein structure and thus lead to a change of protein function [24]. In human genes, tandem repeats of polyglutamine cause incurable neurodegenerative diseases [25,26]. In Arabidopsis thaliana, polyglutamine length in EARLY FLOWERING 3 has dramatic effects on flowering time and circadian clock-related phenotypes [24]. Poly proline affects the binding of profilin and may have consequences for the regulation of actin cytoskeletal dynamics in plant cells [27]. The SSRs in the exon of LaSCL6 change the primary structure of its protein, and this affects its three-dimensional structure, which might result in the functional diversification of LaSCL6; further work is needed to test this. SSRs in the exon affect protein structure and thus lead to a change of protein function [24]. In human genes, tandem repeats of polyglutamine cause incurable neurodegenerative diseases [25,26]. In Arabidopsis thaliana, polyglutamine length in EARLY FLOWERING 3 has dramatic effects on flowering time and circadian clock-related phenotypes [24]. Poly proline affects the binding of profilin and may have consequences for the regulation of actin cytoskeletal dynamics in plant cells [27]. The SSRs in the exon of LaSCL6 change the primary structure of its protein, and this affects its three-dimensional structure, which might result in the functional diversification of LaSCL6; further work is needed to test this.

Parent-Offspring Relationships can be Analyzed by SSRs in LaSCL6
L. kaempferi is monoecious and mainly wind-pollinated [15]. Accurate information about parentoffspring relationships is important for Larix breeding programs [10,16]. Three mature trees were used to study the genetic information delivery of LaSCL6 from parents to offspring based on the haplotypes of the two SSR loci in LaSCL6.
Tree 3 was homozygous and had one haplotype (GCA)7(CCA)7 ( Table 3). The haplotypes of endosperms and embryos from 9 seeds were determined. All the haplotypes of 9 endosperms and

Parent-Offspring Relationships Can Be Analyzed by SSRs in LaSCL6
L. kaempferi is monoecious and mainly wind-pollinated [15]. Accurate information about parent-offspring relationships is important for Larix breeding programs [10,16]. Three mature trees were used to study the genetic information delivery of LaSCL6 from parents to offspring based on the haplotypes of the two SSR loci in LaSCL6.
Tree 3 was homozygous and had one haplotype (GCA) 7 (CCA) 7 ( Table 3). The haplotypes of endosperms and embryos from 9 seeds were determined. All the haplotypes of 9 endosperms and 66.7% (12/18) of the haplotypes of 9 embryos were found in tree 3, while 33.3% (6/18) were not found in tree 3 (Table 3). Table 3. Haplotypes of two simple sequence repeats in LaSCL6 of mother trees and their seeds.

Number Endosperm Embryo
Tree 1  The haplotypes of embryos different from that of their mother trees are shown in bold. The numbers in parentheses refer to the occurrence frequencies of the haplotypes in the sequenced clones.
In the seeds of L. kaempferi, the embryo (2n) is diploid and its two haplotypes come from the female and male parents; the endosperm (n) is haploid and its haplotype is from the mother tree and the same as one haplotype of the embryo. Here, we took LaSCL6 as a case study and found that the haplotype of an endosperm was the same as one haplotype of the mother tree and the embryo. For most seeds, the two haplotypes of an embryo were the same as those of the female parent (Table 3), indicating that the female and male parents had the same haplotype. For some seeds, one haplotype of an embryo differed from those of the female parent (Table 3), indicating that it was from the male parent that had a different genetic component from the female parent. Notably, the same haplotype occurred in female and male parents, suggesting that these two SSR loci in LaSCL6 make no contribution to self-incompatibility.

The Expressions of LaSCL6 Alleles Have the Same Patterns
Allelic expression imbalance has been studied in several plant species, and sometimes it can result in a phenotypic change [5][6][7]28]; to determine whether it occurs in LaSCL6, the expression patterns of five LaSCL6 alleles in three heterozygous L. kaempferi seedlings were analyzed by qRT-PCR assay with allele-specific primers (Table 4). Table 4. Primers for allele-specific quantitative reverse transcription-polymerase chain reaction.
The haplotypes of seedling 2 were (GCA) 4 (CCA) 8 and (GCA) 6 (CCA) 8 . Almost the same expression patterns for two LaSCL6 alleles were detected by the primers (GCA) 4 and (GCA) 6 (Figure 3b, Table 4). The haplotypes of seedling 3 were (GCA) 4 (CCA) 7 and (GCA) 7 (CCA) 7 , and almost the same expression patterns for two LaSCL6 alleles were detected by the primers (GCA) 4 and (GCA) 7 (Figure 3c, Table 4). Taken together, all five alleles were strongly expressed in stems, weakly in roots, and showed almost the same expression patterns and balanced expression during the process of vegetative development. Taken together, all five alleles were strongly expressed in stems, weakly in roots, and showed almost the same expression patterns and balanced expression during the process of vegetative development.

Conclusions
In summary, the SSRs in LaSCL6 show high levels of polymorphism and might affect the protein structure and function. Based on an analysis of the haplotypes of LaSCL6 in mother trees and their offspring, the parent-offspring relationships were determined. These results provide a candidate marker for the L. kaempferi genetic studies and breeding.
Altogether, we have obtained new information on LaSCL6, especially its regulation by microRNA171 and its genomic structure. For future studies, constructing the LaSCL6 network by mining its interacting proteins and target genes will help to reveal the molecular mechanisms underlying the growth and development of L. kaempferi.