Overall, the identification of genetic variation in the carbohydrate biosynthesis pathway of soybean presents a good opportunity for optimizing meal composition for soybean consumers. There are several genes in the carbohydrate biosynthetic pathway that have previously been used to reduced raffinose and stachyose levels, and related genes whose functions are unexplored. The soybean genome contains four genes encoding raffinose synthase enzymes.
RAFFINOSE SYNTHASE2 (
RS2),
RAFFINOSE SYNTHASE3 (
RS3), and
RAFFINOSE SYNTHASE4 (
RS4) share 60% amino acid identity with the
Arabidopsis raffinose synthase enzyme, and soybean
RAFFINOSE SYNTHASE1 (
RS1) is less similar [
6]. A naturally-occurring polymorphism in the
RAFFINOSE SYNTHASE2 gene was discovered in the soybean accession PI 200508, a deletion of three nucleotides relative to the reference sequence that results in the loss of tryptophan 331 (W331-) in the protein sequence, and has the phenotypic effect of increasing the ratio of sucrose to RFOs in the seed [
6]. Reverse genetics approaches also previously identified a missense mutation (T107I) in the
RS2 gene that reduces levels of raffinose and stachyose and increases sucrose levels [
7]. Both of these
rs2 mutants have reduced levels of raffinose (0.1–0.2% raffinose as a percentage of total seed carbohydrates) relative to lines wild type for
RS2 (0.8–1% raffinose) in replicated field tests [
8]. Stachyose levels are also reduced in the
rs2 single mutant (0.5–2% of total seed carbohydrates) contrasted with 4% of total carbohydrate in wild type lines, and moderate increases in total sucrose levels [
8,
9]. A biotechnological approach silencing the
RS2 gene in soybean seeds also achieved significant reductions in seed RFOs and an increase in sucrose [
4]. Mutation in
RAFFINOSE SYNTHASE3 has been shown to reduce RFOs further when combined with a non-functional allele of
rs2, resulting in ultra-low levels of raffinose and stachyose of less than 2% as a fraction of total seed carbohydrates [
8,
9,
10,
11]. While the decrease in RFO content is desirable for meal digestibility, the alteration of carbohydrate partitioning resulting in increased levels of sucrose, which contributes to the metabolizable energy in soybean meal is an added advantage [
12,
13]. RFOs have been demonstrated to have a role in desiccation and cold tolerance in plants and may contribute to seed vigor [
14,
15,
16,
17,
18,
19]. However, studies have shown that soybean
rs2 and
rs3 mutants germinate and emerge with normal efficiency [
11,
20]. Overall, variation for RFO content is limited in available soybean germplasm [
21], and it remains an open question in the field of soybean improvement what impact a null mutation in
RAFFINOSE SYNTHASE could have on seed carbohydrate levels, and which genetic combination optimally reduces RFOs across growing environments to optimize meal traits.
The
Cel I endonuclease-based TILLING (
Targeting
Induced
Local
Lesions
in Genomes) approach has previously been successfully implemented in soybean and resulted in the identification of mutants in seed composition, disease signaling, and other important agronomic traits [
7,
22,
23,
24,
25]. The reverse-genetic method TILLING presents the ability to identify mutants without prior knowledge of the phenotype or the degree of phenotypic severity. Further benefits to TILLING are the capability to identify mutations that may be lethal or result in reduced viability in the homozygous state. Particularly for the soybean genome, where most genes have two highly similar homeologs, which are both functional. Another advantage that TILLING provides over phenotypic screens is a means to identify mutations in genes where deleterious effects can be masked by the presence of another gene that compensates for the loss of function. Additionally, the resulting mutations are non-transgenic variation and can be used in both conventional or transgenic breeding programs. However, there are several significant challenges: Each gene is different in its coding sequence and GC content and in turn the probability of creating an amino acid change or nonsense mutation as a result of a single base pair change. In many cases, DNA point mutations can be silent or create synonymous changes that are unlikely to be deleterious to the protein. Another challenge is the cost: While sequencing costs per base have declined precipitously in recent years, the effort of maintaining and extracting DNA from large populations, library construction, barcoding, and high-fidelity amplification, as well as the technical effort to perform these techniques, remain a cost impediment. Finally, soybean has undergone a relatively recent genome duplication, and homologous genes are, in some cases, identical at the coding level, presenting a challenge in detecting mutations in one specific member of the homologous pair [
26]. In the case of genes involved in carbohydrate biosynthesis, a phenotypic screen of a large population is tedious and costly and requires HPLC (high performance liquid chromatography) analysis of individual samples. In the case of the
RAFFINOSE SYNTHASE gene family, it is known that
RS2 and
RS3 both contribute to carbohydrate partitioning. We sought to obtain additional and more severe mutant alleles to further reduce raffinose and stachyose levels in soybean seeds. We pursued a reverse genetic strategy utilizing high throughput sequencing (TILLING-by-Sequencing, TbyS [
27]) to identify additional loss-of-function mutations in the
RS2 and
RS3 genes. To reduce costs, we constructed only one-dimensional pools, which limited the number of libraries. After sequence analysis, we selected mutations that we expected to be deleterious. As a secondary screen, we designed PCR-based SNP (Single Nucleotide Polymorphism) markers to confirm the polymorphisms and determine which pool and subpool contained the mutant individual. This provided an alternative method of confirming the polymorphism, as well as a usable marker for downstream breeding approaches.