Phenotypic Examination of Camelina sativa (L.) Crantz Accessions from the USDA-ARS National Genetics Resource Program

Camelina sativa (L.) Crntz. is a hardy self-pollinated oilseed plant that belongs to the Brassicaceae family; widely grown throughout the northern hemisphere until the 1940s for production of vegetable oil but was later displaced by higher-yielding rapeseed and sunflower crops. However, interest in camelina as an alternative oil source has been renewed due to its high oil content that is rich in polyunsaturated fatty acids, antioxidants as well as its ability to grow on marginal lands with minimal requirements. For this reason, our group decided to screen the existing (2011) National Genetic Resources Program (NGRP) center collection of camelina for its genetic diversity and provide a phenotypic evaluation of the cultivars available. Properties evaluated include seed and oil traits, developmental and mature morphologies, as well as chromosome content. Selectable marker genes were also evaluated for potential use in biotech manipulation. Data is provided in a raw uncompiled format to allow other researchers to analyze the unbiased information for their own studies. Our evaluation has determined that the NGRP collection has a wide range of genetic potential for both breeding and biotechnological manipulation purposes. Accessions were identified within the NGRP collection that appear to have desirable seed harvest weight (5.06 g/plant) and oil content (44.1%). Other cultivars were identified as having fatty acid characteristics that may be suitable for meal and/or food use, such as low (<2%) erucic acid content, which is often considered for healthy consumption and ranged from a high of 4.79% to a low of 1.83%. Descriptive statistics are provided for a breadth of traits from 41 accessions, as well as raw data, and key seed traits are further explored. Data presented is available for public use.


Introduction
Renewable energy sources including esterified vegetable oil (i.e., biodiesel) have been proposed as a possible option to reduce greenhouse gas (GHG) emissions in the transportation sector. Current widely used oilseed for producing biofuel include rapeseed, sunflower (Europe), soybean (USA), and palm oils (tropical regions). In comparison to these other oilseed plants, Camelina sativa (L.) Crntz. (camelina) PI 650142, PI 650157, PI 650158) were discovered to have mixed seed from two distinct cultivars. These lines were therefore given designations of 'A' and 'B' types of the originating name. The NGRP collection cultivar PI 304268 appeared to require vernalization conditions that were not met in this study, as such, the plant remained in the rosette stage, failing to flower and provide seed. Therefore, this line was not included in this study; giving a total of 41 accessions reviewed.

Plant Description
Camelina is a self-pollinated plant with small, 4-lobed flowers of pale yellow. The tear shaped fruits, or siliques, appear similar to flax bolls and contain approximately 15-20 small seeds with a high oil content, making it desirable for potential commercialization. Throughout the study, phenotypic descriptions and growth stages of camelina sativa presented will be subtitled according to the two-digit BBCH scale [67] in the text.

Early Development
Principal growth stage, seed germination, and early development. 01: Initiation of seed imbibition-It was observed that all viable seed for every cultivar formed a mucosal/gelatinous coat within the first 5 min of being imbibed. 03: Radicle emergence from seed-All cultivars showed radical emergence in 1 to 2 days with a median rate of 1 day (Table 1). 04: Emergence of hypocotyl with cotyledons from the seed-Shoot emergence was seen from 1.8 to 3 days after sowing on damp soil and with a median rate of 2 days (Table 1). At day 4, cultivars were measured for root length, hypocotyl length, and percent cotyledon unfolding. Roots were found to range from 5.4 mm to 40.4 mm with a median value of 25.9 mm (Table 1). Hypocotyl length was determined to have values from 2.2 mm to 10.8 mm with a median value of 7.8 mm (Table 1). 10: Cotyledons (node 0) unfolded-The rate of cotyledon unfolding was measured at day 4 and was observed to range from 0 to 100% with a median value of 72%. Vernalization was required for 9 of the 41 cultivars and these are indicated by asterisks at the end of their accession number (Table 1).  * Indicates accessions that required vernalization in order to flower. Also described as 'winter' accessions. ** A sample size (n) of 3-12 was used per cultivar. A and B designation were given to accession obtained from the NGRP center that contained more than two unique phenotypes. a Defined as requirement to keep seed for 8 weeks at 4 • C before being able bolt or flower. b Defined as time it takes for root tip to after sowing seeds on wet filter paper. c Defined as time it takes for shoot to emerge from the seed coat. d Root length measured 4 days after sowing seeds on wet filter paper. e Hypocotyl length measured 4 days after sowing seeds on wet filter paper. f Defined as the percentage of plant that have unfurled cotyledons greater than 50% within 4 days after sowing.

Physical Attributes
Principal growth stage 1: leaf development. Leaf morphology was observed in three basic shapes with lanceolate (75.6%) being the most predominant, followed by subulate (17.1%) and linear (7.3%) ( Figure 1, Table 2). Leaf margin morphology (shape of the edge) was determined to also have three basic characteristics with spiny (41.5%) most often observed. Serrate (29.3%) was the second most common followed by smooth (26.8%). Examples of shapes are seen in Figure 1 and each accession's combination of leaf shape, edge shape, and spine number (points) characteristics are listed in Table 2. Like most of the Brassicaceae, camelina develops lateral branches. The development of lateral branches is variable and depends on the genotype. Plant density and environmental conditions that may also affect the number of branches [67]. Here we present findings of lateral branch development under low plant density and favorable growth (greenhouse) conditions.     Table S2 for specific values. A and B designation were given to accession obtained from the NGRP center that contained more than two unique phenotypes. a Spiny margin, defined as having a series of sharp stiff points; serrated margin, defined as having a series of wave like forward pointed teeth around the entire leaf edge; smooth margin, defined as no projections around the outside of the leaf. b Points are defined as short needle like projections or spines from the edge of the leaf. '4 + 4' defined as points on 'right' side of leaf + points on 'left' side of leaf. c W-Top heavy branching often seen with secondary branching; X-Branched length of main stem with few secondary branches; Y-Tiller where all branches originate from the base and contain few secondary branching; Z-Chaos where branches are seen originating everywhere and a dominant stem was not observed.
Lateral branch development appears to have four unique patterns. Top heavy branching (W-75.6%) often seen with secondary branching is by far the most common, followed by branched length of main stem (X-14.6%) with few secondary branches observed, Tiller (Y-4.9%) where all branches originate from the base and contain few secondary branching and finally Chaos (Z-4.9%) where branches are seen originating everywhere without a dominant stem observed ( Figure 2, Table 2). Interestingly, four cultivars PI 311736, PI 650143, PI 650152, and PI 650167 appeared to keep the rosette throughout their lifecycle, under greenhouse conditions. All appear to be winter cultivars. The cultivars in this study showed a range of bolting time from 18 to 35 days with an average of 23 days (Table 3) Table 3). The number of leaves present on the primary bolting stem at time of flowering ranged from 18 to 41 (Table 3). The mean value height and width (Table 3) seen at the time of initial flowering were 45 cm and 22 cm, respectively. The primary inflorescence contains 9 florets at a minimum and 18 at a maximum with a median value of 12, while the secondary inflorescence contains an average of 4 florets (Table 3).  Table 3). The number of leaves present on the primary bolting stem at time of flowering ranged from 18 to 41 (Table 3). The mean value height and width (Table 3) seen at the time of initial flowering were 45 cm and 22 cm, respectively. The primary inflorescence contains 9 florets at a minimum and 18 at a maximum with a median value of 12, while the secondary inflorescence contains an average of 4 florets (Table 3). * Indicates accessions that required vernalization in order to flower. Also described as 'winter' accessions. ** A sample size (n) of 3-12 was used per cultivar. See Supplemental Table S2 for specific values. A and B designation were given to accession obtained from the NGRP center that contained more than two unique phenotypes. Maturity of the plant was defined as the time when the plant is in full flower, it has reached its maximum height and seed set has begun. The earliest accessions to reach maturity were PI 650164 and PI 650165 at 42 days and the latest at 63 day was seen by PI 650143. A median maturity value of 50 days was observed (Table 4). 39: Maximum stem length-Accessions reached maximum height at maturity, which was seen to be between 64.9 cm (PI 650142A) to a maximum of 88.3 cm (PI 650151) with a median value of 78.3 cm ( Table 4). The plants width at maturity was observed to be between 11.4 cm (PI 650153) to 61.0 cm (PI 650168) with a median value of 36.6 cm (Table 4). Principal growth stage 9: senescence. 97: Plant dead and dry-Plants were dry and seeds ready to harvest at a minimum of 52 days (PI 304271, PI 650164, PI 650165), at a maximum of 72 days (PI 650145, PI 650146) with an mean value of 63 days (Table 4).

Seed Analysis
Seed characteristics were assessed, and the mean seed weight was seen to range from 0.19 to 1.05 mg per seed (Table 5). Total mean seed weight per plant showed a minimum of 1.45 g (PI 304270) to a maximum of 5.06 g (PI 311735) per plant with a median value of 2.6 g ( Table 5). The estimate total seed per plant shows a range from 1604 (PI 650153) to 9225 (PI 650143) seed per plant with a median value of 3328 (Table 5). While the number of seed and total weight of seed per plant did not appear to correlate, the 'winter' cultivars produced greater than average amounts of seed in general. Seed dimensions showed a mean minimum and maximum width of 0.65 and 1.06 mm, respectively ( Table 5). The seed mean length ranged from 1.20 to 2.12 mm with an average of 1.80 mm ( Table 5). The length by width ratio was also compiled and 1.74 to 2.62 and a median value of 2.04 (Table 5).

Seed Oil Biochemical Data
Biochemical analysis of the oil collected shows varying degrees of differences among the 41 accessions of Camelina sativa (L.) Crntz. (Table 6). Oil content (OC) ranged from 23.6% in PI 650152 to 44.1% in the PI 650155 accession (Table 6). Fatty acid components included saturated, monounsaturated, and polyunsaturated fatty acids. The most prevalent saturated fatty acid was palmitic acid, ranging from 5.5 % to 9.5 % ( Table 6). The most abundant mono-unsaturated fatty acids are oleic acid (C18:1), ranging from 9.1% to 17.1%, and gondoic acid (C20:1), measured at 10.5% to 16.4% ( Table 6). The most abundant poly-unsaturated fatty acids are linoleic acid (C18:2), from 16.1% to 28.6%, and linolenic acid (C18:3), measured at 23.5% to 36.2% ( Table 6). The weight of a thousand seeds (TWS) varied from 0.2006 g in PI 650167 to 1.0473 g PI 650153 (Table 6). See Supplemental Table S1 for a complete fatty acid profile of the raw data obtained. To further explore the oilseed yield traits of thousand seed weight (TSW) and total seed weight, one-way ANOVAs were performed to compare accessions. Significant differences between accessions were observed for both TSW (ANOVA, p < 2.2 × e −16 ) and total seed weight (ANOVA, p < 2.2 × e −16 ). TukeyHSD post-hoc analyses were performed for all pair-wise comparisons with results included in Supplemental Table S3 (TSW stats) and Supplemental  Table S4 (total seed weight stats).  Table S2 for specific values. A and B designation were given to accession obtained from the NGRP center that contained more than two unique phenotypes. a Defined as the average value of 1000 seed total weight divided by 1000 from 10 plants. b Defined as the average total weight of seeds from 10 plants. c Defined as mean total weight divided by the mean seed weight. See Supplemental Tables S1-S4 for specific values. A and B designation were given to accession obtained from the NGRP center that contained more than two unique phenotypes. a OC-oil content b TSW-Thousand Seed Weight.

Genetic Analysis
The camelina accessions were examined for chromosome number. It was determined that 75.6% of the lines contained the expected 2n = 40 while unexpectantly 24.3% contained a 2n = 38 value. One accession PI 650152 contains a 2n of 26. Individual lines were also examined for their COT values to provide a unique identifying fingerprint for each cultivar ( Table 7). The COT analysis is a technique that provides a way to measure DNA reassociation kinetics and gives a measure of repetitive DNA content per genome [68]. This analysis was used as verification for the predicted A/B split in the 4 lines that contained observable differences in phenotypes, and independent from chromosome counts. Table 8 provides a summary of previous tables but specific to the alignment of the A/B accessions. Results show that each A/B accession has a unique COT value, chromosome number, vernalization requirement, branch pattern, leaf shape, leaf margin, flowering time, height, and number of days to maturity.   * Indicates accessions that required vernalization in order to flower. Also described as 'winter' accessions. ** A sample size (n) of 3-12 was used per cultivar. See Supplemental Table S2 for specific values. A and B designation were given to accession obtained from the NGRP center that contained more than two unique phenotypes. a COT value is defined as DNA reassociation kinetics and gives a measure of repetitive DNA content per genome b Sampling size ranged from 3-33 to determine chromosome counts. See Supplemental Table S2 for details. c Defined as requirement to keep seed for 8 weeks at 4 • C before being able bolt or flower.

Advances in Biotechnology
Biotechnological enhancements have been achieved to complement the traditional cultivar improvement efforts of breeding and mutagenesis. To add to and improve upon these biotechnology efforts, our lab investigated the use of positive as well as negative selection marker genes to facilitate techniques such as RMCE and GAANTRY [56,57] for metabolic engineering.
We chose to investigate the positive marker genes bar, [41], hptII [69], nptII [70] and sulI [71] for use in camelina selection, as they have all been shown to work in Arabidopsis. The optimal range of selective agents to inhibit camelina growth was determined for each (Supplemental Figures S1-S5). The negative selective marker gene, codA, was also investigated for monitoring DNA excision events [72,73]. The codA gene required a kill curve be determined for both 5-fluorocytosine (5FC) and 5-fluorouracil (5FU (Supplemental Figure S6). In the presence of a functional codA gene the nontoxic 5FC is converted to it toxic form 5FU. The toxin 5FU is a DNA chain-terminating compound that will stunt or kill germinating seed at very low concentrations.
Employing the Lu and Kang [41] method, a series of binary vectors were transformed into the camelina cultivar Suneson (Supplemental Figure S1). These vectors were designed with a positive/negative marker cassette flanked by recombinase recognition sites for use as RMCE founder lines [56] and capable of gene stacking. During the hygromycin (hptII) selection trial, 13 transformed lines were recovered. All lines contained a high T-DNA copy number (4 or more), as determined by Southern blot analysis (SBA) ( Figure 3A). Plants under selection appeared sickly and were difficult to identify against background (wild-type) growth. However, plants did recover once transferred to soil. Glufosinate (bar) selection identified 14 plants that showed both low and multiple copy lines by SBA ( Figure 3A). Identified plants appeared healthy under selection. Sulfadiazine revealed one line from initial trials; it was determined to be a 2 copy T-DNA insertion event by SBA and the plant appeared healthy ( Figure 3A). The final positive selection marker tested was nptII. While a kill curve range was determined for the antibiotics kanamycin and G418 (Geneticin) (Supplemental Figure S2), use of the nptII selection gene, driven by the double enhanced 35S promoter (pCTAG-GCN), was only sufficient to produce a single resistance transgenic camelina from~20,000 seeds screened.
To determine if this was a failure of the marker or poor rates of transformation, a second binary vector previously used for transformation, pCTAGV-KCN3 [72] was used. Seed selection was split onto either kanamycin selection or DsRed (visual) selection for germinating seedlings. The fluorescent DsRed selection marker under a constitutive promoter has previously been used to identify transformed seed based on fluorescence [41]. Five plants were obtained using visual DsRed selection, while zero plants were obtained from kanamycin selection of~1000 plated seeds.
For codA negative selection testing, T 2 seeds from the aforementioned hygromycin, sulfadiazine and glufosinate positive selection studies were germinated in the presence of 500 mg L −1 5FC. Results can be seen in Figure 3B, C and indicate that in the presence of a functional codA gene and the selective agent 5FC, camelina plants appear to grow yellow and stunted as compared to a null segregating sibling from the same transformation event that appears green and robust.
In an attempt to determine whether any of the 41 lines investigated were viable for biotechnological manipulation we chose the line PI 311735 (due to its large seed and high oil content) for transformation using the pCTAG-GBC binary vector and glufosinate (bar) selection. Rates of transformation were similar to the control accession Suneson for production of transgenic plants through the floral dip method described [41]. While not overly effective we were able to obtain transgenic camelina plants with an efficiency of 0.6% using bar selection. PCR was used to provide molecular verification of transgenic camelina obtained from the PI 311735 and Suneson lines (Supplemental Figure S7).
soil. Glufosinate (bar) selection identified 14 plants that showed both low and multiple copy lines by SBA ( Figure 3A). Identified plants appeared healthy under selection. Sulfadiazine revealed one line from initial trials; it was determined to be a 2 copy T-DNA insertion event by SBA and the plant appeared healthy ( Figure 3A). The final positive selection marker tested was nptII. While a kill curve range was determined for the antibiotics kanamycin and G418 (Geneticin) (Supplemental Figure S2), use of the nptII selection gene, driven by the double enhanced 35S promoter (pCTAG-GCN), was only sufficient to produce a single resistance transgenic camelina from ~20,000 seeds screened.

Discussion
The genus Camelina is composed of 11 species [74] but as of 2011, when seeds for this study were obtained, only five species: (C. sativa, C. microcarpa, C. rumelica, C. alyssum, and C. hispida) were present in the germplasm of the IPK (Institute of Plant Genetics and Crop Plant Research, Gatersleben, Germany) and the USDA-NGRP repositories. Among them, only C. sativa and C. microcarpa are cultivated. Within C. sativa, three different subspecies, ssp. pilosa, ssp. sativa and ssp. foetida, have been described [75].
Chromosome analysis of the 41 accessions (Table 6) displayed an interesting set of results. First, a single accession PI 650152 was found to contain an n = 13 and may have been mis-classified, although it appears to display phenotypic characteristics similar to Camelina sativa (L.) Crntz. Next, a split in chromosome number was obtained and it appears that 75.6% (31/41) lines have the predicted n = 20 chromosome number while 24.3% (10/41) have an n = 19. Previous results indicate that Camelina sativa (L.) Crntz. is an allohexaploid plant and in 2006, the Snowdon lab [50] demonstrated through the use of 157 AFLP marker linkage map and 3 Brassica SSR markers that the chromosome number of camelina was n = 20. These results were confirmed when the genome was sequenced, and a genome size of 750 Mbp was reported [49,76] for two different cultivars-accessions used were unpublished. It appears that triplication of the camelina genome occurred through whole genome duplication by either autopolyploidization or allopolyploidization. Though an autopolyploidy event triplicating a single diploid genome would result in an autohexaploid with a haploid genome of n = 18, 21, or 24 chromosomes depending on a starting genome chromosome count of n = 6, 7, or 8, respectively. However, camelina has reported chromosome counts of n = 6, 7, 10, 13, 18, 19, 20 [76,77].
Based on previous reports, n = 20 appeared to be the most common value and agrees with the recent AFLP marker linkage map and sequencing data. However, an n = 20 chromosome count would be difficult (although not impossible) to achieve through a single event triplicating a diploid genome. Triplication of the camelina genome from two allopolyploidy events, resulting in first a tetraploid followed by a second polyploid mating to produce a hexaploid, similar to the origin of cultivated 6-row wheat, is more likely. From previous research it appears that hybridization via outcrossing to related species is possible within the Brassicaceae family [78,79]. Taking into consideration the reported chromosome counts of various camelina related species, an initial allopolyploidy hybrid cross resulting from two diploid parental species where one was n = 6 and the other n = 7 could contribute to the production of a tetraploid genome with 13 chromosomes and would explain cultivar PI 650152. This is possible considering that related species C. laxa and Camelina spp. have n = 6, C. hispida has n = 7, and C. rumelica has n = 13 chromosome counts [80]. Following this logic, a second allopolyploidy hybridization event producing hexaploid progeny could be achieved by mating the tetraploid progeny (n = 13) with either of the two starting parental lines where a 13 + 6 cross would result in an n = 19 and a 13 + 7 event would result an n = 20 chromosome count. These hypothetical crosses would explain the varied chromosome counts documented in numerous camelina publications. The hypothetical crossing scenario is further supported by a recent publication [81] where the genetic diversity of the camelina genus was assessed across 54 accessions representing five species through RADseq, ITS sequencing, and flow cytometry. Results of the investigation infer that an (n = 6 + 7 + 7) hybridization is possible. The allopolyploid hypothesis is also supported by the observation that C. sativa demonstrates diploid inheritance [41,48], as would be expected for an allopolyploid [82]. A hexaploid C. sativa could also be derived from the combination of an autotetraploid and a diploid species if, in the autopolyploid genome, homologous chromosomes differentiated, so the subsequent chromosome-specific pairing mimicked an allopolyploid genome in its diploid inheritance patterns [82]. Regardless of its evolutionary path, the C. sativa genome appears organized in three redundant and differentiated copies and can be formally considered to be an allohexaploid. Our results support previous research and add to it that two hexaploid combinations exist within the known NGRP accessions n = 6 + 7 + 7 (20) and n = 6 + 7 + 6 (19).
From an agronomic point of view, C. sativa ssp. sativa and Camelina sativa ssp. Pilosa, sometimes termed 'winter' camelina, seem to be the most promising subspecies for abundant seed production, where 8 of the 9 cultivars examined had an above average seed count (4043 to 9225) when compared to the accessions as a whole (3328), see Table 7. These winter camelina subspecies are usually sown in autumn, since they require vernalization in order to attain stem elongation and subsequent flowering, while Camelina sativa ssp. foetida (aka., 'spring' Camelina) does not require vernalization and can be sown in both autumn and spring. All cultivars examine had an early emergence phenotype (2 days on average) that appears characteristic of the species in general, see Table 1. Bolting or time to flowering and time to dry are agronomic traits of importance, with the NGRP accessions ranging from 18 to 35'days and 52 to 72 days, respectively, see Tables 3 and 4. From data observed rapid maturing and drying accessions appear to be PI 650145 and PI 650146. These lines may offer an opportunity to cultivate rapid cycling genotypes for double cropping utilization.
Total seed weight for the various accessions ranged from a minimum of 1.45 g (PI 304270) to a maximum of 5.06 g (PI 311735). The oil content in dry weight seeds ranges between 23.6% (PI 650152) and 44.1% (PI650115) and consists of approximately 54% polyunsaturated, 34% monounsaturated, and 12% saturated fatty acids. The most abundant poly-unsaturated fatty acids are linoleic acid (C18:2), ranging from 10.5% to 16.4%, and linolenic acid (C18:3), ranging from 23.5% to 36.2%. The mono-unsaturated erucic acid (C22:1) is of importance for feed, with a maximum value of <2% allowed. Only one line, PI 650140, falls within this parameter at 1.83%, (Supplemental Table S1) and potentially provides useful breeding stock for improvement to camelina as an animal feed. Taken as a whole accession PI 311735 appears to provide an excellent set of agronomic traits for an oil seed crop. The cultivar has one of the earliest bolting times, at 18 days and a better than average drying time at 60.8 days. However, its most outstanding characteristics are its large total seed weight at 5.06 gram per plant and its above average oil content, measured at 38.2%. Accession PI 311735 was further tested for potential genetic manipulation through the agrobacterium 'flora dip' transformation technique in an attempt to produce a transgenic plant. Our lab successfully produced six transgenic plants as verified by seedling selection on glufosinate and confirmation by PCR. These results validate that this accession could undergo genetic modification.
From our examination the selectable marker bar (glufosinate) still appears to be the best selection system available for camelina genome modification. However, our research indicates that sulI (sulfadiazine) may be a viable option. Unfortunately, a more thorough study will be needed to validate this claim. The negative selectable maker gene codA was successful in inhibiting seedling growth in the presence of 5FC. Seedlings could even be rescued from the 5FC selection plate and grown to produce viable plants (data not shown). This will provide a useful selection tool for monitoring DNA excision events for techniques such and CRISRP and RMCE.

Materials and Methods
Throughout the phenotypic descriptions, growth stages of camelina sativa presented are subtitled according to the two-digit BBCH scale [67] in the text. In the present investigation, plants were grown and evaluated under non-crowded greenhouse conditions in the California Bay area and therefore may not perfectly reflect more stressful field conditions in other parts of the country. Table 9 presents accessions used listed according to the NGRP center designation, countries of origin and other names associated with these cultivars.

Early Emergence Studies
After seeds were initially harvested, they were dried at 30 • C for one week and then weighed. To test to rate of germination, seed were imbibed on sterile 3 mm Whatman paper saturated with purified MQ water. It was observed that all viable seed for every cultivar formed a mucosal/gelatinous coat within the first 5 minutes of being imbibed. Seeds were placed on growth racks at 24 • C and observed for radical emergence. Vernalization was required for 9 of the 41 cultivars and these are indicated by asterisks at the end of their accession number (Table 1). These 'winter' types remained at the rosette stage indefinitely under greenhouse conditions, if not first cold-treated for the required vernalization time period to induce its bolting capacity. Therefore, all accessions were sown on moist soil and kept in the dark at 4 • C for 14 days.

Greenhouse Phenotype Studies
Seeds were imbibed in water for 1 h prior to sowing in soil. For each accession, 5 seeds were sown per pot on the surface of damped soil treated with Gnatrol (3-12 pots were sowed depending on accession viability). All accessions once sown were kept in the dark at 4 • C for 14 days prior to placing in the greenhouse. Greenhouse growth conditions consisted of 18 h light, 6 h dark cycles at 26 • C and 24 • C, respectively. Whole plant measurements were taken every 3-4 days. Seeds were harvested when dry. Three to twelve pots were planted per accession depending on seed viability.

Seed Quality Traits
The weight of thousand seeds (TSW-thousand seed weight) was determined by measuring three replicates of thousand seeds each. Seeds were counted manually and weighed to the nearest 0.1 mg. Seeds were measured manually to the nearest 0.01 mm.

Seed Quality Trait Statistics
ANOVA and TukeyHSD statistical analyses of oilseed yield traits were performed in R (version 3.6.3) using aov() and TukeyHSD() functions. Prior to analyses, accessions with mixed traits (designated with A/B) and/or missing values were removed for a balanced design with n = 3 measurements per accession. Residual plots and Shapiro-Wilk normality tests were done to inspect ANOVA assumptions of normality and homogeneity of variance, and a square root transformation was implemented for total seed weight.

Oil Extraction and Weight to Volume Determination
Oil content for Camelina seeds (sample size 0.4-0.6 g, weighed to four decimal places) was determined non-destructively using a Bruker seed analyzer (Fremont, CA, USA) calibrated for Camelina, with each determination done in triplicate. From each Camelina accession, three samples (0.5 g) of dry seeds were ground with hexane (0.1% BHA) in a glass homogenizer (1.5 mL g −1 tissue) and poured into a 16 × 100 mm screw cap tube with a Teflon-lined screw cap. The solution was agitated for 30 min. The extract was then centrifuged with desktop Dynac centrifuge (Becton, Dickinson and Company) for 10 min at 1000 rpm (140 g) and the supernatant collected. The extraction procedure was repeated on the sediment. The hexane was evaporated under nitrogen and the residue dissolved in 1 mL of hexane (0.1% BHA) for further analysis.

Camelina Chromosome Squashes
For chromosome counts, 15 seed of each accession were germinated on moist filter paper in a 27 • C growth chamber. Root tips were collected 5 days after germinated and were pretreated with 0.05% colchicine in 2% (v/v) DMSO for 4 h [78]. Root tips were then fixed overnight in 3:1 ethanol/glacial acetic acid. Slide preparations were made by digesting root tips for 30-to 60-min with 0.05 g L −1 Onuzuka R-10 cellulase and 0.01 g L −1 pectolyase Y-23 (Phytotechnology Labs) in 0.01 M citrate buffer pH 4.8 prior to. Digestion time varied according to the thickness and degree of lignification of the roots. Squashes were prepared according to Kirov et al., [83] and counts and were mounted in VECTASHIELD (Vector Laboratories) antifade mounting medium with DAPI (4,6-diamidino-2-phenylindole). Slide preparations were visualized under an Olympus BX51 fluorescent microscope.

COT Value Determination
The COT analysis is a technique that provides a way to measure DNA reassociation kinetics and gives a measure of repetitive DNA content per genome [54]. The procedure used to analyze nuclear DNA content in plant cells was modified from [84]. Briefly, the procedure consists of preparing suspensions of intact nuclei by chopping of 50 mg plant tissues in MgSO 4 buffer mixed with DNA standards and stained with propidium iodide (PI) in a solution containing DNAase-free-RNAase. Fluorescence intensities of the stained nuclei are measured by a flow cytometer. Values for nuclear DNA content are estimated by comparing fluorescence intensities of the nuclei of the test population with those of an appropriate internal DNA standard that is included with the tissue being tested. Nuclei from Arabidopsis thaliana (0.36 pg/2C) was used as the internal standard. The pellet is suspended by vortexing vigorously in 0.5 mL solution containing 10 mM MgSO 4 .7H 2 O, 50mM KCl, 5 mM Hepes, pH 8.0, 3 mM dithiothreitol, 0.1 mg/mL propidium iodide, 1.5 mg/mL DNAse free RNAse (Rhoche, Indionapolis, IN, USA), and 0.25% Triton X-100. The suspended nuclei are withdrawn using a pipettor, filtered through 30-µm nylon mesh, and incubated at 37 • C for 30 min before flow cytometric analysis. Suspensions of sample nuclei is spiked with suspension of standard nuclei (prepared in above solution) and analyzed with a FACScalibur flow cytometer (Becton-Dickinson, San Jose, CA, USA). For each measurement, the propidium iodide fluorescence area signals (FL2-A) from 1000 nuclei are collected and analyzed by CellQuest software (Becton-Dickinson, San Jose, CA, USA) on a Macintosh computer. The mean position of the G0/G1 nuclei peak of the sample and the internal standard are determined by CellQuest software. The mean nuclear DNA content of each plant sample, measured in picograms, are based on 1000 scanned nuclei.

Plant Transformation
Modified from Lu and Kang [41]. From a single colony of Agrobacterium, grow 3 mL starter culture overnight at 28 • C. Inoculate 300 mL large-scale culture with starter culture and grow overnight at 28 • C with agitation. Collect cells by centrifugation, then suspend cells in transformation media (0.5X MS salts, 1X Gamborg vitamins, 50 g/L Sucrose, 0.01 mg/L BAP, 20 mg/L acetosyringone, 0.5 mL/L Silwet 77). Submerge the initial Camelina inflorescences into Agrobacterium suspension, and then swirl flowers gently in solution and vacuum infiltrated for 3 min. Wrap flowers in plastic wrap and store overnight in darkened room. Next day, unwrap flowers, and return plants to greenhouse to mature. Collect seed, then select transformants as described.

Plant Selection
Antibiotics, including kanamycin, G418 (nptII), hygromycin (hptII), and sulfadiazine (sulI), and the herbicide glufosinate (bar), to determine an effective concentration for routine for a floral dip protocol and positive seed selection. In addition, 5-fluorocytocine (5FC) and 5-fluorouracil (5FU) were investigated for negative selection. It was determined that a concentration of 200 mg L −1 for kanamycin, 30 mg L −1 for G418, 30 mg L −1 for hygromycin, 150 mg L −1 for sulfadiazine and 15 mg L −1 for glufosinate ammonium were effective at inhibiting growth of seedlings (Supplemental Figures S1-S5). Camelina was tolerant of 5FC up to 1000 mg L −1 but sensitive for 5FU at concentrations as low as 20 mg L −1 (Supplemental Figure S6).

Conclusions
From our phenotypic evaluation of the 41 camelina accessions obtained from National Genetic Resources Program (NGRP) center, we identified a number of lines with potentially useful traits. For example, accession PI 311735, while providing only an average number of seed per plant (4953) produced the greatest yields in overall seed weight at 5.06 g per plant. This accession also had one of the highest mean oil contents per TSW at 38.2% and was seen to be faster than average for days to maturity and drying. As results were so encouraging with this line, its use for biotech application was also explored. It was observed that this accession was capable of transformation via traditional floral-dip technology and that glufosinate was an effective agent for seed selection. Thus, observation indicates that accession PI 311735 is a potential line for both breeding and biotech use. Another line of interest was observed through biochemical analysis of the oil, where it was discovered that accession PI 650141 had an erucic acid concentration of 1.83%, which is below the 2% as required for food consumption. Making this another potentially useful line for breeding efforts. Of interest was the split seen in the camelina population for chromosome number between n = 19 (24.3%) and n = 20 (75.6%). Our group hypothesized that camelina in its current hexaploidy form may have originated from two divergent but related pathways. In short both events could have begun with an n = 6 (C. laxa) and n = 7 (C. hispida) hybridization producing n = 13 (C. rumelica) like species. Then diverged with the second hybridization of n = 13 to one or the other original parent such that n = 6 + 7 + 6 (19) or 6 + 7 + 7 (20). From the biotechnological studies, it was discovered that the codA gene worked very efficiently at stunting camelina growth in the presences of 5FC. This should provide a valuable resource for techniques, such as RMCE or CRISPR, where the removal of DNA is a required component for both strategies. Finally, even though the sulI selection marker gene only produced a single plant that plant has been shown resistant to sulfa based herbicides (data not shown) and may provide farmers with a way to control weeds while cultivating camelina using conventional treatments. Data is presented in an uncompressed format in both tables for manuscript discussion and as raw data in XLS spreadsheet (Supplemental Tables S1-S4) format. With the variability seen within this collection it is our hope that this information will help direct breeding or biotechnological programs for camelina's future use as biofuel and/or meal sustainable crop.