Haploids in Conifer Species: Characterization and Chromosomal Integrity of a Maritime Pine Cell Line

Haploids are a valuable tool for genomic studies in higher plants, especially those with huge genome size and long juvenile periods, such as conifers. In these species, megagametophyte cultures have been widely used to obtain haploid callus and somatic embryogenic lines. One of the main problems associated with tissue culture is the potential genetic instability of the regenerants. Because of this, chromosomal stability of the callus and/or somatic embryos should also be assessed. To this end, chromosome counting, flow cytometry and genotyping using microsatellites have been reported. Here, we present an overview of the work done in conifers, with special emphasis on the production of a haploid cell line in maritime pine (Pinus pinaster L.) and the use of a set of molecular markers, which includes Single Nucleotide Polymorphisms (SNPs) and microsatellites or Single Sequence Repeats (SSRs), to validate chromosomal integrity confirming the presence of all chromosomic arms.


Introduction
Haploid tissues are valuable for genomic studies and breeding purposes in higher plants, especially those with huge genomes and long juvenile periods, such as conifers.To this end, megagametophytes and, to a lesser extent, microsporophylls (i.e., pollen grains), have been used as a source of material.
In the last few years, development of next-generation sequencing technologies has made genomes and transcriptomes available from non-model species such as conifers [1].To date, different conifer genome sequencing projects have been performed from diploid tissue and also from haploid megagametophytes, since they offer a major reduction in sequence complexity [2][3][4].However, the reduced amount DNA that can be obtained from a single megagametophyte may be insufficient to sequence these large genomes [5].To follow this approach, it is essential to generate a genetically stable haploid cell line derived from in vitro grown tissues for each targeted species.
The generation of haploid lines relies on the proliferative capability of initial haploid tissue.The latest review on this subject [6] reported limited success when male haploid tissues were used as explants, whereas haploid cell lines exhibiting different degrees of morphogenesis have been reported from megagametophyte cultures of several conifers.An updated overview of what has been reported from megagametophyte cultures in conifers is depicted in Table 1.Briefly, proliferating calli were obtained from Picea sitchensis [7] and organogenic and/or embryogenic cultures of Pices abies [8,9], Larix decidua, Larix leptolepsis, and their reciprocal crosses [10][11][12][13][14].In addition, putative L. decidua haploid plants were obtained [15].Chromosome counting of this material showed that haploid cultures sometimes produced a diversity of chromosome complements ( [12,16] and references therein) and that the plant regenerated from L. decidua haploid somatic embryos was mixoploid with predominant diploid cells [15,16].More recently, mass production of pollen protoplast has been reported for Pinus bungeana and Picea wilsonii [17], but their haploid status was not analyzed.proliferating calli [19] In the last decade, advances in genome sequencing have prompted the development of protocols to generate true haploid cell lines to be used as a DNA source for massive genome analyses of conifer species.As a result of these efforts, haploid cell lines from megagametophytes of Pinus pinaster [19] and Larix sibirica [18] have been obtained and the haploid status studied using DNA markers.The developmental stage of the megagametophyte is an important factor in these studies, although the optimal stage for the induction of haploid cultures may differ among species and genotypes [6,7,18,19].Modified Murashige and Skoog [20] or Litvay (as described in [21]) media, supplemented with different concentrations of 2,4-D with or without cytokinins, have been widely used to initiate callus formation and for further proliferation in conifers [6,7,18,19].However, is well documented that long-term cultures may induce instability of the regenerants [22].This instability can be genomic (affecting ploidy level), chromosomic (inversions, deletions . . .), genic (mutation), or epigenetic [16,18,[23][24][25][26][27].Since in vitro variation in chromosome sets has been recorded [12,13,16,18,19], it is crucial to validate the suitability of the starting material when approaching de novo whole genome sequencing based on a single haploid tissue.Different strategies can be used to confirm the presence of all chromosomes in a haploid state and the absence of large deletions involving the loss of chromosome arms [18], including chromosome counting, cytometry, and genotyping using molecular markers [26].Karyological analysis enables unfailing detection of all modifications in chromosome number, but it is sometimes difficult to obtain a high enough number of cells containing metaphases with chromosomes sufficiently contracted and separated to provide a reliable chromosome count [19].Flow cytometry allows an indirect evaluation, but hardly detects aneuploidy involving the presence of only one extra or missing chromosome, or important chromosome rearrangements.
Pinus pinaster (maritime pine) is an important conifer in Mediterranean regions with a high ecological and socio-economical value [28], but is also one of the most advanced models for conifer research.As a result, several collaborative initiatives to sequence the genome of this species have been launched in the last few years [29].In previous work, Arrillaga et al. generated some maritime pine haploid cell lines [19], one of which (L5) was considered a good candidate to be used as a template for maritime pine genome sequencing [30].The haploid status of the L5 callus line was initially checked using flow cytometry and seven microsatellite markers [19].Therefore, a more in-depth analysis of chromosome integrity was needed to ensure its suitability for massive sequencing.Here, we describe the selection and use of a set of Single Nucleotide Polymorphisms (SNPs) and microsatellites or Single Sequence Repeats (SSRs) to assess the presence of all chromosome arms in the DNA extracted from all L5 haploid callus line samples used for Pinus pinaster genome sequencing.

Plant Material
As starting plant material, we used the putative haploid maritime pine L5 line previously generated by Arrillaga et al. [19].Briefly, this line was obtained from a megagametophyte isolated from a cone of the Oria 6 genotype (Oria, Almeria, Spain) collected before seed dehydration.After embryo removal, the isolated megagametophyte was cultured on modified Litvay's medium (mLV) [21] for callus induction (Figure 1).The generated L5 line was maintained by transference to the same medium every 3 weeks.

Plant Material
As starting plant material, we used the putative haploid maritime pine L5 line previously generated by Arrillaga et al. [19].Briefly, this line was obtained from a megagametophyte isolated from a cone of the Oria 6 genotype (Oria, Almeria, Spain) collected before seed dehydration.After embryo removal, the isolated megagametophyte was cultured on modified Litvay's medium (mLV) [21] for callus induction (Figure 1).The generated L5 line was maintained by transference to the same medium every 3 weeks.

Chromosome Integrity Analysis
Total DNA from L5 cell line samples, derived from a single megagametophyte preselected as a template for genome sequencing, as well as from needles of the mother tree (Oria 6) was extracted using the Qiagen DNeasy ® Plant Mini Kit (Qiagen, Hilden, Germany) following the manufacturer's protocol.SNP and SSR molecular markers covering all chromosome arms were identified based on their positions on the Oria 6 genetic map [28] (Figure 2).Allelic composition of the SSR loci was determined by electrophoresis in acrylamide gels according to de Miguel et al. [28].SNPs were genotyped by sequencing a region of 200-400 bp flanking each targeted site amplified by PCR using specific primers.Primers (available as supplementary material Table S1) were designed based on a BLAST search of the sequences used to build the genotyping array, which was used to construct the Oria 6 genetic map [29].PCRs were performed in 25 μL containing 10 ng of DNA; 1× Pfu buffer (Fermentas, ON, Canada), 0.2 mM of each dNTP, 1.5 mM MgSO4, 1.25 U Pfu DNA polymerase (Fermentas, ON, Canada), and 0.2 μM of each primer.A Perkin-Elmer GenAmp 9700 thermal cycler (Perkin Elmer Inc., Waltham, MA, USA) was used to carry out the PCR reactions.The thermocycler parameters were: 95 °C for 1 min, 20 touchdown cycles of 95 °C for 30 s, (Tm + 3) °C, 45 s (−0.5 °C/cycle), 72 °C for 1 min; 20 cycles of 95 °C for 30 s, (Tm − 5) °C for 30 s, 72 °C for 1 min and a final extension step of 72 °C for 7 min.After PCR, an aliquot of each amplified product was checked by electrophoresis on 1% agarose gels, and the visualized band was purified using a QiaquickTM gel extraction kit (Qiagen, Hilden, Germany) and sequenced.The haploid/diploid state of the targeted SNP was evaluated by aligning the sequences obtained for the callus and the mother tree with that of the reference transcriptome using the software SeqMan 7.1 from DNAStar (Lasergene, GATC Biotech, Konstanz, Germany).

Chromosome Integrity Analysis
Total DNA from L5 cell line samples, derived from a single megagametophyte preselected as a template for genome sequencing, as well as from needles of the mother tree (Oria 6) was extracted using the Qiagen DNeasy ® Plant Mini Kit (Qiagen, Hilden, Germany) following the manufacturer's protocol.SNP and SSR molecular markers covering all chromosome arms were identified based on their positions on the Oria 6 genetic map [28] (Figure 2).Allelic composition of the SSR loci was determined by electrophoresis in acrylamide gels according to de Miguel et al. [28].SNPs were genotyped by sequencing a region of 200-400 bp flanking each targeted site amplified by PCR using specific primers.Primers (available as Supplementary Material Table S1) were designed based on a BLAST search of the sequences used to build the genotyping array, which was used to construct the Oria 6 genetic map [29].PCRs were performed in 25 µL containing 10 ng of DNA; 1× Pfu buffer (Fermentas, Burlington, ON, Canada), 0.2 mM of each dNTP, 1.  C for 7 min.After PCR, an aliquot of each amplified product was checked by electrophoresis on 1% agarose gels, and the visualized band was purified using a QiaquickTM gel extraction kit (Qiagen, Hilden, Germany) and sequenced.The haploid/diploid state of the targeted SNP was evaluated by aligning the sequences obtained for the callus and the mother tree with that of the reference transcriptome using the software SeqMan 7.1 from DNAStar (Lasergene, GATC Biotech, Konstanz, Germany).Location of the molecular markers on the Oria 6 genetic map.The position of the markers used to confirm the presence of all chromosome arms, as well as the size of each linkage group is indicated in centimorgans according to [28].

Results
In previous work, the ploidy level of the L5 line was tested by karyological, flow cytometry analyses and 7 SSR markers [19].In order to have the most appropriate material for maritime pine DNA sequencing, the chromosome integrity should be demonstrated.
The genotyping of the single SSR locus studied (A6F03) confirmed the presence of the upper arm of chromosome 10 in a haploid state by detecting one and two alleles on the DNA from L5 callus line and needle samples of the mother tree, respectively (Table 2).In addition, a set of 64 SNPs located on the 23 remaining chromosome arms of the Oria 6 genetic map [28] was selected to further analyze chromosome integrity.Thus, new primers were designed flanking these selected 64 SNPs.The analysis of the sequences obtained with 23 of them confirmed the presence and the haploid status of their respective chromosome arms by detecting two alleles in the diploid DNA and one allele in the haploid samples at the targeted SNP position (Table 2).Regarding the remaining 41 primer pairs, 24 led to poor or no amplification, probably because of the presence of mismatched nucleotides in the primer sequences; six amplified more than two DNA fragments on agarose gels or sequences with high levels of variation, even in the L5 sample, pointing towards the amplification of unspecific or duplicated sequences; four amplified DNA fragments had no homology with the reference sequence  Location of the molecular markers on the Oria 6 genetic map.The position of the markers used to confirm the presence of all chromosome arms, as well as the size of each linkage group is indicated in centimorgans according to [28].

Results
In previous work, the ploidy level of the L5 line was tested by karyological, flow cytometry analyses and 7 SSR markers [19].In order to have the most appropriate material for maritime pine DNA sequencing, the chromosome integrity should be demonstrated.
The genotyping of the single SSR locus studied (A6F03) confirmed the presence of the upper arm of chromosome 10 in a haploid state by detecting one and two alleles on the DNA from L5 callus line and needle samples of the mother tree, respectively (Table 2).In addition, a set of 64 SNPs located on the 23 remaining chromosome arms of the Oria 6 genetic map [28] was selected to further analyze chromosome integrity.Thus, new primers were designed flanking these selected 64 SNPs.The analysis of the sequences obtained with 23 of them confirmed the presence and the haploid status of their respective chromosome arms by detecting two alleles in the diploid DNA and one allele in the haploid samples at the targeted SNP position (Table 2).Regarding the remaining 41 primer pairs, 24 led to poor or no amplification, probably because of the presence of mismatched nucleotides in the primer sequences; six amplified more than two DNA fragments on agarose gels or sequences with high levels of variation, even in the L5 sample, pointing towards the amplification of unspecific or duplicated sequences; four amplified DNA fragments had no homology with the reference sequence used to design the primers; and seven amplified monomorphic DNA fragments were observed in the diploid sample.
Table 2. Molecular markers used to validate the presence of all chromosome arms.The position of each marker on the Oria 6 genetic map [29] as well as the allelic state of the mother tree (Oria 6) and L5 line (callus) are shown.

Discussion
Different molecular marker technologies have been used to study the genetic stability of cell lines or regenerated plants.Thus, random amplified polymorphic DNA (RAPD) markers have been used in peach [31] cedar [32], lemon [33] and chestnut [34], whereas amplified fragment length polymorphism (AFLP) has been used in kiwifruit [35], grape [36] and apple [37].In the last few years, microsatellites have been widely used to analyze many plant species, including arabian coffee [38], cork oak [39] and trembling aspen [40].In conifers, genetic stability of in vitro-derived diploid cultures has been widely studied.Embryogenic cells lines and cotyledonary embryos of Pinus sylvestris showed high mutation rates in SSR sequences [25].In contrast, the stability of RAPD markers and nuclear microsatellite sequences has been reported at successive stages of somatic embryogenesis in Picea abies, including plants regenerated from somatic embryos [41][42][43].In Pinus pinaster, the genetic stability of embryogenic cultures and derived plantlets has been studied by flow cytometry [44], microsatellites [45] and DNA methylation [27].Flow cytometry analysis was unable to detect genetic changes whereas both microsatellites and DNA methylation studies allowed identifying genetic variation in embryogenic lines [27,44,45].These results pointed to the use of DNA markers as the most appropriate method to detect genetic/epigenetic instability compared with flow cytometry techniques.
The use of molecular markers to detect genomic instability attains special relevance when approaching de novo whole genome sequencing based on a single haploid tissue.To the best of our knowledge, only two studies have previously used molecular markers to assess the genetic stability of conifer haploid cultures.In Larix sibirica, mutations in one or more of the 11 microsatellite loci were detected in all the megagametophyte-derived calli tested [18].In Pinus pinaster, the seven microsatellites used to genotype 16 megagametophyte-haploid lines showed expected allele composition, amplifying a single allele from the mother tree [19].Microsatellites are robust, highly variable and codominant markers.Despite these benefits, however, the main drawbacks of using microsatellites in pine species are the large proportion of amplification failure, inconsistencies in the amplification, multibanding patterns, and lack of polymorphism [46], probably due to the complexity of their genomes.For these reasons, a limited number of SSR markers have been located on the existing P. pinaster genetic maps [47][48][49], and only one of them that showed clear amplification profiles and was informative in terms of the position on one of the chromosome arms (locus A6F03) and was used in this study.However, the analysis of specific regions of the conifer genome can be performed using markers based on single nucleotide polymorphisms (SNPs).Although they are less polymorphic than SSR markers, SNPs are abundant, ubiquitous and amenable for high-throughput detection.In this line, identification of specific heterozygous SNP loci located on the remaining chromosome arms of the Oria 6 genome allowed validation of their presence in the L5 cell line.Although several SNPs initially targeted were discarded because of inconsistencies in the amplification, scoring or sequencing process, 23 SNPs were finally selected based on their usefulness to validate the presence of their corresponding chromosome arms (Table 2).Therefore, the molecular marker analysis described allowed the detection of a single allele at each studied loci (one SSR and 23 SNPs), confirming the presence of all chromosomes in a haploid state in the P. pinaster cell line L5 and the absence of deletions involving the loss of chromosome arms.Thus, the present work represents not only a step forward in validating the suitability of the L5 cell line samples for massive sequencing of the P. pinaster genome but also provides a set of markers that could be used for similar purposes in this species.

Conclusions
The availability of haploid cell lines or plants is important to facilitate genome sequencing of plant species with large genomes, such as conifers.In most haploid in vitro cultures, genetic instability has been described by karyologial and, later, by flow cytometry analyses.Therefore, when starting a de novo whole genome sequencing project based on single haploid tissue, the validation of the suitability of the starting material is highly desirable.In this work, we validated the L5 cell line, which is being used as a template for genome sequencing of Pinus pinaster, by genotyping molecular markers located on each chromosome arm, a strategy that may be extended to other species.

Figure 1 .
Figure 1.(a) Developmental stage of the megagametophyte used to initiate maritime pine haploid cultures.The zygotic embryo was removed before culture initiation; (b) callus from Line L5.

Figure 1 .
Figure 1.(a) Developmental stage of the megagametophyte used to initiate maritime pine haploid cultures.The zygotic embryo was removed before culture initiation; (b) callus from Line L5.

Figure 2 .
Figure 2.Location of the molecular markers on the Oria 6 genetic map.The position of the markers used to confirm the presence of all chromosome arms, as well as the size of each linkage group is indicated in centimorgans according to[28].

Figure 2 .
Figure 2.Location of the molecular markers on the Oria 6 genetic map.The position of the markers used to confirm the presence of all chromosome arms, as well as the size of each linkage group is indicated in centimorgans according to[28].

Table 1 .
Haploid tissue obtained from megagametophyte cultures of conifer species.