Pacbio Sequencing Reveals Identical Organelle Genomes between American Cranberry (Vaccinium macrocarpon Ait.) and a Wild Relative

Breeding efforts in the American cranberry (Vaccinium macrocarpon Ait.), a North American perennial fruit crop of great importance, have been hampered by the limited genetic and phenotypic variability observed among cultivars and experimental materials. Most of the cultivars commercially used by cranberry growers today were derived from a few wild accessions bred in the 1950s. In different crops, wild germplasm has been used as an important genetic resource to incorporate novel traits and increase the phenotypic diversity of breeding materials. Vaccinium microcarpum (Turcz. ex Rupr.) Schmalh. and V. oxycoccos L., two closely related species, may be cross-compatible with the American cranberry, and could be useful to improve fruit quality such as phytochemical content. Furthermore, given their northern distribution, they could also help develop cold hardy cultivars. Although these species have previously been analyzed in diversity studies, genomic characterization and comparative studies are still lacking. In this study, we sequenced and assembled the organelle genomes of the cultivated American cranberry and its wild relative, V. microcarpum. PacBio sequencing technology allowed us to assemble both mitochondrial and plastid genomes at very high coverage and in a single circular scaffold. A comparative analysis revealed that the mitochondrial genome sequences were identical between both species and that the plastids presented only two synonymous single nucleotide polymorphisms (SNPs). Moreover, the Illumina resequencing of additional accessions of V. microcarpum and V. oxycoccos revealed high genetic variation in both species. Based on these results, we provided a hypothesis involving the extension and dynamics of the last glaciation period in North America, and how this could have shaped the distribution and dispersal of V. microcarpum. Finally, we provided important data regarding the polyploid origin of V. oxycoccos.


Introduction
Worldwide, the majority of food crops derive from a few wild plant species. In the United States (USA), farmers grow a great variety of plants such as cereals, sugar crops, vegetables, and fruits (FAO, 2010). However, few crop species are native to North America, and even fewer can still be found in their wild forms and habitats. Wild germplasm, especially from crop wild relatives, is of tremendous importance for breeding and agricultural production since it represents a pool of useful characteristics such as tolerance to biotic and abiotic stress, resistance genes, and traits influencing yield [1][2][3][4].
Vaccinium macrocarpon Ait. (American cranberry) is endemic to North America, and one of the few fruit crops that can still be found in the wild. The niche of this species is similar to its wild relative Vaccinium oxycoccos L. (small cranberry) [5][6][7]. However, V. oxycoccos is restricted to peatlands, where it grows embedded in moss lawns (usually sphagnum). Vaccinium macrocarpon grows in a wider variety of wetlands, including peatlands, swamps, and wet shores, and it will grow in mineral soils as well as moss lawns. Vaccinium oxycoccos produces over-wintering berries that, although smaller than the cultivated cranberry, share a similar flavor and possess a superior antioxidant content [8,9]. The high phytochemical content of the wild relative species and the relatively low genetic diversity in cultivated cranberry make the wild species extremely useful for breeding purposes [10,11]. In addition to its unique antioxidant profile, its northern and circumboreal distribution [12,13] make the small cranberry an interesting source of traits such as cold hardiness [14,15].
While the American cranberry is a diploid species, the small cranberry V. oxycoccos can be found as diploid, tetraploid, and, rarely, hexaploid forms. Although the most recent Flora of North America treatment [6] treats all three cytotypes as a single variable species (i.e., V. oxycoccos), morphological, molecular, phenological, and ecological data support the recognition of the diploid form as a distinct species: Vaccinium microcarpum (Turcz. ex Rupr.) Schmalh. [11,16,17]. According with [18], V. macrocarpon is considered the ancestral form, from which V. microcarpum was derived. According to this theory, the tetraploid V. oxycoccos formed following secondary contact of the diploid species through a combination of polyploidy and hybridization. He noted that this could have followed two paths: allopolyploid hybridization between V. macrocarpon and V. microcarpum, or the hybridization of autotetraploid plants derived from each diploid species. Regardless of the exact mechanism, he considered the morphology of V. oxycoccos, which is in some ways intermediate between the two diploids, strong evidence supporting hybrid ancestry. This hypothesis is supported by allozyme data [19], which shows that V. oxycoccos possesses alleles from both of the diploid species. However, the same data revealed no fixed heterozygous loci, which is consistent with tetrasomic inheritance as expected in autotetraploids [19,20]. Our own amplified fragment length polymorphism (AFLP) data [11] also reveal the presence of alleles from both diploids in V. oxycoccos, although simulation analysis suggests that it has diverged from the diploids since its formation. During the last decade, several studies regarding cranberry genetics and genomics have been published. In 2013, the first simple sequence repeat (SSR)-based genetic map was published, in which several quantitative trait loci (QTL) for yield-related traits and important phytochemicals were reported [21]. Subsequently, high-density genetic maps as well as more multi-trait QTL studies became available [22][23][24][25][26][27]. In addition, nuclear [28], plastid [29], and mitochondrial [30] genomes have been published recently. Regarding diversity between V. macrocarpon and wild relatives such as V. microcarpum and V. oxycoccos, most of the studies have been based on simple sequence repeat (SSR) markers generated from both nuclear and organelle genomes [31][32][33][34][35].
Given the recent advances on next-generation sequencing technologies, we generated PacBio complete plastid and mitochondrial genomes of an Alaskan accession of V. microcarpum as well as updated the American cranberry organelle genomes. A comprehensive comparison between the organelles of these two species revealed near identical sequences, which indicates common ancestry with implications about the origin of cultivated V. macrocarpon in North America. Moreover, Illumina resequencing in V. microcarpum samples collected in eastern Canada suggested a process of genetic diversification. Finally, we provided more evidence regarding the polyploid origin of V. oxycoccos.

Plant Material
For V. macrocarpon, we used the commercial cultivar "Stevens", which was derived from a cross between two wild selections, "McFarlin" (from MA, USA) and "Potter's Favorite" (from WI, USA).
For V. microcarpum, we used an accession collected by N. Vorsa (Rutgers University, New Brunswick, Newark and Camden, NJ, USA) in southern Alaska and described in Mahy et al. (2000). Both plants were maintained clonally under greenhouse conditions at the University of Wisconsin-Madison (Madison, WI, USA). These accessions have been extensively used in different marker studies, providing a unique genetic fingerprint based on nuclear and organellar information as well as known genetic relationships based on relatively diverse samples of V. macrocarpon, V. microcarpum, and V. oxycoccos [19,[31][32][33]. In Figure 1, we show the distinctive plant characteristics and attributes of each of our V. microcarpum and V. macrocarpon accessions.

Plant Material
For V. macrocarpon, we used the commercial cultivar "Stevens", which was derived from a cross between two wild selections, "McFarlin" (from MA, USA) and "Potter's Favorite" (from WI, USA). For V. microcarpum, we used an accession collected by N. Vorsa (Rutgers University, New Brunswick, Newark and Camden, NJ, USA) in southern Alaska and described in Mahy et al. (2000). Both plants were maintained clonally under greenhouse conditions at the University of Wisconsin-Madison (Madison, WI, USA). These accessions have been extensively used in different marker studies, providing a unique genetic fingerprint based on nuclear and organellar information as well as known genetic relationships based on relatively diverse samples of V. macrocarpon, V. microcarpum, and V. oxycoccos [19,[31][32][33]. In Figure 1, we show the distinctive plant characteristics and attributes of each of our V. microcarpum and V. macrocarpon accessions.

PacBio Sequencing, Genome Assembly, and Annotation
Young leaves of V. macrocarpon and V. microcarpum were processed at Amplicon Express (Pullman, WA, USA) to obtain high molecular weight DNA. Two single-molecule real-time (SMRT) cells of PacBio Sequel (Pacific Biosciences of California, Inc., CA, USA) were sequenced for each species at The DNA Technologies and Expression Analysis Cores of the University of California-Davis (Davis, CA, USA). For each species, a complete de novo assembly (including all the raw reads) was performed on Canu v1.7 [36] using the automatic pipeline (that includes read correction, trimming, and assembly) with default parameters. From these assemblies, mapping (using BLASTn) the plastid assembly published by Fajardo et al. (2013) allowed us to extract contigs of lengths 209,726 bp (for V. macrocarpon) and 223,859 bp (for V. microcarpum), which we believed contain the complete plastid genomes. On the other hand, mapping the reference mitochondrial genome published by Fajardo et al. (2014) yielded contigs with lengths smaller than the expected genome size (459,678 bp); therefore, an alternative strategy was carried out as follows. Raw PacBio reads were mapped with BlastR (with minMatch = 15, and fastSDP and advanceHalf activated) using as reference the mitochondrial

PacBio Sequencing, Genome Assembly, and Annotation
Young leaves of V. macrocarpon and V. microcarpum were processed at Amplicon Express (Pullman, WA, USA) to obtain high molecular weight DNA. Two single-molecule real-time (SMRT) cells of PacBio Sequel (Pacific Biosciences of California, Inc., CA, USA) were sequenced for each species at The DNA Technologies and Expression Analysis Cores of the University of California-Davis (Davis, CA, USA). For each species, a complete de novo assembly (including all the raw reads) was performed on Canu v1.7 [36] using the automatic pipeline (that includes read correction, trimming, and assembly) with default parameters. From these assemblies, mapping (using BLASTn) the plastid assembly published by Fajardo et al. (2013) allowed us to extract contigs of lengths 209,726 bp (for V. macrocarpon) and 223,859 bp (for V. microcarpum), which we believed contain the complete plastid genomes. On the other hand, mapping the reference mitochondrial genome published by Fajardo et al. (2014) yielded contigs with lengths smaller than the expected genome size (459,678 bp); therefore, an alternative strategy was carried out as follows. Raw PacBio reads were mapped with BlastR (with minMatch = 15, and fastSDP and advanceHalf activated) using as reference the mitochondrial genome published by Fajardo et al. (2014); then, filtered reads were de novo assembled using the automatic pipeline of Canu as in the assembling process used for the plastid. For V. macrocarpon and V. microcarpum, the largest contig on each assembly was 468,773 bp and 468,164 bp, respectively. Since Canu was ran using the default parameters, all of the assemblies were resolved at~40× coverage. As described in the user's manual, Canu processes the longer reads up to get 40×, which are further corrected using the remaining shorter reads. Thus, a conventional estimation of coverage (which is computed using all the reads) would be considerably larger than this. A round of polishing was performed in the plastid and mitochondrial assemblies of both species using Arrow (with default parameters). Subsequently, the circularity of each assembly was checked with the tool Circlator v1.5.5 [37] using b2r_length_cutoff = 100,000 and merge_reassemble_end = 9000. Following the recommendations of Circlator's authors, a final round of polishing with Arrow was performed in all four assemblies, (i.e., two organelles each for V. macrocarpon and V. microcarpum).
Gene annotation was performed with GeSeq [38] using all the Ericales and Asterids as reference; the percentage of identity for the BLAT protein search, and for the BLAT rRNA, tRNA, and DNA search, were set to 70 and 80, respectively. All the annotations were manually reviewed and corrected, if necessary, in Geneious R11 [39]. GC content was analyzed in R [40] using the package Biostrings. SSR search was performed with Phobos version 3.3.12 [41].

Illumina Sequencing and Mapping
We extended our study by sequencing additional accessions of V. macrocarpon and V. microcarpum, as well the tetraploid species V. oxycoccos. For V. macrocarpon, we included samples WABL11, WC16-13 and WC16-16. For V. microcarpum, we include samples QCCW5, QCJB20, and QCWA9 (previously described in Smith et al., 2015). For V. oxycoccos, we include three wild samples: MWB3, PMS5 and PB17. Additionally, the V. macrocarpon and V. microcarpum samples used for PacBio sequencing were also sequenced with Illumina (San Diego, CA, USA). In Table 1, a summary of the samples used for both PacBio and Illumina sequencing as well as important information regarding their collection sites is provided. Genomic DNA was extracted using NucleoSpin ® Plant II (Macherey-Nagel, Bethlehem, PA, USA), and submitted to The Biotech Center at the University of Wisconsin-Madison (Madison, WI, USA) for sequencing. Indexed samples were evenly loaded into three HiSeq 2500 Illumina (San Diego, CA, USA) lanes, 1 × 100 bp. Illumina reads were cleaned using Trimmomatic [42] (minimum length 85 bp and minimum quality 28), and then mapped against our PacBio organelle genomes with bwa. Later, filtered reads were de novo assembled using Abyss with the parameter k = 64, and the contigs were ordered based on the organelle reference genomes in Geneious R11 (Newark, NJ, USA). Variants were searched first, by aligning all nine samples (global SNP across all nine species), and then by aligning the samples of the same species (specific alignments, three sequences per species). Alignments were performed using the progressive method implemented in Mauve [43]; SNP were called after masking gaps in the alignment.

Phylogenetic Analysis
Due to the recent increase in the availability of plastid genomes, especially from the Ericales, we carried out a phylogenetic analysis using 68 genes (five atp genes, ccsa, cema, matk, 11 ndh, six pet, five psa, 14 psb, rbcl, seven rpl, three rpo, 11 rps, and two ycf), which were consistently present in 69 plastid genomes. In this analysis, 36 genomes corresponded to the Ericales. All the genome sequences were downloaded from NCBI. Coding sequences were extracted in Geneious R11, and imported in R using the package Biostrings for filtering the genes present in the majority of the species. Then, gene sequences were concatenated for each species, exported in fasta format, and imported back in Geneious R11 for sequence alignment and tree construction. Alignments were performed using the progressive method implemented in MAFFT v7 [44] with the default parameters. Optimal trees were inferred using Maximum Likelihood as implemented in FastTree [45]. Since mitochondrial genomes of Ericales are still scarce, we did not perform any phylogenetic analysis based on mitochondrial sequences.

Cultivated and Wild Cranberries Share Identical Organelle Genomes
The first mitochondrial and plastid genomes in V. macrocarpon were published in 2014 [30] and 2013 [29], respectively. While these assemblies were highly reliable, the advantages of today's single molecule real-time sequencing allowed us to improve the current assemblies and generate novel organelle genomes of V. microcarpum.
The complete mitochondrial genome of V. macrocarpon obtained here was 468,115 bp long, which is almost 10 kb larger than the original version [30], while the plastid was 176,093 bp long-just 48 bp longer [29]. Both assemblies were resolved in a single circular scaffold with a mean coverage (based on PacBio corrected reads only) of 39.8× for the mitochondria, and 38.9× for the chloroplast. For the mitochondria, the GC content was very similar with the previous version (45.3% versus 45.4%), whereas for the plastid, the GC content was identical for both versions (36.8%). Moreover, the GC content in both organelles fell within the expected range of the Asterids clade (based on the mitochondrial genomes for this group available in NCBI). To identify major structural changes between previous and current assemblies of V. macrocarpon organelles, we performed whole-genome alignments using MuMmer [46] (Figure 2). For the mitochondrial genome, 376 rearrangements were observed; from those, 50% corresponded to sequences smaller than 30 nucleotides, and 81% corresponded to sequences smaller than 100 nucleotides. Although MuMmer identified rearrangements involving large sequences, they (and most of the smaller ones) shared perfect synteny between previous and current genome assemblies. For the plastid genome, 161 rearrangements were observed, from which 17% involved sequences smaller than 30 nucleotides and 43% involved sequences smaller than 100 nucleotides. Similar to the mitochondrial genome, most of the rearrangements did not affect the overall synteny between both assemblies.
For V. microcarpum, the genomes were resolved in a single circular scaffold with a mean coverage of 37.1% and 39.5% for mitochondrial and plastid genomes, respectively. Remarkably, both the mitochondrial and plastid genomes of V. microcarpum were completely identical to the ones in V. macrocarpon, with an exception of two synonymous single nucleotide polymorphisms (SNPs) in the plastid, which are described below. Illumina resequencing of both V. macrocarpon and V. microcarpum confirmed our PacBio assemblies. Both organelle genome assemblies for V. macrocarpon and V. microcarpum were deposited in the GenBank (MK715444, MK715445, MK715446, and MK715447). Genes 2019, 10 6 Figure 2. Organelle genomes of V. macrocarpon and V. microcarpum. A comparison between the PacBio assemblies presented in this study and previously published cranberry organelle versions [29,30] are shown; collinear regions were calculated using MuMmer with default parameters (see Materials and Methods). Since V. macrocarpon and V. microcarpum presented near-identical organelles, for the plastid genome, only single nucleotide polymorphism (SNP) positions are provided, and the mitochondria shown are identical. Gene annotations are shown as tiles in internal tracks; colors were randomly assigned to highlight different categories of protein-coding genes, ribosomal RNA and tRNA. Sequencing coverage is provided in area plots oriented inwards, where blue, yellow, and red represent genome areas with >30×, >10×, and ≤10× coverage, respectively (based on PacBio corrected reads). GC content is also provided as a green-colored area plot. In the plastid, large single copy (LSC), small single copy (SSC), and inverted repeats (IR1 and IR2) are highlighted with red, blue, and purple concentric bars. Since V. macrocarpon and V. microcarpum presented near-identical organelles, for the plastid genome, only single nucleotide polymorphism (SNP) positions are provided, and the mitochondria shown are identical. Gene annotations are shown as tiles in internal tracks; colors were randomly assigned to highlight different categories of protein-coding genes, ribosomal RNA and tRNA. Sequencing coverage is provided in area plots oriented inwards, where blue, yellow, and red represent genome areas with >30×, >10×, and ≤10× coverage, respectively (based on PacBio corrected reads). GC content is also provided as a green-colored area plot. In the plastid, large single copy (LSC), small single copy (SSC), and inverted repeats (IR1 and IR2) are highlighted with red, blue, and purple concentric bars.
Although uncommon, several studies in other plant species have reported the similarity of organelle genomes among cultivated and wild species. For example, the mitochondrial genomes of wild (Hordeum vulgare ssp. spontaneum) and cultivated (H. vulgare ssp. vulgare) barley, which has a history of 10,000 years since its domestication from the wild progenitor, are identical with the exception of three SNPs [47]. Similarly, a study regarding plastid inheritance in Nicotiana tabacum found very few SNPs when comparing the latter with N. sylvestris, its maternal wild progenitor [48].

Genome Architecture and Gene Content
Since V. macrocarpon and V. microcarpum contained near-identical organelle genomes, the following results regarding gene content, repeated sequences, and other statistics apply for both species.
For the mitochondria, gene content was similar to the previous genome version [30]. We found nine genes of the respiratory chain complex I (nad-NADH dehydrogenase), two of the complex II (sdh-succinate oxido-reductase, one more than in Fajardo et al. (2014)), one of the complex III (apocytochrome b), three of the complex IV (cox-cytochrome oxidase), and five of the complex V (atp-ATP synthase, one more than in [30]). In addition, we found four genes of the cytochrome c biogenesis (ccm), two for the cytochrome b (cytB and petL), one transcription gene (mat-maturase), one transporter protein (mtt-B), three ribosomal RNA (rrm), 13 ribosomal proteins (rpl and rps genes), and a malonate uptake protein (tatC). In total, we found 17 tRNA, from which three were duplicated (trnL, trnM, and trnS). In Fajardo et al. (2014), the authors reported the presence of two copies of tRNA-Sec, which is an unusual transfer RNA in land plants; here, we confirm its presence, but in a single copy (exon 1 of copy 1 (located at~45 kb) has a single different nucleotide). Out of the 42 protein-coding genes found in the mitochondrial genome, three contained non-ATG start codons, and eight had undetermined stop codons (different than TAA, TGA, or TAG). Among these, the gene petL had both start and stop codons undetermined; it was also 73 nucleotides long. Similarly, gene mttB was expected to have a non-ATG start codon based on previous studies [49][50][51][52]. Protein-coding genes varied largely in size and intervals, ranging from a few hundred nucleotides and a single interval (one exon), to thousands of nucleotides and up to five intervals (i.e. nad7 and nad2). Regarding SSR, we observed 1449 motifs with total lengths ranging from six to 26 nucleotides. As expected, dinucleotides and trinucleotides were the most abundant (80.0% and 15.6%, respectively); tetra, penta, and hexanucleotides account for less than 5%. The repeat AG was the most abundant motif among the dinucleotides. A complete list of repeated sequences in the mitochondria is provided in Supplementary File 1.
Most of the genes found in the plastid were also consistent with the previous version published by Fajardo et al. (2013). The annotated sequences were classified as follows: out of the 68 unique protein-coding genes, six corresponded to ATP synthase subunits (atp genes), 11 corresponded to NADH-dehydrogenase subunits (ndh genes), six corresponded to cytochrome b/f complex subunits (pet genes), five corresponded to photosystem I subunits (psa genes), 16 corresponded to photosystem II subunits (psb genes), eight corresponded to large ribosomal protein units (rpl genes), 12 corresponded to small ribosomal protein units (rps genes), and four corresponded to RNA polymerase subunits (rpo genes). Other important genes such accD, ccsA, infA, matK, and rbcL were also found. More than 95% of these genes showed regular start and stop codons. Five hypothetical genes (ycf ) were found, although only one showed standard start and stop codons. In addition, four ribosomal RNA and 26 unique transfer RNA were found. The length of the large single copy (LSC), small single copy (SSC), and inverted repeats (IR) fragments were 104,592 bp, 3025 bp, and 34,238 bp, respectively. All the ribosomal RNA were located in the IR; therefore, two copies of each gene were found. On the other hand, ndhF was the only gene in the SSC. Only 492 SSRs were found in the plastid genome with the following proportions: dinucleotide 77.2%, trinucleotide 13.4%, tetranucleotide 5.9%, and both penta and hexanucleotide with less than 2.0% each. The dinucleotide motif AT was the most abundant (Supplementary File 1). SSR frequency and the distribution of SSR motif type were different compared with previous studies in cranberry [33]. In the previous study, the mitochondria had fewer SSR (one every 3.8 kb) than the plastid (one every 2.0 kb) genome (in average), and the tetranucleotide motifs were the predominant for both organelle genomes.
As mentioned before, both mitochondrial and plastid genomes were virtually identical between V. macrocarpon and V. microcarpum. Only two SNPs were found in the plastid; the first one (G in V. microcarpum, A in V. macrocarpon, at position 139,582 bp) was located in the ndhF (at the SSC fragment), and was a non-synonymous mutation producing isoleucine in V. macrocarpon and valine in V. microcarpum, which were two similar non-polar aminoacids. The second SNP (A in V. microcarpum, G in V. macrocarpon, at position 57,591 bp) was located in the psaA gene, and it resulted in a synonymous change (which was translated to the amino acid glycine in both cases).

Phylogenetic Analysis of V. macrocarpon and Related Species
To study the phylogenetic position of V. macrocarpon within the Ericales, we performed an analysis based on 68 genes consistently present among 36 Ericales genomes, and 39 from other families ( Figure 3). As expected, the Ericales formed a monophyletic group. The topology of our tree mirrors relationships recovered from a more comprehensive analysis of a supermatrix that includes nearly 5000 species and 25 loci [53]. In both analyses, the balsaminoids (represented by Impatiens piufanensis and Hydrocera triflora in our data) are placed sister to the rest of the Ericales; the Ericaceae are sister to the sarracenioids (represented by Actinidia in our data) in a strongly supported clade. Our data included two additional Ericaceae: Arbutus unedo and Chamaedaphne calyculata. Of these, V. macrocarpon was most closely related to C. calyculata. Chamaedaphne calyculata, which is a circumboreal species with a recently published plastid genome [54], shares almost an identical genome architecture with V. macrocarpon, with the most profound changes in the inverted repeat regions. Moreover, C. calyculata shares many phenotypic traits with wild V. macrocarpon plants [55].

Phylogenetic Analysis of V. macrocarpon and Related Species
To study the phylogenetic position of V. macrocarpon within the Ericales, we performed an analysis based on 68 genes consistently present among 36 Ericales genomes, and 39 from other families ( Figure 3). As expected, the Ericales formed a monophyletic group. The topology of our tree mirrors relationships recovered from a more comprehensive analysis of a supermatrix that includes nearly 5000 species and 25 loci [53]. In both analyses, the balsaminoids (represented by Impatiens piufanensis and Hydrocera triflora in our data) are placed sister to the rest of the Ericales; the Ericaceae are sister to the sarracenioids (represented by Actinidia in our data) in a strongly supported clade. Our data included two additional Ericaceae: Arbutus unedo and Chamaedaphne calyculata. Of these, V. macrocarpon was most closely related to C. calyculata. Chamaedaphne calyculata, which is a circumboreal species with a recently published plastid genome [54], shares almost an identical genome architecture with V. macrocarpon, with the most profound changes in the inverted repeat regions. Moreover, C. calyculata shares many phenotypic traits with wild V. macrocarpon plants [55].

Divergent V. microcarpum Can Be Found in Eastern Canada
Due to the remarkable similarity found between V. macrocarpon and V. microcarpum organelle genomes, we expanded our study by resequencing additional accessions from a broader geographic area. In particular, we included samples of V. microcarpum collected in Quebec, Canada, V. oxycoccos (4×) collected in Wisconsin and Minnesota, USA, as well as V. macrocarpon from Wisconsin and the east coast of the USA (Table 1). For these samples, Illumina resequencing allowed us to partially assemble both mitochondrial and plastid genomes at a very high coverage (>1000×). After removing contigs smaller than 5 kb, mitochondrial genome assemblies for all nine samples had a length between 394-428 kb distributed among 27-30 scaffolds (additional statistics regarding assemblies are provided in Supplementary File 1). For the plastid, assembly lengths (taking in account only one of the inverted repeats) ranged from 76-95 kb among 7-10 contigs. The incompleteness of these de novo assemblies could have been produced by two main causes. First, the use of very short reads (1 × 100 bp) could have hampered the assembly process. Second, since these de novo assemblies were conducted using a subset of raw reads that mapped to our V. macrocarpon PacBio assembly, potential novel sequences absent in either the reference or the additional accession could have been missed. After orienting and ordering the plastid and mitochondrial contigs for all nine samples (based on V. macrocarpon), we aligned them and identified SNP variants. By considering aligned and ungapped regions involving all nine samples (using our V. macrocarpon PacBio assemblies as reference), we discovered 831 SNPs in the mitochondria ( Figure 4A and Table 2) and 121 in the plastid ( Figure 4B and Table 2). Of the SNPs found in the mitochondrial genome, four were located in the nad1 gene, 15 were located in the nad2 gene, and one was located in the rps14 gene. In the plastid, 39 SNPs were found in 13 different genes, in which ycf2 had the most (11). A large portion of SNPs was found in the regions delimiting the inverted repeat regions, which was expected given their volatility [56]. When aligning only the wild V. macrocarpon samples, 10 SNPs were found in the mitochondria and 11 were found in the plastid ( Figure 4C, top panel). Similarly, alignments of only the tetraploid V. oxycoccos resulted in 11 SNPs each for both mitochondria and plastid ( Figure 4C, bottom panel). In contrast, the alignment of the Canadian V. microcarpum samples resulted in 232 SNPs in the mitochondria and 332 SNPs in the plastid; from these, more than 95% of the SNPs were called, because sample QCWA9 showed alternative alleles ( Figure 4C, middle panel; for space reasons, only the first alignment position is shown). Additionally, we carried out an alternative approach in which, instead of comparing the de novo assemblies, the Illumina raw reads (previously trimmed and cleaned) were directly mapped using our V. macrocarpon as reference, and low coverage regions were recorded. This strategy revealed that the mitochondrial genomes of the wild V. macrocarpon samples WC16-13 and WC16-16 were very similar to our reference; however, we could not find Illumina reads of WABL11 to map a~1 kb region at position~198 kb of our V. macrocarpon assembly. Also, when mapping mitochondrial raw reads of our Canadian V. microcarpum samples, we observed a region of~8 kb at position~55 kb, and a two-interval region of~7 kb at position~384 kb in which we could not find Illumina reads to map. Interestingly, all the V. oxycoccos (4×) samples showed a continuous and ungapped coverage along both plastid and mitochondrial genomes. Figure 4D shows the unique coverage plots observed across all nine samples in the mitochondria; in Figures S1 and S2, all the coverage maps for both mitochondria and plastid genomes are provided.

Relationships among Diploids, and the Link between Alaskan V. microcarpum and V. macrocarpon
Vaccinium macrocarpon and V. microcarpum displayed dramatically different genetic diversity. Our samples of V. macrocarpon span 1700 km from Wisconsin to the east coast. Despite the geographic distance, four samples yielded only 21 intraspecific SNPs combined for the chloroplast and mitochondrial genomes. In contrast, we found 564 intraspecifc SNPs in three V. microcarpum samples On the heatmaps (right arcs), the inner three rings correspond to the V. microcarpum samples QCCW5, QCJB20, and QCWA9; the next three rings (4)(5)(6) correspond to the V. oxycoccos samples MWPB3, PB17, and PMS5; rings 7-9 correspond to the wild V. macrocarpon samples WABL11, WC16-13, and WC16-16; and rings 10-11 correspond to cultivated V. macrocarpon sequenced with Illumina and PacBio, respectively (note these last two sequences were identical to the Alaskan V. microcarpum sequences). Each SNP is connected by a colored link to its physical position within either the mitochondria or plastid genome (left to right arcs); link colors are meaningless. For reference purposes, gene locations were added on the left arcs and colored by category. (C) SNP variants were called when aligning sequences by species; for the V. macrocarpon alignment, Stevens* correspond to the assembly carried out with Illumina data (used to validate the PacBio assembly), whereas Stevens** correspond to the one with PacBio; for V. microcarpum, only the first positions of the alignment are shown. (D) Coverage plots using Illumina filtered raw reads and V. macrocarpon/V. microcarpum mitochondrial genomes as reference (additional coverage plots are provided in Figures S1 and S2). (E) Ice extent between 9-10 k years ago showing the ice-free corridor (IFC).

Relationships among Diploids, and the Link between Alaskan V. microcarpum and V. macrocarpon
Vaccinium macrocarpon and V. microcarpum displayed dramatically different genetic diversity. Our samples of V. macrocarpon span 1700 km from Wisconsin to the east coast. Despite the geographic distance, four samples yielded only 21 intraspecific SNPs combined for the chloroplast and mitochondrial genomes. In contrast, we found 564 intraspecifc SNPs in three V. microcarpum samples collected within 260 km along the James Bay Coast. This is a very modest representation of a species with a circumboreal distribution; more comprehensive sampling is likely to reveal further diversity.
The organelle genomes shared by the Alaskan V. microcarpum sample and all the V. macrocarpon samples was surprising. Given the unexpected diversity in our V. microcarpum plants, this could be explained by the persistence of multiple ancestral cytoplasmic lineages across the range of V. microcarpum, including the one now present in V. macrocarpon. In other words, the shared genomes are the consequence of incomplete lineage sorting among V. macrocarpon and V. microcarpum.
Alternatively, the shared genomes could be the result of past gene flow between the Alaskan V. microcarpum populations and V. macrocarpon in eastern North America. Our V. macrocarpon accession was collected in southern Alaska, in the region of the ice-free corridor (IFC) that connected eastern Alaska with the north-central region of the USA during the deglaciation period between 8000-11,000 years ( Figure 4E) [57]. The IFC enabled the movement of humans and wildlife across the continent thousands of years before the Laurentide ice sheet retreated from eastern Canada, which could explain the closer genetic relationship between V. macrocarpon and the Alaskan V. microcarpum than that of the geographically closer populations in Quebec. Further sampling across the range of V. microcarpum is needed to clarify this situation.

The Origin of the Tetraploid V. oxycoccos
Different studies have discussed the tetraploid origin of V. oxycoccos, but no definitive answer has been provided. Two main theories have been proposed. The first one describes that V. oxycoccos arose as a hybrid of the two diploid taxa, V. macrocarpon and V. microcarpum [18], whereas the second suggests that V. oxycoccos is most likely the autopolyploid descendant of the diploid V. microcarpum [19]. Since V. oxycoccos was included in our resequencing analysis, the SNPs found across all three species may provide evidence regarding the origin of V. oxycoccos.
For 827 of the 831 mitochondrial SNPs (99.3%), V. macrocarpon and V. oxycoccos shared the same allele (Table 2). Vaccinium oxycoccos and V. microcarpum shared the same allele at a single mitochondrial SNP (<1%). This strong link to V. macrocarpon refutes the hypothesis that V. oxycoccos is an autopolyploid descendant of V. microcarpum. The simplest explanation is that V. oxycoccos is an allotetraploid, and that V. macrocarpon was the ancestral cytoplasmic donor.
However, the chloroplast data challenges this simple explanation, as the V. oxycoccos genome appears to contain a mixture alleles from both diploid species. Vaccinium macrocarpon and V. microcarpum are distinguished by 100 chloroplast SNPs. Of these, V. oxycoccos had the "microcarpum" allele for 67 SNPs, and the "macrocarpon" allele for 33 SNPs. We are not aware of any mechanism that would enable recombination between chloroplast genomes from diploid ancestors to yield a mixture of genomes in the derived polyploid. This may indicate that our sampling was not adequate to capture the full scope of variation in the chloroplast genomes among the three species. That is, the alleles that are unique to V. macrocarpon and V. microcarpum in our data may actually be present in both species. Alternatively, a third diploid species may have been involved in the formation of tetraploid V. oxycoccos.
The possible existence of undocumented genetic diversity is further supported by the contrasting results from the PacBio sequencing. These data show that the Alaskan V. microcarpum sample possesses mitochondrial and chloroplast genomes that are identical to V. macrocarpon, and markedly different from the genomes of conspecific samples from Quebec. This could be explained by the preservation of multiple ancestral organelle lineages in the circumboreal V. microcarpum, including the one present in the more geographically restricted V. macrocarpon.
Alternatively, it could be an indication of cryptic variation in the V. oxycoccos/V. microcarpum group. In this context, it has been recognized the existence of a western cranberry species (Oxycoccos ovalifolius (Michx.) Porsild) [58], however, later authors have subsumed it within V. oxycoccos (along with V. microcarpum, [5]). Moreover, genetic studies of V. microcarpum have been geographically limited: Mahy et al. (2000) included only Alaskan V. microcarpum samples, and Smith et al. (2015) was restricted to western Quebec. The results of the current study suggest that a broader assessment of genetic diversity in diploid cranberries across their northern range may reveal taxonomically and agronomically valuable variation.

Conclusions and Further Directions
Our study of the organelle genome architecture of V. macrocarpon, V. microcarpum, and V. oxycoccos, and their diversity across North America, revealed a remarkable similarity between the cytoplasms of Alaskan V. microcarpum and cultivated V. macrocarpon. Moreover, we showed that a highly differentiated V. microcarpum can be found in eastern Canada, and suggested that the spatial and temporal dynamics of the last glaciation period could had shaped the genetic diversity among cranberry species. Further studies should investigate the genomic rearrangements observed in Canada's V. microcarpum samples as well as elaborate a more extensive sampling and genomic profiling of germplasm across Alaska and the "ice-free corridor". In summary, this study provided important insights about the movement and genetic differentiation of cranberry and its wild relatives in North America, which is of vital importance for the inclusion of wild germplasm into cranberry breeding programs.