Genomic Resource Development for Hydrangea ( Hydrangea macrophylla (Thunb.) Ser.)—A Transcriptome Assembly and a High-Density Genetic Linkage Map

: Hydrangea ( Hydrangea macrophylla ) is an important ornamental crop that has been cultivated for more than 300 years. Despite the economic importance, genetic studies for hydrangea have been limited by the lack of genetic resources. Genetic linkage maps and subsequent trait mapping are essential tools to identify and make markers available for marker-assisted breeding. A transcriptomic study was performed on two important cultivars, Veitchii and Endless Summer, to discover simple sequence repeat (SSR) markers and an F 1 population based on the cross ‘Veitchii’ × ‘Endless Summer’ was established for genetic linkage map construction. Genotyping by sequencing (GBS) was performed on the mapping population along with SSR genotyping. From an analysis of 42,682 putative transcripts, 8780 SSRs were identiﬁed and 1535 were validated in the mapping parents. A total of 267 polymorphic SSRs were selected for linkage map construction. The GBS yielded 3923 high quality single nucleotide polymorphisms (SNPs) in the mapping population, resulting in a total of 4190 markers that were used to generate maps for each parent and a consensus map. The consensus linkage map contained 1767 positioned markers (146 SSRs and 1621 SNPs), spanned 1383.4 centiMorgans (cM), and was comprised of 18 linkage groups, with an average mapping interval of 0.8 cM. The transcriptome information and large-scale marker development in this study greatly expanded the genetic resources that are available for hydrangea. The high-density genetic linkage maps presented here will serve as an important foundation for quantitative trait loci mapping, map-based gene cloning, and marker-assisted selection of H. macrophylla .


Introduction
Hydrangea is one of the most widely cultivated plants in the horticulture industry and is grown for floriculture, nursery, landscape, and cut flower markets. Introduced as a cultivated ornamental into Europe from Asia around 1788 by Sir Joseph Banks [1], Hydrangea macrophylla is now represented by more than 500 extant cultivars in the United States and Europe. Among the twenty-three Hydrangea species that are recognized, less than half are common in cultivation and the most economically important among these by far is H. macrophylla [2]. In recent years, hydrangeas have become a dominant product in the flowering shrub market in the United States (US), worth 91.3 million US dollars in 2014 [3]. The economic value has stimulated an increasing demand for specialized hydrangeas, such as the remontant type (blooming more than once per year), and thus has energized hydrangea breeding in the US.
As a typical woody plant, hydrangea breeding requires a long generation time in a conventional breeding process. Traits that are most important to H. macrophylla breeding are unique inflorescence architecture and flower color variation for all markets, floral induction (reblooming for landscape use and year-round floral production for florist's hydrangeas), drought and sun tolerance in landscape breeding, and powdery mildew resistance for greenhouse and landscape grown plants. Selections in hydrangea breeding historically have focused mainly on novel flower types that can be chosen in the first generation for incremental improvements [4]. An additional, current goal of hydrangea breeding is to introduce resistance to existing and emerging diseases and pests for nursery producers [5]. As a highly heterozygous horticultural crop, hydrangea breeding is mainly based on phenotypic selection, which is a long and difficult process, particularly when coupled with long generation times [4].
Conventional breeding can be improved and generation interval times can be reduced by incorporating marker-assisted selection (MAS) or marker-assisted breeding. One prerequisite of MAS is the development of genomic resources, such as genetic markers, that can be used to identify markers linked to the traits of interest to segregate populations through genetic mapping analyses [6]. Codominant markers such as microsatellites (simple sequence repeats, SSRs) and single nucleotide polymorphisms (SNPs) are markers of choice for such an application as they occur randomly throughout the genome in relatively abundant levels that can allow for estimation of additive and dominance allelic effects. However, little genetic information is currently known for H. macrophylla. Development of genetic information has been difficult given its highly heterozygous genomic composition, differing ploidy levels with base chromosome number (n = 18), and large diploid genome size (~2.2 Gbp) [7].
A previous transcriptomic study focusing on aluminum stress of hydrangea discovered thousands of SSRs; however, none were validated [8,9]. Only 39 SSR markers were available for hydrangea through 2010 [10][11][12]. Even with 226 newly developed SSR markers in 2018 [13], the total number of SSRs is still under 300, which is too low for most genetic studies in hydrangea. The insufficient number of molecular markers and unavailability of any genotyping platform has largely hampered the ability to perform linkage and association mapping for molecular-trait dissection studies in hydrangea. To date, only one linkage map was reported in hydrangea, but with low SSR marker density [13], which is much lower than the number of linkage maps reported in roses (eight) [14], peony (three) [15], and chrysanthemum (three) [16]. In addition to providing tools for MAS, high-density linkage maps could also be used to assist genome assembly in scaffolding and quality assessments [17]. Further marker development and high-density genetic linkage maps are needed for potential genetic improvement as well as genomic studies in hydrangea.
Transcriptome sequencing is an efficient and economical way to capture sequences of the genic regions in plants with complex, large genomes. Specialty crops such as H. macrophylla benefit from this approach, which creates massive expression datasets that include candidate coding sequences that can be used to produce an abundance of markers. Recent advances in the next-generation sequencing (NGS) methods have also made detection of SNPs, the most abundant marker type in the largest numbers, possible through high-throughput sequence-based genotyping. Sequence-based genotyping is a powerful tool for plant breeding and is especially helpful for those plant species that lack genome information [18]. Genotyping by sequencing (GBS) allows for the detection and genotyping of tens of thousands of SNPs in many individuals simultaneously, resulting in an unparalleled cost per data point when screening for codominant polymorphisms in large panels and for constructing highly saturated genetic maps [19]. GBS has been employed for linkage mapping in woody ornamentals such as roses [20] and apples [21]. In hydrangea, it has been used for genetic diversity [22] and association mapping studies [23,24], but no linkage map was constructed with this method.
Genomic resource development is urgently needed for future genetic mapping and quantitative genetics studies in H. macrophylla. Genomics approaches to hydrangea breeding offer opportunities to make immediate advances in trait improvement, especially for complex traits such as environmental tolerance and disease resistance. The objectives of this study were to generate genomic resources through transcriptome sequencing and genotyping by sequencing for H. macrophylla, and to demonstrate the utility of these resources with a high-density linkage map.

Plant Materials and Sample Preparation
Two important hydrangea cultivars, Veitchii and Endless Summer, were chosen as representatives for genomic resource development. 'Veitchii' is a lacecap cultivar used for the cut flower market and is tolerant to powdery mildew and Cercospora leaf spot but only flowers once a year (once-flowering), while 'Endless Summer' (formerly also identified as 'Bailmer') is a mophead cultivar which can flower twice or more per year (remontant) and is popular in the floriculture market but susceptible to powdery mildew [25,26].  panels and for constructing highly saturated genetic maps [19]. GBS has been employed for linkage mapping in woody ornamentals such as roses [20] and apples [21]. In hydrangea, it has been used for genetic diversity [22] and association mapping studies [23,24], but no linkage map was constructed with this method. Genomic resource development is urgently needed for future genetic mapping and quantitative genetics studies in H. macrophylla. Genomics approaches to hydrangea breeding offer opportunities to make immediate advances in trait improvement, especially for complex traits such as environmental tolerance and disease resistance. The objectives of this study were to generate genomic resources through transcriptome sequencing and genotyping by sequencing for H. macrophylla, and to demonstrate the utility of these resources with a high-density linkage map.

Plant Materials and Sample Preparation
Two important hydrangea cultivars, Veitchii and Endless Summer, were chosen as representatives for genomic resource development. 'Veitchii' is a lacecap cultivar used for the cut flower market and is tolerant to powdery mildew and Cercospora leaf spot but only flowers once a year (once-flowering), while 'Endless Summer' (formerly also identified as 'Bailmer') is a mophead cultivar which can flower twice or more per year (remontant) and is popular in the floriculture market but susceptible to powdery mildew [25,26]. In order to get broader representation in gene expression and better coverage of the genome, four tissue types (leaves, stems, buds, and open flowers) were collected for transcriptome sequencing from three year old 'Veitchii' and 'Endless Summer' plants grown in three gallon plastic containers (24.1 × 27.9 × 27.9 cm) containing composted pine bark medium under drip irrigation and 50% shade at the United States Department of Agriculture Agri  For genotyping by sequencing, an F 1 mapping population comprising 90 individuals, as well as the two parents, 'Veitchii' and 'Endless Summer', were used to construct a genetic linkage map. Trifoliate leaves of both parents and the F 1 population were collected into 2 mL Eppendorf tubes with lysate buffer and ground in a homogenizer (FastPrep-24 5G; MP Biomedicals, Santa Ana, CA, USA). Genomic DNA was isolated from the lysed tissue using the DNeasy Plant Mini Kit, followed by RNase treatment (Qiagen, Valencia, CA, USA). Nucleic acid was evaluated in 1% agarose gel and quantified using a spectrometer (NanoDrop 2000; Thermo Scientific, Wilmington, DE, USA). Enzyme activity was evaluated using 6 U of EcoRI on 300 ng DNA of eight randomly selected samples for 2 h at 37 • C and separation using gel electrophoresis on a 1% agarose gel. A lyophilized aliquot of 1.5 mg DNA was prepared for each sample for subsequent SSR genotyping and GBS.

Transcriptome Sequencing, Assembly, and Analyses
Total RNA from each cultivar was pooled in approximately equimolar amounts prior to construction of a customized cDNA library enriched for polyA and 5 cap containing transcripts, with the double-stranded cDNA library intermediate partially normalized by DSN treatment (Evrogen, Moscow, Russia) to reduce representation of the high abundance transcripts. Prior to adaptor ligation, the library was sheared by nebulization to optimize template length distribution for the sequencing system and assessed on a Bioanalyzer DNA7500 chip (Agilent; Santa Clara, CA, USA). According to the manufacturer (Roche/454 Sequencing, Branford, CT, USA) emulsion PCR was performed and the library sequenced to below saturation on a GS FLX Titanium PicoTitreTM plate to 800 cycles.

Transcriptome Assembly
Sequence reads were cleaned of adaptor sequences [27] and assembled into isotigs using NEWBLER v2.3 GS De Novo Assembler [28] (Roche 454 Sequencing, Branford, CT, USA) with default parameters for cDNA (40 bp overlap; 90% identity). TransDecoder [29] was used to predict open reading frames (ORFs) from the isotig assembly. Predictions were improved by including the Pfam matching information. The matches were generated by comparing the isotigs to the Pfam database version 3.1b1 [30] with HMMer hmmscan version 3.1b2 ("HMMER" 2016) [31]. The nucleotide sequences of the isotigs were compared to the Uniprot Swissprot database and the plant proteins from Uniprot Trembl [32] with BLAST version 2.2.26. Hits were filtered at an e-value of 0.00001. InterProScan version 5.4-47.0 [33] was run for all protein sequences. Assemblies were assessed for completeness with BUSCO version 1.2 and the early access plant dataset [34]. Raw sequence data were uploaded to NCBI under BioProject PRJNA661039.
Primers were designed and used to amplify SSR loci in genomic DNA from 'Veitchii' and 'Endless Summer' using a modified three-primer protocol originally described in Waldbieser et al. [36]. In summary, DNA was extracted from 1 cm 2 pieces of fresh leaf tissue using Qiagen Plant Mini Kit (Qiagen). Genomic DNA was quantified using a NanoDrop Spectrophotometer (Nanodrop Technologies, Wilmington, DE, USA). Forward primers were 5 tailed with the sequence 5 -CAGTTTTCCCAGTCACGAC-3 to permit product labeling, and reverse primers were tailed at the 5 end with the sequence 5 -GTTT-3 to promote non-template adenylation [37]. A primer, 5 -CAGTTTTCCCAGTCACGAC-3 , was labeled with 6-carboxy-fluorescein (FAM) (Integrated DNA Technologies, Coralville, IA, USA) and added to the amplification reaction, which included Advantage2 Taq DNA Polymerase (Clontech, Mountain View, CA, USA) according to previously published proto-cols [6]. Fluorescence-labeled PCR fragments were visualized by automated capillary gel electrophoresis on an ABI3100-Avant (Applied Biosystems, Foster City, CA, USA) using a ROX-500 size standard. GeneMapper version 4.0 (Applied Biosystems) was used to recognize and size peaks. Loci that produced high quality data, that were polymorphic between 'Veitchii' and 'Endless Summer', and that produced allele combinations compatible with the pseudo test cross strategy were selected for linkage mapping with the mapping population.

Genotyping by Sequencing and SNP Identification
GBS was performed at the University of Wisconsin-Madison Biotechnology Center (UWBC; Madison, WI, USA) following the protocols described in Elshire et al. [19]. The 4-base cutter enzyme, ApeKI (G * CWGC), was found to be efficient in producing fragments that were < 500 bp in sample DNA digestion and was selected for library preparation. The digested DNA fragments were then ligated with optimal barcodes and common adapters followed by standard PCR. Single-end sequencing (100 bp) of the library was performed on the Illumina HiSeq 2000 (Illumina Inc. San Diego, CA, USA). Raw sequence data for the total library were uploaded to NCBI under BioProject PRJNA656177.
The reference-free Java program TASSEL-UNEAK [38] was utilized for GBS data analysis given there was no reference genome available for H. macrophylla. Briefly, the barcodes were first removed from the raw sequence reads. The remaining sequences were analyzed by the TASSEL-UNEAK pipeline to further produce 64 bp length sequence tags and aligned with each other based on unique sequence tags. SNP discovery was performed directly within pairs of matched sequence tags and filtered through network analysis. The minimum tag count required for output was set to 5. To reduce the sequencing error, the error tolerance rate was set to 0.03. Raw SNPs were filtered to ensure high-quality genotype calling using Trait Analysis by Association, Evolution and Linkage (TASSEL) version 5.0 [39]. SNPs that had more than 10% missing data and minimum minor allele frequency (MAF) of 0.05 were excluded from the dataset, individuals that possessed more than 20% missing data were filtered out. SNPs that were polymorphic between the parents of the mapping population were retained for linkage mapping.

Linkage Map Construction
The allele data for all selected markers were tested for segregation distortion and genotypic data similarities. The linkage map was created using the cross-pollinated (CP) model in JoinMap v5.0 [40]. In the case of SSRs, five segregation types (lm × ll, nn × np, hk × hk, ef × eg, and ab × cd) were used based on the software instructions, while three segregation types (lm × ll, nn × np, and hk × hk) were used to code SNPs. Distortion of each marker from their expected frequencies was observed by a goodness-of-fit (Chi-square) test. Identical progenies, identical markers, and severely distorted loci (p < 0.001) were observed and excluded from mapping construction. Linkage groups (LGs) were constructed using the regression mapping method with default parameters. The independence LOD (logarithm of the odds) score was set from 2 to 15 for grouping purposes and groups were selected at the LOD score that produced group numbers equivalent to the number of chromosomes. Map distances within individual linkage groups were converted to centiMorgans (cM) using the Kosambi mapping function.
The two-step strategy was used to construct the linkage map. First, alleles segregating only in the female (lm × ll) or male (nn × np) parents were separated as individual datasets using the embed grouping function. Individual maps for each of the parents were constructed using the parameters mentioned above. For consensus map construction, the maternal and paternal maps were integrated by combining linkage groups sharing the same SNP alleles classified as (hk × hk). The map construction, marker order, and distance for each LG in the consensus map were calculated using the same parameters as those for the parental maps. The consensus map was graphically represented using MapChart V2.2 [41].

Transcriptome Analysis and EST-SSR Identification
The hydrangea libraries were constructed from normalized cDNA and generated a total of 1.17 million reads (444 Mb) that were de novo assembled into 42,682 isotigs, putative transcripts, comprising 24,201 isogroups, and putative unigene sets of transcripts. Open reading frames (ORFs) were found in 42,620 isotigs (>99%). Functional annotation of these data was performed using BLAST to publicly available databases. This resulted in matches for 67% of the isotigs against the SwissProt database [42] and for 85% of the isotigs against plant proteins from the Trembl database [32]. InterProScan [33] was used to gain further functional information in protein sequences derived from the ORFs, estimating 199,709 additional putative functions for 34,389 of the isotigs. InterProScan also yielded 171,059 gene ontology (GO) term [43] assignments with assignments of 1679 unique terms to 21,818 isotigs. To assess the completeness of the transcriptome, the software BUSCO was utilized. Based on the analysis, the H. macrophylla assembly includes coverage of 92% of the 956 BUSCO conserved genes (429 as complete single copy transcripts, 358 as complete but multiple transcripts, and 91 as fragmented genes).
EST-SSR development identified 30,169 new SSR loci for H. macrophylla. Primers could be designed for 5986 loci and 2794 loci in the contigs and singletons (non-assembled reads), respectively. The total number of new SSR loci identified was 8780 (Supplementary Table S1). The majority of these loci contained tri-nucleotide repeats (52.82%) as would be expected for coding regions. The next largest class of repeat motifs was di-nucleotide (37.15%) ( Table 1). The remaining (<10%) motifs were larger than tri-nucleotides.

EST-SSR Genotyping
A selection of the identified SSR loci (1535) were evaluated with genomic DNA from 'Veitchii' and 'Endless Summer'. Of these, 1226 produced high-quality data, but 35% of those loci were monomorphic and therefore not suitable for genetic mapping purposes. A total of 267 SSRs were selected for genetic mapping of the F 1 population. Among them, 115 (43.1%) showed maternal segregation (lm × ll) and 98 (36.7%) showed paternal segregation (nn × np), while 53 (19.9%) were heterozygous in both parents (44 with four distinct alleles, ab × cd type, and 9 with 3 alleles, ef × eg type), and one (0.3%) was partially informative as both parents had the same heterozygous genotype (hk × hk). Information of the 267 markers are highlighted in Supplementary Table S1.

GBS-SNP Discovery
The GBS library generated an average of 2.7 million reads per F 1 progeny and the TASSEL-UNEAK pipeline analysis resulted in 376,153 raw SNPs. By removing SNPs that showed MAF < 0.05 and more than 10% missing data, the total SNP number was reduced to 12,214. Extra criteria were also used to assure the SNP quality as follows: SNPs that were missing in either parent or homozygous in both parents were removed from further analyses. Furthermore, SNPs showing segregation patterns in the progeny that were not in agreement with the parental genotypes were discarded from the mapping process. Finally, a group of 3923 high quality informative SNPs were obtained, consisting of 1160 SNPs (29.6%) segregating in the female parent (lm × ll), 881 SNPs (22.5%) segregating in the male parent (nn × np), and 1882 SNPs (47.9%) that were heterozygous and segregating in both parents (hk × hk). The SNP sequence information is presented in Supplementary Dataset S1.

Construction of Genetic Linkage Maps
A total of 4190 high quality markers were used for linkage mapping, including 267 SSR markers developed from transcriptome sequencing and 3923 SNP markers developed from GBS (Supplementary Table S2). Of them, 472 markers were identical to other markers and were excluded from further analyses. No identical samples were found in the mapping population.
The maternal and paternal maps were determined at LOD of 8, with each map containing 18 LGs. The maternal map consisted of 1686 markers with a total genetic length of 1396.9 cM, while the paternal map consisted of 1396 markers with a total genetic length of 1400.8 cM (Supplementary Table S3). The maternal and paternal map visualizations can be found in Supplementary Figure S1. Generally, the paternal map tended to have smaller sizes for each of the linkage groups, which resulted in an overall smaller size than the maternal and consensus maps. When comparing the markers of individual LGs between consensus map and parent maps, the maternal map contributed most of the markers on LGs 5,8,9,10,14,15, and 17 of the consensus map, while the paternal map contributed largely to LGs 2 and 7 of the consensus map. The LGs 1, 3, 4, 6, 11, 12, 13, 16, and 18 of the consensus map had a similar number of markers derived from the corresponding LGs of each parent map.
A consensus map was generated that contained a total of 1767 markers, consisting of 1621 SNP and 146 SSR markers. The total length of the consensus map was 1383.4 cM with LG 14 (108.5 cM) being the largest group and LG 9 (39.8 cM) being the smallest group ( Table 2). The number of markers per linkage group varied from 152 (LG 5) to 49 (LG 6), with an average of 98 markers per linkage group. Most LGs were made of both SNP and SSR markers, with LG 17 being an exception and only containing SNPs. Four large gaps, that were more than 10 cM, were observed on LG 8 (10.0 cM), LG 12 (11.5 cM), LG 13 (13.5 cM), and LG 17 (11.5 cM). Overall, the average distance between markers was 0.8 cM.

The distribution of SNPs and SSRs on consensus map
LGs is shown in Figure 2.

Discussion
Recently, hydrangea has been gaining popularity as an attractive plant for public and private gardens and has become an important ornamental crop for nursery growers. MAS could directly accelerate the breeding of new hydrangea cultivars to meet the consumer

Discussion
Recently, hydrangea has been gaining popularity as an attractive plant for public and private gardens and has become an important ornamental crop for nursery growers. MAS could directly accelerate the breeding of new hydrangea cultivars to meet the consumer needs for new and novel plants [44]. Previous studies on hydrangeas mainly focused on phenotypic evaluations on a physiologic level, which have not let to an acceleration in the hydrangea breeding progress. The lack of genomic resources is a common restraint to molecular breeding in nursery crops, and H. macrophylla is no exception to this trend. The rapid development of sequencing technology provides opportunities of marker development not only in a fast and convenient way, but also at low cost. In order to expand the genomic resources of H. macrophylla, the present study implemented a transcriptomics study for SSR development, as well as a sequencing-based marker discovery method, GBS, to identify SNP markers in the population developed by crossing these two cultivars for genetic linkage mapping.
A total of 8780 SSRs were identified through the transcriptomics of important hydrangea cultivars, providing the most extensive set of genic sequences so far for this species. Among the identified SSRs, 1535 of them were tested and 267 were found to be polymorphic. SSR markers are robust tools for genetic mapping and molecular breeding applications. The polymorphic SSRs validated in this study significantly increased the availability of useful markers for genetic studies in H. macrophylla. In addition, high transferability of SSRs were reported between hydrangea species [10,11,45], so the discovered SSRs here should immediately enable multiple molecular applications in the numerous related species.
GBS offers an efficient way to decipherer hydrangea genetics as it can discover SNPs and facilitate genotyping at the same time [19]. GBS has been widely used in popular ornamental crops, such as roses [20], for linkage mapping and subsequent quantitative trait loci (QTL) analyses. In hydrangea, valuable traits such as inflorescence shape, double flower, and recurrent blooming are thought to be quantitatively inherited and suitable for QTL studies to identify markers for MAS [13,23]. The genotyping of two hydrangea cultivars and their F 1 progeny identified 3923 SNP markers through a GBS approach, providing the possibility to facilitate construction of a high-density linkage map for future QTL studies in hydrangea.
Genetic linkage maps are powerful tools to locate genes and QTL regions. As a perennial woody plant, linkage map construction in hydrangea is more complicated than that of homozygous inbred annual plants due to its high heterozygosity and self-incompatibility. F 1 full-sibling populations have become a choice for linkage maps in woody plant species. In this study, an F 1 population was derived from the cross of two important cultivars, Veitchii and Endless Summer, and was used to build the linkage map of hydrangea with 1767 molecular markers. The genetic map contained 18 linkage groups, corresponding to the haploid number of chromosomes identified in the karyotype of diploid H. macrophylla (2n = 36) [7]. To date, only one linkage map was reported in hydrangea and it included only 147 SSRs [13]. The linkage map developed in this study consists of 146 SSRs and 1621 SNPs (Table 2), with an average of 98 markers per linkage group, which was far more than the markers used in the previous study [13]. The total length (1383.4 cM) and small marker interval (0.8 cM in average) indicated the high quality of the developed map when compared to the total genetic distance of 980 cM and average marker interval of 6.8 cM in the previously reported map [13]. With more total mapped markers, a larger genetic distance, and the small marker interval, the reported linkage map represents the most saturated genetic map, and will serve as a valuable tool for future genetic and breeding study in hydrangea.
All of the linkage groups identified in the current map consist of both SSR and SNP markers. However, the selected 267 SSRs alone were not sufficient to fully construct a linkage map of hydrangea with the number of linkage groups equal to the number of chromosomes. Studies in woody ornamentals, such as roses [20] and blueberry [46], have used SSRs as the bridge for crosstalk between genetic maps, thus, more SSRs need to be validated in order to build a high-density consensus map using multiple hydrangea populations. Several regions in one linkage group showed higher density than other regions, which may indicate the uneven marker polymorphism and recombination frequency on a certain chromosome between the mapping parents. This phenomenon has also been observed in grapes [47] and Mei [48]. Though the average genetic distance was 0.8 cM, four gaps located on LG 8, LG 12, LG 13, and LG 17 were found to be larger than 10 cM (Table 2, Figure 2). Large gaps in genetic maps are usually caused by low marker detection and polymorphism in similar chromosome segments of mapping parents or centromeric regions of the chromosome [49], which could be filled by a high-density consensus map construction using different marker types [50].
Despite the substantial value of the industry, technologies available to augment the development of new and improved ornamental plants lag far behind most other agricultural crops. Paucity of genomic resources such as reference genome sequences and comprehensive transcriptome datasets are especially problematic for breeding programs focused on woody ornamentals. Challenges posed by long juvenile periods, obligate or nearly obligate out-crossing breeding systems, inbreeding depression, high heterozygosity, and poor understanding of the genetic control of traits variation are major impediments for effective application of traditional breeding approaches to improve woody ornamental cultivars. Genome-enabled research and breeding programs would greatly accelerate the development of new durable and high-value cultivars. MAS provides the opportunity to assess young plants at the genotype level for multiple traits of interest, which can greatly reduce costs associated with growing the plants to maturity. The mapping parents used in this study, 'Veitchii' and 'Endless Summer', are two important hydrangea cultivars that have contrasting phenotypes in inflorescence type, flowering habits, powdery mildew tolerance, and drought and cold stress, and the markers developed in this study will be useful for interrogation of these traits in the future. Integration of these traits in cultivars through conventional breeding methods has proved difficult but it is likely more feasible through MAS.

Conclusions
This study performed large-scale SSR and SNP marker discovery in a F 1 population through transcriptome and genotyping by sequencing for genetic map construct in hydrangea. The consensus linkage map contained 1767 positioned markers (146 SSRs and 1621 SNPs), spanned 1383.4 centiMorgans (cM), and was comprised of 18 linkage groups, with an average mapping interval of 0.8 cM. This study provided not only the most comprehensive marker resource currently available in hydrangea, but also an important foundation for QTL mapping, map-based gene cloning, and MAS of H. macrophylla. For traits where variation is readily available in cultivated germplasm, this work defined the strategies for improving hydrangea through conventional breeding. It is these traits where advanced genetic methods such as molecular markers could increase efficiency in long term, high risk projects such as germplasm improvement and new cultivar development.   18 LGs in maternal, paternal, and consensus maps. Supplementary Dataset S1-Information of SNP sequence discovered by genotyping by sequencing.