Exploring Genetic Diversity in an Ilex crenata Breeding Germplasm

: Knowledge of genetic identity, genetic relationships, ploidy level, and chromosome numbers can enhance the efﬁciency of ornamental plant breeding programs. In the present study, genome sizes, chromosome numbers, and genetic ﬁngerprints were determined for a collection of 94 Ilex accessions, including 69 I. crenata . The genome size of the entire collection ranged from 1.50 ± 0.03 to 8.01 ± 0.18. Within the species of I. crenata , genome sizes varied (mean ± sd) between 1.98 ± 0.08 and 2.30 ± 0.04, with three outliers: 3.06 ± 0.04, 4.04 ± 0.09, and 4.19 ± 0.08. The chromosome counting results showed 2n = 40 for I. crenata accessions and conﬁrmed the outliers as one triploid and two tetraploids. A high intra-speciﬁc genetic diversity in Ilex crenata was found, after genetic ﬁngerprinting using genotyping-by-sequencing (GBS). The species I. crenata was separated into three clades, which coincided with intraspeciﬁc differences in genome sizes (mean ± sd) of 2.09 ± 0.006, 2.07 ± 0.05, and 2.19 ± 0.06 pg/2C per clade as mean values for the diploids. Applying a principal coordinate analysis (PCoA) to the genetic ﬁngerprinting data of all species in the collection revealed a wide genetic variation, which has not yet been commercially exploited. These ﬁndings could form the basis for selectively breeding parents, in order to create more genetic diversity via intra-and interspeciﬁc crosses.


Introduction
The genus Ilex belongs to the monogeneric family Aquifoleaceae and is a large dioecious genus with 571 accepted species [1].The genus can be found in regions of mesic growing conditions around the world.While most Ilex species originate from East Asia and South America, a multitude of species can also be found in Southeast Asia and North America.Only four species occur in Europe, one occurs in Africa, and another species occurs in Northern Australia.A few species can be found in Hawaii, the Caribbean, the Canary Islands, the Azores, Madeira, New Caledonia, and Fiji [2].Previous phylogenetic studies have used fossil records [2][3][4] and nuclear and plastid markers to unravel the evolution of the Ilex genus, revealing a complex evolutionary history.Moreover, the taxonomy of the genus is not yet fully resolved.There are multiple heterotypic and homotypic synonyms within the genus, which occasionally cause confusion.For example, I. maximowicziana var.kanehirae (Yamamoto) Yamazaki and I. crenata var.mutchagara (Makino) Hara are two names given to the same plant [5].The same applies to I. leucoclada (Maxim.)Makino and I. integra var.leucoclada Maxim.[1].Recently, the increasing interest in the genus and decreasing cost of DNA sequencing have led to reference genomes for I. latifolia [6], I. polyneura [4], and I. asprella [7].In some parts of the world, Ilex species such as I. opaca, I. vomitoria, and I. paraguariensis are used for brewing teas, while I. asprella is used as a medicinal plant [8].I. aquifolia (common holly, English holly, European holly, or Christmas holly) is well known as an ornamental garden plant, with typical wintergreen branches and red berries used as a Christmas decoration.Other hollies used for their ornamental properties include I. polyneura and I. crenata (box-leaved holly or Japanese holly).I. crenata resembles boxwood (Buxus L.) and is also used for hedges and topiary [9].
I. crenata originates from Japan, Korea, China, the Sakhalin Islands, the Kuril Islands, Taiwan, the Philippines, and the Himalayas.In Japan, the species naturally grows from sea level up to an altitude of 1000 m, with local populations exhibiting wide morphological variation [9].The species was introduced to the garden of the Russian Czar Alexander II in 1864 by Russian botanist Carl Maximowicz.Starting from that introduction, distribution spread throughout Europe.In 1898, I. crenata was introduced to the United States.This evergreen, small-leaved shrub gained popularity in the 1930s as an ornamental plant.Since then, I. crenata has been introduced multiple times to other parts of the world, and the plant has been extensively used for breeding [5], leading to the release of numerous new cultivars.These cultivars were mainly lucky findings, selected based on phenotypic variation and subsequently vegetatively propagated [9,10].With more targeted breeding, value could be added, by overcoming abiotic and biotic stress challenges, such as high soil pH and infection with black root rot caused by the soil-dwelling fungus Berkeleyomyces basicola.Knowledge about the genetic relations and variation in existing species and cultivars could guide breeders selecting breeding parents, thus paving the way for novel or improved, and thus commercially marketable, varieties.Moreover, knowledge of the ploidy level and chromosome number of genotypes can help to predict cross-compatibility.The genome sizes of I. cornuta, I. mucronata, and I. verticillata have previously been reported in the range of 1.31 pg/2C [11] to 4.13 pg/2C [12].The Chromosome Count Database describes 41 Ilex species ranging from 34 = 2n for I. crenata and I. leucoclada [13,14] to 120 = 2n for I. pedunculosa [15,16].For both I. crenata and I. leucoclada, Galle reports the chromosome number to be 40 = 2n [9].Darlington and Wylie mention a basic number of x = 9 or 10 [17].
Some efforts have been made to clarify the phylogeny of the different species within the genus, but knowledge about the underlying genetic variation within the species I. crenata is still limited.
The present study included an investigation into the genetic diversity within the species I. crenata and a selection of other hardy Ilex species.The genetic identity of the plants in the collection was also examined in light of their taxonomy.Naming mistakes can occur for many reasons, including ex situ conservation over the years, selling plants under an incorrect name for economic reasons, erroneous determinations, etc.We assembled a collection of 94 accessions, comprising 69 accessions labeled as I. crenata and 28 accessions from 17 other species.The genomic constitution was characterized at three levels: (1) genome sizes were determined using flow cytometry, (2) the chromosome number and ploidy level were investigated through fluorescence microscopy, and (3) genotyping-bysequencing (GBS) [18] was performed as a reduced-representation genome sequencing technique, which allowed for high-resolution genetic fingerprinting using thousands of genetic markers spread across the genome.

Collection of Plant Material for the Ilex Collection
A germplasm collection was compiled of 94 Ilex accessions, belonging to 18 species and hybrids including I. x attenuata, I. aquifolium, I. x aquipernyi, I. crenata, I. glabra, I. integra, I. leucoclada, I. macrocarpa, I. x makinoi, I. maximowicziana, I. x meserveae, I. mutchagara, I. opaca, I. rugosa, I. serrata, I. sugerokii, I. vomitoria, and I. yunnanensis.The collection also contains a total of 69 accessions labeled as I. crenata (Table 1).Plant material from I. crenata Table 1.Overview of the Ilex collection.The table shows all the accessions and for every genotype: the accession number used at ILVO; the species and cultivar name; the collection of origin, along with the accession number from the arboretum; the mean genome size; and the chromosome number.Genetically identical accessions are indicated with the same superscript letter (a-d), accessions for which a naming problem was discovered are indicated with an asterisk.

Genome Size of the Collected Ilex Accessions
The genome size of 91 accessions in the collection was measured using flow cytometry (Table 1), according to the methods described by Van Oost et al. 2021 [19].Four accessions did not develop rooted cuttings and could not be used.The genome size is expressed as pg/2C.Terminology used is according to Greilhuber et al. [20].All samples were analyzed using a Quantum P flow cytometer and CyPAD software (Quantum Analysis, Münster, Germany).For all genotypes, except IL104, the internal standard used was Zea mays, with a known genome size of 5.43 pg/2C [21].Pisum sativum (9.09 pg/2C [22]) was used as an internal standard for IL104, because of its deviant genome size.Young leaf material of the sample and standard was mixed, co-chopped, and used for sample preparation with a CyStain PI kit (Sysmex, Münster, Germany), according to the manufacturer's protocol with minor modifications: using 0.4 mL extraction buffer, and 1.2 mL staining buffer, to which 1% (w/w) PVP10 was added.The stained samples were incubated in the dark at 4 • C for at least 20 min.At least three replicates were analyzed for every genotype, preferably from three plants of this genotype and on three different days.The sample genome size was derived by calculating the ratio of the peak positions of the sample and internal standard in the histograms of both the FL2 and FL3 detectors.Mean values and standard deviations of the sample genome size were calculated as the mean of a minimum of six values, specifically the peak position ratio of both detectors for at least three replicates.

Chromosome Number of the Collected Ilex Accessions
Based on the flow cytometry results, a subset of 17 I. crenata genotypes (Table 1), covering the full range of genome sizes, was selected to count the chromosomes.The chromosomes of the following species were also counted: I. x aquipernyi (IL001), I. glabra (IL012), I. maximowicziana var.kanehirae (IL013), I. x meserveae (IL046), I. x makinoi (IL051), I. rugosa (IL098), and I. leucoclada (IL099) (Table 1).Young root tips of newly rooted cuttings were harvested and incubated in 0.1% colchicine for 3 h at room temperature, to arrest mitosis at the metaphase stage.Subsequently, the root tips were fixed in 3:1 ethanol:acetic acid for 1 h at room temperature.Cell suspensions were made by digesting the root tips in 0.6% enzyme solution (0.6% cellulase, 0.6% pectolyase, and 0.6% cytohelicase in 0.1 M citrate buffer (0.1 M sodium citrate tribasic dehydrate, and 0.1 M citric acid)) at 37 • C for 45 min.Chromosome slides were prepared using the SteamDrop protocol [23], using 1:1 ethanol:acetic acid solution for both fixation steps.Subsequently, the chromosome slides were stained with 1% DAPI (100 g/mL) diluted in Vectashield mounting medium and analyzed with a fluorescence microscope (AxioImager M2, Carl Zeiss MicroImaging, Belgium) at 1000× magnification.Images were captured using an Axiocam MRm camera and ZEN software (Carl Zeiss MicroImaging, Belgium).Chromosomes were counted on at least five images of well-spread metaphase chromosomes per genotype using DRAWID v0.26 [24].

GBS Fingerprinting of the ILEX Collection 2.4.1. Library Preparation and Sequencing
DNA was extracted from all genotypes of the collection using a modified CTAB protocol [25].Preparation of genotyping-by-sequencing (GBS) libraries and sequencing was performed by LGC Genomics (Berlin, Germany), using the following steps: (1) quality control (agarose gel check); (2) library preparation, including indexing and quality control using a double digest with PstI and MseI; and (3) PE-150 sequencing using an Illumina NextSeq 500/550 instrument.

GBS Data Analysis
All data analysis steps described below are also available, together with accompanying scripts, on https://gitlab.com/ilvo/GBS_Ilex(accessed on 25 May 2021) and on Zenodo with doi 10.5281/zenodo.7669149under an MIT license.
Data preprocessing.The obtained GBS reads were processed using GBprocesS v3.0.3 [26].This software makes use of Cutadapt [27] and PEAR [28], and comprises the following steps: (1) trimming of adapters, barcodes, restriction site remnants from 5 -and 3 -ends, and discarding reads that are shorter than 30 bp using Cutadapt; (2) merging of forward and reverse reads with a minimum overlap of 10 bp and a minimal final size of 40 bp with PEAR [28]; (3) filtering of reads that have a mean Phred quality score lower than 25 or two consecutive bases with a Phred quality score below 20 and discarding reads containing Ns; and (4) filtering of reads with intact internal restriction enzyme recognition sites (due to partial restriction digest).The raw PE-150 read data, as well as the merged preprocessed read data, were submitted to NCBI's Sequence Read Archive (SRA) under the BioProject number PRJNA895194.
Loci identification.Data were analyzed reference-free, using GIbPSs software [29].Unless mentioned otherwise, standard settings were used and the recommended workflow of GIbPSs was followed.More detailed information can be found in the accompanying software repository, as mentioned above.Briefly, loci were identified in all merged reads using indloc, poploc, and indpoploc, and errors were corrected using indloc.Loci of lengths between 32 and 300 bp were selected using data selector.Loci containing indels were detected using indelchecker and discarded.Loci that were extremely deeply sequenced were identified using depth analyzer and discarded.The minimum value for the median depth percentage and the maximum value for the median scaled depth were set to 1 and 0.15, respectively.Additionally, split loci were removed from the dataset.Finally, a GIbPSs genotype call table was exported, containing the locus genetic information for all samples.
Determination of common loci and allele similarity.All samples were compared to each other using a custom python script available in the script repository [30], which uses the GIbPSs genotype call table as input.This script calculates (1) the number of loci per sample; (2) the number of common loci between every pair of samples and; (3) the allele similarity of the common loci of every pair of samples.The allele similarity is defined as the percentage of common loci that share at least one allele.An allele similarity of 100% means that all common loci have at least one shared allele (=a locus sequence with a combination of SNPs) (see Supplementary Materials).The results are represented in a false-colored similarity matrix, displaying the total number of loci for every genotype on the diagonal, the number of loci in common between two genotypes in the upper half of the matrix, and the allele similarity in the lower half of the matrix.
Concatenated alignment and phylogeny.Using a custom perl script available in the script repository, all SNP-containing loci from the GIbPSs genotype call table were concatenated and saved as an alignment file.In addition, a text file that kept track of the original locus (and positions in that locus) in the concatenated alignment was saved.Next, again using a custom perl script, positions where more than 20% of the samples contained missing data were filtered out.Invariant positions were removed using the 'ascbias.py'script (available at https://github.com/btmartin721/raxml_ascbias)(accessed on 25 May 2021).The resulting filtered concatenated alignment was used for (1) principal coordinate analysis (PCoA); and (2) phylogenetic tree construction.The PCoA was performed by reading the multiple alignments in R v4.2.2 in R Studio Server v2022.07.2 build576, using the Biostrings package.Next, the pairwise distance between all individuals was calculated using the Hamming distance (Decipher package), while ignoring gap-letter matches and performing the PCoA using the 'cmdscale' function (stats package).The R scripts are available in the script repository.This analysis was performed for the entire collection of 94 individuals and two technical replicates, as well as for the 69 individuals in the I. crenata group only and one technical replicate.A phylogenetic tree was constructed containing the collection of all 94 individuals using the maximum likelihood method with RAXML v8.2.12, including a GTRCAT model without rate heterogeneity (-V), with ascertainment bias correction (-asc-corr=lewis), and including 100 bootstraps.I. vomitoria 'Hoskins Shadow' (IL095) was used as an outgroup.The tree was then visualized using iTOL v6 [31].

Results and Discussion
This study explored the genetic diversity within the genus of Ilex, more specifically within the species I. crenata.We assembled a large collection of 94 Ilex accessions from arboreta and commercial growers, 69 of which were labeled I. crenata, as well as three others with other names that were genetically identical to I. crenata.For the accessions in this collection, we delineated the genetic identity, investigated the genetic relationships, and studied the genetic diversity between the accessions of the collection.

Genetic Identity
GBS was performed on all 94 accessions of the Ilex collection.The number of preprocessed merged reads per sample (see Materials and Methods above) ranged from 1.8 M to 8.4 M, with a median of 3.6 M. Reference-free locus delineation was performed using GIbPSs and resulted in a total of 218,024 loci for all samples combined.Rarefaction analysis of a few selected samples revealed that the number of identified loci using GIbPSs reached a plateau at ca. 3-4 M reads, suggesting that most of the samples had been sequenced deep enough to reach locus saturation (data not shown).The number of identified loci per sample ranged from 9513 to 53,403, with a mean of 18,079.These loci had a mean length of 184 bp (range 40 to 270 bp), a mean read depth of 57, and a median read depth of 20.In total, 70% of the loci had a mean read depth per sample of at least 15.The GBS loci were compared between samples in a pairwise manner to estimate the number of common loci and the fraction of loci with shared alleles (for definitions of 'common loci' and 'allele similarity', see Materials and Methods and [30]).The heat map (Figure S1) shows clear clusters, indicating the genetic substructure in the Ilex collection.The lowest number of common loci between species was 2880 for I. crenata 'Luxus' (IL122) and I. macrocarpa (IL103); the highest number was 11,249 for I. crenata 'Pride's Tiny' (IL035) and I. sugerokii var.longipedunculata (IL097).Within the species I. crenata, the lowest number of common loci between two accessions was 6340 for I. crenata 'Luxus' (IL122) and I. crenata 'Nummularia' (IL107); the highest was 15,219 for I. crenata 'Braddock Heights' (IL016) and I. crenata 'Pride's Tiny' (IL035).The lowest allele similarity between species was 8.85% between I. macrocaropa (IL103) and I. vomitoria 'Will Fleming' (IL108); the highest allele similarity between species was 60.25% for I. opaca (IL100) and I. integra (IL102).
Next, we analyzed the allele similarity of replicates, to set a threshold for identifying genetically identical accessions across the Ilex collection.For two accessions (IL014 and IL051), we used two technical replicates to analyze the reproducibility of GBS fingerprinting.The total number of loci of I. crenata 'Dark Green' (IL014) replicates was similar (11,312 and 11,945), with 10,335 loci in common.Likewise, technical replicates of I. x makinoi 'Hara' (IL051) had 16,716 and 19,756 loci, with 15,195 loci in common.The allele similarity between these technical replicates was 99.88% and 99.77% for IL014 and IL051, respectively.Furthermore, we collected I. crenata 'Fastigiata', a common cultivar, from four sources as 'biological' replicates (IL005, IL021, IL022 and, IL062).As expected, this group of Fastigiata accessions displayed high allele similarities, ranging from 98.13% to 99.68%.Taken together, these results showed that the allele similarity method for pairwise comparison based on GBS genetic fingerprints was especially accurate.The number of common loci was less reliable for identifying similar genotypes, which was probably due to datasets that were not completely saturated.Consequently, we used a 98% threshold of minimal allele similarity to consider GBS genetic fingerprints as identical (=the same cultivar) and screened the Ilex collection for genetically identical accessions.For instance, I. crenata 'Sky Pencil' (IL082), morphologically identical to the Fastigiata genotypes with an upright phenotype, can also be considered genetically identical to the Fastigiata genotypes (allele similarity ranging from 98.19% with IL022 to 99.50% with IL005).Furthermore, the allele similarity of I. crenata 'Golden Helleri' (IL024) and I. crenata 'Lancaster Yellow' (IL029) was 99.89%, thus surpassing this threshold.Dudley and Eisenbeiss (1992) previously reported that a mutation of 'Helleri' was commercialized under the name 'Lancaster Yellow' [5], thus confirming that these accession names are genetically identical.Comparison of the allele similarities of other genotypes revealed that I. yunnanensis (IL048) is highly related to I. mutchagara Makino (IL047), I. crenata 'Blondie' (IL015), and I. maximowicziana var.kanehirae (IL013), with allele similarity values ranging from 98.83% to 99.76% (Figure S1).There are two reasons for these high allele similarities.The first is that I. yunnanensis (IL048) is actually an I. crenata genotype.Visual assessment of the original plants in the arboretum revealed that the mother plant of I. yunnanensis (IL048) looked like an I. crenata and not like the other I. yunnanensis in the collection, leading to the conclusion that the mother plant, and subsequently the I. yunnanensis (IL048) accession in our collection, was initially incorrectly labeled as I. yunnanensis.The other reason for the observed high allele similarities is that I. maximowicziana var.kanehirae (Yamam.)T. Yamaz.and I. mutchagara Makino are both synonyms of one another and are also synonyms of I. crenata var.mutchagara (Makino) Ohwi, as previously reported by Dudley and Eisenbeiss (1992) [5].More recently, many more heterotypic synonyms have been added to the list of synonyms, including I. crenata var.kanehirae Yamam., I. crenata var.scoriatum Yamam., and I. maximowicziana Loes.[1].Our data show that both I. maximowicziana var.kanehirae (IL013) and I. mutchagara Makino (IL047) are not more genetically different from I. crenata than the different I. crenata genotypes are from one another (more than 98% allele similarity) (Figure S1).Moreover, they are genetically almost identical to the I. crenata cultivar 'Blondie', with allele similarities only slightly lower than that of the technical replicates.In conclusion, our data confirm that I. maximowicziana var.kanehirae (Yamam.)T. Yamaz.and I. mutchagara Makino are synonyms and further show that they are also synonyms of I. crenata Thunb.

Genetic Relationships among Ilex Species
Pairwise comparison of allele similarities across the collection showed blocks of samples that are more closely related to each other.These blocks were consistent with the clades of the phylogenetic tree (Figure 1) and the clusters of the PCoA (Figure 2).Four clusters were distinguished using PCoA analysis of the GBS data of the Ilex collection.Cluster 1, containing I. crenata accessions, was separated on the PC1 axis.Cluster 2, containing I. vomitoria accessions, was separated on the PC2 axis.Cluster 3, containing I. opaca accessions, was separated from Cluster 4, containing multiple other species.The phylogenetic tree generally agreed with this genetic structure.The species I. vomitoria was used as an outgroup, as it became clear during preliminary analyses that this species was phylogenetically less related to the other species than the other species were related to one another.Besides I. vomitoria, I. opaca was also less related to the other species than the remaining species were related to each other.The phylogenetic analysis generally confirmed the species boundaries, as the different species were clearly grouped in separate clusters in the phylogenetic tree, with high bootstrap support (100% for species).
In previously published phylogenetic trees based on a limited number of nuclear markers (Cuénoud et al. [2] and Manen et al. [3]), I. sugerokii and I. yunnanensis were closely related.However, in phylogenetic trees based on plastid markers, I. sugerokii is more closely related to I. crenata [2,3].In our study, I. sugerokii and I. yunnanensis are more closely related.In the studies of Cuénoud et al. [2] and Manen et al. [3], I. opaca and I. x attenuata are placed in the same group in the phylogenetic trees resulting from both nuclear and plastid markers.In addition, I. rugosa, I. aquifolium, I. leucoclada, I. x makinoi, and I. integra were placed in the same group.Those results are consistent with our findings.Furthermore, our phylogenetic tree (Figure 1) was consistent with the traditional classifications of Galle [9], but also revealed some deviations.First, similar to Galle's classification, I. integra, I. leucoclada, and I. aquifolium were also closely related in our data.According to Galle, they belong together in subgenus Aquifolium, section Aquifolium, series Aquifoliodes.Second, I. x makinoi, I. x meserveae, and I. rugosa were closely related in our study.Galle [9] places them under subgenus Aquifolium, section Aquifolium, series Hookerianae.Third, in our study, I. vomitoria was not closely related to any other species in our collection (the highest allele similarity is 17.45% with I. x meserveae).This species is also classified separately in Galle's classification, under subgenus Aquifolium, series Vomitoriae, along with one other species, I. fuertisiana (not in our collection).Our GBS data also revealed some discrepancies with previous classifications.Galle [9] places I. x attenuata, I. glabra, I. opaca, I. sugerokii, and I. yunnanensis together in the subgenus Prinos, section Paltoria, series Cassinoides.However, in the phylogenetic tree shown above, I. x attenuata and I. opaca were placed separately from the others.Furthermore, I. serrata and I. macrocarpa, both belonging to different sections, were placed close to I. glabra, I. yunnanensis, and I. sugerokii.
The genome size of I. aquifolium (included in our collection) is previously reported to be 2.30 pg/2C [32] as measured by Feulgen densitometry, an older and less accurate technique.The results for I. aquifolium in our study were lower, with 1.85 ± 0.04 pg/2C for the cultivar 'Angustifolia' and 1.90 ± 0.07 for the cultivar 'Crispa'.The genome sizes of other accessions in our collection have not been previously reported.We determined the chromosome numbers of several species in our collection: I. aquifolium, I. x attenuata, I. crenata, I. glabra, I. integra, I. opaca, I. vomitoria, I. rugosa, I. serrata, and I. leucoclada [9,16].Fluorescence microscopy images of stained nuclei are shown in Figure 3.The genome sizes and chromosome numbers for all analyzed accessions in the collection are presented in Table 1.For most species, our data confirmed the results of previous studies.All diploid species studied in this research had 2n = 40 chromosomes.In addition, we observed 2n = 40 for I. crenata and I. leucoclada (confirmed by Galle, 1997 for I. crenata [9]), which does not agree with Sugiura [13] and Fedorov [14], who reported a chromosome number of 2n = 34 for both species (found in the Chromosome Count Database (CCDB)) [16].The largest reported chromosome number in the Chromosome Count Database is that of I. pedunculosa Miq.(2n = 120).In our collection, a larger chromosome number of 2n = 160 was observed for I. yunnanensis var.gentilis (IL104).This species also had the largest genome size in our collection, at 8.01 ± 0.18 pg/2C.crenata and I. leucoclada (confirmed by Galle, 1997 for I. crenata [9]), which does not agree with Sugiura [13] and Fedorov [14], who reported a chromosome number of 2n = 34 for both species (found in the Chromosome Count Database (CCDB)) [16].The largest reported chromosome number in the Chromosome Count Database is that of I. pedunculosa Miq.(2n = 120).In our collection, a larger chromosome number of 2n = 160 was observed for I. yunnanensis var.gentilis (IL104).This species also had the largest genome size in our collection, at 8.01 ± 0.18 pg/2C.

Genetic Relationships of Ilex Crenata Cultivars
When studying the phylogenetic tree (Figure 1), combined with the pairwise allele similarities of the different I. crenata accessions (Supplementary Materials), a segregation into three groups emerged within the species of I. crenata, which was also observed based on the number of common loci and based on the allele similarity.Clade A consists of I. crenata 'Golden Gem' (IL023), I. crenata 'Luteovariegata' (IL030), and I. crenata 'Picas' (IL033) (bootstrap value of 100%).Clade B contains 34 I. crenata genotypes (bootstrap value of 95%), and Clade C contains 36 I. crenata genotypes.Clade B has three highly supported subgroups (bootstrap values of 83%, 93% and 99%), while Clade C has some highly supported subgroups but also some genotypes (such as I. crenata 'Monmouth (IL073) and I. crenata 'Tee Dee' (IL040)) with less clear phylogenetic positions.The PCoA of I. crenata (Figure 2B) roughly confirmed this structure, with clear separation of clusters A and B from C. Contrary to clusters B and C, there was only very little variation within Clade A. In Figure 1, the color gradient shows the distribution of the genome size of I. crenata.For all analyzed samples, the mean genome size and standard deviation are shown in Table 1.The genome size of the I. crenata genotypes varied from 1.98 pg/2C to 2.30 pg/2C (all diploids 2n = 2x, data shown in Table 1), with three outliers: I. crenata 'Black Beauty' (3.06 ± 0.04 pg/2C, triploid: 2n = 3x = 60), I. crenata 'Chesapeake' (4.04 ± 0.09 pg/2C, tetraploid: 2n = 4x = 80), and I. crenata 'Cherokee' (4.19 ± 0.08 pg/2C, tetraploid: 2n = 4x = 80).The observed genome size variation in I. crenata (1.16-fold difference) was larger than generally expected within a species.Interestingly, the genome sizes were more concise when calculated in accordance with the observed genetic clade structure.The phylogenetic clades had a significant difference in genome size, as depicted in the right panel of Figure 1.Clade A had a mean genome size of 2.09 pg/2C ± 0.006, Clade B had a mean genome size of 2.07 pg/2C ± 0.05, and Clade C contains the I. crenata samples with the largest genomes (mean of 2.19 pg/2C ± 0.06).All three clades showed little variation within them and have low standard deviations, which is the normal variation expected within a species.There was, however, a significant difference in the mean genome size of Clade B and Clade C (p-value = 7.44 × 10 −13 ).As Clade A only comprises three genotypes, statistical differences were not calculated for this clade.
Intraspecific genome size variation should be interpreted with caution, as illustrated by Noirot et al. (2000) [33] and Greilhuber (2005) [34].Noirot et al. showed that the use of an internal standard can reduce, but not eliminate, measurement errors [35].As Ilex is a dioecious plant and sex chromosomes can differ in size, a possible link between genome size and sex was investigated but not found, indicating that the observed variations in genome size and grouping in clades were not due to the sex chromosomes.A more plausible explanation is that the genome size differences are linked to the evolutionary or breeding history of the plants.However, information about the geographic origin, breeding parents, and year of discovery of the different cultivars in this study is scarce and did not lead to unraveling the origin of the observed clades.In the species I. polyneura, the emergence of two groups and some in between genotypes was described by Yao et al. using RADseq population genetic analyses [4].A link between genome size and evolution has been proposed in maize [36,37].Bilinsk et al. observed small differences in genome size in maize and described a negative correlation between genome size and cell production rate, as well as a negative correlation between cell production rate and flowering time.Their research suggested that a mechanistic relationship between genome size, cell production, and developmental rate may lead to differences in optimal flowering times across altitudes, thus affecting genome size [36].

Genetic Diversity via Crossing and Hybridization
In addition to identifying very closely related genotypes, GBS-derived calculations of pairwise common loci and allele similarities also make it possible to identify hybrids, while only considering a phylogenetic tree often does not [30].In this study, we were able to identify IL051 I. x makinoi as the result of a spontaneous hybridization event between I. leucoclada (IL099) and I. rugosa (IL098), and I. x meserveae 'Blue Girl' (IL046) as the result of a manual cross between I. rugosa (IL098) and I. aquifolium (IL050) (Figure 4).Furthermore, the table with common loci and allele similarities in the supplemental materials (Figure S1) shows the known hybrid origin of I. x aquipernyi (IL045 and IL001) as being the result of a manual cross of I. aquifolium and I. pernyi (not in dataset) as parents for I. x aquipernyi (IL045).This accession has many loci in common with I. aquifolium (IL049 and IL050).For I. x aquipernyi (IL001), however, the number of common loci with I. aquifolium (IL049 and IL050) is not as high as for I. x aquipernyi (IL045).It is possible that I. x aquipernyi (IL001) has a higher similarity to its other parent, I. pernyi (not included in the data).
The variety of commercial I. crenata cultivars is rather limited, comprised mostly of a small selection of cultivars that are multiplied.The PCoA analyses revealed the breadth of currently unexploited genetic diversity.There were no genotypes in our collection that are positioned in the spaces between clusters, indicating a great potential for crosses between clusters.Interspecific hybridization in Ilex is usually more complicated than intraspecific crossing, and chromosome numbers can be used as a basis for estimating the rate of success [9].Nevertheless, hybridization has led to many commercially important cultivars of different Ilex species in the past [9], such as I. x meserveae cultivars.Our data even confirm the possibility of inter-species hybridization.To obtain cultivars that are more robust to challenges such as high soil pH and black root rot, the knowledge from this paper needs to be combined with knowledge about phenotypic traits such as tolerance and resistance.In the future, both intra-and interspecific hybridization represent promising approaches to creating novel genetic diversity.

Conclusions
Despite the popularity of I. crenata as an ornamental shrub, a handful of cultivars are commercially dominant.The aim of this study was to investigate the genetic diversity within the species I. crenata, and between I. crenata and a few other hardy Ilex species.Genome size measurements, chromosome numbers, and genetic fingerprinting were used to unravel the genetic identity, genetic relationships, and genetic diversity of 72 I. crenata accessions and 22 accessions of other species.Our data showed that I. crenata has a high intra-specific genetic diversity.This was reflected in the differences in genome size of diploid I. crenata genotypes related to three phylogenetic clades identified through GBS genetic fingerprinting.Furthermore, the genetic fingerprinting data showed that large parts of this genetic variation might be interesting for commercial use.Interspecific hybridization or intraspecific crossings are a possible strategy to further enlarge the genetic diversity in the commercial assortment of I. crenata.This research is valuable for breeding programs, as it allows deliberately selecting for genetically diverse parental material, creating novel genetic diversity, and screening for novel morphological traits.

Figure 1 .
Figure 1.Phylogenetic tree of the Ilex collection along with the genome size (pg/2C-value).Th clades containing non-Ilex crenata spp.are depicted in grey and the clades containing I. crenata a cessions in black.The three clades within I. crenata are indicated on the right (Clade A, Clade B, an Clade C).For every genotype a dot represents the mean genome size (n = 6) and the horizontal ba the standard deviation, as measured using flow cytometry.Vertical lines labeled mean A, mean and mean C indicate the mean genome sizes per clade (polyploids not included) of 2.09 ± 0.006; 2.0 ± 0.05, and 2.19 ± 0.06 pg/2C ± sd for clade A, B, and C, respectively.The genome size of genotype with a genome size exceeding the scale are depicted on the right.

Figure 1 .
Figure 1.Phylogenetic tree of the Ilex collection along with the genome size (pg/2C-value).The clades containing non-Ilex crenata spp.are depicted in grey and the clades containing I. crenata accessions in black.The three clades within I. crenata are indicated on the right (Clade A, Clade B, and Clade C).For every genotype a dot represents the mean genome size (n = 6) and the horizontal bar the standard deviation, as measured using flow cytometry.Vertical lines labeled mean A, mean B, and mean C indicate the mean genome sizes per clade (polyploids not included) of 2.09 ± 0.006; 2.07 ± 0.05, and 2.19 ± 0.06 pg/2C ± sd for clade A, B, and C, respectively.The genome size of genotypes with a genome size exceeding the scale are depicted on the right.

Figure 2 .
Figure 2. PCoA of the 94 accessions of the Ilex collection (A) and the subset of 72 I. crenata accessions (B).(A) Samples are colored by species and grouped in four clusters.(B) Samples are false-colored according to their genome size (range 1.98-2.30pg/2C), while the polyploids are pictured in gray.Accessions of Clade A are grouped using a circle.The dotted line separates accessions of Clade B and Clade C. In previously published phylogenetic trees based on a limited number of nuclear markers (Cuénoud et al. [2] and Manen et al. [3]), I. sugerokii and I. yunnanensis were closely related.However, in phylogenetic trees based on plastid markers, I. sugerokii is more closely related to I. crenata [2,3].In our study, I. sugerokii and I. yunnanensis are more

Figure 2 .
Figure 2. PCoA of the 94 accessions of the Ilex collection (A) and the subset of 72 I. crenata accessions (B).(A) Samples are colored by species and grouped in four clusters.(B) Samples are false-colored according to their genome size (range 1.98-2.30pg/2C), while the polyploids are pictured in gray.Accessions of Clade A are grouped using a circle.The dotted line separates accessions of Clade B and Clade C.