1. Introduction
Faba bean (
Vicia faba L.) is a major grain legume crop, grown for high seed protein content (approximately 25%–30%) and superior biomass, ranking as the fourth most internationally important cool-season legume after peas, chickpeas and lentils [
1]. Although the exact origin is unknown, it is generally accepted that faba bean was one of the earliest food legumes to be domesticated, and has probably been cultivated since the Neolithic period [
2]. Both botanical and molecular genetic data suggest that the wild progenitor of contemporary faba bean is yet to be discovered, or has become extinct [
3]. As for other legume crops, faba bean plays a critical role in crop rotations and effective nitrogen fixation for the purpose of soil improvement [
4]. Faba bean (
Vicia faba L.) is a diploid (2n = 2x = 12), partially allogamous plant species [
5], with a very large nuclear genome (c. 13 Gbp diploid [
6,
7]). Large genomes pose substantial challenges to effective development and implementation of molecular genetic markers for genomics-assisted breeding.
More than 43,500 faba bean accessions are conserved within 37 global collections [
1,
8]. The largest collection of faba bean accessions (>9,000 accessions) is located at the International Center for Agricultural Research in the Dry Areas (ICARDA) in Syria, followed by the Chinese Academy of Agricultural Sciences (CAAS) in China (>5,200 accessions). Faba bean has long attracted the interests of numerous taxonomists and evolutionists due to many unknown evolutionary aspects [
9]. The species is divided into distinct groups based on seed size, ranging from small-seeded minor beans (0.2–0.8 g per seed) to medium-seeded equina beans and the large-seeded major beans (greater than approximately 1.0 g and up to 2.6 g per seed), that have become known as distinct botanical groups [
10].
The key factor for successful crop improvement is a continued supply of genetic diversity in breeding programs, including new or improved variability for target traits. Plant genetic resources have always played a major role in providing sources of resistance to biotic and abiotic stresses, and it is therefore critical to manage, conserve and evaluate such materials. Although management and utilisation of large-scale diversity in collections provide major challenges to germplasm curators and crop breeders, it is equally important to characterise and understand genetic diversity among plant resources for effective use in breeding programs [
11].
Molecular genetic markers represent a powerful tool for characterisation of germplasm collections. Different types of marker systems have been used to characterise genetic diversity in various crop species, including faba bean. Genetic diversity among 28 inbred lines (European and Mediterranean lines) was assessed using random amplified polymorphic DNA (RAPD) markers [
9], identifying three major clusters comprised of European large-seeded, European small-seeded and Mediterranean germplasm. Amplified fragment length polymorphism (AFLP) analysis was used to determine genetic diversity among inbred lines derived from elite cultivars in some earlier studies [
10,
12], while in another study the authors used target region amplification polymorphism (TRAP) markers to assess genetic diversity and relationships between faba bean germplasm entries [
13]. Mediterranean landraces that were highly diverse for morpho-agronomic traits were analysed using inter-simple sequence repeat (ISSR) markers, revealing substantial diversity. ISSR markers were also used to assess genetic diversity and relationships within globally distributed faba bean germplasm [
14]. The authors concluded that accessions from North China showed the highest genetic diversity, while accessions from central China displayed a low level of diversity, and accessions from Europe were genetically closer to those from North Africa. Recently, large numbers of simple sequence repeat (SSR) markers have been identified and characterized from faba bean [
15,
16,
17], and those derived from expressed sequence tags (ESTs) were used to understand the genetic relationships between 32 genotypes, permitting definition of four distinct clusters based on geographical origins [
17].
Until recently, SSRs have been considered as the marker system of choice for the majority of applications. However, recent advances in sequencing and genotyping technologies now permit generation of large sets of single nucleotide polymorphism (SNP) markers from relatively understudied crop species such as faba bean at an acceptable level of cost. As a consequence, SNPs have become more widely used due to high abundance and capacity to be multiplex-formatted for high-throughput genotyping. In addition, SNP discovery from transcribed regions of the genome provides the basis to establish a direct link between sequence polymorphism and putative functional variation.
In the present study, a selection was made of 45 faba bean lines from North Africa, China, Ecuador, Europe and Australia that represent Australian cultivars, as well as the major parents of the Australian faba bean breeding program. Multiple genotypes from each of the faba bean lines were genotyped with EST-derived SNP markers to assess genetic diversity, which was then related to geographical location of origin and pedigree structure, providing a support for design of future breeding populations.