Assessment of Genetic Diversity of Rice in Registered Cultivars and Farmers’ Fields in Burkina Faso

The genetic diversity of cultivated rice in farmers’ fields remains understudied in West Africa despite the importance of rice for food security in this region. In this study, we genotyped rice samples from Burkina Faso using the C6AIR SNP (Single Nucleotide Polymorphism) array (IRRI), including 27 registered cultivars and 50 rice samples collected in rice fields from three geographical zones in western Burkina Faso. Most of the registered cultivars clustered with the indica genetic group, except seven assigned to japonica and one admix. All but one of the rice samples from farmers’ fields belonged to the indica group. The other field sample, which unexpectedly clustered with the Aus genetic group, originated from a rainfed lowland site known to differ in terms of agronomic practices, and which revealed to be highly differentiated from the five other sites. Apart from this peculiar site, the rice grown in irrigated areas did not differ from rice sampled in rainfed lowlands. Finally, obtained genetic data confirmed the high frequency of one cultivar, in congruence with farmers’ interviews. We argue on the importance to document and preserve the high agro-biodiversity observed in rice from Burkina Faso as a prerequisite to face the current challenges of growing rice demand and global change.


Introduction
Crop genetic diversity is a component of agro-biodiversity, with high value for nutrition and adaptation to biotic and abiotic stresses [1], particularly in the context of global changes [2]. It contributes rendering farming systems more stable, robust and sustainable. On the other hand, the development, dissemination and adoption of improved cultivars is a pathway to increasing crop productivity and aligning this with market demands. A deep knowledge of registered cultivars at a genomic level, as well as crop genetic diversity actually grown locally, are important pieces of information to take into account for plant diversity management and crop improvement.
Rice is rapidly becoming a staple food in the African diet. In West Africa, the average annual production is 10  For each cultivar, we indicate the potential synonym (with usual name in bold), the country or organism of origin, the date of introduction in Burkina Faso, the time to maturity (in days) and the genetic group based on a priori knowledge. Most of the cultivars' names begin with 'FKR', which stands for 'Farako-Bâ Riz'. Further information (pedigrees and/or agro-morphological information) can be found in [21,22].
In addition, we took advantage of a sampling previously performed in six sites located in western Burkina Faso [23]. These six sites are located in three geographical zones (Bama, Banzon and Karfiguela, see Figure 1a), each zone comprising one irrigated area Crops 2021, 1 132 and a neighboring rainfed lowland. The present study focused on 50 fields visited in 2018 (7-11 fields per site, 8.33 on average, see Table 2).
For each cultivar, we indicate the potential synonym (with usual name in bold), the country or organism of origin, the date of introduction in Burkina Faso, the time to maturity (in days) and the genetic group based on a priori knowledge. Most of the cultivars' names begin with 'FKR', which stands for 'Farako-Bâ Riz'. Further information (pedigrees and/or agro-morphological information) can be found in [21,22].
In addition, we took advantage of a sampling previously performed in six sites located in western Burkina Faso [23]. These six sites are located in three geographical zones (Bama, Banzon and Karfiguela, see Figure 1a), each zone comprising one irrigated area and a neighboring rainfed lowland. The present study focused on 50 fields visited in 2018 (7-11 fields per site, 8.33 on average, see Table 2). Each sample corresponded to one rice leaf per field and was collected between September and December 2018. In 40 out of the 50 fields (80%), we interviewed the farmer and asked for the rice cultivar grown. We also performed farmers' interviews at the same sites in the two previous years (2016-2017) and the subsequent year (2019) [23]. A synthesis of obtained responses (annual frequencies of cultivars grown over the fouryear period) are reported in Figure 1b and detailed information is available at https: //dataverse.ird.fr/dataset.xhtml?persistentId=doi:10.23708/8FDWIE (accessed on 22 April 2021). In every case, we obtained permission from the farmers to work in their fields, and the management of the entire project followed the guidelines of the Nagoya protocol regarding access and benefit sharing.
Wet lab work (both DNA extraction and SNP genotyping) was performed at the Genotyping Services Lab at the International Rice Research Institute (IRRI). DNA fingerprinting approach used the Illumina Infinium rice 6K chip (C6AIR) [24], a set of SNPs designed to characterize the diversity within O. sativa species. This chip has already been used in rice diversity studies, for example, to characterize rice samples from Bangladesh [25]. SNP genotyping data table was provided by the genotyping platform and used for subsequent analysis.
In order to place the rice diversity from Burkina Faso in the global context of Asian rice diversity, we downloaded the 29 mio SNP datasets from the 3K genome data available at https://snp-seek.irri.org/ (accessed on 31 March 2021). Obtained PLINK binary files were converted to VCF using PLINK software v1.9 [26], enabling the keep-allele-order option. SNPs' positions corresponding to the C6AIR were extracted from the VCF file using bcftools v1.9 [27], then imported within the R software v4.1.0 [28], as well as the genotyping table from accessions from Burkina Faso. Datasets from the chip genotyping and from the 3K genome data were merged prior to applying genomic filters. In order to keep best-quality SNPs, we applied the following filters SNPwise: less than 15% of missing data considering only the accessions from Burkina Faso, less than 10% missing data considering the whole dataset and an additional filter on heterozygosity, which removed positions with more than 45% heterozygosity. We ended up with a final dataset including 5247 SNPs.
We first conducted a Principal Component Analysis (PCA) using LEA3.1 R package [29]. Graphical display of obtained PCA ( Figure 2) was made using ggplot2 R package v3.3.5 [30]. Datasets were converted to the genind format from the adegenet R package v2.1.3 [31] and analyzed both with adegenet and hierfstat R package v0.5-7 [32]. A Discriminant Analyses on Principal Components (DAPC) was performed, using the 3K diversity groups as reference, to assign accessions from Burkina Faso to these groups ( Figure S1). We then focused our analysis on the accessions from this study (samples from Burkina Faso) and first computed a genetic tree within these samples ( Figure 3). Genetic distances between the accessions were computed using the dist.gene function and the resulting Neighbor-Joining tree was computed using the ape R package v5.5 [33]. Graphical representation was made using the 'fan' option of the ggtree R package v3.1.2 [34]. A PCA was then computed considering only the field genotypes (with or without a peculiar accession, Figures 4 and S2) with the dudi.pca function of ade4 R Crops 2021, 1 134 package v1.7-17 [35]. Finally, basic population genetics descriptive statistics (gene diversity and populations pairwise F ST ), considering different levels of hierarchy, were computed using hierfstat (Tables 2, 3 and S1).
Crops 2021, 1, FOR PEER REVIEW 6 v2.1.3 [31] and analyzed both with adegenet and hierfstat R package v0.5-7 [32]. A Discriminant Analyses on Principal Components (DAPC) was performed, using the 3K diversity groups as reference, to assign accessions from Burkina Faso to these groups ( Figure  S1). We then focused our analysis on the accessions from this study (samples from Burkina Faso) and first computed a genetic tree within these samples (Figure 3). Genetic distances between the accessions were computed using the dist.gene function and the resulting Neighbor-Joining tree was computed using the ape R package v5.5 [33]. Graphical representation was made using the 'fan' option of the ggtree R package v3.1.2 [34]. A PCA was then computed considering only the field genotypes (with or without a peculiar accession, Figures 4 and S2) with the dudi.pca function of ade4 R package v1.7-17 [35]. Finally, basic population genetics descriptive statistics (gene diversity and populations pairwise FST), considering different levels of hierarchy, were computed using hierfstat (Tables 2, 3 and S1).     The other genetic groups are indicated with colored shapes: blue for japonica, pink for Aus and grey for admix (same color scheme as in Figure 2).

Results
The PCA analysis including both this study's samples and the global reference of rice genomes (Figure 2), as well as the DAPC analysis ( Figure S1), showed that the vast majority of the samples correspond to the O. sativa indica group. This result corresponds to the expectations, as none of the samples came from upland growing systems, where japonica are generally found [6]. Moreover, Diop et al. [10] found that indica was the most widely cultivated type of lowland rice in West Africa, with very few of their samples revealed as japonica.
On the other hand, seven reference cultivars (FKR21, FKR33, FKR45N, FKR55, FKR59, FKR61, NERICA4) were attributed to the japonica group (Figures 2 and 3). These cultivars were known to be japonica or NERICA upland (Table 1, Figure 3) so that this result is congruent with expectations. The japonica group does not contain any of the analyzed samples from farmers' fields (Figures 2 and 3). We noticed that the NERICA cultivars, resulting from O. sativa-O. glaberrima crosses, do not cluster together, likely a consequence of the genotyping method. Indeed, the C6AIR SNP array [24] was designed to maximize within-species diversity for O. sativa and may not target the small parts of the NERICA genomes originating from O. glaberrima.
The cultivar FKR04, a cultivar introduced from Casamance (Senegal) in 1960 (Table 1) belonged to admix (Figures 2 and 3). Finally, one field sample, from the field labelled 'TG02 , belonged to the Aus group (Figures 2 and 3). The Aus genetic group does not seem to be common in West Africa in general, as it does not appear in the two studies cited previously [6,10], whereas Sié et al. [12] reported the presence of two Aus rice samples in a previous sampling performed in Burkina Faso.
Global gene diversity estimated in the 27 registered cultivars from Burkina Faso was 0.282. It represented various genetic groups (indica, japonica and admix, Figure 2). Withingroup diversity was also apparent as we noticed the registered cultivars from Burkina Faso are not so closed from each other's within the indica and japonica diversity groups ( Figure 2). Our results, however, show that the diversity used for the registered cultivars in Burkina Faso could still be enlarged by mobilizing more genetic diversity of the rice worldwide germplasm.
Global gene diversity estimated in the 50 analyzed field samples from Burkina Faso was 0.137, and within-site genetic diversity was the highest at the Tengrela site (0.132; Table 2). On the other hand, the Banzon irrigated site presented the lowest genetic diversity (0.108; Table 2). Tengrela was also involved in all of the highest pairwise genetic differentiation values (Table 3 and Table S1). The highest between site genetic differentiation was between the Tengrela rainfed lowland and Banzon irrigated perimeter sites (F ST = 0.328 (0.311-0.347)). Such a specificity of the site of Tengrela, compared to the five other study sites, is likely related to social reasons. Indeed, it is congruent with the data obtained for farmers' interviews, showing that, in this site specifically, rice was mostly grown by women for self-consumption only, with low frequency of chemical fertilization but often the use of manure from household waste [23].
Tengrela was also the only site where a sample was attributed to a group other than indica, namely the Aus group (Figures 2 and S2). According to farmers' interviews [23], this was the only site among the six where only landraces were grown (no use of registered cultivars, Figure 1b). The farmer named the cultivar from the field TG02 (attributed to Aus group) 'Samperema', while other cultivars' names from this site included ETP, Bedankaki, Bandakadi/Debale and Tchombiais. These six samples from Tengrela, although assigned to the indica group, were, however, differentiated from the samples found in the five other sites (Figure 3, and see the point aspect in Figures 4 and S2). They likely derive from hybridization between the locally grown landrace belonging to the Aus diversity group and introduced indica varieties, as suggested by their affinity with TG02 in the PCA ( Figure S2).
In terms of rice growing systems, we note that the samples from irrigated areas and rainfed lowlands (respectively in blue and orange in Figure 3) do not specifically differ from each other, with the exception of Tengrela village, as previously mentioned (sample names beginning with 'TG' in Figure 3). Rainfed lowlands considered as a whole (three sites) had a higher gene diversity (0.149) compared to irrigated areas (0.125). The genetic differentiation between all rainfed lowland fields and all the fields located in irrigated areas was estimated to be F ST = 0.030 (0.027-0.034). This is likely due to the peculiarity of the Tengrela site, as we note that the F ST obtained between rainfed lowlands and irrigated areas from the geographic zones of Banzon and Bama did not differ from zero (Table 3 and Figure S2).
In the phylogenetic tree (Figure 3), as well as PCA analyses (Figures 4 and S2), we also note that many field samples (from five sites: Senzon, Banzon, Karfiguela, Badala, Bama) are identical to each other's and to the reference cultivar FKR64 (commonly named TS2 in Burkina Faso, see Table 1). This cultivar, originating from Taïwan (Table 1), was frequently mentioned by the farmers from all sites, except in the peculiar site of Tengrela (Figure 1b). Consequently, genetic data and farmers' responses were in agreement in that this rice cultivar was the most frequently grown in the vast majority of the studied sites from western Burkina Faso. However, Figures 4 and S2, which present the cultivar given names according to farmers' interviews as label color, show that at the field level, the genetic assignation of samples did not always correspond to a farmer's information. This likely reflects the dynamics of rice genetic diversity in a farmer's field and illustrates that the genetic pool is not fixed but is still evolving.

Conclusions
Understanding the present genetic diversity and distribution of crops is crucial for in situ agro-biodiversity conservation programs as well as for crop improvement, with the selection of parents with diverse genetic background.
Our results firstly offer a picture of rice genetic diversity in six sites, furthermore characterized in terms of agricultural practices and the levels of major diseases [23], representing irrigated areas and rainfed lowlands in western Burkina Faso. We confirmed that indica rice is by far the most frequently grown, but we also identified a sample from the Aus genetic group. We found no major differences between rice cultivated in lowlands vs irrigated areas, except for the lowland site of Tengrela (TG), where rice is grown traditionally (low input) by women using traditional landraces. We confirmed the predominance of one registered cultivar (registered as FKR64, and named TS2 by the farmers) in the five other sites. These results encourage further research to encompass rice agro-biodiversity actually grown in this country and in West Africa in general. To add to this perspective, we propose to include more registered varieties and to extend the geographic areas to cover all important rice production areas in Burkina Faso, including, for example, the Boucle du Mouhoun region that was shown to present the highest rice genetic diversity in a previous study [13]. In addition, it could be interesting to include rainfed upland rice, although this rice production system is minor in Burkina Faso (only 10% of the rice land area and 5% of national rice production [18]). Rice cultivated in upland fields likely includes japonica rice that was not found in the farmers' fields visited in this study but was represented in the set of registered cultivars, offering the farmers a diversity of rice cultivars adapted to various rice growing systems of the country. Finally, deciphering potential within-field rice genetic diversity is also an interesting research question for future work, which was not addressed in our study where only one plant sample from each field was analyzed.
We also documented the genetic diversity of 27 registered cultivars, including indica, japonica and NERICA. This may offer the perspective (not straightforward though) to try to design easy-to-use genetic markers (see [36] for markers discriminant for rice species) useful for quality control and seed certification. The apparent discrepancies observed here in Burkina Faso between genetic assignations and naming of the cultivars by the farmers illustrate the importance to strengthen these aspects. Combining the study of rice genetic diversity with human and social science in West Africa would be a way to understand further the rationale behind rice farmers' seed choice (see for example [37] in Benin). Such an integrative approach including breeders, geneticists and social scientists would deliver useful information to design suitable strategies for crop genetic diversity management.
Indeed, while genetic improvement is very important to increase yield and fight poverty and food insecurity [38], it is also critical to preserve agro-biodiversity and to include landraces, especially those preferred by farmers and consumers, in breeding programs and dissemination projects. Farmer rice cultivars in West Africa were shown to be tolerant of suboptimal conditions [39], illustrating the crucial role of crop genetic diversity for a robust food security system able to adapt to the dynamic nature of biotic and abiotic stresses, particularly with the current global changes.