Next Article in Journal
Nitrogen Fixation and Resource Partitioning in Alfalfa (Medicago sativa L.), Cicer Milkvetch (Astragalus cicer L.) and Sainfoin (Onobrychis viciifolia Scop.) Using 15N Enrichment under Controlled Environment Conditions
Previous Article in Journal
Crambe: Seed Yield and Quality in Response to Nitrogen and Sulfur—A Case Study in Northeastern Poland
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Genotyping-by-Sequencing to Unlock Genetic Diversity and Population Structure in White Yam (Dioscorea rotundata Poir.)

1
International Institute of Tropical Agriculture, PMB 5320, Ibadan 200001, Nigeria
2
Boyce Thompson Institute, Cornell University, Ithaca, NY 14853, USA
3
Agriculture and Agri-Food Canada, Fredericton, NB E3B 4Z7, Canada
4
International Trade Center, 29 Independence Avenue, Accra 00233, Ghana
5
International Institute of Tropical Agriculture, PMB 82, Kubuwa, Abuja 901101, Nigeria
*
Author to whom correspondence should be addressed.
Agronomy 2020, 10(9), 1437; https://doi.org/10.3390/agronomy10091437
Submission received: 7 August 2020 / Revised: 12 September 2020 / Accepted: 14 September 2020 / Published: 22 September 2020
(This article belongs to the Section Crop Breeding and Genetics)

Abstract

:
White yam (Dioscorea rotundata Poir.) is one of the most important tuber crops in West Africa, where it is indigenous and represents the largest repository of biodiversity through several years of domestication, production, consumption, and trade. In this study, the genotyping-by-sequencing (GBS) approach was used to sequence 814 genotypes consisting of genebank landraces, breeding lines, and market varieties to understand the level of genetic diversity and pattern of the population structure among them. The genetic diversity among different genotypes was assessed using three complementary clustering methods, the model-based admixture, discriminant analysis of principal components (DAPC), and phylogenetic tree. ADMIXTURE analysis revealed an optimum number of four groups that matched with the number of clusters obtained through phylogenetic tree. Clustering results obtained from ADMIXTURE analysis were further validated using DAPC-based clustering. Analysis of molecular variance (AMOVA) revealed high genetic diversity (96%) within each genetic group. A network analysis was further carried out to depict the genetic relationships among the three genetic groups (breeding lines, genebank landraces, and market varieties) used in the study. This study showed that the use of advanced sequencing techniques such as GBS coupled with statistical analysis is a robust method for assessing genetic diversity and population structure in a complex crop such as white yam.

1. Introduction

White yam or white guinea yam (Dioscorea roundata Poir.) is one of the most important staple tuber crops of West Africa [1], which is strongly associated with the food security, income, and social culture of >300 million people of this region, having a net value internationally of ≈$15 billion [2]. It belongs to the section Enantiophyllum and Dioscoreaceae family consisting of approximately 600 species distributed in the tropical and subtropical regions of the world [3]. D. rotundata is an allogamous, polyploid, dioecious species with a basic chromosome number of n = x = 20. The ‘yam belt’ consisting of six countries in West and Central Africa including Nigeria, Ghana, Bénin, Côte d’Ivoire, Togo, and Cameroon accounts for 97% of the total yam production of the world, while Nigeria alone accounts for 68% of global production [2]. In this region, there is the occurrence of several Dioscorea spp., including white yam. In West and Central Africa, where white Guinea yams were domesticated about 7000 years ago [4], farmers selected genotypes that suited their needs such as food, farming, etc. and thus generated a large number of traditional cultivars. Hence, white yam domestication took advantage of a huge reservoir of diversity in this region as a result of years of large production, consumption, and trade [5,6,7,8].
Several studies have been carried out to determine the phylogenetic relationships of Guinea yam (D. rotundataD. cayenensis) using morphological features and molecular markers. The morphological characterization of guinea yams from Benin and Cameroon classified the accessions as D. rotundata, D. cayenensis, and D. rotundataD. cayenensis groups [9]. However, the use of morphological features has remained a challenge to biosystematics for many years because of the limitations related with the continuous variation and plasticity of most of the features, making them less informative for species identification [5]. Furthermore, first-generation molecular markers such as isozymes [10,11,12], AFLPs [13], RAPDs [7,14,15], RFLPs [16] and SSRs [1,17,18] have also been used to assess taxonomic and phylogenetic diversity in cultivated and wild guinea yams. However, such assessment suggested the cultivars/accessions of D. cayenensis as a separate taxon from that of the D. rotundata [18]. Meanwhile, a recent study on guinea yam collections from Ethiopia using microsatellites suggested no clear distinction between cultivated and wild Dioscorea species [19]. In fact, there are very few studies that used morphological or molecular markers to assess genetic diversity in white yam per se [1,20,21].
In recent years, several next-generation sequencing (NGS)-based protocols have been developed for the discovery and generation of large sets of genome-wide SNP (single nucleotide polymorphism) markers for genetic diversity studies, linkage mapping, genomic selection, and QTL (quantitative trait loci) analysis [22]. NGS-based SNPs, due to their genome-wide abundance, are currently the most widely used molecular markers for germplasm characterization as well as the quantification of ancestry of cultivars [23]. Genotyping-by-sequencing (GBS) has emerged as one of the most inexpensive NGS-based genotyping platforms that allows for a high level of multiplexing and high marker density to reveal the extent of genetic relatedness and genetic variation within and between cultivated and wild species [24,25,26,27]. The GBS approach reduces the genome complexity using restriction enzymes for high-density SNP markers discovery [28]. In addition, the SNP calling and bioinformatics pipelines are well established and publicly available [29,30]. Next-generation sequencing (GBS-based/DArTseq-based) has been successfully applied in non-model species such as guinea yam [31], water yam [32,33,34], and trifoliate yam [35], which demonstrated the suitability of this method for the high-throughput genotyping in yams. In the present study, GBS was used to characterize white/white Guinea yam (D. rotundata.) landraces, breeding lines, and market varieties to understand the genetic relationships and population structure for further improvement of this very important species in West and Central Africa.

2. Materials and Methods

2.1. Plant Materials

A total of 814 genotypes were used in this study that included 473 genebank landraces selected from the revised yam core collection [36], 314 breeding lines, and 27 popular market varieties or landraces collected from different markets across Nigeria and Ghana (Table S1). The genebank landraces and breeding lines were collected from the Genetic Resources Center (GRC) and from the Yam Breeding Unit (YBU) of the International Institute of Tropical Agriculture (IITA), Ibadan, Nigeria, respectively. Of the 462 genebank landraces collected and conserved at the Genetic Resources Center at IITA-Ibadan, Nigeria, 198 were from Nigeria and 166 were from Togo, while other countries were Benin (32), Ghana (29), Cote d’Ivoire (27), Guinea (7), and 1 accession each were from Burkina Faso, Equatorial Guinea, and Sierra Leone (Table S1).

2.2. DNA Extraction and GBS

Genomic DNA was extracted from 100 mg of fresh young leaves using the Qiagen DNeasy Pant Mini kit (Qiagen, Germantown, MD, USA) following the manufacturer’s protocol. A Nanodrop 8000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA) was used to measure the quality of the DNA by comparing the 260 and 280 nm absorptions. DNA samples were further quantified using the Quant-iT™ PicoGreen® dsDNA assay kit (Invtrogen, Carisbad, CA, USA) and diluted to 50 ng/µL with 1 × TE buffer. About 30 µL of DNA for each genotype was sent in 96-well plates to the Institute for Genomic Diversity (IGD) at Cornell University, Ithaca, New York, where GBS was done using a 96-plex Pst I GBS protocol [29]. In brief, for each library, purified genomic DNA was first digested with the restriction enzyme PstI (New England Biolabs, Whitby, ON, Canada), and the ligation of customized adapters (barcodes) with T4 ligase was subsequently carried out. This was followed by PCR with flow-cell attachment site tagged primers. Single-end sequencing was performed using an Illumina HiSeq2000 (Illumina Inc. San Diego, CA, USA).

2.3. Processing of Illumina Raw Sequence Read Data, SNP Calling, and Filtering

The raw sequencing reads (of read length 1 × 100 bp) containing the barcode were sorted, de-multiplexed, and trimmed to the first 64 bases starting from the enzyme cut site. All the reads containing ‘N’ within the first 64 bases and tags with less than 64 bases were removed. We used the first draft of the Dioscorea rotundata reference genome [37] and Bowtie2 [38] to align the sequencing reads. We implemented the best practices of the GATK pipeline to call SNPs [39] GATK 2.4 version and GATK-UnifiedGenotyper were used in this study for the SNP calling. We used multi-sample variant calling by GATK-UnifiedGenotyper considering the large number of samples involved in the study. The SAM files generated from the alignment were converted to BAM format and sorted by name using SAMtools. The final variant calling was generated through GATK (2.4) (using HaplotypeCaller in the gVCF mode) and joint genotyping (using GenotypeGVCFs). The VCF file developed was filtered for using criteria of MAF (minor allele frequency) >0.05 and missing data >80% both at the genotypes and SNP markers level. Only bi-allelic SNP markers with genotype quality >20 and read depth >5 were retained after using Vcftools v.0.1.12b [40] and PLINK v1.07 [41] for filtering. The resulting SNPs were subjected to linkage disequilibrium (LD) and SNP markers in LD were removed, and a total of 3432 SNP markers were retained for all subsequent analysis.

2.4. Population Structure, Genetic Diversity, and Relationships

The filtered SNP data were used to assess the population structure and genetic diversity in white yam. A set of parametric and non-parametric methods including the model-based maximum likelihood estimation of ancestral subpopulations using ADMIXTURE [42], assumption-free discriminant analysis of principal components (DAPC) [43,44] and fixation index (Fst)-based population differentiation were used for analysis. Genetic diversity analysis was carried out using parameters such as minor allele frequency (MAF), polymorphic information content (PIC), expected heterozygosity (He), and observed heterozygosity (Ho) using R [45]. Additionally, genetic diversity was assessed by calculating Shannon–Weaver (H’), Simpson, inverse Simpson, and Pielou’s evenness indices [46] for the three genetic categories (genebank landraces, breeding lines, and market varieties) using Vegan package in R [47].
A pairwise identity by state (IBS) genetic distance matrix was calculated from 3432 SNP markers using PLINK, and a critical distance threshold [48] was used to declare whether two genotypes are identical based on the pairwise distances between them. Then, the dissimilarity matrix was used to construct the network relationships among the three genetic groups using QGRAPH [49] implemented in R. The aim was to detect important key nodes representing genetic relationships between two genetic groups (breeding lines–market varieties, genebank landraces–breeding lines, and genebank landraces–market varieties) in the network. The results from ADMIXTURE was complemented with DAPC analysis using the R package ‘adegenet’ [50] in a two-step process. First, the optimal number of clusters was inferred using k-means analysis [51,52] of PCA (principal component analysis)-transformed genome-wide SNP data by varying the possible number of clusters from one to 100. Bayesian Information Criterion (BIC) was used to assess the best supported model as well as the number and nature of clusters. DAPC scatter plots were later developed on the clusters identified through k-means using the first 50 principal components. The information based on ADMIXTURE analysis was used to determine the most appropriate K, and accessions with membership proportions (Q-value) ≥70% were assigned to groups, while those with membership probabilities less than 70% were declared as admixtures. Then, the results from DAPC analysis and ADMIXTURE were compared. The coefficient of genetic differentiation among the population was calculated based on pairwise Fst (fixation index) to estimate the genetic distance and the relationships among the three genetic groups of D. rotundata used in the study. Analysis of molecular variance (AMOVA) was also conducted to assess the population differentiation among the three genetic groups used in the study. Then, a phylogenetic tree was built by following the procedure of IBS with 1000 bootstrap replicates in PowerMarker v3.25 [53]. The resulting tree was visualized in Molecular Evolutionary Genetics Analysis (MEGA) software version X [54].

3. Results

3.1. SNP Summary

The FastQ files of the generated sequences were aligned to the D. rotundata reference genome [37], of which 43.4% tags aligned uniquely to the reference, 9.8% aligned to multiple positions, and 46.8% did not successfully align. Uniquely aligned tags were used for calculating the distribution of tag density at each position in D. rotundata genome and for SNP distribution. A total of 137,800 unfiltered SNPs was detected as raw SNP markers. A total of 3432 filtered SNPs were obtained and distributed across the twenty-one pseudo chromosomes of D. rotundata (Table 1; Figure 1). The genome-wide SNP density plot (Figure 1) revealed that the highest number of SNPs was in chromosome 5 (11.1%, 380 SNPs), while the lowest number of SNPs was mapped in chromosome 11 (2.6%, 90 SNPs). Then, the transition and transversion SNPs were calculated. Transition SNPs (66.9%, 2296 SNPs) were more frequent than transversions (33.1%, 1136 SNPs). The C/T transitions (34.3%) accounted for the highest frequency, while C/G transversions (5.1%) occurred at the lowest frequency among all the 3432 SNPs (Figure 2). The average PIC value across all the markers was 0.135, while the observed heterozygosity ranged from 0.138 to 0.190 with an average of 0.165 (Table 1). The expected heterozygosity ranged between 0.133 and 0.190, and the mean was 0.161. Similarly, the minor allele frequency (MAF) ranged between 0.090 and 0.133 with an average of 0.111.

3.2. Population Structure and Genetic Diversity

Based on missing data (>20%), 11 genebank landraces of geographical origin of Nigeria were removed from further analysis. Among the 803 accessions analyzed, 314 were breeding lines that were generated through open pollination and bi-parental crossing over several years by the Yam Breeding Unit at IITA-Ibadan, Nigeria. The details are available in Table S1. The majority of the bi-parental crosses were made using a limited number of parental lines (see Table S1), and this could be attributed to the flowering behavior (dioecious, shy female flowering, limited seed set, among others) in white yam [56]. Being clonally propagated, the progenies of bi-parental crosses in white yam represented a segregating population (F2) and were genetically different from each other. This is evident from the grouping of progenies derived from same bi-parental crosses in different clusters (Table S1).
The genetic distance among 803 genotypes varied between 0 and 0.27 (Table S2). Based on the genetic estimation, two accessions were considered identical or representative of the same clone if their pairwise genetic distance was lower than 0.02. Based on this criterion, a total of 767 unique genotypes were recorded. To understand the pattern of population structure, a Bayesian Information Criterion (BIC) and complementary coordination analysis by DAPC were performed. The BIC results suggested the best clustering at K = 2 (with a probability of cluster membership assignment of 100) based on delta K values (Figure 3A, Table S3). Clusters 1 and 2 consisted of 739 (309 breeding lines, 403 genebank landraces, and 27 market varieties) and 64 genotypes (five breeding lines and 59 genebank landraces) (Figure 3B, Table S3), respectively. DAPC analysis was further carried out to assess the subclusters at K = 3 (Figure 3C), K = 4 (Figure 3D), K = 5 (Figure 3E), and K = 12 (Figure 3F). The summary of DAPC cluster grouping and probability of cluster membership assignment of genotypes at K = 2, 3, 4, 5, and 12 is presented in Table S2. Based on the probability of cluster membership assignment, DAPC clusters both at K = 2 and K = 3 represented a good fit. At K = 3, Cluster 1 consisted of 67 genotypes including five breeding lines, 59 genebank landraces majorly representing two countries such as Nigeria (28 landraces) and Togo (21 landraces), and three market varieties (Tables S1 and S3). Cluster 3 was the largest, consisting of 654 genotypes (genebank landraces: 372; breeding lines: 282) while Cluster 2 consisted of 82 genotypes representing 31 genebank landraces, 27 breeding lines, and 24 market varieties (Table S3). A comparative analysis of genetic diversity among the three genetic groups (genebank landraces, breeding lines, and market varieties) revealed that the PIC (0.141), Ho (0.173), and He (0.168) values were relatively higher for genebank landraces while they were the lowest for the market varieties (Table 2). Among the different genetic groups of D. rotundata, the Shannon–Weaver index and Simpson’s index were the highest for genebank landraces, while the highest Pielou’s evenness value was for the market varieties (Table 2).
A significant level of population divergence based on pairwise Fst (p < 0.0001) was also observed between different genetic groups (breeding lines, genebank landraces, and market varieties, while strong genetic relationships were observed within each group (Table 3). The Fst-based population differentiation was highest among breeding lines and genebank landraces (0.038), and it was the minimum between genebank landraces and market varieties (0.024). The AMOVA analysis revealed that the variability was divided into 96% within genetic groups and 4% between the three genetic groups (Table 4).
To elucidate the clustering of 803 genotypes using the ADMIXTURE program, a varying number of subpopulations from K = 2 to 50 was plotted (Figure S1). This resulted in the most appropriate number of subpopulations at K = 2, 3, 4, and 10. Figure 4 represented the genetic relationships among 803 genotypes as the estimated ancestries (Q) from ADMIXTURE analysis represented as barplot at K = 2, 3, 4, and 10. The summary of ADMIXTURE cluster composition at K = 2, 3, 4, and 10 is presented in Table S4. At K = 2, two major clusters were obtained consisting of 735 genotypes in Cluster 1 (Red) and 59 genotypes in Cluster 2 (Green) (Figure 4). After assessing the number of subpopulations (K) from 2, 3, 4, and 10, the most appropriate number was found to be K = 4 and K = 10, which produced the lowest cross-validation error compared to other K values (Figure 4 and Figure S1). At K = 4, 119 genotypes were found to be admixed consisting of 29 breeding lines, 82 landraces, and 8 market varieties. Similarly, at K = 10, the majority of the market varieties (17) were found to be admixed (Table S4).
The results of ADMIXTURE-based clustering at K = 4 was strongly supported by the topology of the distance-based phylogenetic tree (Figure S2). A major difference between the results of DAPC and ADMIXTURE clustering was the tendency of DAPC analysis to assign all genotypes to a single cluster compared to ADMIXTURE, which assigned admixed genotypes to multiple clusters based on K values (Tables S2 and S4).

3.3. Hierarchical Clustering and Network Analysis

A phylogenetic tree was further generated that grouped 803 genotypes into four major clusters (Figure S1) with several subclusters within Cluster 1 (Green) and Cluster 2 (Red). Cluster 1 was the largest cluster consisting of a mixture of genebank landraces (400), market varieties (26), and breeding lines (311) corresponding to Cluster 3 of DAPC clustering at K = 3 and Clusters 3 and 4 of ADMIXTURE clustering at K = 4. Clusters 3 (Black) and 4 (Blue) were comparatively smaller groups consisting mainly very few breeding lines and market varieties, respectively. In addition, a network analysis was further carried out to assess the genetic relationship between the three genetic groups (breeding lines, genebank landraces, and market varieties). Although the phylogenetic tree grouped 803 genotypes into four distinct clusters, the grouping was unable to highlight any particular pattern for the three genetic groups under study. On the contrary, the network analysis among these genetic groups (Figure 5) revealed a centralized structure and genetic contribution of each genotype within a genetic group (genebank landrace, breeding lines, and market varieties). The network analysis between breeding lines and market varieties showed no direct genetic relationship among them, although some of the breeding lines may have been used directly as market varieties (Figure 5A), since there is a tendency with the farmers/traders to re-name the genotypes/breeding lines using their own naming system, which is associated with a popular market variety name. The naming system is also complex and depends upon the region within the country. For example, one of the popular market varieties in Nigeria is Hembakwase, which corresponds to several breeding lines (TDr 09/00023) and is associated with other market varieties such as Makakuasa and Omi_efun. Similarly, market variety Alumaco_1 was found to be similar to genebank landrace TDr 3584 and other market varieties such as Alumaco_2, TDr_Adaka, and TDr_Idu_Ekpeye. In addition, market variety TDr_Ehuru was identical to the breeding line TDr 08/00628. The network analysis between breeding lines and genebank landraces (Figure 5B) showed a strong genetic relationships indicating the use of genebank landraces in the yam breeding program to generate the selected breeding lines. This can be further elucidated from Table S1, wherein the pedigree information of the breeding lines has been provided. The central core of the QGRAPH (Figure 5B) represented a set of genebank landraces that were genetically similar and probably not used in the yam breeding program to generate breeding lines. Similarly, the network analysis between genebank landraces and market varieties indicated that these are genetically closer (Figure 5C). These findings highlighted the genetic relationships among different genetic groups of D. rotundata and the limited use of genebank landraces and market varieties in the breeding program.

4. Discussion

The present study dissected the genetic relationships between different genetic groups (breeding lines, genebank landraces, and market varieties) of white yam/white guinea yam (D. rotundata) so that diverse genetic materials are utilized in the yam breeding program to introgress gene(s) of interest for its improvement. Despite several studies to assess the genetic diversity of white yam, little is known about the population structure or genetic diversity in contrast to other crops. A previous study [31] used a total of 94 landrace/gene bank accessions across seven guinea yam species to understand genetic diversity and their evolution. The allelic diversity, admixed patterns, and differential genome-wide population structure assayed by GBS-SNPs in three diverse genetic groups of D. rotundata further implied their efficacy in genomics-assisted breeding applications, which is one of the food security crops in the yam belt of West and Central Africa. The 3432 SNP markers identified in the current study were distributed across 21 pseudo chromosomes as per Tamiru et al. [37]. Sansaloni et al. [57] concluded in their study that the clustering of SNPs within certain regions across the genome is an issue due to the reduced representation method used in developing the probes in GBS sequencing while resulting in low genome coverage.
This study elucidated that the majority of genetic variance exists within countries instead of between countries in the D. rotundata core collection. This was evident in DAPC clustering wherein the landraces from different countries were grouped together (Table S1). Several other studies including cowpea did not observe significant correlation between molecular clustering and geographic origins of genebank landraces [58]. The DAPC method revealed more clusters than ADMIXTURE (Tables S3 and S4), while the latter method provided information on genotypes with ancestries. The DAPC approach relies on discriminant functions and maximizes the diversity between clusters and minimizes within-cluster diversity [44]. However, DAPC-based clustering was found to be less efficient in clonally propagated crops such as white yam because of their continuous and complex population structure. This has been also reported in other clonally propagated crops such as cassava [59]. In general, DAPC cluster membership assignment was in agreement with ADMIXTURE clusters. Admixed ancestry was observed among breeding lines, which probably reflected their complex breeding history involving open pollination and bi-parental crossing coupled with strong adaptive selection pressure [60]. The admixed ancestry observed in the genebank landraces could be due to complex domestication patterns of D. rotundata landraces during evolutionary divergence. It has been explicitly demonstrated that West Africa at the forest/savannah ecotone is the cradle of yam domestication [8,11]. Furthermore, the inclusion of diverse landraces as common parents in the yam improvement program to develop/breed for valuable agronomic traits and higher yield might have influenced their population group assignment, resulting in numerous admixtures among these breeding lines (Table S1). In the present study, we have successfully unraveled the underlying genetic relationships and population structures among different genetic groups through network analysis. In the absence of complete pedigree records (across several years of breeding) or where breeding lines were selected from open-pollinated seeds, the dissection of genetic relationships among different genetic groups through network analysis was a reliable process. This has been successfully elucidated in cassava [59], wherein GBS-SNPs and complementary cluster analysis were used to assess the population structure and variety identification.
In clonal crops such as yams, improved varieties or breeding lines are often generated through intergenerational crosses (mainly bi-parental), which depends again on the flowering behavior of the parents used in those crosses across years. This has necessitated the use of the same parents with known flowering behavior in crossing programs at the Yam Breeding Unit, thus narrowing down the genetic base used. The genetic diversity observed among genebank landraces was high based on the Shannon–Weaver index, observed and expected heterozygosity, and this could be attributed to the extent of diversity captured within the core collection [37] and the percent of unique accessions (94%) (Table 2) identified in the present study. The genebank landraces were distributed across all the clusters generated through DAPC, ADMIXTURE, and phylogenetic tree, representing a significant amount of genetic diversity in the D. rotundata core collection. Hence, the diverse genebank landraces can be used as parents after the preliminary evaluation of flowering behavior and trait profiling, in white yam breeding programs to broaden the genetic base. Furthermore, this study was able to identify the complex naming system followed by farmers/traders to market varieties that were independently collected from different markets in Nigeria and Ghana, resulting in discrepancies from synonymy and homonymy in the tracking of released breeding lines when relying on use of names alone. Therefore, the is a need for a broader study including market varieties from a wide range of markets from different regions within Nigeria and Ghana to establish the inconsistencies with varietal names and its effect on the formal seed system in yams.

5. Conclusions

In conclusion, our study on white yam/white guinea yam represented a larger set of genotypes representing different genetic groups such as genebank landraces/breeding lines/market varieties. The genetic relationships dissected among different genetic groups in this study could be further explored in the white yam improvement by identifying diverse parents to generate mapping populations for target traits. Further studies are clearly needed to introgress gene(s) of target traits for white yam improvement. This study confirmed the reliability and accuracy of high-density SNP markers generated from next-generation sequencing-based genotyping coupled with complementary statistical analyses for genetic diversity and population structure.

Supplementary Materials

The following are available online at https://www.mdpi.com/2073-4395/10/9/1437/s1, Table S1. List of genotypes analyzed with information on their name, respective genetic group, geographic origin/pedigree, and results of DAPC clustering. Table S2. Identity by state (IBS)-based dissimilarity matrix. Table S3. Summary of DAPC cluster groups at different K values. Figure S1. Determination of optimal number of ADMIXTURE clusters using cross-validation error rates for K = 2 to K = 50. Table S4. Summary of ADMIXTURE cluster groups at different K values. Figure S2. Phylogenetic tree consisting of 803 genotypes using 3432 SNPs.

Author Contributions

Conceptualization, R.B., A.L.-M., P.L.K., M.A. and R.A.; Data curation, P.A. (Paterne Agre), G.B. and D.D.K.; Formal analysis, P.A. (Paterne Agre) and G.B.; Funding acquisition, R.B.; Methodology, R.B., P.A. (Paterne Agre), A.L.-M. and P.L.K.; Supervision, R.B.; Validation, R.B.; Writing—original draft, R.B.; Writing—review and editing, R.B., P.A. (Paterne Agre), G.B., D.D.K., A.L.-M., P.L.K., M.A., P.A. (Patrick Adebola), A.A. and R.A. All authors have read and agreed to the published version of the manuscript.

Funding

The research study was partly funded through complementary funding from the CGIAR Coordinated Research Project on Roots, Tubers and Bananas (CRP-RTBs) and partly through a research grant (OPP1052998) to IITA.

Acknowledgments

The authors would like to thank Ibukun Ogunleye and other Bioscience Center staff at IITA for the collection of samples and DNA extraction. We acknowledge Andreas Gisel and Yusuf Muideen for their critical advice on statistical analyses. The authors would also like to thank the Boyce Thompson Institute (BTI) Ithaca, Cornell University for providing server space for raw data deposit.

Conflicts of Interest

The authors declare no conflict of interest.

Data Availability

The datasets generated during and/or analysed during the current study are available on yambase (https://yambase.org).

Abbreviations

BWABurrow–Wheeler aligner
DAPCDiscriminant analysis of principal components
DNADeoxyribonucleic acid
GATKGenome analysis tool kit
GBSGenotyping-by-sequencing
HoObserved heterozygosity
HeExpected heterozygosity
HShannon–Weaver index
IBSIdentity-by-state
IGDInstitute for Genomic Diversity
IITAInternational Institute of Tropical Agriculture
LDLinear discriminants
MAFMinor allele frequency
NGSNext-generation sequencing
PCAPrincipal coordinate analysis
PCRPolymerase chain reaction
PICPolymorphic information content
SNPSingle nucleotide polymorphism
VCFVariant call format

References

  1. Mignouna, H.D.; Abang, M.M.; Fagbemi, S.A. A comparative assessment of molecular marker assays (AFLP, RAPD and SSR) for white yam (Dioscorea rotundata) germplasm characterization. Ann. Appl. Biol. 2003, 142, 269–276. [Google Scholar] [CrossRef]
  2. FAO. Food and Agriculture Organization of the United Nations Statistics Database, FAOSTAT. 2018. Available online: http://www.fao.org/faostat/en/#data/QC (accessed on 20 July 2020).
  3. Wilkin, P.; Schols, P.; Chase, M.W.; Chayamarit, K.; Furness, C.A.; Huysmans, S.; Rakotonasolo, F.; Smets, E.; Thapyai, C. A plastid gene phylogeny of the yam genus, Dioscorea: Roots, fruits and Madagascar. Syst. Bot. 2005, 30, 736–749. [Google Scholar] [CrossRef] [Green Version]
  4. Burkill, I.H. The organography and the evolution of Dioscoreaceae, the family of the yams. Bot. J. Linn. Soc. 1960, 56, 319–412. [Google Scholar] [CrossRef]
  5. Mignouna, H.D.; Dansi, A. Yam (Dioscorea spp.) domestication by the Nago and Fon ethnic groups in Benin. Genet. Resour. Crop. Evol. 2003, 50, 519–528. [Google Scholar] [CrossRef]
  6. Dumont, R.; Dansi, A.; Vernier, P.; Zoundjihékpon, J. Biodiversité et Domestication des Ignames en Afrique de l’Ouest. Pratiques Traditionnelles Conduisant à Dioscorea Rotundata; CIRAD, Ed.; Collection Repère: Montpelier, VT, USA, 2005. [Google Scholar]
  7. Zannou, A.; Agbicodo, E.; Zoundjihékpon, J.; Struik, P.C.; Ahanchédé, A.; Kossou, D.K.; Sanni, A. Genetic variability in yam cultivars from the Guinea-Sudan zone of Benin assessed by random amplified polymorphic DNA. Afr. J. Biotechnol. 2009, 8, 26–36. [Google Scholar] [CrossRef]
  8. Scarcelli, N.; Cubry, P.; Akakpo, R.; Thuillet, A.-C.; Obidiegwu, J.; Baco, M.N.; Otoo, E.; Sonke, B.; Dansi, A.; Djedatin, G.; et al. Yam genomics supports west Africa as a major cradle of crop domestication. Sci. Adv. 2019, 5, eaaw1947. [Google Scholar] [CrossRef] [Green Version]
  9. Dansi, A.; Mignouna, H.D.; Zoundjihékpon, J.; Sangare, A.; Asiedu, R.; Quin, F.M. Morphological diversity, cultivar groups and possible descent in the cultivated yams (Dioscorea cayenensisDioscorea rotundata complex) of Benin Republic. Genet. Resour. Crop Evol. 1999, 46, 371–388. [Google Scholar] [CrossRef]
  10. Mignouna, H.D.; Dansi, A.; Zok, S. Morphological and isozymic diversity of the cultivated yams (Dioscorea cayenensis/Dioscorea rotundata complex) of Cameroon. Genet. Resour. Crop Evol. 2002, 49, 21–29. [Google Scholar] [CrossRef]
  11. Bressan, E.A.; Briner Neto, T.; Zucchi, M.I.; Rabello, R.J.; Veasey, E.A. Genetic structure and diversity in the Dioscorea cayenensis/D. rotundata complex revealed by morphological and isozyme markers. Genet. Mol. Res. 2014, 13, 425–437. [Google Scholar] [CrossRef]
  12. Dansi, A.; Mignouna, H.D.; Zoundjihékpon, J.; Sangaré, A.; Asiedu, R.; Ahoussou, N. Using isozyme polymorphism to assess genetic variation within cultivated yams (Dioscorea cayenensis/Dioscorea rotundata complex) of the Republic of Benin. Genet. Resour. Crop Evol. 2000, 47, 371–383. [Google Scholar] [CrossRef]
  13. Scarcelli, N.; Tostain, S.; Mariac, C.; Agbangla, C.; Da, O.; Berthaud, J.; Pham, J.-L. Genetic nature of yams (Dioscorea spp.) domesticated by farmers in Benin (West Africa). Genet. Resour. Crop Evol. 2006, 53, 121–130. [Google Scholar] [CrossRef]
  14. Mignouna, H.D.; Abang, M.M.; Wanyera, N.W.; Chikaleke, V.A.; Asiedu, R.; Thottapally, G. PCR marker-based analysis of wild and cultivated yams (Dioscorea spp.) in Nigeria: Genetic relationships and implications for ex situ conservation. Genet. Resour. Crop Evol. 2005, 52, 755–763. [Google Scholar] [CrossRef]
  15. Dansi, A.; Mignouna, H.D.; Zoundjihékpon, J.; Sangaré, A.; Ahoussou, N.; Asiedu, R. Identification of some Benin Republic’s Guinea yam (Dioscorea cayenensis/Dioscorea rotundata complex) cultivars using randomly amplified polymorphic DNA. Genet. Resour. Crop Evol. 2000, 47, 619–625. [Google Scholar] [CrossRef]
  16. Terauchi, R.; Chikalele, V.A.; Thottapally, G.; Hahn, S.K. Origin and phylogeny of guinea yams as revealed by RFLP analysis of chloroplast DNA and nuclear ribosomal DNA. Theor. Appl. Genet. 1992, 83, 743–751. [Google Scholar] [CrossRef] [PubMed]
  17. Obidiegwu, J.E.; Kolesnikova-Allen, M.; Ene-Obong, E.; Muoneke, C.; Asiedu, R. SSR markers reveal diversity in Guinea yam (Dioscorea cayenensis/D. rotundata) core set. Afr. J. Biotechnol. 2009, 8, 2730–2739. [Google Scholar]
  18. Loko, L.Y.; Bhattacharjee, R.; Agre, A.P.; Dossou-Aminon, I.; Orobiyi, A.; Djedatin, G.; Dansi, A. Genetic diversity and relationship of Guinea yam (Dioscorea cayenensis Lam.–D. rotundata Poir. complex) germplasm in Benin (West Africa) using microsatellite markers. Genetic Resour. Crop Evol. 2017, 1205–1219. [Google Scholar] [CrossRef]
  19. Wendawek, A.M.; Demissew, S.; Fay, M.F.; Smith, R.J.; Nordal, I.; Wilkin, P. Genetic diversity and population structure of guinea yams and their wild relatives in south and south west Ethiopia as revealed by microsatellite markers. Genet. Resour. Crop Evol. 2013, 60, 529–541. [Google Scholar] [CrossRef]
  20. Tostain, S.; Agbangla, C.; Scarcelli, N.; Mariac, C.; Berthaud, J.; Pham, J.-L. Genetic diversity analysis of yam cultivars (Dioscorea rotundata Poir.) in Benin using simple sequence repeat (SSR) markers. Plant Genet. Resour. 2007, 5, 71–81. [Google Scholar] [CrossRef]
  21. Harikumar, P.; Sheela, M.N. Genetic diversity in white yam (Dioscorea rotundata Poir.) using random amplified polymorphic DNA (RAPD) markers. Indian J. Pure Appl. Biosci. 2019, 7, 30–35. [Google Scholar] [CrossRef]
  22. Spindel, J.; Wright, M.; Chen, C.; Cobb, J.; Gage, J.; Harrington, S.; Lorieux, M.; Ahmadi, N.; McCouch, S. Bridging the genotyping gap: Using genotyping by sequencing (GBS) to add high-density SNP markers and new value to traditional bi-parental mapping and breeding populations. Theor. Appl. Genet. 2013, 126, 2699–2716. [Google Scholar] [CrossRef] [Green Version]
  23. Rowe, H.C.; Renaut, S.; Guggisberg, A. RAD in the realm of next-generation sequencing technologies. Mol. Ecol. 2011, 20, 3499–3502. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  24. Deschamps, S.; Llaca, V.; May, G.D. Genotyping-by-sequencing in plants. Biology 2012, 1, 460–483. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  25. Poland, J.A.; Rife, T.W. Genotyping-by-Sequencing for Plant Breeding and Genetics. Plant Genome 2012, 5, 92–102. Available online: https://dl.sciencesocieties.org/publications/tpg/abstracts/5/3/92 (accessed on 2 September 2020). [CrossRef] [Green Version]
  26. He, J.; Zhao, X.; Laroche, A.; Lu, Z.-X.; Liu, H.; Li, Z. Genotyping-by-sequencing (GBS), an ultimate marker-assisted selection (MAS) tool to accelerate plant breeding. Front. Plant Sci. 2014, 5, 484. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  27. Hamblin, M.T.; Rabbi, I.Y. The effects of restriction-enzyme choice on properties of genotyping-by-sequencing libraries: A study in cassava (Manihot esculenta). Crop Sci. 2014, 54, 2603–2608. [Google Scholar] [CrossRef] [Green Version]
  28. Elshire, R.J.; Glaubitz, J.C.; Sun, Q.; Poland, J.A.; Kawamoto, K.; Buckler, E.S.; Mitchell, S.E. A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species. PLoS ONE 2011, 6, e19379. [Google Scholar] [CrossRef] [Green Version]
  29. Lu, F.; Lipka, A.E.; Glaubitz, J.; Elshire, R.J.; Cherney, J.H.; Casler, M.D.; Buckler, E.S.; Costich, D.E. Switchgrass genomic diversity, ploidy, and evolution: Novel insights from a network-based SNP discovery protocol. PLoS Genet. 2013, 9, e1003215. [Google Scholar] [CrossRef] [Green Version]
  30. Eaton, D.A.; Ree, R.H. Inferring phylogeny and introgression using RADseq data: An example from flowering plants (Pedicularis: Orobanchaceae). Syst. Biol. 2013, 62, 689–706. [Google Scholar] [CrossRef] [Green Version]
  31. Girma, G.; Hyma, K.E.; Asiedu, R.; Mitchell, S.E.; Gedil, M.; Spillane, C. Next-generation sequencing based genotyping, cytometry and phenotyping for understanding diversity and evolution of guinea yams. Theor. Appl. Genet. 2014, 127, 1783–1794. [Google Scholar] [CrossRef]
  32. Saski, C.A.; Bhattacharjee, R.; Scheffler, B.E.; Asiedu, R. Genomic resources for water yam (Dioscorea alata L.): Analyses of EST-sequences, de novo sequencing and GBS libraries. PLoS ONE 2015, 10, e0134031. [Google Scholar] [CrossRef] [Green Version]
  33. Cormier, F.; Mournet, P.; Causse, S.; Arnau, G.; Maledon, E.; Gomez, R.-M.; Pavis, C.; Chair, H. Development of a cost-effective single nucleotide polymorphism genotyping array for management of greater yam germplasm collections. Ecol. Evol. 2019, 9, 5617–5636. [Google Scholar] [CrossRef] [PubMed]
  34. Agre, P.; Asibe, F.; Darkwa, K.; Edemondu, A.; Bauchet, G.; Asiedu, R.; Adebola, P.; Asfaw, A. Phenotypic and molecular assessment of genetic structure and diversity in a panel of winged yam (Dioscorea alata) clones and cultivars. Sci. Rep. 2019, 9, 18221. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Siadjeu, C.; Mayland-Quellhorst, E.; Albach, D.C. Genetic diversity and population structure of trifoliate yam (Dioscorea dumetorum Kunth) in Cameroon revealed by genotyping-by-sequencing (GBS). BMC Plant Biol. 2018, 18, 359. [Google Scholar] [CrossRef] [Green Version]
  36. Girma, G.; Bhattacharjee, R.; Lopez-Montes, A.; Gueye, B.; Ofodile, S.; Franco, J.; Abberton, M. Redefining the yam (Dioscorea spp.) core collection using morphological traits. Plant Genet. Resour. 2018, 16, 193–200. [Google Scholar] [CrossRef]
  37. Tamiru, M.; Natsume, S.; Takagi, H.; White, B.; Yaegashi, H.; Shimizu, M.; Yoshida, K.; Uemura, A.; Oikawa, K.; Abe, A.; et al. Genome Sequencing of the Staple Food Crop White Guinea Yam Enables the Development of a Molecular Marker for Sex Determination. BMC Biol. 2017, 15, 86. Available online: https://www.ncbi.nlm.nih.gov/assembly/GCA_002240015.2 (accessed on 2 September 2020). [CrossRef] [Green Version]
  38. Li, H.; Durbin, R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 2010, 26, 589–595. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  39. Available online: https://software.broadinstitute.org/gatk/best-practices/bp_3step.php?case=GermShortWGS (accessed on 2 September 2020).
  40. Danecek, P.; Auton, A.; Abecasis, G.; Albers, C.A.; Banks, E.; DePristo, M.A.; Handsaker, R.E.; Lunter, G.; Marth, G.T.; Sherry, S.T.; et al. The variant call format and VCFtools. Bioinformatics 2011, 27, 2156–2158. [Google Scholar] [CrossRef]
  41. Purcell, S.; Neale, B.; Todd-Brown, K.; Thomas, L.; Ferreira, M.A.R.; Bender, D.; Maller, J.; Sklar, P.; de Bakker, P.I.W.; Daly, M.J.; et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 2007, 81, 559–575. [Google Scholar] [CrossRef] [Green Version]
  42. Alexander, D.H.; Lange, K. Enhancements to the ADMIXTURE algorithm for individual ancestry estimation. BMC Bioinform. 2011, 12, 246. [Google Scholar] [CrossRef] [Green Version]
  43. Alexander, D.H.; Novembre, J.; Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009, 19, 1655–1664. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  44. Jombart, T.; Devillard, S.; Balloux, F. Discriminant analysis of principal components: A new method for the analysis of genetically structured populations. BMC Genet. 2010, 11, 94. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  45. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2018; Available online: https://www.R-project.org/ (accessed on 2 September 2020).
  46. Hill, M.O. Diversity and evenness: A unifying notation and its consequences. Ecology 1973, 54, 427–432. [Google Scholar] [CrossRef] [Green Version]
  47. Oksanen, J.; Kindt, R.; Legendre, P.; O′Hara, B.; Simpson, G.L. The vegan package. Community Ecol. Package 2007, 10, 631–637. [Google Scholar]
  48. Noli, E.; Teriaca, M.S.; Conti, S. Criteria for the definition of similarity thresholds for identifying essentially derived varieties. Plant Breed. 2013, 132, 525–531. [Google Scholar] [CrossRef]
  49. Epskamp, S.; Cramer, O.J.C.; Waldorp, L.J.; Schmittmann, V.D.; Borsboom, D. qgraph: Network Visualizations of Relationships in Psychometric Data. J. Stat. Software 2012, 48, 1–18. Available online: http://www.jstatsoft.org/v48/i04/ (accessed on 2 September 2020). [CrossRef] [Green Version]
  50. Jombart, T. adegent: A R package for the multivariate analysis of genetic markers. Bioinformatics 2008, 24, 1403–1405. [Google Scholar] [CrossRef] [Green Version]
  51. Parida, S.K.; Mukerji, M.; Singh, A.K.; Singh, N.K.; Mohapatra, T. SNPs in stress-responsive rice genes: Validation, genotyping, functional relevance and population structure. BMC Genom. 2012, 13, 426. [Google Scholar] [CrossRef] [Green Version]
  52. Nimmakayala, P.; Levi, A.; Abburi, L.; Abburi, V.L.; Tomason, Y.R.; Saminathan, T.; Vajja, V.G.; Ma, S.A.; Reddy, R.; Wehner, T.C.; et al. diversity, linkage disequilibrium, and selective sweeps in cultivated watermelon. BMC Genom. 2014, 15, 767. [Google Scholar] [CrossRef] [Green Version]
  53. Liu, K.; Muse, S.V. Power Marker: Integrated analysis environment for genetic marker data. Bioinformatics 2005, 21, 2128–2129. [Google Scholar] [CrossRef] [Green Version]
  54. Kumar, S.; Stecher, G.; Li, M.; Knyaz, C.; Tamura, K. MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Mol. Biol. Evol. 2018, 35, 1547–1549. [Google Scholar] [CrossRef]
  55. Available online: https://sniplay.southgreen.fr/cgi-bin/snp_statistics.cgi?session=3036277183304&result=result (accessed on 2 September 2020).
  56. Sartie, A.; Asiedu, R.; Franco, J. Genetic and phenotypic diversity in a germplasm working collection of cultivated tropical yams (Dioscorea spp.). Genet. Resour. Crop Evol. 2012, 59, 1753–1765. [Google Scholar] [CrossRef]
  57. Sansaloni, C.P.; Petroli, C.D.; Carling, J.; Hudson, C.J.; Steane, D.A.; Myburg, A.A.; Grattapaglia, D.; Vaillancourt, R.E.; Kilian, A. A high-density Diversity Arrays Technology (DArT) microarray for genome-wide genotyping in Eucalyptus. Plant Methods 2010, 6, 16. [Google Scholar] [CrossRef] [Green Version]
  58. Xiong, H.; Shi, A.; Mou, B.; Qin, J.; Motes, D.; Lu, W.; Ma, J.; Weng, Y.; Yang, W.; Wu, D. Genetic diversity and population structure of cowpea (Vigna unguiculate L. Walp). PLoS ONE 2016, 11, e0160941. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  59. Rabbi, I.Y.; Kulakow, P.A.; Manu-Aduening, J.A.; Dankyi, A.A.; Asibou, J.Y.; Parkes, E.Y.; Abdoulaye, T.; Girma, G.; Gedil, M.A.; Ramu, P.; et al. Tracking crop varieties using genotyping-by-sequencing markers: A case study using cassava (Manihot esculenta Crantz). BMC Genet. 2015, 16, 115. [Google Scholar] [CrossRef] [Green Version]
  60. Darkwa, K.; Olasanmi, B.; Asiedu, R.; Asfaw, A. Review of empirical and emerging breeding methods and tools for yam (Dioscorea spp.) improvement: Status and prospects. Plant Breed. 2020, 139, 474–497. [Google Scholar] [CrossRef] [Green Version]
Figure 1. Distribution and density of filtered single nucleotide polymorphism (SNPs) across 21 pseudo chromosomes, as suggested by Tamiru et al. [38]. The horizontal axis displays the chromosome length. The number of SNPs in a given region is indicated at the bottom right.
Figure 1. Distribution and density of filtered single nucleotide polymorphism (SNPs) across 21 pseudo chromosomes, as suggested by Tamiru et al. [38]. The horizontal axis displays the chromosome length. The number of SNPs in a given region is indicated at the bottom right.
Agronomy 10 01437 g001
Figure 2. Transition and transversion based on bi-allelic SNP markers. Tv: Transversions; Ts: Transitions; A: Adenine; T: Thymine; G: Guanine; C: Cytosine. Chart developed using SNIPLAY software [55].
Figure 2. Transition and transversion based on bi-allelic SNP markers. Tv: Transversions; Ts: Transitions; A: Adenine; T: Thymine; G: Guanine; C: Cytosine. Chart developed using SNIPLAY software [55].
Agronomy 10 01437 g002
Figure 3. (A) Graph representing number of clusters vs. Bayesian Information Criterion (BIC). In the x-axis, a different number of clusters that could be considered in the population are presented. In the y-axis, the BIC value associated with each number of clusters is presented. (B) Discriminant analysis of principal components (DAPC) with K = 2. (C) Discriminant analysis of principal components (DAPC) with K = 3. (D) Discriminant analysis of principal components (DAPC) with K = 4. (E) Discriminant analysis of principal components (DAPC) with K = 5. (F) Discriminant analysis of principal components (DAPC) with K = 12. The axes represent the first two linear discriminants. Each color represents a cluster and each dot represents an individual.
Figure 3. (A) Graph representing number of clusters vs. Bayesian Information Criterion (BIC). In the x-axis, a different number of clusters that could be considered in the population are presented. In the y-axis, the BIC value associated with each number of clusters is presented. (B) Discriminant analysis of principal components (DAPC) with K = 2. (C) Discriminant analysis of principal components (DAPC) with K = 3. (D) Discriminant analysis of principal components (DAPC) with K = 4. (E) Discriminant analysis of principal components (DAPC) with K = 5. (F) Discriminant analysis of principal components (DAPC) with K = 12. The axes represent the first two linear discriminants. Each color represents a cluster and each dot represents an individual.
Agronomy 10 01437 g003
Figure 4. Population structure with K = 2, 3, 4, and 10 for 803 Dioscorea genotypes using 3432 SNPs. The genotypes represented by vertical bars along the horizontal axis were classified into K color segments based on their estimated membership fraction in each K cluster. Accessions on the x-axis were sorted in the same order for each K.
Figure 4. Population structure with K = 2, 3, 4, and 10 for 803 Dioscorea genotypes using 3432 SNPs. The genotypes represented by vertical bars along the horizontal axis were classified into K color segments based on their estimated membership fraction in each K cluster. Accessions on the x-axis were sorted in the same order for each K.
Agronomy 10 01437 g004
Figure 5. The genetic networks obtained using QGRAPH. The figure represents networks for all genetic groups with the node size depicting genetic relationships among different genotypes based on the observed heterozygosity and allelic richness. (A) Network analysis between breeding lines and market varieties; (B) Network analysis between breeding lines and genebank landraces; (C) Network analysis between genebank landraces and market varieties.
Figure 5. The genetic networks obtained using QGRAPH. The figure represents networks for all genetic groups with the node size depicting genetic relationships among different genotypes based on the observed heterozygosity and allelic richness. (A) Network analysis between breeding lines and market varieties; (B) Network analysis between breeding lines and genebank landraces; (C) Network analysis between genebank landraces and market varieties.
Agronomy 10 01437 g005
Table 1. SNP marker summary statistics across twenty-one chromosomes.
Table 1. SNP marker summary statistics across twenty-one chromosomes.
ChromosomeNo. of SNPsPICMAFHoHe
1810.1430.1190.1790.172
21440.1310.1050.1540.156
31470.1470.1210.1730.176
42710.1460.1230.1840.175
53800.1350.1120.1680.161
61790.1220.0980.1380.144
71170.1330.1040.1560.157
82730.1510.1260.1840.180
91100.1320.1130.1650.159
101230.1520.1260.1900.182
11900.1300.0900.1390.133
121180.1230.1060.1620.154
131060.1210.0930.1340.142
141850.1270.1020.1540.150
151650.1460.1220.1810.175
161930.1200.0980.1390.142
171860.1510.1310.1820.182
181530.1270.1030.1610.150
191880.1590.1330.1900.191
201200.1330.1060.1720.158
211030.1240.0970.1420.145
Total/Average34320.1350.1110.1650.161
PIC: polymorphic information content; MAF: minor allele frequency; Ho: observed heterozygosity; He: expected heterozygosity.
Table 2. Comparison of genetic diversity parameters among the three genetic groups of D. rotundata used in the study.
Table 2. Comparison of genetic diversity parameters among the three genetic groups of D. rotundata used in the study.
Summary Statistics
GenotypesPICMAFHoHe
All0.1350.1110.1650.161
Breeding line0.1260.1060.1560.151
Genebank landraces0.1410.1150.1730.168
Market varieties0.1170.1000.1570.142
Genetic Diversity Parameters
GenotypesH’SimpsonInverse SimpsonPielou’s Evenness
All6.6660.9987670.149
Breeding lines5.7360.9963080.173
Landraces5.9680.9974340.163
Market varieties3.2420.960250.294
PIC: polymorphic information content; MAF: minor allele frequency; Ho: observed heterozygosity; He: expected heterozygosity; H’: Shannon–Weaver index.
Table 3. Pairwise fixation index (Fst) values among the genetic groups.
Table 3. Pairwise fixation index (Fst) values among the genetic groups.
Fst-Based Genetic Groups
Breeding LinesMarket VarietiesGenebank Landraces
Breeding Lines0.000
Market Varieties0.0310.000
Genebank Landraces0.0380.0240.000
Table 4. Analysis of molecular variance (AMOVA) within/among genetic groups.
Table 4. Analysis of molecular variance (AMOVA) within/among genetic groups.
Source of VariationdfSSMSEst. Var.%
Among genetic groups27051.373525.6815.134
Within genetic groups800318,465.22398.08398.0896
Total802325,516.59 413.21100
df: degrees of freedom; SS: sum of squares; MS: mean square; Est. Var.: estimated variance; %: percentage of variation.

Share and Cite

MDPI and ACS Style

Bhattacharjee, R.; Agre, P.; Bauchet, G.; De Koeyer, D.; Lopez-Montes, A.; Kumar, P.L.; Abberton, M.; Adebola, P.; Asfaw, A.; Asiedu, R. Genotyping-by-Sequencing to Unlock Genetic Diversity and Population Structure in White Yam (Dioscorea rotundata Poir.). Agronomy 2020, 10, 1437. https://doi.org/10.3390/agronomy10091437

AMA Style

Bhattacharjee R, Agre P, Bauchet G, De Koeyer D, Lopez-Montes A, Kumar PL, Abberton M, Adebola P, Asfaw A, Asiedu R. Genotyping-by-Sequencing to Unlock Genetic Diversity and Population Structure in White Yam (Dioscorea rotundata Poir.). Agronomy. 2020; 10(9):1437. https://doi.org/10.3390/agronomy10091437

Chicago/Turabian Style

Bhattacharjee, Ranjana, Paterne Agre, Guillaume Bauchet, David De Koeyer, Antonio Lopez-Montes, P. Lava Kumar, Michael Abberton, Patrick Adebola, Asrat Asfaw, and Robert Asiedu. 2020. "Genotyping-by-Sequencing to Unlock Genetic Diversity and Population Structure in White Yam (Dioscorea rotundata Poir.)" Agronomy 10, no. 9: 1437. https://doi.org/10.3390/agronomy10091437

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop