G3, 2025, 15(7), jkaf113 https://doi.org/10.1093/g3journal/jkaf113 Advance Access Publication Date: 23 May 2025 Plant Genetics and Genomics Whole-genome resequencing of a global collection of Napier grass (Cenchrus purpureus) to explore global population structure and QTL governing yield and feed quality traits Abel Teshome,1 Hailu Lire,2 Janet Higgins,3 Temesgen Magule Olango,4 Ermias Haile Habte,1 Alemayehu Teressa Negawo,1 Meki Shehabu Muktar,1 Yilikal Assefa,1 Jorge Fernando Pereira ,5 Ana Luisa Sousa Azevedo ,5 Juarez Campolina Machado ,5 Desterio Nyamongo,6 Jiyu Zhang ,7 Yan Qi,7 William Anderson,8 Jose De Vega ,3,* Christopher Stephen Jones 9,* 1Feed and Forage Development, International Livestock Research Institute, P.O. Box 5689, Addis Ababa, Ethiopia 2Ethiopian Institute of Agricultural Research, Wondogenet Agricultural Research Centre, P.O. Box 2003, Wondogenet, Ethiopia 3Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK 4Almaviva, Via di Casal Boccone, 188-190, Rome 00137, Italy 5Embrapa Dairy Cattle, Juiz de Fora 36038-330, Brazil 6Kenya Agricultural and Livestock Research Organization, P.O. Box 57811, Nairobi 00200, Kenya 7State Key Laboratory of Grassland Agro-Ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou 730020, China 8Crop Genetics and Breeding Research Unit, USDA-ARS, 115 Coastal Ways, Tifton, GA 31793, USA, retired 9Feed and Forage Development, International Livestock Research Institute, Box 30709, Nairobi 00100, Kenya *Corresponding author: Feed and Forage Development, International Livestock Research Institute, Box 30709, Nairobi 00100, Kenya. Email: c.s.jones@cgiar.org; *Corresponding author: Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK. Email: jose.de-vega@earlham.ac.uk Napier grass (Cenchrus purpureus) is a C4 perennial grass species native to Sub-Saharan Africa and widely used as livestock feed in the region. In this study, we sequenced the genomes of 450 Napier grass individuals from 18 countries, identifying over 170 million DNA variants (SNPs and Indels). Approximately 1% of these SNPs were informative and used to assess genetic diversity within the collection. Our resequencing study provided valuable insights into the global genetic diversity of Napier grass. Additionally, a genome-wide asso ciation study on 2 independent populations identified multiple quantitative trait loci significantly associated with key agronomic traits, including biomass yield, nitrogen and cellulose content. These findings serve as a crucial resource for preserving and understanding Napier grass genetic diversity in the context of climate change. Moreover, they will support genomics-based breeding programs aimed at developing high-yielding and drought-tolerant varieties for forage and biofuel production. Keywords: agronomic traits; drought; GWAS; elephant grass; Pennisetum purpureum; nutritional traits; QTL; SNP markers; underutilized crop; plant genetics and genomics Received on 09 December 2024; accepted on 29 April 2025 © The Author(s) 2025. Published by Oxford University Press on behalf of The Genetics Society of America. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. Introduction Globally, grasslands cover 26% of the land area, 70% of agricultur al land, and play an important role as livestock feed, particularly in Sub-Saharan Africa (SSA) (FAO 2010). In SSA, popular grasses include Cenchrus, Urochloa, and Megathyrsus species and these grasses are critically important for smallholders and frequently used by women to maintain the livestock production systems (Njuki and Sanginga 2013; Simeão et al. 2021). Unfortunately, an nual milk and meat production in SSA remains low compared to the global average (Balehegn et al. 2021). One of the significant rea sons behind the below-par productivity of the livestock industry is the inadequate access to quality feeds and forages, worsened re cently by the risks associated with climate change (Balehegn et al. 2020; Paul et al. 2020). Most small-scale livestock farmers in SSA rely heavily on natural common grazing lands as their primary source of forage and feed supply, mainly available during the rainy seasons (Hanan and Kahiu, 2016). Unfortunately, such grazing lands are dwindling because of the inevitable population increase, climate change, and more land being allocated for food crops (Tolera 2007; Smith et al. 2013; Enahoro et al. 2019). Consequently, livestock farmers are now more in need of product ive, high quality and resilient forage varieties to support their livestock. Napier grass or Elephant grass [Cenchrus purpureus (Schumach.) Morrone syn. Pennisetum purpureum Schumach.] is a crucial trad itional forage species in SSA, growing mainly up to 2,000 m above sea level in the tropics (Habte et al. 2020; Mkutche 2020). It is pri marily used to feed cattle in cut and carry feeding systems in Ethiopia, Kenya, Uganda, Tanzania, and Nigeria (Mwendia et al. 2006), because of its low cost of production, year-round https://orcid.org/0000-0001-9340-065X https://orcid.org/0000-0003-1939-0339 https://orcid.org/0000-0003-2551-2080 https://orcid.org/0000-0002-3642-373X https://orcid.org/0000-0003-2847-5158 https://orcid.org/0000-0001-9096-9728 mailto:c.s.jones@cgiar.org mailto:jose.de-vega@earlham.ac.uk https://creativecommons.org/licenses/by/4.0/ https://doi.org/10.1093/g3journal/jkaf113 availability under limited irrigation, and some degree of resilience against drought (Habte et al. 2020; Muktar et al. 2022). Due to its high biomass yield and desirable nutritional traits, Napier grass has recently garnered interest as a candidate for bio-based pro ducts and biofuels in tropical and semitropical regions of the world, such as the USA and Brazil (Anderson et al. 2008; Rocha et al. 2019; Sawasdee and Pisutpaisal 2021). Once established in the main production field, Napier grass can grow and be main tained for decades under good management practices, yielding up to 50 tons of dry matter (DM) ha−1 per year (Habte et al. 2020; Dokbua et al. 2021). Because of its adaptability, persistence, and versatility, it has been naturalized to Central and South America, the tropical parts of Asia, Australia, the Middle East, and the Pacific Islands (Fukagawa and Isshi 2018). Napier grass has the potential to be included in the mainstream feed chain, particularly in the tropics, if research is focused on this species, it can also contribute to energy requirements for current and future generations. Unfortunately, Napier grass remains an underutilized crop with limited genetic and genomic tools devel oped to date and few cultivars available for farmers. The first ref erence genome was reported in 2020 (Yan et al. 2021) and a second, improved 1 was reported in 2022 (Zhang et al. 2022). The availabil ity of these reference genomes facilitates the generation of mo lecular markers by elucidating their genomic positions. Here, we report on a species-level whole-genome sequencing (WGS) study of a global collection of 450 genotypes. We analyzed the collec tion’s diversity to explore how breeding, selection, and environ mental pressures have shaped the Napier grass genome in the international collections. We also analyzed the genomic regions associated with important agronomic traits, such as fresh bio mass yield and plant height, and nutritional feed-quality traits, such as crude protein content. Thus, the genomic tools developed in this study will enable forage breeders to apply advanced plant breeding procedures such as genomic selection and marker- assisted breeding, which have been lacking to date for Napier grass. Furthermore, new perspectives from the study should benefit conservation efforts worldwide. Materials and methods Napier grass field evaluation Phenotype data were assessed from 2 collections of Napier grass genotypes which were independently evaluated in this study. The first collection consisted of 84 genotypes conserved at the International Livestock Research Institute (ILRI) genebank which was evaluated in Bishoftu, Ethiopia for 2 consecutive years, in a P-rep design, replicated twice. Details of the field evaluation have previously been reported (Habte et al. 2020; Muktar et al. 2022). Briefly, 84 genotypes (72 unique and 12 check genotypes) were arranged into 4 blocks with 2 replicates. The 12 check geno types were duplicated in each block, while the remaining 72 gen otypes appeared only once. The first 2 blocks (first replicate) were subjected to a volumetric soil water content (VWC) of approxi mately 20%, referred to as moderate water stress (MWS). In con trast, the other 2 blocks received less water, resulting in a VWC of about 10%, referred to as severe water stress (SWS). The second collection of 91 Napier grass genotypes was evaluated at the Embrapa Dairy Cattle experimental field, located in Brazil, and 5 cuttings were conducted between 2014–2016 in both wet and dry seasons. Rocha et al. (2019) described in detail these evaluations that were done in natural conditions. Nine genotypes (BA17, BA30, BA34, BA53, BA81, BA86, BA93, BA97, and Pioneiro (released cultivar)) were shared between these 2 trials. In both trials, the planting materials used were clonally propagated from original mother plants conserved in situ. Throughout the experimental period, the plants were harvested every 6–8 weeks, and no flower ing was observed during the experimental period. Phenotyping of agronomic and feed quality traits The following agronomic traits were measured for the trial carried out in Ethiopia: leaf length (LL, mm), leaf width (LW, mm), leaf-to-stem ratio (LSR), stem thickness (ST, mm), tiller number (TN), and biomass yield data (total fresh weight [TFW, g] and total dry weight (TDW, g) were collected as described previously in Habte et al. (2022). Water use efficiency (WUE) was also calculated by dividing the TDW per plant by the total volume of irrigated water applied to each plant during the dry season. Likewise in Brazil, plant height (PH, m), production of TFW (Mg ha−1), produc tion of TDW (Mg ha−1), and DM concentration (%) were scored. Furthermore, 9 feed-quality nutritional traits including acid de tergent fiber (ADF), neutral detergent fiber (NDF), lignin (LIG), cel lulose (CEL), hemicellulose (HCEL), in vitro dry matter digestibility (IVDMD), ash (ASH), and nitrogen content (NIT) were also scored in Brazil. The DM concentration recorded for agronomic traits was used as a common denominator for estimating of biomass di gestibility. Further details can be found in Rocha et al. (2019). Phenotypic data analysis Collected phenotypic values for each trait were checked for nor mal distribution and transformed, when needed, ahead of vari ance comparison using the bestNormalize R package (Peterson 2018). Phenotypic variability between genotypes was calculated with R statistical software (R Core Team R 2022) using the model: E · R + E · R · B + G + G · E where E, R, B, and G denote environ ment, replicate, incomplete block, and genotype, respectively. Environment effects and replicate effects nested within the environ ment, both represented by (E·R), were taken as fixed. In contrast, the block effect nested within the replicate and environment and the genotype-by-environment interaction (G·E) were taken as random. The main genotype effect (G) was taken as random. Analysis of vari ance (ANOVA) and multiple comparison tests (LSD) were conducted at a probability level of 5%. Furthermore, phenotypic data were used to carry out hierarchical clustering and principal component ana lysis (PCA) using the Factoextra R package (Kassambara 2017). Correlation analysis was also carried out between all the variables measured in the field evaluation in Brazil. Sequenced worldwide Napier grass collection A total of 450 Napier grass genotypes were sequenced and deposited as bioproject PRJEB73794: 61 from the ILRI genebank, 131 from Embrapa, 23 from the USDA, 6 from China (Lanzhou University), 118 from the Kenya Agricultural and Livestock Research Organization, and 2 released cultivars, namely Super Napier (G1) and Pioneiro (PION). In addition, 109 progeny plants (generated from seeds collected from 14 ILRI genotypes (mother plants) by open pollination were sequenced. The progenies were from open pollinated plants in the field and the pollen donor genotypes were unknown. All mother plants were represented by 6–10 progenies except for 1 mother plant (IL18438), which a single progeny repre sented. More information about these genotypes can be found in Supplementary Table 1: metadata. DNA extraction and sequencing Young leaf tissue was collected from respective genotypes and subjected to isolation of genomic DNA following the procedure de scribed in the Qiagen DNeasy Plant Mini kit (250) (Qiagen Inc., 2 | A. Teshome et al. http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data Valencia, CA, USA). Before library preparation, DNA quality was checked on 1% agarose gels, and DNA purity was checked using a Nanophotometer spectrophotometer (IMPLEN, CA, USA), and DNA concentration was measured using the Qubit DNA Assay Kit in a Qubit 2.0 Fluorometer (Life Technologies, CA, USA). High-quality DNA with a minimum of 50 ng/µL was used for Illumina WGS. The genotypes were sequenced using Illumina technology using paired-end 2 × 150 bp short-reads. A total of 4.92 Tb of data were generated, with an average sequencing depth of 15–20 × per sample. Library preparation and sequencing were conducted by Novogene (https://en.novogene.com). Read mapping, SNP calling, and filtering The quality of raw reads was checked using the FastQC (Andrews 2010) and MultiQc tools (Ewels et al. 2016). Afterwards, raw reads were trimmed and filtered with the trimmomatic tool (Bolger et al. 2014) to remove Illumina Truseq adapter remnant sequences, as well as low-quality reads (with a quality score lower than 30). Curated reads were mapped against the Napier grass reference genome with the Burrows Wheller Aligner (BWA) (Li and Durbin 2009). The SAM files generated from the BWA step were converted into sorted BAM files using SAMtools (Li et al. 2009). The HaplotypeCaller tool, Genome Analysis Toolkit (GATK4.4), was used for the variant calling step with default parameters (McKenna et al. 2010). The generated vcf file, from the variant call ing step, was filtered and pruned with BCFtools (v.1.9) (Li et al. 2009). The SNP filtering process retained biallelic and polymorphic loci with read depths between 10 and 300, mapping quality (GQ > 20), a minor allele frequency above 0.05, and missing data in less than 1% of the samples. After filtering, 1,068,685 SNPs were re tained for downstream analysis. Genetic diversity and population structure The population structure analysis tool, ADMIXTURE (v1.3.0) (Alexander and Lange 2009), was used to infer optimal cluster/ subpopulations (K ) and the proportion of ancestry among the 450 global Napier grass genotypes, with the filtered SNPs. Ten in dependent runs were carried out for maximum likelihood esti mates of the ancestry subgroups (K ) from 2 to 10. For each K, ADMIXTURE was run 20 times with varying random seeds. Afterwards, CLUMPP software (Jakobsson and Rosenberg 2007) was used to align up to 10 Q-matrices in the same cluster. The number of ancestors was determined according to the position of the minimum value, with an error rate obtained from the cross- validation (CV) score. A good value of K will exhibit a low CV error compared to other K values. Outputs from ADMIXTURE were col lated using the R pophelper program (v.2.3.1) (Francis 2017), which compares the ancestral make-up of each predicted population. Using genotypic data, PCA was performed to examine inter- population distribution using the SNPRelate (v. 4.0.2) (Zheng et al. 2012) and Plotly R packages (Sievert 2020). A phylogenetic tree was also constructed with the filtered, high-quality SNPs using identity by descent with the SNPRelate R package (v.4.0.2) (Zheng et al. 2012) and visualized with the interactive Tree Of Life (Letunic and Bork 2021). Genome-wide association study A total of 174 genotypes in 2 independent populations, 90 from Embrapa and 84 from ILRI, were considered for the genome-wide association study (GWAS). These genotypes included 2 independ ent populations phenotyped in the field evaluation carried out in Brazil and Ethiopia. These trials were carried out at 2 different times, and different traits were measured in each experiment. For the field evaluation carried out in Brazil, the marker-trait asso ciation analysis was carried out separately for dry and wet sea sons, for each of the 12 quantitative agronomic and feed-quality traits (PH, TFW, DM concentration,TDW, CEL, LIG, ADF, NDF, HCEL, IVDDM, NIT and ASH). In the experiment carried out in Ethiopia, the traits measured were PH, leaf length (LL), LW, LSR, ST, TDW, TFW, TN, and WUE and the groups were split into MWS and SWS dry season treatments. Furthermore, GWAS was employed to investigate marker-trait associations (MTAs) for 3 agronomic traits; PH, production of green biomass (TFW), and production of dry biomass (TDW) as sessed in both field evaluations conducted in Brazil and Ethiopia. The mean value for each trait was first indexed as low, medium and high based on quantile values for each trait per sea son, i.e. mean values less than the 1st quantile were labeled as low and mean values between the 1st and 3rd quantiles were labeled as medium and mean values above the 3rd quantile were labeled as high. Once recoded, the data from each country was merged per trait and for each season. The transformed data from the 9 genotypes, shared between the 2 evaluations showed some degree of inconsistency (probably due to GxE interactions), particularly for the dry season and in such cases data from Ethiopia was se lected since frequent measurements, i.e. every 8 weeks, were ta ken during the trial in Ethiopia (Supplementary Table 1). Prior to GWAS, the average values for all traits were normalized using the bestNormalize R package (Peterson 2018). GWAS was then performed on the average values, normalized values, and BLUE and BLUP predicted values. Different GWAS models were used to ensure detection of significant associations while accounting for popula tion structure and relatedness. The analysis was conducted using the Genomic Association and Prediction Integrated Tool (GAPIT) version 3 software package within the R environment (Wang and Zhang 2021). We implemented the Fixed and random model Circulating Probability Unification (Liu et al. 2016 ), which improves power by iteratively testing markers while controlling for con founding effects. Bayesian-information and Linkage-disequilib rium Iteratively Nested Keyway model (Huang et al. 2019) enhances efficiently by replacing the kinship matrix with Bayesian information and linkage disequilibrium (LD)-based marker selec tion. The multiple-locus mixed linear model (Segura et al. 2021) it eratively incorporates multiple associated SNPs as cofactors, improving polygenic trait detection while accounting for population structure and relatedness. The distribution of observed vs. expected −log10(P) values was assessed using Quantile–Quantile (Q–Q) plots, which visualize deviations from the null hypothesis. Significant SNP-trait associations were identified based on the internal model selection criteria and multiple testing correction methods imple mented in GAPIT, including adjustments for population structure and control of the genome-wide type I error rate. This approach en sured the statistical rigor and reliability of our GWAS results. Regions of 0.04 Mbp surrounding highly significant SNPs, identi fied by multiple models and associated with multiple traits and/or treatment conditions, were blasted against protein databases, in cluding Phytozome (Goodstein et al. 2012), to identify homologous genes or proteins with similar sequences and MTAs. A threshold of 80% identity was used to report putative homologous proteins. Results Phenotypic variability among Napier grass genotypes Field evaluations in wet and dry seasons at the Embrapa Dairy Cattle in Brazil indicated significant differences between seasons, Whole-genome resequencing of a global collection of Napier grass | 3 https://en.novogene.com http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data among some agronomic and feed-quality traits (Table 1). PH and TFW were significantly higher during the wet season whereas CEL and ash concentrations exhibited no seasonal variation. Furthermore, the interaction between genotypes and harvest cy cle was insignificant for most traits, except PH, TFW, DM, and TDW. Also, the interaction of genotypes with the season was sig nificant for most traits except TDW, HCEL content, and NIT, indi cating that the performance of genotypes was differentially affected by season. The mean performance per accession for all traits is presented in Supplementary Table 1. Among the 91 genotypes evaluated, the highest TFW was re corded for genotypes BAGCE2, BAGCE64, and BAGCE60. The high est biomass-yielding genotypes as TDW were similar, indicating a high correlation between TFW and TDW. Regarding NIT content, a key trait in feed quality, the genotypes BAGCE58, BAGCE82, and BAGCE1 were the top performers (Supplementary Table 2). Interestingly the genotype BAGCE82, with the highest NIT con tent, also showed a high mean TFW (64.8 Mg ha−1). The output from the field evaluation trial in Bishoftu, Ethiopia, has previously been reported (Habte et al. 2020). Among the shared genotypes, BAGCE53, 86 and 97 performed well regarding TFW in the trials in both Brazil and Ethiopia. PCA and clustering analyses were conducted among the subset of genotypes from Embrapa (Fig. 1). In this analysis divergence was observed based on growth, forage yield and nutritional quality traits. The PCA identified the first 3 components, explaining 77% of the cumulative variation (Supplementary Table 3). The first principal component (PC1) ac counted for 40.1% of the total explained variance, PH (0.43), CEL (0.84), ADF (0.95), and NDF (0.78) were the main contributing traits for this component. Likewise, the second principal component (PC2) accounted for 21.1% of the total explained variance and PH (0.54), TFW (0.89), TDW (0.93), HCEL (0.48), and DM (0.36) were the main contributors for this component (Supplementary Table 3). A PCA biplot shows the degree of correlation among measured traits, with those in the same dimension and a tight angle between vectors indicating a high and positive correlation (Fig. 1a). The strongest positive correlation was found between TDW and TFW. Furthermore, CEL and ADF were also highly and positively correlated. TFW, TDW, NDF and PH appeared in the same dimen sion with a positive correlation with each other, and CEL and ADF were negatively correlated with NIT and ash (Supplementary Fig. 1). Based on data from 2 seasons and 5 harvest cycles of mean va lues of 12 quantitative traits, the clustering analysis revealed 4 major clusters, cluster I, composed of 2 major sub-clusters, con sisted of 40 genotypes; cluster II contained 3 major sub-clusters and consisted of 19 genotypes; cluster III contained 3 major sub- clusters and was composed of 31 genotypes; and the cluster IV ex hibited a distinct genetic profile, forming an isolated group con taining a single accession (Fig. 1b). Top ranking genotypes, in terms of TFW/TDW, such as BAGCE2, BAGCE60, and BAGCE64, were in cluster III. Interestingly, BAGCE58, which clustered in group IV, by itself, scored the lowest mean TFW and TDW (18.8 and 5.9 Mg, respectively) and the highest NIT content (0.66%). Genome-wide SNP discovery and their distribution across assembled chromosomes Illumina 150-bp paired-end reads were generated from 450 Napier grass genotypes. The average sequencing depth was 15–20 × per accession. Nearly ∼170 million variants (SNPs and Indels) were generated and from these variants, ca. 1 M hard-filtered SNPs were mapped across the 14 assembled chromosomes of Napier grass (Supplementary Fig. 2). These markers were used for genetic diversity and marker-trait association analyses. The number of T ab le 1 . M ea n s q u ar e fr om c om b in ed a n al ys is o f va ri an ic e (A N O V A ) fo r 12 g ro w th a n d f or ag e b io m as s yi el d t ra it s of N ap ie r gr as s ac ce ss io n s ev al u at ed f or 2 y ea rs in B ra zi l. S ou rc e of v ar ia ti on d f PH T FW D M T D W C EL LI G A D F N D F H C EL IV D D M N IT A S H R ep 1 4. 1* ** 18 30 ** 0. 02 ** * 67 n s 79 .4 ** * 0. 01 n s 17 .6 ** * 4. 2n s 61 ** * 14 8. 3* ** 0. 55 ** * 1. 8* ** A cc 90 0. 83 ** * 20 36 ** * 0. 00 8* ** 28 9* ** 7. 8* ** 3. 2* ** 14 .2 ** * 12 .1 ** * 4. 7* ** 30 .7 ** * 0. 03 7* ** 3. 57 ** * H ar ve st 3 41 .9 7* ** 23 ,7 13 ** * 0. 2* ** 43 60 ** * 16 36 ** * 18 7. 62 ** * 27 35 .3 ** * 16 0. 9* ** 17 85 .6 ** * 25 75 .6 ** 0. 37 ** * 47 .6 5 Se as on 1 23 8. 19 ** * 14 5, 14 9* ** 0. 02 ** * 43 60 ** * 1. 1n s 18 9. 4* ** 96 .5 ** * 32 3. 3* ** 66 .5 ** * 15 81 .6 ** * 0. 3* ** 0. 03 n s A cc :H ar ve st 26 7 0. 13 * ** 34 9* ** 0. 00 18 ** * 48 ** * 3. 2n s 0. 85 n s 5. 3n s 4. 3n s 2. 3n s 7. 9n s 0. 00 7n s 0. 98 n s A cc :S ea so n 89 0. 23 ** * 22 4* 0. 00 35 ** * 28 n s 6. 2* ** 1. 67 ** * 11 .5 ** * 8. 1* ** 2. 5n s 14 .4 ** * 0. 01 n s 1. 82 ** Er ro r 44 2 0. 14 17 2 0. 00 1 23 3. 6 0. 76 5. 8 4. 1 2. 1 9. 1 0. 01 1. 08 *p < 0 .0 5. ** p < 0 .0 1. ** *p < 0 .0 01 . d f, d eg re es o f fr ee d om ; P H , p la n t h ei gh t; T FW , t ot al f re sh w ei gh t; D M , d ry m at te r co n ce n tr at io n ; T D W , t ot al d ry w ei gh t; C EL , c el lu lo se ; L IG , l ig n in ; A D F, a ci d d et er ge n t fi b er ; N D F, n eu tr al d et er ge n t fi b er ; H C EL , h em ic el lu lo se ; IV D D M , i n v it ro d ry m at te r d ig es ti b il it y; N IT , n it ro ge n ; A SH , a sh ; A cc , a cc es si on s. 4 | A. Teshome et al. http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data SNPs per chromosome was variable, with more SNPs mapped on the longest A01 and B01 chromosomes (Supplementary Table 4). The SNP density was similar for all chromosomes with 1 SNP de tected for every 1,830 bases. We have generated a comprehensive Napier grass genome variation dataset, identifying numerous SNPs from diverse landraces, varieties, and progenies. Genetic variation and relationship The PCA revealed a clear pattern of genetic structure, with 3 major clusters and a noticeable degree of aggregation based on the re gion of origin (Fig. 2). PC1 accounted for the largest proportion of genetic variation in the dataset (69.9%) while PC2 explained an additional 29.9%, together capturing nearly all of the total vari ation among the genotypes. Separation along PC1 primarily dis tinguished genotypes by admixture groups (Supplementary Table 5), with Q3 and Q10 forming distinct clusters apart from Q1 and Q7. In contrast, PC2 differentiated Q2 and Q5 from Q4 and Q8, which appeared as well-separated sub-clusters in the PCA space. The Q5 cluster consisted of mainly genotypes from Kenya, alongside 2 admixed genotypes sourced from ILRI while Q8 was predominantly composed of Brazilian accessions, with 2 samples representing USDA collection. Interestingly, the 8 reported interspecific hybrid genotypes (Supplementary Table 6) did not form a distinct cluster, instead scattering across the PCA space. Similarly, genotypes sourced from ILRI were distributed among various clusters, reflecting their diverse global origins. Progeny genotypes were primarily grouped within Q10, although several appeared in other sub-clusters, mainly in Q3, suggesting varying degrees of relatedness or admix ture. Notably, the admixed group (the largest, with 152 genotypes) encompassed accessions from germplasm banks in Ethiopia, China, and Brazil, as well as breeding lines and progeny. This group also included 5 of the 7 purple-colored genotypes (CN96273, CN96211, CN94131, CN93182, and CN93081), highlight ing its genetic and phenotypic diversity. Population structure among global Napier grass genotypes Population structure analysis divided the 450 genotypes into 10 subgroups according to CV errors (Fig. 3a). Clustering at K = 6 (Fig. 3b) which separated some genotypes from Embrapa (Brazil), Kenya, and ILRI from the rest. However, a high admixture was no ticed within the overall collection. A similar trend was observed in the phylogenetic tree, where genotypes were distributed regard less of their region of origin (Fig. 4). For example, 2 genotypes of Chinese origin (cpReyan4 and PgJujun) clustered together with genotypes sourced from genebanks in Embrapa, Kenya, and ILRI. Interestingly, the reference genome CpPurple, a purple var iety and all other purple varieties were clustered close to each other except BA97. Furthermore, CpPurple showed a high similar ity with BAGCE105 (a purple accession from Brazil) indicating that these 2 could be related genotypes although currently grown on different continents. Progeny genotypes from ILRI showed a mixed trend in the phylogenetic tree; even those originating from the same mother plant did not cluster together. A high level of genetic similarity among Kenyan genotypes was observed indi cating possible duplications in the Kenyan collection. Marker-trait associations For the field evaluation carried out in Brazil, the marker-trait asso ciation analysis was carried out independently for dry and wet sea sons for each of the 12 quantitative traits (Fig. 5). Significantly associated SNP loci [−log10(P) > 5], were identified for 10 traits scored in more than 1 association model for either of the seasons (Supplementary Fig. 3). Interestingly, significant QTL were recorded for some traits such as TFW and ADF [−log10(P) > 5] mainly in the dry season (Supplementary Table 7). A GWAS was also carried out for the trial conducted in Ethiopia, and the results revealed interesting associations for the 9 traits scored. All traits significantly associated with QTL were identified under dry and wet seasons and also under 2 soil moisture condi tions. A QTL significantly associated with ST was identified during the wet season and under both soil conditions in the dry season (Supplementary Table 8). Likewise, SNPs were identified to be sig nificantly associated with TFW under both dry and wet seasons, in both soil conditions. For the binary trait, leaf color (green vs. pur ple), a total of 494 SNP loci were determined to be significantly as sociated [−log10(P) > 7.3] using 3 different GAPIT models (Supplementary Table 9). In general, a total of 207 SNP loci were significantly associated with other traits, excluding the leaf color, using 3 GAPIT models, for the trial carried out in Ethiopia (Supplementary Fig. 4). Forty-seven of those marker trait Fig. 1. Principal component a) and cluster analysis b) of 91 Napier grass accessions evaluated in the Brazil trial, based on agro-morphological and nutritional traits. Accessions are grouped into four clusters shown in arbitrary colours for distinction. Whole-genome resequencing of a global collection of Napier grass | 5 http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data associations (MTAs) were significantly associated with more than 1 trait and were also significant in more than1 GAPIT model. Three traits, PH, TFW and TDW were measured in both field trials in Ethiopia and Brazil, and the combined data were used to carry out a GWAS analysis that led to the identification of additional QTL (Supp. Fig. 5). Interestingly, significant QTL were identified for all 3 traits at a higher threshold [−log10(P) > 7.3] in the dry sea son. In contrast, a significant QTL was only identified for TDW in the wet season. From the significantly associated QTLs, 13 were associated with multiple phenotypic and feed quality traits with more than 1 GAPIT model. The search for those significant MTAs has revealed sequence similarity with proteins of various functions in the Phytozome database. For instance, a region around a QTL asso ciated with TDW and TFW showed sequence similarity with a Gag-Pol-related retrotransposon. Similarly, another QTL signifi cantly associated with traits such as ADF, IVDMD, and LIG showed similarity with a WDSAM1 protein (Supplementary Table 10). Additionally, a QTL highly correlated with leaf color exhibited similarity with the Zinc-finger domain of a monoamine-oxidase A repressor R1 and carotenoid synthesis regulator regions. Discussion Tropical forages, compared to temperate counterparts like peren nial ryegrass and alfalfa, remain under-researched. Fig. 2. PCA of 450 Napier grass accessions based on approximately 1 million SNPs. The scatter plot shows the relationship between PC1 (explaining 69.9% of the variance) vs PC2 (explaining 29.9% of the variance). Data points color-coded by country of origin, and shapes represent distinct groups defined through ADMIXTURE analysis. Genotypes labeled “UNK” indicate unknown country of origin. 6 | A. Teshome et al. http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data Consequently, most small-scale farmers, especially in Africa, rely on landraces, which may lack adequate adaptation to the current and future climate conditions (Simeão et al. 2021). Among tropical grasses, Napier grass (also known as Elephant grass) is widely grown in SSA due to its high biomass, resilience to heat and water scarcity, and ability to regrow for up to 6 harvests annually (Kamau 2007; Habte et al. 2022). Field evaluation of Napier grass genotypes The evaluation of Napier grass genotypes from ILRI (Ethiopia) and Embrapa (Brazil) highlighted important phenotypic performance and diversity. These evaluations were essential for identifying promising candidates for breeding, with a focus on improving bio mass, drought resilience, and feed quality traits such as high crude protein and low LIG. Consistent with previous findings (FAO 2010; Njuki and Sanginga 2013; Lamb et al. 2018), traits like PH, TFW, and TDW were higher during the wet season compared to the dry season. In contrast, feed quality traits like CEL and ash content remained stable across seasons. Notably, genotypes such as BAGCE2, BAGCE64, and BAGCE60 showed the highest TFW and TDW, performing well in both dry and wet seasons, suggesting strong genetic potential. BAGCE30, which was shared between the trials in Brazil and Ethiopia, demonstrated resilience to drought, producing consistently high biomass in both countries, further supporting its potential for performance across diverse environments (Habte et al. 2022). Feed quality traits such as crude protein (calculated as % nitro gen × 6.25) and LIG contents are key forage traits. In a trial con ducted in Brazil, the highest NIT was recorded for accession BAGCE58, although it ranked the lowest for TFW and TDW (Supplementary Table 2). Positive correlations were observed between traits, such as TFW and TDW, CEL and ADF (Supplementary Fig. 1), consistent with findings by Habte et al. (2022). Conversely, CEL and ADF were negatively correlated with NIT and ASH. Napier grass is a promising bioenergy crop due to its high CEL, HCEL, and LIG contents. In this study, CEL content across geno types ranged from 36.9 to 43.1%, with BAGCE83, BAGCE6, and BAGCE59 having the highest levels, making them suitable for ethanol production. Pioneiro, a Brazilian livestock forage cultivar, had the lowest CEL content, along with BAGCE104, BAGCE106, and BAGCE62, suggesting potential for improving digestibility through breeding. Genotypes clustered into 4 groups based on agro- morphological and feed quality traits. Cluster IV, containing BAGCE58, had the lowest TFW and TDW but the highest NIT con tent, which could be valuable for nitrogen production in breeding programs. High-yielding genotypes like BAGCE2, BAGCE60, and BAGCE64 were in Cluster III, showing strong potential for future breeding. Genomic tools for Napier grass Previous genotyping by sequencing (GBS) studies on Napier grass identified around 100 K SNP markers, with lower SNP density a b Fig. 3. a) Cross validation error plot from ADMIXTURE analysis across K values ranging from 2 to 10, used to determine the optimal number of genetic clusters. b) Admixture bar plot showing population structure of 450 Napier grass accessions at K = 6 and K = 10. Whole-genome resequencing of a global collection of Napier grass | 7 http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data compared to the current study (Paudel et al. 2018; Muktar et al. 2019,2022, 2023). WGS of 450 global Napier grass genotypes, most ly landraces, generated over 100 million variants, revealing signifi cant genetic diversity. In the present study, an average of 1 SNP was detected every 1803 bases across all chromosomes (Supplementary Table 4). After a complex filtering, nearly a mil lion SNPs were retained, evenly distributed along the 14 Napier grass chromosomes. These genome-wide SNPs can be used as a DNA fingerprinting tool in the germplasm bank collections and to verify the trueness-to-type of cultivars. As an allotetraploid (2n = 4× = 28, A′A′BB sub-genomes), Napier grass shares high homology with the pearl millet A genome (Cenchrus americanus, 2n = 2× = 14, AA), suggesting that genomic tools developed here could be useful for both Napier grass and pearl millet improve ment or hybrid development (Gupta and Mhere 1997). The WGS approach applied in this study has the potential to generate more SSR markers compared to the previous GBS-based approach (Paudel et al. 2018). Inter-population structure and phylogeny among global Napier grass genotypes PCA of filtered SNPs revealed 3 major clusters (Fig. 2), but these did not align with region of origin. This may be due to clonal propaga tion of Napier grass and limited genetic selection in the samples, which were sourced from 18 different countries. A similar finding was reported by Muktar et al. (2023) using GBS genotyping and Fig. 4. Phylogenetic relationships among 450 Napier grass accessions based on approximately 1 million filtered SNPs. Accessions with purple background exhibit purple leaf coloration, including the reference accession CpPurple which is among the purple-coloured accessions. 8 | A. Teshome et al. http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data Wanjala et al. (2013) with AFLP markers, where genotypes did not cluster by region of origin. However, some genotypes, such as Q4 and Q6, representing only Kenyan samples from different districts (Kiambu and Murang’a), showed regional aggregation. This likely reflects the historical and ongoing exchange of root splits through Kenya’s informal seed system as noted by Muktar et al. (2023). An interesting finding from the PCA was that progenies from 14 mother plants grouped separately and did not show a distinct pro file reflecting their sexual origin. A similar result was reported by Muktar et al. (2023) using GBS genotyping, where progenies did not cluster with their respective mother plants. Several ILRI geno types aggregated closer to the Embrapa elite breeding lines, which contributed to cultivars like BRS Capiacu and BRS Kurumi (Pereira et al. 2017), suggesting their genetic potential for cultivar improve ment. This study also included 8 hybrids (Cenchrus purpureus × Cenchrus americanus), but they did not cluster independently, indi cating a possible error in their acquisition or management. However, further taxonomic and/or cytology characterization is needed to confirm their hybrid status. Population structure analysis was conducted to better under stand the relationships among genotypes, landraces, breeding lines, and progeny plants. The analysis divided the 450 genotypes into 10 subgroups based on CV error values from an ADMIXTURE analysis (Fig. 3b), which separated some of the Embrapa genotypes from ILRI genebank materials (Supplementary Table 5). A similar trend was reported by Muktar et al. (2023), who used GBS to distin guish Embrapa and ILRI collections. Additionally, Negawo et al. (2018) observed 2 sub-populations in nearly 2 hundred Napier grass genotypes from the same collections using SSR markers. This find ing suggests that the ILRI and Embrapa collections represent 2 inde pendent gene pools with slight admixture, indicating that heterotic breeding for desirable traits would be effective. Interestingly, most progenies were distributed across different sub-clusters, consistent with the PCA analysis. This pattern may be due to gene recombin ation during hybridization, a similar unorthodox clustering of pro genies was also noted by Muktar et al. (2023). A phylogenetic tree of the 450 genotypes confirmed previous findings, showing no clustering based on region of origin. It re vealed 2 main clusters: 1 cluster with 5 genotypes and the other containing all remaining samples. The first cluster included CpPurple (the reference genome), 2 Embrapa genotypes and 2 ac cessions, from Kenya and from ILRI. Notably, CpPurple (a purple variety) showed high genetic similarity to BAGCE105, another purple genotype, despite being sourced from China. Similarly, Pioneiro and BAGCE116 clustered closely, even though BAGCE116 was selected from a different elephant grass popula tion. BAGCE116 has distinctive yellow-green striped leaves, sug gesting it may be a mutant variant of the Pioneiro cultivar, though further validation is needed. As seen in the PCA, hybrid-labeled genotypes (IL16835, IL16837, IL16834, IL16838, IL15357, IL16840, and IL14982) did not cluster to gether in the phylogeny tree, failing to reflect a distinct hybrid pro file. This aligns with findings by Muktar et al. (2019). Additionally, a high level of genetic similarity or possible duplication was ob served among Kenyan-sourced genotypes. Muktar et al. (2023) similarly reported low genetic diversity among Kenyan genotypes, despite their collections from different districts. Napier grass exhibits self-incompatibility (Martel et al. 1997; Hanna et al. 2004) and is obligate outcrossing in nature (Yan et al. 2021). The phylogenetic tree generated from this study can aid in selecting distantly related parents to develop hybrid var ieties. Hybrid breeding has been effective in temperate forages Fig. 5. Significantly associated SNPs for the 12 quantitative traits evaluated during the dry a) and wet b) seasons in Brazil. Manhattan plots constructed with BLINK model. The GWAS analysis were performed for the following traits: ADF, ASH, CEL, DM concentration, HCEL, plant height (PH), IVDDM, LIG, NDF, NIT, TDB, and TGB. The horizontal lines represent the thresholds with P-value of; lower 0.05 [−log10(P) > 4] and upper; 0.01 [−log10(P) > 6], respectively. CEL, cellulose; ASH, ash; LIG, lignin; ADF, acid detergent fiber; DM, dry matter; NDF, neutral detergent fiber; HCEL, hemicellulose; IVDDM, in vitro dry matter digestibility; NIT, nitrogen; ASH, ash; TDB, total dry weight; TGB, total fresh weight Whole-genome resequencing of a global collection of Napier grass | 9 http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data like ryegrass (Foster 1971; Pembleton et al. 2013) and could similar ly benefit Napier grass, its vegetative propagation allows for target traits to be fixed at the F1 stage. Overall, the phylogeny aligns with PCA and ADMIXTURE analyses, confirming the genetic diversity in this collection without clear clustering by region of origin. Genome-wide association studies identified key QTL in two different collections Despite its resilience to various biotic and abiotic stresses, Napier grass production faces challenges from head smut and stunt diseases, recurrent droughts, and feed quality issues. Developing high-yielding, nutritious, and stress-resilient var ieties is essential for improving animal performance, particular ly in the SSA region. However, field characterization is time consuming and labor-intensive, mainly due to its perennial na ture (Habte et al. 2020), large size (unsuitable for greenhouse studies; Rengsirikul et al. 2013), and obligate outcrossing repro duction (Martel et al. 1997). Molecular markers are crucial for accelerating breeding and reducing resource use. Studies on wheat have shown that genomic-assisted selection significantly improves yield compared to traditional phenotypic selection (Tessema et al. 2020). While only a few GWAS have been conducted on Napier grass (Muktar et al. 2019; Rocha et al. 2019; Muktar et al. 2022), the high-density genome-wide SNP markers reported here will improve identifica tion of markers linked to key traits, such as total fresh and dry weight (TFW, TDW) and WUE. In this study, a GWAS was per formed on the 2 independent collections, revealing important marker trait associations (MTAs). For the field evaluation in Brazil, SNPs [−log10(P) > 5] were significantly associated with all measured traits in at least 1 GAPIT model across both seasons (Fig. 5). Overall, 318 SNPs were linked to 12 traits under both dry and wet conditions (Supplementary Table 7). These MTAs could aid future Napier grass breeding efforts in Brazil and be yond. Additionally, the associated SNPs may help identify genes underpinning important traits, providing targets for gene edit ing, a key tool for improving animal productivity in the tropics (Camargo and Pereira 2022). However, we acknowledge the lim itations of our sample size and future work incorporating larger and more diverse panels will be valuable for further validating these results. In the Ethiopian trial, a higher number of QTL were identified for key agronomic traits such as TFW, TDW, and TN, all critical for Napier grass improvement. Significant associations were ob served across dry and wet conditions and under moderate (MWS) and severe (SWS) soil moisture stress (Supplementary Table 8). Muktar et al. (2022) previously reported QTL linked to yield, WUE, and feed quality traits, with significant MTAs for TDW under dry and SWS conditions on Chr5, Chr9, and Chr13. This study identified additional MTAs across most of the Napier grass chromosomes (Supplementary Table 8). ST showed significant QTL [−log10(P) > 5] during the wet season harvests and under both MWS and SWS in the dry season (Supplementary Fig. 4). Notably, GWAS identified 494 highly associated SNPs [−log10(P) > 7.3] for the binary leaf color trait (green vs. purples), distributed across all chromosomes (Supplementary Table 9). Muktar et al. (2022) also reported GBS markers linked to purple leaf color, but this study identified a greater number of SNPs across both sub-genomes. Since purple pigmentation in Napier grass results from high anthocyanins con tent, which has potential health benefits for both humans and an imals (Kruger et al. 2014; Yao et al. 2016), these MTAs could be valuable for feed quality improvement. Three traits, PH, TFW, and TDW were measured in both Ethiopian and Brazilian field trials. The combined analysis revealed QTLs that were consistent across environments. Interestingly, during dry-season treat ments, significant QTL were identified for all 3 traits at a higher threshold [−log10(P) > 7.3], while in the wet season, significant QTL were detected only for TDW (Supplementary Fig. 5). One of the SNPs significantly associated with leaf color (A01_63825491) was located in a region syntenic with homologs encoding protein- serine/threonine kinases (Supplementary Table 10). These pro teins act as central regulators, processing environmental and ex ternal cues to influence gene expression, metabolism, growth, development, fertilization, and immunity (Hardie 1999; Jose et al. 2020; Liu et al. 2024). Another SNP (A01_ 63717178) linked to leaf color was found in the region associated with genes involved in the carotenoid bio synthesis pathway, which plays a key role in producing photosyn thetic pigments, stress hormones, and protective compounds in grass species like rice (Shumskaya and Wurtzel 2013; Stanley and Yuan 2019). Carotenoids also have antioxidant properties, helping to reduce diseases incidence in both plants and animals (Abdelali and Zakir 2016). The SNP markers identified in this study could aid in developing nutritionally enhanced Napier grass cultivars. Conclusions Limited access to high quality forages significantly impacts live stock performance in SSA. Indigenous species like Napier grass, which is familiar to smallholder farmers, require low inputs and adapt well to various agro-ecologies and production systems, are recommended. Our results reveal genomic differences and marker trait associations in global Napier grass genotypes, likely due to adaptation to diverse environments and breeding. We be lieve that the genomic tools developed, including the diversity profile and identified QTL, alongside recently available reference genomes, will promote the use of molecular markers in Napier grass improvement. These resources are also vital for managing genetic diversity and advancing conservation programs both in situ on farms and ex situ in genebanks. Data availability All data generated and analyzed in this study are publicly available. The raw sequencing reads have been deposited in the European Nucleotide Archive (ENA) under the accession number PRJEB73794: https://www.ebi.ac.uk/ena/browser/view/PRJEB73794. The complete SNP dataset is accessible via the European Variation Archive (EVA) under the accession number PRJEB88573: https://www.ebi.ac.uk/eva/?eva-study=PRJEB88573. Supplemental material available at G3 online. Acknowledgments The research was conducted in the Feed and Forage Development (FFD) program at the ILRI forage genebank in Addis Ababa, Ethiopia, and the Earlham Institute, UK. The authors would like to thank The Royal Society, The Future Leaders—African Independent Research (FLAIR) Fellowships, FAPEMIG (Fundação de Amparo à Pesquisa do Estado de Minas Gerais—project APQ-03630-23), and the CGIAR intiative Sustainable Animal Productivity for Livelihoods, Nutrition and Gender inclusion initia tive (SAPLING) for financial support. Embrapa, KALRO, Lanzhou University and USDA are acknowledged for making their germ plasm and breeding lines available in this study. We acknowledge 10 | A. Teshome et al. http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data https://www.ebi.ac.uk/ena/browser/view/PRJEB73794 https://www.ebi.ac.uk/eva/?eva-study=PRJEB88573 http://academic.oup.com/g3journal/article-lookup/doi/10.1093/g3journal/jkaf113#supplementary-data Collins Mutai for the DNA extraction of samples from Kenya. The authors would also like to thank Jane Poole for reading, comment ing, and correcting the manuscript. Funding The research was supported by The Future Leaders—African Independent Research (FLAIR) Fellowships (FLR\R1\201095), FAPEMIG (Fundação de Amparo à Pesquisa do Estado de Minas Gerais—project APQ-03630-23) and the CGIAR initiative Sustainable Animal Productivity for Livelihoods, Nutrition and Gender inclusion initiative (SAPLING), CGIAR research is sup ported by contributions to the CGIAR Trust Fund (https://www. cgiar.org/funders/). Conflicts of interest The author(s) declare no conflict of interest. Author contributions TA, JDG, and CJ designed and supervised the project and the manuscript writing, TA, JDG, LH, and HJ analyzed phenotypic and genotypic datasets. AY, LH collected leaf samples and ex tracted DNA. TA, MSM, HE, and NAT involved in the supervision of the phenotyping project in Ethiopia. TMO, ZJ, AB, NDO, and QY involved in the supervision of the genotyping project. JFP, ALSA, and JCM oversaw the field trial in Brazil. All authors have read, reviewed, and approved this manuscript. Literature cited Abdelali H, Zakir H. 2016. Carotenoid biosynthesis and regulation in plants. In: Agriculture and Agri-Food Canada. 1391 Sandford Street, London, Ontario, Canada. Alexander DH, Lange K. 2009. Enhancements to the ADMIXTURE al gorithm for individual ancestry estimation. BMC Bioinformatics. 12(1):246. doi:10.1186/1471-2105-12-246. Anderson WF, Dien BS, Brandon SK, Peterson JD. 2008. Assessment of bermudagrass and bunch grasses as feedstock for conversion to ethanol. Appl Biochem Biotechnol. 145(1-3):13–21. doi:10.1007/ s12010-007-8041-y. Andrews S. 2010. FastQC: A quality control tool for high throughput sequence data. Available from: http://www.bioinformatics. babraham.ac.uk/projects Balehegn M, Duncan A, Tolera A, Ayantunde AA, Issa S, Karimou M, Zampaligré N, André K, Gnanda I, Varijakshapanicker P, et al. 2020. Improving adoption of technologies and interventions for increasing supply of quality livestock feed in low- and middle-income countries. Glob Food Sec. 26:100372. doi:10. 1016/j.gfs.2020.100372. Balehegn M, Kebreab E, Tolera A, Hunt S, Erickson P, Crane TA, Adesogan AT. 2021. Livestock sustainability research in Africa with a focus on the environment. Anim Front. 11(4):47–56. doi: 10.1093/af/vfab034. Bolger AM, Lohse M, Usadel B. 2014. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 30(15):2114–2120. doi:10.1093/bioinformatics/btu170. Camargo LSA, Pereira JF. 2022. Genome-editing opportunities to en hance cattle productivity in the tropics. CABI Agric Biosci. 3(1):8. doi:10.1186/s43170-022-00075-w. Dokbua B, Waramit N, Chaugool J, Thongjoo C. 2021. Biomass prod uctivity, developmental morphology, and nutrient removal rate of hybrid Napier grass (Pennisetum purpureum x Pennisetum ameri canum) in response to potassium and nitrogen fertilization in a multiple-harvest system. Bioenergy Res. 14(4):1106–1117. doi:10. 1007/s12155-020-10212-w. Enahoro D, Mason-D’Croz D, Mul M, Rich KM, Robinson TP, Thornton P, Staal SS. 2019. Supporting the sustainable expansion of live stock production in South Asia and sub-saharan Africa: scenario analysis of investment options. Glob Food Sec. 20:114–121. doi:10. 1016/j.gfs.2019.01.001. Ewels P, Magnusson M, Lundin S, Käller M. 2016. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 32(19):3047–3048. doi:10.1093/bioinformatics/ btw354. FAO. 2010. Challenges and Opportunities for Carbon Sequestration in Grassland Systems: A Technical Report on Grassland Management and Climate Mitigation. Rome: Food and Agriculture Organization of the United Nations. Foster CA. 1971. Interpopulation and intervarietal hybridization in Lolium perenne breeding: heterosis under non-competitive conditions. J Agric Sci. 76(1):107–130. doi:10.1017/S002185 9600015665. Francis RM. 2017. Pophelper: an R package and web app to analyze and visualize population structure. Mol Ecol Res. 17(1):27–32. doi:10.1111/1755-0998.12509. Fukagawa S, Isshi Y. 2018. Grassland establishment of dwarf Napier grass (Pennisetum purpureum Schumach.) by planting of cuttings in the winter season. Agronomy. 8(2):1–10. doi:10.3390/ agronomy8020012. Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, Dirks W, Hellsten U, Putnam N, et al. 2012. Phytozome: a com parative platform for green plant genomics. Nucleic Acids Res. 40(D1):D1178–D1186. doi:10.1093/nar/gkr944. Gupta SC, Mhere O. 1997. Identification of superior pearl millet by Napier hybrids and Napier’s in Zimbabwe. Afr Crop Sci J. 5(3): 229–237. doi:10.4314/acsj.v5i3.27840. Habte E, Muktar MS, Abdena A, Hanson J, Sartie AM, Negawo AT, Machado JC, Ledo FJDS, Jones CS. 2020. Forage performance and detection of marker-trait associations with potential for Napier grass (Cenchrus purpureus) improvement. Agronomy. 10(4):542. doi:10.3390/agronomy10040542. Habte E, Teshome A, Muktar MS, Assefa Y, Negawo AT, Machado JC, Ledo FJDS, Jones CS. 2022. Productivity and feed quality perform ance of Napier grass (Cenchrus purpureus) genotypes growing un der different soil moisture levels. Plants. 11(19):2549. doi:10. 3390/plants11192549. Hanan NP, Kahiu MN. 2016. Mapping forage resources using earth observation data: a case study to assess the relationship between herbaceous and woody cover components as determinants of large herbivore distribution in sub-saharan Africa. In: AGU Fall Meeting Abstracts. GC13D-1225. Hanna WW, Chaparro CJ, Mathews BW, Burns JC, Sollenberger LE, Carpenter JR. 2004. Perennial pennisetums. Warm-season (C4) grasses. 45:503–535. Hardie DG. 1999. Plant protein serine/threonine kinases: classifica tion and functions. Annu Rev Plant Biol. 50(1):97–131. doi:10. 1146/annurev.arplant.50.1.97. Huang M, Liu X, Zhou Y, Summers RM, Zhang Z. 2019. BLINK: a package for the next level of genome-wide association studies with both individuals and markers in the millions. Gigascience. 8(2):giy154. doi:10.1093/gigascience/ giy154. Whole-genome resequencing of a global collection of Napier grass | 11 https://www.cgiar.org/funders/ https://www.cgiar.org/funders/ https://doi.org/10.1186/1471-2105-12-246 https://doi.org/10.1007/s12010-007-8041-y https://doi.org/10.1007/s12010-007-8041-y http://www.bioinformatics.babraham.ac.uk/projects http://www.bioinformatics.babraham.ac.uk/projects https://doi.org/10.1016/j.gfs.2020.100372 https://doi.org/10.1016/j.gfs.2020.100372 https://doi.org/10.1093/af/vfab034 https://doi.org/10.1093/bioinformatics/btu170 https://doi.org/10.1186/s43170-022-00075-w https://doi.org/10.1007/s12155-020-10212-w https://doi.org/10.1007/s12155-020-10212-w https://doi.org/10.1016/j.gfs.2019.01.001 https://doi.org/10.1016/j.gfs.2019.01.001 https://doi.org/10.1093/bioinformatics/btw354 https://doi.org/10.1093/bioinformatics/btw354 https://doi.org/10.1017/S0021859600015665 https://doi.org/10.1017/S0021859600015665 https://doi.org/10.1111/1755-0998.12509 https://doi.org/10.3390/agronomy8020012 https://doi.org/10.3390/agronomy8020012 https://doi.org/10.1093/nar/gkr944 https://doi.org/10.4314/acsj.v5i3.27840 https://doi.org/10.3390/agronomy10040542 https://doi.org/10.3390/plants11192549 https://doi.org/10.3390/plants11192549 https://doi.org/10.1146/annurev.arplant.50.1.97 https://doi.org/10.1146/annurev.arplant.50.1.97 https://doi.org/10.1093/gigascience/giy154 https://doi.org/10.1093/gigascience/giy154 Jakobsson M, Rosenberg NA. 2007. CLUMPP: a cluster matching and permutation program for dealing with label switching and multi modality in analysis of population structure. Bioinformatics. 23(14):1801–1806. doi:10.1093/bioinformatics/btm233. Jose J, Ghantasala S, Roy CS. 2020. Arabidopsis transmembrane receptor-like kinases (RLKs): a bridge between extracellular sig nal and intracellular regulatory machinery. Int J Mol Sci. 21(11): 4000. doi:10.3390/ijms21114000. Kamau M. 2007. Farm Household Allocative Efficiency: a Multi-Dimensional Perspective on Labour use in Western Kenya. Wageningen: Wageningen University and Research. Kassambara A. 2017. Practical guide to principal component meth ods in R: PCA, M (CA), FAMD, MFA, HCPC, factoextra. Sthda. ISBN: 9781975721138. Kruger M, Davies N, Myburgh K, Lecour S. 2014. Proanthocyanidins, anthocyanins and cardiovascular diseases. Food Res Int. 59(46): 41–52. doi:10.1016/j.foodres.2014.01.046. Lamb MC, Anderson WF, Strickland TC, Coffin AW, Sorensen RB, Knoll JE, Pisani O. 2018. Economic competitiveness of Napier grass in irrigated and non-irrigated Georgia coastal plain crop ping systems. Bioenergy Res. 11(3):574–582. doi:10.1007/s12155- 018-9916-1. Letunic I, Bork P. 2021. Interactive tree of life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49(W1):W293–W296. doi:10.1093/nar/gkab301. Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 25(14):1754–1760. doi:10.1093/bioinformatics/btp324. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup. 2009. The sequence alignment/map format and SAMtools. Bioinformatics. 25(16):2078–2079. doi:10.1093/ bioinformatics/btp352. Liu X, Huang M, Fan B, Buckler ES, Zhang Z. 2016. Iterative usage of fixed and random effect models for powerful and efficient genome-wide association studies. PLoS Genet. 12(2):e1005767. doi:10.1371/journal.pgen.1005767. Liu J, Li W, Ali K. 2024. An update on evolutionary, structural, and functional studies of receptor-like kinases in plants. Front Plant Sci. 15:1305599. doi:10.3389/fpls.2024.1305599. Martel E, De Nay D, Siljak-Yakovlev S, Brown S, Sarr A. 1997. Genome size variation and basic chromosome number in pearl millet and fourteen related Pennisetum species. J Hered. 88(2):139–143. doi: 10.1093/oxfordjournals.jhered.a023072. McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, et al. 2010. The gen ome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20(9): 1297–1303. doi:10.1101/gr.107524.110. Mkutche CD. 2020. Evaluation of feed resources for local goat pro duction under traditional management systems in Golomoti EPA Dedza and on-station at Bunda Campus, LUANAR, Malawi [Diss]. [Rome]: International Institute of Tropical Agriculture. Available from: https://hdl.handle.net/10568/108504. Muktar MS, Bizuneh T, Anderson W, Assefa Y, Negawo AT, Teshome A, Habte E, Muchugi A, Feyissa T, Jones CS. 2023. Analysis of glo bal Napier grass (Cenchrus purpureus) collections reveals high gen etic diversity among genotypes with some redundancy between collections. Sci Rep. 13(1):14509. doi:10.1038/s41598-023-41583-7. Muktar MS, Habte E, Teshome A, Assefa Y, Negawo AT, Lee KW, Zhang J, Jones CS. 2022. Insights into the genetic architecture of complex traits in Napier grass (Cenchrus purpureus) and QTL re gions governing forage biomass yield, water use efficiency and feed quality traits. Front Plant Sci. 12:678862. doi:10.3389/fpls. 2021.678862. Muktar MS, Teshome A, Hanson J, Negawo AT, Habte E, Domelevo Entfellner JB, Lee KW, Jones CS. 2019. Genotyping by sequencing provides new insights into the diversity of Napier grass (Cenchrus purpureus) and reveals variation in genome-wide LD patterns be tween collections. Sci Rep. 9(1):1–15. doi:10.1038/s41598-019- 43406-0. Mwendia SW, Wanyoike M, Nguguna JGM, Wahome RG, Mwangi DM. 2006. Evaluation of Napier Grass Cultivars for Resitance for Napier Head Smut. Nairobi, Kenya. Kenya Agricultural Research Institute, University of Nairobi. Negawo AT, Jorge A, Hanson J, Teshome A, Muktar MS, Azevedo ALS, Lédo F, Machado JC, Jones CS. 2018. Molecular markers as a tool for germplasm acquisition to enhance the genetic diversity of a Napier grass (Pennisetum purpureum) collection. Trop Grasslands. 6(2):58–69. doi:10.17138/tgft(6)58-69. Njuki J, Sanginga PC. 2013. Gender and livestock: key issues, chal lenges and opportunities. In: Njuki J, Sanginga PC, editors. Women, Livestock Ownership and Markets: Bridging the Gender gap in Eastern and Southern Africa. New York (NY): Routledge. Paudel D, Kannan B, Yang X, Harris-Shultz K, Thudi M, Varshney RK, Altpeter F, Wang J. 2018. Surveying the genome and con structing a high-density genetic map of napiergrass (Cenchrus purpureus Schumach). Sci Rep. 8(1):1–11. doi:10.1038/s41598- 018-32674-x. Paul BK, Koge J, Maass BL, Notenbaert A, Peters M, Groot JC, Tittonell P. 2020. Tropical forage technologies can deliver multiple benefits in sub-saharan Africa. A meta-analysis. Agron Sustain Dev. 40(4): 1–17. doi:10.1007/s13593-020-00626-3. Pembleton LW, Wang J, Cogan NOI, Pryce JE, Ye G, Bandaranayake CK, Hand ML, Baillie RC, Drayton MC, Lawless K, et al. 2013. Candidate gene-based association genetics analysis of herbage quality traits in perennial ryegrass (Lolium perenne L.). Crop Pasture Sci. 64(3):244–253. doi:10.1071/CP12392. Pereira AV, Lédo FJS, Machado JC. 2017. BRS Kurumi and BRS Capiaçu —new elephant grass cultivars for grazing and cut-and-carry sys tem. Crop Breed Appl Biotechnol. 17(1):59–62. doi:10.1590/1984- 70332017v17n1c9. Peterson RA. 2018. bestNormalize: Normalizing Transformation Functions. R package version 1.573. R Core Team. 2022. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. Available from: https://www.R-project.org. Rengsirikul K, Ishii Y, Kangvansaichol K, Sripichitt P, Punsuvon V, Vaithanomsat P, Nakamanee G, Tudsri S. 2013. Biomass yield, chemical composition, and potential ethanol yields of 8 cultivars of napiergrass (Pennisetum purpureum Schumach.) harvested 3- monthly in central Thailand. J. Sustainable Bioenergy Syst. 3(2): 107–112. doi:10.4236/jsbs.2013.32015. Rocha JR, Das C, Marçal TS, Salvador FV, da Silva AC, Carneiro PCS, de Resende MDV, Carneiro JDC, Azevedo ALS, Pereira JF, et al. 2019. Unraveling candidate genes underlying biomass digestibil ity in elephant grass (Cenchrus purpureus). BMC Plant Biol. 19(1): 548. doi:10.1186/s12870-019-2180-5. Sawasdee V, Pisutpaisal N. 2021. Potential of Napier grass Pak Chong 1 as feedstock for biofuel production. Energy Rep. 7:519–526. doi: 10.1016/j.egyr.2021.07.101. Segura V, Vilhjálmsson BJ, Platt A, Korte A, Seren Ü, Long Q, Nordborg M. 2012. An efficient multi-locus mixed-model ap proach for genome-wide association studies in structured popu lations. Nat Genet. 44(7):825–830. 12 | A. Teshome et al. https://doi.org/10.1093/bioinformatics/btm233 https://doi.org/10.3390/ijms21114000 https://doi.org/10.1016/j.foodres.2014.01.046 https://doi.org/10.1007/s12155-018-9916-1 https://doi.org/10.1007/s12155-018-9916-1 https://doi.org/10.1093/nar/gkab301 https://doi.org/10.1093/bioinformatics/btp324 https://doi.org/10.1093/bioinformatics/btp352 https://doi.org/10.1093/bioinformatics/btp352 https://doi.org/10.1371/journal.pgen.1005767 https://doi.org/10.3389/fpls.2024.1305599 https://doi.org/10.1093/oxfordjournals.jhered.a023072 https://doi.org/10.1101/gr.107524.110 https://hdl.handle.net/10568/108504 https://doi.org/10.1038/s41598-023-41583-7 https://doi.org/10.3389/fpls.2021.678862 https://doi.org/10.3389/fpls.2021.678862 https://doi.org/10.1038/s41598-019-43406-0 https://doi.org/10.1038/s41598-019-43406-0 https://doi.org/10.17138/tgft(6)58-69 https://doi.org/10.1038/s41598-018-32674-x https://doi.org/10.1038/s41598-018-32674-x https://doi.org/10.1007/s13593-020-00626-3 https://doi.org/10.1071/CP12392 https://doi.org/10.1590/1984-70332017v17n1c9 https://doi.org/10.1590/1984-70332017v17n1c9 https://www.R-project.org https://doi.org/10.4236/jsbs.2013.32015 https://doi.org/10.1186/s12870-019-2180-5 https://doi.org/10.1016/j.egyr.2021.07.101 Shumskaya M, Wurtzel ET. 2013. The carotenoid biosynthetic path way: thinking in all dimensions. Plant Sci. 208:58–63. doi:10.1016/ j.plantsci.2013.03.012. Sievert C. 2020. Interactive Web-Based Data Visualization with R, Plotly, and Shiny. Boca Raton (FL): CRC Press. ISBN: 9781138331457. Available from: https://plotly-r.com. Simeão RM, Resende MDV, Alves RS, Pessoa-Filho M, Azevedo ALS, Jones CS, Pereira JF, Machado JC. 2021. Genomic selection in tropical forage grasses: current status and future applications. Front Plant Sci. 12:665195. doi:10.3389/fpls.2021. 665195. Smith J, Sones K, Grace D, MacMillan S, Tarawali S, Herrero M. 2013. Beyond milk, meat and eggs: role of livestock in food and nutri tion security. Anim Front. 3(1):6–13. doi:10.2527/af.2013-0002. Stanley L, Yuan YW. 2019. Transcriptional regulation of carotenoid biosynthesis in plants: so many regulators, so little consensus. Front Plant Sci. 10:1017. doi:10.3389/fpls.2019.01017. Tessema BB, Liu H, Sørensen AC, Andersen JR, Jensen J. 2020. Strategies using genomic selection to increase genetic gain in breeding programs for wheat. Front Genet. 11:578123. doi:10. 3389/fgene.2020.578123. Tolera A. 2007. The role of forage supplements in smallholder mixed farming systems. In: Hare MD, Wongpichet K, editors. Forages: A Pathway to Prosperity for Smallholder Farmers. Proceedings of an International Forage Symposium. Faculty of Agriculture, Ubon Ratchathani University, Thailand, p. 165–186. Wang J, Zhang Z. 2021. GAPIT version 3: boosting power and accuracy for genomic association and prediction. Genomics Proteomics Bioinformatics. 19(4):629–640. doi:10.1016/j.gpb.2021.08.005. Wanjala BW, Obonyo M, Wachira FN, Muchugi A, Mulaa M, Harvey J, Skilton RA, Proud J, Hanson J. 2013. Genetic diversity in Napier grass (Pennisetum purpureum) cultivars: implications for breeding and conservation. AoB Plants. 5:plt022. doi:10.1093/aobpla/plt022. Yan Q, Wu F, Xu P, Sun Z, Li J, Gao L, Lu L, Chen D, Muktar M, Jones C, et al. 2021. The elephant grass (Cenchrus purpureus) genome pro vides insights into anthocyanidin accumulation and fast growth. Mol Ecol Resour. 21(2):526–542. doi:10.1111/1755-0998.13271. Yao N, Xian-Feng Y, Lai ZQ, Liang YL, Deng SY, Lai DW. 2016. Effects of Pennisetum purpureum Schum cv. Purple on growth perform ance and serum biochemical parameters of meat geese. J Southern Agric. 47(12):2163–2168. Zhang S, Xia Z, Li C, Wang X, Lu X, Zhang W, Ma H, Zhou X, Zhang W, Zhu T, et al. 2022. Chromosome-scale genome assembly provides insights into speciation of allotetraploid and massive biomass ac cumulation of elephant grass (Pennisetum purpureum Schum.). Mol Ecol Resour. 22(6):2363–2378. doi:10.1111/1755-0998.13612. Zheng X, Levine D, Shen J, Gogarten SM, Laurie C, Weir BS. 2012. A high-performance computing toolset for relatedness and princi pal component analysis of SNP data. Bioinformatics. 28(24): 3326–3328. doi:10.1093/bioinformatics/bts606. Editor: A. Lipka Whole-genome resequencing of a global collection of Napier grass | 13 https://doi.org/10.1016/j.plantsci.2013.03.012 https://doi.org/10.1016/j.plantsci.2013.03.012 https://plotly-r.com https://doi.org/10.3389/fpls.2021.665195 https://doi.org/10.3389/fpls.2021.665195 https://doi.org/10.2527/af.2013-0002 https://doi.org/10.3389/fpls.2019.01017 https://doi.org/10.3389/fgene.2020.578123 https://doi.org/10.3389/fgene.2020.578123 https://doi.org/10.1016/j.gpb.2021.08.005 https://doi.org/10.1093/aobpla/plt022 https://doi.org/10.1111/1755-0998.13271 https://doi.org/10.1111/1755-0998.13612 https://doi.org/10.1093/bioinformatics/bts606 Whole-genome resequencing of a global collection of Napier grass (Cenchrus purpureus) to explore global population structure and QTL governing yield and feed quality traits Introduction Materials and methods Napier grass field evaluation Phenotyping of agronomic and feed quality traits Phenotypic data analysis Sequenced worldwide Napier grass collection DNA extraction and sequencing Read mapping, SNP calling, and filtering Genetic diversity and population structure Genome-wide association study Results Phenotypic variability among Napier grass genotypes Genome-wide SNP discovery and their distribution across assembled chromosomes Genetic variation and relationship Population structure among global Napier grass genotypes Marker-trait associations Discussion Field evaluation of Napier grass genotypes Genomic tools for Napier grass Inter-population structure and phylogeny among global Napier grass genotypes Genome-wide association studies identified key QTL in two different collections Conclusions Data availability Acknowledgments Funding Conflicts of interest Author contributions Literature cited