The Crop Journal 12 (2024) 558–568 Contents lists available at ScienceDirect The Crop Journal journal homepage: www.keaipubl ishing.com/en/ journals / the-crop- journal / Genome-wide association mapping and genomic prediction of stalk rot in two mid-altitude tropical maize populations https://doi.org/10.1016/j.cj.2024.02.004 2214-5141/� 2024 Crop Science Society of China and Institute of Crop Science, CAAS. Production and hosting by Elsevier B.V. on behalf of KeAi Communications This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). ⇑ Corresponding authors. E-mail addresses: chunpingw@haust.edu.cn (C. Wang), d.thanda@cgiar.org (T. Dhliwayo). Junqiao Song a,b,c, Angela Pacheco b, Amos Alakonya b, Andrea S. Cruz-Morales b, Carlos Muñoz-Zavala b, Jingtao Qu d, Chunping Wang a,⇑, Xuecai Zhang b, Felix San Vicente b, Thanda Dhliwayo b,⇑ aCollege of Agronomy, Henan University of Science and Technology, Luoyang 471000, Henan, China b International Maize and Wheat Improvement Center (CIMMYT), El Batan, Mexico cAnyang Academy of Agricultural Sciences, Anyang 455000, Henan, China dCIMMYT-China Specialty Maize Research Center, Crop Breeding, and Cultivation Research Institute, Shanghai Academy of Agricultural Sciences, Shanghai, China a r t i c l e i n f o Article history: Received 3 September 2023 Revised 13 February 2024 Accepted 18 February 2024 Available online 11 March 2024 Keywords: Maize stalk rot Genome-wide association mapping Haplotype analysis Genomic prediction G � E interaction a b s t r a c t Maize stalk rot reduces grain yield and quality. Information about the genetics of resistance to maize stalk rot could help breeders design effective breeding strategies for the trait. Genomic prediction may be a more effective breeding strategy for stalk-rot resistance than marker-assisted selection. We per- formed a genome-wide association study (GWAS) and genomic prediction of resistance in testcross hybrids of 677 inbred lines from the Tuxpeño and non-Tuxpeño heterotic pools grown in three environ- ments and genotyped with 200,681 single-nucleotide polymorphisms (SNPs). Eighteen SNPs associated with stalk rot shared genomic regions with gene families previously associated with plant biotic and abi- otic responses. More favorable SNP haplotypes traced to tropical than to temperate progenitors of the inbred lines. Incorporating genotype-by-environment (G � E) interaction increased genomic prediction accuracy. � 2024 Crop Science Society of China and Institute of Crop Science, CAAS. Production and hosting by Elsevier B.V. on behalf of KeAi Communications Co., Ltd. This is an open access article under the CC BY-NC- ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). 1. Introduction Stalk rot is one of the most destructive diseases of maize, caus- ing yield losses ranging from 5% to 100% [1]. It is an endemic dis- ease in Mexico, particularly in the central states, including Guanajuato, which ranks among the five top maize-producing states in Mexico [2]. The pathogens causing stalk rot are complex and varied, making their identification difficult. Many mycotoxin- producing fungi, including Fusarium verticillioides (Fv), Fusarium graminearum, syn. Gibberella (Fg), Colletotrichum graminicola (Anthracnose), Stenocarpella maydis (Diplodia), Acremonium strictum (Cephalosporium), Macrophomina phaseolina, and Pythium aphani- dermatum, alone or in combination may cause severe stalk rot [1]. These pathogens block vascular bundles, resulting in prema- ture plant death [3]. In Mexico, Fv and Fg are the most common and have been reported in Guanajuato [4] Although some management practices can reduce the incidence of stalk rot, there is no effective chemical control for the disease [1]. Breeding resistant cultivars is an objective of maize breeding programs where the disease is endemic. However, stalk rot pheno- typing is laborious, and the trait tends to have high spatial variabil- ity and genotype-by-environment (G � E) interaction. Identifying genomic regions associated with stalk rot is essential to develop DNA markers for marker-assisted selection or to inform genomic- assisted breeding strategies for the trait. Classical quantitative genetics and quantitative trait locus (QTL) mapping studies of stalk rot [5–9] have shown that the trait is quantitative and controlled by multiple loci with minor effects. QTL with major effects on stalk rot have also been reported, includ- ing qRfg1, qRfg2, and qRfg3 associated with resistance to Gibberella stalk rot [10–13]. These QTL have been fine-mapped: qRfg1 was mapped to a 500 kb contig on chromosome 10, qRfg2 to a 300 kb contig on chromosome 1, and qRfg3 to a 350 kb contig on chromo- some 3 [10–13]. A major QTL, Rcg1 for Anthracnose stalk rot on chromosome 4 has been cloned [14]. Two major QTL, RpiX178-1 and RpiX178-2, associated with resistance to Pythium stalk rot have been mapped to chromosomes 1 and 10 [15,16]. Genetic mapping studies of stalk rot resistance are needed to elucidate the genetic architecture of the trait in tropical germplasm and environments. Most maize stalk rot mapping studies have Co., Ltd. http://crossmark.crossref.org/dialog/?doi=10.1016/j.cj.2024.02.004&domain=pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ https://doi.org/10.1016/j.cj.2024.02.004 http://creativecommons.org/licenses/by-nc-nd/4.0/ mailto:chunpingw@haust.edu.cn mailto:d.thanda@cgiar.org https://doi.org/10.1016/j.cj.2024.02.004 http://www.sciencedirect.com/science/journal/22145141 http://www.keaipublishing.com/en/journals/the-crop-journal/ J. Song, A. Pacheco, A. Alakonya et al. The Crop Journal 12 (2024) 558–568 been conducted in recombinant inbred line (RIL), backcross, and F2-derived F3 populations using simple sequence repeat and chip-based SNP markers [11,13,15]. Testcross populations have been used to map other maize traits, including grain yield but not stalk rot, and we are not aware of any studies conducted using SNP markers derived from genotyping-by-sequencing to map stalk rot. Most stalk rot mapping studies have been conducted in tem- perate germplasm, with few studies conducted in the tropics. Genomic regions detected in temperate germplasm may not coin- cide with those in tropical germplasm owing to G � E interaction, genetic background (epistasis), pathogen diversity, and possible lack of intraspecific genetic collinearity—when the order of genes on a chromosome is not maintained or some genes are missing in some individuals within a species [17]. An alternative to QTL mapping and marker-assisted selection is genomic selection. Genomic selection (GS) assumes that each mar- ker is associated with minor effects and uses all markers to calcu- late a breeding value for each selection candidate [18]. In GS, a training set, which is a set of genotyped and phenotyped individu- als, is used to estimate the effects of the markers. The estimated marker effects are then used to predict the genetic merit or geno- mic estimated breeding values (GEBVs) of individuals in the pre- diction set consisting of individuals that have been genotyped but not phenotyped. The prediction accuracy, which is the correla- tion between the GEBVs and true breeding values, is affected by many factors, including training population size, G � E interaction, heritability of the trait, and the genetic relationship between the training and the prediction sets [19]. Although the genetic archi- tecture of stalk rot [5–9] suggests that genomic selection could be used to improve resistance in maize, its effectiveness and fac- tors affecting prediction accuracy have not been investigated. In this study, we used testcross data of 667 inbred lines evalu- ated for stalk rot under natural disease inoculation in three tropical environments to conduct a genome-wide association study (GWAS) to identify genomic regions and putative candidate genes in the regions associated with stalk rot. We used the same dataset to evaluate the effectiveness of genomic selection using models with and without G � E and varying the training population size relative to the total population. The specific objectives of this study were to i) identify genomic regions and putative candidate genes associated with resistance to stalk rot in tropical germplasm and environments, ii) for each region, identify favorable haplotypes and their sources, iii) and assess the effectiveness of genomic pre- diction for stalk rot in two tropical maize populations under selec- tion in a hybrid breeding program. 2. Materials and methods 2.1. Plant materials Of 677 inbred lines in the first year of testing in the CIMMYT mid-altitude tropical breeding program, 381 were from the Tux- peño and 296 from the non-Tuxpeño heterotic groups (Tables S1, S2). CIMMYT’s heterotic groups and their alignment with known maize germplasm groups have been described by Guo et al. [20]. The Tuxpeño subset was derived from 14 tropical and 15 temper- ate inbreds with expired U.S. plant variety protection (exPVP). The non-Tuxpeño subset comprised lines derived from 25 tropical and 18 temperate exPVP lines. More details about the background of the lines are presented in Tables S1, S2, and S3. Each population was crossed to one heterotic tester for trait evaluation. The Tuxpeño lines were crossed to CSL1663, a line derived from CML444, and the non-Tuxpeño lines were crossed to CML312, an important line for the mid-altitude tropics in sub- Saharan Africa and Mexico. The testcross hybrids were made at 559 CIMMYT’s Tlaltizapán research station in Morelos state, Mexico, during the 2020 summer (May–November) season. 2.2. Field trials and experimental design The 677 testcross hybrids and checks were subdivided into seven trial sets of 84 to 108 entries. The testcross hybrids were subdivided into smaller trials to facilitate field layout and control spatial field variation. The experimental design for each trial was an alpha lattice (0, 1) with two replications. The plot field row and column coordinates were recorded. The seven trials were each evaluated at three locations during summer 2021 in Guanajuato state: Cortazar (20�2600000N, 100�5603000W; 1736 masl), Valle de Santiago (20�2303400N, 101�1102900W; 1717 masl), and Juventino Rosas (20�4100000N, 100�5900000W; 1847 masl). Each plot consisted of two rows, each 4 m long, with 0.75 m between rows and 0.16 m between plants within a row for a density of about 93,000 plants ha�1. During the 2021 summer season, stalk rot symptoms were observed, and laboratory assays showed that the predominant pathogens in these fields were Fv and Fg. For each plot, the number of plants with stalk rot infection was recorded at about 8 weeks after anthesis, corresponding to physi- ological maturity. A plant was classified as having stalk rot infec- tion when it was dead and failed either of two tests: the pinch test, when the stalk is crushed when pinched between the lowest two internodes; and the push test, when the plant does not snap back to vertical position when it is pushed to an angle of 30 from vertical [21,22]. The number of infected plants was expressed as a percentage of the total number of plants per plot. Grain yield (t ha�1), percent grain moisture, plant height (cm), number of days to anthesis, and test weight (kg 100 L�1) were also recorded. 2.3. Phenotypic data analysis The Tuxpeño and non-Tuxpeño populations were analyzed sep- arately. For each population, two models were compared for anal- ysis of variance (ANOVA) of the phenotypic data: one based on the alpha-lattice design and the other based on the field row-column coordinates. ANOVA was conducted according to the alpha-lattice design and using the field rows and columns to adjust for field spa- tial variation. The field row-column model had a lower Bayesian information content (BIC) value [23] and therefore had better fit to the data than the alpha-lattice model. The row-column model was fitted to the phenotypic data as follows: Yijklmn ¼ lþ Ei þ Gj þ Tk þ RlðjkÞ þ pmðljkÞ þunðljkÞ þ GEij þ eijklm ð1Þ where Yijklmn is the phenotype of the jth (j = 1,. . .,J) genotype (inbred line) tested in the ith (i = 1,. . .,I) environment, kth (k = 1,. . .,K) trial, lth (l = 1,. . .,L) replication nested in the ith environment and kth trial, mth (m = 1,. . .,M) row and nth (n = 1,. . .,N) column both nested in the lth replication of the kth trial and ith environment; l is the overall mean, Ei is the effect of the ith environment, Gj is the effect of the jth genotype, Tk is the effect of the kth trial, RlðjkÞ is the effect of the lth replication nested in the ith environment and kth trial, and GEij is the interaction between the ith environment and the jth genotype. The effects pmðjklÞ and unðljkÞ are for the mth row and nth column, respectively, both nested in the ith environment, kth trial, and lth replication. The model fitted to the data for each environment can be obtained by dropping the environment effect in the models described above. The best linear unbiased estimate (BLUE) for each trait was calculated assuming genotypes as fixed and the other fac- tors as random. Variance components from the mixed models were calculated by restricted maximum likelihood method (REML) with the lmer function of the lme4 package in R [24]. Heritability (h2Þ was then calculated as: J. Song, A. Pacheco, A. Alakonya et al. The Crop Journal 12 (2024) 558–568 h2 ¼ r2 g r2 g þ r2 ge e þ r2 e re ð2Þ where r2 g is genetic variance, r2 ge is G � E interaction variance, r2 e is the residual variance, e is the number of environments, and r is the number of replications within each environment. Correlation coef- ficients among the five traits were calculated based on the least- squares means across environments. 2.4. Genotyping The 677 inbred lines and their parents were grown in a green- house at CIMMYT in Texcoco, state of Mexico, Mexico. Leaf tissue was collected from 10 plants of each line, and genomic DNA was extracted from bulked young leaves using the CTAB method [25]. All 737 lines were genotyped at the Biotechnology Center–DNA Sequencing Facility, University of Wisconsin-Madison, Wisconsin, USA, using genotyping by sequencing (GBS). The DNA was digested with ApeKI restriction enzyme and sequenced with an Illumina NovaSeq600 instrument [26]. The GBS reads were anchored to the B73 reference genome using the GBS 2.7 TOPM (tags on physical map) file retrieved from Panzea (www.panzea.org), and the SNPs were called using the TAS- SEL 5.0 [27] SNP calling pipeline. In total, 955,690 SNPs were called, including 955,120 SNPs mapped to the 10 maize chromo- somes and 570 SNPs that could not be mapped to a chromosome. The SNP genotypes of the lines and progenitors were filtered and imputed before separating 677 lines into Tuxpeño and non- Tuxpeño populations. Markers with > 50% missing data, minor allele frequency < 0.05, and heterozygosity > 5% were filtered out before imputing using the LD KNNi method [28] with the default parameters in TASSEL. After filtering and imputing, 200,681 high- quality SNPs remained and were used for GWAS. 2.5. GWAS and candidate genes The genetic structure of Tuxpeño and non-Tuxpeño populations and all 737 lines was estimated by principal component analysis (PCA) in TASSEL. For each population, the first two principal com- ponents were plotted using the ggplot2 package in R [29]. The first three principal components were retained to account for popula- tion structure. The n � n pairwise matrix of kinship coefficient (K) and linkage disequilibrium (LD) decay were also computed for each population using TASSEL. GWAS analysis was performed using TASSEL based on a linear mixed linear model: y ¼ Xbþ Zuþ e ð3Þ where y is the vector of phenotypic observations (BLUEs); b repre- sents unknown vectors of fixed effects, including them� 1 vector of SNP markers tested and p � 1 vector of the first p principal compo- nents explaining population structure; X and Z are design matrices containing variables intended to explain the observed phenotypic data; u is an unknown vector of additive genetic effects; and e is a vector of random residuals. The variance of u was estimated as Var uð Þ ¼ Kr2 a , where K is the n � n pairwise matrix of kinship among the inbreds and r2 a is the additive genetic variance, with e � N 0; Ir2 � � . GWAS analysis was performed in the Tuxpeño and non-Tuxpeño populations separately using the 200,681 imputed SNPs and the BLUEs for each environment and across environments. The negative logarithm of the probability, �log10(P), that an SNP and stalk rot are associated by random chance was plotted against the chromosome position of each SNP to produce a Manhattan plot; the observed � log10(P) was plotted against the expected �log10(P) to produce a quantile–quantile (Q–Q) plot. Both the Manhattan and 560 Q–Q plots were produced using the Cmplot package in R [30]. The genome-wide statistical significance threshold was determined for each population using the algorithm proposed by Li et al. [31] and implemented in the Genetic type 1 error calculator (version 1.0) tool. Thresholds of P < 1.23 � 10�5 in the Tuxpeño and P < 1.03 � 10�5 in the non-Tuxpeño population were adopted to maintain a genome-wide a = 0.05 and used to declare significant SNP–trait associations. The distance in kb flanking a significant SNP over which LD decays to r2 < 0.2 was defined as a genomic region asso- ciated with stalk rot resistance, and genes within this region were considered candidate genes. Candidate genes were identified and annotated on the MaizeGDB website (https://www.maizegdb.org) using the B73 version 2 reference genome (https://ensembl.gra- mene.org/Zea_mays). 2.6. Haplotype analysis in genomic regions associated with stalk rot LD blocks in genomic regions containing SNPs associated with stalk rot were identified using LDBlockShow software [32] via stan- dardized disequilibrium coefficients (D0) [33]. The favorable haplo- type of each LD block was the sequence associated with the lowest value of stalk rot in the individual and combined environments. The relative frequency of each favorable haplotype was calculated separately for the Tuxpeño and non-Tuxpeño populations and their tropical and temperate progenitors. The effect of each favorable haplotype in each population was calculated as the average pheno- typic deviation of the individuals carrying the favorable haplotype relative to the population mean. 2.7. Genomic prediction Genomic prediction for stalk rot was conducted using models described by Jarquin et al. [34] and Mageto et al. [35]. Phenotypic data analysis to calculate the BLUEs was based on the baseline model: Yij ¼ lþ Ei þ Gj þ GEij þ eij ð4Þ where Yij is the response of the jth(j = 1,. . .,J) genotype tested in the ith (i = 1,. . .,I) environment, l is the overall mean, Ei is the random environmental main effect Ei �iid Nð0;r2 EÞ h i , Gj is the random geno- type effect Gj �iid N 0;r2 G � �h i , GEij is the random interaction between the jth genotype and the ith environment GEij �iid N 0;r2 GE � �h i , and eij is the random residual eij �iid Nð0;r2 e Þ h i . In this model, Ei, Gj, GEij, and eij are assumed normally distributed N(.,.), and have indepen- dent and identically distributed responses (iid); r2 E, r2 G, r2 GE, and r2 e are the variances for environment, genotype, G � E interaction, and residual error, respectively. This baseline model does not exploit the covariance among genotypes because the genotypes were treated as independent outcomes. Models used for genomic prediction were derived from the baseline model above by exclud- ing terms, modifying assumptions, or incorporating marker infor- mation. Below is a brief description of the genomic models: Model 1 (Environment + Line) is obtained by retaining the first three components from the baseline model (l, Ei, and Gj) while their underlying assumptions remain unchanged. Model 1 uses only the observed phenotypic values to predict the phenotypic val- ues of genetically related lines. Model 2 (Environment + Line + Marker) is derived from an alter- native representation of the random main effect of line (Gj) in the baseline model as a linear combination of markers and their corre- sponding effects: Yij ¼ lþ Ei þ Gj þ qj þ eij ð5Þ http://www.panzea.org/ https://www.maizegdb.org/ https://ensembl.gramene.org/Zea_mays https://ensembl.gramene.org/Zea_mays J. Song, A. Pacheco, A. Alakonya et al. The Crop Journal 12 (2024) 558–568 where qj ¼ Pp m¼1xjmbm, bm �iid N 0;r2 b � � represents the random effect of the mth(m = 1,. . ., p) marker, xjm is the genotype of the jth line at the mth marker, and r2 b is its corresponding variance. Thus, q ¼ q1; � � � ; qJ � � represents the vector of marker genetic effects and is normally distributed with mean zero and covariance matrix Cov qð Þ ¼ Gr2 q , where G ¼ XX0 p is the genomic relationship matrix, with X representing the centered and standardized marker matrix such that r2 b ¼ r2 q [35]. The line effect was retained in the model to account for imperfect information and model misspecification because of potential imperfect linkage disequilibrium between markers [36]. Model 3 (Environment + Line + Marker + Marker � Environmen t) extends the Genomic Best Linear Unbiased Predictor (GBLUP) random effect model by modeling the main effects of lines (geno- types), markers, environments, and their interactions using covari- ance structures that are functions of marker genotypes and environments [36]. The model can be expressed as Yij ¼ lþ Ei þ Gj þ qj þ qEij þ eij ð6Þ where qEij is the random interaction between the genetic value of the jth marker genotype and the ith environment. The three models were fitted using the Bayesian generalized linear regression (BGLR) R package [37,38]. Because the BGLR soft- ware cannot handle heterogenous error variances, all models were fitted assuming homogenous error variances across environments. First, each model was fitted to the entire dataset for each popula- tion using the R package BGLR to estimate variance components and assess model fit based on the deviance information criterion (DIC) [39]. Next, genomic prediction was performed using the Gaussian model (Bayesian ridge regression), assuming Gaussian priors for the marker effects with default parameters in BGLR. Two random cross-validation schemes were used. The first scheme (CV1) used an independent but related training dataset to evaluate the prediction accuracies of models when the testing set has not been evaluated in any environment. Thus, CV1 tests the ability of the model to predict the breeding values of new lines that were not used to train the model. The second scheme (CV2) mimics unbalanced field trials (e.g., sparse testing designs) and aims to predict the breeding values of genotypes that were not tested in one or more environments. Thus, the goal of CV2 is to test the ability of the model to predict the breeding values of individu- als in environments they have not yet been tested. In CV2, predic- tion accuracies can be improved by exploiting the covariance among lines within an environment, lines across environments, and correlated environments. For both CV1 and CV2, a fivefold cross-validation was per- formed to assess the prediction accuracy for stalk rot within each population. The inbred lines and their corresponding SNP and phe- notypic data were randomly divided into five subsets, using 80% (four) training and 20% for validation. The permutations from the random subdivisions led to five training and validation sets. The procedure was repeated 20 times for each population, and the mean correlation between the observed stalk rot breeding values and the genomic estimated breeding values (GEBVs) was defined as the prediction accuracy (rMP). The effect of the size of the training set relative to the validation set on the prediction accuracy was assessed using a series of ran- dom selections of 70%, 60%, 50%, 40%, 30%, and 20% of the lines as the training set to predict the breeding values of the rest of the lines using the Model 3 and CV2. The effect of the genetic rela- tionship between the training and the prediction sets on prediction accuracy was assessed using either population as the training set to predict the values of the other with Model 3 and CV2. In all 561 cases, predictions were based on 30,000 samples from the poste- rior distribution and a burn-in of 15,000 samples. 3. Results 3.1. Phenotypic characteristics on stalk rot The non-Tuxpeño population generally had a higher mean for stalk rot in the individual and combined environments than the Tuxpeño population. The mean stalk rot across environments for non-Tuxpeño was about 10% higher than the Tuxpeño mean. The means of both populations were highest at Cortazar and lowest at Juventino Rosas (Table 1). Heritability estimates for stalk rot were high (> 0.50) for both populations at Cortazar and Valle de Santiago, and low (< 0.25) at Juventino Rosas (Table 1). For non-Tuxpeño, the heritability was � 0.01 at Juventino Rosas, indicating a lack of genetic variance for stalk rot at this location. Consequently, Juventino Rosas was excluded from further analyses for the non-Tuxpeño population, leaving a heritability estimate of 0.76 (Table 1). 3.2. Correlation between stalk rot and other agronomic traits The Tuxpeño and non-Tuxpeño populations differed for grain yield, grain moisture, number of days to anthesis, and test weight but not for plant height (Table S4). Compared with the non- Tuxpeño population, the Tuxpeño population had higher grain yield, lower grain moisture, and more days to anthesis (Table S4). The pair-wise correlation coefficients among the six traits in Tux- peño and non-Tuxpeño showed a moderate to strong negative cor- relation between stalk rot and grain yield, grain moisture, and plant height in both populations (Table S5). The correlation with stalk rot in the Tuxpeño population ranged from r = �0.24 (P < 0.01) for plant height to r = �0.64 (P < 0.01) for grain moisture. Grain moisture and grain yield were strongly and negatively corre- lated (r < �0.51; P < 0.01) with stalk rot. 3.3. SNP summary statistics and population structure The full dataset of 200,681 SNPs for the 737 lines had a missing rate of 0.23%, an average heterozygosity of 1.44%, and an average minor allele frequency (MAF) of 0.23 (Table S6). These summary statistics were similar for the Tuxpeño and non-Tuxpeño popula- tions, excluding the progenitors. The first two principal components (PCs) explained respectively 23.8%, 17.2%, and 14.7% of the total SNP marker variance for Tux- peño, non-Tuxpeño, and the complete set of 737 lines. The first two PCs subdivided the populations into respectively five, three, and four clusters associated with germplasm subgroups within Tuxpeño, non-Tuxpeño, and the entire dataset (Fig. 1A–C). The first two PCs also clearly separated the Tuxpeño and non-Tuxpeño het- erotic groups among the 737 genotyped lines (Fig. 1C). The average LD decay distance at r2 = 0.2 across the ten chromo- somes was 1.27 kb for Tuxpeño, 0.94 kb for non-Tuxpeño, and 0.92 kb for the combined population (Fig. 1D–F). 3.4. Genomic regions associated with stalk rot resistance Eighteen SNPs were associated with stalk rot in the two popu- lations at P values ranging from 1.15 � 10�5 to 1.09 � 10�6 (Fig. 2A, B; Table 2). Eight of the 18 SNPs were detected in the Tux- peño population on chromosomes 1, 3, 4, and 10, and 10 SNPs in non-Tuxpeño on chromosomes 5, 6, and 7. The phenotypic vari- ance explained by each SNP ranged from 6.20% to 9.09%, suggesting that the trait was controlled by many loci with minor effects. The Table 1 Descriptive statistics (range and mean), genetic variances (VG), genotype-environment interaction variances (VGE), error variances (Ve), and heritability estimates (h2) for percent stalk rot in the Tuxpeño and non-Tuxpeño populations across three environments. Environment Range Mean VG Ve VGE h2 Tuxpeño Cortazar 7.66–91.99 31 ± 1.2 252.35 447.56 0.53 Valle de Santiago 5.21–87.36 14 ± 0.9 171.51 201.46 0.63 Juventino Rosas 1.27–19.02 3 ± 0.2 1.88 13.33 0.22 Combined 5.27–55.51 16 ± 0.6 63.14 216.12 73.93 0.51 non-Tuxpeño Cortazar 3.04–105.1 44 ± 1.7 587.92 378.94 0.76 Valle de Santiago 2.42–97.23 28 ± 1.5 231.62 173.42 0.73 Juventino Rosas 3.08–16.37 3 ± 0.2 0.04 15.92 0.005 Combined 4.36–64.11 25 ± 0.9 938.48 558.61 313.42 0.76 Fig. 1. Principal component (PC) plots for Tuxpeño (A), non-Tuxpeño (B), and the 737 inbred lines, including progenitors (C), the linkage disequilibrium (r2) decay in Tuxpeño (D), non-Tuxpeño (E), and the 677 inbred lines (excluding the progenitors) (F). J. Song, A. Pacheco, A. Alakonya et al. The Crop Journal 12 (2024) 558–568 18 SNPs were consistently detected across environments (Table S7). Based on the LD decay distance, the eight SNPs could be grouped into six genomic regions in Tuxpeño on chromosome 1 at positions 187 Mb and 191 Mb, on chromosome 3 at 215 Mb, chromosome 4 at 168 Mb and 190 Mb, and chromosome 10 at 147 Mb. In contrast, four genomic regions were identified in non- Tuxpeño on chromosome 5 at 22 Mb and 33 Mb, chromosome 6 at 136 Mb, and chromosome 7 at 8 Mb (Table 2). These genomic regions also contained annotated genes with known predicted functions (Table S8). Genomic regions detected in Tuxpeño did not overlap with those in non-Tuxpeño. Eight favorable haplotypes were detected in the 10 genomic regions: five from the Tuxpeño and three from the non-Tuxpeño population (Fig. S1), where 13 SNPs associated with FSR resistance were contained (Table S9). The frequencies of the favorable haplo- types ranged from 3.37% to 39.27% in Tuxpeño, and 0.36% to 73.30% in non-Tuxpeño (Table 3). A search of the haplotypes among the tropical and temperate progenitors found all eight 562 haplotypes in tropical progenitors (Table 3), and only five in temperate progenitors. The top 13 lines with the lowest percent stalk rot in both Tux- peño and non-Tuxpeño populations contained one to three favor- able haplotypes (Table S10). The presence of more favorable haplotypes in the most resistant lines suggests that the results of the haplotype analysis were consistent with the performance of the hybrids for stalk rot. 3.5. Prediction accuracy in different models and cross-validation schemes Model 3 showed the lowest DIC values for both Tuxpeño and non-Tuxpeño populations, indicating that it provided the best fit (Table S11). Among the three models, Model 1 showed the lowest prediction accuracy for CV1 (rMP � 0) and a moderate to high pre- diction accuracy for CV2 in both the Tuxpeño (rMP = 0.31) and non- Tuxpeño (rMP = 0.47) populations (Fig. 3A–D). These results show that the stalk rot phenotypic values of a set of lines are a poor pre- Fig. 2. Manhattan and quantile–quantile (Q–Q) plots for GWAS of Tuxpeño and non-Tuxpeño populations: the Manhattan plots showing significant SNPs and associated genes for the Tuxpeño population (A) and the non-Tuxpeño population (B); Q–Q plots for the Tuxpeño population (C) and the non-Tuxpeño population (D). Table 2 SNPs associated with stalk rot in the Tuxpeño and non-Tuxpeño populations based on GWAS. SNP Allele P-value PVE (%)a Binb Genomic region Tuxpeño S1_187642387 G/A 1.09E-06 7.53 1.06 chr1:187640937–187643837 S1_191581422 G/A 9.56E-06 6.31 1.06 chr1:191579972–191582872 S3_215475106 A/T 7.34E-06 6.45 3.08 chr3:215474116–215476096 S4_168991386 C/A 5.32E-06 6.64 4.06 chr4:168989856–168992916 S4_190444220 G/C 1.15E-05 6.2 4.08 chr4:190442690–190445750 S10_147125575 A/C 8.63E-06 6.36 10.07 chr10:147124415–147126748 S10_147125579 G/T 8.63E-06 6.36 10.07 S10_147125588 G/C 8.63E-06 6.36 10.07 non-Tuxpeño S5_22556254 G/T 3.81E-06 7.51 5.03 chr5:22555334–22557174 S5_33821881 A/T 8.10E-06 8.21 5.03 chr5:33820961–33822804 S5_33821884 C/T 8.10E-06 8.21 5.03 S6_136411596 C/G 6.75E-06 8.34 6.05 chr6:136410666–136412541 S6_136411602 C/G 6.75E-06 8.34 6.05 S6_136411611 C/T 6.05E-06 8.42 6.05 S7_8047244 G/C 4.72E-06 8.63 7.01 chr7:8046354–8048176 S7_8047279 G/A 4.72E-06 8.63 7.01 S7_8047282 G/A 4.72E-06 8.63 7.01 S7_8047286 A/G 2.54E-06 9.09 7.01 a Percentage of the phenotypic variance explained by each QTL. b Chromosome bin J. Song, A. Pacheco, A. Alakonya et al. The Crop Journal 12 (2024) 558–568 dictor of the stalk rot values of a genetically related set of lines. In contrast, Model 1 CV2 shows that prediction accuracies can be improved by taking advantage of the covariance structure among environments. With Model 2, the prediction accuracy for CV1 relative to Model 1 increased from 0.0 in both Tuxpeño and non-Tuxpeño to 0.31 in Tuxpeño and 0.60 in non-Tuxpeño (Fig. 3A, C). The prediction 563 accuracy also increased from 0.31 to 0.38 in Tuxpeño and from 0.47 to 0.62 in non-Tuxpeño for CV2 (Fig. 3B, D), indicating the advantage of including marker effects for both cross-validation schemes. The G � E model (Model 3), which includes the interaction between markers and the environment, gave higher prediction accuracy than the genotype (line) main effects models (Model 1 Table 3 The genetic location, frequency, effect, and source populations of seven favorable haplotypes associated with stalk rot resistance in the Tuxpeño and non-Tuxpeño populations. LD blocks Favorable haplotypes Effect (%) Frequency (%) Physical position ID Sequence Tuxpeño non-Tuxpeño Tuxpeño non-Tuxpeño Temperate parents Tropical parents Tuxpeño Chr.1, �187.6 Mb H1-1 GTT �2.63 �13.15 13.55 18.28 27.60 25.80 Chr.1, �191.5 Mb H1-2 GT �5.78 �4.54 10.3 3.58 0 9.70 Chr.3, �215.4 Mb H3 TTGCGTCAT �2.03 �3.76 34.9 51.05 69.00 19.40 Chr.4, �168.8 Mb H4-1 CGGGATGGC �1.09 �6.11 3.37 12.21 0 3.20 Chr.4, �190.4 Mb H4-2 GT �5.8 �2.58 39.27 73.3 50.00 50.00 non-Tuxpeño Chr.5, �22.5 Mb H5 GTCCCCT �1.04 �20.13 3.86 8.33 10.30 3.20 Chr.6, �136.4 Mb H6 TTGGCCC �10.9 �15.86 9.26 0.36 0 3.20 Chr.7, �8.0 Mb H7 GGGGTATC �5.52 �11.93 14.88 31.56 17.24 6.45 Fig. 3. Genomic prediction accuracies for stalk rot in the Tuxpeño and non-Tuxpeño populations when training the model on 80% of the population to predict the remaining 20%. (A–D) Results for the baseline model based on phenotypic data alone (M1: Environment + Line), the model incorporating genomic effects without G � E (M2: Environment + Line + Genomic), and the model incorporating genomic effects and G � E (M3: Environment + Line + Genomic + Genomic � Environment). The genomic prediction was conducted using two cross-validation schemes: CV1, equivalent to using an existing independent but related data set to predict breeding values of newly developed lines, and CV2, equivalent to predicting breeding values in unbalanced multi-environment trials. (E, F) Genomic prediction accuracies for Model 3 and CV2 in non- Tuxpeño using the stalk rot data from positively correlated environments only (E) and the phenotypic data from all three environments, including a negatively correlated environment (F). (G) The prediction accuracies for Model 3 and CV2 of each individual environment, including the uncorrelated environment (Juventino Rosas) in the non- Tuxpeño population (G). J. Song, A. Pacheco, A. Alakonya et al. The Crop Journal 12 (2024) 558–568 and Model 2). The mean prediction accuracy for Model 3 increased for CV1 relative to Model 2 from 0.31 to 0.35 in Tuxpeño and from 0.60 to 0.70 in the non-Tuxpeño (Fig. 3A, C). A similar trend was observed for CV2, especially in non-Tuxpeño, where prediction 564 accuracy increased from 0.62 in Model 2 to 0.77 in Model 3 (Fig. 3B, D). The prediction accuracy of all models was affected by the correlation among environments. In the non-Tuxpeño population, J. Song, A. Pacheco, A. Alakonya et al. The Crop Journal 12 (2024) 558–568 the correlation between Cortazar and Valle de Santiago stalk rot means was significant and positive, whereas Juventino Rosas was not correlated with either Cortazar or Valle de Santiago (Fig. S2). Excluding Juventino Rosas fromModel 3 increased prediction accu- racy from 0.50 to 0.77. A similar trend was observed for Model 1 and Model 2 (Fig. 3E, F). Moreover, prediction accuracies within each environment were � 0.0 or negative for all three models for Juventino Rosas, indicating a lack of genetic variance for stalk rot at this location (Fig. 3G). 3.6. Prediction with varying training population sizes and between populations Prediction accuracies tended to increase with the size of the training set. The prediction accuracy in Tuxpeño increased from 0.29 when 20% of the population was used to predict the remaining 80% to 0.38 when 80% of the population was used to predict the remaining 20% (Fig. 4A). A similar trend was observed in the non-Tuxpeño population, where the prediction accuracy increased from 0.62 when 20% was used to predict 80% to 0.77 when 80% of the population was used to predict the remaining 20% (Fig. 4B). Prediction accuracy was low when either population was used to train the model to predict stalk rot in the other. When Tuxpeño was used as the training population, prediction accuracies in non- Tuxpeño were 0.00, 0.09, and 0.17 for Model 1, Model 2, and Model 3, respectively (Fig. 4C). The prediction accuracies were equally low when non-Tuxpeño was used to train the model to predict stalk rot in Tuxpeño (Fig. 4C). Fig. 4. Genomic prediction accuracies for stalk rot resistance in the Tuxpeño and non-Tu 80% to predict the rest of the population using Model 3 and CV2 in the Tuxpeño (A) a population to predict the Tuxpeño population and vice versa (C). 565 4. Discussion 4.1. Linkage disequilibrium decay and genome-wide association mapping The LD estimates from this study were higher than those reported in other studies that used inbred lines selected to capture most of the genetic diversity in a germplasm pool. The Tuxpeño and non-Tuxpeño populations under selection for combining abil- ity for grain yield and were relatively closed, resulting in high LD. The high LD decay distance implies that the resolution of genomic regions detected in this study was lower than of those detected in other studies using populations with lower LD decay distance. The lack of overlap in the genomic regions detected in the two populations suggests that the two heterotic pools were genetically heterogeneous for stalk. Although the tester effect could explain some of the heterogeneity, it is also possible that the populations indeed had unique genomic regions associated with stalk rot. This genetic heterogeneity could result from the genetic divergence expected from hybrid breeding [40] or other population genetics forces such as drift. Whatever the cause, this genetic divergence could facilitate hybrid breeding for stalk rot resistance by enabling independent selection for a few regions in each heterotic group and maximizing resistance in hybrids through complementary dominance and additive effects of the favorable alleles from the parents. The regions detected in this study did not overlap with regions detected for other stalk rot pathogens in previous studies. None of xpeño populations when the relative size of the training set was varied from 20% to nd non-Tuxpeño (B) populations. Prediction accuracy when training non-Tuxpeño J. Song, A. Pacheco, A. Alakonya et al. The Crop Journal 12 (2024) 558–568 the ten regions overlapped with known QTL for Gibberella stalk rot, including qRfg1, qRfg2, and qRfg3 [11–13]. Likewise, Ma et al. [14] reported a major QTL for Anthracnose stalk rot in bin 4.07, but the regions detected on chromosome 4 in this study were at posi- tion 168.99 Mb in bin 4.06 and 190.44 Mb in bin 4.08 in Tuxpeño. In another study, two genes associated with Pythium stalk rot, one on chromosome 1 (bin 1.03) and the other on chromosome 10 (bin 10.02), were reported by Song et al. [15]; however, the regions detected in this study were in bins 1.06 and 10.07. In yet another study for Pythium stalk rot Duan et al. [16] reported two genes on chromosomes 1 (bin 1.09) and chromosome 4 (bin 4.08). The genomic position of the gene on chromosome 4, RpiX178-2, was about 5 Mb from the region detected in bin 4.08 in the Tuxpeño population in this study (Table 2). In a study conducted using a tropical maize population in India, Rashid et al. [7] reported five SNP markers significantly associated with stalk rot caused by Fv on chromosomes 1, 2, and 6. None of the five SNPs were in regions detected in this study, even though the soil and plant residue anal- yses indicated that Fv was one of the two main pathogens present in the fields used for this study. Differences in inoculation methods could explain some of the lack of congruency in the regions detected in this study and those of Rashid et al. [7]. Natural disease inoculation was used in this study, whereas Rashid et al. [7] relied on artificial inoculation. The lines that made up the Tuxpeño and non-Tuxpeño popula- tions were derived from tropical and temperate germplasm (Fig. 1C). This genetic background allowed us to track the favorable haplotypes for stalk rot in the tropical and temperate progenitors. The haplotype tracking results suggested more favorable alleles for stalk rot in tropical than in temperate germplasm (Table 3). While more genetic diversity for disease resistance is expected in tropical maize than in temperate maize [41], the difference in the number of favorable haplotypes could be due to sampling, drift, or other genetic forces. The finding that no one line carried all seven favor- able haplotypes among the tropical progenitors indicates the potential to improve stalk resistance by pyramiding several favor- able haplotypes through breeding. Although the effects of the QTL detected in this study were small, marker-assisted selection may be an effective breeding strategy in populations segregating for a few major QTL. Such a strategy works well with well-validated QTL or genes with large effects in the target population [15,16]. Most genes identified in the 10 regions are involved in plant growth and development (Table S8). However, some genes, including GRMZM2G122025—a stress response NST1-like protein [42]; GRMZM2G046021—a his- tone acetyltransferase GNAT/MYST [43]; and GRMZM5G860810—a leucine-rich repeat (LRR) protein kinase family protein have been associated with biotic and abiotic stress responses in plants [44,45]. 4.2. Factors affecting genomic prediction accuracy The moderate to high prediction accuracies observed in this study indicate that genomic selection is an effective strategy for identifying superior genotypes for stalk rot. The prediction accura- cies were slightly higher than those of previous studies showing moderate to high prediction accuracies for ear rot, including Fusar- ium ear rot and Fumonisin ear rot [46–48]. Higher prediction accu- racies in this study could be attributed to the higher heritability for stalk rot (Table 1) than for ear rot [47,48], consistent with the gen- eral expectation for prediction accuracies to increase with an increase in heritability [19,35]. In addition to heritability, the prediction accuracy could also be affected by the prediction model used, the size of the training set, G � E interaction—which affects the heritability, and the relation- ship between the training and the testing populations. The G � E 566 model had the highest prediction accuracy of the three models used in both the Tuxpeño and non-Tuxpeño (Fig. 3). Further, a rel- atively small training set is unlikely to sufficiently sample the genetic and phenotypic diversity in the entire population, resulting in low prediction accuracies [35,49]. The prediction accuracy for stalk rot in Tuxpeño increased with an increase in the size of the training set (Fig. 4). However, depending on the genetic architec- ture of the trait and population structure, a small training set could still result in high prediction accuracies. Using 20% of the popula- tion to predict stalk rot in the rest of the population resulted in a prediction accuracy > 0.6 in the non-Tuxpeño population. Using the Tuxpeño population as a training set to predict non- Tuxpeño and vice versa resulted in poor prediction accuracies. Poor prediction accuracy when training and predicting across popula- tions was expected because, in addition to the tester effect, hybrid breeding with two heterotic pools drives allele frequencies of the germplasm pools in opposite directions [40]. Similar observations have been reported for the effects of G � E, the relationship between the training and testing sets, and training population size for various traits [35,49]. In general, prediction accuracies were lower when predicting stalk rot for new untested lines (CV1) than when predicting stalk rot in unbalanced multi-environment trials (CV2). Higher predic- tion accuracy for CV2 is achieved by using phenotypic values of lines already tested, the genetic covariance among lines, and exploiting correlation among environments. For this reason, excluding the uncorrelated environment (Juventino Rosas) increased the prediction accuracy for the G � E model in non- Tuxpeño (Fig. 3). Therefore, when designing genomic prediction breeding schemes for complex traits such as stalk rot and grain yield, selecting sufficiently correlated environments is essential [50]. In CV1, the prediction set is not tested in any of the environ- ments, resulting in a less effective exploitation of the covariance structure among environments compared to CV2 where all the lines are tested in at least one environment. Both CV1 and CV2 are useful for breeding programs, but they increase genetic gain in two distinct ways. With an appropriate training dataset, the CV1 scheme permits the prediction of trait values for new lines without phenotypic data, allowing breeders to skip testing stages and reduce breeding cycle time. The CV2 scheme, in contrast, increases genetic gain primarily by increasing selection accuracy in unbalanced experiments. However, in most cases, breeders prefer to sacrifice some accuracy for speed and have a product on the market as quickly as possible. Whatever the case, reducing cycle time has the largest effect on genetic gain [51] and the associated benefits may be great enough to compen- sate for the lower prediction accuracy of CV1. But with the CV1 prediction accuracies of 0.35 in Tuxpeño, 0.70 in non-Tuxpeño and the marginal increases in prediction accuracy observed with CV2, the advantages of predicting untested lines for stalk rot can still be substantial. Data availability The datasets generated and analyzed for this study are available from the CIMMYT data and software repository network: https:// hdl.handle.net/11529/10548947. CRediT authorship contribution statement Junqiao Song: Data curation, Formal analysis, Visualization, Writing – original draft, Writing – review & editing. Angela Pacheco: Investigation, Formal analysis. Amos Alakonya: Investi- gation, Methodology. Andrea S. Cruz-Morales: Investigation, Methodology. Carlos Muñoz-Zavala: Investigation, Methodology. https://hdl.handle.net/11529/10548947 https://hdl.handle.net/11529/10548947 J. Song, A. Pacheco, A. Alakonya et al. The Crop Journal 12 (2024) 558–568 Jingtao Qu: Investigation, Formal analysis. Chunping Wang: Con- ceptualization, Methodology, Writing – review & editing. Xuecai Zhang: Conceptualization, Methodology, Data curation, Formal analysis, Writing – review & editing. Felix San Vicente: Project administration, Conceptualization. Thanda Dhliwayo: Conceptual- ization, Methodology, Data curation, Project administration, Supervision. Declaration of competing interest The authors declare that they have no known competing finan- cial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgments This work was funded by the CGIAR Research Program (CRP) on MAIZE, the USAID through the Accelerating Genetic Gains Supple- mental Project (Amend. No. 9 MTO 069033), and the One CGIAR Initiative on Accelerated Breeding. The MAIZE CRP received fund- ing from the governments of Australia, Belgium, Canada, China, France, India, Japan, the Republic of Korea, Mexico, the Nether- lands, New Zealand, Norway, Sweden, Switzerland, the United Kingdom, the United States, and the World Bank. JS was supported by the China Scholarship Council. The authors thank Oscar Garcia- Romero and Jorge Martinez-Ruiz for their technical support of the work. Appendix A. Supplementary data Supplementary data for this article can be found online at https://doi.org/10.1016/j.cj.2024.02.004. References [1] D.G. White, Compendium of corn diseases, third edition., American Phytopathological Society Press, St Paul, MN, USA, 1999. [2] S. Zahniser, N.F. López-López, M. Motamed, Z.Y. Silva-Vargas, T. Capehart, The growing corn economies of Mexico and the United States (a report from the Economic Research Service, OCS-19F-02), Economic Research Service, USDA, USA, 2019. [3] W.J. Li, P. He, J.Y. Jin, Effect of potassium on ultrastructure of maize stalk pith and young root and their relation to stalk rot resistance, Agric. Sci. China 9 (2010) 1467–1474. [4] M.G. Figueroa-Rivera, R. Rodríguez-Guerra, B.Z. Guerrero-Aguilar, M.M. González-Chavira, J.L. Pons-Hernández, Characterization of fusarium species associated with rotting of corn root in gunajuato, Mexico, Revista Mexicana de Fitopatología 28 (2010) 124–134. [5] Z.R. Mir, P.K. Singh, P.H. Zaidi, M.T. Vinayan, S.S. Sharma, M.K. Krishna, A.K. Vemula, A. Rathore, S.K. Nair, Genetic analysis of resistance to post flowering stalk rot in tropical germplasm of maize (Zea mays L.), Crop Protect. 106 (2018) 42–49. [6] P.J. Donahue, E.L. Stromberg, C.W. Roane, A diallel study of stalk rot resistance in elite maize and its interaction with yield, Virginia J. Sci. 40 (1989) 157–170. [7] Z. Rashid, V. Babu, S.S. Sharma, P.K. Singh, S.K. Nair, Identification and validation of a key genomic region on chromosome 6 for resistance to fusarium stalk rot in tropical maize, Theor. Appl. Genet. 135 (2022) 4549–4563. [8] S. Liu, J. Fu, Z. Shang, X. Song, M. Zhao, Combination of genome-wide association study and QTL mapping reveals the genetic architecture of fusarium stalk rot in maize, Front. Agron. 2 (2021) 590374. [9] Y. Kou, S. Wang, Broad-spectrum and durability: understanding of quantitative disease resistance, Curr. Opin. Plant Biol. 13 (2010) 181–185. [10] C. Wang, Q. Yang, W. Wang, Y. Li, Y. Guo, D. Zhang, X. Ma, W. Song, J. Zhao, M. Xu, A transposon-directed epigenetic change in ZmCCT underlies quantitative resistance to gibberella stalk rot in maize, New Phytol. 215 (2017) 1503–1515. [11] Q. Yang, G. Yin, Y. Guo, D. Zhang, S. Chen, M. Xu, A major QTL for resistance to gibberella stalk rot in maize, Theor. Appl. Genet. 121 (2010) 673–687. [12] D. Zhang, Y. Liu, Y. Guo, Q. Yang, J. Ye, S. Chen, M. Xu, Fine-mapping of qRfg2, a QTL for resistance to gibberella stalk rot in maize, Theor. Appl. Genet. 124 (2012) 585–596. [13] C. Ma, X. Ma, L. Yao, Y. Liu, F. Du, X. Yang, M. Xu, qRfg3, a novel quantitative resistance locus against gibberella stalk rot in maize, Theor. Appl. Genet. 130 (2017) 1723–1734. 567 [14] W. Ma, X. Gao, T. Han, M.T. Mohammed, J. Yang, J. Ding, W. Zhao, Y.L. Peng, V. Bhadauria, Molecular genetics of anthracnose resistance in maize, J. Fungi (basel) 8 (2022) 540. [15] F.J. Song, M.G. Xiao, C.X. Duan, H.J. Li, Z.D. Zhu, B.T. Liu, S.L. Sun, X.F. Wu, X.M. Wang, Two genes conferring resistance to pythium stalk rot in maize inbred line Qi319, Mol. Genet. Genomics 290 (2015) 1543–1549. [16] C. Duan, F. Song, S. Sun, C. Guo, Z. Zhu, X. Wang, Characterization and molecular mapping of two novel genes resistant to pythium stalk rot in maize, Phytopathology 109 (2019) 804–809. [17] H. Fu, H.K. Dooner, Intraspecific violation of genetic colinearity and its implications in maize, Proc. Natl. Acad. Sci. U. S. A. 99 (2002) 9573–9578. [18] T.H. Meuwissen, B.J. Hayes, M.E. Goddard, Prediction of total genetic value using genome-wide dense marker maps, Genetics 157 (2001) 1819–1829. [19] H. Zhang, L. Yin, M. Wang, X. Yuan, X. Liu, Factors affecting the accuracy of genomic selection for agricultural economic traits in maize, cattle, and pig populations, Front Genet. 10 (2019) 189. [20] R. Guo, J. Chen, C.D. Petroli, A. Pacheco, X. Zhang, F. San Vicente, S.J. Hearne, T. Dhliwayo, The genetic structure of CIMMYT and U.S. inbreds and its implications for tropical maize breeding, Crop Sci. 61 (2021) 1666–1681. [21] A.M. Stucker, E. Morris, C.J. Stubbs, D.J. Robertson, The crop clamp - a non- destructive electromechanical pinch test to evaluate stalk lodging resistance, HardwareX 10 (2021) e00226. [22] T.A. Jackson-Ziems, J.M. Rees, R.M. Harveson, Common stalk rot diseases of corn, Papers in Plant Pathology 532 (2014), http://digitalcommons.unl. edu/plantpathpapers/532. [23] A. Chakrabarti, J.K. Ghosh, AIC, BIC and recent advances in model selection, in: P.S. Bandyopadhyay, M.R. Forster (Eds.), Philosophy of Statistics, North- Holland, Amsterdam, Amsterdam, the Netherlands, 2011, pp. 583–605. [24] D. Bates, M. Mächler, B. Bolker, S. Walker, Fitting linear mixed-effects models using lme4, J. Statistical Soft. 67 (2015) 1–48. [25] J. Doyle, J. Doyle, A rapid procedure for DNA purification from small quantities of fresh leaf tissue, Phytochem. Bull. 19 (1987) 11–15. [26] R.J. Elshire, J.C. Glaubitz, Q. Sun, J.A. Poland, K. Kawamoto, E.S. Buckler, S.E. Mitchell, A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species, PLoS ONE 6 (2011) e19379. [27] P. Bradbury, Z. Zhang, D. Kroon, T. Casstevens, Y. Ramdoss, E. Buckler, TASSEL: software for association mapping of complex traits in diverse samples, Bioinformatics 23 (2007) 2633–2635. [28] D. Money, K. Gardner, Z. Migicovsky, H. Schwaninger, G.Y. Zhong, S. Myles, LinkImpute, Fast and accurate genotype imputation for nonmodel organisms, G3-Genes Genomes Genet. 5 (2015) 2383–2390. [29] H. Wickham, ggplot2: elegant graphics for data analysis, 2nd, Springer, New York, NY, USA, 2009. [30] T. Van den Ende, F.A. Abe Nijenhuis, H.G. van den Boorn, E. Ter Veer, M.C.C.M. Hulshof, S.S. Gisbertz, M.G.H. van Oijen, H.W.M. van Laarhoven, COMplot, a graphical presentation of complication profiles and adverse effects for the curative treatment of gastric cancer: a systematic review and meta-analysis, Front Oncol. 9 (2019) 684. [31] M.X. Li, J.M. Yeung, S.S. Cherny, P.C. Sham, Evaluating the effective numbers of independent tests and significant p-value thresholds in commercial genotyping arrays and public imputation reference datasets, Hum. Genet. 131 (2012) 747–756. [32] S.S. Dong, W.M. He, J.J. Ji, C. Zhang, Y. Guo, T.L. Yang, LDBlockShow: a fast and convenient tool for visualizing linkage disequilibrium and haplotype blocks based on variant call format files, Brief. Bioinformatics 22 (2020) bbaa227. [33] S.A. Flint-Garcia, J.M. Thornsberry, E.S. Buckler, Structure of linkage disequilibrium in plants, Annu. Rev. Plant Biol. 54 (2003) 357–374. [34] D. Jarquin, J. Crossa, X. Lacaze, P. Du Cheyron, J. Daucourt, J. Lorgeou, F. Piraux, L. Guerreiro, P. Perez, M. Calus, J. Burgueno, G. de los Campos, A reaction norm model for genomic selection using high-dimensional genomic and environmental data, Theor. Appl. Genet.127 (2014) 595–607. [35] E.K. Mageto, J. Crossa, P. Pérez-Rodríguez, T. Dhliwayo, N. Palacios-Rojas, M. Lee, R. Guo, F. San Vicente, X. Zhang, V. Hindu, Genomic prediction with genotype by environment interaction analysis for kernel zinc concentration in tropical maize germplasm, G3-Genes Genomes Genet. 10 (2020) 2629–2639. [36] M. Lopez-Cruz, J. Crossa, D. Bonnett, S. Dreisigacker, J. Poland, J.L. Jannink, R.P. Singh, E. Autrique, G. de los Campos, Increased prediction accuracy in wheat breeding trials using a marker � environment interaction genomic selection model, G3-Genes Genomes Genet. 5 (2015) 569–582. [37] P. Pérez, G. de los Campos, Genome-wide regression and prediction with the BGLR statistical package, Genetics 198 (2014) 483–495. [38] P. Pérez, G. de Los Campos, J. Crossa, D. Gianola, Genomic-enabled prediction based on molecular markers and pedigree using the Bayesian linear regression package in R, Plant, Genome 3 (2010) 106–116. [39] D.J. Spiegelhalter, N.G. Best, B.P. Carlin, A. van der Linde, Bayesian measures of model complexity and fit, J. R. Stat. Soc. Ser. B-Stat. Methodol. 64 (2002) 583– 639. [40] E.A. Lee, M. Tollenaar, Physiological basis of successful breeding strategies for maize grain yield, Crop Sci. 47 (2007) S202–S215. [41] M.M. Goodman, Genetic and germplasm stocks worth conserving, J. Hered. 81 (1990) 11–16. [42] Q. Zhang, F. Luo, Y. Zhong, J. He, L. Li, Modulation of NAC transcription factor NST1 activity by XYLEM NAC DOMAIN1 regulates secondary cell wall formation in arabidopsis, J. Exp. Bot. 71 (2020) 1449–1458. https://doi.org/10.1016/j.cj.2024.02.004 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0005 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0005 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0005 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0010 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0010 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0010 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0010 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0010 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0015 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0015 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0015 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0020 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0020 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0020 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0020 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0025 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0025 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0025 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0025 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0030 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0030 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0035 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0035 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0035 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0040 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0040 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0040 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0045 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0045 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0050 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0050 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0050 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0055 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0055 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0060 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0060 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0060 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0065 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0065 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0065 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0070 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0070 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0070 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0075 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0075 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0075 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0080 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0080 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0080 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0085 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0085 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0090 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0090 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0095 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0095 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0095 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0100 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0100 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0100 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0105 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0105 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0105 http://digitalcommons.unl.edu/plantpathpapers/532 http://digitalcommons.unl.edu/plantpathpapers/532 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0115 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0115 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0115 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0115 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0115 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0120 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0120 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0125 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0125 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0130 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0130 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0130 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0135 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0135 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0135 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0140 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0140 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0140 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0145 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0145 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0145 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0150 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0150 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0150 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0150 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0150 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0155 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0155 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0155 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0155 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0160 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0160 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0160 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0165 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0165 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0170 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0170 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0170 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0170 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0175 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0175 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0175 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0175 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0180 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0180 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0180 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0180 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0180 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0185 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0185 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0190 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0190 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0190 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0195 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0195 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0195 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0200 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0200 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0205 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0205 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0210 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0210 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0210 J. Song, A. Pacheco, A. Alakonya et al. The Crop Journal 12 (2024) 558–568 [43] X. Liu, M. Luo, W. Zhang, J. Zhao, J. Zhang, K. Wu, L. Tian, J. Duan, Histone acetyltransferases in rice (Oryza sativa L.): phylogenetic analysis, subcellular localization and expression, BMC Plant Biol. 12 (2012) 145. [44] Y. Zan, Y. Ji, Y. Zhang, S. Yang, Y. Song, J. Wang, Genome-wide identification, characterization and expression analysis of populusleucine-rich repeat receptor-like protein kinase genes, BMC Genomics 14 (2013) 318. [45] K.U. Torii, Leucine-rich repeat receptor kinases in plants: structure, function, and signal transduction pathways, in: International Review of Cytology, Academic Press, New York, NY, USA, 2004, pp. 1–46. [46] Y.B. Liu, G.H. Hu, A. Zhang, A. Loladze, Y.X. Hu, H. Wang, J.T. Qu, X.C. Zhang, M. Olsen, F. San Vicente, J. Crossa, F. Lin, B.M. Prasanna, Genome-wide association study and genomic prediction of fusarium ear rot resistance in tropical maize germplasm, Crop J. 9 (2021) 325–341. [47] J.B. Holland, T.P. Marino, H.C. Manching, R.J. Wisser, Genomic prediction for resistance to fusarium ear rot and fumonisin contamination in maize, Crop Sci. 60 (2020) 1863–1875. 568 [48] M.C. Kuki, R.J.B. Pinto, F.A.B. Bertagna, D.J. Tessmann, A. Teixeira do Amaral Júnior, C.A. Scapim, J.B. Holland, Association mapping and genomic prediction for ear rot disease caused by fusarium verticillioides in a tropical maize germplasm, Crop Sci. 60 (2020) 2867–2881. [49] J. Crossa, P. Pérez-Rodríguez, J. Cuevas, O. Montesinos-López, D. Jarquín, G. de Los Campos, J. Burgueño, J.M. González-Camacho, S. Pérez-Elizalde, Y. Beyene, S. Dreisigacker, R. Singh, X. Zhang, M. Gowda, M. Roorkiwal, J. Rutkoski, R.K. Varshney, Genomic selection in plant breeding: methods, models, and perspectives, Trends Plant Sci. 22 (2017) 961–975. [50] J.E. Spindel, S.R. McCouch, When more is better: how data sharing would accelerate genomic selection of crop plants, New Phytol. 212 (2016) 814–826. [51] D.V. Butruille, F.H. Birru, M.L. Boerboom, E.J. Cargill, D.A. Davis, P. Dhungana, G.M. Dill, F. Dong, A.E. Fonseca, B.W. Gardunia, G.J. Holland, N. Hong, P. Linnen, T.E. Nickson, N. Polavarapu, J.K. Pataky, J. Popi, S.B. Stark, Maize breeding in the United States: views from within monsanto, in: J. Janick (Ed.), Plant Breeding Reviews, Volume 39, John Wiley & Sons Inc, Hoboken, NJ, USA, 2015, pp. 199–282. http://refhub.elsevier.com/S2214-5141(24)00044-8/h0215 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0215 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0215 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0220 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0220 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0220 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0225 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0225 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0225 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0225 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0230 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0230 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0230 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0230 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0235 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0235 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0235 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0240 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0240 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0240 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0240 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0245 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0245 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0245 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0245 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0245 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0250 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0250 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0255 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0255 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0255 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0255 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0255 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0255 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0255 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0255 http://refhub.elsevier.com/S2214-5141(24)00044-8/h0255 Genome-wide association mapping and genomic prediction of stalk rotin two mid-altitude tropical maize populations 1. Introduction 2. Materials and methods 3. Results References