Vol.:(0123456789)

Theoretical and Applied Genetics          (2026) 139:52  
https://doi.org/10.1007/s00122-026-05159-z

ORIGINAL ARTICLE

Enhancing genomic prediction ability of blast resistance using 
genome‑wide association study‑derived marker weights in two rice 
(Oryza sativa L.) populations

Félicien Akohoue1   · Cristian Camilo Herrera1 · Silvio James Carabali Balanta1 · Juanita Torres1 · 
Constanza Quintero1 · Gloria Mosquera1 · Maria Fernanda Alvarez1

Received: 3 June 2025 / Accepted: 11 January 2026 
© The Author(s) 2026

Abstract
Key message  Leaf and panicle blast resistances were moderately correlated and controlled by several genes, includ-
ing Pi2/Pi9 and Pi33. GWAS-based marker weighting increased GBLUP predictive ability by up to 37% across two rice 
populations.
Abstract  Breeding for blast resistance remains a high priority in rice (Oryza sativa L.) improvement, yet the genetic com-
plexity of leaf blast (BL) and panicle blast (PB) continues to challenge prediction accuracy in genomic selection (GS). 
Traditional GS approaches, such as genomic best linear unbiased prediction (GBLUP), assume equal contribution from all 
markers, potentially limiting the capture of key resistance loci. Recent advances integrating genome-wide association studies 
(GWAS) into GS offer new opportunities to weight markers based on their biological relevance. In this study, we dissected 
the genetic architecture of BL and PB resistance in two diverse rice populations and evaluated the performance of three 
weighted GBLUP models that incorporate marker information from GWAS. Marker weighting strategies included FST-based 
weighting (FST-w), squared additive effects (AE-w), and − log10(p)-based weighting (− log10(p)-w). We identified signifi-
cant marker-trait associations (MTAs), including key loci near the Pi2/Pi9 cluster and Pi33 gene regions on chromosomes 
6 and 8. A moderate genetic correlation (0.43–0.44) between BL and PB severity suggests partially shared genetic control. 
Across traits and populations, AE-w and − log10(p)-w models improved predictive ability by 4–37% (0.03–0.23) and reduced 
normalized root mean square error by 3.8−35.3% relative to the unweighted GBLUP. These results demonstrate the value 
of integrating GWAS into GS (GS + GWAS) and highlight marker weighting as a practical strategy to enhance prediction 
accuracy for complex traits like blast resistance, ultimately accelerating genetic gains in rice breeding programs.

Introduction

Rice (Oryza sativa L.) represents a major staple food crop 
for more than half of the world population and serves as a 
primary source of calories and livelihoods in many coun-
tries, particularly across Asia, the Americas, and Africa 
(FAO 2023). Despite its vital contribution to global food 
security, rice production faces several biotic and abiotic 
stresses which have a significant impact on yield and grain 

quality across production regions (Shah et al. 2019; Liu et al. 
2021; Radha et al. 2023). Major stresses include blast dis-
eases caused by the plant-pathogenic fungus Magnaporthe 
oryzae B.C. Couch (syn. Pyricularia oryzae) which affects 
yield and grain quality negatively worldwide (Perez-Nadales 
et al. 2014; Liu et al. 2021). The pathogen infects above-
ground tissues at different growth stages, causing lesions 
on leaves, panicles and other organs (Ghatak et al. 2013). 
Under field conditions, blast infections can result in an aver-
age yield loss of 10 to 30%, with a total crop failure in severe 
conditions (Fisher et al. 2012; Perez-Nadales et al. 2014; 
Asibi et al. 2019; Devanna et al. 2022). The fungus can sur-
vive on infected residues for several rice cultivation cycles, 
which represent a primary source of inoculum (Raveloson 
et al. 2018). Integrated approaches comprising appropriate 
agronomic practices and high host resistance represent the 
most effective management strategy for blast disease (Asibi 

Communicated by Joshua N. Cobb.

 *	 Félicien Akohoue 
	 F.Akohoue@cgiar.org

1	 Rice Program, International Centre for Tropical Agriculture 
(CIAT), Alliance Bioversity and CIAT, Americas Hub, Km 
17 Recta Cali–Palmira, CP 763537 Palmira, Colombia

http://crossmark.crossref.org/dialog/?doi=10.1007/s00122-026-05159-z&domain=pdf
http://orcid.org/0000-0002-2160-0182


	 Theoretical and Applied Genetics          (2026) 139:52    52   Page 2 of 27

et al. 2019). Cultivating blast-resistant varieties is a prefer-
able control measure to reduce the use of fungicides and 
their harmful environmental effects.

Resistance to blast disease is generally classified into 
two categories, namely complete and quantitative resist-
ance. Complete resistance, also called qualitative resistance, 
is controlled by a single race-specific major resistance (R) 
gene, which is effective against specific M. oryzae strains 
possessing the corresponding avirulence genes (Koizumi 
2010). About 100 Pyricularia genes (Pi genes) have been 
reported, of which 25 genes have been cloned and widely 
used by diverse breeding programs (Li et al. 2019). These 
include well known Pi genes like Pi9, which have been used 
in many breeding programs (Rathour et al. 2016; Xiao et al. 
2019; Zhou et al. 2020; Misman et al. 2021; Fengshun et al. 
2024). However, the complete resistance is not durable, as 
it can easily break in the presence of a new pathogen race. 
On the other hand, quantitative resistance, also known as 
field resistance, is a non-race-specific and polygenic resist-
ance which involves multiple small effect genes that allow 
pathogen infection but restrict lesion expansion, preventing 
the disease progression. Over 700–800 quantitative trait loci 
(QTLs) harboring blast resistance genes have been reported 
from diverse genetic backgrounds and environmental condi-
tions (Tian et al. 2022; Devanna et al. 2024). Unlike com-
plete resistance, partial resistance is more durable and does 
not break down within a few years (Srivastava et al. 2017), 
although its use by breeding programs through marker-
assisted selection (MAS) is reduced due to their genomic 
complexity. The combination of the two types of resist-
ance is ideal to develop rice varieties with long-lasting and 
broad-spectrum resistance for more effective control of blast 
disease in famers’ fields. Many worldwide grown cultivars 
with high broad-spectrum resistance have been reported, 
including Moroberekan, IR64 and Jao Hom Nin (Sallaud 
et al. 2003; Chaipanya et al. 2017).

To effectively leverage these genomic loci for higher 
blast resistance, the integration of high-throughput genom-
ics-assisted breeding methods into breeding programs is of 
paramount importance. Standard marker-assisted selection 
(MAS) has been successfully implemented to develop blast-
resistant cultivars based on major-effect R genes (Jiang et al. 
2019; Yang et al. 2019; Liu et al. 2024). However, the effec-
tiveness of MAS in leveraging small-effect genes for a dura-
ble resistance is limited due to the extensive marker develop-
ment efforts and costs involved. With the ever-decreasing 
genotyping costs and the availability of dense genome-wide 
single nucleotide polymorphism (SNP) arrays, advanced 
molecular methods such as genome-wide association stud-
ies (GWAS) and genomic selection (GS) have emerged as 
promising solutions to improve breeding progress (Ahmadi 
2022; Bartholomé et al. 2022). GS offers the potential to use 
all available genomic loci through robust prediction models, 

thereby enhancing breeding efficiency and genetic gains. In 
rice, GS potential has been investigated for many traits, such 
as yield and yield components (Grenier et al. 2015; Spindel 
et al. 2016; Wang et al. 2017; Zhang et al. 2023) and other 
agronomic traits (Grenier et al. 2015; Onogi et al. 2015; 
Zhang et al. 2023). Bartholomé et al. (2024) suggested that 
genomic selection could be useful for predicting tolerance to 
aluminum toxicity in upland rice, with a prediction accuracy 
of 0.67 for days to flowering, 0.60 for plant height, 0.53 for 
yield, and 0.65 for zinc. Unfortunately, the application of GS 
to predict blast resistance remains limited. Recently, Huang 
et al. (2019) evaluated 323 African and USDA accessions 
under artificial infections with 10 blast strains and reported 
genomic prediction accuracies ranging from 0.15 to 0.72. 
With this variability of prediction accuracy across strains, 
further investigations are necessary to fully harness the 
potential of GS for durable blast resistance. Studies should 
focus on both types of blast disease and expand to broader 
genetic backgrounds and environments.

Several genomic prediction methods have been proposed, 
with genomic best unbiased linear predictor (GBLUP) 
being among the most applied models (Jeon et al. 2023; 
Montesinos-López et al. 2024a). GBLUP directly estimates 
genomic estimated breeding values of genotypes without 
explicitly estimating individual marker effects. Unlike tradi-
tional BLUP models that use pedigree information, GBLUP 
replaces the pedigree with a genomic relationship matrix 
(G-matrix), built based on genome-wide markers like sin-
gle nucleotide polymorphisms (SNPs) (VanRaden 2008; Su 
et al. 2012). The G-matrix captures the genetic similarity 
between genotypes and allows the GBLUP model to pre-
dict their breeding values (Yang et al. 2014). In doing this, 
GBLUP model naively assumes that all available markers 
contribute equally to the genetic variation of the trait (Tiezzi 
and Maltecca 2015). This assumption ignores differences 
between markers and dilute contributions from major loci. 
However, not all genome-wide markers have the same influ-
ence on traits like disease resistance. In practice, some mark-
ers may be located near or within QTLs with large effects, 
while others may have little to no effect on the trait. Failing 
to prioritize these large effects markers could lead to reduced 
GBLUP prediction accuracy (Nishio and Satoh 2015; Zhang 
et al. 2016).

To address this limitation, weighted GBLUP (wGBLUP) 
models that account for unequal marker contributions have 
been proposed and demonstrated to outperform the tradi-
tional GBLUP model (Li et al. 2015; Nishio and Satoh 2015; 
Karaman et al. 2018; Gualdrón Duarte et al. 2020; Ren et al. 
2021). wGBLUP incorporates markers' weights which are 
defined based on various weighting methods ranging from 
fixation index (FST)-based weights (Chang et al. 2019) to 
more biologically relevant approaches such as the incorpo-
ration of GWAS results (Dong et al. 2016; Ren et al. 2021). 


Theoretical and Applied Genetics          (2026) 139:52 	 Page 3 of 27     52 

Despite its outperformance over the standard GBLUP, the 
genomic prediction accuracies of wGBLUP models are trait-
dependent, as reported by Ren et al. (2021). To date, the 
impact of wGBLUP on genomic prediction accuracy of the 
different types of blast resistance remains unknown. There 
is a critical need to evaluate different weighting strategies 
for predicting blast resistance across diverse genetic back-
grounds to provide practical insights to optimize genomic 
selection in rice breeding for blast resistance.

This study aims to address these gaps by (i) investigat-
ing the genetic architecture of leaf blast (BL) and panicle 
blast (PB) resistance using single-trait genome-wide associa-
tion study (GWAS) within two distinct populations, namely 
SSD Tropics and 3K rice populations, and (ii) evaluating the 
genomic prediction ability of single-trait (ST) and multi-trait 
(MT) GBLUP models for BL and PB severity, incorporating 
three G-matrix weighting methods such as FST-based method 
(FST-w), squared additive effects (AE-w) and negative base 
10 logarithm of P-value (− log10(p)-w) across both popula-
tions. The use of cross-population comparative approach in 
this study was pivotal for validating the robustness and con-
sistency of the G-matrix weighting methods across various 
genetic backgrounds, ensuring their applicability in diverse 
rice breeding programs targeting blast resistance.

Materials and methods

Plant materials

The study included two rice populations: single seed descent 
(SSD) Tropics and 3K populations. SSD Tropics popula-
tion comprised 24 interconnected families developed from 
crosses between ten restorer lines (Table 1) selected based 
on their moderate to high blast resistance. F1 families were 
advanced to F6 using the rapid generation advance (RGA), 
generating 1,484 lines with 20 to 88 lines per family 
(Table 1). On the other hand, the 3K population was com-
posed of 204 accessions randomly selected from the global 
3K germplasm (Supplementary file 1). Selected accessions 
belonged to “ind1A” (seven), “ind1B” (three), “ind2” (four), 
“ind3” (91), “indx” (95), “trop1” (two), “aromatic” (one) 
and “admix” (one) variety types. Most accessions (199) were 
distributed across diverse Asian subregions (Supplementary 
file 1).

Field evaluation

Both populations were evaluated under field conditions 
with a natural source of blast inoculum at the Fedearroz 
Santa Rosa experimental station at Villavicencio, Colom-
bia. SSD Tropics lines were evaluated in 2023 and 2024 
with four checks, including Olimar, FEDEARROZ-2000, 

FED-ITAGUA and ORYZICA-1. Lines were evaluated using 
an augmented design with row-column adjustment in 2023, 
and an augmented design without row-column adjustment in 
2024. In each experiment, each check was included with 61 
replicates. The 3K population was evaluated with the same 
checks from 2020 to 2023 using an alpha lattice design with 
two replicates.

The following data were collected: days to flowering (DF, 
days), leaf blast (BL) severity, panicle blast (PB) severity 
and plant vigor (PV). DF was recorded plot-wise when at 
least 50% of plants flowed. BL was collected plot-wise at 
30 to 40 days after planting using the Standard Evalua-
tion System of IRRI (2014). BL typically initiates as small 
lesions which begin near the leaf tips or margins and extend 
downward, turning from pale green to yellow. Lesions can 
sometimes cover the entire leaf in susceptible varieties, 
while severe infections cause wilting and plant death. Like 
BL, PB was recorded on the panicle at 21–25 days after 
flowering using the same scale. PB is characterized by dark, 
necrotic lesions that partially or completely cover the panicle 
base, upper internode, or lower panicle axis, resulting in 
grayish panicles with partially filled or unfilled grains. To 
ensure a homogenous pressure of blast pathogen across the 
entire experiment, the highly susceptible cultivar Fanny was 
mixed with other indica susceptible genotypes to be used 
as spreader rows planted across the field. Genotypes with 

Table 1   The 10 restorer lines used as parents for developing SSD 
Tropics population, and their leaf blast (BL) and panicle blast (PB) 
resistance status

Each female parent was crossed with the four male parents, generat-
ing 24 interconnected families

Genotype Blast status

BL PB

a. Female parent:
FEM1 Highly resistant Moderately resistant
FEM2 Moderately resistant Moderately resistant
FEM3 Moderately resistant Moderately resistant
FEM4 Highly resistant Highly resistant
FEM5 Highly resistant Highly resistant
FEM6 Highly resistant Highly resistant
b. Male parent:
MAL1 Moderately resistant Highly resistant
MAL2 Highly resistant Highly resistant
MAL3 Highly resistant Highly resistant
MAL4 Moderately resistant Moderately resistant
c. Checks:
Olimar Highly susceptible Highly susceptible
FEDEARROZ-2000 Highly susceptible Moderately resistant
FED-ITAGUA​ Highly resistant Highly resistant
ORYZICA-1 Moderately resistant Moderately resistant


	 Theoretical and Applied Genetics          (2026) 139:52    52   Page 4 of 27

a disease score lower than or equal to 3 were considered as 
resistant. PV was collected plot-wise in the SSD Tropics 
population using a scale of 1–9 scale, with 1 being ‘‘excel-
lent vigor’’ and 9 being ‘‘very poor vigor.’’.

Genotyping and marker filtering

All “SSD Tropics” lines and their parents were genotyped 
using the 1k-RiCA v4.2 single nucleotide polymorphism 
(SNP) array (Arbelaez et al. 2019). In total, 1,094 SNP 
markers were obtained, including 261 trait markers, 28 
purity markers, and 805 genome-wide markers. The marker 
data were filtered by removing SNPs with a minor allele 
frequency (MAF) lower than 5% and missing values greater 
than 20%. The remaining missing data, representing 0.56% 
of the SNPs, were imputed using Wright’s equilibrium 
method (Wright 1922). After filtering and imputation, 671 
high-quality SNPs were retained for downstream analyses.

For the 204 lines from the 3K population, one million 
SNP markers were obtained from the public Rice SNP-Seek 
database (https://​3kric​egeno​me.​s3.​amazo​naws.​com/​3kRG_​
downl​oad.​html) (Mansueto et al. 2017). The same filtering 
criteria were applied, reducing the dataset to 431,377 SNP 
markers. The marker data were further refined and narrowed 
by implementing a selective linkage disequilibrium prun-
ing (SLDP) adapted from the procedure described by Zhu 
et al. (2023). LD was calculated using the squared allele 
frequency correlation adjusted for kinship relationships ( r2

v
 ) 

(Mangin et al. 2012):

where r2
v
(i, j) is the kinship-adjusted LD estimate between 

markers i and j; Xv is the kinship-adjusted genotype matrix. 
Xv was calculated as follows:

where X is the genotype matrix and K is the kinship matrix 
calculated using the VanRaden method (VanRaden 2008).

Briefly, SLDP was implemented in four steps, which 
involved: (1) identifying significant SNPs through a GWAS 
analysis for each trait and their highly linked ( r2

v
 ≥ 0.95) 

neighbors within a 50 kb window; (2) pruning significant 
SNPs and linked neighbors based on r2

v
 ≥ 0.95 and GWAS P 

values; (3) conducting genome-wide pruning of remaining 
SNPs using r2

v
 ≥ 0.80; and (4) combining pruned SNPs to 

produce a final dataset containing all GWAS-detected SNPs. 
The SLDP process reduced the marker dataset to 9,126 high-
quality SNP markers for the 3K population.

(1)r2
v
(i, j) =

(Cov(Xv
i
,Xv

j
))2

Var
(
Xv
i

)
.Var

(
Xv
j

)

(2)Xv = K−1∕2X

Phenotypic data analysis

Phenotypic data were analyzed using a two-stage approach 
to account for differences in experimental design appropri-
ately. From the three repeated BL scorings, only the high-
est score was used for each genotype to assess performance 
under highest blast pressure while minimizing the poten-
tial effect of weather variability which could influence dis-
ease pressure during field evaluations. In the first stage, the 
analysis was performed separately per environment (loca-
tion × year combinations), and adjusted means were esti-
mated for each genotype and their respective Smith weight 
was estimated. Given the ordinal nature of the traits, except 
for DF, a cumulative logit mixed model was fitted for each 
trait within each population. For SSD Tropics, the mixed 
model was fitted as follows:

where yiknt is the response of genotype i in row n and column 
t within block k; P(yiknt ≤ c) is the probability of yiknt being 
in category c or below; αc is the category specific threshold 
(intercept); gi is the genotype effect; bk is the block effect; 
wnk is the effect of row n within block k; ltk is the effect of 
column t within block k. In the 3K population, the model 
was fitted as follows:

where yijk is the response of genotype i in block k within 
replicate j; gi is the genotype effect, rj is the replicate effect; 
bjk is the effect of block k within the replicate j.

Models (1) and (2) were fitted following the Bayesian 
approach with four chains using the BRMS R package 
(Bürkner 2017). For each chain, total  Markov Chain Monte 
Carlo (MCMC) iterations, warmup and thinning were set to 
20,000, 6000, and 2, respectively. To control for the potential 
effect of flowering date on panicle blast severity, DF was 
included as a covariate in the first-stage PB model. Unlike 
ordinal scale data (BL, PB and VG), DF was analyzed using 
the Gaussian link family following Eq. (3) and (4).

where εiknt and εijk are residual errors.
Model evaluation was done based on Gelman-Rubin 

diagnostic statistics such as effective sample size (ESS) and 
R-hat ( R̂ ) (Gelman et al. 2004, 2014). ESS measures the 
number of independent samples after accounting for auto-
correlation between draws. An ESS ≥ 400 indicates more 
reliable estimates and better convergence. For the combined 

(3)SSD ∶ logit
[
P
(
yiknt ≤ c

)]
= �c + gi + bk + wnk + ltk

(4)3K ∶ logit
[
P
(
yijk ≤ c

)]
= �c + gi + rj + bjk

(5)SSD ∶ yiknt = gi + bk + wnk + ltk + �iknt

(6)3K ∶ yijk = gi + rj + bjk + �ijk

https://3kricegenome.s3.amazonaws.com/3kRG_download.html
https://3kricegenome.s3.amazonaws.com/3kRG_download.html


Theoretical and Applied Genetics          (2026) 139:52 	 Page 5 of 27     52 

chains, ESS value was calculated for each model parameter 
(Vehtari et al. 2021) as follows:

where N is the number of post-warmup draws, M is the 
number of chains, ρt is the autocorrelation at lag t, and k 
is the truncation point used to limit the sum of autocorre-
lations to reduce noise. The best k was chosen following 
the initial positive sequence estimator method, where the 
sum was truncated at the largest k, for which all autocor-
relations remained positive. N was estimated for each chain 
as follows:

Moreover, R̂ assesses the convergence of the different 
Markov chains. It compares between- and within-chain 
variances to determine if the chains have mixed well and 
converged to the posterior distribution. R̂ was calculated 
following the improved procedure by Vehtari et al. (2021) 
as follows:

At R̂ = 1 chains are perfectly converged and well-mixed, 
and the posterior distribution has been sufficiently explored. 
v̂ar

+
(�|y) is the marginal posterior variance which com-

bines both between- (B) and within-chain (W) variances as 
follows:

ESS and R̂ were rank-normalized to improve convergence 
diagnostics as recommended by Vehtari et al. (2021).

In all models, genotype was fitted as fixed effects and 
corresponding adjusted means were estimated using the pos-
terior_linpred() function. For each genotype and adjusted 
mean, a weight was estimated as the diagonal elements of 
the inverse of the variance–covariance (Vj) matrix following 
the Smith weighting method as described by Möhring and 
Piepho (2009):

where SWj is the Smith weight of each genotype in environ-
ment j and Vj is the variance–covariance matrix of genotypes 
in environment j.

(7)ESS =
NM

1 + 2
∑2k+1

t=1
�t

(8)N =
MCMCiterations - warmup

thinning

(9)R̂ =

√
�var

+
(𝜃|y)
W

(10)v̂ar
+
(�|y) = N − 1

N
W +

1

N
B

(11)SWj = D(V−1
j
)

In the second stage, the following mixed linear mod-
els were fitted for each trait using the ASReml-R package 
(Butler et al. 2023):

where yijk and yik are adjusted mean from first stage for each 
genotype; gi is the genotype effect; fj is the family effect; 
ek is the environment effect; geik is the effect of genotype-
by-environment interaction; fejk is the effect of family-by-
environment interaction; and εijk and εik are residual errors. 
Smith weight estimated from the first stage was incorporated 
into the second stage model to separate the genotype-by-
environment interaction and residual variances. Genotype, 
family and environment were fitted as random effects to esti-
mate variance components for each trait. The likelihood ratio 
test was performed to evaluate the statistical significance of 
variance components.

In addition, genotype was fitted as fixed effects to 
estimate best linear unbiased estimates (BLUE) for each 
genotype across environments. Broad sense heritabil-
ity (H2) was estimated as follows (Piepho and Möhring 
2007):

where �2
g
 is the genotypic variance, vΔ is the variance of a 

difference between two BLUEs.
To estimate genotypic correlation between traits in each 

population, the second stage models were extended to a 
bivariate model described as follows:

where y1, and y2 are adjusted means of genotype for the first 
and second trait, respectively. Bivariate models were fitted 
with a heterogeneous variance–covariance structure using 
corgh option for genotype, family, environment and residual.

Based on BLUEs from the second stage, the phe-
notypic diversity within each population was further 
described by performing a hierarchical cluster analy-
sis using the FactoMineR package (Lê et al. 2008). All 
analyses were performed in the R software 4.4.2 (R Core 
Team 2024).

(12)SSD ∶ yijk = gi + fj + ek + geik + fejk + �ijk

(13)3K ∶ yik = gi + ek + geik + �ik

(14)H2 =
�2
G

�2
G
+

vΔ

2

(15)SSD ∶

[
y1
y2

]
= gi + fj + ek + geik + fejk + �ijk

(16)3K ∶

[
y1
y2

]
= gi + ek + geik + �ik


	 Theoretical and Applied Genetics          (2026) 139:52    52   Page 6 of 27

Population structure and marker‑trait association 
analysis

Based on the high-quality markers (671 SNPs for SSD Trop-
ics and 9,126 SNPs for the 3K), population structure was 
investigated using principal components analysis. Individual 
admixture coefficients were estimated for all lines, assuming 
1 to 13 ancestral populations (K), to determine the probabil-
ity of each genotype to be included in a distinct subpopula-
tion. For each K, 40 iterations were performed to estimate 
cross-entropy values that were useful to determine the opti-
mal number of subpopulations. A genotype was included in 
a specific subpopulation when its inclusion probability was 
greater than 60%. Genotypes that did not meet this inclusion 
criterion were considered admixed. Admixture analysis was 
done using the LEA package (Frichot and François 2015). 
To evaluate the influence of family structure on the genetic 
differentiation within SSD Tropics population, an analysis 
of molecular variance (AMOVA) was performed using the 
poppr.amova() function from the poppr R package (Kamvar 
et al. 2014). The analysis applied the pegas method (Paradis 
2010) to partition genetic variation across three hierarchical 
levels: between populations (defined by family), between 
genotypes within family, and residual (i.e., heterozygosity).

Within each population, the genetic architecture of BL 
and PB was investigated by conducting a genome-wide asso-
ciation analysis. A mixed linear model (MLM) that incor-
porated kinship and population structure (Wang and Zhang 
2021) was fitted as follows:

where y is the vector of BLUE for each genotype; β is the 
fixed effect including grand mean and population structure; 
α is the marker effect; u ~ N(0, 2 K �2

a
 ) is a vector of size 

n (number of individuals) for random polygenic effects; 
ε ~ N(0, I �2

�
 ) is a vector of random residual effects. X, W 

and Z are design matrices for β, α and u, respectively. K is 
the kinship matrix calculated using the VanRaden method 
(VanRaden 2008). MLM was fitted using the Genomic 
Association and Prediction Integrated Tool (GAPIT) pack-
age v.3.1.0 (Wang and Zhang 2021). Significant marker-trait 
associations (MTAs) were identified based on a corrected 
Bonferroni threshold of 10−5 and 10−6 in SSD Tropics and 
3K populations, respectively. The threshold was determined 
in each population by dividing the α value of 0.05 by the 
number of markers.

LD block analysis was performed per chromosome 
based on all markers to identify quantitative trait loci 
(QTL) regions which were associated with the traits. 
LD block was identified using a modified version of the 
block partition method referred to as Big-LD by Kim 
et al. (2018), by replacing the standard r2 estimate by the 

(17)y = X� +W� + Zu + �

kinship-adjusted squared correlation ( r2
v
 ) to compute LD 

as previously described in Eq. (1), followed by a graph-
based clique detection using a threshold of r2

v
 ≥ 0.80. The 

Genome Annotation Project database (RGAP; https://​
rice.​uga.​edu) (Hamilton et al. 2025) was queried against 
physical positions of GWAS-detected markers on the 
Nipponbare reference genome Os-Nipponbare-Refer-
ence-IRGSP1.0 to retrieve gene ontology and descrip-
tion for most significant marker-trait associations for 
both traits.

Unweighted genomic relationship matrix 
construction

To implement single-trait and multi-trait genomic predic-
tion for BL and PB, additive genomic relationship matrices 
(G-matrix) were constructed using the AGHmatrix v2.1.4 
package (Amadeu et al. 2023) following the method proposed 
by VanRaden (2008). The unweighted G-matrix (Gunw) was 
calculated as:

where Z is a n × m matrix (n = number of genotypes, 
m = number of SNPs) which contains SNP genotype coeffi-
cients at each SNP. The coefficients of SNP i with alleles A1 
and A2 are 0–2pi for homozygous allele A1 (A1A1), 1 − 2pi 
for the heterozygous state (A1A2), and 2–2pi for homozy-
gous allele A2 (A2A2), where qi and pi are the frequencies 
of A1 and A2, respectively.

Weighted genomic relationship matrices 
construction

Weighted additive G-matrices (Gw) were calculated for each 
prediction model as follows:

where w is a diagonal matrix with the ith diagonal element 
being SNP weight at locus i.

SNP weights were determined using three methods: the 
FST-based weighting method (FST-w), squared additive marker 
effect-based method (AE-w) and negative base 10 logarithm 
of GWAS P value-based method (− log10(p)-w). In FST-w, the 
genotypic matrix was weighted by weights derived from FST 
values of individual SNPs. FST values were calculated based 
on population structure results, using the SNPRelate package 
following Weir and Cockerham (1984):

(18)Gunw =
ZZ�

2
∑

piqi

(19)Gw =
ZWZ�

2
∑

piqi

(20)FST =
a

a + b + c

https://rice.uga.edu
https://rice.uga.edu


Theoretical and Applied Genetics          (2026) 139:52 	 Page 7 of 27     52 

where a is the variance of allele frequencies among sub-
populations; b is the covariance of allele frequencies within 
subpopulations and c is the average expected heterozygosity 
within subpopulations. For SNP j, the relative weight was 
calculated as described by Chang et al. (2019):

where wj is the weight for SNP j, FSTj is the FST value 
of SNP j and n is the number of SNPs. Moreover, AE-w 
and − log10(p)-w are GWAS statistics-derived weighting 
methods. Weights were calculated and scaled for both meth-
ods following the procedure described by Su et al. (2014) as 
in Eq. (22) and (23) as follows:

Marker effect-derived weight:

GWAS P value-derived weight:

where AEj is additive effect of SNP j derived from GWAS 
analysis, pj is the frequency of the alternative allele of SNP 
j. To fit multi-trait prediction models, SNP weights were 
calculated as a weighted linear combination of trait-specific 
marker weights (wj). The combined weight, denoted as cwj, 
was calculated using the following formula:

where α1 and α2 are trait-specific scaling coefficients that 
account for genetic correlation between trait1 and trait2, 
wj,trait1 is weight of SNP j for trait1 and wj,trait2 is weight of 
SNP j for trait2. α1 and α2 are defined based on the genetic 
correlation between the two traits, such that α1 + α2 = 1. We 
calculated α1 and α2 using Eqs. (25) and (26) as follows:

where rg is the genetic correlation between the two traits. 
Here, as the genetic correlation increases, the contributions 
of the individual traits to the combined weight become 
more balanced, maximizing the weights of shared loci. 
This enables the second trait—typically the one with lower 

(21)wj =
FSTj∑n

j=1
FSTj

n

(22)wj =
2pj(1 − pj)AE

2
j

∑n

j=1
wj

n

(23)wj =
−2pj(1 − pj) log10(P − value)

∑n

j=1
wj

n

(24)cwj = �1wj,trait1 + �2wj,trait2

(25)�1 =
1

1 + rg

(26)�2 =
rg

1 + rg

heritability—to leverage the effects of shared loci more 
effectively when the correlation is high.

Genomic prediction analysis and models evaluation

Single-trait (ST) and multi-trait (MT) GBLUP models were 
fitted in the ASReml-R package (Butler et al. 2023) for BL 
and PB severity based on the BLUE from the phenotypic 
analysis and each of the genomic relationship matrices.

The ST model was fitted for each trait as follows:

Y is the vector of BLUE values corresponding to the 
genotypes; 1 is the vector with elements 1; µ is the grand 
mean effect; ZA is design matrix that associates the genomic 
breeding values with the response variable; gA is the vector 
of genomic breeding values; and ε is the residual term. 
gA ~ N(0, G σ2

g
); µ ~ N(0, I σ2

ε
); cov(ε, gA) = 0. G is the additive 

genomic relationship matrix. MT model was fitted by 
extending the ST model to a bivariate model as follows:

where 
[
Y1
Y2

]
 is the vector of BLUEs for trait1 and trait2; I1 

and I2 are the identity matrices; 
[
�1

�2

]
 is the vector of grand 

mean effects for trait1 and trait2; 
[
gA1

gA2

]
 is the vector of the 

genomic breeding values of the two traits; ZA1
 and ZA2

 are 
design matrices that associate genomic breeding values with 

the response variables; 
[
�1
�2

]
 is the vector of residual effects 

of the two traits .  
[
gA1

gA2

]
 ~ N(0, G  ⊗ H) ,  where 

H =

[
�2
g1

�g12
�g12 �2

g2

]
 is the variance–covariance matrix of the 

genomic breeding values of the two traits. 
[
�1
�2

]
 ~ N(0, I ⊗ R), 

where  R =

[
�2
�1

��12
��12 �2

�2

]
 is the residual variance–covariance 

matrix. MT model was fitted with a heterogeneous vari-
ance–covariance structure using corgh option for genotype 
and residual.

The evaluation of Single-trait (ST) and Multi-trait (MT) 
models was performed using a fivefold cross-validation 
approach. This consisted in dividing the initial population into 
five folds, with each fold containing 296–302 genotypes for 
the SSD Tropics population and 37–45 genotypes for the 3K 
population. When composing the folds, a stratified sampling 
was applied to ensure that genotypes were selected from all 

(27)Y = 1� + ZAgA + �

(28)
[
Y1
Y2

]
=

[
I1 0

0 I2

][
�1

�2

]
+

[
ZA1

0

0 ZA2

][
gA1

gA2

]
+

[
�1
�2

]


	 Theoretical and Applied Genetics          (2026) 139:52    52   Page 8 of 27

families in the SSD Tropics population, and all subpopula-
tions in the 3K population. The stratified sampling accounts 
for population structure and optimizes the representativeness 
of the training set. Additionally, each fold was sampled 100 
times, and a leave-one-fold-out approach was used to compose 
the training and validation sets. At each iteration, one fold was 
excluded and used as the validation set, while the remaining 
four folds were combined to form the training set. This process 
was repeated until each of the five folds served as the valida-
tion set. This ensures that every genotype was included in the 
validation set exactly once, while the remaining data was used 
for model training. This allowed for a comprehensive evalua-
tion of the model performance across the entire population to 
minimize overfitting and ensure independency of the results 
from a specific data partition.

At each iteration and for each validation set, the model 
predictive ability was determined by calculating the Pear-
son correlation between genomic estimated breeding values 
(GEBV) and BLUE of each genotype in the validation set. 
To further evaluate the performance of genomic prediction 
models, the root mean square error (RMSE) was calculated 
for each validation set and iteration. RMSE was normalized 
by the standard deviation of each trait, yielding the normal-
ized RMSE (nRMSE) to enable meaningful comparison across 
weighting methods and traits. For each genotype i, nRMSE 
was defined as the square root of the squared difference 
between the genomic estimated breeding value (GEBVi) and 
the observed phenotype (yi), divided by the standard deviation 
(SD) of the trait:

where N is the number of genotypes in each validation set.

(29)nRMSE =

�
1

N

∑N

i=1
(GEBVi − yi)

2

SD

Prior to constructing G-matrices, GWAS analyses were 
conducted for each training set at every iteration to calcu-
late marker weights for the AE-w and − log10(p)-w weight-
ing methods. This approach prevents information leakage 
between the training and validation sets and minimizes 
biases in model evaluation, especially for the weighted 
models.

Results

Genetic variation within each population for all 
traits

Model diagnostic statistics showed perfect convergence 
(maximum Ȓ = 1), with good chains mixing (Table 2). Fur-
thermore, all traits had a high minimum effective sample 
size (ESSmin > 400). This also suggests high precision in 
parameter estimates, reducing the uncertainty associated 
with marginal means.

In both populations, the genetic variance was substan-
tial for all traits (Table 2). Genotype-by-environment inter-
action variance also contributed to the trait variability, 
although it was smaller than the genetic variance. Her-
itability estimates were high for all traits, ranging from 
0.63 to 0.82, suggesting that a significant portion of the 
phenotypic variance was attributable to genetic factors. In 
SSD Tropics, within-family variance ( σ2

G
 ) was 1.2 − two-

fold higher than between-family variance for all traits. The 
density plot within each population showed that BLUE 
values for all traits were continuously and relatively nor-
mally distributed (Fig. 1).

Table 2   Stage 1 model 
diagnostic statistics, and 
variance components and 
heritability estimates from 
stage 2

DF days to 50%  flowering (days), BL  severity = leaf blast severity, PB  severity = panicle blast severity, 
VG = plant vigor, σ2

G
 = genotypic variance, σ2

F
 = family variance, σ2

GE
 = genotype-by-environment interac-

tion variance, σ2
FE

 = family-by-environment interaction variance, σ2
ε
 = residual variance, R̂max = R-hat con-

vergence diagnostic, ESSmin = minimum effective sample size and ESSmax = maximum effective sample size

Gelmn-Rubin statistic Variance components and heritability

Trait R̂max ESSmin ESSmax σ2
G

σ2
F

σ2
GE

σ2
FE

σ2
ε

H2

a. SSD Tropics:
BL severity 1.00 727 20,084 0.32 0.26 0.12 0.09 0.01 0.81
DF 1.00 822 6235 21.35 10.81 0.01 0.01 2.90 0.78
PB severity 1.00 955 2294 0.29 0.25 0.24 0.03 0.02 0.66
VG 1.00 785 6480 0.71 0.52 0.25 0.11 0.10 0.63
b. 3K population:
BL severity 1.00 10,232 12,387 2.41 0.06 0.80 0.83
DF 1.00 6540 20,349 152.79 4.42 0.48 0.82
PB severity 1.00 8578 16,044 0.40 0.01 0.55 0.69


Theoretical and Applied Genetics          (2026) 139:52 	 Page 9 of 27     52 

Genetic correlation and clustering in SSD Tropics 
and 3K populations

In both SSD Tropics and 3K populations, positive and mod-
erate genetic correlations (0.43–0.44) were detected between 
BL and PB severity (Figs. 2a and 3a). Correlations between 
disease severity (BL and PB) and agronomic traits such as 
DF and VG were generally low. The multivariate analysis 
revealed that the first two components explained 66.0% and 
80.5% of the phenotypic diversity in SSD Tropics and 3K 
populations, respectively. BL and PB severity were corre-
lated with Dimension 1, while DF and VG were correlated 
with Dimension 2 (Figs. 2b and 3b).

The hierarchical clustering identified three distinct clus-
ters (Figs. 2c and 3c) in the two populations. In the SSD 
Tropics population, Cluster 1 showed the lowest average BL 
severity (2.52), while Cluster 2 had the lowest average PB 
severity (2.19) (Fig. 2d). On the other hand, Cluster 3 exhib-
ited the highest average severity for BL (4.50) and PB (3.56). 
Similar results were obtained in the 3K population (Fig. 3d). 

Cluster 1 showed the lowest BL and PB severity, while the 
highest disease severity was observed with Cluster 3.

Population structure revealed by principal 
component analysis

The results of the population structure analysis for the SSD 
Tropics population are illustrated in Fig. 4. Average cross-
entropy values decreased steadily with K and plateaued 
around K = 10 (Fig. 4a). Based on the inclusion probability 
of 60%, the number of distinct subpopulations was 10, with 
several admixed genotyped (Fig. 4b). The PCA biplot shows 
that the first two principal components explain 10.1% and 
7.7% of the total genetic variance, respectively (Fig. 4c). 
The clustering pattern was mainly explained by family, with 
most of them being overlapped. The AMOVA results indi-
cated that a significant proportion of the total genetic varia-
tion (42.5%) was attributed to differences between families 
(Table 3). Interestingly, a higher proportion of variation was 
observed among lines within families, representing 50.8% of 
the total genetic variation. In contrast, the residual variation 

SSD Tropics 3K

2.5 5.0 7.5 2.5 5.0 7.5
0.00

0.05

0.10

0.15

0.20

0.0

0.1

0.2

0.3

BL severity

D
en

si
ty

SSD Tropics 3K

2 4 6 8 2 4 6 8
0.0

0.2

0.4

0.6

0.0

0.1

0.2

0.3

0.4

0.5

D
en

si
ty

SSD Tropics 3K

70 80 90 100 70 80 90 100
0.00

0.02

0.04

0.06

0.000

0.025

0.050

0.075

DF

D
en

si
ty

SSD Tropics

2.5 5.0 7.5
0.0

0.1

0.2

0.3

VG

D
en

si
ty

a b

c d
PB severity

Fig. 1   Density plots showing the distribution of best linear unbiased 
estimates (BLUEs) for all traits within SSD Tropics and 3K popula-
tions. a = leaf blast (BL) severity, b = panicle blast (PB) severity, 

c = days to 50% flowering (DF), and d = plant vigor (VG). VG was 
recorded in SSD Tropics population only. The blue dashed line indi-
cates the population mean for each trait


	 Theoretical and Applied Genetics          (2026) 139:52    52   Page 10 of 27

within lines—primarily reflecting individual heterozygo-
sity—was minimal (Table 3).

In the 3K population, the cross-entropy curve showed 
an optimum of four subpopulations (K = 4) as shown in 
Fig. 5a. This was confirmed by the heatmap of admixture 
coefficients (Fig. 5b) based on the inclusion probability of 
60%, which showed four distinct subpopulations and many 
admixed genotypes. Additionally, the PCA revealed that the 
first two principal components clearly discriminated the four 
identified subpopulations, and explained 6.7% and 3.6% of 
genetic variance, respectively (Fig. 5c).

Marker‑trait associations for leaf blast and panicle 
blast severity

The genetic architecture of BL and PB severity was depicted 
by implementing a single-trait genome-wide association 
study within each population. Ten and four subpopulations 
were included in the GWAS model as covariates to account 
for population structure in SSD Tropics and 3K populations, 

respectively. Significant marker-trait associations were iden-
tified on chromosomes 6 and 8 in the SSD Tropics popula-
tion and chromosomes 1, 6 and 12 in the 3K population 
(Fig. 6, Supplementary file 2).

In SSD Tropics population, several significant MTAs 
were identified on chromosomes 6 and 8 for BL severity and 
chromosome 6 for PB severity. LD block analysis revealed 
several linkage blocks with r2

v
 ≥ 0.8 delimiting QTL regions, 

including qtl6.6 and qtl8.3 on chromosomes 6 and 8, respec-
tively. Within qtl6.6, common MTAs were detected for BL 
and PB severity, of which the most significant MTA was 
MSU7_6_10388389_TT-AA on chromosome 6 (Table 4). 
This common MTA explained about 2% of the phenotypic 
variation. MTAs detected on chromosome 8 within qtl8.3 
explained about 19% of phenotypic variance each. In the 3K 
population, MTAs were detected on chromosomes 6 and 12 
for BL severity and chromosome 1 for PB severity (Table 4). 
QTL regions, namely qtl1.6, qtl6.18 and qtl12.17 were 
defined by significant markers on chromosomes 1, 6 and 
12, respectively. The most significant MTA for BL severity 

−0.10

0.10

−0.01

0.17 0.43**

−0.11

DF VG

VG

BL DF

VG

BL

−1.0

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5 1.0
Dim 1 (39.7%)

D
im

 2
 (2

6.
3%

) Cor

−1.0

−0.5

0.0

0.5

1.0

−3

−1

1

3

−3 −1  1 3

D
im

 2
 (2

6.
3%

)

Cluster
1

2

3

*** ***

0

2

4

6

8

10

BL

D
is

ea
se

 s
ev

er
ity

Cluster
1

2

3

Dim 1 (39.7%)

a b

c d

PB

PB

PB

PB

Fig. 2   Description of phenotypic diversity within SSD Tropics popu-
lation based on days to 50% flowering (DF), leaf blast (BL) severity, 
panicle blast (PB) severity and plant vigor (VG). a = genetic corre-
lation between traits, b = correlation between traits and the first two 

dimensions, c = clustering of the genotypes on the first two dimen-
sions, d = statistical difference between clusters for BL and PB sever-
ity. *** significant at p < 0.001


Theoretical and Applied Genetics          (2026) 139:52 	 Page 11 of 27     52 

in this population was 191472769 within qtl6.18, detected 
on chromosome 6 with 6% of explained phenotypic variance.

Gene ontology search in the Rice Annotation Project 
Database (RAP-DB) identified several candidate genes 
within QTL regions (Supplementary file 3). On chro-
mosome 6, common MTAs (MSU7_6_10388389_TT-A, 
MSU7_6_10389352_A-T, Pi2-01, Pi2-02) for BL and PB 
severity from SSD Tropics population and the most signifi-
cant MTA (191472769) for BL severity from 3K population 
were all located within locus Os06g0286700 (Table 5). This 
locus represents Pi2/Pi9 gene cluster which encodes a nucle-
otide-binding site leucine-rich repeat (NBS-LRR) protein. In 
addition, MTA Pi33_3 specifically detected for BL in SSD 
Tropics population was linked to Pi33 gene which encodes 
avirulence conferring enzyme 1 (ACE1)-specific protein.

Based on the most significant SNP markers linked to Pi2/
Pi9 gene cluster and Pi33_3, haplotype groups with low 
average disease severity (< 3) were identified within each 
population (Fig. 7). With Pi2/Pi9, average BL and PB sever-
ity in best haplotypes was reduced by 1.1−2.6 and 0.4−0.8 

across the two populations, respectively, compared to groups 
with highest disease severity (Fig. 7a, b). Similarly, average 
BL severity of best haplotype group (BL severity ≈ 2.68) 
for Pi33 was 0.97 lower than the group with highest severity 
(BL severity ≈ 3.65) (Fig. 7c).

Predictive ability of weighted and unweighted 
genomic prediction models

In the SSD Tropics population, average predictive ability 
ranged from 0.78 to 0.81 for BL severity (Fig. 8a) and 0.62 
to 0.67 for PB severity (Fig. 8b). The highest predictive abil-
ity was observed for both traits with − log10(p)-w and AE-w, 
while FST-w and the unweighted models exhibited the lowest 
values. Average predictive abilities of BL severity with both 
AE-w and − log10(p)-w were 0.03 higher compared to the 
unweighted models, and 0.03−0.05 higher for PB severity 
(Fig. 8a, b). No difference in the average predictive ability 
was observed between FST-w and the unweighted models. 
Similarly, − log10(p)-w and AE-w showed no significant 

−0.08

0.44*

−0.14

BL

BL

DF
DF

BL

−1.0

−0.5

0.0

0.5

1.0

−1.0 −0.5 0.0 0.5 1.0
Dim 1 (52.3%)

D
im

 2
 (

28
.2

%
)

−1.0

−0.5

0.0

0.5

1.0

−3

−1

1

−3 −1 1  3

D
im

 2
 (

28
.2

%
)

Cluster

1

2

3

***
***

0

2

4

6

8

10

BL

D
is

ea
se

 s
ev

er
it

y
Cluster

1

2

3

Dim 1 (52.3%)

a b

c d

PB

PB

PB

Fig. 3   Description of phenotypic diversity within 3K population 
based on  days to 50%  flowering (DF), leaf blast (BL) severity and 
panicle blast (PB) severity. a = genetic correlation between traits, 
b = correlation between traits and the first two dimensions, c = clus-

tering of the genotypes on the first two dimensions, d = statistical dif-
ference  between clusters for BL and PB severity. *** significant at 
p < 0.001


	 Theoretical and Applied Genetics          (2026) 139:52    52   Page 12 of 27

difference within this population for both traits. Average 
predictive abilities of ST and MT models were statistically 
similar for all methods.

The average predictive ability in the 3K population 
ranged from 0.74 to 0.90 for BL severity (Fig. 8c) and 0.62 
to 0.85 for PB severity (Fig. 8d). Like in the SSD Tropics 
population, the highest average predictive abilities in the 3K 
population were observed with − log10(p)-w and AE-w for 
both traits (Fig. 8c, d). The unweighted models consistently 

exhibited the lowest predictive abilities (Fig. 8c). Depend-
ing on ST and MT models, the average predictive ability 
for BL severity with AE-w was 0.14–0.16 higher com-
pared to the unweighted models, and 0.14–0.23 higher for 
PB severity. Similarly, − log10(p)-w showed an increase of 
0.12–0.14 in the average predictive ability for BL severity 
and 0.12–0.20 for PB severity (Fig. 8c, d). In contrast, the 
increase in predictive ability from FST-w was smaller, with 
0.03 for both traits. A significant difference was observed 
between ST and MT models for AE-w and − log10(p)-w for 
both traits in the 3K population. For BL severity, the aver-
age predictive ability of ST model was 0.02 higher than that 
of MT model with AE-w and − log10(p)-w. For PB severity, 
ST model outperformed MT model by 0.09 and 0.08 with 
AE-w and − log10(p)-w, respectively. In contrast, the predic-
tive abilities of ST and MT models were similar for FST-w 
and unweighted models.

Based on the unweighted models, the predictive ability of 
BL severity in the SSD Tropics population was 0.04 higher 
compared to the 3K population, while similar values were 
observed for PB severity between the two populations.

0.40

0.45

0.50

0.55

0.60

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Number of populations (K)

C
ro

ss
−e

nt
ro

py
a

−8

−6

−4

−2

0

2

4

6

8

−8 −6 −4 −2 0 2 4 6 8 10
PC1 (10.1%)

PC
2 

(7
.7

%
)

CT26676
CT26677
CT26678
CT26679
CT26680
CT26681
CT26682
CT26683
CT26684
CT26685
CT26686
CT26687
CT26688

CT26689
CT26690
CT26691
CT26692
CT26693
CT26694
CT26695
CT26696
CT26697
CT26698
CT26699
Parent
Tester

c

0.00

0.25

0.50

0.75

1.00

G
en

12
05

G
en

12
19

G
en

12
37

G
en

12
61

G
en

12
95

G
en

12
66

G
en

11
97

G
en

12
21

G
en

12
88

G
en

12
89

G
en

12
86

G
en

12
56

G
en

12
96

G
en

12
45

G
en

28
0

G
en

12
72

G
en

12
17

G
en

12
35

G
en

12
24

G
en

12
04

G
en

12
83

G
en

12
48

G
en

13
17

G
en

12
97

G
en

11
31

G
en

12
02

G
en

12
85

G
en

12
49

G
en

12
74

G
en

12
93

G
en

12
03

G
en

12
54

G
en

13
18

G
en

12
08

G
en

12
42

G
en

13
06

G
en

12
31

G
en

12
40

G
en

12
34

G
en

12
25

G
en

13
14

G
en

12
16

G
en

12
60

G
en

12
65

G
en

12
58

G
en

11
69

G
en

11
95

G
en

12
68

G
en

12
52

G
en

12
53

G
en

12
11

G
en

11
80

G
en

12
64

G
en

11
93

G
en

12
70

G
en

11
88

G
en

12
80

G
en

12
55

Fe
m
5

G
en

13
38

G
en

11
74

G
en

99
6

G
en

11
34

G
en

11
07

G
en

96
2

G
en

11
15

G
en

10
65

G
en

10
39

G
en

11
17

G
en

14
52

G
en

12
79

G
en

98
2

G
en

13
91

G
en

11
19

G
en

11
11

G
en

11
42

G
en

12
99

G
en

11
25

G
en

11
46

G
en

13
73

G
en

13
98

G
en

94
4

G
en

13
72

G
en

13
54

G
en

11
24

G
en

13
10

G
en

12
67

G
en

93
0

G
en

11
02

G
en

12
81

G
en

10
98

G
en

95
0

G
en

12
51

G
en

13
09

G
en

14
27

G
en

13
68

G
en

11
70

G
en

10
99

G
en

12
98

G
en

11
13

G
en

10
12

G
en

12
63

G
en

13
21

G
en

13
90

G
en

13
92

G
en

95
9

G
en

21
50

G
en

12
75

G
en

10
71

G
en

11
72

G
en

11
26

G
en

97
7

G
en

12
76

G
en

11
48

G
en

13
56

G
en

10
67

G
en

11
75

G
en

13
77

G
en

94
5

G
en

97
6

G
en

11
33

G
en

13
61

G
en

93
6

G
en

13
04

G
en

13
07

G
en

94
6

G
en

10
41

G
en

11
79

G
en

10
94

G
en

12
43

G
en

14
21

G
en

12
92

G
en

12
27

G
en

11
41

G
en

10
31

G
en

11
28

G
en

97
9

G
en

13
30

G
en

14
28

G
en

11
01

G
en

13
35

G
en

12
50

G
en

14
20

G
en

13
89

G
en

13
50

G
en

11
45

G
en

11
16

G
en

12
10

G
en

93
1

G
en

11
43

G
en

11
27

G
en

13
31

G
en

11
37

G
en

11
32

G
en

13
12

G
en

94
2

G
en

13
36

G
en

12
78

G
en

93
7

G
en

13
45

G
en

14
24

G
en

98
5

G
en

11
35

G
en

13
55

G
en

11
10

G
en

10
38

G
en

11
68

G
en

10
68

G
en

21
81

G
en

13
76

G
en

12
15

G
en

97
8

G
en

97
0

G
en

11
83

G
en

99
2

G
en

11
61

G
en

14
29

G
en

93
5

G
en

64
8

M
al
1

G
en

98
1

G
en

19
42

G
en

19
07

G
en

19
00

G
en

15
26

G
en

18
74

G
en

01
6

G
en

36
5

G
en

10
75

G
en

10
64

G
en

31
1

G
en

18
80

G
en

00
9

G
en

93
9

G
en

66
7

G
en

29
9

G
en

18
68

G
en

19
52

G
en

93
4

G
en

01
1

G
en

68
2

G
en

15
10

G
en

97
4

G
en

09
0

G
en

17
95

G
en

37
9

G
en

30
1

G
en

26
7

G
en

36
6

G
en

37
7

G
en

35
5

G
en

96
8

G
en

31
5

G
en

66
4

G
en

28
5

G
en

01
9

G
en

33
8

G
en

19
11

G
en

18
71

G
en

38
5

G
en

19
03

G
en

04
7

G
en

15
65

G
en

15
01

G
en

29
3

G
en

66
0

G
en

28
4

G
en

10
93

G
en

92
9

G
en

31
0

G
en

03
7

G
en

31
3

G
en

18
03

G
en

15
45

G
en

30
6

G
en

09
7

G
en

65
1

G
en

01
5

G
en

16
59

G
en

09
1

G
en

18
10

G
en

19
25

G
en

18
93

G
en

96
6

G
en

36
7

G
en

93
2

G
en

34
6

G
en

34
4

G
en

15
63

G
en

04
0

G
en

93
3

G
en

18
70

G
en

18
84

G
en

15
73

G
en

34
3

G
en

18
94

G
en

15
85

G
en

93
8

G
en

34
7

G
en

18
83

G
en

19
45

G
en

16
43

G
en

33
3

G
en

18
15

G
en

01
4

G
en

37
1

G
en

67
9

G
en

19
51

G
en

19
26

G
en

15
66

G
en

01
3

G
en

18
73

G
en

19
01

G
en

05
3

G
en

10
73

G
en

18
79

G
en

16
44

G
en

97
3

G
en

36
4

G
en

69
4

G
en

30
4

G
en

70
9

G
en

15
00

G
en

97
5

G
en

30
3

G
en

14
74

G
en

18
53

G
en

38
4

G
en

99
3

G
en

15
90

G
en

33
0

G
en

03
6

G
en

64
0

G
en

19
46

G
en

17
89

G
en

98
0

G
en

18
54

G
en

04
4

G
en

05
4

G
en

19
16

G
en

10
48

G
en

09
2

G
en

18
85

G
en

15
61

G
en

15
46

G
en

95
4

G
en

99
4

G
en

18
78

G
en

67
1

G
en

19
08

G
en

96
4

G
en

10
66

G
en

29
7

G
en

65
2

G
en

14
64

G
en

10
30

G
en

14
98

G
en

33
1

G
en

15
68

G
en

18
75

G
en

94
3

G
en

10
37

G
en

10
69

G
en

18
57

G
en

35
3

G
en

07
9

G
en

00
7

G
en

28
2

G
en

15
49

G
en

37
8

G
en

97
1

G
en

37
3

G
en

18
89

G
en

15
27

G
en

19
05

G
en

37
5

G
en

36
3

G
en

98
3

G
en

97
2

G
en

15
02

G
en

92
8

G
en

29
5

G
en

34
5

G
en

31
6

G
en

96
5

G
en

96
3

G
en

18
20

G
en

37
4

G
en

32
9

G
en

94
0

G
en

14
61

G
en

15
04

G
en

26
2

G
en

25
7

G
en

76
6

Fe
m
6

G
en

84
8

G
en

77
1

G
en

72
1

G
en

73
4

G
en

92
7

G
en

77
9

G
en

77
3

G
en

80
9

G
en

78
7

G
en

90
2

G
en

87
4

G
en

89
0

G
en

76
1

G
en

74
2

G
en

81
8

G
en

68
7

G
en

78
4

G
en

85
2

G
en

64
4

G
en

78
3

G
en

81
9

G
en

63
8

G
en

73
3

G
en

69
5

G
en

68
3

G
en

72
4

G
en

75
2

G
en

88
1

G
en

81
5

G
en

74
3

G
en

77
4

G
en

67
3

G
en

70
6

G
en

86
2

G
en

88
0

G
en

89
2

G
en

76
0

G
en

70
5

G
en

86
7

G
en

64
6

G
en

74
6

G
en

74
8

G
en

80
0

G
en

78
0

G
en

83
3

G
en

69
6

G
en

67
0

G
en

71
1

G
en

71
2

G
en

71
3

G
en

79
7

G
en

70
2

G
en

77
8

G
en

65
3

G
en

85
1

G
en

84
0

G
en

75
1

G
en

88
3

G
en

67
7

G
en

68
4

G
en

64
7

G
en

83
0

G
en

69
1

G
en

76
8

G
en

74
9

G
en

65
9

G
en

91
6

G
en

81
7

G
en

65
6

G
en

79
4

G
en

66
3

G
en

91
4

G
en

80
3

G
en

82
5

G
en

73
8

G
en

79
0

G
en

81
6

G
en

70
4

G
en

71
8

G
en

87
3

G
en

83
8

G
en

75
3

G
en

83
7

G
en

64
2

G
en

69
3

G
en

70
8

G
en

73
0

G
en

75
7

G
en

75
4

G
en

73
1

G
en

64
3

G
en

79
5

G
en

83
6

G
en

77
7

G
en

88
8

G
en

83
2

G
en

79
9

G
en

78
6

G
en

82
7

G
en

84
5

G
en

81
4

G
en

65
8

G
en

90
5

G
en

90
7

G
en

82
4

G
en

13
13

G
en

91
9

G
en

70
1

G
en

69
7

G
en

81
2

G
en

72
2

G
en

80
8

G
en

89
7

G
en

68
9

G
en

92
4

G
en

67
2

G
en

77
5

G
en

77
6

G
en

82
6

G
en

91
7

G
en

89
5

G
en

67
5

G
en

65
5

G
en

76
2

G
en

87
0

G
en

84
1

G
en

66
1

G
en

71
0

G
en

75
0

G
en

84
6

G
en

92
6

G
en

76
9

G
en

87
9

G
en

64
9

G
en

89
6

G
en

70
3

G
en

70
7

G
en

85
6

G
en

80
7

G
en

72
8

G
en

90
3

G
en

65
4

G
en

69
8

G
en

86
0

G
en

75
9

G
en

73
5

G
en

66
9

G
en

75
8

G
en

84
9

G
en

91
1

G
en

89
8

G
en

88
5

G
en

64
1

G
en

86
1

G
en

84
7

G
en

87
8

G
en

85
0

G
en

64
5

G
en

72
3

G
en

89
1

G
en

80
5

G
en

92
5

G
en

77
2

G
en

74
1

G
en

92
3

G
en

63
9

G
en

83
5

G
en

91
0

G
en

72
6

G
en

90
0

G
en

91
8

G
en

88
6

G
en

13
80

G
en

76
4

G
en

67
4

G
en

70
0

G
en

87
5

G
en

17
51

IR
80

55
9A

.1
G
en

16
97

Fe
m
4

G
en

17
60

G
en

16
24

G
en

16
17

G
en

15
30

G
en

17
75

G
en

14
84

G
en

15
51

G
en

15
57

G
en

16
45

G
en

16
87

G
en

18
56

G
en

16
92

G
en

15
52

G
en

59
9

G
en

15
62

G
en

16
50

G
en

16
86

G
en

19
48

G
en

16
80

G
en

17
85

G
en

15
06

G
en

17
50

G
en

17
52

G
en

17
19

G
en

16
89

G
en

14
66

G
en

14
68

G
en

16
98

G
en

84
4

G
en

17
91

G
en

16
10

G
en

16
51

G
en

15
67

G
en

18
60

G
en

17
92

G
en

18
05

G
en

17
00

G
en

18
41

G
en

18
46

G
en

18
62

G
en

15
05

G
en

15
48

G
en

16
79

G
en

16
74

G
en

15
70

G
en

17
69

G
en

14
67

G
en

18
31

G
en

18
61

G
en

16
42

G
en

19
50

G
en

14
70

G
en

17
29

G
en

14
65

G
en

17
84

G
en

14
71

G
en

15
87

G
en

14
72

G
en

16
46

G
en

15
99

G
en

14
58

G
en

17
74

G
en

17
27

G
en

15
12

G
en

15
11

G
en

17
49

G
en

16
33

G
en

14
86

G
en

17
99

G
en

17
58

G
en

16
55

G
en

18
12

G
en

16
73

G
en

15
64

G
en

16
88

G
en

16
02

G
en

17
70

G
en

18
00

G
en

15
38

G
en

18
32

G
en

16
19

G
en

18
17

G
en

15
14

G
en

18
43

G
en

17
65

G
en

14
73

G
en

17
97

G
en

17
77

G
en

16
08

G
en

16
05

G
en

15
28

G
en

17
28

G
en

14
83

G
en

15
50

G
en

16
70

G
en

16
36

G
en

16
26

G
en

16
14

G
en

17
83

G
en

15
22

G
en

16
90

G
en

18
49

G
en

16
76

G
en

18
11

G
en

16
96

G
en

17
90

G
en

15
59

G
en

17
05

G
en

16
28

G
en

15
17

G
en

81
1

G
en

16
54

G
en

17
59

G
en

16
52

G
en

16
64

G
en

18
55

G
en

15
07

G
en

18
36

G
en

15
18

G
en

16
75

G
en

14
62

G
en

15
13

G
en

14
69

G
en

15
47

G
en

17
09

G
en

16
81

G
en

16
69

G
en

16
62

G
en

16
53

G
en

17
53

G
en

17
22

G
en

16
61

G
en

16
47

G
en

18
37

G
en

17
11

G
en

15
21

G
en

16
95

G
en

16
09

G
en

17
62

G
en

14
63

G
en

15
94

G
en

16
06

G
en

18
59

G
en

18
18

G
en

18
50

G
en

18
35

G
en

17
66

G
en

16
67

G
en

16
11

G
en

15
09

G
en

16
99

G
en

14
85

G
en

18
19

G
en

15
16

G
en

18
25

G
en

15
15

G
en

17
64

G
en

15
71

G
en

16
21

G
en

18
01

G
en

24
8

G
en

42
1

G
en

40
1

G
en

19
69

G
en

44
5

G
en

19
85

G
en

19
70

G
en

39
5

G
en

20
26

G
en

16
38

G
en

16
18

G
en

46
3

G
en

03
2

G
en

16
32

G
en

11
0

G
en

11
76

G
en

15
92

G
en

40
2

G
en

19
27

G
en

16
49

G
en

11
08

G
en

40
7

G
en

16
57

G
en

46
2

G
en

20
36

G
en

11
23

G
en

48
0

G
en

19
61

G
en

16
15

G
en

20
55

G
en

16
16

G
en

11
9

G
en

41
8

G
en

43
5

G
en

16
07

G
en

20
03

G
en

03
5

G
en

19
57

G
en

41
9

G
en

78
1

G
en

16
20

G
en

16
60

G
en

15
98

G
en

67
8

G
en

47
5

G
en

16
01

G
en

44
7

G
en

10
2

G
en

11
18

G
en

19
95

G
en

43
3

G
en

20
64

G
en

19
99

G
en

75
5

G
en

47
4

G
en

41
7

G
en

08
3

G
en

16
41

G
en

43
1

G
en

39
4

G
en

19
65

G
en

16
65

G
en

47
6

G
en

10
5

G
en

11
00

G
en

15
88

G
en

73
7

G
en

15
93

G
en

39
6

G
en

19
80

G
en

11
50

G
en

47
1

G
en

20
62

G
en

19
55

G
en

42
6

G
en

20
22

G
en

67
6

G
en

11
47

G
en

65
7

G
en

39
9

G
en

41
1

G
en

16
56

G
en

55
0

G
en

40
5

G
en

77
0

G
en

03
0

G
en

19
93

G
en

18
82

G
en

16
63

G
en

72
0

G
en

16
13

G
en

48
2

G
en

16
29

G
en

11
36

G
en

11
86

G
en

16
66

G
en

19
71

G
en

11
12

G
en

16
40

G
en

11
03

G
en

20
37

G
en

20
17

G
en

19
79

G
en

11
55

G
en

76
3

G
en

11
09

G
en

73
9

G
en

46
0

G
en

73
6

G
en

11
44

G
en

74
7

G
en

16
27

G
en

11
81

G
en

10
96

G
en

20
32

G
en

47
8

G
en

11
4

G
en

11
29

G
en

48
3

G
en

16
31

G
en

11
82

G
en

16
39

G
en

16
68

G
en

71
9

G
en

16
23

G
en

08
6

G
en

72
7

G
en

10
95

G
en

14
57

G
en

39
3

G
en

44
3

G
en

11
3

G
en

45
7

G
en

41
6

G
en

40
6

G
en

11
30

G
en

40
9

G
en

11
39

G
en

11
49

G
en

41
0

G
en

19
66

G
en

43
0

G
en

20
01

M
al
4

G
en

11
87

G
en

10
91

G
en

19
49

G
en

22
2

G
en

23
7

G
en

23
0

G
en

27
9

G
en

27
7

G
en

24
6

G
en

03
4

G
en

20
6

G
en

19
1

G
en

05
1

G
en

18
5

G
en

14
3

G
en

16
2

G
en

19
0

G
en

22
1

G
en

04
9

G
en

14
6

G
en

21
1

G
en

25
9

G
en

12
1

G
en

09
4

G
en

15
1

G
en

23
9

G
en

24
3

G
en

26
6

G
en

18
1

G
en

19
2

G
en

14
0

G
en

25
5

G
en

11
2

G
en

24
1

G
en

19
8

G
en

15
0

G
en

25
2

G
en

23
8

G
en

20
2

G
en

20
5

G
en

20
0

G
en

13
2

G
en

22
4

G
en

23
4

G
en

02
6

G
en

11
1

G
en

02
4

G
en

04
6

G
en

11
7

G
en

27
1

G
en

25
3

G
en

03
3

G
en

03
8

G
en

19
9

G
en

00
8

G
en

10
4

G
en

10
3

G
en

00
6

G
en

00
2

G
en

08
4

G
en

08
5

G
en

27
8

G
en

12
0

G
en

02
2

G
en

25
8

G
en

18
0

G
en

21
9

G
en

10
8

G
en

18
2

G
en

26
5

G
en

04
2

G
en

12
4

G
en

02
3

G
en

22
0

G
en

13
6

G
en

02
0

G
en

02
5

G
en

11
5

G
en

01
2

G
en

00
5

G
en

16
6

G
en

05
2

G
en

18
7

G
en

17
7

G
en

04
3

G
en

13
9

G
en

17
4

G
en

13
8

G
en

10
1

G
en

02
9

G
en

00
3

G
en

09
5

G
en

01
7

G
en

04
1

G
en

13
7

G
en

19
4

G
en

17
5

G
en

17
9

G
en

12
3

G
en

01
0

G
en

19
5

G
en

02
1

G
en

09
8

G
en

24
5

G
en

02
7

G
en

17
6

G
en

09
3

G
en

09
6

G
en

17
2

Fe
m
1

G
en

03
9

G
en

25
1

G
en

20
1

G
en

11
6

G
en

24
9

G
en

00
1

G
en

04
5

G
en

03
1

G
en

04
8

G
en

01
8

G
en

23
2

G
en

18
8

G
en

25
0

G
en

21
55

G
en

21
90

G
en

21
59

G
en

21
54

G
en

21
70

G
en

21
64

G
en

21
87

G
en

21
49

G
en

21
88

G
en

21
80

G
en

21
78

G
en

21
71

G
en

21
72

G
en

21
77

G
en

21
51

G
en

21
63

G
en

21
57

G
en

21
37

G
en

19
44

G
en

21
44

G
en

20
30

G
en

21
46

G
en

19
28

G
en

19
92

G
en

20
42

G
en

19
54

G
en

18
92

G
en

18
04

G
en

19
10

G
en

18
14

G
en

20
29

G
en

19
04

G
en

18
90

G
en

20
27

G
en

21
04

G
en

21
06

G
en

19
56

G
en

19
68

G
en

21
82

G
en

21
75

G
en

20
74

G
en

19
38

G
en

19
98

G
en

20
45

G
en

20
67

G
en

18
13

G
en

21
08

G
en

20
25

G
en

18
33

G
en

20
66

G
en

18
67

G
en

19
97

G
en

21
67

G
en

20
07

G
en

19
35

G
en

21
76

G
en

18
72

G
en

21
68

G
en

19
91

Fe
m
3

G
en

21
60

G
en

21
52

G
en

21
28

G
en

20
34

G
en

18
09

G
en

18
96

G
en

20
54

G
en

21
79

G
en

18
91

G
en

20
11

G
en

19
88

G
en

19
47

G
en

18
24

G
en

19
78

G
en

18
65

G
en

19
90

G
en

20
56

G
en

21
45

G
en

20
71

G
en

21
03

G
en

18
88

G
en

18
69

G
en

19
76

G
en

21
21

G
en

20
88

G
en

20
00

G
en

20
91

G
en

19
31

G
en

20
14

G
en

21
74

G
en

18
23

G
en

21
94

G
en

20
97

G
en

20
77

G
en

20
68

G
en

21
31

G
en

19
83

G
en

21
27

G
en

21
53

G
en

19
77

G
en

20
20

G
en

20
16

G
en

21
23

G
en

18
98

G
en

21
43

G
en

18
95

G
en

21
01

G
en

20
10

G
en

21
93

G
en

21
95

G
en

21
30

G
en

19
53

G
en

18
87

G
en

19
96

G
en

21
66

G
en

20
35

G
en

20
98

G
en

18
81

G
en

21
62

G
en

20
15

G
en

18
02

G
en

21
85

G
en

21
26

G
en

21
02

G
en

18
64

G
en

20
02

G
en

20
99

G
en

19
67

G
en

19
02

G
en

20
85

G
en

18
40

G
en

18
58

G
en

20
63

G
en

20
76

G
en

21
89

G
en

20
04

G
en

18
48

G
en

21
58

G
en

18
86

G
en

19
75

G
en

18
22

G
en

21
48

G
en

20
89

G
en

18
76

G
en

20
12

G
en

21
47

G
en

20
23

G
en

18
07

G
en

18
30

G
en

19
86

G
en

11
8

G
en

18
21

G
en

19
34

G
en

18
45

G
en

17
93

G
en

20
93

G
en

19
32

G
en

19
06

G
en

20
19

G
en

20
33

G
en

18
44

G
en

14
10

G
en

49
7

G
en

18
26

G
en

48
7

G
en

50
8

G
en

52
9

Fe
m
2

G
en

52
7

G
en

52
6

G
en

41
3

G
en

42
4

G
en

45
9

G
en

51
5

G
en

52
8

G
en

48
1

G
en

55
4

G
en

51
9

G
en

53
7

G
en

50
5

G
en

41
5

G
en

30
5

G
en

54
1

G
en

44
9

G
en

49
3

G
en

63
1

G
en

42
2

G
en

35
2

G
en

38
6

G
en

56
8

G
en

49
9

G
en

53
2

G
en

38
2

G
en

53
0

G
en

48
5

G
en

58
9

G
en

54
7

G
en

49
0

G
en

61
9

G
en

53
3

G
en

56
7

G
en

42
8

G
en

47
9

G
en

44
1

G
en

41
4

G
en

48
9

G
en

62
8

G
en

57
2

G
en

56
5

G
en

35
6

G
en

58
0

G
en

42
0

G
en

49
1

G
en

57
1

G
en

49
2

G
en

42
9

G
en

61
3

G
en

43
8

G
en

54
6

G
en

58
8

G
en

62
4

G
en

56
6

G
en

39
8

G
en

38
7

G
en

58
1

G
en

42
5

G
en

31
2

G
en

38
3

G
en

62
5

G
en

52
1

G
en

22
6

G
en

55
1

G
en

42
7

G
en

40
8

G
en

42
3

G
en

55
7

G
en

29
8

G
en

51
0

G
en

38
0

G
en

43
6

G
en

38
1

G
en

37
2

G
en

30
0

G
en

37
0

G
en

26
1

G
en

57
9

G
en

47
2

G
en

24
4

G
en

46
8

G
en

35
7

G
en

46
9

G
en

63
6

G
en

30
9

G
en

46
6

G
en

62
9

G
en

36
9

G
en

30
8

G
en

36
8

G
en

30
2

G
en

44
8

G
en

37
6

G
en

31
4

G
en

58
7

G
en

33
2

G
en

61
5

G
en

49
6

G
en

30
7

G
en

60
5

G
en

41
2

G
en

45
5

G
en

59
6

G
en

76
7

M
al
3

G
en

13
20

G
en

51
4

G
en

20
4

G
en

53
5

G
en

51
3

G
en

82
9

G
en

81
3

G
en

17
12

G
en

50
9

G
en

52
4

G
en

18
9

G
en

53
6

G
en

12
44

G
en

17
0

G
en

49
8

G
en

49
5

G
en

21
2

G
en

21
05

G
en

11
38

G
en

83
9

G
en

81
0

G
en

20
75

G
en

48
8

G
en

15
9

G
en

13
0

G
en

79
1

G
en

17
1

G
en

14
4

G
en

20
79

G
en

53
1

G
en

16
1

G
en

17
63

G
en

20
8

G
en

21
00

G
en

12
18

G
en

17
76

G
en

17
55

G
en

79
6

G
en

16
77

G
en

17
61

G
en

20
7

G
en

20
82

G
en

50
6

G
en

21
3

G
en

21
19

G
en

12
77

G
en

16
94

G
en

16
93

G
en

12
73

G
en

16
5

G
en

20
73

G
en

20
92

G
en

16
4

G
en

20
83

G
en

50
7

G
en

12
14

G
en

17
71

G
en

17
79

G
en

16
78

G
en

17
81

G
en

50
1

G
en

21
39

G
en

20
70

G
en

14
2

G
en

17
48

G
en

52
0

G
en

22
3

G
en

84
3

G
en

12
9

G
en

21
24

G
en

21
15

G
en

79
8

G
en

20
84

G
en

50
2

G
en

82
0

G
en

52
3

G
en

50
4

G
en

20
80

G
en

80
4

G
en

21
38

G
en

11
94

G
en

15
5

G
en

82
1

G
en

12
46

G
en

17
73

G
en

82
3

G
en

48
4

G
en

53
4

G
en

52
5

G
en

21
07

G
en

21
32

G
en

20
69

G
en

17
14

G
en

12
94

G
en

17
57

G
en

21
20

G
en

82
2

G
en

17
68

G
en

17
02

G
en

80
2

G
en

18
6

G
en

83
4

G
en

51
8

G
en

51
1

G
en

80
6

G
en

17
21

G
en

80
1

G
en

20
78

G
en

16
9

G
en

12
30

G
en

21
41

G
en

21
14

G
en

21
25

G
en

83
1

G
en

84
2

G
en

17
56

G
en

13
4

G
en

20
9

G
en

21
09

G
en

15
8

G
en

23
5

G
en

11
40

G
en

20
72

G
en

15
6

G
en

21
29

G
en

52
2

G
en

12
71

G
en

21
18

G
en

21
16

G
en

21
0

G
en

21
13

G
en

13
03

G
en

13
1

G
en

12
47

G
en

17
8

M
al
2

G
en

88
9

G
en

14
50

G
en

89
3

G
en

13
71

G
en

59
0

G
en

21
86

G
en

13
79

G
en

13
83

G
en

59
8

G
en

13
43

G
en

57
3

G
en

27
2

G
en

13
32

G
en

21
91

G
en

17
94

G
en

13
65

G
en

26
4

G
en

26
8

G
en

91
3

G
en

62
2

G
en

22
7

G
en

14
48

G
en

18
27

G
en

13
78

G
en

26
3

G
en

61
2

G
en

61
6

G
en

27
3

G
en

55
3

G
en

13
96

G
en

56
0

G
en

57
8

G
en

21
84

G
en

60
8

G
en

21
56

G
en

25
4

G
en

90
8

G
en

14
17

G
en

13
39

G
en

87
7

G
en

60
6

G
en

13
29

G
en

86
5

G
en

10
84

G
en

24
2

G
en

90
6

G
en

86
8

G
en

21
83

G
en

62
1

G
en

28
1

G
en

21
69

G
en

60
9

G
en

22
8

G
en

92
1

G
en

13
42

G
en

18
51

G
en

13
62

G
en

60
2

G
en

14
26

G
en

22
9

G
en

13
84

G
en

21
65

G
en

86
6

G
en

85
9

G
en

13
85

G
en

61
0

G
en

18
08

G
en

18
52

G
en

18
28

G
en

53
8

G
en

61
8

G
en

87
1

G
en

26
0

G
en

14
49

G
en

86
4

G
en

60
3

G
en

63
5

G
en

14
37

G
en

19
12

G
en

13
59

G
en

88
7

G
en

13
25

G
en

13
53

G
en

13
86

G
en

85
3

G
en

17
96

G
en

23
3

G
en

14
07

G
en

60
7

G
en

59
4

G
en

22
5

G
en

14
25

G
en

13
24

G
en

60
1

G
en

54
5

G
en

13
82

G
en

14
47

G
en

62
3

G
en

24
7

G
en

13
81

G
en

63
2

G
en

13
23

G
en

13
37

G
en

61
4

G
en

91
2

G
en

14
30

G
en

62
7

G
en

56
2

G
en

13
41

G
en

17
98

G
en

63
4

G
en

13
64

G
en

55
2

G
en

59
2

G
en

59
5

G
en

13
66

G
en

55
5

G
en

59
3

G
en

27
5

G
en

27
4

G
en

21
92

G
en

27
6

Genotype

A
nc

es
tr

y 
co

ef
fic

ie
nt

b

Fig. 4   Principal component analysis of SSD Tropics lines, including the 
1,484 F6 lines, their 10 parental lines based on 671 high quality SNP 
markers. a = Scree plot showing the optimum number of ancestral popula-

tions; b = Heatmap of individual admixture coefficients for the optimum 
number of ancestral populations; c = Scatter plot illustrating genotypes 
groupings projected on the first two components

Table 3   Results of analysis of molecular variance based on 671 SNP 
markers and the 1484 genotypes from the 24 families in the SSD 
Tropics population

df degree of freedom

Source of variation df Variance 
component

% of total variance

Between families 23 104.03 42.58
Among lines within family 1460 124.08 50.80
Within lines (residual) 1484 16.18 6.62
Total 2967 244.29 100.00


Theoretical and Applied Genetics          (2026) 139:52 	 Page 13 of 27     52 

Root mean square error of weighted 
and unweighted genomic prediction models

In both populations, normalized root mean square error 
(nRMSE) was remarkably lower with all ST and MT 
weighted models than their corresponding unweighted ones 
(Fig. 9). For both traits, the lowest nRMSE values were 
observed with − log10(p)-w and AE-w, while the unweighted 
models showed the highest values. In SSD Tropics popula-
tion, − log10(p)-w and AE-w reduced nRMSE by 0.06–0.07 
(11.3%) for BL severity, and 0.03–0.05 (3.8–5.1%) for PB 
severity across both ST and MT models (Fig. 9a, b), while 
FST-w showed little to no nRMSE reduction.

In addition, nRMSE reduction was higher in 3K popula-
tion for all traits than the SSD Tropics population. In the 
3K population, − log10(p)-w exhibited an nRMSE reduction 
of 0.17–0.20 (25–29.4%) for BL severity and 0.07–0.21 
(7.5–26.9%) for PB severity across ST and MT models. 
(Fig. 9c, d). Similarly, AE-w reduced nRMSE by 0.21–0.24 
(30.4–35.3%) for BL severity and 0.10–0.25 (9.7–32.1%) 
for PB severity. FST-w showed an nRMSE reduction of 

0.03–0.17 (4.4–24.6%) for BL severity and 0.03–0.13 
(3.8–14.0%).

In both populations, nRMSE reduction by ST models was 
higher than that of MT models across traits and weighting 
methods, with the exception of FST-w in 3K population.

Discussion

With its assumption of equal contributions of all 
genetic markers to the trait of interest, the traditional 
GBLUP model has significant limitations that can lower 
genomic prediction accuracy (Nishio and Satoh 2015). 
By comparing three marker weighting methods such as 
FST-w, − log10(p)-w and AE-w using SSD Tropics and 3K 
populations, this study aimed to refine the genomic rela-
tionship matrix and enhance predictive abilities for leaf 
blast and panicle blast resistance in rice. Compared to the 
standard unweighted approach, − log10(p)-w and AE-w 
exhibited the highest predictive ability with the lowest root 
mean square error for ST and MT models (Figs. 8 and 9). 

0.37

0.38

0.39

0.40

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Number of populations (K)

C
ro

ss
−e

nt
ro

py
a

−1.5

−1.0

−0.5

0.0

0.5

1.0

−1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 2.0
PC1 (6.7%)

PC
2 

(3
.6

%
) Admixed

Pop1

Pop2

Pop3

Pop4

c

0.00

0.25

0.50

0.75

1.00

IR
IS
_3

13
−1

21
38

IR
IS
_3

13
−1

22
88

IR
IS
_3

13
−1

22
92

IR
IS
_3

13
−1

22
61

IR
IS
_3

13
−9

01
9

IR
IS
_3

13
−8

75
1

IR
IS
_3

13
−1

19
20

IR
IS
_3

13
−1

15
46

IR
IS
_3

13
−1

16
35

IR
IS
_3

13
−1

10
71

IR
IS
_3

13
−1

11
45

IR
IS
_3

13
−1

07
62

IR
IS
_3

13
−1

11
35

IR
IS
_3

13
−1

21
87

IR
IS
_3

13
−1

22
69

IR
IS
_3

13
−1

11
43

IR
IS
_3

13
−1

22
90

IR
IS
_3

13
−8

38
6

IR
IS
_3

13
−1

14
63

IR
IS
_3

13
−9

02
0

IR
IS
_3

13
−1

11
41

IR
IS
_3

13
−1

21
21

IR
IS
_3

13
−1

22
21

IR
IS
_3

13
−8

38
3

IR
IS
_3

13
−1

09
90

IR
IS
_3

13
−1

17
20

IR
IS
_3

13
−8

90
9

IR
IS
_3

13
−8

72
3

IR
IS
_3

13
−7

68
5

IR
IS
_3

13
−1

06
52

IR
IS
_3

13
−1

05
54

IR
IS
_3

13
−9

53
3

IR
IS
_3

13
−1

21
48

IR
IS
_3

13
−1

07
69

IR
IS
_3

13
−1

11
20

IR
IS
_3

13
−9

20
9

IR
IS
_3

13
−1

06
74

IR
IS
_3

13
−1

08
44

IR
IS
_3

13
−1

21
88

IR
IS
_3

13
−1

17
00

IR
IS
_3

13
−1

20
50

IR
IS
_3

13
−1

00
54

IR
IS
_3

13
−1

21
94

IR
IS
_3

13
−1

11
37

IR
IS
_3

13
−8

89
5

C
X4

03
IR
IS
_3

13
−1

06
75

IR
IS
_3

13
−9

60
9

IR
IS
_3

13
−1

20
52

IR
IS
_3

13
−1

11
38

IR
IS
_3

13
−9

73
0

IR
IS
_3

13
−8

98
6

IR
IS
_3

13
−1

13
77

IR
IS
_3

13
−1

12
34

IR
IS
_3

13
−1

07
51

IR
IS
_3

13
−1

18
35

IR
IS
_3

13
−1

18
94

IR
IS
_3

13
−1

09
71

IR
IS
_3

13
−1

17
45

IR
IS
_3

13
−1

13
84

IR
IS
_3

13
−1

13
83

IR
IS
_3

13
−9

60
0

IR
IS
_3

13
−1

11
32

IR
IS
_3

13
−1

11
28

IR
IS
_3

13
−1

16
77

IR
IS
_3

13
−1

16
78

IR
IS
_3

13
−1

17
19

IR
IS
_3

13
−1

19
61

IR
IS
_3

13
−1

22
59

IR
IS
_3

13
−1

22
49

IR
IS
_3

13
−1

18
20

IR
IS
_3

13
−1

17
05

IR
IS
_3

13
−1

21
30

IR
IS
_3

13
−1

21
93

IR
IS
_3

13
−9

00
6

IR
IS
_3

13
−1

17
17

IR
IS
_3

13
−1

11
33

IR
IS
_3

13
−1

06
95

IR
IS
_3

13
−1

10
76

IR
IS
_3

13
−1

16
76

IR
IS
_3

13
−1

22
22

IR
IS
_3

13
−1

22
46

IR
IS
_3

13
−1

19
91

IR
IS
_3

13
−1

18
37

IR
IS
_3

13
−1

16
79

IR
IS
_3

13
−1

16
82

IR
IS
_3

13
−1

23
34

IR
IS
_3

13
−1

07
72

IR
IS
_3

13
−1

09
97

IR
IS
_3

13
−1

19
92

IR
IS
_3

13
−1

13
27

IR
IS
_3

13
−1

19
02

IR
IS
_3

13
−1

16
64

IR
IS
_3

13
−1

10
86

IR
IS
_3

13
−1

22
63

IR
IS
_3

13
−1

17
08

IR
IS
_3

13
−1

07
79

IR
IS
_3

13
−1

22
87

IR
IS
_3

13
−1

07
77

IR
IS
_3

13
−1

15
43

IR
IS
_3

13
−8

67
4

IR
IS
_3

13
−1

22
86

IR
IS
_3

13
−1

10
91

IR
IS
_3

13
−9

11
2

IR
IS
_3

13
−1

18
49

IR
IS
_3

13
−9

28
1

IR
IS
_3

13
−8

79
1

IR
IS
_3

13
−8

67
9

IR
IS
_3

13
−1

21
42

IR
IS
_3

13
−8

99
6

IR
IS
_3

13
−1

11
51

IR
IS
_3

13
−9

18
2

IR
IS
_3

13
−1

21
28

IR
IS
_3

13
−1

06
55

IR
IS
_3

13
−1

09
35

C
X7

3
IR
IS
_3

13
−8

79
3

IR
IS
_3

13
−9

35
7

IR
IS
_3

13
−9

28
6

IR
IS
_3

13
−9

07
0

IR
IS
_3

13
−1

13
86

IR
IS
_3

13
−1

22
68

IR
IS
_3

13
−1

09
04

IR
IS
_3

13
−8

98
5

IR
IS
_3

13
−1

13
94

IR
IS
_3

13
−1

07
74

IR
IS
_3

13
−1

21
46

IR
IS
_3

13
−8

90
3

IR
IS
_3

13
−1

09
07

IR
IS
_3

13
−9

60
2

IR
IS
_3

13
−1

14
03

IR
IS
_3

13
−9

20
8

IR
IS
_3

13
−1

05
55

IR
IS
_3

13
−1

14
93

IR
IS
_3

13
−1

12
54

IR
IS
_3

13
−1

11
26

IR
IS
_3

13
−8

58
5

IR
IS
_3

13
−1

07
55

IR
IS
_3

13
−1

11
42

IR
IS
_3

13
−8

94
6

IR
IS
_3

13
−1

22
96

IR
IS
_3

13
−9

56
0

IR
IS
_3

13
−9

34
2

IR
IS
_3

13
−9

28
5

IR
IS
_3

13
−1

15
45

IR
IS
_3

13
−1

19
97

IR
IS
_3

13
−1

15
21

IR
IS
_3

13
−1

15
49

IR
IS
_3

13
−8

94
8

IR
IS
_3

13
−1

11
46

IR
IS
_3

13
−1

05
75

IR
IS
_3

13
−7

79
7

IR
IS
_3

13
−8

69
9

IR
IS
_3

13
−1

11
44

IR
IS
_3

13
−1

15
48

IR
IS
_3

13
−1

11
30

IR
IS
_3

13
−8

29
2

IR
IS
_3

13
−1

13
38

IR
IS
_3

13
−1

12
92

IR
IS
_3

13
−1

22
10

IR
IS
_3

13
−8

71
7

IR
IS
_3

13
−1

18
16

IR
IS
_3

13
−1

11
39

IR
IS
_3

13
−1

11
29

IR
IS
_3

13
−1

11
77

IR
IS
_3

13
−9

57
2

IR
IS
_3

13
−1

23
29

IR
IS
_3

13
−9

96
8

IR
IS
_3

13
−1

18
18

IR
IS
_3

13
−1

04
49

IR
IS
_3

13
−9

31
3

IR
IS
_3

13
−7

72
8

IR
IS
_3

13
−8

30
5

IR
IS
_3

13
−1

08
59

IR
IS
_3

13
−1

17
97

C
X1

51
IR
IS
_3

13
−1

11
07

IR
IS
_3

13
−1

17
43

IR
IS
_3

13
−1

06
09

IR
IS
_3

13
−1

06
76

IR
IS
_3

13
−1

06
71

IR
IS
_3

13
−1

19
34

C
X8

9
IR
IS
_3

13
−8

34
1

IR
IS
_3

13
−1

15
94

IR
IS
_3

13
−1

16
94

IR
IS
_3

13
−9

55
5

IR
IS
_3

13
−1

19
47

IR
IS
_3

13
−8

74
3

IR
IS
_3

13
−1

20
78

IR
IS
_3

13
−1

02
21

IR
IS
_3

13
−1

18
76

IR
IS
_3

13
−9

42
9

IR
IS
_3

13
−1

17
44

IR
IS
_3

13
−9

31
7

IR
IS
_3

13
−9

20
4

IR
IS
_3

13
−1

07
48

IR
IS
_3

13
−1

16
93

IR
IS
_3

13
−1

21
86

IR
IS
_3

13
−1

10
97

IR
IS
_3

13
−1

20
57

IR
IS
_3

13
−1

01
77

C
X1

62
IR
IS
_3

13
−1

18
93

Genotype

A
nc

es
tr

y 
co

ef
fic

ie
nt

b

Fig. 5   Principal component analysis of the 204 3K lines based on 
9,126 high quality SNP markers. a = Scree plot showing the optimum 
number of ancestral populations; b = Heatmap of individual admix-

ture coefficients for the optimum number of ancestral populations; 
c = Scatter plot illustrating genotypes groupings projected on the first 
two components


	 Theoretical and Applied Genetics          (2026) 139:52    52   Page 14 of 27

Fig. 6   Manhattan plots showing 
marker-trait associations for 
leaf blast (BL) and panicle blast 
(PB) severity. a = SSD Tropics 
population and b = 3K popula-
tion. The red solid line repre-
sents the corrected Bonferroni 
threshold, serving as the cutoff 
for significant markers

Table 4   Most significant marker-trait associations within quantitative trait loci (QTL) region for leaf blast (BL) and panicle blast (PB) severity in 
SSD Tropics and 3K populations

SNP single nucleotide polymorphism, Chr chromosome, Pos physical position on the Nipponbare reference genome Os-Nipponbare-Reference-
IRGSP1.0, FA/UA favorable and unfavorable alleles, FAF favorable allele frequency, MAF minor allele frequency, − LOG10(p) = negative base 10 
logarithm of P-value, PVE phenotypic variance explained, FDR false discovery rate

QTL region Chr Pos (Mb) Lead SNP Trait FA/UA FAF  − LOG10(p) PVE FDR

a. SSD Tropics
qtl6.6 6 10.006 − 11.131 Pi2-01 BL T/A 0.09 11.30 0.07 1.0E-09

Pi2-01 PB A/T 0.09 8.25 0.02 2.3E-06
Pi2-02 BL C/G 0.09 11.50 0.06 1.0E-09
Pi2-02 PB C/G 0.09 7.98 0.02 2.3E-06
MSU7_6_10389352_A-T BL A/T 0.09 11.10 0.06 1.3E-09
MSU7_6_10389352_A-T PB A/T 0.09 8.17 0.02 8.0E-05
MSU7_6_10388389_TT-AA PB T/A 0.09 8.17 0.02 2.3E-06
MSU7_6_10388389_TT-AA BL T/A 0.09 12.10 0.06 5.2E-10

qtl8.3 8 6.269 − 7.470 chr08_6269190 BL A/T 0.37 6.74 0.19 2.4E-05
Pi33_3 BL T/G 0.36 6.17 0.19 7.6E-06

b. 3K population
qtl1.6 1 6.227 − 6.320 6249810 PB A/G 0.83 6.56 0.18 7.6E-04
qtl6.18 6 10.321 − 10.389 191472769 BL G/C 0.13 7.35 0.06 4.6E-05

191432383 BL A/T 0.08 6.41 0.18 6.1E-05
qtl12.17 12 10.933 − 12.410 357138628 BL G/A 0.35 6.30 0.18 6.1E-05


Theoretical and Applied Genetics          (2026) 139:52 	 Page 15 of 27     52 

Table 5   Gene ontology and encoding proteins linked to the most significant marker-trait associations for leaf blast (BL) and panicle blast (PB) 
severity within SSD Tropics and 3K populations

ACE1 avirulence conferring enzyme 1, Chr chromosome, Pos physical position on the Nipponbare reference genome Os-Nipponbare-Reference-
IRGSP1.0, NBS-LRR nucleotide-binding site leucine-rich repeats

QTL region Chr Pos (Mb) Lead GWAS-detected SNP Locus Encoding protein

a. SSD Tropics
qtl6.6 6 10.006 − 11.131 Pi2-01 Os06g0286700 NBS-LRR

Pi2-02 Os06g0286700 NBS-LRR
MSU7_6_10389352_A-T Os06g0286700 NBS-LRR
MSU7_6_10388389_TT-AA Os06g0286700 NBS-LRR

qtl8.3 8 6.269 − 7.470 chr08_6269190 Os08g0207500 Zinc transporter 4
Pi33_3 Pi33 ACE1-specific protein

b. 3K population
qtl1.6 1 6.227 − 6.320 6249810 Os01g0214300 Bromodomain protein
qtl6.18 6 10.321 − 10.389 191472769 Os06g0286700 NBS-LRR

191432383 Os06g0286351 Armadillo-type fold domain
qtl12.17 12 10.933 − 12.410 357138628 Os12g0294100 WD40 repeats protein

SSD Tropics 3K

AA AT TT 
0.0

CC CG GG

1.5

3.0

4.5

6.0

7.5

9.0

0.0

1.5

3.0

4.5

6.0

7.5

9.0

Os06g0286700

B
L 

se
ve

rit
y

SSD Tropics 3K

AA AT TT 
0.0

CC CG GG

1.5

3.0

4.5

6.0

7.5

9.0

0.0

1.5

3.0

4.5

6.0

7.5

9.0

Os06g0286700

SSD Tropics

0.0
GG GT TT

1.5

3.0

4.5

6.0

7.5

9.0

Pi33

B
L 

se
ve

rit
y

a b

c

PB
 s

ev
er

ity

Fig. 7   Haplotype groups defined based on markers linked to 
Os06g0286700 and Pi33 for each trait in SSD Tropics and 3K popu-
lations. a = leaf blast (BL) severity with Os06g0286700, b = pani-
cle blast (PB) severity with Os06g0286700 and c = leaf blast (BL) 
severity with Pi33. Haplotypes for Os06g0286700 were identified 

by allele combinations of markers MSU7_6_10388389_TT-AA and 
191472769 in SSD Tropics and 3K populations, respectively. Haplo-
types for Pi33 were defined based on markers Pi33_3 in SSD Trop-
ics population. Red point within each box represents group average 
disease severity


	 Theoretical and Applied Genetics          (2026) 139:52    52   Page 16 of 27

These methods performance was consistent across the two 
rice populations, demonstrating that GWAS-based weight-
ing methods can be reliably applied in different breeding 
programs, particularly when targeting quantitative traits 
controlled by the combination of major and small effect 
genes like blast resistance. Here, we will discuss practical 
implementation of GS + GWAS model approach that maxi-
mizes the use of both major-effect and small-effect loci to 
accelerate genetic gains for durable blast resistance in rice.

Genetic diversity and population structure

Substantial genetic diversity was observed in both popula-
tions (Figs. 2, 3, 4 and 5). In the SSD Tropics population 
which consists of interconnected families sharing several 
common parents, the phenotypic analysis consistently 

revealed higher within-family variance than between-
family variance across all traits (Table 2). This pattern 
suggests that most of the genetic variation in this popu-
lation resides within families rather than between fami-
lies, likely due to the segregation of alleles inherited from 
common parents. This observation was further supported 
by the molecular analysis where AMOVA results showed 
that within-family genetic variation (~ 124.1) was approxi-
mately 8% higher than between-family variation (~ 104.0) 
(Table 3). This interconnected pedigree structure also con-
tributed to the overlapping genetic backgrounds observed 
in the population structure analysis, where subpopulations 
were primary defined by family relationships rather than a 
clear genetic differentiation. In contrast, the 3K population 
exhibited a higher genetic diversity and differentiation, 
likely due to several factors, including the various variety 

0.81   0.81
 0.78 0.78     0.78 0.78     0.81 0.81

0.0

0.2

0.4

0.6

0.8

1.0

Unw      FST−w −log10(p)−w AE−w

0.66 0.67
0.63 0.62     0.62 0.62     

0.65 0.67

0.0

0.2

0.4

0.6

0.8

1.0

Unw           FST−w −log10(p)−w       AE−w

0.74 0.74     0.77 0.77

*** ***

0.0

0.2

0.4

0.6

0.8

1.0

Unw       FST−w        −log10(p)−w     AE−w

0.62 0.62      
0.65 0.65

0.74
0.82

0.76

*** ***

0.0

0.2

0.4

0.6

0.8

1.0

 Unw        FST−w  −log10(p)−w AE−w

MT ST

0.86 0.88 0.88 0.90
0.85

BL

BL

Pr
ed

ic
tiv

e 
ab

ili
ty

Pr
ed

ic
tiv

e 
ab

ili
ty

Pr
ed

ic
tiv

e 
ab

ili
ty

Pr
ed

ic
tiv

e 
ab

ili
ty

a (SSD Tropics) b (SSD Tropics)

c (3K) d (3K)

PB

PB

Fig. 8   Predictive abilities of single-trait (ST) and multi-trait (MT) 
weighted and unweighted (Unw) genomic prediction for each trait. 
a = leaf blast (BL) severity in SSD Tropics population, b = panicle 
blast (PB) severity in SSD Tropics population, c = leaf blast severity 
(BL) in 3K population and d = panicle blast (PB) severity in 3K pop-

ulation. Values above each bar represent average predictive ability. 
FST-w = fixation index-based weighting method, − log10(p)-w = neg-
ative base  10 logarithm of P value-derived weighting method, 
AE-w = squared additive effects-derived weighting method. *** sig-
nificant at p < 0.001


Theoretical and Applied Genetics          (2026) 139:52 	 Page 17 of 27     52 

types of its accessions, such as ind1A, ind1B, indx, ind2, 
ind3, aro (aromatic), trop1 and admix types (Supplemen-
tary file 1). The 3K population encompasses landraces, 
traditional cultivars, and improved lines of diverse variety 
types collected from different agro-ecological zones that 
might contribute to the observed higher genetic variability. 
The high broad sense heritability (0.69–0.83) and observed 
significant genotype-by-environment interaction highlight 
the contribution of both genetic and environmental factors 
to shaping leaf and panicle blast resistances (Table 2).

Genetic architecture of leaf blast and panicle blast 
resistance

GWAS analysis was implemented to link the observed 
genetic diversity to blast disease resistance and detected 
several marker-trait associations for leaf blast and panicle 

blast severity across the two populations, reinforcing blast 
resistance’s complex and polygenic nature (Table 4, Fig. 6). 
These findings are consistent with previous studies which 
reported several QTLs and candidate genes that contribute 
to blast resistance (Li et al. 2019; Tian et al. 2022; Korinsak 
et al. 2023; Devanna et al. 2024). In a recent meta-analysis, 
Devanna et al. (2024), reported 737 QTLs for blast disease 
from 53 independent populations, which they clustered into 
71 meta-QTLs. Similarly, in a prior meta-QTL and RNA-seq 
analysis, Tian et al. (2022) also clustered 839 QTLs for blast 
resistance into 67 meta-QTLs, from which they reported 
more than 118 differentially expressed genes. Interestingly, 
distinct genetic basis of resistance for BL- and PB-specific 
loci in both SSD Tropics and 3K populations were identified, 
which suggests that tested genotypes hold different genes 
that respond differently to leaf and panicle infection, as 
reported previously (Kalia and Rathour 2019). Blast can 
infect the same rice plant at vegetative and reproductive 

0.56 0.55
0.59

0.630.62 0.62
0.56 0.55

**

0.0

0.2

0.4

0.6

0.8

Unw FST−w −log10(p)−w AE−w

0.94

0.74

0.95

0.78

0.99

0.78

0.94

0.75

*** *** *** ***

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Unw FST−w −log10(p)−w AE−w

0.48
0.44

0.52

0.65
0.69 0.68

0.52
0.48

***

*
**

0.0

0.2

0.4

0.6

0.8

Unw FST−w −log10(p)−w AE−w

0.84

0.53

0.80
0.75

0.93

0.78
0.86

0.57

***
*** ***

0.0

0.2

0.4

0.6

0.8

1.0

1.2

Unw FST−w −log10(p)−w AE−w

MT ST

BL

BL

(SSD Tropics) (SSD Tropics)

(3K) (3K)

N
or

m
al

iz
ed

 R
M

SE

N
or

m
al

iz
ed

 R
M

SE

N
or

m
al

iz
ed

 R
M

SE

N
or

m
al

iz
ed

 R
M

SE

a b

c d

PB

PB

Fig. 9   Normalized root mean square error (RMSE) of single-trait 
(ST) and multi-trait (MT) weighted and unweighted (Unw) genomic 
prediction models. a = leaf blast (BL) severity in SSD Tropics popu-
lation, b = panicle blast (PB) severity in SSD Tropics population, 
c = leaf blast (BL) severity in 3K population and d = panicle blast 

(PB) severity in 3K population. FST-w = fixation index-based weight-
ing method, − log10(p)-w = negative base  10 logarithm of P value-
derived weighting method, AE-w = squared additive effects-derived 
weighting method. *, ** and *** significant at p < 0.05, 0.01 and 
0.001


	 Theoretical and Applied Genetics          (2026) 139:52    52   Page 18 of 27

stage if disease conducive conditions prevail during the 
entire crop cycle, as is the case in tropical countries like 
Colombia. In other environments where climate conditions 
are less stable, rice fields face high disease pressure at 
vegetative or reproductive stage. Our findings highlight 
the detection of potential donors for leaf and panicle 
blast resistance under natural infection conditions. This 
differentiation is very useful for breeding programs since 
both leaf and panicle blast resistance are necessary to 
decrease the negative economic and environmental impact 
of blast disease on rice production.

A major MTA, namely Pi33_33 linked to Pi33 gene 
harbored by QTL region qtl8.3 (6.269–7.470  Mb) was 
detected in SSD Tropics population for leaf blast resistance, 
explaining about 19% of phenotypic variance of the trait 
(Tables 4 and 5). Pi33 was previously reported as a major 
blast resistance gene, which encodes a protein that recog-
nizes pathogen’s avirulence conferring enzyme 1 (ACE1) 
(Berruyer et al. 2003; Ballini et al. 2007). ACE1 is a M. 
grisea avirulence gene which encodes polyketide synthase 
protein fused to a non-ribosomal peptide synthetase involved 
in the biosynthesis of secondary metabolite that is specifi-
cally recognized by Pi33 during the infection (Collemare 
et al. 2008). Based on gene expression analysis in the IR64 
rice cultivar, Vergne et al. (2007) demonstrated that ACE1-
to-Pi33 interaction triggers the activation and upregulation 
of defense and genes and down-regulation of several chlo-
rophyll a/b binding genes, leading to enhanced resistance 
to blast disease. Haplotype analysis revealed 517 lines that 
possessed favorable allele of marker Pi33_3, showing an 
average leaf blast severity of 2.65. These genotypes could 
be incorporated in future studies focusing on marker valida-
tion and development of breeder’s friendly marker for use in 
marker-assisted breeding strategies.

In addition, our results detected three highly linked 
( r2

v
 ≥ 0.8) common significant markers mapped to posi-

tion 10.38 Mb on chromosome 6 for BL and PB severity 
within SSD Tropics population. This suggests the presence 
of shared genetic factors governing resistance to both types 
of blast disease as reported by Babasaheb Aglawe et al. 
(2017), Noenplab et al. (2006) and Korinsak et al. (2023). 
This common marker was linked to the locus Os06g0286700 
(10.38–10.39 Mb) which was annotated as Pi2/Pi9 gene 
cluster according to the Rice Annotation Project Database 
(RAP-DB) (Sakai et al. 2013). Based on a scale of 0–9, our 
results revealed that this gene reduced blast severity of best 
haplotypes by up to 2.4 points (Fig. 7). Kalia and Rathour 
(2019) found that most major blast resistance genes, includ-
ing Pi2/Pi9 gene cluster encode an NBS-LRR protein. This 
protein is known to play a pivotal role in plant defense, 
particularly through pathogens’ effectors recognition and 
effector-triggered immunity initiation, a key plant immune 
response (Dubey and Singh 2018). Similarly, Devanna et al. 

(2024) identified NBS-LRR genes from 53 refined meta-
QTLs for blast resistance, further emphasizing the impor-
tance of these genes in breeding for blast resistance in rice. 
Moreover, our study also identified another significant 
marker (191472769, 10.38 Mb) linked to Pi2/Pi9 for BL 
severity within 3K population. This demonstrates the con-
sistency of this gene across diverse genetic backgrounds and 
highlights its potential as a promising target for genomics-
assisted breeding for improved leaf blast resistance in rice. 
The relatively high variability observed within haplotype 
groups for both BL and PB severity in the two population 
(Fig. 7), could be explained by the existence of several 
small effect genes that contribute to blast resistance besides 
complete and race-specific genes such as Pi2/Pi9 and Pi33 
detected in this study.

Furthermore, a moderate genotypic correlation 
(0.43–0.44) was also observed between two traits, support-
ing the fact that some genetic factors influence both leaf 
and panicle blast resistance (Figs. 2 and 3). In this case, 
selection for resistance to leaf blast would positively influ-
ence resistance to panicle blast disease. Similarly, Korin-
sak et al. (2023), evaluated three blast isolates (THL191, 
THL949, and THL 557) and found moderate positive cor-
relations (0.45–0.47) between leaf and panicle blast resist-
ances. Noenplab et al. (2006) also reported low to moder-
ate correlations (0.28–0.56) between leaf and panicle blast 
using various blast isolates. However, the low and moder-
ate strength of the correlation also supports the presence 
of distinct genetic factors governing leaf and panicle blast 
resistance. Different correlations reported using similar data 
analysis are impacted by the blast population used to gener-
ate phenotypic data, the results reported here correspond to 
natural infection where the virulence spectrum of the entire 
population is difficult to assess, contrary to evaluations car-
ried out using purified strains. Further analysis should be 
done to stablish if the resistance identified in this study is 
able to control blast pathogen in other locations where rice 
blast is a major threat.

G‑matrix weighting methods and genomic 
prediction ability

The high average predictive ability of 0.74–0.78 observed 
for leaf blast severity and the moderate predictive ability 
of 0.62–0.63 for panicle blast severity with the standard 
unweighted single-trait GBLUP underscore the potential of 
GBLUP-based genomic selection to accelerate blast resist-
ance breeding in rice (Fig. 8). Similar results were reported 
by Huang et al. (2019), who evaluated several blast isolates 
in two rice populations and found an average predictive abil-
ity of 0.55 for GBLUP. In our study, the predictive ability 
was consistently high for leaf blast severity and moderate for 
panicle blast severity across the two populations. Average 


Theoretical and Applied Genetics          (2026) 139:52 	 Page 19 of 27     52 

predictive ability of unweighted GBLUP for leaf blast sever-
ity compared to panicle blast severity was 0.16 and 0.12 
higher in SSD Tropics and 3K populations, respectively. 
This may be attributed to differences in the genetic archi-
tecture of these traits and their underlying molecular mech-
anisms, which are likely shaped by distinct pathosystems. 
Despite the ability of GS to predict both traits, its efficiency 
might be higher for leaf blast resistance. Genetic resources 
for panicle blast resistance are usually limited, partly due 
to difficulties in phenotyping for panicle blast disease using 
available evaluation methods (Hayashi et al. 2019). Visual 
plot-wise scoring under natural infection conditions may be 
more reliable for leaf blast than panicle blast since foliar 
tissue is most exposed to the rater eye than panicle structure 
in which capturing moderate resistant reaction could be chal-
lenging. An additional strategy as plant-wise phenotyping 
method could be considered to confirm the reduced associa-
tion between genetic diversity and panicle resistance within 
the studied population and apply the models to capture all 
loci that may contribute to this trait. Additionally, very late 
genotypes may escape highest panicle blast infection pres-
sure under natural conditions leading to missing records 
for PB severity as observed mostly in 3K population in our 
study (Fig. 1). Moreover, the results also confirm the ability 
of GBLUP to generalize across diverse genetic backgrounds 
and environments effectively.

In addition, the application of different weighting meth-
ods resulted in varying levels of gains in the predictive 
ability compared to the unweighted GBLUP models across 
populations and traits. Improvements ranged from high to 
no increases in the average predictive ability and were con-
sistent across both ST and MT models. This demonstrates 
that the integration of marker-specific weights has the poten-
tial to increase genomic prediction accuracy. Incorporation 
of marker weights accounted for differences in the genetic 
control of the traits as well as the biological relevance of 
specific genomic loci. Similar findings have been reported 
by Montesinos-López et al. (2024b) and Ren et al. (2021), 
who reported significant improvement in genomic predic-
tion accuracy using marker-specific weights. Moreover, the 
weighting methods achieved moderate to substantial reduc-
tion in the normalized root mean square error (nRMSE) 
compared to unweighted models (Fig. 9). These results 
firstly emphasize the effectiveness of weighted models in 
accurately ranking genotypes based on their genomic esti-
mated breeding values (GEBVs), thereby improving selec-
tion decisions. Secondly, they also highlight the potential 
of weighted models to predict GEBVs that numerically 
approximate the true and/or observed performance.

Among the tested methods on the single-trait model, the 
improvement shown by the FST-w was population-specific, 
with no improvement observed in the average predictive 
ability within the SSD Tropics population for either leaf or 

panicle blast severity. This finding shows that population 
differentiation alone may not sufficiently capture the genetic 
architecture of blast resistance to improve genomic predic-
tion accuracy. However, FST-w caused a modest increase in 
the predictive ability in the 3K population, ranging from 4.1 
to 4.8% compared to the unweighted models. Chang et al. 
(2019) reported a comparable improvement and observed 
a 5% gain in prediction accuracy when applying FST-based 
weighted GBLUP in animal breeding. Nevertheless, while 
this method may enhance prediction accuracy for differ-
ent traits, our results demonstrate that its effectiveness 
was inconsistent across diverse genetic backgrounds. This 
limitation is particularly apparent in cases of overlapping 
ancestral populations, where allele frequencies are similar 
across subpopulations, and the genetic variation between 
subpopulations is relatively low, as observed in the SSD 
Tropics population. For all traits, subpopulations in the SSD 
Tropics were driven by family structure, and within-family 
genetic variance was higher than between-family variance.

In both populations, the application of AE-w 
and − log10(p)-w resulted in consistently higher predic-
tive abilities and lowest nRMSE for all traits compared to 
the unweighted models and FST-w methods. Across traits 
and populations, the highest improvements relative to 
the unweighted models were 37.1% and 32.3% for AE-w 
and − log10(p)-w, respectively. This demonstrates the supe-
riority of AE-w and − log10(p)-w as weighting methods for 
optimizing prediction accuracy for rice blast resistance. 
Marker effect-based weighting was also used by Strandén 
and Jenko (2024) and Ren et al. (2021), who found improve-
ments in the prediction accuracy for complex traits. Moreo-
ver, the results highlight the high potential of genome-wide 
association statistics for increasing genomic prediction 
accuracy through the integration of marker significance 
levels and additive effects into the prediction model. Su 
et al. (2014) also reported the superiority of P-value derived 
weights over other weighting methods. Similarly, Spindel 
et al. (2016) demonstrated that the integration of GWAS 
into genomic prediction using ridge regression BLUP (RR-
BLUP), an alternative to GBLUP model, improved the pre-
diction accuracy for several agronomic traits in rice. Zhang 
et al. (2023) incorporated P value-derived weight matrix 
into RR-BLUP model and reported a significant improve-
ment in predictive ability of agronomic traits in rice. Unlike 
population-specific or trait-specific methods, the GWAS-
based methods show broader applicability and could capture 
genomic signals more effectively across diverse backgrounds 
and trait architectures. Moreover, AE-w and − log10(p)-w 
effectively capture genome-wide signals associated with 
both large- and small-effect (polygenic background) loci. 
In contrast, FST-w primarily reflects population differentia-
tion and may not target loci that are directly influencing the 


	 Theoretical and Applied Genetics          (2026) 139:52    52   Page 20 of 27

trait of interest, which likely explains its consistently lower 
performance.

Despite the similarity of the unweighted models’ perfor-
mance between populations, particularly for panicle blast 
severity, our results showed that the predictive abilities 
with AE-w and − log10(p)-w were up to 26.9% higher in 
the 3K population relative to the SSD Tropics population. 
Across traits, the maximum within-population improve-
ments achieved by AE-w and − log10(p)-w relative to the 
unweighted models