i 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ii 
 
 
     
 
 
 
 
 
 
 
 
 
 
 
 
Procedures for the evaluation of sweetpotato trials  
MANUAL 
 
© International Potato Center, 2019 
 
ISBN: 978‐92‐9060‐522‐5  
DOI: 10.4160/9789290605225 
 
Sweetpotato Research Guides (SPRGs) This document describes technologies that have been developed and used by 
CIP in cooperation with national agriculture research programs to promote research and the exchange of information 
among sweetpotato scientists. Procedures and forms for data collection described here are now being used in 
integrated electronic tools such as Highly Interactive Data Analysis Platform for Breeding (HIDAP) and Sweetpotato 
base for the design of trials, and collection, archiving and analysis of data. 
 
CIP publications contribute important development information to the public arena. Readers are encouraged to quote or 
reproduce material from them in their own publications. As copyright holder CIP requests acknowledgement and a copy 
of the publication where the citation or material appears. Please send this to the Communications Department at the 
address below. 
 
Correct citation: 
Grüneberg, W.J.; Eyzaguirre, R.; Diaz, F.; Boeck, B. de; Espinoza, J.; Mwanga, R.O.M.; Swanckaert, J.; Dapaah, H.; 
Andrade, M.; Makunde, G.; Tumwegamire, S.; Agili, S.; Ndingo‐Chipungu, F.P.; Attaluri, S.; Kapinga, R.; Nguyen, T.; 
Kaiyung, X.; Tjintokohadi, K.; Ssali, R.T.; Carey, T. Low, J. 2019. Procedures for the evaluation of sweetpotato trials. 
Manual. Lima (Peru). International Potato Center (CIP) ISBN: 978‐92‐9060‐522‐5. 77p. 
 
International Potato Center 
P.O. Box 1558, Lima 12, Peru 
cip@cgiar.org • www.cipotato.org 
   
Produced by the CIP Communications Department 
 
May 2019 
 
CIP also thanks all donors and organizations which globally support its work through their contributions to the CGIAR Trust Fund. 
https://www.cgiar.org/funders/ 
© February 2019. International Potato Center. All rights reserved. 
This work by the International Potato Center is licensed under a Creative Commons Attribution 4.0 International (CC BY 4.0)  
To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/. Permissions beyond the scope of this license may be available at: 
http://www.cipotato.org/contact/ 
 
 
iii 
Acknowledgements 
This work was made possible by funding from HarvestPlus, the Bill & Melinda Gates Foundation, the 
United States Agency for International Development, and the CGIAR Research Program on Roots, Tubers 
and Bananas. 
 
 
iv 
 Table of Contents 
Acronym list ............................................................................................................................................................ vi 
1. Introduction .......................................................................................................................................................... 2 
1.1. Check varieties ...................................................................................................................................................... 2 
1.2. Accelerated breeding scheme (ABS) ..................................................................................................................... 3 
2. Procedures ............................................................................................................................................................ 6 
2.1. Multiply clones for trial and maintain their identities ........................................................................................... 6 
Verify and maintain the identity of clones during the process of multiplication and evaluation ................................ 6 
2.2. Trial types and selection schemes ......................................................................................................................... 7 
Early breeding stages ................................................................................................................................................... 7 
Later breeding stages ................................................................................................................................................... 9 
Farmer Participatory Variety Selection (FPVS) – Later Breeding Stages .................................................................... 12 
On‐farm trials (OFTs) .................................................................................................................................................. 12 
Identification of local partner(s) and areas for on‐farm trial ..................................................................................... 13 
Identification of farmers or farmers’ groups .............................................................................................................. 13 
Planning for the trials with farmers ........................................................................................................................... 13 
Planting the trial ......................................................................................................................................................... 14 
Monitoring the trial .................................................................................................................................................... 15 
SPVD assessment and 1st Weeding ........................................................................................................................... 15 
Leaf taste‐test evaluation .......................................................................................................................................... 16 
Final evaluation .......................................................................................................................................................... 16 
2.3. Data analysis, selection of clones and reporting of results ................................................................................. 18 
3. Description of data collection forms and instructions for    their use ................................................................... 21 
3.1. Data collection .................................................................................................................................................... 22 
Form 1A and 1B. Sweetpotato OT (general information) – Appendix 1 ..................................................................... 22 
Form 2. Sweetpotato OT (data sheet) ........................................................................................................................ 23 
Form 3A and 3B. Sweetpotato PT and AT (general information) Appendix 2 ............................................................ 23 
Form 2B. Sweetpotato genotypes in trial .................................................................................................................. 25 
Forms 2C, D and E – including classification variables ............................................................................................... 25 
Form 2C. Pre‐harvest data sheet ................................................................................................................................ 26 
Form 2D. Sweetpotato harvest .................................................................................................................................. 27 
Form 2E. Sweetpotato quality .................................................................................................................................... 27 
Form 5A. Sweetpotato farmer participatory field evaluation .................................................................................... 28 
3.2. Derived variables ................................................................................................................................................. 29 
4. Suggestions for data analysis and clonal selection ............................................................................................... 32 
 
 
v 
4.1. Statistical program packages ............................................................................................................................... 32 
Randomization of field trials and random number generators .................................................................................. 32 
PLABSTAT (Plant Breeding Statistics) ......................................................................................................................... 32 
SAS (Statistical Analysis Software) ............................................................................................................................. 33 
R (no information why R is called R) .......................................................................................................................... 33 
HIDAP (Highly Interactive Data Analysis Platform) .................................................................................................... 33 
4.2. Data analysis example ......................................................................................................................................... 33 
Data set ...................................................................................................................................................................... 34 
Model ......................................................................................................................................................................... 35 
4.3. Computations for our example using PLABSTAT ................................................................................................. 36 
PLABSTAT input .......................................................................................................................................................... 36 
PLABSTAT output ....................................................................................................................................................... 38 
4.4 Computations for our example using SAS ............................................................................................................ 41 
SAS input .................................................................................................................................................................... 41 
SAS output ................................................................................................................................................................. 43 
4.5. Multiple comparison procedures in plant breeding ............................................................................................ 47 
4.6. Computations for our example using R ............................................................................................................... 48 
R input ........................................................................................................................................................................ 48 
R output ..................................................................................................................................................................... 50 
4.7. Suggestions for selection in ATs .......................................................................................................................... 51 
ATs ............................................................................................................................................................................. 51 
ETs (these would correspond to second stage ATs) ................................................................................................... 51 
5. References .......................................................................................................................................................... 54 
6. Appendices ......................................................................................................................................................... 57 
Appendix 1. Sweetpotato observational trial .......................................................................................................... 57 
Appendix 2. Sweetpotato preliminary PT and advanced trial AT ............................................................................. 60 
Appendix 3. Soil groups in Africa as classified by the FAO ....................................................................................... 68 
Appendix 4. (In Process) .......................................................................................................................................... 75 
Appendix 5. Sweetpotato trials farmer ................................................................................................................... 76 
  
 
 
vi 
Acronym list 
ABS  Accelerated breeding scheme 
ANOVA  Analysis of variance 
AT  Advanced yield trial 
CGIAR  Consultative Group for International Agricultural Research 
CIP  International Potato Center 
CRD  Completely randomized design 
CSIR‐CRI  Council for Scientific and Industrial Research – Crops Research Institute 
ET  Elite trials 
FPVS  Farmer participatory variety selection 
IIAM  Mozambique Institute of Agricultural Research 
LSD  Least significant difference 
NaCRRI  National Crops Resources Research Institute in Uganda 
NIRS  Near infra‐red spectrometry 
NT  National trial 
OFT  On‐farm trial 
OT  Observational yield trial 
PLABSTAT  Plant Breeding Statistics  
PT  Preliminary yield trial 
RCBD  Randomised complete block design 
SAS  Statistical Analysis Software 
SPVD  Sweetpotato virus disease 
UT  Uniform trial 
 
 
vii 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1 
1. Introduction 
Breeding programs involve large investments of time and money, but can pay very large returns on 
investment in the form of improved varieties that benefit farmers, societies and the environment. 
International breeding efforts involving multiple partners and targeting regionally important constraints 
have great potential for efficiently and rapidly achieving impact. Standardised information on the 
performance of progenies and selected clones across environments assists breeders to efficiently make 
decisions about selection and variety releases. Standardised methods also facilitate the reporting of 
breeding program results to the agencies that support us. This manual of procedures for the evaluation 
and analysis of sweetpotato trials provides standard methods for partners in CIP’s global breeding efforts. 
This manual is the result of an iterative process involving discussions among breeders at a series of 
meetings held over the years 2008–2010, starting with support from the HarvestPlus program of the 
CGIAR and continuing under the Sweetpotato for Profit and Health Initiative. This manual is a work in 
progress and is continuously refined in response to 1) the needs of sweetpotato breeders,  producers and 
consumers and 2) advances in breeding knowledge, tools and techniques. We are excited about the 
application of new methods to sweetpotato breeding, including an accelerated breeding approach that 
will shorten cycles of recurrent selection, and can lead to the release of new varieties in 3–4 years. 
Certainly sweetpotato breeding will be further modernized in the future ‐ for example CIP‐HQ will start to 
use in later breeding stages so‐called p‐rep designs and row columns designs in 2019), but we think the 
basis of the standardised methodology and reporting described here will be maintained. 
This manual is divided into six sections. First, this introduction provides a brief discussion of some of the 
key principles for our sweetpotato breeding effort, including the need for check varieties and an overview 
of the Accelerated Breeding Scheme. In section 2 we describe and discuss the standard trialling stages 
used in the breeding programs. In section 3 we present the standard data forms used in the trials, 
providing examples of completed forms. In section 4, we provide a brief introduction to the analysis of 
data from selected trials using PLABSTAT, SAS and R statistical packages. Section 5 presents references for 
further reading, and in section 6 the Appendices, provides sets of blank forms which partners may copy 
for use in their trials for different breeding stages. Procedures described here are reviewed and updated 
periodically by members of the sweetpotato breeding community. Procedures and forms for data 
collection described here are now being used in integrated electronic tools such as Highly Interactive Data 
Analysis Platform (HIDAP) for breeding  and Sweetpotato base for the design of trials, and collection, 
archiving and analysis of data. Up‐to‐date information may be found at sweetpotatoknowledge.org. 
1.1. Check varieties 
Breeding is a process for adapting a crop to human needs. An important component of breeding is the 
selection of new varieties, and this requires a good understanding of the needs of farmers and societies, 
and good biological and statistical knowledge. 
 
 
2 
A variety is always characterised by several traits. A better variety must have good performance over all 
traits and at least in one important trait it must be clearly superior to all other varieties so far available in 
a region. It is not possible to compare a set of new genotypes with all existing varieties across all target 
environments. Therefore we evaluate and compare new genotypes with important standard varieties 
(check varieties) in important environments (check environments). The check environments should be 
representative of the region we are aiming at. 
Selection of new varieties for a region requires comparing new genotypes with check varieties. This is a 
very complex task, and may involve many partners working in different places throughout the region. 
Only what is comparable can be compared! If we evaluate different traits or use different procedures to 
evaluate these traits in different trials, we cannot compare the performance of genotypes across trials. 
The same is true if we use different sets of new genotypes and different check varieties. Only commonly 
measured genotypes and check varieties can be compared. We distinguish two designs which allow us to 
compare results of trials: (i) the complete design in which all genotypes are commonly tested across all 
environments and (ii) the incomplete design in which only a fraction of all genotypes are tested across all 
environments. If, in the incomplete design, there are no or only very few genotypes (1–6 genotypes) 
commonly tested across all environments, it is not possible to make meaningful comparisons among 
trials, but it is at least possible for genotypes compare relative to commonly used checks (2–4 checks). 
Hence there needs to be agreement among breeders about: (i) the most important traits to be evaluated, 
(ii) standardised procedures to record these traits and (iii) commonly used check varieties (2–4 checks). 
1.2. Accelerated breeding scheme (ABS) 
This method, which can be used for both pre‐breeding (population improvement with selection of new 
parents) and varietal selection, is illustrated in Figure 1. It can allow for completion of a selection and 
recombination cycle (in the case of population improvement) in two years, or for the selection and 
release of a new variety within 4–5 years. The main features of the approach are to use multiple selection 
sites from the initial stage of selection, and to minimise replication (a maximum of two replications per 
trial are used) at each site so as to conserve resources while obtaining information on stability of 
genotypes being tested. The trial stages for which we present forms in this manual are the observational 
yield trial (OT), the preliminary yield trial (PT) and the advanced yield trial (AT). An important feature of 
our breeding and cultivar selection method is the use of farmer participation, which provides essential 
input to the breeding and selection process. Here, we detail the use of farmer input at the AT stage and in 
on‐farm trials (OFTs). 
 
 
 
3 
Crossings
Year 1
True seed parents & multiplication
Year 1
A-clones 
Atleast 2 locations
Year 2(no plot replication)
B-clones atleast 3 locations
Year 3 (plot replications)
C-clones
Atleast 5 location
Year 4 (plot replication)
Propagation
 
 
Figure 1. Accelerated Breeding Scheme (ABS)  implemented by CIP and partners for faster clone selection and short 
recurrent selection cycles. Testing A‐clones in observational yield trials (OT trials) directly at two or more locations in 
1‐m row plots without plot replications, and testing promising B‐clones in preliminary yield trials (PT trials) at three or 
more  locations with plot replications. This  is  followed by testing advanced clones  in advanced yield trials  (C‐clones 
and AT trials) at five or more locations with plot replications, and is linked to the first stage of variety release testing 
and propagation. 
 
 
 
4 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5 
2. Procedures 
2.1. Multiply clones for trial and maintain their identities 
Sweetpotato clones for trialling may be newly derived from seedling populations or may be important 
varieties or promising selections from other breeding programs, which have been introduced as 
pathogen‐tested in vitro plantlets. Within regions, sweetpotato clones may be moved as cuttings, 
following approved quarantine procedures. Quarantine procedures may slow the exchange of breeding 
material, but are important. Who wants to become famous by introducing new pests or pathogens into 
environments where these have so far not been present? 
Clones have to be multiplied to produce planting materials for initial trials. Locally important, and 
standard check varieties, should be included in multiplication plots to provide uniform planting materials 
for trials. Planting material of all the genotypes for any trial should come from a single source, and the 
health status of genotypes in the trial should be similar. Often a common health status among clones is 
difficult to achieve. For example, local clones might not be virus free, whereas introduced clones are 
obtained pathogen free. In such a case the effect of the genotype is confounded with its health status. 
Similar problems might occur even if you are not working with introduced clones. Standard check varieties 
might have been used for a long time, without being renewed from a source of pathogen‐free planting 
material. In contrast, newly developed clones are young. It should be noted that the health status does 
not only affect yield related traits. The same clone can differ morphologically if it is pathogen‐free or 
infected. Therefore the multiplication of check varieties should, to the extent possible, routinely make use 
of pathogen‐free mother plants. These mother plants are maintained in greenhouses and are routinely 
checked to be pathogen‐free. A mother plant in which a pathogen has been detected must be 
immediately removed from the greenhouse. Mother clones with pathogens have to be cleaned up or 
replaced from a pathogen‐free in vitro source. 
Verify and maintain the identity of clones during the process of multiplication and evaluation 
The question most frequently asked by professors in plant breeding to PhD students is “Is genotype 
number 1 still number 1 or is it perhaps already number 2?” Many plant breeders have encountered 
surprising results, only to realise that a mistake in the labelling of the genotypes must have happened, 
leading to a mix‐up in identities. Such mistakes can be drastically reduced by giving clones both numbers 
and names. Mistakes in numbers occur more frequently than mistakes in names. If a genotype has no 
name, give the clone a ‘code name’. A code name can easily be formed by the family the clone is traced 
from [six digits: three digits for the father and three digits for the mother (in cases where the clone is 
derived from polycrosses put 000 for the father)] and then – separated by a decimal point – the clone 
number within the family. It is very unlikely that a mistake in the clone number and in the code name will 
happen at the same time and in the same direction, so you can usually quickly identify mistakes. It should 
be noted that mistakes cannot be completely avoided; however, the problem starts when you cannot 
identify mistakes or when you cannot identify them rapidly. 
 
 
6 
Finally we want to mention that the identity of clones can be confirmed by morphological characteristics, 
descriptor lists, and molecular markers. Standard trait descriptors are maintained by the Crop Ontology 
Curation Tool (cropontolgy.org). However, these are usually only available for clones maintained in gene 
banks. The finally selected clones must be morphologically described and distinguishable from other 
clones to allow a registration as a new variety. If published and observed descriptors (including 
pigmentation of foliage and roots, and especially leaf shape) do not coincide (in other words, if you 
observe two different types for the same clone) both clones can still be entered in trials – in this case, the 
clones must be renamed to distinguish them. You should use the original name with an extension 
corresponding to where the morphologically different type has been observed for the first time (e.g. 
Jonathan‐L for the original clone described and maintained at the sweetpotato gene bank in Lima, and 
Jonathan‐M for a clearly distinguishable Jonathan first observed in Maputo). Since sweetpotato has a 
tendency to mutate, it may well be that the new type is not the result of a mix‐up, but could be a new 
Jonathan even superior to the original Jonathan. An striking example for this is a mutation in Resisto ‐ 
released as ”New Resisto (BARI SP‐12)” in Bangladesh, and this mutant is less infected by sweetpotato 
weevil. 
2.2. Trial types and selection schemes 
Examples of sweetpotato selection schemes are provided by Hahn (1982), Martin (1983), Jones et al. 
(1986), Wilson et al. (1989), Kukimura et al. (1990), Saladaga et al. (1991), Tan et al. (2007) and Grüneberg 
et al. (2009a). Sweetpotato breeding has been reviewed by Martin and Jones (1986), Laurie and van den 
Berg (2002), Tan et al. (2007), Grüneberg et al. (2009b), Lebot (2010) and Grüneberg et al. (2015). Each 
sweetpotato variety is a highly heterozygous hybrid and we think that the use of hybrid breeding methods 
in sweetpotato breeding has merit (for a discussion of this topic see Grüneberg et al. 2009b). This might 
lead to changes for sweetpotato population improvement in the future, but not to changes in variety 
development and selection. 
In formal plant breeding we distinguish between OTs, PTs and ATs. ATs are occasionally designated as 
uniform trials (UTs), national trials (NTs) or varietal trials (VTs) but the procedures used in both are usually 
the same. Therefore, we refer to both of these as ATs. Formal plant breeding (on station breeding) has 
been criticised for being slow to develop better varieties for resource‐poor farmers. For this reason CIP 
supports formal plant breeding programs which involve farmers by (i) farmer participatory variety 
selection (FPVS, in later as well as early breeding stages) and (ii) on‐farm trials (later breeding stages). 
Early breeding stages 
In the early breeding stages plants are raised from true seeds. Selection of single true seedling plants may 
not be advisable because measurements on single plants have extremely high variation and plants grown 
from seeds are very different from those grown from cuttings with respect to storage root formation. For 
this reason, evaluations of true seed plants are limited to a few highly heritable traits such as 
susceptibility to pathogens or storage root flesh colour. Genotypes selected among true seed plants enter 
 
 
7 
OTs. In some cases, cuttings are taken from seedlings and initial selection on the basis of storage root 
formation or disease reaction is done during multiplication for OT. 
The OT belongs to the early breeding stages. OTs serve to select material for later breeding stages 
(varietal selection) as well as on basis of offspring‐parent analysis to select the best parents / ”family 
makers” for the next breeding population (population improvement). The breeder has to evaluate many 
genotypes (several thousands) grown from cuttings of true seed plants. Most of the genotypes grown in 
OTs clearly do not meet the lowest acceptable value in at least one character. OTs are also recommended 
for clones introduced from other regions of the world (i) to get a better understanding of how to handle 
foreign materials and (ii) to discard clones which clearly do not meet minimum acceptable values in the 
new region.  
In the sweetpotato breeding program at CIP we plant OTs without replication, in 1‐m single‐row plots 
comprising three plants. Each clone is planted at in least two locations – each location is treated as a 
replication of a randomised complete block design (RCBD). Each OT is bordered by guard rows on all four 
sides of the trial to provide competition to all entries. The name OT might seem to imply that only visual 
observations are made at this stage of selection. However, an OT grown across several locations merits 
the recording and analysis of highly heritable traits. Formerly CIP planted OTs in a single‐row comprising 
10 plants at only one location. Other breeders conducted the OT so that each genotype was evaluated on 
single plant basis – for example for sweetpotato virus disease (SPVD) resistance or storage root flesh 
colour. In the new OT design at CIP it is possible to observe the stability of genotypes across locations and 
to separate the genotype by environment interaction effect from the genotype effect. Soil heterogeneity 
is adjusted by two check clones planted in a grid across the field as suggested by Westcott (1981). 
Heritability (h2) estimates in the new OT show that the harvest index and several storage root quality 
characteristics (i.e. dry matter, protein, starch, sugars and pro‐vitamin A concentrations) can be evaluated 
in such an OT design with sufficient precision for selection (h2 > 0.6). To our surprise we observed 
significant low to medium heritabilities (h2 about 0.4–0.5) for storage root yield. Visual selection for 
storage root size, shape and form – recorded using a single rating scale of 1 to 5 – showed a significant 
correlation with yield measurements in kg per plot in the new OT design. This can be explained by the 
extremely large genetic variation for storage root yield in early breeding stages due to very high 
segregation in hexaploid sweetpotato. However, this allows us only to discard genotypes with poor yield 
performance since discrimination among clones with medium to high yield performance is only possible in 
larger plots (see PTs and ATs). 
The OTs in the sweetpotato breeding program CIP Headquarters are directly used to select parents for 
crosses in hybridisation blocks. Hence the breeding program operates with very short recurrent selection 
cycles to improve breeding populations. Each year about 200 to 400 genotypes are selected from OTs: (i) 
to use for crosses for population development and (ii) to enter into PTs for variety development and 
selection, respectively. All crosses are carried out by controlled crossings. In contrast to polycross 
nurseries, this results in more balanced seed production per parent, since both parents can be controlled. 
Selection theory tells us that controlled crosses are more efficient than polycrosses. However, 
considerably fewer seeds can be produced using controlled crosses – and current research aims to 
 
 
8 
compare progress using both approaches. Our crosses are carried out in a factorial design (the best set of 
parents with the rest), in which about 6 best clones are used as females crossed with the remaining 
genotypes (200) as males. About 1/3 of all cross combinations result in no or low seed set, and so each 
year about 800 families with about 20–40 seeds per family are developed for the OTs of the next 
recurrent selection cycle. 
To summarise, OTs are characterised by a very large number of genotypes evaluated in very small plots 
without replications. The OT can be carried out at one location or at several locations and environments. 
The design of the OTs depends on the priorities of specific regions. For example, in regions of high virus 
pressure the breeder has first to eliminate all genotypes which show insufficient virus resistance, whereas 
in a drought‐prone region the breeder first has to eliminate all genotypes with poor vine survival. 
Here we do not prescribe traits to record in OTs, because these depend on the region, the country and 
the major breeding objectives. However, we aim to obtain general OT trial information, as well as 
information for the recorded traits in selected clones, which will allow us to group breeding programs into 
clusters to support appropriate true seed exchange among them. Moreover, we request recording the 
parents and clone numbers of selected clones. The reason for this is that it is common knowledge among 
breeders that good clones often trace back to a few cross combinations. The record of parents and clone 
numbers of selected clones will allow us to determine the frequency of selected clones among parental 
combinations and to determine which promising crosses should be repeated on a larger scale. For details 
on record keeping for OTs, see section 3. Data collection forms for OTs are provided in Appendix1: Forms 
1A, 1B and Form 2. 
Later breeding stages 
Clones selected in OTs enter into later breeding stages or into variety selection and development. These 
later breeding stages comprise PTs and ATs. In PTs and ATs the same characters are recorded, but on a 
different plot size basis. Note that the ATs should be recorded on a plot size basis identical with that 
required for official variety release – this is country specific, and often requires two seasons of data. UTs, 
NTs, VTs and elite trials (ETs) are names also used for these advanced ATs. Usually these have the same 
plot size and the same traits are recorded as for ATs. It should be noted that ETs are used for those finally 
selected clones to be tested against a group of check clones for variety release. In this case we distinguish 
between ATs and ETs, but both have the same plot size and the same traits are recorded. They differ in 
the number of clones to be tested. Usually in ETs a smaller number of clones are tested, so that the test 
precision/power is larger compared to ATs (see 4.1. Statistical program packages and 4.5. Multiple 
comparison procedures in plant breeding). 
The PT is normally carried out in two‐row plots, 30 plants per plot (i.e. 15 plants per row) and two plot 
replications. The PTs are planted in a RCBD – i.e. replications of clones are planted in blocks and in each 
block all genotypes are randomised. Note RCBD could be in the future replaced by so‐called row‐column 
designs, which just soil heterogeneity in two dimension (in a RCBD the block can only be arranged on the 
field either row or column direction). Single‐row plots should not be used due to inter‐plot competition 
(border effects due to neighbour plots within a block). Border effects are assumed to be large in 
 
 
9 
sweetpotato, due to the large genetic variation for the upper biomass production among sweetpotato 
clones. The coefficient of variation for the storage root yield error‐term in a single PT (two rows, 15 plants 
per row) is typically very large. In our breeding programs this is in the range of 28–52% and would 
probably be larger if we used single rows. 
The PTs must be carried out in at least three locations and with plot replications, in order to be 
considered as a preliminary yield trial (replications can be partially replicated so‐called p‐rep designs, but 
it is important that these require special randomization plans). The clear advantage of conducting PTs at 
three or more locations is that this saves time (years/seasons), because in plant trials, temporal variation 
of test environments (years/seasons) can be replaced by spatial variation of test environments (locations). 
Conducting PTs across locations, with two plot replications, allows us to separate the effects due to 
genotypes, genotypes by environment interaction and the plot error for each trait. Furthermore, with 
three locations it is possible to determine stability parameters for each genotype, which must be 
considered as an additional character associated with yield. 
To summarise, since 2008 CIP has aimed to conduct PTs as follows: (i) at least two‐row plots with (ii) at 
least 30 plants per plot, (iii) two plot replications per genotype and (iv) in at least three locations or 
environments (you can generate environments in the same location by using treatments, i.e. irrigation or 
fertilisation). The set of attributes and traits to be recorded in PTs is fixed. For details of data to record for 
PTs, see section 3. Data forms for the PTs are provided in Appendix 2: Forms 2A, 2B, 2C, 2D, 2E, 3A and 
3B. However, the recording of additional traits is optional if the breeder thinks one or more traits must be 
recorded for an appropriate selection in the target environment. 
The main question in selection is how many genotypes should be selected? If almost none of the clones in 
a trial meet the lowest acceptable value in each trait there is not much choice. However, after good OTs 
most genotypes should meet the lowest acceptable value across all characters. Variety selection is a 
multi‐stage process, and for fixed entries (all genotypes of clonally propagated crops are fixed entries) this 
multi‐stage selection problem has been well solved by selection theory. The results of selection theory 
show for very different selection scenarios [different ratios of variance components for genotypes, 
genotypes by environments and plot error, and different numbers of test capacities (total number of 
possible field plots to be allocated to genotypes, locations and replications)] that at each selection stage 
5–15% of the total number of genotypes should be selected. Moreover, they show that more than 3–4 
selection stages do not result in a significant increase in genetic gain. Hence a three‐step selection (one in 
OTs, one in PTs and one in ATs) is sufficient to identify the most appropriate clones for variety registration 
in a breeding population. The sweetpotato breeders at CIP clearly advocate for a two‐stage selection in 
later breeding stages (one in PTs and one in ATs). 
However, if the breeding population is still at an unsatisfactory level, there might be no or only very few 
clones which can be recommended for variety registration. In such a case the breeder must allocate more 
resources to population improvement (to increase the variety‐generating ability of the breeding 
population) by (i) conducting more crosses (controlled crosses), (ii) using more parents and (iii) shorter 
recurrent selection cycles (one year to recombine parents and one year to select parents that are closer 
to the breeding targets). 
 
 
10 
The AT is the next selection stage of variety selection. It is usually planted as a RCBD (row‐column designs 
are just introduced to sweetpotato breeding), but using larger plots than those in the PT. Our ATs in the 
breeding program of CIP have five‐row plots (15 plants per row) with 75 plants per plot, and two 
replications per location. The ATs are carried out at four or more locations. The coefficient of variation for 
the storage root yield error term in a single AT at CIP is in the range of 25–32%, which is still large 
compared to ATs for grain crops. This shows the potential for improvement in trial designs for 
sweetpotato, and better experimental designs may be developed for sweetpotato in the future (for 
example: planting at higher density and eliminate plants per plot down to the desired number per plot, 
row‐column designs, etc.). The result of the selection process in ATs should be 5–8 clones with good 
performance over all traits and a clear advantage in at least one trait compared to all sweetpotato 
varieties available in the region. As mentioned before, formal plant breeding has been criticised for not 
being successful at developing better varieties for resource‐poor farmers. For this reason, at least at one 
location, the AT has to be carried out with farmer participation. This eliminates the possibility of 
proposing genotypes for official variety release that are not accepted by farmers. The selected clones (5–
8) are re‐evaluated in the next growing season in a similar design at the same locations and additionally in 
more than 10 on‐farm trials (OFT), which should be linked with the process of official variety release. 
To summarise: since 2008 CIP has proposed that ATs be carried out: (i) in three and more row plots with 
(ii) at least 75 plants per plot, (iii) two plot replications per genotype and (iv) at least four locations. FPVS 
is required in at least one location, and at least 10 OFTs in the final selection stage should be linked to 
official variety release. Instructions for data recording for ATs are given in section 3. Data forms for the 
ATs are provided in Appendix 2: Forms 2A, 2B, 2C, 2D, 2E, 3A, and 3B. Note that the same forms are used 
for ATs as for PTs, but provide space for details of plot layout. Space is also provided for the collection of 
additional attributes. 
Extended PTs and ATs to evaluate vine survival and piecemeal harvest, are of special importance in 
sweetpotato. Therefore, we provide a brief discussion of these traits and a recommended method to 
evaluate these traits (this procedure might change in the future). 
Sweetpotato has the highest food production per unit area per unit time. However, in drought‐prone 
environments, a critical character of sweetpotato is vine survival from harvest to the next planting season. 
Additionally, an important trait of sweetpotato is the ability for use in piecemeal harvest, especially for 
home gardens. Neither trait has been addressed in PTs and ATs to date, because typically the complete 
plot was harvested and no plants remained to determine vine survival and piecemeal harvest quality. We 
propose a design that allows observing these traits, by using extended plots in PTs and ATs. These trials 
are carried out as described above, but with longer rows (about five planting positions per row). The first 
and larger part of the plot is used to record characters as usual. The second and smaller part of the plot is 
used later (2–3 months after the first harvest) to determine vine survival and piecemeal harvest traits. We 
recommend that each partner performs extended PTs and ATs at two locations with two plot replications. 
It should be noted that in drought‐prone regions vine survival and sprouting potential of small roots 
determines the acceptance of a variety, since varieties that fail in these attributes will disappear because 
no planting material will be available when the rain comes. Moreover, the ability to use a variety for 
 
 
11 
piecemeal harvest is one of the most important characteristics at the household level in many places in 
sub‐Saharan Africa; varieties that develop undesirable fibre or taste at later growing stages (five months 
and more) are nearly always rejected by farmers. CIP hopes to come to an agreement with partners that 
these characters be determined in extended PTs and ATs in the future, and we are working to develop 
standard methods for data collection from these trials. 
Farmer Participatory Variety Selection (FPVS) – Later Breeding Stages 
This is an important part of the evaluation of an AT clone. It should be carried out at a minimum of one AT 
location per country. Farmers are invited to give their evaluations and comments on clones in ATs. The 
evaluation is carried out on the basis of frequencies for the assessment of six traits and one overall 
assessment for each variety. The assessment is recorded using colour cards to score each entry in the trial 
(Red = Not acceptable; Yellow = More or less acceptable; and Green = Clearly acceptable) and the overall 
performance of each variety. To assess the genotypes, each farmer obtains 21 colour cards for each 
genotype to be evaluated (one card of each colour for each attribute to be evaluated). As each attribute is 
discussed, the farmer evaluates each genotype by placing one colour card into a bag (based on the degree 
of acceptance for that attribute). There should be separate bags for women’s and men’s votes, or cards 
may be marked with F or M to indicate gender, respectively. Results are tallied using the data collection 
forms provided in Appendix5: Form 5A. 
Note: Farmer can also contribute substantially to evaluated in OTs. One location is evaluated together 
with a smaller group of farmers, which are prioritising the crop in their activities. This can be addressed 
as farmer participatory breeding since selection of clones does not only result in material of later 
breeding stage; evaluation of clones results in the important off‐spring evaluation to determined which 
parents shall be used for recombination for the next breeding population (in fact the farmer is breeding 
by contributing to the parents to use). Breeders are often staying for many years with the same crop in 
the same country. However, breeders get also new assignments – the country is new and perhaps also 
the crop. In this case it is wise to consider evaluations in OTs together with a smaller group of farmers. 
It is not so much what a breeder has been thinking about what should be selected, it is what he/she has 
not been thinking about. For example a couple of years ago only few breeders were informed about 
how important vine strength and vine survival is to make a variety successful, especially in drought 
prone areas.        
On‐farm trials (OFTs) 
ATs at the second stage are carried out at several locations and are used as mother trials for farmer 
participatory variety testing in On‐farm trials. OFTs aim to: 
 introduce the varieties to users (farmers) – initial step for variety/technology transfer, 
 test performance of promising varieties under farmer growing conditions and researcher‐farmer 
management, 
 
 
12 
 test farmers’ acceptance and ranked preference of the varieties for yield and quality attributes 
(including taste tests), 
 obtain feedback (in terms of what farmers like in a variety) to breeders and 
 build farmers’ capacity on variety assessment (experimentation). 
Identification of local partner(s) and areas for on‐farm trial 
Selection of areas for on‐farm trials should prioritize capturing the range of different agro‐ecological (rain, 
soil, temperature) and socio‐economic conditions (better‐off and poorer farmers) of the target areas. It is 
important to clarify the objectives, work plans and roles for the on‐farm trials with the local partner(s). 
Identification of farmers or farmers’ groups 
This can be done by the researcher and the local partner or the local partner alone depending on the level 
of collaboration and mutual trust.  We should aim to have at least ½ of the on‐farm trials with women. 
Working with farmer groups that are well organized can accelerate varietal dissemination. Otherwise, it is 
better to select individual farmers to conduct the trials.  So strive to have at least 10 sites for a given agro‐
ecology.  
In selecting farmers, pay attention to the following criteria: 
1.  Willingness to host the trial and have visitors come to her/his farm on the evaluation day 
2.  Assess whether there is sufficient labor and land to undertake the trial for the agreed upon 
management approach 
3.   Located in an accessible area (not too far from a main road) 
4.   Experienced sweetpotato grower in good health 
5.   Soil for plot used in the trial should be homogeneous  
6.  Whether the farmer had problems in the past with animal destruction and theft 
  In some countries, it may be useful to have the farmer sign a contract 
Planning for the trials with farmers 
This is an important step and a meeting should be scheduled with the entire group of farmers or group 
leaders.  
It is important to ensure that the meeting is participatory and should help to generate readiness for trials 
among the farmers. Land for the trial should be identified and modalities for its preparation put in place 
and agreed on. Each farmer obtains at least four varieties from the AT and has to assess these varieties 
relative to his/her currently used variety.  
 
 
13 
Planting the trial 
The researcher should once again explain the trial objectives and design.  
1) Plot size of about 30 m2 arranged in 5 rows, each 6 meters long  per candidate variety (Fig. 2). Ridges 
should be at least 40 cm high. On each row/ridge, vines should be planted approximately 30 cm apart. 
Thus 100 cuttings are required per plot. Additional cuttings (depending on the supply of material) may 
be planted at the end of the row to use for gap filling. 
2) Explain to the farmers: 
a.  The middle 3 rows cannot be harvested during the growing period, as they need to be assessed with 
the researcher present to get good measurements of the yields. The farmer will keep all of the roots 
except 10 roots, that the researcher will need for lab assessments and roots that will be cooked for 
the organoleptic assessment. 
b. The 1st row on the outside can be used by the farmer for piecemeal harvesting. This row will also be 
used to obtain leaves for evaluating quality when cooked (for countries in which human leaf 
consumption is significant). 
c.  The last row must not be piecemeal harvested, because it will be used to assess in‐ground storability 
over a 2 month period. 
Nearly similar plot sizes can be used in areas where farmers plant sweetpotato on mounds. On the 
mounds, three vines are planted in a triangular fashion approximately 30 cm apart. The researcher should 
guide but let the farmers plant the vines their own way and replicate with more farmers (4 – 10 farmers) 
depending on the number of groups. You will need 33 mounds, planted with 99 cuttings in total. From the 
middle of one mound to the middle of the next mound, there should be a distance of 1 meter. 
Further explanation must be made of what is expected of the farmers and a schedule of when you will 
come back. 
A. Example of trial layout 
 
  5 Ridges   
 
6 m 
 
  5 m     
 
 
 
 
14 
Variety 1 
Variety 2 
Variety 3 
Variety 4 
Variety 5 
Check 
B. Example of individual plot layout (5 rows, 6 m long and 1 m apart) 
 
x  x  X x x
x  x  x  x  x 
x  x  x  x  x 
x  x  x  x  x 
x  x  x  x  x 
.  .  .  .  . 
.  .  .  .  . 
.  .  .  .  . 
x  x  x  x  x 
x  x  x  x  x 
x  x  x  x  x 
For  For yield assessment with  For in‐
piecemeal  researchers  ground 
harvest  storage 
by farmer 
 
Figure  2.  An  illustration  of  the  trial  layout(A)  and  the  individual  plot  layout  (B)  for On‐farm  participatory  variety 
testing.  
Monitoring the trial 
Monitoring is done by all the stakeholders (researchers, local partners and farmers). The purpose is to: a) 
check on the establishment and ensure timely gap filling; b) ensure timely weeding of the trials by the 
farmers and c) ensure general good progress of the trials. Note that most often monitoring visits are 
combined with evaluation (or data collection) visits. 
SPVD assessment and 1st Weeding 
The first weeding should be done 3 weeks after planting and farmers should be instructed to do so. If 
funds are abundant, a visit can be made at 3 weeks. If not, combine a visit to assess virus incidence and 
weeding at 6 weeks. This assessment will be done by the researcher. However, the farmers and the local 
partner should be available  to be shown virus symptoms if they are present in the field.  Form 5B should 
be used for the evaluation of establishment and virus rating.  
 
 
15 
Leaf taste‐test evaluation 
Three months after planting, leaves or leaves and petioles (depending on local practice) are harvested 
from each candidate variety and prepared for consumption using the local preparation method. While the 
leaves are still on the plant, ask the farmers to evaluate: Will this be good for cooking? (Yes/No). Then ask 
them why. 
Harvest from the border rows so as not to influence the root yield. You should note what local practice is 
in terms of which leaves are selected (size/location) and whether the petiole is also consumed. Leaves 
should be cooked in a simple local fashion to generate relevant results. The prepared leaves are evaluated 
for 1) taste 2) appearance and 3) texture using the color card system described for roots below (use Form 
5C in Appendix 5). Then conduct a pair‐wise comparison of the cooked leaves in order to stimulate 
discussion about the difference between the varieties and to rank them in order of preference. 
Final evaluation 
This is a three stage evaluation done at harvesting time.  
Stage 1. Quantitative assessment: Two weeks prior to harvest, remove the foliage from the central row 
of each plot in order to evaluate/demonstrate the effectiveness of this practice for pre‐harvest curing. 
Between 4.5 and 5.0 months after planting date (depending on normal practice in a given country), three 
middle rows/ridges of each of the plots are harvested and quantitative data recorded for standard harvest 
using standard recording forms (Form 5C). 
Researchers will keep 5 roots from the middle row (cured) and 5 roots from the 2nd or 4th row to take back 
to the station to evaluate shelf‐life. The shelf‐life evaluation assesses 1) weight 2) sprouting and 3) rotting 
on a weekly basis.  
Stage 2. Participatory field variety evaluation: This is done with farmers using cards to indicate their 
observations on different attributes of each of the test varieties. Farmer assessment of foliage and SPVD 
susceptibility both need to be done before storage root harvest. 
As with FPVS, the evaluation is carried out on the basis of frequencies for six traits and one overall 
assessment for each variety. The farmer ranks the new genotypes relative to the performance of his 
currently preferred variety for each trait as well as for the overall performance across traits. We 
recommend conducting this on the basis of colour cards as for FPVS: Green = Improved or better than the 
local check; Yellow = Equal or nearly equal to the local check; and Red = Inferior compared to the local 
check. To address gender issues, provide two batches of the colored  cards and label one batch with letter 
‘M’ so as to differentiate it. The ‘M’ cards are used for men, the ones without it for the women.  
Pre‐labeled bags bearing variety name and the attribute being assessed should be placed on each 
plot/variety (e.g. Plot 1, Root Yield or Plot 2, SPVD resistance). The evaluation is then done by considering 
each variety at a time. The performance of each variety is assessed by each farmer individually by 
assigning and putting one card only in the bag.  
The number of farmers should be at least 15 per sex for good results. 
 
 
16 
Farmers are given 7 cards per color per variety for the agronomic assessment. Each farmer puts into the 
bag one card that shows the level of performance of the variety per attribute being assessed. When the 
exercise is completed per individual varieties, then bags should be collected and bundled by attributes. 
Assessment at field level could be done on all or some of the following attributes depending on what 
farmers consider important: The question posed to the farmers could be: “Give your opinion by using the 
provided cards on the following attributes”:  
 The ability to produce enough planting material (foliage production);  
 The ability to tolerate diseases, especially SPVD;  
 The ability to tolerate pest damage (mainly weevils);  
 The yielding ability (i.e. number and size of mature roots);  
 The attractiveness of the root skin color. Probe more to know which color(s) are most preferred and 
why?  
 The attractiveness of the root flesh color? Probe more to know which color(s) are most preferred and 
why? 
 What is your overall opinion on the acceptability of this variety? 
The cards in each bag should be separated and counted by colors and sex. The information is recorded in 
the data sheet (Form 5A). 
If the varieties being assessed are more than ten, at the end of the individual assessment, on group basis, 
farmers should be asked to tour and select the best five varieties and worst three varieties respectively, 
and give reasons for their choices. Then for the top 5 varieties, use pair wise comparison whereby every 
variety has a chance of being compared with all others. In pair wise comparison, those varieties 
mentioned more frequently over others are considered acceptable. 
Stage 3. Consumer acceptability assessment: Roots from each variety should be labeled; boiled and small 
pieces are then served on plates for ‘blind’ assessment using A, B, C etc. or 1, 2, 3 etc. to code each 
variety. Take care to not overcook the roots, especially those with lower dry matter content. The use of 
cards in the consumer acceptability exercise is done in a much similar way as for the field evaluation. 
Farmers are given 8 cards per color per variety for the root taste tests. The bags for receiving the cards 
are labeled with a name of the variety and the attribute being assessed. The group should be divided into 
women and men. Before starting, the exercise, review what the attributes are, emphasizing that it is how 
they feel individually about the variety. The question posed to the evaluators could be: “Give your opinion 
by using the provided cards on the following root attributes. 
 Attractiveness of the color of the boiled root (root flesh appearance).  
 Taste when chewed (Taste of the root); some will prefer sweetness, some not.  
 Flavor/aroma in the mouth (“Smell”/ flavor). 
 
 
17 
 Flouriness/Starchiness (Dryness). 
 Consistency of the root texture (Fibrousness). 
 What is your overall opinion on the acceptability of this variety? 
For convenience, all the attributes of one variety should be assessed before moving on to the other. In the 
exercise, several bags labeled with different attributes are passed round one after another for the farmers 
to put in their cards. When all the varieties have been assessed, the bags are then separated based on the 
attributes. The information is recorded as shown in the sample sheet (Form 5B). At the end of the 
individual assessment, on group basis, farmers are asked to select their best five varieties giving reasons. 
Then for those 5 varieties, a pair wise comparison should be done by farmers so that again every variety 
will have an equal chance of being compared with the others. Reasons for varieties being ranked best 
should be provided by the evaluators. 
Key visits 
1. Visit to meet with local partners (identify areas and meet local partners)   
2. Visit to identify farmers 
3. Visit to plan trials with farmers 
4. Visit to plant the trial 
5. After 6 weeks, virus assessment & weeding check (farmers will need to be invited) 
6. Trip at 3 months, for leaf cooking and evaluation 
7. Visit 2 weeks before root harvest to cut vines for in‐ground curing on the central row but not on the 
other 2 rows being assessed & set up invitations for farmer participation 
8. Harvest 
9. Assessment of in‐ground storability over 2 months 
In‐ground storability: On the harvest day, cut the vines on the last row (border row). Hill up the soil, 
covering any exposed roots and where the vine was cut and pack the soil using feet. After 2 months, 
return for the final visit and assess for each variety: 1) # of roots, 2) # of roots infested with weevil or 
rotted, 3) weight (kgs) and 4) Raw taste.  
2.3. Data analysis, selection of clones and reporting of results 
To facilitate analysis and decision‐making, the raw data collected in trials using standardised methods 
described in section 3 should be transformed into reference units. For instance, the number of harvested 
plants divided by the number of cuttings planted would be survival, and yield measured in kg/plot must 
be converted to t/ha. After this processing step, sort the data to be analysed by location, genotype and 
replication – some statistical programs require that the data are sorted (e.g. PLABSTAT), and this also 
helps to get a better overview of data records. The analysis of variance (ANOVA) and mean comparisons – 
e.g. least significant difference (LSD) or Tukey test – become useful tools for clonal selection. In section 4 
 
 
18 
we give recommendations for breeders to analyse data from their trials, select clones and report results. 
We all must follow the description of data collection in section 3 for results of our sweetpotato variety 
selection efforts to be most amenable to cross‐program analysis. This standardization will also become 
more important as we move into the era of genomic selection in sweetpotato.  CIP regional breeders will 
work with national partners on a continuing basis to achieve uniformity, quality, consistency and 
relevance of data from breeding trials through our breeding community of practice. 
 
 
19 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20 
3. Description of data collection forms and instructions for    
their use 
[Note: while under some circumstances, there are different meanings of the words “variable, trait and 
attribute”, for our purposes we use these words synonymously.] In experiments, we distinguish between 
the response variables, which must be analysed, and the classification variables. Classification variables 
help us to identify plots and how experimental factors and factor levels are applied to plots (e.g. year, 
location, genotype, genotype name and replication). The classification variables can be comparable to our 
home address by which we can identify who is living where (country, town, street and name). 
Classification variables allow us to identify a plot, its location, which genotype was planted in it and how 
the plot was treated. Moreover, classification variables are needed to provide statistical program 
packages with information about how the data were organised and are used to inform statistical 
procedures about how to analyse the data. 
Variables that must be analysed can be distinguished as follows: (i) variables with an approximately 
normal distribution (e.g. storage root yield, upper biomass yield and storage root dry matter); (ii) variables 
which show strong deviations from a normal distribution (e.g. disease damage); and (iii) rank variables 
(e.g. scores with a scale of 1–9 for vine vigour or scores of 1–3 corresponding to the colour cards used 
during farmer assessment of varieties). It should be noted that the analysis of variance (ANOVA, GLM, 
REML) is relatively robust to deviations from the normal distribution, so that even symmetrically 
distributed rank scores of 1–9 can be analysed by an analysis of variance. However, the analysis of 
variance is very sensitive to deviations due to heterogeneity of the error variance – this is the case when a 
genotype obtains a common score value across plot replications (e.g. 1 for vine strength = no vine 
survival), while other genotypes obtain different scores (e.g. 6–9 for vine strength = vine survival). No 
variation among replications results in extreme heterogeneity of error variance, and the requirements of 
the analysis of variance are not fulfilled! Rank variables with scores of 1–3 should never be analysed by an 
analysis of variance; however, the frequency means provide useful information. Rank variables with 
scores of 1–3 must be analysed by non‐parametric rank statistics and for these, significance tests and 
multiple comparison procedures (procedures to compare each clone with a check or among all other 
clones in the trials) are available. Note: complex models considering several factors (genotypes, 
environments, block replication) are not possible to analyse by non‐parametric rank statistics and the 
statistical analysis of complex experiments must be simplified – this is the reason why experienced 
breeder do what is possible to evaluate on rank scores of 1–9.   
Note on rating scales: in general, our approach with rating scales is to use a 1–9 scale, setting 1 as good 
and 9 as bad (in the case of hedonic scales), or setting 1 as absence of a problem (in the case of diseases 
and pests). For a few traits including vine vigour and cooked root storage quality traits, this logic does not 
hold perfectly. Thus, for vine vigour, we have set 1 to lowest and 9 to highest. In some places, established 
breeding programs may have already developed different rating scales (e.g. 1–3, 1–5 or 1–9). It is our 
hope that all our partners in this collaborative sweetpotato breeding effort will be willing to adopt and 
 
 
21 
use the scales given below, so that all may benefit from the power of the information provided from 
comparative analysis of our combined results. Standard trait definitions for sweetpotato and other 
agricultural crops are maintained at the Crop Ontology Curation Tool website: http://cropontology.org. 
Traits used are not set in stone and can be revised as needed, and additional traits added, but there is a 
consultative procedure to achieve this.  
Below we provide detailed forms and instructions for use. We have also now developed computer 
programs to assist with all major aspects of sweetpotato breeding program management including trial 
design, data capture, analysis and archiving. CloneSelector was developed based on the procedures and 
forms described here, but has been superseded by HIDAP, linked to the sweetpotatobase database.  Up to 
date information and links to this increasingly powerful integrated suite of tools can be found at 
http://sweetpotatoknowledge.org. 
3.1. Data collection 
Form 1A and 1B. Sweetpotato OT (general information) – Appendix 1 
Form 1A and 1B are availble in file ”A1 ‐ APPENDIX1_SWEETPOTATO_OTs (ALL_FORMS_EXCEL)”. The form 
1A and 1B requests essential information for the OT such as location, plot size and trial management 
practices. It also provides space for the results of soil analyses and meteorological data, which adds value 
to the information on performance of genotypes in the trials by helping to identify patterns among 
experimental sites and agro‐ecological zones, respectively. 
The general OT information to be recorded: 
1. Country (see codes) 
2. Name of contact scientist 
3. Institution 
4. Address 
5. Phone numbers 
6. Location of trial 
7. Latitude, longitude and altitude 
8. Type of trial (single plant or row observations and season) 
9. Names of the check varieties 
10. Planting and harvest dates (including crop duration) 
11. Plot description 
12. Plot size 
13. Crop rotation 
 
 
22 
14. Soil description (see A3 ‐ Appendix 3 for description of categories, – file ” A3 –   Appendix 3 Soil group 
description”) 
15. Meteorological data during the trial 
16. Traits evaluated in the OT 
17. Comments on the OT 
Form 2. Sweetpotato OT (data sheet) 
Form 2 is available in file ”A1 ‐ APPENDIX1_SWEETPOTATO_OTs (ALL_FORMS_EXCEL)”. The observations 
in the OT to be recorded have to consider all clones. This will allow us to identify the most successful 
crosses and parents, respectively, and will allow then to repeat ”Elite” crosses on a larger scale (i.e. 500–
1000 seeds per elite cross combination). The data to be recorded are: 
1. Clone number formed by a number for the father, a number for the mother and the number of the 
clones within the family. In the case of clones from polycrosses, leave the columns for the father 
empty or set them to zero. 
2. Pedigree name if the father and mother of the clone already have names, e.g. Jonathan × SPK004. 
3. Indicate the traits recorded in the OT and in the case of scores identify the meaning of the scores. 
Examples given in the data sheet are root and vine yield per plot, which would be used to calculate 
harvest index, and flesh colour. 
4. Record observations of traits for each selected clone. 
Form 3A and 3B. Sweetpotato PT and AT (general information) Appendix 2 
Form 3A and 3B are available in file ”A2 ‐ APPENDIX2_SWEETPOTATO_PT & AT Trails 
(ALL_FORMS_EXCEL).xls”. The form 3A and 3B requests essential information for PTs and ATs such as 
location, plot size and trial management practices. It also provides space for the results of soil analyses 
and meteorological data which are needed to add value to the information recorded. This additional 
information will help to identify patterns among experimental sites and agro‐ecological zones, 
respectively. 
The general PT and AT information to be recorded: 
1. Country (use the codes) 
2. Name of contact scientist 
3. Institution 
4. Address 
5. Phone numbers 
6. Location of the trial including district, site name and agro‐ecological zone 
 
 
23 
7. Latitude, longitude and altitude 
8. Type of trial 
a. PT, 2‐ AT, 3‐ OFT 
b. RCB design, 2‐ Other designs (specify) 
c. Standard trial, 2‐ Quality specific trial 
d. Season: 1‐ wet, long rains; 2‐ wet, short rains; 3‐ dry 
9. Dates 
a. Planting 
b. Verification of establishment (3–4 weeks after planting) 
c. First virus symptom evaluation (6–8 weeks after planting) 
d. Second virus symptom evaluation (1 month before harvest) 
e. Harvest 
f. Crop duration in days from planting to harvest 
10. Plot description 
a. Plot type: 1‐ Rows/ridges, 2‐ Mounds, 3‐ Rows/flat 
b. Number of rows/mounds per plot (includes the border rows) 
c. Number of border rows or rows of mounds per plot 
d. Number of plants intended for final harvest (excludes border rows and end plants) 
e. Cuttings per plot used to achieve target plant density per plot 
f. Target plant spacing WITHIN rows (m) 
g. Space BETWEEN rows (m) 
11. Determine NET plot size (m2 excluding border rows and plants) 
12. Crop rotation 
a. Crop(s) from previous season 
b. Crop(s) from two seasons ago 
13. Soil description (see A3 ‐ Appendix 3 for description and use codes on form 3B, file “A3 – Appendix 3 
Soil group description”) 
a. Soil type 
b. Soil texture 
c. Soil pH 
 
 
24 
d. Percent organic matter 
14. Meteorological data during the trial 
a. Specify month 
b. Code for month 
c. Rainfall (mm) for each month 
d. Temperature (°C) mean for each month 
e. Temperature (°C) mean minimum for each month 
f. Temperature (°C) mean maximum for each month 
15. Specify and describe the number of check varieties used. Please use MORE than one check (four 
recommended) 
a. Number of check varieties 
b. Check 1 
c. Check 2 
d. Check 3 
e. Check 4 
16. Other comments on events that occurred during the trial. 
Form 2B. Sweetpotato genotypes in trial 
Form 2B is available in file ”A2 ‐ APPENDIX2_SWEETPOTATO_PT & AT Trails (ALL_FORMS_EXCEL).xls”. This 
is the form for maintaining a detailed record of the names of clones entered in the trial. It seems 
superfluous, although it may be useful for assigning a simple code number to each genotype. 
Forms 2C, D and E – including classification variables 
Form 2C, D and 2E are available in file ”A2 ‐ APPENDIX2_SWEETPOTATO_PT & AT Trails 
(ALL_FORMS_EXCEL).xls”. These forms are for pre‐harvest, harvest and post‐harvest data from PT and ATs 
and must be filled in completely. Variables 1–8 are classification variables and are repeated in each form. 
If response data are missing for a trait, the cells with missing data receive the data record ‘*’ for missing 
value. Note: The variable codes provided here do not appear on the paper forms in the appendices, but 
are the variable names used when coding data for analysis. 
1. L = Location or site 
2. T = Trial type 
3. Y = Year 
4. S = Season 
 
 
25 
5. PN = Plot number 
6. R = Replication 
7. G = Genotype number (three digits for the year of the cross the clone traces back to, + three 
digits for the father number, + three digits for the mother number, and four digits for the clone 
number in the family.  
8. SC = Simple genotype code for 1 to N (N is the total number of clones in trial, ensure that the 
same genotype has the same number across locations). 
Form 2C. Pre‐harvest data sheet 
9. NOPS = Number of plants (cuttings) planted per plot. 
10. NOPE = Number of plants (cuttings) established per plot (to be determined 3 weeks after 
planting). 
11. VIR1 = Virus symptoms, first evaluation (at 4–6 weeks after planting); recorded in scores of 1–9: 1 
= No virus symptoms; 2 = Unclear virus symptoms; 3 = Clear virus symptoms for < 5% of plants 
per plot; 4 = Clear virus symptoms for 6–15% of plants per plot; 5 = Clear virus symptoms for 16–
33% of plants per plot; 6 = Clear virus symptoms for 34–66% of plants per plot (i.e. > 1/3 and < 
2/3); 7 = Clear virus symptoms for 67–99% of plants per plot (2/3 to almost all); 8 = Clear virus 
symptoms for all plants per plot (not stunted); and 9 = Severe virus symptoms for all plants per 
plot (stunted). 
12. VIR2 = Virus symptoms, second evaluation (at one month before harvest; recorded in scores of 
1–9 as described for VIR1). 
13. ALT1 = Alternaria symptoms, first evaluation (at 4–6 weeks after planting); recorded in scores of 
1–9: 1 = No symptoms; 2 = Unclear symptoms; 3 = Clear symptoms for < 5% per plot; 4 = Clear 
symptoms for 6–15% of plants per plot; 5 = Clear symptoms for 16–33% of plants per plot; 6 = 
Clear symptoms for 34–66% of plants per plot (i.e. > 1/3 and < 2/3); 7 = Clear symptoms for 67–
99% of plants per plot (2/3 to almost all); 8 = Clear symptoms for all plants (not fully defoliated); 
and 9 = Severe symptoms for all plants per plot (fully defoliated). 
14. ALT2 = Alternaria symptoms, second evaluation (at one month before harvest; recorded in scores 
of 1–9 as described for ALT1). 
15. VV = Vine vigour, first evaluation (at one month before harvest; recorded in scores of 1–9: 1 = 
Nearly no vines; 2 = Weak vines, thin stems and very long internode distances; 3 = Weak to 
medium strong vines, medium thick stems and long internode distances; 4 = Medium strong 
vines, medium thick stems and medium internode distances; 5 = Medium strong vines, thick 
vines and long internode distances; 6 = Medium strong vines, thick stems and medium internode 
distances; 7 = Strong vines, thick stems, short internode distances and medium–long vines; 8 = 
Strong vines, thick stems, short internode distances and long vines; and 9 = very strong vine 
strength, thick stems, short internode distances and very long vines). 
 
 
26 
Form 2D. Sweetpotato harvest 
16. VW = Weight of vines per NET plot in kg. 
17. NOPH = Number of plants harvested. 
18. NOPR = Number of plants with storage roots. 
19. NOCR = Number of commercial storage roots per NET plot. 
20. NONC = Number of non‐commercial storage roots per NET plot. 
21. CRW = Weight of commercial storage roots per NET plot in kg. 
22. NCRW = Weight of non‐commercial storage roots per NET plot in kg. 
23. SCOl = The most representative skin color of the root.1‐ White;  2‐ Cream;  3‐ Yelllow;  4‐ Orange;  
5‐ Brownish Orange;  6‐ Pink;   7‐ Red;  8‐ Purple Red;  9‐ Dark Purple 
24. FCOL = Storage root flesh colour to be determined on four storage roots per plot using CIP colour 
chart, noting the page number from the colour chart on the data sheet. If you don’t have a colour 
chart, use a 1–9 scale: 1 = White; 2 = Cream; 3 = Dark cream; 4 = Pale yellow; 5 = Dark yellow; 6 = 
Pale orange; 7 = Intermediate orange; 8 = Dark orange; and 9 = Strongly pigmented with 
anthocyanins (purple) [Note: some may find it more convenient to determine skin and flesh 
colour in the laboratory using samples taken for dry matter determination]. 
25. RS = Overall assessment of storage root size based on inspection of the harvested roots. Use a 1–
9 scale: 1 = Excellent; 3 = Good; 5 = Fair, 7 = Poor; and 9 = Terrible, with numbers in between 
representing intermediate ratings. 
26. RF = Overall assessment of storage root form based on inspection of the harvested roots. Use a 
1–9 scale: 1 = Excellent; 3 = Good; 5 = Fair, 7 = Poor; and 9 = Terrible, with numbers in between 
representing intermediate ratings. 
27. DAMR = Note storage root defects if prominent, including cracks, veins, constrictions and 
grooves, or a predominance of pencil roots. Use a 1–9 scale: 1 = None; 3 = Light (few roots 
affected); 5 = Moderate (10–30% damaged); 7 = Severe (30–60% roots affected); and 9 = Very 
severe (> 60% roots affected). 
28. WED1 = Overall assessment of weevil damage based on inspection of the harvested roots. Use a 
1–9 scale: 1 = no damage; 3 = minor; 5 = moderate; 7 = heavy; and 9 = severe damage, with 
numbers in between representing intermediate ratings. 
Form 2E. Sweetpotato quality 
Note: in PTs only the top fraction of clones (15–25% of all PT clones) needs to be determined, whereas 
in ATs all clones must be determined. 
29. DMF = Fresh weight of storage root samples (roughly 200 g is the recommended sample size). 
 
 
27 
30. DMD = Dry weight of storage root samples. 
31. DMM = Dry matter assessment method (1‐ Sun‐dried, 2‐ Laboratory oven dried, 3‐ freeze dried, 
and 4‐specific gravity). 
32. COOF = Fibres in cooked storage root samples assessed by inspection and tasting. Use a 1–9 
scale: 1 = non‐fibrous; 3 = slightly fibrous; 5 = moderately fibrous; 7 = fibrous; and 9 = very 
fibrous, with numbers in between representing intermediate ratings. 
33. COOSU = Storage root sweetness in cooked samples, determined by taste test. Use a 1–9 scale: 1 
= non‐sweet; 3 = slightly sweet; 5 = moderately sweet; 7 = sweet; and 9 = very sweet, with 
numbers in between representing intermediate ratings. 
34. COOST = Storage root texture in cooked samples, determined by taste test. Use a 1–9 scale: 1 = 
very moist; 3 = moist; 5 = moderately dry; 7 = dry; and 9 = very dry, with numbers in between 
representing intermediate ratings. 
35. COOT = Overall taste of cooked samples assessed using a 1–9 scale: 1 = excellent; 3 = good; 5 = 
fair; 7 = poor; and 9 = horrible, with numbers in between representing intermediate ratings. 
36. COOAP = Overall appearance of cooked samples assessed using a 1–9 scale: 1 = excellent; 3 = 
good; 5 = fair; 7 = poor; and 9 = horrible, with numbers in between representing intermediate 
ratings. 
 
Note: A form 4 for determination of post harvest attributes such as perishability is under development, 
but there is still no common agreement among breeders which traits to record and how data needs to 
be recorded – this might become an appendix 4 in this manual in the future.  
Form 5A. Sweetpotato farmer participatory field evaluation 
Form 5A is available in file ”A5 APPENDIX5_SWEETPOTATO_TRIALS_Farmer T (ALL_FORMS_EXCEL).xls” 
Farmer participatory field evaluation has to be carried out at one AT location. Farmers are asked to give 
their opinion about each genotype in the AT in one plot replication by ”providing their cards (Green = 
Good; Yellow = Medium; Red = Unacceptable – men are given cards marked with M, women are given 
cards marked with F) on the following seven attributes and traits”: 
1. Genotype code 
2. Gender 
3. Ability to produce enough planting material (foliage production) 
4. Ability to tolerate diseases, especially SPVD 
5. Ability to tolerate pest damage (mainly weevils) 
6. Yielding ability (i.e. number and size of mature roots) 
 
 
28 
7. Attractiveness of the root skin colour. Probe more to find which colour(s) are most preferred and why 
8. Attractiveness of the root flesh colour. Probe more to find which colour(s) are most preferred and why 
9. Overall opinion on the acceptability of the variety. 
Data are recorded on a plot basis. Card frequencies are registered on form 5A give the number of green, 
yellow and red cards provided to the farmers grouped by gender for each trait. 
3.2. Derived variables 
Several variables can be derived from the raw data of agronomic trials and can be effective in evaluating 
the performance of clones. Here we consider only the total storage root yield per hectare and storage 
root dry matter content to be essential: 
Yield of total roots in tonnes per hectare: 
RYTHA = (CRW + NCRW) / NET plot area in m2 × 10. 
Storage root dry matter content: 
DM = (DMD / DMF) × 100. 
Number of commercial storage roots per plant (a trait with extreme high positive correlation to RYTHA; 
simply count and you have already a very good estimate for root yield): 
NOCRPL = NOCR / NOPH. 
 
Further variables derived from raw data should be calculated within the analysis using the statistical 
program packages. Suggestions for further variables are: 
Average commercial root weight: 
ACRW = CRW / NOCR. 
Biomass yield: 
BIOM = CRW + NCRW + VW. 
Percentage of marketable roots: 
CI = NOCR / (NOCR + NONC) × 100. 
Commercial root yield in tonnes per hectare: 
CYTHA = CRW / NET plot area in m2 × 10. 
Foliage total yield in tonnes per hectare: 
FYTHA = VW / NET plot area in m2 × 10. 
Harvest Index: 
 
 
29 
HI = (CRW + NCRW) / (VW + CRW + NCRW) × 100. 
Number of roots per plant: 
NRPP = (NOCR + NONC) / NOPH. 
Survival or establishment: 
SHI = (NOPH / Number of cuttings per NET plot area) × 100. 
Total root weight: 
TRW = CRW + NCRW. 
Yield per plant: 
YPP = (CRW + NCRW) / NOPH. 
Root foliage ratio: 
(CRW + NCRW) × (DMD / DMF) / (VW × DMVD / DMVF) × 100. 
 
 
30 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
31 
4. Suggestions for data analysis and clonal selection 
4.1. Statistical program packages 
CIP provides statistical support for data recorded with the Forms 2C and 2D. The support is restricted to 
the program packages PLABSTAT, SAS and R. Some R packages for experimental design and data analysis 
have been used to develop the Highly Interactive Data Analysis Platform (HIDAP). 
Randomization of field trials and random number generators 
Randomization belongs to the absolutely basic principles of designing experiments. Replication is also an 
basic principal of experimental designs. With no plot replications (only plot replications for checks and/or 
parents) the location becomes replication and we all agree that no breeder would even think about to 
launch a new interesting candidate on basis of information from one location. The same attitude we 
should have with respect to randomization: no breeder should try or think about to run a trial without 
randomization. Randomization is absolutely vital to avoid systematic effects on basis of neighbour plots 
and the position in the field, screenhouse, etc. Randomization is possible with tables for random numbers 
and random number generators using computer software. HiDAP is randomizing field plans on basis of 
random numbers with R, several random number generators are available in R, and SAS even has a 
procedure ”PROC PLAN” to randomize factors and their factor levels. There is no excuse not to randomize. 
A field trial not randomized is lost. With not randomized field trails you can lose your reputation and we 
sweetpotato breeders do not want to become famous in the breeders community of not randomizing our 
field trials. Randomization is done today best by software (tested algorithms) – it is a science, which 
generates so‐called pseudo‐random numbers which need as seed number as a starting point for the 
algorithm. The seed number can be set as a number of your choice or time (year, day of the year, hour, 
minute, seconds, and even milliseconds). Note the same seed number generates the same order of 
”random” numbers – therefore these numbers are called pseudo‐random numbers. It is required to 
change the seed number from trial to trial / block to block. This is not required by using time as seed 
number. However, the order of random numbers generated by the seed number ”time” can never be 
generated again. Please keep in mind with open access that data becomes freely available – the public 
who has funded the generation of data has the right that the data becomes public. With open access 
there are many statisticians out there who are willing to help and to conduct further analysis with open 
access data in the frame of big data sets, but they are also capable to determine the probability of the 
order within your trials.                  
PLABSTAT (Plant Breeding Statistics) 
PLABSTAT is a statistical program for plant breeders written by a plant breeder. Its output provides 
important parameters such as the variance components, LSD, heritability, stability parameters 
(ecovalence, slope of the regression line and deviations from the regression line), as well as covariances 
and genotypic correlations. PLABSTAT is free and can be download from https://plant‐breeding.uni‐
 
 
32 
hohenheim.de/software.html. PLABSTAT might be avaible in the future with more mixed model 
procedures by reprograming PLABSTAT with R. 
SAS (Statistical Analysis Software) 
SAS is a widely used statistical program package, but SAS requires deeper statistical knowledge to be 
used. Analysis of variances can be conducted by the procedures ANOVA, VARCOMP, GLM, and MIXED. 
You can use many different multiple comparison procedures (LSD, Tukey, Scheffe and Dunnett). 
Heritability, stability parameters, and analysis of covariances and genotypic correlations can only be 
calculated by user‐written programs in SAS‐IML. However, SAS‐IML allows you to write your own 
programs for AMMI analysis, index selection procedures for several characters, etc., and are readily 
available for sharing among breeders. 
R (no information why R is called R) 
R is a free programming language and software environment for statistical computing. It requires high 
level statistical knowledge and programming abilities to be used. It is a program language similar to SAS‐
IML. Linear models can be fitted with the base distribution of R. To fit linear and nonlinear mixed effects 
models you need to download and install libraries ‘nlme’ or ‘lme4’. For R there are many statistical 
procedures and libraries freely available. The base distribution, contributed packages and documentation 
about R can be found at http://www.r‐project.org. 
HIDAP (Highly Interactive Data Analysis Platform) 
HiDAP supports clonal crop breeders at the International Potato Center. It is part of on‐going in‐house 
efforts to unify best practices which include data collection, data quality and data analysis in clonal crop 
breeding. HiDAP builds on the former in‐house tools DataCollector (DC) and CloneSelector (CS) and adds 
new features to support open access and connect with corporate and local databases such as CIP‐BioMart 
and Sweetpotatobase, the latter via the Breeding API (BrAPI). HiDAP builds on the statistical platform R, 
and includes the R shiny tools, the knitr package, and more than 100 other R packages. The R shiny 
package enables interactive web pages that are usable online and offline and the knitr package enables 
the creation of reproducible reports. 
4.2. Data analysis example 
This data analysis example is simple and shall only demonstrate the entrance into data analysis. 
Genotypes are treated as a fixed factor to calculate means (balanced complete design) or lsmeans 
(unbalanced complete designs). In advanced analysis of plant breeding data genotypes are treated as a 
random factor for determination of variance components or BLUPs [best linear unbiased predictors; 
BLUPs (occasionally called BLUP means) are estimated / predicted on basis of the factor genotype treated 
as random and therefore genotypes obtained from one population – genotype treated as fixed is treating 
genotypes as obtained from different populations)]. Plant breeding often has to do with incomplete 
designs such as not all genotypes at all locations and years or cross combinations and families, 
 
 
33 
respectively, with equal or unequal numbers of genotypes. The later results in designs in which part of the 
factors are in cross classification, whereas others are nested (genotypes within families). These 
incomplete designs need to be analysed by mixed model statistics – see for example PROC MIXED in SAS. 
Note also formulas for heritability estimates change by moving form complete to incomplete designs. The 
breeders around PLABSTAT are currently investing efforts to implement mixed model statistics by R into a 
new PLABSTAT and the sweetpotato breeders started to implement mixed model statistics into HIDAP 
with R. These efforts are targeting a compact output for the statistical analysis restricted to those 
parameters needed by the breeder (without mathematical / statistical information or sometimes also 
called the statistics on the statistics) to obtain an overview of the plant material and to make informed 
choices (lsmeans or BLUPs, variance component estimates, heritabilities, genetic covariances, genetic 
correlations, multi‐trait selection procedures, and allocation of breeding resources).  
Note the follow example is treating the factor genotype as fixed – other options and more complicated 
designs require large extensions of this manual or even a separate training manual.              
Data set 
The example data set was taken from mega‐clone trials (clones of worldwide or regional importance 
designated as mega‐clones). It was reduced to five clones (SantoAmaro, Jonathan, Resisto, Xushu18 and 
Tanzania) and three locations (Chiclayo, La Molina and San Ramon). Our mega‐clone trials generally have 
only two replications. The data set comprises: 
1. Classification variables L, Y, GENO (here the CIP number), G and R (see Forms 4B,4C and 4D) with the 
additional variable Name (because clones are already varieties with a name), 
2. Observation variables VW (vine weight) and FYTHA (VW in tonnes per hectare) from the Vine 
observation data (Form 2D), 
3. Observation variables TRW (total root weight), NOPH (number of plants harvested), SHI (percent 
survival), RYTHA (TRW in tonnes per hectare) from the Root observation data (Form 2D), 
4. Observation variables DMF (fresh weight of storage root samples) and DM (storage root dry matter 
content), from the Quality observation data (Form 4D), and BC (beta carotene content). Note: negative 
values for BC are possible if these were estimated by near infra‐red spectrometry (NIRS), which is the 
case for this example. NIRS analytical capabilities are available at CIP’s breeding support platforms in 
Ghana, Uganda and Mozambique. 
5. No observation variables were taken from Vine survival and Piecemeal harvest quality. 
 
 
34 
Y L GENO NAME G R VW TRW NOPH SHI FYTHA RYTHA DM BC
2006 Chiclayo 400011 SantoAmaro 1 1 13.7 66.0 11 36.67 10.15 48.89 33.33 -39.30
2006 Chiclayo 400011 SantoAmaro 1 2 17.1 34.0 18 60.00 12.67 25.19 37.25 -60.43
2006 Chiclayo 420014 Jonathan 2 1 12.5 46.3 13 43.33 9.26 34.30 28.88 146.07
2006 Chiclayo 420014 Jonathan 2 2 12.5 27.8 12 40.00 9.26 20.59 32.60 *
2006 Chiclayo 440001 Resisto 3 1 9.3 18.6 11 35.00 6.89 13.78 26.59 442.26
2006 Chiclayo 440001 Resisto 3 2 8.9 29.5 16 51.67 6.59 21.85 26.98 204.52
2006 Chiclayo 440025 Xushu18 4 1 14.7 27.0 16 53.33 10.89 20.00 30.28 -60.14
2006 Chiclayo 440025 Xushu18 4 2 10.7 52.0 18 60.00 7.93 38.52 29.08 -92.69
2006 Chiclayo 440166 Tanzania 5 1 16.2 7.5 12 38.33 12.00 5.56 33.40 -93.45
2006 Chiclayo 440166 Tanzania 5 2 20.1 39.0 16 53.33 14.89 28.89 37.80 -97.61
2006 La Molina 400011 SantoAmaro 1 1 37.0 12.0 19 63.33 54.81 17.78 31.69 -26.88
2006 La Molina 400011 SantoAmaro 1 2 53.5 6.0 24 80.00 79.26 8.89 31.04 -18.87
2006 La Molina 420014 Jonathan 2 1 29.0 8.5 16 53.33 42.96 12.59 25.79 572.40
2006 La Molina 420014 Jonathan 2 2 30.0 3.5 20 66.67 44.44 5.19 26.77 174.80
2006 La Molina 440001 Resisto 3 1 22.0 19.0 15 50.00 32.59 28.15 23.61 629.40
2006 La Molina 440001 Resisto 3 2 41.0 22.5 24 80.00 60.74 33.33 23.61 653.90
2006 La Molina 440025 Xushu18 4 1 32.0 25.5 25 83.33 47.41 37.78 30.63 -14.03
2006 La Molina 440025 Xushu18 4 2 37.0 30.2 27 90.00 54.81 44.74 29.70 -13.87
2006 La Molina 440166 Tanzania 5 1 56.0 7.0 24 80.00 82.96 10.37 32.47 -12.51
2006 La Molina 440166 Tanzania 5 2 90.0 14.0 26 86.67 133.33 20.74 32.74 -7.59
2006 San Ramon 400011 SantoAmaro 1 1 18.2 18.1 20 66.67 26.96 26.81 30.65 -21.46
2006 San Ramon 400011 SantoAmaro 1 2 4.2 5.8 22 73.33 6.22 8.59 * *
2006 San Ramon 420014 Jonathan 2 1 3.9 8.5 19 63.33 5.78 12.59 35.25 119.70
2006 San Ramon 420014 Jonathan 2 2 6.2 7.9 27 90.00 9.19 11.70 28.83 147.60
2006 San Ramon 440001 Resisto 3 1 6.4 10.1 25 83.33 9.48 14.96 33.00 498.10
2006 San Ramon 440001 Resisto 3 2 13.4 21.4 28 93.33 19.85 31.70 28.94 483.30
2006 San Ramon 440025 Xushu18 4 1 5.1 16.0 27 90.00 7.56 23.70 33.04 -8.11
2006 San Ramon 440025 Xushu18 4 2 1.3 2.0 15 50.00 1.93 2.96 31.21 -12.53
2006 San Ramon 440166 Tanzania 5 1 11.4 6.1 17 56.67 16.89 9.04 33.93 -12.87
2006 San Ramon 440166 Tanzania 5 2 11.0 4.1 18 60.00 16.30 6.07 34.73 -14.84  
Model 
The statistical model is 
Yijk =  + i + j + ij + k(j) + ijk 
where: 
 Yijk is the response variable with genotype i, at location j, replication k. 
 i is the fixed effect of genotype i. 
 j is the random effect of location j. We assume that j has a normal distribution with mean 0 and 
variance  2
 .  
 ij is the random interaction effect between genotype i and location j. 
 k(j) is the random effect of replication and block, respectively, k within location j. 
 ijk is the random error term, assumed to be normally distributed with mean 0 and variance  2
e .  
 
 
 
35 
Statistical models are not always nice to read, but they present in a compact form what was made and 
they are useful for the material and methods section in publications. It is important to know the 
difference between fixed and random effects. If we compare means and lsmeans of genotypes in 
advanced multi‐location trials the effect of genotypes is fixed and all other effects are random. If we want 
information about BLUPs, variance components, and heritabilities all effects are random. 
Note: The model here is for a block design testing genotypes across location. A block design controls 
systematic changes in the field such as soil fertility in one dimension. The further might be row‐column 
designs in which systematic changes in the field are controlled in two dimensions.    
4.3. Computations for our example using PLABSTAT 
PLABSTAT input 
Important: in the way, we use PLABSTAT here, the data must be sorted according to the order of the 
factors in the ‘factor’ statement – in our example location, genotype and replication. PLABSTAT reads the 
data lines (rows) according to this order and if the data lines or rows are not sorted in this order our 
results would be a meaningless mess due to mixed up factor levels. 
'REFERENCE' 5 mega-clones at 3 Locations 
'FACTORS' L=3 G=5 R=2 
'MODEL' L + G + LG + R:L + RLG 
'ANOVA/1111' 6 8 8 
'VARIABLE_NAMES' VW TRW NOPH SHI FYTHA RYTHA DM BC 
'NAMES_OF_TR/L' Chiclayo La_Molina San_Ramon 
'NAMES_OF_TR/G' SantoAmaro Jonathan Resisto Xushu18 Tanzania 
'RANDOM' L R 
'HERITAB' G 
'SUBINT'LG 
'MEAN' GL 
'TBT_TAB' GL 
'RUN' 
2006 Chicl 400011 SantoAmaro 1 1 13.7 66.0 11 36.67  10.15 48.89 33.33 -39.30 
2006 Chicl 400011 SantoAmaro 1 2 17.1 34.0 18 60.00  12.67 25.19 37.25 -60.43 
2006 Chicl 420014 Jonathan   2 1 12.5 46.3 13 43.33   9.26 34.30 28.88 146.07 
2006 Chicl 420014 Jonathan   2 2 12.5 27.8 12 40.00   9.26 20.59 32.60      * 
2006 Chicl 440001 Resisto    3 1  9.3 18.6 11 35.00   6.89 13.78 26.59 442.26 
2006 Chicl 440001 Resisto    3 2  8.9 29.5 16 51.67   6.59 21.85 26.98 204.52 
2006 Chicl 440025 Xushu18    4 1 14.7 27.0 16 53.33  10.89 20.00 30.28 -60.14 
2006 Chicl 440025 Xushu18    4 2 10.7 52.0 18 60.00   7.93 38.52 29.08 -92.69 
2006 Chicl 440166 Tanzania   5 1 16.2  7.5 12 38.33  12.00  5.56 33.40 -93.45 
2006 Chicl 440166 Tanzania   5 2 20.1 39.0 16 53.33  14.89 28.89 37.80 -97.61 
2006 La_Mo 400011 SantoAmaro 1 1 37.0 12.0 19 63.33  54.81 17.78 31.69 -26.88 
2006 La_Mo 400011 SantoAmaro 1 2 53.5  6.0 24 80.00  79.26  8.89 31.04 -18.87 
2006 La_Mo 420014 Jonathan   2 1 29.0  8.5 16 53.33  42.96 12.59 25.79 572.40 
2006 La_Mo 420014 Jonathan   2 2 30.0  3.5 20 66.67  44.44  5.19 26.77 174.80 
2006 La_Mo 440001 Resisto    3 1 22.0 19.0 15 50.00  32.59 28.15 23.61 629.40 
2006 La_Mo 440001 Resisto    3 2 41.0 22.5 24 80.00  60.74 33.33 23.61 653.90 
2006 La_Mo 440025 Xushu18    4 1 32.0 25.5 25 83.33  47.41 37.78 30.63 -14.03 
2006 La_Mo 440025 Xushu18    4 2 37.0 30.2 27 90.00  54.81 44.74 29.70 -13.87 
2006 La_Mo 440166 Tanzania   5 1 56.0  7.0 24 80.00  82.96 10.37 32.47 -12.51 
2006 La_Mo 440166 Tanzania   5 2 90.0 14.0 26 86.67 133.33 20.74 32.74  -7.59 
2006 San_R 400011 SantoAmaro 1 1 18.2 18.1 20 66.67  26.96 26.81 30.65 -21.46 
2006 San_R 400011 SantoAmaro 1 2  4.2  5.8 22 73.33   6.22  8.59     *      * 
2006 San_R 420014 Jonathan   2 1  3.9  8.5 19 63.33   5.78 12.59 35.25 119.70 
2006 San_R 420014 Jonathan   2 2  6.2  7.9 27 90.00   9.19 11.70 28.83 147.60 
2006 San_R 440001 Resisto    3 1  6.4 10.1 25 83.33   9.48 14.96 33.00 498.10 
2006 San_R 440001 Resisto    3 2 13.4 21.4 28 93.33  19.85 31.70 28.94 483.30 
2006 San_R 440025 Xushu18    4 1  5.1 16.0 27 90.00   7.56 23.70 33.04  -8.11 
2006 San_R 440025 Xushu18    4 2  1.3  2.0 15 50.00   1.93 12.96 31.21 -12.53 
 
 
36 
2006 San_R 440166 Tanzania   5 1 11.4  6.1 17 56.67  16.89  9.04 33.93 -12.87 
2006 San_R 440166 Tanzania   5 2 11.0  4.1 18 60.00  16.30  6.07 34.73 -14.84 
'EOD' 
'STOP' 
 
The interpretation of the PLABSTAT command lines is as follows: 
‐   The 'REFERENCE' statement, line gives you the option for a reference, comment or name of the data 
set you are going to analyse. 
‐   The ‘FACTORS’ statement line specifies the factors in your experiment. Here there are three factors 
(location, genotype and replication) named L, G and R, respectively. The first factor – location – has 
three factor levels (L=3); the second factor – genotype – has five factor levels (G=5); and the third has 
two factor levels (R=2). Again, note that the way PLABSTAT is used here the data must be sorted by 
location, genotype and replication prior to the analysis. 
‐   The 'MODEL' statement line specifies the experimental design, in this case randomized complete block 
design (RCBD) experiment carried out at a series of locations (see also randomised complete block 
design with one factor in a series over places in the PLABSTAT manual). In APPENDIX C of the 
PLABSTAT manual there is a very useful collection of 'MODEL' statement lines for experimental 
designs. 
‐   The 'ANOVA' statement instructs how to read data and conduct an ANOVA for balanced data (missing 
values up to 15% will be estimated to obtain a balanced data set). The qualifier after the forward slash 
“/” is used for controlling input and output. The qualifier consists of four digits namely MISS, EXTR, 
PRIN and NEWF [e.g. 1111: MISS = 0, zeros are not interpreted as missing values or MISS = 1, zeros are 
interpreted as missing values (default); EXTR = 0, no test on extreme values or outliers or EXTR = 1, 
test of residuals on extreme values (default); EXTR = 2, test of residuals and effects on extreme values 
(for PRIN and NEWF please see PLABSTAT manual)]. The qualifier is followed by three numbers. The 
first is used for the number of variables (columns) to be ignored in the ANOVA – these are the 
variables (columns) used as classification variables for our data set. In our example (see above) the 
first six columns (see first row: 2006 Chicl 400011 Santo Amaro 1 1) are used as classification variables 
for our data set. The second number is the number of variables (columns) to be read for the ANOVA; 
and the third number is the number of variables to be analysed in the ANOVA. 
‐   With the statement 'VARIABLE_NAMES' you can assign names to the eight variables to be read and 
the eight variables to be analysed by the ANOVA in our example. 
‐   With the statement 'NAMES_OF_TR/L' you can assign names to the three levels of the factor L. 
‐   With the statement 'NAMES_OF_TR/G' you can assign names to the five levels of the factor G. 
‐   With the 'RANDOM' statement you define the random factors in your design. All factors not listed are 
assumed to be fixed. This statement results in changes of the error term used for testing the different 
effects in the ANOVA. For example, if G and L are fixed, the main effects of G and L, as well as the 
interaction term GL, are tested against the error term in the F‐test. However, if G is fixed and L is 
 
 
37 
random, the main effect of G must be tested against the interaction term GL, whereas the main effects 
of L and the interaction term GL must be tested against the error term in the F‐test. 
Note 1: in all cases the factor replication (R) is a random factor. 
Note 2: in 99.9% of all cases in plant breeding the factor L is a random factor! 
Note 3: the factor G is a fixed factor when you want to compare mean differences among genotypes for 
example by the LSD test. However, G is a random factor when you want to estimate BLUPs, variance 
components, and heritabilities. 
‐   The statement 'HERITAB' requests the calculation of heritabilities based on the variance component 
estimations in the ANOVA – this is usually done for the factor G. 
‐   The statement 'SUBINT' requests the calculation of a stability analysis and stability parameters for the 
interaction term [in our case GL]. The stability parameters (slope of regression lines, deviations from 
regression lines, ecovalence, etc.) are calculated for both factors (in our case G and L). 
Note 4: in plant breeding the stability of environments is often of interest, because breeders want to 
select in an environment in which they can distinguish well among genotypes (environments with a 
slope of regression lines b >0.7). 
‐   The statement 'MEAN' requests a table with means across the factor levels and by factor levels. 
‐   The statement 'TBT_TAB' allows you to write the calculations of the 'MEAN' statement in a file 
separate from the rest of the output file for further analysis (e.g. AMMI analysis or index selection 
procedure). 
‐   With the statement 'RUN' PLABSTAT starts to read your data set. 
‐   With the statement 'EOD' PLABSTAT stops reading your data set. 
‐   With the statement 'STOP' the PLABSTAT program stops (exit and no further analysis). 
Note 5: you can run several analyses (several blocks from statement 'REFERENCE' until statement 'EOD'). 
In this way, you can analyse your data for all factors random and for one factor (G) fixed with all other 
factors random. 
PLABSTAT output 
From the PLABSTAT output of the example and the eight variables to analyse, we chose three variables: 
root yield in tonnes per hectare (RYTHA), storage root dry matter content (DM) and beta carotene (BC). 
   RYTHA    DM     BC 
 MIN  5.19 23.61 -97.61 
 MAX 48.89 37.80 653.90 
 
 
This output allows you to identify, in a first step, outliers in our data set. Values that are clearly out of the 
biological range must be set to ‘*’ in the input data set, which is the symbol for a missing value in 
PLABSTAT – such values were not observed in our example. 
 
 
38 
  
----------  Character  1  RYTHA  ---------- 
 Please check for outliers (test after ANSCOMBE and TUKEY) 
 Source  DF      SS        MS       Var.cp  s(V.cp)     F   DF-NM  DF-DN  s.e.   LSD5 
 L        2   503.6645  251.8323   23.6707  17.8329  16.65*  2.00   3.00  1.23   5.53 
 G        4   997.3840  249.3460   13.9130  26.9912   1.50   4.00   8.00  5.26  17.15 
 GL       8  1326.9457  165.8682   27.4119  42.6146   1.49   8.00  12.00  7.45  22.96 
 R:L      3    45.3755   15.1252  -19.1839   8.6094   0.14   3.00  12.00  4.71  14.52 
 RGL     12  1332.5335  111.0445  111.0445  41.9709 
 Total   29  4205.9032 
 HERITAB 33.48 (-497.81  86.82) 
 
 
In PLABSTAT the results for ANOVA (above) include the variance components (Var.cp), which are very 
important parameters for plant breeding. The asterisk after the F‐value for L indicates that this effect is 
significant and the LSDs at the 5% level (LSD5) are given in the last column, with the value for L (5.53) the 
only one of interest since the G and GL effects are not significant. 
  
--------- Subdivision of two-way table G * L  --------- 
COMPOUND ANOVA 
 Source of variation       DF         SS            MS           Varcomp     Fvalue 
 G *L                       8      1326.9457      165.8682       27.4119      1.49   
   Non-additiv (TUKEY)      1         1.5572        1.5572       -6.2595      0.01   
   Het.Regr.G               3       264.4549       88.1516      -28.4489      0.34   
   Het.Regr.L               1       284.3989      284.3989        2.5554      1.10   
   Deviat. from regr.       3       776.5347      258.8449       73.9002      2.33   
 Regr.coeff. of interaction effects on product of both main effects  C = -0.00964 
 
Estimates for the factor G          
 Level       Mean   Corr.   Regr.       MSdev     MSentry  MSinteract.   MSdevXHY 
 -------------------------------------------------------------------------------- 
   1 Santo   22.7  0.6729   1.692      174.19      159.17       99.15      218.09 
   2 Jonat   16.2  0.6794   1.341      105.69       98.16       55.78      161.67 
   3 Resis   24.0 -0.2993  -0.387       76.59       42.06       86.73       92.67 
   4 Xushu   29.6  0.5906   1.350      171.30      131.54       88.74       15.54 
   5 Tanza   13.4  0.9745   1.004        2.69       26.72        1.34       24.94 
 -------------------------------------------------------------------------------- 
 
 Estimates for the factor L          
 Level       Mean   Corr.   Regr.       MSdev     MSentry  MSinteract.   MSdevXHY 
 -------------------------------------------------------------------------------- 
   1 Chicl   25.8  0.3814   0.494       79.27       69.57       70.11       46.70 
   2 La_Mo   22.0  0.8274   1.741       77.37      183.95       80.83      130.76 
   3 San_R   15.8  0.8114   0.766       16.86       37.01       14.93       41.95 
 -------------------------------------------------------------------------------- 
 
 
These are the results for the stability ANOVA. Note: neither the heterogeneity due to the regression on 
genotypes (Het.Regr.G) nor the heterogeneity due to the regression on locations are significant 
(Het.Regr.L). Important stability parameters for genotypes and locations are: the slope of the regression 
line (Regr.) should be close to 1; the deviations from the regression line (MSdev) should be close to zero; 
and the ecovalence (MSinteract.) should be low. 
The yield differences between genotypes are remarkable – i.e. between Tanzania (13.4 t/ha) and Xushu 
(29.6 t/ha) – but they are not significant due to the very large LSD5 of 17.15. Please note that ‘no 
significant differences’ does not mean there are no differences – the large differences in the example 
were simply not possible to verify at the 5% significance level because of the relatively large error. The 
ratio of variance components in this experiment for genotype : genotype by location interaction : error 
was 13.91 : 27.41 : 111.04 (i.e. 1 : 1.97 : 7.98). This is a very extreme ratio. Usually the ratio of variance 
 
 
39 
components is not so extreme in sweetpotato (see Grüneberg et al., 2005). This is also reflected by the 
low heritability for storage root yield of 33.48. This is too low for ATs. Here, careful checking of data for 
suspicious values is recommended. Indeed, Xushu18 had a yield of 23.70 t/ha in replication 1 in San_R and 
a yield of 2.96 t/ha in replication 2 in San_R – this is very suspect against the background that clone 
Xushu18 has a yield of 29.6 t/ha across locations. In such cases it would be worthwhile checking the 
original data or setting this value to a missing value ‘*’ and re‐analysing the data. 
The results of the ANOVA of DM are next. There are striking differences among the DM content of 
genotypes (F‐value of 3.68), but no significant difference (all differences among genotypes are smaller 
than the LSD5 of 4.46). The genotype by environment interaction is significant (F‐value of 4.42), but we 
are not interested in comparing the means of genotypes by location, since we considered the factor 
location to be random. The heritability for storage root DM content is high (72.83), which is typical for this 
quality trait. 
 
----------  Character  2  DM  ---------- 
 Missing data    1 
     4  Iterations 
        38    27.772 
  
 Please check for outliers (test after ANSCOMBE and TUKEY) 
  
 Source   DF     SS       MS     Var.cp s(V.cp)    F   DF-NM  DF-DN  s.e.  LSD5 
 L         2   55.0616  27.5308  1.6423  2.0696  2.48   2.00   3.00  1.05  4.74 
 G         4  165.0367  41.2592  5.0079  4.0572  3.68+  4.00   8.00  1.37  4.46 
 GL        8   89.6934  11.2117  4.3376  2.5559  4.42*  8.00  11.00  1.13  3.51 
 R:L       3   33.3222  11.1074  1.7142  1.4190  4.38*  3.00  11.00  0.71  2.22 
 RGL      11   27.9010   2.5365  2.5365  0.9949 
 Total    28  371.0149 
 HERITAB  72.83 (-144.20  94.62) 
 Note:  Tests approximative, since treatment variances are overestimated in case of 
missing data 
 ***  NO CORRECTION OF DEGREES OF FREEDOM FOR MISSING VALUES IN SUBINT 
 
 
We do not examine the results of the stability analysis of DM, since stability analysis of plant quality 
parameters like dry matter, starch, sugars, carotenoids and minerals is usually not very useful. 
Results of the analysis of BC come next. There are significant differences among the BC contents of 
genotypes (F‐value of 29.04**), so there will be differences among genotypes that exceed the LSD5 
(139.53). The genotype by environment interaction is not significant (F‐value of 1.29). The heritability for 
BC content is remarkably high (96.56) with a 95% confidence lower limit of 69.05 and upper limit of 99.32. 
Note: There is a suspect value of 572.4 in rep 2 at La Molina. The analysis of BC can be probably be 
improved by setting the highly unlikely value of 572.4 to missing value ‘*’. 
 
 ----------  Character  3  BC  ---------- 
 Missing data    2 
     4  Iterations 
         5    72.175       38   -19.783 
  
 Please check for outliers (test after ANSCOMBE and TUKEY) 
 Suspect%  117    in RGL     1   2   2    obs. =  572.4 
 Suspect% -117    in RGL     2   2   2    obs. =  174.8 
  
 Source  DF       SS         MS        Var.cp    s(V.cp)    F    DF-NM DF-DN  s.e.   LSD5 
 L        2  114840.118  57420.0589  4854.7080  4098.8090  6.47+  2.00  3.00 29.79 134.06 
 
 
40 
 G        4 1275770.544 318942.6361 51326.5752 30701.1858 29.04** 4.00  8.00 42.78 139.53 
 GL       8   87865.478  10983.1847  1239.9849  3007.3596  1.29   8.00 10.00 65.20 205.46 
 R:L      3   26618.936   8872.9785    73.9527  1319.7375  1.04   3.00 10.00 41.24 129.95 
 RGL     10   85032.149   8503.2149  8503.2149  3471.4229 
 Total   27 1590127.224 
 HERITAB 96.56 (69.05  99.32) 
 Note:  Tests approximative, since treatment variances are overestimated in case of 
missing data 
 ***  NO CORRECTION OF DEGREES OF FREEDOM FOR MISSING VALUES IN SUBINT 
 
 
4.4 Computations for our example using SAS 
SAS input 
Now we illustrate how to fit the model with SAS. The stability analysis is not possible here but CIP will 
make SAS‐IML programs available for region analysis and AMMI. 
In the first section, the data is loaded. 
‐   The first statement line gives a name for the data, in this case all. 
‐   The second statement line gives the names of the variables in the data. Non‐numeric variables must be 
followed by ‘$’. 
‐   The cards statement indicates that the data lines follow immediately. Missing values are indicated 
with dots in SAS. 
‐  The run statement after the data lines reads the data. 
In the second section, proc means is used to calculate means, minima and maxima. 
‐   In the first line, the data=all statement tells proc means that the data set with name all must be used. 
‐  In the second line, var tells proc means which observation variables (here RYTHA, DM and BC) that 
proc means has to use to calculate the mean, standard deviation, minimum and maximum values. 
‐   The statement run ends each proc section – here the section proc means. 
In the third section, proc glm is used to fit a general linear model (not to be confused with the generalised 
linear model) in order to get the ANOVA results. 
‐   In the first line, the data=all statement tells proc glm that the data with name all must be used. 
‐   The class statement indicates the classification variables (factors) that are going to be included in the 
analysis. 
‐   The model statement indicates the response variable (on the left of ‘=’) and the complete model 
specification, that is, the fixed and random factors as well as their interactions (on the right of ‘=’). 
Since proc glm assumes fixed effects, due to this statement we will get an ANOVA where all the effects 
are considered as fixed. Note that the F‐test for factor G would not be valid since its mean square must 
be compared with the interaction mean square and not with the error mean square. 
 
 
41 
‐   The random statement indicates the factors and interactions which are random. The /test statement 
ask for the F‐tests for these effects. Here SAS takes into account which effects are fixed and which are 
random to calculate appropriate F‐ratios. 
‐   Alternatively we can ask for specific tests with the statement test in a new line. For instance, the 
statement test H=G E=L*G will consider G as the main effect to evaluate and L*G as the error term for 
the F ratio. 
‐   The lsmeans statement computes least squares means (LS‐means). In this case we are asking LS‐means 
for the levels of factor G. After the ‘/’ some options are defined. cl requests confidence limits for the 
individual LS‐means or for differences between pairs. pdiff requests p‐values for differences of the LS‐
means, and for these differences several adjustments are available. Here, adjust=T signifies no 
adjustment for multiple comparisons, so a Student t‐distribution based confidence interval is 
computed. E=L*G specifies the effect of the model to use as the error term. 
‐   Finally the run statement tells SAS to run the proc glm computations. 
 
data all; 
input Y L $ GENO NAME $ G R RYTHA DM BC; 
cards; 
2006  Chiclayo   400011  SantoAmaro  1  1  48.89  33.33  -39.30 
2006  Chiclayo   400011  SantoAmaro  1  2  25.19  37.25  -60.43 
2006  Chiclayo   420014  Jonathan    2  1  34.30  28.88  146.07 
2006  Chiclayo   420014  Jonathan    2  2  20.59  32.60  . 
2006  Chiclayo   440001  Resisto     3  1  13.78  26.59  442.26 
2006  Chiclayo   440001  Resisto     3  2  21.85  26.98  204.52 
2006  Chiclayo   440025  Xushu18     4  1  20.00  30.28  -60.14 
2006  Chiclayo   440025  Xushu18     4  2  38.52  29.08  -92.69 
2006  Chiclayo   440166  Tanzania    5  1   5.56  33.40  -93.45 
2006  Chiclayo   440166  Tanzania    5  2  28.89  37.80  -97.61 
2006  La_Molina  400011  SantoAmaro  1  1  17.78  31.69  -26.88 
2006  La_Molina  400011  SantoAmaro  1  2   8.89  31.04  -18.87 
2006  La_Molina  420014  Jonathan    2  1  12.59  25.79  572.40 
2006  La_Molina  420014  Jonathan    2  2   5.19  26.77  174.80 
2006  La_Molina  440001  Resisto     3  1  28.15  23.61  629.40 
2006  La_Molina  440001  Resisto     3  2  33.33  23.61  653.90 
2006  La_Molina  440025  Xushu18     4  1  37.78  30.63  -14.03 
2006  La_Molina  440025  Xushu18     4  2  44.74  29.70  -13.87 
2006  La_Molina  440166  Tanzania    5  1  10.37  32.47  -12.51 
2006  La_Molina  440166  Tanzania    5  2  20.74  32.74   -7.59 
2006  San_Ramon  400011  SantoAmaro  1  1  26.81  30.65  -21.46 
2006  San_Ramon  400011  SantoAmaro  1  2   8.59  .     . 
2006  San_Ramon  420014  Jonathan    2  1  12.59  35.25  119.70 
2006  San_Ramon  420014  Jonathan    2  2  11.70  28.83  147.60 
2006  San_Ramon  440001  Resisto     3  1  14.96  33.00  498.10 
2006  San_Ramon  440001  Resisto     3  2  31.70  28.94  483.30 
2006  San_Ramon  440025  Xushu18     4  1  23.70  33.04   -8.11 
2006  San_Ramon  440025  Xushu18     4  2  12.96  31.21  -12.53 
2006  San_Ramon  440166  Tanzania    5  1   9.04  33.93  -12.87 
2006  San_Ramon  440166  Tanzania    5  2   6.07  34.73  -14.84 
run; 
 
proc means data=all; 
var RYTHA DM BC; 
/* proc means compute the mean, the standard deviation, the minimum and 
maximum for each variable in the data. Note this is a comment and starts with (/*) and 
ends with (*/) 
run; 
 
proc glm data=all; 
class L G R; 
 
 
42 
model RYTHA DM BC = L G L*G R(L); 
random L L*G R(L) /test; 
/* test H=G E=L*G */ 
lsmeans G / cl pdiff adjust=T E=L*G; 
 
/* Two further important multiple comparison procedures 
1) the Tukey test, which compares all possible differences among the factor levels of G – 
in our example 5*(5-1)/2 = 10 differences. Note with more and more differences the power 
of a test goes down */ 
/*lsmeans G / pdiff=all cl adjust=tukey E=L*G;*/ 
/* 
2) the Dunnett test, which allows us to compare against a control. Here we test against 
the factor level 2 of the factor G - this is the variety Jonathan – and we test if the 
noncontrol levels are greater than the control */ 
/*lsmeans G / pdiff=controlu('2') cl adjust=dunnett E=L*G; */ 
run; 
 
 
SAS output 
Here we have the expected mean squares for each source of variation and the ANOVA results computed 
with Proc glm for the TYLDha variable: 
 
The GLM Procedure 
 
Source                  Type III Expected Mean Square 
 
L                        Var(Error) + 5 Var(R(L)) + 2 Var(L*G) + 10 Var(L) 
G                        Var(Error) + 2 Var(L*G) + Q(G) 
L*G                      Var(Error) + 2 Var(L*G) 
R(L)                     Var(Error) + 5 Var(R(L)) 
 
 
Please note that mean square estimates are not variance component estimates! However, variance 
component estimates can be calculated from mean square estimates with the above equations!                                         
 
The GLM Procedure 
Tests of Hypotheses for Mixed Model Analysis of Variance 
 
Dependent Variable: RYTHA 
 
   Source                        DF     Type III SS     Mean Square    F Value    Pr > F 
 
   L                              2      503.664540      251.832270       3.60    0.3335 
   Error                      1.077       75.337912       69.948928 
   Error: MS(L*G) + MS(R(L)) - MS(Error) 
 
   Source                        DF     Type III SS     Mean Square    F Value    Pr > F 
 
   G                              4      997.383900      249.345975       1.50    0.2885 
   Error: MS(L*G)                 8     1326.945760      165.868220 
 
   Source                        DF     Type III SS     Mean Square    F Value    Pr > F 
 
   L*G                            8     1326.945760      165.868220       1.49    0.2560 
   R(L)                           3       45.375490       15.125163       0.14    0.9365 
   Error: MS(Error)              12     1332.533460      111.044455 
 
 
In contrast to PLABSTAT you do not get a note of significant effects on the 5% or 1% level by ‘*’ or ‘**’. 
Instead you get the probability of the F‐value (Pr > F) directly. In our case the probability is 0.2885 to get a 
 
 
43 
result at least as extreme as the observed one in the case that the statement “no effect due to 
genotypes” is true. 
For each response variable in the model statement of the procedure glm an output table is printed. In our 
example the expected mean squares for each source of variation and the ANOVA results computed for 
the DM variable are given below: 
                                   
The GLM Procedure 
Source                  Type III Expected Mean Square 
 L                        Var(Error) + 4.6429 Var(R(L)) + 1.8571 Var(L*G) + 9.2857 Var(L) 
 G                        Var(Error) + 1.875 Var(L*G) + Q(G) 
 L*G                      Var(Error) + 1.9 Var(L*G) 
 R(L)                     Var(Error) + 4.6667 Var(R(L)) 
 
                                        The GLM Procedure 
                      Tests of Hypotheses for Mixed Model Analysis of Variance 
 
Dependent Variable: DM 
 
  Source                          DF     Type III SS     Mean Square    F Value    Pr > F 
 
  L                                2       52.851783       26.425892       1.70    0.2608 
  Error                       5.9894       93.312982       15.579698 
  Error: 0.9774*MS(L*G) + 0.9949*MS(R(L)) - 0.9723*MS(Error) 
 
  Source                          DF     Type III SS     Mean Square    F Value    Pr > F 
 
  G                                4      162.173806       40.543451       4.78    0.0286 
  Error                       8.0632       68.392971        8.482073 
  Error: 0.9868*MS(L*G) + 0.0132*MS(Error) 
 
  Source                          DF     Type III SS     Mean Square    F Value    Pr > F 
 
  L*G                              8       68.490787        8.561348       3.38    0.0328 
  R(L)                             3       29.182193        9.727398       3.84    0.0421 
  Error: MS(Error)                11       27.901058        2.536460 
 
 
There are significant differences among genotypes for DM on the 5% level – see Pr > F of 0.0286 for the 
source of variation G, which is smaller than 0.05. In the following is given the expected mean squares for 
each source of variation and the ANOVA results for the BC variable: 
 
The GLM Procedure 
 
 Source                  Type III Expected Mean Square 
 
 L                        Var(Error) + 4.3077 Var(R(L)) + 1.7231 Var(L*G) + 8.6154 Var(L) 
 G                        Var(Error) + 1.7529 Var(L*G) + Q(G) 
 L*G                      Var(Error) + 1.8008 Var(L*G) 
 R(L)                     Var(Error) + 4.3333 Var(R(L)) 
 
 
                                        
 
The GLM Procedure 
Tests of Hypotheses for Mixed Model Analysis of Variance 
 
Dependent Variable: BC 
 
  Source                          DF     Type III SS     Mean Square    F Value    Pr > F 
 
 
44 
 
  L                                2          102657           51329       5.02    0.1303 
  Error                       2.5552           26134           10228 
  Error: 0.9569*MS(L*G) + 0.9941*MS(R(L)) - 0.9509*MS(Error) 
 
  Source                          DF     Type III SS     Mean Square    F Value    Pr > F 
 
  G                                4         1222768          305692      28.29    <.0001 
  Error                       8.3421           90133           10805 
  Error: 0.9734*MS(L*G) + 0.0266*MS(Error) 
 
  Source                          DF     Type III SS     Mean Square    F Value    Pr > F 
 
  L*G                              8           86939           10867       1.28    0.3512 
  R(L)                             3           23887     7962.430024       0.94    0.4589 
  Error: MS(Error)                10           85032     8503.214093 
 
 
There are significant differences among genotypes for BC at the 1% level – see Pr > F of <.0001 for the 
source of variation G, which is smaller than 0.01. 
Considering the output of multiple comparison procedures below, first we have the LS‐means values for 
the RYTHA variable. LS‐means are better estimates than the mean values for the observed variables in 
cases where there are missing values. The results of the multiple comparison procedure based on a 
Student t‐test are not given for TYLDha because the G effect was not significant in the F‐test of the 
ANOVA. However, not significant does not mean that there are no differences. ‘Not significant’ in a 
statistical test means that the observed differences between factor levels are not significant compared to 
the error (in our case L*G). In cases where a significant L*G effect and a non‐significant G effect is 
observed, care has to be taken. Compare variances and variance components due to G and L*G, and if the 
variance component due to interaction is larger than the variance component due to the main effect 
genotype, it becomes interesting to look at the performance of genotypes across locations. Usually main 
effects are larger than interaction effects in biology. In our case the interaction effect L*G is significant 
and the main effect G is not significant, so the genotypes react very differently across locations. There 
might be patterns in response of genotypes to locations (genotypes with different adaptation to 
locations). 
Least Squares Means 
Standard Errors and Probabilities Calculated Using the Type III MS for L*G as an Error 
Term 
                                          RYTHA      LSMEAN 
                              G          LSMEAN      Number 
                              1      22.6916667           1 
                              2      16.1600000           2 
                              3      23.9616667           3 
                              4      29.6166667           4 
                              5      13.4450000           5 
 
 
Second, we present here the LS‐means for the variable DM together with the p‐values for differences 
between pairs of LS‐means based on a Student t‐test: 
 Least Squares Means 
Standard Errors and Probabilities Calculated Using the Type III MS for L*G as an Error 
Term 
                                                     LSMEAN 
                              G       DM LSMEAN      Number 
 
                              1      31.9554167           1 
 
 
45 
                              2      29.6866667           2 
                              3      27.1216667           3 
                              4      30.6566667           4 
                              5      34.1783333           5 
 
                           Least Squares Means for effect G 
                           Pr > |t| for H0: LSMean(i)=LSMean(j) 
 
                                  Dependent Variable: DM 
 
        i/j              1             2             3             4             5 
 
           1                      0.2566        0.0315        0.5041        0.2655 
           2        0.2566                      0.1674        0.5816        0.0289 
           3        0.0315        0.1674                      0.0697        0.0031 
           4        0.5041        0.5816        0.0697                      0.0706 
           5        0.2655        0.0289        0.0031        0.0706 
 
 
In contrast to PLABSTAT the LSD is not given. SAS uses here another method to present the results of the 
Student t‐test: the p‐values of each comparison. For our example, the Student t‐test indicates significant 
differences between DM LS‐mean 1 and 3, LS‐mean 2 and 5, and LS‐mean 3 and 5 (see p‐values < 0.05). 
Another way to present the results of the Student t‐test follows below: the confidence limits of the LS‐
mean values. In our example the DM LS‐mean of genotype 5 is estimated with 34.178333% and there is a 
95% confidence that the computed interval from 31.178333 to 36.932914% contains the ‘true mean 
value’ of genotype 5. Genotypes with non‐overlapping confidence limits are significantly different for the 
observed variable. 
                      
       G       DM LSMEAN      95% Confidence Limits 
 
                      1       31.955417       28.676810    35.234024 
                      2       29.686667       26.932086    32.441248 
                      3       27.121667       24.367086    29.876248 
                      4       30.656667       27.902086    33.411248 
                      5       34.178333       31.423752    36.932914 
 
 
Finally, for the BC variable we have the following LS‐means, LSD and confidence limit estimates: 
                                 Least Squares Means 
Standard Errors and Probabilities Calculated Using the Type III MS for L*G as an Error 
Term 
 
                                                   LSMEAN 
                            G       BC LSMEAN      Number 
 
                            1      -31.120417           1 
                            2      205.457500           2 
                            3      485.246667           3 
                            4      -33.561667           4 
                            5      -39.811667           5 
 
 
                           Least Squares Means for effect G 
                          Pr > |t| for H0: LSMean(i)=LSMean(j) 
 
                                Dependent Variable: BC 
 
      i/j              1             2               3             4             5 
   
         1                      0.0108          <.0001        0.9715        0.8987 
         2        0.0108                        0.0029        0.0069        0.0060 
 
 
46 
         3        <.0001        0.0029                        <.0001        <.0001 
         4        0.9715        0.0069        <.0001                        0.9198 
         5        0.8987        0.0060        <.0001        0.9198 
 
 
                    G       BC LSMEAN      95% Confidence Limits 
 
                    1      -31.120417     -147.930741    85.689908 
                    2      205.457500       88.647175   322.267825 
                    3      485.246667      387.106364   583.386969 
                    4      -33.561667     -131.701969    64.578636 
                    5      -39.811667     -137.951969    58.328636 
 
 
Here we get significant differences between BC LS‐mean 1 and 2, LS‐mean 1 and 3, LS‐mean 2 and 3, LS‐
mean 2 and 4, LS‐mean 2 and 5, LS‐mean 3 and 4, and LS‐mean 3 and 5 (see p‐values < 0.05). 
4.5. Multiple comparison procedures in plant breeding 
In our output examples we presented the Student t‐test. In our input SAS example we gave also the Tukey 
test and the Dunnett test in a statement set into a comment (comments in SAS have the /* comment */ 
syntax). Which test shall be used? The Student t‐test is informative but it can control the 5% error level 
only up to a comparison of all differences among three lsmeans (or a factor with three levels) if you have 
the previous information of significant differences of the F‐test. What happens when you have more 
lsmeans? There can be situations where the F‐test is not significant and you find significant differences by 
Student t‐tests, which is a problem. 
The results of F‐test and multiple comparison procedures must be consistent. For this reason the Tukey 
test was developed which allows comparisons among all lsmeans and controls the 5% error level so that 
F‐test results and Tukey test results are consistent (the Tukey will never give you a significant difference if 
the F‐test is not significant). However, with the increasing number of comparisons the precision/power of 
tests declines. There are cases in which the F‐test is significant but Tukey finds no significant difference, 
especially in situations of many comparisons. With fewer comparisons you have a higher power; for 
example, with five genotypes you have 5 × (5 – 1)/2 = 10 comparisons. In plant breeding, sufficient 
information is usually obtained by comparing with a check, and often only what is significantly larger (or 
smaller) than the check genotype is of interest. For these type of comparisons the Dunnett test was 
developed. The Dunnett controls the 5% level in the case of a test against one check (in such cases the 
Dunnett has a higher power than the Tukey test). There are three test possibilities: (i) smaller or larger 
than the check, (ii) smaller than the check and (iii) larger than the check. The latter two have a higher 
precision/power compared to the first test strategy, and they control for all comparisons against the 
check at the 5% level (there will not be significant differences among genotypes when the F‐test shows no 
significant differences). 
How important are these multiple comparison procedures in plant breeding? They are important when 
you want to release varieties at the final stages of a breeding program. You want to obtain information 
with ≤ 5% error (not more) about which of your best candidates is better than the most widely grown 
variety (i.e. check variety) in at least one variable (trait). You want to present this information to national 
authorities with ≤ 5% error (this can be considered as a quality label for a new genotype) that your new 
 
 
47 
genotype is better in at least one relevant variable (trait) compared to existing varieties. The test situation 
is that there are very few top genotypes and these have to be compared with a standard or check. In this 
situation, exact multiple comparison procedures are useful in plant breeding and they provide a quality 
label to the released material. However, the most important issue is that none of the top clones is close to 
or below the lowest acceptable value (according to the needs of farmers) in any trait. 
In early breeding stages (OTs or PTs) forget exact multiple comparison procedures and the 5% level. The F‐
test and the LSD in PLABSTAT are more than enough. Realise that you must operate with thousands of 
new genotypes to increase the chances of ‘good’ genotypes among your material. If you want to compare 
with exact multiple comparison procedures (e.g. with 5000 genotypes you will have 12,497,500 
comparisons with the Tukey test and 4999 with the Dunnett test) the power of these comparisons is 
extremely low, or you find many striking and interesting differences but nearly no significant differences 
at the ≤ 5% level. Note: multiple comparison procedures in statistics are designed to control the alpha 
error with 5%. This is the situation where, when you make the statement that one is better than another, 
you are confident that, if no one is better than the others, on average in only 5 out of 100 cases you will 
be wrong. 
When you work with many genotypes the beta‐error becomes increasingly important. This is the case 
where you make the statement “not different” and you make an error with the statement. The situation 
in early breeding stages is that – provided you have made good crosses – it is nearly certain that there will 
be some genotypes that are better than the best widely grown genotypes among thousands of new 
genotypes which your statistical analysis indicates are of equal or lower value than the best widely grown 
genotypes. Multiple comparison procedures like LSD, Tukey and Dunnett do not control the beta‐error. To 
control the beta‐error selection procedures must be used in which candidates are discarded step‐by‐step 
to enrich the frequency of genotypes with good performance over all variables (traits) in the selected 
fraction. However, beta‐error controlled multiple comparison procedures are still a research field in 
mathematical statistics. In cases where a breeder has made good crosses, there are more than a few good 
genotypes in the population, and it is no problem to discard some good genotypes as long as the 
frequency of good genotypes in the selected fraction is clearly increased. A parameter to measure ‘good’ 
genotypes in the selected fraction is the response to selection, which can be estimated by statistical 
procedures of quantitative genetic and selection theory. At the end of this chapter we will give 
suggestions to select (or discard genotypes) in advanced breeding material. 
4.6. Computations for our example using R 
R input 
R is a very flexible package and therefore there are several different ways to do things in R. To read data it 
can be in different formats (e.g. text, csv and xls files) and for each data format there are different 
options. For the example here, we will read the data from a text file with name ‘example.dat’. This text 
file has the same structure as the data table we read in the SAS example. In the same way, there are 
different ways to analyse a linear model in R. For a model with only fixed effects, commands lm or aov 
 
 
48 
can be used. For linear mixed effects models the more widely used packages are nlme and lme4 (R can be 
extended with additional packages. For a list of all available packages in CRAN visit https://cran.r‐
project.org/web/packages/available_packages_by_name.html). Below we show code for this analysis, but 
before going into this, it is important to note a couple of things about R: (i) everything in R is an object, 
and so you must give names to all the objects you are using (data tables, analysis results, etc.) and (ii) the 
symbol ‘<‐’ is used to assign a value to a name. 
‐   In the first line, we load the data stored in the file example.dat. The data is loaded and stored in an 
object with name all in R. The argument header = TRUE is used to indicate that the first row is not for 
values but for the headers of the data table. 
‐   In the second and third lines, we indicate that G and R are qualitative factors. If this is not indicated, R 
will consider these two as quantitative factors since they are coded with numbers in the data. 
‐  In the fourth line, we fit the linear model for the RYTHA trait. L*G means the effect of L plus the effect 
of G plus the interaction effect L by G. R%in%L means that blocks (R) are nested in locations. The 
results are stored in the object model.RYTHA. In this model, all the factors are considered as fixed. 
‐   In the fifth line, the command anova(model.RYTHA) extracts and prints the ANOVA results stored in 
the model.RYTHA object. 
‐   The same steps are repeated for the DM and BC variables. In these variables we have some missing 
values. Missing values must be identified with NA in the data (NA stands for Not Available). Because of 
differences in the way the missing values are treated, we get slightly different results for the sums of 
squares with PLABSTAT, SAS and R. 
‐   Finally in the last four sentences a mixed model for RYTHA is fitted, considering G as fixed and all the 
other factors as random. To fit this model command lmer from package lme4 is used. The sentence 
library(lme4) loads the package lme4 in the work environment. Then, the commands anova and 
summary are used to extract from the object model.RYTHA.R some important results for the fixed and 
random parts of the model respectively. 
 
all <- read.table("example.dat", header = TRUE) 
all$G <- as.factor(all$G) 
all$R <- as.factor(all$R) 
model.RYTHA <- lm(RYTHA ~ L*G + R%in%L, data = all) 
anova(model.RYTHA) 
model.DM <- lm(DM ~ L*factor(G) + factor(R)%in%L, data = all) 
anova(model.DM) 
model.BC <- lm(BC ~ L*factor(G) + factor(R)%in%L, data = all) 
anova(model.BC) 
library(lme4)  
model.RYTHA.R <- lmer(RYTHA ~ G + (1|L/R) + (1|L:G), data = all) 
anova(model.RYTHA.R) 
summary(model.RYTHA.R) 
 
 
 
 
49 
R output 
Below is the output for the fixed effects models fitted with lm. For the ANOVA tables R prints out the p‐
values [Pr(>F)]. 
 
Analysis of Variance Table 
Response: TYLDha 
            Df  Sum Sq Mean Sq F value Pr(>F) 
L            2  503.66  251.83  2.2679 0.1461 
factor(G)    4  997.38  249.35  2.2455 0.1249 
L:factor(G)  8 1326.95  165.87  1.4937 0.2560 
L:factor(R)  3   45.38   15.13  0.1362 0.9365 
Residuals   12 1332.53  111.04 
 
 
Analysis of Variance Table 
 
Response: DM 
            Df  Sum Sq Mean Sq F value    Pr(>F) 
L            2  63.524  31.762 12.5222  0.001462 ** 
factor(G)    4 182.837  45.709 18.0208 8.544e-05 *** 
L:factor(G)  8  58.585   7.323  2.8872  0.053227 . 
L:factor(R)  3  29.182   9.727  3.8350  0.042089 * 
Residuals   11  27.901   2.536 
--- 
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
 
Analysis of Variance Table 
 
Response: BC 
            Df  Sum Sq Mean Sq F value   Pr(>F) 
L            2  114310   57155  6.7216  0.01412 * 
factor(G)    4 1259027  314757 37.0162 5.76e-06 *** 
L:factor(G)  8   85880   10735  1.2625  0.35787 
L:factor(R)  3   23887    7962  0.9364  0.45887 
Residuals   10   85032    8503 
--- 
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
 
 
For the random effects model with only G as fixed, the output for the RYTHA trait is shown below. At the 
beginning is the ANOVA table. Note that this table has only one entry for the G factor because this is the 
only factor that is considered as fixed. Then, for the random factors, the estimations for variance 
components are obtained. 
 
Analysis of Variance Table 
  Df Sum Sq Mean Sq F value 
G  4 552.37  138.09  1.5033 
 
Random effects: 
 Groups   Name        Variance Std.Dev. 
 L:G      (Intercept) 37.004   6.083 
 R:L      (Intercept)  0.000   0.000 
 L        (Intercept)  8.596   2.932 
 Residual             91.861   9.584 
Number of obs: 30, groups:  L:G, 15; R:L, 6; L, 3 
 
 
 
 
50 
4.7. Suggestions for selection in ATs 
In advanced breeding trials a relatively low number of genotypes have to be compared (20–60 
genotypes), depending on the size of the breeding program. Usually for about 100–300 genotypes 
entering PTs and ATs, only 2–10 genotypes (not more) are tested for variety release. A common rule is to 
select from 5–20% at every breeding stage. 
ATs 
The comparison of 20–60 genotypes still results in many multiple comparisons. We recommend first 
determining the lowest acceptable value (according to the needs of farmers) for each variable (trait), 
except yield. Discard all genotypes that do not meet the lowest acceptable values for each trait before 
making further comparisons. Depending on the quality of your selection in previous breeding stages you 
should not have too many genotypes that meet or exceed the lowest acceptable value for all traits. Then, 
use the LSD for yield, and compare the best among the remaining genotypes with all other remaining 
genotypes. Discard all genotypes with differences from the best genotype that exceed the LSD. However, 
this is a comparison only within your breeding material and does not provide information about the 
performance of new genotypes in relation to other breeding material and programs. Such information is 
only possible by check clones comprising successful varieties from other sub‐regions and regions, e.g. 
mega‐clones. A set of recommendable mega‐clones to be used as check clones are described by 
(Eyzaguirre et al. 2009). These are Blesbok, CEMSA 74‐228 (CIP440034), Dagga (CIP199062.1), Xushu 18, 
Brondal, Jonathan and Tanzania, and are available from CIP for distribution across regions. The CIP 
sweetpotato breeders agreed to use Dagga and CEMSA 74‐228 as common check clones across breeding 
platforms to compare data relative to these two checks across platforms – both clones have been 
released in several countries across Latin America and the Caribbean as well as Sub‐Saharan Africa 
(exhibiting high yield with low contribution to genotype by environment interaction). Note: Experienced 
breeders often work with a larger set of check clones to obtain information and to characterise advanced 
breeding clones (i.e. at CIP in Peru CEMSA 74‐228, Dagga, Arne, Benjamin, Abigail, Isabelle, Sumy, Xushu 
18, and Jonathan are use as check clones in PTs and ATs). Often they simply express performance of new 
advanced breeding clones relative to these checks. For example: 110% in yield relative to CEMSA 74‐228, 
95% dry matter to Tanzania, 120% β‐carotene to Jonathan, and 100% SPVD symptoms to Tanzania – such 
a genotype would surely be a very interesting clone for ETs within our breeding program as well as for 
other breeding programs. It should be noted that for effective comparisons, adapted check mega‐clones 
should be identified in a particular region. 
ETs (these would correspond to second stage ATs) 
In these trials only very few genotypes are tested. These trials should be designed so that results can be 
used as information in variety release by national authorities (this depends on the country). However, 
elite material from other breeding programs can and should be used, provided that clones can be 
imported due to quarantine regulations. The check clones to be used should include the abovementioned 
check clones, although national authorities request only local checks (usually the mainly grown varieties in 
 
 
51 
the country). Here multiple comparison procedures that control the 5% error clearly make sense. When 
there are few new elite clones (2–4) and a larger number of check clones (6–8), test each elite clone 
against the check clones (Dunnett) to identify in which traits your new genotype is significantly different 
compared to checks. 
 
 
52 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
53 
5. References 
Eyzaguirre R., S. Agili, M. Andrade, F. Diaz, K. Tjintoko Hadi, S. Tumwegamire and W.J. Grüneberg, 2009: 
Genotype by environment interactions of sweetpotato across regions. In: Proceedings of the 
15th Symposium of the ISTRC (2–6 November 2009, Lima, Peru). 
Grüneberg W.J., K. Manrique, D. Zhang and M. Hermann, 2005: Genotype × environment interactions for 
a diverse set of sweetpotato clones evaluated across varying ecogeographic conditions in Peru. 
Crop Sci. 45: 2160–2171. 
Grüneberg W.J., F. Diaz, R. Eyzaguirre, J. Espinoza, G. Burgos, T. zum Felde, M. Andrade and R. Mwanga, 
2009a: Heritability estimates for an accelerated breeding scheme (ABS) in clonally propagated 
crops ‐ using sweetpotato as a model. In: Proceedings of the 15th Symposium of the ISTRC (2–6 
November 2009, Lima, Peru). 
Grüneberg W.J., R. Mwanga, M. Andrade and J. Espinoza. 2009b: In Plant breeding and farmer 
participation, Edited by S. Ceccarelli, E.P. Guimarães and E. Weltizien Breeding clonally 
propagated Crops. FAO. pp. 275–322. 
Grüneberg, W.J., D. Ma, R.O.M. Mwanga, E.E. Carey, K. Huamani, F. Diaz, R. Eyzaguirre, E. Guaf, M. Jusuf, 
A. Karuniawan, K. Tjintokohadi, Y.‐S. Song, S.R. Anil, M. Hossain, E.H.M. Shofiur Rahaman, S. 
Attaluri, K. Some, S. Afuape, K. Adofo, E. Lukonge, L. Karanja, J. Ndirigwe, G. Ssemakula, S. Agili, 
J.‐M. Randrianaivoarivony, M. Chiona, F. Chipungu, S. Laurie, J. Ricardo, M. Andrade, F. Rausch 
Fernandes, A.S. Mello, A. Khan, D.R. Labonte and G.C. Yencho, 2015: Advances in sweetpotato 
breeding from 1993 to 2012. In: Potato and Sweetpotato in Africa, Transforming the Value 
Chains for Food and Nutrition Security (J. Low, M. Nyongesa, S. Quinn and M. Parker, eds), CABI. 
pp.1‐77. 
Hahn S.K., 1982: Research priorities, techniques and accomplishments in sweet‐potato breeding at IITA. 
In: Root Crop in Eastern Africa: Proceedings of a workshop held in Kigali, Rwanda, 23–27 
November 1980.pp.23‐26. 
Jones A., P.D. Dukes and J.K. Schalk, 1986: Sweetpotato breeding. In: Breeding Vegetable Crops. (M.J. 
Bassett, ed), AVI, Westport, Connection. pp. 1–35. 
Kukimura H., K. Komaki and M. Yoshinaga, 1990: Current progress of sweet potato breeding in Japan. 
JARQ 24: 169–174. 
Laurie S.M. and A.A. van den Berg, 2002: A review of recent progress in breeding sweet potato in South 
Africa for resource poor farmers. In: Proceedings of the 12th Symposium of the ISTRC (Tsukuba, 
Japan). Potential of root crops for food and industrial resources. Nakatani M. and Kamaki K.(eds), 
pp,216‐219. 
Lebot V., 2010. Sweet potato. In: Root and Tuber Crops, Handbook of Plant Breeding 7 (J.E. Bradshaw, 
ed.), Springer Science & Business Media, pp. 97–125. 
 
 
54 
Martin F., (ed.), 1983: Breeding new sweet potatoes for the tropics. Proc. Am. Soc. Hort. Sci., Tropical 
Regional, Vol. 27. 
Martin F.M. and A. Jones, 1986: Breeding sweet potatoes. In: Plant Breeding Reviews, 4: 313–345. (J. 
Janick, ed), John Wiley & Sons. 
Saladaga F.A., H. Takagi, S.J. Cherng and R.T. Opena, 1991: Handling and selecting improved clones and 
true seed populations of sweetpotato. AVRDC International Cooperators Guide 91–348. AVRDC, 
Tainan, Tainan. 
Tan S.L., M. Nakatani and K. Komaki, 2007: Breeding of Sweetpotato. In: Breeding Major Food Staples 
(M.S. Kang and P.M. Priyadarahan, eds), Blackwell Publishing. 
Westcott B., 1981: Two methods for early generation yield testing in winter wheat. In: Proc. of the 4th 
meeting of the Biometrics in Plant Breeding Section of Eucarpia. INRA Poiter, France, pp 91‐95. – 
see also Kempton R.A., 1984. The design and analysis of unreplicated field trails. Vortr. 
Pflanzenzüchtg. 7, pp 219‐242.   
Wilson J.E., F.S. Pole, N.E.J.M. Smit and P. Taufatofua, 1989: Sweetpotato breeding Agro Facts, IRETA 
Publications, Western Samoa. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56 
6. Appendices 
Appendix 1. Sweetpotato observational trial 
 
 
 
 
 
 
 
 
 
 
 
 
  
 
57 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59 
Appendix 2. Sweetpotato preliminary PT and advanced trial AT 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
60 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62 
 
 
 
63 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
66 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67 
 
Appendix 3. Soil groups in Africa as classified by the FAO 
Source: Africa: soil group distribution. Retrieved 7 June 2008, from Encyclopædia Britannica Online: 
http://www.britannica.com/eb/art‐19257 
 
 
68 
 
1  Acrisols  Acrisols form on old landscapes that have an undulating topography and a humid tropical 
climate. Their natural vegetation is woodland, which in some areas has given way to tree 
savannah maintained by seasonal burning. The age, mineralogy and extensive leaching of these 
soils have led to low levels of plant nutrients, excess aluminium and high erodibility – all of 
which make agriculture problematic. Nevertheless, traditional shifting cultivation of acid‐
tolerant crops has adapted well to the conditions found in Acrisols. They occupy just under 8% 
of the continental land surface on Earth, covering areas throughout central and northern Latin 
America, Southeast Asia and West Africa. 
Acrisols are defined by the presence of a subsurface layer of accumulated kaolinitic clays 
where less than half of the ions available to plants are calcium, magnesium, sodium or 
potassium and also by the lack of an extensively leached layer below the surface horizon 
(uppermost layer). They are related taxonomically to the Oxisol soil order of the U.S. Soil 
Taxonomy. Related FAO soil groups originating in tropical climates and also containing layers 
with clay accumulations are Lixisols and Nitisols. 
2  Arenosols  Arenosols are sandy‐textured soils that lack any significant soil profile development. They 
exhibit only a partially formed surface horizon (uppermost layer) that is low in humus, and 
they are bereft of subsurface clay accumulation. Given their excessive permeability and low 
nutrient content, agricultural use of these soils requires careful management. They occupy 
about 7% of the continental surface area of the Earth, and are found in arid regions such as the 
Sahel of western Africa and the deserts of Western Australia, as well as in the tropical regions 
of Brazil. Arenosols are related to the sandy‐textured members of the Entisol order of the U.S. 
Soil Taxonomy. 
3  Calcisols  Calcisols are characterised by a layer of translocated (migrated) calcium carbonate – whether 
soft and powdery or hard and cemented – at some depth in the soil profile. They are usually 
well‐drained with fine to medium texture, and relatively fertile because of their high calcium 
content. Their chief use is for animal grazing. Occupying about 6.4% of the continental land 
surface of the Earth, these soils are typically encountered in arid or Mediterranean climatic 
zones (southwestern U.S., central and southern Argentina, central China, northern Africa and 
the Arabian Peninsula). 
Soils in the Aridisol, Inceptisol and Mollisol orders of the U.S. Soil Taxonomy show strong 
calcium carbonate accumulation and are therefore closely related to the Calcisols. Related FAO 
soil groups originating in arid regions and conditioned by limited leaching are Solonchak, 
Solonetz, Durisol and Gypsisol. 
4  Cambisols  Cambisols are characterised by the absence of a layer of accumulated clay, humus, soluble 
salts or iron and aluminium oxides. They differ from unweathered parent material in their 
aggregate structure, colour, clay content, carbonate content or other properties that give some 
evidence of soil‐forming processes. Because of their favourable aggregate structure and high 
content of weatherable minerals, they can usually be exploited for agriculture subject to the 
limitations of terrain and climate. Cambisols are the second most extensive soil group on Earth, 
occupying 12% of the total continental land area – mainly in boreal polar regions, in landscapes 
with high rates of erosion and in regions of parent material resistant to clay movement. They 
are not common in humid tropical climates. 
For a soil to qualify as a Cambisol, the texture of the subsurface horizons must be sandy loam 
 
 
69 
or finer, with at least 8% clay by mass and a thickness of 15 cm or more. These soils naturally 
form on medium‐ to fine‐textured parent materials under any climatic, topographic and 
vegetative‐cover conditions. They differ from Leptosols and Regosols by their greater depth 
and finer texture and are often found in conjunction with Luvisols. 
5  Durisols  Durisols are soils in semiarid environments that have a substantial layer of silica within 1 m of 
the land surface. The silica occurs either as weakly cemented nodules or as hardpan and 
accumulates as a result of downward translocation (migration) when solubilised during 
weathering of the soil. Durisols are found in the southwestern U.S., Chile, South Africa and 
especially Australia, where rainfall is low. They usually occur in association with Arenosols, 
Calcisols, Cambisols, Gypsisols or Vertisols. Soils in the Aridisol and Vertisol orders of the U.S. 
Soil Taxonomy that exhibit hardened layers of silica accumulation are closely related to the 
Durisols. 
6  Ferralsols  Ferralsols are red and yellow weathered soils whose colours result from an accumulation of 
metal oxides, particularly iron and aluminium (from which the name of the soil group is 
derived). They are formed on geologically old parent materials in humid tropical climates, with 
rainforest vegetation growing in the natural state. Because of the residual metal oxides and the 
leaching of mineral nutrients, they have low fertility and require additions of lime and fertiliser 
if they are to be used for agriculture. Tree crops such as oil palm, rubber or coffee are suitable, 
but pasture is often their main agricultural use after the original forest is cleared. Occupying 
just below 6% of the continental land surface on Earth, Ferralsols are found mainly in Brazil, the 
Congo River basin, Guinea and Madagascar. 
Ferralsols are technically defined by a fine‐textured subsurface layer of low silt‐to‐clay ratio, 
high contents of kaolinitic clay and iron and aluminium oxides, and low amounts of available 
calcium or magnesium ions. Ferralsols are related to the Oxisol order of the U.S. Soil 
Taxonomy. Related FAO soil groups originating in tropical climates and composed of weathered 
soils with high iron or aluminium content are Plinthosols and Alisols. 
7  Fluvisols  Fluvisols are found typically on level topography that is flooded periodically by surface waters 
or rising groundwater, as in river floodplains and deltas and in coastal lowlands. They are 
cultivated for dryland crops or rice and are used for grazing in the dry season. They occupy 
about 2.8% of the continental land area on Earth, mainly in the great river basins and deltas of 
the world (e.g. the Amazon basin and the Nile delta). 
Fluvisols are technically defined by a weak or non‐existent surface horizon (uppermost layer) 
and by parent material derived from river, lake or marine sediments that have been deposited 
at regular intervals or in the recent past. These soils exhibit a stratified profile that reflects 
their depositional history or an irregular layering of humus and mineral sediments in which the 
content of organic carbon decreases with depth. Wide variations in texture and mineral 
composition are observed. Fluvisols are related to the Inceptisol and Entisol orders of the U.S. 
Soil Taxonomy, wherever the latter occur on floodplains and deltas. Fluvisols are sometimes 
found in conjunction with Gleysols, a related FAO soil group formed under the influence of 
water. 
8  Gleysols  Gleysols are formed under waterlogged conditions produced by rising groundwater. In the 
tropics and subtropics they are cultivated for rice or, after drainage, for field crops and trees. 
Gleysols found in the polar regions (Alaska and Arctic Asia – about half of all Gleysols) are 
frozen at shallow depth and are used only by wildlife. These soils occupy about 5.7% of the 
 
 
70 
continental land area on Earth, including the Mississippi valley, north‐central Argentina, central 
Africa, the Yangtze River valley and Bangladesh. 
Gleysols are technically characterised by both chemical and visual evidence of iron reduction. 
Subsequent downward translocation (migration) of the reduced iron in the soil profile is 
associated with grey or blue colours in subsurface horizons (layers). Wherever oxidation of 
translocated iron has occurred (in fissures and cracks that may dry out), red, yellow or brown 
mottles may be seen. Gleysols are related to the Entisol and Inceptisol orders of the U.S. Soil 
Taxonomy, wherever the latter occur under waterlogged conditions sufficient to produce visual 
evidence of iron reduction. In warm climatic zones these soils occur in association with the FAO 
soil groups Fluvisol and Cambisol. 
9  Gypsisols  Gypsisols are characterised by a subsurface layer of gypsum (a hydrated calcium sulfate) 
accumulated by the precipitation of calcium and sulfate from downward percolating waters in 
the soil profile. With intensive management, irrigated crops can be grown on these soils. 
Occupying about 0.7% of the continental land area on Earth, Gypsisols occur in the very arid 
regions of the world (North Africa and the Middle East), sometimes in association with 
Calcisols, as in Australia and the U.S. 
To qualify as a Gypsisol, a soil may also have layers of accumulated clay or of calcium carbonate 
but not of soluble salts, and it may not show waterlogging or swelling‐clay effects. Little soil 
horizon (layer) differentiation is present other than the gypsic layer (which may be hardened 
and compact), with gypsum crystallites forming pebbles, stones or rosettes (the so‐called 
desert rose, in which gypsum crystals cluster together as do the petals of a rose). 
10  Kastanozems  Kastanozems are humus‐rich soils that were originally covered with early‐maturing native 
grassland vegetation, which produces a characteristic brown surface layer. They are found in 
relatively dry climatic zones (200–400 mm of rainfall per year), usually bordering arid regions 
such as southern and central Asia, northern Argentina, the western U.S. and Mexico. 
Kastanozems are principally used for irrigated agriculture and grazing. They occupy about 3.7% 
of the continental land area on Earth. 
Kastanozems have relatively high levels of available calcium ions bound to soil particles. These 
and other nutrient ions move downward with percolating water to form layers of accumulated 
calcium carbonate or gypsum. Kastanozems are related to the soils in the Mollisol order of the 
U.S. Soil Taxonomy that form in semiarid regions under relatively sparse grasses and shrubs. 
Related FAO soil groups originating in a steppe environment are Chernozems and Phaeozems. 
11  Leptosols  Leptosols are soils with a very shallow profile depth (indicating little influence of soil‐forming 
processes), and they often contain large amounts of gravel. They typically remain under 
natural vegetation, being especially susceptible to erosion, desiccation or waterlogging, 
depending on climate and topography. Leptosols are approximately equally distributed among 
high mountain areas, deserts and boreal or polar regions, where soil formation is limited by 
severe climatic conditions. They are the most extensive soil group worldwide, occupying about 
13% of the total continental land area on Earth, principally in South America, Canada, the 
Sahara, the Middle East, central China, Europe and Asia. 
Because of continual wind or water erosion or shallow depth to hard bedrock, Leptosols show 
little or none of the horizonation, or layering, characteristic of other soils. Leptosols are related 
to the soils in the Entisol order of the U.S. Soil Taxonomy that are found in high mountains, 
 
 
71 
deserts or boreal and polar regions of the world. Regosols are a related FAO soil group 
originating from erosion processes. 
12  Lixisols  Lixisols develop on old landscapes in a tropical climate with a pronounced dry season. Their age 
and mineralogy have led to low levels of plant nutrients and a high erodibility, making 
agriculture possible only with frequent fertiliser applications, minimum tillage and careful 
erosion control. Perennial crops are thus more suitable for these soils than root or tuber crops. 
They occupy just under 3.5% of the continental land area on Earth, mainly in east‐central Brazil, 
India and West Africa. 
 
Lixisols are defined by the presence of a subsurface layer of accumulated kaolinitic clays, 
where at least half of the readily displaceable ions are calcium, magnesium, sodium or 
potassium, but they are also identified by the absence of an extensively leached layer below 
the surface horizon (uppermost layer). They are related to the Oxisol order of the U.S. Soil 
Taxonomy. Related FAO soil groups originating in tropical climates and also containing layers 
with clay accumulations are Acrisols and Nitisols. 
13  Luvisols  The mixed mineralogy, high nutrient content and good drainage of these soils make them 
suitable for a wide range of agriculture, from grains to orchards to vineyards. Luvisols form on 
flat or gently sloping landscapes under climatic regimes that range from cool temperate to 
warm Mediterranean. Occupying just over 5% of the total continental land area on Earth, they 
are found typically in west‐central Russia, the U.S., central Europe, the Mediterranean basin 
and southern Australia. 
Luvisols are technically characterised by a surface accumulation of humus overlying an 
extensively leached layer that is nearly devoid of clay and iron‐bearing minerals. Below the 
latter lies a layer of mixed clay accumulation that has high levels of available nutrient ions 
comprising calcium, magnesium, sodium or potassium. Luvisols are often associated with 
Cambisols. Albeluvisols are a related FAO soil group also exhibiting clay migration. 
14  Nitisols  Occupying 1.6% of the total land surface on Earth, Nitisols are found mainly in eastern Africa at 
higher altitudes, coastal India, Central America and tropical islands (Cuba, Java and the 
Philippines). They are perhaps the most inherently fertile of the tropical soils because of their 
high nutrient content and deep, permeable structure. They are exploited widely for plantation 
agriculture. 
Nitisols are technically defined by a significant accumulation of clay (30% or more by mass and 
extending as much as 150 cm below the surface) and by a blocky aggregate structure. Iron 
oxides and high water content are believed to play important roles in creating the soil 
structure. Nitisols are also strongly influenced by biological activity, resulting in a 
homogenisation of the upper portion of the soil profile. These soils are related to the Alfisol 
and Inceptisol orders of the U.S. Soil Taxonomy. Related FAO soil groups originating in tropical 
climates and also containing layers with clay accumulations are Acrisols and Lixisols. 
15  Planosols  Planosols are characterised by a subsurface layer of clay accumulation. They occur typically in 
wet low‐lying areas that can support either grass or open forest vegetation. They are poor in 
plant nutrients, however, and their clay content leads to both seasonal waterlogging and 
drought stress. Under careful management they can be cultivated for rice, wheat or sugar 
beets, but their principal use is for grazing. Occupying about 1% of the total continental land 
 
 
72 
area on Earth, they are found mainly in Brazil, northern Argentina, South Africa, eastern 
Australia and Tasmania. 
The characteristic clay‐rich layer of Planosols can form from a downward translocation 
(migration) of clay particles under the action of percolating water, from burial of a clay‐rich 
layer by over‐washed coarse material, or from seasonal destruction and translocation of clay (a 
process known as ferrolysis). The clay layer thus may lie under an extensively leached (and 
hence nutrient‐poor) layer. Planosols are related to the Alfisols and Ultisols of the U.S. Soil 
Taxonomy. Related FAO soil groups also exhibiting clay migration are Luvisols and Albeluvisols. 
16  Plinthosols  Plinthosols form under a variety of climatic and topographic conditions. They are defined by a 
subsurface layer containing an iron‐rich mixture of clay minerals (chiefly kaolinite) and silica 
that hardens on exposure into ironstone concretions known as plinthite. The impenetrability of 
the hardened plinthite layer, as well as the fluctuating water table that produces it, restrict the 
use of these soils to grazing or forestry, although the hardened plinthite has value as subgrade 
material for roads or even as iron ore (the iron oxide content can be as high as 80% by mass). 
Plinthosols occupy about 0.5% of the total continental land area on Earth, mainly in Brazil and 
West Africa. A related FAO soil group also originating in the tropics is Nitisol. 
17  Podsols  Podzols form under forested landscapes on coarse parent material that is high in quartz. They 
have a characteristic subsurface layer known as the spodic horizon made up of accumulated 
humus and metal oxides, usually iron and aluminium. Above the spodic horizon there is often a 
bleached‐out layer from which clay and iron oxides have been leached, leaving a layer of 
coarse‐textured material containing primary minerals and little organic matter. Podzols usually 
defy cultivation because of their acidity and climatic environment. Occupying almost 4% of the 
total continental land area on Earth, they range from Scandinavia to Russia and Canada in the 
Northern Hemisphere, to The Guianas near the Equator, to Australia and Indonesia in the 
Southern Hemisphere. Podzols are closely similar to the Spodosol order of the U.S. Soil 
Taxonomy. Albeluvisols are a related FAO soil group also exhibiting a bleached‐out layer. 
18  Regosols  Regosols are characterised by shallow, medium‐ to fine‐textured, unconsolidated parent 
material that may be of alluvial origin and by the lack of a significant soil horizon (layer) 
formation because of dry or cold climatic conditions. Regosols occur mainly in polar and desert 
regions, occupying about 2% of the continental land area on Earth, principally in northern 
China, Greenland, Antarctica, north‐central Africa, the Middle East and northwest Australia. 
They are usually found under their original natural vegetation or under limited dryland 
cropping. 
 
Regosols often show accumulations of calcium carbonate or gypsum in hot, dry climatic zones. 
In very cold climatic zones they contain permafrost within 2 m of the land surface. Regosols are 
similar to the soils in the Entisol order of the U.S. Soil Taxonomy that occur in either very cold 
or very dry and hot climatic zones. They differ from the FAO soil groups Andosols, Arenosols 
and Vertisols in parent materials, from Gleysols in having lower water content and from 
Leptosols in having greater soil profile depth. 
19  Vertisols  Vertisols are characterised by a clay‐sized‐particle content of 30% or more by mass in all 
horizons (layers) of the upper 50 cm of the soil profile, by cracks at least 1 cm wide extending 
downward from the land surface, and by evidence of strong vertical mixing of the soil particles 
 
 
73 
over many periods of wetting and drying. They are found typically on level or mildly sloping 
topography in climatic zones that have distinct wet and dry seasons. Vertisols contain high 
levels of plant nutrients, but, owing to their high clay content, are not well suited to cultivation 
without painstaking management. They are estimated to occupy about 2.7% of the continental 
land area on Earth, mainly in the Deccan Plateau of India, the Al‐Jazirah region of The Sudan, 
eastern Australia, Texas in the U.S. and the Paraná basin of South America. 
 
Vertisols are dark‐coloured soils (though they have only moderate humus content) that may 
also be characterised by salinity and well‐defined layers of calcium carbonate or gypsum. They 
are similar in all respects to the Vertisol order of the U.S. Soil Taxonomy. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74 
 
Appendix 4. (In Process) 
 
 
75 
Appendix 5. Sweetpotato trials farmer 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  
 
 
 
 
 
 
 
 
 
 
77 
 
 
 
 
 
 
 
78 
 
 
 
‐ 79 ‐