MINING THE GAPS: USING MACHINE LEARNING TO MAP A MILLION DATA POINTS FROM AGRICULTURAL RESEARCH FROM THE GLOBAL SOUTH Authors Jaron Porciello, Leslie Lipper, Thomas ACKNOWLEDGEMENTS Bourne, Maryia Ivanina, Sammi Lin, We wish to thank Mary O’Connor and Cristina Ashby Sarah Langleben from CABI for their partnership and compilation of the Author Affiliations dataset. Thank you to the Commission on Sustainable Porciello (Havos.Ai and Cornell Agriculture Intensification (CoSAI) including David University); Lipper, Lin, and Langleben Shearer for managing this study and especially Julia (Cornell University), Bourne (Havos.Ai), Compton for envisioning the concept and shepherding and Ivanina (Epam Systems, Inc.) this work. A special thank you to the CoSAI Global Commissioners: Ruben Echeverria, Akissa Bahri, Aysegul Editorial Services Ozkavukcu, David Simon, Grethel Aguilar, Haris Gazdar, Stacey Shackford Irene Annor-Frempong, Jennifer Baarn, Jianguo Liu, Julio Berdegue, Madiodio Niasse, Mauricio Lopez, P.V. Vara Graphic Design, Artwork and Layout Prasad, Pablo Tittonell, Rasheed Sulaiman V, Rodomiro Walmazan Studios Inc. Ortiz, Sarah Mbago-Bhunu, Uduak Edem Igbeka, Uma Lele, Varad Pande, and Ximena Rueda. Data partner CABI TABLE OF CONTENTS INTRODUCTION ................................................................................1 CONTEXT AND KEY OBJECTIVES............................................................2 OUR APPROACH ................................................................................3 OUR FINDINGS..................................................................................5 CONCLUSION ..................................................................................15 FIGURES FIGURE 1. CONCEPTUAL MODEL ..........................................................3 FIGURE 2: GENERAL AI PROCESS ..........................................................5 FIGURE 3. PILLARS OF INNOVATION ....................................................8 FIGURE 4. EVIDENCE PILLARS & GAPS IN THE RESEARCH. ........................9 FIGURE 5. STUDY TYPES BY CABI CODES ..............................................10 FIGURE 7. GLOBAL RESEARCH OUTPUT ACROSS COUNTRIES ...................13 FIGURE 8. USER GROUPS ..................................................................14 DEFINITIONS The Global South indicates countries that fall into the World Bank’s Lending Classification Categories for Low-Income, Middle-Income & Upper-Middle Income Countries. Artificial intelligence is the simulation of human intelligence processes by machines, especially computers. Machine learning is the study of computer algorithms that can improve automatically through experience and by the use of data. Natural language processing (NLP) refers to the branch of computer science—and more specifically, the branch of artificial intelligence or AI—concerned with giving computers the ability to understand text and spoken words in much the same way human beings can. It combines linguistics with statistical, machine learning, and deep learning models. While there is no precise definition of an intervention for most sectors outside of medicine and health, it is generally recognized that an intervention is an activity that is introduced in a population to produce a certain outcome. Often, the aim of identifying interventions in one context is to evaluate whether it could be reintroduced in another context and with similar results. Interventions is a proxy term to identify research programs, strategies, experiments and projects and other work that has been explored outside of a controlled experiment environment (e.g., a laboratory) and preferably with a target user group. The OCED-DAC defines outcomes as, “likely or achieved short- term and medium-term change and effects of intervention outputs.” We use a machine-learning model trained to identify and extract outcomes from scientific literature that is primarily based on how researchers have expressed it in their text. Research outputs and results: The artefact of conducting research and codifying it in a format (usually written) that can be disseminated. The analysis conducted in this report relies on research outputs, often referred to herein as publications or collectively as our ‘evidence base.’ INTRODUCTION We’re entering a new era in agriculture, one that moves beyond a purely production-oriented vision and recognizes its role in contributing to a food system that prioritizes people’s livelihoods and nutrition, as well as environmental and climate outcomes. This shift in thinking will require major shifts in policy, research, that we know where to invest in the future. But until recently and investment. But where should these investments go? we lacked the technology to conduct a landscape scan of the What foundations should be strengthened? Which gaps need millions of articles that are out there. filling? What’s working? What’s not? Earlier work in this area has suggested that the evidence base In order to answer these questions in an informed way, we we have is not fit for the questions we need and want to ask need to examine the evidence that exists and identify areas (Lipper et al., 2020). We need additional efforts to help us where more research is needed. understand what the current evidence base has found. But this is easier said than done. And we also need resources to be designed in such a way that we can seamlessly add new data as it emerges, from The evidence base for agriculture is growing exponentially, many partners and independent of the sources from which and while the wider food systems literature may contain the data originated. many of the solutions we are seeking, they need to be holistically integrated in order to find those needles in the With the aid of artificial intelligence and machine learning proverbial haystacks. technology, we took a deep dive into more than 1.2 million publications to assess the current landscape of research for Evidence synthesis reports, such as systematic and scoping the Global South. reviews, provide much needed transparent, rigorous evidence for specific questions. But often, broader questions The result is a clearer picture of what research has been about the whole of the evidence base need to be answered conducted on small-scale farming and post-production first, the basic who-what-when-where-and-how that comes systems from 2000 to the present, and where evidence before trying to apply a more thorough lens. gaps exist. State of the evidence reports like this one provide more This, in turn, highlights potential areas for investment in coverage than what we can achieve in a more focused research and innovation for small-scale farms in the Global systematic review—a birds-eye view of the evidence base so South, and provides scope for future research questions.  1 What is Innovation? CONTEXT Innovation in the agricultural and food sectors is essential to achieve the types of transformative changes needed for the improved livelihoods, nutrition, and environmental performance of food systems. In agriculture, “innovation” AND KEY has been used in a narrow sense to mean new technologies that increase productivity. CoSAI uses a much broader definition, such as the one from FAO 2018: “ a process whereby individuals or organizations bring new OBJECTIVES or existing products, processes or ways of organization into use for the first time in a specific context in order to increase effectiveness, competitiveness, resilience to shocks or environmental sustainability and thereby contribute to food security and nutrition, economic development or sustainable natural resource management.” Clearly, in the context of innovation to support multiple objectives of food system transformation, the broader MIND THE GAPS definition of innovation is more applicable. Yet we know that many of the research outputs in the evidence base may reflect a narrower definition. This gives rise to gaps in our knowledge which require attention. Until recently, agricultural research and innovation has been largely focused on improving productivity, focused mainly on a small number of crops (Serraj & Pingali, 2018). While we’ve seen very high returns from this approach, we have also seen the unintended and negative consequences it can have on nutrition and diets, social inclusion, and the environment (Davidson, 2016; Webb & Kennedy, 2014). We are now witnessing a major shift in thinking about agriculture, one which puts agriculture in the larger context of a system with complex interactions between food ■ What outcomes are being studied across research domain production, processing, consumption, and climate change areas? (Barrett et al., 2020). ■ Who are the user groups included within studies? How This same shift implies a need for rethinking the role of much of the research is targeted on solutions for small- agricultural research and development efforts, and a push scale farmers and other agricultural actors? for innovations that go beyond productivity. There is a corresponding urgency to identify priority investments ■ Do we have information about the impact of agricultural (Laborde et al., 2020; Reardon, Lu, et al., 2019). In order to innovation on communities such as indigenous and tribal do so, however, we must have an adequate and accessible communities, youth or elderly, extremely poor? How much evidence base on agricultural innovations and their research focuses on women? potential in the context of a transformation (Herrero et al., 2020; Reardon, Echeverria, et al., 2019). And it has become ■ What does the evidence say about gaps in institutions, increasingly clear that there are several gaps in evidence. policy and finance? This study looks at the summaries of more than 1.2 million ■ What does our current evidence base reveal about how we past publications and uses these to assess the current are studying our changing climate? What are the priority landscape of research for the Global South. The questions aims of research surrounding sustainable development that we ask in this report were prioritized by CoSAI: and biodiversity? ■ What is the research output focused on the Global South? ■ Who are the primary funders and research organizations Which countries have had more research focused on them, that are consistently showing up in Global South focused and which have less? What about crop research? research? ■ Can we identify major domains of research? Can they be ■ Can we determine what scale of research is being organized in a way that will help us better interpret gaps undertaken—farm and household level, macro and/or across different research domains? landscape, enterprise and/or food system? 2 We specifically seek to inform these questions based on Similar questions have been raised in the lead-up to the what research has been conducted on small-scale farming United Nations Food Systems Summit. The value of this and post-production food systems in the Global South, report is a baseline mapping that will, we hope, aid the from 2000 to the present. The conceptual model (Figure 1) prioritization and coordination of international funding and lays out these questions as a mega-research map. research efforts. FIGURE 1. The primary data points collected per article is outlined in the conceptual model. CATEGORY GEOGRAPHY & CLIMATE COUNTRY REGIONS WORLD BANK CLASSIFICATION PLANTS, ANIMALS, CLIMATE ZONES PESTS AND DISEASES PLANTS INTERVENTIONS, OUTCOMES, STUDY TYPE STUDY AND CONTENT RESEARCH DESIGN TYPE LEADERSHIP &  INTERVENTIONS FUNDING SOURCE OUTCOMES RESEARCH STUDY LEADERSHIP FUNDING SOURCE SECTORS & DISCIPLINES SECTORS DISCIPLINES STUDY POPULATION: SOCIO-DEMOGRAPHICS & SOCIAL FACTORS OF STUDY AGE FARM SIZE SEX AGRICULTURAL WORKERS EQUITY 3 OUR APPROACH Every seven seconds, a new research paper is added to the treasure trove of scientific literature (Science, 2012). The volume of research has doubled in the past 10 years (Bornmann & Mutz, 2015). As the amount of published information continues to grow exponentially, it is increasingly difficult to get an accurate picture of what is out there, especially on a global scale. We take a bottom-up approach to inform the questions in this report. Rather than being intimidated by the volumes of research out there, we dive head-first into it, using new technologies that are designed to handle classification tasks with speed and accuracy. Advancements in artificial intelligence (AI) and machine learning (ML) can help us use the data we already have, and this can be a highly effective way of surfacing relevant insights from a large and representative dataset (Gil et al., 2014). The academic and development sectors are far behind the business sector in using big data and ML approaches to improve decision making. But these approaches are essential. We believe that the next big thing will not only come from major scientific breakthroughs, but also from sequencing millions of data points over time to observe how they interact with each other and where major gaps exist. Mining the Gap used Havos.Ai machine learning models and data summarizing more than 1.2 million reports and papers from development and research organizations, UN agencies, and peer-review journals to create a mega-map of agricultural research, as shown in Figure 1. Data partner CABI provided access to 1.2 million citations from applied life sciences around the world. CABI was a natural partner because of their long standing commitment to open up the world's literature and catalogue resources from small publishers and repositories, Research from top global publishers Science and Nature, as well as hundreds of disciplinary giants like Food Policy, Field Crops Research, Global Food Security, African Journal of Crop Science, and Phytopathology are included in the dataset. The output of global institutions, including work spanning OneCGIAR, are also included. So are materials from smaller scholarly publications, many of them non-English, such as Revista Mexicana de Ciencias Pecuarias and Atti dell’Accademia dei Georgofili. 4 OUR METHODS Working with CABI for this project enabled us to draw from predictions, knowledge about the sources of the data is a comprehensive, representative dataset to which we could essential for the results of ML to be useful. Scientific data gives seamlessly apply the Havos.Ai model and create a macro view us a higher probability of producing reliable, accurate results. of research from the past 20 years. The utility of this study is to provide a broad look and explore new ways of approaching We create an ML pipeline for the more than 1.2 million the evidence base, and to add in new content from other articles in the dataset. Specific information is extracted from sources over time. each article, all based on a series of questions and answers (Porciello et al., 2020). This helps us approach the literature Machine Learning (ML) seeks to detect patterns in data in a to ask a series of modular questions, where we harmonize context of stochasticity (“noise”). At its most basic level, machine and clean the data before presenting the analysis to human learning algorithms use historical data to learn patterns and experts. For instance, the kinds of questions we’re interested uncover relationships. However, ML will almost always find a in answering in this work include, does the research describe “pattern” — whether or not the identified pattern is insightful any interventions and outcomes include: and if so, what is revealed by the interpretation of humans.  are they? What kind of study methods did the authors use? Who is the study population? Which crops or livestock are Training is an important feature in machine learning. The mentioned? Where is the research taking place? Additional more high-quality data that the system is presented with, the technical details about the classification and the model more refined the model. And, since ML generates data-driven ensemble are available in the annex. FIGURE 2: An overview of the AI-assisted process 4 CREATE A BACKEND DASHBOARD FOR 2 CLEAN THE ANALYSIS DATASET 1 3 5 THE SUMMARY THE DATASET MODEL THE DATA REPORT 5 STUDY LIMITATIONS There are important limitations of this study. First, the aim of this study is to surface relevant insights across studies using only summary title, abstract and other available metadata (such as keywords). (The exception is the analysis on funders and institutions1.) We are necessarily limited in our observations and analysis based on what we can reasonably expect to learn from summary data. Additional analysis of the corresponding full-text is possible using similar methods, and would provide additional insight into the quality of the research, including determining whether the underlying data supports the claims made in the summary data. Second, additional analysis is needed to evaluate how the identified interventions and outcomes are supported within the body of the study. Ceres2030 global researchers tackled this problem across eight priority research areas, finding that only about 2% of the available evidence base had enough high-quality data to support their small-scale- producer-focused research questions. Third, this is a representative dataset of 1.2 million summary titles, abstracts, and metadata from CABI databases and from other sources. CABI was targeted as a resource because of CABI’s mission to identify and aggregate research from the Global South; it is among the best in the world for our purposes. But there are known gaps, such as large landscape reports from multinational agencies and NGOs. Thus, the research mapping will change over time if new information is incorporated in the dataset. 1 The exception is the analysis on funders and institutions. Use of summary metadata is specific for this report and a departure from other use cases, where the model has conducted full-text extraction and analysis, such as IFAD’s Big Data Challenge and a scoping study, Agriculture in the Digital Age supported by the Bill & Melinda Gates Foundation and USAID. 6 OUR FINDINGS PILLARS OF INNOVATION Food systems can help us demonstrate the interconnectedness of the world of food and sustainable agriculture. For instance, technical innovation in crop breeding, seeds and storage facilities has increased productivity and yields so that fewer people will go hungry. Better management of limited natural resources through ecosystem services, like water and soil, protects biodiversity and fosters planetary health. We need thriving markets and roads to connect them to distribute and sell healthy, safe food that encourages diet diversity and food security. And, stable governments and other enabling systems are needed to continue to advance opportunities to increase education and eradicate poverty. We identify how research publications cluster together, represented in Figure 3, across three pillars of agricultural innovation: technical, socioeconomic, and ecosystem services. Within each pillar are the top nine intervention areas, based on quantity of research. There are nearly twice as many research publications focused on technology innovations as compared to both ecosystem services and socioeconomic innovations. Research about crop and soil sciences, and the use of fertilizer, is well-represented, whereas outputs about emerging domain areas like digital agriculture are relatively small. There is less research being published about government, market, food and social interventions, as well as ecosystem services literature focused on conservation, water and forest. Some of the names of the intervention area are the same, MORE ON METHODS such as ‘water’ appearing as a domain areas in both the ecosystem services and technology pillar. In general, the The pillars and domain areas were created using underlying evidence base is comprised of a different suite an AI-assisted clustering technique, where all of the of publications for each intervention area. Technological supporting content (in this case, summary data) innovations for water (for instance) are more concerned is examined in a vector space. We apply different with measuring soil respiration dynamics based on algorithms in order to test different patterns that different precipitation patterns, whereas ecosystem show us emergent relationships. Once these services focuses more on management and use relationships emerge, we conduct an information of water as a natural resources, such as nutrient extraction process to explore the underlying recycling. interventions that comprise each domain area. We can generate more specificity per domain area than what is pictured in Figure 3, down to the level of specific interventions. An expanded list of the domains is available in the annex, as well as more specifics about the methods; also provided in the annex are the specific outcomes that were captured in each category. 7 FIGURE 3. The figure is organized first by pillar, and then intervention areas in each pillar. Each domain area is labeled with a single word that best represents the underlying research publications. The size of the intervention bubble corresponds to the number of articles supporting each intervention, and a scale indicating the general size of the document corpus is presented in each pillar. SOCIOECONOMIC ECOSYSTEM SERVICES TECHNOLOGY INTERVENTION AREAS INTERVENTION AREAS INTERVENTION AREAS GOVERNMENT ECOSYSTEM HEALTH CONSERVATION CROP MARKET ENVIRONMENTAL FOOD FOREST SOIL SOCIAL WATER FERTILIZER EDUCATION CLIMATE INCENTIVES LAND WATER LIVELIHOOD ENERGY CROP BREEDING COMMUNITY AGROFORESTRY SAFETY ECONOMIC RECYCLING FINANCE EMISSION IRRIGATION TILLAGE SUBSIDIES LIVESTOCK ENERGY GENETICS 19,000 - 90,000 ARTICLES SEED 15,000 - 80,000 ARTICLES DIGITALWEATHER 10,000 - 260,000 ARTICLES A selection of research outcomes—economic growth, healthy people, healthy planet, and gender & inclusion—were also captured across all articles that reported an outcome. Figure 4 shows the relationships between pillars, domains and outcomes, where each domain was assessed against Over-emphasis in any one each of the four outcome areas. intervention area—or one pillar— More research has focused on economic outcomes, cannot achieve gains across the such as productivity and yield, than any other area, and researchers are also actively trying to incorporate outcomes entire system. We can see that focus that measure water use and soil health (captured under Healthy Planet). But there are consistent gaps in the on any single intervention will be evidence for outcomes focused on nutrition, social inclusion, and gender empowerment across nearly every domain. ineffective. Instead, an integrated Such findings are not surprising. It reinforces the message approach across interventions helps that is continuously stated, that over-emphasis in any one domain area—or one pillar—cannot achieve gains across the us achieve the greatest gains entire system. We can see that focus on any single domain will be ineffective. Instead, an integrated approach across domains helps us achieve the greatest gains (Barrett et al., 2020; Laborde et al., 2020). 8 FIGURE 4. A mapping of evidence pillars and intervention areas according to four outcome areas. Read the map from left to right, starting with domain area, and then from the top. This shows how each interventions area is faring in terms of studying specific outcomes. Evidence Pillars & Gaps in Research Outputs OUTCOMES PILLAR SUBSET OF INTERVENTIONS ECONOMIC HEALTHY HEALTHY GENDER & GROWTH PEOPLE PLANET INCLUSIVITY Crop         Soil         Fertilizer         Water         Plant Breeding         TECHNSOafLeOtyGY         DOMATINill aAgReEAS         Irrigation         Livestock         TECHNOLOGY Energy         Genetics         Seeds         Digital         Weather         Government         Health         Market         Food         Social         Education         Incentives         SOCIOECONOMIC Livelihood         Community         Economic         Finance         Subsidies           Ecosystem         Conservation         Forest         Water         Climate         Land         ECOSYSTEM SERVICES Energy         Agroforestry         Emissions         Percentages are shown in terms of the numbers of articles <10% 11-20% 21-30% 31-40% 41-50% >50% 9 VIEW FROM WITHIN Most of the sciences have a ‘death valley’: a desert between scientific data and The vision for the future of sustainable agriculture requires us to move fast, to identify innovation as it is emerging. applied processes that can help us make A challenge we collectively face is how to identify when emerging research is in the pipeline, especially when the use of those findings for innovation. innovations come from specialized scientific and private sector research programs. Most of the sciences have a ‘death valley’: a desert between Other studies have highlighted that within the papers scientific data and applied processes that can help us make themselves, there tends to be a stronger focus on household use of those findings for innovation. Our universities and and farm-level analyses and relatively little attention to the research centers are primarily equipped to support and landscape or macro level analyses that are so important in conduct novel, basic research investigation. the context of scaled-up interventions (Barrett et al., 2020; Liverpool-Tasie et al., 2020; Ricciardi et al., 2020). What are the ways we see this emerging in the research? First, publication trends favor field, laboratory, simulation, In addition, despite an emphasis on household and farm-level and narrative studies (Figure 5). There are less experimental outcomes, there are some serious gaps regarding what we (clinical studies in nutrition, and impact evaluations) and know about the study populations themselves, where even observational studies. We show these trends using a selection basic demographic information about the study population, of standardized CABI codes. like age, sex, and education is missing (Acevedo et al., 2020). FIGURE 5. Research study types (definitions in the annex) are shown according to a selection of research disciplines. The names of the research disciplines come from standard CABI codes, which are applied to each article in the CABI’s databases. COUNT 0 50,000 100,000 150,000 200,000 Plant Production Field Crops Agricultural Economics Meteorology and Climate Horticultural Crops Techniques and Methodology Forests & Forest Trees Crop Produce Water resources Non-Food Feed Plant Products Fertilizers and Other Amendments Food Composition & Quality FIELD STUDY Plant Breeding & Genetics EXPERIMENTAL STUDY Pollution & Degradation SIMULATION/MODELING STUDY Milk & Dairy Produce NARRATIVE/REVIEW STUDIES Plant Physiology & Biochemistry LABORATORY STUDY Soil Chemistry & Mineralogy OBSERVATIONAL STUDIES Pesticides & Drugs: Control 10 Addressing the dearth of research that focuses on impact and causal pathways would require more opportunities to WHERE IS THE EVIDENCE? collect and harmonize data about innovations over time, and across different scales of research: farm-level, landscape China, Brazil and India lead the way in publishing research or macro, and food systems. This is a long-standing outputs, with more modest results appearing across all of the discussion from organizations like Standing Panel on Impact other included countries. These trends follow even when we Assessment (SPIA) that that aim to better link evaluation, look at the cross-section views of crop, livestock and value interventions and outcomes across a common set of agreed- chain research across the Global South. (Figure 6). Different upon guidelines that, although they originate at the project or countries and regions come into focus depending on the program level, could also be useful for exploring impact that target crops, as highlighted in the maps below (Figures 7-12). is policy relevant. Of note is a particular lack of research about fruits and Funders and governments are increasingly interested in vegetables in Sub-Saharan Africa (Pingali, 2015). This is of seeing science's causal and applied impact, especially in concern because expanding fruits and vegetables in the countries where issues like poverty and food security loom food supply and reducing the concentration on cereals like largest. Among the most frequently acknowledged funders rice, maize, and wheat is essential for improving health and are the World Bank, European Commission, USAID, the Bill reducing the incidence of non-communicable diseases (Fanzo & Melinda Gates Foundation, and the Asian Development et al., 2017). And given that Sub-Saharan Africa is where rural Bank. This is based on a sampling of the “acknowledgements populations are predicted to increase significantly in the section” across ~37,000 research papers because funding next 20 years, we would expect a considerable expansion in data is not systemically requested by all journals—though demand for fruits and vegetables. this is changing. But right now, it is difficult to generate an accurate mapping of research funding trends based solely on The expansion of fruit and vegetable production could provide the scientific papers themselves. increased opportunities for small-scale farmers, conditional on market access and technical capacity. Expansion of decent There are some encouraging trends. The publications employment opportunities in food value chains will be key emerging about the Global South are primarily Global South to improving livelihoods in areas of high rural poverty such led. The affiliations of the first author publishing the paper as sub-Saharan African and South Asia, since the small farm are overwhelmingly from institutions based in the Global sector is incapable of absorbing the expected increases in the South. This counters some of the narratives about Global labor force to provide opportunities for decent employment North researchers dominating research production. Instead, (Rural Development Report 2021 – Transforming Food Systems there is a healthy representation of regional organizations for Rural Prosperity, n.d.). (primarily academic organizations) that emerge as the top research producers in their own regions. Hopefully this will The extent to which research outputs focus on value chains and aid research prioritization so existing capacity can flourish. post-production processes including storage, distribution, or marketing channels, is a well-documented gap in the evidence base, and is reflected in the domain mapping (Liverpool-Tasie et al., 2020). A recent and exhaustive evidence synthesis examining post-harvest loss reduction concluded that there is a lack of studies on training, finance, infrastructure, policy and market interventions (Stathers et al., 2020). Value chains will play a key role in directing incentives and signals to the producers of food and agricultural products, as well as the consumers (Reardon, Echeverria, et al., 2019). Their fundamental role in the transformation of food systems to improve the livelihood, nutrition and environmental performance of the world population was a major theme in the recent UN Food System Summit. 11 FIGURE 6. Crop Research by Country shows research outputs by geography. Countries in black are excluded from this analysis. The calculation to identify countries within the research output is based on whether the geography is the area of focus where the research took place. Note that multiple countries can, and are, identified within one study. Specific Crop Research output is shown by country using the same calculations and coloring, and shown here by 1) maize; 2) rice; 3) wheat; 4) fruits and vegetables; 5) roots, tubers and bananas; and 6) livestock. The classifications are based on machine learning that uses a custom harmonized thesaurus based on plants and animals from AGROVOC and National Agricultural Library. CROP RESEARCH BY COUNTRY 87950 100 Serie 1 MAIZE RICE WHEAT 106 4,825 114 8,137 127 9,186 FRUITS ROOTS, TUBERS AND VEGS AND BANANAS LIVESTOCK 130 7,905 104 2,380 104 8,008 12 OUR CHANGING CLIMATE The Earth’s climate zones We highlight the degree to which agriculture research outputs will continue to shift at an are focused on different climate zones such as tropical climate and arid climate (shown in Figure 14), per region. This is an accelerated pace, and many early exploration to improve our understanding of the work researchers are doing to figure out agriculture’s adaption to climate scientists suggest that climate change. Here, we see that nearly twice as much work has been focused on issues facing tropical climates than arid monitoring shifts in climate climates, with some on coastal areas. zones is a reasonable measure The acceleration of climate change means that the biodiversity inhabiting each climate zone will have less time of ‘reality’ for living systems, to adapt to the climatic change. Ecosystem services will play an increasingly critical role to protect biodiversity and shared including agriculture natural resources. It is also essential for food security for many indigenous communities that rely on food gathered (Mahlstein et al., 2013). from natural ecosystems, such as oceans. As shown in the pillars and evidence gap overview, ecosystem services emerged with the least amount of research inputs, and it is unclear how integrated these innovations really are within the other two socioeconomic and technology pillars. FIGURE 7. The frequency of articles per region according to different climate zones. The naming of the climate zones are recognized by FAO’s AGROVOC. 25,000 20,000 15,000 MIDDLE EAST & NORTH AFRICA 10,000 SOUTH ASIA SUB-SAHARAN AFRICA 5,000 EAST ASIA AND THE PACIFIC LATIN AMERICA AND THE CARIBBEAN 0 13 COUNT Tropical Climate Arid Climate Coastal Climate Semiarid Climate Subtropical Climate Humid Climate Semihumid Climate WHO ARE WE INCLUDING IN As highlighted in Figure 8, women are underrepresented, and both elderly and youth populations were rarely mentioned. THE RESEARCH? Studies focusing on these groups are usually in association with health or nutrition outcomes. Information about communities such as tribal and indigenous populations, or research focused Understanding complex social factors about user groups is a on other areas of equity, such as wealth, access to finance, cornerstone of both research and development. education and literacy, is sparse, often in the low thousands. We sought to identify who is included in research outputs More work needs to be done to capture information about by looking for information about the study populations, all of the beneficiary communities. Overlapping social factors relying on basic demographic details as a proxy. For instance, such as education, socioeconomic status, race, class and we explored various employment: farmers or agricultural gender, can create interdependent systems of discrimination workers, including (but not limited to) small-scale producers, and disadvantage which reinforce the exclusion of some agribusiness dealers, value chain actors, extension service groups—particularly, but not only, women—from the benefits agents, and others. We investigated whether research of agricultural innovations. Additional and sustained work outputs contained generalized descriptions, such as ages of a in this area will reduce the likelihood of making generalized, study population (adults, elderly, youth), or the sex/age range homogenous assumptions for heterogeneous groups. of a study population (women, men, girls, boys); and many other sociodemographic descriptions, such as mothers, Given that the vast majority of published science is focused indigenous, tribal or nomadic populations, and more2. on basic, upstream experiments in which study populations are not part of the work, many of the gaps we identified make The descriptions of farmers and agricultural workers is some sense. But, given that small-scale producers are the focal extremely ambiguous, and rarely includes contextual clues point of Sustainable Development Goal 2 (SDG2), these gaps about farm size and type that are useful to discern who is are still startling from the perspective of research prioritization. really included. What we observe is that 10% of the literature mentions the general term ‘farmer’ without other contextual It is tempting to conclude that we must be missing information details, and 3% specifically identify small-scale producer. But, because we are only looking at titles and abstract data— even this term is a somewhat complicated by the fact that a and certainly this is a possibility we must keep in mind. But publication describing ‘small-scale producers’ from Brazil is comparable and recent research by global research teams featured alongside ‘small-scale producers’ from Malawi, even took a painstaking look at the underlying research papers and though farm sizes are quite different. found similarly startling numbers, including the comprehensive Ceres2030 report that highlighted a massive under-investment in research for small-scale farms in the Global South (“Ending 2 Livestock animals were excluded as a specific population of research Hunger,” 2020). FIGURE 8. User Groups WE SET OUT TO IDENTIFY THE STUDY POPULATIONS ACROSS ALL STUDIES IN THE DATASET. HERE'S WHAT WE FOUND: SIGNIFICANT GENDER 11% & EQUITY GAPSFARMERS AS ONLY THE STUDY % POPULATION 25% 1ELDERLYINDIGENOUS & YOUTH OF THE LITERATURE FOCUSES ON % 3% PEOPLE 2 WOMEN AND GIRLSSMALL-SCALE PRODUCERS AGRICULTURAL WORKERS % LIKE EXTENSION AGENTS, TRADERS, OF AND AGRIBUSINESS DEALERS10 STUDIES 14 content_id:345890358 SPOTLIGHT: INSTITUTIONS, POLICY AND FINANCE Research and innovation in policy, institutions, and finance will play a large part in developing the transformative changes needed to address complex challenges in agricultural systems. Institutions and policy instruments have the power to promote or block broader transformation in this sector. In general, weak institutions are a key obstacle for small and poor farmers, and some of the research on policy instruments emphasizes single- solution instruments that may provide guidance and site-specific recommendations, but less about comparative analysis of multiple instruments, and its applicability in another contexts (Piñeiro et al., 2020; Zilberman et al., 2018). Institutions, policies, and financing mechanisms are key to achieving change, especially to ensuring that farmers have the resources that they need to succeed. Within the research domains, however, the areas of focus are not immediately evident. One exception would be farmer organizations (FOs), such as associations, cooperatives, self-help and women’s groups, and the extent to which they are empowered to work with all farmers (Bizikova et al., 2020). Understanding the interlocking role that institutions, policies and finance have a critical need. They face different sets of constraints and opportunities, and thus they merit specific attention. 15 CONCLUSION Agriculture (and, more broadly, food systems) is an incredible The essentialness of equity. It is clear that too little is being node that touches many issues and disciplines. However, captured and reported about study populations, including such diffusion can make it incredibly challenging to work basic sociodemographic details, such as employment, age and from the same evidence playbook. Agriculture cannot be sex. Equally important, however, is the capture of social factors either/or, it must be And. that could underscore how barriers are systematic for some communities and not for others. As we look towards the future By taking a birds eye view of research across disciplines of research prioritization, equity outcomes need to become spanning the three pillars of agricultural innovation more pronounced. (technical, socioeconomic, and ecosystem services), our findings reinforced the message that integrated approaches Connecting research and innovation pathways. The research across interventions are more effective in achieving gains pipeline for agriculture is extremely long, and a decade or more across the entire food system. can pass before some technologies (like nutritionally fortified crops) see results in farmers’ fields. Despite this, it is challenging Our efforts to map and analyze the evidence pointed to to connect and trace upstream and downstream research in some key gaps. any observable way, making it difficult to find pathways that Not all underfunded areas can be treated equally. There can scale research to innovation in the market. are many areas of research that are underfunded, but some Beyond farm and household level outcomes. We also need of those areas may result in more significant trade-offs to go beyond capturing research that reports impact at the than others. Research into fruits and vegetables (both in household and farm-level to produce more evidence about production and post-harvest), is one example where we risk impacts at the macro and food systems levels. greater challenges for healthy diets and diversity if this does not emerge as a key research priority. So is biodiversity. New technologies to share and unleash scientific potential. And whether we are studying traditional local systems that We know that the next big thing will not just come not from one link to broader markets through intermediaries or larger idea or one platform, but by sequencing millions of small details industrialized and global systems, many food systems on similar problems from researchers across the world. In the correlate with their location, so it is key to understand the race to develop the food system of the future so that innovation geographic distribution of the evidence base. Likewise, it can flourish, we need to analytic tools and databases that help matters where in the world we set-up our research programs us make short work of the hay and present a stack of needles. and the partnerships that are created. About CoSAI: The Commission on Sustainable Agriculture Intensification (CoSAI) was set up to promote more and better investment in innovation for Sustainable Agriculture Intensification (SAI) for the Global South, in support of the Sustainable Development Goals (SDGs). For CoSAI, innovation includes not only science and technology but also innovation in policies, finance, and social institutions. CoSAI has a timeline running up to December 2021. CoSAI has six Commissioner Working Groups addressing Big Questions around innovation for SAI. Working Group 2 focuses on priorities for innovation. Some of the work already commissioned under this working group includes two studies on global funding for innovation in SAI (Investment Baseline and Investment Gap studies) and a study on instruments and approaches for innovation in SAI. CoSAI is building up an evidence base to support the case for increased and better-targeted investment in agricultural innovation for the Global South. This includes studies on the investment baseline and projected investment gap, approaches and instruments, learning from case studies on pathways to innovation, and a Taskforce on Principles and Metrics. About Havos Inc. Havos.Ai builds software solutions and platforms for global organizations that want to use advanced computation for complex, open-ended problems that are beyond the scope of individual decision-making. Our approach taps into collective intelligence and wisdom of global experts, supported by artificial intelligence and the best scientific data. Founded in 2021 by leaders in science, policy and industry, Havos.Ai improves decision-making for governments, multilateral agencies, funders, and research organizations. The company emerged as a start-up out of Cornell University. 16 List of Works Cited Mahlstein, I., Daniel, J. S., & Solomon, S. (2013). Pace of shifts in climate regions increases with global temperature. Nature Acevedo, M., Pixley, K., Zinyengere, N., Meng, S., Tufan, H., Cichy, K., Climate Change, 3(8), 739–743. https://doi.org/10.1038/ Bizikova, L., Isaacs, K., Ghezzi-Kopel, K., & Porciello, J. (2020). A nclimate1876 scoping review of adoption of climate-resilient crops by small-scale producers in low- and middle-income countries. Nature Plants, Piñeiro, V., Arias, J., Dürr, J., Elverdin, P., Ibáñez, A. M., Kinengyere, 6(10), 1231–1241. https://doi.org/10.1038/s41477-020-00783-z A., Opazo, C. M., Owoo, N., Page, J. R., Prager, S. D., & Torero, M. (2020). A scoping review on incentives for adoption of Barrett, C. B., Benton, T. G., Cooper, K. A., Fanzo, J., Gandhi, R., Herrero, sustainable agricultural practices and their outcomes. Nature M., James, S., Kahn, M., Mason-D’Croz, D., Mathys, A., Nelson, R. Sustainability, 3(10), 809–820. https://doi.org/10.1038/s41893- J., Shen, J., Thornton, P., Bageant, E., Fan, S., Mude, A. G., Sibanda, 020-00617-y L. M., & Wood, S. (2020). Bundling innovations to transform agri- food systems. Nature Sustainability, 3(12), 974–976. https://doi. Pingali, P. (2015). Agricultural policy and nutrition outcomes – getting org/10.1038/s41893-020-00661-8 beyond the preoccupation with staple grains. Food Security, 7(3), 583–591. https://doi.org/10.1007/s12571-015-0461-x Bizikova, L., Nkonya, E., Minah, M., Hanisch, M., Turaga, R. M. R., Speranza, C. I., Karthikeyan, M., Tang, L., Ghezzi-Kopel, K., Kelly, Porciello, J., Ivanina, M., Islam, M., Einarson, S., & Hirsh, H. (2020). J., Celestin, A. C., & Timmers, B. (2020). A scoping review of the Accelerating evidence-informed decision-making for the contributions of farmers’ organizations to smallholder agriculture. Sustainable Development Goals using machine learning. Nature Nature Food, 1(10), 620–630. https://doi.org/10.1038/s43016-020- Machine Intelligence, 2(10), 559–565. https://doi.org/10.1038/ 00164-x s42256-020-00235-5 Bornmann, L., & Mutz, R. (2015). Growth rates of modern science: A Reardon, T., Echeverria, R., Berdegué, J., Minten, B., Liverpool-Tasie, bibliometric analysis based on the number of publications and S., Tschirley, D., & Zilberman, D. (2019). Rapid transformation cited references. Journal of the Association for Information Science of food systems in developing regions: Highlighting the role of and Technology, 66(11), 2215–2222. https://doi.org/10.1002/ agricultural research & innovations. Agricultural Systems, 172, asi.23329 47–59. https://doi.org/10.1016/j.agsy.2018.01.022 Davidson, D. (2016). Gaps in agricultural climate adaptation research. Reardon, T., Lu, L., & Zilberman, D. (2019). Links among innovation, Nature Climate Change, 6(5), 433–435. https://doi.org/10.1038/ food system transformation, and technology adoption, nclimate3007 with implications for food policy: Overview of a special issue. Food Policy, 83, 285–288. https://doi.org/10.1016/j. Ending hunger: Science must stop neglecting smallholder farmers. foodpol.2017.10.003 (2020). Nature, 586(7829), 336–336. https://doi.org/10.1038/ d41586-020-02849-6 Ricciardi, V., Wane, A., Sidhu, B. S., Godde, C., Solomon, D., McCullough, E., Diekmann, F., Porciello, J., Jain, M., Randall, N., Fanzo, J., Arabi, M., Burlingame, B., Haddad, L., Kimenju, S., Miller, G., & Mehrabi, Z. (2020). A scoping review of research funding Nie, F., Recine, E., Serra-Majem, L., & Sinha, D. (2017). Nutrition and for small-scale farmers in water scarce regions. Nature food systems. A Report by the High Level Panel of Experts on Food Sustainability, 3(10), 836–844. https://doi.org/10.1038/s41893- Security and Nutrition of the Committee on World Food Security. 020-00623-0 Gil, Y., Greaves, M., Hendler, J., & Hirsh, H. (2014). Amplify scientific Rural Development Report 2021 – Transforming food systems for rural discovery with artificial intelligence. Science, 346(6206), 171–172. prosperity. (n.d.). IFAD. Retrieved October 18, 2021, from https:// https://doi.org/10.1126/science.1259439 www.ifad.org/en/web/latest/-/rdr-2021-launch Herrero, M., Thornton, P. K., Mason-D’Croz, D., Palmer, J., Benton, T. G., Serraj, R., & Pingali, P. (2018). Agriculture & Food Systems to 2050: Bodirsky, B. L., Bogard, J. R., Hall, A., Lee, B., Nyborg, K., Pradhan, Global Trends, Challenges and Opportunities (Vol. 2). World P., Bonnett, G. D., Bryan, B. A., Campbell, B. M., Christensen, Scientific. S., Clark, M., Cook, M. T., de Boer, I. J. M., Downs, C., … West, P. C. (2020). Innovation can accelerate the transition towards a Stathers, T., Holcroft, D., Kitinoja, L., Mvumi, B. M., English, A., sustainable food system. Nature Food, 1(5), 266–272. https://doi. Omotilewa, O., Kocher, M., Ault, J., & Torero, M. (2020). A org/10.1038/s43016-020-0074-1 scoping review of interventions for crop postharvest loss reduction in sub-Saharan Africa and South Asia. Nature Laborde, D., Porciello, J., Smaller, C., Murphy, S., & Parent, M. Sustainability, 3(10), 821–835. https://doi.org/10.1038/s41893- (2020). Ceres2030: Sustainable Solutions to End Hunger Summary 020-00622-1 Report [Report]. Ceres2030. https://ecommons.cornell.edu/ handle/1813/72799 Webb, P., & Kennedy, E. (2014). Impacts of Agriculture on Nutrition: Nature of the Evidence and Research Gaps. Liverpool-Tasie, L. S. O., Wineman, A., Young, S., Tambo, J., Vargas, C., Food and Nutrition Bulletin, 35(1), 126–132. https://doi. Reardon, T., Adjognon, G. S., Porciello, J., Gathoni, N., Bizikova, org/10.1177/156482651403500113 L., Galiè, A., & Celestin, A. (2020). A scoping review of market links between value chain actors and small-scale producers in Zilberman, D., Lipper, L., McCarthy, N., & Gordon, B. (2018). developing regions. Nature Sustainability, 3(10), 799–808. https:// Innovation in response to climate change. In Climate smart doi.org/10.1038/s41893-020-00621-2 agriculture (pp. 49–74). Springer, Cham. 17 MINING THE GAPS: USING MACHINE LEARNING TO MAP A MILLION DATA POINTS FROM AGRICULTURAL RESEARCH FROM THE GLOBAL SOUTH Porciello, Jaron; Bourne, Thomas; Lipper, Leslie; Lin, Sammi; and Langleben, Sarah. 2021. Mining the Gaps: Using Machine-Learning to Map a Million Data Points on Agricultural Research from the Global South. Colombo, Sri Lanka: Commission on Sustainable Agriculture Intensification. Copyright © 2021, Commission on Sustainable Agriculture Intensification ATTRIBUTION: The work must be referenced according to international citation standards, while attribution should in no way suggest endorsement by WLE, IWMI or the author(s). NON-COMMERCIAL: This work may not be used for commercial purposes. Note: This report uses proprietary datasets created through secondary research. Hence, unless mentioned otherwise, data and graphs in this report are derived from this database. havos.org @AiHavosjp @havos.org