1 Generative AI for Agriculture (GAIA)-Phase II First Year Report: April to November 2025 November 30, 2025 Submitted by: Gerrit Hoogenboom, VJ Joshi, and Willingthon Pavan University of Florida 2 Background In the GAIA-Phase II project, the University of Florida (UF) team main focus is using dynamic crop simulation models for enhancing the advisory capabilities for small-holder farmers. By integrating dynamic crop simulation models with site-specific weather and soil conditions and crop management and genetics information, we are developing strategies to provide timely, context- specific recommendations to supplement static advisories (Figure 1). Figure 1: Workflow demonstrating a scalable framework to integrate Generative Artificial Intelligence (GenAI) in parameterizing, setting up files and execution of the DSSAT-Pythia crop simulation model. 3 In the current phase, we are integrating Generative AI (GenAI) with process-based crop models available within Decision Support System for Agro-Technology Transfer (DSSAT; www.DSSAT.net) to create an adaptive decision support system that can interpret farmers/users’ queries, auto-configure simulations, and translate simulation outputs into actionable insights for farmers and other users. Conceptually, we have planned two major approaches: a tactical workplan that enables near-real time crop model simulation runs to support in-season crop management decisions, and a strategic workplan that involves running site-specific historical simulations with multiple management scenarios and organizing the outputs as a queryable database for management planning. The first year of the work reported here has focused on the tactical work plan, where the emphasis is on automating the full simulation pipeline from a user’s query to model-based advisory. Tactical workplan The main objective of the tactical workplan is to enable near-real time crop model simulation runs through automating the parameterization and execution of the crop model so that it reduces the technical threshold required to configure and execute crop model simulations. Figure 2: A prototype of multi-agents workflow demonstrating a network of specialized AI agents and their specific tasks. http://www.dssat.net/ 4 We have framed this problem as mapping a user’s question into a set of points in model parameters space, conditioned on both space (location-specificity, crop and management context) and time (season-specificity and moment of decision-making) and then generating model outputs that can aid decisions. Using GenAI, we have built a first prototype system for space-time aware model parameterization and execution. The workflow is designed to infer location (later converted into geocoordinates), crop of interest, growing season and management decision context directly from the user’s query and to translate these into model inputs. Here we have three specific objectives. First, we aim to provide model inputs that are highly specific to location, growing season, crop, problem and time of decision-making. Second, we aim to configure these inputs and set up experimental files in fully DSSAT- compatible formats, including weather, soil and crop management treatments. Third, we aim to execute the simulation runs for the desired experimentation, return model outputs, analyze it and synthesize it in a plain-language form that directly addresses the user’s question. To achieve this, we have been working on a multi-agents GenAI workflow, in which specialized agents perform distinct but coordinated tasks (Figure 2). When a user asks a question, Agent 1: Orchestrator first determines if any simulation is required. Not all questions require running the simulations. So, if the question can be answered from the existing knowledge base, such as prior simulations, from literature, the Orchestrator will respond directly. If the question is about different scenarios of planting dates, fertilizer timing or rates, irrigation amount or timing, where simulations are needed to respond, the Orchestrator will forward the question to Agent 2. Agent 2: Experiment designer translates the user’s question in natural language to a formal experimental design in DSSAT compatible format. It identifies the model parameters and management factors of interest from the question, such as planting date, fertilizer, irrigation and also interprets the output of interest requested by the user, such as crop yield and water use. Then, Agent 2 identifies the specific site of simulation and current date from the user’s inputs. This triggers the automatic generation of DSSAT-compatible weather (.WTH) and soil (.SOL) files. Agent 2 then specifies the design and levels of treatments, constructing the experimental file. The experimental configuration produced by Agent 2 is then evaluated by Agent 3, Quality Check. This agent examines whether the experiment actually tests the parameters of interest from the users, and whether the treatment levels represent a meaningful and realistic range and whether the experiment run will produce the outputs that answer the user’s question. If any shortcomings are detected, Agent 3 sends the critiques and revision instructions to Agent 2. This creates an iterative loop that refines the experimental file. Once the experimental file passes the quality control, Agent 4: AWS Agent takes over the execution phase. This agent uploads all generated input files to the AWS including the site-specific weather files, soil files and the experimental files. Then, it executes DSSAT CSM runs in AWS. Once the simulations are complete, Agent 5: Analytics agent processes the simulation output files. It filters the outputs to retain only those variables that address the user’s query. Then, it performs specific targeted analyses such as comparing treatment levels or identifying optimum input levels across different scenarios. Agent 5 5 then converts these quantitative outputs into a concise, plain-English language summary suitable for non-technical users. Specific tasks As per the prototype and tactical workplan, we have worked on several specific tasks. So far, our work has focused on building the foundational data pipelines that make automated DSSAT execution possible. We have completed two production-oriented standalone modules, one for weather (Task 1) and another for soil (Task 2) (Figure 3). The soil module ingests geocoordinates and outputs DSSAT-ready site-specific .SOL files. It first examines a geo-spatial raster layer to identify the relevant grid cell, resolves a site-identifier through a linked database index and programmatically extracts and formats the correct soil profile from the larger repository. Figure 3: Different steps and tasks within tactical workflow for near-real time model execution. The weather module retrieves daily weather variables from the Open-Meteo API. It performs reverse geocoding to generate a location-specific code and writes a standard DSSAT-format .WTH file. In order to ensure the quality of the retrieved weather data, we have developed and embedded quality assurance checks. To operate these backend pipelines for users, we have created an interactive Streamlit dashboard that supports map-based location selection for weather generation or upload and analyze workflows for existing .WTH files (Figure 4). 6 Figure 4: Interface of the dashboard that supports map-based location selection for weather generation or upload and analyze workflows for existing weather files. The dashboard also provides interactive analytics for temperature distributions and rainfall patterns (Figure 5). These components have been integrated into the multi-agent execution pipeline by extending Agent 2 (Figure 2) to parse the site id from the newly generated soil file and the code from the weather file. The agent uses this information to populate DSSAT field section (ID_SOIL and WSTA) during the experimental file assembly, enabling a single-command-line run to organize file-generation, meta-data extraction and experimental file construction for any specified location. We are now extending this pipeline to upscale the system to accept GeoJSON inputs with many geocoordinates, iterate over each point, generate per-point .SOL, .WTH and experimental files and store outputs in per-location directories enabling regional-scale simulations. 7 Figure 5: Quality assurance checks of weather data and visualization of monthly rainfall distribution. Parallelly, we have worked on advancing weather-data integration from different sources, improving geospatial infrastructure and large language model (LLM)-based summarization and evaluations workflows. One major line of work focused on reviewing and strengthening the DSSAT-Pythia codebase and its surrounding data pipelines. We conducted a comprehensive architectural analysis of Pythia, identifying redundant logic, eliminating unused code paths, and documenting structural gaps and optimization opportunities. We completed on integrating NASA POWER weather data directly from the AWS S3 Zarr repository, reducing dependence on API calls and mitigating rate-limit challenges. CHIRPS v3 precipitation data were also fully evaluated, compared against NASA POWER estimates, and successfully merged into a unified ICASA-compatible dataset ready for DSSAT workflows. Elevation and soil-profile data handling were also substantially improved, including the creation of a raster-based two-band soil mapping system that eliminates the old SQLite dependency and provides a more scalable and DSSAT-aligned configuration. Work is ongoing to implement a more efficient, multi-threaded weather-data pipeline that can process large regions with hundreds of coordinate points. The new approach aims to parallelize S3 extraction, chunk time ranges for faster batch downloads, consolidate outputs into regional NetCDF or Zarr datasets, and then generate ICASA files with much higher throughput. At the same time, approved enhancements, such as the raster-based soil workflows, are being refactored into the main Pythia codebase with attention to backward compatibility. We are also refining the soil-profile integration to allow users to pair custom SOIL.SOL files with raster encodings, aligning directly with native DSSAT conventions. In addition to developing and advancing data-pipelines for enabling automated modeling, we have worked on LLM-based summarization and evaluation (Figure 2, Agent 5). One stream focused on handling large files by breaking them into smaller chunks and applying similarity-search techniques (FAISS with K-Means) to retrieve the most relevant segments before summarization through recursive LLM methods. Smaller DSSAT simulation output .OUT files were also processed directly 8 in context, and prompting strategies are being refined to achieve more targeted and actionable summaries. A new LLM evaluator was created using Evidently to assess summary quality based on stakeholder questions, implementing the LLM-as-a-Judge paradigm to score and validate outputs. The overall workflow is now running in Python and Google Colab, utilizing models such as GPT-4- mini and Claude 4.5. The next steps will focus on tuning prompts and formalizing the criteria for what constitutes a high-quality summary. Collectively, these efforts reflect strong progress in both the data-engineering and AI components of the project. Core pipelines for weather, soil, and geospatial processing are becoming faster, more scalable, and more flexible, while the summarization sub-project is building a robust evaluation framework to ensure accuracy and reliability of LLM- generated summaries. Upcoming work will continue to optimize weather data reuse, consolidate CHIRPS into a centralized, daily updated store, refine prompts for improved summarization response, and finalize the integration of the new Pythia workflows through Pull Requests and architectural updates. We are completing the initial design and implementation of the DSSAT experimental file (FileX) generation workflow associated with Agent 2 (Figure 2). Based on the crop and location provided by a farmer or user, we have established a GenAI/LLM-enabled approach that generates a DSSAT experimental file with proper estimation of parameter values and in the correct format. We have implemented a multi-agent architecture using LangGraph, supported by a Python/Java stack and integrating GPT-4o (model-agnostic by design), DSSATTools, the OpenAI interface, and XB2-build for formatting correction. The system is structured around specialized AI agents that are responsible for distinct sections of FileX, such as field setup, planting, and fertilizer, and communicate through a shared state that transports context and intermediate outputs across steps. We have finalized the foundational structure for all section agents and written detailed instruction prompts such that each agent produces a complete section with no missing values. As a result, the end-to-end pipeline is now capable of generating a fully populated FileX that is correctly structured and formatted to run a DSSAT simulation. Collaboration with SCiO Last September, the SCiO team hosted a meeting at their office in Athens, Greece, where our team from the University of Florida exchanged detailed updates with the SCiO team on our respective workplans and identified areas of collaboration for joint work. The central outcome of the meeting was an ongoing collaboration to align interfaces between our GenAI-driven experiment file creation workflow and SCiO’s efforts on developing data infrastructure and APIs. In our current work, one collaborative aim is the agreement on a structured JSON inputs that provides a set of input variables required for DSSAT parameterization. In this plan, UF will provide SCiO with this JSON specification and SCiO will use their databases to retrieve the required site-specific information on crop management needed to configure FileX. In parallel, SCiO team is also developing API endpoints with other databases to further extend this configuration process. The intent is whenever a new location is introduced or a new set of model inputs is required, our workflow can make 9 programmatic API calls to retrieve the location and crop specific data products on demand. This API-based work integration improves the scalability, enables automation in the end-to-end pipeline without manual data assembly and preparation.