Chroma Context-1
Training a Self-Editing Search Agent
Two key trends motivated us to train a specialized model for agentic search:
- Agentic search is a subtask of most agentic workflows (i.e. a coding agent first needs to find relevant code snippets before generating any new code).
- Offloading defined tasks to subagents is becoming increasingly common as a way to keep the main orchestrator agent’s context focused.
We posttrained gpt-oss-20b with a staged training curriculum across 4 domains, and achieved performance comparable to frontier models at a fraction of the cost and latency. It’s worth noting that training a specialized search agent is not a novel contribution in itself, and we don’t claim to generalize across all use cases given the tight scope of our tasks. What I see as our main contribution is open-sourcing the resources for training such an agent (data generation pipeline, model weights, training process) and setting a foundation for future work to build on.
The research direction I’m particularly interested in is expanding these evals to be more representative of real use cases. Our generated questions are scoped to NIAH-style search, where the task is to find a specific piece of information based on various criteria, and our tools are similarly limited. In reality, users ask many different kinds of questions. One example is breadth-style questions, where the answer requires finding all documents satisfying a given criterion (i.e. “find me every file in the repo that imports module X”) rather than the single most-relevant result, which is a fundamentally different retrieval shape. Real tasks also call for a broader set of tools, such as metadata filtering and hierarchical search.
One idea explored in this report is a self-editing context, which is a direction I’m interested in exploring further. Current approaches to modifying context tend to revolve around some form of compaction, where parts of the existing context are removed and whatever is dropped is permanently lost. I wonder if there could be a different approach, where none of the context is discarded and a dedicated smaller model instead rebuilds the relevant context for each turn via agentic search over the full history.