2 posts tagged with "responses-api"

Your Agent, Your Rules: Building Powerful Agents with the Responses API in Llama Stack

March 18, 2026 · 5 min read

The Responses API is rapidly emerging as one of the most influential interfaces for building AI agents. It handles multi-step reasoning, tool orchestration, and conversational state in a single interaction, which is a big improvement over the manual orchestration loops that developers had to build on top of chat completion APIs. Llama Stack's implementation of the Responses API brings these capabilities to the open source world, where you can choose your own models and run on your own infrastructure.

This post covers why the Responses API matters, what Llama Stack's implementation enables, and how it connects to the broader move toward open agent standards like Open Responses.

Building a Self-Improving Agent with Llama Stack

March 1, 2026 · 7 min read

Raghotham Murthy

Llama Stack Core Team

What if your AI agent could improve itself? Most agent tutorials show a single loop — user asks a question, the agent calls some tools, returns an answer. But what happens when you need to systematically improve your agent's behavior over time?

In this post, we build a ResearchAgent that answers questions from an internal engineering knowledge base — and gets better at it automatically. The agent uses the Responses API agentic loop with file_search and client-side tools to research questions, and it owns its own system prompt. Every N calls, it benchmarks itself by using a different model to judge the results, and rewrites its own prompt via the Prompts API.

This is literally self-referential: a Llama Stack agent evaluating and improving itself using the Responses API, Prompts API, and Vector Stores as its toolkit.