Skip to main content

Tools

ToolGroups API Removed

The ToolGroups API (/v1/toolgroups, /v1/tools, /v1/tool-runtime/*) has been removed. Built-in tools (web search, RAG, WolframAlpha) are now auto-registered based on your provider configuration. For MCP servers, use the Connectors API.

Tools are functions that can be invoked by an agent to perform tasks. They are organized into tool groups and registered with specific providers. Each tool group represents a collection of related tools from a single provider. They are organized into groups so that state can be externalized: the collection operates on the same state typically.

An example of this would be a "db_access" tool group that contains tools for interacting with a database. "list_tables", "query_table", "insert_row" could be examples of tools in this group.

When instantiating an agent, you can provide it a list of tool groups that it has access to. Agent gets the corresponding tool definitions for the specified tool groups and passes them along to the model.

Refer to the Building AI Applications notebook for more examples on how to use tools.

Server-side vs. Client-side Tool Execution

Llama Stack allows you to use both server-side and client-side tools. With server-side tools, agent.create_turn can perform execution of the tool calls emitted by the model transparently giving the user the final answer desired. If client-side tools are provided, the tool call is sent back to the user for execution and optional continuation using the agent.resume_turn method.

Server-side Tools

Llama Stack provides built-in providers for some common tools. These include web search, math, and RAG capabilities.

Built-in tool groups are automatically registered based on your configured tool_runtime providers. For example, if you have a remote::tavily-search provider configured, the builtin::websearch tool group is auto-registered.

You have three providers to execute the web search tool calls generated by a model: Brave Search, Bing Search, and Tavily Search. Configure any of these as a tool_runtime provider and the builtin::websearch tool group will be auto-registered.

The tool requires an API key which can be provided either in the configuration or through the request header X-LlamaStack-Provider-Data. The format of the header is:

{"<provider_name>_api_key": <your api key>}

Math

The WolframAlpha tool provides access to computational knowledge through the WolframAlpha API. Configure the remote::wolfram-alpha provider in your tool_runtime providers and the builtin::wolfram_alpha tool group will be auto-registered.

RAG

The RAG tool enables retrieval of context from various types of memory banks (vector, key-value, keyword, and graph). Configure the inline::rag-runtime provider in your tool_runtime providers and the builtin::rag tool group will be auto-registered.

Features:

  • Support for multiple memory bank types
  • Configurable query generation
  • Context retrieval with token limits

Model Context Protocol (MCP)

MCP is an upcoming, popular standard for tool discovery and execution. It is a protocol that allows tools to be dynamically discovered from an MCP endpoint and can be used to extend the agent's capabilities.

Using Remote MCP Servers

You can find some popular remote MCP servers here. Register them as connectors in your stack config:

connectors:
- connector_id: "mcp::deepwiki"
connector_type: mcp
url: "https://mcp.deepwiki.com/sse"

Then reference the connector by connector_id when creating the Agent:

agent = Agent(
client,
model="meta-llama/Llama-3.2-3B-Instruct",
instructions="You are a helpful assistant.",
tools=[
{
"type": "mcp",
"connector_id": "mcp::deepwiki",
"server_label": "deepwiki",
}
],
)
agent.create_turn(...)

Note that most of the more useful MCP servers need you to authenticate with them. Many of them use OAuth2.0 for authentication. You can provide the authorization token in the tool definition:

tools=[
{
"type": "mcp",
"connector_id": "mcp::deepwiki",
"server_label": "deepwiki",
"authorization": "<your_access_token>", # OAuth token (without "Bearer " prefix)
}
]

Alternatively, you can pass the server_url directly without registering a connector:

tools=[
{
"type": "mcp",
"server_url": "https://mcp.deepwiki.com/sse",
"server_label": "deepwiki",
}
]

Running Your Own MCP Server

Here's an example of how to run a simple MCP server that exposes a File System as a set of tools to the Llama Stack agent.

# Start your MCP server
mkdir /tmp/content
touch /tmp/content/foo
touch /tmp/content/bar
npx -y supergateway --port 8000 --stdio 'npx -y @modelcontextprotocol/server-filesystem /tmp/content'

Adding Custom (Client-side) Tools

When you want to use tools other than the built-in tools, you just need to implement a python function with a docstring. The content of the docstring will be used to describe the tool and the parameters and passed along to the generative model.

# Example tool definition
def my_tool(input: int) -> int:
"""
Runs my awesome tool.

:param input: some int parameter
"""
return input * 2
Documentation Best Practices

We employ python docstrings to describe the tool and the parameters. It is important to document the tool and the parameters so that the model can use the tool correctly. It is recommended to experiment with different docstrings to see how they affect the model's behavior.

Once defined, simply pass the tool to the agent config. Agent will take care of the rest (calling the model with the tool definition, executing the tool, and returning the result to the model for the next iteration).

# Example agent config with client provided tools
agent = Agent(client, ..., tools=[my_tool])

Refer to llama-stack-apps for an example of how to use client provided tools.

Complete Examples

Web Search Agent

  1. Start by registering a Tavily API key at Tavily.
  2. [Optional] Set the API key in your environment before starting the Llama Stack server
export TAVILY_SEARCH_API_KEY="your key"

WolframAlpha Math Agent

  1. Start by registering for a WolframAlpha API key at WolframAlpha Developer Portal.
  2. Provide the API key either by setting it in your environment before starting the Llama Stack server:
    export WOLFRAM_ALPHA_API_KEY="your key"
    or from the client side:
    client = LlamaStackClient(
    base_url="http://localhost:8321",
    provider_data={"wolfram_alpha_api_key": wolfram_api_key},
    )

Best Practices

Tool Selection

  • Use server-side tools for production applications requiring reliability and security
  • Use client-side tools for development, prototyping, or specialized integrations
  • Combine multiple tool types for comprehensive functionality

Documentation

  • Write clear, detailed docstrings for custom tools
  • Include parameter descriptions and expected return types
  • Test tool descriptions with the model to ensure proper usage

Security

  • Store API keys securely using environment variables or secure configuration
  • Use the X-LlamaStack-Provider-Data header for dynamic authentication
  • Validate tool inputs and outputs for security

Error Handling

  • Implement proper error handling in custom tools
  • Use structured error responses with meaningful messages
  • Monitor tool performance and reliability