Skip to main content

Google Interactions API Compatibility

Llama Stack provides a compatibility layer for the Google Interactions API (v1alpha), so teams using the Google GenAI SDK can point at a Llama Stack server with minimal code changes.

from google import genai

client = genai.Client(
http_options={"api_version": "v1alpha"},
vertexai=False,
api_key="fake",
)
# Override the base URL to point at Llama Stack
client._api_client._url = "http://localhost:8321"

response = client.models.generate_interaction(
model="llama-3.3-70b",
input="Hello",
)
print(response.outputs[0].text)

Implemented endpoints

EndpointMethodStatus
/v1alpha/interactionsPOSTImplemented
/v1alpha/interactions/{id}GETNot yet
/v1alpha/interactions/{id}DELETENot yet
/v1alpha/interactions/{id}/cancelPOSTNot yet

For property-level coverage details, see the conformance report.

How it works

The adapter translates Google Interactions requests into OpenAI Chat Completion calls through Llama Stack's inference API. This means any inference provider that Llama Stack supports (vLLM, Ollama, OpenAI, Bedrock, etc.) can serve the Google Interactions API.

Supported features:

  • Text generation with string or multi-turn conversation input
  • Streaming via Server-Sent Events matching Google's event format
  • System instructions mapped to the system role
  • Generation config parameters (temperature, top_p, top_k, max_output_tokens)

Known limitations

  • Only text content is supported; multimodal inputs (images, audio, video) are not yet implemented
  • Tool declarations (Function, GoogleSearch, CodeExecution, MCP) are not yet supported
  • Background execution and interaction storage (store, background) are not available
  • The GET, DELETE, and Cancel endpoints are not yet implemented
  • Response modalities are accepted for compatibility but ignored