llama-stack
  • Llama Stack
  • Getting Started
  • Core Concepts
  • API Providers
    • External Providers
    • OpenAI API Compatibility
    • Inference
      • Overview
      • Providers
        • inline::meta-reference
        • inline::sentence-transformers
        • remote::anthropic
        • remote::bedrock
        • remote::cerebras
        • remote::databricks
        • remote::fireworks
        • remote::gemini
        • remote::groq
        • remote::hf::endpoint
        • remote::hf::serverless
        • remote::llama-openai-compat
        • remote::nvidia
        • remote::ollama
        • remote::openai
        • remote::passthrough
        • remote::runpod
        • remote::sambanova
        • remote::tgi
        • remote::together
        • remote::vllm
        • remote::watsonx
    • Agents
    • Datasetio
    • Safety
    • Telemetry
    • Vector_Io
    • Tool_Runtime
    • Files
  • Distributions Overview
  • Advanced APIs
  • AI Application Examples
  • Deployment Examples
  • Contributing to Llama-Stack
  • References
llama-stack
  • API Providers
  • Inference
  • View page source

Inference

Overview

This section contains documentation for all available providers for the inference API.

Providers

  • inline::meta-reference
  • inline::sentence-transformers
  • remote::anthropic
  • remote::bedrock
  • remote::cerebras
  • remote::databricks
  • remote::fireworks
  • remote::gemini
  • remote::groq
  • remote::hf::endpoint
  • remote::hf::serverless
  • remote::llama-openai-compat
  • remote::nvidia
  • remote::ollama
  • remote::openai
  • remote::passthrough
  • remote::runpod
  • remote::sambanova
  • remote::tgi
  • remote::together
  • remote::vllm
  • remote::watsonx
Previous Next

© Copyright 2025, Meta.

Built with Sphinx using a theme provided by Read the Docs.
Read the Docs v: v0.2.17
Versions
latest
v0.2.19
v0.2.18
v0.2.17
v0.2.16
v0.2.15
v0.2.14
v0.2.13
v0.2.12
v0.2.11