llama-stack
  • Llama Stack
  • Quickstart
  • Detailed Tutorial
  • Why Llama Stack?
  • Core Concepts
  • OpenAI API Compatibility
  • Providers Overview
    • External Providers
    • Agents
    • DatasetIO
    • Eval
    • Inference
      • Inference Providers
    • Post Training
    • Safety
    • Scoring
    • Telemetry
    • Tool Runtime
    • Vector IO
  • Distributions Overview
  • Building AI Applications (Examples)
  • Llama Stack Playground
  • Contributing to Llama-Stack
  • References
llama-stack
  • Providers Overview
  • Inference Providers
  • View page source

Inference Providers

This section contains documentation for all available providers for the inference API.

  • inline::meta-reference

  • inline::sentence-transformers

  • inline::vllm

  • remote::anthropic

  • remote::bedrock

  • remote::cerebras

  • remote::cerebras-openai-compat

  • remote::databricks

  • remote::fireworks

  • remote::fireworks-openai-compat

  • remote::gemini

  • remote::groq

  • remote::groq-openai-compat

  • remote::hf::endpoint

  • remote::hf::serverless

  • remote::llama-openai-compat

  • remote::nvidia

  • remote::ollama

  • remote::openai

  • remote::passthrough

  • remote::runpod

  • remote::sambanova

  • remote::sambanova-openai-compat

  • remote::tgi

  • remote::together

  • remote::together-openai-compat

  • remote::vllm

  • remote::watsonx

Previous Next

© Copyright 2025, Meta.

Built with Sphinx using a theme provided by Read the Docs.
Read the Docs v: v0.2.14
Versions
latest
v0.2.19
v0.2.18
v0.2.17
v0.2.16
v0.2.15
v0.2.14
v0.2.13
v0.2.12
v0.2.11