Skip to main content
Version: v0.2.23

Llama Stack Playground

Experimental Feature

The Llama Stack Playground is currently experimental and subject to change. We welcome feedback and contributions to help improve it.

The Llama Stack Playground is a simple interface that aims to:

  • Showcase capabilities and concepts of Llama Stack in an interactive environment
  • Demo end-to-end application code to help users get started building their own applications
  • Provide a UI to help users inspect and understand Llama Stack API providers and resources

Key Features​

Interactive Playground Pages​

The playground provides interactive pages for users to explore Llama Stack API capabilities:

Chatbot Interface​

Simple Chat Interface

  • Chat directly with Llama models through an intuitive interface
  • Uses the /inference/chat-completion streaming API under the hood
  • Real-time message streaming for responsive interactions
  • Perfect for testing model capabilities and prompt engineering

Evaluation Interface​

Custom Dataset Evaluation

  • Upload your own evaluation datasets
  • Run evaluations using available scoring functions
  • Uses Llama Stack's /scoring API for flexible evaluation workflows
  • Great for testing application performance on custom metrics

Inspection Interface​

Provider Management

  • Inspect available Llama Stack API providers
  • View provider configurations and capabilities
  • Uses the /providers API for real-time provider information
  • Essential for understanding your deployment's capabilities

Getting Started​

Quick Start Guide​

1. Start the Llama Stack API Server

# Build and run a distribution (example: together)
llama stack build --distro together --image-type venv
llama stack run together

2. Start the Streamlit UI

# Launch the playground interface
uv run --with ".[ui]" streamlit run llama_stack.core/ui/app.py

Available Distributions​

The playground works with any Llama Stack distribution. Popular options include:

llama stack build --distro together --image-type venv
llama stack run together

Features:

  • Cloud-hosted models
  • Fast inference
  • Multiple model options

Use Cases & Examples​

Educational Use Cases​

  • Learning Llama Stack: Hands-on exploration of API capabilities
  • Prompt Engineering: Interactive testing of different prompting strategies
  • RAG Experimentation: Understanding how document retrieval affects responses
  • Evaluation Understanding: See how different metrics evaluate model performance

Development Use Cases​

  • Prototype Testing: Quick validation of application concepts
  • API Exploration: Understanding available endpoints and parameters
  • Integration Planning: Seeing how different components work together
  • Demo Creation: Showcasing Llama Stack capabilities to stakeholders

Research Use Cases​

  • Model Comparison: Side-by-side testing of different models
  • Evaluation Design: Understanding how scoring functions work
  • Safety Testing: Exploring shield effectiveness with different inputs
  • Performance Analysis: Measuring model behavior across different scenarios

Best Practices​

🚀 Getting Started​

  • Begin with simple chat interactions to understand basic functionality
  • Gradually explore more advanced features like RAG and evaluations
  • Use the inspection tools to understand your deployment's capabilities

🔧 Development Workflow​

  • Use the playground to prototype before writing application code
  • Test different parameter settings interactively
  • Validate evaluation approaches before implementing them programmatically

📊 Evaluation & Testing​

  • Start with simple scoring functions before trying complex evaluations
  • Use the playground to understand evaluation results before automation
  • Test safety features with various input types

🎯 Production Preparation​

  • Use playground insights to inform your production API usage
  • Test edge cases and error conditions interactively
  • Validate resource configurations before deployment