Version: v0.3.2

llama (server-side) CLI Reference

The llama CLI tool helps you set up and use the Llama Stack. The CLI is available on your path after installing the llama-stack package.

Installation

You have two ways to install Llama Stack:

Install as a package: You can install the repository directly from PyPI by running the following command:
```
pip install llama-stack
```

Install from source: If you prefer to install from the source code, follow these steps:

 mkdir -p ~/local
 cd ~/local
 git clone git@github.com:meta-llama/llama-stack.git

 uv venv myenv --python 3.12
 source myenv/bin/activate  # On Windows: myenv\Scripts\activate

 cd llama-stack
 pip install -e .

`llama` subcommands

stack: Allows you to build a stack using the llama stack distribution and run a Llama Stack server. You can read more about how to build a Llama Stack distribution in the Build your own Distribution documentation.

For downloading models, we recommend using the Hugging Face CLI. See Downloading models for more information.

Sample Usage

llama --help

usage: llama [-h] {stack} ...

Welcome to the Llama CLI

options:
  -h, --help  show this help message and exit

subcommands:
  {stack}

  stack                 Operations for the Llama Stack / Distributions

Downloading models

You first need to have models downloaded locally. We recommend using the Hugging Face CLI to download models.

First, install the Hugging Face CLI:

pip install huggingface_hub[cli]

Then authenticate and download models:

# Authenticate with Hugging Face
huggingface-cli login

# Download a model
huggingface-cli download meta-llama/Llama-3.2-3B-Instruct --local-dir ~/.llama/Llama-3.2-3B-Instruct

List the downloaded models

To list the downloaded models, you can use the Hugging Face CLI:

# List all downloaded models in your local cache
huggingface-cli scan-cache

Installation​

llama subcommands​

Sample Usage​

Downloading models​

List the downloaded models​

Installation

`llama` subcommands

Sample Usage

Downloading models

List the downloaded models