Skip to main content

Telemetry

The preferred way to instrument Llama Stack is with OpenTelemetry. Llama Stack enriches the data collected by OpenTelemetry to capture helpful information about the performance and behavior of your application. Here is an example of how to forward your telemetry to an OTLP collector from Llama Stack:

export OTEL_EXPORTER_OTLP_ENDPOINT="http://127.0.0.1:4318"
export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
export OTEL_SERVICE_NAME="llama-stack-server"

uv pip install opentelemetry-distro opentelemetry-exporter-otlp
uv run opentelemetry-bootstrap -a requirements | uv pip install --requirement -

uv run opentelemetry-instrument llama stack run starter

Known issues

When OpenTelemetry auto-instrumentation is enabled, both the low-level database driver instrumentor (e.g. asyncpg, sqlite3) and the SQLAlchemy ORM instrumentor activate simultaneously. This causes every database operation to be traced twice -- once at the ORM level and once at the raw protocol level. The driver-level spans expose internal pool mechanics (such as connection health-check queries) that inflate traces with noise. To prevent this, disable the driver-level instrumentors and rely on the SQLAlchemy instrumentation alone:

export OTEL_PYTHON_DISABLED_INSTRUMENTATIONS="sqlite3,asyncpg"
note

The container image sets this automatically when any OTEL_* environment variable is present.