Telemetry
The preferred way to instrument Llama Stack is with OpenTelemetry. Llama Stack enriches the data collected by OpenTelemetry to capture helpful information about the performance and behavior of your application. Here is an example of how to forward your telemetry to an OTLP collector from Llama Stack:
export OTEL_EXPORTER_OTLP_ENDPOINT="http://127.0.0.1:4318"
export OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf
export OTEL_SERVICE_NAME="llama-stack-server"
uv pip install opentelemetry-distro opentelemetry-exporter-otlp
uv run opentelemetry-bootstrap -a requirements | uv pip install --requirement -
uv run opentelemetry-instrument llama stack run starter
Known issues
When OpenTelemetry auto-instrumentation is enabled, both the low-level database driver instrumentor
(e.g. asyncpg, sqlite3) and the SQLAlchemy ORM instrumentor activate simultaneously. This causes
every database operation to be traced twice -- once at the ORM level and once at the raw protocol level.
The driver-level spans expose internal pool mechanics (such as connection health-check queries) that
inflate traces with noise. To prevent this, disable the driver-level instrumentors and rely on the
SQLAlchemy instrumentation alone:
export OTEL_PYTHON_DISABLED_INSTRUMENTATIONS="sqlite3,asyncpg"
The container image sets this automatically when any OTEL_* environment variable is present.
Related Resources
- OpenTelemetry Documentation - Comprehensive observability framework
- Jaeger Documentation - Distributed tracing visualization