Skip to main content

remote::pgvector

Description

PGVector is a remote vector database provider for Llama Stack. It allows you to store and query vectors directly in memory. That means you'll get fast and efficient vector retrieval.

Features

  • Easy to use
  • Fully integrated with Llama Stack

There are three implementations of search for PGVectoIndex available:

  1. Vector Search:
  • How it works:
    • Uses PostgreSQL's vector extension (pgvector) to perform similarity search
    • Compares query embeddings against stored embeddings using Cosine distance or other distance metrics
    • Eg. SQL query: SELECT document, embedding <=> %s::vector AS distance FROM table ORDER BY distance

-Characteristics:

  • Semantic understanding - finds documents similar in meaning even if they don't share keywords
  • Works with high-dimensional vector embeddings (typically 768, 1024, or higher dimensions)
  • Best for: Finding conceptually related content, handling synonyms, cross-language search
  • By default, Llama Stack creates a HNSW (Hierarchical Navigable Small Worlds) index on a column "embedding" in a vector store table enabling production-ready, performant and scalable vector search for large datasets out of the box.
  1. Keyword Search
  • How it works:

    • Uses PostgreSQL's full-text search capabilities with tsvector and ts_rank
    • Converts text to searchable tokens using to_tsvector('english', text). Default language is English.
    • Eg. SQL query: SELECT document, ts_rank(tokenized_content, plainto_tsquery('english', %s)) AS score
  • Characteristics:

    • Lexical matching - finds exact keyword matches and variations
    • Uses GIN (Generalized Inverted Index) for fast text search performance
    • Scoring: Uses PostgreSQL's ts_rank function for relevance scoring
    • Best for: Exact term matching, proper names, technical terms, Boolean-style queries
  1. Hybrid Search
  • How it works:

    • Combines both vector and keyword search results
    • Runs both searches independently, then merges results using configurable reranking
  • Two reranking strategies available:

    • Reciprocal Rank Fusion (RRF) - (default: 60.0)
    • Weighted Average - (default: 0.5)
  • Characteristics:

    • Best of both worlds: semantic understanding + exact matching
    • Documents appearing in both searches get boosted scores
    • Configurable balance between semantic and lexical matching
    • Best for: General-purpose search where you want both precision and recall
  1. Database Schema

The PGVector implementation stores data optimized for all three search types: CREATE TABLE vector_store_xxx ( id TEXT PRIMARY KEY, document JSONB, -- Original document embedding vector(dimension), -- For vector search content_text TEXT, -- Raw text content tokenized_content TSVECTOR -- For keyword search );

Usage

To use PGVector in your Llama Stack project, follow these steps:

  1. Install the necessary dependencies.
  2. Configure your Llama Stack project to use pgvector. (e.g. remote::pgvector).
  3. Start storing and querying vectors.

This is an example how you can set up your environment for using PGVector (you can use either Podman or Docker)

  1. Export PGVector environment variables:
export PGVECTOR_DB=testvectordb
export PGVECTOR_HOST=localhost
export PGVECTOR_PORT=5432
export PGVECTOR_USER=user
export PGVECTOR_PASSWORD=password
  1. Pull pgvector image with that tag you want:

Via Podman:

podman pull pgvector/pgvector:0.8.1-pg18-trixie

Via Docker:

docker pull pgvector/pgvector:0.8.1-pg18-trixie
  1. Run container with PGVector:

Via Podman

podman run -d   --name pgvector   -e POSTGRES_PASSWORD=password   -e POSTGRES_USER=user   -e POSTGRES_DB=testvectordb   -p 5432:5432   -v pgvector_data:/var/lib/postgresql   pgvector/pgvector:0.8.1-pg18-trixie

Via Docker

docker run -d   --name pgvector   -e POSTGRES_PASSWORD=password   -e POSTGRES_USER=user   -e POSTGRES_DB=testvectordb   -p 5432:5432   -v pgvector_data:/var/lib/postgresql   pgvector/pgvector:0.8.1-pg18-trixie

Documentation

See PGVector's documentation for more details about PGVector in general.

Configuration

FieldTypeRequiredDefaultDescription
hoststr | NoneNolocalhost
portint | NoneNo5432
dbstr | NoneNopostgres
userstr | NoneNopostgres
passwordstr | NoneNomysecretpassword
distance_metricLiteral[COSINE, L2, L1, INNER_PRODUCT] | NoneNoCOSINEPGVector distance metric used for vector search in PGVectorIndex
vector_indexPGVectorHNSWVectorIndex | PGVectorIVFFlatVectorIndex | NoneNotype=<PGVectorIndexType.HNSW: 'HNSW'> m=16 ef_construction=64 ef_search=40PGVector vector index used for Approximate Nearest Neighbor (ANN) search
persistenceKVStoreReference | NoneNoConfig for KV store backend (SQLite only for now)
persistence.namespacestrNoKey prefix for KVStore backends
persistence.backendstrNoName of backend from storage.backends

Sample Configuration

host: ${env.PGVECTOR_HOST:=localhost}
port: ${env.PGVECTOR_PORT:=5432}
db: ${env.PGVECTOR_DB}
user: ${env.PGVECTOR_USER}
password: ${env.PGVECTOR_PASSWORD}
distance_metric: COSINE
vector_index:
type: HNSW
m: 16
ef_construction: 64
ef_search: 40
persistence:
namespace: vector_io::pgvector
backend: kv_default