Adding a New Vector Database
This guide will walk you through the process of adding a new vector database to Llama Stack.
NOTE: Here’s an example Pull Request of the Milvus Vector Database Provider.
Vector Database providers are used to store and retrieve vector embeddings. Vector databases are not limited to vector search but can support keyword and hybrid search. Additionally, vector database can also support operations like filtering, sorting, and aggregating vectors.
Steps to Add a New Vector Database Provider
Choose the Database Type: Determine if your vector database is a remote service, inline, or both.
Remote databases make requests to external services, while inline databases execute locally. Some providers support both.
Implement the Provider: Create a new provider class that inherits from
VectorDatabaseProvider
and implements the required methods.Implement methods for vector storage, retrieval, search, and any additional features your database supports.
You will need to implement the following methods for
YourVectorIndex
:YourVectorIndex.create()
YourVectorIndex.initialize()
YourVectorIndex.add_chunks()
YourVectorIndex.delete_chunk()
YourVectorIndex.query_vector()
YourVectorIndex.query_keyword()
YourVectorIndex.query_hybrid()
You will need to implement the following methods for
YourVectorIOAdapter
:YourVectorIOAdapter.initialize()
YourVectorIOAdapter.shutdown()
YourVectorIOAdapter.list_vector_dbs()
YourVectorIOAdapter.register_vector_db()
YourVectorIOAdapter.unregister_vector_db()
YourVectorIOAdapter.insert_chunks()
YourVectorIOAdapter.query_chunks()
YourVectorIOAdapter.delete_chunks()
Add to Registry: Register your provider in the appropriate registry file.
Update llama_stack/providers/registry/vector_io.py to include your new provider.
from llama_stack.providers.registry.specs import InlineProviderSpec
from llama_stack.providers.registry.api import Api
InlineProviderSpec(
api=Api.vector_io,
provider_type="inline::milvus",
pip_packages=["pymilvus>=2.4.10"],
module="llama_stack.providers.inline.vector_io.milvus",
config_class="llama_stack.providers.inline.vector_io.milvus.MilvusVectorIOConfig",
api_dependencies=[Api.inference],
optional_api_dependencies=[Api.files],
description="",
),
Add Tests: Create unit tests and integration tests for your provider in the
tests/
directory.Unit Tests
By following the structure of the class methods, you will be able to easily run unit and integration tests for your database.
You have to configure the tests for your provide in
/tests/unit/providers/vector_io/conftest.py
.Update the
vector_provider
fixture to include your provider if they are an inline provider.Create a
your_vectorprovider_index
fixture that initializes your vector index.Create a
your_vectorprovider_adapter
fixture that initializes your vector adapter.Add your provider to the
vector_io_providers
fixture dictionary.
Please follow the naming convention of
your_vectorprovider_index
andyour_vectorprovider_adapter
as the tests require this to execute properly.
Integration Tests
Integration tests are located in tests/integration. These tests use the python client-SDK APIs (from the
llama_stack_client
package) to test functionality.The two set of integration tests are:
tests/integration/vector_io/test_vector_io.py
: This file tests registration, insertion, and retrieval.tests/integration/vector_io/test_openai_vector_stores.py
: These tests are for OpenAI-compatible vector stores and test the OpenAI API compatibility.You will need to update
skip_if_provider_doesnt_support_openai_vector_stores
to include your provider as well asskip_if_provider_doesnt_support_openai_vector_stores_search
to test the appropriate search functionality.
Running the tests in the GitHub CI
You will need to update the
.github/workflows/integration-vector-io-tests.yml
file to include your provider.If your provider is a remote provider, you will also have to add a container to spin up and run it in the action.
Updating the pyproject.yml
If you are adding tests for the
inline
provider you will have to update theunit
group.uv add new_pip_package --group unit
If you are adding tests for the
remote
provider you will have to update thetest
group, which is used in the GitHub CI for integration tests.uv add new_pip_package --group test
Update Documentation: Please update the documentation for end users
Generate the provider documentation by running ./scripts/provider_codegen.py.
Update the autogenerated content in the registry/vector_io.py file with information about your provider. Please see other providers for examples.