Vector Stores

Overview

The athomic.database.vector module provides a high-level, provider-agnostic abstraction for interacting with Vector Databases. It is the foundation for implementing Semantic Search, Similarity Matching, and Retrieval-Augmented Generation (RAG) features within the Nala framework.

By decoupling the application logic from specific vendors (like Qdrant or Weaviate), this module ensures that your AI and search capabilities remain portable, testable, and consistent with the rest of the application ecosystem.

Key Features

Vendor Agnostic: Switch between backends via configuration without changing a single line of business logic.
Unified Data Model: Uses standardized DTOs (VectorRecord, SearchResult) to normalize data exchange across different providers.
Context-Aware: Automatically handles multi-tenancy by injecting the current tenant_id into storage operations via the ContextAwareVectorStore wrapper.
Lifecycle Management: Fully integrated with the application's lifecycle for connection pooling and graceful shutdown.
Observability: Automatic instrumentation for tracing and metrics on all vector operations.

Core Concepts

`VectorStoreProtocol`

This is the abstract contract that all vector store providers must implement. It defines the standard asynchronous operations required for vector manipulation:

create_collection(name, dimension, ...): Manages schema/collection creation.
upsert(collection, records): Inserts or updates vector records.
search(collection, vector, ...): Performs approximate nearest neighbor (ANN) search.
delete(collection, ids, ...): Removes records.

`VectorManager`

Similar to the ConnectionManager, the VectorManager orchestrates the lifecycle of vector database connections. It reads the configuration, initializes the appropriate factory, and holds the references to the active clients.

`VectorStoreFactory`

The factory responsible for instantiating the concrete provider (e.g., QdrantVectorStore) and dynamically applying any configured wrappers (middleware) to it.

Data Models

To ensure consistency, the module does not expose vendor-specific objects (like Qdrant's PointStruct). Instead, it uses internal Pydantic models:

VectorRecord: Represents a data point to be stored.
- id: Unique identifier (UUID or string).
- vector: The dense vector embedding (list of floats).
- payload: A dictionary of metadata associated with the vector.
SearchResult: Represents a generic search hit.
- id: The record ID.
- score: The similarity score.
- payload: The retrieved metadata.

Available Providers

`QdrantVectorStore`

The primary provider implementation for Qdrant. It supports: - High-performance gRPC or HTTP connections. - Automatic payload indexing for tenant isolation. - Efficient batching for upserts.

`WeaviateVectorStore`

Implementation for Weaviate, utilizing its schema-based approach and GraphQL-like interface for retrieval.

Usage Example

The following example demonstrates how to store and retrieve embeddings using the memory_service pattern, which leverages the vector store internally.

from nala.athomic.database.factory import connection_manager_factory
from nala.athomic.database.vector.types import VectorRecord

async def index_document_embeddings(doc_id: str, vector: list[float], content: str):
    # 1. Get the connection manager
    manager = connection_manager_factory.create()

    # 2. Get the default vector store client
    # This client is already connected and wrapped (e.g., with context awareness)
    store = manager.get_vector_store()

    # 3. Create a standardized record
    record = VectorRecord(
        id=doc_id,
        vector=vector,
        payload={
            "content": content,
            "type": "knowledge_base",
            # Note: tenant_id is NOT manually added here; 
            # the ContextAware wrapper handles it automatically.
        }
    )

    # 4. Upsert
    await store.upsert(collection_name="documents", records=[record])

async def find_similar(query_vector: list[float]):
    manager = connection_manager_factory.create()
    store = manager.get_vector_store()

    # 5. Search
    results = await store.search(
        collection_name="documents",
        query_vector=query_vector,
        limit=5,
        # Optional: Filters are passed as generic dicts and translated by the provider
        filter_criteria={"type": "knowledge_base"}
    )

    return results

Configuration

Vector stores are configured under the [database.vector] section in settings.toml. It follows the standard Connection Group pattern.

Qdrant Example

[default.database.vector]
enabled = true
# The name of the default connection to use
default_connection_name = "default_qdrant"

  # --- Connection Definition ---
  [default.database.vector.connections.default_qdrant]
  enabled = true
  backend = "qdrant"
  connection_name = "default_qdrant"

    # --- Provider Specifics ---
    [default.database.vector.connections.default_qdrant.provider]
    location = "http://localhost:6333"
    # Secret reference for production security
    api_key = { path = "database/qdrant", key = "api_key" } # pragma: allowlist secret
    prefer_grpc = true
    timeout = 10.0

    # --- Wrappers ---
    # Automatically inject tenant_id into payloads/filters
    [[default.database.vector.connections.default_qdrant.wrappers]]
    name = "context_aware"
    enabled = true

API Reference

`nala.athomic.database.vector.protocol.VectorStoreProtocol`

Bases: Protocol

Protocol definition for Vector Database interactions. Enforces a standard API for Collection Management, Upsert, and Search.

`create_collection(collection_name, dimension, **kwargs)` `async`

Creates a new collection or index if it does not exist.

`delete(collection_name, ids, **kwargs)` `async`

Deletes specific records by their IDs.

`delete_collection(collection_name)` `async`

Deletes a collection and all associated data.

`search(collection_name, query_vector, limit=10, score_threshold=0.0, filter_criteria=None, **kwargs)` `async`

Performs semantic search using a query vector.

`upsert(collection_name, records, **kwargs)` `async`

Inserts or updates vector records in batch.

`nala.athomic.database.vector.manager.VectorManager`

Bases: BaseManager[VectorStoreProtocol, VectorSettings]

A specialized lifecycle manager for Vector Store connections.

It orchestrates the initialization, connection, and graceful shutdown of vector database providers defined in the configuration.

`init(settings)`

Initializes the VectorManager.

Parameters:

Name	Type	Description	Default
`settings`	`DatabaseSettings`	The root database settings object containing the 'vector' connection group configuration.	required

`nala.athomic.database.vector.factory.VectorStoreFactory`

Bases: FactoryProtocol

Factory responsible for creating instances of the configured Vector Store provider. It applies wrappers dynamically based on the configuration defined in 'settings.wrappers'.

`create(settings=None)` `classmethod`

Creates and returns an instance of the Vector Store provider based on settings.

Parameters:

Name	Type	Description	Default
`settings`	`Optional[VectorSettings]`	The specific connection settings. If None, loads from global config.	`None`

Returns:

Type	Description
`VectorStoreProtocol`	An initialized VectorStoreProtocol, potentially wrapped.

`nala.athomic.database.vector.types.VectorRecord`

Bases: BaseModel

Represents a single record to be stored in the vector database. Contains the vector embedding, original content, and metadata.

`nala.athomic.database.vector.providers.qdrant.QdrantVectorStore`

Bases: BaseVectorStore

Qdrant implementation of the VectorStoreProtocol.

Features: - Native Multi-Tenancy via Payload Filtering. - Automatic Payload Indexing for tenant_id. - Bulkhead concurrency control. - Robust API Adaptability.

Vector Stores

Overview

Key Features

Core Concepts

VectorStoreProtocol

VectorManager

VectorStoreFactory

Data Models

Available Providers

QdrantVectorStore

WeaviateVectorStore

Usage Example

Configuration

Qdrant Example

API Reference

nala.athomic.database.vector.protocol.VectorStoreProtocol

create_collection(collection_name, dimension, **kwargs) async

delete(collection_name, ids, **kwargs) async

delete_collection(collection_name) async

search(collection_name, query_vector, limit=10, score_threshold=0.0, filter_criteria=None, **kwargs) async

upsert(collection_name, records, **kwargs) async

nala.athomic.database.vector.manager.VectorManager

__init__(settings)

nala.athomic.database.vector.factory.VectorStoreFactory

create(settings=None) classmethod

nala.athomic.database.vector.types.VectorRecord

nala.athomic.database.vector.providers.qdrant.QdrantVectorStore