Vector Stores
Overview
The athomic.database.vector module provides a high-level, provider-agnostic abstraction for interacting with Vector Databases. It is the foundation for implementing Semantic Search, Similarity Matching, and Retrieval-Augmented Generation (RAG) features within the Nala framework.
By decoupling the application logic from specific vendors (like Qdrant or Weaviate), this module ensures that your AI and search capabilities remain portable, testable, and consistent with the rest of the application ecosystem.
Key Features
- Vendor Agnostic: Switch between backends via configuration without changing a single line of business logic.
- Unified Data Model: Uses standardized DTOs (
VectorRecord,SearchResult) to normalize data exchange across different providers. - Context-Aware: Automatically handles multi-tenancy by injecting the current
tenant_idinto storage operations via theContextAwareVectorStorewrapper. - Lifecycle Management: Fully integrated with the application's lifecycle for connection pooling and graceful shutdown.
- Observability: Automatic instrumentation for tracing and metrics on all vector operations.
Core Concepts
VectorStoreProtocol
This is the abstract contract that all vector store providers must implement. It defines the standard asynchronous operations required for vector manipulation:
create_collection(name, dimension, ...): Manages schema/collection creation.upsert(collection, records): Inserts or updates vector records.search(collection, vector, ...): Performs approximate nearest neighbor (ANN) search.delete(collection, ids, ...): Removes records.
VectorManager
Similar to the ConnectionManager, the VectorManager orchestrates the lifecycle of vector database connections. It reads the configuration, initializes the appropriate factory, and holds the references to the active clients.
VectorStoreFactory
The factory responsible for instantiating the concrete provider (e.g., QdrantVectorStore) and dynamically applying any configured wrappers (middleware) to it.
Data Models
To ensure consistency, the module does not expose vendor-specific objects (like Qdrant's PointStruct). Instead, it uses internal Pydantic models:
VectorRecord: Represents a data point to be stored.id: Unique identifier (UUID or string).vector: The dense vector embedding (list of floats).payload: A dictionary of metadata associated with the vector.
SearchResult: Represents a generic search hit.id: The record ID.score: The similarity score.payload: The retrieved metadata.
Available Providers
QdrantVectorStore
The primary provider implementation for Qdrant. It supports: - High-performance gRPC or HTTP connections. - Automatic payload indexing for tenant isolation. - Efficient batching for upserts.
WeaviateVectorStore
Implementation for Weaviate, utilizing its schema-based approach and GraphQL-like interface for retrieval.
Usage Example
The following example demonstrates how to store and retrieve embeddings using the memory_service pattern, which leverages the vector store internally.
from nala.athomic.database.factory import connection_manager_factory
from nala.athomic.database.vector.types import VectorRecord
async def index_document_embeddings(doc_id: str, vector: list[float], content: str):
# 1. Get the connection manager
manager = connection_manager_factory.create()
# 2. Get the default vector store client
# This client is already connected and wrapped (e.g., with context awareness)
store = manager.get_vector_store()
# 3. Create a standardized record
record = VectorRecord(
id=doc_id,
vector=vector,
payload={
"content": content,
"type": "knowledge_base",
# Note: tenant_id is NOT manually added here;
# the ContextAware wrapper handles it automatically.
}
)
# 4. Upsert
await store.upsert(collection_name="documents", records=[record])
async def find_similar(query_vector: list[float]):
manager = connection_manager_factory.create()
store = manager.get_vector_store()
# 5. Search
results = await store.search(
collection_name="documents",
query_vector=query_vector,
limit=5,
# Optional: Filters are passed as generic dicts and translated by the provider
filter_criteria={"type": "knowledge_base"}
)
return results
Configuration
Vector stores are configured under the [database.vector] section in settings.toml. It follows the standard Connection Group pattern.
Qdrant Example
[default.database.vector]
enabled = true
# The name of the default connection to use
default_connection_name = "default_qdrant"
# --- Connection Definition ---
[default.database.vector.connections.default_qdrant]
enabled = true
backend = "qdrant"
connection_name = "default_qdrant"
# --- Provider Specifics ---
[default.database.vector.connections.default_qdrant.provider]
location = "http://localhost:6333"
# Secret reference for production security
api_key = { path = "database/qdrant", key = "api_key" } # pragma: allowlist secret
prefer_grpc = true
timeout = 10.0
# --- Wrappers ---
# Automatically inject tenant_id into payloads/filters
[[default.database.vector.connections.default_qdrant.wrappers]]
name = "context_aware"
enabled = true
API Reference
nala.athomic.database.vector.protocol.VectorStoreProtocol
Bases: Protocol
Protocol definition for Vector Database interactions. Enforces a standard API for Collection Management, Upsert, and Search.
create_collection(collection_name, dimension, **kwargs)
async
Creates a new collection or index if it does not exist.
delete(collection_name, ids, **kwargs)
async
Deletes specific records by their IDs.
delete_collection(collection_name)
async
Deletes a collection and all associated data.
search(collection_name, query_vector, limit=10, score_threshold=0.0, filter_criteria=None, **kwargs)
async
Performs semantic search using a query vector.
upsert(collection_name, records, **kwargs)
async
Inserts or updates vector records in batch.
nala.athomic.database.vector.manager.VectorManager
Bases: BaseManager[VectorStoreProtocol, VectorSettings]
A specialized lifecycle manager for Vector Store connections.
It orchestrates the initialization, connection, and graceful shutdown of vector database providers defined in the configuration.
__init__(settings)
Initializes the VectorManager.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
settings
|
DatabaseSettings
|
The root database settings object containing the 'vector' connection group configuration. |
required |
nala.athomic.database.vector.factory.VectorStoreFactory
Bases: FactoryProtocol
Factory responsible for creating instances of the configured Vector Store provider. It applies wrappers dynamically based on the configuration defined in 'settings.wrappers'.
create(settings=None)
classmethod
Creates and returns an instance of the Vector Store provider based on settings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
settings
|
Optional[VectorSettings]
|
The specific connection settings. If None, loads from global config. |
None
|
Returns:
| Type | Description |
|---|---|
VectorStoreProtocol
|
An initialized VectorStoreProtocol, potentially wrapped. |
nala.athomic.database.vector.types.VectorRecord
Bases: BaseModel
Represents a single record to be stored in the vector database. Contains the vector embedding, original content, and metadata.
nala.athomic.database.vector.providers.qdrant.QdrantVectorStore
Bases: BaseVectorStore
Qdrant implementation of the VectorStoreProtocol.
Features: - Native Multi-Tenancy via Payload Filtering. - Automatic Payload Indexing for tenant_id. - Bulkhead concurrency control. - Robust API Adaptability.