This recipe shows how to build a Retrieval-Augmented Generation (RAG) pattern using the Vector Store for embeddings and the KV Store for document text.

Pattern

  1. Index: Store document embeddings in a vector collection, full text in KV
  2. Search: Find similar vectors, then retrieve the associated text
  3. Generate: Pass the retrieved context to your LLM

Implementation

Step 1: Index Documents

#!/bin/bash
set -euo pipefail

DB="--db ./data"

# Create a collection for document embeddings
strata $DB vector create knowledge 384 --metric cosine

# Batch upsert all embeddings at once (atomic)
strata $DB vector batch-upsert knowledge '[
  {"key":"doc-1","vector":[0.1,0.2,...],"metadata":{"text_key":"doc:doc-1"}},
  {"key":"doc-2","vector":[0.3,0.4,...],"metadata":{"text_key":"doc:doc-2"}},
  {"key":"doc-3","vector":[0.5,0.6,...],"metadata":{"text_key":"doc:doc-3"}}
]'

# Store the full text in KV
strata $DB kv put doc:doc-1 "StrataDB is an embedded database for AI agents..."
strata $DB kv put doc:doc-2 "Branches provide data isolation like git branches..."
strata $DB kv put doc:doc-3 "The Vector Store supports cosine similarity search..."

Step 2: Search and Retrieve

#!/bin/bash
set -euo pipefail

DB="--db ./data"
QUERY_EMBEDDING="[0.1,0.2,...]"  # Generated by your embedding model

# Find similar documents
strata $DB vector search knowledge "$QUERY_EMBEDDING" 5

# Retrieve full text for top matches
for key in $(strata $DB --raw vector search knowledge "$QUERY_EMBEDDING" 5 | awk '{print $1}'); do
    strata $DB kv get "doc:$key"
done

Step 3: Interactive RAG Session

$ strata --db ./data
strata:default/default> vector search knowledge [0.1,0.2,...] 3
key=doc-1 score=0.9823
key=doc-3 score=0.8712
key=doc-2 score=0.7544
strata:default/default> kv get doc:doc-1
"StrataDB is an embedded database for AI agents..."
strata:default/default> kv get doc:doc-3
"The Vector Store supports cosine similarity search..."

Branch-Scoped Knowledge Bases

Use branches to maintain different knowledge bases:

#!/bin/bash
set -euo pipefail

DB="--db ./data"

# General knowledge base in default branch
strata $DB vector create kb 384 --metric cosine
# ... index general documents ...

# Session-specific knowledge
strata $DB branch create session-001
strata $DB --branch session-001 vector create kb 384 --metric cosine
# ... index session-specific documents ...
# Search only sees session-specific documents
strata $DB --branch session-001 vector search kb "[0.1,0.2,...]" 5

Incremental Updates

Add new documents at any time:

strata --db ./data vector upsert knowledge new-doc "[0.1,0.2,...]" --metadata '{"text_key":"doc:new-doc"}'
strata --db ./data kv put doc:new-doc "New document content..."

Monitoring Collection Health

Use collection statistics to monitor your knowledge base:

strata --db ./data vector stats knowledge

Output:

name: knowledge
count: 1000
dimension: 384
metric: cosine
index_type: flat
memory_bytes: 1536000

See Also