Systematic edge case analysis at every layer.

1. Key Validation Boundaries

KV / State / JSON Keys — via validate_key() (bridge.rs:110-129)

InputBehaviorCorrect?
"" (empty)Rejected: “Key must not be empty”Yes
"a" (1 char)AcceptedYes
1024-byte stringAccepted (at limit)Yes
1025-byte stringRejected: “Key exceeds maximum length”Yes
"hello\0world" (NUL)Rejected: “Key must not contain NUL bytes”Yes
"_strata/internal"Rejected: reserved prefixYes
"_strata" (no slash)Accepted (prefix is _strata/)Yes
Unicode ("日本語")AcceptedYes
Whitespace-only (" ")AcceptedDebatable — not harmful

Verdict: KV/State/JSON key validation is thorough.

Vector Keys — via validate_vector_key() (collection.rs:66-82)

InputBehaviorDifferent from KV?
"" (empty)AcceptedYes — KV rejects
1024-byte stringAcceptedSame
1025-byte stringRejectedSame
NUL bytesRejectedSame
"_strata/internal"AcceptedYes — KV rejects

Finding: Vector key validation is more permissive than KV key validation. Empty keys and reserved-prefix keys are accepted. The comment at line 67 says “Empty string keys are allowed (consistent with other key-value stores)” — but this is inconsistent with the same codebase’s own KV validation.

Collection Names — via validate_collection_name() (collection.rs:19-57)

InputBehaviorCorrect?
"" (empty)RejectedYes
"/" (slash)RejectedYes
NUL bytesRejectedYes
"_internal" (underscore prefix)Rejected (reserved)Yes
> 256 charsRejectedYes

Verdict: Collection name validation is thorough.

Event Types — via validate_event_type() (event.rs:177-185)

InputBehaviorCorrect?
"" (empty)Rejected: EmptyEventTypeYes
" " (whitespace only)AcceptedDebatable
UnicodeAcceptedYes
Very long stringsAccepted (no length limit)Missing limit

Verdict: Event type validation is minimal — only empty string rejected. No length limit.

Branch Names

InputBehaviorCorrect?
"" (empty)AcceptedNo — inconsistent
"default"Maps to nil UUIDYes
Valid UUID stringParsed directlyYes
Any other stringUUID v5 generatedYes

Finding: create_branch("") succeeds. The engine’s BranchIndex::create_branch() (index.rs:238) has no name validation. The executor’s to_core_branch_id() (bridge.rs:82-93) generates a deterministic UUID v5 for any non-”default” string, including empty strings. Every other primitive rejects empty identifiers (keys, collection names, event types).

2. Numeric Overflow Boundaries

Version::increment() — Overflow at u64::MAX

Location: crates/core/src/contract/version.rs:139-145

pub const fn increment(&self) -> Self {
    match self {
        Version::Txn(v) => Version::Txn(*v + 1),       // ← unchecked
        Version::Sequence(v) => Version::Sequence(*v + 1),
        Version::Counter(v) => Version::Counter(*v + 1),
    }
}

Behavior:

  • Debug mode: Panics on overflow (attempt to add with overflow)
  • Release mode: Wraps to 0 silently

Impact: Used in StateCell::cas() (state.rs:210) and StateCell::set() (state.rs:242). A state cell updated u64::MAX times would panic in debug or wrap to Counter(0) in release, breaking CAS semantics.

Safe alternative exists: saturating_increment() at version.rs:148-154 uses saturating_add(1), but it is never called in production code.

Global Version Counter — Overflow at u64::MAX

Location: crates/concurrency/src/manager.rs:136-138

pub fn allocate_version(&self) -> u64 {
    self.version.fetch_add(1, Ordering::SeqCst) + 1
}

Behavior: fetch_add wraps at u64::MAX. After u64::MAX versions:

  • fetch_add(1) returns u64::MAX, function returns 0 (wraps)
  • Next call returns 1, then 2, etc.
  • These duplicate earlier version numbers, corrupting MVCC ordering

Impact: All MVCC snapshot reads, version chain ordering, and conflict detection rely on version monotonicity. Wrapping destroys this invariant globally.

Practical risk: At 1 billion transactions per second, overflow takes ~585 years. Low practical risk, but the invariant is architecturally fundamental.

Transaction ID Counter — Overflow at u64::MAX

Location: crates/concurrency/src/manager.rs:118-120

pub fn next_txn_id(&self) -> u64 {
    self.next_txn_id.fetch_add(1, Ordering::SeqCst)
}

Same wrapping behavior as version counter. Transaction IDs would collide in WAL records, making recovery ambiguous.

Event Sequence Counter — Overflow at u64::MAX

Location: crates/engine/src/primitives/event.rs:374

meta.next_sequence = sequence + 1;

Behavior: Unchecked addition. At u64::MAX, wraps to 0. The next event is stored at sequence 0, overwriting the very first event in the log.

Impact: Event key format encodes sequence as part of the storage key. Wrapping creates key collisions with existing events, silently overwriting historical data.

3. Vector Embedding Boundaries

NaN and Infinity in Embeddings — Not Validated

Location: crates/engine/src/primitives/vector/heap.rs:180-213 (upsert)

The upsert() method validates dimension match but does not check for NaN or Infinity values in the embedding vector. The embedding is stored directly:

if embedding.len() != self.config.dimension {
    return Err(VectorError::DimensionMismatch { ... });
}
// No NaN/Infinity check — goes straight to storage

Impact:

  • NaN in embeddings produces NaN distances during search, corrupting result ordering
  • Infinity in embeddings produces Infinity distances, pushing results to extremes
  • Cosine similarity with NaN returns NaN, which is not comparable (all comparisons with NaN are false)
  • Once stored, the corrupted vector affects every subsequent search across the collection

Contrast: Event payload validation (event.rs:186-209) explicitly rejects NaN and Infinity in JSON payloads. The vector subsystem has no equivalent guard.

Dimension Bounds

InputBehaviorCorrect?
dimension = 0Rejected: InvalidDimension (store.rs:168)Yes
dimension = 1AcceptedYes
dimension = 1,000,000AcceptedMissing upper bound

Finding: create_collection() validates dimension > 0 but has no upper bound. A dimension of 1 million means each vector requires 4MB (1M * 4 bytes). With 1000 vectors, that’s 4GB in the heap alone. There is no guard against memory exhaustion.

Embedding Dimension Mismatch

InputBehaviorCorrect?
Embedding matches config dimensionAcceptedYes
Embedding shorter than configRejected: DimensionMismatchYes
Embedding longer than configRejected: DimensionMismatchYes
Empty embedding (dim=0 config)Rejected at collection creationYes

Verdict: Dimension mismatch is properly validated.

4. JSON Document Boundaries

Nesting Depth

Location: crates/core/src/primitives/json.rs:40

MAX_NESTING_DEPTH = 100
InputBehaviorCorrect?
99-level nested objectAcceptedYes
100-level nested objectAccepted (at limit)Yes
101-level nested objectRejectedYes

Document Size

Location: crates/core/src/primitives/json.rs:34

MAX_DOCUMENT_SIZE = 16 * 1024 * 1024  (16 MB)
InputBehaviorCorrect?
16MB - 1 byteAcceptedYes
16MBAccepted (at limit)Yes
16MB + 1 byteRejectedYes

JSON Path Edge Cases

InputBehaviorCorrect?
"$" (root)Selects root documentYes
"" (empty)Equivalent to "$"Yes (json.rs:686-689)
"$.nonexistent"Returns None/nullYes
"$[0]" on non-arrayReturns None/nullYes

Verdict: JSON boundaries are well-guarded.

5. Event Log Boundaries

Sequence Numbering

ScenarioBehaviorCorrect?
First event in logsequence = 0 (0-indexed)Yes
Read sequence 0Returns first eventYes
Read negative offsetNot possible (u64)Yes
Read beyond last sequenceReturns NoneYes

Event Payload Validation

InputBehaviorCorrect?
{} (empty object)AcceptedYes
[] (array)Rejected: “Payload must be a JSON object”Yes
"string"Rejected: “Payload must be a JSON object”Yes
42 (number)Rejected: “Payload must be a JSON object”Yes
nullRejected: “Payload must be a JSON object”Yes
Object with NaNRejected: “NaN values not permitted”Yes
Object with InfinityRejected: “Infinity values not permitted”Yes

Verdict: Event payload validation is thorough. Notable that this is stricter than vector embedding validation.

Hash Chain at Boundary

ScenarioBehaviorCorrect?
First event (no previous)Hash includes [0u8; 32] as prev_hashYes
Event after gapNot possible — sequences are contiguousYes

6. Transaction Boundaries

Empty Transaction

ScenarioBehaviorCorrect?
Begin → Commit (no operations)Succeeds, allocates versionWasteful but harmless
Begin → Rollback (no operations)Succeeds, no version allocatedYes

Finding: An empty transaction commit allocates a version number that serves no purpose. This creates version gaps (documented as intentional in manager.rs:126-135) but wastes version space. Not a bug per se, but a minor inefficiency.

Transaction Size

No explicit limit on:

  • Number of keys in write-set
  • Total data size in write-set
  • Number of operations per transaction

A transaction writing millions of keys would accumulate all data in memory (TransactionContext’s write_set HashMap) before commit, potentially causing OOM.

Transaction Timeout

Location: crates/concurrency/src/transaction.rs

A timeout field exists on TransactionContext but is not enforced in production code paths. There is no background task checking for expired transactions. A transaction that begins but never commits holds its allocated resources (TransactionContext) indefinitely — until Session::drop() returns it to the pool.

7. Branch Operation Boundaries

Delete Default Branch

ScenarioBehaviorCorrect?
Delete “default” branchRejectedYes

Operations on Non-Existent Branch

ScenarioBehaviorCorrect?
KV write to non-existent branchSucceeds (creates shard lazily)By design
TxnBegin on non-existent branchSucceedsBy design (#853)
BranchGet on non-existent branchReturns NoneYes
BranchDelete on non-existent branchReturns errorYes

Operations on Deleted Branch

ScenarioBehaviorCorrect?
KV write to deleted branchSucceedsNo — creates orphaned data
State write to deleted branchSucceedsNo — creates orphaned data
Event append to deleted branchSucceedsNo — creates orphaned data

Finding: Branch deletion (BranchIndex::delete_branch() at index.rs:312-373) removes metadata and scans all data, but does not prevent future writes. Since branch IDs are deterministic UUIDs derived from names, a write to a deleted branch creates new data in a shard with no corresponding metadata. This data is invisible to list_branches() but exists in storage.

8. CAS (Compare-and-Swap) Boundaries

State CAS Edge Cases

ScenarioBehaviorCorrect?
CAS with expected=None, cell doesn’t existInitializes cellYes
CAS with expected=None, cell existsReturns conflictYes (after #836 fix)
CAS with expected=Counter(0)Only works if cell has Counter(0)Yes
CAS with expected=Counter(u64::MAX)CAS succeeds, new version wrapsNo — overflow

Version 0

ContextMeaning of version 0Consistent?
State CAS expected=None”Create if not exists”Yes
MVCC version 0Never allocated (versions start at 1)Yes
Event sequence 0First event in logYes — but different from MVCC
Version::is_zero()Returns true for any variant with value 0Yes

9. Summary

#FindingSeverityType
1Version::increment() panics (debug) or wraps (release) at u64::MAXMediumOverflow bug
2Global version counter wraps at u64::MAX — corrupts MVCCLowOverflow (theoretical)
3Transaction ID counter wraps at u64::MAXLowOverflow (theoretical)
4Event sequence counter wraps at u64::MAX — overwrites eventsLowOverflow (theoretical)
5NaN/Infinity in vector embeddings not validatedMediumMissing validation
6No upper bound on vector dimension — allows memory exhaustionMediumMissing validation
7Empty branch name accepted — inconsistent with other primitivesLowInconsistent validation
8Operations on deleted branch create orphaned dataMediumMissing guard
9Transaction timeout not enforcedLowIncomplete feature
10Vector key validation inconsistent with KV key validationLowInconsistent validation

Overall: Input validation is strong at the KV/State/JSON/Event layers but has gaps in the Vector and Branch layers. The most practically impactful issues are NaN in embeddings (#5), missing dimension upper bound (#6), and orphaned data after branch delete (#8). The overflow issues (#1-4) are theoretical given realistic workloads but represent architectural invariant violations.