1. Error Type Inventory

CrateTypeVariantsRole
coreStrataError21Canonical engine error. All engine operations return StrataResult<T>.
executorError30Client-facing error. All handler/session operations return Result<T>.
engineVectorError16Vector-specific errors. Converted to StrataError via From impl.
concurrencyCommitError4Transaction commit failures. Converted to StrataError via From impl.
concurrencyJsonConflictError3JSON-level conflict details. Internal to ValidationResult, not exposed.
concurrencyPayloadError1MessagePack deserialization. Internal, not exposed.
durabilityBranchBundleError14Bundle import/export errors. Converted to StrataError::Storage at engine boundary.
durabilityRecoveryError7+Database recovery errors. Not converted — fatal at startup.
durabilitySnapshotError6+Snapshot read errors. Internal to recovery.
durabilityCodecError3Codec encode/decode errors. Converted to StrataError::Serialization.

Total: ~105 error variants across 10 error types in 5 crates.

2. Conversion Chain

                          ORIGIN ERRORS
                          ─────────────
  io::Error    serde_json::Error    bincode::Error    CodecError
      │              │                    │                │
      └──────────────┴────────────────────┴────────────────┘


                     ┌─────────────────┐
                     │   StrataError   │   21 variants
                     │   (core crate)  │   Canonical engine error
                     └────────┬────────┘

            ┌─────────────────┼─────────────────┐
            │                 │                 │
            ▼                 ▼                 ▼
    ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
    │ VectorError  │  │ CommitError  │  │  BundleError │
    │ (16 variants)│  │ (4 variants) │  │(14 variants) │
    └──────┬───────┘  └──────┬───────┘  └──────┬───────┘
           │                 │                 │
           │   From impl     │  From impl      │  .map_err()
           └─────────────────┴─────────────────┘


                     ┌─────────────────┐
                     │   StrataError   │   (merged back)
                     └────────┬────────┘

                              │  convert_result() in executor
                              │  From<StrataError> for Error

                     ┌─────────────────┐
                     │ executor::Error │   30 variants
                     │ (client-facing) │
                     └────────┬────────┘


                          CLIENT

Two conversion paths exist (this is a problem — see Section 4):

PATH A: Executor (non-transactional)
  engine returns StrataResult<T>
    → convert_result() in handler
    → From<StrataError> for Error (in convert.rs)
    → executor::Result<T>

PATH B: Session (transactional)
  ctx.get()/ctx.put() returns StrataResult<T>
    → .map_err(Error::from) in dispatch_in_txn
    → From<StrataError> for Error (same From impl, BUT different
      intermediate error types from TransactionContext)
    → executor::Result<T>

3. Context Loss at Each Boundary

Boundary: StrataError → executor::Error

Location: crates/executor/src/convert.rs

StrataError fieldWhat happensImpact
entity_ref: EntityRef in NotFoundString-parsed by prefix ("kv:", "branch:", etc.) to select Error variantNew entity types silently fall through to KeyNotFound
entity_ref in VersionConflictDiscardedClient doesn’t know which entity had the conflict
entity_ref in WriteConflictFormatted into reason stringUnstructured — client must parse string
entity_ref in InvalidOperationFormatted into reason stringUnstructured
source: Box<Error> in StorageDiscardedUnderlying OS error (disk full, permission denied) lost
Version enum in VersionConflictConverted to raw u64 via version_to_u64()Counter(5) and Txn(5) and Sequence(5) all become 5
duration_ms in TransactionTimeoutFormatted into reason stringNumeric data lost
resource/limit/requested in CapacityExceededFormatted into reason stringStructured data lost

Boundary: VectorError → StrataError

Location: crates/engine/src/primitives/vector/error.rs

VectorError fieldWhat happensImpact
name in CollectionNotFoundMapped to EntityRef::new("collection", name)Preserved
name in CollectionAlreadyExistsMapped to EntityRef::new("collection", name)Preserved
key in VectorNotFoundMapped to EntityRef::new("vector", key)Preserved
tuple strings in Storage/Transaction/Internal/Io/DatabaseMapped to message stringsPreserved but untyped

Boundary: CommitError → StrataError

Location: crates/concurrency/src/transaction.rs

CommitError variantMaps toImpact
ValidationFailed(result)TransactionAborted { reason: result.to_string() }Conflict details (keys, paths) lost
InvalidState(msg)TransactionNotActive { state: msg }Preserved
WALError(msg)Storage { message: msg }Error type lost — WAL failure becomes generic storage
StorageError(msg)Storage { message: msg }Preserved

4. Error Swallowing Inventory

CRITICAL: Errors silently converted to success values

LocationCodeWhat’s swallowedSeverity
handlers/state.rs line ~79Err(_) => Ok(Output::MaybeVersion(None))ALL errors from state.init()High — storage failures, serialization errors, everything becomes None
handlers/state.rs line ~90Err(_) => Ok(Output::MaybeVersion(None))ALL errors from state.cas()High — version conflicts, not-found, storage failures all become None
handlers/vector.rs line ~86let _ = p.vector.create_collection(...)ALL errors from create_collectionMedium — storage failures silently ignored, next insert fails confusingly

LOW: Cleanup/non-critical error ignoring

LocationCodeWhat’s swallowedSeverity
durability/src/snapshot.rslet _ = std::fs::remove_file(...)File deletion errors during cleanupLow — best-effort cleanup
durability/src/branch_bundle/writer.rslet _ = std::fs::remove_file(...)Temp file removalLow — cleanup
storage/src/sharded.rsThread join result ignoredThread join errorsLow — shutdown path

MODERATE: .ok() converting Result to Option

LocationWhat’s lostSeverity
storage/src/sharded.rsSnapshotView::get()Storage read errors become NoneMedium — read failure looks like key-not-found
durability/src/format/snapshot.rsBinary parse errorsLow — internal parsing
durability/src/format/wal_record.rsHeader parse errorsLow — internal parsing
durability/src/retention/mod.rsUTF-8/hex parse errorsLow — file enumeration

5. Inconsistency Table

How each primitive handles the same error scenario

Scenario: Operation fails due to storage/IO error

PrimitiveHandler behaviorClient receives
KVconvert_result() propagatesError::Io { reason }
Eventconvert_result() propagatesError::Io { reason }
State (read/init/set)convert_result() propagatesError::Io { reason }
State (cas)SwallowedOutput::MaybeVersion(None) — looks like CAS failure
JSONconvert_result() propagatesError::Io { reason }
Vectorconvert_vector_result() propagatesError::Io { reason }
Vector (upsert auto-create)SwallowedSubsequent insert fails with collection-not-found
SearchMapped to InternalError::Internal { reason }

Scenario: Operation inside session transaction

PrimitiveError conversion pathDiffers from non-txn?
KV getctx.get().map_err(Error::from)Different .map_err chain than convert_result
KV puttxn.kv_put().map_err(Error::from)Different chain
State readctx.get().map_err(Error::from) then serde_json::from_str().map_err(Internal)Yes — JSON parse errors become Internal
JSON get (root)ctx.get().map_err(Error::from) then serde_json::from_str().map_err(Internal)Yes — same issue
JSON get (path)txn.json_get_path().map_err(Error::from)Different chain

Scenario: Entity not found

Error sourceconvert_result routingCorrect?
KV key not foundEntityRef starts with "kv:"Error::KeyNotFoundYes
Branch not foundEntityRef starts with "branch:"Error::BranchNotFoundYes
Collection not foundEntityRef starts with "collection:"Error::CollectionNotFoundYes
Event stream not foundEntityRef starts with "event:"Error::StreamNotFoundYes
State cell not foundEntityRef starts with "state:"Error::CellNotFoundYes
New future entity typeNo prefix match → Error::KeyNotFound (wrong variant)No — silent misrouting

6. Panic Risk Assessment

Production code .expect() calls

FileCountPatternRisk
executor.rs27branch.expect("resolved by resolve_default_branch")Medium — invariant violation panics the thread
session.rs1.expect("txn_branch_id set when txn_ctx is Some")Medium — state corruption panics
session.rs1.unwrap() on txn_ctx.take()Medium — state corruption panics

Total: 29 panic points in production code.

All 27 executor .expect() calls assume cmd.resolve_default_branch() was called before dispatch. If a code path skips resolution, the executor panics rather than returning an error.

The 2 session panic points assume internal state consistency between txn_ctx and txn_branch_id. If one is set without the other (e.g., after a partial failure in handle_begin), the session panics.

7. Diagnostics Assessment

What users CAN diagnose from errors

ScenarioError receivedDiagnosable?
Key not foundKeyNotFound { key: "user:123" }Yes — key name included
Branch not foundBranchNotFound { branch: "my-branch" }Yes
Wrong value typeWrongType { expected: "Int", actual: "String" }Yes
Invalid key formatInvalidKey { reason: "key too long (1025 > 1024)" }Yes
Dimension mismatchDimensionMismatch { expected: 128, actual: 256 }Yes
Invalid JSON pathInvalidPath { reason: "..." }Yes
History trimmedHistoryTrimmed { requested: 1, earliest: 5 }Yes

What users CANNOT diagnose

ScenarioError receivedWhat’s missing
State CAS version mismatchMaybeVersion(None)Everything — was it conflict? not-found? IO error?
State CAS on non-existent cellMaybeVersion(None)Same None as version mismatch
Storage error during state CASMaybeVersion(None)Same None as everything else
Disk full during writeIo { reason: "..." }Cannot distinguish from permission denied or corruption
WAL failure during commitTransactionConflict { reason: "..." }Cannot distinguish from read-set conflict
Search invalid queryInternal { reason: "..." }Cannot distinguish from budget exceeded or backend failure
Data corruption in txn readInternal { reason: "..." }Cannot distinguish from code bug
Version conflict (which type?)VersionConflict { expected: 5, actual: 7 }Is 5 a counter, txn ID, or sequence number?

8. Summary of Problems

#ProblemSeverityType
1state_cas() swallows ALL errorsHighSwallowed errors
2Session vs executor use different error pathsMediumInconsistency
3Storage error source chain discarded at executor boundaryMediumContext loss
4Search errors all collapse to InternalMediumContext loss
5VersionConflict loses version type informationLowContext loss
6Commit failures all become TransactionConflictMediumContext loss
7vector_upsert ignores create_collection errors (not just AlreadyExists)MediumSwallowed errors
8Session deserialization errors map to InternalLowWrong error type
927 .expect() calls in executor can panicMediumPanic risk
10NotFound routing depends on string prefix parsingLowFragile design