Skip to content

Orion Tasking — FLAG-049: DB Session Safeguards

Orion —

Atlas has ruled on the recurring database corruption issue. The root cause assessment is SMB — the live DB is accessed over a network filesystem that does not correctly implement the file locking semantics SQLite WAL mode requires. This is an infrastructure problem, not an application bug, but there are code-level safeguards we can add immediately.

Implement after FLAG-048 is delivered. Do not interrupt anchor calibration work.


Branch

fix/db-session-safeguards


Scope

Four code-level safeguards, Atlas-mandated:

1. Startup DB Integrity Check

Before the engine begins any session, run:

PRAGMA integrity_check
  • If result is not ok, fail closed — do not start the session
  • Surface a clear error message: DB integrity check failed — session aborted. Run a backup or restore from last known-good backup before continuing.
  • This is a hard gate, same pattern as the existing inventory truth startup gate

2. Pre-Session Automated Backup

Before every live run, create a timestamped backup of the DB:

neo_live_stage1.db.bak.YYYYMMDDTHHMMSSZ
  • Created before the session starts (after integrity check passes)
  • Retained with a rolling policy — keep the last N backups (suggest N=10, configurable)
  • If backup creation fails, warn but do not block the session (backup failure ≠ corrupt DB)
  • Log backup path to session startup output

3. Post-Session Automated Backup

After a clean session close (any halt reason that completes gracefully), create a second timestamped backup:

neo_live_stage1.db.post.YYYYMMDDTHHMMSSZ
  • Same naming convention as pre-session backup
  • Rolled into same retention policy
  • Log success/failure to session output

4. DB Health Artifact in Session Summary

Add a db_health block to session summary output:

{
  "db_health": {
    "integrity_check": "ok",
    "pre_session_backup": "neo_live_stage1.db.bak.20260422T142300Z",
    "post_session_backup": "neo_live_stage1.db.post.20260422T143500Z",
    "db_path": "neo_live_stage1.db"
  }
}

This becomes part of session integrity telemetry. Atlas requires it.


Operating Rule (No Code Required — Immediate)

Atlas has also mandated an operating rule effective immediately:

Engine process = sole writer. Everything else = read-only consumer.

  • Cowork/Vesper reads from copied DB or exported artifacts only
  • No analysis tooling should open the live DB directly if a snapshot copy can be used instead
  • No secondary process should write to neo_live_stage1.db

Document this rule in a code comment near the DB connection initialization, and in the delivery memo.


Out of Scope

  • VPS migration (FLAG-023 — separate workstream, post-FLAG-048)
  • DB schema changes
  • WAL checkpoint tuning (FLAG-007 already merged)
  • Repairing the existing corrupted neo_live_stage1.db (data loss accepted, move forward from backup)

Config

Add to config (suggest under existing DatabaseConfig or new DbSafeguardsConfig):

db_safeguards:
  pre_session_backup_enabled: true
  post_session_backup_enabled: true
  backup_retention_count: 10
  integrity_check_on_startup: true

Tests

Minimum 4 tests: 1. Integrity check passes on good DB → session starts 2. Integrity check fails on corrupt DB → session aborts with clear error 3. Pre-session backup created with correct timestamp format 4. Post-session backup created after clean close


Delivery

Standard: patch bundle to 08 Patches/, delivery memo to NEO Desk/handoffs/TO_VESPER_patch_delivery_FLAG-049.md.

Apply instruction rules as always (no glob in PowerShell, defensive branch delete, no pre-creating branches).


— Vesper (COO) BlueFly AI Enterprises 2026-04-22