Atlas Ruling — DB Reliability, SMB Risk, and VPS Migration Sequencing¶

1. Database Corruption — Atlas's Read¶

Treat the current local+SMB SQLite setup as operationally untrustworthy.

Not "a bit fragile." Not "monitor it more closely." Untrustworthy.

Repeated corruption, unrecoverable session loss, and a persistence model that depends on SQLite WAL behavior in an environment SQLite explicitly does not like. That is enough.

This is now an infrastructure problem, not an application bug.

FLAG-007 hardening was still worth doing, but it does not change the underlying storage reality.

Root cause assessment: The SMB hypothesis is the leading explanation and should be treated as the working root cause unless disproven.

Repeated pattern, not isolated event
WAL corruption, not just app-level inconsistency
Unrecoverable database state
Known incompatibility class between SQLite WAL semantics and network filesystems / file locking edge cases

Stop expecting further software-side "hygiene" to solve this completely.

2. Direct Answers — DB Reliability Questions¶

Q: Add pre-session DB integrity check?

Yes. Approved immediately.

Add a startup integrity gate: - PRAGMA integrity_check - Fail closed if result is not clean - Do not start the session on a suspect DB

This does not prevent corruption. It does prevent starting from bad state, wasting a session on a broken file, or discovering corruption too late.

Q: Add automated pre-session backups?

Yes. Approved immediately. Now mandatory, not optional.

Minimum: - Timestamped pre-session backup - Created before every live run - Retained with rolling policy

Reason: still on unstable storage. Until migration, point-in-time rollback is required.

Q: Should Cowork treat DB as read-only?

Yes.

Engine process = sole writer. Everything else = read-only consumer.

Do not let Cowork or any secondary process write to the live DB. Do not let analysis tooling touch WAL behavior beyond reading snapshots or copies.

Until migration: - Engine writes to live DB - Analysis reads from copied DB or exported artifacts - No shared live-write access pattern

This is an immediate operating rule.

3. Additional Short-Term Controls¶

Before migration, add these operating safeguards:

A. Per-session backup before run — Mandatory.

B. Post-session backup after clean close — Also mandatory.

C. Read analysis from copies, not live DB — No tooling should inspect the live DB directly if a copied snapshot can be used instead.

D. DB health artifact per session — Each session should record: - Integrity check result at start - Backup timestamp used - DB path used - Whether post-session backup succeeded

This becomes part of session integrity.

4. VPS Migration — Atlas's Read¶

Katja is correct. VPS migration is no longer a "future nice-to-have." It is moving toward near-term necessity.

Not because the engine is fully ready for production — but because the storage substrate is now undermining trust in results. You cannot keep proving system readiness on top of an unreliable persistence layer.

That said: do NOT migrate before anchor calibration is resolved enough to make sessions meaningful. Migration should not become avoidance.

Sequencing:

Fix signal validity enough to make sessions worth running
→ then move to stable infrastructure
→ then continue clean-session proof

5. Direct Answers — VPS Questions¶

Q: Preferred VPS provider / OS baseline?

Ubuntu LTS on a simple, boring VPS.

Provider preference order: 1. Hetzner 2. DigitalOcean 3. Linode

Reason: simple, cost-effective, predictable, plenty good enough for this workload.

Baseline: - Ubuntu LTS - Local SSD-backed filesystem - Single-node deployment - Engine + SQLite local to the box - No SMB - No network-mounted DB

Keep it boring.

Q: Architectural changes before migration?

Do not overbuild this.

Not wanted: - DB split from engine into a separate service - API layer first - Premature distributed architecture

Wanted before/at migration: - Local filesystem only - Engine as sole DB writer - Automated backup scripts - Health checks - Clear runtime/service management - Log rotation - Session artifact export path

That is enough.

Q: Migration timing — standalone branch after FLAG-048, or wait until after Phase 7.4?

Plan migration as the first major post-FLAG-048 infrastructure task, but do not execute until anchor recalibration path is sufficiently validated.

Recommended sequencing: 1. Resolve FLAG-048 / anchor calibration enough that the engine is no longer idling on a broken signal 2. Run at least one meaningful validating session under corrected anchor logic 3. Execute VPS migration 4. Pursue the 2 clean-session Phase 7.4 requirement on the VPS

6. Reclassification Ruling¶

FLAG-023 should be reclassified from "future / low urgency" to "near-term infrastructure priority."

Not current blocker. But no longer back-burner.

7. Operational Posture From This Point¶

Current local setup = development / interim validation only VPS = first serious operational environment

Stop treating the current box as something to trust long-term. Use it only to get through the current calibration layer, then move.

8. Final Directives¶

Immediate: - Add startup integrity check - Add automated pre-session backup - Add automated post-session backup - Enforce engine-only write access - Read analysis from copies, not live DB

Near-term: - Reclassify FLAG-023 upward - Prepare VPS migration plan - Do not migrate until anchor recalibration is sufficiently validated

Preferred platform: - Ubuntu LTS - Hetzner first choice, DigitalOcean second, Linode third - Local SSD, single-node, SQLite local

The database issue is real, recurring, and infrastructure-rooted. Treat SMB + SQLite WAL as untrustworthy. Mitigate immediately. Migrate soon after anchor calibration is validated. Katja's instinct is right.

— Atlas (CSO) 2026-04-22