[C] Orion Pre-Live Replay Report — FLAG-048¶

To: Vesper From: Orion Branch: feat/anchor-dual-signal-calibration Date: 2026-04-21 Status: Replay complete with caveats — requesting sign-off to lift the session hold OR guidance on what additional replay you require.

Executive summary¶

The dual-signal calculator was replayed against real per-tick data from sessions 50/51/52 (CLAUDE.md labels: S47/S48/S49) read from the latest clean DB backup. All replay ran through a real StateManager(":memory:") instance — no mocks — exercising the cross-session persistence path end-to-end.

Two independent findings surfaced that need your decision before we call this a clean pre-live:

Data gap — session 53 (S50) is not recoverable. The live DB at mnt/neo-2026/neo_live_stage1.db is malformed on disk. Integrity check fails on both the mounted copy and Katja's direct upload (byte-identical files). This is the exact SMB/WAL pathology Atlas's 2026-04-22 DB-Reliability ruling called out. Latest clean backup is T165223Z which predates session 53; no other snapshot contains it. Replay proceeded on sessions 50/51/52 only.
Data caveat — pre-FLAG-048 schemas do not persist uncapped structural. The per-tick signal available before this branch is system_metrics.anchor_error_bps, which is already clamped at ±10 bps by the existing anchor cap. For replay purposes this value was fed to AnchorDualSignalCalculator.observe() as a stand-in for structural_basis_bps. What this replay can validate: EMA convergence, warm-up gating, cross-session persistence, rail-lock behavior, and exit reachability under saturated input. What it cannot validate: residual behavior when uncapped structural moves outside ±10 bps, because no such data exists in the historical record.

If you want replay against truly uncapped structural, the only path is to merge C1's schema migration, run one short live tick loop to capture real structural_basis_bps, then replay. We're in a chicken-and-egg situation until then.

Replay configuration¶

Setting	Value
Data source	`neo_live_stage1.db.bak.20260421T165223Z` (integrity_check = ok)
Read mode	`mode=ro&immutable=1` via URI (no writes, no WAL materialization)
Sessions replayed	50, 51, 52 (S47/S48/S49 per CLAUDE.md)
Sessions skipped	53 (S50) — live DB malformed, not in any readable snapshot
Config	default `AnchorDualSignalConfig`: ema_window=150, warmup=50, hyst=20, stale_h=24.0
StateManager	real `StateManager(":memory:")` with `initialize_database()` — no mocks
Calculator	`AnchorDualSignalCalculator` observed against each session in order, baseline dumped at session end via `dump_state()`, persisted to `engine_state` table, restored at next session start via `seed_baseline()`

Results — per session¶

Session 50 (S47) — 32 ticks, `degraded_episode_limit_halt`¶

Started cold. Samples accumulated 0 → 32. Warmup threshold is 50, so calculator correctly returned baseline=None, residual=None for every tick.
At session close, dump_state() → (baseline=-8.3164 bps, samples=32). Written to engine_state under anchor_ds.basis_baseline_bps, anchor_ds.basis_baseline_samples, anchor_ds.basis_baseline_closed_at.

Session 51 (S48) — 171 ticks, `inventory_truth_halt`¶

Restored (baseline=-8.3164, samples=32) from engine_state. seed_baseline() accepted.
Needed 18 additional ticks to cross warmup (32 → 50). At tick 18 is_warm flipped True and residual/baseline began emitting.
Structural ranged from -8.07 to +10.00 across the session (18.07 bps span).
Residual trajectory over the 154 warm ticks: mean_abs=6.19 bps, max_abs=13.27 bps. Real signal, not stuck at rail.
Exit-reachability witness (tick 171): structural dropped from +10.00 to +4.58; residual on the same tick = +0.08 bps. That is well below the 5-bps residual threshold that would drive the anchor guard to exit. Confirms the new control signal reacts within a single tick when structural returns to baseline — which is precisely what was not reachable under capped anchor_error.
dump_state() → (baseline=+4.5030 bps, samples=203).

Session 52 (S49) — 38 ticks, `degraded_episode_limit_halt`, structural locked at +10 for all 38 ticks¶

Restored (baseline=+4.5030, samples=203). Already warm from tick 1.
Structural constant at +10.00 every tick.
Rail-lock witness: residual started at +5.42 bps (cap − restored baseline = 10 − 4.58) and walked down to +3.31 bps by tick 38 as the EMA pulled baseline from +4.58 → +6.69.
Residual mean_abs=4.28 bps, max_abs=5.42 bps across the 38 ticks. Mathematically, if the cap persisted another ~150 ticks the residual would converge toward 0 — exactly the rail-lock property Atlas required (structural saturation does not produce a permanent residual signal).
dump_state() → (baseline=+6.6881 bps, samples=241).

Property checks¶

Property	Check	Result
Warm-up gating	Session 50 (32 ticks, never warm) emits `residual=None, baseline=None` throughout	✅ PASS
Warm-up cross-session continuity	Session 51 was seeded at samples=32, crossed warm at tick 18 (32+18=50)	✅ PASS
Cross-session persistence round-trip	`dump_state` → `set_engine_state` → `get_engine_state` → `seed_baseline` exercised against real StateManager between 50→51 and 51→52; restored values matched	✅ PASS
Exit reachability (Atlas Q4)	Session 51 tick 171: structural fell from +10 to +4.58 → residual = +0.08 bps	✅ PASS
Rail-lock under saturation (Atlas Q5)	Session 52 (38 ticks @ +10): residual drifted 5.42 → 3.31 (monotone toward 0 via EMA pull-up)	✅ DIRECTIONAL PASS (would fully converge to 0 given the 150-tick window; 38 ticks shows the right sign and slope)
Baseline staleness (Atlas 24 h cutoff)	Not exercised by this replay — all three sessions are inside the backup's 24 h window. Unit test coverage lives in `test_flag_048_dual_signal.py :: TestCrossSessionPersistenceDbRoundTrip.test_stale_baseline_discarded`	⚠ Test-only (no replay witness)
Rejection of None/NaN input	Not exercised by this replay (data is clean). Unit test coverage: `test_flag_048_dual_signal.py :: TestStructuralBasisUncapped`	⚠ Test-only (no replay witness)
Uncapped structural residual behavior	Cannot be exercised from pre-FLAG-048 data — column did not exist	❌ NOT EXERCISED — flagged for post-merge replay

DB reliability note (Atlas ruling follow-through)¶

This exercise hit Atlas's exact predicted failure mode:

mnt/neo-2026/neo_live_stage1.db → disk I/O error (FUSE/SMB locking)
mnt/uploads/neo_live_stage1.db (direct upload, byte-identical) → database disk image is malformed
Fallback to latest clean backup T165223Z → integrity_check = ok

FLAG-049 (DB-SESSION-SAFEGUARDS — integrity check + automated backups + write-access enforcement) is exactly right; the backup cadence just saved this replay.

For this session I followed the "analysis reads from copies" rule: live file was not opened for writes, and the snapshot copy sits in the sandbox with chmod 444. No changes were made to any file on the live mount.

What I need from you¶

Three options, pick one:

(A) Accept this replay as sufficient for the pre-live gate. Property checks covered by the replay are all passing directionally; remaining gaps (stale baseline, NaN input, uncapped structural) are covered by unit tests in the C5 suite. Under this option you'd sign off and Katja lifts the session hold.

(B) Require a post-merge live capture before sign-off. Merge the branch, start one short live tick loop (≤10 minutes) to capture real structural_basis_bps data via C1's new column, then I replay again against that. This is the only path to a replay that exercises uncapped structural.

(C) Recover session 53 first. I have not explored aggressive recovery on the malformed live DB (the sqlite3 CLI isn't installed in this sandbox and .recover may or may not extract session 53 cleanly). I can try if you think it's necessary, but Atlas's ruling said analysis should read from copies, and no clean copy containing session 53 exists, so this may not yield anything.

My recommendation: Option A, with Option B queued as a follow-on replay after the first post-merge session. The replay above demonstrates the only two Atlas-critical properties (Q4 exit reachability, Q5 rail-lock) on real per-tick cadence using a real StateManager round-trip. The remaining capped-vs-uncapped gap is a data gap, not a calculator-correctness gap.

Artifacts¶

File	Location
Replay script	`03 Branches/feat-anchor-dual-signal-calibration/flag048_prelive_replay.py`
Raw replay output	sandbox: `/sessions/peaceful-admiring-allen/work/flag048-replay/replay_output.txt` (mirrors the numbers in the tables above)
Backup used	`mnt/neo-2026/neo_live_stage1.db.bak.20260421T165223Z` (read-only, immutable URI)

No live-mount writes occurred during this replay.

Standing by.

— Orion