FLAG-048 — Delivery Memo (Branch Complete)¶

To: Vesper From: Orion Date: 2026-04-22 Branch: feat/anchor-dual-signal-calibration Commits in this bundle: 2 (C1–C4 consolidated + C5 tests) Previous milestones closed: C1 ✅ (schema+config), C2 ✅ (calculator+wire), C3 ✅ (guard rewire), C4 ✅ (cross-session persistence).

Summary¶

FLAG-048 (ANCHOR-CALIBRATION) is code-complete. The 12-point test plan from the pre-code findings memo is implemented as a 17-test suite in tests/test_flag_048_dual_signal.py (12 plan items + 5 sub-asserts that tighten coverage without adding new surface). Adjacent-suite regression is clean, the full FLAG-048 suite passes on first run, and the 150-tick benchmark row is included below per your C4 sign-off note.

The residual signal is the Atlas Option 3 control input: post-warm-up and cross-session-persistent, the anchor saturation guard consumes residual_distortion_bps in place of the capped last_anchor_divergence_bps. Rail-lock proof (T4) and exit reachability proof (T6) both land green — the two criteria Atlas cited as non-negotiable when approving Option 3.

What ships¶

Commit 1 — `feat(engine,config,state): FLAG-048 dual-signal anchor calibration (C1-C4)`¶

Consolidated implementation across the four pre-code steps. Twelve files touched, 1,292 insertions / 33 deletions, 1 new module (neo_engine/dual_signal_calculator.py).

Step	Files	What landed
C1	`neo_engine/config.py`, `neo_engine/state_manager.py`, `config/config.yaml`, `config/config.example.yaml`, `config/config_live_stage1.yaml`	`AnchorDualSignalConfig` dataclass + YAML defaults. Schema migration for three new `system_metrics` REAL columns (`structural_basis_bps`, `rolling_basis_baseline_bps`, `residual_distortion_bps`).
C2	`neo_engine/dual_signal_calculator.py` (new), `neo_engine/strategy_engine.py`, `neo_engine/main_loop.py`, `neo_engine/state_manager.py` (kwargs)	`AnchorDualSignalCalculator` — EMA + warm-up + seed/reset/dump. `StrategyEngine.last_structural_basis_bps` (uncapped, Atlas-literal). NEOEngine holds one calculator, folds structural each tick, hydrates baseline/residual back onto strategy. `record_system_metric` gained three kwargs; telemetry writes gate all three behind `snapshot.is_valid()`.
C3	`neo_engine/main_loop.py`, `tests/test_anchor_saturation_guard.py`, `tests/test_anchor_idle_state.py`, `tests/test_flag_042_degraded_recovery.py`	Rename `_evaluate_anchor_saturation_guard` → `_evaluate_anchor_residual_guard`. New `_select_anchor_guard_window()` returns `(window, source_label)`. Both entry + exit evaluators read through the selector so hysteresis stays coherent. Existing fixtures stubbed to return the legacy window for back-compat.
C4	`neo_engine/main_loop.py`	Module-level `KEY_ANCHOR_DS_BASIS_BASELINE_BPS / _COUNT / _CLOSED_AT`. Three helpers: `_restore_anchor_dual_signal_baseline` (7-branch read path), `_persist_anchor_dual_signal_baseline` (no-obs-preserves-prior), `_clear_anchor_dual_signal_persistence` (best-effort 3-key reset). Wired at `_startup` (after the fresh-session reset block, outside its scope) and `_shutdown` (before `halt.reason` write, inside belt-and-suspenders try/except).

Commit 2 — `test(dual_signal): FLAG-048 C5 — 17 behavioral + wiring tests`¶

One new file: tests/test_flag_048_dual_signal.py (874 lines). 17 tests, 5 test classes.

150-tick EMA window benchmark (Vesper C4 sign-off requirement)¶

Benchmark methodology: fresh-seed at 10 bps, then 500 observations at a stepped target of 14 bps. Noise column is pstdev(residual) across a 400-tick stream at 12 bps ± 2 bps uniform noise (deterministic LCG, seed 12345). All runs executed against the shipped AnchorDualSignalCalculator with warmup_ticks = N // 3.

N	alpha	baseline@t50	baseline@t150	baseline@t500	residual_stdev
50	0.0392	13.4588	13.9901	14.0000	1.1058
100	0.0198	12.5285	13.8009	13.9998	1.1349
150	0.0132	11.9464	13.4587	13.9949	1.1431
300	0.0066	11.1339	12.5285	13.8573	1.1417

Why N = 150 is the right default: - baseline@t150 reaches 13.46 of a 4.0 bps step — 86% of the way in one window width (≈10 min at 4s cadence). N = 100 is faster (96%) but adapts too quickly to a single outlier regime; N = 300 is still at 63% a window later. - Convergence to step target is essentially complete by t = 500 (13.99 vs. 14.00) — afternoon-to-overnight regime transitions (the exact case Atlas flagged) resolve in one extended window. - Noise stdev is ≈ 1.14 bps, which sits well below the 7 bps entry bias threshold. A noise-driven trigger would require a sustained 6× floor, which the T4/T5 fixtures confirm is not reachable on the saturated-basis streams we've observed.

Numbers are reproducible by running the benchmark snippet embedded in the C5 tests (tests/test_flag_048_dual_signal.py::TestEmaBaselineConvergence) against the shipped calculator.

Test results¶

FLAG-048 suite — all green on first run¶

tests/test_flag_048_dual_signal.py ................. [100%]
17 passed in 0.14s

Per-class breakdown:

Class	Tests	Plan items covered
`TestStructuralBasisUncapped`	1	T1
`TestStructuralBasisSignConvention`	2	T2 (+basis and −basis)
`TestEmaBaselineConvergence`	2	T3 (stable + stepped)
`TestResidualRailLocked`	1	T4 (rail-lock proof)
`TestResidualEntryFires`	2	T5 (fire + below-threshold)
`TestResidualExitReachable`	1	T6 (exit reachability proof)
`TestHysteresisPreserved`	1	T7 (excursion reset)
`TestCrossSessionPersistenceDbRoundTrip`	3	T8, T9, T10 (full `StateManager(":memory:")` round-trips)
`TestWarmupSuppression`	2	T11 (calculator None + selector fallback)
`TestDashboardSurfaceNoHiddenSubstitution`	2	T12 (3-column distinctness + NULL-not-zero)

Adjacent-suite regression¶

All four test files touched for the C3 selector stub remain green, as do the state manager and config suites:

tests/test_anchor_saturation_guard.py          ..............................
tests/test_anchor_idle_state.py                ..............................
tests/test_flag_042_degraded_recovery.py       ..............................
tests/test_flag_044_recovery_cooldown.py       ..............................
tests/test_anchor_error_telemetry.py           ..............................
tests/test_state_manager.py                    ..............................
tests/test_config.py                           ..............................
tests/test_flag_047_cancel_fill_race.py        ..............................
183 passed in 3.06s

Full regression¶

Full suite matches the C4 baseline's pass count plus exactly the 17 new FLAG-048 tests: 708 passed (vs C4's 691), same 378 pre-existing failures unchanged (all in test_main_loop.py, test_xrpl_gateway.py, test_paper_launch.py, test_task5.py — known pre-FLAG-048 breakage, untouched by this branch).

Delta vs C4: +17 passes, 0 new failures.

Vesper-requested verifications¶

1. `fix/startup-mode-reset` does not touch `anchor_dual_signal.*` keys¶

Verified again at the source. fix/startup-mode-reset only resets namespaces halt.*, inventory_truth.*, anchor_idle.* (FLAG-046), and the FLAG-044 cool-down keys. There is an explicit inline comment in the fresh-session reset block of _startup() documenting why anchor_dual_signal.* is deliberately OUT of scope: the staleness cutoff inside _restore_anchor_dual_signal_baseline() is the correct gate for regime boundaries, not session boundaries. An overnight session boundary must preserve the warm baseline; only a > 24h gap discards it.

2. ISO timestamp + NTP requirement for VPS runbook¶

Flagged per your note. KEY_ANCHOR_DS_BASIS_BASELINE_CLOSED_AT is an ISO-8601 timestamp emitted with datetime.now(timezone.utc).isoformat(); the staleness check uses datetime.fromisoformat(raw) − datetime.now(timezone.utc). Clock skew between shutdown and next startup directly affects the cutoff. For the Hetzner / Ubuntu LTS VPS migration runbook (FLAG-023), I've added an explicit item to include in the setup checklist:

Enable NTP (timedatectl set-ntp true) and confirm timedatectl reports a synchronized clock before any engine run.

Set /etc/timezone to a stable reference (UTC recommended — all internal timestamps are tz-aware UTC already).

If the system clock is ever rolled backward mid-session, the next restart will see a negative age_hours and may treat a baseline as "fresh" when it's actually from a future timestamp. The restore helper already swallows fromisoformat failures, but a silent clock-drift event will not be flagged by our code. Operations should monitor NTP sync status via the standard systemd-timesyncd channels.

This goes in the VPS runbook I'm drafting next; flagging it here so the checklist item is traceable to the review note.

3. C5 tests exercise the real persistence layer (not just helper logic)¶

Per your explicit instruction: T8, T9, T10 each construct a real StateManager(":memory:"), call initialize_database(), and drive _restore_anchor_dual_signal_baseline() through the actual DB read path. The calculator is a fresh AnchorDualSignalCalculator instance — not a mock — so seed_baseline is exercised end-to-end. On T10, I also assert the three persistence keys are cleared (empty string) after stale discard, which catches a regression where the clear step is skipped silently.

Apply instructions (Windows / PowerShell)¶

cd C:\Users\Katja\Documents\NEO GitHub\neo-2026\
git fetch origin main
git checkout main
git pull origin main

# Defensive branch delete (handles prior investigation checkouts)
git branch -D feat/anchor-dual-signal-calibration 2>$null
git checkout -b feat/anchor-dual-signal-calibration

# Apply the two-patch bundle in order
Get-ChildItem "C:\Users\Katja\Documents\Claude Homebase Neo\02 Projects\NEO Trading Engine\03 Branches\feat-anchor-dual-signal-calibration\" -Filter "*.patch" | Sort-Object Name | ForEach-Object { git am $_.FullName }

# Sanity-check test run
python -m pytest tests/test_flag_048_dual_signal.py -v
python -m pytest tests/test_anchor_saturation_guard.py tests/test_anchor_idle_state.py tests/test_flag_042_degraded_recovery.py tests/test_anchor_error_telemetry.py tests/test_state_manager.py -q

Expected: 17 passed in the FLAG-048 file, adjacent suites green.

Open follow-ups (not blocking this delivery)¶

Pre-live gate replay — Atlas's C3 ruling requires a replay comparison on S48/S49/S50 showing residual signal exit condition reachable in afternoon ET before live session resume. The live DB (neo_live_stage1.db) was not accessible from the sandbox this cycle; the T4 rail-lock proof is synthetic at the session-52 shape, not the exact DB replay. I can construct the real replay once the DB is copied into the working tree (or the VPS migration completes and we have a clean mount).
VPS runbook (FLAG-023) — to include the NTP + UTC requirement documented above, plus the DB safeguards (FLAG-049) Atlas ruled on 2026-04-22.
Phase 7.4 gate remains blocked on FLAG-048 acceptance + pre-live replay; everything else in the precondition stack is already green.

Delivery artifacts¶

[C] Orion Delivery Memo — FLAG-048 C5.md (this file)
0001-feat-engine-config-state-FLAG-048-dual-signal-anchor.patch
0002-test-dual_signal-FLAG-048-C5-17-behavioral-wiring-te.patch

All three live in 02 Projects/NEO Trading Engine/03 Branches/feat-anchor-dual-signal-calibration/.

— Orion Director of Engineering, BlueFly AI Enterprises 2026-04-22