FLAG-048 — Delivery Memo (Branch Complete)¶
To: Vesper
From: Orion
Date: 2026-04-22
Branch: feat/anchor-dual-signal-calibration
Commits in this bundle: 2 (C1–C4 consolidated + C5 tests)
Previous milestones closed: C1 ✅ (schema+config), C2 ✅ (calculator+wire), C3 ✅ (guard rewire), C4 ✅ (cross-session persistence).
Summary¶
FLAG-048 (ANCHOR-CALIBRATION) is code-complete. The 12-point test plan from the pre-code findings memo is implemented as a 17-test suite in tests/test_flag_048_dual_signal.py (12 plan items + 5 sub-asserts that tighten coverage without adding new surface). Adjacent-suite regression is clean, the full FLAG-048 suite passes on first run, and the 150-tick benchmark row is included below per your C4 sign-off note.
The residual signal is the Atlas Option 3 control input: post-warm-up and cross-session-persistent, the anchor saturation guard consumes residual_distortion_bps in place of the capped last_anchor_divergence_bps. Rail-lock proof (T4) and exit reachability proof (T6) both land green — the two criteria Atlas cited as non-negotiable when approving Option 3.
What ships¶
Commit 1 — feat(engine,config,state): FLAG-048 dual-signal anchor calibration (C1-C4)¶
Consolidated implementation across the four pre-code steps. Twelve files touched, 1,292 insertions / 33 deletions, 1 new module (neo_engine/dual_signal_calculator.py).
| Step | Files | What landed |
|---|---|---|
| C1 | neo_engine/config.py, neo_engine/state_manager.py, config/config.yaml, config/config.example.yaml, config/config_live_stage1.yaml |
AnchorDualSignalConfig dataclass + YAML defaults. Schema migration for three new system_metrics REAL columns (structural_basis_bps, rolling_basis_baseline_bps, residual_distortion_bps). |
| C2 | neo_engine/dual_signal_calculator.py (new), neo_engine/strategy_engine.py, neo_engine/main_loop.py, neo_engine/state_manager.py (kwargs) |
AnchorDualSignalCalculator — EMA + warm-up + seed/reset/dump. StrategyEngine.last_structural_basis_bps (uncapped, Atlas-literal). NEOEngine holds one calculator, folds structural each tick, hydrates baseline/residual back onto strategy. record_system_metric gained three kwargs; telemetry writes gate all three behind snapshot.is_valid(). |
| C3 | neo_engine/main_loop.py, tests/test_anchor_saturation_guard.py, tests/test_anchor_idle_state.py, tests/test_flag_042_degraded_recovery.py |
Rename _evaluate_anchor_saturation_guard → _evaluate_anchor_residual_guard. New _select_anchor_guard_window() returns (window, source_label). Both entry + exit evaluators read through the selector so hysteresis stays coherent. Existing fixtures stubbed to return the legacy window for back-compat. |
| C4 | neo_engine/main_loop.py |
Module-level KEY_ANCHOR_DS_BASIS_BASELINE_BPS / _COUNT / _CLOSED_AT. Three helpers: _restore_anchor_dual_signal_baseline (7-branch read path), _persist_anchor_dual_signal_baseline (no-obs-preserves-prior), _clear_anchor_dual_signal_persistence (best-effort 3-key reset). Wired at _startup (after the fresh-session reset block, outside its scope) and _shutdown (before halt.reason write, inside belt-and-suspenders try/except). |
Commit 2 — test(dual_signal): FLAG-048 C5 — 17 behavioral + wiring tests¶
One new file: tests/test_flag_048_dual_signal.py (874 lines). 17 tests, 5 test classes.
150-tick EMA window benchmark (Vesper C4 sign-off requirement)¶
Benchmark methodology: fresh-seed at 10 bps, then 500 observations at a stepped target of 14 bps. Noise column is pstdev(residual) across a 400-tick stream at 12 bps ± 2 bps uniform noise (deterministic LCG, seed 12345). All runs executed against the shipped AnchorDualSignalCalculator with warmup_ticks = N // 3.
| N | alpha | baseline@t50 | baseline@t150 | baseline@t500 | residual_stdev |
|---|---|---|---|---|---|
| 50 | 0.0392 | 13.4588 | 13.9901 | 14.0000 | 1.1058 |
| 100 | 0.0198 | 12.5285 | 13.8009 | 13.9998 | 1.1349 |
| 150 | 0.0132 | 11.9464 | 13.4587 | 13.9949 | 1.1431 |
| 300 | 0.0066 | 11.1339 | 12.5285 | 13.8573 | 1.1417 |
Why N = 150 is the right default:
- baseline@t150 reaches 13.46 of a 4.0 bps step — 86% of the way in one window width (≈10 min at 4s cadence). N = 100 is faster (96%) but adapts too quickly to a single outlier regime; N = 300 is still at 63% a window later.
- Convergence to step target is essentially complete by t = 500 (13.99 vs. 14.00) — afternoon-to-overnight regime transitions (the exact case Atlas flagged) resolve in one extended window.
- Noise stdev is ≈ 1.14 bps, which sits well below the 7 bps entry bias threshold. A noise-driven trigger would require a sustained 6× floor, which the T4/T5 fixtures confirm is not reachable on the saturated-basis streams we've observed.
Numbers are reproducible by running the benchmark snippet embedded in the C5 tests (tests/test_flag_048_dual_signal.py::TestEmaBaselineConvergence) against the shipped calculator.
Test results¶
FLAG-048 suite — all green on first run¶
Per-class breakdown:
| Class | Tests | Plan items covered |
|---|---|---|
TestStructuralBasisUncapped |
1 | T1 |
TestStructuralBasisSignConvention |
2 | T2 (+basis and −basis) |
TestEmaBaselineConvergence |
2 | T3 (stable + stepped) |
TestResidualRailLocked |
1 | T4 (rail-lock proof) |
TestResidualEntryFires |
2 | T5 (fire + below-threshold) |
TestResidualExitReachable |
1 | T6 (exit reachability proof) |
TestHysteresisPreserved |
1 | T7 (excursion reset) |
TestCrossSessionPersistenceDbRoundTrip |
3 | T8, T9, T10 (full StateManager(":memory:") round-trips) |
TestWarmupSuppression |
2 | T11 (calculator None + selector fallback) |
TestDashboardSurfaceNoHiddenSubstitution |
2 | T12 (3-column distinctness + NULL-not-zero) |
Adjacent-suite regression¶
All four test files touched for the C3 selector stub remain green, as do the state manager and config suites:
tests/test_anchor_saturation_guard.py ..............................
tests/test_anchor_idle_state.py ..............................
tests/test_flag_042_degraded_recovery.py ..............................
tests/test_flag_044_recovery_cooldown.py ..............................
tests/test_anchor_error_telemetry.py ..............................
tests/test_state_manager.py ..............................
tests/test_config.py ..............................
tests/test_flag_047_cancel_fill_race.py ..............................
183 passed in 3.06s
Full regression¶
Full suite matches the C4 baseline's pass count plus exactly the 17 new FLAG-048 tests: 708 passed (vs C4's 691), same 378 pre-existing failures unchanged (all in test_main_loop.py, test_xrpl_gateway.py, test_paper_launch.py, test_task5.py — known pre-FLAG-048 breakage, untouched by this branch).
Delta vs C4: +17 passes, 0 new failures.
Vesper-requested verifications¶
1. fix/startup-mode-reset does not touch anchor_dual_signal.* keys¶
Verified again at the source. fix/startup-mode-reset only resets namespaces halt.*, inventory_truth.*, anchor_idle.* (FLAG-046), and the FLAG-044 cool-down keys. There is an explicit inline comment in the fresh-session reset block of _startup() documenting why anchor_dual_signal.* is deliberately OUT of scope: the staleness cutoff inside _restore_anchor_dual_signal_baseline() is the correct gate for regime boundaries, not session boundaries. An overnight session boundary must preserve the warm baseline; only a > 24h gap discards it.
2. ISO timestamp + NTP requirement for VPS runbook¶
Flagged per your note. KEY_ANCHOR_DS_BASIS_BASELINE_CLOSED_AT is an ISO-8601 timestamp emitted with datetime.now(timezone.utc).isoformat(); the staleness check uses datetime.fromisoformat(raw) − datetime.now(timezone.utc). Clock skew between shutdown and next startup directly affects the cutoff. For the Hetzner / Ubuntu LTS VPS migration runbook (FLAG-023), I've added an explicit item to include in the setup checklist:
- Enable NTP (
timedatectl set-ntp true) and confirmtimedatectlreports a synchronized clock before any engine run.- Set
/etc/timezoneto a stable reference (UTC recommended — all internal timestamps are tz-aware UTC already).- If the system clock is ever rolled backward mid-session, the next restart will see a negative age_hours and may treat a baseline as "fresh" when it's actually from a future timestamp. The restore helper already swallows
fromisoformatfailures, but a silent clock-drift event will not be flagged by our code. Operations should monitor NTP sync status via the standardsystemd-timesyncdchannels.
This goes in the VPS runbook I'm drafting next; flagging it here so the checklist item is traceable to the review note.
3. C5 tests exercise the real persistence layer (not just helper logic)¶
Per your explicit instruction: T8, T9, T10 each construct a real StateManager(":memory:"), call initialize_database(), and drive _restore_anchor_dual_signal_baseline() through the actual DB read path. The calculator is a fresh AnchorDualSignalCalculator instance — not a mock — so seed_baseline is exercised end-to-end. On T10, I also assert the three persistence keys are cleared (empty string) after stale discard, which catches a regression where the clear step is skipped silently.
Apply instructions (Windows / PowerShell)¶
cd C:\Users\Katja\Documents\NEO GitHub\neo-2026\
git fetch origin main
git checkout main
git pull origin main
# Defensive branch delete (handles prior investigation checkouts)
git branch -D feat/anchor-dual-signal-calibration 2>$null
git checkout -b feat/anchor-dual-signal-calibration
# Apply the two-patch bundle in order
Get-ChildItem "C:\Users\Katja\Documents\Claude Homebase Neo\02 Projects\NEO Trading Engine\03 Branches\feat-anchor-dual-signal-calibration\" -Filter "*.patch" | Sort-Object Name | ForEach-Object { git am $_.FullName }
# Sanity-check test run
python -m pytest tests/test_flag_048_dual_signal.py -v
python -m pytest tests/test_anchor_saturation_guard.py tests/test_anchor_idle_state.py tests/test_flag_042_degraded_recovery.py tests/test_anchor_error_telemetry.py tests/test_state_manager.py -q
Expected: 17 passed in the FLAG-048 file, adjacent suites green.
Open follow-ups (not blocking this delivery)¶
- Pre-live gate replay — Atlas's C3 ruling requires a replay comparison on S48/S49/S50 showing residual signal exit condition reachable in afternoon ET before live session resume. The live DB (
neo_live_stage1.db) was not accessible from the sandbox this cycle; the T4 rail-lock proof is synthetic at the session-52 shape, not the exact DB replay. I can construct the real replay once the DB is copied into the working tree (or the VPS migration completes and we have a clean mount). - VPS runbook (FLAG-023) — to include the NTP + UTC requirement documented above, plus the DB safeguards (FLAG-049) Atlas ruled on 2026-04-22.
- Phase 7.4 gate remains blocked on FLAG-048 acceptance + pre-live replay; everything else in the precondition stack is already green.
Delivery artifacts¶
[C] Orion Delivery Memo — FLAG-048 C5.md(this file)0001-feat-engine-config-state-FLAG-048-dual-signal-anchor.patch0002-test-dual_signal-FLAG-048-C5-17-behavioral-wiring-te.patch
All three live in 02 Projects/NEO Trading Engine/03 Branches/feat-anchor-dual-signal-calibration/.
— Orion Director of Engineering, BlueFly AI Enterprises 2026-04-22