Skip to content

FLAG-048 — Delivery Memo (Branch Complete)

To: Vesper From: Orion Date: 2026-04-22 Branch: feat/anchor-dual-signal-calibration Commits in this bundle: 2 (C1–C4 consolidated + C5 tests) Previous milestones closed: C1 ✅ (schema+config), C2 ✅ (calculator+wire), C3 ✅ (guard rewire), C4 ✅ (cross-session persistence).


Summary

FLAG-048 (ANCHOR-CALIBRATION) is code-complete. The 12-point test plan from the pre-code findings memo is implemented as a 17-test suite in tests/test_flag_048_dual_signal.py (12 plan items + 5 sub-asserts that tighten coverage without adding new surface). Adjacent-suite regression is clean, the full FLAG-048 suite passes on first run, and the 150-tick benchmark row is included below per your C4 sign-off note.

The residual signal is the Atlas Option 3 control input: post-warm-up and cross-session-persistent, the anchor saturation guard consumes residual_distortion_bps in place of the capped last_anchor_divergence_bps. Rail-lock proof (T4) and exit reachability proof (T6) both land green — the two criteria Atlas cited as non-negotiable when approving Option 3.


What ships

Commit 1 — feat(engine,config,state): FLAG-048 dual-signal anchor calibration (C1-C4)

Consolidated implementation across the four pre-code steps. Twelve files touched, 1,292 insertions / 33 deletions, 1 new module (neo_engine/dual_signal_calculator.py).

Step Files What landed
C1 neo_engine/config.py, neo_engine/state_manager.py, config/config.yaml, config/config.example.yaml, config/config_live_stage1.yaml AnchorDualSignalConfig dataclass + YAML defaults. Schema migration for three new system_metrics REAL columns (structural_basis_bps, rolling_basis_baseline_bps, residual_distortion_bps).
C2 neo_engine/dual_signal_calculator.py (new), neo_engine/strategy_engine.py, neo_engine/main_loop.py, neo_engine/state_manager.py (kwargs) AnchorDualSignalCalculator — EMA + warm-up + seed/reset/dump. StrategyEngine.last_structural_basis_bps (uncapped, Atlas-literal). NEOEngine holds one calculator, folds structural each tick, hydrates baseline/residual back onto strategy. record_system_metric gained three kwargs; telemetry writes gate all three behind snapshot.is_valid().
C3 neo_engine/main_loop.py, tests/test_anchor_saturation_guard.py, tests/test_anchor_idle_state.py, tests/test_flag_042_degraded_recovery.py Rename _evaluate_anchor_saturation_guard_evaluate_anchor_residual_guard. New _select_anchor_guard_window() returns (window, source_label). Both entry + exit evaluators read through the selector so hysteresis stays coherent. Existing fixtures stubbed to return the legacy window for back-compat.
C4 neo_engine/main_loop.py Module-level KEY_ANCHOR_DS_BASIS_BASELINE_BPS / _COUNT / _CLOSED_AT. Three helpers: _restore_anchor_dual_signal_baseline (7-branch read path), _persist_anchor_dual_signal_baseline (no-obs-preserves-prior), _clear_anchor_dual_signal_persistence (best-effort 3-key reset). Wired at _startup (after the fresh-session reset block, outside its scope) and _shutdown (before halt.reason write, inside belt-and-suspenders try/except).

Commit 2 — test(dual_signal): FLAG-048 C5 — 17 behavioral + wiring tests

One new file: tests/test_flag_048_dual_signal.py (874 lines). 17 tests, 5 test classes.


150-tick EMA window benchmark (Vesper C4 sign-off requirement)

Benchmark methodology: fresh-seed at 10 bps, then 500 observations at a stepped target of 14 bps. Noise column is pstdev(residual) across a 400-tick stream at 12 bps ± 2 bps uniform noise (deterministic LCG, seed 12345). All runs executed against the shipped AnchorDualSignalCalculator with warmup_ticks = N // 3.

N alpha baseline@t50 baseline@t150 baseline@t500 residual_stdev
50 0.0392 13.4588 13.9901 14.0000 1.1058
100 0.0198 12.5285 13.8009 13.9998 1.1349
150 0.0132 11.9464 13.4587 13.9949 1.1431
300 0.0066 11.1339 12.5285 13.8573 1.1417

Why N = 150 is the right default: - baseline@t150 reaches 13.46 of a 4.0 bps step — 86% of the way in one window width (≈10 min at 4s cadence). N = 100 is faster (96%) but adapts too quickly to a single outlier regime; N = 300 is still at 63% a window later. - Convergence to step target is essentially complete by t = 500 (13.99 vs. 14.00) — afternoon-to-overnight regime transitions (the exact case Atlas flagged) resolve in one extended window. - Noise stdev is ≈ 1.14 bps, which sits well below the 7 bps entry bias threshold. A noise-driven trigger would require a sustained 6× floor, which the T4/T5 fixtures confirm is not reachable on the saturated-basis streams we've observed.

Numbers are reproducible by running the benchmark snippet embedded in the C5 tests (tests/test_flag_048_dual_signal.py::TestEmaBaselineConvergence) against the shipped calculator.


Test results

FLAG-048 suite — all green on first run

tests/test_flag_048_dual_signal.py ................. [100%]
17 passed in 0.14s

Per-class breakdown:

Class Tests Plan items covered
TestStructuralBasisUncapped 1 T1
TestStructuralBasisSignConvention 2 T2 (+basis and −basis)
TestEmaBaselineConvergence 2 T3 (stable + stepped)
TestResidualRailLocked 1 T4 (rail-lock proof)
TestResidualEntryFires 2 T5 (fire + below-threshold)
TestResidualExitReachable 1 T6 (exit reachability proof)
TestHysteresisPreserved 1 T7 (excursion reset)
TestCrossSessionPersistenceDbRoundTrip 3 T8, T9, T10 (full StateManager(":memory:") round-trips)
TestWarmupSuppression 2 T11 (calculator None + selector fallback)
TestDashboardSurfaceNoHiddenSubstitution 2 T12 (3-column distinctness + NULL-not-zero)

Adjacent-suite regression

All four test files touched for the C3 selector stub remain green, as do the state manager and config suites:

tests/test_anchor_saturation_guard.py          ..............................
tests/test_anchor_idle_state.py                ..............................
tests/test_flag_042_degraded_recovery.py       ..............................
tests/test_flag_044_recovery_cooldown.py       ..............................
tests/test_anchor_error_telemetry.py           ..............................
tests/test_state_manager.py                    ..............................
tests/test_config.py                           ..............................
tests/test_flag_047_cancel_fill_race.py        ..............................
183 passed in 3.06s

Full regression

Full suite matches the C4 baseline's pass count plus exactly the 17 new FLAG-048 tests: 708 passed (vs C4's 691), same 378 pre-existing failures unchanged (all in test_main_loop.py, test_xrpl_gateway.py, test_paper_launch.py, test_task5.py — known pre-FLAG-048 breakage, untouched by this branch).

Delta vs C4: +17 passes, 0 new failures.


Vesper-requested verifications

1. fix/startup-mode-reset does not touch anchor_dual_signal.* keys

Verified again at the source. fix/startup-mode-reset only resets namespaces halt.*, inventory_truth.*, anchor_idle.* (FLAG-046), and the FLAG-044 cool-down keys. There is an explicit inline comment in the fresh-session reset block of _startup() documenting why anchor_dual_signal.* is deliberately OUT of scope: the staleness cutoff inside _restore_anchor_dual_signal_baseline() is the correct gate for regime boundaries, not session boundaries. An overnight session boundary must preserve the warm baseline; only a > 24h gap discards it.

2. ISO timestamp + NTP requirement for VPS runbook

Flagged per your note. KEY_ANCHOR_DS_BASIS_BASELINE_CLOSED_AT is an ISO-8601 timestamp emitted with datetime.now(timezone.utc).isoformat(); the staleness check uses datetime.fromisoformat(raw) − datetime.now(timezone.utc). Clock skew between shutdown and next startup directly affects the cutoff. For the Hetzner / Ubuntu LTS VPS migration runbook (FLAG-023), I've added an explicit item to include in the setup checklist:

  • Enable NTP (timedatectl set-ntp true) and confirm timedatectl reports a synchronized clock before any engine run.
  • Set /etc/timezone to a stable reference (UTC recommended — all internal timestamps are tz-aware UTC already).
  • If the system clock is ever rolled backward mid-session, the next restart will see a negative age_hours and may treat a baseline as "fresh" when it's actually from a future timestamp. The restore helper already swallows fromisoformat failures, but a silent clock-drift event will not be flagged by our code. Operations should monitor NTP sync status via the standard systemd-timesyncd channels.

This goes in the VPS runbook I'm drafting next; flagging it here so the checklist item is traceable to the review note.

3. C5 tests exercise the real persistence layer (not just helper logic)

Per your explicit instruction: T8, T9, T10 each construct a real StateManager(":memory:"), call initialize_database(), and drive _restore_anchor_dual_signal_baseline() through the actual DB read path. The calculator is a fresh AnchorDualSignalCalculator instance — not a mock — so seed_baseline is exercised end-to-end. On T10, I also assert the three persistence keys are cleared (empty string) after stale discard, which catches a regression where the clear step is skipped silently.


Apply instructions (Windows / PowerShell)

cd C:\Users\Katja\Documents\NEO GitHub\neo-2026\
git fetch origin main
git checkout main
git pull origin main

# Defensive branch delete (handles prior investigation checkouts)
git branch -D feat/anchor-dual-signal-calibration 2>$null
git checkout -b feat/anchor-dual-signal-calibration

# Apply the two-patch bundle in order
Get-ChildItem "C:\Users\Katja\Documents\Claude Homebase Neo\02 Projects\NEO Trading Engine\03 Branches\feat-anchor-dual-signal-calibration\" -Filter "*.patch" | Sort-Object Name | ForEach-Object { git am $_.FullName }

# Sanity-check test run
python -m pytest tests/test_flag_048_dual_signal.py -v
python -m pytest tests/test_anchor_saturation_guard.py tests/test_anchor_idle_state.py tests/test_flag_042_degraded_recovery.py tests/test_anchor_error_telemetry.py tests/test_state_manager.py -q

Expected: 17 passed in the FLAG-048 file, adjacent suites green.


Open follow-ups (not blocking this delivery)

  1. Pre-live gate replay — Atlas's C3 ruling requires a replay comparison on S48/S49/S50 showing residual signal exit condition reachable in afternoon ET before live session resume. The live DB (neo_live_stage1.db) was not accessible from the sandbox this cycle; the T4 rail-lock proof is synthetic at the session-52 shape, not the exact DB replay. I can construct the real replay once the DB is copied into the working tree (or the VPS migration completes and we have a clean mount).
  2. VPS runbook (FLAG-023) — to include the NTP + UTC requirement documented above, plus the DB safeguards (FLAG-049) Atlas ruled on 2026-04-22.
  3. Phase 7.4 gate remains blocked on FLAG-048 acceptance + pre-live replay; everything else in the precondition stack is already green.

Delivery artifacts

  • [C] Orion Delivery Memo — FLAG-048 C5.md (this file)
  • 0001-feat-engine-config-state-FLAG-048-dual-signal-anchor.patch
  • 0002-test-dual_signal-FLAG-048-C5-17-behavioral-wiring-te.patch

All three live in 02 Projects/NEO Trading Engine/03 Branches/feat-anchor-dual-signal-calibration/.

— Orion Director of Engineering, BlueFly AI Enterprises 2026-04-22