[C] Orion Investigation Report — FLAG 053 EMA During ANCHOR IDLE
Brief received: 07 Agent Coordination/[C] Orion Investigation Brief — EMA Update During ANCHOR_IDLE.md.
Investigation scope: three questions, file paths and line numbers, no code changes. Repo: fix/anchor-idle-guard-semantics tip (post-S54 main state is structurally identical for these files — the guard-semantics fixes do not touch calculate_quote, observe, or _evaluate_anchor_idle_exit).
Executive Summary¶
The EMA is NOT state-gated by ANCHOR_IDLE. The deadlock hypothesis in the brief — that ANCHOR_IDLE prevents quoting → no EMA updates → stuck residual — is refuted by the code. observe() runs every tick where snapshot.is_valid() is True, which is the only precondition for reaching Step 8 in _tick. ANCHOR_IDLE does not block Step 8.
The real deadlock is in the exit evaluator's window source, not the EMA. The exit evaluator routes through _select_anchor_guard_window, which uses the legacy capped _anchor_error_window until the residual window is fully populated (requires warmup_ticks + residual_hysteresis_lookback_ticks = 50 + 20 = 70 valid observations). Against the capped signal, mean magnitude sits at ±10 bps in a saturated regime — far above the 4-bps exit bias threshold — so the exit test cannot pass, stability ticks never accumulate, and ANCHOR_IDLE cannot exit regardless of how long it runs. The EMA can converge perfectly and the engine still cannot exit on that source.
The dashboard/log sign mismatch is a three-way sign-convention zoo, not a bug. Three different signals carry overlapping but non-identical definitions (clob_vs_amm_divergence_bps in market_data logs, structural_basis_bps on the EMA side, last_anchor_divergence_bps on the dashboard side). The dashboard's +9.56 bps and the tick log's −11 to −13 bps describe the SAME regime (amm ~11–13 bps above clob_mid) with opposite numerator ordering, and the dashboard value is additionally clamped at the +10 bps cap.
Atlas must decide whether to (a) tighten the exit evaluator's behavior during the legacy-window phase, (b) accelerate the residual-window handover, (c) reconcile the sign conventions across the three metric paths, or (d) a combination.
Q1 — EMA update gating¶
Finding: UNCONDITIONAL. Not state-gated by ANCHOR_IDLE.
The observe() method itself — no engine knowledge¶
neo_engine/dual_signal_calculator.py:176-241:
def observe(
self, structural_basis_bps: Optional[float]
) -> DualSignalObservation:
if structural_basis_bps is None:
# skip tick, no state advance
return DualSignalObservation(...)
if self._baseline is None:
self._baseline = float(structural_basis_bps)
else:
self._baseline = (
self._alpha * float(structural_basis_bps)
+ (1.0 - self._alpha) * self._baseline
)
self._observations_since_seed += 1
...
observe() has NO awareness of engine mode. It takes a float (or None), advances internal EMA state, increments its observation counter, and returns a DualSignalObservation. The only gate inside it is structural_basis_bps is None, which is a local skip-tick signal from the caller, not a mode check.
The sole observe() call site¶
neo_engine/main_loop.py:4312 (inside Step 8):
# ... main_loop.py:4280
if snapshot.is_valid():
intents = self._strategy.calculate_quote(snapshot, inventory)
intents_generated = len(intents)
intent_sides = [intent.side.value for intent in intents]
# ... anchor divergence observation (capped signal) ...
# ... comment block about FLAG-048 ...
# main_loop.py:4312
dual_obs = self._dual_signal_calculator.observe(
self._strategy.last_structural_basis_bps
)
# ... write-back of baseline/residual to strategy attrs ...
# ... residual window append (main_loop.py:4339-4345) ...
There is exactly ONE observe() call site in the entire repo (verified by grep). It is nested inside if snapshot.is_valid(): — nothing else gates it. No if not MODE_ANCHOR_IDLE, no ACTIVE check, no DEGRADED guard.
Upstream: where last_structural_basis_bps comes from¶
neo_engine/strategy_engine.py:236-239:
if mid_price > 0 and amm_price is not None and amm_price > 0:
self.last_structural_basis_bps = (
((mid_price - amm_price) / mid_price) * 10000.0
)
else:
self.last_structural_basis_bps = None
This is inside calculate_quote() (entered at strategy_engine.py:143; early-returns at line 163 only on not snapshot.is_valid()). calculate_quote() is called unconditionally from main_loop.py:4281 whenever snapshot.is_valid(). No mode gating.
_tick() early-return paths — does ANCHOR_IDLE skip Step 8?¶
neo_engine/main_loop.py:3976 (top of _tick). Skim of the early-return paths:
- DEGRADED + pending-truth timeout → HALT (not applicable to ANCHOR_IDLE).
MODE_HALTcurrent → return False.- Risk HALT from risk manager → return False.
ReplayExhaustedin paper-replay path → return False.account_offersfailure → return False.- Reconciler HALTED result → return False.
None of these paths trigger on MODE_ANCHOR_IDLE alone. ANCHOR_IDLE does gate order SUBMISSION in Step 9 (via the truth gate), but Step 8 (which contains both calculate_quote and observe) runs to completion on every valid-snapshot tick regardless of mode.
Verdict¶
Vesper's deadlock hypothesis for Q1 is incorrect. The EMA updates every valid-snapshot tick. In S55, if the engine saw 50+ ticks while in ANCHOR_IDLE and snapshot.is_valid() held throughout (which is consistent with CLAUDE.md's notes that intents_generated: 2 every tick and CLOB was healthy), then self._dual_signal_calculator._observations_since_seed advanced 50+ times, _baseline converged toward the session's structural basis, is_warm flipped True at tick ~50, and a real residual started materialising at tick ~51.
That means FLAG-053's actual failure mechanism must live downstream of the EMA, in the exit evaluator and/or its window selection.
Q2 — ANCHOR_IDLE exit condition¶
Finding: seven predicates + 30-tick hysteresis. Window source is the critical variable; against the legacy capped signal, exit is mathematically impossible in a cap-saturated regime.
The exit evaluator¶
neo_engine/main_loop.py:2887-2997, called at Step 8.4 from main_loop.py:4478 (inside if snapshot.is_valid():).
Predicates (in evaluation order):
| # | Predicate | Line | Source |
|---|---|---|---|
| 1 | cfg.enabled |
2942 | anchor_saturation_guard.enabled (config_live_stage1.yaml) |
| 2 | cfg.recovery_enabled |
2942 | anchor_saturation_guard.recovery_enabled (kill switch) |
| 3 | self._current_truth_mode() == MODE_ANCHOR_IDLE |
2944 | only fires when in ANCHOR_IDLE |
| 4 | window.maxlen is not None and len(window) >= window.maxlen |
2961 | window must be full |
| 5 | abs(mean(window)) < cfg.recovery_exit_bias_threshold_bps |
2971 | config = 4.0 bps |
| 6 | prev_pct < cfg.recovery_exit_prevalence_pct |
2972 | config = 30% (pct of window with \|x\| > prevalence_threshold_bps = 5 bps) |
| 7 | Stability ticks counter self._anchor_idle_stability_ticks >= cfg.recovery_stability_ticks |
2976-2978 | config = 30 consecutive ticks |
Any failure on (5) or (6) resets the stability counter to 0 (lines 2993-2997 — hysteresis). Any excursion wipes the progress.
The window source selector — the critical detail¶
neo_engine/main_loop.py:2729-2766:
def _select_anchor_guard_window(self) -> tuple[deque, str]:
ads_cfg = self._config.anchor_dual_signal
res_window = self._anchor_residual_window
if (
ads_cfg.enabled
and res_window.maxlen is not None
and len(res_window) >= res_window.maxlen
):
return res_window, "residual_distortion_bps"
return self._anchor_error_window, "last_anchor_divergence_bps"
Two sources:
-
Residual window (
_anchor_residual_window, maxlen =residual_hysteresis_lookback_ticks = 20, config_live_stage1.yaml). Only chosen once fully populated. Residual values are fed only whendual_obs.is_warm AND dual_obs.residual_distortion_bps is not None(main_loop.py:4339-4345). Sinceis_warmflips True at_observations_since_seed >= warmup_ticks = 50, the residual window starts filling at the 51st valid observation and reachesmaxlenat the 70th. Only from tick ~70 onward does the exit evaluator use this source. -
Legacy capped window (
_anchor_error_window, maxlen =anchor_saturation_guard.lookback_ticks = 25). Fed fromlast_anchor_divergence_bps(main_loop.py:4294), which is the CAPPED quote-anchor-vs-mid deflection (strategy_engine.py:221), bounded at ±anchor_max_divergence_bps = 10 bps.
Why S55 could not exit on the legacy source¶
In a regime where amm is persistently outside the ±10 bps cap relative to clob_mid, last_cap_applied is True on nearly every tick (dashboard recorded 82% in S55). When capped, last_anchor_divergence_bps = ±10.0 bps exactly. The legacy window therefore holds values near the cap ceiling almost uniformly. In that state:
|mean(window)|≈ 10 bps, versus therecovery_exit_bias_threshold_bps = 4.0test.prev_pct(share of |x| > 5 bps) ≈ 100%, versus therecovery_exit_prevalence_pct = 30%test.
Both bias and prevalence fail on every tick. Stability counter never increments. Exit is unreachable on this source, no matter how long ANCHOR_IDLE runs.
S55 tick-count reachability¶
CLAUDE.md: S55 elapsed ~7 minutes, ANCHOR_IDLE entered at tick ~3, never exited. At ~5 s/tick on the live stage1 config, 7 min ≈ 84 ticks. That is just barely past the 70-tick threshold where the residual window becomes ready as the exit source. It is plausible that:
(a) The residual window never filled in S55 because the session ended too early (SIGINT at tick ~84 leaves only ~14 ticks of residual-source routing — and we still need 30 consecutive stability ticks on top of that to actually exit).
(b) OR: residual window DID fill around tick ~70, but the regime was still hostile enough on that source (e.g. EMA hadn't converged tightly enough for residual to sit inside the 4-bps bias band) — stability counter started accumulating but never reached 30 before SIGINT.
(c) OR: there's a secondary issue — e.g. the cross-session persistence of baseline left the EMA primed from a prior regime, so the residual was not near-zero on entry to the residual-source phase. This is the FLAG-051-style regime-drift risk, and FLAG-051 guards abs(structural − persisted_baseline) > 10 cold-start; if S55's structural was around −12 bps and persisted baseline was around −11 bps (within 10), no cold-start, and the residual would start near zero immediately — but if the persisted baseline was stale from a different regime, behavior is different.
Confirming which of (a), (b), (c) applies needs DB inspection of S55 tick data: structural_basis_bps, rolling_basis_baseline_bps, residual_distortion_bps, and anchor_error_bps per tick. I did NOT run that inspection per the "investigation only" scope — flagging it so Atlas can direct.
Verdict¶
The exit evaluator's design is correct in isolation, but has a real edge case: a saturated cap regime plus an ANCHOR_IDLE entry that fires BEFORE the residual window is warm creates an un-exitable state for the first ~70 valid ticks. The "stickiness" scales with cap-saturation percentage and is worst at 100% cap-locked (S49/S50-style hostile regimes). This is the deadlock mechanism for FLAG-053.
Q3 — Dashboard signal mismatch¶
Finding: three different sign conventions across three signals. Not a bug — a metric-label ambiguity. The +9.56 bps dashboard value and the −11 to −13 bps tick log are consistent descriptions of the same regime.
The dashboard metric — where it's sourced¶
- Dashboard widget:
dashboard.py:2283readsanchor.mean_bpsfrom engine_state via_sf(es, "anchor.mean_bps"). - Writer:
neo_engine/main_loop.py:5763—self._state.set_engine_state("anchor.mean_bps", str(mean_bps))inside_log_anchor_divergence_summary()(main_loop.py:5683-5773). - Underlying data:
self._anchor_divergence_obs(main_loop.py:5685), populated at main_loop.py:4286-4288: - Ultimate source:
strategy_engine.py:221
This is the CAPPED quote-anchor-vs-mid deflection, bounded at ±anchor_max_divergence_bps = 10 bps.
Sign convention: positive when quote_anchor_price > mid_price — i.e. when the capped amm reference lies above clob_mid. In the capped_amm branch, cap_upper is chosen when amm_price > upper_bound = mid * (1 + cap_frac), i.e. amm is above mid. So dashboard-positive = amm-above-mid = cap_upper applied.
The 82% cap-lock: _anchor_cap_applied_count / valid_count × 100 (main_loop.py:5738), derived from last_cap_applied (strategy_engine.py:197, 201). True whenever amm is beyond either cap bound. Correct metric, correct source.
The tick-log "CLOB-AMM divergence" — a different signal¶
neo_engine/market_data.py:188-190:
if amm_price is not None and amm_price > 0:
clob_vs_amm_divergence_bps = (
((clob_mid_price - amm_price) / amm_price) * 10000.0
)
Sign convention: positive when clob_mid > amm_price — i.e. amm is BELOW clob_mid.
The FLAG-048 EMA input — a third signal¶
neo_engine/strategy_engine.py:236-239:
if mid_price > 0 and amm_price is not None and amm_price > 0:
self.last_structural_basis_bps = (
((mid_price - amm_price) / mid_price) * 10000.0
)
Sign convention: positive when mid_price > amm_price — same sign as the market_data log field, but different denominator. Numerically within ~0.01% of each other for mid/amm in the same price range (the divisor difference is negligible for bps-scale comparisons).
And the fourth convention — the scratch variable¶
strategy_engine.py:190 (used only inside capped_amm cap decision, not persisted anywhere):
Opposite sign to the three above. Kept in a local variable, dropped after the cap decision. It is logged (strategy_engine.py:461) for tick-level anchor diagnostics under the key raw_divergence_bps, distinct from clob_vs_amm_divergence_bps.
Reconciling S55's numbers¶
The brief states the tick log showed CLOB-AMM divergence consistently −11 to −13 bps. Under the market_data.py definition ((clob_mid − amm)/amm), negative means amm is above clob_mid by ~11–13 bps. From that single observation:
| Signal | Formula | Expected value in S55 |
|---|---|---|
clob_vs_amm_divergence_bps (log) |
(clob_mid − amm) / amm × 10000 |
−11 to −13 bps ✓ (given) |
raw_divergence_bps (strategy scratch) |
(amm − mid) / mid × 10000 |
+11 to +13 bps (amm above) |
last_anchor_divergence_bps (dashboard) |
(quote_anchor − mid) / mid × 10000, capped |
+10 bps (upper cap hit; avg +9.56 observed → consistent) |
structural_basis_bps (EMA input) |
(mid − amm) / mid × 10000 |
−11 to −13 bps |
| 82% cap-lock | share of ticks with last_cap_applied True |
≥ 82% (cap upper hit on most ticks) ✓ (given) |
Every observed dashboard number is consistent with the code. The "opposite signs" impression from the brief comes from comparing the log's (mid − amm)/amm signed value against the dashboard's (quote_anchor − mid)/mid signed value — different numerators and different denominators. They describe the same geometry from opposite perspectives.
Checklist against the brief's three sub-questions¶
-
"Is the dashboard metric reading stale EMA data from prior sessions (cross-session persistence, FLAG-048)?" No.
anchor.mean_bpsis written at shutdown from_anchor_divergence_obs, which is a per-session list fed in Step 8 fromlast_anchor_divergence_bps(legacy capped signal). The FLAG-048 dual-signal triad writes to SEPARATE engine_state keys (anchor_dual_signal.*) and distinct DB columns (structural_basis_bps,rolling_basis_baseline_bps,residual_distortion_bps). The dashboard's anchor diagnostics panel does not read the EMA — it reads the legacy summary. -
"Is there a sign convention difference between
structural_basis_bpsandanchor_error_bps?" Yes — confirmed.structural_basis_bps = (mid − amm)/midandanchor_error_bps(dashboard source) =(quote_anchor − mid)/midhave opposite numerator orders. These are fundamentally different signals by design (structural is the uncapped basis diagnostic; anchor_error_bps is the capped legacy saturation metric). Not a bug, but a real label-collision risk on any dashboard that mixes them. -
"Is the cap-lock percentage (82%) being calculated from the wrong signal?" No.
_anchor_cap_applied_counttrackslast_cap_applied, which is the truth-of-record for the capped_amm decision insidecalculate_quote. It is counting real cap events on the legacy path.
Verdict¶
There is no dashboard-sourcing bug. There is a documentation/legibility gap: three adjacent signals with overlapping names (clob_vs_amm_divergence_bps, structural_basis_bps, last_anchor_divergence_bps, raw_divergence_bps) and three distinct sign conventions. For an operator eyeballing the dashboard and the tick log side-by-side in S55, the only consistent read is to translate between the conventions explicitly.
Deliverable summary — for Atlas¶
-
Q1 (EMA gating): refuted.
observe()runs every valid-snapshot tick.neo_engine/dual_signal_calculator.py:176-241+neo_engine/main_loop.py:4312. EMA converged as expected in S55. -
Q2 (exit condition): seven predicates + 30-tick hysteresis at
neo_engine/main_loop.py:2887-2997, with window source selection atneo_engine/main_loop.py:2729-2766. Exit is unreachable on the legacy capped window in a cap-saturated regime; residual source does not become available untilwarmup_ticks + residual_hysteresis_lookback_ticks = 70valid ticks. S55 halted at ~84 ticks — at the ragged edge of residual-source readiness. This is the real deadlock mechanism. -
Q3 (dashboard mismatch): not a bug. Three adjacent signals with three different sign conventions. Dashboard value consistent with tick log given the conventions. Sub-question answers: stale EMA ruled out (dashboard does not read EMA); sign difference confirmed; cap-lock percentage correct.
Recommendation for Atlas ruling scope: the non-bug here is important context. Atlas should decide whether to (i) reduce the pre-residual lockout (e.g. an earlier hand-over trigger, or a small-ticks legacy-source exit path when the regime is benign on the uncapped signal), (ii) standardize sign conventions across the three signals, (iii) both, or (iv) accept the current behavior and require longer sessions to clear ANCHOR_IDLE on hostile entry.
No code changes from me pending Atlas ruling.
— Orion