Vesper Audit Package — Post-S40 Guard Architecture Review¶

To: Atlas
From: Vesper
Date: 2026-04-22
Context: S51 halted at 26 ticks (degraded_episode_limit_halt). Katja's observation: engine was trading successfully before (S40: 84 fills, ran to duration) and is no longer functional. Root complaint is not the market — it's the accumulated guard machinery. Prepared per Atlas delegation ruling.

Section 1 — S40 Baseline (Last Successful Session)¶

S40 (Apr 19): - 316 ticks | 84 fills | buy=53, sell=31 - halt=duration_elapsed ✅ (ran full session) - VW spread: +12.47 bps | toxic=0 - Anchor: mean near −10 bps, 100% of ticks at saturation (Vesper S41 post-session note) - Guards merged Apr 19: anchor saturation, drift, corridor — all active - Episode counting: NOT present (FLAG-042 merged Apr 21) - ANCHOR_IDLE: NOT present (FLAG-046 merged Apr 22) - Episode limit halt: NOT present (FLAG-044 merged Apr 21)

S41 (Apr 19, next session after S40): - 106 ticks | 3 fills | buy=3, sell=0 - halt=duration_elapsed ✅ (ran full session — barely traded but didn't halt early) - Anchor: mean +3.93 bps, range [−3.6, +7.8], 42.5% >5bps (cycling regime) - Note: 3 fills in first 24s (burst), then no fills rest of session. Drift guard present but no episode counting.

Takeaway: With guards but without episode counting + ANCHOR_IDLE escalation, the engine ran both S40 and S41 to duration_elapsed in hostile/cycling regimes. The current engine cannot survive 3 minutes in the same conditions.

Section 2 — Post-S40 Change Ledger¶

Ordered by merge date. Every change that could affect guard behavior included.

2.1 — `feat/anchor-error-per-tick-telemetry` — merged Apr 21¶

Problem it solved: No per-tick anchor error visibility in DB for post-hoc analysis. What it changed: Added anchor_err_bps column to system_metrics. No guard logic changes. Pure telemetry. Guard behavior impact: None.

2.2 — `fix/halt-reason-taxonomy-leak` (FLAG-041) — merged Apr 21¶

Problem it solved: main_loop.py and run_paper_session.py were clobbering the authentic halt reason with engine_requested_halt — sessions showed wrong halt reason in summary. What it changed: Fixed two clobber sites so authentic halt token surfaces correctly. Guard behavior impact: None (display/logging only). But: enabled accurate root cause attribution from S43 onward.

2.3 — `feat/flag-042-degraded-recovery` — merged Apr 21¶

Problem it solved: DEGRADED had no exit path for market-regime guards — anchor, drift, corridor entered DEGRADED but could only escape via 300s wallet-truth timeout → HALT. S44 showed anchor recovering mid-session while engine sat stuck in DEGRADED. What it changed: - Added recovery evaluators for anchor, drift, corridor - Per-episode cap: one recovery attempt per episode per source — second DEGRADED re-entry → immediate HALT - Anchor recovery exit threshold: mean_abs < 4 bps AND prevalence < 30%, sustained N ticks - Reset rolling windows on exit Guard behavior impact: SIGNIFICANT. Introduced episode counting as a concept. Added recovery path but also added the first version of an episode-level halt gate.

2.4 — `feat/flag-044-recovery-cooldown` — merged Apr 21¶

Problem it solved: FLAG-042's recovery_exhausted_halt halted too early — no distinction between true oscillation and persistent hostile regime. S45 halted at 163s with no way to wait out the regime. What it changed: - Replaced recovery_exhausted_halt with cooldown model: after failed recovery, suppress recovery evaluator for recovery_cooldown_ticks (default 120 ticks, ~8 min at 4s cadence) - Added per-source episode cap: max_degraded_episodes_per_source_per_session: 3 — after 3 episodes per source → degraded_episode_limit_halt - This replaces the per-attempt cap with a per-session backstop Guard behavior impact: SIGNIFICANT. The degraded_episode_limit_halt that killed S51 originates here. 3 episodes per source per session is the hard limit. Atlas locked principle: "A bad regime is not, by itself, a reason to terminate the session."

2.5 — `fix/startup-mode-reset` — merged Apr 21¶

Problem it solved: Fresh session startup was inheriting inventory_truth.mode, degraded_since, degraded_reason from prior session — engine started in wrong state. What it changed: Startup now resets those fields. 3 new tests. Guard behavior impact: Startup-only. Prevents stale DEGRADED state carrying forward.

2.6 — `feat/anchor-idle-state` (FLAG-046) — merged Apr 22¶

Problem it solved: Anchor saturation was routing through DEGRADED and consuming the 3-episode budget. Atlas ruling: anchor saturation is a market condition (pause), not a system failure (safety halt). It should not cost episode budget. What it changed: - ANCHOR_IDLE is now a first-class state (ACTIVE → ANCHOR_IDLE, no episode count) - Anchor saturation → ANCHOR_IDLE (no HALT path from here directly) - SOURCE_ANCHOR retired from RECOVERY_CAPPED_SOURCES - New interaction: if drift/corridor/truth fires while in ANCHOR_IDLE → escalate to DEGRADED (episode counted) Guard behavior impact: SIGNIFICANT — and introduces the specific fault path that halted S51. ANCHOR_IDLE prevents anchor episodes, but drift/corridor episodes can still accumulate while idle. The drift guard keeps running while the engine is paused.

2.7 — `fix/cancel-fill-race` (FLAG-047) — merged Apr 22¶

Problem it solved: tecNO_TARGET cancel response was masking real fills — CANCELLED_BY_ENGINE guard suppressed legitimate fill detection after cancel-fill race. What it changed: On tecNO_TARGET, demotes to CANCEL_RACE_UNKNOWN, does on-chain account_tx lookup, three-way resolution (FILL/CANCEL/INCONCLUSIVE). No guard logic changes. Guard behavior impact: Fixes incorrect fill accounting. Indirect effect: prevents inventory_truth_halt from false triggers.

2.8 — `feat/anchor-dual-signal-calibration` (FLAG-048) — merged Apr 22¶

Problem it solved: Anchor error was capped at ±10 bps — when CLOB-AMM divergence exceeded the cap the control signal was flat and the exit condition was mathematically unreachable. What it changed: - New signal: residual_distortion_bps = structural_basis_bps - rolling_basis_baseline_bps - residual_distortion_bps becomes the ANCHOR_IDLE control signal (not capped anchor_error) - 50-tick warm-up required before new signal is active - Cross-session persistence for baseline EMA Guard behavior impact: Changes what signal ANCHOR_IDLE keys off of. But: requires 50-tick warm-up — guard runs on legacy capped signal during that window. S51 halted at tick 26, so FLAG-048 never activated.

Section 3 — Session Timeline (S40 → S51)¶

Session	Ticks	Halt	Anchor	Fills	Key guards active	Notes
S40	316	`duration_elapsed` ✅	~−10 bps, 100%	84 (buy=53, sell=31)	anchor, drift, corridor (no episode counting)	Last fully working session
S41	106	`duration_elapsed` ✅	+3.93 bps mean, 42.5% >5	3 (buy=3, sell=0)	same as S40	Barely traded but survived
S42	—	`engine_requested_halt`	+9.28 bps, 100%	—	+ FLAG-042 episode counting	Atlas: "correct behavior in hostile regime"
S43	—	`inventory_truth_halt`	+9.28 bps, 100%	2	same	Hostile, truth halt
S44	—	`inventory_truth_halt`	+4.43 bps, cycling	2	same	FLAG-042 bug: engine stuck in DEGRADED during recovery
S45	—	`recovery_exhausted_halt`	+9.47 bps, 100%	2	+ FLAG-042 per-attempt cap	FLAG-042 per-attempt cap too aggressive
S46	117	`inventory_truth_halt`	−5.22 bps, cycling	2 (phantom)	+ FLAG-044 cooldown	FLAG-037 phantom fill cycle
S47	33	`degraded_episode_limit_halt`	−5.62 bps, 50%	1	+ FLAG-044 3-episode limit	FLAG-037 fixed; episode limit halted
S48	—	`inventory_truth_halt`	+2.94 bps, improving	0	same	FLAG-047 cancel-fill race
S49	~184	manual SIGINT	+10.00 bps, 100%	1	same	Anchor cap-lock confirmed → FLAG-048
S50	184	manual SIGINT	+10.00 bps, 100%	0	same	FLAG-048 opened
S51	26	`degraded_episode_limit_halt`	+9.35 bps, 100%	1	+ FLAG-046 ANCHOR_IDLE + FLAG-048 (warmup)	Confirmed fault path
S52	~30	manual SIGINT	−7.68 bps → +17 bps shift	1 (sell)	+ containment fix merged	FLAG-050 confirmed in production — counter carried across ANCHOR_IDLE, C fired on exit

Section 4 — Guard Interaction Map¶

State Machine (current)¶

ACTIVE
  │
  ├─ anchor saturation (mean ≥6 bps AND prevalence ≥40%)
  │     └──► ANCHOR_IDLE (no episode count)
  │               │
  │               ├─ anchor normalizes (mean <4 bps AND prevalence <30%, sustained)
  │               │     └──► ACTIVE
  │               │
  │               └─ drift A/B/C fires  ──► DEGRADED (episode counted ← THIS IS THE FAULT PATH)
  │                  OR corridor fires   ──► DEGRADED (episode counted)
  │                  OR truth fires      ──► DEGRADED (episode counted)
  │
  ├─ drift A/B/C fires  ──► DEGRADED (episode counted)
  ├─ corridor fires      ──► DEGRADED (episode counted)
  └─ truth fires         ──► DEGRADED (episode counted)

DEGRADED
  │
  ├─ recovery evaluator runs
  │     ├─ recovery succeeds → ACTIVE (windows reset)
  │     └─ recovery fails → cooldown (120 ticks) → try again
  │
  └─ episode count reaches 3 per source → degraded_episode_limit_halt → HALT

Drift Guard Conditions¶

Condition	Trigger	Evaluates while ANCHOR_IDLE?	Note
A	≥3 same-side fills within ~30s	Yes (presumed)	Requires active quoting to generate same-side fills
B	Net notional imbalance exceeds threshold	Yes (presumed)	Cumulative measure, persists across state
C	No opposing fill within 15 ticks	Yes — THIS IS THE PROBLEM	Fires trivially when engine is idle

Episode Count Budget (FLAG-044)¶

Max DEGRADED episodes per source per session: 3
Cooldown after failed recovery: 120 ticks (~8 min)
After 3 episodes: degraded_episode_limit_halt (session ends)

The S51 Math¶

With drift C firing at tick 15 after any buy fill, and assuming minimal cooldown overhead: - Tick 1–11: quoting - Tick ~11: buy fill - Tick ~12: ANCHOR_IDLE entered (anchor saturated) - Tick ~26: drift C fires (15 ticks since buy, no opposing sell — engine is idle) - DEGRADED episode 1 → cooldown (N ticks) - Drift C fires again → episode 2 → cooldown - Drift C fires again → episode 3 → degraded_episode_limit_halt

The engine mathematically cannot survive past tick 26–30 after any buy fill in a hostile anchor regime under the current architecture.

Section 5 — Candidate Issue List¶

For each item: classification is Vesper's first-pass read. Orion's engineering questions are targeted at items marked "unclear."

Issue 1 — Drift guard condition C evaluates while ANCHOR_IDLE¶

Description: Condition C ("no opposing fill in 15 ticks") fires while the engine is in ANCHOR_IDLE and cannot quote. The absence of opposing fills is not a new risk signal — it is the expected consequence of being idle. Condition C was designed to detect directional risk when the engine is actively quoting. Classification: Likely miswired. The condition is evaluating a scenario it was not designed for. Immediate containment fix: Dispatched to Orion — suppress condition C evaluation while ANCHOR_IDLE.

Issue 2 — Episodes earned via idle escalation count against the same session budget as active-quoting episodes¶

Description: When drift escalates from ANCHOR_IDLE → DEGRADED, the episode counts toward the 3-episode limit alongside episodes earned while ACTIVE. A session that spends 90% of its time correctly idle could exhaust its DEGRADED budget on idle-escalation events, leaving no budget for genuine safety events. Classification: Likely overconstraining. Idle-escalation episodes are structurally different from active-quoting DEGRADED episodes. Whether they should share the same budget is an architectural question. Orion confirmation needed: Can the episode count be split by escalation source (from-idle vs from-active)? What is the implementation complexity?

Issue 3 — Drift guard fill history carries across ANCHOR_IDLE boundary¶

Description: The drift guard's rolling fill history window (which condition B and C read) likely persists through ANCHOR_IDLE entry and exit. A pre-idle buy fill imbalance that was already "resolved" by entering ANCHOR_IDLE may re-trigger the guard on exit, before the engine has had a chance to generate opposing fills. Classification: Likely miswired. The fill imbalance that caused ANCHOR_IDLE entry is already being handled by the idle state. Carrying it forward into the post-idle window double-penalizes the same event. Orion confirmation needed: Does the drift guard history reset on ANCHOR_IDLE entry or exit? Or does it persist?

Issue 4 — 3-episode limit produces deterministic session termination at tick ~26 in hostile regimes¶

Description: With drift C firing every 15 ticks after any buy and no opposing sells possible while idle, the 3-episode limit terminates the session at approximately tick 26–30. This is not a configurable behavior — it is a hard ceiling. Sessions S47 and S51 both hit this ceiling in < 30 ticks. With FLAG-048's 50-tick warm-up window, the dual-signal control signal can never become active before the episode limit fires. Classification: Unclear / needs architectural decision. The 3-episode limit was designed as a backstop against pathological churn in active trading. Whether it is appropriate as a limit on idle-escalation events is an architectural question for Atlas. Candidate options (Vesper framing): - (a) Separate episode budgets: active-sourced vs idle-escalation-sourced - (b) Increase episode limit for idle-sourced events - (c) Suppress idle-sourced drift escalation entirely (only truth/corridor escalate from idle) - (d) Add a minimum-ticks-before-episode-limit gate (engine must have been ACTIVE for N ticks before episode limit can apply)

Issue 5 — FLAG-048 warm-up window vs episode limit ceiling are incompatible in hostile regimes¶

Description: FLAG-048 requires 50 ticks of warm-up before the residual signal becomes the control input. The episode limit ceiling under hostile conditions is approximately tick 26. The engine will always halt before FLAG-048 activates in a +10 bps regime after any buy fill. The two features cannot coexist in their current form. Classification: Likely overconstraining interaction. FLAG-048 and the episode limit were designed independently. The combination creates a scenario where the intended fix cannot demonstrate its effect. Blocking implication: Until Issue 1 (drift C suppression) is fixed, every FLAG-048 validation attempt will terminate before the 50-tick warm-up completes.

Issue 6 — Cooldown tick count vs session effectiveness¶

Description: recovery_cooldown_ticks = 120 (default, ~8 min at 4s cadence). In a 2-hour session with 3 episodes, the total time consumed by cooldown windows is up to 360 ticks (~24 min). In hostile anchor regimes, the engine is in ANCHOR_IDLE during cooldown, meaning it also cannot trade. The effective trading window per session may be much smaller than the session duration. Classification: Likely necessary but worth quantifying. This may be by design (preserve capital in hostile regime). Not proposing a change — flagging for Atlas's awareness when reviewing session design.

Issue 7 — The "A bad regime is not a reason to terminate" principle may now be violated by the drift guard¶

Description: FLAG-044's locked principle: "A bad regime is not, by itself, a reason to terminate the session." The intent was to let the engine wait out hostile anchor conditions. However, the drift guard condition C is now terminating sessions via episode limit in exactly the bad-regime scenario the principle was designed to protect. The principle is being honored architecturally (no anchor-sourced episode counting) but violated behaviorally (drift C fires in anchor's absence). Classification: Likely miswired. The principle's intent covers the current failure mode, but the implementation doesn't extend to the drift guard's idle-sourced escalation path.

Section 6 — Engineering Questions — Answered by Orion (2026-04-22)¶

Q1–Q4 answered in Orion's delivery memo for fix/drift-c-anchor-idle-suppression. Q5 not yet answered — deferred to Atlas ruling phase.

Q1 — Condition C evaluation: does the evaluator see engine state?

Orion's answer: Pre-fix, the evaluator did NOT consult engine state before evaluating C. _current_truth_mode() was never called in _evaluate_directional_drift_guard. The C branch fired on counter values alone with no mode awareness. This is the confirmed structural root cause of the S51 cascade.

Vesper note for Atlas: this means the drift guard has been entirely blind to engine state since it was written. The S51 failure was not a regression — it was a latent structural gap that was only exercised once ANCHOR_IDLE existed.

Q2 — Fill history across ANCHOR_IDLE boundary

Orion's answer: Fill history is NOT reset on ANCHOR_IDLE entry or exit. Both counters (_drift_ticks_since_opposing_fill, _drift_fills_seen_this_session) persist across the idle boundary and only reset on a successful drift recovery. This explains why S51 fired at tick 26 rather than the theoretical ~65: the counter was partially consumed during active quoting before ANCHOR_IDLE was entered, compressing the firing timeline.

Vesper note for Atlas: this is a live follow-on issue (FLAG-050). With the containment patch merged, condition C is suppressed during ANCHOR_IDLE, so the counter accumulates silently during idle. On ANCHOR_IDLE exit, condition C may fire almost immediately if the counter crossed the threshold during idle. S52 (2026-04-22) confirmed this in production. The engine placed orders, the sell filled, CLOB-AMM divergence shifted to +17 bps, drift C fired repeatedly before ANCHOR_IDLE was triggered, and after ANCHOR_IDLE the counter remained elevated. The engine could not quote for the remainder of the session. Containment fix prevented session termination but did not prevent the trading blockage. Whether the fill history should reset on ANCHOR_IDLE entry or exit is a separate architectural decision — FLAG-050 is open and awaiting this ruling.

Q3 — Tick breakdown for S51

Orion's answer: S51 live DB is malformed — exact per-episode tick numbers not recoverable. Analog evidence from sessions 50 and 52 (same degraded_episode_limit_halt pattern in backup): all 3 episodes had fills_seen=1, ticks_since_opposing ∈ {15, 16}. Theoretical S51 breakdown with the carry-forward counter: episode 1 at ~tick 15 (partial counter), cooldown, episode 2 at ~+40, episode 3 at ~+65 → but pre-idle counter accumulation compresses this to the observed ~26 ticks.

Q4 — Have conditions A or B ever escalated from ANCHOR_IDLE?

Orion's answer: No. Across all 52 sessions in the clean backup, every drift-guard DEGRADED escalation is condition C. Zero A, zero B. All 6 escalations carry fills_seen=1, ticks_since_opposing ∈ {15, 16}.

Vesper note for Atlas: This is the most significant finding from the engineering audit. Conditions A (burst) and B (net notional imbalance) have NEVER fired in production. The entire observed safety value of the drift guard comes from condition C alone. This has two implications: (1) conditions A and B may be unreachable under current quote sizes and fill patterns; (2) condition C's semantics — and whether it belongs in the same evaluator as A and B — deserves architectural review.

Q5 — Episode count splitting complexity (not yet answered — deferred to Atlas ruling phase)

Section 7 — Summary for Atlas¶

What Vesper concludes:

The engine's failure to trade in S51 is not primarily a market problem and not primarily a FLAG-048 problem. It is the result of three independent guard changes (FLAG-042, FLAG-044, FLAG-046) that were each individually correct for their stated problem but that compose into a system with a hard ~26-tick termination ceiling in any hostile anchor regime after a buy fill.

The specific fault path is drift guard condition C escalating from ANCHOR_IDLE to DEGRADED and consuming the 3-episode limit. This was not the intended scope of any of the three changes. It is an emergent interaction.

S40 ran 316 ticks with 84 fills in identical anchor conditions because it predated episode counting entirely. The guards were present but had no termination mechanism beyond the 300s truth-halt timeout.

The immediate containment fix (drift C suppression while idle) has been dispatched to Orion. It unblocks the next session. The broader architectural questions — whether idle-sourced and active-sourced episodes should share a budget, whether the 3-episode limit applies to idle escalations, and whether the FLAG-048 warm-up window and the episode ceiling can coexist — are Atlas-level decisions.

Orion's answers to the Q1–Q5 engineering questions above will confirm or correct Vesper's mechanical reasoning before Atlas rules.

— Vesper
2026-04-22