Vesper Ruling — FLAG-042 Pre-Code Q1–Q4¶
To: Orion (he/him) From: Vesper (she/her) CC: Katja (Captain), Atlas (he/him) Date: 2026-04-21 Re: Four open decisions before code — all ruled below
Investigation Assessment¶
Q1–Q5 are thorough and correct. Every architectural assumption in the tasking is confirmed:
_exit_degraded_mode()is the right building block ✅- Anchor window keeps updating in DEGRADED — no separate data-collection path needed ✅
- Episode tracking is absent — add it as specified ✅
- The tick step ordering is exactly as assumed ✅
The recovery_stability_ticks timer counting from "when exit conditions FIRST hold" (not from DEGRADED entry) is the correct interpretation — note that in the spec. The wrinkle about the bounded deque carrying DEGRADED-era values is correctly analyzed: tighter exit thresholds (4 bps / 30% vs 6 bps / 40% entry) compensate for this, and the guard reset clearing the window on exit is the clean answer anyway.
Proceed to code on all four rulings below.
Q1 — Drift/corridor scope: OPTION B¶
Ruling: ship all three in this branch.
Atlas's ruling explicitly specifies drift and corridor recovery in Section 6 — "keep minimal" is scope guidance, not deferral. The recovery infrastructure (episode counter, YAML config, recovery monitor step pattern) is shared and has to exist regardless. The incremental cost for drift and corridor is ~50 lines and 2 tests. The alternative (Option A) would create a partial implementation that Atlas's spec says should be complete.
Deviations permitted per Option B:
recovery_stability_ticks_driftas a new YAML param underDirectionalDriftGuardConfig— YES, add it. Default: 10 ticks. Rationale: drift is faster-moving than anchor saturation; requiring the same 30-tick stability window is overbuilt.recovery_stability_ticks_corridoras a new YAML param underInventoryCorridorGuardConfig— NO. Usecorridor_lookback_ticksfor both entry and exit directions per Atlas spec ("sustained for corridor_lookback_ticks"). No new param needed.
Both deviations from the default: document in your delivery note.
Minimum tests with Option B: 12 (10 anchor + 1 drift exit + 1 corridor exit, exactly as you proposed).
Q2 — Wallet-truth exit toward the recovery cap: (b) does NOT count¶
Ruling: wallet-truth ok exits are excluded from the one-per-episode cap.
Reasoning: - The one-per-episode cap exists to prevent the guard recovery loop — a market-regime-driven cycle where the anchor normalizes briefly then re-saturates. That is the failure mode Atlas is protecting against. - Wallet-truth DEGRADED is a different causal layer (on-chain balance divergence, not market regime). It can legitimately resolve and re-resolve within a session for reasons unrelated to whether the anchor guard will re-fire. - Combining the two makes the HALT escalation path harder to reason about: a clean wallet-truth resolution followed by an anchor guard entry would immediately HALT on what is effectively the first guard entry of the session.
Implementation consequence: the entry_count check in _enter_degraded_mode must scope to guard-triggered entries only. One approach: pass a source: str parameter through _enter_degraded_mode and only increment degraded.entry_count when source != "wallet_truth". Alternatively, maintain a separate degraded.guard_entry_count key in engine_state and leave degraded.entry_count as a total count (useful for observability). Your call on naming — pick whichever is cleaner to implement. Document the scoping rule in a comment.
Q3 — Recovery placement in tick: (i) separate method per guard¶
Ruling: implement as separate _evaluate_anchor_saturation_recovery(), _evaluate_directional_drift_recovery(), _evaluate_inventory_corridor_recovery() methods.
Rationale: matches the per-guard pattern for entry evaluators, tests each recovery path in isolation, and keeps the entry + recovery logic readable independently. The alternative (merged evaluator) conflates two distinct state-machine paths (entry detection vs. exit detection) that happen to share a window.
Step placement in _tick:
Step 8.4 (new): if MODE_DEGRADED:
_evaluate_anchor_saturation_recovery()
_evaluate_directional_drift_recovery()
_evaluate_inventory_corridor_recovery()
# if any recovery exits DEGRADED → continue to Step 8.5
# entry evaluators (Step 8.5) still run so a re-trigger is caught immediately
Important: Step 8.4 recovery evaluation runs BEFORE Step 8.5 guard entry evaluation, not after. This means a recovery can happen on the same tick that the window first satisfies exit conditions, and the entry evaluators immediately verify that the restored mode is genuine (they won't re-trigger if conditions are actually clean).
If mode is still DEGRADED after Step 8.4, skip Step 9 (submit) as today. The recovery step is observation-only until it fires.
Q4 — Halt reason taxonomy token: recovery_exhausted_halt¶
Ruling: use recovery_exhausted_halt.
Matches the existing taxonomy style (inventory_truth_halt, replay_exhausted). Descriptive enough to identify the cause without being verbose. Add to main_loop.py as a module-level constant:
This token must appear in the halt.reason taxonomy test suite (or at minimum be logged with the halt event so it's surfaced in session output). Flag it in your delivery note if adding it to the test suite would meaningfully expand scope.
Episode Counter Keys — Finalized¶
Per Q4, standardize the engine_state keys to:
degraded.entry_count— total DEGRADED entries this session (wallet-truth + guard)degraded.guard_entry_count— guard-triggered entries only (used for cap enforcement)degraded.recovery_count— successful guard-triggered exits (not wallet-truth exits)
All three reset to "0" in _startup()'s fresh-session clear block (same pattern as KEY_MODE / halt.reason resets added in fix/startup-mode-reset). All three stored as int-as-string in engine_state (same convention as other engine_state integer values).
If you prefer to promote these to module-level constants in main_loop.py (parallel to KEY_MODE, KEY_DEGRADED_SINCE, etc.) — yes, do that. Prevents future typo bugs and makes it clear these are first-class state keys.
Config Summary — Final¶
# Under anchor_saturation_guard:
recovery_enabled: true
recovery_exit_bias_threshold_bps: 4.0
recovery_exit_prevalence_pct: 30.0
recovery_stability_ticks: 30
# Under directional_drift_guard:
recovery_enabled: true
recovery_stability_ticks_drift: 10
# Under inventory_corridor_guard:
recovery_enabled: true
# No new stability param — uses existing corridor_lookback_ticks for exit
# Top-level (under strategy: or risk:, match existing nesting):
degraded_recovery:
max_recovery_attempts_per_episode: 1
Apply to config.yaml, config.example.yaml, and config_live_stage1.yaml.
Green Light¶
All four decisions ruled. No blockers. Create the branch and begin code.
Deliver to: patches/feat-flag-042-degraded-recovery/
Vesper reviews before merge. S45 runs after merge.
— Vesper