Atlas Ruling — Replace recovery_exhausted_halt with Cool-Down Session Model¶

To: Vesper (she/her), Orion (he/him) From: Atlas (he/him) CC: Katja (Captain) Date: 2026-04-21 Re: Spec question — idle session model (filed from NEO Desk escalations)

Final Ruling¶

Option A approved. Replace recovery_exhausted_halt with a cool-down model.

Diagnosis¶

The current FLAG-042 design solved oscillation, but solved it by converting a recoverable market-state problem into an operator restart problem. That is too rigid.

The one-recovery-attempt cap collapses two distinct cases into one outcome: 1. Borderline oscillation that should not flap endlessly 2. Persistently hostile regime that may normalize later in the same session

Both currently become: recover once → fail once → halt. That is too blunt.

Locked principle: A bad regime is not, by itself, a reason to terminate the session. A bad regime is a reason to stop trading and keep observing.

Termination should happen because: - Session duration elapsed - Truth/integrity failed - A hard safety boundary was crossed - Repeated recovery attempts indicate a deeper control failure

Not merely because the market stayed bad for a few minutes.

Approved Model¶

Replace: - Per-episode attempts counter - recovery_exhausted_halt

With: - Per-source cool-down timer - Repeated recovery opportunities after cool-down expires - Session duration as the outer safety wall

This preserves anti-oscillation protection, ability to idle through hostile conditions, and ability to re-enter when the regime clears.

Configuration Ruling¶

recovery_cooldown_ticks: 120

Approved as the default. At current cadence (~4s), that is roughly 8 minutes — long enough to avoid tick-to-tick thrashing, short enough to permit real intra-session recovery in a 2–4 hour run. Make it config-driven and tunable.

Required Safety Constraints¶

1. Cool-down is per guard source¶

Track independently for anchor saturation, directional drift, and inventory corridor. A failed anchor recovery should not block a truth-layer recovery path, and vice versa.

2. DEGRADED is the only state during cool-down¶

During cool-down: no quoting, no order placement, continue observation and truth checks, log that recovery is suppressed. This is an idle state, not a soft-active state.

3. Cool-down does not reset on every hostile tick¶

The timer is set when a recovery fails and the guard re-fires. It is NOT extended continuously because the hostile condition persists. Reason: otherwise you create an unbounded sliding lockout. Discrete waiting windows, not a moving prison.

4. Recovery still requires hysteresis + stability¶

Cool-down changes when recovery evaluation is allowed, not how recovery is judged. Exit from DEGRADED still requires: exit threshold below entry threshold, sustained stability window, no single-tick exits.

5. No infinite flapping — episode cap as session-level backstop¶

If the same guard source enters DEGRADED more than N times in one session → HALT.

max_degraded_episodes_per_source_per_session: 3

Model becomes: try → wait → try again → but not forever.

6. Taxonomy changes¶

Remove: recovery_exhausted_halt
Add (logged state, not a halt): recovery_cooldown_active
Add to state/logs: degraded_episode_count, cooldown_until_tick, recovery_suppressed_by_cooldown

What Must NOT Change¶

DEGRADED semantics (cancel all, stop quoting, continue observation)
Truth-guard recovery behavior
Thresholds (hysteresis unchanged)
Session duration limit
Recovery evaluating every tick during cool-down (it does not)

The system remains: trade → detect → degrade → wait safely → re-evaluate → resume if safe.

Session Duration Ruling¶

Use 2-hour sessions (--duration-seconds 7200) as the default test length for this model. 4-hour sessions acceptable later if 2-hour runs behave cleanly. Do not jump straight to 4 hours.

Engineering Scope¶

Focused follow-on to FLAG-042. Not a redesign.

Required changes: - Drop attempt-counter logic - Add per-source cooldown tracking (cooldown_until_tick) - Add per-source episode counting (degraded_episode_count) - Add config params: recovery_cooldown_ticks, max_degraded_episodes_per_source_per_session - Preserve hysteresis / stability logic unchanged

Required tests: - Cool-down activation - Cool-down suppression during active cool-down - Post-cool-down re-evaluation - Successful recovery after cool-down - Repeated DEGRADED episodes → HALT at episode cap - Cool-down and episode counter reset at session start

Bottom Line¶

Approved: replace recovery_exhausted_halt with cool-down. Default cooldown: 120 ticks (~8 min). Add per-source episode cap (max 3) as session-level backstop. Keep DEGRADED as idle observation state. Use 2-hour sessions first.

S43–S45 showed the current design is safe but too brittle. The cool-down model preserves safety and adds patience. That is the correct evolution.

— Atlas