Skip to content

Orion Tasking — fix/startup-mode-reset

To: Orion (he/him) From: Vesper (she/her) CC: Katja (Captain), Atlas (he/him) Date: 2026-04-21 Priority: BLOCKING — S43 cannot run until this is fixed


Context

S43 (session_id=44) halts immediately on tick 1 with unexpected_halt. The unexpected_halt token is the FLAG-041 fallback — _tick() returned False but no specific halt.reason was written. Engine ran for 0.94s total.

The FLAG-041 fix surfaced this previously hidden bug. Before the fix it would have appeared as engine_requested_halt and been ignored.


Root Cause — CONFIRMED by Vesper code audit

What writes the bad state

S42 ended via DEGRADED→HALT escalation. _escalate_degraded_to_halt() (main_loop.py:1453–1478) writes:

self._state.set_engine_state("inventory_truth.mode", "halt")
self._state.set_engine_state("halt.reason", HALT_REASON_INVENTORY_TRUTH)
self._state.set_engine_state("halt.detail", reason_detail)
This persists in the DB.

What S43 startup does

In _startup() at lines 578–586, on fresh session start (parent_session_id is None):

self._state.set_engine_state("halt.reason", "")
self._state.set_engine_state("halt.detail", "")
halt.reason and halt.detail are cleared. But inventory_truth.mode is NOT cleared.

What the truth check does

The startup truth check runs and returns status=ok (deltas are near-zero after realignment). Then _apply_truth_check_result(result) is called. Its ok branch (lines 2193–2196):

if result.status == "ok":
    if self._current_truth_mode() == MODE_DEGRADED:
        self._exit_degraded_mode()
    return
It checks if mode is MODE_DEGRADED — but mode is MODE_HALT. The condition is False. Mode stays halt.

What tick 1 does

At the top of _tick() (lines 2290–2293):

self._maybe_run_periodic_truth_check()
# If the periodic check just escalated into HARD HALT, stop the
# loop immediately — don't run a tick against known-bad truth.
if self._current_truth_mode() == MODE_HALT:
    return False
_maybe_run_periodic_truth_check() skips (startup set _last_truth_check_ts, within 60s interval). Mode is still halt. Returns False. No halt.reason was written. _shutdown has no kwarg and no pre-written reason → HALT_REASON_UNEXPECTED. That's the unexpected_halt.


Fix Specification

Change 1 — neo_engine/main_loop.py, startup cleanup block (~lines 578–586)

In the if parent_session_id is None: block that clears stale halt keys, add clearing of inventory_truth.mode, inventory_truth.degraded_since, and inventory_truth.degraded_reason.

Current code:

if parent_session_id is None:
    try:
        self._state.set_engine_state("halt.reason", "")
        self._state.set_engine_state("halt.detail", "")
    except Exception:
        log.debug(
            "Failed to clear halt.reason on fresh session startup",
            exc_info=True,
        )

Target code:

if parent_session_id is None:
    try:
        self._state.set_engine_state("halt.reason", "")
        self._state.set_engine_state("halt.detail", "")
        # FLAG-041 follow-up: also reset stale inventory_truth mode from
        # a prior session. A prior session ending in HALT (e.g. DEGRADED→
        # HALT escalation) persists MODE_HALT in engine_state. Without
        # this reset, tick 1 of the new session sees MODE_HALT and
        # immediately returns False — producing unexpected_halt even when
        # the startup truth check returned ok.
        # Recovery restarts (parent_session_id != None) deliberately
        # preserve mode so the escalation context carries through.
        self._state.set_engine_state(KEY_MODE, MODE_OK)
        self._state.set_engine_state(KEY_DEGRADED_SINCE, "")
        self._state.set_engine_state(KEY_DEGRADED_REASON, "")
    except Exception:
        log.debug(
            "Failed to clear halt.reason / truth-mode on fresh session startup",
            exc_info=True,
        )

Note: KEY_MODE, KEY_DEGRADED_SINCE, KEY_DEGRADED_REASON, MODE_OK are already imported from inventory_truth_checker at the top of main_loop.py (lines 71–74 area). No new imports needed.


Tests Required

Add to tests/test_halt_reason_lifecycle.py (existing file) — new class TestStartupModeReset:

Test 1 — Fresh session after HALT clears MODE_HALT: - Simulate prior session leaving inventory_truth.mode = halt in engine_state DB - Call _startup() on a new session (no parent_session_id) - Assert inventory_truth.mode == ok after startup

Test 2 — Fresh session after DEGRADED clears degraded keys: - Simulate prior session leaving inventory_truth.mode = degraded + inventory_truth.degraded_since set - Call _startup() on new fresh session - Assert inventory_truth.mode == ok and inventory_truth.degraded_since == ""

Test 3 — Recovery restart preserves MODE_HALT: - Simulate prior session leaving inventory_truth.mode = halt - Call _startup() with parent_session_id set (recovery restart) - Assert inventory_truth.mode is still halt (preserved — recovery restart is deliberate)

Minimum passing bar: 3 new tests + existing 251 regression green.


Branch Name

fix/startup-mode-reset


Delivery Requirements (standing rules)

  1. No pre-creating branch during investigation. Branch created only when committing.
  2. Apply instructions must use Get-ChildItem ... -Filter "*.patch" | Sort-Object Name | ForEach-Object { git am $_.FullName } form — no *.patch glob.
  3. Always include git branch -D fix/startup-mode-reset 2>$null before git checkout -b fix/startup-mode-reset.
  4. Patches go in: C:\Users\Katja\Documents\Claude Homebase Neo\02 Projects\NEO Trading Engine\patches\fix-startup-mode-reset\
  5. Deliver to: 03 Branches/fix-startup-mode-reset/[C] Orion Delivery — fix-startup-mode-reset.md

Scope Boundary

This is a narrow one-file fix: neo_engine/main_loop.py startup cleanup block only. Do not touch _apply_truth_check_result or _exit_degraded_mode — those are correct for their in-session purpose. The fix is in the startup lifecycle, not the state machine.

No config changes. No schema changes.

— Vesper