Skip to content

[C] Orion Audit — Session Closure Root Cause + Fix

To: Vesper (she/her) CC: Katja (Captain) From: Orion (he/him) Date: 2026-04-18


TL;DR

Issue 1 (CRITICAL) — found, fixed, tested. Root cause is a one-character typo in _shutdown(): it read _end_inv.total_value_rlusd (no _in_) from an InventorySnapshot dataclass whose actual field is total_value_in_rlusd. The AttributeError fired on every shutdown, was silently swallowed by log.debug, so close_session() was never executed. Every sessions row from S1 through S37 landed with ended_at = NULL as a consequence. Fix is 1 line of code + log-level promotion + 4 regression tests. Patch attached.

Issue 2 (HIGH) — confirmed mechanism, hardening recommended, backup restore safe. S38 DB corruption is consistent with CTRL_CLOSE_EVENT on Windows (terminal X-button close). Python's signal module does not deliver this event — our FLAG-027 signal handlers catch SIGINT / SIGTERM / SIGBREAK but nothing else. The neo_live_stage1.db.bak.20260418T192119Z pre-S38 backup was taken via SQLite's atomic backup() API and is a consistent snapshot — safe to restore. Recommended hardening: add a periodic synchronous PRAGMA wal_checkpoint(TRUNCATE) on a 60-second timer, to bound the blast radius of any future hard-kill.

Recommendation: Clear to run S39 after applying the Issue 1 patch and restoring the backup. Issue 2 hardening is additive defense-in-depth and can ship on a later branch.


Issue 1 — Root Cause

The bug (neo_engine/main_loop.py:668, pre-fix)

self._state.close_session(
    ending_xrp=_end_inv.xrp_balance if _end_inv else 0.0,
    ending_rlusd=_end_inv.rlusd_balance if _end_inv else 0.0,
    ending_value=_end_inv.total_value_rlusd if _end_inv else 0.0,  # ← WRONG
    halt_reason=_final_reason,
)

InventorySnapshot (neo_engine/models.py:176) has fields:

xrp_balance, rlusd_balance, xrp_value_in_rlusd, total_value_in_rlusd,
xrp_pct, drift_pct, skew_tier, drift_velocity_pct_per_min

There is no total_value_rlusd attribute. _end_inv.total_value_rlusd raises AttributeError: 'InventorySnapshot' object has no attribute 'total_value_rlusd' every single time.

Verified in isolation:

$ python -c "from neo_engine.models import InventorySnapshot; \
             s = InventorySnapshot(); s.total_value_rlusd"
AttributeError: 'InventorySnapshot' object has no attribute 'total_value_rlusd'

The confusion is real: the DB column in valuation_snapshots is named total_value_rlusd (no _in_), while the in-memory dataclass field has _in_. The two are inconsistent. _shutdown() picked the DB spelling on a dataclass access.

Why it has been silent since S1

The attribute access is inside a try/except block (main_loop.py:655–672) whose handler was:

except Exception as exc:
    log.debug("Session close failed — continuing shutdown", extra={"error": str(exc)})

log.debug is filtered out at default log levels. The failure is invisible in every file log, every console log, every telemetry export.

Answers to the four audit questions

# Question Answer
1 Is close_session raising an exception on every call? No — close_session is never reached. The AttributeError fires on the argument-evaluation line that computes ending_value, before the call.
2 Is _current_session_id None at shutdown time? No. create_session() sets it; nothing else clears it before close_session runs.
3 Is the DB connection already closed? No. _state.close() is in a finally block at line 678, strictly after the close_session try/except. Ordering is correct.
4 Is _transaction() failing in WAL mode? No. _transaction() is never entered because close_session() is never called.

The root cause is simpler than any of the hypotheses — it's a plain attribute typo in a silent-failure code path.

Git blame

$ git log -S "total_value_rlusd" -- neo_engine/main_loop.py
7eaed37  2026-04-12  Data: add session tracking for funded runs in stage1 DB

Introduced in the commit that first wired up close_session on shutdown. The bug has been present from the first session the table existed.


The Fix

Patch: orion-patches-2026-04-18-session-closure/0001-fix-session-closure-write-ended_at-total_value_in_rl.patch

Branch: fix/session-closure-ended-at — one commit, one production file + one new test file.

neo_engine/main_loop.py (line 668)

-            ending_value=_end_inv.total_value_rlusd if _end_inv else 0.0,
+            # Note: the InventorySnapshot field is `total_value_in_rlusd`. The
+            # prior spelling `total_value_rlusd` (no `_in_`) matches the
+            # valuation_snapshots DB column, not the in-memory dataclass, and
+            # raised AttributeError on every shutdown — silently swallowed by
+            # the debug log below. That caused every session row from S1–S37
+            # to land with ended_at = NULL. Audit: 2026-04-18.
+            ending_value=_end_inv.total_value_in_rlusd if _end_inv else 0.0,
         except Exception as exc:
-            log.debug("Session close failed — continuing shutdown", extra={"error": str(exc)})
+            # Promoted from log.debug: session close is a source-of-truth write —
+            # failures must not be silent.
+            log.error(
+                "Session close failed — continuing shutdown",
+                extra={"error": str(exc)},
+                exc_info=True,
+            )

tests/test_shutdown_ended_at.py (new, 4 tests)

# Test What it pins
1 test_shutdown_populates_ended_at_on_sessions_row End-to-end: real StateManager(":memory:"), open session, run _shutdown, assert ended_at IS NOT NULL
2 test_shutdown_passes_total_value_in_rlusd_as_ending_value Spies on close_session kwargs to pin the attribute name
3 test_shutdown_handles_missing_session_gracefully _current_session_id is None early-return path
4 test_close_session_failure_logs_at_error_level Forces close_session to raise; asserts ERROR record emitted

All 4 pass with fix. 3 of 4 fail without fix (negative control confirmed).


Git Commands for Katja (PowerShell, copy-paste)

Block 1 — branch off main

git checkout main
git pull origin main
git checkout -b fix/session-closure-ended-at

Block 2 — apply patch

$patch = "C:\path\to\Claude Homebase Neo\02 Projects\NEO Trading Engine\orion-patches-2026-04-18-session-closure\0001-fix-session-closure-write-ended_at-total_value_in_rl.patch"
git am --3way "$patch"

Block 3 — verify

git log --oneline main..HEAD
python -m pytest tests/test_shutdown_ended_at.py -v
Expected: 1 commit, 4 passed.

Block 4 — push and merge

git push origin fix/session-closure-ended-at
Merge via GitHub UI.


Issue 2 — DB Corruption Mechanism + Hardening

Confirmed mechanism

CTRL_CLOSE_EVENT (terminal X-button) is NOT caught by Python's signal module. Our FLAG-027 signal handlers cover SIGINT / SIGTERM / SIGBREAK only — the docstring explicitly documents this gap. S38 failure sequence: terminal X → OS gives 5s grace → WAL auto-checkpoint mid-flight → process killed → kernel drops file buffers → malformed DB.

Backup is safe

neo_live_stage1.db.bak.20260418T192119Z was created via SQLite's backup() API — a consistent snapshot at backup time, not a file copy. S38 writes are not in the backup and not recoverable from the malformed DB. Restore from backup is the correct path.

Hardening recommendation — FLAG-035

Add a periodic PRAGMA wal_checkpoint(TRUNCATE) on a 60-second timer. After each checkpoint, WAL file is zeroed — blast radius of any future hard-kill bounded to ≤60 seconds of writes. Implementation sketch provided (threading.Event-based loop, daemon thread, stop on shutdown). Ship as FLAG-035 on a separate branch after S39 confirms clean behavior.


Summary

Priority Item Status
1 Fix ended_at write failure ✅ Patch ready — branch fix/session-closure-ended-at, 4 tests
2 log.debuglog.error on shutdown failure ✅ In same commit
3 WAL checkpoint cadence hardening Designed, not built — FLAG-035, separate branch
4 Backup restore confirmed safe ✅ Confirmed — restore and run PRAGMA quick_check

Clear to run S39 once patch lands on main and PRAGMA quick_check on restored DB returns ok.

What stays unchanged: anchor_max_divergence_bps: 10.0, base_size_rlusd: 15.0, risk_engine.py, fill paths, schema, Phase 7.2 CLOB switch.

— Orion