Skip to content

Orion Acknowledgment — Branch #7 Rulings Accepted + Paste Request

To: Katja CC: Vesper, Atlas From: Orion Date: 2026-04-19


Rulings accepted

Q1–Q5 + Atlas additions locked in. Committing to the following design, no deviations.

Item Ruling Implementation
Q1 Cadence No minimum; ≤0 → disabled Loader coerces negatives to 0. Positive values honored verbatim.
Q2 Thread start Explicit opt-in StateManager.start_wal_checkpoint_loop(interval_s) called from NEOEngine._startup(). __init__ touches nothing.
Q3 TRUNCATE failure Log-and-continue close() wraps TRUNCATE in try/except with log.error(exc_info=True). Connection still closes.
Q4 Observability (b) required Per-checkpoint log.info with mode/busy/log_frames/checkpointed_frames/elapsed_ms. End-of-session aggregate n/p50/p95/max emitted once inside close(), after thread join.
Q5 Window deque(maxlen=512) self._checkpoint_latencies: deque[float] = deque(maxlen=512).
Atlas #1 200ms warning Extra log.warning("wal_checkpoint slow", ...) when elapsed_ms > 200. Emitted in addition to the info line.
Atlas #2 No overlap audit Commit body documents the single-thread loop as the overlap-prevention mechanism. No code change.
Atlas #3 Shutdown ordering close() docstring documents: _checkpoint_stop.set()thread.join(timeout=5.0)_log_checkpoint_aggregate() → TRUNCATE → _conn.close().
Atlas #4 Failure visibility Every except in the checkpoint path: log.error(..., exc_info=True). No silent pass.

Percentile helper: using statistics.quantiles(method="inclusive") — covers p50/p95 cleanly at 512 samples and avoids pulling Branch #6's private _percentile across module boundaries. Will verify values against the Branch #6 unit-test fixtures (1..20 → p50=10.5, p95=19.05) as a sanity check.

Commit spec (single commit)

Subject: feat(state): periodic WAL checkpoint + TRUNCATE at shutdown (FLAG-035)

Files: - neo_engine/state_manager.py — new start_wal_checkpoint_loop, _checkpoint_loop, _run_checkpoint, _log_checkpoint_aggregate; close() extended. - neo_engine/config.pyEngineConfig.wal_checkpoint_interval_seconds: int = 60; loader with negative-coercion to 0. - neo_engine/main_loop.py — one-line call in _startup() after StateManager(...) construction. - config/config.yaml — add wal_checkpoint_interval_seconds: 60 under engine:. - config/config.example.yaml — same. - tests/test_wal_checkpoint_hardening.py — new, 4 tests.

Tests (all on-disk via tempfile.TemporaryDirectory()):

  1. test_periodic_checkpoint_logs_elapsed_and_counters — ≥2 PASSIVE iterations, aggregate emitted at close with non-None p50/p95.
  2. test_concurrent_writes_and_checkpoint_preserve_integrity — 200 writes against an active checkpoint loop + PRAGMA quick_check == "ok" + row count intact.
  3. test_shutdown_truncate_leaves_empty_wal — TRUNCATE at shutdown zeroes or removes the -wal sidecar.
  4. test_slow_checkpoint_emits_warning — mock checkpoint path to exceed 200ms; assert log.warning with elapsed_ms in extra.

Paste request — test-drift rule

Sandbox is cut off 99483c6 (Branch #6 tip). Before I touch any pre-existing file I need your head-of-main slices from C:\Users\Katja\Documents\NEO GitHub\neo-2026\ so I can patch against what you actually have on disk, not what the sandbox thinks exists.

Please paste the following slices verbatim (including surrounding blank lines so I can anchor the patches):

1. neo_engine/state_manager.py - The StateManager.__init__ method (from def __init__ through its last line). - The StateManager.close method (from def close through its last line). - The top-of-file imports block (first ~30 lines). I need to see whether threading, time, collections, and statistics are already imported.

2. neo_engine/config.py - The EngineConfig dataclass (from @dataclass through its closing line). - The loader block that populates EngineConfig from engine_raw (the ~6–10 lines reading engine_raw.get("tick_interval_seconds", ...) etc.).

3. neo_engine/main_loop.py - The first ~15 lines of _startup() — specifically the block where self._state = StateManager(...) is constructed. I need to see what comes immediately after it so I can place the start_wal_checkpoint_loop call cleanly.

4. config/config.yaml - The full engine: block (from engine: through the last indented key before the next top-level key).

5. config/config.example.yaml - Same — the full engine: block.

Once I have those, I cut the commit, run the 4 tests plus the full suite for regression posture, produce the patch, and deliver the memo. Target: same day.

Standing by for pastes.

— Orion