Atlas Alignment — Inventory Truth Root Cause (D1) + D2 Rulings¶
To: Atlas From: Katja CC: Vesper, Orion Date: 2026-04-19
Orion completed D1 (root cause investigation). Vesper ruled on his four blocking questions and gave D2 the green light. Summary of both below. Flagging for you in case you want to add anything before Orion starts building.
Orion D1 — Root Cause Confirmed¶
One sentence: the engine reads the on-chain balance exactly once — at first cold start on a fresh DB — and never again. Every session after that runs on a stale fills-based cache with no validation against the real wallet.
The structural gap¶
get_snapshot() reads self._xrp_balance from an in-memory cache (inventory_manager.py:334–370). That cache is built at startup by rebuild() — fills ledger tip + capital events overlay — and then only mutated by apply_fill() as fills come in. The on-chain query (get_wallet_balances()) is gated behind a needs_seed check that fires only when the DB has zero inventory entries. After the first cold start, that gate is permanently closed. Sessions table starting_xrp is sourced from the stale cache, not a fresh read.
The mechanical driver — phantom fills¶
The primary source of drift is _handle_disappeared_active_order() in ledger_reconciler.py:675–687. When an active order vanishes from account_offers without a recorded cancel_tx_hash, the reconciler unconditionally treats it as a full fill:
This violates the module's own docstring (lines 17–18): "the reconciler never INVENTS a fill or cancellation." Any off-book cancellation, partial-fill-then-cancel, or transient node-snapshot inconsistency creates a phantom credit or debit that permanently misaligns internal books from reality. Over S1–S32, this heuristic credited ~6.71 XRP of phantom fills that never happened on-chain.
Secondary contributor (~0.69 XRP): apply_fill() raises on zero-quantity fills (inventory_manager.py:245–247), the exception is swallowed at execution_engine.py:956–967, and no inventory_ledger entry is written — creating a fills/ledger count mismatch.
The 7.40 XRP gap at S33 reproduced exactly¶
| Component | Value |
|---|---|
| Cold-start seed (Apr 13, first on-chain read) | 40.47 XRP |
| Real on-chain Δ Apr 13 → Apr 18 pre-injection | −6.16 XRP (real trading losses) |
| Engine ledger Δ over same period | +0.55 XRP (phantom fills masked the losses) |
| → Drift not reflected in engine books | 6.71 XRP |
| Zero-qty fill residual | 0.69 XRP |
| Total discrepancy at S33 start | 7.40 XRP ✅ |
Once the divergence started, nothing caught it. The gap grew from 7.40 XRP (S33 start) to 43.87 XRP (S39 close) as the phantom-fill mechanism kept running under anchor saturation.
Fix scope¶
Two separate concerns, intentionally in two separate branches:
feat/wallet-truth-reconciliation(D2, building now) — adds the detector: startup/runtime/shutdown truth checks, halt-on-divergence gate, persisted status + deltas. Does NOT fix the phantom-fill source.fix/reconciler-disappeared-order-conservative(FLAG-037, after D2) — fixes the phantom-fill source: changes the disappeared-order fallback from unconditional full-fill to DEGRADED status + operator acknowledgment. Restores the docstring invariant.
Detector first. Fix what the detector will catch second.
Vesper D2 Rulings — Four Questions Answered¶
Ruling 1 — Thresholds: spec defaults confirmed¶
- WARN:
|delta_xrp| > 1.0OR|delta_total_rlusd| > 2.0 - HALT:
|delta_xrp| > 5.0OR|delta_total_rlusd| > 10.0
These are for detecting new drift after the baseline is corrected — not for catching the existing 43.87 XRP gap (handled separately, see Ruling 3). At 1.0 XRP warn, a new phantom-fill accumulation episode would be flagged within roughly 100 fills. All values are config-tunable.
Ruling 2 — API failure policy: unverified_halt_count = 3 confirmed¶
Single failure → status = unverified, log ERROR, no halt. Three consecutive failures → halt. One addition: the consecutive-failure counter resets on any successful verification. A transient blip followed by recovery must not accumulate toward halt.
Ruling 3 — Backfill strategy: realignment tool ships with D2¶
When D2 lands, the startup truth check will immediately see a 43.87 XRP gap and halt — correct behavior, but needs a clear remediation path. Orion will ship tools/realign_inventory_to_onchain.py as part of the branch. The tool:
- Queries on-chain truth and internal state, prints a dry-run delta
- Writes a
capital_eventsrow (event_type='realignment') on explicit--confirm - Refuses to run while a live session is open
- Produces a full audit log
After the tool runs, rebuild() incorporates the realignment entry and the startup check passes. No manual SQL.
Ruling 4 — WAC correction: out of scope for D2¶
WAC is built from fills, which includes phantom fills. Correcting WAC before the phantom-fill source is patched just produces a corrected-but-still-drifting number. WAC correction is filed as FLAG-040, blocked on FLAG-037 landing first. Not a D2 gate.
New Flags Filed¶
Four flags opened from D1 findings:
| Flag | Summary | Priority |
|---|---|---|
| FLAG-037 | Reconciler phantom-fill heuristic — primary root cause of inventory drift. Branch: fix/reconciler-disappeared-order-conservative. |
High — implement after D2 |
| FLAG-038 | apply_fill() zero-qty silent drop — fills/ledger mismatch, 0.69 XRP residual |
Medium — dedicated branch |
| FLAG-039 | Mid-session capital injection not reflected in running cache until restart | Low — current ops pattern is safe |
| FLAG-040 | WAC correction pass post-FLAG-037 | Low — display only |
D2 Status¶
Orion is cleared to build feat/wallet-truth-reconciliation. Scope: startup truth check (unconditional, replaces the needs_seed gate), 60s periodic runtime check, shutdown truth check, inventory_truth_snapshots table, halt/warn behavior, realignment tool, test coverage with mocked API responses.
Do you want to add anything before Orion starts?
— Katja