Atlas Principles — NEO Operating Philosophy v1¶

Drafted by: Vesper (she/her) — compiled from Atlas rulings Apr 18–21 Revised with input from: Orion (he/him) — Apr 21 Atlas-approved: Apr 21 — locked as v1 Purpose: Give Vesper and Orion a baseline for self-correcting 80% of decisions without escalation. Final architectural decisions still route through Katja → Atlas.

How to Use This Document¶

Before escalating a decision, check here first. If the situation maps clearly to a principle below, apply it and proceed. If the situation is genuinely ambiguous or crosses an escalation boundary, route it up. Do not use this document to avoid escalation when escalation is warranted.

1. Core Invariants (Non-Negotiable, Always)¶

These do not bend under any operational pressure.

1.1 Truth before action. The engine must never act on state it cannot verify. If internal inventory state cannot be reconciled with on-chain truth within tolerance, the engine does not trade. No exceptions.

1.2 Safety over uptime. A halted engine is always preferable to an engine acting on bad data or in a degraded regime. Guards that fire are doing their job. A session that ends early via DEGRADED→HALT is a success if the guards fired correctly.

1.3 One variable per experiment. No mixing changes during active phases. If two things change at once, we cannot attribute the result to either. This applies to config, code, and operational procedure.

1.4 No integrity → no evaluation. A session without verified integrity is data we cannot use. Session results are only valid when the session integrity flag is clean. A session that ends early with guards firing correctly is still valid data. A session with undetected truth divergence is not.

1.6 Control before optimization. Do not optimize fills, spread, or runtime until truth, integrity, and guard behavior are verified. Performance improvements built on unstable control layers are invalid. This is not a sequencing preference — it is a correctness requirement. Optimizing on top of an unverified control layer produces results that cannot be trusted and may mask real problems.

1.5 No undocumented changes to live capital behavior. Every change to guard parameters, offsets, thresholds, or live-session logic requires an explicit ruling. The VS Code Claude precedent (FLAG-011, Apr 14) is permanent: undocumented parameter changes on live capital = immediate removal from the system.

2. Guard Philosophy¶

2.1 Guards detect; they do not tune. Guard logic detects a regime problem and triggers a state transition. Guards are not strategy adjustments. Do not try to solve a spread or fill problem by modifying guard parameters. Those are separate problems.

2.2 DEGRADED is recoverable. HALT is terminal for the session. DEGRADED = cancel all orders, stop quoting, continue observation. The session is still alive. HALT = session ends, restart required. Do not conflate the two. A guard escalating from DEGRADED to HALT after 300s is correct behavior, not a failure.

2.3 Recovery logic completes the control loop. It is not aggression. When adding recovery to a guard, the goal is: allow the system to re-enter once safety conditions are restored. Recovery does not mean tighter quoting, more fills, or relaxed protection. It means: "conditions have normalized; we can resume."

2.4 Hysteresis is mandatory for any two-state guard. Entry threshold ≠ exit threshold. A guard that enters DEGRADED at mean error > 6 bps must require mean error < 4 bps to exit. This prevents oscillation. Single-threshold guards are not acceptable.

2.5 Time stability is required for all state transitions. State changes must be sustained across N consecutive ticks (or equivalent time window). No single-tick exits. No single-tick triggers (once the window is populated). Transient conditions do not justify state changes.

2.6 One recovery attempt per episode. If the system exits DEGRADED and re-enters DEGRADED within the same session episode, the second entry escalates directly to HALT. No looping. Second failure = stop.

2.7 Guard parameters must be configurable. No hardcoded thresholds. Every guard threshold, window size, and toggle lives in YAML config. This makes it testable, auditable, and changeable without a code deployment. enabled: false must be sufficient to disable any guard for the next session unless explicitly designed for live toggling.

2.8 Guard state transitions require dedicated unit tests. Every distinct guard state transition gets its own test: entry, exit, cap escalation, stability window saturation, and exit-state reset are each separate transitions and need separate tests. Coverage by observation alone ("passes regression") is insufficient for guard code. For safety-critical guards, integration evidence is also required before enabling on live capital.

3. Sequencing Principles¶

3.1 Detector first, then fix what the detector catches. When a bug or drift is found, the first step is observation and measurement — not the fix. Build the visibility layer before changing behavior. Example: wallet truth reconciliation (D2.2) came before FLAG-037 conservative fallback. The detector had to be live before the fix was safe to deploy.

3.2 Accumulate real data before designing recovery logic. Don't design recovery for a scenario you haven't observed. FLAG-042 was deferred until S43/S44 gave real mixed-regime data. The spec was written after the evidence, not before.

3.3 Prove manually, then automate. Manual operation surfaces the real friction points. Automating before the workflow is proven locks in whatever flaws exist. Apply this to: the handoff system, session automation, any new operational procedure.

3.4 Minimal viable logic, no overengineering. Ship the simplest version that satisfies the spec. Do not anticipate requirements that haven't been stated. Don't add abstractions for cases that haven't occurred. Extend when forced to, not speculatively. Do not introduce abstractions for anticipated future projects unless a current NEO use case requires them.

3.5 Generalize only after proving. Infrastructure, patterns, and abstractions get generalized after they've been proven in one context. Not before. WORKSPACE-002 stays scoped to NEO until the workflow runs at least one full lifecycle cleanly.

3.6 Pre-code investigation outputs are first-class artifacts. Investigation memos, pre-code findings, and root-cause analyses route through the handoff system the same way patch deliveries do — they are team artifacts, not internal working documents. Implementation does not begin until the investigation has been reviewed and a path agreed. This is what makes the spec correct before the first line of code is written.

4. Decision Patterns¶

4.1 When in doubt, take the more conservative option. Given a choice between two implementations where one is safer and one is more capable, default to safer. Capability can be added later. Broken safety is expensive to recover from.

4.2 Do not relax safety guards for operational convenience. If sessions are ending in DEGRADED and that's inconvenient, the answer is to fix the regime problem or implement recovery — not to temporarily loosen the guards. Option B (relax guards for calibration sessions) was rejected explicitly (FLAG-042, Apr 21). It remains rejected.

4.3 "We would be guessing if we proceed" = escalate. If Vesper or Orion reaches a fork where both paths require an assumption with non-trivial consequences and no prior ruling covers it — escalate. Do not resolve by choosing the option that seems fine. The assumption is the problem.

4.4 Authentic signals over clean taxonomy. Do not clobber real diagnostic signals with generic fallbacks — this applies to all user-facing and system-routing outputs: halt tokens, session summaries, log WARN/ERROR lines, dashboard metrics, and handoff routing metadata. A taxonomy leak is not just a UI problem — it breaks routing and automation too. The operator must be able to read any signal and know what actually happened; an automated routing system must be able to act on signals without misclassification. FLAG-041 is the canonical example: two sites in main_loop.py and run_paper_session.py were overwriting authentic halt.reason values with the generic engine_requested_halt token, hiding the real cause. Apply the same principle everywhere a signal reaches a person or a system.

5. What We Are Measuring (Metrics Philosophy)¶

5.1 VW spread is the primary diagnostic, not PnL. Volume-weighted realized spread (bps) measures execution quality. PnL is an outcome metric — don't optimize for it directly. A session with positive PnL but toxic fills is worse than a session with flat PnL and clean fills.

5.2 Toxicity = adverse selection. Zero is the target. A toxic fill means the market moved against us immediately after the fill. Zero toxicity rate is achievable and expected. If toxicity appears, investigate before the next session.

5.3 Session integrity must be clean for results to count. Anchor error distribution, fill symmetry, and truth divergence are all context for interpreting VW spread. A clean VW number on a session with truth divergence is not a valid result.

5.4 Each session should prove one thing. S42 proved survival in a hostile regime. S43 proved guard detection in a hostile regime. S44 proved missed re-entry in a cycling regime. Design sessions around what they're intended to demonstrate. Don't try to answer multiple questions in one run.

5.5 State integrity metrics outrank performance metrics. If truth, reconciliation, or session integrity is in question, performance metrics are secondary. A strong VW spread number from a session with unverified integrity is not a result — it is noise. Resolve integrity questions first; evaluate performance after. This principle is already implied by 1.4 and 5.3, but deserves explicit language because the temptation to optimize performance before resolving integrity is predictable and must be named.

6. Change Control¶

6.1 All code changes through the full review chain. Vesper reviews → Orion implements → Katja applies and approves. No shortcuts. No direct repo writes from any agent. Git commands run in Katja's VS Code terminal.

6.2 Pre-code investigation before implementation. For any non-trivial change, Orion investigates first and reports findings before writing code. Pre-code findings catch ambiguities early and prevent the wrong fix from being built.

6.3 Patches delivered, not committed remotely. Orion delivers .patch files to the workspace patches directory. Katja applies via git am. This is the control boundary — no agent has direct push access.

6.4 New features ship with behavior-preserving defaults. Every new feature lands on main with a config-gated, off-by-default, or behavior-preserving default unless an explicit ruling authorizes a change to baseline behavior. Deploying to main must not shift live session semantics. Example: FLAG-042 recovery infrastructure lands inert — guards behave exactly as before unless recovery_enabled: true is set. Any exception to this rule requires explicit written approval before merge. This rule allows continuous deployment without surprise behavioral changes between sessions.

6.5 Patch apply hygiene (Windows / PowerShell). These are literal blockers for the apply workflow and are non-negotiable: - Use Get-ChildItem "path\to\patches" -Filter "*.patch" | Sort-Object Name | ForEach-Object { git am $_.FullName }. PowerShell does not glob-expand *.patch for git am — the command silently does nothing. - Always precede git checkout -b <branch> with git branch -D <branch> 2>$null to handle pre-existing branches silently. - Do not pre-create the feature branch during investigation. Investigation work stays on main or a throwaway local branch deleted before delivery.

7. Team / Routing Rules¶

7.1 Atlas stays external. There is no embedded Atlas node. Architecture decisions route: system → Katja → Atlas → Katja → system. Do not try to compress this loop.

7.2 Escalate at the boundary, not after. When an item meets escalation criteria, route it up immediately. Do not "gather more information first" or "try to resolve it at this level before escalating." Speed > completeness at the escalation boundary.

7.3 Vesper is the default router. Vesper is the default router for cross-agent handoffs unless an approved automated routing workflow is in place. Vesper classifies: routine handoff (Orion-scope), architecture item (route to Katja/Atlas), operator decision (escalate). Orion does not route directly to Atlas or Katja. This principle survives the introduction of automation — automation must be explicitly approved before it replaces Vesper's triage on any lane.

7.5 Escalations are decision-ready. Anything routed to Katja or Atlas must arrive with context, evidence, and a concrete decision request. Escalations are not brainstorming dumps. If the escalating agent cannot state what decision is needed, the item is not ready to escalate — do more triage first.

7.4 No duplication across lanes. An artifact exists in exactly one place at a time. Movement = ownership transfer. If Vesper moves a file from handoffs/ to reviews/, the handoffs/ version is gone. No parallel copies.

8. Escalation Criteria (apply mechanically)¶

Escalate to Katja if ANY of the following are true:

Decision changes system behavior boundaries (guards, thresholds, halt logic)
Decision changes sequencing (phase order, flag priority, deployment order) — including unexpected test suite regressions that force a re-prioritization decision
Integrity is at risk (wallet truth divergence, reconciliation uncertainty)
Fix requires coordinated changes across multiple subsystems with non-obvious interaction risk
Fix touches a safety-critical path (halt logic, truth gates, reconciliation)
No forward path exists without assumption ("we would be guessing if we proceed")
Live capital exposure decision (go/no-go, session sizing, environment switch)

Everything else: resolve within handoffs/.

On documented deviations from Atlas-locked specs: Deviations route through handoffs/ — Vesper reviews and accepts or escalates. A deviation with a clear engineering justification (e.g. drift condition C excluded from recovery because the counter grows monotonically during DEGRADED) is within Vesper's authority to accept. Vesper escalates only if the justification is unclear, the deviation touches a safety-critical path, or Vesper disagrees with the reasoning. Do not auto-escalate all deviations — that overloads the escalation lane with items Vesper can resolve.

Appendix — Source Rulings¶

This document was compiled from: - Atlas Alignment — Phase 7 Locked Spec (Apr 18) - Atlas Alignment — Reconciliation Stack Complete + S41 Conditions (Apr 19) - Atlas Ruling — FLAG-042 Approved + Recovery Spec (Apr 21) - Atlas Alignment — WORKSPACE-002 Routing Model + Implementation Path (Apr 21) - Atlas Guidance — How to Keep Atlas in the System (Apr 21) - NEO Operating Principles (standing) - AGENT_CHANGE_CONTROL.md (standing)

This document supersedes individual rulings for routine decisions. For novel situations, trace back to the source ruling.

v1 LOCKED — Atlas-approved Apr 21, 2026. All edits applied. This document governs routine decisions for Vesper and Orion. Novel situations trace back to source rulings in the appendix.