NEO Trading Engine — Experiment Log¶

Persistent record of all phases, configurations, results, and decisions. Updated as experiments complete. Source of truth across sessions.

Current Best Baseline¶

Environment: Stage 1 Live (real capital, XRPL mainnet, RLUSD/XRP) Locked config: sell offset 16 bps | buy offset unchanged | anchor dynamic capped_amm (~84% active, mean 9.81 bps)

sell_offset_bps: 16
anchor_mode: dynamic_capped_amm
anchor_cap_pct: 0.84
participation_filter_suppressed_pct: 0.0
base_order_size_rlusd: 3.0   # NOTE: hardcoded — config says 10.0, not yet wired (FLAG-007)

Key metrics at this baseline (18→16 tuning session, Apr 13): - total_fills: 48 | buy: 24 | sell: 24 (perfect symmetry) - VW spread: 3.3016 bps - avg_realized_spread_bps: 5.6537 - ending XRP share: 44.10% | drift: -5.9% - ending total value: 140.28 RLUSD | trading value: +2.75 cumulative - session_min_dist_to_ask: 0.0 bps - toxic fills: 0 - suppressed ticks: 0%

Current best config (Phase 5C.4, pending confirmation):

bid_offset_bps: 10.0
ask_offset_bps: 14.0
drift_pct: 0.65
max_skew_bps: 10.0
base_size_rlusd: 10.0
max_xrp_exposure: 100.0

Next experiment: Phase 5C.5 — confirmation run at current config before any further tuning

Phase History¶

Phase 3C — Distance / Aggressiveness Tuning¶

Goal: Reduce buy-side toxicity while preserving participation via bid distance. Control variable: bid_spread_multiplier

Run	bid_spread_multiplier	buy_fills	toxic_fill_rate_pct	avg_markout_buy_bps	ending XRP share
3C.1	0.90	—	50.85%	-4.57	—
3C.2	1.00	—	46.55%	-5.07	—
3C.3	1.10	33	33.33%	-2.17	54.29%
3C.4	1.15	29	39.29%	-3.62	51.51%

Conclusion: Best distance is bid_spread_multiplier: 1.10. Improvement flattened and reversed at 1.15. Phase 3C closed.

Decision: Keep 1.10 as single-order baseline. Move to directional protection.

Phase 3D — Time-Based Directional Protection¶

Goal: Suppress buys after short-horizon downward price movement. Signal: Short-term mid-price delta (lookback: 8s) Control variable: bid_protection_threshold_bps

Run	Threshold	Result
3D.1	-5.0 bps	No observable effect (identical to baseline)
3D.2	-2.5 bps	No observable effect (identical to baseline)

Bug found mid-phase: Protection suppressed new BUY intent but did NOT cancel resting paper BUY orders. Stale exposure remained live. Fix implemented: resting BUY now cancelled when protection triggers.

Corrected rerun (3D.3): Still no measurable effect after fix.

Conclusion: Short-horizon mid-price delta is not predictive of adverse selection at 4s requote cadence. Adverse selection is sub-cycle or structural, not directional. Phase 3D closed.

Phase 3E — Anchor-Relative Protection¶

Goal: Suppress buys when anchor is structurally elevated relative to CLOB mid. Signal: anchor_divergence_bps vs threshold Control variable: anchor_relative_bid_protection_threshold_bps

Run	Threshold	toxic_fill_rate_pct	Result
3E.1	10 bps	0%	Sell-only, inventory collapsed to RLUSD
3E.2	12 bps	0%	Identical to 3E.1
3E.3	14 bps	0%	Identical to 3E.1

Anchor divergence audit: - divergence = constant 15 bps every tick - pct_cap_applied = 100%

Conclusion: Anchor is fully pinned at cap. Divergence is constant, not variable. Threshold tuning was testing against a non-selective signal. All thresholds (10–14 bps) were below the constant divergence value, so protection was always on. Phase 3E closed.

Phase 3F — Anchor Cap Adjustment¶

Goal: Restore anchor dynamics by reducing cap. Assess whether a lower cap changes anchor behavior. Control variable: anchor_max_divergence_bps (15 → 10) Protection: Disabled for clean read.

Result (cap = 10, bid_spread_multiplier = 1.00): - total_fills: 58 | buy: 30 | sell: 28 - toxic_fill_rate_pct: 25.86% - avg_markout_buy_bps: -0.0765 (near neutral) - avg_markout_sell_bps: +29.95 - avg_realized_spread_bps: +14.79 - ending XRP share: 50.94% - pct_cap_applied: 100% (anchor still fully pinned)

Conclusion: Anchor remains synthetic (anchor = clob_mid + cap). No dynamic AMM influence. However, execution quality improved materially — participation balanced, buy markout near neutral, inventory healthy. This configuration is the first plausibly profitable paper-trading baseline. Anchor architecture redesign deferred.

Decision: Accept cap=10 as stable operating regime. Move to selective participation.

Phase 4A — Post-Fill Participation Filter¶

Goal: Reduce repeated adverse selection by pausing BUY quoting briefly after a toxic fill. Signal: Post-fill markout on last BUY fill (reactive, not predictive) Behavior: If markout of last buy fill ≤ threshold → suppress BUY for pause_seconds. Resting paper BUY cancelled during pause. SELL unchanged.

Config:

participation_filter_enabled: true
post_fill_markout_trigger_bps: -3.0
post_fill_pause_seconds: 8
post_fill_reference: last_buy_fill

Result — Phase 4A (threshold = -3.0): - total_fills: 52 | buy: 30 | sell: 22 - toxic_fill_rate_pct: 21.15% - avg_markout_buy_bps: +0.7038 - avg_realized_spread_bps: +14.74 - ending XRP share: 54.67% - activation_count: 8 | suppressed_pct: 7.5%

Result — Phase 4A.1 (threshold = -2.5, all else equal): - Metrics identical to -3.0 run - activation_count: 8 | suppressed_pct: 7.5%

Conclusion: Filter is validated. Same 8 activations at both thresholds — the triggering events are all worse than -2.5 bps, so tightening didn't add new triggers. Threshold tuning is saturated. Filter operates in a stable region. This is the strongest observed configuration to date: positive buy markout, reduced toxicity, healthy participation, maintained spread capture.

Decision: Commit Phase 4A baseline. Next step: CLOB spread variance audit before implementing a spread-based participation gate (Phase 4B).

Phase 4B — Volatility / Spread Audit (Static Environment)¶

Goal: Determine whether volatility or spread-regime signals are viable in the current environment.

Volatility audit result: distinct_mid_values = 1, pct_static = 100%, min/max/mean = 0.0. Signal non-informative.

Mid-source audit result: clob_mid_price confirmed static — not a bug, not object reuse. The paper environment itself is static. No tick-to-tick market movement exists.

Conclusion: Volatility and spread-regime signals cannot be evaluated in the static paper environment. Environmental limitation, not a signal failure. Both signals deferred to replay environment.

Decision: Build live capture pipeline → accumulate real data → build ReplayMarketDataAdapter → re-test under real movement.

Phase 4C — Replay Infrastructure & Validation¶

Goal: Transition from static paper environment to replay-based validation using real XRPL top-of-book data.

Infrastructure milestones: - Live capture pipeline operational (XRPL WebSocket → SQLite) - ReplayMarketDataAdapter implemented and integrated - Architecture-safe: zero changes to strategy or execution paths

Dataset #1 result (~150 min, ~5000 ticks): - total_fills: 82 | buy: 41 | sell: 41 - toxic_fill_rate_pct: 0.0% - avg_markout_buy_bps: +12.22 - avg_markout_sell_bps: +17.80 - avg_realized_spread_bps: +14.83 - ending XRP share: 54.83% - total_pnl: +1.41 RLUSD - participation filter activations: 0 (no toxic events in this regime)

Interpretation: Strategy shows positive edge under real market movement. Zero adverse selection observed. Prior ~21% toxicity in static environment likely a simulation artifact — but cannot confirm until Dataset #2 tested. The participation filter is unvalidated under replay (zero activations in a zero-toxicity dataset is a non-event, not a confirmation).

Status: Phase 4C.1 complete. Phase 4C.2 in progress (Dataset #2 capture underway).

Live deployment gate: Minimum 2–3 distinct replay datasets showing consistent performance required before any live test consideration.

Phase 5 — Stage 1 Live Deployment (Real Capital, XRPL Mainnet)¶

Phase 5A — Sell Offset Tuning: 18 → 16 bps¶

Date: Apr 13, 2026 Goal: Test whether reducing sell ask distance from 18 bps to 16 bps increases sell-side fill conversion without collapsing spread quality. Prior sessions consistently showed 75–100% buy-heavy fills, near-zero sell conversion. Control variable: sell_offset_bps (18 → 16) Hypothesis (pre-defined, FLAG-004): If the bottleneck is pricing distance, tighter ask = more sells + spread quality holds. If queue position is the bottleneck, conversion stays low regardless.

Stage 1 Baseline (pre-experiment, reference session): - VW spread: 1.34 bps - Buy/sell ratio: ~80% buy / 20% sell - Sell near-touch ticks: 14 | Sell fills: 0 | Sell conversion: 0% - session_min_dist_to_ask: 0.0 bps - Anchor: dynamic capped_amm (~40% active)

Tuning session results (18→16, full session):

Metric	Value
Ticks	634
Orders	221
Total fills	48
Buy fills	24
Sell fills	24
Buy/sell ratio	50% / 50%
VW spread	3.3016 bps
Avg realized spread	5.6537 bps
Ending XRP share	44.10%
Inventory drift	-5.9% (RLUSD-heavy)
Ending total value	140.28 RLUSD
Trading value cumulative	+2.75 RLUSD
session_min_dist_to_ask	0.0 bps
Toxic fills	0
Suppressed ticks	0%
Anchor activity	~84% active, mean 9.81 bps

Fill mechanism observed: Sweep-based. 24 sell fills with 0 sell near-touch ticks — fills happen when price sweeps quickly through ask level, not via sustained queue accumulation. Near-touch metric not applicable to sweep-based sessions (see FLAG-004 evolution).

Mid-session snapshot (45 min, 9 fills): Buy/sell 22%/78%, VW spread 1.72 bps, inventory rebalancing from +8.3% XRP-heavy to -2.8% in real time. Sell-side engagement confirmed early.

Conclusion: Experiment is a clear pass. - Sell suppression eliminated — from ~0% sell fills to 50/50 symmetry - VW spread improved (3.30 bps vs 1.34 bps baseline) — tighter ask did NOT collapse quality - Inventory managed within ±6% throughout — skew mechanism effective at higher activity - Zero toxic fills, zero suppressed ticks — clean execution conditions - The improvement came from pricing distance, not queue position

Decision: Lock 16 bps as the new sell offset baseline. Phase 5A closed.

Identified next bottleneck: Quote size. At 3 RLUSD base (~2.2 XRP), DOM participation ~1%. Sweeps are clearly ≥10–15 RLUSD. Size is now the binding constraint on spread capture. See FLAG-007.

Phase 5B — Size Experiment: 10 RLUSD base (in progress)¶

Date: Apr 13, 2026 Goal: Test whether increasing base order size from 3 → 10 RLUSD improves spread capture and DOM participation. Single-variable change: size only, sell offset held at 16 bps. Config plumbing: Committed 49d75e8 (Apr 13). Hardcoded 3.0 removed, engine now reads config.order_size.base_size_rlusd. Guardrail: min(config, 0.10 × portfolio, 0.15 × portfolio). Effective size = 10.0 RLUSD at current portfolio.

Attempt 1 (Apr 13, ~20:00 run) — halted early, inconclusive: - Ticks: 304 | Session elapsed: 1681s (of 3600s) | Halted: XRP exposure limit (85.72 > 85.0 RLUSD) - Fills: 13 (dashboard, session-scoped) | Buy: 8 | Sell: 5 - VW spread: 2.18 bps | Session PnL: -0.0740 - Ending inventory: 61% XRP / 39% RLUSD | Drift: +11% - Spread DOM: 1% - MISPLACED alert visible on dashboard — ruled out as display artifact (dashboard.py frozen stale offset from pre-16bps config; mid drift of ~12 bps after placement). Engine was correct throughout. Fix shipped same session. - Session summary reported 63 fills — confirmed cross-session scoping bug (FLAG-010). Dashboard count (13) is authoritative. - Conclusion: Session cut off before skew could rebalance. Cannot distinguish "skew insufficient" from "not enough time." Not a valid size experiment result.

Attempt 2 (Apr 14) — concluded, superseded by architecture work: Multiple sessions ran but were contaminated by: undocumented ask_offset change (16→12), stale resting orders at startup, and the two-transaction cancel+create gap creating order dark periods. Size experiment itself was not cleanly evaluable. Key outcome: 10 RLUSD size confirmed as working size, max_xrp_exposure raised to 100 RLUSD. Architecture issues addressed in Phase 5C before further offset tuning.

Decision: Phase 5B closed. Size = 10 RLUSD locked. Move to Phase 5C (architecture validation + offset calibration).

Phase 5C — Execution Architecture Validation + Offset Calibration (Apr 14)¶

Goal: Validate atomic replacement architecture, then systematically find the optimal offset configuration for participation vs spread quality.

Architecture fixes shipped before Phase 5C: - Atomic OfferSequence replacement (cancel+create → single transaction, gap eliminated) - Startup order cancellation (_startup_cancel_legacy_xrpl_orders) — clears previous session resting orders - Drift calibration config wiring (max_skew_bps now config-driven) - drift_pct added as tunable parameter

Phase 5C.1 — Atomic Replacement Baseline Confirmation¶

Config: ask=16, bid=10, drift=0.2, skew_cap=12, size=10 Result: - 7 fills | 4 buy / 3 sell | VW spread +1.31 bps | Avg +1.75 bps - Ending inventory: 49.2% XRP (near-perfect neutral) - Negative sell dist: 0.2% ticks | Toxic: 0 - FLAG-013 verification: PASSED — placement fidelity confirmed, atomic replacement dominant, no churn

Conclusion: Execution architecture validated. Remaining constraint is offset competitiveness, not execution timing. Fills still sweep-based / dislocation capture only.

Phase 5C.2 — Drift Threshold Calibration¶

Control variable: drift_pct 0.2 → 0.65 (Earn-zone → Deep-zone threshold per Titan intel) Config: ask=16, bid=10, drift=0.65, skew_cap=12, size=10 Result: - 7 fills | VW spread +2.13 bps | Avg +1.77 bps - Spread improved vs baseline, no churn, atomic replacement stable

Conclusion: PASS. Appropriate patience for deep-zone placement. drift_pct: 0.65 locked.

Phase 5C.3 — Ask Offset Tightening: 16 → 14 bps¶

Control variable: ask_offset_bps 16 → 14 Config: ask=14, bid=10, drift=0.65, skew_cap=12, size=10 Result: - 23 fills | 6 buy / 17 sell | VW spread +0.37 bps | Avg +6.58 bps - Participation ~3x increase | Toxic: 0 | Inventory: 53.3% XRP (slightly skewed) - Sell distance to touch: ~11.6 bps avg — still outside queue, dislocation capture - Skew-cap risk (2 bps floor at ask=14/cap=12): 0.2% adverse ticks, non-clustered

Conclusion: PASS on participation. VW spread compressed (0.37 bps) due to skew-boosted large sell fills at tight adjusted offset. Identified tradeoff: more fills, lower quality per fill. Skew cap adjustment needed to protect spread floor.

Phase 5C.4 — Skew Cap Reduction: 12 → 10 bps¶

Control variable: max_skew_bps 12 → 10 Config: ask=14, bid=10, drift=0.65, skew_cap=10, size=10 Rationale: Restores minimum adjusted sell floor from 2 bps (ask=14 − skew=12) to 4 bps (ask=14 − skew=10) — same floor as the validated Phase 5A baseline. Result:

Metric	Skew cap 12	Skew cap 10
Fills	23	11
Buy / Sell	6 / 17	6 / 5
VW spread	+0.37 bps	+1.22 bps
Avg spread	+6.58 bps	+1.19 bps
Inventory	53.3% XRP	49.1% XRP (neutral)

Avg ≈ VW → no weighting distortion, fills consistent quality
Inventory returned to near-perfect neutral
Fill count decreased but remains above baseline (7)
No churn, no anomalies, no toxicity

Conclusion: PASS. First configuration simultaneously achieving positive VW spread, reasonable fill count, neutral inventory, and stable behavior. First candidate working baseline.

Nuance: Sell often sitting at ~15 bps (slight negative skew throughout session) — true 14 bps sell regime not fully exercised under positive skew conditions. Confirmation run required.

Decision: Do not tune further. Run confirmation session (Phase 5C.5) at identical parameters before any next variable change.

Phase 5C.5 — Confirmation Run 1¶

Config (unchanged): ask=14, bid=10, drift=0.65, skew_cap=10, size=10 Result: - 11 fills | 6 buy / 5 sell | VW spread +1.22 bps | Avg +1.19 bps - Ending inventory: 49.1% XRP (neutral) | Toxic: 0 - Slight negative skew throughout (~-1 bps) → sell sat at ~15 bps effective, not true 14 bps - 4 bps floor (positive skew stress case) was NOT exercised this session

Conclusion: Provisional PASS. Reproducibility confirmed for neutral-skew regime. Positive-skew stress case unobserved. Second confirmation run needed.

Phase 5C.6 — Confirmation Run 2¶

Config (unchanged): ask=14, bid=10, drift=0.65, skew_cap=10, size=10 Result: - 22 fills | 10 buy / 12 sell | VW spread -0.83 bps | Avg -0.56 bps - Ending inventory: ~44% XRP (moderate RLUSD bias) | Toxic: 0 - SELL effective offset reached 23+ bps → market trending down, stale floor orders only fill on reversion - Buy fills into falling market at above-current-mid prices → negative realized spread on buys - No mechanical issues — execution layer clean

Key finding: Same config, opposite market path → regime-split confirmed.

Regime	VW spread	Characteristic
Neutral / low skew	+1.22 bps	Stable, balanced fills
Drift / dislocation	-0.83 bps	Buy-into-trend + stale sell at floor

Conclusion: Config is not universally stable. Strategy is now demonstrably regime-sensitive. This is not a tuning issue — it is behavioral/regime interaction. We are no longer in static offset optimization.

Decision: Run third confirmation session unchanged to characterize the regime split further. Specifically observe: whether negative fills cluster during positive skew + downtrend periods, whether skew reaches floor (≤5 bps sell) at any point, and fill burst patterns vs continuous participation.

Next architectural candidate (conditional): Regime-aware directional filter (FLAG-014, now elevated to Phase 2). Only if third session confirms the split. Do not move to ask=13 until regime question is resolved.

Phase 5C.7 — Confirmation Run 3 (closed — engine fix identified)¶

Config: ask=14, bid=10, drift=0.65, skew_cap=10 (displayed), size=10 Result: - 2 fills (1 buy / 1 sell) | VW spread -1.22 bps | Avg -0.52 bps - XRP session: start 1.3571, end 1.3596, +0.182% (flat) - Drift range: -1.65% to +7.62% | Ending drift: +2.67% - Sell near-touch (SESSION): 45 ticks | 1 fill | 2.2% conversion - Buy near-touch (SESSION): 1 tick | 1 fill | 100% conversion - Anchor: mean +3.82 bps (positive, abnormal — prior sessions -6 to -8 bps)

Critical finding during this session: Code audit revealed max_skew_bps config field was never read by the engine. Engine ran at hardcoded 12 bps throughout all Phase 5C sessions. Fix shipped same night. Both 5C.3 and 5C.4 ran at skew_cap=12. VW improvement (+0.37 → +1.22) was regime variation, not parameter effect.

On the 45/1 sell conversion: Queue position hypothesis raised — market approached sell 45 times, converted once. However, anchor was +3.82 bps (pushing effective ask ~4 bps further from CLOB mid than configured), creating a confounder. Cannot isolate queue position from anchor distortion in this session.

On skew cap validity: Drift peaked at +7.62% — never crossed 10% threshold. Skew cap (10 vs 12) was never binding this session. Even with engine fix, a session that doesn't cross 10% drift produces no observable cap difference.

Phase 5C.4 rerun — True skew cap test (pending, engine fix now active)¶

Config: ask=14, bid=10, drift=0.65, skew_cap=10 (now engine-enforced), size=10 Pre-run: Run test suite (test_atomic_replace.py, test_main_loop.py). Confirm init log shows max_skew_bps: 10.0. Evaluation criteria:

Condition	Interpretation
Max drift never ≥ 10%	Engine fix validated, but skew cap effect NOT observable — inconclusive for cap comparison
Max drift ≥ 10% at any point	Skew cap is binding — observe sell floor (should be 4 bps at cap=10), VW impact under stress

What to log at close: Max drift reached, time spent above 10% drift, VW spread during high-drift periods, whether sell floor (≤5 bps adjusted sell) was observed.

Pass criteria: VW spread positive, fill count 8-15, inventory controlled. If drift never exceeds 10%: treat as stability confirmation only, not skew cap validation. Skew cap validation requires a separate session with sufficient drift stress.

Actual result (Phase 5C.4 rerun, Apr 15 ~12:22 AM): - 8 fills | 5 buy / 3 sell | VW spread +3.32 bps | Avg +3.71 bps | 0 toxicity - Drift range: -6.98% to +7.25% — cap never engaged - BUY conversion: 36 near-touch / 5 fills = 13.9% | SELL: 17 near-touch / 3 fills = 17.6% - Spread DOM: 2% (first time) | Spread PnL: +0.0207 RLUSD - Verdict: PASS (execution + stability). Cap not engaged — skew cap experiment still incomplete.

Phase 5C.8 — Overnight stability run (Apr 15)¶

Config (unchanged): ask=14, bid=10, drift=0.65, skew_cap=10, size=10 Result: - 5 fills | 2 buy / 3 sell | VW spread +0.69 bps | 0 toxicity - Drift peak: +2.9% — cap never engaged (third consecutive session without cap engagement) - Anchor: pending terminal summary (behavioral evidence: neutral/mildly positive — no conversion collapse) - BUY conversion: 11 near-touch / 2 fills = 18% | SELL: 56 near-touch / 3 fills = 5.3% - Session min dist to ask: 1.5 bps — queue position confirmed as binding (market within 1.5 bps, other sellers in front) - Lower VW vs prior session: microstructure variation (fewer fills, missed sell opportunities), not config failure - Verdict: PASS (calm regime stability). Sell conversion band now characterized: 5–18%, queue-sensitive.

Skew cap status: Three sessions post-fix, zero cap engagement. Waiting for drift ≥10%.

Phase 5C.9 — Anchor-dominated regime session (Apr 15, ongoing)¶

Config (unchanged): ask=14, bid=10, drift=0.65, skew_cap=10, size=10 Anchor regime: POSITIVE-DOMINATED — mean +9.58 bps, median +10.00 bps (88% capped) Effective positioning vs CLOB mid:

effective_buy  ≈ bid_adjusted (13.3) − anchor (9.58) = 3.7 bps  ← near-queue
effective_sell ≈ ask_adjusted (10.7) + anchor (9.58) = 20.3 bps ← defensive

Mid-session metrics (24 min): - 9 fills | 5B/4S | VW +3.71 bps | 0 toxicity - Buy conversion: 38.5% (5/13) — near-queue participation explains high rate - Sell conversion: 26.7% (4/15) - Opportunity imbalance: 46% BUY / 54% SELL — balanced - Placement: on target (BUY 0.0 bps delta, SELL -1.0 bps) - Drift: +3.3%, cap not engaged

Classification (Atlas framework): Anchor-dominated regime. Anchor was the primary pricing driver, not skew. Strong results are real (VW is accurate) but edge source is anchor-driven buy proximity, not config optimality. Case 2 — favorable anchor bias. Do not over-credit config.

Open questions this session will NOT answer: Neutral anchor test, drift ≥10% cap engagement.

Phase 6A — Atlas Session 29 Decision (Apr 16)¶

Decision: Run Session 29 unchanged. No parameter adjustments. Rationale (Atlas): Changing threshold to force more suppression activity would not increase probability of stress regime — market generates regimes, the filter does not. A threshold change now would break comparability across Sessions 26/27/28 and convert the test from "does Phase 6A fix the known failure mode?" to "does a different system behave differently?" — that's Phase 6B, not now.

Key principle stated: Momentum filter ≠ regime generator. We are waiting for inventory drift + market path → cross ±10%. That comes from volatility, directional moves, and time — not from parameter tweaks.

Low-frequency validation phase: Patience > optimization. If drift never crosses ±10% across multiple sessions, that becomes a separate and interesting observation ("system may naturally avoid extreme drift"). But we are not there yet.

Session 29 plan: Same config, 2 hours, same template, same evaluation criteria. Single focus: did abs(drift) ≥ 10%?

Phase 6A.2 — Session 28 (neutral anchor, near-stress, Apr 16)¶

Config (unchanged): ask=14, bid=10, drift=0.65, skew_cap=10, momentum_filter_enabled=true, lookback=3, threshold=4.0 Result: - 53 fills | 30B/23S | VW +0.95 bps | 0 toxicity | 1259 ticks | 2 hours - Drift range: -7.23% to +8.94% — near-cap, did not cross 10% - Anchor: mean -1.93 bps — near-neutral (closest session to anchor ≈ 0) - XRP session: +0.99% (rising)

Momentum filter: - Buy suppressions: 111 | Sell suppressions: 140 - ~20% of ticks involved suppression - SELL > BUY in rising XRP session → directional logic correct ✓ - Classification: upper-bound moderate — worth watching, not adjusting

Interaction: Near-touch 74B/102S | Both sides active | No fill collapse

Key finding: First session where: - Anchor is NOT driving the outcome (near-neutral, -1.93 bps mean) - Filter is active (251 total suppressions) - VW remains positive (+0.95 bps) - System scales cleanly (53 fills)

vs Session 26 (pre-6A stress): Session 26 VW ≈ 0 at similar drift levels. Session 28 VW +0.95 bps with filter active — strong directional evidence Phase 6A is improving behavior under stress-like conditions.

Verdict: CONDITIONAL PASS → PRACTICAL PASS

Atlas classification: edge is intrinsic, not anchor-assisted. Config is profitable on its own mechanics.

Remaining gap: drift never crossed ±10% — formal stress confirmation still outstanding. One cap-engagement session closes Phase 6A completely.

Atlas structural conclusion (Apr 15): This session was the first with sufficient sample size (15 fills, 60/61 balanced near-touch, 8/7 balanced fills) to support a structural call:

EDGE = f(anchor regime, placement)
  anchor = regime selector (not noise, not confound)
  placement = spread capture mechanism
  skew = inventory management
  cap = stress protection (untested)

Anchor does not change the edge — it rotates which side expresses it: - Negative anchor → sell competitive → sell-side edge expression - Positive anchor → buy competitive → buy-side edge expression - Neutral anchor → unknown → open test

First confirmed: top-of-book sell participation (dist_to_ask -0.0 bps) was NOT toxic (0 adverse fills, falling market). Kills the "tight positioning → adverse selection" concern.

System status: Real, repeatable edge confirmed across multiple anchor regimes. Execution correct. Metrics correct.

What remains unproven: Neutral-anchor robustness | Stress (≥10% drift) cap behavior

Phase 6A.3 — Session 29 (stress confirmed, FULL PASS, Apr 16)¶

Config (unchanged): ask=14, bid=10, drift=0.65, skew_cap=10, momentum_filter_enabled=true, lookback=3, threshold=4.0 Mid-run (71 min, session still running at time of write): - 75 fills | 41B/34S | 55%/45% | VW +0.62 bps | 0 toxicity - Drift range: -13.8% to ~+3% — ±10% threshold crossed ✓ - Anchor: mean -1.93 bps — near-neutral (consistent with S28, clean comparison) - Sell fill conversion: 87.2% (34/39) — near-maximal; momentum filter suppressed over-buying into weakness - Buy fill conversion: 57.7% (41/71) - XRP session: +0.553%

Scoring vs Phase 6A evaluation template: - Startup gate: PASS - Side-correctness: PASS (BUY suppressed more in falling market window) - Overfiring: No - Baseline preserved: PASS — VW positive through stress - Stress condition: MET — drift crossed -13.8%, cap engaged - Verdict: FULL PASS

Atlas interpretation (Apr 16): 87.2% sell conversion + 57.7% buy conversion in a falling market = exactly correct system behavior. Momentum filter prevented the exact failure mode from Session 26 (over-buying into weakness). Edge confirmed under: neutral anchor + cap-engaged stress + directional move (falling).

Phase 6A structural conclusion (Atlas, Apr 16):

system profitable under:
  - neutral anchor (not anchor-assisted)
  - cap-engaged stress (drift ≥ 10%)
  - directional move (falling market)
momentum filter: proven to prevent Session 26 failure mode

Phase 6A CLOSED.

Final terminal summary: - Ticks: 1128 | Elapsed: 7214s (~2 hours) - Fills: 121 total | 65B / 56S | 0/121 toxic (0.0%) — highest fill count to date - VW spread: +1.1819 bps | Avg spread: +1.8493 bps - Drift: ending=+11.82% | range=-10.92% to +13.47% — crossed ±10% in BOTH directions ✓ - Anchor: mean=-0.94 bps | median=-2.17 bps — near-neutral (closest to zero across all sessions) - XRP session: start=1.4081, end=1.4266, +1.315% - Momentum suppressions: 229 BUY / 231 SELL — near-perfectly balanced across a session where XRP moved both directions - Ending inventory: 62.35 XRP / 54.94 RLUSD | XRP share 61.8% | drift +11.82% (at skew cap) - session_min_dist_to_ask: 0.0 bps — top-of-book sell participation confirmed again

Three-axis report: 1. Drift axis: -10.92% to +13.47% — most extreme session to date. Both directions crossed ±10%. 2. Anchor axis: mean -0.94 bps — closest to true neutral across all Phase 6A sessions. No anchor assistance. 3. Outcome axis: VW +1.1819 bps, 121 fills, 0 toxicity, 65B/56S balanced. VW improved vs Session 28 under more stress.

vs Session 26 (the known failure mode): Session 26 VW ≈ 0 at ±10% drift. Session 29 VW +1.18 bps at ±10% drift in both directions, 2× the fills, 0 toxicity. This is the definitive comparison. Phase 6A fixed exactly what it was designed to fix.

Suppression balance (229B/231S): Near-perfect balance reflects a session where XRP fell and rose significantly. Filter suppressed buys into weakness and sells into strength equally — correct directional behavior across both regimes.

Phase 6B — Spread Capture Optimization¶

Goal: Maximize edge per trade. Phase 6A proved the system survives stress. Phase 6B extracts more from it.

Atlas framing (Apr 16):

Previously: "does it work?"
Now:        "how much edge per trade?"

Transition principle: Quote competitiveness (offsets) is the primary lever — not size, not duration, not complexity. Size scales edge; it doesn't create it. Fix edge first, then scale.

Phase 6B sequencing (Atlas decision): 1. ask_offset 14 → 13 bps — single variable, directly affects sell-side conversion and fill probability. Test whether VW holds at tighter placement. 2. Size 10 → 15 RLUSD — only after edge at ask=13 is confirmed. Scales edge without changing it.

Expected behavior at ask=13: - ↑ sell fills and sell conversion - Slight ↓ avg spread per fill - ↑ total spread PnL (more fills compensates) - If VW collapses → too aggressive. If VW holds → free money.

Gate for size increase: ask=13 VW positive and stable across ≥2 sessions.

Phase 6B.1 — Session 30 (first ask=13 run, Apr 16)¶

Config change: ask_offset_bps: 14.0 → 13.0 (single variable)
Session ID: 30 | Duration: 16:25:55 → 18:25:30 UTC (~2h)
Reconstruction: Terminal summary lost (Cowork outage). Rebuilt from DB.

Fill metrics: - Fills: 93 total (50B / 43S) | Toxic: 0 - Avg spread: 1.664 bps | VW spread: +0.7882 bps - Session spread PnL: 0.0543 RLUSD

Price/drift context: - XRP: 1.4336 → 1.4318 (−0.126%) - Drift range: −8.01% → +12.91% | Ending: −2.08% - Starting drift: +12.91% (max long XRP) — session opened with max inventory imbalance

Momentum filter: - Buy suppressions: 193 | Sell suppressions: 201 (near-balanced ✓)

Anchor (mid price): - Mean: 1.4254 | Median: 1.4249 | Range: 1.4164–1.4361

Ending inventory: - XRP: 48.27 | RLUSD: 75.10 | Total value: 144.21 (was 144.35, −0.14 RLUSD session)

Segmented evaluation — context critical:

Segment A (rebalancing): Session opened at +12.91% long XRP — full skew, max sell pressure. Engine spent significant early ticks rebalancing toward neutral. VW in this phase is compressed by skew-driven sell fills at suboptimal placement.

Segment B (post-normalization): Once drift approached neutral, engine operated in steady-state. This is the true ask=13 baseline. No granular per-tick data available for clean segmentation (DB only).

Segment B analysis (FLAG-026 tooling, Apr 16):

Metric	Full Session	Segment A	Segment B
VW spread bps	+0.79	+0.20	+1.29
Fill count	93	42	51
Buy conversion	33.1%	27.9%	37.3%
Sell conversion	53.1%	48.9%	58.8%
Tick count	1173	394	779

Segment B VW +1.29 bps exceeds Phase 6A baseline (~1.18 bps). Rebalancing was suppressing the full-session read by ~0.5 bps. The signal was always there.

Revised verdict: Segment B PASS ask=13 is not degrading VW. Steady-state edge at ask=13 is stronger than ask=14 baseline. One session — need Session 31 Segment B to confirm stability. Atlas gate: positive and stable across ≥2 sessions before size increase.

Next: Session 31, same config (ask=13), run immediately per Atlas Q1 decision.

Phase 6B.2 — Session 31 (DB corruption + FLAG-027, Apr 16–17)¶

Session 31 attempt 1 (Apr 16): Gateway (QuikNode) stalled mid-session with tick latency climbing from expected ~4s to 8–17s. Dashboard surfaced RPC=failed (≥120s since last tick). Stop-Process used to terminate → killed engine mid-WAL-checkpoint. DB tail-truncated by 3 pages. fills.session_id index + valuation_snapshots corrupt. 67 fills unrecoverable. Rolled back to flag025_test.db (pre-S31 snapshot, 640 fills, S30 intact).

Session 31 attempt 2 (Apr 16 → Apr 17): Re-ran against the restored DB. Engine ran and appeared to complete. Terminal closed after apparent completion → CTRL_CLOSE_EVENT killed the process mid-WAL-checkpoint before final flush. DB tail-truncated by 148 pages (much worse). Session 31's 93 fills entirely in the un-flushed pages — 0 recoverable from file state. Terminal summary (seen briefly by Katja pre-close) is the only record of Session 31 outcome.

Conclusion: Two independent corruption events within 24h, same root-cause class (process killed during WAL checkpoint). Session 31 as a data point is lost. No Segment B number for Session 31.

FLAG-027 shipped Apr 17 (commit 9e124b4): - Pre-run atomic DB backup (SQLite backup API, retention=10, .bak.YYYYMMDDTHHMMSSZ) - Blocking startup gateway preflight (no more stall → forced kill) - Signal handlers (SIGINT/SIGTERM/SIGBREAK) route to clean engine._shutdown() - Windows CTRL_CLOSE_EVENT is uncatchable from Python — pre-run backup is the backstop for that class

Gate: No more sessions until FLAG-027 is verified live (backup path visible in logs, preflight reports OK, signal-driven shutdown confirmed clean on a short run). Session 31 re-run scheduled after verification, same config (ask=13).

Phase 6B.2 — Session 31 (DB session 32) — ask=13 confirmation attempt (Apr 17)¶

Config: ask=13, bid=10, drift=0.65, skew_cap=10, momentum_filter_enabled=true (unchanged from 6B.1) Duration: ~2 hours | Session ID (DB): 32

Segment B results:

Metric	Session 30 (6B.1)	Session 31 (6B.2)	Threshold
Seg B VW spread	+1.29 bps	+0.63 bps	≳ 1.1 bps
Seg B fill count	51	62	—
Coverage ratio	64%	67.3%	—
Fills per minute	0.66	0.77	—
Toxicity	0%	0%	—

Verdict: FAIL — Segment B VW +0.63 bps is below the 1.1 bps threshold.

Classification (Atlas): Case 3 — over-aggressive quoting. VW↓ + fills↑ = tighter ask converted more fills but at lower spread quality per fill. ask=13 is at or past the queue-competitive boundary. No stability across two sessions at this offset.

Decision (Atlas): REVERT ask_offset_bps 13 → 14. ask=13 is not a reliable operating point. Two-session spread: 1 PASS, 1 FAIL. The tightness direction is exhausted.

Next step: 1 confirmation session at ask=14 — validate Phase 6A baseline still holds before declaring Phase 6B resolved.

Note on DB corruption: Session 31 suffered two DB corruption events (kill-during-WAL-checkpoint, same root class). FLAG-027 shipped to prevent recurrence. Session results were recovered from terminal summary (Katja's pre-close read). No DB-sourced Segment B tooling output available — results above are from terminal summary + visual inspection.

Phase 6B.3 — Session 32 (DB session 33) — ask=14 confirmation (Apr 17)¶

Config: ask=14, bid=10, drift=0.65, skew_cap=10, momentum_filter_enabled=true (reverted from 6B.2)

Session summary: - Duration: 7,215s (2h) - Ticks: 1,249 - Fills: 44 total | buy=26 sell=18 | toxic=0/44 (0.0%) - VW realized spread (full session): +1.3752 bps - Avg realized spread: +1.7505 bps - XRP: start=1.4826 end=1.4882 chg=+0.378% - Drift: ending=+6.42% | range=−12.09% to +10.30% - Momentum suppressions: buy=119 sell=133 (balanced) - Anchor: mean=−3.33 bps | median=−7.44 bps | bias=negative

Segment B results (|drift| ≤ 5%):

Metric	Full	Seg A	Seg B
Fill count	44	17	27
Buy / Sell	26/18	8/9	18/9
VW spread (bps)	+1.38	+0.65	+1.89
Spread PnL (RLUSD)	+0.0439	+0.0086	+0.0354
Tick count	1,249	315	934
Coverage	100%	26.1%	73.9%

Segment B VW: +1.89 bps — PASS (threshold ≳ 1.0 bps)

Verdict: PASS. Phase 6B CLOSED. ask=14 confirmed as baseline.

Phase 6B summary across all sessions:

Session	Config	Seg B VW	Result
S30 (6B.1)	ask=13	+1.29 bps	PASS
S31 (6B.2)	ask=13	+0.63 bps	FAIL
S32 (6B.3)	ask=14	+1.89 bps	PASS

ask=13 rejected — unstable, over-tightening (Case 3). ask=14 is the confirmed operating point.

Gate opened: Capital injection sequence (FLAG-008 → FLAG-030 → FLAG-031 → FLAG-032 → dry-run → $50 injection → size 15 RLUSD).

Phase 7 — Executable Price-Aware Quoting¶

Goal: Replace anchor-based quoting with CLOB-referenced quoting. The capped_amm anchor has a structural limitation: when CLOB-AMM divergence is large and persistent, the anchor systematically lags the real market. Phase 7 tests whether using CLOB mid directly improves spread capture under these conditions.

Architecture direction (Atlas-locked Apr 18): - CLOB mid = primary price reference - capped_amm = secondary signal only (not controlling midpoint) - Regime model: ALIGNED ≤3 bps / DIVERGENT 3–8 bps / STRESS >8 bps (anchor_error_bps)

Phase 7 Context — Post-Injection Diagnostic Sessions¶

S35 — First post-injection session (Apr 17, ~2 hrs) Config: bid=10 | ask=14 | size=15 RLUSD | anchor cap=10 bps - Fills: 0 total - Cause: CLOB-AMM divergence 20–23 bps. With cap=10, anchor lagged CLOB by ~10–13 bps → quotes displaced ~10 bps from real market. - Verdict: Case A (zero fills). Atlas ruling: hold anchor at 10, run one diagnostic session before deciding. - Note: size=15 makes anchor displacement non-trivial vs prior size=10 regime.

S36 — Diagnostic session (Apr 18 overnight, ~2 hrs) Config: bid=10 | ask=14 | size=15 RLUSD | anchor cap=10 bps - Fills: 19 total (buy=10, sell=9) | toxic=0 - VW spread: +0.65 bps (below ~1.1 bps Phase 6A baseline) - Anchor: mean=+2.70 bps | median=+5.45 bps | range=[-10, +10] | bias=positive - Ending inventory: 36.6% XRP / 63.4% RLUSD (RLUSD-heavy, consistent with positive anchor bias) - Verdict: Case B confirmed — fills returned but spread quality degraded. Structural pricing limitation. - Atlas decision: skip cap widening (Path A rejected), move to Phase 7 (Path B). Cap is not the problem — anchor is not the execution venue.

Phase 7.1 — Instrumentation Complete ✅¶

Branch merged to main Apr 18: fix/cleanup-and-anchor-instrumentation Also resolved: FLAG-034 ✅ | max_inventory_usd retired ✅ | FLAG-033 ✅ | FLAG-028 ✅

Phase 7.1 — S37 Baseline Session (Apr 18, ~2 hrs)¶

Metric	Value
Fills	29 total (buy=19, sell=10)
Toxic fills	0 / 29 (0.0%)
VW realized spread	+3.40 bps
Avg realized spread	+3.22 bps
XRP price change	-0.627%
Anchor mean	+5.51 bps
Anchor median	+9.88 bps
Anchor range	[-10.0, +10.0]
\|err\|>5bps	69.4%
Buy suppressions	45
Sell suppressions	29
Ending XRP share	50.38% (near-balanced)
Ending drift	+8.06% (range: -4.17% to +13.45%)

Quote quality at session end: - Skew: +8.1 bps applied → BUY at 18.1 bps | SELL at 5.9 bps (base: BUY 10 / SELL 14) - SELL dist_to_ask: -0.0 bps (at touch) | session_min_dist=1.2 bps - BUY dist_to_bid: 18.0 bps (very passive due to inventory skew + anchor bias)

Key findings: 1. VW +3.40 bps — strongest since Phase 6B. Fills returned and spread quality recovered vs S36 (+0.65 bps). 2. |err|>5bps = 69.4% — anchor was in unreliable territory for nearly 70% of all ticks. This is the baseline for Phase 7.2 calibration. 3. 19/10 buy/sell split — XRP price fell -0.627%, sweeping buy orders despite wide offset (18 bps). Positive anchor bias prices SELL closer to market → sell fills should be easier but only 10 vs 19 buys. Skew+drift suppression limited sell conversion. 4. Regime: STRESS-dominant (anchor median 9.88 bps, |err|>5bps 69.4%). At Phase 7.2 threshold of 3 bps, near-100% of ticks would switch to CLOB reference.

Verdict: S37 PASS — strong baseline established. Phase 7.2 implementation approved. Binary CLOB/anchor switch ready to build.

Phase 7.2 — Binary CLOB/Anchor Switch (UPCOMING — Orion)¶

Spec (Atlas-locked Apr 18):

clob_mid = (best_bid + best_ask) / 2

if abs(anchor_error_bps) > 3:
    reference_mid = clob_mid
else:
    reference_mid = anchor_mid

Constraints: - No blending — binary switch only - No weighting — clean experiment - No other changes in same commit

Threshold: 3 bps = live control threshold | 5 bps = evaluation reliability floor Expected effect: Near-100% CLOB reference at current regime (69.4% already >5 bps, even more >3 bps). Should eliminate anchor lag and improve quote competitiveness on both sides.

Open Questions / Next Steps¶

Complete Dataset #2 capture and replay run — critical for distinguishing "calm regime" from "simulation artifact" as the explanation for near-zero toxicity in Dataset #1. If toxicity remains near zero across 2+ distinct datasets, real adverse selection is materially lower than static environment suggested. If Dataset #2 shows adverse conditions, participation filter becomes relevant.
Validate participation filter under adverse conditions — Phase 4A showed zero activations in Dataset #1 (calm regime). Filter is unvalidated under replay until a dataset with genuine adverse selection events is run. Cannot confirm or deny filter value until this happens.
Multi-dataset gate before live — minimum 2–3 distinct replay periods required. Performance must be consistent across datasets. If degradation appears, investigate regime sensitivity before any live test consideration.
Re-test Phase 4B signals under replay — volatility and spread-regime signals deferred from static environment. Evaluate once multi-dataset baseline is confirmed.
Anchor architecture — deferred. System is operating as CLOB-pegged with constant capped premium. Revisit if participation filter dimensions are exhausted post-replay.

Closed / Deprioritized¶

Dimension	Status	Reason
Bid distance tuning	Closed	Optimal at 1.00–1.10
Directional momentum filter	Closed	Signal not predictive at 4s cadence
Anchor-relative threshold filter	Closed	Signal constant (pinned anchor)
Anchor cap reduction for dynamics	Closed	Anchor remains pinned regardless of cap value
Ladder / queue logic	Deferred	Participation not the bottleneck
Anchor architecture redesign	Deferred	Functional baseline exists
Volatility-based filter	Deferred	Paper env confirmed static (not a bug). Re-test under replay once adapter is built.
Spread-regime filter	Deferred	Same reason — requires replay environment with real spread variation.