livepilot 1.17.1 → 1.17.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,129 @@
1
1
  # Changelog
2
2
 
3
+ ## 1.17.2 — iterate_toward_goal + preview-studio truth-gap (April 23 2026)
4
+
5
+ ### Added
6
+
7
+ - **`iterate_toward_goal` MCP tool** (`mcp_server/tools/agent_os.py`,
8
+ `mcp_server/tools/_agent_os_engine/iteration.py`): closes the outer
9
+ evaluation loop. Given a compiled `GoalVector` and a list of candidate
10
+ move sets, runs up to N experiments sequentially. Each iteration
11
+ creates an experiment, runs all branches (with per-branch
12
+ apply-snapshot-undo already handled by the existing experiment engine),
13
+ scores the top branch against the goal, and either commits (score ≥
14
+ threshold) or discards and tries the next candidate set. On timeout,
15
+ commits the best-so-far (`on_timeout="commit_best"`, default) or
16
+ commits nothing (`on_timeout="discard_on_timeout"`). Per-branch undo
17
+ stays inside `run_experiment` — this loop never issues a raw undo.
18
+ Tool count: 426 → 427.
19
+
20
+ Engine ships as both a pure-sync `iterate_toward_goal_engine` (for
21
+ tests with in-memory fakes) and `iterate_toward_goal_engine_async`
22
+ (for the live MCP wrapper with coroutine callbacks); the sync entry
23
+ auto-detects coroutine callbacks and dispatches accordingly. Covered
24
+ by 11 tests in `tests/test_iterate_toward_goal.py` spanning happy
25
+ path, exhaustion + commit-best, exhaustion + discard, no candidates,
26
+ no-winner iterations, max_iterations capping, async coroutine
27
+ callbacks, and MCP registration.
28
+
29
+ This is the P0 item from the v1.17.1 review gap-analysis between
30
+ "tool orchestration" and "agentic optimization" — the create /
31
+ run / compare / commit primitives existed but nothing drove them
32
+ toward a scalar goal. `iterate_toward_goal` is that driver.
33
+
34
+ ### Fixed
35
+
36
+ - **Preview Studio truth-gap** (`mcp_server/preview_studio/engine.py`,
37
+ `mcp_server/preview_studio/tools.py`): two compounding bugs made the
38
+ system lie about committed state.
39
+ 1. `compare_variants()` scored every variant without filtering for
40
+ `status="blocked"` or missing `compiled_plan`. A blocked /
41
+ analytical-only variant could win the recommendation even with a
42
+ higher taste_fit than the only executable option. Fix: partition
43
+ variants into executable vs analytical, score only the executable
44
+ list, surface the analytical bucket on a new `analytical_candidates`
45
+ field for introspection. `recommended` stays a bare string (or
46
+ `None` when no executable variant exists) so no API shape breaks.
47
+ 2. `commit_preview_variant()` called `engine.commit_variant()` — which
48
+ flips `preview_set.status = "committed"` and discards every sibling
49
+ variant — BEFORE checking whether the chosen variant had a compiled
50
+ plan. Analytical-only picks therefore got recorded as committed
51
+ with `committed=False` in the response and the preview set's
52
+ in-memory state said the opposite. Wonder lifecycle also advanced
53
+ to `resolved`. Fix: short-circuit analytical/blocked picks at the
54
+ top of the handler, return `{committed: False, reason:
55
+ "analytical_only" | "blocked", ...}`, leave `preview_set.status`
56
+ untouched, and gate Wonder lifecycle hooks behind the executable
57
+ branch. New regressions in `tests/test_preview_studio_truth_gap.py`
58
+ lock all four scenarios (A1-A4 from the remediation plan).
59
+ - **Runtime capability probes stop lying about `web` and `flucoma`**
60
+ (`mcp_server/runtime/tools.py`, `mcp_server/runtime/capability_state.py`):
61
+ `get_capability_state` previously hardcoded `web_ok=False` and never
62
+ emitted a `flucoma` domain at all, causing `route_request` to pick
63
+ degraded research/perception paths on machines where those
64
+ capabilities were actually available. `_probe_web()` now runs a
65
+ 500 ms HEAD request to `https://api.github.com` using stdlib
66
+ `urllib.request` (no new dependency); `_probe_flucoma()` uses
67
+ `importlib.util.find_spec("flucoma")` with safe exception swallowing.
68
+ The `flucoma` domain is now emitted unconditionally so consumers can
69
+ distinguish "probed and missing" from "not probed yet".
70
+ - **`build_song_brain` flags degraded responses**
71
+ (`mcp_server/song_brain/tools.py`): When `get_session_info` fails,
72
+ the tool injected `{tempo: 120.0, track_count: 0}` and returned a
73
+ polished SongBrain with no indication the inputs were synthesized.
74
+ The fallback is preserved for backward compatibility but the
75
+ response now carries a top-level `degradation` payload
76
+ (`{is_degraded, reasons, substituted_fields}`) so callers can branch
77
+ on synthesized vs real data.
78
+ - **`create_preview_set` flags the empty-kernel fallback**
79
+ (`mcp_server/preview_studio/engine.py`,
80
+ `mcp_server/preview_studio/models.py`): When the caller omits a real
81
+ session kernel, `create_preview_set` synthesizes an empty-but-valid
82
+ shape so compilers degrade to no-op steps. `PreviewSet` now carries a
83
+ `degradation` field that is marked
84
+ `is_degraded=True, reasons=["empty_kernel_fallback"]` whenever that
85
+ substitution fires, so downstream consumers can tell a synthesized
86
+ compile from a kernel-backed one.
87
+
88
+ ### Added
89
+
90
+ - **`DegradationInfo` dataclass** (`mcp_server/runtime/degradation.py`):
91
+ New shared payload that engines attach to their responses whenever
92
+ they substitute fallback data. Three fields:
93
+ `is_degraded: bool`, `reasons: list[str]`, `substituted_fields: list[str]`.
94
+ Intentionally minimal and import-safe so any engine can adopt it
95
+ without circular-import risk. Wired into `song_brain` and
96
+ `preview_studio`; other engines will adopt it as audits surface more
97
+ silent-fallback paths.
98
+ - **`flucoma` capability domain** now emitted by
99
+ `build_capability_state` alongside `session_access`, `analyzer`,
100
+ `memory`, `web`, and `research`. Matches the existing
101
+ `CapabilityDomain` schema.
102
+
103
+ ### Changed
104
+
105
+ - **`capability-modes.md` reference doc rewritten to match the actual
106
+ response shape** (`livepilot/skills/livepilot-evaluation/references/capability-modes.md`).
107
+ The old example JSON described a flat
108
+ `{mode, analyzer_connected, bridge_version, spectral_cache_age_ms, flucoma_available, session_connected}`
109
+ shape that hasn't matched `get_capability_state` for releases. The
110
+ new section documents the nested `capability_state.domains.<name>`
111
+ structure, explicit per-domain and per-field definitions, and
112
+ explicitly scopes the `web` domain as *"server-side outbound HTTP
113
+ capability; does NOT imply curated research corpora are installed"*.
114
+
115
+ ### Tests
116
+
117
+ - `tests/test_preview_studio_truth_gap.py` — 5 tests locking the four
118
+ A1-A4 scenarios from the remediation plan.
119
+ - `tests/test_runtime_capability_probes.py` — 6 tests covering the
120
+ web probe (true/false/exception-swallow) and the flucoma probe
121
+ (emitted-when-importable, emitted-when-missing, find_spec-backed).
122
+ - `tests/test_degradation_signalling.py` — 8 tests covering the
123
+ `DegradationInfo` dataclass defaults, `song_brain` degradation on
124
+ session failure, and `preview_studio` degradation on empty-kernel
125
+ fallback.
126
+
3
127
  ## 1.17.1 — Splice auto-reconnect + Codex installer fix (April 23 2026)
4
128
 
5
129
  Two bug fixes discovered in a parallel worktree hours after v1.17.0
package/README.md CHANGED
@@ -17,7 +17,7 @@
17
17
 
18
18
  <p align="center">
19
19
  An agentic production system for Ableton Live 12.<br>
20
- 426 tools. 52 domains. Device atlas. Plan-aware Splice integration. Auto-composition. Spectral perception. Technique memory. Drum-rack pad builder. Live dead-device detection.
20
+ 427 tools. 52 domains. Device atlas. Plan-aware Splice integration. Auto-composition. Spectral perception. Technique memory. Drum-rack pad builder. Live dead-device detection.
21
21
  </p>
22
22
 
23
23
  <br>
@@ -80,7 +80,7 @@ Most MCP servers are tool collections — they execute commands. LivePilot is an
80
80
  │ └─────────────────┼──────────────────┘ │
81
81
  │ ▼ │
82
82
  │ ┌─────────────────┐ │
83
- │ │ 426 MCP Tools │ │
83
+ │ │ 427 MCP Tools │ │
84
84
  │ │ 52 domains │ │
85
85
  │ └────────┬────────┘ │
86
86
  │ │ │
@@ -121,7 +121,7 @@ Most MCP servers are tool collections — they execute commands. LivePilot is an
121
121
 
122
122
  ## The Intelligence Layer
123
123
 
124
- 12 engines sit on top of the 426 tools. They give the AI musical judgment, not just musical execution.
124
+ 12 engines sit on top of the 427 tools. They give the AI musical judgment, not just musical execution.
125
125
 
126
126
  ### SongBrain — What the Song Is
127
127
 
@@ -173,7 +173,7 @@ Every engine follows: **measure before → act → measure after → compare**.
173
173
 
174
174
  ## Tools
175
175
 
176
- 426 tools across 52 domains. Highlights below — [full catalog here](docs/manual/tool-catalog.md).
176
+ 427 tools across 52 domains. Highlights below — [full catalog here](docs/manual/tool-catalog.md).
177
177
 
178
178
  <br>
179
179
 
@@ -208,7 +208,8 @@ The M4L Analyzer sits on the master track. UDP 9880 carries spectral data to the
208
208
  > Most tools work without the analyzer — it adds 32 spectral/analyzer tools (frequency, loudness, perception) and closes the feedback loop.
209
209
 
210
210
  ```
211
- SPECTRAL ─────── 8-band frequency decomposition (sub → air)
211
+ SPECTRAL ─────── 9-band frequency decomposition (sub_low → air)
212
+ sub_low (20-60 Hz) split off so kick fundamentals don't hide inside sub
212
213
  true RMS / peak metering
213
214
  Krumhansl-Schmuckler key detection
214
215
 
@@ -361,7 +362,7 @@ The V2 intelligence layer. These tools analyze, diagnose, plan, evaluate, and le
361
362
  | Creative Constraints | 5 | constraint activation, reference-inspired variants |
362
363
  | Preview Studio | 5 | variant creation, preview rendering, comparison, commit |
363
364
 
364
- > **[View all 426 tools →](docs/manual/tool-catalog.md)**
365
+ > **[View all 427 tools →](docs/manual/tool-catalog.md)**
365
366
 
366
367
  <br>
367
368
 
@@ -588,7 +589,7 @@ See [CONTRIBUTING.md](CONTRIBUTING.md) for architecture details, code guidelines
588
589
 
589
590
  | Document | What's inside |
590
591
  |----------|---------------|
591
- | [Manual](docs/manual/index.md) | Complete reference: architecture, all 426 tools, workflows |
592
+ | [Manual](docs/manual/index.md) | Complete reference: architecture, all 427 tools, workflows |
592
593
  | [Intelligence Layer](docs/manual/intelligence.md) | How the 12 engines connect — conductor, moves, preview, evaluation |
593
594
  | [Device Atlas](docs/manual/device-atlas.md) | 1305 devices indexed — search, suggest, chain building |
594
595
  | [Samples & Slicing](docs/manual/samples.md) | 3-source search, fitness critics, slice workflows |
@@ -32,27 +32,31 @@ We tap the audio for analysis without affecting the pass-through.
32
32
  4. Add object: `[*~ 0.5]` (scale to prevent clipping)
33
33
  5. Connect: `[+~]` outlet → `[*~ 0.5]` inlet
34
34
 
35
- ## Step 4: 8-Band Spectrum Analysis
36
-
37
- 1. Add object: `[fffb~ 8]` (fast 8-band filter bank)
38
- 2. Connect: `[*~ 0.5]` outlet `[fffb~ 8]` inlet
39
- 3. Set `fffb~` frequencies in Inspector or via message:
40
- - Band 1: 40 Hz (sub)
41
- - Band 2: 130 Hz (low)
42
- - Band 3: 350 Hz (low-mid)
43
- - Band 4: 1000 Hz (mid)
44
- - Band 5: 3000 Hz (high-mid)
45
- - Band 6: 6000 Hz (high)
46
- - Band 7: 10000 Hz (presence)
47
- - Band 8: 16000 Hz (air)
48
-
49
- To set: add `[loadmess 40 130 350 1000 3000 6000 10000 16000]` → `[fffb~ 8]` right inlet
50
-
51
- 4. For each of the 8 outlets of `[fffb~ 8]`:
35
+ ## Step 4: 9-Band Spectrum Analysis
36
+
37
+ (v1.16+ layout. Pre-v1.16 devices used `[fffb~ 8]`; the server still accepts
38
+ 8-band payloads for backward compatibility, but new builds should use 9.)
39
+
40
+ 1. Add object: `[fffb~ 9]` (fast 9-band filter bank)
41
+ 2. Connect: `[*~ 0.5]` outlet → `[fffb~ 9]` inlet
42
+ 3. Set `fffb~` center frequencies in Inspector or via message:
43
+ - Band 1: 35 Hz (sub_low) — kick fundamentals, Villalobos subs
44
+ - Band 2: 85 Hz (sub) — 808s, sub-bass body
45
+ - Band 3: 175 Hz (low) — bass body, warmth
46
+ - Band 4: 350 Hz (low_mid) — mud zone
47
+ - Band 5: 700 Hz (mid) — vocal presence, snare body
48
+ - Band 6: 1400 Hz (high_mid) — consonants, pick attack
49
+ - Band 7: 2800 Hz (high) — presence, intelligibility
50
+ - Band 8: 5600 Hz (presence) — cymbal definition
51
+ - Band 9: 12000 Hz (air) — shimmer, sparkle
52
+
53
+ To set: add `[loadmess 35. 85. 175. 350. 700. 1400. 2800. 5600. 12000.]` → `[fffb~ 9]` right inlet
54
+
55
+ 4. For each of the 9 outlets of `[fffb~ 9]`:
52
56
  - Add `[abs~]` (rectify to positive)
53
57
  - Add `[snapshot~ 200]` (sample at 5 Hz)
54
58
 
55
- 5. Add `[pack f f f f f f f f]` and connect all 8 `[snapshot~]` outlets to it
59
+ 5. Add `[pack f f f f f f f f f]` and connect all 9 `[snapshot~]` outlets to it
56
60
  6. Add `[prepend /spectrum]` → connect from `[pack]`
57
61
  7. Add `[udpsend 127.0.0.1 9880]` → connect from `[prepend]`
58
62
 
@@ -131,7 +135,7 @@ We tap the audio for analysis without affecting the pass-through.
131
135
 
132
136
  1. Drop `LivePilot Analyzer` on the **master track**
133
137
  2. Play some audio
134
- 3. In Claude Code, run: `get_master_spectrum` — should return 8 band values
138
+ 3. In Claude Code, run: `get_master_spectrum` — should return 9 band values (v1.16+) or 8 values (pre-v1.16 .amxd)
135
139
  4. Run: `get_master_rms` — should return RMS and peak
136
140
  5. After 8+ bars: `get_detected_key` — should return key and scale
137
141
 
@@ -143,7 +147,7 @@ We tap the audio for analysis without affecting the pass-through.
143
147
  │ │
144
148
  plugin~ ──┤──L+R──► plugout~ (pass-through) │
145
149
  │ │
146
- │──L+R──► +~ ──► *~ 0.5 ──┬──► fffb~ 8 ──► UDP │
150
+ │──L+R──► +~ ──► *~ 0.5 ──┬──► fffb~ 9 ──► UDP │
147
151
  │ ├──► peakamp~ ──► UDP │
148
152
  │ ├──► average~ ──► UDP │
149
153
  │ └──► sigmund~ ──► JS │
Binary file
@@ -95,7 +95,7 @@ function anything() {
95
95
  function dispatch(cmd, args) {
96
96
  switch(cmd) {
97
97
  case "ping":
98
- send_response({"ok": true, "version": "1.17.1"});
98
+ send_response({"ok": true, "version": "1.17.2"});
99
99
  break;
100
100
  case "get_params":
101
101
  cmd_get_params(args);
@@ -1,2 +1,2 @@
1
1
  """LivePilot MCP Server — bridges MCP protocol to Ableton Live."""
2
- __version__ = "1.17.1"
2
+ __version__ = "1.17.2"
@@ -471,7 +471,8 @@ class SpectralReceiver(asyncio.DatagramProtocol):
471
471
  """Receives OSC-formatted UDP packets from the M4L device.
472
472
 
473
473
  OSC messages:
474
- /spectrum f f f f f f f f — 8-band spectrum
474
+ /spectrum f f f f f f f f [f] — 8 or 9 band spectrum
475
+ (9 = v1.16+ with sub_low; 8 = legacy)
475
476
  /peak f — peak level
476
477
  /rms f — RMS level
477
478
  /pitch f f — MIDI note, amplitude
@@ -11,6 +11,7 @@ import json
11
11
  import time
12
12
  from typing import Optional
13
13
 
14
+ from ..runtime.degradation import DegradationInfo
14
15
  from .models import PreviewSet, PreviewVariant
15
16
 
16
17
 
@@ -52,7 +53,10 @@ def create_preview_set(
52
53
  kernel: the live session kernel (track topology + device chains). Compilers
53
54
  resolve targets from it — without it, variants degrade into no-ops or
54
55
  generic reads. Callers that have a `ctx` should fetch a real kernel
55
- via runtime.tools.get_session_kernel(ctx).
56
+ via runtime.tools.get_session_kernel(ctx). When omitted the engine
57
+ synthesizes an empty-but-valid kernel (see ``_build_triptych``) and
58
+ flags the resulting PreviewSet with ``degradation.is_degraded=True``
59
+ so callers can tell a synthesized compile from a real one.
56
60
  """
57
61
  set_id = _compute_set_id(request_text, kernel_id)
58
62
  now = int(time.time() * 1000)
@@ -61,6 +65,18 @@ def create_preview_set(
61
65
  song_brain = song_brain or {}
62
66
  taste_graph = taste_graph or {}
63
67
 
68
+ # Degradation bookkeeping — if the caller didn't supply a kernel the
69
+ # compiler receives a synthesized one (see engine.py line 128 area)
70
+ # and every variant is scored against that synthetic topology.
71
+ if kernel:
72
+ degradation = DegradationInfo()
73
+ else:
74
+ degradation = DegradationInfo(
75
+ is_degraded=True,
76
+ reasons=["empty_kernel_fallback"],
77
+ substituted_fields=["compile_kernel"],
78
+ )
79
+
64
80
  if strategy == "creative_triptych":
65
81
  variants = _build_triptych(
66
82
  request_text, moves, song_brain, taste_graph, set_id, now, kernel,
@@ -79,6 +95,7 @@ def create_preview_set(
79
95
  source_kernel_id=kernel_id,
80
96
  variants=variants,
81
97
  created_at_ms=now,
98
+ degradation=degradation,
82
99
  )
83
100
  store_preview_set(ps)
84
101
  return ps
@@ -258,31 +275,66 @@ def _build_binary(
258
275
  # ── Comparison ────────────────────────────────────────────────────
259
276
 
260
277
 
278
+ _NON_EXECUTABLE_STATUSES = {"blocked", "failed"}
279
+
280
+
281
+ def _is_executable(variant: PreviewVariant) -> bool:
282
+ """A variant is executable when it has a compiled plan AND its status
283
+ hasn't been flagged as blocked/failed upstream.
284
+
285
+ The compiled plan may be a non-empty list of steps OR a dict with a
286
+ non-empty ``steps`` key — both shapes exist in the wild.
287
+ """
288
+ if variant.status in _NON_EXECUTABLE_STATUSES:
289
+ return False
290
+ plan = variant.compiled_plan
291
+ if plan is None:
292
+ return False
293
+ if isinstance(plan, list):
294
+ return len(plan) > 0
295
+ if isinstance(plan, dict):
296
+ return len(plan.get("steps") or []) > 0
297
+ # Any other truthy shape is treated as executable; falsy as not.
298
+ return bool(plan)
299
+
300
+
261
301
  def compare_variants(
262
302
  preview_set: PreviewSet,
263
303
  criteria: Optional[dict] = None,
264
304
  ) -> dict:
265
- """Compare variants within a preview set and rank them."""
305
+ """Compare variants within a preview set and rank them.
306
+
307
+ Truth-gap fix (PR-A): variants that are blocked/failed OR lack a
308
+ compiled_plan are partitioned out of the scored ranking. They appear
309
+ in ``analytical_candidates`` (just their variant_ids) and ALSO stay
310
+ in ``rankings`` at the bottom for introspection, but they can never
311
+ populate ``recommended``. When no executable variant exists,
312
+ ``recommended`` is ``None`` so callers can surface a clear message
313
+ instead of silently committing a no-op.
314
+ """
266
315
  criteria = criteria or {}
267
316
  weight_taste = criteria.get("taste_weight", 0.3)
268
317
  weight_novelty = criteria.get("novelty_weight", 0.2)
269
318
  weight_identity = criteria.get("identity_weight", 0.5)
270
319
 
271
- rankings = []
320
+ executable: list[PreviewVariant] = []
321
+ analytical: list[PreviewVariant] = []
272
322
  for v in preview_set.variants:
273
- # Score components
323
+ (executable if _is_executable(v) else analytical).append(v)
324
+
325
+ def _score(v: PreviewVariant) -> float:
274
326
  taste_score = v.taste_fit
275
327
  novelty_score = 1.0 - abs(v.novelty_level - 0.5) * 2 # bell curve around 0.5
276
328
  identity_score = _identity_effect_score(v.identity_effect)
277
-
278
329
  composite = (
279
330
  taste_score * weight_taste
280
331
  + novelty_score * weight_novelty
281
332
  + identity_score * weight_identity
282
333
  )
283
- v.score = round(composite, 3)
334
+ return round(composite, 3)
284
335
 
285
- rankings.append({
336
+ def _row(v: PreviewVariant) -> dict:
337
+ return {
286
338
  "variant_id": v.variant_id,
287
339
  "label": v.label,
288
340
  "score": v.score,
@@ -292,13 +344,35 @@ def compare_variants(
292
344
  "summary": v.intent,
293
345
  "what_preserved": v.what_preserved,
294
346
  "why_it_matters": v.why_it_matters,
295
- })
296
-
297
- rankings.sort(key=lambda r: r["score"], reverse=True)
347
+ "status": v.status,
348
+ }
349
+
350
+ executable_rows: list[dict] = []
351
+ for v in executable:
352
+ v.score = _score(v)
353
+ executable_rows.append(_row(v))
354
+ executable_rows.sort(key=lambda r: r["score"], reverse=True)
355
+
356
+ # Analytical variants still get a score computed so introspection
357
+ # shows the same shape, but they're appended AFTER the sorted
358
+ # executables so they can never land at position 0.
359
+ analytical_rows: list[dict] = []
360
+ for v in analytical:
361
+ v.score = _score(v)
362
+ analytical_rows.append(_row(v))
363
+
364
+ rankings = executable_rows + analytical_rows
365
+
366
+ recommended: Optional[str]
367
+ if executable_rows:
368
+ recommended = executable_rows[0]["variant_id"]
369
+ else:
370
+ recommended = None
298
371
 
299
372
  comparison = {
300
373
  "rankings": rankings,
301
- "recommended": rankings[0]["variant_id"] if rankings else "",
374
+ "recommended": recommended,
375
+ "analytical_candidates": [v.variant_id for v in analytical],
302
376
  "criteria_used": {
303
377
  "taste_weight": weight_taste,
304
378
  "novelty_weight": weight_novelty,
@@ -6,6 +6,8 @@ import time
6
6
  from dataclasses import asdict, dataclass, field
7
7
  from typing import Optional
8
8
 
9
+ from ..runtime.degradation import DegradationInfo
10
+
9
11
 
10
12
  @dataclass
11
13
  class PreviewVariant:
@@ -59,6 +61,11 @@ class PreviewSet:
59
61
  committed_variant_id: str = ""
60
62
  status: str = "pending" # pending, compared, committed, discarded
61
63
  created_at_ms: int = field(default_factory=lambda: int(time.time() * 1000))
64
+ # Degradation signalling — set when the engine substituted a fallback
65
+ # (e.g. an empty-but-valid kernel) during variant compilation. Callers
66
+ # can inspect .degradation.is_degraded to tell synthesized preview
67
+ # topology apart from a real kernel-backed compile.
68
+ degradation: DegradationInfo = field(default_factory=DegradationInfo)
62
69
 
63
70
  def to_dict(self) -> dict:
64
71
  return {
@@ -71,4 +78,5 @@ class PreviewSet:
71
78
  "committed_variant_id": self.committed_variant_id,
72
79
  "status": self.status,
73
80
  "variant_count": len(self.variants),
81
+ "degradation": self.degradation.to_dict(),
74
82
  }
@@ -270,7 +270,68 @@ async def commit_preview_variant(
270
270
  if not ps:
271
271
  return {"error": f"Preview set {set_id} not found"}
272
272
 
273
+ # Resolve the chosen variant WITHOUT mutating state yet. We have to
274
+ # short-circuit analytical-only / blocked picks BEFORE engine.commit_variant
275
+ # runs, otherwise `preview_set.status` gets flipped to "committed" and
276
+ # sibling variants get discarded even though nothing executed.
277
+ chosen = None
278
+ for v in ps.variants:
279
+ if v.variant_id == variant_id:
280
+ chosen = v
281
+ break
282
+ if not chosen:
283
+ available = [v.variant_id for v in ps.variants]
284
+ return {
285
+ "error": f"Variant {variant_id} not found in set {set_id}",
286
+ "available_variants": available,
287
+ }
288
+
289
+ # ── Truth-gap guard: refuse to "commit" a variant that can't execute ──
290
+ # If the variant was flagged blocked/failed upstream or lacks a
291
+ # compiled plan, the old code still marked preview_set.status='committed'
292
+ # and returned committed=False as a silent contradiction. Close that
293
+ # gap: return an honest no-op and leave state untouched so the caller
294
+ # can pick a different variant.
295
+ plan = chosen.compiled_plan
296
+ plan_is_empty = (
297
+ plan is None
298
+ or (isinstance(plan, list) and len(plan) == 0)
299
+ or (isinstance(plan, dict) and len(plan.get("steps") or []) == 0)
300
+ )
301
+ blocked = chosen.status in {"blocked", "failed"}
302
+ if plan_is_empty or blocked:
303
+ reason = "blocked" if blocked and plan_is_empty is False else "analytical_only"
304
+ return {
305
+ "committed": False,
306
+ "status": "analytical_only" if reason == "analytical_only" else "blocked",
307
+ "reason": reason,
308
+ "preview_set_id": set_id,
309
+ "variant_id": chosen.variant_id,
310
+ "label": chosen.label,
311
+ "intent": chosen.intent,
312
+ "move_id": chosen.move_id,
313
+ "identity_effect": chosen.identity_effect,
314
+ "what_preserved": chosen.what_preserved,
315
+ "message": (
316
+ "chose analytical variant; no session changes applied"
317
+ if reason == "analytical_only"
318
+ else "variant is blocked; no session changes applied"
319
+ ),
320
+ "note": (
321
+ "Variant has no compiled plan (analytical-only). Preview set "
322
+ "was left in its pre-commit state so you can pick a different "
323
+ "variant."
324
+ if reason == "analytical_only"
325
+ else "Variant is blocked/failed. Preview set was left in its "
326
+ "pre-commit state so you can pick a different variant."
327
+ ),
328
+ }
329
+
330
+ # Only now do we flip state — the chosen variant has an executable plan.
273
331
  chosen = engine.commit_variant(ps, variant_id)
332
+ # engine.commit_variant cannot return None here (we already verified
333
+ # the variant_id exists), but keep the defensive check for the type
334
+ # checker.
274
335
  if not chosen:
275
336
  available = [v.variant_id for v in ps.variants]
276
337
  return {
@@ -289,55 +350,44 @@ async def commit_preview_variant(
289
350
  }
290
351
 
291
352
  # ── v1.10.3: actually execute the compiled plan ──
292
- # If there's no compiled plan, the variant is analytical-only — record
293
- # the choice and return honestly instead of pretending it was applied.
294
- if not chosen.compiled_plan:
295
- result["committed"] = False
296
- result["status"] = "analytical_only"
297
- result["note"] = (
298
- "Variant has no compiled plan (analytical-only). Preview set "
299
- "marked the choice but no session changes were made. Use an "
300
- "executable variant if you want the commit to apply changes."
301
- )
353
+ from ..runtime.execution_router import execute_plan_steps_async
354
+ plan = chosen.compiled_plan
355
+ steps = plan if isinstance(plan, list) else plan.get("steps", []) or []
356
+ ableton = _get_ableton(ctx)
357
+ bridge = ctx.lifespan_context.get("m4l")
358
+ mcp_registry = ctx.lifespan_context.get("mcp_dispatch", {})
359
+
360
+ exec_results = await execute_plan_steps_async(
361
+ steps,
362
+ ableton=ableton,
363
+ bridge=bridge,
364
+ mcp_registry=mcp_registry,
365
+ ctx=ctx,
366
+ stop_on_failure=False,
367
+ )
368
+ log = [
369
+ {
370
+ "tool": r.tool,
371
+ "backend": r.backend,
372
+ "ok": r.ok,
373
+ **({"result": r.result} if r.ok else {"error": r.error}),
374
+ }
375
+ for r in exec_results
376
+ ]
377
+ steps_ok = sum(1 for r in exec_results if r.ok)
378
+ steps_failed = len(exec_results) - steps_ok
379
+
380
+ result["execution_log"] = log
381
+ result["steps_ok"] = steps_ok
382
+ result["steps_failed"] = steps_failed
383
+
384
+ if steps_failed == 0 and steps_ok > 0:
385
+ result["status"] = "committed"
386
+ elif steps_ok > 0:
387
+ result["status"] = "committed_with_errors"
302
388
  else:
303
- from ..runtime.execution_router import execute_plan_steps_async
304
- plan = chosen.compiled_plan
305
- steps = plan if isinstance(plan, list) else plan.get("steps", []) or []
306
- ableton = _get_ableton(ctx)
307
- bridge = ctx.lifespan_context.get("m4l")
308
- mcp_registry = ctx.lifespan_context.get("mcp_dispatch", {})
309
-
310
- exec_results = await execute_plan_steps_async(
311
- steps,
312
- ableton=ableton,
313
- bridge=bridge,
314
- mcp_registry=mcp_registry,
315
- ctx=ctx,
316
- stop_on_failure=False,
317
- )
318
- log = [
319
- {
320
- "tool": r.tool,
321
- "backend": r.backend,
322
- "ok": r.ok,
323
- **({"result": r.result} if r.ok else {"error": r.error}),
324
- }
325
- for r in exec_results
326
- ]
327
- steps_ok = sum(1 for r in exec_results if r.ok)
328
- steps_failed = len(exec_results) - steps_ok
329
-
330
- result["execution_log"] = log
331
- result["steps_ok"] = steps_ok
332
- result["steps_failed"] = steps_failed
333
-
334
- if steps_failed == 0 and steps_ok > 0:
335
- result["status"] = "committed"
336
- elif steps_ok > 0:
337
- result["status"] = "committed_with_errors"
338
- else:
339
- result["status"] = "failed"
340
- result["committed"] = False
389
+ result["status"] = "failed"
390
+ result["committed"] = False
341
391
 
342
392
  # Wonder lifecycle hooks
343
393
  ws = _find_wonder_session_by_preview(set_id)