livepilot 1.17.3 → 1.17.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,95 @@
1
1
  # Changelog
2
2
 
3
+ ## 1.17.5 — Classify error-only commit payloads as failures (April 23 2026)
4
+
5
+ ### Fixed
6
+
7
+ - **`_classify_commit_result` now catches error-only commit payloads**
8
+ (`mcp_server/tools/_agent_os_engine/iteration.py`): Codex review on
9
+ PR #27 caught a gap I shipped in v1.17.3. My docstring listed
10
+ `{"error": ...}` as a known failure signal, but the implementation
11
+ never checked for a top-level `error` key. `commit_branch_async` in
12
+ `mcp_server/experiment/engine.py` returns error-only dicts in 5+
13
+ paths (`Branch {id} not found`, `Branch has no compiled plan`,
14
+ `Experiment {id} not found`). These fell through to
15
+ `"committed"` because they had no explicit `committed: false` /
16
+ `ok: false` / `status: "failed"` / `steps_ok: 0` signal. Classic
17
+ truth-gap: the iteration loop could claim success while the commit
18
+ applied zero steps.
19
+
20
+ Fix: if `result.get("error")` is truthy AND `result.get("committed")`
21
+ is not explicitly `True`, return `"commit_failed"`. The explicit-
22
+ committed caveat handles the edge case where a payload reports
23
+ success with a warning in the `error` field.
24
+
25
+ ### Tests
26
+
27
+ 4 new TDD tests in `tests/test_iterate_toward_goal.py`:
28
+ - `{"error": "Experiment not found"}` → `commit_failed`
29
+ - `{"error": "Branch not found"}` (real commit_branch_async shape) →
30
+ `commit_failed`, with the payload surfaced on `commit_result`
31
+ - Same discipline on the `on_timeout="commit_best"` path
32
+ - Edge case: `{"committed": True, "error": "warning...",
33
+ steps_ok: 3}` still returns `committed` (explicit success overrides)
34
+
35
+ 2726 → 2730 passing.
36
+
37
+ ### Process note
38
+
39
+ The fix that shipped in v1.17.3 was itself caught by a subsequent
40
+ review. Writing a docstring listing a failure signal and forgetting
41
+ to implement the check is the classic TDD violation the discipline
42
+ exists to prevent. Codex's automated review acted as the missing
43
+ failing-test-first pass.
44
+
45
+ ## 1.17.4 — Shape cleanup + memory probe (April 23 2026)
46
+
47
+ ### Fixed
48
+
49
+ - **`get_session_kernel` now probes the memory store** instead of
50
+ hardcoding `memory_ok=True` (`mcp_server/runtime/tools.py`). If the
51
+ underlying technique store raises on `list_techniques` (disk full,
52
+ corrupted index, permissions error), the kernel previously still
53
+ reported memory as available to orchestration planners. Same
54
+ truth-gap class as the v1.17.3 web/flucoma fix — should have been
55
+ caught by the same review pass. Now probed the same way
56
+ `get_capability_state` does, wrapped in try/except.
57
+ - **`capability_state` flat shape** in session kernel
58
+ (`mcp_server/runtime/tools.py`): `state.to_dict()` wraps its output as
59
+ `{"capability_state": {...}}` — that's the right shape for the
60
+ standalone `get_capability_state` tool, but when stored on the kernel
61
+ it produced the ugly double-nested
62
+ `kernel["capability_state"]["capability_state"]["domains"]`. v1.17.3
63
+ probe tests worked around it with defensive
64
+ `outer.get("capability_state", outer)`. Fix: unwrap the outer key
65
+ once before passing to `build_session_kernel`. Consumer path is
66
+ now `kernel["capability_state"]["domains"]` directly. Standalone
67
+ `get_capability_state` return shape unchanged.
68
+
69
+ ### Tests
70
+
71
+ - 4 new TDD tests in `tests/test_runtime_capability_probes.py`:
72
+ - memory probe raises → kernel reports memory unavailable
73
+ - memory probe succeeds → kernel reports available
74
+ - kernel's capability_state has no nested `capability_state` key
75
+ - end-to-end flat access without defensive fallbacks
76
+ - Consumer updates:
77
+ - `test_session_kernel.py:203` — removed extra level
78
+ - `test_runtime_capability_probes.py` (4 places) — removed
79
+ defensive `outer.get('capability_state', outer)` pattern now that
80
+ the shape is known-flat
81
+
82
+ 2722 → 2726 passing.
83
+
84
+ ### Known follow-up
85
+
86
+ Audit while writing this release flagged a third bug in
87
+ `mcp_server/runtime/safety_kernel.py:244`: the safety kernel reads
88
+ `capability_state.get("mode", "normal")` but the actual shape uses
89
+ `overall_mode`, not `mode`. The `.get(..., "normal")` default silently
90
+ falls back, so `read_only` mode gating never kicks in. Separate fix,
91
+ out of scope for this release.
92
+
3
93
  ## 1.17.3 — Truth-gap remediation, for real (April 23 2026)
4
94
 
5
95
  ### Fixed
Binary file
@@ -95,7 +95,7 @@ function anything() {
95
95
  function dispatch(cmd, args) {
96
96
  switch(cmd) {
97
97
  case "ping":
98
- send_response({"ok": true, "version": "1.17.3"});
98
+ send_response({"ok": true, "version": "1.17.5"});
99
99
  break;
100
100
  case "get_params":
101
101
  cmd_get_params(args);
@@ -1,2 +1,2 @@
1
1
  """LivePilot MCP Server — bridges MCP protocol to Ableton Live."""
2
- __version__ = "1.17.3"
2
+ __version__ = "1.17.5"
@@ -185,11 +185,21 @@ def get_session_kernel(
185
185
  web_ok = _probe_web()
186
186
  flucoma_ok = _probe_flucoma()
187
187
 
188
+ # v1.17.4: probe memory the same way too. Previously memory_ok=True was
189
+ # hardcoded — if the store raised, the kernel still reported memory
190
+ # available. Same truth-gap class as the v1.17.3 web/flucoma fix.
191
+ memory_ok = False
192
+ try:
193
+ _memory_store.list_techniques(limit=1)
194
+ memory_ok = True
195
+ except Exception as exc:
196
+ logger.debug("get_session_kernel memory probe failed: %s", exc)
197
+
188
198
  state = build_capability_state(
189
199
  session_ok=session_ok,
190
200
  analyzer_ok=analyzer_ok,
191
201
  analyzer_fresh=analyzer_fresh,
192
- memory_ok=True,
202
+ memory_ok=memory_ok,
193
203
  web_ok=web_ok,
194
204
  flucoma_ok=flucoma_ok,
195
205
  )
@@ -248,9 +258,18 @@ def get_session_kernel(
248
258
  except Exception as e:
249
259
  kernel_warnings.append(f"session_memory_unavailable: {e}")
250
260
 
261
+ # v1.17.4: state.to_dict() wraps its output as {"capability_state": {...}}
262
+ # because that shape is what the standalone get_capability_state tool
263
+ # returns. When building the session kernel, that wrapper becomes the
264
+ # ugly double-nested kernel["capability_state"]["capability_state"]["domains"]
265
+ # path. Unwrap once here so kernel consumers get
266
+ # kernel["capability_state"]["domains"] directly.
267
+ _cap_dict = state.to_dict()
268
+ _cap_flat = _cap_dict.get("capability_state", _cap_dict)
269
+
251
270
  kernel = build_session_kernel(
252
271
  session_info=session_info,
253
- capability_state=state.to_dict(),
272
+ capability_state=_cap_flat,
254
273
  request_text=request_text,
255
274
  mode=mode,
256
275
  aggression=aggression,
@@ -102,6 +102,13 @@ def _classify_commit_result(result: Any) -> str:
102
102
  return "commit_failed"
103
103
  if result.get("status") == "failed":
104
104
  return "commit_failed"
105
+ # v1.17.5 (Codex PR#27 review): a top-level "error" key with no
106
+ # explicit committed=True is a failure signal. commit_branch_async
107
+ # returns {"error": "Branch not found"} / {"error": "Branch has no
108
+ # compiled plan"} / {"error": "Experiment not found"} in several
109
+ # paths — without this check they'd fall through to "committed".
110
+ if result.get("error") and result.get("committed") is not True:
111
+ return "commit_failed"
105
112
  steps_ok = result.get("steps_ok")
106
113
  steps_failed = result.get("steps_failed")
107
114
  if steps_ok == 0 and (steps_failed is None or steps_failed > 0):
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "livepilot",
3
- "version": "1.17.3",
3
+ "version": "1.17.5",
4
4
  "mcpName": "io.github.dreamrec/livepilot",
5
5
  "description": "Agentic production system for Ableton Live 12 — 427 tools, 52 domains. Device atlas (1305 devices), sample engine (Splice + browser + filesystem), auto-composition, spectral perception, technique memory, creative intelligence (12 engines)",
6
6
  "author": "Pilot Studio",
@@ -5,7 +5,7 @@ Entry point for the ControlSurface. Ableton calls create_instance(c_instance)
5
5
  when this script is selected in Preferences > Link, Tempo & MIDI.
6
6
  """
7
7
 
8
- __version__ = "1.17.3"
8
+ __version__ = "1.17.5"
9
9
 
10
10
  from _Framework.ControlSurface import ControlSurface
11
11
  from . import router
package/server.json CHANGED
@@ -6,12 +6,12 @@
6
6
  "url": "https://github.com/dreamrec/LivePilot",
7
7
  "source": "github"
8
8
  },
9
- "version": "1.17.3",
9
+ "version": "1.17.5",
10
10
  "packages": [
11
11
  {
12
12
  "registryType": "npm",
13
13
  "identifier": "livepilot",
14
- "version": "1.17.3",
14
+ "version": "1.17.5",
15
15
  "transport": {
16
16
  "type": "stdio"
17
17
  }