livepilot 1.18.1 → 1.18.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,80 @@
1
1
  # Changelog
2
2
 
3
+ ## 1.18.2 — Wonder cold-start + tie-break + genre catalog closure (April 24 2026)
4
+
5
+ Second patch in the v1.18.x series. Three items from the v1.18.0/v1.18.1
6
+ Known Issues list resolved. Test suite grew to 2785 pass, xfail marker
7
+ removed (formerly 1, now 0).
8
+
9
+ ### Fixes
10
+
11
+ - **#10 Wonder Mode zero-variant degradation on empty session context.**
12
+ `enter_wonder_mode` on an empty/sparse session was returning 3
13
+ IDENTICAL `analytical_only` variants all with intent "Analytical
14
+ suggestion for: <request>". Live-verified during v1.18.0 Test 4
15
+ ("I'm stuck" on a 4-track empty session). Fix: introduced
16
+ `_COLD_START_SEEDS` in `mcp_server/wonder_mode/engine.py` — three
17
+ distinct starting-point suggestions covering different families
18
+ (`device_creation × rhythmic` + `sound_design × harmonic` +
19
+ `mix × architecture-first`). When `executable_count == 0`, the
20
+ padding loop uses `build_cold_start_variant()` which pulls from
21
+ the seed set by index, producing genuinely distinct variants with
22
+ specific actionable `what_changed` / `why_it_matters` text.
23
+ Partial-match case (1-2 executable) still uses the generic
24
+ fallback to avoid mixing real moves with architecture-first seeds.
25
+
26
+ - **#11 Experiment ranking tie-break coarseness.**
27
+ `ExperimentSet.ranked_branches()` was a single-key sort by score,
28
+ producing unstable rankings at score ties. Live-verified in v1.18.0
29
+ Test 8 — 3-branch experiment with `add_space` + `add_warmth` +
30
+ `widen_stereo` all scored 0.6 with no clear winner. Fix: composite
31
+ sort key via new `_branch_rank_key()` helper, in priority order:
32
+ (1) `-score` (primary, higher wins), (2) `-novelty_rank` (higher
33
+ novelty wins score ties — creative asks reward variation),
34
+ (3) `risk_rank` (lower risk wins secondary ties — safety default),
35
+ (4) `step_count` (simpler plans win tertiary ties),
36
+ (5) `branch_id` (deterministic final tiebreak for reproducibility).
37
+
38
+ - **Concept packet catalog closure.** 13 new genre YAMLs
39
+ (drone, downtempo, lo_fi, boom_bap, footwork, techno,
40
+ detroit_techno, synthwave, deep_house, disco, soul, dub, hyperpop)
41
+ + 15 too-generic/narrow refs removed from 12 artist packets
42
+ (electronic ×5, electronica, bass_music, cinematic, acid_techno,
43
+ french_house, nu_disco, soulful_house, vaporwave, juke, jungle).
44
+ The xfailing `test_all_artist_genre_refs_resolve_strictly` test
45
+ is now a required green pass. The concept surface has full graph
46
+ closure — every artist→genre cross-reference resolves to an actual
47
+ genre YAML's `id` field.
48
+
49
+ ### Tests added / changed
50
+
51
+ - `test_wonder_cold_start_has_distinct_variants` (new — guards
52
+ against regression to the 3-identical-generics degradation)
53
+ - `test_experiment_tie_break_prefers_higher_novelty` (new — unexpected
54
+ > strong > safe at equal scores)
55
+ - `test_experiment_tie_break_is_deterministic` (new — ranking stable
56
+ across input order)
57
+ - `test_all_artist_genre_refs_resolve_strictly` (was xfailing, now
58
+ passing — xfail marker removed)
59
+ - `test_concept_packets_count` (floor updated 14 → 27 genres)
60
+
61
+ ### Still open for v1.18.3 / v1.19
62
+
63
+ 5 items remain from the original v1.18.0 Known Issues list:
64
+
65
+ - **#7 Packet `avoid` list runtime enforcement** (still advisory —
66
+ pre-flight check against tool args needed)
67
+ - **#8 `locked_dimensions` runtime enforcement** (same pattern as #7)
68
+ - **Experiment state continuity between branches** (before-snapshot
69
+ drift)
70
+ - **Hybrid-packet compilation algorithm** (union/intersection logic
71
+ for "Basic Channel meets Dilla")
72
+ - **Full architectural fix for #3** (route director Phase 6 through
73
+ semantic_move commits — big redesign, v1.19 scope)
74
+
75
+ These all need new infrastructure or architectural decisions
76
+ unsuitable for a patch release.
77
+
3
78
  ## 1.18.1 — Director HIGH-severity patches (April 23 2026)
4
79
 
5
80
  Patch release addressing 4 of the 12 known issues documented in v1.18.0.
@@ -1,2 +1,2 @@
1
1
  """LivePilot MCP Server — bridges MCP protocol to Ableton Live."""
2
- __version__ = "1.18.1"
2
+ __version__ = "1.18.2"
@@ -194,6 +194,52 @@ class ExperimentBranch:
194
194
  return d
195
195
 
196
196
 
197
+ # v1.18.2 #11: composite tie-break ranking for experiment branches.
198
+ # Maps novelty_label / risk_label strings to integer ranks.
199
+ _NOVELTY_RANK: dict[str, int] = {
200
+ "safe": 0,
201
+ "medium": 1, # rarely used, but accept it for robustness
202
+ "strong": 1,
203
+ "unexpected": 2,
204
+ "bold": 2, # alias in some producer outputs
205
+ }
206
+ _RISK_RANK: dict[str, int] = {
207
+ "low": 0,
208
+ "medium": 1,
209
+ "high": 2,
210
+ }
211
+
212
+
213
+ def _branch_rank_key(branch: "ExperimentBranch") -> tuple:
214
+ """Composite sort key for ExperimentSet.ranked_branches().
215
+
216
+ Returns a tuple (-score, -novelty, risk, step_count, branch_id) such
217
+ that Python's default ascending sort produces the desired ranking:
218
+ higher scores first, then higher novelty at score ties, then lower
219
+ risk under equal novelty, then simpler plans, then branch_id as a
220
+ deterministic final tiebreak.
221
+ """
222
+ score = float(getattr(branch, "score", 0.0) or 0.0)
223
+ seed = getattr(branch, "seed", None)
224
+
225
+ if seed is not None:
226
+ novelty_label = (seed.novelty_label or "").lower()
227
+ risk_label = (seed.risk_label or "").lower()
228
+ else:
229
+ novelty_label = ""
230
+ risk_label = ""
231
+
232
+ novelty_rank = _NOVELTY_RANK.get(novelty_label, 1) # middle if unknown
233
+ risk_rank = _RISK_RANK.get(risk_label, 1)
234
+
235
+ plan = getattr(branch, "compiled_plan", None) or {}
236
+ step_count = int(plan.get("step_count", 0) or 0)
237
+
238
+ branch_id = getattr(branch, "branch_id", "") or ""
239
+
240
+ return (-score, -novelty_rank, risk_rank, step_count, branch_id)
241
+
242
+
197
243
  @dataclass
198
244
  class ExperimentSet:
199
245
  """A collection of branches being compared for one request."""
@@ -215,9 +261,33 @@ class ExperimentSet:
215
261
  return None
216
262
 
217
263
  def ranked_branches(self) -> list[ExperimentBranch]:
218
- """Return branches sorted by score descending."""
264
+ """Return evaluated branches sorted by composite rank.
265
+
266
+ v1.18.2 #11 fix: pre-fix this was a single-key sort by score,
267
+ which produced unstable rankings at score ties (live-verified in
268
+ v1.18.0 Test 8 — three branches at 0.6 with no winner).
269
+
270
+ Sort keys, in priority order:
271
+ 1. -score — higher score wins
272
+ 2. -novelty_rank — higher novelty wins at score ties
273
+ (creative asks reward variation)
274
+ 3. risk_rank — lower risk wins secondary ties
275
+ (safety default under equal novelty)
276
+ 4. step_count — simpler plans win tertiary ties
277
+ 5. branch_id — deterministic final tiebreak
278
+ (stable ranking across equal branches)
279
+
280
+ Novelty labels rank: "safe"=0, "strong"=1, "unexpected"=2, "bold"=2.
281
+ Risk labels rank: "low"=0, "medium"=1, "high"=2.
282
+ Unknown labels default to the middle (1).
283
+ """
219
284
  evaluated = [b for b in self.branches if b.status == "evaluated"]
220
- return sorted(evaluated, key=lambda b: -b.score)
285
+ return sorted(evaluated, key=_branch_rank_key)
286
+
287
+ # expose the key function for testing + custom rankers
288
+ def rank_key_for(self, branch: "ExperimentBranch") -> tuple:
289
+ """Return the composite rank key for a branch (for tie-break debugging)."""
290
+ return _branch_rank_key(branch)
221
291
 
222
292
  def to_dict(self) -> dict:
223
293
  return {
@@ -321,6 +321,82 @@ def build_analytical_variant(label: str, request_text: str, novelty_level: float
321
321
  }
322
322
 
323
323
 
324
+ # v1.18.2 #10 fix: distinct cold-start variant seeds for empty/sparse
325
+ # sessions. Used when no semantic moves match the request. Each seed has
326
+ # a specific `what_changed` + `why_it_matters` covering a different
327
+ # starting-point family (device_creation × rhythm + device_creation ×
328
+ # harmony + mix-architecture-first). Replaces the 3-identical-generics
329
+ # degradation that v1.18.0 Test 4 surfaced.
330
+ _COLD_START_SEEDS: list[dict] = [
331
+ {
332
+ "label": "safe",
333
+ "family": "device_creation",
334
+ "intent": "Begin with a rhythmic foundation",
335
+ "what_changed": "Load a drum kit (Drum Rack or Core Kit) on a fresh MIDI track, program a 4-bar kick-and-hat pattern",
336
+ "what_preserved": "blank slate — first move sets the tempo and grid foundation",
337
+ "why_it_matters": "Every track needs a rhythmic anchor before timbral or structural work. Safe starting point — drums-first is the most common composition entry.",
338
+ "novelty_level": 0.3,
339
+ "identity_effect": "establishes",
340
+ },
341
+ {
342
+ "label": "strong",
343
+ "family": "sound_design",
344
+ "intent": "Begin with a harmonic source",
345
+ "what_changed": "Load Drift or Meld on a MIDI track with a chord-stab patch (short attack, moderate release, slight detune), sketch a 2-bar chord pattern",
346
+ "what_preserved": "tempo and key are still open to discovery — lets the harmony suggest the rhythm",
347
+ "why_it_matters": "A harmonic source opens a different emotional palette than drums-first. Chord-first composition (Isolée / Luomo style) is less common but produces distinctive results.",
348
+ "novelty_level": 0.55,
349
+ "identity_effect": "establishes",
350
+ },
351
+ {
352
+ "label": "unexpected",
353
+ "family": "mix",
354
+ "intent": "Begin with the space, not the source",
355
+ "what_changed": "Configure return tracks BEFORE any instrument work — set up Return A with Convolution Reverb (cathedral IR) and Return B with Echo in ping-pong mode",
356
+ "what_preserved": "the blank slate IS the canvas; the sends are the frame you'll paint into",
357
+ "why_it_matters": "Dub techno and ambient producers (Basic Channel, Gas, Henke) build sound AROUND pre-configured sends. Unusual but genre-appropriate starting point.",
358
+ "novelty_level": 0.85,
359
+ "identity_effect": "establishes",
360
+ },
361
+ ]
362
+
363
+
364
+ def build_cold_start_variant(seed: dict, request_text: str, variant_id: str = "") -> dict:
365
+ """Build a cold-start variant seed for an empty/sparse session.
366
+
367
+ Used when no semantic moves match the request. Returns a variant with
368
+ distinct, actionable `what_changed` / `why_it_matters` text — NOT the
369
+ generic 'No matching moves found' fallback. Each seed covers a
370
+ different starting-point family; together they give the user three
371
+ genuinely distinct first-moves to choose from.
372
+
373
+ See `_COLD_START_SEEDS` for the seed set. The variant is
374
+ `analytical_only=True` (no compiled_plan) — turning these into
375
+ one-click executable plans is a v1.19 enhancement.
376
+ """
377
+ return {
378
+ "variant_id": variant_id,
379
+ "label": seed["label"],
380
+ "move_id": "",
381
+ "family": seed["family"],
382
+ "intent": seed["intent"],
383
+ "what_changed": seed["what_changed"],
384
+ "what_preserved": seed["what_preserved"],
385
+ "why_it_matters": seed["why_it_matters"],
386
+ "identity_effect": seed["identity_effect"],
387
+ "novelty_level": seed["novelty_level"],
388
+ "taste_fit": 0.5,
389
+ "targets_snapshot": {},
390
+ "compiled_plan": None,
391
+ "score": 0.0,
392
+ "rank": 0,
393
+ "score_breakdown": {},
394
+ "analytical_only": True,
395
+ "distinctness_reason": f"Cold-start seed ({seed['family']}) — empty session, no moves matched",
396
+ "cold_start": True,
397
+ }
398
+
399
+
324
400
  # ── Taste fit scoring ────────────────────────────────────────────
325
401
 
326
402
 
@@ -577,16 +653,37 @@ def generate_wonder_variants(
577
653
 
578
654
  executable_count = len(variants)
579
655
 
580
- # Pad with analytical variants
581
- while len(variants) < 3:
582
- idx = len(variants)
583
- v = build_analytical_variant(
584
- label=labels[idx],
585
- request_text=request_text,
586
- novelty_level=_NOVELTY_LEVELS.get(labels[idx], 0.5),
587
- variant_id=f"{set_prefix}_{labels[idx]}",
588
- )
589
- variants.append(v)
656
+ # v1.18.2 #10 fix: when NO executable moves matched, seed from the
657
+ # cold-start distinct-starting-points set instead of padding with
658
+ # identical generic analytical variants. Pre-fix, cold-start on an
659
+ # empty session returned 3 variants all with the same generic
660
+ # "No matching moves found" text — unhelpful to the user.
661
+ #
662
+ # The partial-match case (1 or 2 executable variants) still pads with
663
+ # the generic analytical fallback because we don't want to mix real
664
+ # move-based variants with architecture-first seeds — that would
665
+ # confuse the presentation.
666
+ if executable_count == 0:
667
+ while len(variants) < 3:
668
+ idx = len(variants)
669
+ seed = _COLD_START_SEEDS[idx]
670
+ v = build_cold_start_variant(
671
+ seed=seed,
672
+ request_text=request_text,
673
+ variant_id=f"{set_prefix}_{seed['label']}",
674
+ )
675
+ variants.append(v)
676
+ else:
677
+ # Partial-match: pad to 3 with generic analytical variants
678
+ while len(variants) < 3:
679
+ idx = len(variants)
680
+ v = build_analytical_variant(
681
+ label=labels[idx],
682
+ request_text=request_text,
683
+ novelty_level=_NOVELTY_LEVELS.get(labels[idx], 0.5),
684
+ variant_id=f"{set_prefix}_{labels[idx]}",
685
+ )
686
+ variants.append(v)
590
687
 
591
688
  novelty_band = 0.5
592
689
  taste_evidence = 0
@@ -603,7 +700,13 @@ def generate_wonder_variants(
603
700
 
604
701
  degraded_reason = ""
605
702
  if executable_count == 0:
606
- degraded_reason = "No matching executable moves found"
703
+ # v1.18.2 #10: cold-start path distinct starting-point seeds
704
+ # rather than identical-generic padding.
705
+ degraded_reason = (
706
+ "No matching executable moves — cold-start variants seeded "
707
+ "from distinct starting-point families (device_creation × 2 "
708
+ "+ mix-architecture-first)"
709
+ )
607
710
  elif executable_count == 1:
608
711
  degraded_reason = "Only 1 distinct executable move found"
609
712
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "livepilot",
3
- "version": "1.18.1",
3
+ "version": "1.18.2",
4
4
  "mcpName": "io.github.dreamrec/livepilot",
5
5
  "description": "Agentic production system for Ableton Live 12 — 427 tools, 52 domains. Device atlas (1305 devices), sample engine (Splice + browser + filesystem), auto-composition, spectral perception, technique memory, creative intelligence (12 engines)",
6
6
  "author": "Pilot Studio",
@@ -5,7 +5,7 @@ Entry point for the ControlSurface. Ableton calls create_instance(c_instance)
5
5
  when this script is selected in Preferences > Link, Tempo & MIDI.
6
6
  """
7
7
 
8
- __version__ = "1.18.1"
8
+ __version__ = "1.18.2"
9
9
 
10
10
  from _Framework.ControlSurface import ControlSurface
11
11
  from . import router
package/server.json CHANGED
@@ -6,12 +6,12 @@
6
6
  "url": "https://github.com/dreamrec/LivePilot",
7
7
  "source": "github"
8
8
  },
9
- "version": "1.18.1",
9
+ "version": "1.18.2",
10
10
  "packages": [
11
11
  {
12
12
  "registryType": "npm",
13
13
  "identifier": "livepilot",
14
- "version": "1.18.1",
14
+ "version": "1.18.2",
15
15
  "transport": {
16
16
  "type": "stdio"
17
17
  }