@event4u/agent-config 5.4.0 → 5.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (49) hide show
  1. package/.agent-src/commands/knowledge/cross-repo.md +71 -0
  2. package/.agent-src/commands/knowledge.md +2 -0
  3. package/.agent-src/commands/skill/preview.md +67 -0
  4. package/.agent-src/commands/skill.md +48 -0
  5. package/.agent-src/commands/skills/discover.md +76 -0
  6. package/.agent-src/commands/skills.md +56 -0
  7. package/.agent-src/commands/video/from-song.md +317 -0
  8. package/.agent-src/commands/video.md +19 -9
  9. package/.agent-src/rules/linked-projects-onboarding-gate.md +1 -1
  10. package/.agent-src/skills/song-to-script/SKILL.md +193 -0
  11. package/.claude-plugin/marketplace.json +9 -2
  12. package/CHANGELOG.md +49 -0
  13. package/CONTRIBUTING.md +6 -0
  14. package/README.md +3 -3
  15. package/dist/cli/registry.js +1 -0
  16. package/dist/cli/registry.js.map +1 -1
  17. package/dist/discovery/deprecation-report.md +1 -1
  18. package/dist/discovery/discovery-manifest.json +171 -17
  19. package/dist/discovery/discovery-manifest.json.sha256 +1 -1
  20. package/dist/discovery/discovery-manifest.summary.md +4 -4
  21. package/dist/discovery/orphan-report.md +1 -1
  22. package/dist/discovery/packs.json +17 -10
  23. package/dist/discovery/trust-report.md +3 -3
  24. package/dist/discovery/workspaces.json +13 -6
  25. package/dist/mcp/registry-manifest.json +2 -2
  26. package/docs/architecture.md +2 -2
  27. package/docs/contracts/command-clusters.md +4 -1
  28. package/docs/contracts/cross-repo-retrieval.md +64 -0
  29. package/docs/contracts/skill-discovery.md +80 -0
  30. package/docs/contracts/skill-dry-run.md +47 -0
  31. package/docs/decisions/ADR-032-linked-projects-scope.md +7 -3
  32. package/docs/getting-started.md +1 -1
  33. package/docs/guides/cross-repo-linked-projects.md +7 -0
  34. package/docs/guides/cross-repo-retrieval.md +61 -0
  35. package/docs/guides/skill-discovery.md +71 -0
  36. package/docs/guides/skill-preview.md +71 -0
  37. package/package.json +1 -1
  38. package/scripts/__pycache__/validate_frontmatter.cpython-312.pyc +0 -0
  39. package/scripts/_dispatch.bash +10 -0
  40. package/scripts/_lib/__pycache__/__init__.cpython-312.pyc +0 -0
  41. package/scripts/_lib/__pycache__/agent_src.cpython-312.pyc +0 -0
  42. package/scripts/ai-video/lib/probe-audio.sh +181 -0
  43. package/scripts/cross_repo_retrieve.py +172 -0
  44. package/scripts/inventory_meta_layers.py +288 -0
  45. package/scripts/linked_projects_list.py +91 -0
  46. package/scripts/memory_lookup.py +53 -2
  47. package/scripts/skill_discovery.py +254 -0
  48. package/scripts/skill_linter.py +8 -4
  49. package/scripts/skill_preview.py +179 -0
@@ -0,0 +1,80 @@
1
+ ---
2
+ stability: experimental
3
+ ---
4
+
5
+ # Skill Discovery Contract
6
+
7
+ > **Status** · v0 / design · 2026-05-30. Phase 3 of `road-to-leaner-core-and-discovery`.
8
+ > **Local-only.** Mirrors [`local-analytics.md`](local-analytics.md): no network egress, no POST,
9
+ > no remote Worker. The recommender reads local files only and honours the analytics opt-out.
10
+
11
+ ## Problem
12
+
13
+ The package ships 220 skills. Both council members named "218-skill paralysis" as the dominant
14
+ discoverability risk. This contract defines a **recommendation surface** that turns existing signals
15
+ into a short, *explained* shortlist — and explicitly **reuses** signals already on disk. It adds **no**
16
+ new always-loaded layer (that would fail the Phase-1 leaner-core premise).
17
+
18
+ ## Input signals (all local, all already on disk)
19
+
20
+ | Signal | Source | Used for |
21
+ |---|---|---|
22
+ | Skill catalog | `.agent-src/skills/*/SKILL.md` frontmatter (`name`, `description`, `domain`) | candidate universe + `domain` category |
23
+ | Role shortlist | `agents/roles/<role>/skills.yml` (priority-ordered `id` + `why`) | `most-useful-for-role` |
24
+ | Local analytics | `~/.event4u/agent-config/workspace/analytics/events.jsonl` (`event`, `data.role`, `data.task`, optional `data.skill`) | `recently-adopted`, `popular-in-role` |
25
+
26
+ The role `skills.yml` is the strongest signal and is always present; analytics is optional and
27
+ degrades gracefully (below).
28
+
29
+ ## Four recommendation classes
30
+
31
+ | Class | Ranking basis | `why` shape |
32
+ |---|---|---|
33
+ | `most-useful-for-role` | role `skills.yml` priority order | the shortlist's own `why:` line |
34
+ | `related-to-current-task` | skills sharing the `domain` of the role's shortlist skills, not already shortlisted | `same domain (<domain>) as your role's core skills` |
35
+ | `recently-adopted` | analytics events in the last 14 days carrying a skill id (`data.skill`), most-recent first | `used <N>d ago in this workspace` |
36
+ | `popular-in-role` | analytics skill-events filtered by `data.role`, by frequency | `launched <N>× by the <role> role locally` |
37
+
38
+ ## Explanation requirement — non-negotiable
39
+
40
+ ```
41
+ EVERY RECOMMENDATION CARRIES A NON-EMPTY `why`. NEVER AN UNEXPLAINED SCORE.
42
+ ```
43
+
44
+ Both council members flagged "opaque / self-referential recommendations without real usage signal" as
45
+ the main risk. A result with no `why` is a contract violation. The `why` names the *signal* (role match,
46
+ domain adjacency, recent-adoption, role-popularity) — never a bare number.
47
+
48
+ ## Graceful degradation — analytics absent or opted out
49
+
50
+ Analytics is optional. When the JSONL file is missing, empty, or the opt-out is set
51
+ (`AGENT_CONFIG_NO_LOCAL_ANALYTICS` env or `analytics.local: off` config — same checks as
52
+ `local-analytics.md`), the two analytics-backed classes do not fabricate signal:
53
+
54
+ - `recently-adopted` and `popular-in-role` fall back to **role-shortlist order** with an honest `why`
55
+ (`from your role shortlist — no local usage signal yet`).
56
+ - `most-useful-for-role` and `related-to-current-task` are unaffected (catalog + role only).
57
+
58
+ The recommender therefore always returns a useful, explained list — even on a fresh machine with no
59
+ analytics history. Today's analytics schema logs `data.task` (not skill ids); the skill-level classes
60
+ read the forward-compatible `data.skill` field and degrade to the role-shortlist fallback until it is
61
+ populated. No class ever returns an empty `why`.
62
+
63
+ ## Local-only / no-network floor
64
+
65
+ The recommender opens local files only. It performs no network I/O, writes nothing, and never emits a
66
+ prompt or response body. It is read-only over the catalog, the role file, and (if present) the analytics
67
+ log. This mirrors `local-analytics.md` and does not lift the 3.1.0 Hard-Floor.
68
+
69
+ ## Surfaces
70
+
71
+ - CLI / agent: `/skills:discover [role]` → Markdown table (`skill · class · why · first command`).
72
+ Defaults to the active role experience when one is set; otherwise prompts for a role.
73
+ - GUI: a right-rail "Suggested skills" strip on the Workspace tab, reusing the `/api/v1/workspace/*`
74
+ bridge (no new infra). Deferrable behind the CLI surface if the employee-roadmap right-rail blocker
75
+ is still open.
76
+
77
+ ## Implementation
78
+
79
+ `scripts/skill_discovery.py` (≤ 300 LOC). Pure-local, no POST. Honours the analytics opt-out env + config.
80
+ Coverage: `tests/test_skill_discovery.py` against a fixture catalog + fixture analytics JSONL.
@@ -0,0 +1,47 @@
1
+ ---
2
+ stability: experimental
3
+ ---
4
+
5
+ # Skill Dry-Run / Preview Contract
6
+
7
+ > **Status** · v0 / design · 2026-05-30. Phase 5 of `road-to-leaner-core-and-discovery`.
8
+ > The council's missing-item catch: with 220 skills, non-dev personas need a non-destructive way
9
+ > to see what a skill/command will do **before** running it.
10
+
11
+ ## What "preview" means
12
+
13
+ A preview reads a skill's **declared intent** — its frontmatter and `## Steps` body — and renders a
14
+ plain-language "this skill will…" summary. It surfaces:
15
+
16
+ - the skill's **declared steps** (the `## Steps` section headings);
17
+ - its **execution type** (`manual` / `assisted` / `automated`, default `manual`) and **handler**
18
+ (`none` / `shell` / `php` / `node` / `internal`);
19
+ - its declared **`allowed_tools`**;
20
+ - any **file or command targets** named in the body (backtick paths, `python3 scripts/…` invocations).
21
+
22
+ ## Explicit non-goals
23
+
24
+ ```
25
+ PREVIEW IS NOT A SANDBOX. IT DOES NOT EXECUTE A FENCED COPY OF THE SKILL.
26
+ IT IS NOT A GUARANTEE OF SIDE-EFFECT-FREENESS FOR SKILLS WITH AN `execution` BLOCK.
27
+ ```
28
+
29
+ Preview reads declared intent — it does not run the skill, does not dry-run its commands, and cannot
30
+ prove a skill is harmless. It tells you what the skill *says* it will touch, so you can decide whether
31
+ to run it. For `execution: manual` skills (the default), it states plainly: **instructional only — no
32
+ automatic execution** (per [`runtime-safety`](../../.agent-src/rules/runtime-safety.md): `manual` is
33
+ instructional, `assisted` must propose before executing).
34
+
35
+ ## Surface
36
+
37
+ - CLI / agent: `/skill:preview <name>` — plain-language summary by default; `--technical` shows the raw
38
+ frontmatter + step list.
39
+ - Script: `scripts/skill_preview.py <name> [--technical] [--format text|json]`.
40
+
41
+ Plain-language mode reuses the plain-explain tone (employee-roadmap Phase 6). A malformed or missing
42
+ SKILL.md degrades to a **structured error**, never a crash.
43
+
44
+ ## Implementation
45
+
46
+ `scripts/skill_preview.py` (≤ 250 LOC). Read-only over `.agent-src/skills/<name>/SKILL.md`. No network,
47
+ no execution. Coverage: `tests/test_skill_preview.py`.
@@ -99,9 +99,13 @@ telemetry.
99
99
 
100
100
  ## Open follow-ups
101
101
 
102
- - **Consumer detector reachability:** the detector lives in `scripts/_lib/`;
103
- exposing it as an `agent-config` CLI subcommand for consumer installs is a
104
- follow-up. Import-reachable in this repo / co-located maintainer setups today.
102
+ - **Consumer detector reachability:** **Closed (2026-05-30, `road-to-leaner-core-and-discovery`
103
+ Phase 4).** The detector is now exposed as `agent-config linked-projects:list`
104
+ (`scripts/linked_projects_list.py`, registered in `src/cli/registry.ts` + `scripts/_dispatch.bash`),
105
+ wrapping `scripts/_lib/linked_projects.detect_linked_projects` + the `.agent-settings.local.yml`
106
+ opt-in cascade. Cross-repo *retrieval* over the opted-in siblings ships alongside it
107
+ (`/knowledge:cross-repo`, `scripts/cross_repo_retrieve.py`) per
108
+ [`cross-repo-retrieval`](../contracts/cross-repo-retrieval.md).
105
109
  - **Multi-agent verification:** only Claude Code was empirically validated.
106
110
  Cursor / Augment / Copilot are unverified — the guide's manual snippet covers
107
111
  them until an interactive per-IDE test is run.
@@ -169,7 +169,7 @@ Your agent now understands slash commands:
169
169
  | `/quality-fix` | Run and fix all quality checks |
170
170
  | `/chat-history` | Inspect the persistent chat-history log (read-only `show`) |
171
171
 
172
- → [Browse all 135 active commands](../.agent-src/commands/)
172
+ → [Browse all 141 active commands](../.agent-src/commands/)
173
173
 
174
174
  ---
175
175
 
@@ -77,6 +77,13 @@ If it reports the name, cross-repo access works. An out-of-root edit will prompt
77
77
  for confirmation, then succeed — that is expected (the agent's permission gate
78
78
  still applies).
79
79
 
80
+ ## Next: pull context from a sibling
81
+
82
+ Detection makes the agent *aware* of a sibling. To have it **read** targeted
83
+ context from one — a shared type, an API contract, a config — without copying
84
+ the sibling's files in, see [Cross-repo retrieval](cross-repo-retrieval.md)
85
+ (`agent-config linked-projects:list` + `/knowledge:cross-repo`).
86
+
80
87
  ## Tell us what works
81
88
 
82
89
  Auto-detection is verified for Claude Code only. If you use Cursor, Augment, or
@@ -0,0 +1,61 @@
1
+ # Cross-repo retrieval — pull sibling context without copying files
2
+
3
+ Once the agent knows about a sibling repo ([detection guide](cross-repo-linked-projects.md)),
4
+ cross-repo retrieval lets it **read targeted context** from that sibling — a shared type, an
5
+ API contract the frontend consumes, a config the sibling owns — without bulk-including the
6
+ sibling's files. It is the read layer on top of detection.
7
+
8
+ It stays inside [ADR-032](../decisions/ADR-032-linked-projects-scope.md) Option A: read-only,
9
+ opt-in per sibling, targeted query only. No full-tree sweep, no implicit inclusion, no writes.
10
+
11
+ ## 1. See which siblings are reachable
12
+
13
+ ```
14
+ agent-config linked-projects:list
15
+ ```
16
+
17
+ Prints the opted-in siblings as `path · detected via · large`. Add `--all` to see detected
18
+ siblings you have not decided on yet. A sibling only becomes reachable once you set
19
+ `include: true` for it in `agents/settings/.agent-settings.local.yml` (see the detection guide).
20
+
21
+ ## 2. Retrieve targeted context
22
+
23
+ ```
24
+ /knowledge:cross-repo "OrderApiContract"
25
+ ```
26
+
27
+ Under the hood:
28
+
29
+ ```bash
30
+ python3 scripts/cross_repo_retrieve.py "OrderApiContract" [--path-scope 'src/*.ts'] [--max-chunks 8]
31
+ ```
32
+
33
+ You get a bounded table — `source_repo · path · freshness · why` — drawn only from opted-in
34
+ siblings. Each chunk is redacted (secrets and PII are scrubbed before anything is shown), so
35
+ no credential ever crosses a repo boundary.
36
+
37
+ ## 3. Scope large siblings
38
+
39
+ A sibling flagged `large` by the detector **requires** a `--path-scope` glob:
40
+
41
+ ```
42
+ /knowledge:cross-repo "config" --path-scope 'packages/shared/**'
43
+ ```
44
+
45
+ Without a scope, a large sibling is skipped with a note — this keeps retrieval cheap and
46
+ targeted instead of walking a huge tree.
47
+
48
+ ## How it ranks in memory
49
+
50
+ When a skill retrieves memory with the `cross-repo` type, matches are tagged `source: cross-repo`
51
+ and scored **below** the project's own curated knowledge — so cross-repo context informs the
52
+ answer but never outranks your own repo's truth.
53
+
54
+ ## Notes
55
+
56
+ - **Read-only.** The surface never writes to a sibling. Out-of-root writes still pass the host
57
+ permission gate; cross-repo retrieval writes nothing.
58
+ - **Opt-in only.** A sibling that is not `include: true` is never read.
59
+ - **Targeted only.** Path-glob + content grep, never a blind full walk.
60
+ - Contract: [`cross-repo-retrieval`](../contracts/cross-repo-retrieval.md). Detection story:
61
+ [`cross-repo-linked-projects`](cross-repo-linked-projects.md).
@@ -0,0 +1,71 @@
1
+ # Skill discovery — a 3-minute walkthrough
2
+
3
+ The package ships 220 skills. You do not need to know them. `/skills:discover`
4
+ turns the catalog into a short, **explained** shortlist for your role — every
5
+ row tells you *why* it is suggested, so you never adopt a skill on faith.
6
+
7
+ It is local-only: it reads the skill catalog, your role's shortlist, and (if
8
+ present) your local-analytics log. No network, no writes.
9
+
10
+ ## 1. Run it for your role
11
+
12
+ ```
13
+ /skills:discover sales
14
+ ```
15
+
16
+ Or, with a role experience already active, just `/skills:discover` — it picks
17
+ up `roles.active_role` from `.agent-settings.yml`.
18
+
19
+ Under the hood the command runs:
20
+
21
+ ```bash
22
+ python3 scripts/skill_discovery.py --role sales
23
+ ```
24
+
25
+ ## 2. Read the `why` column
26
+
27
+ The output is a table. The third column is the point — it names the **signal**
28
+ behind each suggestion, never a bare score:
29
+
30
+ ```
31
+ | skill | class | why | first command |
32
+ |------------------------|-------------------------|-----------------------------------------------------------|---------------------------|
33
+ | refine-prompt | most-useful-for-role | Tightens fuzzy buyer briefs before drafting | Skill › refine-prompt |
34
+ | voice-and-tone-design | most-useful-for-role | Locks the deal voice across customer + procurement | Skill › voice-and-tone-design |
35
+ | competitive-positioning| most-useful-for-role | Surfaces the ours-vs-theirs delta when a competitor named | Skill › competitive-positioning |
36
+ | activation-design | related-to-current-task | same domain (product) as your sales core skills | Skill › activation-design |
37
+ | customer-research | recently-adopted | from your role shortlist — no local usage signal yet | Skill › customer-research |
38
+ | funnel-analysis | popular-in-role | from your role shortlist — no local usage signal yet | Skill › funnel-analysis |
39
+ ```
40
+
41
+ The four classes:
42
+
43
+ - **most-useful-for-role** — your role's curated priority shortlist.
44
+ - **related-to-current-task** — same-domain peers you have not shortlisted yet.
45
+ - **recently-adopted** — what you actually used recently (from local analytics);
46
+ on a fresh machine with no usage history it honestly says *"no local usage
47
+ signal yet"* and falls back to your shortlist instead of inventing a number.
48
+ - **popular-in-role** — what your role launches most locally (same fallback).
49
+
50
+ ## 3. Adopt one
51
+
52
+ Pick a row, read its `why`, and start with the `first command`. Unsure what a
53
+ skill will actually do before you commit? Preview it first:
54
+
55
+ ```
56
+ /skill:preview competitive-positioning
57
+ ```
58
+
59
+ That is the safe adoption loop: **discover → preview → run**.
60
+
61
+ ## Notes
62
+
63
+ - **Local-only.** `/skills:discover` never touches the network and writes
64
+ nothing. It reads the catalog, your role file, and the optional analytics log.
65
+ - **Analytics is optional.** Opt out with `AGENT_CONFIG_NO_LOCAL_ANALYTICS=1`
66
+ or `analytics.local: off` in `.agent-settings.yml`. The two usage-driven
67
+ classes then fall back to your role shortlist — the list is still useful, just
68
+ without the personalised signal.
69
+ - **`--format json`** emits the same data machine-readably; **`--limit N`** sets
70
+ how many results per class (default 5).
71
+ - Contract: [`skill-discovery`](../contracts/skill-discovery.md).
@@ -0,0 +1,71 @@
1
+ # Skill preview — see what a skill does before you run it
2
+
3
+ With 220 skills and some that run commands, you should not have to run a skill to
4
+ find out what it touches. `/skill:preview` reads a skill's **declared intent** —
5
+ its steps, execution type, tools, and any file/command targets — and renders a
6
+ plain-language summary. Read-only: it never runs the skill.
7
+
8
+ This is the middle of the safe adoption loop: **discover → preview → run**.
9
+
10
+ ## 1. Discover, then preview
11
+
12
+ Find a candidate with [`/skills:discover`](skill-discovery.md), then look before you leap:
13
+
14
+ ```
15
+ /skill:preview competitive-positioning
16
+ ```
17
+
18
+ Under the hood:
19
+
20
+ ```bash
21
+ python3 scripts/skill_preview.py competitive-positioning
22
+ ```
23
+
24
+ ## 2. Read the summary
25
+
26
+ A **manual** skill (the default) is pure guidance — preview says so plainly:
27
+
28
+ ```
29
+ # Preview — `accessibility-auditor`
30
+
31
+ **Execution: instructional only.** This skill does not run anything automatically —
32
+ it guides the agent step by step.
33
+
34
+ _No tools, commands, or file targets declared — pure guidance._
35
+ ```
36
+
37
+ An **assisted** skill proposes actions you approve — preview surfaces the command
38
+ and tools it declares:
39
+
40
+ ```
41
+ # Preview — `adr-create`
42
+
43
+ **Execution: assisted** (handler `shell`). It will *propose* actions for you to
44
+ approve — it never executes silently.
45
+
46
+ This skill will walk these steps:
47
+ - Pick the next ADR number
48
+ - Write the standard template
49
+ - Regenerate the index
50
+
51
+ Declared command: `python3 scripts/adr/regenerate_index.py`
52
+ ```
53
+
54
+ Add `--technical` for the raw frontmatter + numbered step list.
55
+
56
+ ## 3. Decide, then run
57
+
58
+ Preview hands the decision back to you. If the declared steps and targets look
59
+ right, invoke the skill. If not, skip it — you have spent zero side effects
60
+ finding out.
61
+
62
+ ## What preview is not
63
+
64
+ - **Not a sandbox.** It does not run the skill or a fenced copy of it.
65
+ - **Not a safety guarantee.** It shows what the skill *declares* it will touch —
66
+ it cannot prove a skill with an `execution` block is side-effect-free.
67
+
68
+ A malformed or missing skill yields a structured error, never a crash.
69
+
70
+ Contract: [`skill-dry-run`](../contracts/skill-dry-run.md). Pairs with
71
+ [`skill-discovery`](skill-discovery.md) as the discover → preview → run loop.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@event4u/agent-config",
3
- "version": "5.4.0",
3
+ "version": "5.5.0",
4
4
  "description": "Universal AI Agent OS \u2014 audited skills, governance rules, commands, and templates for AI coding tools (Claude Code, Cursor, Windsurf, Copilot).",
5
5
  "license": "MIT",
6
6
  "private": false,
@@ -165,6 +165,8 @@ Tier 2 — maintenance / internal (hooks, MCP, memory, telemetry):
165
165
  --payload <path|event-name> [--native-event <native>]
166
166
  [--manifest <path>] [--json] [--dry-run]
167
167
  memory:lookup Retrieve memory entries (text or JSON envelope)
168
+ linked-projects:list List opted-in IDE-attached sibling repos (path · detected_via · large)
169
+ Flags: --all (show undecided too), --format json
168
170
  memory:signal Append a provisional intake signal (memory proposal)
169
171
  memory:hash Hash a memory entry (YAML or JSON stdin)
170
172
  memory:check Validate memory YAML schema + staleness
@@ -419,6 +421,13 @@ cmd_memory_lookup() {
419
421
  exec python3 "$script" "$@"
420
422
  }
421
423
 
424
+ cmd_linked_projects_list() {
425
+ require_python3
426
+ local script
427
+ script="$(resolve_script "scripts/linked_projects_list.py")" || return 1
428
+ exec python3 "$script" "$@"
429
+ }
430
+
422
431
  cmd_memory_signal() {
423
432
  require_python3
424
433
  local script
@@ -928,6 +937,7 @@ main() {
928
937
  implement-ticket) cmd_implement_ticket "$@" ;;
929
938
  work) cmd_work "$@" ;;
930
939
  memory:lookup) cmd_memory_lookup "$@" ;;
940
+ linked-projects:list) cmd_linked_projects_list "$@" ;;
931
941
  memory:signal) cmd_memory_signal "$@" ;;
932
942
  memory:hash) cmd_memory_hash "$@" ;;
933
943
  memory:check) cmd_memory_check "$@" ;;
@@ -0,0 +1,181 @@
1
+ #!/usr/bin/env bash
2
+ # probe-audio.sh — turn a song file into a deterministic, network-free
3
+ # JSON summary the `song-to-script` skill maps to scenes:
4
+ #
5
+ # {"duration": <seconds>,
6
+ # "method": "silence" | "rms" | "interval",
7
+ # "warning": "<present only for the interval fallback>",
8
+ # "sections": [{"start":0.0,"end":12.5,"energy":0.41,"label":"intro"}, ...]}
9
+ #
10
+ # HONEST FRAMING (AI-council design review, 2026-05-30): this is energy /
11
+ # silence segmentation, NOT beat detection or musical analysis. Modern
12
+ # masters are brick-walled (near-constant RMS), so a real cut structure
13
+ # is often absent. The probe therefore degrades through three methods and
14
+ # always reports which one produced the anchors:
15
+ #
16
+ # 1. silence — ffmpeg silencedetect found real quiet gaps → true cuts.
17
+ # 2. rms — no usable silence; greedy-merge per-window RMS energy.
18
+ # 3. interval — track is structurally flat (brick-walled / sustained):
19
+ # fall back to fixed-interval cuts and SET `warning` so the
20
+ # caller (and the operator) knows timing is not musical.
21
+ #
22
+ # Sections are cut anchors, never a transcription. For beat-accurate cuts
23
+ # the operator passes `--scene-durations` to /video:from-song instead.
24
+ #
25
+ # Usage:
26
+ # probe-audio.sh <song-file> [--window <seconds>] [--interval <seconds>]
27
+ # [--silence-db <dB>] [--silence-min <seconds>]
28
+ #
29
+ # --window RMS analysis window (default 3)
30
+ # --interval fixed-interval fallback section length (default 15)
31
+ # --silence-db silencedetect noise floor (default -30)
32
+ # --silence-min silencedetect minimum gap to count as a boundary (default 0.5)
33
+ #
34
+ # Exit codes:
35
+ # 0 JSON written to stdout
36
+ # 2 usage / file missing
37
+ # 3 required tool missing (ffprobe / ffmpeg)
38
+ # 4 no audio stream in the file
39
+
40
+ set -euo pipefail
41
+
42
+ die() { printf 'probe-audio: %s\n' "$2" >&2; exit "$1"; }
43
+
44
+ [ "$#" -ge 1 ] || die 2 "usage: $0 <song-file> [--window <s>] [--interval <s>] [--silence-db <dB>] [--silence-min <s>]"
45
+
46
+ song="$1"; shift || true
47
+ window=3
48
+ interval=15
49
+ silence_db=-30
50
+ silence_min=0.5
51
+ while [ "$#" -gt 0 ]; do
52
+ case "$1" in
53
+ --window) window="${2:-3}"; shift 2 ;;
54
+ --interval) interval="${2:-15}"; shift 2 ;;
55
+ --silence-db) silence_db="${2:--30}"; shift 2 ;;
56
+ --silence-min) silence_min="${2:-0.5}"; shift 2 ;;
57
+ *) die 2 "unknown arg: $1" ;;
58
+ esac
59
+ done
60
+
61
+ [ -f "${song}" ] || die 2 "file not found: ${song}"
62
+ command -v ffprobe >/dev/null 2>&1 || die 3 "ffprobe not found"
63
+ command -v ffmpeg >/dev/null 2>&1 || die 3 "ffmpeg not found"
64
+
65
+ # --- 1. duration + audio-stream check ----------------------------------
66
+ duration="$(ffprobe -v error -select_streams a:0 \
67
+ -show_entries format=duration -of default=nk=1:nw=1 "${song}" 2>/dev/null || true)"
68
+ [ -n "${duration}" ] || die 4 "no audio stream in: ${song}"
69
+
70
+ # --- 2. per-window RMS energy via astats --------------------------------
71
+ # Slice the track into <window>-second chunks; read mean RMS level (dB),
72
+ # normalise to 0..1 where -60dB→0 and 0dB→1. These energies feed BOTH the
73
+ # rms-merge method and the per-section labelling of every method.
74
+ n_windows="$(awk -v d="${duration}" -v w="${window}" 'BEGIN{
75
+ n = int(d / w); if (n * w < d) n++; if (n < 1) n = 1; print n }')"
76
+
77
+ win_starts=""; win_energy=""
78
+ i=0
79
+ while [ "${i}" -lt "${n_windows}" ]; do
80
+ start="$(awk -v i="${i}" -v w="${window}" 'BEGIN{printf "%.3f", i*w}')"
81
+ rms_db="$(ffmpeg -hide_banner -nostats -ss "${start}" -t "${window}" -i "${song}" \
82
+ -af astats=metadata=1:reset=1 -f null - 2>&1 \
83
+ | awk -F': ' '/RMS level dB/ {v=$2} END{print v}')"
84
+ case "${rms_db}" in ""|*inf*|*nan*) rms_db=-60 ;; esac
85
+ norm="$(awk -v x="${rms_db}" 'BEGIN{
86
+ v=(x+60)/60; if(v<0)v=0; if(v>1)v=1; printf "%.3f", v }')"
87
+ win_starts="${win_starts}${start}\n"
88
+ win_energy="${win_energy}${norm}\n"
89
+ i=$((i + 1))
90
+ done
91
+
92
+ # --- 3. silencedetect boundaries ----------------------------------------
93
+ # Real quiet gaps split the track at musically-meaningful points far more
94
+ # reliably than RMS deltas on a compressed master. Collect the midpoints
95
+ # of detected silences as candidate section boundaries.
96
+ sil_bounds="$(ffmpeg -hide_banner -nostats -i "${song}" \
97
+ -af "silencedetect=noise=${silence_db}dB:d=${silence_min}" -f null - 2>&1 \
98
+ | awk '
99
+ /silence_start/ { for(i=1;i<=NF;i++) if($i=="silence_start:") s=$(i+1) }
100
+ /silence_end/ { for(i=1;i<=NF;i++) if($i=="silence_end:") { e=$(i+1); printf "%.3f\n", (s+e)/2 } }
101
+ ' 2>/dev/null || true)"
102
+ n_sil="$(printf '%s' "${sil_bounds}" | sed '/^$/d' | wc -l | tr -d ' ')"
103
+
104
+ # --- 4. choose method + build section boundaries ------------------------
105
+ # A method needs >= 3 sections (>= 2 internal boundaries) to count as
106
+ # "structure found"; otherwise degrade to the next method.
107
+ method=""
108
+ boundaries="" # internal cut points (excluding 0 and duration)
109
+
110
+ if [ "${n_sil}" -ge 2 ]; then
111
+ method="silence"
112
+ boundaries="$(printf '%s\n' "${sil_bounds}" | sed '/^$/d' \
113
+ | awk -v d="${duration}" '$1>0.5 && $1<d-0.5' | sort -n | uniq)"
114
+ fi
115
+
116
+ if [ -z "${method}" ]; then
117
+ # greedy-merge adjacent RMS windows; keep a boundary on energy delta > 0.12
118
+ rms_bounds="$(paste <(printf '%b' "${win_starts}") <(printf '%b' "${win_energy}") \
119
+ | awk -v d="${duration}" '
120
+ { st[NR]=$1; en[NR]=$2; cnt=NR }
121
+ END {
122
+ prev=en[1]
123
+ for(k=2;k<=cnt;k++){
124
+ if ((en[k]-prev>0.12)||(prev-en[k]>0.12)) { if(st[k]>0.5 && st[k]<d-0.5) print st[k] }
125
+ prev=en[k]
126
+ }
127
+ }')"
128
+ n_rms="$(printf '%s' "${rms_bounds}" | sed '/^$/d' | wc -l | tr -d ' ')"
129
+ if [ "${n_rms}" -ge 2 ]; then
130
+ method="rms"
131
+ boundaries="$(printf '%s\n' "${rms_bounds}" | sed '/^$/d' | sort -n | uniq)"
132
+ fi
133
+ fi
134
+
135
+ warning=""
136
+ if [ -z "${method}" ]; then
137
+ method="interval"
138
+ warning="track is structurally flat (no usable silence or energy structure); sections are fixed ${interval}s intervals, not musical cuts"
139
+ boundaries="$(awk -v d="${duration}" -v iv="${interval}" 'BEGIN{
140
+ for(t=iv; t<d-0.5; t+=iv) printf "%.3f\n", t }')"
141
+ fi
142
+
143
+ # --- 5. assemble sections, label, emit JSON -----------------------------
144
+ # Boundaries → [0, b1, b2, ..., duration] section edges. Energy per section
145
+ # = mean of the RMS windows whose start falls inside it.
146
+ printf '%s' "${boundaries}" \
147
+ | sed '/^$/d' \
148
+ | awk -v d="${duration}" -v method="${method}" -v warning="${warning}" \
149
+ -v wins="$(printf '%b' "${win_starts}")" -v ens="$(printf '%b' "${win_energy}")" '
150
+ BEGIN {
151
+ nw=split(wins, ws, "\n"); split(ens, es, "\n")
152
+ # build edges
153
+ ne=0; edges[ne++]=0
154
+ }
155
+ { edges[ne++]=$1+0 }
156
+ END {
157
+ edges[ne++]=d+0
158
+ # mean energy across all windows for relative labelling
159
+ sum=0; c=0
160
+ for(k=1;k<=nw;k++){ if(ws[k]!=""){ sum+=es[k]; c++ } }
161
+ mean=(c?sum/c:0)
162
+ printf "{\"duration\": %.3f, \"method\": \"%s\"", d, method
163
+ if (warning != "") { gsub(/"/,"\\\"",warning); printf ", \"warning\": \"%s\"", warning }
164
+ printf ", \"sections\": ["
165
+ segs=ne-1
166
+ for(j=0;j<segs;j++){
167
+ s=edges[j]; e=edges[j+1]
168
+ # mean energy of windows starting within [s,e)
169
+ es_sum=0; es_c=0
170
+ for(k=1;k<=nw;k++){ if(ws[k]!=""){ if(ws[k]+0>=s && ws[k]+0<e){ es_sum+=es[k]; es_c++ } } }
171
+ energy=(es_c?es_sum/es_c:mean)
172
+ if (j==0) label="intro"
173
+ else if (j==segs-1) label=(energy<mean?"outro":"drop")
174
+ else if (energy>=mean+0.10) label="drop"
175
+ else if (energy<=mean-0.10) label="breakdown"
176
+ else label="build"
177
+ sep=(j<segs-1)?",":""
178
+ printf "{\"start\": %.3f, \"end\": %.3f, \"energy\": %.3f, \"label\": \"%s\"}%s", s, e, energy, label, sep
179
+ }
180
+ print "]}"
181
+ }'