@event4u/agent-config 5.4.1 → 5.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agent-src/commands/image/analyse.md +51 -0
- package/.agent-src/commands/image/create.md +53 -0
- package/.agent-src/commands/image/verify.md +48 -0
- package/.agent-src/commands/image.md +69 -0
- package/.agent-src/commands/knowledge/cross-repo.md +71 -0
- package/.agent-src/commands/knowledge.md +2 -0
- package/.agent-src/commands/skill/preview.md +67 -0
- package/.agent-src/commands/skill.md +48 -0
- package/.agent-src/commands/skills/discover.md +76 -0
- package/.agent-src/commands/skills.md +56 -0
- package/.agent-src/commands/video/from-song.md +351 -0
- package/.agent-src/commands/video.md +19 -9
- package/.agent-src/contexts/authority/commit-mechanics.md +8 -0
- package/.agent-src/rules/commit-policy.md +3 -8
- package/.agent-src/rules/linked-projects-onboarding-gate.md +1 -1
- package/.agent-src/rules/media-sync-ground-truth.md +58 -0
- package/.agent-src/skills/image-analyser/SKILL.md +121 -0
- package/.agent-src/skills/image-analyser/canon-spec.md +109 -0
- package/.agent-src/skills/image-analyser/evals/triggers.json +16 -0
- package/.agent-src/skills/image-creator/SKILL.md +117 -0
- package/.agent-src/skills/image-creator/evals/triggers.json +16 -0
- package/.agent-src/skills/song-to-script/SKILL.md +216 -0
- package/.claude-plugin/marketplace.json +15 -2
- package/CHANGELOG.md +84 -0
- package/CONTRIBUTING.md +6 -0
- package/README.md +3 -3
- package/config/agent-settings.template.yml +18 -0
- package/dist/cli/registry.js +1 -0
- package/dist/cli/registry.js.map +1 -1
- package/dist/discovery/deprecation-report.md +1 -1
- package/dist/discovery/discovery-manifest.json +327 -20
- package/dist/discovery/discovery-manifest.json.sha256 +1 -1
- package/dist/discovery/discovery-manifest.summary.md +4 -4
- package/dist/discovery/orphan-report.md +1 -1
- package/dist/discovery/packs.json +24 -10
- package/dist/discovery/trust-report.md +3 -3
- package/dist/discovery/workspaces.json +20 -6
- package/dist/mcp/registry-manifest.json +3 -3
- package/dist/router.json +1 -1
- package/dist/server/schemas/settings.js +4 -0
- package/dist/server/schemas/settings.js.map +1 -1
- package/docs/architecture.md +3 -3
- package/docs/catalog.md +20 -6
- package/docs/contracts/benchmark-report-schema.md +12 -10
- package/docs/contracts/command-clusters.md +5 -1
- package/docs/contracts/cross-repo-retrieval.md +64 -0
- package/docs/contracts/rule-router.md +39 -0
- package/docs/contracts/skill-discovery.md +80 -0
- package/docs/contracts/skill-dry-run.md +47 -0
- package/docs/contracts/value-dashboard-spec.md +7 -3
- package/docs/contracts/value-report-schema.md +6 -1
- package/docs/decisions/ADR-032-linked-projects-scope.md +7 -3
- package/docs/getting-started.md +2 -2
- package/docs/guides/cross-repo-linked-projects.md +7 -0
- package/docs/guides/cross-repo-retrieval.md +61 -0
- package/docs/guides/skill-discovery.md +71 -0
- package/docs/guides/skill-preview.md +71 -0
- package/docs/value.md +17 -17
- package/package.json +1 -1
- package/scripts/__pycache__/validate_frontmatter.cpython-312.pyc +0 -0
- package/scripts/_dispatch.bash +10 -0
- package/scripts/_lib/__pycache__/__init__.cpython-312.pyc +0 -0
- package/scripts/_lib/__pycache__/agent_src.cpython-312.pyc +0 -0
- package/scripts/_lib/bench_report.py +13 -14
- package/scripts/_lib/bench_telegraph_report.py +1 -2
- package/scripts/_lib/token_count.py +95 -0
- package/scripts/_lib/value_report.py +3 -3
- package/scripts/ai-video/adapters/higgsfield.sh +163 -6
- package/scripts/ai-video/adapters/openai-images.sh +92 -6
- package/scripts/ai-video/lib/probe-audio.sh +181 -0
- package/scripts/audit_auto_rules.py +22 -6
- package/scripts/audit_command_surface.py +6 -1
- package/scripts/audit_initial_context.py +210 -0
- package/scripts/bench_ab_diff.py +4 -11
- package/scripts/bench_run.py +2 -3
- package/scripts/bench_runner.py +2 -2
- package/scripts/condense.py +44 -3
- package/scripts/cross_repo_retrieve.py +172 -0
- package/scripts/inventory_meta_layers.py +288 -0
- package/scripts/iron_law_sha.py +14 -5
- package/scripts/linked_projects_list.py +91 -0
- package/scripts/measure_rule_budget.py +15 -0
- package/scripts/memory_lookup.py +53 -2
- package/scripts/project_thin_rules.py +168 -0
- package/scripts/render_value_md.py +14 -23
- package/scripts/schemas/command.schema.json +1 -1
- package/scripts/schemas/rule.schema.json +1 -1
- package/scripts/schemas/skill.schema.json +2 -2
- package/scripts/skill_discovery.py +254 -0
- package/scripts/skill_linter.py +8 -4
- package/scripts/skill_preview.py +179 -0
- package/scripts/trigger_coverage.py +129 -0
|
@@ -82,25 +82,27 @@ verdict:
|
|
|
82
82
|
Headers in order:
|
|
83
83
|
|
|
84
84
|
1. `# Benchmark Report — <corpus_id> · <generated_at>`
|
|
85
|
-
2. `## Headline` — three-line summary (selection ·
|
|
85
|
+
2. `## Headline` — three-line summary (selection · tokens · quality).
|
|
86
86
|
3. `## Selection accuracy` — table per prompt with hit/miss + expected/got.
|
|
87
|
-
4. `##
|
|
88
|
-
session jsonl was found.
|
|
87
|
+
4. `## Token usage` — per-tier message counts + token totals; "unavailable"
|
|
88
|
+
block if no session jsonl was found. The monetary (USD) comparison is
|
|
89
|
+
**intentionally not rendered** — per-call API pricing misleads
|
|
90
|
+
subscription users; tokens are the currency-neutral metric that matters.
|
|
89
91
|
5. `## Quality probe` — per-prompt assertion pass/fail; `not_collected`
|
|
90
92
|
block when no agent-output path was passed.
|
|
91
|
-
6. `## Notes` —
|
|
92
|
-
versioned filename for citation.
|
|
93
|
+
6. `## Notes` — `corpus path` and the versioned filename for citation.
|
|
93
94
|
|
|
94
95
|
## Invariants
|
|
95
96
|
|
|
96
|
-
- **No silent drops.** Missing
|
|
97
|
-
|
|
97
|
+
- **No silent drops.** Missing token source → emit `source: unavailable`
|
|
98
|
+
with a marker; never omit the section.
|
|
98
99
|
- **Quality stub honesty.** When agent outputs are not provided, set
|
|
99
100
|
`quality.source: not_collected` and `verdict.overall: partial`. Score
|
|
100
101
|
stays `0.0`; never inflate by assuming pass.
|
|
101
|
-
- **
|
|
102
|
-
|
|
103
|
-
|
|
102
|
+
- **Tokens, not money.** The rendered report shows token counts only. The
|
|
103
|
+
JSON still carries the `cost` block (`total_cost_usd`, per-tier `cost_usd`)
|
|
104
|
+
for back-compat with downstream consumers, but no USD figure is rendered in
|
|
105
|
+
the Markdown headline, sections, or footer.
|
|
104
106
|
|
|
105
107
|
## Cross-references
|
|
106
108
|
|
|
@@ -49,7 +49,11 @@ column 1 of this table.
|
|
|
49
49
|
| `sync-gitignore` | — | `fix` | new sub-command 2026-05-11 — cluster head retains the existing append-only sync as its default flow; `:fix` adds `--cleanup-legacy` semantics, scrubbing pre-`/agents/` runtime patterns wherever they appear in the consumer's `.gitignore` (inside or outside the managed block) and re-syncing the canonical entries. |
|
|
50
50
|
| `ghostwriter` | — | `fetch` · `write` · `list` · `show` · `delete` | new cluster 2026-05-15 — third voice primitive for AI-assisted writing in the public voice of a public figure. Hybrid storage: real-person profiles live consumer-side under `agents/reference/ghostwriter/<slug>.md` (gitignored by default); package source ships only `fictional: true` fixtures. Zero network code in the package — `:fetch` delegates web-fetch / web-search to the host agent. Mandatory disclosure footer on every `:write` output (no opt-out). Schema: [`ghostwriter-schema`](ghostwriter-schema.md). |
|
|
51
51
|
| `post-as` | — | `me` · `ghostwriter` | new cluster 2026-05-15 — consumer-facing write entry points. `:me` reads `.agent-user.md.voice_sample` and drafts in the maintainer's own voice (no disclosure footer — the user is the author); `:ghostwriter` is a thin alias for `/ghostwriter:write` with the mandatory disclosure footer. Both share the procedural [`write-engine`](write-engine.md) contract — style source and footer are the only axes of variation. |
|
|
52
|
-
| `video` | — | `from-script` · `scene` · `storyboard` · `stitch` | new cluster 2026-05-17 — AI video generation pipeline. Cluster head orchestrates the full flow; `:scene` runs a single scene end-to-end (script → blueprint → still → motion → clip); `:storyboard` expands a script into per-scene blueprints + reference stills with character-lock JSON; `:from-script` walks a multi-scene script through storyboard + per-scene generation; `:stitch` concatenates scene clips with `ffmpeg` against a scene manifest. Provider-agnostic via the adapter contract under `scripts/ai-video/lib/adapter-contract.md`; cost-gated with mandatory `AIV_DRYRUN=true` default and explicit confirmation before live provider calls. |
|
|
52
|
+
| `video` | — | `from-script` · `from-song` · `scene` · `storyboard` · `stitch` | new cluster 2026-05-17 — AI video generation pipeline. Cluster head orchestrates the full flow; `:scene` runs a single scene end-to-end (script → blueprint → still → motion → clip); `:storyboard` expands a script into per-scene blueprints + reference stills with character-lock JSON; `:from-script` walks a multi-scene script through storyboard + per-scene generation; `:from-song` builds a music-video from a song + reference images — derived/briefed timed script (via the `song-to-script` skill + `probe-audio.sh` hybrid segmentation), optional character-lock, then stitch with the song muxed as the master track and a mandatory AI-generation disclosure; `:stitch` concatenates scene clips with `ffmpeg` against a scene manifest. Provider-agnostic via the adapter contract under `scripts/ai-video/lib/adapter-contract.md`; cost-gated with mandatory `AIV_DRYRUN=true` default and explicit confirmation before live provider calls. |
|
|
53
|
+
| `knowledge` | — | `ingest` · `list` · `forget` · `cross-repo` | local-knowledge namespace (employee-product Phase 7); `:cross-repo` added 2026-05-30 (`road-to-leaner-core-and-discovery` Phase 4) — read-only targeted retrieval over opted-in `linked_projects` siblings (ADR-032 Option A), per [`cross-repo-retrieval`](cross-repo-retrieval.md). The pre-existing `ingest`/`list`/`forget` sub-commands are recorded in the registry here for the first time. |
|
|
54
|
+
| `skills` | — | `discover` | new cluster 2026-05-30 (`road-to-leaner-core-and-discovery` Phase 3) — local, explained skill-recommendation surface over the catalog + role shortlists + optional local analytics, per [`skill-discovery`](skill-discovery.md). Every result carries a non-empty `why`; no network, honours the analytics opt-out. |
|
|
55
|
+
| `skill` | — | `preview` | new cluster 2026-05-30 (`road-to-leaner-core-and-discovery` Phase 5) — non-destructive skill/command preview: surfaces the declared steps + files/commands a skill would touch before it runs, per [`skill-dry-run`](skill-dry-run.md). Singular `skill` (one target) vs plural `skills` (the catalog) by design. |
|
|
56
|
+
| `image` | — | `analyse` · `create` · `verify` | new cluster 2026-05-31 (`road-to-character-image-fidelity` Phase 4) — character-image fidelity surface mirroring `/video:*`. `:analyse` extracts a per-feature spec from an image and diffs it against a Canon Spec down to the smallest mole (OCR for lettered tattoos, per-section severity scores, canon-breaking hard gate); `:create` assembles a max-fidelity anchors-first generation prompt from the Canon Spec, governance- + provider-gated, `AIV_DRYRUN=true` default; `:verify` runs the analyser in loop mode against a candidate and reports the gate verdict + remaining diff with plateau/oscillation/budget stop conditions. Skills: [`image-analyser`](../../skills/image-analyser/SKILL.md) + [`image-creator`](../../skills/image-creator/SKILL.md); schema/rubric/loop in [`canon-spec.md`](../../skills/image-analyser/canon-spec.md). |
|
|
53
57
|
|
|
54
58
|
**Net change:** Phase 1 collapsed 15 atomics → 3 clusters; Phase 2
|
|
55
59
|
collapses 26 atomics → 11 sub-command clusters. Sub-commands use
|
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
---
|
|
2
|
+
stability: experimental
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Cross-Repo Retrieval Contract
|
|
6
|
+
|
|
7
|
+
> **Status** · v0 / design · 2026-05-30. Phase 4 of `road-to-leaner-core-and-discovery`.
|
|
8
|
+
> Extends [ADR-032](../decisions/ADR-032-linked-projects-scope.md) **Option A** (passive,
|
|
9
|
+
> read-only, opt-in-per-sibling, no bulk inclusion). Does **not** advance to Option B (auto-scan)
|
|
10
|
+
> or Option C (implicit inclusion).
|
|
11
|
+
|
|
12
|
+
## Problem
|
|
13
|
+
|
|
14
|
+
The ADR-032 detector finds IDE-attached sibling repos but today only produces a passive *awareness
|
|
15
|
+
note* — the agent knows a sibling exists but cannot pull context from it. This contract makes opted-in
|
|
16
|
+
siblings a **read-only, targeted retrieval source**: the agent can fetch a shared type, an API contract
|
|
17
|
+
the frontend consumes, or a config the sibling owns, **without bulk-including** its files.
|
|
18
|
+
|
|
19
|
+
## Scope guards (Option A, non-negotiable)
|
|
20
|
+
|
|
21
|
+
```
|
|
22
|
+
READ-ONLY. OPT-IN-PER-SIBLING. TARGETED QUERY, NEVER A FULL-TREE SWEEP.
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
- **Read-only.** No writes to any sibling. Out-of-root writes still pass the host permission gate;
|
|
26
|
+
this surface never writes.
|
|
27
|
+
- **Opt-in only.** Only siblings with `include: true` in `agents/settings/.agent-settings.local.yml`
|
|
28
|
+
→ `linked_projects[]` are read. A sibling not opted in is never touched.
|
|
29
|
+
- **Targeted query only.** Every retrieval is a bounded path-glob + content grep — never a blind walk.
|
|
30
|
+
A `large`-flagged sibling (per the detector) **requires a path scope** and rejects an unscoped query.
|
|
31
|
+
- **Bounded.** ≤ `max_chunks` results per query (default 8). One concept per query.
|
|
32
|
+
|
|
33
|
+
## Retrieval envelope
|
|
34
|
+
|
|
35
|
+
Each match is returned as:
|
|
36
|
+
|
|
37
|
+
```json
|
|
38
|
+
{ "source_repo": "<sibling dir name>", "path": "<rel path in sibling>",
|
|
39
|
+
"chunk": "<≤ 2 KB redacted excerpt>", "freshness": "<git last-commit date or mtime>",
|
|
40
|
+
"match_reason": "<why this matched: path-glob or content term>" }
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
`chunk` passes the same redaction floor as `knowledge_ingest.py` — secrets and PII are scrubbed before
|
|
44
|
+
any cross-repo text is surfaced. Cross-repo text never leaks a secret.
|
|
45
|
+
|
|
46
|
+
## Memory integration — tagged + discounted
|
|
47
|
+
|
|
48
|
+
Cross-repo matches projected into `memory_retrieve` carry `source: cross-repo` and are scored **below**
|
|
49
|
+
local curated knowledge — the same 0.85× discount the `knowledge:` namespace already applies — so
|
|
50
|
+
cross-repo context never outranks the project's own truth.
|
|
51
|
+
|
|
52
|
+
## Surfaces
|
|
53
|
+
|
|
54
|
+
- CLI: `agent-config linked-projects:list` — prints opted-in siblings (`path · detected_via · large`).
|
|
55
|
+
Closes the ADR-032 follow-up "expose the detector as a CLI subcommand for consumer reach."
|
|
56
|
+
- CLI / agent: `/knowledge:cross-repo <query>` — renders matches as `source_repo · path · freshness · why`.
|
|
57
|
+
Honours opt-out (a sibling not `include: true` is never read); inert with a clear message when no
|
|
58
|
+
siblings are opted in.
|
|
59
|
+
|
|
60
|
+
## Implementation
|
|
61
|
+
|
|
62
|
+
`scripts/cross_repo_retrieve.py` (≤ 300 LOC). Pure-local, read-only, no network. Reuses the chunking +
|
|
63
|
+
redaction floor from `knowledge_ingest.py`. Coverage: `tests/test_cross_repo_retrieve.py` against
|
|
64
|
+
fixture sibling repos under `tests/fixtures/cross-repo/`.
|
|
@@ -123,6 +123,45 @@ The host agent reads `dist/router.json` once per session. Per turn:
|
|
|
123
123
|
No runtime profile resolution — the profile is fixed at session
|
|
124
124
|
start, the router lookup is keyword/phrase/path/intent matching only.
|
|
125
125
|
|
|
126
|
+
## Kill-switch — thin-projection rollback (lean-initial-context Phase 2.3)
|
|
127
|
+
|
|
128
|
+
Phase 3 of the lean-initial-context migration makes the per-tool projector
|
|
129
|
+
emit the kernel full-bodied and every non-kernel rule as a one-line
|
|
130
|
+
router-resolved pointer. That is the suite's biggest behavioural change, so
|
|
131
|
+
it ships behind a **single documented flip** that restores today's
|
|
132
|
+
full-eager projection:
|
|
133
|
+
|
|
134
|
+
```yaml
|
|
135
|
+
# .agent-settings.yml
|
|
136
|
+
lean_projection:
|
|
137
|
+
# thin = kernel full-bodied + non-kernel rules as router pointers (Phase 3)
|
|
138
|
+
# eager-all = every rule body inlined into every projection (today's behaviour)
|
|
139
|
+
mode: eager-all # DEFAULT until Phase 3.1 ships + its benchmark gate is green
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
Revert procedure (one flip, no code change): set `lean_projection.mode:
|
|
143
|
+
eager-all`, run `task generate-tools` (regenerates `.claude/`, `.cursor/`,
|
|
144
|
+
`.clinerules/`, `.windsurfrules`) + `task sync` (`.agent-src/`, `.augment/`).
|
|
145
|
+
The thin projector (Phase 3.1) MUST honour this key; with it absent or
|
|
146
|
+
`eager-all` the projector behaves exactly as today. Default stays
|
|
147
|
+
`eager-all` so the migration is opt-in and reversible by one line.
|
|
148
|
+
|
|
149
|
+
### Staleness guard — `src → dist`
|
|
150
|
+
|
|
151
|
+
A projection or router that drifts from source silently re-introduces the
|
|
152
|
+
eager bytes (or a missing pointer target). Three CI gates enforce
|
|
153
|
+
`src == dist`, all already wired into `task ci`:
|
|
154
|
+
|
|
155
|
+
- `task check-router` (`compile_router.py --check`) — `dist/router.json`
|
|
156
|
+
must equal a fresh compile from frontmatter `triggers:`/`routes_to:`.
|
|
157
|
+
- `task check-artefact-checksums` — every artefact's committed checksum
|
|
158
|
+
must match its current source bytes.
|
|
159
|
+
- `task lint-projection-fidelity` — the per-tool projections must match
|
|
160
|
+
what the projector would emit from source.
|
|
161
|
+
|
|
162
|
+
The thin projector inherits all three: a thin projection whose recorded
|
|
163
|
+
source hash ≠ current source fails CI before it can ship a stale pointer.
|
|
164
|
+
|
|
126
165
|
## Linter contract (Phase 3.3)
|
|
127
166
|
|
|
128
167
|
`scripts/skill_linter.py` extension enforces:
|
|
@@ -0,0 +1,80 @@
|
|
|
1
|
+
---
|
|
2
|
+
stability: experimental
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Skill Discovery Contract
|
|
6
|
+
|
|
7
|
+
> **Status** · v0 / design · 2026-05-30. Phase 3 of `road-to-leaner-core-and-discovery`.
|
|
8
|
+
> **Local-only.** Mirrors [`local-analytics.md`](local-analytics.md): no network egress, no POST,
|
|
9
|
+
> no remote Worker. The recommender reads local files only and honours the analytics opt-out.
|
|
10
|
+
|
|
11
|
+
## Problem
|
|
12
|
+
|
|
13
|
+
The package ships 220 skills. Both council members named "218-skill paralysis" as the dominant
|
|
14
|
+
discoverability risk. This contract defines a **recommendation surface** that turns existing signals
|
|
15
|
+
into a short, *explained* shortlist — and explicitly **reuses** signals already on disk. It adds **no**
|
|
16
|
+
new always-loaded layer (that would fail the Phase-1 leaner-core premise).
|
|
17
|
+
|
|
18
|
+
## Input signals (all local, all already on disk)
|
|
19
|
+
|
|
20
|
+
| Signal | Source | Used for |
|
|
21
|
+
|---|---|---|
|
|
22
|
+
| Skill catalog | `.agent-src/skills/*/SKILL.md` frontmatter (`name`, `description`, `domain`) | candidate universe + `domain` category |
|
|
23
|
+
| Role shortlist | `agents/roles/<role>/skills.yml` (priority-ordered `id` + `why`) | `most-useful-for-role` |
|
|
24
|
+
| Local analytics | `~/.event4u/agent-config/workspace/analytics/events.jsonl` (`event`, `data.role`, `data.task`, optional `data.skill`) | `recently-adopted`, `popular-in-role` |
|
|
25
|
+
|
|
26
|
+
The role `skills.yml` is the strongest signal and is always present; analytics is optional and
|
|
27
|
+
degrades gracefully (below).
|
|
28
|
+
|
|
29
|
+
## Four recommendation classes
|
|
30
|
+
|
|
31
|
+
| Class | Ranking basis | `why` shape |
|
|
32
|
+
|---|---|---|
|
|
33
|
+
| `most-useful-for-role` | role `skills.yml` priority order | the shortlist's own `why:` line |
|
|
34
|
+
| `related-to-current-task` | skills sharing the `domain` of the role's shortlist skills, not already shortlisted | `same domain (<domain>) as your role's core skills` |
|
|
35
|
+
| `recently-adopted` | analytics events in the last 14 days carrying a skill id (`data.skill`), most-recent first | `used <N>d ago in this workspace` |
|
|
36
|
+
| `popular-in-role` | analytics skill-events filtered by `data.role`, by frequency | `launched <N>× by the <role> role locally` |
|
|
37
|
+
|
|
38
|
+
## Explanation requirement — non-negotiable
|
|
39
|
+
|
|
40
|
+
```
|
|
41
|
+
EVERY RECOMMENDATION CARRIES A NON-EMPTY `why`. NEVER AN UNEXPLAINED SCORE.
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Both council members flagged "opaque / self-referential recommendations without real usage signal" as
|
|
45
|
+
the main risk. A result with no `why` is a contract violation. The `why` names the *signal* (role match,
|
|
46
|
+
domain adjacency, recent-adoption, role-popularity) — never a bare number.
|
|
47
|
+
|
|
48
|
+
## Graceful degradation — analytics absent or opted out
|
|
49
|
+
|
|
50
|
+
Analytics is optional. When the JSONL file is missing, empty, or the opt-out is set
|
|
51
|
+
(`AGENT_CONFIG_NO_LOCAL_ANALYTICS` env or `analytics.local: off` config — same checks as
|
|
52
|
+
`local-analytics.md`), the two analytics-backed classes do not fabricate signal:
|
|
53
|
+
|
|
54
|
+
- `recently-adopted` and `popular-in-role` fall back to **role-shortlist order** with an honest `why`
|
|
55
|
+
(`from your role shortlist — no local usage signal yet`).
|
|
56
|
+
- `most-useful-for-role` and `related-to-current-task` are unaffected (catalog + role only).
|
|
57
|
+
|
|
58
|
+
The recommender therefore always returns a useful, explained list — even on a fresh machine with no
|
|
59
|
+
analytics history. Today's analytics schema logs `data.task` (not skill ids); the skill-level classes
|
|
60
|
+
read the forward-compatible `data.skill` field and degrade to the role-shortlist fallback until it is
|
|
61
|
+
populated. No class ever returns an empty `why`.
|
|
62
|
+
|
|
63
|
+
## Local-only / no-network floor
|
|
64
|
+
|
|
65
|
+
The recommender opens local files only. It performs no network I/O, writes nothing, and never emits a
|
|
66
|
+
prompt or response body. It is read-only over the catalog, the role file, and (if present) the analytics
|
|
67
|
+
log. This mirrors `local-analytics.md` and does not lift the 3.1.0 Hard-Floor.
|
|
68
|
+
|
|
69
|
+
## Surfaces
|
|
70
|
+
|
|
71
|
+
- CLI / agent: `/skills:discover [role]` → Markdown table (`skill · class · why · first command`).
|
|
72
|
+
Defaults to the active role experience when one is set; otherwise prompts for a role.
|
|
73
|
+
- GUI: a right-rail "Suggested skills" strip on the Workspace tab, reusing the `/api/v1/workspace/*`
|
|
74
|
+
bridge (no new infra). Deferrable behind the CLI surface if the employee-roadmap right-rail blocker
|
|
75
|
+
is still open.
|
|
76
|
+
|
|
77
|
+
## Implementation
|
|
78
|
+
|
|
79
|
+
`scripts/skill_discovery.py` (≤ 300 LOC). Pure-local, no POST. Honours the analytics opt-out env + config.
|
|
80
|
+
Coverage: `tests/test_skill_discovery.py` against a fixture catalog + fixture analytics JSONL.
|
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
---
|
|
2
|
+
stability: experimental
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
# Skill Dry-Run / Preview Contract
|
|
6
|
+
|
|
7
|
+
> **Status** · v0 / design · 2026-05-30. Phase 5 of `road-to-leaner-core-and-discovery`.
|
|
8
|
+
> The council's missing-item catch: with 220 skills, non-dev personas need a non-destructive way
|
|
9
|
+
> to see what a skill/command will do **before** running it.
|
|
10
|
+
|
|
11
|
+
## What "preview" means
|
|
12
|
+
|
|
13
|
+
A preview reads a skill's **declared intent** — its frontmatter and `## Steps` body — and renders a
|
|
14
|
+
plain-language "this skill will…" summary. It surfaces:
|
|
15
|
+
|
|
16
|
+
- the skill's **declared steps** (the `## Steps` section headings);
|
|
17
|
+
- its **execution type** (`manual` / `assisted` / `automated`, default `manual`) and **handler**
|
|
18
|
+
(`none` / `shell` / `php` / `node` / `internal`);
|
|
19
|
+
- its declared **`allowed_tools`**;
|
|
20
|
+
- any **file or command targets** named in the body (backtick paths, `python3 scripts/…` invocations).
|
|
21
|
+
|
|
22
|
+
## Explicit non-goals
|
|
23
|
+
|
|
24
|
+
```
|
|
25
|
+
PREVIEW IS NOT A SANDBOX. IT DOES NOT EXECUTE A FENCED COPY OF THE SKILL.
|
|
26
|
+
IT IS NOT A GUARANTEE OF SIDE-EFFECT-FREENESS FOR SKILLS WITH AN `execution` BLOCK.
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
Preview reads declared intent — it does not run the skill, does not dry-run its commands, and cannot
|
|
30
|
+
prove a skill is harmless. It tells you what the skill *says* it will touch, so you can decide whether
|
|
31
|
+
to run it. For `execution: manual` skills (the default), it states plainly: **instructional only — no
|
|
32
|
+
automatic execution** (per [`runtime-safety`](../../.agent-src/rules/runtime-safety.md): `manual` is
|
|
33
|
+
instructional, `assisted` must propose before executing).
|
|
34
|
+
|
|
35
|
+
## Surface
|
|
36
|
+
|
|
37
|
+
- CLI / agent: `/skill:preview <name>` — plain-language summary by default; `--technical` shows the raw
|
|
38
|
+
frontmatter + step list.
|
|
39
|
+
- Script: `scripts/skill_preview.py <name> [--technical] [--format text|json]`.
|
|
40
|
+
|
|
41
|
+
Plain-language mode reuses the plain-explain tone (employee-roadmap Phase 6). A malformed or missing
|
|
42
|
+
SKILL.md degrades to a **structured error**, never a crash.
|
|
43
|
+
|
|
44
|
+
## Implementation
|
|
45
|
+
|
|
46
|
+
`scripts/skill_preview.py` (≤ 250 LOC). Read-only over `.agent-src/skills/<name>/SKILL.md`. No network,
|
|
47
|
+
no execution. Coverage: `tests/test_skill_preview.py`.
|
|
@@ -267,9 +267,13 @@ copies it verbatim into the dashboard.
|
|
|
267
267
|
Saves output tokens — when the corpus rewards it.
|
|
268
268
|
- **Ohne Paket / Mit Paket** — "without the package" /
|
|
269
269
|
"with the package" — the two arms of the A/B comparison.
|
|
270
|
-
-
|
|
271
|
-
|
|
272
|
-
|
|
270
|
+
- **Δ Tokens** — input-token difference per request vs. the baseline.
|
|
271
|
+
The rendered dashboard reports cost in **tokens only** — no € figure.
|
|
272
|
+
A €/USD comparison would assume per-call API pricing, which the many
|
|
273
|
+
users on subscriptions do not pay; tokens are the currency-neutral
|
|
274
|
+
metric. The `eur_delta` fields remain in the JSON for back-compat but
|
|
275
|
+
are not rendered. (Historical € figures elsewhere in this spec are
|
|
276
|
+
dated examples, kept as record.)
|
|
273
277
|
|
|
274
278
|
## Honest baseline appendix
|
|
275
279
|
|
|
@@ -80,10 +80,15 @@ totals:
|
|
|
80
80
|
cumulative_pct: <signed float> # net % of baseline
|
|
81
81
|
net_verdict: net-saving | net-cost | break-even # by sign of cumulative_pct
|
|
82
82
|
notes:
|
|
83
|
-
- "
|
|
83
|
+
- "Cost is reported in tokens only — no € figure (API pricing misleads subscription users)."
|
|
84
84
|
- "<other invariants surfaced as plain prose>"
|
|
85
85
|
```
|
|
86
86
|
|
|
87
|
+
> **Rendering note.** The `eur_delta` / `cumulative_eur_delta` /
|
|
88
|
+
> `pricing_sourced_on` fields stay in the JSON for back-compat, but the
|
|
89
|
+
> rendered dashboard (`docs/value.md`) shows **tokens only** — no € column,
|
|
90
|
+
> no €-per-1k figure, no NETTO € line. See `scripts/render_value_md.py`.
|
|
91
|
+
|
|
87
92
|
## Invariants
|
|
88
93
|
|
|
89
94
|
- **No silent drops.** Missing input → emit the rung with
|
|
@@ -99,9 +99,13 @@ telemetry.
|
|
|
99
99
|
|
|
100
100
|
## Open follow-ups
|
|
101
101
|
|
|
102
|
-
- **Consumer detector reachability:**
|
|
103
|
-
|
|
104
|
-
|
|
102
|
+
- **Consumer detector reachability:** ✅ **Closed (2026-05-30, `road-to-leaner-core-and-discovery`
|
|
103
|
+
Phase 4).** The detector is now exposed as `agent-config linked-projects:list`
|
|
104
|
+
(`scripts/linked_projects_list.py`, registered in `src/cli/registry.ts` + `scripts/_dispatch.bash`),
|
|
105
|
+
wrapping `scripts/_lib/linked_projects.detect_linked_projects` + the `.agent-settings.local.yml`
|
|
106
|
+
opt-in cascade. Cross-repo *retrieval* over the opted-in siblings ships alongside it
|
|
107
|
+
(`/knowledge:cross-repo`, `scripts/cross_repo_retrieve.py`) per
|
|
108
|
+
[`cross-repo-retrieval`](../contracts/cross-repo-retrieval.md).
|
|
105
109
|
- **Multi-agent verification:** only Claude Code was empirically validated.
|
|
106
110
|
Cursor / Augment / Copilot are unverified — the guide's manual snippet covers
|
|
107
111
|
them until an interactive per-IDE test is run.
|
package/docs/getting-started.md
CHANGED
|
@@ -129,7 +129,7 @@ Your agent is now:
|
|
|
129
129
|
- **Respecting your codebase** — no conflicting patterns
|
|
130
130
|
- **Following standards** — consistent code quality
|
|
131
131
|
|
|
132
|
-
This is enforced automatically by
|
|
132
|
+
This is enforced automatically by 79 rules. No configuration needed.
|
|
133
133
|
|
|
134
134
|
---
|
|
135
135
|
|
|
@@ -169,7 +169,7 @@ Your agent now understands slash commands:
|
|
|
169
169
|
| `/quality-fix` | Run and fix all quality checks |
|
|
170
170
|
| `/chat-history` | Inspect the persistent chat-history log (read-only `show`) |
|
|
171
171
|
|
|
172
|
-
→ [Browse all
|
|
172
|
+
→ [Browse all 145 active commands](../.agent-src/commands/)
|
|
173
173
|
|
|
174
174
|
---
|
|
175
175
|
|
|
@@ -77,6 +77,13 @@ If it reports the name, cross-repo access works. An out-of-root edit will prompt
|
|
|
77
77
|
for confirmation, then succeed — that is expected (the agent's permission gate
|
|
78
78
|
still applies).
|
|
79
79
|
|
|
80
|
+
## Next: pull context from a sibling
|
|
81
|
+
|
|
82
|
+
Detection makes the agent *aware* of a sibling. To have it **read** targeted
|
|
83
|
+
context from one — a shared type, an API contract, a config — without copying
|
|
84
|
+
the sibling's files in, see [Cross-repo retrieval](cross-repo-retrieval.md)
|
|
85
|
+
(`agent-config linked-projects:list` + `/knowledge:cross-repo`).
|
|
86
|
+
|
|
80
87
|
## Tell us what works
|
|
81
88
|
|
|
82
89
|
Auto-detection is verified for Claude Code only. If you use Cursor, Augment, or
|
|
@@ -0,0 +1,61 @@
|
|
|
1
|
+
# Cross-repo retrieval — pull sibling context without copying files
|
|
2
|
+
|
|
3
|
+
Once the agent knows about a sibling repo ([detection guide](cross-repo-linked-projects.md)),
|
|
4
|
+
cross-repo retrieval lets it **read targeted context** from that sibling — a shared type, an
|
|
5
|
+
API contract the frontend consumes, a config the sibling owns — without bulk-including the
|
|
6
|
+
sibling's files. It is the read layer on top of detection.
|
|
7
|
+
|
|
8
|
+
It stays inside [ADR-032](../decisions/ADR-032-linked-projects-scope.md) Option A: read-only,
|
|
9
|
+
opt-in per sibling, targeted query only. No full-tree sweep, no implicit inclusion, no writes.
|
|
10
|
+
|
|
11
|
+
## 1. See which siblings are reachable
|
|
12
|
+
|
|
13
|
+
```
|
|
14
|
+
agent-config linked-projects:list
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
Prints the opted-in siblings as `path · detected via · large`. Add `--all` to see detected
|
|
18
|
+
siblings you have not decided on yet. A sibling only becomes reachable once you set
|
|
19
|
+
`include: true` for it in `agents/settings/.agent-settings.local.yml` (see the detection guide).
|
|
20
|
+
|
|
21
|
+
## 2. Retrieve targeted context
|
|
22
|
+
|
|
23
|
+
```
|
|
24
|
+
/knowledge:cross-repo "OrderApiContract"
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
Under the hood:
|
|
28
|
+
|
|
29
|
+
```bash
|
|
30
|
+
python3 scripts/cross_repo_retrieve.py "OrderApiContract" [--path-scope 'src/*.ts'] [--max-chunks 8]
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
You get a bounded table — `source_repo · path · freshness · why` — drawn only from opted-in
|
|
34
|
+
siblings. Each chunk is redacted (secrets and PII are scrubbed before anything is shown), so
|
|
35
|
+
no credential ever crosses a repo boundary.
|
|
36
|
+
|
|
37
|
+
## 3. Scope large siblings
|
|
38
|
+
|
|
39
|
+
A sibling flagged `large` by the detector **requires** a `--path-scope` glob:
|
|
40
|
+
|
|
41
|
+
```
|
|
42
|
+
/knowledge:cross-repo "config" --path-scope 'packages/shared/**'
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
Without a scope, a large sibling is skipped with a note — this keeps retrieval cheap and
|
|
46
|
+
targeted instead of walking a huge tree.
|
|
47
|
+
|
|
48
|
+
## How it ranks in memory
|
|
49
|
+
|
|
50
|
+
When a skill retrieves memory with the `cross-repo` type, matches are tagged `source: cross-repo`
|
|
51
|
+
and scored **below** the project's own curated knowledge — so cross-repo context informs the
|
|
52
|
+
answer but never outranks your own repo's truth.
|
|
53
|
+
|
|
54
|
+
## Notes
|
|
55
|
+
|
|
56
|
+
- **Read-only.** The surface never writes to a sibling. Out-of-root writes still pass the host
|
|
57
|
+
permission gate; cross-repo retrieval writes nothing.
|
|
58
|
+
- **Opt-in only.** A sibling that is not `include: true` is never read.
|
|
59
|
+
- **Targeted only.** Path-glob + content grep, never a blind full walk.
|
|
60
|
+
- Contract: [`cross-repo-retrieval`](../contracts/cross-repo-retrieval.md). Detection story:
|
|
61
|
+
[`cross-repo-linked-projects`](cross-repo-linked-projects.md).
|
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
# Skill discovery — a 3-minute walkthrough
|
|
2
|
+
|
|
3
|
+
The package ships 220 skills. You do not need to know them. `/skills:discover`
|
|
4
|
+
turns the catalog into a short, **explained** shortlist for your role — every
|
|
5
|
+
row tells you *why* it is suggested, so you never adopt a skill on faith.
|
|
6
|
+
|
|
7
|
+
It is local-only: it reads the skill catalog, your role's shortlist, and (if
|
|
8
|
+
present) your local-analytics log. No network, no writes.
|
|
9
|
+
|
|
10
|
+
## 1. Run it for your role
|
|
11
|
+
|
|
12
|
+
```
|
|
13
|
+
/skills:discover sales
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
Or, with a role experience already active, just `/skills:discover` — it picks
|
|
17
|
+
up `roles.active_role` from `.agent-settings.yml`.
|
|
18
|
+
|
|
19
|
+
Under the hood the command runs:
|
|
20
|
+
|
|
21
|
+
```bash
|
|
22
|
+
python3 scripts/skill_discovery.py --role sales
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
## 2. Read the `why` column
|
|
26
|
+
|
|
27
|
+
The output is a table. The third column is the point — it names the **signal**
|
|
28
|
+
behind each suggestion, never a bare score:
|
|
29
|
+
|
|
30
|
+
```
|
|
31
|
+
| skill | class | why | first command |
|
|
32
|
+
|------------------------|-------------------------|-----------------------------------------------------------|---------------------------|
|
|
33
|
+
| refine-prompt | most-useful-for-role | Tightens fuzzy buyer briefs before drafting | Skill › refine-prompt |
|
|
34
|
+
| voice-and-tone-design | most-useful-for-role | Locks the deal voice across customer + procurement | Skill › voice-and-tone-design |
|
|
35
|
+
| competitive-positioning| most-useful-for-role | Surfaces the ours-vs-theirs delta when a competitor named | Skill › competitive-positioning |
|
|
36
|
+
| activation-design | related-to-current-task | same domain (product) as your sales core skills | Skill › activation-design |
|
|
37
|
+
| customer-research | recently-adopted | from your role shortlist — no local usage signal yet | Skill › customer-research |
|
|
38
|
+
| funnel-analysis | popular-in-role | from your role shortlist — no local usage signal yet | Skill › funnel-analysis |
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
The four classes:
|
|
42
|
+
|
|
43
|
+
- **most-useful-for-role** — your role's curated priority shortlist.
|
|
44
|
+
- **related-to-current-task** — same-domain peers you have not shortlisted yet.
|
|
45
|
+
- **recently-adopted** — what you actually used recently (from local analytics);
|
|
46
|
+
on a fresh machine with no usage history it honestly says *"no local usage
|
|
47
|
+
signal yet"* and falls back to your shortlist instead of inventing a number.
|
|
48
|
+
- **popular-in-role** — what your role launches most locally (same fallback).
|
|
49
|
+
|
|
50
|
+
## 3. Adopt one
|
|
51
|
+
|
|
52
|
+
Pick a row, read its `why`, and start with the `first command`. Unsure what a
|
|
53
|
+
skill will actually do before you commit? Preview it first:
|
|
54
|
+
|
|
55
|
+
```
|
|
56
|
+
/skill:preview competitive-positioning
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
That is the safe adoption loop: **discover → preview → run**.
|
|
60
|
+
|
|
61
|
+
## Notes
|
|
62
|
+
|
|
63
|
+
- **Local-only.** `/skills:discover` never touches the network and writes
|
|
64
|
+
nothing. It reads the catalog, your role file, and the optional analytics log.
|
|
65
|
+
- **Analytics is optional.** Opt out with `AGENT_CONFIG_NO_LOCAL_ANALYTICS=1`
|
|
66
|
+
or `analytics.local: off` in `.agent-settings.yml`. The two usage-driven
|
|
67
|
+
classes then fall back to your role shortlist — the list is still useful, just
|
|
68
|
+
without the personalised signal.
|
|
69
|
+
- **`--format json`** emits the same data machine-readably; **`--limit N`** sets
|
|
70
|
+
how many results per class (default 5).
|
|
71
|
+
- Contract: [`skill-discovery`](../contracts/skill-discovery.md).
|
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
# Skill preview — see what a skill does before you run it
|
|
2
|
+
|
|
3
|
+
With 220 skills and some that run commands, you should not have to run a skill to
|
|
4
|
+
find out what it touches. `/skill:preview` reads a skill's **declared intent** —
|
|
5
|
+
its steps, execution type, tools, and any file/command targets — and renders a
|
|
6
|
+
plain-language summary. Read-only: it never runs the skill.
|
|
7
|
+
|
|
8
|
+
This is the middle of the safe adoption loop: **discover → preview → run**.
|
|
9
|
+
|
|
10
|
+
## 1. Discover, then preview
|
|
11
|
+
|
|
12
|
+
Find a candidate with [`/skills:discover`](skill-discovery.md), then look before you leap:
|
|
13
|
+
|
|
14
|
+
```
|
|
15
|
+
/skill:preview competitive-positioning
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
Under the hood:
|
|
19
|
+
|
|
20
|
+
```bash
|
|
21
|
+
python3 scripts/skill_preview.py competitive-positioning
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
## 2. Read the summary
|
|
25
|
+
|
|
26
|
+
A **manual** skill (the default) is pure guidance — preview says so plainly:
|
|
27
|
+
|
|
28
|
+
```
|
|
29
|
+
# Preview — `accessibility-auditor`
|
|
30
|
+
|
|
31
|
+
**Execution: instructional only.** This skill does not run anything automatically —
|
|
32
|
+
it guides the agent step by step.
|
|
33
|
+
|
|
34
|
+
_No tools, commands, or file targets declared — pure guidance._
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
An **assisted** skill proposes actions you approve — preview surfaces the command
|
|
38
|
+
and tools it declares:
|
|
39
|
+
|
|
40
|
+
```
|
|
41
|
+
# Preview — `adr-create`
|
|
42
|
+
|
|
43
|
+
**Execution: assisted** (handler `shell`). It will *propose* actions for you to
|
|
44
|
+
approve — it never executes silently.
|
|
45
|
+
|
|
46
|
+
This skill will walk these steps:
|
|
47
|
+
- Pick the next ADR number
|
|
48
|
+
- Write the standard template
|
|
49
|
+
- Regenerate the index
|
|
50
|
+
|
|
51
|
+
Declared command: `python3 scripts/adr/regenerate_index.py`
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
Add `--technical` for the raw frontmatter + numbered step list.
|
|
55
|
+
|
|
56
|
+
## 3. Decide, then run
|
|
57
|
+
|
|
58
|
+
Preview hands the decision back to you. If the declared steps and targets look
|
|
59
|
+
right, invoke the skill. If not, skip it — you have spent zero side effects
|
|
60
|
+
finding out.
|
|
61
|
+
|
|
62
|
+
## What preview is not
|
|
63
|
+
|
|
64
|
+
- **Not a sandbox.** It does not run the skill or a fenced copy of it.
|
|
65
|
+
- **Not a safety guarantee.** It shows what the skill *declares* it will touch —
|
|
66
|
+
it cannot prove a skill with an `execution` block is side-effect-free.
|
|
67
|
+
|
|
68
|
+
A malformed or missing skill yields a structured error, never a crash.
|
|
69
|
+
|
|
70
|
+
Contract: [`skill-dry-run`](../contracts/skill-dry-run.md). Pairs with
|
|
71
|
+
[`skill-discovery`](skill-discovery.md) as the discover → preview → run loop.
|