@event4u/agent-config 4.8.0 → 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (66) hide show
  1. package/.agent-src/commands/implement-ticket.md +5 -4
  2. package/.agent-src/rules/language-and-tone.md +4 -10
  3. package/.agent-src/skills/command-routing/SKILL.md +5 -4
  4. package/.claude-plugin/marketplace.json +1 -1
  5. package/CHANGELOG.md +86 -0
  6. package/CONTRIBUTING.md +19 -0
  7. package/README.md +11 -0
  8. package/dist/cli/registry.js +0 -2
  9. package/dist/cli/registry.js.map +1 -1
  10. package/dist/discovery/deprecation-report.md +1 -1
  11. package/dist/discovery/discovery-manifest.json +5 -5
  12. package/dist/discovery/discovery-manifest.json.sha256 +1 -1
  13. package/dist/discovery/discovery-manifest.summary.md +1 -1
  14. package/dist/discovery/orphan-report.md +1 -1
  15. package/dist/discovery/packs.json +2 -2
  16. package/dist/discovery/trust-report.md +1 -1
  17. package/dist/discovery/workspaces.json +2 -2
  18. package/dist/mcp/registry-manifest.json +2 -2
  19. package/dist/router.json +1 -1671
  20. package/docs/benchmark.md +20 -8
  21. package/docs/benchmarks.md +11 -0
  22. package/docs/contracts/benchmark-corpus-spec.md +31 -3
  23. package/docs/contracts/command-surface-tiers.md +1 -1
  24. package/docs/contracts/hook-architecture-v1.md +33 -0
  25. package/docs/contracts/migrate-command.md +197 -0
  26. package/docs/contracts/settings-api.md +2 -1
  27. package/docs/contracts/value-dashboard-spec.md +374 -0
  28. package/docs/contracts/value-report-schema.md +150 -0
  29. package/docs/decisions/ADR-031-validation-severity-tiers-and-projection-roundtrip.md +97 -0
  30. package/docs/decisions/INDEX.md +1 -0
  31. package/docs/guidelines/agent-infra/installed-tools-manifest.md +6 -3
  32. package/docs/guidelines/agent-infra/language-and-tone-examples.md +35 -0
  33. package/docs/migration/v1-to-v2.md +40 -27
  34. package/docs/value.md +84 -0
  35. package/package.json +8 -8
  36. package/scripts/__pycache__/validate_frontmatter.cpython-312.pyc +0 -0
  37. package/scripts/_cli/cmd_migrate.py +264 -102
  38. package/scripts/_cli/cmd_settings_migrate.py +2 -1
  39. package/scripts/_dispatch.bash +147 -49
  40. package/scripts/_lib/__pycache__/__init__.cpython-312.pyc +0 -0
  41. package/scripts/_lib/__pycache__/agent_src.cpython-312.pyc +0 -0
  42. package/scripts/_lib/install_regenerator.py +129 -0
  43. package/scripts/_lib/value_ladder.py +599 -0
  44. package/scripts/_lib/value_report.py +441 -0
  45. package/scripts/bench_rtk_savings.py +320 -0
  46. package/scripts/compile_router.py +19 -5
  47. package/scripts/expected_perms.json +1 -1
  48. package/scripts/first_run_gate_hook.py +178 -0
  49. package/scripts/hook_manifest.yaml +16 -7
  50. package/scripts/hooks/dispatch_hook.py +27 -0
  51. package/scripts/hooks/dispatch_issues.py +136 -0
  52. package/scripts/hooks_doctor.py +40 -1
  53. package/scripts/install.py +25 -21
  54. package/scripts/inventory_abstraction_budget.py +616 -0
  55. package/scripts/lint_agents_layout.py +5 -4
  56. package/scripts/lint_bench_corpus.py +86 -4
  57. package/scripts/lint_global_paths.py +4 -3
  58. package/scripts/lint_marketplace_install_completeness.py +188 -0
  59. package/scripts/lint_value_dashboard.py +218 -0
  60. package/scripts/render_benchmark_md.py +6 -2
  61. package/scripts/render_value_md.py +355 -0
  62. package/scripts/repro/repro_marketplace_install_gap.sh +161 -0
  63. package/scripts/roadmap_progress_hook.py +23 -0
  64. package/scripts/router_telemetry.py +470 -0
  65. package/scripts/validate_frontmatter.py +23 -9
  66. package/scripts/_cli/cmd_migrate_to_global.py +0 -415
@@ -77,3 +77,38 @@ in an English `.md` file. Body text, example sentences, prompt templates,
77
77
  agent-rendered strings, and failure modes must be English. Reference
78
78
  established phrases abstractly later (e.g. "a standing autonomy directive")
79
79
  and link back to the anchor block.
80
+
81
+ ## Pre-send gate — filler-phrase blocklist
82
+
83
+ The pre-send confirm step (Step 4 of the rule's gate) checks for
84
+ language-of-target-mismatched opening phrases. The blocklist:
85
+
86
+ - **English filler that must NOT open a German reply:** `Let me`,
87
+ `Now`, `Found`, `Confirmed`, `OK`, `Alright`, `Here's`, `So`.
88
+ - **German filler that must NOT open an English reply:** `Lass mich`,
89
+ `Jetzt`, `Gefunden`, `Bestätigt`.
90
+
91
+ If the first sentence starts with one of these in the wrong language,
92
+ rewrite the whole reply.
93
+
94
+ ## CLI / icon spacing rules
95
+
96
+ - Two spaces after `❌`, `✅`, `⚠️` in CLI output.
97
+ - One space for other icons.
98
+ - One blank line max; no double / triple blanks.
99
+ - File ends with exactly one newline.
100
+
101
+ ## `.md` files — pre-save detection heuristic
102
+
103
+ Before saving any `.md` under `.augment/`, `.agent-src/`,
104
+ `.agent-src.uncondensed/`, or `agents/`, scan the body for:
105
+
106
+ - Umlauts (`ä`, `ö`, `ü`, `Ä`, `Ö`, `Ü`, `ß`) **outside** fenced code,
107
+ paths, or `DE: … · EN: …` anchor blocks.
108
+ - German function words: `für`, `nicht`, `dass`, `wenn`, `sollte`,
109
+ `werden`, `arbeite`, `selbstständig`, `jetzt`, `einfach`, `weiter`,
110
+ `lösche`, `frag`, `schreib`, `mach`.
111
+ - Non-English quoted phrases in body text.
112
+
113
+ Any hit → translate to English, OR move to a `DE: … · EN: …` anchor
114
+ block (the only allowed German location).
@@ -1,10 +1,11 @@
1
1
  # Migration — v1 → v2 (npx-only runtime)
2
2
 
3
- > **Status:** skeleton. The one-shot `npx @event4u/agent-config migrate`
4
- > is implemented in P3.5 of
5
- > [`road-to-portable-runtime-and-update-check`](../../agents/roadmaps/road-to-portable-runtime-and-update-check.md);
6
- > this document tracks the user-facing cutover contract so consumers can
7
- > rehearse the change before the command lands.
3
+ > **Status:** active. The one-shot `npx @event4u/agent-config migrate`
4
+ > is implemented in `scripts/_cli/cmd_migrate.py`; its action matrix +
5
+ > exit-code contract live in
6
+ > [`docs/contracts/migrate-command.md`](../contracts/migrate-command.md).
7
+ > This document is the user-facing narrative; the contract is the
8
+ > normative reference.
8
9
 
9
10
  ## Why this change
10
11
 
@@ -56,32 +57,44 @@ The per-tool glue (`.claude/`, `.cursor/`, `.clinerules/`,
56
57
  the source that writes them changed (from `vendor/` /
57
58
  `node_modules/` scripts to the npx-resolved runtime).
58
59
 
59
- ## The `migrate` command — contract sketch
60
+ ## The `migrate` command
60
61
 
61
62
  ```bash
62
- npx @event4u/agent-config migrate # interactive, default
63
- npx @event4u/agent-config migrate --dry-run # plan only, no writes
64
- npx @event4u/agent-config migrate --yes # non-interactive
63
+ npx @event4u/agent-config migrate # apply, real changes
64
+ npx @event4u/agent-config migrate --dry-run # plan only, zero writes
65
65
  ```
66
66
 
67
- Order of operations (locked once P3.5 lands):
68
-
69
- 1. Detect pre-v2 markers: `composer.json` require entry,
70
- `package.json` devDependency, `vendor/event4u/agent-config/`,
71
- `node_modules/@event4u/agent-config/`, legacy `.gitignore` lines,
72
- `~/.claude/{rules,skills}/event4u/` and siblings.
73
- 2. Print the planned change set (file removals, file writes, pin
74
- value). Stop here under `--dry-run`.
75
- 3. Remove dependency entries via the appropriate package manager
76
- (`composer remove`, `npm uninstall` / `pnpm remove` / `yarn remove`).
77
- 4. Wipe the retired user-scope `event4u/` namespace dirs.
78
- 5. Write / update `.agent-settings.yml` with the new shape +
79
- `agent_config_version` pin.
80
- 6. Re-run `sync-gitignore` to refresh the project `.gitignore` block.
81
- 7. Print a one-screen post-migration verification list.
82
-
83
- Idempotency: each step is a no-op when the v1 marker is absent. Re-runs
84
- print *"already on v2 nothing to do"* and exit 0.
67
+ One opinionated command, one flag. The full action matrix +
68
+ exit-code contract is documented in
69
+ [`docs/contracts/migrate-command.md`](../contracts/migrate-command.md);
70
+ the operations summary below is the narrative form.
71
+
72
+ Order of operations (fixed; foundation-first):
73
+
74
+ 1. Detect every legacy signal in one pass: `composer.json` require
75
+ entry, `package.json` devDependency, managed symlinks pointing
76
+ into `vendor/` / `node_modules/`, v0
77
+ `.implement-ticket-state.json`, project-local `.agent-settings.yml`
78
+ / `.agent-user.yml` (flat or under `settings/`), empty
79
+ `agent-config/` shell.
80
+ 2. Under `--dry-run`, print the planned change set and stop with
81
+ exit 0.
82
+ 3. Strip the package entries from `package.json` / `composer.json`
83
+ in-place; preserve sibling keys + formatting.
84
+ 4. Purge legacy symlinks; preserve user-managed symlinks elsewhere
85
+ with a warning.
86
+ 5. Migrate `.implement-ticket-state.json` → `.work-state.json`
87
+ (renames the v0 source to `.bak`).
88
+ 6. **Hard-delete** legacy project-local config files. The wizard
89
+ (`agent-config setup`) recreates fresh global config on the next
90
+ run — deletion is the design, not a regression.
91
+ 7. Remove the empty `agent-config/` shell directory if present.
92
+ 8. Refresh the `.gitignore` agent-config managed block.
93
+ 9. Print a summary listing every action taken.
94
+
95
+ Idempotency: re-running on a fully-migrated repo prints
96
+ *"already migrated — nothing to do"* and exits 0 without touching
97
+ the filesystem.
85
98
 
86
99
  ## Verification after migration
87
100
 
package/docs/value.md ADDED
@@ -0,0 +1,84 @@
1
+ # Value Dashboard — was kostet das Paket, was bringt es?
2
+
3
+ > Diese Seite beantwortet **eine** Frage in echten Zahlen: *Wie viel mehr Tokens kostet mich das Paket, und wie viel spart es danach wieder ein?* Generiert von `scripts/render_value_md.py` aus dem letzten `value-v1` Report; Quelle: `internal/bench/reports/value/latest.json`.
4
+
5
+ ## Wie diese Seite zu lesen ist
6
+
7
+ **Panel A (Kostenleiter)** — von oben nach unten lesen. Jede Stufe sagt: *was sie macht*, *wie viele Input-Tokens sie pro Request hinzufügt oder spart*, *was das in € auf 1,000 Requests kostet*, und *wo wir kumulativ stehen*. Die fett gedruckte **NETTO**-Zeile am Ende ist die Antwort.
8
+
9
+ **Panel B (Verhalten)** — vier reale Vergleiche, *mit* vs. *ohne* Paket. Hier liegt der nicht-Token-Wert: passende Skill-Auswahl, Stopps bei riskanten Aktionen, weniger Rückfragen, mehr abgeschlossene Aufgaben.
10
+
11
+ **Confidence-Marker** an jeder Stufe: `✅ gemessen` = echter Wert aus einem Report im Repo · `⏳ pending` = noch nicht gemessen, Stufe trägt 0 zur Summe bei · `⚠️ vendor-claim` = Behauptung eines Herstellers, nicht selbst gemessen.
12
+
13
+ ## Reference scale
14
+
15
+ - **1,000** Requests, durchschnittlich **8,000** Input-Tokens und **600** Output-Tokens pro Request
16
+ - Modell-Tier: `sonnet` · Preisstand `2026-05-14` (Quelle: `internal/bench/pricing.yaml`)
17
+ - Wer einen anderen Workload fährt, rechnet selbst nach — die Methodik ist offengelegt; nichts ist hardcodiert versteckt.
18
+
19
+ ## Panel A — Kostenleiter (kumulativ, min → max)
20
+
21
+ Liest sich von oben nach unten. Positive Δ-Werte = das Paket *kostet* Tokens (Regel-Load ist die ehrliche Up-Front-Steuer); negative Δ-Werte = das Paket *spart* Tokens.
22
+
23
+ | Stufe | Was sie tut | Δ Tokens | Δ € (1k Req) | Kumulativ | Quelle |
24
+ |---|---|---:|---:|---:|---|
25
+ | **Ohne Paket / Without package** | Baseline — der nackte Request ohne Paket-Regeln. | +0 | +0.00 € | +0.00% | `n/a` · ✅ gemessen |
26
+ | Mit Paket (Regeln laden) / With package (rule load) | Die immer-aktiven Regeln landen im Kontext jedes Requests. ⚠️ erst teurer | +8 895 | +24.55 € | +111.19% | `dist/router.json` · ✅ gemessen |
27
+ | | _Fußnote:_ Kernel = 10 rules (31570 chars) + charter (4010 chars); tokens ≈ chars / 4. | | | | |
28
+ | + condense (Regeln eindampfen) / + condense (rule shrink) | Build-Schritt schrumpft Regel-Dateien vor dem Ausliefern. | -186 | -0.51 € | +108.86% | `internal/bench/reports/telegraph-v2.json` · ✅ gemessen |
29
+ | | _Fußnote:_ Aggregate across non-Thin-Root categories; Thin-Root files (AGENTS.md variants) net negative (~−4%) and are excluded from the rung — surfaced separately. | | | | |
30
+ | + rtk (CLI-Output filtern) / + rtk (filter CLI output) | rtk schneidet verbose CLI-Ausgabe vor dem Modell-Input weg. | -593 | -1.64 € | +101.45% | `internal/bench/reports/rtk/latest.json` · ✅ gemessen |
31
+ | + terse (Antworten knapper) / + terse (shorter replies) | Telegraph-Stil zielt auf knappere Modell-Antworten. | +56 | +0.77 € | +102.15% | `internal/bench/reports/telegraph-v1.json` · ✅ gemessen |
32
+ | | _Fußnote:_ Honest: gemessener Median = -9.27% gegen 'sei knapp' — Telegraph liefert hier mehr Tokens, nicht weniger. Wir messen, wir verstecken nicht. | | | | |
33
+
34
+ **NETTO: Mehrkosten** ⚠️ — **+8 172 Tokens / Request**, **+22.55 €** auf 1,000 Requests, kumulativ **+102.15%** vs. Baseline.
35
+
36
+ ## Panel B — Verhalten (mit vs. ohne)
37
+
38
+ Vier reale Vergleiche aus echten Bench-Runs. Hier liegt der Wert, den Tokens allein nicht messen: ob der Agent das richtige Skill wählt, bei riskanten Aktionen stoppt, weniger rückfragt und mehr Aufgaben abschließt.
39
+
40
+ | Metrik | Was es bedeutet | Mit Paket | Ohne Paket | Δ | Mode |
41
+ |---|---|---:|---:|---:|---|
42
+ | Right-skill selection / Richtige Skill-Wahl | Wie oft das passende Skill aktiviert wird (top-K Treffer). | 50.0% | 0.0% | 50.0% | ✅ live |
43
+ | Destructive-op stops / Stopps bei riskanten Aktionen | Wie oft der Agent vor destructive ops anhält / nachfragt (von 5). | — | — | — | ⚠️ dry-run |
44
+ | Ask-vs-act ratio / Fragen vs. Handeln | Verhältnis Rückfragen zu Aktionen — niedriger = entschlossener. | 0.000 | 0.000 | 0.000 | ✅ live |
45
+ | Task completion rate / Aufgaben fertig | Anteil der Aufgaben, die der Agent vollständig abschließt. | 84.6% | 7.7% | 76.9% | ✅ live |
46
+
47
+ ## Glossar
48
+
49
+ Plain-language Definitionen für den nicht-Entwickler-Reader.
50
+
51
+ - **Token** — die Einheit, in der ein Sprachmodell abrechnet. Faustregel: ein Token ≈ 4 Zeichen deutsch/englischer Prosa. 1.000 Tokens ≈ 750 Wörter.
52
+ - **Input-Tokens** — alles, was das Modell pro Turn liest (System-Prompt, immer-aktive Regeln, deine Nachricht, frühere Konversation). Das Paket fügt hier Regeln hinzu — Installation kostet Input-Tokens.
53
+ - **Output-Tokens** — was das Modell zurückschreibt. Meist weniger als Input. Pro Token teurer als Input.
54
+ - **condense** — ein Build-Schritt, der die Regel-Dateien vor dem Ausliefern schrumpft (`.agent-src.uncondensed` → `.agent-src`). Spart Input-Tokens bei jedem Request.
55
+ - **rtk** — der *Rust Token Killer*, ein CLI-Wrapper, der verbose Output (`git status`, lint-Output, test-Runner) filtert, bevor das Modell ihn liest. Spart Input-Tokens auf Tool-Calls.
56
+ - **terse / telegraph** — ein Stil (kurze Phrasen, weggelassene Artikel), den der Agent für knappere Antworten nutzt. Spart Output-Tokens — wenn der Korpus es belohnt.
57
+ - **Ohne Paket / Mit Paket** — *without the package* / *with the package* — die zwei Arme des A/B-Vergleichs.
58
+ - **€-per-1k-requests** — Token-Kosten auf der Referenz-Skala (1.000 Requests durchschnittlicher Größe, gepreist mit den aktuellen Sonnet-Raten aus `internal/bench/pricing.yaml`).
59
+
60
+ ## Methodik & Quellen
61
+
62
+ Diese Seite ist eine **abgeleitete** Sicht — keine eigene Messung. Sie fasst drei bestehende Bench-Surfaces zusammen (siehe Spalte 'Quelle' in Panel A). Die maschinen-lesbaren Roh-Reports bleiben die Source-of-Truth:
63
+
64
+ - `internal/bench/reports/telegraph-v1.json` / `telegraph-v2.json` — Telegraph/Condense-Messungen.
65
+
66
+ - `agents/runtime/frugality/baseline.jsonl` — der Paket-Load (Metric A footprint).
67
+
68
+ - `internal/bench/reports/rtk/latest.json` — die rtk-Messung (neu, Phase 2).
69
+
70
+ - `internal/bench/reports/ab/*-ab-trackb-{with,without}.json` — A/B Track B (Verhalten).
71
+
72
+ - `internal/bench/reports/*-dev.json` — Dev-Korpus Selection-Accuracy.
73
+
74
+
75
+ **A/B-technischer Anhang:** [`docs/benchmark.md`](benchmark.md) trägt die Cache-Key-, Integrity- und Methodik-Details des A/B-Benches — wer den Variant-Axis-Beweis sehen will, liest dort weiter.
76
+
77
+
78
+ **Hinweise aus dem Report:**
79
+
80
+ - Token→€ conversion priced at sonnet rates from internal/bench/pricing.yaml (sourced_on=2026-05-14).
81
+ - Pending rungs contribute 0 to the cumulative until measured.
82
+ - Reference scale: 1000 requests × 8000 input / 600 output tokens per request.
83
+
84
+ _Last rendered: `2026-05-29T04:36:04+00:00`_
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@event4u/agent-config",
3
- "version": "4.8.0",
3
+ "version": "5.0.0",
4
4
  "description": "Universal AI Agent OS \u2014 audited skills, governance rules, commands, and templates for AI coding tools (Claude Code, Cursor, Windsurf, Copilot).",
5
5
  "license": "MIT",
6
6
  "private": false,
@@ -93,15 +93,15 @@
93
93
  "vitest": "^2.1.9"
94
94
  },
95
95
  "dependencies": {
96
- "@fastify/static": "^9.1.3",
96
+ "@fastify/static": "^9.1.0",
97
97
  "@preact/signals": "^2.9.0",
98
98
  "commander": "^12.1.0",
99
- "execa": "^9.6.1",
100
- "fastify": "^5.8.5",
101
- "js-yaml": "^4.1.1",
99
+ "execa": "^9.5.0",
100
+ "fastify": "^5.8.0",
101
+ "js-yaml": "^4.1.0",
102
102
  "open": "^10.2.0",
103
- "preact": "^10.29.2",
104
- "zod": "^3.25.76",
105
- "zod-to-json-schema": "^3.25.2"
103
+ "preact": "^10.29.0",
104
+ "zod": "^3.25.0",
105
+ "zod-to-json-schema": "^3.25.0"
106
106
  }
107
107
  }