npm - @event4u/agent-config - Versions diffs - 4.9.0 → 5.1.0 - Mend

@event4u/agent-config 4.9.0 → 5.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (82) hide show

package/.agent-src/commands/implement-ticket.md +5 -4
package/.agent-src/contexts/execution/roadmap-process-loop.md +30 -4
package/.agent-src/rules/language-and-tone.md +4 -10
package/.agent-src/rules/linked-projects-onboarding-gate.md +82 -0
package/.agent-src/rules/roadmap-progress-sync.md +39 -5
package/.agent-src/scripts/update_roadmap_progress.py +63 -7
package/.agent-src/skills/command-routing/SKILL.md +5 -4
package/.agent-src/skills/roadmap-management/SKILL.md +121 -21
package/.agent-src/skills/roadmap-writing/SKILL.md +63 -0
package/.agent-src/templates/agent-settings.md +16 -0
package/.agent-src/templates/roadmaps.md +22 -1
package/.agent-src/templates/scripts/work_engine/_lib/agent_settings.py +20 -3
package/.claude-plugin/marketplace.json +1 -1
package/CHANGELOG.md +106 -0
package/CONTRIBUTING.md +19 -0
package/README.md +12 -1
package/dist/cli/registry.js +0 -2
package/dist/cli/registry.js.map +1 -1
package/dist/discovery/deprecation-report.md +1 -1
package/dist/discovery/discovery-manifest.json +36 -14
package/dist/discovery/discovery-manifest.json.sha256 +1 -1
package/dist/discovery/discovery-manifest.summary.md +3 -3
package/dist/discovery/orphan-report.md +1 -1
package/dist/discovery/packs.json +6 -5
package/dist/discovery/trust-report.md +3 -3
package/dist/discovery/workspaces.json +5 -4
package/dist/mcp/registry-manifest.json +3 -3
package/dist/router.json +1 -1671
package/docs/architecture.md +1 -1
package/docs/benchmark.md +20 -8
package/docs/benchmarks.md +11 -0
package/docs/catalog.md +3 -2
package/docs/contracts/benchmark-corpus-spec.md +31 -3
package/docs/contracts/command-surface-tiers.md +1 -1
package/docs/contracts/hook-architecture-v1.md +33 -0
package/docs/contracts/migrate-command.md +197 -0
package/docs/contracts/settings-api.md +2 -1
package/docs/contracts/value-dashboard-spec.md +374 -0
package/docs/contracts/value-report-schema.md +150 -0
package/docs/decisions/ADR-031-validation-severity-tiers-and-projection-roundtrip.md +97 -0
package/docs/decisions/ADR-032-linked-projects-scope.md +118 -0
package/docs/decisions/INDEX.md +2 -0
package/docs/getting-started.md +1 -1
package/docs/guidelines/agent-infra/installed-tools-manifest.md +6 -3
package/docs/guidelines/agent-infra/language-and-tone-examples.md +35 -0
package/docs/guides/cross-repo-linked-projects.md +86 -0
package/docs/migration/v1-to-v2.md +40 -27
package/docs/value.md +84 -0
package/package.json +8 -8
package/scripts/__pycache__/validate_frontmatter.cpython-312.pyc +0 -0
package/scripts/_cli/cmd_migrate.py +264 -102
package/scripts/_cli/cmd_settings_migrate.py +2 -1
package/scripts/_dispatch.bash +147 -49
package/scripts/_lib/__pycache__/__init__.cpython-312.pyc +0 -0
package/scripts/_lib/__pycache__/agent_src.cpython-312.pyc +0 -0
package/scripts/_lib/agent_settings.py +20 -3
package/scripts/_lib/install_regenerator.py +129 -0
package/scripts/_lib/linked_projects.py +238 -0
package/scripts/_lib/value_ladder.py +599 -0
package/scripts/_lib/value_report.py +441 -0
package/scripts/bench_rtk_savings.py +320 -0
package/scripts/check_no_local_settings_committed.py +51 -0
package/scripts/compile_router.py +19 -5
package/scripts/expected_perms.json +1 -1
package/scripts/first_run_gate_hook.py +178 -0
package/scripts/hook_manifest.yaml +16 -7
package/scripts/hooks/dispatch_hook.py +27 -0
package/scripts/hooks/dispatch_issues.py +136 -0
package/scripts/hooks_doctor.py +40 -1
package/scripts/install.py +25 -21
package/scripts/lint_agents_layout.py +5 -4
package/scripts/lint_bench_corpus.py +86 -4
package/scripts/lint_global_paths.py +4 -3
package/scripts/lint_marketplace_install_completeness.py +188 -0
package/scripts/lint_value_dashboard.py +218 -0
package/scripts/render_benchmark_md.py +6 -2
package/scripts/render_value_md.py +355 -0
package/scripts/repro/repro_marketplace_install_gap.sh +161 -0
package/scripts/roadmap_progress_hook.py +23 -0
package/scripts/router_telemetry.py +470 -0
package/scripts/validate_frontmatter.py +23 -9
package/scripts/_cli/cmd_migrate_to_global.py +0 -415

package/docs/decisions/ADR-032-linked-projects-scope.md ADDED Viewed

@@ -0,0 +1,118 @@
+---
+adr: 032
+status: accepted
+date: 2026-05-29
+decision: linked-projects-scope-go-option-a
+supersedes: —
+superseded_by: —
+phase: v3.x · multi-project-scope evaluation
+type: structural
+review_date: 2027-05-29
+---
+# ADR-032 — Linked-projects scope: GO on opt-in auto-detection (Option A, passive awareness)
+## Status
+**Accepted** · 2026-05-29. Approves an opt-in auto-detection feature for
+IDE-attached sibling repositories, scoped to **passive awareness** (Option A).
+A same-day earlier draft recorded NO-GO; that verdict was reversed after the
+proactivity-gap argument (below). Time-boxed: review on **2027-05-29** or
+earlier if a kill-switch trigger fires.
+Not to be confused with [`ADR-029`](ADR-029-multi-workspace-deferred.md): that
+defers a restructure of the **package's own root layout**. This ADR is about
+the **agent's working scope over a sibling project repository**.
+## Context
+Developers routinely check out sibling repos that change together (e.g.
+`galawork-api` + `galawork-web`) and attach them in the IDE. Detection is
+deterministic from on-disk config (`.idea/modules.xml` + `vcs.xml`,
+`*.code-workspace`).
+A Phase-0 spike found Claude Code can already read/write a sibling outside its
+working directory **unconditionally** — no rule needed. An initial reading
+concluded the feature was therefore only an "awareness signal" a doc could
+deliver, and drafted NO-GO.
+## The reversal — proactivity gap
+That NO-GO mis-framed the value. The point is **not** capability (the agent can
+write everywhere); it is **proactivity**: the agent does **not** consider a
+sibling unless explicitly told, so cross-repo dependencies — an API change that
+breaks the frontend, a shared type that drifts — are missed by default. A
+manual doc/snippet presupposes the very awareness the target user lacks: the
+developer who needs this most is exactly the one who won't think to write the
+note. **Auto-detection is zero-knowledge** — it reads the relationship the
+developer already encoded by attaching the repo in their IDE.
+AI Council (anthropic/claude-sonnet-4-5 + openai/gpt-4o, 3 rounds + Karpathy
+peer-review, 2026-05-29) flipped to **GO** on this reasoning.
+## Decision — GO, scoped to Option A (passive awareness)
+Build an **opt-in** auto-detection feature:
+1. **Detect** IDE-attached siblings from on-disk config (config-driven only;
+   never arbitrary adjacent directories).
+2. **Opt-in once** per sibling; persist the choice **local-only** in
+   `.agent-settings.local.yml` (in agents/settings/) (gitignored, per-machine — sibling paths differ
+   per developer and must never be committed).
+3. **Behavioral directive** for in-scope siblings: proactively check cross-repo
+   impact on relevant changes (API contract, shared types) and **warn**.
+   **Do NOT bulk-include** the sibling's files (interpretation C — token
+   blowup — stays **out of scope**). Out-of-root writes still pass the host
+   agent's own permission gate.
+### A/B/C scoping
+- **A — passive awareness (CHOSEN):** know + warn, no bulk inclusion. Cheap, low risk.
+- **B — proactive dependency scanning:** auto-scan on every change. Deferred (needs heuristics).
+- **C — implicit inclusion of all sibling files:** **rejected** — token blowup, context pollution.
+### Fork resolutions
+- **Fork A** — `.agent-settings.local.yml` (in agents/settings/), deepest cascade layer reusing `_deep_merge` (not a bespoke override).
+- **Fork B** — key `linked_projects` (avoids ADR-007 "scope"/"workspace", ADR-029 "multi-workspace").
+- **Fork C** — cross-cwd writes documented, never auto-configured; host permission gate applies.
+## Consequences
+- New: detector (`scripts/_lib/linked_projects.py`), the
+  `.agent-settings.local.yml` (in agents/settings/) cascade layer, a committed-local lint, and the
+  `linked-projects-onboarding-gate` rule (tier-2b, **experimental**, **removable**).
+- The intra-repo module system (`enumerate_modules()`) is untouched.
+- Size never excludes a sibling — a real frontend (galawork-web ≈ 38k files)
+  must surface; it is flagged `large` (awareness only). The council's literal
+  "skip >20k files" guardrail was corrected: it conflated Option C's
+  file-inclusion cost with Option A, under which repo size is cost-irrelevant.
+- Per install decision **D2**, the installer does not touch the consumer
+  `.gitignore`; consumers gitignore `.agent-settings.local.yml` (in agents/settings/) themselves
+  (documented in the guide).
+## Kill-switch
+Experimental + removable by construction. If opt-in is consistently declined or
+siblings are never cited in practice, remove the rule. Signal stays local — no
+telemetry.
+## Open follow-ups
+- **Consumer detector reachability:** the detector lives in `scripts/_lib/`;
+  exposing it as an `agent-config` CLI subcommand for consumer installs is a
+  follow-up. Import-reachable in this repo / co-located maintainer setups today.
+- **Multi-agent verification:** only Claude Code was empirically validated.
+  Cursor / Augment / Copilot are unverified — the guide's manual snippet covers
+  them until an interactive per-IDE test is run.
+## Alternatives considered
+- **NO-GO + docs only** — rejected: a manual note fails the target user who lacks the awareness to write it.
+- **Build Option C** — rejected: token blowup.
+## References
+- [`docs/guides/cross-repo-linked-projects.md`](../guides/cross-repo-linked-projects.md)
+- [`ADR-007`](ADR-007-agent-discovery-scopes.md) — owns "scope"/"workspace".
+- [`ADR-029`](ADR-029-multi-workspace-deferred.md) — unrelated package-root multi-workspace defer.

package/docs/decisions/INDEX.md CHANGED Viewed

@@ -34,6 +34,8 @@ _Auto-generated by `scripts/adr/regenerate_index.py`. Do not edit._
 | [ADR-028](ADR-028-root-layout.md) | Root Layout | accepted | 2026-05-25 | — |
 | [ADR-029](ADR-029-multi-workspace-deferred.md) | Multi Workspace Deferred | accepted | 2026-05-25 | — |
 | [ADR-030](ADR-030-claude-code-command-projection.md) | Claude Code Command Projection | accepted | 2026-05-28 | — |
+| [ADR-031](ADR-031-validation-severity-tiers-and-projection-roundtrip.md) | Validation Severity Tiers And Projection Roundtrip | accepted | 2026-05-29 | — |
+| [ADR-032](ADR-032-linked-projects-scope.md) | Linked Projects Scope Go Option A | accepted | 2026-05-29 | — |
 ## Unnumbered (legacy)

package/docs/getting-started.md CHANGED Viewed

@@ -106,7 +106,7 @@ Your agent is now:
 - **Respecting your codebase** — no conflicting patterns
 - **Following standards** — consistent code quality
-This is enforced automatically by 77 rules. No configuration needed.
+This is enforced automatically by 78 rules. No configuration needed.
 ---

package/docs/guidelines/agent-infra/installed-tools-manifest.md CHANGED Viewed

@@ -89,9 +89,12 @@ intentionally pin an older version of the manifest.
 Under [ADR-020](../../decisions/ADR-020-global-only-consumer-scope.md)
 global is the only consumer scope. Consumers carrying a pre-2.5
 project-scope payload move to global with the one-shot
-`npx @event4u/agent-config migrate-to-global` subcommand — it copies
-each tool's project payload into the matching user-scope path, drops
-the bridge marker, and removes the legacy project artefacts.
+`npx @event4u/agent-config migrate` subcommand — it removes the
+legacy project artefacts in one opinionated pass (deletion-over-
+migration policy); the wizard recreates fresh global config on the
+next `agent-config setup`. See
+[docs/contracts/migrate-command.md](../../contracts/migrate-command.md)
+for the full action matrix.
 For maintainers running `AGENT_CONFIG_DEV_MODE=1`, project-scope
 re-installs remain available; the installer still detects scope

package/docs/guidelines/agent-infra/language-and-tone-examples.md CHANGED Viewed

@@ -77,3 +77,38 @@ in an English `.md` file. Body text, example sentences, prompt templates,
 agent-rendered strings, and failure modes must be English. Reference
 established phrases abstractly later (e.g. "a standing autonomy directive")
 and link back to the anchor block.
+## Pre-send gate — filler-phrase blocklist
+The pre-send confirm step (Step 4 of the rule's gate) checks for
+language-of-target-mismatched opening phrases. The blocklist:
+- **English filler that must NOT open a German reply:** `Let me`,
+  `Now`, `Found`, `Confirmed`, `OK`, `Alright`, `Here's`, `So`.
+- **German filler that must NOT open an English reply:** `Lass mich`,
+  `Jetzt`, `Gefunden`, `Bestätigt`.
+If the first sentence starts with one of these in the wrong language,
+rewrite the whole reply.
+## CLI / icon spacing rules
+- Two spaces after `❌`, `✅`, `⚠️` in CLI output.
+- One space for other icons.
+- One blank line max; no double / triple blanks.
+- File ends with exactly one newline.
+## `.md` files — pre-save detection heuristic
+Before saving any `.md` under `.augment/`, `.agent-src/`,
+`.agent-src.uncondensed/`, or `agents/`, scan the body for:
+- Umlauts (`ä`, `ö`, `ü`, `Ä`, `Ö`, `Ü`, `ß`) **outside** fenced code,
+  paths, or `DE: … · EN: …` anchor blocks.
+- German function words: `für`, `nicht`, `dass`, `wenn`, `sollte`,
+  `werden`, `arbeite`, `selbstständig`, `jetzt`, `einfach`, `weiter`,
+  `lösche`, `frag`, `schreib`, `mach`.
+- Non-English quoted phrases in body text.
+Any hit → translate to English, OR move to a `DE: … · EN: …` anchor
+block (the only allowed German location).

package/docs/guides/cross-repo-linked-projects.md ADDED Viewed

@@ -0,0 +1,86 @@
+# Working across linked sibling projects
+When two repositories change together — an API and its frontend, a service and
+a shared library — a change in one can silently break the other. The agent can
+already read and write a sibling repo, but it won't **proactively consider** one
+unless it knows the sibling is relevant. This feature closes that gap: it
+detects the sibling your IDE already attached and, after a one-time opt-in,
+makes the agent flag cross-repo impact by default.
+> **Scope — passive awareness (Option A).** The agent gains *awareness*: it
+> warns about cross-repo impact on relevant changes and can read/edit the
+> sibling on demand. It does **not** bulk-load the sibling's files into context
+> (that would blow up token cost). See
+> [ADR-032](../decisions/ADR-032-linked-projects-scope.md). Unrelated to
+> [ADR-029](../decisions/ADR-029-multi-workspace-deferred.md) (package root
+> layout).
+## Auto-detection (Claude Code — verified)
+If you attach a sibling repo in your IDE, the agent detects it from on-disk
+config and prompts **once** to opt it in:
+- **PhpStorm / IntelliJ** — a sibling attached via `.idea/modules.xml` /
+  `.idea/vcs.xml` (e.g. `../galawork-web`).
+- **VS Code** — folders in a `*.code-workspace`.
+On the first turn (and on a new attachment) the agent asks per detected sibling:
+include / decline / always / never-ask. Your choice is stored **local-only** in
+`.agent-settings.local.yml` (in agents/settings/) (gitignored, per-machine — see below). A declined
+sibling is never prompted again.
+Once a sibling is in scope, the agent proactively checks it for impact when a
+change here may affect it (API contract, shared types) and warns you — without
+loading its files wholesale. Large siblings (a real frontend easily exceeds
+20 000 files) are flagged `large` and surfaced as awareness only, never skipped.
+## Manual setup (other agents / any editor)
+Auto-detection is verified for Claude Code only. For Cursor, Augment, Copilot,
+or any editor without IDE attachment, add the sibling by hand to
+`.agent-settings.local.yml` (in agents/settings/):
+~~~yaml
+linked_projects:
+  - path: /abs/path/to/web   # or a path relative to the project
+    include: true
+~~~
+Or, if your agent reads a rules file, drop a short note there:
+~~~markdown
+## Linked sibling project: ../web
+`../web` changes alongside this repo. When an API/contract or shared-type
+change here may affect it, check `../web` for impact and warn. Don't load its
+files wholesale; access specific files on demand.
+~~~
+## Keep it local, never committed
+`.agent-settings.local.yml` (in agents/settings/) is **gitignored on purpose** — sibling paths are
+per-developer and per-machine. The installer does **not** touch your
+`.gitignore` (decision D2 — you own your ignore file), so if your project does
+not already ignore it, add:
+~~~gitignore
+.agent-settings.local.yml
+~~~
+## Validate it works
+Ask the agent:
+> Read `package.json` (or `composer.json`) from the linked sibling and tell me the project name.
+If it reports the name, cross-repo access works. An out-of-root edit will prompt
+for confirmation, then succeed — that is expected (the agent's permission gate
+still applies).
+## Tell us what works
+Auto-detection is verified for Claude Code only. If you use Cursor, Augment, or
+Copilot, please report whether the rule note alone worked, you needed to add the
+folder to the IDE workspace, or neither — that evidence is the trigger to extend
+verified auto-detection to your agent. See
+[ADR-032](../decisions/ADR-032-linked-projects-scope.md).

package/docs/migration/v1-to-v2.md CHANGED Viewed

@@ -1,10 +1,11 @@
 # Migration — v1 → v2 (npx-only runtime)
-> **Status:** skeleton. The one-shot `npx @event4u/agent-config migrate`
-> is implemented in P3.5 of
-> [`road-to-portable-runtime-and-update-check`](../../agents/roadmaps/road-to-portable-runtime-and-update-check.md);
-> this document tracks the user-facing cutover contract so consumers can
-> rehearse the change before the command lands.
+> **Status:** active. The one-shot `npx @event4u/agent-config migrate`
+> is implemented in `scripts/_cli/cmd_migrate.py`; its action matrix +
+> exit-code contract live in
+> [`docs/contracts/migrate-command.md`](../contracts/migrate-command.md).
+> This document is the user-facing narrative; the contract is the
+> normative reference.
 ## Why this change
@@ -56,32 +57,44 @@ The per-tool glue (`.claude/`, `.cursor/`, `.clinerules/`,
 the source that writes them changed (from `vendor/` /
 `node_modules/` scripts to the npx-resolved runtime).
-## The `migrate` command — contract sketch
+## The `migrate` command
 ```bash
-npx @event4u/agent-config migrate              # interactive, default
-npx @event4u/agent-config migrate --dry-run    # plan only, no writes
-npx @event4u/agent-config migrate --yes        # non-interactive
+npx @event4u/agent-config migrate              # apply, real changes
+npx @event4u/agent-config migrate --dry-run    # plan only, zero writes
 ```
-Order of operations (locked once P3.5 lands):
-1. Detect pre-v2 markers: `composer.json` require entry,
-   `package.json` devDependency, `vendor/event4u/agent-config/`,
-   `node_modules/@event4u/agent-config/`, legacy `.gitignore` lines,
-   `~/.claude/{rules,skills}/event4u/` and siblings.
-2. Print the planned change set (file removals, file writes, pin
-   value). Stop here under `--dry-run`.
-3. Remove dependency entries via the appropriate package manager
-   (`composer remove`, `npm uninstall` / `pnpm remove` / `yarn remove`).
-4. Wipe the retired user-scope `event4u/` namespace dirs.
-5. Write / update `.agent-settings.yml` with the new shape +
-   `agent_config_version` pin.
-6. Re-run `sync-gitignore` to refresh the project `.gitignore` block.
-7. Print a one-screen post-migration verification list.
-Idempotency: each step is a no-op when the v1 marker is absent. Re-runs
-print *"already on v2 — nothing to do"* and exit 0.
+One opinionated command, one flag. The full action matrix +
+exit-code contract is documented in
+[`docs/contracts/migrate-command.md`](../contracts/migrate-command.md);
+the operations summary below is the narrative form.
+Order of operations (fixed; foundation-first):
+1. Detect every legacy signal in one pass: `composer.json` require
+   entry, `package.json` devDependency, managed symlinks pointing
+   into `vendor/` / `node_modules/`, v0
+   `.implement-ticket-state.json`, project-local `.agent-settings.yml`
+   / `.agent-user.yml` (flat or under `settings/`), empty
+   `agent-config/` shell.
+2. Under `--dry-run`, print the planned change set and stop with
+   exit 0.
+3. Strip the package entries from `package.json` / `composer.json`
+   in-place; preserve sibling keys + formatting.
+4. Purge legacy symlinks; preserve user-managed symlinks elsewhere
+   with a warning.
+5. Migrate `.implement-ticket-state.json` → `.work-state.json`
+   (renames the v0 source to `.bak`).
+6. **Hard-delete** legacy project-local config files. The wizard
+   (`agent-config setup`) recreates fresh global config on the next
+   run — deletion is the design, not a regression.
+7. Remove the empty `agent-config/` shell directory if present.
+8. Refresh the `.gitignore` agent-config managed block.
+9. Print a summary listing every action taken.
+Idempotency: re-running on a fully-migrated repo prints
+*"already migrated — nothing to do"* and exits 0 without touching
+the filesystem.
 ## Verification after migration

package/docs/value.md ADDED Viewed

@@ -0,0 +1,84 @@
+# Value Dashboard — was kostet das Paket, was bringt es?
+> Diese Seite beantwortet **eine** Frage in echten Zahlen: *Wie viel mehr Tokens kostet mich das Paket, und wie viel spart es danach wieder ein?* Generiert von `scripts/render_value_md.py` aus dem letzten `value-v1` Report; Quelle: `internal/bench/reports/value/latest.json`.
+## Wie diese Seite zu lesen ist
+**Panel A (Kostenleiter)** — von oben nach unten lesen. Jede Stufe sagt: *was sie macht*, *wie viele Input-Tokens sie pro Request hinzufügt oder spart*, *was das in € auf 1,000 Requests kostet*, und *wo wir kumulativ stehen*. Die fett gedruckte **NETTO**-Zeile am Ende ist die Antwort.
+**Panel B (Verhalten)** — vier reale Vergleiche, *mit* vs. *ohne* Paket. Hier liegt der nicht-Token-Wert: passende Skill-Auswahl, Stopps bei riskanten Aktionen, weniger Rückfragen, mehr abgeschlossene Aufgaben.
+**Confidence-Marker** an jeder Stufe: `✅ gemessen` = echter Wert aus einem Report im Repo · `⏳ pending` = noch nicht gemessen, Stufe trägt 0 zur Summe bei · `⚠️ vendor-claim` = Behauptung eines Herstellers, nicht selbst gemessen.
+## Reference scale
+- **1,000** Requests, durchschnittlich **8,000** Input-Tokens und **600** Output-Tokens pro Request
+- Modell-Tier: `sonnet` · Preisstand `2026-05-14` (Quelle: `internal/bench/pricing.yaml`)
+- Wer einen anderen Workload fährt, rechnet selbst nach — die Methodik ist offengelegt; nichts ist hardcodiert versteckt.
+## Panel A — Kostenleiter (kumulativ, min → max)
+Liest sich von oben nach unten. Positive Δ-Werte = das Paket *kostet* Tokens (Regel-Load ist die ehrliche Up-Front-Steuer); negative Δ-Werte = das Paket *spart* Tokens.
+| Stufe | Was sie tut | Δ Tokens | Δ € (1k Req) | Kumulativ | Quelle |
+|---|---|---:|---:|---:|---|
+| **Ohne Paket / Without package** | Baseline — der nackte Request ohne Paket-Regeln. | +0 | +0.00 € | +0.00% | `n/a` · ✅ gemessen |
+| Mit Paket (Regeln laden) / With package (rule load) | Die immer-aktiven Regeln landen im Kontext jedes Requests. ⚠️ erst teurer | +8 895 | +24.55 € | +111.19% | `dist/router.json` · ✅ gemessen |
+| | _Fußnote:_ Kernel = 10 rules (31570 chars) + charter (4010 chars); tokens ≈ chars / 4. | | | | |
+| + condense (Regeln eindampfen) / + condense (rule shrink) | Build-Schritt schrumpft Regel-Dateien vor dem Ausliefern. | -186 | -0.51 € | +108.86% | `internal/bench/reports/telegraph-v2.json` · ✅ gemessen |
+| | _Fußnote:_ Aggregate across non-Thin-Root categories; Thin-Root files (AGENTS.md variants) net negative (~−4%) and are excluded from the rung — surfaced separately. | | | | |
+| + rtk (CLI-Output filtern) / + rtk (filter CLI output) | rtk schneidet verbose CLI-Ausgabe vor dem Modell-Input weg. | -593 | -1.64 € | +101.45% | `internal/bench/reports/rtk/latest.json` · ✅ gemessen |
+| + terse (Antworten knapper) / + terse (shorter replies) | Telegraph-Stil zielt auf knappere Modell-Antworten. | +56 | +0.77 € | +102.15% | `internal/bench/reports/telegraph-v1.json` · ✅ gemessen |
+| | _Fußnote:_ Honest: gemessener Median = -9.27% gegen 'sei knapp' — Telegraph liefert hier mehr Tokens, nicht weniger. Wir messen, wir verstecken nicht. | | | | |
+**NETTO: Mehrkosten** ⚠️ — **+8 172 Tokens / Request**, **+22.55 €** auf 1,000 Requests, kumulativ **+102.15%** vs. Baseline.
+## Panel B — Verhalten (mit vs. ohne)
+Vier reale Vergleiche aus echten Bench-Runs. Hier liegt der Wert, den Tokens allein nicht messen: ob der Agent das richtige Skill wählt, bei riskanten Aktionen stoppt, weniger rückfragt und mehr Aufgaben abschließt.
+| Metrik | Was es bedeutet | Mit Paket | Ohne Paket | Δ | Mode |
+|---|---|---:|---:|---:|---|
+| Right-skill selection / Richtige Skill-Wahl | Wie oft das passende Skill aktiviert wird (top-K Treffer). | 50.0% | 0.0% | 50.0% | ✅ live |
+| Destructive-op stops / Stopps bei riskanten Aktionen | Wie oft der Agent vor destructive ops anhält / nachfragt (von 5). | — | — | — | ⚠️ dry-run |
+| Ask-vs-act ratio / Fragen vs. Handeln | Verhältnis Rückfragen zu Aktionen — niedriger = entschlossener. | 0.000 | 0.000 | 0.000 | ✅ live |
+| Task completion rate / Aufgaben fertig | Anteil der Aufgaben, die der Agent vollständig abschließt. | 84.6% | 7.7% | 76.9% | ✅ live |
+## Glossar
+Plain-language Definitionen für den nicht-Entwickler-Reader.
+- **Token** — die Einheit, in der ein Sprachmodell abrechnet. Faustregel: ein Token ≈ 4 Zeichen deutsch/englischer Prosa. 1.000 Tokens ≈ 750 Wörter.
+- **Input-Tokens** — alles, was das Modell pro Turn liest (System-Prompt, immer-aktive Regeln, deine Nachricht, frühere Konversation). Das Paket fügt hier Regeln hinzu — Installation kostet Input-Tokens.
+- **Output-Tokens** — was das Modell zurückschreibt. Meist weniger als Input. Pro Token teurer als Input.
+- **condense** — ein Build-Schritt, der die Regel-Dateien vor dem Ausliefern schrumpft (`.agent-src.uncondensed` → `.agent-src`). Spart Input-Tokens bei jedem Request.
+- **rtk** — der *Rust Token Killer*, ein CLI-Wrapper, der verbose Output (`git status`, lint-Output, test-Runner) filtert, bevor das Modell ihn liest. Spart Input-Tokens auf Tool-Calls.
+- **terse / telegraph** — ein Stil (kurze Phrasen, weggelassene Artikel), den der Agent für knappere Antworten nutzt. Spart Output-Tokens — wenn der Korpus es belohnt.
+- **Ohne Paket / Mit Paket** — *without the package* / *with the package* — die zwei Arme des A/B-Vergleichs.
+- **€-per-1k-requests** — Token-Kosten auf der Referenz-Skala (1.000 Requests durchschnittlicher Größe, gepreist mit den aktuellen Sonnet-Raten aus `internal/bench/pricing.yaml`).
+## Methodik & Quellen
+Diese Seite ist eine **abgeleitete** Sicht — keine eigene Messung. Sie fasst drei bestehende Bench-Surfaces zusammen (siehe Spalte 'Quelle' in Panel A). Die maschinen-lesbaren Roh-Reports bleiben die Source-of-Truth:
+- `internal/bench/reports/telegraph-v1.json` / `telegraph-v2.json` — Telegraph/Condense-Messungen.
+- `agents/runtime/frugality/baseline.jsonl` — der Paket-Load (Metric A footprint).
+- `internal/bench/reports/rtk/latest.json` — die rtk-Messung (neu, Phase 2).
+- `internal/bench/reports/ab/*-ab-trackb-{with,without}.json` — A/B Track B (Verhalten).
+- `internal/bench/reports/*-dev.json` — Dev-Korpus Selection-Accuracy.
+**A/B-technischer Anhang:** [`docs/benchmark.md`](benchmark.md) trägt die Cache-Key-, Integrity- und Methodik-Details des A/B-Benches — wer den Variant-Axis-Beweis sehen will, liest dort weiter.
+**Hinweise aus dem Report:**
+- Token→€ conversion priced at sonnet rates from internal/bench/pricing.yaml (sourced_on=2026-05-14).
+- Pending rungs contribute 0 to the cumulative until measured.
+- Reference scale: 1000 requests × 8000 input / 600 output tokens per request.
+_Last rendered: `2026-05-29T04:36:04+00:00`_

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
     "name": "@event4u/agent-config",
-    "version": "4.9.0",
+    "version": "5.1.0",
     "description": "Universal AI Agent OS \u2014 audited skills, governance rules, commands, and templates for AI coding tools (Claude Code, Cursor, Windsurf, Copilot).",
     "license": "MIT",
     "private": false,
@@ -93,15 +93,15 @@
         "vitest": "^2.1.9"
     },
     "dependencies": {
-        "@fastify/static": "^9.1.3",
+        "@fastify/static": "^9.1.0",
         "@preact/signals": "^2.9.0",
         "commander": "^12.1.0",
-        "execa": "^9.6.1",
-        "fastify": "^5.8.5",
-        "js-yaml": "^4.1.1",
+        "execa": "^9.5.0",
+        "fastify": "^5.8.0",
+        "js-yaml": "^4.1.0",
         "open": "^10.2.0",
-        "preact": "^10.29.2",
-        "zod": "^3.25.76",
-        "zod-to-json-schema": "^3.25.2"
+        "preact": "^10.29.0",
+        "zod": "^3.25.0",
+        "zod-to-json-schema": "^3.25.0"
     }
 }

package/scripts/__pycache__/validate_frontmatter.cpython-312.pyc CHANGED Viewed

Binary file