@event4u/agent-config 3.1.1 → 3.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (96) hide show
  1. package/.agent-src/commands/agent-status.md +1 -1
  2. package/.agent-src/commands/analytics/prune.md +78 -0
  3. package/.agent-src/commands/analytics/show.md +107 -0
  4. package/.agent-src/commands/analytics.md +64 -0
  5. package/.agent-src/commands/knowledge/forget.md +104 -0
  6. package/.agent-src/commands/knowledge/ingest.md +122 -0
  7. package/.agent-src/commands/knowledge/list.md +102 -0
  8. package/.agent-src/commands/knowledge.md +75 -0
  9. package/.agent-src/scripts/update_roadmap_progress.py +1 -1
  10. package/.agent-src/skills/compress-memory/SKILL.md +1 -1
  11. package/.agent-src/templates/agents/agent-project-settings.example.yml +1 -1
  12. package/.claude-plugin/marketplace.json +8 -1
  13. package/AGENTS.md +5 -4
  14. package/CHANGELOG.md +54 -222
  15. package/README.md +12 -2
  16. package/dist/discovery/deprecation-report.md +1 -1
  17. package/dist/discovery/discovery-manifest.json +164 -10
  18. package/dist/discovery/discovery-manifest.json.sha256 +1 -1
  19. package/dist/discovery/discovery-manifest.summary.md +3 -3
  20. package/dist/discovery/orphan-report.md +1 -1
  21. package/dist/discovery/packs.json +12 -5
  22. package/dist/discovery/trust-report.md +2 -2
  23. package/dist/discovery/workspaces.json +11 -4
  24. package/dist/mcp/mcp-cloudflare-catalogue.json +2 -0
  25. package/dist/mcp/registry-manifest.json +5 -3
  26. package/docs/architecture.md +1 -1
  27. package/docs/archive/CHANGELOG-pre-3.2.0.md +268 -0
  28. package/docs/benchmarks.md +4 -4
  29. package/docs/catalog.md +9 -2
  30. package/docs/contracts/CHANGELOG-conventions.md +20 -1
  31. package/docs/contracts/adr-mcp-runtime.md +1 -1
  32. package/docs/contracts/at-rest-encryption.md +146 -0
  33. package/docs/contracts/benchmark-corpus-spec.md +3 -3
  34. package/docs/contracts/benchmark-report-schema.md +5 -5
  35. package/docs/contracts/caveman-telemetry.md +4 -4
  36. package/docs/contracts/compression-default-kill-criterion.md +5 -5
  37. package/docs/contracts/cost-enforcement.md +1 -1
  38. package/docs/contracts/daily-workspace.md +137 -0
  39. package/docs/contracts/explain-modes.md +146 -0
  40. package/docs/contracts/host-agent-protocol.md +88 -0
  41. package/docs/contracts/local-analytics.md +148 -0
  42. package/docs/contracts/local-knowledge-ingestion.md +96 -0
  43. package/docs/contracts/mcp-beta-criteria.md +1 -1
  44. package/docs/contracts/mcp-cloud-scope.md +4 -4
  45. package/docs/contracts/mcp-registry-manifest.schema.json +1 -1
  46. package/docs/contracts/mcp-tool-inventory.md +1 -1
  47. package/docs/contracts/mcp-tool-stub-envelope.md +1 -1
  48. package/docs/contracts/measurement-baseline.md +6 -6
  49. package/docs/contracts/role-experience.md +121 -0
  50. package/docs/contracts/workspace-documents.md +140 -0
  51. package/docs/decisions/ADR-022-daily-workspace-decomposition.md +140 -0
  52. package/docs/decisions/ADR-023-host-agent-protocol.md +129 -0
  53. package/docs/decisions/ADR-024-workspace-v0-feature-floor.md +126 -0
  54. package/docs/decisions/ADR-025-workspace-chrome.md +119 -0
  55. package/docs/decisions/ADR-026-explain-mode-translation.md +117 -0
  56. package/docs/decisions/ADR-027-changelog-machine-vs-manual.md +129 -0
  57. package/docs/decisions/ADR-028-root-layout.md +147 -0
  58. package/docs/decisions/ADR-029-multi-workspace-deferred.md +122 -0
  59. package/docs/decisions/INDEX.md +8 -0
  60. package/docs/deploy/small-team-recipe.md +148 -0
  61. package/docs/deploy/team-deployment-posture.md +91 -0
  62. package/docs/getting-started-by-role.md +27 -0
  63. package/docs/getting-started.md +1 -1
  64. package/docs/guides/local-analytics.md +125 -0
  65. package/docs/guides/local-knowledge.md +127 -0
  66. package/docs/mcp-server.md +1 -1
  67. package/docs/parity/bench-ruflo.json +3 -3
  68. package/docs/parity/ruflo.md +1 -1
  69. package/docs/setup/mcp-client-config.md +1 -1
  70. package/docs/setup/mcp-cloud-endpoints.md +1 -1
  71. package/docs/setup/mcp-cloud-setup.md +2 -2
  72. package/docs/setup/mcp-r2-bootstrap.md +1 -1
  73. package/package.json +4 -2
  74. package/scripts/__pycache__/validate_frontmatter.cpython-312.pyc +0 -0
  75. package/scripts/_lib/__pycache__/__init__.cpython-312.pyc +0 -0
  76. package/scripts/_lib/__pycache__/agent_src.cpython-312.pyc +0 -0
  77. package/scripts/_lib/bench_caveman.py +2 -2
  78. package/scripts/_lib/bench_caveman_report.py +1 -1
  79. package/scripts/_lib/bench_cost.py +2 -2
  80. package/scripts/_lib/bench_report.py +2 -2
  81. package/scripts/_lib/changelog_eras.py +330 -0
  82. package/scripts/audit_mcp_tools.py +1 -1
  83. package/scripts/bench_baseline_ready.py +3 -3
  84. package/scripts/bench_compress_memory.py +4 -4
  85. package/scripts/bench_drift_check.py +2 -2
  86. package/scripts/bench_per_tool.py +2 -2
  87. package/scripts/bench_run.py +4 -4
  88. package/scripts/build_mcp_registry_manifest.py +2 -2
  89. package/scripts/mcp_server/__init__.py +1 -1
  90. package/scripts/mcp_server/catalog.py +1 -1
  91. package/scripts/mcp_server/consumer_tool_catalog.json +1 -1
  92. package/scripts/mcp_server/tools.py +1 -1
  93. package/scripts/memory_lookup.py +78 -1
  94. package/scripts/pack_mcp_content.py +6 -6
  95. package/scripts/release.py +93 -3
  96. package/scripts/skill_trigger_eval.py +2 -2
@@ -13,7 +13,7 @@ keep-beta-until: 2026-08-15
13
13
 
14
14
  | Key | Value | Provenance |
15
15
  |---|---|---|
16
- | `caveman_multiplier_version` | `v1` | Tied to `bench/reports/caveman-v1.{json,md}` |
16
+ | `caveman_multiplier_version` | `v1` | Tied to `internal/bench/reports/caveman-v1.{json,md}` |
17
17
  | `caveman_multiplier_value` | `0.9155` | `median(terse_control_tokens / compressed_tokens)` over the 10-prompt v1 corpus |
18
18
  | `caveman_multiplier_p10` | `0.4506` | 10th percentile (worst-case carve-out-tax prompts) |
19
19
  | `caveman_multiplier_p90` | `2.3664` | 90th percentile (pure-prose prompts where caveman wins) |
@@ -40,7 +40,7 @@ where `M = caveman_multiplier_value`.
40
40
 
41
41
  ## Why suspended after v1
42
42
 
43
- The `caveman-v1` bench (`bench/reports/caveman-v1.md`, 30 calls,
43
+ The `caveman-v1` bench (`internal/bench/reports/caveman-v1.md`, 30 calls,
44
44
  2026-05-16) found:
45
45
 
46
46
  - Median savings vs raw uncompressed: **+23.51 %** (inflated by the
@@ -78,6 +78,6 @@ Until a v2 bench (broader corpus or a re-tuned dialect) lifts the
78
78
  ## See also
79
79
 
80
80
  - [`compression-default-kill-criterion.md`](compression-default-kill-criterion.md) — the rule-default-flip gate; this multiplier is gated on the same `vs_terse` arm.
81
- - [`bench/reports/caveman-v1.md`](../../bench/reports/caveman-v1.md) — provenance for the `v1` value.
82
- - [`bench/reports/caveman-v2.md`](../../bench/reports/caveman-v2.md) — input-side (orthogonal); does NOT feed this multiplier (this multiplier is output-side).
81
+ - [`internal/bench/reports/caveman-v1.md`](../../bench/reports/caveman-v1.md) — provenance for the `v1` value.
82
+ - [`internal/bench/reports/caveman-v2.md`](../../bench/reports/caveman-v2.md) — input-side (orthogonal); does NOT feed this multiplier (this multiplier is output-side).
83
83
  - [`caveman-speak`](../../.agent-src.uncompressed/rules/caveman-speak.md) — runtime rule the multiplier measures.
@@ -6,7 +6,7 @@ keep-beta-until: 2026-08-14
6
6
  # Compression default — kill-criterion
7
7
 
8
8
  > **Status:** v1-measured · criterion not met · default stays `off` · **Owner:** `step-16-caveman-substance.md`
9
- > Phase 1 closeout · **Sources:** [`bench/reports/caveman-v1.md`](../../bench/reports/caveman-v1.md) ·
9
+ > Phase 1 closeout · **Sources:** [`internal/bench/reports/caveman-v1.md`](../../bench/reports/caveman-v1.md) ·
10
10
  > [`council-synthesis.md` § 7](../../agents/evidence/audits/2026-05-14-north-star/council-synthesis.md) <!-- council-ref-allowed: ADR decision trace for v1 kill-criterion verdict --> ·
11
11
  > [`caveman-v1-kc-verdict.json`](../../agents/runtime/council/responses/caveman-v1-kc-verdict.json) <!-- council-ref-allowed: ADR decision trace for v1 kill-criterion verdict -->
12
12
 
@@ -23,14 +23,14 @@ DECISION OWNED BY THE NEXT BENCH CLOSEOUT, NOT BY THIS DOC.
23
23
  [`caveman-speak`](../../.agent-src.uncompressed/rules/caveman-speak.md)
24
24
  but the feature is non-promoted: no skill recommends turning it on,
25
25
  no preset enables it, no profile depends on it.
26
- 2. **Baselines.** Every published `bench/reports/caveman-v<N>.{json,md}`
26
+ 2. **Baselines.** Every published `internal/bench/reports/caveman-v<N>.{json,md}`
27
27
  measures three arms (`compressed` · `terse-control` ·
28
28
  `uncompressed`) and reports two savings columns:
29
29
  - `vs_raw` — median savings against the uncompressed arm.
30
30
  - `vs_terse` — **load-bearing** median savings against the
31
31
  `Answer concisely.` terse-control arm. `vs_raw` is inflated by the
32
32
  carve-out-tax-free pure-prose case and is **not** the gate metric.
33
- 3. **Decision table.** Read the latest `bench/reports/caveman-v<N>.md`
33
+ 3. **Decision table.** Read the latest `internal/bench/reports/caveman-v<N>.md`
34
34
  and apply exactly one of:
35
35
 
36
36
  | Measured `vs_terse` median | Quality regression on corpus | Verdict |
@@ -50,7 +50,7 @@ DECISION OWNED BY THE NEXT BENCH CLOSEOUT, NOT BY THIS DOC.
50
50
 
51
51
  ## v1 verdict (2026-05-16)
52
52
 
53
- [`bench/reports/caveman-v1.md`](../../bench/reports/caveman-v1.md)
53
+ [`internal/bench/reports/caveman-v1.md`](../../bench/reports/caveman-v1.md)
54
54
  landed 30 calls · $0.0805 · 0 errors · `claude-sonnet-4-5`:
55
55
 
56
56
  | Metric | Median | p10 | p90 |
@@ -100,7 +100,7 @@ re-litigating compression on every PR.
100
100
 
101
101
  ## Cross-references
102
102
 
103
- - [`bench/reports/caveman-v1.md`](../../bench/reports/caveman-v1.md)
103
+ - [`internal/bench/reports/caveman-v1.md`](../../bench/reports/caveman-v1.md)
104
104
  — v1 measurement; canonical baseline this doc cites.
105
105
  - [`docs/benchmarks.md`](../benchmarks.md)
106
106
  — cadence + when the next bench run is mandatory.
@@ -131,4 +131,4 @@ suite is wired to `task test-cost-budget` per `step-11` Phase 2 Step 5.
131
131
  - `step-11-ruflo-parity` — Measurement & Governance Parity roadmap.
132
132
  - `docs/contracts/cost-dashboard.md` — companion dashboard contract.
133
133
  - `scripts/cost/budget.mjs` — evaluator implementation.
134
- - `bench/pricing.yaml` — per-model USD pricing table.
134
+ - `internal/bench/pricing.yaml` — per-model USD pricing table.
@@ -0,0 +1,137 @@
1
+ # Daily Workspace Surface Contract
2
+
3
+ > **Status** · v0 / design · 2026-05-24. Surface contract for the daily
4
+ > workspace introduced as Phase 4 of the employee-product workstream.
5
+ > Governed by ADRs [`022`](../decisions/ADR-022-daily-workspace-decomposition.md) ·
6
+ > [`023`](../decisions/ADR-023-host-agent-protocol.md) ·
7
+ > [`024`](../decisions/ADR-024-workspace-v0-feature-floor.md) ·
8
+ > [`025`](../decisions/ADR-025-workspace-chrome.md).
9
+
10
+ ## Shape (v0)
11
+
12
+ Browser tab at `http://127.0.0.1:<gui-port>/workspace`, served by the
13
+ existing installer GUI (`packages/core/installer/src/gui/server.ts`).
14
+ Same CSRF token, same loopback bind, same kill-switch as
15
+ [`gui-wizard`](gui-wizard.md). Launched via
16
+ `npx @event4u/agent-config workspace` (alias for
17
+ `init --gui --route=/workspace` once wired).
18
+
19
+ ```
20
+ ┌─ /workspace ─────────────────────────────────────────────────┐
21
+ │ [identity strip — shared with installer GUI shell] │
22
+ ├────────────────────┬─────────────────────────────────────────┤
23
+ │ Role + Task │ Active session log │
24
+ │ launcher │ (latest JSONL entries, append-only) │
25
+ │ │ │
26
+ │ - galabau │ ▸ 12:04 launch · role=galabau │
27
+ │ - content-creator │ ▸ 12:05 host · claude / tier-1 │
28
+ │ - consultant │ ▸ 12:08 host · turn.completed │
29
+ │ │ │
30
+ │ (Phase 3 roles) │ Knowledge pane │
31
+ │ │ - source: handbuch.pdf │
32
+ │ │ - source: angebot-template.md │
33
+ │ │ (Phase 2 namespace; "no sources yet" │
34
+ │ │ when empty) │
35
+ └────────────────────┴─────────────────────────────────────────┘
36
+ ```
37
+
38
+ No left / centre / right three-rail layout in v0 (deferred per
39
+ ADR-024). One launcher, one log, one stub pane.
40
+
41
+ ## Endpoints (additions to the GUI server)
42
+
43
+ All endpoints CSRF-gated, loopback-bound. Existing wizard endpoints
44
+ in [`gui-wizard`](gui-wizard.md) are untouched.
45
+
46
+ | Method · Path | Purpose |
47
+ |---|---|
48
+ | `GET /workspace` | HTML shell + initial state (role list, recent sessions). |
49
+ | `GET /api/v1/workspace/roles` | List available roles from `agents/roles/<role>/`. |
50
+ | `GET /api/v1/workspace/roles/:role/tasks` | Per-role task list from `skills.yml` + `prompts/`. |
51
+ | `POST /api/v1/workspace/launch` | Body: `{ role, task, host? }`. Resolves host via ADR-023 tier; runs the launch; appends to JSONL log. |
52
+ | `GET /api/v1/workspace/sessions` | List of recent sessions (≤ 20, ordered by mtime). |
53
+ | `GET /api/v1/workspace/sessions/:id` | Streams the JSONL log for one session. |
54
+ | `GET /api/v1/workspace/knowledge` | Snapshot of the current `knowledge:` memory namespace (read-only). |
55
+
56
+ ## Session JSONL schema
57
+
58
+ Path: `~/.event4u/agent-config/workspace/sessions/<yyyy-mm-dd>/<session-id>.jsonl`
59
+ (one file per session; append-only; UTF-8). Session id = `YYYYMMDDTHHMMSSZ-<8-hex>`.
60
+
61
+ Each line is one JSON record with the shared envelope:
62
+
63
+ ```json
64
+ { "ts": "<iso-8601-utc>", "kind": "<event-kind>", "data": { … } }
65
+ ```
66
+
67
+ Event kinds:
68
+
69
+ - `launcher.input` — `{ role, task, rendered_prompt, host_tier, host_id }`
70
+ - `host.turn` — `{ host_id, turn_id, model, input_tokens, output_tokens, latency_ms }`
71
+ - `host.output` — `{ host_id, turn_id, role: "assistant", text }` *(verbatim host envelope text — Tier 1 only)*
72
+ - `host.tool` — `{ host_id, turn_id, tool_name, input, output_excerpt }` *(when the host envelope surfaces it)*
73
+ - `host.error` — `{ host_id, message, exit_code }`
74
+ - `inbox.handoff` — `{ inbox_path, copied_to_clipboard: bool }` *(Tier 3 only)*
75
+
76
+ No PII in filenames. No remote sync. Encryption-at-rest deferred to a
77
+ future ADR.
78
+
79
+ ## Inbox handoff (Tier 3)
80
+
81
+ Path: `~/.event4u/agent-config/workspace/inbox/<yyyy-mm-dd>/<id>.md`.
82
+
83
+ ```markdown
84
+ ---
85
+ created_at: 2026-05-24T12:08:00Z
86
+ role: galabau
87
+ task: angebot-erstellen
88
+ host_tier: 3
89
+ host_id: cursor
90
+ ---
91
+
92
+ [rendered prompt body — skill context inlined per ADR-023]
93
+ ```
94
+
95
+ The UI surfaces a one-line banner: "Workspace wrote
96
+ `~/.event4u/.../<id>.md`. Open it in Cursor and paste." Clicking
97
+ the banner copies the path to clipboard.
98
+
99
+ ## Skill resolution
100
+
101
+ Tier 1 with skill surface (Claude Code only) — workspace passes the
102
+ slash command as part of the prompt body (`/work "<task>"` style)
103
+ and lets the host resolve it from `.claude/commands/`.
104
+
105
+ Tier 1 without skill surface (Codex, Gemini) and Tier 3 — workspace
106
+ **inlines** the skill body into the rendered prompt. The host gets
107
+ the prompt with skill context as a self-contained block.
108
+
109
+ ## State scope
110
+
111
+ - Per-user. Local-only. One workspace per OS user.
112
+ - No multi-tenant view in v0. Multi-user deployment (the topology
113
+ from [`ADR-021`](../decisions/ADR-021-deployment-shape.md)) is
114
+ out of scope for v0.
115
+ - Closing the browser tab does not kill running host subprocesses.
116
+ Reopening shows the live JSONL log.
117
+
118
+ ## Failure modes & telemetry
119
+
120
+ - Host CLI not installed → workspace renders "Host `<id>` not
121
+ available" banner with install link. No silent fallback.
122
+ - JSON envelope shape change → demote host to Tier 3 per ADR-023.
123
+ - Inbox write failure (disk full, permissions) → red banner; no
124
+ silent loss.
125
+
126
+ Telemetry stays off by default (project inertia). When the user
127
+ opts in via `.agent-settings.yml`, the workspace emits
128
+ `workspace.launch`, `workspace.host_turn`, `workspace.inbox_handoff`
129
+ counters only. No prompt bodies, no response bodies.
130
+
131
+ ## Cross-references
132
+
133
+ - ADRs: [`022`](../decisions/ADR-022-daily-workspace-decomposition.md) · [`023`](../decisions/ADR-023-host-agent-protocol.md) · [`024`](../decisions/ADR-024-workspace-v0-feature-floor.md) · [`025`](../decisions/ADR-025-workspace-chrome.md).
134
+ - Host-agent protocol: [`host-agent-protocol`](host-agent-protocol.md).
135
+ - GUI substrate: [`gui-wizard`](gui-wizard.md).
136
+ - Knowledge ingestion: [`local-knowledge-ingestion`](local-knowledge-ingestion.md).
137
+ - Role experience: [`role-experience`](role-experience.md).
@@ -0,0 +1,146 @@
1
+ # Explain Modes Contract
2
+
3
+ > **Status** · v0 / design · 2026-05-24. Phase 6 of the
4
+ > employee-product workstream.
5
+ > Governed by [`ADR-026`](../decisions/ADR-026-explain-mode-translation.md).
6
+ > Translates the existing engineer-shaped `explain-v1` envelope into a
7
+ > role-aware plain surface, without changing the underlying data.
8
+
9
+ ## Two modes over one envelope
10
+
11
+ The agent-memory MCP already returns an `explain-v1` envelope per
12
+ `memory_explain`. It speaks engineer: `trust_score`, `score_breakdown`,
13
+ `promotion_history`, `contradictions`, `decay`. Phase 6 keeps that
14
+ envelope as the single source of truth and renders **two views**:
15
+
16
+ | Mode | Default for | Vocabulary |
17
+ |---|---|---|
18
+ | `technical` | engineering-lead, platform-engineer, default for `--debug` flag | trust_score, decay rate, promotion path, contradictions count |
19
+ | `plain` | every other role (galabau, content-creator, consultant, …) | "where this came from", "how confident", "when last reviewed", "what's contested" |
20
+
21
+ No new MCP call. No new data fetch. The plain renderer is a **pure
22
+ function** over the existing envelope.
23
+
24
+ ## Field mapping
25
+
26
+ | envelope field | technical label | plain label (default) |
27
+ |---|---|---|
28
+ | `trust_score` (0.0–1.0) | "Trust score" | "Confidence" with 4-band label (Very High ≥ 0.85 · High ≥ 0.65 · Medium ≥ 0.40 · Low < 0.40) |
29
+ | `score_breakdown.validation` | "Validation contribution" | "How well it's been checked" |
30
+ | `score_breakdown.usage` | "Usage contribution" | "How often it's been used" |
31
+ | `score_breakdown.recency` | "Recency contribution" | "How recently it was confirmed" |
32
+ | `score_breakdown.contradictions` | "Contradiction penalty" | "Disagreements found" |
33
+ | `promotion_history[]` | "Promotion timeline" | "When this was confirmed" (most recent first, ≤ 3 entries) |
34
+ | `contradictions[]` | "Unresolved contradictions" | "What disagrees with this" |
35
+ | `decay.applied_factor` | "Decay factor" | "Freshness" with 3-band label (Fresh ≥ 0.80 · Aging ≥ 0.50 · Stale < 0.50) |
36
+ | `evidence.sources[]` | "Sources" | "Where this came from" |
37
+ | `last_reviewed_at` | "Last reviewed" | "When last reviewed" + human-relative ("3 days ago") |
38
+
39
+ The technical view renders one section per envelope field, terse,
40
+ tabular. The plain view renders four labelled paragraphs:
41
+
42
+ ```
43
+ Where this came from
44
+ 3 sources — handbuch.pdf · offer-template.md · 1 council vote.
45
+
46
+ How confident
47
+ High (0.74). Last confirmed 3 days ago.
48
+
49
+ When last reviewed
50
+ 2026-05-21 — by the maintenance pass.
51
+
52
+ What's contested
53
+ No open disagreements.
54
+ ```
55
+
56
+ ## Per-role glossary override
57
+
58
+ Each role may ship an `agents/roles/<role>/explain-glossary.yml`
59
+ that overrides default plain-mode labels and the 4-band threshold
60
+ points. The file is optional; missing → defaults are used.
61
+
62
+ ```yaml
63
+ # agents/roles/galabau/explain-glossary.yml
64
+ schema: explain-glossary/v0
65
+ labels:
66
+ confidence: "Sicherheit"
67
+ sources: "Woher das stammt"
68
+ last_reviewed: "Zuletzt geprüft"
69
+ contradictions: "Was widerspricht"
70
+ bands:
71
+ confidence:
72
+ very_high: 0.85
73
+ high: 0.65
74
+ medium: 0.40
75
+ freshness:
76
+ fresh: 0.80
77
+ aging: 0.50
78
+ ```
79
+
80
+ Labels stay in `.md` source English (per `language-and-tone`);
81
+ **glossary YAMLs are the exception** — they hold the localized
82
+ runtime strings for the rendered surface and may be in the role's
83
+ native language. Loader validates `schema:` matches `explain-glossary/v0`.
84
+
85
+ ## `/why` quick command
86
+
87
+ Any role may invoke `/why` on the last agent reply. Resolution:
88
+
89
+ 1. Look up the last `host.turn` in the active session JSONL.
90
+ 2. Extract memory entry IDs referenced in the reply (regex on
91
+ `mem://<id>` markers the host envelope already emits).
92
+ 3. Call `memory_explain` for each id; merge envelopes.
93
+ 4. Render in the active mode (plain by default, technical if the
94
+ role's `explain_default` is `technical`).
95
+ 5. Append the rendered output to the session JSONL as
96
+ `{ kind: "explain.rendered", data: { mode, ids: [...] } }`.
97
+
98
+ `/why` never makes a network call beyond the existing MCP transport.
99
+
100
+ ## Renderer surface (pure function)
101
+
102
+ ```ts
103
+ function renderExplain(
104
+ envelope: ExplainV1,
105
+ options: {
106
+ mode: "technical" | "plain",
107
+ glossary?: ExplainGlossaryV0,
108
+ locale?: string, // affects relative-date rendering only
109
+ }
110
+ ): { markdown: string, mode: string, ids: string[] }
111
+ ```
112
+
113
+ Implementation lives in `packages/core/src/workspace/explain/`. No I/O,
114
+ no clock dependency beyond the `now` injected for relative-date
115
+ formatting; testable with fixtures.
116
+
117
+ ## Coverage (Phase 6 Step 5)
118
+
119
+ Fixture-driven golden tests against `tests/golden/explain/` for ≥ 5
120
+ envelope shapes:
121
+
122
+ 1. High-trust validated entry — fresh, no contradictions.
123
+ 2. Low-trust quarantined entry — never promoted.
124
+ 3. Contradicted entry — 2 open contradictions, one resolved.
125
+ 4. Recently promoted entry — last `promotion_history[0]` < 24h old.
126
+ 5. Deprecated entry — superseded-by chain, decay factor 0.20.
127
+
128
+ Each fixture exercised in both `technical` and `plain` modes plus
129
+ one with a glossary override. ≥ 90 % branch on the renderer module.
130
+
131
+ ## Failure modes
132
+
133
+ - Missing envelope field → render placeholder "(unavailable)" in
134
+ plain mode; renderer never throws. Technical mode shows the raw
135
+ null with a warning marker.
136
+ - Unknown `schema:` in glossary → loader logs a warning and falls
137
+ back to defaults; never blocks rendering.
138
+ - `/why` finds no `mem://` markers → renders "This reply did not
139
+ cite any stored memory entries." No error.
140
+
141
+ ## Cross-references
142
+
143
+ - ADR: [`ADR-026`](../decisions/ADR-026-explain-mode-translation.md).
144
+ - Envelope contract: [`agent-memory-contract`](agent-memory-contract.md) (`explain-v1`).
145
+ - Workspace integration: [`daily-workspace`](daily-workspace.md) (right rail).
146
+ - Roles: [`role-experience`](role-experience.md) (`explain_default` field).
@@ -0,0 +1,88 @@
1
+ # Host-Agent Protocol Contract
2
+
3
+ > **Status** · v0 / inventory · 2026-05-24. The daily workspace shells out to
4
+ > a host agent for every model interaction; it never re-implements one. This
5
+ > contract names which surfaces each host agent exposes today, where the
6
+ > workspace can rely on them, and what the fallback is when a surface is
7
+ > missing. Governs ADR-023 / ADR-024 / ADR-025 — see
8
+ > [`ADR-022`](../decisions/ADR-022-daily-workspace-decomposition.md).
9
+
10
+ ## Required capabilities
11
+
12
+ The workspace v0 requires exactly two surfaces from a host agent:
13
+
14
+ 1. **`launch(prompt, skill?, cwd)`** — start a new conversation in the host
15
+ agent with `prompt` pre-filled and (optionally) `skill` pre-selected, in
16
+ the named working directory. Must be invocable from a non-interactive
17
+ shell. Return shape: success / failure; the conversation runs inside the
18
+ host's own UI from there.
19
+ 2. **`emit_trace(session) → ndjson`** — append-only, structured event stream
20
+ for the running conversation: model id, tool calls, citations,
21
+ explain-trace envelope (per
22
+ [`memory-explain-v1`](memory-explain-v1.md) when memory is involved).
23
+ Must be readable by tail-style consumers without polling the host's UI.
24
+
25
+ Both surfaces must be **stable** — documented by the vendor, covered by
26
+ their semver, not derived from unstable stdout parsing.
27
+
28
+ ## Today's inventory (2026-05-24)
29
+
30
+ | Host agent | `launch` surface | `emit_trace` surface | Effective tier |
31
+ |---|---|---|---|
32
+ | **Claude Code (CLI)** | `claude -p "<prompt>" --output-format json` (subprocess; documented). Slash commands resolved against `.claude/commands/`. | JSON envelope on stdout per turn; session id preserved; no live append stream. | **Tier 1** — only host with both surfaces today. |
33
+ | **OpenAI Codex CLI** | `codex exec --json` consumes stdin; documented. No slash-command surface (skills not first-class). | NDJSON event stream on stdout — `turn.completed`, `item.completed`, tool envelopes. | **Tier 1**, no skill surface — workspace must pre-render the prompt with skill context inlined. |
34
+ | **Gemini CLI** | `gemini --output-format json` consumes stdin; documented. | JSON envelope on stdout per turn. OAuth grant required once. | **Tier 1**, no skill surface (same as Codex). |
35
+ | **Augment (IDE)** | None documented. Hook trampolines exist (`scripts/hooks/augment-dispatcher.sh`) — post-event only, cannot initiate a conversation. | None — hook payloads cover events, not model output. | **Tier 3** — observe-only. |
36
+ | **Cursor (IDE)** | `cursor://` deep links open files / chats but cannot pre-fill a prompt with skill context from a non-Cursor process. Hooks (`.cursor/hooks.json`) are post-event. | None at the protocol layer. | **Tier 3** — observe-only. |
37
+ | **Cline (VS Code ext)** | None. Hooks (`~/Documents/Cline/Hooks/`) are post-event. | None at the protocol layer. | **Tier 3** — observe-only. |
38
+ | **Windsurf (Cascade)** | None. Hooks (`.windsurf/hooks.json`) are post-event. | None at the protocol layer. | **Tier 3** — observe-only. |
39
+
40
+ ## Tier definitions
41
+
42
+ - **Tier 1 — first-class.** Both `launch` and `emit_trace` are stable.
43
+ Workspace can build full features against the host.
44
+ - **Tier 2 — degraded.** One of the two surfaces exists; workspace can
45
+ partially drive but degrades a named feature (e.g. no inline citations).
46
+ *(No host agent occupies this tier today.)*
47
+ - **Tier 3 — observe-only.** Neither surface exists at the agent boundary.
48
+ The workspace falls back to (a) user-paste of a generated prompt, or (b)
49
+ inbox-file handoff (writes `~/.event4u/agent-config/workspace/inbox/<id>.md`,
50
+ user opens the host themselves). Hook trampolines remain available for
51
+ passive event recording but do not initiate conversations.
52
+
53
+ ## v0 scope
54
+
55
+ - The workspace v0 ships against **Claude Code** as the single Tier-1 host.
56
+ Codex and Gemini are wired but secondary (no skill surface — see ADR-024).
57
+ - Tier-3 hosts get the **inbox handoff** fallback only: workspace writes the
58
+ rendered prompt + skill body into the inbox file and surfaces a one-line
59
+ copy-to-clipboard banner. No tighter integration is attempted in v0.
60
+ - The CLI shell-out is the **only** mechanism. No HTTP RPC, no MCP-driven
61
+ agent control, no shared SQLite — those are deferred to v1+ when at least
62
+ one Tier-3 host moves up.
63
+
64
+ ## Stability & change policy
65
+
66
+ - The vendor-published JSON envelope shapes are the contract. Workspace
67
+ parses by named keys, never by positional fields.
68
+ - A new host-agent CLI release that breaks the envelope **fails closed** —
69
+ the workspace surfaces a banner and degrades to Tier 3 (inbox handoff)
70
+ until this contract is updated.
71
+ - This file is the source of truth for host-agent tier. Adding a host or
72
+ promoting a tier requires (a) a vendor-link in the inventory row,
73
+ (b) at least one integration test under
74
+ `tests/integration/host-agent-protocol/`.
75
+
76
+ ## Cross-references
77
+
78
+ - ADR: [`ADR-022`](../decisions/ADR-022-daily-workspace-decomposition.md) ·
79
+ [`ADR-023`](../decisions/ADR-023-host-agent-protocol.md) ·
80
+ [`ADR-024`](../decisions/ADR-024-workspace-v0-feature-floor.md) ·
81
+ [`ADR-025`](../decisions/ADR-025-workspace-chrome.md).
82
+ - Skill: [`ai-council`](../../.agent-src/skills/ai-council/SKILL.md) — uses
83
+ the same CLI subprocess shape (claude / codex / gemini) for council
84
+ members; the workspace inherits the proven invocation paths.
85
+ - Hooks: [`hook-architecture-v1`](hook-architecture-v1.md) — covers the
86
+ post-event surface for all hosts including Tier-3.
87
+ - Daily workspace surface: [`daily-workspace`](daily-workspace.md) — UI
88
+ contract that consumes this protocol.
@@ -0,0 +1,148 @@
1
+ # Local Analytics Contract
2
+
3
+ > **Status** · v0 / design · 2026-05-24. Phase 7 of the
4
+ > employee-product workstream.
5
+ > **Local-only.** Does NOT lift the Hard-Floor item from 3.1.0 — no
6
+ > network egress, no remote Worker, no POST. Inertia of the prior
7
+ > telemetry roadmap is preserved.
8
+
9
+ ## Position vs the 3.1.0 telemetry SDK
10
+
11
+ 3.1.0 shipped the telemetry SDK + Cloudflare Worker as **source-only**;
12
+ the kill-switch defaults off and nothing is deployed. Phase 7 builds
13
+ a **separate local-only** analytics path:
14
+
15
+ | Surface | Lives | Egress | Default |
16
+ |---|---|---|---|
17
+ | 3.1.0 remote telemetry | Worker (undeployed) | ✗ inert | off, Hard-Floor |
18
+ | **Phase 7 local analytics** | `~/.event4u/agent-config/workspace/analytics/` | ✗ never | **on** for local-only |
19
+
20
+ The two surfaces share **event vocabulary** where it overlaps; they
21
+ never share a transport. Local analytics writes to disk; remote
22
+ telemetry remains undeployed.
23
+
24
+ ## Event vocabulary
25
+
26
+ Re-uses the `install_stage` schema (3.1.0) where applicable, and
27
+ adds the `workspace_event` schema for launcher / document / explain
28
+ interactions:
29
+
30
+ | schema | source | example fields |
31
+ |---|---|---|
32
+ | `install_stage/v1` | installer (3.1.0) | `stage`, `outcome`, `duration_ms`, `package_version` |
33
+ | `workspace_event/v0` | Phase 4–6 workspace | `event`, `role`, `task`, `host_tier`, `duration_ms` |
34
+
35
+ `workspace_event/v0` event names (closed set):
36
+
37
+ - `launcher.opened` · `launcher.task_picked` · `launcher.task_launched`
38
+ - `session.started` · `session.host_turn` · `session.completed`
39
+ - `document.created` · `document.edited` · `document.exported`
40
+ - `explain.opened` · `explain.mode_toggled` · `why.invoked`
41
+ - `knowledge.queried` · `knowledge.source_clicked`
42
+
43
+ No prompt bodies. No response bodies. No PII. Only counters, role
44
+ labels, task slugs (already public), and durations.
45
+
46
+ ## Storage
47
+
48
+ ```
49
+ ~/.event4u/agent-config/workspace/analytics/
50
+ ├── events.jsonl ← append-only event log
51
+ └── retention.lock ← prune-pass mutex
52
+ ```
53
+
54
+ One JSON record per line:
55
+
56
+ ```json
57
+ {
58
+ "ts": "2026-05-24T12:08:00Z",
59
+ "schema": "workspace_event/v0",
60
+ "event": "launcher.task_launched",
61
+ "data": { "role": "galabau", "task": "angebot-erstellen",
62
+ "host_tier": 1, "duration_ms": 420 }
63
+ }
64
+ ```
65
+
66
+ Rolling retention: **90 days local**. A prune pass on workspace
67
+ launch trims records older than 90 days; the lockfile prevents
68
+ concurrent prune (cheap fs lock, not a real mutex).
69
+
70
+ ## Opt-out
71
+
72
+ Single env var, single config flag, both checked:
73
+
74
+ | Surface | Default | Override |
75
+ |---|---|---|
76
+ | Env | `AGENT_CONFIG_NO_LOCAL_ANALYTICS` unset | set to any non-empty value → no writes |
77
+ | Config | `.agent-settings.yml` → `analytics.local: on` | set to `off` → no writes |
78
+
79
+ Either set to off → emitter short-circuits before opening the file.
80
+ No retention pruning either; the existing log stays until the user
81
+ removes it.
82
+
83
+ ## Emitter API
84
+
85
+ ```python
86
+ # packages/core/src/workspace/analytics/emitter.py
87
+ class LocalAnalytics:
88
+ def emit(self, event: str, data: dict) -> None: ...
89
+ def query(self, since: datetime, event: str | None = None) -> list[Event]: ...
90
+ def prune(self) -> int: ... # returns number of records dropped
91
+ ```
92
+
93
+ The emitter is a synchronous append-line write. Never blocks the UI
94
+ thread above 5 ms (90th percentile); no async / queue / batch
95
+ machinery in v0.
96
+
97
+ ## `/analytics:show` command
98
+
99
+ Local-only query. Renders to ASCII / Markdown table; never POSTs.
100
+
101
+ ```
102
+ $ npx @event4u/agent-config analytics:show --window 30d
103
+
104
+ Top prompts (last 30 days)
105
+ galabau · angebot-erstellen 47
106
+ content-creator · video-from-script 31
107
+ consultant · meeting-memo 24
108
+
109
+ Launcher → completion rate per role
110
+ galabau 87% (47 launched · 41 completed)
111
+ content-creator 71% (31 launched · 22 completed)
112
+ consultant 92% (24 launched · 22 completed)
113
+
114
+ Average session length: 4m 12s
115
+ Knowledge sources clicked: 18 (handbuch.pdf · offer-template.md · …)
116
+ ```
117
+
118
+ Flags: `--window <30d|7d|24h>` · `--event <name>` · `--role <slug>` ·
119
+ `--format <markdown|csv|json>`. No `--upload`, no `--share`; the
120
+ command can only read and render.
121
+
122
+ ## Coverage (Phase 7 Step 4)
123
+
124
+ - pytest against fixture JSONL stores (`tests/fixtures/local-analytics/`):
125
+ emitter writes, query filters by window + event + role, prune
126
+ drops correctly at the 90-day boundary.
127
+ - Env-flag short-circuit: emitter is a no-op when
128
+ `AGENT_CONFIG_NO_LOCAL_ANALYTICS=1`; no file is created.
129
+ - Concurrency: two emitters writing the same file produce
130
+ well-formed lines (POSIX `O_APPEND` semantics — test on Linux,
131
+ document Windows caveat).
132
+
133
+ ## Failure modes
134
+
135
+ - Disk full → emitter logs warning to stderr, drops the event, never
136
+ raises. UI thread is unaffected.
137
+ - Malformed line in `events.jsonl` → query skips the line, increments
138
+ a `malformed_lines` counter exposed via `/analytics:show --health`.
139
+ - Schema bump (`workspace_event/v0` → `v1`) → emitter writes the new
140
+ schema; query reads both. Migration is forward-compatible.
141
+
142
+ ## Cross-references
143
+
144
+ - Phase 4 shell that produces the events: [`daily-workspace`](daily-workspace.md).
145
+ - Phase 5 document events: [`workspace-documents`](workspace-documents.md).
146
+ - Phase 6 explain events: [`explain-modes`](explain-modes.md).
147
+ - 3.1.0 telemetry inertia: archived `road-to-product-adoption.md` Phase 4.
148
+ - Walkthrough doc (Phase 7 Step 5): `docs/guides/local-analytics.md` (deferred).