@event4u/agent-config 2.18.0 → 2.20.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (108) hide show
  1. package/.agent-src/commands/agent-status.md +29 -0
  2. package/.agent-src/commands/onboard.md +221 -81
  3. package/.agent-src/commands/refine-ticket.md +3 -0
  4. package/.agent-src/packs/README.md +49 -0
  5. package/.agent-src/packs/agency-delivery.yml +63 -0
  6. package/.agent-src/packs/content-engine.yml +53 -0
  7. package/.agent-src/packs/founder-mvp.yml +51 -0
  8. package/.agent-src/personas/README.md +8 -0
  9. package/.agent-src/presets/README.md +26 -0
  10. package/.agent-src/presets/balanced.yml +34 -0
  11. package/.agent-src/presets/fast.yml +31 -0
  12. package/.agent-src/presets/strict.yml +38 -0
  13. package/.agent-src/profiles/README.md +29 -0
  14. package/.agent-src/profiles/agency.yml +27 -0
  15. package/.agent-src/profiles/content_creator.yml +25 -0
  16. package/.agent-src/profiles/developer.yml +26 -0
  17. package/.agent-src/profiles/finance.yml +24 -0
  18. package/.agent-src/profiles/founder.yml +25 -0
  19. package/.agent-src/profiles/ops.yml +25 -0
  20. package/.agent-src/rules/no-cheap-questions.md +25 -17
  21. package/.agent-src/skills/adr-create/SKILL.md +78 -68
  22. package/.agent-src/skills/refine-ticket/SKILL.md +3 -0
  23. package/.agent-src/skills/subagent-orchestration/SKILL.md +33 -0
  24. package/.agent-src/templates/agents/agent-project-settings.example.yml +1 -1
  25. package/.agent-src/templates/skill-archive-note.md +101 -0
  26. package/.agent-src/user-types/README.md +124 -0
  27. package/.agent-src/user-types/_template/user-type.md +95 -0
  28. package/.agent-src/user-types/galabau-field-crew.md +100 -0
  29. package/.agent-src/user-types/metalworking-shop.md +105 -0
  30. package/.agent-src/user-types/truck-driver.md +113 -0
  31. package/.claude-plugin/marketplace.json +1 -1
  32. package/CHANGELOG.md +91 -30
  33. package/README.md +68 -72
  34. package/config/agent-settings.template.yml +22 -0
  35. package/docs/adrs/caveman/0001-default-off-until-bench.md +93 -0
  36. package/docs/adrs/caveman/README.md +9 -0
  37. package/docs/adrs/cost/0001-hard-stop-hook.md +114 -0
  38. package/docs/adrs/cost/README.md +9 -0
  39. package/docs/adrs/memory/0001-consumer-side-snapshot.md +111 -0
  40. package/docs/adrs/memory/README.md +9 -0
  41. package/docs/adrs/router/0001-three-tier-routing.md +119 -0
  42. package/docs/adrs/router/README.md +9 -0
  43. package/docs/adrs/schema/0001-json-schema-frontmatter.md +102 -0
  44. package/docs/adrs/schema/README.md +9 -0
  45. package/docs/adrs/smoke/0001-per-tier-smoke-scripts.md +99 -0
  46. package/docs/adrs/smoke/README.md +9 -0
  47. package/docs/architecture/current-onboard-baseline.md +126 -0
  48. package/docs/architecture/current-safety-behavior.md +137 -0
  49. package/docs/archive/CHANGELOG-pre-2.16.0.md +48 -0
  50. package/docs/contracts/adr-layout.md +108 -0
  51. package/docs/contracts/adr-mcp-runtime.md +128 -0
  52. package/docs/contracts/adr-user-types-axis.md +127 -0
  53. package/docs/contracts/benchmark-corpus-spec.md +97 -0
  54. package/docs/contracts/benchmark-report-schema.md +111 -0
  55. package/docs/contracts/command-clusters.md +1 -0
  56. package/docs/contracts/command-taxonomy.md +137 -0
  57. package/docs/contracts/compression-default-kill-criterion.md +69 -0
  58. package/docs/contracts/config-presets.md +144 -0
  59. package/docs/contracts/cost-dashboard.md +143 -0
  60. package/docs/contracts/cost-enforcement.md +134 -0
  61. package/docs/contracts/file-ownership-matrix.json +0 -7
  62. package/docs/contracts/mcp-tool-inventory.md +53 -0
  63. package/docs/contracts/measurement-baseline.md +102 -0
  64. package/docs/contracts/namespace.md +125 -0
  65. package/docs/contracts/profile-system.md +142 -0
  66. package/docs/contracts/safety-model.md +129 -0
  67. package/docs/contracts/smoke-contracts.md +144 -0
  68. package/docs/contracts/user-type-schema.md +146 -0
  69. package/docs/contracts/workflow-packs.md +121 -0
  70. package/docs/decisions/ADR-010-profile-pack-preset-boundary.md +132 -0
  71. package/docs/decisions/INDEX.md +1 -0
  72. package/docs/featured-commands.md +27 -0
  73. package/docs/parity/bench-ruflo.json +58 -0
  74. package/docs/parity/bench.json +41 -0
  75. package/docs/parity/ruflo.md +46 -0
  76. package/docs/profiles.md +91 -0
  77. package/docs/recruits/_template.md +81 -0
  78. package/package.json +1 -1
  79. package/scripts/_cli/cmd_explain.py +250 -0
  80. package/scripts/_lib/bench_cost.py +138 -0
  81. package/scripts/_lib/bench_quality.py +118 -0
  82. package/scripts/_lib/bench_report.py +150 -0
  83. package/scripts/agent-config +13 -0
  84. package/scripts/audit_adr_coverage.py +175 -0
  85. package/scripts/audit_mcp_tools.py +146 -0
  86. package/scripts/bench_baseline_ready.py +108 -0
  87. package/scripts/bench_drift_check.py +151 -0
  88. package/scripts/bench_per_tool.py +216 -0
  89. package/scripts/bench_run.py +155 -0
  90. package/scripts/compress.py +48 -2
  91. package/scripts/config/__init__.py +9 -0
  92. package/scripts/config/presets.py +206 -0
  93. package/scripts/config/profiles.py +173 -0
  94. package/scripts/cost/budget.mjs +73 -12
  95. package/scripts/cost/preflight.mjs +89 -0
  96. package/scripts/lint_archived_skills.py +143 -0
  97. package/scripts/lint_bench_corpus.py +161 -0
  98. package/scripts/lint_namespace.py +135 -0
  99. package/scripts/schemas/user-type.schema.json +35 -0
  100. package/scripts/skill_linter.py +139 -4
  101. package/scripts/skill_overlap.py +204 -0
  102. package/scripts/skill_tools/audit_user_type_coverage.py +148 -0
  103. package/scripts/skill_usage_collect.py +191 -0
  104. package/scripts/skill_usage_report.py +162 -0
  105. package/scripts/smoke/kernel.sh +101 -0
  106. package/scripts/smoke/router.sh +129 -0
  107. package/scripts/smoke/schema.sh +71 -0
  108. package/scripts/smoke/skills.sh +101 -0
@@ -0,0 +1,125 @@
1
+ ---
2
+ stability: stable
3
+ ---
4
+
5
+ # Namespace contract — skills, rules, commands, personas
6
+
7
+ > Every artefact name is a **stable identifier**: routed to from
8
+ > `router.json`, cited from skills, surfaced in `/help`, embedded in
9
+ > command paths, and back-referenced in test fixtures. Drift breaks
10
+ > all five surfaces silently.
11
+ >
12
+ > **Source:** Step-11 Phase 5 Step 1
13
+ > (`step-11-ruflo-parity.md`).
14
+ > **Enforcer:** [`scripts/lint_namespace.py`](../../scripts/lint_namespace.py),
15
+ > wired into `task lint-skills`.
16
+
17
+ ## 1. Shape
18
+
19
+ ```
20
+ <stem>-<intent> kebab-case, ASCII, lowercase
21
+ ```
22
+
23
+ | Component | Rule |
24
+ |---|---|
25
+ | Charset | `[a-z0-9-]+` only |
26
+ | Separator | single `-` between tokens; never `_`, `.`, or camelCase |
27
+ | Length | skills: 3 ≤ name ≤ 64 · rules / commands / personas: 2 ≤ name ≤ 64 (two-letter slot reserved for intentional acronyms — `pr`, `ci`, `qa`, `me`) |
28
+ | First char | `[a-z]` (digits and `-` forbidden at start) |
29
+ | Last char | `[a-z0-9]` (trailing `-` forbidden) |
30
+ | Run | no consecutive `--` |
31
+
32
+ The `<stem>` carries the **subject** (`commit`, `eloquent`,
33
+ `livewire`); the `<intent>` (optional) carries the **verb / lens**
34
+ (`-writing`, `-architect`, `-routing`). Single-token names are
35
+ permitted when the stem already encodes both (`commit`, `eloquent`,
36
+ `docker`).
37
+
38
+ ## 2. Reserved names — forbidden as artefact names
39
+
40
+ | Name | Reason |
41
+ |---|---|
42
+ | `pattern` | Reserved for trigger-pattern fixtures (see `tests/fixtures/triggers/`). |
43
+ | `claude-memories` | Reserved for the `~/.claude/CLAUDE.md` shape — host-agent state, not a package artefact. |
44
+ | `default` | Ambiguous with profile / mode defaults; collides with `.agent-settings.yml` keys. |
45
+ | `index` | Reserved for auto-generated INDEX.md files. |
46
+ | `router` | Reserved for `router.json` and the router contract. |
47
+
48
+ Reserved names apply at the **top level** of each artefact type. A
49
+ sub-verb under a namespaced group (e.g. `council/default.md` →
50
+ `/council:default`) is **not** a top-level identifier — the group
51
+ prefix disambiguates it, and reserved-name enforcement is skipped
52
+ for sub-verbs by the linter. A future artefact `pattern-foo` at the
53
+ top level is fine; bare `pattern` is not.
54
+
55
+ `README.md` and `INDEX.md` are documentation, not artefacts, and are
56
+ skipped by the linter.
57
+
58
+ ## 3. Per-type conventions
59
+
60
+ | Type | Source path | Naming nuance |
61
+ |---|---|---|
62
+ | Skill | `.agent-src.uncompressed/skills/<name>/SKILL.md` | Directory name == frontmatter `name`. |
63
+ | Rule | `.agent-src.uncompressed/rules/<name>.md` | Filename stem == frontmatter `id` (when present). |
64
+ | Command | `.agent-src.uncompressed/commands/<name>.md` or `<group>/<verb>.md` | Slash-command invocation `<name>` or `<group>:<verb>`. |
65
+ | Persona | `.agent-src.uncompressed/personas/<name>.md` | Cited from skill frontmatter `personas:` list. |
66
+
67
+ Sub-namespacing (`commit/in-chunks.md` → `/commit:in-chunks`) uses
68
+ the same charset rules per segment; the joining colon is implicit.
69
+
70
+ ## 4. Linter — `scripts/lint_namespace.py`
71
+
72
+ Walks the four source roots above, asserts each artefact name:
73
+
74
+ 1. Matches the regex `^[a-z][a-z0-9]*(-[a-z0-9]+)*$`.
75
+ 2. Length 3 ≤ name ≤ 64.
76
+ 3. Not in the reserved-names list.
77
+ 4. Skill: directory name matches frontmatter `name`.
78
+
79
+ Exit codes:
80
+
81
+ | Exit | Meaning |
82
+ |---|---|
83
+ | `0` | All names valid. |
84
+ | `1` | At least one name fails a rule. |
85
+ | `2` | Linter crashed (filesystem error, malformed frontmatter). |
86
+
87
+ Diagnostic format: one issue per line — `<path>: <rule> — <detail>`.
88
+
89
+ ## 5. Adding a new artefact
90
+
91
+ Pick the name; verify locally:
92
+
93
+ ```bash
94
+ python3 scripts/lint_namespace.py --name <candidate>
95
+ # or full run:
96
+ python3 scripts/lint_namespace.py
97
+ ```
98
+
99
+ If the candidate fails, the linter prints the rule it violated.
100
+ **Renames after release are expensive** — touch router.json, every
101
+ skill citing the old name, the bench corpus, and consumer settings.
102
+ Pay the naming cost once, upfront.
103
+
104
+ ## 6. Relationship to the frontmatter contract
105
+
106
+ The **shape** lives here. The **frontmatter keys** that carry the
107
+ name (`name:` in skills, `id:` in rules) live in
108
+ [`frontmatter-contract.md`](../../agents/docs/frontmatter-contract.md).
109
+ Both contracts share the regex; this file is the source of truth for
110
+ the regex string.
111
+
112
+ ## 7. Why this exists
113
+
114
+ `router.json` resolves `<kind>:<id>` strings at session start. Any
115
+ artefact rename breaks every routing entry pointing at the old name
116
+ without compile-time error. The linter catches the rename at the PR
117
+ boundary, not at runtime in a consumer.
118
+
119
+ ## 8. Out of scope
120
+
121
+ - File-system case sensitivity (we rely on lowercase-only names).
122
+ - Cross-tool aliases (Augment / Claude / Cursor all consume the same
123
+ name — projection is by content, not by alias).
124
+ - Versioning suffixes (`-v2`, `-legacy`). Use `status: superseded`
125
+ in frontmatter instead; never rename in place.
@@ -0,0 +1,142 @@
1
+ ---
2
+ stability: beta
3
+ keep-beta-until: 2026-08-14
4
+ ---
5
+
6
+ # Profile System — Contract
7
+
8
+ > **Status:** beta · **Owner:** package maintainer · **Last reviewed:** 2026-05-16
9
+ >
10
+ > Schema and semantics for the **Profile** axis introduced in step-15
11
+ > Phase 1 item 1. Profile answers *who is the user?* — audience
12
+ > taxonomy that selects the default skill/command surface, README
13
+ > entry-paragraph, and persona pre-selection. Boundary against
14
+ > `preset.id`, `pack.id`, and `cost_profile`:
15
+ > [`ADR-010`](../decisions/ADR-010-profile-pack-preset-boundary.md).
16
+
17
+ ## Decision
18
+
19
+ A **profile** declares the user's audience identity. Six seed profiles
20
+ ship; users can declare their own under
21
+ `.agent-src.uncompressed/profiles/<id>.yml`.
22
+
23
+ | `profile.id` | Audience | README entry-paragraph | Default `preset.id` |
24
+ |---|---|---|---|
25
+ | `founder` | Solo / early-stage founder; wears every hat | "Ship the company, not the codebase" | `fast` |
26
+ | `developer` | IC engineer; primary day-to-day user today | "Pair with a senior reviewer that never sleeps" | `balanced` |
27
+ | `content_creator` | Writers, ghostwriters, marketers | "Your voice, my hands" | `balanced` |
28
+ | `agency` | Multi-client delivery shop | "Same playbook across every client repo" | `strict` |
29
+ | `finance` | CFO / fractional finance / FP&A | "Forecasts and memos with the receipts attached" | `strict` |
30
+ | `ops` | RevOps, support, SRE-adjacent | "Procedures that get followed, not skipped" | `strict` |
31
+
32
+ The seed set is **fixed for v2.x**. Adding a seventh profile requires
33
+ an ADR — the contract surface that ships in the wizard
34
+ (`/onboard` role-selection) treats this set as exhaustive.
35
+
36
+ ## Profile shape
37
+
38
+ ```yaml
39
+ profile:
40
+ id: developer
41
+ audience:
42
+ label: "IC engineer"
43
+ readme_anchor: "developer" # selects README first-screen block
44
+ defaults:
45
+ preset_id: balanced # may be overridden by .agent-settings.yml
46
+ personas: [reviewer, security] # pre-selected persona ids
47
+ skills_hint: [developer-like-execution, verify-before-complete, minimal-safe-diff]
48
+ surface:
49
+ commands_hint: [work, implement-ticket, review-changes, fix]
50
+ docs_first_pointer: "docs/getting-started-by-role.md#developer"
51
+ ```
52
+
53
+ Per [ADR-010](../decisions/ADR-010-profile-pack-preset-boundary.md), a
54
+ profile **MAY** set `defaults.preset_id` but **MAY NOT** set any
55
+ preset-owned knob directly. The lint task (`task lint-config-schema`)
56
+ enforces this.
57
+
58
+ ## Loader contract
59
+
60
+ The Phase 1 loader lives at `scripts/config/profiles.py`. Resolution
61
+ chain (last writer wins):
62
+
63
+ 1. `pack.profile_id` (if pack active) → `profile.id`.
64
+ 2. `.agent-settings.yml` top-level `profile:` block → `profile.id`
65
+ and any user overrides for `audience` / `defaults` / `surface`.
66
+ 3. Environment variable `AGENT_CONFIG_PROFILE_ID` → `profile.id`.
67
+ 4. Runtime CLI flag `--profile=<id>` → `profile.id`, single session.
68
+
69
+ If no profile resolves, the loader **does not pick a default
70
+ silently** — it falls back to `developer` only when
71
+ `.agent-settings.yml` is missing entirely (fresh install before
72
+ `/onboard`). With a settings file present but no `profile:` block,
73
+ the loader raises a structured warning pointing to `/onboard`.
74
+
75
+ ```
76
+ RATIONALE: a silent default would hide the "I never picked an audience"
77
+ state from the wizard, breaking the council v3 observation that audience
78
+ choice must be a deliberate act of the user, not an agent inference.
79
+ ```
80
+
81
+ ## Resolution outcome
82
+
83
+ After the loader runs, the session has:
84
+
85
+ ```python
86
+ {
87
+ "id": "developer",
88
+ "audience": {"label": "IC engineer", "readme_anchor": "developer"},
89
+ "preset_id": "balanced",
90
+ "personas": ["reviewer", "security"],
91
+ "skills_hint": ["developer-like-execution", ...],
92
+ "commands_hint": ["work", "implement-ticket", ...],
93
+ "source": "user-settings | env | runtime | pack | default",
94
+ }
95
+ ```
96
+
97
+ The `source` field is mandatory and feeds the
98
+ `/agent-config explain`
99
+ command (Phase 1 item 3).
100
+
101
+ ## User-defined profiles
102
+
103
+ A consumer project MAY ship a custom profile under
104
+ `.agent-src.uncompressed/profiles/<id>.yml`. Constraints:
105
+
106
+ - `id` MUST be unique across seed + user-defined profiles.
107
+ - Shape MUST match the seed contract above (audience / defaults / surface).
108
+ - `defaults.preset_id` MUST reference an existing preset
109
+ ([`config-presets.md`](config-presets.md)).
110
+ - The lint task hard-fails on schema violations.
111
+
112
+ User-defined profiles do **not** require an ADR — they are project-local.
113
+ Only changes to the **seed set** require an ADR.
114
+
115
+ ## Drift detection
116
+
117
+ `task lint-config-schema` (added in Phase 1) hard-fails when:
118
+
119
+ - A profile YAML names a preset-owned knob (cost cap, autonomy,
120
+ confidence, risk).
121
+ - A profile YAML references a non-existent `preset_id`.
122
+ - The seed-profile count diverges from this contract's table.
123
+ - `defaults.personas` references a persona id that does not exist
124
+ under `.agent-src.uncompressed/personas/`.
125
+
126
+ ## Non-goals
127
+
128
+ - This contract does **not** define preset knobs. See
129
+ [`config-presets.md`](config-presets.md).
130
+ - It does **not** define packs. See `workflow-packs.md` (Phase 2 item 7).
131
+ - It does **not** override `cost_profile`. The rule-tier loader keeps
132
+ its independent axis per
133
+ [`cost-profile-defaults.md`](cost-profile-defaults.md).
134
+ - It does **not** ship a UI. Profile selection happens in `/onboard`
135
+ (step-15 Phase 1 item 2).
136
+
137
+ ## See also
138
+
139
+ - [`ADR-010`](../decisions/ADR-010-profile-pack-preset-boundary.md) — axis boundary.
140
+ - [`config-presets.md`](config-presets.md) — preset knobs.
141
+ - [`cost-profile-defaults.md`](cost-profile-defaults.md) — rule-tier axis (orthogonal).
142
+ - `step-15-product-refinement` — Phase 1 item 1.
@@ -0,0 +1,129 @@
1
+ ---
2
+ stability: beta
3
+ keep-beta-until: 2026-08-12
4
+ ---
5
+
6
+ # Universal safety model
7
+
8
+ > **Status:** beta — first draft 2026-05-16 (Phase 2 Item 9 of
9
+ > `step-15-product-refinement`).
10
+ >
11
+ > **Baseline:** [`docs/architecture/current-safety-behavior.md`](../architecture/current-safety-behavior.md)
12
+ > documents the pre-step-15 surface this contract replaces.
13
+
14
+ A **per-profile, per-domain safety policy** declared as a single
15
+ machine-readable table. Replaces the legacy "one autonomy switch for
16
+ everything" model documented in the baseline. Does **not** weaken the
17
+ four non-overridable floors — those keep their universal scope and
18
+ are referenced by id, not redeclared here.
19
+
20
+ ## The Iron Floor
21
+
22
+ ```
23
+ NO POLICY ENTRY MAY WIDEN AN EXISTING FLOOR.
24
+ ANY ENTRY THAT WOULD ALLOW A FLOOR-BLOCKED ACTION IS REJECTED AT LINT.
25
+ ```
26
+
27
+ The four floors are listed in
28
+ [`current-safety-behavior § The four non-overridable floors`](../architecture/current-safety-behavior.md#the-four-non-overridable-floors):
29
+ `non-destructive-by-default`, `scope-control § git-ops`,
30
+ `commit-policy`, `security-sensitive-stop`. Floor membership is
31
+ maintained in [`kernel-membership`](kernel-membership.md); a domain
32
+ listed there cannot be set to `allow` here.
33
+
34
+ ## Schema
35
+
36
+ ```yaml
37
+ # .agent-src.uncompressed/profiles/<id>.yml — new top-level key
38
+ profile:
39
+ id: <profile.id>
40
+ # ... existing fields ...
41
+ safety:
42
+ domains:
43
+ <domain-id>:
44
+ policy: <deny | ask | allow>
45
+ rationale: "<= 280 chars — why this policy for this profile>"
46
+ ```
47
+
48
+ ### Domain registry
49
+
50
+ Domains are declared in this contract, **not** invented per profile.
51
+ A profile may only reference an id from the table below.
52
+
53
+ | Domain id | What it gates | Floor reference |
54
+ |---|---|---|
55
+ | `prod_data` | Reads / writes against production data stores. | `non-destructive-by-default` |
56
+ | `prod_infra` | Terraform / k8s / cloud config touching prod. | `non-destructive-by-default` |
57
+ | `secrets` | Secret values in env, config, or output. | `security-sensitive-stop` |
58
+ | `auth_changes` | Auth, session, tenant-boundary, IAM edits. | `security-sensitive-stop` |
59
+ | `billing` | Pricing, invoicing, refund, payout logic. | `security-sensitive-stop` |
60
+ | `bulk_delete` | `rm -rf`, `DROP`, `TRUNCATE`, ≥ 5-file deletion. | `non-destructive-by-default` |
61
+ | `git_push` | `git push` to any remote. | `scope-control § git-ops` |
62
+ | `git_branch` | branch create / switch / delete. | `scope-control § git-ops` |
63
+ | `commit` | Any git commit. | `commit-policy` |
64
+ | `mcp_call_costly` | MCP / web / model call ≥ preset's `per_call_max_usd`. | — (advisory) |
65
+ | `pii_redact` | PII redaction in support / finance / recruiting / marketing outputs. | `domain-safety-pii-*` |
66
+ | `pii_log` | Logging of raw PII. | `domain-safety-logging-pii-floor` |
67
+ | `legal_advice` | Output shaped as legal advice. | `domain-safety-disclaimer-legal` |
68
+ | `medical_advice` | Output shaped as medical advice. | `domain-safety-disclaimer-medical` |
69
+ | `financial_advice` | Investment / tax / valuation positions. | `domain-safety-disclaimer-financial` |
70
+ | `pr_create` | Pull-request open / close / retarget. | `scope-control § git-ops` |
71
+ | `deploy` | Deploy / release / tag / pipeline trigger. | `non-destructive-by-default` |
72
+
73
+ ### Policy semantics
74
+
75
+ | Policy | Behaviour | Floor interaction |
76
+ |---|---|---|
77
+ | `deny` | The agent refuses. Numbered-option block surfaces the refusal and the rationale field; no override path. | `deny` is the default for every floor domain — it cannot be relaxed. |
78
+ | `ask` | The agent stops and asks a single numbered question per [`user-interaction`](../../.agent-src/rules/user-interaction.md). One question per turn. | `ask` is the default for every floor domain in a profile that has not opted out — the floor remains operative even when `policy=allow` is set elsewhere. |
79
+ | `allow` | The agent proceeds without asking. Trivial-question suppression applies. | `allow` is **forbidden** on any domain whose `Floor reference` column is non-empty. Linter rejects it. |
80
+
81
+ The legacy single switch (`personal.autonomy`) is preserved as a
82
+ **fallback** for any domain a profile does not declare — keeping
83
+ existing installs functional while profiles migrate.
84
+
85
+ ## Resolution
86
+
87
+ Order (last writer wins, subject to the Iron Floor):
88
+
89
+ 1. Domain default = `ask` for floor domains, `allow` otherwise.
90
+ 2. Profile `safety.domains.<id>.policy`.
91
+ 3. Active pack's profile (if `--pack <id>` is active).
92
+ 4. `.agent-settings.yml` user override under `profile.safety.domains`.
93
+
94
+ The explain command at [`explain config`](../../.agent-src/scripts/agent-config)
95
+ (Phase 1 Item 3 deliverable) surfaces the resolved policy per domain,
96
+ with the writer source per row.
97
+
98
+ ## Validation
99
+
100
+ `scripts/lint_safety_model.py` (Phase 2 deliverable — not yet
101
+ shipped) fails CI on:
102
+
103
+ - Unknown domain id.
104
+ - `allow` on a floor-referenced domain.
105
+ - Missing `rationale` (≤ 280 chars, plain prose).
106
+ - Profile declaring `safety` without at least one entry.
107
+
108
+ Until the linter lands, profiles are reviewed by hand at PR time.
109
+
110
+ ## What this contract does **not** do
111
+
112
+ - **Does not** introduce new safety rules. Every domain row maps to
113
+ an existing rule or to advisory cost guidance.
114
+ - **Does not** ship the loader. `scripts/config/safety.py` is a
115
+ Phase 2 deliverable deferred to its own step.
116
+ - **Does not** override domain-safety output floors. PII redaction
117
+ and disclaimer rules apply regardless of `safety.domains.*` —
118
+ `policy=allow` on `pii_redact` means "do not ask before redacting",
119
+ not "skip redaction".
120
+ - **Does not** authorize per-tool MCP overrides. Cost caps live in
121
+ [`config-presets`](config-presets.md).
122
+
123
+ ## See also
124
+
125
+ - [`current-safety-behavior`](../architecture/current-safety-behavior.md) — pre-step-15 baseline (what this replaces)
126
+ - [`config-presets`](config-presets.md) — cost caps and enforcement
127
+ - [`profile-system`](profile-system.md) — profile axis
128
+ - [`workflow-packs`](workflow-packs.md) — pack-level overrides
129
+ - `step-15-product-refinement` § Phase 2 Item 9
@@ -0,0 +1,144 @@
1
+ ---
2
+ stability: beta
3
+ keep-beta-until: 2026-08-14
4
+ ---
5
+
6
+ # Smoke Contracts — Phase 3 of step-11-ruflo-parity
7
+
8
+ > **Status:** active · **Owner:** step-11 Phase 3 · **Sibling:**
9
+ > [`measurement-baseline.md`](measurement-baseline.md) (snapshot semantics)
10
+ > · [`cost-enforcement.md`](cost-enforcement.md) (cost ladder)
11
+
12
+ Per-tier smoke scripts validate the system's structural baselines on
13
+ every PR that touches the tier. Each script is **fast** (≤ 30 s wall),
14
+ **deterministic** (same input → same exit), and **measured** (baseline
15
+ numbers come from `task smoke:*` on `main` at lock-in, not from claims).
16
+
17
+ ## § 1 — Runtime budget
18
+
19
+ Every `scripts/smoke/<tier>.sh` honours:
20
+
21
+ | Limit | Value | Rationale |
22
+ |---|---:|---|
23
+ | Wall time | ≤ 30 s | CI matrix slot; local dev iteration |
24
+ | External I/O | none beyond filesystem | no network, no MCP |
25
+ | Output | last line is the **baseline declaration** | parseable by CI summary |
26
+
27
+ A smoke that approaches 30 s should be split into sub-smokes, not
28
+ optimised in place.
29
+
30
+ ## § 2 — Path-trigger globs
31
+
32
+ CI's `.github/workflows/smoke.yml` dispatches the right scripts based on
33
+ the paths touched in the PR:
34
+
35
+ | Tier | Globs that trigger | Script |
36
+ |---|---|---|
37
+ | kernel | `.agent-src.uncompressed/rules/**`, `.agent-src/rules/**`, `router.json`, `scripts/measure_rule_budget.py` | `scripts/smoke/kernel.sh` |
38
+ | router | `router.json`, `.agent-src.uncompressed/rules/**`, `.agent-src.uncompressed/skills/**`, `docs/contracts/**`, `docs/guidelines/**` | `scripts/smoke/router.sh` |
39
+ | schema | `.agent-src.uncompressed/skills/**`, `.agent-src.uncompressed/rules/**`, `scripts/schemas/**`, `scripts/skill_linter.py`, `scripts/validate_frontmatter.py` | `scripts/smoke/schema.sh` |
40
+ | skills | `.agent-src.uncompressed/skills/**` | `scripts/smoke/skills.sh` |
41
+
42
+ `task smoke` runs all four locally regardless of paths.
43
+
44
+ ## § 3 — Baseline declarations (locked 2026-05-16)
45
+
46
+ Smoke baselines are **measured today**, not aspirational. They lock
47
+ **regression**: a smoke goes red only if the count drifts the wrong way.
48
+ Drift toward the ideal (fewer breaches, more fences) updates the
49
+ constant in the script body and the row below.
50
+
51
+ ### § 3.1 — Kernel (`scripts/smoke/kernel.sh`)
52
+
53
+ ```
54
+ 9 kernel rules · 8 carry Iron-Law fences · 1 dispatch index · ≤ 2 budget breaches
55
+ ```
56
+
57
+ - **9 kernel rules** — fixed by [`kernel-membership.md`](kernel-membership.md).
58
+ - **8 carry Iron-Law fences** — measured 2026-05-16. `agent-authority`
59
+ is the **dispatch index** (priority table pointing at the other four
60
+ authority rules); it is structurally exempt from the Iron-Law-fence
61
+ requirement and listed in the script's `EXEMPT_FROM_FENCE` set.
62
+ - **≤ 2 budget breaches** — `python3 scripts/measure_rule_budget.py
63
+ --kernel-budget-check` currently reports 2 breaches
64
+ (`kernel-bucket > 26000`, `no-cheap-questions > 4000`). The smoke
65
+ asserts the count does not grow; reductions update `EXPECTED_BREACHES`
66
+ in `scripts/smoke/kernel.sh`. See
67
+ `road-to-kernel-and-router.md`
68
+ for the path back to zero.
69
+
70
+ ### § 3.2 — Router (`scripts/smoke/router.sh`)
71
+
72
+ ```
73
+ 75 router ids · 0 broken rule pointers · 35 routes_to refs · 2 missing contracts
74
+ ```
75
+
76
+ - **75 ids** — 9 kernel + 24 tier_1 + 42 tier_2; every id resolves to
77
+ `.agent-src/rules/<id>.md`.
78
+ - **0 broken rule pointers** — hard assertion; smoke fails on any miss.
79
+ - **35 routes_to refs** across tier_1 + tier_2; resolver honours the
80
+ four prefixes (`skill:`, `command:`, `guideline:`, `contract:`).
81
+ - **2 missing contracts** — measured 2026-05-16:
82
+ `contract:artifact-engagement-flow`,
83
+ `contract:command-suggestion-flow`. Tracked separately under
84
+ ``step-11` Phase 4 (ADR layout)`;
85
+ smoke asserts the count is `≤ EXPECTED_MISSING_CONTRACTS=2`.
86
+
87
+ ### § 3.3 — Schema (`scripts/smoke/schema.sh`)
88
+
89
+ ```
90
+ 438 lintable artefacts · 0 schema FAILs · ≤ 92 warns
91
+ ```
92
+
93
+ - **0 FAILs** — hard assertion. `scripts/skill_linter.py --all` returns
94
+ exit 0/1 (warns) but never 2 (fail).
95
+ - **≤ 92 warns** — measured 2026-05-16; locks regression. Warns
96
+ trending down updates the constant.
97
+ - **v2 schema (step-5) deferred** — when
98
+ `step-5-schema-rigor.md`
99
+ Phase 1 closes, this smoke gains a `model_tier` presence assertion;
100
+ Phase 3 adds `schema_version: "2"`. Until then, v1 schema in
101
+ `scripts/schemas/skill.schema.json` is the contract.
102
+
103
+ ### § 3.4 — Skills (`scripts/smoke/skills.sh`)
104
+
105
+ ```
106
+ 5/5 random skills resolve · frontmatter parses · name matches directory
107
+ ```
108
+
109
+ - **5 random skills** picked deterministically (seed = epoch day) from
110
+ `.agent-src.uncompressed/skills/*/SKILL.md` and re-validated via
111
+ `scripts/validate_frontmatter.py`. `agent-config explain skill` is
112
+ **not** invoked — `explain` only supports `{config,rule,route}` today
113
+ ([`scripts/agent-config/cmd_explain.py`](../../scripts/agent-config/cmd_explain.py));
114
+ filesystem-resolution is the contract.
115
+
116
+ ## § 4 — Local invocation
117
+
118
+ ```bash
119
+ task smoke # all four
120
+ task smoke:kernel # individual tiers
121
+ task smoke:router
122
+ task smoke:schema
123
+ task smoke:skills
124
+ ```
125
+
126
+ Every script honours `SMOKE_QUIET=1` (suppresses table output, keeps
127
+ the final baseline line) for CI summary parsing.
128
+
129
+ ## § 5 — Failure modes
130
+
131
+ | Symptom | Likely cause | Fix |
132
+ |---|---|---|
133
+ | `kernel.sh` reports > 8 missing fences | Kernel rule lost its Iron Law block during edit | Restore the fence; update `EXEMPT_FROM_FENCE` only for new dispatch indexes |
134
+ | `router.sh` reports > 0 broken pointers | `router.json` references an id without a rule file | Add the rule or remove the route — never edit the smoke baseline up |
135
+ | `schema.sh` reports FAILs | A skill / rule lost a required field | Restore via [`scripts/schemas/skill.schema.json`](../../scripts/schemas/skill.schema.json) |
136
+ | `skills.sh` 5/5 random sample fails | Hand-edit broke frontmatter or renamed directory without updating `name:` | Restore filename ↔ slug coupling |
137
+
138
+ ## § 6 — See also
139
+
140
+ - [`measurement-baseline.md`](measurement-baseline.md) — measurement substrate.
141
+ - [`cost-enforcement.md`](cost-enforcement.md) — cost ladder, sibling smoke surface.
142
+ - [`kernel-membership.md`](kernel-membership.md) — the 9-rule kernel set.
143
+ - [`rule-router.md`](rule-router.md) — router contract.
144
+ - `road-to-kernel-and-router.md` — kernel budget reduction path.
@@ -0,0 +1,146 @@
1
+ ---
2
+ stability: beta
3
+ keep-beta-until: 2026-08-14
4
+ ---
5
+
6
+ # User-type Schema — runtime review-lens axis
7
+
8
+ > **Status:** active · **Stability:** beta · **Owner:** step-6-user-types-axis
9
+ > · **Linter:** `scripts/skill_linter.py § lint_usertype`
10
+ > · **Source-of-truth dir:** `.agent-src.uncompressed/user-types/`
11
+ > · **Sibling axis (distinct):** install-time `user-types/` (package root) — see [`adr-install-user-type-axis`](adr-install-user-type-axis.md)
12
+ > · **ADR:** [`adr-user-types-axis`](adr-user-types-axis.md)
13
+
14
+ Locks the canonical user-type shape. A user-type is a **runtime review
15
+ lens** simulating a real end-user of the software under review (a
16
+ galabau field crew, a metalworking shop, a truck driver). It is the
17
+ twin of `personas/` along a different axis: persona = *how* we review
18
+ (methodology — qa, senior-engineer); user-type = *who* we simulate
19
+ (end-user — domain workflow + operational reality).
20
+
21
+ ## § 1 — Frontmatter
22
+
23
+ | Key | Type | Required | Notes |
24
+ |---|---|---|---|
25
+ | `id` | string | yes | lowercase-hyphenated, must match filename stem |
26
+ | `kind` | const `user-type` | yes | discriminator — locks this file as a review-lens user-type, separates it from the install-time user-type-axis YAMLs |
27
+ | `description` | string | yes | one sentence, ≤ 160 chars (linter cap matches persona) |
28
+ | `version` | string | yes | semver; bump on breaking changes |
29
+ | `source` | enum | yes | `package` \| `project` — project-specific is the typical case (consumer-domain end-users) |
30
+
31
+ `user-types:` is NOT a skill-frontmatter key in v1. The axis is
32
+ CLI-only (`/refine-ticket --user-type=<id>`). Skill-level defaults are
33
+ deferred to v2 — see [`adr-user-types-axis`](adr-user-types-axis.md).
34
+
35
+ ## § 2 — Required section spine (locked)
36
+
37
+ User-types share the spine across the axis — no Core/Specialist split,
38
+ no tier enum. Every user-type carries all seven sections:
39
+
40
+ 1. **Focus** — one paragraph. Who this lens is, the operational
41
+ context they work in, and what no other lens catches. End with one
42
+ sentence pinning the boundary: review-lens only, never operational
43
+ instruction source.
44
+ 2. **Daily Workflow** — concrete day-shape, not generic prose. What
45
+ they do at 06:00, 10:00, 15:00; what they look at, what they touch,
46
+ what they wait for.
47
+ 3. **Vocabulary** — domain terms the software must use (or must NOT
48
+ substitute). Bilingual where the trade is bilingual. Plain-language
49
+ over engineer-language where the user is non-technical.
50
+ 4. **Operational Constraints** — mobile / offline / gloves / noise /
51
+ PPE / time pressure / connectivity / lighting / dead-zones /
52
+ hours-of-service / break-windows / shop-floor vs office split.
53
+ Each constraint is a UI / flow signal, not generic empathy.
54
+ 5. **Unique Questions** — ≥ 3 questions no persona asks verbatim.
55
+ Each must be falsifiable against the ticket under review. (Linter
56
+ warns < 3, matches persona heuristic.)
57
+ 6. **Ticket Red Flags** — what this lens would flag as missing or
58
+ unrealistic when reviewing a ticket. Bullet list, each item names a
59
+ concrete signal a generic reviewer would miss.
60
+ 7. **Anti-Patterns** — what this lens must refuse to do. Guardrails
61
+ are non-negotiable here: **review-only, never operational
62
+ instruction**. No trade execution (welding procedure, electrical
63
+ work, structural advice). No dangerous how-to. No medical / legal
64
+ / engineering advice. Generic prose ("consider usability") is
65
+ itself an anti-pattern.
66
+
67
+ `Composes well with` is permitted as an optional eighth section
68
+ (advisory pairings with personas), not budget-counted.
69
+
70
+ ## § 3 — Size budget
71
+
72
+ | Section count | Line cap | Rationale |
73
+ |---|---|---|
74
+ | 7 | ≤ 120 | Matches the persona core budget. Spine is wider than a
75
+ core persona (7 vs 5 sections) but narrower than a wing-3/4 specialist
76
+ (no Critical Rules + Workflows blocks). 120 is the larger of the two
77
+ candidate caps and the persona core uses it for a 5-section spine —
78
+ the extra two sections need the headroom. |
79
+
80
+ Enforced by `lint-skills` against the full file including frontmatter
81
+ and trailing blank line.
82
+
83
+ ## § 4 — Anti-Generic Quality Bar (merge gate)
84
+
85
+ Every user-type must encode **≥ 5 concrete, domain-specific review
86
+ points** across `Daily Workflow`, `Vocabulary`, `Operational
87
+ Constraints`, and `Ticket Red Flags`. Generic prose is REJECTED at
88
+ lint or review time:
89
+
90
+ - ❌ "consider mobile usability" → ✅ "capacitive touch fails with
91
+ wet leather gloves at 4 °C; tap targets ≥ 60 px or voice command"
92
+ - ❌ "think about offline" → ✅ "no signal in cellar yards; queue
93
+ changes locally, conflict-resolve on the morning brief"
94
+ - ❌ "users want reports" → ✅ "end-of-day proof = timestamped photo
95
+ + customer signature + GPS fix; anything less is a billing dispute"
96
+
97
+ The Reviewer test: a generic reviewer persona could not have produced
98
+ the `Unique Questions` or `Ticket Red Flags` of this file. If they
99
+ could, the file is generic.
100
+
101
+ ## § 5 — Guardrails (encoded in every Anti-Patterns block)
102
+
103
+ User-types are review lenses, not operational manuals. Every file's
104
+ `## Anti-Patterns` section MUST explicitly forbid:
105
+
106
+ - Trade-execution instructions (welding procedure, electrical work,
107
+ structural advice, anything that could harm if followed)
108
+ - Dangerous how-to (chemical handling, equipment operation, work-at-
109
+ height procedures)
110
+ - Medical / legal / engineering advice that requires a licensed
111
+ practitioner
112
+
113
+ Allowed and encouraged: workflow realism, ticket gap analysis,
114
+ terminology correction, mobile / offline / safety / approval signals
115
+ as ticket-requirement signals.
116
+
117
+ ## § 6 — Schema enforcement
118
+
119
+ The linter (`scripts/skill_linter.py § lint_usertype`) enforces:
120
+
121
+ - frontmatter shape (table in § 1)
122
+ - `kind` const value
123
+ - required sections per § 2
124
+ - size budget per § 3
125
+ - ≥ 3 bullets in `Unique Questions`
126
+ - `id` matches filename stem
127
+ - description ≤ 160 chars
128
+
129
+ Authors must use the template at
130
+ `.agent-src.uncompressed/user-types/_template/user-type.md`.
131
+
132
+ ## § 7 — Versioning
133
+
134
+ Section rename / add / remove → ADR + linter update + user-type
135
+ migrations in the same PR. Size-cap tightening is breaking when it
136
+ forces existing user-types to lose content; size-cap loosening is
137
+ non-breaking. The `kind` const is locked — renaming requires a major
138
+ version bump and a separate ADR.
139
+
140
+ ## See also
141
+
142
+ - [`persona-schema`](persona-schema.md) — sister axis (methodology vs end-user)
143
+ - [`adr-user-types-axis`](adr-user-types-axis.md) — why the axis split exists
144
+ - [`adr-install-user-type-axis`](adr-install-user-type-axis.md) — the install-time `user_type` axis (distinct layer, same vocabulary)
145
+ - `.agent-src.uncompressed/user-types/README.md` — authoring entry point
146
+ - `.agent-src.uncompressed/user-types/_template/user-type.md` — template starter