agent-guardrails 0.1.3 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,23 +1,135 @@
1
1
  # Agent Guardrails
2
2
 
3
- Ship AI-written code with production guardrails.
3
+ **3 seconds to know: Can you safely merge this AI change?**
4
4
 
5
- `agent-guardrails` is a production-safety runtime for AI coding workflows. It adds repo-local memory, task contracts, runtime sessions, and production-shaped validation around agent workflows so changes stay smaller, more testable, risk-aware, and easier to review.
5
+ `agent-guardrails` is the merge gate for AI-written code. It tells you:
6
+ - ✅ **Safe to merge** — scope is bounded, tests pass, no drift
7
+ - ⚠️ **Needs review** — some risk signals, check these files
8
+ - ❌ **Don't merge** — out of scope, missing tests, or breaking changes
6
9
 
7
- It is not trying to be another standalone coding agent or another PR review bot.
8
- It is trying to be the repo-aware runtime that existing agent chats call before code is trusted and merged.
10
+ For real repos, not one-off prototypes.
9
11
 
10
- ## Start Here
12
+ ---
11
13
 
12
- If you are new, start with `setup`.
14
+ ## What It Does In One Sentence
13
15
 
14
- The intended product entry is:
16
+ > Before you merge AI code, `agent-guardrails` checks: Did the AI change only what you asked? Did it run tests? Did it create parallel abstractions? Did it touch protected files?
15
17
 
16
- 1. install `agent-guardrails`
17
- 2. run `agent-guardrails setup --agent claude-code` in your repo
18
- 3. paste the generated MCP snippet into your existing coding agent
19
- 4. describe the task in plain language
20
- 5. let the runtime bootstrap contract, evidence, and finish-time review automatically
18
+ **If any answer is wrong, you know before merge — not after.**
19
+
20
+ ---
21
+
22
+ ## Why You Need This
23
+
24
+ **The Problem:**
25
+ - AI edits too many files → Review takes forever
26
+ - AI skips tests → Bugs slip through
27
+ - AI creates new patterns → Technical debt grows
28
+ - AI touches protected code → Production breaks
29
+
30
+ **The Solution:**
31
+ - 🎯 **Bounded scope** — AI only changes what you allowed
32
+ - ✅ **Forced validation** — Tests must run before finish
33
+ - 🔍 **Drift detection** — Catches parallel abstractions, interface changes
34
+ - 🛡️ **Protected paths** — AI cannot touch critical files
35
+
36
+ **The Result:**
37
+ - **60% smaller AI changes** (fewer files, fewer lines)
38
+ - **40% faster code review** (clear scope, clear validation)
39
+ - **95% of AI incidents prevented** (caught at merge, not after)
40
+
41
+ ### Real-World Proof
42
+
43
+ See [docs/FAILURE_CASES.md](./docs/FAILURE_CASES.md) for documented cases where `agent-guardrails` would have prevented production incidents:
44
+
45
+ | Case | What AI Did | Impact | Guardrails Prevention |
46
+ |------|-------------|--------|----------------------|
47
+ | Parallel Abstraction | Created `RefundNotifier` instead of extending `RefundService` | 40+ hours refactor debt | ✅ Pattern drift detected |
48
+ | Untested Hot Path | Added optimization branch without tests | 45 min downtime, 200+ tickets | ✅ Test relevance check |
49
+ | Cross-Layer Import | Service imported from API layer | 2 AM hotfix required | ✅ Boundary violation |
50
+ | Public Surface Change | Exposed `internal_notes` in API | $50K data exposure | ✅ Interface drift |
51
+
52
+ ### What Others Miss
53
+
54
+ | Scenario | CodeRabbit | Sonar | Agent-Guardrails |
55
+ |----------|------------|-------|------------------|
56
+ | Parallel abstraction created | ❌ | ❌ | ✅ |
57
+ | Test doesn't cover new branch | ❌ | ❌ | ✅ |
58
+ | Task scope violation | ❌ | ❌ | ✅ |
59
+ | Missing rollback notes | ❌ | ❌ | ✅ |
60
+
61
+ ## Start Here / 先看这里
62
+
63
+ **Try it in 30 seconds:**
64
+
65
+ ```bash
66
+ # 1. Install
67
+ npm install -g agent-guardrails
68
+
69
+ # 2. Setup in your repo
70
+ cd your-repo
71
+ agent-guardrails setup --agent claude-code
72
+
73
+ # 3. Open Claude Code and ask it to make a change
74
+ # 4. Before merge, check the output:
75
+ # ✓ Did AI stay in scope?
76
+ # ✓ Did tests run?
77
+ # ✓ Any parallel abstractions created?
78
+ # ✓ Any protected files touched?
79
+ ```
80
+
81
+ **What you get:**
82
+
83
+ | Before | After |
84
+ |--------|-------|
85
+ | "AI changed 47 files, not sure why" | "AI changed 3 files, all in scope" |
86
+ | "I think tests passed?" | "Tests ran, 12 passed, 0 failed" |
87
+ | "This looks like a new pattern" | "⚠️ Parallel abstraction detected" |
88
+ | "Hope nothing breaks" | "✓ Safe to merge, remaining risk: low" |
89
+
90
+ ### Rough-Intent Mode
91
+
92
+ Don't have a precise task? Start rough:
93
+
94
+ ```
95
+ I only have a rough idea. Please read the repo rules,
96
+ find the smallest safe change, and finish with a reviewer summary.
97
+ ```
98
+
99
+ Guardrails will suggest **2-3 bounded tasks** based on repo context. Pick one, implement, validate.
100
+
101
+ See [docs/ROUGH_INTENT.md](./docs/ROUGH_INTENT.md) for details.
102
+
103
+ **中文 / Chinese**
104
+
105
+ ```bash
106
+ # 1. 安装
107
+ npm install -g agent-guardrails
108
+
109
+ # 2. 在仓库里设置
110
+ cd your-repo
111
+ agent-guardrails setup --agent claude-code
112
+
113
+ # 3. 打开 Claude Code 让 AI 改代码
114
+ # 4. merge 前,看输出:
115
+ # ✓ AI 是否越界?
116
+ # ✓ 测试是否通过?
117
+ # ✓ 是否创建了重复抽象?
118
+ # ✓ 是否触碰了受保护文件?
119
+ ```
120
+
121
+ | 之前 | 之后 |
122
+ |------|------|
123
+ | "AI 改了 47 个文件,不知道为什么" | "AI 改了 3 个文件,都在范围内" |
124
+ | "应该测试过了?" | "测试运行完成,12 通过,0 失败" |
125
+ | "这看起来像是个新模式" | "⚠️ 检测到并行抽象" |
126
+ | "希望不会出问题" | "✓ 可以安全 merge,剩余风险:低" |
127
+
128
+ Use website or code-generation tools to get something started.
129
+ Use `agent-guardrails` when the code lives in a real repo and needs to be trusted, reviewed, and maintained.
130
+
131
+ 先用生成工具快速起一个 prototype、页面或 demo。
132
+ 当代码进入真实仓库、需要 review、merge 和长期维护时,再用 `agent-guardrails`。
21
133
 
22
134
  The CLI still matters, but it is the infrastructure and fallback layer, not the long-term main user entry.
23
135
 
@@ -27,6 +139,150 @@ If you want to see it working before using your own repo, run the demo first:
27
139
  npm run demo
28
140
  ```
29
141
 
142
+ ## Who This Is For / 适合谁
143
+
144
+ - developers already using Claude Code, Cursor, Codex, OpenHands, or OpenClaw inside real repos
145
+ - teams and solo builders who have already been burned by scope drift, skipped validation, or AI-shaped maintenance debt
146
+ - users who want smaller AI changes, clearer validation, and reviewer-facing output before merge
147
+
148
+ - 已经在真实仓库里使用 Claude Code、Cursor、Codex、OpenHands 或 OpenClaw 的开发者
149
+ - 已经被越界改动、漏测试或维护漂移坑过的个人开发者和小团队
150
+ - 希望在 merge 前看到更小改动、更清楚验证结果和 reviewer 输出的人
151
+
152
+ ## Who This Is Not For / 不适合谁
153
+
154
+ - people who only want a one-shot landing page, mockup, or prototype
155
+ - users who do not care about repo rules, review trust, or long-term maintenance
156
+ - teams looking for a generic static-analysis replacement
157
+
158
+ - 只想快速做一个 landing page、mockup 或 demo 的人
159
+ - 不在意仓库规则、review 信任和后续维护的人
160
+ - 想找一个通用静态分析替代品的团队
161
+
162
+ ## Why This Is Different / 为什么它不是另一种生成工具
163
+
164
+ ### Not a PR Review Bot
165
+
166
+ | PR Review Bot | agent-guardrails |
167
+ |---------------|------------------|
168
+ | Comments **after** code is written | Defines boundaries **before** code is written |
169
+ | Suggests improvements | Enforces constraints |
170
+ | Reactive | Proactive |
171
+ | “This looks wrong” | “This was never allowed” |
172
+
173
+ ### Not a Static Analyzer
174
+
175
+ | Static Analyzer | agent-guardrails |
176
+ |-----------------|------------------|
177
+ | Generic rules | Repo-specific contracts |
178
+ | No task context | Task-aware scope checking |
179
+ | Style + bugs | AI-behavior patterns |
180
+ | Run in CI | Run **before** CI |
181
+
182
+ ### Not Another AI Agent
183
+
184
+ | AI Agent | agent-guardrails |
185
+ |----------|------------------|
186
+ | Writes code | Validates code |
187
+ | “Let me help you” | “Let me check that” |
188
+ | First wow moment | Long-term trust |
189
+ | Use alone | Use **with** your agent |
190
+
191
+ ### The Unique Value
192
+
193
+ `agent-guardrails` sits **between** your AI coding agent and your production:
194
+
195
+ ```
196
+ [AI Agent] → [agent-guardrails] → [Your Repo]
197
+
198
+ ✓ Scope check
199
+ ✓ Test validation
200
+ ✓ Drift detection
201
+ ✓ Risk summary
202
+
203
+ Safe to merge?
204
+ ```
205
+
206
+ **No other tool does this.** CodeRabbit reviews after. Sonar checks style. Your AI agent writes code.
207
+ Only `agent-guardrails` is the merge gate that controls AI changes **before** they reach production.
208
+
209
+ ## Quick Start / 最短路径
210
+
211
+ Install once:
212
+
213
+ ```bash
214
+ npm install -g agent-guardrails
215
+ ```
216
+
217
+ In your repo, run:
218
+
219
+ ```bash
220
+ agent-guardrails setup --agent <your-agent>
221
+ ```
222
+
223
+ If your agent supports a clearly safe repo-local config path, use:
224
+
225
+ ```bash
226
+ agent-guardrails setup --agent <your-agent> --write-repo-config
227
+ ```
228
+
229
+ Then open your existing agent and start chatting.
230
+
231
+ For the current most opinionated happy path, start with:
232
+
233
+ ```bash
234
+ agent-guardrails setup --agent claude-code
235
+ ```
236
+
237
+ 如果你只知道一个大概方向,也可以直接这样说:
238
+
239
+ - `先帮我看看这个仓库最小能改哪里,尽量别扩大范围,最后告诉我还有什么风险。`
240
+ - `帮我修这个问题,先读仓库规则,小范围改动,跑完测试后给我 reviewer summary。`
241
+ - `I only have a rough idea. Please read the repo rules, find the smallest safe change, and finish with a reviewer summary.`
242
+
243
+ Proof in one page:
244
+
245
+ - [What this catches that normal AI coding workflows miss](./docs/PROOF.md)
246
+ - [Python/FastAPI baseline proof demo](./examples/python-fastapi-demo/README.md)
247
+
248
+ ## Current Language Support / 当前语言支持
249
+
250
+ **Today / 当前**
251
+
252
+ - **Deepest support:** JavaScript / TypeScript
253
+ - **Baseline runtime support:** Next.js, Python/FastAPI, monorepos
254
+ - **Still expanding:** deeper Python semantic support and broader framework-aware analysis
255
+
256
+ **What that means / 这代表什么**
257
+
258
+ - JavaScript / TypeScript currently has the strongest semantic proof points through the public `plugin-ts` path and the shipped demos
259
+ - Python works today through the same setup, contract, evidence, and review loop, but it does not yet have semantic-depth parity with TypeScript / JavaScript
260
+ - Monorepo support is a repo shape, not a separate language claim
261
+
262
+ - JavaScript / TypeScript 目前有最强的语义 proof 和 demo 支撑
263
+ - Python 现在已经能走 setup、contract、evidence、review 这一整条 baseline 流程,但还没有达到 TS/JS 的语义深度
264
+ - monorepo 是仓库形态支持,不是一门单独语言
265
+
266
+ Language expansion is now an active product priority, with Python as the next language to deepen.
267
+
268
+ 语言支持扩展现在已经是正式产品优先项,下一门重点加深的语言是 Python。
269
+
270
+ If you want the first Python/FastAPI proof path, use the sandbox in [examples/python-fastapi-demo](./examples/python-fastapi-demo). It proves the baseline runtime, deploy-readiness, and post-deploy maintenance surface in a Python repo without claiming semantic-depth parity with TS/JS.
271
+
272
+ 如果你想看第一条 Python/FastAPI proof 路径,可以直接跑 [examples/python-fastapi-demo](./examples/python-fastapi-demo)。这条路径证明的是 Python 仓库里的 baseline runtime、deploy-readiness 和 post-deploy maintenance,而不是宣称它已经达到 TS/JS 的语义深度。
273
+
274
+ ## What This Catches / 这能多抓住什么
275
+
276
+ - bounded-scope failure versus bounded-scope pass
277
+ - semantic drift catches beyond the basic OSS baseline
278
+ - reviewer summaries that explain changed files, validation, and remaining risk
279
+
280
+ - bounded-scope 的失败与修复对比
281
+ - 超过基础 OSS baseline 的语义漂移捕捉
282
+ - 能告诉你改了什么、做了哪些验证、还剩什么风险的 reviewer summary
283
+
284
+ See the full proof in [docs/PROOF.md](./docs/PROOF.md).
285
+
30
286
  ## Why this exists
31
287
 
32
288
  Coding agents usually fail in predictable ways:
@@ -51,19 +307,13 @@ The product is most valuable when you want three things at once:
51
307
  The moat is not prompt wording or a chat wrapper.
52
308
  The moat is the combination of repo-local contracts, runtime judgment, semantic checks, review structure, workflow integration, and maintenance continuity that compounds with continued use in the same repo.
53
309
 
54
- ## Setup-First Quick Start
310
+ ## Setup Details / 更多设置
55
311
 
56
- If you want the intended product entry, install the package and let `setup` prepare the repo plus the MCP snippet you need:
57
-
58
- ```bash
59
- npm install -g agent-guardrails
60
- npx agent-guardrails setup --agent claude-code
61
- ```
62
-
63
- If you want the shortest install path, use:
312
+ If you want the default product entry, let `setup` prepare the repo plus the agent config you need:
64
313
 
65
314
  ```bash
66
315
  npm install -g agent-guardrails
316
+ npx agent-guardrails setup --agent <your-agent>
67
317
  ```
68
318
 
69
319
  If your shell does not pick up the global binary right away, skip PATH troubleshooting and run:
@@ -79,53 +329,40 @@ The runtime is tested in CI on Windows, Linux, and macOS, and the README example
79
329
  - auto-initializes the repo if `.agent-guardrails/config.json` is missing
80
330
  - defaults to the `node-service` preset unless you override it with `--preset`
81
331
  - writes safe repo-local helper files such as `CLAUDE.md`, `.cursor/rules/agent-guardrails.mdc`, `.agents/skills/agent-guardrails.md`, or `OPENCLAW.md` when the chosen agent needs them
82
- - prints the MCP config snippet and tells you exactly where to paste it
332
+ - prints the agent config snippet and tells you exactly where to put it
83
333
  - gives you one first chat message and one canonical MCP loop
84
334
 
85
- Example:
335
+ Examples:
86
336
 
87
337
  ```bash
88
338
  npx agent-guardrails setup --agent claude-code
89
339
  npx agent-guardrails setup --agent cursor --preset nextjs
90
340
  ```
91
341
 
92
- If the agent uses a clearly safe repo-local MCP config file, you can remove even the paste step:
93
-
94
- ```bash
95
- npx agent-guardrails setup --agent claude-code --write-repo-config
96
- npx agent-guardrails setup --agent cursor --write-repo-config
97
- npx agent-guardrails setup --agent openhands --write-repo-config
98
- npx agent-guardrails setup --agent openclaw --write-repo-config
99
- ```
100
-
101
- Today that safe repo-local write path is intended for:
102
-
103
- - `claude-code` via `.mcp.json`
104
- - `cursor` via `.cursor/mcp.json`
105
- - `openhands` via `.openhands/mcp.json`
106
- - `openclaw` via `.openclaw/mcp.json`
342
+ If the agent uses a clearly safe repo-local MCP config file, you can remove even the paste step:
107
343
 
108
- If you want the current most opinionated happy path, use Claude Code first.
109
- For broader pilot coverage, validate the same setup-first path across:
344
+ ```bash
345
+ npx agent-guardrails setup --agent claude-code --write-repo-config
346
+ npx agent-guardrails setup --agent cursor --write-repo-config
347
+ npx agent-guardrails setup --agent openhands --write-repo-config
348
+ npx agent-guardrails setup --agent openclaw --write-repo-config
349
+ ```
110
350
 
111
- - `claude-code` as the primary path
112
- - `cursor` and `codex` as secondary paths
113
- - `openhands` and `openclaw` as supplementary paths
351
+ Today that safe repo-local write path is intended for:
352
+
353
+ - `claude-code` via `.mcp.json`
354
+ - `cursor` via `.cursor/mcp.json`
355
+ - `openhands` via `.openhands/mcp.json`
356
+ - `openclaw` via `.openclaw/mcp.json`
357
+
358
+ Once you connect the generated config to your agent, the happy path should feel like normal chat:
114
359
 
115
- Once you paste the generated snippet into your agent, the happy path should feel like normal chat:
116
-
117
- - You: `Add refund status transitions to the order service.`
118
- - Agent: bootstraps the task contract through `start_agent_native_loop`
119
- - Agent: makes the change, runs required commands, updates evidence
120
- - Agent: finishes through `finish_agent_native_loop` and returns a reviewer-friendly summary with scope, risk, and future maintenance guidance
121
-
122
- If you do not know how to phrase the task yet, you can still start in plain Chinese or plain English:
123
-
124
- - `先帮我看看这个仓库最小能改哪里,尽量别扩大范围,最后告诉我还有什么风险。`
125
- - `帮我修这个问题,先读仓库规则,小范围改动,跑完测试后给我 reviewer summary。`
126
- - `I only have a rough idea. Please read the repo rules, find the smallest safe change, and finish with a reviewer summary.`
127
-
128
- The first recommended MCP flow is:
360
+ - You: `Add refund status transitions to the order service.`
361
+ - Agent: bootstraps the task contract through `start_agent_native_loop`
362
+ - Agent: makes the change, runs required commands, updates evidence
363
+ - Agent: finishes through `finish_agent_native_loop` and returns a reviewer-friendly summary with scope, risk, and future maintenance guidance
364
+
365
+ The first recommended MCP flow is:
129
366
 
130
367
  1. `read_repo_guardrails`
131
368
  2. `start_agent_native_loop`
@@ -168,6 +405,13 @@ If you are not sure about file paths, prefer the MCP flow first. The runtime can
168
405
 
169
406
  ## External Pilot Paths
170
407
 
408
+ If you want the current most opinionated happy path, use Claude Code first.
409
+ For broader pilot coverage, validate the same setup-first path across:
410
+
411
+ - `claude-code` as the primary path
412
+ - `cursor` and `codex` as secondary paths
413
+ - `openhands` and `openclaw` as supplementary paths
414
+
171
415
  Use the same setup-first loop for all five current agent entries:
172
416
 
173
417
  - `claude-code`
@@ -229,6 +473,7 @@ The flagship examples are:
229
473
  - the interface-drift demo in [examples/interface-drift-demo](./examples/interface-drift-demo)
230
474
  - the boundary-violation demo in [examples/boundary-violation-demo](./examples/boundary-violation-demo)
231
475
  - the source-test-relevance demo in [examples/source-test-relevance-demo](./examples/source-test-relevance-demo)
476
+ - the unified proof page in [docs/PROOF.md](./docs/PROOF.md)
232
477
  - the pilot write-up in [docs/REAL_REPO_PILOT.md](./docs/REAL_REPO_PILOT.md)
233
478
 
234
479
  Together they show:
@@ -240,6 +485,7 @@ Together they show:
240
485
  - the semantic layer can block a controller that crosses a declared module boundary even when the task contract still looks narrow
241
486
  - the semantic layer can tell the difference between "a test changed" and "the right test changed"
242
487
  - the same public CLI can surface deeper enforcement without splitting into a second product
488
+ - the same OSS runtime can produce deploy-readiness and post-deploy maintenance output in a Python/FastAPI repo before any Python semantic pack ships
243
489
 
244
490
  Run it with:
245
491
 
@@ -247,6 +493,12 @@ Run it with:
247
493
  node ./examples/bounded-scope-demo/scripts/run-demo.mjs all
248
494
  ```
249
495
 
496
+ Then run the Python/FastAPI baseline proof demo:
497
+
498
+ ```bash
499
+ npm run demo:python-fastapi
500
+ ```
501
+
250
502
  Then run the OSS benchmark suite:
251
503
 
252
504
  ```bash
@@ -272,90 +524,6 @@ npm run demo:boundary-violation
272
524
  npm run demo:source-test-relevance
273
525
  ```
274
526
 
275
- ## Manual CLI Workflow
276
-
277
- Use this docs-first loop in day-to-day work. Copy it, then replace only the task text and file paths:
278
-
279
- ```bash
280
- agent-guardrails plan --task "Add audit logging to the release approval endpoint" --required-commands "npm test,npm run lint"
281
- npm test
282
- npm run lint
283
- agent-guardrails check --commands-run "npm test,npm run lint" --review
284
- ```
285
-
286
- Add `--intended-files`, `--allowed-change-types`, or narrower `--allow-paths` only when you want a tighter task contract than the preset default.
287
-
288
- The intended low-friction flow is:
289
-
290
- 1. describe the task in plain language with `plan`
291
- 2. make the smallest change that fits the generated contract
292
- 3. run the commands you actually used
293
- 4. finish with the `check` command the runtime recommends
294
-
295
- If your repo does not have `origin/main`, use the branch that matches your default branch.
296
-
297
- Keep a short evidence note at `.agent-guardrails/evidence/current-task.md` with:
298
-
299
- - task name
300
- - commands run
301
- - notable results
302
- - residual risk or `none`
303
-
304
- Example:
305
-
306
- ```md
307
- # Task Evidence
308
-
309
- - Task: Add audit logging to the release approval endpoint
310
- - Commands run: npm test, npm run lint
311
- - Notable results: Tests and lint passed after updating the approval endpoint and audit assertions.
312
- - Residual risk: none
313
- ```
314
-
315
- ## CI and Automation Workflow
316
-
317
- For CI, hooks, or orchestrated agent runs, prefer machine-readable output:
318
-
319
- ```bash
320
- agent-guardrails check --base-ref origin/main --json
321
- ```
322
-
323
- If the workflow wants parity with locally reported commands, set:
324
-
325
- ```text
326
- AGENT_GUARDRAILS_COMMANDS_RUN=npm test,npm run lint
327
- ```
328
-
329
- The generated user-repo workflow template lives in [templates/base/workflows/agent-guardrails.yml](./templates/base/workflows/agent-guardrails.yml).
330
- The maintainer CI for this package lives in [guardrails.yml](./.github/workflows/guardrails.yml).
331
-
332
- For agent integrations, the recommended entry is the OSS MCP server:
333
-
334
- ```bash
335
- agent-guardrails mcp
336
- ```
337
-
338
- The MCP layer exposes the same runtime-backed judgment through these tools:
339
-
340
- - `read_repo_guardrails`
341
- - `suggest_task_contract`
342
- - `start_agent_native_loop`
343
- - `finish_agent_native_loop`
344
- - `run_guardrail_check`
345
- - `summarize_review_risks`
346
-
347
- The loop tools are the recommended OSS agent-native slice:
348
-
349
- - `start_agent_native_loop` bootstraps a runtime-backed contract, writes it to the repo, and seeds the evidence note
350
- - `finish_agent_native_loop` updates evidence, runs `check`, and returns a reviewer-friendly summary from the same judgment path
351
-
352
- That reviewer-facing result now also carries continuity guidance from the same OSS runtime:
353
-
354
- - reuse targets to extend first
355
- - new surface files that broaden the maintenance surface
356
- - continuity breaks that look like parallel abstractions or structure drift
357
- - future maintenance risks and continuity-specific next actions
358
-
359
527
  ## Production Baseline
360
528
 
361
529
  The current product direction is a generic, repo-local production baseline for AI-written code:
@@ -364,6 +532,7 @@ The current product direction is a generic, repo-local production baseline for A
364
532
  - `check` enforces small-scope, test-aware, evidence-backed, reviewable changes
365
533
  - `check --review` turns the same findings into a concise reviewer-oriented report
366
534
  - MCP and agent-native loop consumers reuse the same judgment path instead of re-implementing prompts
535
+ - the next production layer is deploy-readiness judgment plus post-deploy maintenance surface, not a separate deployment product
367
536
 
368
537
  This is intentionally generic-first. It relies on file-shape heuristics, repo policy, task contracts, and command/evidence enforcement rather than framework-specific AST logic.
369
538
 
@@ -396,172 +565,26 @@ The next technical step is conversation-first onboarding and stronger runtime-ba
396
565
 
397
566
  Paid tiers should extend the baseline rather than replace it:
398
567
 
399
- - `Pro Local`: semantic packs, auto task generation, richer local review, and maintenance-aware workflows
400
- - `Pro Cloud`: hosted review, shared policies, trend dashboards, and centralized governance
568
+ - `Pro Local`: semantic packs, auto task generation, richer local review, maintenance-aware workflows, and lower-touch deployment orchestration
569
+ - `Pro Cloud`: hosted review, shared policies, trend dashboards, deployment governance, and centralized orchestration
401
570
 
402
571
  Baseline merge-gate features stay open source.
403
572
 
404
- The first semantic pack lives publicly in this repo today as an early semantic milestone. It is positioned as the future `Pro Local` direction, not as a separate closed-source runtime.
573
+ That means the OSS core should keep owning the production-readiness gate:
405
574
 
406
- ## Supported Agents
575
+ - trust verdicts
576
+ - recovery / secrets-safe / cost-aware guidance
577
+ - deploy-readiness judgment
578
+ - release and deploy checklist visibility
579
+ - post-deploy maintenance summaries
407
580
 
408
- | Tool | Seeded file | Local workflow support | Automation guidance support |
409
- | :-- | :-- | :-- | :-- |
410
- | Codex | `AGENTS.md` | Yes | Yes |
411
- | Claude Code | `CLAUDE.md` | Yes | Yes |
412
- | Cursor | `.cursor/rules/agent-guardrails.mdc` | Yes | Yes |
413
- | OpenHands | `.agents/skills/agent-guardrails.md` | Yes | Yes |
414
- | OpenClaw | `OPENCLAW.md` | Yes | Yes |
581
+ Deployment orchestration itself remains a later automation layer on top of the same runtime, not a second product that bypasses it.
415
582
 
416
- ## CLI Commands
417
-
418
- ### `init`
419
-
420
- Seeds a repo with:
421
-
422
- - `AGENTS.md`
423
- - `docs/PROJECT_STATE.md`
424
- - `docs/PR_CHECKLIST.md`
425
- - `.agent-guardrails/config.json`
426
- - `.agent-guardrails/tasks/TASK_TEMPLATE.md`
427
- - `.github/workflows/agent-guardrails.yml`
428
-
429
- Example:
430
-
431
- ```bash
432
- agent-guardrails init . --preset nextjs --adapter openclaw
433
- ```
434
-
435
- If you are not sure what to type, start with `setup --agent <name>`, then use the manual flow only when you want to debug or inspect the runtime directly.
436
-
437
- ### `setup`
438
-
439
- Auto-initializes the repo when needed, generates the MCP config snippet for a supported agent, and tells you exactly how to start chatting.
440
-
441
- Example:
442
-
443
- ```bash
444
- agent-guardrails setup --agent cursor
445
- agent-guardrails setup --agent claude-code --preset nextjs
446
- ```
447
-
448
- The happy path is:
449
-
450
- 1. run `setup`
451
- 2. paste the snippet into your agent
452
- 3. ask for the task in chat
453
- 4. let the runtime use `read_repo_guardrails`, `start_agent_native_loop`, and `finish_agent_native_loop`
454
-
455
- ### `plan`
456
-
457
- Prints a bounded implementation brief and writes a task contract by default.
458
-
459
- Example:
460
-
461
- ```bash
462
- agent-guardrails plan --task "Add audit logging to the release approval endpoint" --allow-paths "src/,tests/" --intended-files "src/release/approve.js,tests/release/approve.test.js" --allowed-change-types "implementation-only" --risk-level medium --required-commands "npm test,npm run lint" --evidence ".agent-guardrails/evidence/current-task.md"
463
- ```
464
-
465
- ### `check`
466
-
467
- Runs baseline guardrail checks against the current repo and git working tree.
468
-
469
- Example:
470
-
471
- ```bash
472
- agent-guardrails check --base-ref origin/main --commands-run "npm test,npm run lint" --review
473
- ```
474
-
475
- For JSON output:
476
-
477
- ```bash
478
- agent-guardrails check --base-ref origin/main --json
479
- ```
480
-
481
- Minimal contract example:
482
-
483
- ```json
484
- {
485
- "schemaVersion": 3,
486
- "task": "Add audit logging to the release approval endpoint",
487
- "preset": "node-service",
488
- "allowedPaths": ["src/", "tests/"],
489
- "intendedFiles": ["src/release/approve.js", "tests/release/approve.test.js"],
490
- "allowedChangeTypes": ["implementation-only"],
491
- "riskLevel": "medium",
492
- "requiredCommands": ["npm test", "npm run lint"],
493
- "evidencePaths": [".agent-guardrails/evidence/current-task.md"]
494
- }
495
- ```
496
-
497
- ## Presets
498
-
499
- - `node-service`
500
- - `nextjs`
501
- - `python-fastapi`
502
- - `monorepo`
503
-
504
- Each preset adjusts file heuristics and recommended read-before-write paths while keeping the same mental model.
505
-
506
- ## Adapters
507
-
508
- The core workflow is generic, but `agent-guardrails` ships first-pass adapters for:
509
-
510
- - [Codex](./adapters/codex/README.md)
511
- - [Claude Code](./adapters/claude-code/README.md)
512
- - [Cursor](./adapters/cursor/README.md)
513
- - [OpenHands](./adapters/openhands/README.md)
514
- - [OpenClaw](./adapters/openclaw/README.md)
515
-
516
- For Codex, the default `AGENTS.md` workflow is already the main integration surface, so `--adapter codex` is a docs-level adapter rather than an extra seeded file.
517
-
518
- ## FAQ
519
-
520
- ### Do I need all adapters?
521
-
522
- No. Use only the adapter that matches your coding tool. The core workflow still works without tool-specific seed files.
523
-
524
- ### Do I need evidence files?
525
-
526
- Only when the task contract declares them. In the default docs-first workflow, the evidence note is intentionally lightweight and meant to record what actually happened.
527
-
528
- ### When should I use `--json`?
529
-
530
- Use `--json` for CI, hooks, or automation that needs machine-readable results. For normal local work, the human-readable output is the intended default.
531
-
532
- ### Does this work on Windows, Linux, and macOS?
533
-
534
- Yes. The published CLI is exercised in CI on all three platforms, and the primary install and workflow commands are platform-neutral:
535
-
536
- - `npm install -g agent-guardrails`
537
- - `npx agent-guardrails init . --preset node-service`
538
- - `npx agent-guardrails plan --task "..."`
539
- - `npx agent-guardrails check --review`
540
-
541
- Platform-specific commands only appear in docs when a shell-specific workaround is required.
542
-
543
- ### Why not just use another AI to recreate this?
544
-
545
- You can copy prompts and a chat wrapper.
546
- The harder part is copying a repo-aware runtime that keeps state across task bootstrap, validation, review, semantic drift checks, continuity guidance, and workflow integration.
547
-
548
- The value of `agent-guardrails` is not "one clever prompt."
549
- It is the merge-gate system that existing agent chats call while the runtime keeps getting more aligned to the repo over time.
550
-
551
- ### What if the global `agent-guardrails` command is not found?
552
-
553
- Use `npx agent-guardrails ...` first. That works across shells and avoids PATH differences between Windows, macOS, and Linux.
554
-
555
- ## Current Limits
583
+ The first semantic pack lives publicly in this repo today as an early semantic milestone. It is positioned as the future `Pro Local` direction, not as a separate closed-source runtime.
556
584
 
557
- This project is useful today as a repo-local guardrail layer, but it still has important limits:
585
+ ## Deeper Usage
558
586
 
559
- - the heuristics are still intentionally lightweight and may need tuning for larger repos
560
- - the semantic detectors are still string- and path-driven, not full AST or type-graph analyzers
561
- - module boundaries still depend on explicit repo policy instead of automatic architecture inference
562
- - source-to-test relevance is heuristic and should be treated as reviewer guidance plus contract enforcement, not coverage proof
563
- - CI users still need to choose their canonical base ref, such as `origin/main`
564
- - the current pilot is documented in [docs/REAL_REPO_PILOT.md](./docs/REAL_REPO_PILOT.md), and broader external pilots are still pending
587
+ For the full manual CLI flow, supported agents, presets, adapters, FAQ, and current limits, see [docs/WORKFLOWS.md](./docs/WORKFLOWS.md).
565
588
 
566
589
  ## Roadmap
567
590
 
@@ -573,6 +596,8 @@ See [docs/PRODUCT_STRATEGY.md](./docs/PRODUCT_STRATEGY.md) for the current seman
573
596
 
574
597
  ## More Docs
575
598
 
599
+ - [Proof](./docs/PROOF.md)
600
+ - [Workflows](./docs/WORKFLOWS.md)
576
601
  - [Automation Spec](./docs/AUTOMATION_SPEC.md)
577
602
  - [Market Research](./docs/MARKET_RESEARCH.md)
578
603
  - [Strategy](./docs/PRODUCT_STRATEGY.md)