@whitehatd/crag 0.0.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,32 +1,881 @@
1
1
  # crag
2
2
 
3
+ [![npm version](https://img.shields.io/npm/v/%40whitehatd%2Fcrag?color=%23e8bb3a&label=npm&logo=npm)](https://www.npmjs.com/package/@whitehatd/crag)
4
+ [![Test](https://github.com/WhitehatD/crag/actions/workflows/test.yml/badge.svg)](https://github.com/WhitehatD/crag/actions/workflows/test.yml)
5
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](./LICENSE)
6
+ [![Node](https://img.shields.io/node/v/%40whitehatd%2Fcrag)](https://nodejs.org)
7
+ [![Zero dependencies](https://img.shields.io/badge/dependencies-0-brightgreen)](./package.json)
8
+ [![159 tests](https://img.shields.io/badge/tests-159%20passing-brightgreen)](./test)
9
+
3
10
  **The bedrock layer for AI coding agents. One `governance.md`. Any project. Never stale.**
4
11
 
5
- > This is a name-reservation placeholder. The full tool is in active development and will ship shortly.
12
+ Write your AI agent rules once. Enforce them in **Claude Code, Cursor, Copilot, Codex, Gemini, Aider, Cline, Continue, Windsurf, Zed, and Sourcegraph Cody** — plus your CI pipeline and git hooks. From a single 20-line file.
13
+
14
+ ```bash
15
+ npx @whitehatd/crag init # Interview → generate governance
16
+ npx @whitehatd/crag analyze # Or skip the interview: infer from existing project
17
+ npx @whitehatd/crag compile --target all # Output for 12 downstream tools
18
+ ```
19
+
20
+ > **The one-sentence pitch:** Every other AI coding tool ships static config files that hardcode your project's current shape. They rot. crag ships a runtime discovery engine plus a single governance file — the engine reads the filesystem every session so it never goes stale, and the governance is your rules, not your paths.
21
+
22
+ ---
23
+
24
+ ## The 12-target pitch, visually
25
+
26
+ ```
27
+ ┌──────────────────┐
28
+ │ governance.md │ ← you maintain this (20-30 lines)
29
+ │ one file │
30
+ └────────┬─────────┘
31
+
32
+ crag compile
33
+
34
+ ┌─────────────────────┼─────────────────────┐
35
+ │ │ │
36
+ ┌─────┴──────┐ ┌─────┴──────┐ ┌─────┴──────┐
37
+ │ CI / hooks │ │ AI native │ │ AI extras │
38
+ ├────────────┤ ├────────────┤ ├────────────┤
39
+ │ GitHub CI │ │ AGENTS.md │ │ Copilot │
40
+ │ husky │ │ Cursor │ │ Cline │
41
+ │ pre-commit │ │ Gemini │ │ Continue │
42
+ └────────────┘ └────────────┘ │ Windsurf │
43
+ │ Zed │
44
+ │ Cody │
45
+ └────────────┘
46
+ ```
47
+
48
+ Change one line in `governance.md`, re-run `crag compile --target all`, and 12 downstream configs regenerate. Your rules, your CI, your git hooks, and 9 different AI coding agents all stay in lock-step from a single source.
49
+
50
+ ---
51
+
52
+ ## Why "crag"?
53
+
54
+ A crag is a rocky outcrop — an unmoving landmark that stands while seasons, paths, and generations change around it. That's exactly what this tool is. Your skills discover. Your gates run. Your CI regenerates. But `governance.md` — the crag — doesn't move until you say so. Your AI agents anchor to it.
55
+
56
+ ---
57
+
58
+ ## Proven in Production
59
+
60
+ Not on demos. On real systems, in production, shipping to real infrastructure.
61
+
62
+ | Project | Stack | Services | Deployment | Result |
63
+ |---|---|---|---|---|
64
+ | **Leyoda** | Spring Boot + Next.js 16 + Python | Monolith + signal engine | Docker blue-green, NGINX | Discovered entire stack, 215-line governance, zero skill modification |
65
+ | **MetricHost** | Spring Boot + Next.js 16 | 11 microservices | Kubernetes (k3s), Kafka, Redis | 3-level governance hierarchy (root + backend + frontend), dual-repo |
66
+ | **StructuAI** | Node + Rust + Python + Java + React | 9 Docker Compose services | Docker Compose | 5 languages detected, all gates generated from interview |
67
+ | **crag** | Node.js CLI | Single module | npm | Scaffolds itself — full dogfooding, 159 tests, zero deps |
68
+
69
+ The same universal skills — written once, never modified per project — discovered a full-stack monolith with OAuth and blue-green deploys, an 11-microservice K8s platform with Stripe billing and Kafka event buses, a 5-language polyglot with Rust decoders and Puppeteer rendering, and a Node.js CLI. Zero project-specific instructions in the skills. They discovered everything.
70
+
71
+ ---
72
+
73
+ ## The Architecture
74
+
75
+ ```mermaid
76
+ flowchart TB
77
+ subgraph SHIPPED["Ships with crag (universal)"]
78
+ direction LR
79
+ PRE["pre-start skill\ndiscovers ANY project"]
80
+ POST["post-start skill\nvalidates using YOUR gates"]
81
+ end
82
+
83
+ subgraph GENERATED["Generated from interview or analyze (project-specific)"]
84
+ GOV["governance.md\n20-30 lines of YOUR rules"]
85
+ end
86
+
87
+ subgraph ALSO["Also generated"]
88
+ HOOKS["hooks/"]
89
+ AGENTS["agents/"]
90
+ SETTINGS["settings"]
91
+ end
92
+
93
+ PRE -->|"reads at runtime"| GOV
94
+ POST -->|"reads at runtime"| GOV
95
+
96
+ style SHIPPED fill:#1a3a1a,stroke:#00ff88,color:#eee
97
+ style GENERATED fill:#3a2a1a,stroke:#ffbb33,color:#eee
98
+ style GOV fill:#3a2a1a,stroke:#ffbb33,color:#eee
99
+ style ALSO fill:#0f3460,stroke:#00d2ff,color:#eee
100
+ ```
101
+
102
+ The skills ship once and work forever. They don't know your stack — they discover it. They don't know your gates — they read them from governance.md. Add a service, change your CI, switch frameworks — the skills adapt. Nothing to update.
103
+
104
+ ### The Core Insight: Discovery vs Governance
105
+
106
+ Every other tool in this space mixes "how to find things" with "what to enforce." crag separates them cleanly:
107
+
108
+ - **Discovery** (universal skills) — reads the filesystem, detects runtimes, maps architecture, finds configs. Works on any project without modification.
109
+ - **Governance** (your `governance.md`) — defines YOUR rules: quality gates, security requirements, branch strategy, deployment pipeline. Changes only when YOU change it.
110
+
111
+ The skills handle discovery. `governance.md` handles governance. The skills never go stale because they re-discover every session. The governance never goes stale because it's your standards, not your file paths.
112
+
113
+ ---
114
+
115
+ ## Quick Start
116
+
117
+ ```bash
118
+ # Install once globally (the package is scoped; the binary name is `crag`)
119
+ npm install -g @whitehatd/crag
120
+
121
+ # Or use via npx (no install)
122
+ npx @whitehatd/crag init
123
+
124
+ # After install, all commands use the plain `crag` binary
125
+ crag init # Interview → generate governance + hooks + agents
126
+ crag analyze # Zero-interview: infer governance from existing project
127
+ crag check # Verify infrastructure
128
+ crag diff # Compare governance against codebase reality
129
+ crag upgrade # Update universal skills (with hash-based conflict detection)
130
+ crag workspace # Inspect detected workspace
131
+ crag compile --target all # Compile governance → CI, hooks, and 9 AI agent configs
132
+ crag install # Install interview agent globally for /crag-project
133
+ ```
134
+
135
+ After setup, in any Claude Code session:
136
+ ```bash
137
+ /pre-start-context # Discovers project, loads governance, ready to work
138
+ # ... do your task ...
139
+ /post-start-validation # Validates, captures knowledge, commits, deploys
140
+ ```
141
+
142
+ ---
143
+
144
+ ## User Guide
145
+
146
+ ### Installation
147
+
148
+ crag is a zero-dependency Node.js CLI. You don't need to install it — run it via `npx`:
149
+
150
+ ```bash
151
+ npx crag <command>
152
+ ```
153
+
154
+ Or install globally:
155
+ ```bash
156
+ npm install -g @whitehatd/crag
157
+ crag <command>
158
+ ```
159
+
160
+ The package is published under a scope (`@whitehatd/crag`) but the binary name remains `crag`, so after installation all commands work as `crag init`, `crag analyze`, etc.
161
+
162
+ **Requirements:**
163
+ - Node.js 18+ (uses built-in `https`, `crypto`, `fs`, `child_process`)
164
+ - Git (for branch strategy inference and discovery cache)
165
+ - Claude Code CLI (`claude --version`) — only needed for `crag init`
166
+
167
+ ### Choosing Your Entry Point
168
+
169
+ crag has two ways to generate governance for a project:
170
+
171
+ | Situation | Command | What happens |
172
+ |-----------|---------|--------------|
173
+ | New project, unsure of standards | `crag init` | Interactive interview — agent asks about your stack, quality bar, security, deployment |
174
+ | Existing project with CI/linters already configured | `crag analyze` | Zero-interview mode — reads your CI workflows, package.json scripts, linter configs, git history |
175
+ | Want to see what would be generated | `crag analyze --dry-run` | Prints inferred governance without writing |
176
+ | Already have governance, want to add inferred gates | `crag analyze --merge` | Preserves existing governance, appends inferred additions |
177
+ | Monorepo with sub-projects | `crag analyze --workspace` | Analyzes root + every workspace member |
178
+
179
+ ### Command Reference
180
+
181
+ #### `crag init` — Interactive Setup
182
+
183
+ Runs an interview agent that asks about your project, then generates all infrastructure:
184
+
185
+ ```bash
186
+ cd your-project
187
+ npx crag init
188
+ ```
189
+
190
+ **What gets generated:**
191
+ - `.claude/skills/pre-start-context/SKILL.md` — universal discovery skill
192
+ - `.claude/skills/post-start-validation/SKILL.md` — universal validation skill
193
+ - `.claude/governance.md` — your rules (from interview answers)
194
+ - `.claude/hooks/` — sandbox-guard, drift-detector, circuit-breaker, auto-post-start
195
+ - `.claude/agents/` — test-runner, security-reviewer, skill-auditor
196
+ - `.claude/settings.local.json` — permissions + hook wiring
197
+ - `.claude/ci-playbook.md` — empty template for known CI failures
198
+
199
+ After init, the skills are ready to use in any Claude Code session via `/pre-start-context`.
200
+
201
+ #### `crag analyze` — Zero-Interview Governance
202
+
203
+ Generates `governance.md` from your existing project without asking questions:
204
+
205
+ ```bash
206
+ crag analyze # Generate .claude/governance.md
207
+ crag analyze --dry-run # Preview without writing
208
+ crag analyze --workspace # Analyze all workspace members
209
+ crag analyze --merge # Merge with existing governance
210
+ ```
211
+
212
+ **What it detects:**
213
+ - **Stack:** Node, Rust, Python, Java, Go, Docker (from manifests)
214
+ - **Gates from CI:** parses `.github/workflows/*.yml` (recursively) for `run:` steps including multiline `run: |` blocks
215
+ - **Gates from scripts:** `package.json` `test`, `lint`, `build`, `format`, `typecheck`
216
+ - **Linters:** ESLint, Biome, Prettier, Ruff, Clippy, Rustfmt, Mypy, TypeScript
217
+ - **Branch strategy:** feature branches vs trunk-based (from git history)
218
+ - **Commit convention:** conventional vs free-form (from git log)
219
+ - **Deployment:** Docker, Kubernetes, Vercel, Fly.io, Netlify, Render, Terraform
220
+
221
+ Output sections marked `# Inferred` should be reviewed.
222
+
223
+ #### `crag check` — Verify Infrastructure
224
+
225
+ Lists all core and optional files, shows which are present:
226
+
227
+ ```bash
228
+ crag check
229
+ ```
230
+
231
+ Run this after `crag init` to verify everything was generated, or any time you're unsure if the setup is complete.
232
+
233
+ #### `crag compile` — Export Governance (12 targets)
234
+
235
+ Compiles your `governance.md` to multiple formats:
236
+
237
+ ```bash
238
+ # CI / git hooks
239
+ crag compile --target github # .github/workflows/gates.yml
240
+ crag compile --target husky # .husky/pre-commit
241
+ crag compile --target pre-commit # .pre-commit-config.yaml
242
+
243
+ # AI coding agents — native formats
244
+ crag compile --target agents-md # AGENTS.md (Codex, Aider, Factory)
245
+ crag compile --target cursor # .cursor/rules/governance.mdc
246
+ crag compile --target gemini # GEMINI.md
247
+
248
+ # AI coding agents — additional formats
249
+ crag compile --target copilot # .github/copilot-instructions.md
250
+ crag compile --target cline # .clinerules
251
+ crag compile --target continue # .continuerules
252
+ crag compile --target windsurf # .windsurfrules
253
+ crag compile --target zed # .zed/rules.md
254
+ crag compile --target cody # .sourcegraph/cody-instructions.md
255
+
256
+ crag compile --target all # All 12 targets at once
257
+ crag compile # List available targets
258
+ ```
259
+
260
+ **Why this matters:** one `governance.md` becomes your CI workflow, your git hooks, and configuration for **9 different AI coding agents**. Change a gate once, recompile, and every downstream tool sees the update. The generator detects Node/Python/Java/Go versions from your project files (`package.json engines.node`, `pyproject.toml requires-python`, `build.gradle.kts` toolchain, `go.mod` directive) instead of hardcoding defaults.
261
+
262
+ Gate classifications control behavior per target:
263
+ - `# [MANDATORY]` (default) — stop on failure
264
+ - `# [OPTIONAL]` — warn via `continue-on-error: true` (GitHub) or wrapper (husky/pre-commit)
265
+ - `# [ADVISORY]` — log result, never block
266
+
267
+ #### `crag diff` — Governance Drift Detection
268
+
269
+ Compares `governance.md` against codebase reality:
270
+
271
+ ```bash
272
+ crag diff
273
+ ```
274
+
275
+ ```
276
+ MATCH node --check bin/crag.js
277
+ DRIFT ESLint referenced but biome.json found
278
+ MISSING CI gate: cargo test (in governance, not in CI)
279
+ EXTRA docker build (in CI, not in governance)
280
+
281
+ 3 match, 1 drift, 1 missing, 1 extra
282
+ ```
283
+
284
+ Command alias normalization means `npm test` and `npm run test` are treated as equivalent, as are `./gradlew` and `gradlew`.
285
+
286
+ #### `crag upgrade` — Update Skills
287
+
288
+ Updates universal skills in the current project to the latest version:
289
+
290
+ ```bash
291
+ crag upgrade # Update skills in current project
292
+ crag upgrade --check # Dry run — show what would change
293
+ crag upgrade --workspace # Update all workspace members
294
+ crag upgrade --force # Overwrite locally modified skills (creates backup)
295
+ ```
296
+
297
+ **How it works:**
298
+ - Skills track their version in YAML frontmatter (`version: 0.2.1`)
299
+ - A `source_hash` (SHA-256, CRLF-normalized) detects local modifications
300
+ - If you modified a skill locally, upgrade won't overwrite it without `--force`
301
+ - When force-overwriting, a timestamped backup is created (`SKILL.md.bak.1712252400`)
302
+ - Global 24-hour cache at `~/.claude/crag/update-check.json`
303
+ - Opt-out: `CRAG_NO_UPDATE_CHECK=1`
304
+
305
+ #### `crag workspace` — Inspect Workspace
306
+
307
+ Shows the detected workspace, all members, their tech stacks, and governance hierarchy:
308
+
309
+ ```bash
310
+ crag workspace # Human-readable
311
+ crag workspace --json # Machine-readable JSON (for CI/scripting)
312
+ ```
313
+
314
+ Example output:
315
+ ```
316
+ Workspace: npm
317
+ Root: /path/to/monorepo
318
+ Config: package.json
319
+ Members: 3
320
+ Root governance: 2 gate section(s), runtimes: node
321
+
322
+ Members:
323
+ ✓ backend [node]
324
+ packages/backend
325
+ ✓ frontend [node] (inherits)
326
+ packages/frontend
327
+ ○ shared [node]
328
+ packages/shared
329
+ ```
330
+
331
+ Use this to debug workspace detection or understand governance inheritance in monorepos.
332
+
333
+ #### `crag install` — Install Global Agent
334
+
335
+ Installs the `crag-project` interview agent to `~/.claude/agents/` so you can invoke it with `/crag-project` from any Claude Code session:
336
+
337
+ ```bash
338
+ crag install
339
+ ```
340
+
341
+ #### `crag version` / `crag help`
342
+
343
+ ```bash
344
+ crag version # Print version
345
+ crag help # Print usage
346
+ ```
347
+
348
+ ### The Session Loop
349
+
350
+ Once crag is set up, your workflow in any Claude Code session becomes:
351
+
352
+ ```
353
+ 1. /pre-start-context → Discovers project, loads governance, checks skill currency
354
+ 2. ... your task ... → Write code, fix bugs, add features
355
+ 3. /post-start-validation → Runs gates, security review, captures knowledge, commits
356
+ ```
357
+
358
+ **Pre-start does:**
359
+ - Detects workspace type (pnpm, Cargo, Go, Gradle, Maven, Nx, Turbo, Bazel, submodules, nested repos)
360
+ - Enumerates members and checks for multi-level governance
361
+ - Detects runtime versions (Node, Java, Python, Go, Rust, Docker)
362
+ - Reads `governance.md` and applies rules for the session
363
+ - Loads cross-session memory (if MemStack enabled)
364
+ - Checks skill currency — notifies if `crag upgrade` available
365
+
366
+ **Post-start does:**
367
+ - Runs governance gates in order (stops on MANDATORY failure; logs OPTIONAL/ADVISORY)
368
+ - Auto-fixes mechanical errors (lint, format) with bounded retry
369
+ - Runs security review (grep for secrets, check new endpoints)
370
+ - Captures knowledge (insights, sessions) if MemStack enabled
371
+ - Commits with conventional commit format
372
+ - Writes `.session-state.json` for next session's warm start
373
+
374
+ ### Common Workflows
375
+
376
+ **Workflow 1: Add crag to an existing project**
377
+ ```bash
378
+ cd my-existing-project
379
+ npx crag analyze --dry-run # Preview what it would generate
380
+ npx crag analyze # Write .claude/governance.md
381
+ # Review the generated file, adjust as needed
382
+ npx crag check # Verify infrastructure
383
+ # Use /pre-start-context in Claude Code
384
+ ```
385
+
386
+ **Workflow 2: Start a brand new project**
387
+ ```bash
388
+ mkdir my-new-project && cd my-new-project
389
+ git init
390
+ npx crag init # Interactive interview
391
+ # Follow the prompts — agent asks ~20 questions
392
+ # Skills + hooks + agents are all generated
393
+ npx crag check
394
+ ```
395
+
396
+ **Workflow 3: Monorepo with per-service governance**
397
+ ```bash
398
+ cd my-monorepo
399
+ npx crag workspace # See detected type + members
400
+ npx crag init # Root-level governance
401
+ cd packages/backend
402
+ npx crag analyze --merge # Add backend-specific gates
403
+ cd ../../packages/frontend
404
+ npx crag analyze --merge # Add frontend-specific gates
405
+ # Now each package has its own governance.md, and root has cross-cutting rules
406
+ ```
407
+
408
+ **Workflow 4: Keep everything current**
409
+ ```bash
410
+ npx crag upgrade --check # See what would update
411
+ npx crag upgrade # Apply updates (preserves local changes)
412
+ npx crag diff # Check governance hasn't drifted
413
+ npx crag compile --target all # Regenerate CI workflows, hooks, cross-agent files
414
+ ```
415
+
416
+ **Workflow 5: Switch AI tools (Claude → Cursor → Gemini)**
417
+ ```bash
418
+ npx crag compile --target agents-md # Generate AGENTS.md
419
+ npx crag compile --target cursor # Generate .cursor/rules/
420
+ npx crag compile --target gemini # Generate GEMINI.md
421
+ # Same governance rules now work in Codex, Cursor, Gemini CLI, Aider, Factory
422
+ ```
6
423
 
7
- ## What crag will be
424
+ ### Troubleshooting
8
425
 
9
- crag inverts how AI coding setups work. Every AI agent today reads static config files — CLAUDE.md, AGENTS.md, Cursor rules, Gemini docs. They hardcode facts about your project. Facts change. Instructions rot.
426
+ **Q: `crag init` says "Claude Code CLI not found"**
427
+ A: Install Claude Code from https://claude.com/claude-code. Only `init` needs it; other commands don't.
10
428
 
11
- crag ships **universal skills** that discover any project at runtime — any language, any framework, any deployment target — and reads your rules from a single `governance.md` file. The skills are the engine. The governance is the crag the ancient landmark that doesn't drift while everything else changes.
429
+ **Q: `crag upgrade` shows "locally modified" and won't update**
430
+ A: You edited a skill file. Either (1) accept that your edits are preserved and stay on the old version, or (2) run `crag upgrade --force` to overwrite (backup is created).
12
431
 
13
- ## What's shipping
432
+ **Q: `crag analyze` generates nothing useful**
433
+ A: It needs signals — CI configs, `package.json` scripts, linter configs. For greenfield projects, use `crag init` for the interview flow instead.
14
434
 
15
- - `crag init` interview-driven setup
16
- - `crag analyze` zero-interview governance generation from existing CI/scripts/git
17
- - `crag diff` — drift detection
18
- - `crag upgrade` — version-tracked skill updates
19
- - `crag workspace` — 11+ workspace type detection (pnpm, Cargo, Go, Gradle, Nx, Turbo, Bazel, submodules, nested repos)
20
- - `crag compile` — compile one governance.md to 6 targets: GitHub Actions, husky, pre-commit, AGENTS.md, Cursor, Gemini
435
+ **Q: `crag diff` reports drift but my CI is working**
436
+ A: Drift means `governance.md` says one thing and the codebase uses another. Either update `governance.md` to match reality, or update the codebase to match governance. Both are valid.
21
437
 
22
- Proven in production on 4 projects (StructuAI, Leyoda, MetricHost, crag itself). 117 tests. Zero dependencies.
438
+ **Q: Skills don't auto-update when I run `/pre-start-context`**
439
+ A: Auto-update runs via the CLI commands, not the skill itself. Run `crag upgrade` from your terminal. The skill reports skill version on pre-start so you know when to run upgrade.
23
440
 
24
- ## Status
441
+ **Q: Multi-level governance not merging correctly**
442
+ A: Check that member governance files use `## Gates (inherit: root)` to opt in to inheritance. Without this marker, member governance replaces root.
25
443
 
26
- Name reserved on npm. Full release coming soon.
444
+ ---
27
445
 
28
- **Follow:** https://github.com/WhitehatD/crag
446
+ ## governance.md
447
+
448
+ The only file you maintain. 20-30 lines. Everything else is universal.
449
+
450
+ ```markdown
451
+ # Governance — StructuAI
452
+
453
+ ## Identity
454
+ - Project: StructuAI
455
+ - Description: AI-powered Minecraft schematic describer
456
+
457
+ ## Gates (run in order, stop on failure)
458
+ ### Frontend
459
+ - npx eslint frontend/ --max-warnings 0
460
+ - cd frontend && npx vite build
461
+
462
+ ### Backend
463
+ - node --check scripts/api-server.js scripts/worker.js scripts/queue.js
464
+ - cargo clippy --manifest-path source/decode/Cargo.toml
465
+ - cargo test --manifest-path source/decode/Cargo.toml
466
+
467
+ ### Infrastructure
468
+ - docker compose config --quiet
469
+
470
+ ## Branch Strategy
471
+ - Trunk-based, conventional commits
472
+ - Auto-commit after all gates pass
473
+
474
+ ## Security
475
+ - Schematic file uploads only (validate file type server-side)
476
+ - No hardcoded secrets or API keys in source
477
+ ```
478
+
479
+ Change a gate → takes effect next session. Add a security rule → enforced immediately. The skills read this file every time — they never cache stale instructions.
480
+
481
+ ### Governance v2 annotations (optional)
482
+
483
+ Gate sections support optional annotations for workspace-aware execution:
484
+
485
+ ```markdown
486
+ ## Gates (run in order, stop on failure)
487
+ ### Frontend (path: frontend/) # cd to frontend/ before running
488
+ - npx biome check . # [MANDATORY] (default)
489
+ - npx tsc --noEmit # [OPTIONAL] — warn but don't fail
490
+
491
+ ### TypeScript (if: tsconfig.json) # skip section if file doesn't exist
492
+ - npx tsc --noEmit
493
+
494
+ ### Audit
495
+ - npm audit # [ADVISORY] — informational only
496
+
497
+ ## Gates (inherit: root) # merge with root governance
498
+ ```
499
+
500
+ All annotations are optional. Existing governance files work unchanged. Classifications are honored by all compile targets (GitHub Actions `continue-on-error`, husky/pre-commit wrapper scripts).
501
+
502
+ ### Multi-level governance (monorepos)
503
+
504
+ For projects with multiple sub-repos or services, governance can be hierarchical:
505
+
506
+ ```
507
+ project-root/
508
+ ├── .claude/governance.md # Cross-stack: branch strategy, deployment, security
509
+ ├── backend/.claude/governance.md # Backend-specific: Gradle gates, service tests
510
+ └── frontend/.claude/governance.md # Frontend-specific: Biome, Vitest, responsive audit
511
+ ```
512
+
513
+ Each level gets the same universal skills. Each reads its own `governance.md`. Open Claude Code at the root — get the cross-stack view. Open it in `backend/` — get backend-specific gates. The skills adapt to wherever you are.
514
+
515
+ ---
516
+
517
+ ## Workspace Detection
518
+
519
+ crag auto-detects 11+ workspace types:
520
+
521
+ | Marker | Workspace Type |
522
+ |--------|----------------|
523
+ | `pnpm-workspace.yaml` | pnpm |
524
+ | `package.json` with `"workspaces"` | npm/yarn |
525
+ | `Cargo.toml` with `[workspace]` | Cargo |
526
+ | `go.work` | Go |
527
+ | `settings.gradle.kts` with `include(` | Gradle |
528
+ | `pom.xml` with `<modules>` | Maven |
529
+ | `nx.json` | Nx |
530
+ | `turbo.json` | Turborepo |
531
+ | `WORKSPACE` / `MODULE.bazel` | Bazel |
532
+ | `.gitmodules` | Git submodules |
533
+ | Multiple child `.git` dirs | Independent repos |
534
+
535
+ Workspace members are enumerated, checked for their own `.claude/governance.md`, and their tech stacks detected. Multi-level governance merges root gates (mandatory) with member gates (additive).
536
+
537
+ ---
538
+
539
+ ## Governance Compiler — 12 Targets
540
+
541
+ `governance.md` is agent-readable. But the gates in it are just shell commands — they can also drive your CI pipeline, git hooks, and configuration for **9 different AI coding agents**. One source of truth, twelve outputs:
542
+
543
+ ### Full target list
544
+
545
+ | Group | Target | Output path | Consumed by |
546
+ |---|---|---|---|
547
+ | **CI** | `github` | `.github/workflows/gates.yml` | GitHub Actions |
548
+ | **CI** | `husky` | `.husky/pre-commit` | husky pre-commit framework |
549
+ | **CI** | `pre-commit` | `.pre-commit-config.yaml` | pre-commit.com framework |
550
+ | **AI native** | `agents-md` | `AGENTS.md` | Codex, Aider, Factory, and any tool reading `AGENTS.md` |
551
+ | **AI native** | `cursor` | `.cursor/rules/governance.mdc` | Cursor |
552
+ | **AI native** | `gemini` | `GEMINI.md` | Google Gemini CLI |
553
+ | **AI extras** | `copilot` | `.github/copilot-instructions.md` | GitHub Copilot (VS Code, JetBrains, Visual Studio, Copilot Workspace) |
554
+ | **AI extras** | `cline` | `.clinerules` | Cline (VS Code extension) |
555
+ | **AI extras** | `continue` | `.continuerules` | Continue.dev |
556
+ | **AI extras** | `windsurf` | `.windsurfrules` | Windsurf IDE (Codeium) |
557
+ | **AI extras** | `zed` | `.zed/rules.md` | Zed Editor AI assistant |
558
+ | **AI extras** | `cody` | `.sourcegraph/cody-instructions.md` | Sourcegraph Cody |
559
+
560
+ ```bash
561
+ crag compile --target all # Generate all 12 at once
562
+ crag compile --target github # Or pick one
563
+ crag compile # Or list targets interactively
564
+ ```
565
+
566
+ The compiler parses your gates, auto-detects runtimes from the commands (Node, Rust, Python, Java, Go, Docker), and generates the right setup steps with proper version inference from your project files (not hardcoded defaults). Human-readable `Verify X contains Y` gates are compiled to `grep` commands automatically (with shell-injection-safe escaping). All 12 targets write atomically (temp file + rename) so partial failures leave the old state intact.
567
+
568
+ ```mermaid
569
+ flowchart LR
570
+ GOV["governance.md\n(one file)"]
571
+
572
+ subgraph CI["CI / git hooks"]
573
+ GH["gates.yml"]
574
+ HK["husky"]
575
+ PC["pre-commit"]
576
+ end
577
+
578
+ subgraph NATIVE["AI native"]
579
+ AM["AGENTS.md"]
580
+ CR["Cursor MDC"]
581
+ GM["GEMINI.md"]
582
+ end
583
+
584
+ subgraph EXTRAS["AI extras"]
585
+ CO["Copilot"]
586
+ CL["Cline"]
587
+ CN["Continue"]
588
+ WS["Windsurf"]
589
+ ZE["Zed"]
590
+ CY["Cody"]
591
+ end
592
+
593
+ GOV --> CI
594
+ GOV --> NATIVE
595
+ GOV --> EXTRAS
596
+ GOV -->|"read at runtime"| SKILL["Universal skills"]
597
+
598
+ style GOV fill:#3a2a1a,stroke:#ffbb33,color:#eee
599
+ style CI fill:#0f3460,stroke:#00d2ff,color:#eee
600
+ style NATIVE fill:#1a3a1a,stroke:#00ff88,color:#eee
601
+ style EXTRAS fill:#2a1a3a,stroke:#bb86fc,color:#eee
602
+ style SKILL fill:#1a3a1a,stroke:#00ff88,color:#eee
603
+ ```
604
+
605
+ Governance-as-config that compiles to agent behavior, CI/CD pipelines, and **9 different AI coding tool configs** from a single 20-line file.
606
+
607
+ ---
608
+
609
+ ## Zero-Interview Mode
610
+
611
+ Don't want an interview? `crag analyze` generates governance from your existing project:
612
+
613
+ ```bash
614
+ crag analyze # Infer governance from codebase + CI
615
+ crag analyze --dry-run # Preview without writing
616
+ crag analyze --workspace # Analyze all workspace members
617
+ crag analyze --merge # Merge with existing governance
618
+ ```
619
+
620
+ It reads your CI workflows (recursively, handling `run: |` multiline blocks), `package.json` scripts, linter configs, git history, and deployment configs. Outputs `governance.md` with `# Inferred` markers so you know what to verify.
621
+
622
+ ---
623
+
624
+ ## Governance Drift Detection
625
+
626
+ `crag diff` compares your `governance.md` against codebase reality:
627
+
628
+ ```bash
629
+ crag diff
630
+ ```
631
+
632
+ ```
633
+ MATCH node --check bin/crag.js (tool exists)
634
+ DRIFT ESLint referenced but biome.json found
635
+ MISSING CI gate: cargo test (in governance, not in CI)
636
+ EXTRA CI step: docker build (in CI, not in governance)
637
+
638
+ 3 match, 1 drift, 1 missing, 1 extra
639
+ ```
640
+
641
+ ---
642
+
643
+ ## Auto-Update
644
+
645
+ Skills track their version in YAML frontmatter. When you run any crag command, it checks for updates:
646
+
647
+ ```bash
648
+ crag upgrade # Update skills in current project
649
+ crag upgrade --workspace # Update all workspace members
650
+ crag upgrade --check # Dry run — show what would change
651
+ crag upgrade --force # Overwrite locally modified skills (with backup)
652
+ ```
653
+
654
+ The update checker queries the npm registry (cached for 24 hours, 3s timeout, graceful failure offline). Skills are only overwritten if the user hasn't modified them — local modifications are detected via SHA-256 content hash (CRLF-normalized for cross-platform consistency) and preserved unless `--force` is used.
655
+
656
+ ---
657
+
658
+ ## What Ships vs What's Generated
659
+
660
+ | Component | Source | Maintains itself? |
661
+ |-----------|--------|-------------------|
662
+ | Pre-start skill | **Ships universal** | Yes — discovers at runtime, caches results, auto-updates |
663
+ | Post-start skill | **Ships universal** | Yes — reads governance for gates, auto-fixes, auto-updates |
664
+ | `governance.md` | **Generated from interview or analyze** | No — you maintain it (20-30 lines) |
665
+ | Hooks | **Generated for your tools** | Yes — sandbox guard + drift detector + gate enforcement |
666
+ | Agents | **Generated for your stack** | Yes — read governance for commands |
667
+ | Settings | **Generated** | Yes — RTK wildcards cover new tools |
668
+ | CI playbook | **Generated template** | You add entries as failures are found |
669
+ | Compile targets | **Generated on demand** | `crag compile` regenerates from governance (12 targets) |
670
+ | Workspace detection | **Ships universal** | Yes — detects 11+ workspace types at runtime |
671
+ | Governance diff | **Ships universal** | Yes — compares governance vs codebase reality |
672
+
673
+ ---
674
+
675
+ ## Why Everything Else Is Static
676
+
677
+ ```mermaid
678
+ flowchart LR
679
+ subgraph STATIC["Current ecosystem"]
680
+ T["CLAUDE.md / AGENTS.md\nStatic config files\nHardcode project facts\nRot when project changes"]
681
+ C["Skill collections\n1,234+ skills\nPick per project\nManual assembly"]
682
+ I["Templates\nOne stack per template\nCopy and modify\nNo runtime discovery"]
683
+ end
684
+
685
+ subgraph RUNTIME["crag"]
686
+ U["Universal skills\nDiscover at runtime\nWork for any stack\nNever go stale"]
687
+ G["governance.md\n20-30 lines\nYour rules only\nHuman-controlled"]
688
+ end
689
+
690
+ T -->|"manual updates required"| STALE["Stale instructions"]
691
+ C -->|"wrong skill for new stack"| MISMATCH["Stack mismatch"]
692
+ I -->|"facts change"| ROT["Template rot"]
693
+ U -->|"reads filesystem every session"| FRESH["Always current"]
694
+ G -->|"changes only when you decide"| STABLE["Your standards"]
695
+
696
+ style STATIC fill:#3a1a1a,stroke:#ff6b6b,color:#eee
697
+ style RUNTIME fill:#1a3a1a,stroke:#00ff88,color:#eee
698
+ style STALE fill:#3a1a1a,stroke:#ff6b6b,color:#eee
699
+ style MISMATCH fill:#3a1a1a,stroke:#ff6b6b,color:#eee
700
+ style ROT fill:#3a1a1a,stroke:#ff6b6b,color:#eee
701
+ style FRESH fill:#1a3a1a,stroke:#00ff88,color:#eee
702
+ style STABLE fill:#3a2a1a,stroke:#ffbb33,color:#eee
703
+ ```
704
+
705
+ ---
706
+
707
+ ## The Session Loop
708
+
709
+ ```mermaid
710
+ flowchart LR
711
+ subgraph PRE["Pre-start (universal)"]
712
+ direction TB
713
+ W["Warm start?\n.session-state.json"]
714
+ IC["Classify intent\nscope discovery"]
715
+ CC["Cache valid?\n.discovery-cache.json"]
716
+ W --> IC --> CC
717
+ CC -->|"hit"| FAST["Fast path\nskip 80% of discovery"]
718
+ CC -->|"miss"| FULL["Full discovery\ndetect stack · load memory"]
719
+ FAST --> GOV["Read governance"]
720
+ FULL --> GOV
721
+ end
722
+
723
+ TASK["Your task"]
724
+
725
+ subgraph POST["Post-start (universal)"]
726
+ direction TB
727
+ V1["Detect changes"]
728
+ V2["Run governance gates"]
729
+ V2 -->|"fail"| FIX["Auto-fix\n(bounded retry)"]
730
+ FIX --> V2
731
+ V2 -->|"pass"| V3["Security review"]
732
+ V3 --> V4["Capture knowledge"]
733
+ V4 --> V5["Write session state\nCommit · Deploy"]
734
+ end
735
+
736
+ PRE --> TASK --> POST
737
+ POST -.->|"cache + state + knowledge"| PRE
738
+
739
+ style PRE fill:#1a2a3a,stroke:#03dac6,color:#eee
740
+ style TASK fill:#2a1a3a,stroke:#bb86fc,color:#eee
741
+ style POST fill:#1a3a2a,stroke:#00ff88,color:#eee
742
+ ```
743
+
744
+ ### What makes this loop tight
745
+
746
+ | Feature | What it does | Savings |
747
+ |---|---|---|
748
+ | **Discovery cache** | Hashes build files, skips unchanged domains | ~80% of pre-start tool calls on unchanged projects |
749
+ | **Intent-scoped discovery** | Classifies task, skips irrelevant domains | Skip frontend discovery for backend bugs, and vice versa |
750
+ | **Session continuity** | Reads `.session-state.json` for warm starts | Near-zero-latency startup when continuing work |
751
+ | **Gate auto-fix** | Fixes lint/format errors, retries gate (max 2x) | Eliminates human round-trip for mechanical failures |
752
+ | **Auto-post-start** | Hook warns before commit if gates haven't run | Removes "forgot to validate" failure mode |
753
+ | **Sandbox guard** | Hard-blocks destructive commands at hook level | Security at system level, not instruction level |
754
+ | **Workspace detection** | Detects 11+ workspace types, enumerates members | Automatic monorepo/polyrepo awareness |
755
+ | **Auto-update** | Version-tracked skills with hash-based conflict detection | Skills stay current across all projects |
756
+ | **Governance diff** | Compares `governance.md` against actual codebase | Catches drift before it causes failures |
757
+
758
+ No agent framework does all of these. Most re-discover cold every session, require manual validation, and trust instructions for safety.
759
+
760
+ ---
761
+
762
+ ## Generated Infrastructure
763
+
764
+ ```
765
+ .claude/
766
+ ├── governance.md # YOUR rules (only custom file)
767
+ ├── skills/
768
+ │ ├── pre-start-context/SKILL.md # Universal discoverer
769
+ │ └── post-start-validation/SKILL.md # Universal validator
770
+ ├── hooks/
771
+ │ ├── sandbox-guard.sh # Hard-blocks destructive commands
772
+ │ ├── auto-post-start.sh # Gate enforcement before commits
773
+ │ ├── drift-detector.sh # Checks key files exist
774
+ │ ├── circuit-breaker.sh # Failure loop detection
775
+ │ ├── pre-compact-snapshot.sh # Memory before compaction
776
+ │ └── post-compact-recovery.sh # Memory after compaction
777
+ ├── agents/
778
+ │ ├── test-runner.md # Parallel tests (Sonnet)
779
+ │ ├── security-reviewer.md # Security audit (Opus)
780
+ │ ├── dependency-scanner.md # Vulnerability scan
781
+ │ └── skill-auditor.md # Infrastructure audit
782
+ ├── rules/ # Cross-session memory
783
+ ├── ci-playbook.md # Known CI failures
784
+ ├── .session-name # Notification routing
785
+ ├── .discovery-cache.json # Cached discovery (auto-generated)
786
+ ├── .session-state.json # Session continuity (auto-generated)
787
+ ├── .gates-passed # Gate sentinel (auto-generated)
788
+ └── settings.local.json # Hooks + permissions
789
+ ```
790
+
791
+ ---
792
+
793
+ ## Principles
794
+
795
+ 1. **Discover, don't hardcode.** Every fact about the codebase is read at runtime. The skills never say "22 controllers" — they say "read the controller directory."
796
+
797
+ 2. **Govern, don't hope.** Your quality bar lives in `governance.md`. The skills enforce it but never modify it. It changes only when you change it.
798
+
799
+ 3. **Ship the engine, generate the config.** Universal skills ship once. `governance.md` is generated per-project. The engine works forever. The config is 20 lines.
800
+
801
+ 4. **Enforce, don't instruct.** Hooks are 100% reliable at zero token cost. CLAUDE.md rules are ~80% compliance. Critical behavior goes in hooks.
802
+
803
+ 5. **Compound, don't restart.** Cross-session memory means each session knows what the last one learned. Knowledge self-verifies against source files.
804
+
805
+ 6. **Guard, don't trust.** Security hooks hard-block destructive commands at the system level — `rm -rf /`, `DROP TABLE`, `curl|bash`, force-push to main. Even if instructions are misread, the sandbox catches it. Defense in depth: hooks enforce what skills instruct.
806
+
807
+ 7. **Cache, don't re-discover.** Every discovery result is cached with content hashes. If nothing changed, the next session starts in seconds, not minutes. The cache is advisory — if it's wrong, full discovery runs as normal.
808
+
809
+ ---
810
+
811
+ ## Prior Art
812
+
813
+ An independent review assessed every major AI coding tool, open-source project, academic paper, and patent filing as of April 2026. The closest candidates and why they differ:
814
+
815
+ | Candidate | What it does | Why it's not this |
816
+ |---|---|---|
817
+ | **AGENTS.md** (60K+ repos) | Static config file AI agents read | Human-maintained, multiple files by scope, no runtime discovery |
818
+ | **Claude Code** `/init` + CLAUDE.md | Scans repo, generates static instructions | Generates static output that rots. Multiple files. No governance separation |
819
+ | **Cursor** `.cursor/rules/` | Per-directory rule files | Static context, multiple artifacts, no universal engine |
820
+ | **Gemini CLI** GEMINI.md hierarchy | JIT instruction file scanning | Discovers *instruction files*, not the project itself |
821
+ | **Kiro** steering docs | Generates product/tech/structure docs | Multiple steering files, not single governance, not universal |
822
+ | **Codex** AGENTS.md + hooks + skills | Layered static instructions + extensibility | Instruction chain by directory. Could host this engine but doesn't ship one |
823
+ | **claude-code-kit** | Framework detection + generated .claude/ | Kit/framework-specific (Next.js, React, Express). Not universal polyglot |
824
+ | **OpenDev** (arxiv paper) | CLI agent with lazy tool discovery | Research prototype. No governance file. Not productized |
825
+ | **Repo2Run** (arxiv paper) | Repo → runnable Dockerfile synthesis | Build/CI domain only. No agent governance architecture |
826
+
827
+ **Adjacent patents identified:**
828
+ - **US20250291583A1** (Microsoft) — YAML-configured agent rules/actions. Covers "config file drives AI agents" broadly but not universal repo discovery.
829
+ - **US9898393B2** (Solano Labs) — Repo pattern analysis → inferred CI config. Strong historic prior art for build-system discovery, but not AI agent governance.
830
+
831
+ Neither patent blocks this architecture. Both are adjacent, not overlapping.
832
+
833
+ **Three novelty hypotheses validated by the review:**
834
+ 1. **Compositional:** Many systems have pieces (hooks, skills, context files). None compose them into universal discovery engine + single governance file + continuously regenerated artifacts.
835
+ 2. **Scope:** Closest implementations (claude-code-kit) are framework-specific, not polyglot-universal.
836
+ 3. **Governance-as-contract:** Existing tools treat instruction files as context (often non-enforced). This treats governance as an executable contract that deterministically shapes gates and commit behavior.
837
+
838
+ ---
839
+
840
+ ## Roadmap
841
+
842
+ - [x] Universal pre-start and post-start skills
843
+ - [x] Interview-driven governance generation
844
+ - [x] CLI (`crag init`, `crag check`, `crag install`)
845
+ - [x] Proven on 5-language multi-service project (StructuAI)
846
+ - [x] Proven on full-stack monolith with blue-green deploys (Leyoda)
847
+ - [x] Proven on 11-microservice K8s platform with dual-repo governance (MetricHost)
848
+ - [x] Multi-level governance hierarchy (root + backend + frontend)
849
+ - [x] `crag compile` — governance.md → GitHub Actions, husky, pre-commit, AGENTS.md, Cursor, Gemini
850
+ - [x] Incremental discovery cache — content-addressed, skips 80% of pre-start on unchanged projects
851
+ - [x] Intent-scoped discovery — classifies task, skips irrelevant domains
852
+ - [x] Session continuity — warm starts via `.session-state.json`
853
+ - [x] Gate auto-fix loop — fixes lint/format errors automatically, bounded retry (max 2x)
854
+ - [x] Auto-post-start hook — gate enforcement before commits
855
+ - [x] Sandbox guard — hard-blocks destructive commands (rm -rf /, DROP TABLE, curl|bash, force-push main)
856
+ - [x] `crag analyze` — generate governance from existing project without interview
857
+ - [x] `crag diff` — compare governance against codebase reality
858
+ - [x] `crag upgrade` — update universal skills when new version ships
859
+ - [x] `crag workspace` — inspect detected workspace type and members
860
+ - [x] Workspace detection — 11+ types (pnpm, npm, Cargo, Go, Gradle, Maven, Nx, Turbo, Bazel, submodules, nested repos)
861
+ - [x] Governance v2 format — path-scoped gates, conditional sections, mandatory/optional/advisory classification
862
+ - [x] Auto-update — version tracking, npm registry check, content-hash conflict detection
863
+ - [x] Cross-agent compilation — **12 targets** (GitHub Actions, husky, pre-commit, AGENTS.md, Cursor, Gemini, Copilot, Cline, Continue, Windsurf, Zed, Sourcegraph Cody)
864
+ - [x] Modular architecture — 24 modules across 6 directories (zero dependencies)
865
+ - [x] Test suite — 159 tests covering parse, integrity, detect, enumerate, merge, compile, version, shell, CLI, 6 new compile targets, analyze internals, diff internals
866
+ - [x] Published on npm as `@whitehatd/crag`
867
+ - [x] GitHub Actions CI/CD — multi-OS (Ubuntu/macOS/Windows) × multi-Node (18/20/22) test matrix, automated npm publish with SLSA provenance, stale issue cleanup
868
+ - [ ] Cross-repo benchmark — 20-30 repos, measure coverage %, false positives, failure modes
869
+ - [ ] Drift resilience test — add services, change linters, rename directories. Does the engine re-discover?
870
+ - [ ] Baseline comparison — same governance in AGENTS.md, CLAUDE.md, .cursor/rules, GEMINI.md
871
+ - [ ] crag Cloud (paid tier) — hosted governance registry, cross-repo dashboard, team library, compliance templates, drift alerts
872
+
873
+ ---
29
874
 
30
875
  ## License
31
876
 
32
877
  MIT
878
+
879
+ ---
880
+
881
+ *Built by [Alexandru Cioc (WhitehatD)](https://github.com/WhitehatD)*