@plazmodium/odin 0.3.3-beta → 0.3.4-beta

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (73) hide show
  1. package/README.md +16 -10
  2. package/builtin/ODIN.md +1045 -0
  3. package/builtin/agent-definitions/README.md +170 -0
  4. package/builtin/agent-definitions/_shared-context.md +377 -0
  5. package/builtin/agent-definitions/architect.md +627 -0
  6. package/builtin/agent-definitions/builder.md +716 -0
  7. package/builtin/agent-definitions/discovery.md +293 -0
  8. package/builtin/agent-definitions/documenter.md +238 -0
  9. package/builtin/agent-definitions/guardian.md +1049 -0
  10. package/builtin/agent-definitions/integrator.md +363 -0
  11. package/builtin/agent-definitions/planning.md +236 -0
  12. package/builtin/agent-definitions/product.md +405 -0
  13. package/builtin/agent-definitions/release.md +430 -0
  14. package/builtin/agent-definitions/reviewer.md +447 -0
  15. package/builtin/agent-definitions/watcher.md +402 -0
  16. package/builtin/skills/api/graphql/SKILL.md +548 -0
  17. package/builtin/skills/api/grpc/SKILL.md +554 -0
  18. package/builtin/skills/api/rest-api/SKILL.md +469 -0
  19. package/builtin/skills/api/trpc/SKILL.md +503 -0
  20. package/builtin/skills/architecture/clean-architecture/SKILL.md +141 -0
  21. package/builtin/skills/architecture/domain-driven-design/SKILL.md +129 -0
  22. package/builtin/skills/architecture/event-driven/SKILL.md +145 -0
  23. package/builtin/skills/architecture/microservices/SKILL.md +143 -0
  24. package/builtin/skills/architecture/tla-precheck/SKILL.md +171 -0
  25. package/builtin/skills/backend/golang-gin/SKILL.md +141 -0
  26. package/builtin/skills/backend/nodejs-express/SKILL.md +277 -0
  27. package/builtin/skills/backend/nodejs-fastify/SKILL.md +152 -0
  28. package/builtin/skills/backend/python-django/SKILL.md +128 -0
  29. package/builtin/skills/backend/python-fastapi/SKILL.md +140 -0
  30. package/builtin/skills/database/mongodb/SKILL.md +132 -0
  31. package/builtin/skills/database/postgresql/SKILL.md +120 -0
  32. package/builtin/skills/database/prisma-orm/SKILL.md +366 -0
  33. package/builtin/skills/database/redis/SKILL.md +140 -0
  34. package/builtin/skills/database/supabase/SKILL.md +416 -0
  35. package/builtin/skills/devops/aws/SKILL.md +382 -0
  36. package/builtin/skills/devops/docker/SKILL.md +359 -0
  37. package/builtin/skills/devops/github-actions/SKILL.md +435 -0
  38. package/builtin/skills/devops/kubernetes/SKILL.md +459 -0
  39. package/builtin/skills/devops/terraform/SKILL.md +453 -0
  40. package/builtin/skills/frontend/alpine-dev/SKILL.md +27 -0
  41. package/builtin/skills/frontend/angular-dev/SKILL.md +28 -0
  42. package/builtin/skills/frontend/astro-dev/SKILL.md +28 -0
  43. package/builtin/skills/frontend/htmx-dev/SKILL.md +28 -0
  44. package/builtin/skills/frontend/nextjs-dev/SKILL.md +470 -0
  45. package/builtin/skills/frontend/react-patterns/SKILL.md +166 -0
  46. package/builtin/skills/frontend/svelte-dev/SKILL.md +28 -0
  47. package/builtin/skills/frontend/tailwindcss/SKILL.md +131 -0
  48. package/builtin/skills/frontend/vuejs-dev/SKILL.md +28 -0
  49. package/builtin/skills/generic-dev/SKILL.md +307 -0
  50. package/builtin/skills/testing/cypress/SKILL.md +372 -0
  51. package/builtin/skills/testing/jest/SKILL.md +176 -0
  52. package/builtin/skills/testing/playwright/SKILL.md +341 -0
  53. package/builtin/skills/testing/unit-tests-eval-sdd/SKILL.md +73 -0
  54. package/builtin/skills/testing/unit-tests-sdd/SKILL.md +83 -0
  55. package/builtin/skills/testing/vitest/SKILL.md +249 -0
  56. package/dist/adapters/skills/filesystem.d.ts.map +1 -1
  57. package/dist/adapters/skills/filesystem.js +2 -18
  58. package/dist/adapters/skills/filesystem.js.map +1 -1
  59. package/dist/builtin-assets.d.ts +8 -0
  60. package/dist/builtin-assets.d.ts.map +1 -0
  61. package/dist/builtin-assets.js +90 -0
  62. package/dist/builtin-assets.js.map +1 -0
  63. package/dist/init.js +69 -11
  64. package/dist/init.js.map +1 -1
  65. package/dist/schemas.d.ts +1 -1
  66. package/dist/server.js +1 -1
  67. package/dist/server.js.map +1 -1
  68. package/dist/tools/prepare-phase-context.d.ts.map +1 -1
  69. package/dist/tools/prepare-phase-context.js +5 -0
  70. package/dist/tools/prepare-phase-context.js.map +1 -1
  71. package/dist/types.d.ts +3 -0
  72. package/dist/types.d.ts.map +1 -1
  73. package/package.json +5 -3
@@ -0,0 +1,170 @@
1
+ # Odin Agent Definitions
2
+
3
+ This directory contains the prompt definitions for all Odin agents. Each agent is a specialized prompt that handles a specific phase of the 11-phase SDD workflow.
4
+
5
+ ## Agent Overview (v2)
6
+
7
+ | Agent | Phase | File | Description |
8
+ |-------|-------|------|-------------|
9
+ | Planning | 0 | `planning.md` | Epic decomposition into features (L3 only) |
10
+ | **Product** | 1 | `product.md` | PRD generation with complexity-gated templates (NEW in v2) |
11
+ | Discovery | 2 | `discovery.md` | Requirements gathering via stakeholder interviews |
12
+ | Architect | 3 | `architect.md` | Technical specification drafting + opt-in formal design verification |
13
+ | Guardian | 4 | `guardian.md` | Multi-perspective review of PRD + spec + proof results |
14
+ | Builder | 5 | `builder.md` | Code implementation (emits claims, watched) |
15
+ | **Reviewer** | 6 | `reviewer.md` | SAST/security scanning via Semgrep (NEW in v2) |
16
+ | Integrator | 7 | `integrator.md` | Build verification and integration (emits claims, watched) |
17
+ | Documenter | 8 | `documenter.md` | Documentation updates |
18
+ | Release | 9 | `release.md` | PR creation and archival (emits claims, watched) |
19
+
20
+ ### Support Agents
21
+
22
+ | Agent | File | Description |
23
+ |-------|------|-------------|
24
+ | **Watcher** | `watcher.md` | LLM escalation for claim verification (NEW in v2) |
25
+ | Shared Context | `_shared-context.md` | Common context injected into all agents |
26
+
27
+ ### Development Agents (Not Part of Workflow)
28
+
29
+ | Agent | File | Description |
30
+ |-------|------|-------------|
31
+ | Consultant | `spec-driven-dev-consultant.md` | For analyzing/improving Odin itself |
32
+ | MCP Test | `mcp-test.md` | For testing MCP server functionality |
33
+
34
+ ## Workflow Diagram (v2)
35
+
36
+ ```
37
+ ┌─────────────────────────────────────────────────────┐
38
+ │ 11-PHASE WORKFLOW │
39
+ └─────────────────────────────────────────────────────┘
40
+
41
+ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
42
+ │ PLANNING │───▶│ PRODUCT │───▶│ DISCOVERY│───▶│ ARCHITECT│───▶│ GUARDIAN │
43
+ │ (0) │ │ (1) NEW │ │ (2) │ │ (3) │ │ (4) │
44
+ │ L3 only │ │ PRD │ │ Reqs │ │ Spec │ │ Review │
45
+ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └────┬─────┘
46
+
47
+ ┌───────────────────────────────────────────────────────────────┘
48
+
49
+
50
+ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐
51
+ │ BUILDER │───▶│ REVIEWER │───▶│INTEGRATOR│───▶│DOCUMENTER│───▶│ RELEASE │
52
+ │ (5) │ │ (6) NEW │ │ (7) │ │ (8) │ │ (9) │
53
+ │ WATCHED │ │ SAST │ │ WATCHED │ │ Docs │ │ WATCHED │
54
+ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └────┬─────┘
55
+
56
+
57
+ ┌──────────┐
58
+ │ COMPLETE │
59
+ │ (10) │
60
+ └──────────┘
61
+ ```
62
+
63
+ ## New Agents in v2
64
+
65
+ ### Product Agent (Phase 1)
66
+
67
+ The Product agent generates a Product Requirements Document (PRD) **before** the technical spec. This ensures business requirements are captured before diving into technical details.
68
+
69
+ **Key Features:**
70
+ - **Complexity-gated templates:**
71
+ - L1 (Bug Fix): PRD_EXEMPTION — 8-line template
72
+ - L2 (Feature): PRD_LITE — 1-page template
73
+ - L3 (Epic): PRD_FULL — Complete PRD with user journeys, NFRs, rollout plan
74
+ - **Max 1 clarification round** — if unresolved, creates blocker
75
+ - **No implementation details** — PRD focuses on "what", not "how"
76
+
77
+ ### Reviewer Agent (Phase 6)
78
+
79
+ The Reviewer agent runs SAST (Static Application Security Testing) using Semgrep after the Builder completes implementation.
80
+
81
+ **Key Features:**
82
+ - **Default scan:** `semgrep scan --config=auto`
83
+ - **Severity-based gating:** HIGH/CRITICAL must be resolved or deferred with justification
84
+ - **Findings recorded** to `security_findings` table
85
+ - **Output format:** Summary table with blocking/deferred sections
86
+
87
+ ### Watcher Agent (Support)
88
+
89
+ The Watcher agent is called via LLM escalation when the Policy Engine cannot make a deterministic decision.
90
+
91
+ **Key Features:**
92
+ - **Only invoked for:**
93
+ - HIGH risk claims
94
+ - Claims with missing evidence
95
+ - Policy Engine inconclusive results
96
+ - **Returns:** PASS/FAIL/NEEDS_REVIEW with reasoning and confidence
97
+ - **Not a phase agent** — runs as a sub-agent when needed
98
+
99
+ ## Watched Agents (v2)
100
+
101
+ Three agents emit structured **claims** that are verified by the Hybrid Watcher Architecture:
102
+
103
+ | Agent | Claims Emitted |
104
+ |-------|---------------|
105
+ | Builder | CODE_ADDED, CODE_MODIFIED, TEST_PASSED, BUILD_SUCCEEDED |
106
+ | Integrator | INTEGRATION_VERIFIED |
107
+ | Release | PR_CREATED, ARCHIVE_CREATED |
108
+
109
+ ### Verification Flow
110
+
111
+ ```
112
+ Agent emits claim
113
+
114
+
115
+ ┌─────────────────┐
116
+ │ agent_claims │ ─────────────────────────────────┐
117
+ │ (Supabase) │ │
118
+ └────────┬────────┘ │
119
+ │ │
120
+ ▼ ▼
121
+ ┌─────────────────┐ ┌─────────────────┐
122
+ │ Policy Engine │ NEEDS_REVIEW ──────▶ │ LLM Watcher │
123
+ │ (SQL Functions) │ │ (Sub-Agent) │
124
+ └────────┬────────┘ └────────┬────────┘
125
+ │ │
126
+ │ PASS/FAIL │ PASS/FAIL
127
+ │ │
128
+ ▼ ▼
129
+ ┌─────────────────┐ ┌─────────────────┐
130
+ │ policy_verdicts │ │ watcher_reviews │
131
+ │ (Supabase) │ │ (Supabase) │
132
+ └─────────────────┘ └─────────────────┘
133
+ ```
134
+
135
+ ## Shared Context
136
+
137
+ All agents receive the `_shared-context.md` file which includes:
138
+ - 11-phase workflow table
139
+ - Critical workflow rules (spec-first, never skip phases, agents never merge)
140
+ - Watcher verification protocol
141
+ - Skills injection block
142
+ - Recovery mechanisms
143
+
144
+ ## Usage
145
+
146
+ Agents are invoked via the Task tool in the main orchestrator session:
147
+
148
+ ```typescript
149
+ // Example: Invoke the Product agent
150
+ Task({
151
+ subagent_type: "product",
152
+ prompt: "Generate PRD for FEAT-001: User authentication flow",
153
+ description: "Generate PRD"
154
+ });
155
+ ```
156
+
157
+ The orchestrator is responsible for:
158
+ 1. Reading agent definitions before each phase
159
+ 2. Injecting relevant skills
160
+ 3. Executing MCP calls (agents can't access MCP directly)
161
+ 4. Submitting claims and running policy checks
162
+ 5. Recording phase transitions
163
+
164
+ ## Version History
165
+
166
+ - **v2.1** (2026-03-18): Added TLA+ formal design verification (Architect Step A3a + Guardian proof review); fixed all agent phase headers to v2 numbering
167
+ - **v2.0** (2026-03-06): Added Product, Reviewer, Watcher agents; 11-phase workflow; claim verification
168
+ - **v1.4** (2026-02-17): Added step-level enforcement checklists
169
+ - **v1.3** (2026-02-09): Added git branch tracking
170
+ - **v1.0** (2026-02-04): Initial 8-agent system
@@ -0,0 +1,377 @@
1
+ # Shared Agent Context
2
+
3
+ This file contains common patterns and references shared across all SDD agents. Each agent definition references this file rather than duplicating the content.
4
+
5
+ ---
6
+
7
+ ## The 11-Phase Workflow (Odin v2)
8
+
9
+ | Phase | Name | Agent | Watched? | Description |
10
+ |-------|------|-------|----------|-------------|
11
+ | 0 | Planning | Planner | No | Epic decomposition (L3 only) |
12
+ | 1 | Product | Product | No | PRD generation (complexity-gated) |
13
+ | 2 | Discovery | Discovery | No | Requirements gathering |
14
+ | 3 | Architect | Architect | No | Specification drafting |
15
+ | 4 | Guardian | Guardian | No | PRD + Spec review |
16
+ | 5 | Builder | Builder | **YES** | Implementation |
17
+ | 6 | Reviewer | Reviewer | No | SAST/security scan |
18
+ | 7 | Integrator | Integrator | **YES** | Build verification |
19
+ | 8 | Documenter | Documenter | No | Documentation |
20
+ | 9 | Release | Release | **YES** | PR creation + archival |
21
+ | 10 | Complete | - | No | Feature done |
22
+
23
+ **Watched agents** (Builder, Integrator, Release) emit structured claims that are verified by the Policy Engine and optionally the LLM Watcher.
24
+
25
+ ---
26
+
27
+ ## Hybrid Orchestration Model
28
+
29
+ Odin uses a **hybrid orchestration** model:
30
+ - **You (Agent)**: Create artifacts (specs, reviews, code, docs) + document state changes needed
31
+ - **Main Session (Orchestrator)**: Manages workflow state via MCP
32
+
33
+ **Why**: Sub-agents spawned via task/agent tools cannot access MCP servers. Only the main orchestrator session can call MCP tools.
34
+
35
+ **Your responsibility**: At the end of your artifact, include a `## State Changes Required` section listing all state changes the orchestrator should make.
36
+
37
+ ---
38
+
39
+ ## State Changes Required — Template
40
+
41
+ Every agent artifact must end with this section:
42
+
43
+ ```markdown
44
+ ---
45
+ ## State Changes Required
46
+
47
+ ### 1. Submit Claims (if watched agent: Builder, Integrator, Release)
48
+ [Include structured claims with type, description, risk level, evidence refs]
49
+
50
+ ### 2. Track Duration
51
+ - **Phase**: [0-10]
52
+ - **Agent**: [Your agent name]
53
+ - **Operation**: [Brief description]
54
+
55
+ ### 3. Record Development Eval Artifact (if applicable)
56
+ - **Feature ID**: [ID]
57
+ - **Phase**: [N]
58
+ - **Output Type**: `eval_plan` | `eval_run`
59
+ - **Created By**: [Agent name]
60
+
61
+ ### 4. Record Quality Gate (if applicable)
62
+ - **Feature ID**: [ID]
63
+ - **Gate Name**: `eval_readiness` | [other gate]
64
+ - **Status**: `APPROVED` | `REJECTED`
65
+ - **Approver**: [Agent name]
66
+ - **Notes**: [Why]
67
+
68
+ ### 5. Transition Phase (if applicable)
69
+ - **Feature ID**: [ID]
70
+ - **From Phase**: [N]
71
+ - **To Phase**: [N+1]
72
+ - **Transitioned By**: [Agent name]
73
+
74
+ ### 6. Create Blocker (if applicable)
75
+ - **Blocker Type**: [SPEC_AMBIGUITY | MISSING_CONTEXT | TECHNICAL_IMPOSSIBILITY | EXTERNAL_DEPENDENCY | INTEGRATION_CONFLICT | DURATION_EXCEEDED | ITERATION_LIMIT_EXCEEDED | QUALITY_GATE_REJECTED | OTHER]
76
+ - **Phase**: [N]
77
+ - **Severity**: [LOW | MEDIUM | HIGH | CRITICAL]
78
+ - **Title**: [Short description]
79
+ - **Description**: [Details + what's needed to resolve]
80
+ - **Created By**: [Agent name]
81
+ ```
82
+
83
+ ---
84
+
85
+ ## Duration Tracking
86
+
87
+ Agent work duration is tracked automatically by the orchestrator using `start_agent_invocation` and `end_agent_invocation`. You do not need to self-report duration or token usage.
88
+
89
+ **If an operation is taking excessively long** (e.g., unbounded iteration, runaway complexity):
90
+ - Stop and document a `DURATION_EXCEEDED` blocker
91
+ - Include what was completed and what remains
92
+
93
+ ---
94
+
95
+ ## Watcher Verification (Builder, Integrator, Release Only)
96
+
97
+ **Builder, Integrator, and Release are watched agents.** They must emit structured claims for verification.
98
+
99
+ ### How Verification Works
100
+
101
+ 1. **Agent emits claim** (in State Changes Required section)
102
+ 2. **Policy Engine** (SQL) performs deterministic checks:
103
+ - Evidence refs present?
104
+ - Phase order correct?
105
+ - Required gates approved?
106
+ 3. **If PASS**: Claim verified, workflow continues
107
+ 4. **If NEEDS_REVIEW**: Escalated to LLM Watcher for semantic verification
108
+
109
+ ### Escalation Triggers (Any One Triggers LLM Watcher)
110
+
111
+ - Claim marked `HIGH` risk
112
+ - Evidence refs missing or empty
113
+ - Policy check inconclusive
114
+
115
+ ### Claim Format
116
+
117
+ ```markdown
118
+ ### Claim: [CLAIM_TYPE]
119
+
120
+ - **Claim Type**: CODE_ADDED | CODE_MODIFIED | CODE_DELETED | TEST_ADDED | TEST_PASSED | BUILD_SUCCEEDED | INTEGRATION_VERIFIED | PR_CREATED | ARCHIVE_CREATED
121
+ - **Description**: What was done
122
+ - **Risk Level**: LOW | MEDIUM | HIGH
123
+ - **Evidence Refs**:
124
+ ```json
125
+ {
126
+ "commit_sha": "abc123",
127
+ "file_paths": ["src/file.ts"],
128
+ "spec_sections": ["4.2"],
129
+ "test_output_hash": "sha256:...",
130
+ ...
131
+ }
132
+ ```
133
+ ```
134
+
135
+ ### Risk Level Guidelines
136
+
137
+ | Risk Level | When to Use |
138
+ |------------|-------------|
139
+ | **LOW** | Tests, docs, styling, non-critical code |
140
+ | **MEDIUM** | Business logic, API endpoints, data transformations |
141
+ | **HIGH** | Authentication, authorization, payments, PII, security, deletions |
142
+
143
+ **HIGH risk claims ALWAYS escalate to LLM Watcher.**
144
+
145
+ ---
146
+
147
+ ## Memory Candidates
148
+
149
+ Document project knowledge discovered during your work. The orchestrator will prompt the user to save these as permanent memories.
150
+
151
+ **When to document**: Architecture insights, integration patterns, tech stack decisions, performance baselines, security requirements, gotchas found.
152
+
153
+ ```markdown
154
+ ### Memory Candidates
155
+
156
+ **ARCHITECTURE**: [Insight about system architecture]
157
+ **Tags**: [relevant, tags]
158
+
159
+ **PATTERN**: [Reusable pattern discovered]
160
+ **Tags**: [pattern, category]
161
+ ```
162
+
163
+ ---
164
+
165
+ ## Skills — Mandatory
166
+
167
+ Skills are **mandatory** for all agents. The orchestrator injects domain-specific skills into your context under `## Active Skills`. If no specific tech stack skills match, the `generic-dev` fallback skill is injected.
168
+
169
+ Some phases also require workflow skills:
170
+
171
+ - **Builder** must receive `testing/unit-tests-sdd`
172
+ - **Reviewer** must receive `testing/unit-tests-eval-sdd`
173
+
174
+ Always follow patterns, conventions, and best practices from your injected skills.
175
+
176
+ ---
177
+
178
+ ## Development Evals — Additive Verification
179
+
180
+ Development Evals are a workflow track for defining and checking behavior before and after implementation. They are **not** the same as Odin's operational **EVALS** health scoring.
181
+
182
+ ### Core Objects
183
+
184
+ - **`eval_plan`** — Architect-owned pre-build artifact describing capability/regression cases and grading strategy
185
+ - **`eval_readiness`** — Guardian gate before Builder begins
186
+ - **`eval_run`** — Reviewer-owned post-build proof artifact, optionally extended by Integrator for runtime validation
187
+
188
+ ### Phase Responsibilities
189
+
190
+ - **Product**: define success, non-goals, and failure shape
191
+ - **Discovery**: collect happy-path, edge, failure, and should-not-trigger scenarios
192
+ - **Architect**: record `eval_plan` when required
193
+ - **Guardian**: decide `eval_readiness`
194
+ - **Builder**: keep regression/acceptance coverage in sync with the work
195
+ - **Reviewer**: run development evals and record `eval_run`
196
+ - **Integrator**: resolve any `partial` eval state with runtime/end-state verification
197
+
198
+ ### Non-Interference Rule
199
+
200
+ Development Evals are additive. They MUST NOT replace, bypass, or weaken:
201
+
202
+ - formal verification (`odin.verify_design`) when applicable
203
+ - Builder/Integrator test and build verification
204
+ - Reviewer security review via `odin.run_review_checks`
205
+ - runtime spot-checks and integration validation
206
+ - watched-claim Policy Engine / Watcher verification
207
+
208
+ If Development Evals pass but one of the existing review steps fails, the feature is still blocked.
209
+
210
+ ### Runtime Recording Convention
211
+
212
+ Only the main orchestrator session can call Odin MCP tools. Agents should document these state changes when relevant:
213
+
214
+ - `odin.record_phase_artifact({ output_type: "eval_plan", ... })`
215
+ - `odin.record_quality_gate({ gate_name: "eval_readiness", ... })`
216
+ - `odin.record_phase_artifact({ output_type: "eval_run", ... })`
217
+
218
+ ---
219
+
220
+ ## Build Verification — Dual-Check Convention
221
+
222
+ Both the **Builder** and **Integrator** agents must verify the build passes. This provides defense-in-depth:
223
+
224
+ 1. **Builder (Step 5a)**: Runs `npm run build` after completing all code changes. Catches TypeScript errors, import issues, and configuration problems before handoff. If the build fails, the Builder fixes the issue — no phase transition until the build passes.
225
+
226
+ 2. **Integrator (Step 6)**: Runs the build again as a second verification. Also performs runtime verification (Step 6b) to catch issues that pass the build but fail at runtime (e.g., stale data, caching, missing env vars).
227
+
228
+ **Why both?** A build failure caught by the Builder saves a full phase transition round-trip. The Integrator's second check catches regressions or environment-specific issues.
229
+
230
+ ---
231
+
232
+ ## Git Branch Management
233
+
234
+ Odin tracks git branches per feature. When a feature is created, a branch name is generated:
235
+ - **With dev initials**: `{initials}/feature/{FEATURE-ID}` (e.g., `jd/feature/AUTH-001`)
236
+ - **Without initials**: `feature/{FEATURE-ID}` (e.g., `feature/AUTH-001`)
237
+
238
+ The orchestrator creates the git branch **BEFORE calling `create_feature()`**. The branch must exist before the DB record is created:
239
+ ```
240
+ # 1. Create branch FIRST
241
+ git checkout -b {dev_initials}/feature/{FEATURE-ID}
242
+
243
+ # 2. Only after branch exists, create the DB record
244
+ SELECT * FROM create_feature(...);
245
+ ```
246
+
247
+ > **CRITICAL**: If branch creation fails, do NOT call `create_feature()`. A dead DB record with no branch is worse than no record at all.
248
+
249
+ Agents that interact with git should:
250
+ 1. **Orchestrator**: Create the feature branch FIRST, then call `create_feature()` — do NOT defer branch creation to Builder
251
+ 2. **Builder**: Commit after each task, include commit tracking in State Changes
252
+ 3. **Integrator**: Verify build and runtime on the feature branch
253
+ 4. **Release**: Create PR via `gh pr create`, request human review — **NEVER merge PRs**
254
+
255
+ ### Developer Identity
256
+
257
+ The `dev_initials` and `author` parameters must identify the real human developer. **Never guess or use placeholders.** To obtain them:
258
+ 1. Check `git config user.name` and derive initials
259
+ 2. Check recent features: `SELECT dev_initials, author FROM features WHERE dev_initials IS NOT NULL ORDER BY created_at DESC LIMIT 1`
260
+ 3. Ask the developer if neither source is available
261
+
262
+ ### Commit Tracking
263
+
264
+ After each commit, document it in State Changes Required:
265
+
266
+ ```markdown
267
+ ### Record Commit
268
+ - **Feature ID**: AUTH-001
269
+ - **Commit Hash**: abc123
270
+ - **Phase**: 4 (Builder)
271
+ - **Message**: feat(AUTH-001): implement login endpoint
272
+ - **Files Changed**: 5
273
+ - **Insertions**: 120
274
+ - **Deletions**: 30
275
+ ```
276
+
277
+ The orchestrator records commits via `record_commit()` in Supabase.
278
+
279
+ ---
280
+
281
+ ## CRITICAL: NEVER Auto-Merge Pull Requests
282
+
283
+ **Agents can CREATE pull requests but NEVER merge them.**
284
+
285
+ PR merging is ALWAYS a human decision. This applies to ALL agents with git/gh access. No exceptions. No "auto-merge if tests pass." No "merge if approved." NEVER.
286
+
287
+ - **Release agent**: Creates PR via `gh pr create`, records PR URL via `record_pr()`, then STOPS
288
+ - **Human**: Reviews, approves, and merges the PR
289
+ - **After merge**: Human (or agent on instruction) calls `record_merge()` to update tracking
290
+
291
+ ---
292
+
293
+ ## Learning Creation — Mandatory for Bug Fixes
294
+
295
+ **Every bug fix MUST create a learning.** Before closing any fix, ask yourself:
296
+
297
+ > *Is this a learning? If the fix involved a non-obvious cause, a gotcha, or a pattern others would hit — create a learning and declare propagation targets.*
298
+
299
+ Include in your artifact's Memory Candidates section:
300
+ - **Category**: GOTCHA, PATTERN, CONVENTION, etc.
301
+ - **Title**: Concise description of the insight
302
+ - **Content**: What happened, why, and how to avoid it
303
+ - **Propagation targets**: Which skills, agent definitions, or AGENTS.md should receive this
304
+
305
+ **When in doubt, create the learning.** It's better to have a low-importance learning than to lose a hard-won insight.
306
+
307
+ ---
308
+
309
+ ## Post-Release Verification
310
+
311
+ After the Release phase completes, the orchestrator should verify the deployed/running application shows correct data. This includes:
312
+
313
+ 1. **Spot-check a known DB value** against the rendered output (e.g., a feature's status in Supabase vs. what the dashboard shows)
314
+ 2. **Verify key pages render** without errors
315
+ 3. **Check data freshness** — ensure the UI reflects recent changes, not cached stale data
316
+
317
+ If bugs are found post-release:
318
+ 1. Create a **new L1 feature** to track the fix (don't fix ad-hoc without tracking)
319
+ 2. The fix feature MUST create a learning before completion
320
+ 3. The learning MUST declare propagation targets
321
+
322
+ ---
323
+
324
+ ## CRITICAL: NEVER Skip Phases OR Steps
325
+
326
+ **All 11 phases must be executed for every feature.** This is enforced at the database level — `transition_phase()` will reject any attempt to skip a phase.
327
+
328
+ **All steps within each phase must also be executed.** Each agent definition contains a **Mandatory Steps Checklist** that lists every step. No step may be silently skipped.
329
+
330
+ - Forward transitions must be sequential: 0→1→2→3→4→5→6→7→8→9
331
+ - Complexity level (L1/L2/L3) affects **depth** within each phase and step, not which phases or steps run
332
+ - An L1 phase can be a single sentence, but it must still be recorded
333
+ - An L1 step can produce minimal output, but it must still execute
334
+ - When documenting State Changes, always use `To Phase: [current + 1]`
335
+ - Phase 10 (Complete) is set by `complete_feature()`, not by `transition_phase()`
336
+
337
+ **If you think a phase or step is unnecessary**: You're wrong. Execute it briefly. A one-sentence Discovery, a three-line spec, a quick "looks good" Guardian review — these are all valid L1 outputs. If a step truly does not apply (e.g., "Handle Merge Conflicts" when there are none), mark it **N/A** with a one-line justification. Never silently skip it.
338
+
339
+ ---
340
+
341
+ ## Step Execution Protocol
342
+
343
+ Before each phase, the orchestrator MUST:
344
+
345
+ 1. **Read the agent definition** for the upcoming phase
346
+ 2. **Identify the Mandatory Steps Checklist** at the top of the agent's process section
347
+ 3. **Execute every step** in order, or mark it N/A with justification
348
+ 4. **Never silently skip a step** — if a step seems unnecessary, state why and mark N/A
349
+
350
+ **Enforcement levels**:
351
+ | Level | What | Enforced By | Mechanism |
352
+ |-------|------|-------------|-----------|
353
+ | Phase | All 11 phases run (0-10) | Database | `transition_phase()` rejects skips |
354
+ | Step | All steps within a phase run | Agent checklist | Orchestrator verifies each step |
355
+
356
+ **Complexity affects depth, not coverage**:
357
+ | Complexity | Phase Depth | Step Depth |
358
+ |------------|-------------|------------|
359
+ | L1 | 1-3 sentences | Minimal output per step |
360
+ | L2 | Full paragraphs | Standard output per step |
361
+ | L3 | Comprehensive sections | Detailed output per step |
362
+
363
+ ---
364
+
365
+ ## What ALL Agents Must NOT Do
366
+
367
+ - Try to call MCP tools directly (you don't have access)
368
+ - Skip documenting State Changes Required
369
+ - Proceed without skills loaded
370
+ - **Skip phases** — all 11 phases must execute, even for L1 tasks (see above)
371
+ - **Skip steps** — all steps within your phase must execute, even for L1 tasks (see above)
372
+ - Continue past reasonable duration for your phase (document DURATION_EXCEEDED blocker and stop)
373
+ - Make up file paths or code patterns — use only what you can verify from context
374
+ - Override decisions made in earlier phases
375
+ - **Merge pull requests** — PR merging is ALWAYS a human decision (see above)
376
+ - **Skip learning creation after bug fixes** — every fix is a potential learning (see above)
377
+ - **Continue work after PR creation** — When a feature is complete and a PR has been created, ALL work MUST stop. Do NOT switch branches, start the next task, stash changes, or begin planning. Report the PR URL and EVAL score, then STOP and wait for the developer to review and merge. The ONLY exception is if the developer explicitly says "continue."