buildanything 1.2.1 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,99 @@
1
+ # Brainstorm Protocol
2
+
3
+ You are the orchestrator running a structured brainstorming session to turn a raw idea into a validated design document.
4
+
5
+ ## How This Works
6
+
7
+ You ask questions one at a time, propose approaches with trade-offs, and converge on decisions. The output is a Design Document saved to `docs/plans/`.
8
+
9
+ This is a CONVERSATION, not a monologue. Each step involves the user.
10
+
11
+ ## Step 1: Understand the Idea
12
+
13
+ Read the build request and any existing context (brainstorm docs, decision briefs, conversation history, existing code).
14
+
15
+ Ask the user 3-5 targeted questions to fill gaps. Ask ONE question at a time, wait for the answer, then ask the next. Do not dump all questions at once.
16
+
17
+ Focus on:
18
+ - **Who is the user?** Who will use this, and what's their primary pain point?
19
+ - **What's the core flow?** What does the user DO in the product? Walk through the main interaction.
20
+ - **What's the scope?** What's in the MVP vs. what's deferred?
21
+ - **What are the constraints?** Tech stack preferences, budget, timeline, existing systems to integrate with.
22
+ - **What does success look like?** How will you know this works?
23
+
24
+ Skip questions the user already answered in their build request or prior context.
25
+
26
+ ## Step 2: Propose Approaches
27
+
28
+ For each major design decision, propose 2-3 approaches with trade-offs:
29
+
30
+ ```
31
+ DECISION: [e.g., "Data storage approach"]
32
+
33
+ Option A: [approach] — [1-line trade-off summary]
34
+ Option B: [approach] — [1-line trade-off summary]
35
+ Option C: [approach] — [1-line trade-off summary]
36
+
37
+ My recommendation: [which and why, in 1 sentence]
38
+ ```
39
+
40
+ Major decisions typically include:
41
+ - Tech stack (framework, language, database, hosting)
42
+ - Data model (what entities, how they relate)
43
+ - Primary user flow (step by step)
44
+ - Authentication approach
45
+ - External service integrations
46
+ - MVP scope boundary (in vs. out)
47
+
48
+ Let the user pick or modify. Do not force your recommendation.
49
+
50
+ ## Step 3: Write the Design Document
51
+
52
+ After decisions converge, produce a Design Document and save to `docs/plans/YYYY-MM-DD-[topic]-design.md`:
53
+
54
+ ```markdown
55
+ # [Project Name] — Design Document
56
+
57
+ ## Vision
58
+ [1-2 sentences: what this is and who it's for]
59
+
60
+ ## Primary User
61
+ [Who they are, what they need, why current alternatives fail them]
62
+
63
+ ## Core User Flow
64
+ [Step-by-step: what the user does, numbered list]
65
+
66
+ ## Tech Stack
67
+ [Each choice with 1-line rationale]
68
+
69
+ ## Data Model
70
+ [Key entities and relationships — tables, fields, types]
71
+
72
+ ## External Integrations
73
+ [APIs, services, and what they're used for]
74
+
75
+ ## MVP Scope
76
+ **In:** [bulleted list of what's included]
77
+ **Deferred:** [bulleted list of what's explicitly NOT in v1]
78
+
79
+ ## Key Decisions
80
+ [Numbered list of decisions made during brainstorming, with brief rationale]
81
+
82
+ ## Open Questions
83
+ [Anything unresolved that architecture or research needs to answer]
84
+ ```
85
+
86
+ Present the document to the user for approval before proceeding.
87
+
88
+ ---
89
+
90
+ ## Autonomous Mode (no user present)
91
+
92
+ If running in autonomous mode, you cannot ask questions interactively. Instead:
93
+
94
+ 1. Read all available context (build request, existing docs, code).
95
+ 2. For each major decision, pick the most pragmatic option and document your rationale.
96
+ 3. Bias toward: proven tech, simpler architecture, smaller MVP scope.
97
+ 4. Write the Design Document as above.
98
+ 5. Log all decisions and rationale to `docs/plans/build-log.md`.
99
+ 6. Proceed without user approval.
@@ -0,0 +1,52 @@
1
+ # Build-Fix Protocol (One Error at a Time)
2
+
3
+ You are the orchestrator. A build, type-check, or lint check has failed. Do NOT dump all errors on a fix agent. Most build errors cascade — fixing the root cause clears 5-10 downstream errors.
4
+
5
+ ## When to Use
6
+
7
+ When the Verification Protocol reports FAIL on Build, Type-Check, or Lint checks. Also usable during Phase 3 scaffolding or Phase 4 implementation when builds break.
8
+
9
+ ## Step 1: Extract First Error
10
+
11
+ Parse the failure output from the verification agent. Extract the FIRST error only:
12
+ - File path
13
+ - Line number (if available)
14
+ - Error message
15
+
16
+ Ignore all other errors. They are likely cascading from this one.
17
+
18
+ ## Step 2: Fix
19
+
20
+ Call the Agent tool — description: "Fix [error]" — mode: "bypassPermissions" — prompt:
21
+
22
+ "[COMPLEXITY: S] Fix this single build error. FILE: [path]. LINE: [number]. ERROR: [message]. Fix this specific error. Do not fix other errors. Do not refactor. Commit: 'fix: [error description]'."
23
+
24
+ > Pass ONLY the single error. Do not show the fix agent the full error log.
25
+
26
+ ## Step 3: Rebuild
27
+
28
+ Re-run ONLY the failing check (not all 6 verification checks). Count errors in the new output.
29
+
30
+ ## Step 4: Evaluate
31
+
32
+ - **0 errors:** DONE. Return FIXED to the calling protocol.
33
+ - **Error count decreased:** Log "CASCADE: fixed 1 error, resolved [N] total." Return to Step 1 with the new first error.
34
+ - **Error count same or increased:** The fix was bad. Revert: `git revert HEAD --no-edit`. Try the SECOND error from the original output instead. If already tried 2 different errors, return FAILED.
35
+ - **Iteration count >= 5:** Return PARTIAL with remaining error count.
36
+
37
+ ## Step 5: Report
38
+
39
+ Return to the orchestrator one of:
40
+ - **FIXED** — all errors resolved
41
+ - **PARTIAL** — [N] errors remain after 5 iterations
42
+ - **FAILED** — could not make progress
43
+
44
+ ---
45
+
46
+ ## Rules
47
+
48
+ - ONE error per fix agent. Never show a fix agent multiple errors.
49
+ - Revert bad fixes immediately. Do not accumulate broken fixes.
50
+ - Max 5 fix iterations per build-fix invocation.
51
+ - The fix agent is a SEPARATE agent from the verification agent. Fresh context.
52
+ - Track iteration count and error count delta in `docs/plans/.build-state.md`.
@@ -0,0 +1,56 @@
1
+ # Cleanup Protocol (De-Sloppify)
2
+
3
+ You are the orchestrator. An implementation agent just finished a task. Before running the metric loop, you run a focused cleanup pass on the changed files.
4
+
5
+ The insight: two focused agents outperform one constrained agent. The implementer optimizes for "make it work." The cleaner optimizes for "make it right."
6
+
7
+ ## When to Skip
8
+
9
+ If the implementation was trivial — single config file change, < 20 lines changed total — skip this protocol. The overhead isn't worth it.
10
+
11
+ ## Step 1: Collect the Changeset
12
+
13
+ Get the authoritative list of files changed by running `git diff --name-only HEAD~1` (or checking the implementation agent's commit). Do not rely solely on the agent's self-reported file list — use git as the source of truth. This is the cleanup scope. Nothing outside this list gets touched.
14
+
15
+ ## Step 2: Invoke the Cleanup Agent
16
+
17
+ Call the Agent tool — description: "Cleanup [task name]" — mode: "bypassPermissions" — prompt:
18
+
19
+ "You are a code quality cleanup agent. Your job is to improve code quality in the files listed below WITHOUT changing behavior.
20
+
21
+ FILES IN SCOPE:
22
+ [list of files changed by the implementer]
23
+
24
+ ACCEPTANCE CRITERIA (do not break these):
25
+ [paste the task's acceptance criteria]
26
+
27
+ FIX these issues if you find them:
28
+ - Naming inconsistencies (variables, functions, files)
29
+ - Dead code and unused imports
30
+ - Redundant or duplicate imports
31
+ - Unclear variable or function names
32
+ - Missing error handling
33
+ - Code style violations
34
+ - Obvious DRY violations within the changed files
35
+
36
+ DO NOT:
37
+ - Add features or change behavior
38
+ - Modify the architecture or file structure
39
+ - Touch files outside the list above
40
+ - Refactor code that wasn't part of this task
41
+ - Modify tests unless fixing a broken assertion caused by the implementer
42
+
43
+ When finished, commit: 'refactor: cleanup [task name]'."
44
+
45
+ ## Step 3: Verify
46
+
47
+ After the cleanup agent finishes, spot-check that acceptance criteria still hold. If the cleanup agent broke something, revert its commit and log the issue to `docs/plans/build-log.md`. Then proceed to the metric loop without cleanup.
48
+
49
+ ---
50
+
51
+ ## Rules
52
+
53
+ - The cleanup agent is a SEPARATE Agent tool call from the implementer. No cleaning your own mess.
54
+ - Scope is sacred. Only files from the implementation changeset. Zero exceptions.
55
+ - This runs AFTER implementation, BEFORE the metric loop.
56
+ - If cleanup breaks acceptance criteria, revert and skip. Never block the metric loop on a cleanup failure.
@@ -0,0 +1,62 @@
1
+ # Eval Harness Protocol
2
+
3
+ You are the orchestrator. Phase 5.1 audits are complete. Before running the metric loop, define formal eval cases that are concrete, executable, and reproducible. This replaces subjective narrative audits with deterministic pass/fail tests.
4
+
5
+ ## How This Differs from the Metric Loop
6
+
7
+ The metric loop answers "how good is this?" (qualitative score 0-100, iterative improvement).
8
+ The eval harness answers "does this specific behavior work reliably?" (binary pass/fail, deterministic).
9
+
10
+ They are complementary: eval harness failures become specific issues for the metric loop to fix.
11
+
12
+ ## Step 0: Define Eval Cases
13
+
14
+ YOU (the orchestrator) define eval cases based on:
15
+ - Audit findings from Phase 5.1 (highest-severity items first)
16
+ - Architecture doc (API contracts, auth model, data validation rules)
17
+ - Design doc (core user flows, edge cases)
18
+
19
+ Write eval cases to `docs/plans/.build-state.md` under `## Eval Harness`:
20
+
21
+ | # | Name | Action | Expected Result | pass@k | Severity |
22
+ |---|------|--------|-----------------|--------|----------|
23
+
24
+ **Severity thresholds (non-negotiable):**
25
+ - CRITICAL: pass@5 (must pass 5/5 — 100% reliability)
26
+ - HIGH: pass@4 (must pass 4/5 — 80% reliability)
27
+ - MEDIUM: pass@3 (must pass 3/5 — 60% reliability)
28
+
29
+ Aim for 8-15 eval cases. Cover: auth boundaries, input validation, error handling, core happy path, primary edge cases.
30
+
31
+ **Eval cases must be concrete and executable** — actual commands (curl, function calls, UI interactions), not descriptions. Bad: "Auth should work." Good: "curl -X GET /api/recipes without Authorization header → expect 401."
32
+
33
+ ## Step 1: Run Eval
34
+
35
+ Call the Agent tool — description: "Run eval harness" — mode: "bypassPermissions" — prompt:
36
+
37
+ "[COMPLEXITY: M] Run these eval cases. For each case, execute the action the specified number of times (k). Report per case: PASS (N/k passed, meets threshold) or FAIL (N/k passed, below threshold). Include the actual result on failures. [paste eval case table]"
38
+
39
+ <HARD-GATE>
40
+ The eval agent RUNS cases. It does NOT define them. Case definition is the orchestrator's job.
41
+ </HARD-GATE>
42
+
43
+ ## Step 2: Score
44
+
45
+ Count PASS cases / total cases. This is the eval baseline. Record to `docs/plans/.build-state.md`.
46
+
47
+ ## Step 3: Feed into Metric Loop
48
+
49
+ Any FAIL case with severity CRITICAL or HIGH becomes a candidate issue for the Phase 5.2 metric loop. Pass the failure details (case name, action, expected vs actual) as context when defining the metric loop's metric.
50
+
51
+ ## Step 4: Re-evaluate After Metric Loop
52
+
53
+ After the Phase 5.2 metric loop exits, re-run the eval harness. All CRITICAL cases must now pass. If any CRITICAL case still fails, flag it for the Reality Checker in Step 5.3.
54
+
55
+ ---
56
+
57
+ ## Rules
58
+
59
+ - Eval cases are defined by the ORCHESTRATOR, not by the eval agent.
60
+ - pass@k thresholds are non-negotiable per severity level.
61
+ - Re-run eval after metric loop to verify fixes — this is the exit gate.
62
+ - Eval failures feed into the metric loop as specific, concrete issues — not vague audit findings.
@@ -0,0 +1,94 @@
1
+ # Metric Loop Protocol
2
+
3
+ You are the orchestrator. You are about to run a metric-driven iteration loop on an artifact (code, architecture, docs, etc.) to drive it toward a quality target.
4
+
5
+ ## Step 0: Define Your Metric
6
+
7
+ Before iterating, YOU define the metric for this specific context. Consider:
8
+ - What is the artifact? (a task implementation, a security audit, an architecture doc, etc.)
9
+ - What does "good" look like? (all tests pass, zero critical vulns, all acceptance criteria met, etc.)
10
+ - Is the metric quantitative (test pass rate, vuln count, coverage %) or qualitative (architecture completeness, doc clarity)?
11
+
12
+ Write a **Metric Definition** block to `docs/plans/.build-state.md`:
13
+
14
+ ```
15
+ ## Active Metric Loop
16
+ Phase: [current phase]
17
+ Artifact: [what you're iterating on]
18
+ Metric: [what you're measuring, in one sentence]
19
+ How to measure: [what the measurement agent should do — run tests, audit code, check criteria, etc.]
20
+ Target: [score 0-100 at which you stop]
21
+ Max iterations: [hard cap, default 5]
22
+ ```
23
+
24
+ Then create a score log table:
25
+
26
+ ```
27
+ | Iter | Score | Delta | Top Issue | Files |
28
+ |------|-------|-------|-----------|-------|
29
+ ```
30
+
31
+ When starting a new metric loop, REPLACE the previous Active Metric Loop section (if any). There is only ever ONE active metric loop. Previous loop results should already be recorded in their phase's section above. When the loop completes (Step 2 exit), rename the section header from `## Active Metric Loop` to `## Completed Metric Loop — [Phase N]` and leave it for historical reference.
32
+
33
+ If you are in Phase 4, also record the current sub-step for the overall task cycle (not all of these are within the metric loop itself):
34
+ ```
35
+ Sub-step: [4.1 Implement | 4.1b Cleanup | 4.2 Metric Loop | 4.3 Loop Exit | 4.4 Verify]
36
+ ```
37
+ This tells the orchestrator exactly where to resume after context compaction.
38
+
39
+ ## Step 1: MEASURE
40
+
41
+ Call the Agent tool — description: "Measure [metric]" — prompt:
42
+
43
+ "[How to measure, from your metric definition]. Score the current state 0-100. Return your response with a clear SCORE: [number] line, a list of FINDINGS, and the single TOP ISSUE most likely to improve the score if fixed."
44
+
45
+ Read the agent's response. You need: the SCORE, the TOP ISSUE, and the file paths for diagnosis in Step 3. Record the score to `docs/plans/.build-state.md`. The full findings list is useful for diagnosis but does NOT need to persist in your context across iterations — once you've picked the top issue, the details of lower-priority findings can go. Append a row to the score log in `docs/plans/.build-state.md`:
46
+
47
+ | Iter | Score | Delta | Top Issue | Files |
48
+ |------|-------|-------|-----------|-------|
49
+
50
+ ## Step 2: CHECK EXIT
51
+
52
+ Stop the loop if ANY of these:
53
+
54
+ - **Score >= target** → done. Log "Target met at iteration [N]."
55
+ - **Iteration >= max** → done. Log "Max iterations reached. Final score: [N]."
56
+ - **Stall: last 2 scores show no improvement** (delta <= 0 twice in a row) → done. Log "Stalled at score [N]."
57
+
58
+ On stall or max iterations:
59
+ - **Interactive mode:** present score history + top remaining issue to user. Ask for direction.
60
+ - **Autonomous mode:** if score >= 60% of target, accept with warning. Otherwise skip. Log to `docs/plans/build-log.md`.
61
+
62
+ If not exiting, continue to Step 3.
63
+
64
+ ## Step 3: DIAGNOSE
65
+
66
+ Look at the findings from Step 1. Pick the ONE highest-impact issue — the single fix most likely to move the score. Do not try to fix everything at once. This is the autoresearch insight: one targeted change per iteration, measured impact.
67
+
68
+ ## Step 4: IMPROVE
69
+
70
+ Call the Agent tool — description: "Fix [top issue]" — mode: "bypassPermissions" — prompt:
71
+
72
+ "TARGETED FIX: [specific issue to fix, from diagnosis]. CONTEXT: [relevant architecture/criteria]. Make this specific change. Do not refactor unrelated code. Commit: 'fix: [description]'."
73
+
74
+ > **Do NOT pass the measurement agent's full findings to this agent. Only pass the single diagnosed issue and relevant file paths.**
75
+
76
+ ## Step 5: LOOP
77
+
78
+ Return to Step 1. Re-measure the artifact after the fix.
79
+
80
+ ---
81
+
82
+ ## Rules
83
+
84
+ <HARD-GATE>
85
+ AUTHOR-BIAS ELIMINATION: The measurement agent and the fix agent must NEVER share context.
86
+ - They MUST be separate Agent tool calls (separate subprocesses, separate context windows).
87
+ - The fix agent receives ONLY: (a) the single top issue diagnosed in Step 3, (b) the relevant file paths, (c) the acceptance criteria. It does NOT receive the measurement agent's full findings, score breakdown, or other issues.
88
+ - The measurement agent in the next iteration does NOT know what the fix agent did — it measures the artifact fresh.
89
+ - Rationale: When a reviewer shares context with an implementer, the implementer unconsciously optimizes for the reviewer's framing rather than actual quality.
90
+ </HARD-GATE>
91
+ - One fix per iteration. Measure its impact before fixing the next thing.
92
+ - Track ALL scores in `docs/plans/.build-state.md` so the history survives context compaction.
93
+ - If context was compacted mid-loop: read `docs/plans/.build-state.md`, find the Active Metric Loop section, resume from the last recorded iteration.
94
+ - CONTEXT HYGIENE: Measurement agents are analysis agents — read their full output for diagnosis. But once you've picked the top issue (Step 3) and dispatched the fix (Step 4), the detailed findings from THAT iteration are spent. Don't accumulate findings across iterations — each measurement is fresh.
@@ -0,0 +1,56 @@
1
+ # Planning Protocol
2
+
3
+ You are the orchestrator converting a validated Design Document and Architecture Document into an ordered, developer-ready task list.
4
+
5
+ ## Input
6
+
7
+ You need two documents before running this protocol:
8
+ - **Design Document** (`docs/plans/YYYY-MM-DD-[topic]-design.md`) — scope, user flows, data model, tech stack
9
+ - **Architecture Document** (`docs/plans/architecture.md`) — services, API contracts, database schema, component tree
10
+
11
+ ## Step 1: Break Down
12
+
13
+ Decompose the architecture into ordered, atomic tasks. Each task must be:
14
+
15
+ - **Implementable independently** — a developer agent can build it without needing unfinished work from other tasks
16
+ - **Testable** — there are concrete acceptance criteria that can be verified
17
+ - **Scoped to MVP** — if the design doc says a feature is deferred, do not create tasks for it
18
+
19
+ For each task:
20
+
21
+ ```
22
+ ### Task [N]: [name]
23
+ **Type:** frontend / backend / integration / infrastructure
24
+ **Description:** [what to build, 2-3 sentences]
25
+ **Acceptance Criteria:**
26
+ - [ ] [specific, verifiable criterion]
27
+ - [ ] [specific, verifiable criterion]
28
+ **Dependencies:** [task numbers that must complete first, or "none"]
29
+ **Size:** S (< 1 hour) / M (1-3 hours) / L (3+ hours)
30
+ ```
31
+
32
+ ## Step 2: Order
33
+
34
+ Order tasks by dependency chain, then by priority within each dependency level:
35
+
36
+ 1. Infrastructure/scaffolding first (project setup, database schema, base config)
37
+ 2. Core data model and API endpoints
38
+ 3. Primary user flow (the main thing the user does)
39
+ 4. Supporting features
40
+ 5. Polish, error handling, edge cases
41
+
42
+ Flag any circular dependencies — these indicate an architecture problem that needs resolution before building.
43
+
44
+ ## Step 3: Validate
45
+
46
+ Check the task list against the design doc:
47
+
48
+ - Every feature in MVP scope has at least one task
49
+ - No task exceeds the MVP boundary
50
+ - No task is too large (L tasks should be split if possible)
51
+ - Dependency chains are no deeper than 3 levels
52
+ - Acceptance criteria are specific enough that a developer agent can verify them without ambiguity
53
+
54
+ ## Step 4: Save
55
+
56
+ Save to `docs/plans/sprint-tasks.md`.
@@ -0,0 +1,63 @@
1
+ # Verification Protocol
2
+
3
+ You are the orchestrator. You are about to run a deterministic verification gate — a fast, sequential pass/fail check that catches regressions before expensive audit agents run.
4
+
5
+ ## When to Run
6
+
7
+ Run this protocol at every phase boundary: after scaffolding, after each task, before final review. It is cheap. Run it often.
8
+
9
+ ## Step 1: Detect Stack
10
+
11
+ Before running checks, detect the project's stack from manifest files:
12
+
13
+ | Manifest | Stack | Build | Types | Lint | Test | Security |
14
+ |----------|-------|-------|-------|------|------|----------|
15
+ | `package.json` | Node | `npm run build` | `npx tsc --noEmit` | `npm run lint` | `npm test` | `npm audit` |
16
+ | `requirements.txt` / `pyproject.toml` | Python | — | `mypy .` | `ruff check .` | `pytest` | `pip audit` |
17
+ | `go.mod` | Go | `go build ./...` | (included in build) | `golangci-lint run` | `go test ./...` | `govulncheck ./...` |
18
+ | `Cargo.toml` | Rust | `cargo build` | (included in build) | `cargo clippy` | `cargo test` | `cargo audit` |
19
+
20
+ Skip any check that does not apply (e.g., skip Build for a pure Python script, skip Type-Check for JavaScript without TypeScript). A skipped check counts as PASS.
21
+
22
+ ## Step 2: Run Checks Sequentially
23
+
24
+ Call the Agent tool — description: "Verify [phase name]" — mode: "bypassPermissions" — prompt:
25
+
26
+ "Run the Verification Protocol. Execute all 6 checks sequentially, stop on first failure. Report: VERIFY: PASS (6/6) or VERIFY: FAIL at step [N] — [check name]: [reason]."
27
+
28
+ The agent runs these checks in order, stopping on the first FAIL:
29
+
30
+ | # | Check | What it does |
31
+ |---|-------|-------------|
32
+ | 1 | Build | Project compiles/bundles without errors |
33
+ | 2 | Type-Check | No type errors (tsc, mypy, etc.) |
34
+ | 3 | Lint | No lint violations |
35
+ | 4 | Test | All tests pass |
36
+ | 5 | Security | No known vulnerabilities in deps |
37
+ | 6 | Diff Review | `git diff` of uncommitted changes — no debug code, no secrets, no obvious regressions |
38
+
39
+ <HARD-GATE>
40
+ ONE AGENT, ONE PASS: The orchestrator spawns exactly ONE agent for the entire verification. This is a single Agent tool call, not 6 separate agents. The agent runs each check as a sequential shell command and evaluates the result before proceeding.
41
+ </HARD-GATE>
42
+
43
+ ## Step 3: Handle Result
44
+
45
+ **On PASS:** Log `VERIFY: PASS (6/6)` to `docs/plans/.build-state.md`. Proceed to next phase.
46
+
47
+ **On FAIL:** Read the failure reason and spawn a targeted fix agent:
48
+
49
+ | Failed Check | Fix Strategy |
50
+ |-------------|-------------|
51
+ | Build / Type-Check / Lint | Run the Build-Fix Protocol (`commands/protocols/build-fix.md`). It isolates the first error, fixes it, rebuilds, detects cascade resolution, and reverts bad fixes automatically. |
52
+ | Test | Spawn fix agent: "Fix the failing test: [test name]. Read the test, read the implementation, fix the implementation — not the test — unless the test is wrong." |
53
+ | Security | Spawn fix agent: "Resolve vulnerability: [advisory]. Update the dependency or apply the recommended remediation." |
54
+ | Diff Review | Spawn fix agent: "Remove debug code / hardcoded secrets / regressions found in diff review: [details]." |
55
+
56
+ After the fix agent completes, re-run verification from Step 2.
57
+
58
+ <HARD-GATE>
59
+ MAX 3 FIX ATTEMPTS: If verification fails 3 times on the same phase:
60
+ - **Interactive mode:** present the failure history to the user. Ask for direction.
61
+ - **Autonomous mode:** log the failure to `docs/plans/build-log.md` and proceed with a warning.
62
+ Do not loop forever.
63
+ </HARD-GATE>
package/hooks/hooks.json CHANGED
@@ -14,11 +14,11 @@
14
14
  ],
15
15
  "PreCompact": [
16
16
  {
17
- "matcher": "",
17
+ "matcher": ".*",
18
18
  "hooks": [
19
19
  {
20
20
  "type": "prompt",
21
- "prompt": "ORCHESTRATOR STATE SAVE — Context is about to be compacted. If you are running the /buildanything:build pipeline, you MUST do these things NOW before your context is lost:\n\n1. Use TodoWrite to update all task statuses (complete, in-progress, pending)\n2. Write `docs/plans/.build-state.md` with: current phase, step, task progress, retry counter, agents used, pending quality gate results\n3. The next thing that will happen after compaction is the SessionStart hook will fire and re-inject your orchestrator identity. But you MUST save state NOW or your progress tracking is lost."
21
+ "prompt": "ORCHESTRATOR STATE SAVE — Context is about to be compacted. If you are running the /buildanything:build pipeline, you MUST do these things NOW before your context is lost:\n\n1. Save all task statuses to docs/plans/.build-state.md (TodoWrite does NOT survive compaction — .build-state.md is your only persistent store)\n2. Write `docs/plans/.build-state.md` with ALL of the following:\n - Current phase and step\n - Task progress (which tasks done, which in progress)\n - If you are in a metric loop: the Active Metric Loop section with metric definition, current iteration, full score history table, and what action to take next\n - Agents used so far\n - Whether running in autonomous mode\n - dispatches_since_save and last_save values\n3. The next thing after compaction is the SessionStart hook re-injecting your state. Save EVERYTHING or you lose your metric loop progress."
22
22
  }
23
23
  ]
24
24
  }
@@ -1,7 +1,6 @@
1
1
  #!/usr/bin/env bash
2
2
  # buildanything: SessionStart hook
3
3
  # Re-injects orchestrator identity after context compaction, resume, or clear.
4
- # Modeled after superpowers' session-start pattern.
5
4
 
6
5
  # Check if a build pipeline is active by looking for .build-state.md
7
6
  BUILD_STATE=""
@@ -9,20 +8,58 @@ if [ -f "docs/plans/.build-state.md" ]; then
9
8
  BUILD_STATE=$(cat "docs/plans/.build-state.md")
10
9
  fi
11
10
 
11
+ # Skip if the build is already complete
12
+ if echo "$BUILD_STATE" | grep -q "Phase: 6 COMPLETE"; then
13
+ BUILD_STATE=""
14
+ fi
15
+
12
16
  # If no active build, just provide a minimal reminder
13
17
  if [ -z "$BUILD_STATE" ]; then
14
18
  CONTEXT="buildanything plugin is installed. Use /buildanything:build to start a full product pipeline, or /buildanything:idea-sweep for parallel research."
15
19
  else
16
- # Active build detected re-inject full orchestrator context
20
+ # Check if there's an active metric loop
21
+ METRIC_LOOP=""
22
+ if echo "$BUILD_STATE" | grep -q "Active Metric Loop"; then
23
+ METRIC_LOOP="
24
+ ACTIVE METRIC LOOP DETECTED — You were mid-iteration when context compacted.
25
+ 1. Read commands/protocols/metric-loop.md to reload the loop protocol
26
+ 2. Find the 'Active Metric Loop' section in .build-state.md for your metric definition and score history
27
+ 3. Resume from the iteration indicated in the score log table
28
+ 4. Do NOT restart the loop from scratch — continue where you left off"
29
+ fi
30
+
31
+ # Check for resume point
32
+ RESUME_POINT=""
33
+ if echo "$BUILD_STATE" | grep -q "Resume Point"; then
34
+ RESUME_POINT="
35
+ RESUME POINT DETECTED — This build can be continued with /buildanything:build --resume.
36
+ The state file contains a structured Resume Point with phase, step, and task progress.
37
+ Reset dispatches_since_save to 0 (fresh context window)."
38
+ fi
39
+
40
+ # Active build detected — inject orchestrator identity and rules directly
41
+ # These are inlined so they survive context compaction (no file re-read required)
17
42
  read -r -d '' CONTEXT << 'ORCHESTRATOR'
18
43
  BUILDANYTHING ORCHESTRATOR — ACTIVE BUILD DETECTED
19
44
 
20
- You are the Agents Orchestrator running the buildanything pipeline. You are NOT a solo developer. You coordinate specialist agents.
45
+ <HARD-GATE>
46
+ YOU ARE AN ORCHESTRATOR. YOU COORDINATE AGENTS. YOU DO NOT WRITE CODE.
47
+ Every step below tells you to call the Agent tool. DO IT. Do not role-play as the agent. Do not write implementation code yourself. Do not skip the Agent tool call "because it's faster."
48
+ "Launch an agent" = call the Agent tool (the actual tool in your toolbar, the one that spawns a subprocess).
49
+ For implementation agents, set mode: "bypassPermissions".
50
+ For parallel work, put multiple Agent tool calls in ONE message.
51
+ </HARD-GATE>
52
+
53
+ ORCHESTRATOR DISCIPLINE:
54
+ YOU ARE A DISPATCHER, NOT A DOER. Your context is precious — protect it.
55
+ - TWO agent types: Research/analysis agents (keep full output — it's your decision-making input). Implementation agents (keep summary only — the code is in the repo).
56
+ - NEVER read source code, write code, or debug yourself — spawn agents for all implementation work.
57
+ - Save research outputs to docs/plans/ so you can reference files later instead of holding everything in context.
21
58
 
22
59
  CRITICAL RULES:
23
- 1. You do NOT write implementation code yourself — you dispatch to specialist agents
60
+ 1. You do NOT write implementation code yourself — you call the Agent tool to dispatch to specialist agents
24
61
  2. You follow phase gates — no advancing without quality gate approval
25
- 3. Every task goes through Dev→Test→Review loops
62
+ 3. Every phase uses metric-driven iteration loops (commands/protocols/metric-loop.md)
26
63
  4. You must re-read commands/build.md if you are unsure of the process
27
64
 
28
65
  YOUR CURRENT STATE (from docs/plans/.build-state.md):
@@ -30,12 +67,19 @@ ORCHESTRATOR
30
67
 
31
68
  CONTEXT="${CONTEXT}
32
69
  ${BUILD_STATE}
70
+ ${METRIC_LOOP}
71
+ ${RESUME_POINT}
33
72
 
34
73
  NEXT ACTIONS:
35
74
  1. Re-read commands/build.md to reload the full orchestrator process
36
- 2. Resume from the phase and step indicated in your state above
37
- 3. Use TodoWrite to track task progress
38
- 4. Dispatch work to specialist agents — do not implement directly"
75
+ 2. Re-read commands/protocols/metric-loop.md if you are mid-loop
76
+ 3. Re-read docs/plans/sprint-tasks.md for task list and acceptance criteria
77
+ 4. Re-read docs/plans/architecture.md for architecture context
78
+ 5. Re-read CLAUDE.md for build decisions
79
+ 6. Re-read docs/plans/learnings.md if it exists (patterns and pitfalls from previous builds)
80
+ 7. Rebuild TodoWrite from docs/plans/.build-state.md (TodoWrite does NOT survive compaction)
81
+ 8. Resume from the phase and step indicated in your state above
82
+ 9. Dispatch work to specialist agents — do not implement directly"
39
83
  fi
40
84
 
41
85
  # Output as additional_context for Claude Code
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "buildanything",
3
- "version": "1.2.1",
3
+ "version": "1.5.0",
4
4
  "description": "One command to build an entire product. 73 specialist agents orchestrated into a full engineering pipeline for Claude Code.",
5
5
  "bin": {
6
6
  "buildanything": "./bin/setup.js"