bigpowers 1.0.0 → 1.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (90) hide show
  1. package/CHANGELOG.md +21 -0
  2. package/CLAUDE.md +10 -8
  3. package/CONVENTIONS.md +8 -3
  4. package/GEMINI.md +9 -8
  5. package/README.md +57 -21
  6. package/RELEASE.md +10 -0
  7. package/SKILL-INDEX.md +98 -88
  8. package/assess-impact/SKILL.md +1 -0
  9. package/audit-code/SKILL.md +18 -0
  10. package/change-request/SKILL.md +1 -0
  11. package/commit-message/SKILL.md +1 -0
  12. package/compose-workflow/REFERENCE.md +13 -0
  13. package/compose-workflow/SKILL.md +23 -0
  14. package/countable-story-format.md +1 -1
  15. package/craft-skill/REFERENCE.md +1 -1
  16. package/craft-skill/SKILL.md +6 -1
  17. package/deepen-architecture/SKILL.md +15 -2
  18. package/define-language/SKILL.md +1 -0
  19. package/define-success/SKILL.md +1 -0
  20. package/delegate-task/SKILL.md +7 -2
  21. package/design-interface/SKILL.md +1 -0
  22. package/develop-tdd/SKILL.md +10 -9
  23. package/diagnose-root/SKILL.md +22 -0
  24. package/dispatch-agents/SKILL.md +13 -3
  25. package/edit-document/SKILL.md +1 -0
  26. package/elaborate-spec/SKILL.md +1 -0
  27. package/enforce-first/SKILL.md +1 -0
  28. package/evolve-skill/REFERENCE.md +12 -0
  29. package/evolve-skill/SKILL.md +24 -0
  30. package/execute-plan/SKILL.md +12 -4
  31. package/grill-me/SKILL.md +1 -0
  32. package/grill-with-docs/REFERENCE.md +5 -0
  33. package/grill-with-docs/SKILL.md +28 -0
  34. package/guard-git/REFERENCE.md +36 -6
  35. package/guard-git/SKILL.md +5 -2
  36. package/guard-git/scripts/lib/git-guardrails-core.sh +0 -1
  37. package/hook-commits/SKILL.md +1 -0
  38. package/hooks/pre-tool-use.sh +43 -46
  39. package/inspect-quality/SKILL.md +9 -6
  40. package/investigate-bug/SKILL.md +18 -5
  41. package/kickoff-branch/SKILL.md +13 -5
  42. package/map-codebase/SKILL.md +1 -0
  43. package/migrate-spec/SKILL.md +1 -0
  44. package/model-domain/SKILL.md +10 -0
  45. package/opencode.json +1 -1
  46. package/orchestrate-project/REFERENCE.md +13 -7
  47. package/orchestrate-project/SKILL.md +7 -5
  48. package/organize-workspace/SKILL.md +1 -0
  49. package/package.json +3 -2
  50. package/plan-refactor/SKILL.md +1 -0
  51. package/plan-release/SKILL.md +1 -0
  52. package/plan-work/SKILL.md +8 -3
  53. package/profiles/node-service.md +28 -0
  54. package/profiles/solo-git.md +39 -0
  55. package/profiles/swift.md +27 -0
  56. package/profiles/typescript-vue.md +28 -0
  57. package/release-branch/SKILL.md +51 -11
  58. package/request-review/SKILL.md +2 -1
  59. package/research-first/REFERENCE.md +29 -0
  60. package/research-first/SKILL.md +31 -0
  61. package/reset-baseline/SKILL.md +21 -0
  62. package/respond-review/SKILL.md +1 -0
  63. package/run-evals/REFERENCE.md +27 -0
  64. package/run-evals/SKILL.md +27 -0
  65. package/scope-work/SKILL.md +22 -0
  66. package/scripts/add-model-frontmatter.sh +82 -0
  67. package/scripts/build-skill-index.sh +28 -0
  68. package/scripts/install.sh +5 -1
  69. package/scripts/land-branch.sh +166 -0
  70. package/scripts/sync-skills.sh +38 -3
  71. package/search-skills/SKILL.md +20 -0
  72. package/seed-conventions/SKILL.md +3 -0
  73. package/session-state/SKILL.md +25 -3
  74. package/setup-environment/SKILL.md +22 -0
  75. package/simulate-agents/SKILL.md +24 -0
  76. package/slice-tasks/SKILL.md +22 -0
  77. package/spike-prototype/SKILL.md +1 -0
  78. package/stocktake-skills/REFERENCE.md +8 -0
  79. package/stocktake-skills/SKILL.md +28 -0
  80. package/survey-context/SKILL.md +12 -11
  81. package/terse-mode/SKILL.md +1 -0
  82. package/trace-requirement/SKILL.md +1 -0
  83. package/using-bigpowers/SKILL.md +30 -4
  84. package/validate-fix/SKILL.md +9 -5
  85. package/verify-work/REFERENCE.md +23 -0
  86. package/verify-work/SKILL.md +39 -0
  87. package/visual-dashboard/SKILL.md +51 -1
  88. package/wire-observability/SKILL.md +1 -0
  89. package/write-document/REFERENCE.md +166 -0
  90. package/write-document/SKILL.md +12 -1
@@ -0,0 +1,23 @@
1
+ ---
2
+ name: compose-workflow
3
+ description: Chain multiple bigpowers skills into a custom workflow recipe saved in specs/. Use when a project repeats a non-standard skill sequence, or user wants a documented playbook beyond orchestrate-project modes.
4
+ model: sonnet
5
+ ---
6
+
7
+ # Compose Workflow
8
+
9
+ ## Process
10
+
11
+ 1. Interview: goal, phases, which skills, gates between steps.
12
+ 2. Write `specs/WORKFLOW-<name>.md`:
13
+ - Trigger ("Use when...")
14
+ - Ordered steps: `skill → artefact → verify`
15
+ - HARD GATEs between phases
16
+ 3. Register in STATE.md Active Decisions.
17
+ 4. Optional: reference from `orchestrate-project` Ad-Hoc mode.
18
+
19
+ ## Verify
20
+
21
+ → verify: `test -f specs/WORKFLOW-*.md && grep -c "verify:" specs/WORKFLOW-*.md | awk '{if($1>0) print "OK"}'`
22
+
23
+ See [REFERENCE.md](REFERENCE.md) for template.
@@ -152,7 +152,7 @@ Links to design docs, RFCs, ADRs, prior stories, datasets, prototypes.
152
152
 
153
153
  ---
154
154
 
155
- ## Bug-fix specs (DIAGNOSIS.md)
155
+ ## Bug-fix specs (bugs/BUG-*.md)
156
156
 
157
157
  Bug fixes use the same header block and the same 20 sections. The minimum required for "Countable" on a bug fix:
158
158
 
@@ -59,7 +59,7 @@ The description is **the only thing your agent sees** when deciding which skill
59
59
 
60
60
  **Good example**:
61
61
  ```
62
- Investigate a bug by exploring the codebase to find root cause, then write a TDD-based fix plan to specs/DIAGNOSIS.md. Use when user reports a bug, wants to investigate a problem, or mentions "triage".
62
+ Investigate a bug by exploring the codebase to find root cause, then write a TDD-based fix plan to specs/bugs/BUG-*.md. Use when user reports a bug, wants to investigate a problem, or mentions "triage".
63
63
  ```
64
64
 
65
65
  ## When to Add Scripts
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: craft-skill
3
+ model: sonnet
3
4
  description: Create new bigpowers skills with proper structure, progressive disclosure, and bundled resources. Use when user wants to create, write, or build a new skill for the bigpowers lifecycle.
4
5
  ---
5
6
 
@@ -27,9 +28,13 @@ description: Create new bigpowers skills with proper structure, progressive disc
27
28
  - Additional reference files if content exceeds 100 lines
28
29
  - Utility scripts if deterministic operations needed
29
30
 
31
+ **Auto-skill from library README:** When user provides a library README or API docs URL, extract: triggers, HARD GATEs, verify commands, specs/ output — draft SKILL.md without inventing APIs not in the source.
32
+
33
+ 4. Add `model:` frontmatter (`haiku` | `sonnet` | `opus`) per [model-profiles.md](../docs/references/model-profiles.md).
34
+
30
35
  > **STREAM CONTINUITY** — When writing file content, output in continuous chunks of ~200 lines. Do not pause. Continue immediately until complete. If you need time, emit a placeholder comment rather than going silent.
31
36
 
32
- 4. **Review with user** — present draft and ask:
37
+ 5. **Review with user** — present draft and ask:
33
38
  - Does this cover your use cases?
34
39
  - Anything missing or unclear?
35
40
  - Should any section be more/less detailed?
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: deepen-architecture
3
+ model: sonnet
3
4
  description: Find deepening opportunities in a codebase, informed by the domain language in specs/CONTEXT.md and the decisions in specs/adr/. Use when the user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make a codebase more testable and AI-navigable.
4
5
  ---
5
6
 
@@ -49,7 +50,19 @@ Then use the Agent tool with `subagent_type=Explore` to walk the codebase. Don't
49
50
 
50
51
  Apply the **deletion test** to anything you suspect is shallow.
51
52
 
52
- ### 2. Present candidates
53
+ ### 2. Module Depth score
54
+
55
+ For each candidate module, assign a **Module Depth score** (1–5, Ousterhout):
56
+
57
+ | Score | Meaning |
58
+ |-------|---------|
59
+ | 1 | Shallow — interface complexity ≈ implementation |
60
+ | 3 | Balanced |
61
+ | 5 | Deep — small interface, substantial hidden behavior |
62
+
63
+ Include the score in each candidate row. Prioritize score ≤ 2 for deepening.
64
+
65
+ ### 3. Present candidates
53
66
 
54
67
  Present a numbered list of deepening opportunities. For each candidate:
55
68
 
@@ -64,7 +77,7 @@ Present a numbered list of deepening opportunities. For each candidate:
64
77
 
65
78
  Do NOT propose interfaces yet. Ask the user: "Which of these would you like to explore?"
66
79
 
67
- ### 3. Grilling loop
80
+ ### 4. Grilling loop
68
81
 
69
82
  Once the user picks a candidate, drop into a grilling conversation. Walk the design tree with them — constraints, dependencies, the shape of the deepened module, what sits behind the seam, what tests survive.
70
83
 
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: define-language
3
+ model: sonnet
3
4
  description: Extract a DDD-style ubiquitous language glossary from the current conversation, flagging ambiguities and proposing canonical terms. Saves to specs/UBIQUITOUS_LANGUAGE.md. Use when user wants to define domain terms, build a glossary, harden terminology, create a ubiquitous language, or mentions "domain model" or "DDD".
4
5
  ---
5
6
 
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: define-success
3
+ model: sonnet
3
4
  description: Convert an imperative task statement into explicit "step → verify: <cmd>" pairs before implementation begins. Use before plan-work when success criteria are unclear, when a task lacks verifiable checkpoints, or when user says "how will we know this is done?".
4
5
  ---
5
6
 
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: delegate-task
3
+ model: sonnet
3
4
  description: Delegate one complex task to a single subagent, review its work in two stages before merging back. Sequential — one agent at a time, with oversight. Use when a task is complex and requires careful review before the result is accepted. Distinct from dispatch-agents (no parallelism here; reviewer sees full diff before proceeding).
4
5
  ---
5
6
 
@@ -26,9 +27,13 @@ Prior decisions: [relevant entries from specs/STATE.md — omit section if none
26
27
 
27
28
  Do not include full file contents, full conversation history, or decisions unrelated to this task.
28
29
 
29
- ### 2. Spawn the subagent
30
+ ### 2. Spawn the subagent (iterative retrieval, max 3 cycles)
30
31
 
31
- Use the Agent tool to spawn the subagent with the complete brief. Include:
32
+ Use the Agent tool with a **fresh context** per spawn. Pass prior decisions only via `specs/STATE.md`.
33
+
34
+ **Cycle:** dispatch → evaluate output vs goal → refine brief → re-spawn if needed (max 3 cycles).
35
+
36
+ Include in each brief:
32
37
  - All context the agent needs (it starts cold — no shared state)
33
38
  - Reference to CONVENTIONS.md constraints
34
39
  - The verify command it must run before reporting done
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: design-interface
3
+ model: opus
3
4
  description: Generate multiple radically different interface designs for a module using parallel sub-agents, then compare trade-offs. Based on "Design It Twice" from A Philosophy of Software Design. Use when user wants to design an API, explore interface options, compare module shapes, or mentions "design it twice".
4
5
  ---
5
6
 
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: develop-tdd
3
+ model: sonnet
3
4
  description: Test-driven development with red-green-refactor loop using vertical slices. Use when user wants to build features or fix bugs using TDD, mentions "red-green-refactor", wants integration tests, asks for test-first development, or wants to implement a task from specs/PLAN.md.
4
5
  ---
5
6
 
@@ -7,7 +8,7 @@ description: Test-driven development with red-green-refactor loop using vertical
7
8
 
8
9
  > **HARD GATE** — Do NOT proceed if on `main` or `master`. Run `kickoff-branch` first to create a feature branch or worktree.
9
10
  >
10
- > **HARD GATE** — Do NOT write code before you have a plan. If you are starting a new task, run `plan-work` to create `specs/PLAN.md`. If you are fixing a bug, run `investigate-bug` to create `specs/DIAGNOSIS.md`.
11
+ > **HARD GATE** — Do NOT write code before you have a plan. If you are starting a new task, run `plan-work` to create `specs/RELEASE-PLAN.md`. If you are fixing a bug, run `investigate-bug` to create `specs/bugs/BUG-*.md`.
11
12
  >
12
13
  > **RECURSIVE DISCIPLINE** — This lifecycle apply to EVERY task, including updating these skills. Never skip planning because a task is "meta" or "just documentation."
13
14
 
@@ -64,7 +65,7 @@ If you find yourself thinking these things, you are likely deviating from produc
64
65
 
65
66
  Before writing any code:
66
67
 
67
- - [ ] Read `specs/PLAN.md` or `specs/DIAGNOSIS.md` if they exist — understand the task and verify steps
68
+ - [ ] Read `specs/RELEASE-PLAN.md` or `specs/bugs/BUG-*.md` if they exist — understand the task and verify steps
68
69
  - [ ] Confirm with user what interface changes are needed
69
70
  - [ ] Confirm with user which behaviors to test (prioritize)
70
71
  - [ ] Identify opportunities for [deep modules](deep-modules.md) (small interface, deep implementation)
@@ -83,9 +84,9 @@ Apply the **enforce-first** F.I.R.S.T rubric when writing tests: Fast, Independe
83
84
  Write ONE test that confirms ONE thing about the system:
84
85
 
85
86
  ```
86
- RED: Write test for first behavior → test fails
87
- GREEN: Write minimal code to pass → test passes
88
- COMMIT: git commit -m "feat/fix(<scope>): first tracer bullet..."
87
+ RED: Write test for first behavior → test fails → commit via commit-message: test(<scope>): ...
88
+ GREEN: Write minimal code to pass → test passes → commit: feat(<scope>): ... or fix(<scope>): ...
89
+ REFACTOR (optional): clean up → commit: refactor(<scope>): ...
89
90
  ```
90
91
 
91
92
  This is your tracer bullet — proves the path works end-to-end.
@@ -97,9 +98,9 @@ This is your tracer bullet — proves the path works end-to-end.
97
98
  For each remaining behavior:
98
99
 
99
100
  ```
100
- RED: Write next test → fails
101
- GREEN: Minimal code to pass → passes
102
- COMMIT: git commit -m "<type>(<scope>): <behavior description>"
101
+ RED: Write next test → fails → commit: test(<scope>): ...
102
+ GREEN: Minimal code to pass → passes → commit: feat|fix(<scope>): ...
103
+ REFACTOR (optional): commit: refactor(<scope>): ... (use commit-message skill for title/body)
103
104
  ```
104
105
 
105
106
  Rules:
@@ -135,7 +136,7 @@ After all tests pass, look for [refactor candidates](refactoring.md):
135
136
 
136
137
  ### 5. Verify step
137
138
 
138
- After every behavior cycle, run the verify command from `specs/PLAN.md` if one exists for this step. Show evidence before declaring the step done.
139
+ After every behavior cycle, run the verify command from `specs/RELEASE-PLAN.md` if one exists for this step. Show evidence before declaring the step done.
139
140
 
140
141
  ### 6. Manual Verification Handover
141
142
 
@@ -0,0 +1,22 @@
1
+ ---
2
+ name: diagnose-root
3
+ description: Run 4-phase root cause analysis — reproduce, isolate, hypothesize, verify. Use when a bug is confirmed but root cause is unclear, after investigate-bug, or when user mentions root cause analysis.
4
+ model: sonnet
5
+ ---
6
+
7
+ # Diagnose Root
8
+
9
+ Four phases — do not skip. Update the active `specs/bugs/BUG-*.md` file at each phase.
10
+
11
+ ## Phases
12
+
13
+ 1. **Reproduce** — minimal steps; record environment; capture logs.
14
+ 2. **Isolate** — narrow to module/function; binary-search commits or config.
15
+ 3. **Hypothesize** — list ranked hypotheses with falsification test each.
16
+ 4. **Verify** — run falsification; confirm single root cause; link to fix plan.
17
+
18
+ > **HARD GATE** — Do not propose a fix until phase 4 confirms one root cause with evidence.
19
+
20
+ ## Verify
21
+
22
+ → verify: `BUG_FILE=$(ls -t specs/bugs/BUG-*.md 2>/dev/null | head -1); test -n "$BUG_FILE" && grep -cE "Reproduce|Isolate|Hypothesize|Verify" "$BUG_FILE" | awk '{if($1>=4) print "OK"; else print "INCOMPLETE"}' || echo "MISSING"`
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: dispatch-agents
3
+ model: sonnet
3
4
  description: Dispatch multiple subagents in parallel on independent tasks. No waiting between them — all run concurrently. Use when tasks are truly decoupled and speed matters. Distinct from delegate-task (concurrent here, no inter-task review gate).
4
5
  ---
5
6
 
@@ -48,7 +49,16 @@ Prior decisions: [relevant entries from specs/STATE.md — omit section if none
48
49
 
49
50
  Do not include the full conversation, full file contents, or decisions unrelated to this agent's task.
50
51
 
51
- ### 3. Dispatch in parallel
52
+ ### 3. Iterative retrieval (max 3 cycles)
53
+
54
+ After each wave completes:
55
+ 1. **Dispatch** — run parallel agents with briefs.
56
+ 2. **Evaluate** — read outputs; list gaps vs goal.
57
+ 3. **Refine** — tighten briefs or spawn follow-up agents (max **3 cycles** total).
58
+
59
+ Stop when gaps empty or cycle 3 reached — escalate to user.
60
+
61
+ ### 4. Dispatch in parallel
52
62
 
53
63
  Spawn all agents in a single message using multiple Agent tool calls. Each agent gets its own complete brief.
54
64
 
@@ -58,14 +68,14 @@ Agent 2: brief for task B
58
68
  Agent 3: brief for task C
59
69
  ```
60
70
 
61
- ### 4. Collect and review results
71
+ ### 5. Collect and review results
62
72
 
63
73
  When all agents return:
64
74
  - Review each result independently
65
75
  - Run all verify commands
66
76
  - Check diffs for scope violations or CONVENTIONS.md breaches
67
77
 
68
- ### 5. Integrate
78
+ ### 6. Integrate
69
79
 
70
80
  Merge accepted results. If any agent's result conflicts with another, resolve manually and note the conflict.
71
81
 
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: edit-document
3
+ model: sonnet
3
4
  description: Edit and improve documents by restructuring sections, improving clarity, and tightening prose. Use when user wants to edit, revise, restructure, or improve any document — including specs/ files, articles, READMEs, or technical writing.
4
5
  ---
5
6
 
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: elaborate-spec
3
+ model: opus
3
4
  description: Refine a rough idea into a clear, detailed specification through dialogue. Does not produce code. Use when user has a vague idea, wants to think through a feature before planning, or needs to turn "I want X" into a concrete spec.
4
5
  ---
5
6
 
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: enforce-first
3
+ model: haiku
3
4
  description: Apply the F.I.R.S.T test quality rubric (Fast, Independent, Repeatable, Self-Validating, Timely) to a test suite or individual tests. Use when develop-tdd is writing tests, when test quality needs to be checked, or when user mentions F.I.R.S.T or "test quality".
4
5
  ---
5
6
 
@@ -0,0 +1,12 @@
1
+ # Evolve Skill — ADR snippet
2
+
3
+ ```markdown
4
+ ## ADR-XXXX: Evolve &lt;skill-name&gt;
5
+
6
+ **Status:** Accepted
7
+ **Benchmark:** before X% / after Y%
8
+ **Change:** one-sentence summary
9
+ **Evidence:** path/to/benchmark-report.md
10
+ ```
11
+
12
+ Benchmark repo: `/Users/danielvm/Developer/bigpowers-benchmark/`
@@ -0,0 +1,24 @@
1
+ ---
2
+ name: evolve-skill
3
+ description: Benchmark-gated skill evolution — consume bigpowers-benchmark report, propose plan-work change, edit skill via craft-skill, re-run benchmark, record ADR. Use when a skill underperforms on benchmark or stocktake finds systemic gap.
4
+ model: opus
5
+ ---
6
+
7
+ # Evolve Skill
8
+
9
+ > **HARD GATE** — No skill change ships without benchmark score ≥ pre-change baseline. Learning is measured and versioned — never implicit.
10
+
11
+ ## Loop
12
+
13
+ 1. Run `bigpowers-benchmark` (external repo); save report path in STATE.md.
14
+ 2. Identify target skill + measurable gap from report.
15
+ 3. `plan-work` — minimal change proposal with verify commands.
16
+ 4. Edit via `craft-skill` / direct SKILL.md edit; run `sync-skills.sh`.
17
+ 5. Re-run benchmark; compare scores.
18
+ 6. Record decision in `specs/adr/` + `session-state`; revert if regression.
19
+
20
+ ## Verify
21
+
22
+ → verify: benchmark report shows post-change score ≥ baseline (document paths in STATE.md)
23
+
24
+ See [REFERENCE.md](REFERENCE.md) for ADR template.
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: execute-plan
3
+ model: haiku
3
4
  description: Batch-execute tasks from specs/RELEASE-PLAN.md sequentially, with a human checkpoint after each step. Use when user has an approved plan and wants to execute it step-by-step with oversight, or mentions "execute the plan" or "run the plan".
4
5
  ---
5
6
 
@@ -15,7 +16,11 @@ Execute the tasks in `specs/RELEASE-PLAN.md` one at a time, showing evidence aft
15
16
 
16
17
  ### 1. Read the plan
17
18
 
18
- Read `specs/RELEASE-PLAN.md` in full. Confirm with the user:
19
+ Read `specs/RELEASE-PLAN.md` in full. Parse `depends-on:` fields from `specs/TASKS.md` or story steps to build **execution waves** (steps with no unresolved deps run in parallel when user approves).
20
+
21
+ > **CONTEXT ISOLATION** — Spawn each skill invocation (via `delegate-task` / subagent) with a **fresh context window**. Pass decisions only through `specs/STATE.md` — never rely on chat history from prior spawns.
22
+
23
+ Confirm with the user:
19
24
  - How many steps are there?
20
25
  - Any steps to skip or reorder?
21
26
  - Should you stop after a specific step?
@@ -32,9 +37,12 @@ verify: [verify command]
32
37
  ```
33
38
 
34
39
  **b. Execute the work**
35
- Implement the step using the appropriate approach:
40
+
41
+ For **wave execution**: group steps that share no `depends-on:` edges; run wave members in parallel via `dispatch-agents`; wait for all verify commands green before next wave. Use atomic `STATE.md` updates (read-modify-write one block) to avoid race conditions.
42
+
43
+ Implement each step using:
36
44
  - Write/edit code directly for small focused changes
37
- - Spawn a subagent via `delegate-task` for complex isolated work
45
+ - Spawn a subagent via `delegate-task` for complex isolated work (fresh context; read STATE.md first)
38
46
 
39
47
  > **STREAM CONTINUITY** — When writing file content, output in continuous chunks of ~200 lines. Do not pause. Continue immediately until complete. If you need time, emit a placeholder comment rather than going silent.
40
48
 
@@ -80,5 +88,5 @@ After all steps complete:
80
88
  ```
81
89
  ✓ Plan complete: N/N steps executed
82
90
  All verify commands passed.
83
- Suggested next: audit-code → commit-message → release-branch
91
+ Suggested next: verify-work → run-evals → audit-code → simulate-agents → commit-message → release-branch
84
92
  ```
package/grill-me/SKILL.md CHANGED
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: grill-me
3
+ model: sonnet
3
4
  description: Stress-test a plan or design through relentless questioning until every decision is resolved. Two modes: Design (default Q&A on decisions) and Docs (grounds every challenge in real library or API documentation). Use when user wants to challenge a plan, validate API assumptions, or mentions "grill me" or "grill me with docs".
4
5
  ---
5
6
 
@@ -0,0 +1,5 @@
1
+ # Grill With Docs — Question templates
2
+
3
+ - "Docs at [URL] show signature `foo(bar?: Baz)`. Your plan calls `foo(bar, baz)` — which is correct?"
4
+ - "The changelog at [URL] deprecates X in v3. Your plan still uses X — migrate or pin version?"
5
+ - "Error handling in [URL] throws `NetworkError`. Your plan catches `Error` only — is that sufficient?"
@@ -0,0 +1,28 @@
1
+ ---
2
+ name: grill-with-docs
3
+ description: Stress-test plan assumptions grounded in real library or API documentation URLs. Use when the plan depends on a specific library or external API, or as a docs-grounded variant of grill-me.
4
+ model: opus
5
+ ---
6
+
7
+ # Grill With Docs
8
+
9
+ > **HARD GATE** — Every challenge must cite a real documentation URL. No hallucinated APIs.
10
+
11
+ ## Process
12
+
13
+ 1. Read the plan or design under test (`specs/RELEASE-PLAN.md`, INTERFACE-OPTIONS.md, etc.).
14
+ 2. List assumptions that depend on external libraries or APIs.
15
+ 3. For each assumption: fetch or quote official docs; challenge with "docs say X, plan says Y."
16
+ 4. Resolve or update the plan inline; unresolved items block `plan-work`.
17
+
18
+ ## Docs mode rules
19
+
20
+ - Cite URL + quoted snippet (method name, parameter, version).
21
+ - If docs contradict the plan, plan loses until updated.
22
+ - Prefer official docs over blog posts.
23
+
24
+ ## Verify
25
+
26
+ → verify: dialogue log contains at least one `https://` doc URL per challenged assumption
27
+
28
+ See [REFERENCE.md](REFERENCE.md) for question templates.
@@ -1,5 +1,17 @@
1
1
  # Git guardrails — reference
2
2
 
3
+ ## Secret patterns (audit + pre-commit)
4
+
5
+ Agents must not commit files containing:
6
+
7
+ - `sk-` (OpenAI API keys)
8
+ - `ghp_` / `gho_` (GitHub tokens)
9
+ - `AKIA` (AWS access key id)
10
+ - `xoxb-` (Slack bot tokens)
11
+ - `-----BEGIN` private keys
12
+
13
+ Use `audit-code` supply-chain checklist before commit. Consider `git-secrets` or custom pre-commit hook in target projects.
14
+
3
15
  ## Copy layout
4
16
 
5
17
  The main script is `pre-tool-use.sh`.
@@ -98,8 +110,9 @@ Use `BeforeTool` with matcher `run_shell_command`. Set **`GIT_GUARDRAILS_MODE=ge
98
110
 
99
111
  Add **Deny list** entries in **Antigravity → Settings → Terminal**:
100
112
 
101
- - `git push`
102
113
  - `git push --force`
114
+ - `git push origin main`
115
+ - `git push origin master`
103
116
  - `git reset --hard`
104
117
  - `git clean`
105
118
  - `git branch -D`
@@ -110,26 +123,43 @@ Add **Deny list** entries in **Antigravity → Settings → Terminal**:
110
123
 
111
124
  ## Verify (local tests)
112
125
 
113
- **1. Dangerous Pattern (Claude mode):**
126
+ **1. Block push to main without land mode (Claude mode):**
114
127
  ```bash
115
128
  echo '{"tool_input":{"command":"git push origin main"}}' | ./pre-tool-use.sh
116
- # Expected: exit 2, stderr message
129
+ # Expected: exit 2, protected branch message
130
+ ```
131
+
132
+ **2. Allow push to main with GIT_BIGPOWERS_LAND=1:**
133
+ ```bash
134
+ GIT_BIGPOWERS_LAND=1 echo '{"tool_input":{"command":"git push origin main"}}' | ./pre-tool-use.sh
135
+ # Expected: exit 0 (when on main)
117
136
  ```
118
137
 
119
- **2. Conventional Commits (Gemini mode):**
138
+ **3. Allow push to feature branch:**
139
+ ```bash
140
+ echo '{"tool_input":{"command":"git push -u origin feat/my-task"}}' | ./pre-tool-use.sh
141
+ # Expected: exit 0
142
+ ```
143
+
144
+ **4. Conventional Commits (Gemini mode):**
120
145
  ```bash
121
146
  echo '{"tool_input":{"command":"git commit -m \"bad message\""}}' | GIT_GUARDRAILS_MODE=gemini ./pre-tool-use.sh
122
147
  # Expected: exit 0, {"decision":"deny", "reason":"..."}
123
148
  ```
124
149
 
125
- **3. Protected Branch (Cursor mode):**
150
+ **5. Protected Branch commit (Cursor mode):**
126
151
  ```bash
127
152
  # Run on 'main' branch
128
153
  echo '{"command":"git commit -m \"feat: valid message\""}' | GIT_GUARDRAILS_MODE=cursor ./pre-tool-use.sh
129
154
  # Expected: exit 2, "Direct commits to protected branch 'main' are forbidden"
130
155
  ```
131
156
 
132
- **4. Allow (Gemini mode):**
157
+ **6. Land script exists:**
158
+ ```bash
159
+ test -x scripts/land-branch.sh && echo OK
160
+ ```
161
+
162
+ **7. Allow (Gemini mode):**
133
163
  ```bash
134
164
  echo '{"tool_input":{"command":"git status"}}' | GIT_GUARDRAILS_MODE=gemini ./pre-tool-use.sh
135
165
  # Expected: exit 0, {"decision":"allow"}
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: guard-git
3
+ model: haiku
3
4
  description: Block dangerous git commands (push, force push, reset --hard, clean, branch -D, checkout/restore .) and enforce Conventional Commits & Branch Protection before an AI agent runs them. Installs hook scripts for Claude Code, Cursor, Cursor CLI, and Gemini CLI; documents Google Antigravity Terminal deny lists. Use when the user wants git safety hooks, to block git push or destructive git in agents, or to mirror the same policy across AI coding tools.
4
5
  ---
5
6
 
@@ -9,9 +10,11 @@ Installs a shared hook that blocks destructive git operations and enforces workf
9
10
 
10
11
  ## What gets blocked/enforced
11
12
 
12
- - **Safety**: `git push` (including `--force`), `git reset --hard`, `git clean -f`, `git branch -D`, `git checkout .`, `git restore .`.
13
- - **Discipline**: Blocks direct commits or pushes to protected branches (`main`, `master`).
13
+ - **Safety**: `git push --force`, `git reset --hard`, `git clean -f`, `git branch -D`, `git checkout .`, `git restore .`.
14
+ - **Discipline**: Blocks direct commits or pushes to protected branches (`main`, `master`) unless `GIT_BIGPOWERS_LAND=1` (set only by `scripts/land-branch.sh`).
15
+ - **Allows**: `git push origin <feature-branch>` for backup/CI; solo land push to `main` only inside `land-branch.sh`.
14
16
  - **Standardization**: Enforces [Conventional Commits](https://www.conventionalcommits.org/) for all `git commit` commands.
17
+ - **Secrets**: Blocks commits containing common secret patterns (`sk-`, `ghp_`, `AKIA`, `xoxb-`, `-----BEGIN` private keys) — see [REFERENCE.md](REFERENCE.md).
15
18
 
16
19
  ## Quick start
17
20
 
@@ -2,7 +2,6 @@
2
2
  # Source from block-dangerous-git.sh only.
3
3
 
4
4
  GIT_GUARDRAILS_PATTERNS=(
5
- "git push"
6
5
  "git reset --hard"
7
6
  "git clean -fd"
8
7
  "git clean -f"
@@ -1,5 +1,6 @@
1
1
  ---
2
2
  name: hook-commits
3
+ model: haiku
3
4
  description: Set up pre-commit hooks with lint-staged (Prettier), type checking, and tests in the current repo. Use when user wants to add pre-commit hooks, set up Husky, configure lint-staged, or add commit-time formatting/typechecking/testing.
4
5
  ---
5
6