bigpowers 1.0.0 → 1.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +21 -0
- package/CLAUDE.md +10 -8
- package/CONVENTIONS.md +8 -3
- package/GEMINI.md +9 -8
- package/README.md +57 -21
- package/RELEASE.md +10 -0
- package/SKILL-INDEX.md +98 -88
- package/assess-impact/SKILL.md +1 -0
- package/audit-code/SKILL.md +18 -0
- package/change-request/SKILL.md +1 -0
- package/commit-message/SKILL.md +1 -0
- package/compose-workflow/REFERENCE.md +13 -0
- package/compose-workflow/SKILL.md +23 -0
- package/countable-story-format.md +1 -1
- package/craft-skill/REFERENCE.md +1 -1
- package/craft-skill/SKILL.md +6 -1
- package/deepen-architecture/SKILL.md +15 -2
- package/define-language/SKILL.md +1 -0
- package/define-success/SKILL.md +1 -0
- package/delegate-task/SKILL.md +7 -2
- package/design-interface/SKILL.md +1 -0
- package/develop-tdd/SKILL.md +10 -9
- package/diagnose-root/SKILL.md +22 -0
- package/dispatch-agents/SKILL.md +13 -3
- package/edit-document/SKILL.md +1 -0
- package/elaborate-spec/SKILL.md +1 -0
- package/enforce-first/SKILL.md +1 -0
- package/evolve-skill/REFERENCE.md +12 -0
- package/evolve-skill/SKILL.md +24 -0
- package/execute-plan/SKILL.md +12 -4
- package/grill-me/SKILL.md +1 -0
- package/grill-with-docs/REFERENCE.md +5 -0
- package/grill-with-docs/SKILL.md +28 -0
- package/guard-git/REFERENCE.md +36 -6
- package/guard-git/SKILL.md +5 -2
- package/guard-git/scripts/lib/git-guardrails-core.sh +0 -1
- package/hook-commits/SKILL.md +1 -0
- package/hooks/pre-tool-use.sh +43 -46
- package/inspect-quality/SKILL.md +9 -6
- package/investigate-bug/SKILL.md +18 -5
- package/kickoff-branch/SKILL.md +13 -5
- package/map-codebase/SKILL.md +1 -0
- package/migrate-spec/SKILL.md +1 -0
- package/model-domain/SKILL.md +10 -0
- package/opencode.json +1 -1
- package/orchestrate-project/REFERENCE.md +13 -7
- package/orchestrate-project/SKILL.md +7 -5
- package/organize-workspace/SKILL.md +1 -0
- package/package.json +3 -2
- package/plan-refactor/SKILL.md +1 -0
- package/plan-release/SKILL.md +1 -0
- package/plan-work/SKILL.md +8 -3
- package/profiles/node-service.md +28 -0
- package/profiles/solo-git.md +39 -0
- package/profiles/swift.md +27 -0
- package/profiles/typescript-vue.md +28 -0
- package/release-branch/SKILL.md +51 -11
- package/request-review/SKILL.md +2 -1
- package/research-first/REFERENCE.md +29 -0
- package/research-first/SKILL.md +31 -0
- package/reset-baseline/SKILL.md +21 -0
- package/respond-review/SKILL.md +1 -0
- package/run-evals/REFERENCE.md +27 -0
- package/run-evals/SKILL.md +27 -0
- package/scope-work/SKILL.md +22 -0
- package/scripts/add-model-frontmatter.sh +82 -0
- package/scripts/build-skill-index.sh +28 -0
- package/scripts/install.sh +5 -1
- package/scripts/land-branch.sh +166 -0
- package/scripts/sync-skills.sh +38 -3
- package/search-skills/SKILL.md +20 -0
- package/seed-conventions/SKILL.md +3 -0
- package/session-state/SKILL.md +25 -3
- package/setup-environment/SKILL.md +22 -0
- package/simulate-agents/SKILL.md +24 -0
- package/slice-tasks/SKILL.md +22 -0
- package/spike-prototype/SKILL.md +1 -0
- package/stocktake-skills/REFERENCE.md +8 -0
- package/stocktake-skills/SKILL.md +28 -0
- package/survey-context/SKILL.md +12 -11
- package/terse-mode/SKILL.md +1 -0
- package/trace-requirement/SKILL.md +1 -0
- package/using-bigpowers/SKILL.md +30 -4
- package/validate-fix/SKILL.md +9 -5
- package/verify-work/REFERENCE.md +23 -0
- package/verify-work/SKILL.md +39 -0
- package/visual-dashboard/SKILL.md +51 -1
- package/wire-observability/SKILL.md +1 -0
- package/write-document/REFERENCE.md +166 -0
- package/write-document/SKILL.md +12 -1
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: compose-workflow
|
|
3
|
+
description: Chain multiple bigpowers skills into a custom workflow recipe saved in specs/. Use when a project repeats a non-standard skill sequence, or user wants a documented playbook beyond orchestrate-project modes.
|
|
4
|
+
model: sonnet
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Compose Workflow
|
|
8
|
+
|
|
9
|
+
## Process
|
|
10
|
+
|
|
11
|
+
1. Interview: goal, phases, which skills, gates between steps.
|
|
12
|
+
2. Write `specs/WORKFLOW-<name>.md`:
|
|
13
|
+
- Trigger ("Use when...")
|
|
14
|
+
- Ordered steps: `skill → artefact → verify`
|
|
15
|
+
- HARD GATEs between phases
|
|
16
|
+
3. Register in STATE.md Active Decisions.
|
|
17
|
+
4. Optional: reference from `orchestrate-project` Ad-Hoc mode.
|
|
18
|
+
|
|
19
|
+
## Verify
|
|
20
|
+
|
|
21
|
+
→ verify: `test -f specs/WORKFLOW-*.md && grep -c "verify:" specs/WORKFLOW-*.md | awk '{if($1>0) print "OK"}'`
|
|
22
|
+
|
|
23
|
+
See [REFERENCE.md](REFERENCE.md) for template.
|
|
@@ -152,7 +152,7 @@ Links to design docs, RFCs, ADRs, prior stories, datasets, prototypes.
|
|
|
152
152
|
|
|
153
153
|
---
|
|
154
154
|
|
|
155
|
-
## Bug-fix specs (
|
|
155
|
+
## Bug-fix specs (bugs/BUG-*.md)
|
|
156
156
|
|
|
157
157
|
Bug fixes use the same header block and the same 20 sections. The minimum required for "Countable" on a bug fix:
|
|
158
158
|
|
package/craft-skill/REFERENCE.md
CHANGED
|
@@ -59,7 +59,7 @@ The description is **the only thing your agent sees** when deciding which skill
|
|
|
59
59
|
|
|
60
60
|
**Good example**:
|
|
61
61
|
```
|
|
62
|
-
Investigate a bug by exploring the codebase to find root cause, then write a TDD-based fix plan to specs/
|
|
62
|
+
Investigate a bug by exploring the codebase to find root cause, then write a TDD-based fix plan to specs/bugs/BUG-*.md. Use when user reports a bug, wants to investigate a problem, or mentions "triage".
|
|
63
63
|
```
|
|
64
64
|
|
|
65
65
|
## When to Add Scripts
|
package/craft-skill/SKILL.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: craft-skill
|
|
3
|
+
model: sonnet
|
|
3
4
|
description: Create new bigpowers skills with proper structure, progressive disclosure, and bundled resources. Use when user wants to create, write, or build a new skill for the bigpowers lifecycle.
|
|
4
5
|
---
|
|
5
6
|
|
|
@@ -27,9 +28,13 @@ description: Create new bigpowers skills with proper structure, progressive disc
|
|
|
27
28
|
- Additional reference files if content exceeds 100 lines
|
|
28
29
|
- Utility scripts if deterministic operations needed
|
|
29
30
|
|
|
31
|
+
**Auto-skill from library README:** When user provides a library README or API docs URL, extract: triggers, HARD GATEs, verify commands, specs/ output — draft SKILL.md without inventing APIs not in the source.
|
|
32
|
+
|
|
33
|
+
4. Add `model:` frontmatter (`haiku` | `sonnet` | `opus`) per [model-profiles.md](../docs/references/model-profiles.md).
|
|
34
|
+
|
|
30
35
|
> **STREAM CONTINUITY** — When writing file content, output in continuous chunks of ~200 lines. Do not pause. Continue immediately until complete. If you need time, emit a placeholder comment rather than going silent.
|
|
31
36
|
|
|
32
|
-
|
|
37
|
+
5. **Review with user** — present draft and ask:
|
|
33
38
|
- Does this cover your use cases?
|
|
34
39
|
- Anything missing or unclear?
|
|
35
40
|
- Should any section be more/less detailed?
|
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: deepen-architecture
|
|
3
|
+
model: sonnet
|
|
3
4
|
description: Find deepening opportunities in a codebase, informed by the domain language in specs/CONTEXT.md and the decisions in specs/adr/. Use when the user wants to improve architecture, find refactoring opportunities, consolidate tightly-coupled modules, or make a codebase more testable and AI-navigable.
|
|
4
5
|
---
|
|
5
6
|
|
|
@@ -49,7 +50,19 @@ Then use the Agent tool with `subagent_type=Explore` to walk the codebase. Don't
|
|
|
49
50
|
|
|
50
51
|
Apply the **deletion test** to anything you suspect is shallow.
|
|
51
52
|
|
|
52
|
-
### 2.
|
|
53
|
+
### 2. Module Depth score
|
|
54
|
+
|
|
55
|
+
For each candidate module, assign a **Module Depth score** (1–5, Ousterhout):
|
|
56
|
+
|
|
57
|
+
| Score | Meaning |
|
|
58
|
+
|-------|---------|
|
|
59
|
+
| 1 | Shallow — interface complexity ≈ implementation |
|
|
60
|
+
| 3 | Balanced |
|
|
61
|
+
| 5 | Deep — small interface, substantial hidden behavior |
|
|
62
|
+
|
|
63
|
+
Include the score in each candidate row. Prioritize score ≤ 2 for deepening.
|
|
64
|
+
|
|
65
|
+
### 3. Present candidates
|
|
53
66
|
|
|
54
67
|
Present a numbered list of deepening opportunities. For each candidate:
|
|
55
68
|
|
|
@@ -64,7 +77,7 @@ Present a numbered list of deepening opportunities. For each candidate:
|
|
|
64
77
|
|
|
65
78
|
Do NOT propose interfaces yet. Ask the user: "Which of these would you like to explore?"
|
|
66
79
|
|
|
67
|
-
###
|
|
80
|
+
### 4. Grilling loop
|
|
68
81
|
|
|
69
82
|
Once the user picks a candidate, drop into a grilling conversation. Walk the design tree with them — constraints, dependencies, the shape of the deepened module, what sits behind the seam, what tests survive.
|
|
70
83
|
|
package/define-language/SKILL.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: define-language
|
|
3
|
+
model: sonnet
|
|
3
4
|
description: Extract a DDD-style ubiquitous language glossary from the current conversation, flagging ambiguities and proposing canonical terms. Saves to specs/UBIQUITOUS_LANGUAGE.md. Use when user wants to define domain terms, build a glossary, harden terminology, create a ubiquitous language, or mentions "domain model" or "DDD".
|
|
4
5
|
---
|
|
5
6
|
|
package/define-success/SKILL.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: define-success
|
|
3
|
+
model: sonnet
|
|
3
4
|
description: Convert an imperative task statement into explicit "step → verify: <cmd>" pairs before implementation begins. Use before plan-work when success criteria are unclear, when a task lacks verifiable checkpoints, or when user says "how will we know this is done?".
|
|
4
5
|
---
|
|
5
6
|
|
package/delegate-task/SKILL.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: delegate-task
|
|
3
|
+
model: sonnet
|
|
3
4
|
description: Delegate one complex task to a single subagent, review its work in two stages before merging back. Sequential — one agent at a time, with oversight. Use when a task is complex and requires careful review before the result is accepted. Distinct from dispatch-agents (no parallelism here; reviewer sees full diff before proceeding).
|
|
4
5
|
---
|
|
5
6
|
|
|
@@ -26,9 +27,13 @@ Prior decisions: [relevant entries from specs/STATE.md — omit section if none
|
|
|
26
27
|
|
|
27
28
|
Do not include full file contents, full conversation history, or decisions unrelated to this task.
|
|
28
29
|
|
|
29
|
-
### 2. Spawn the subagent
|
|
30
|
+
### 2. Spawn the subagent (iterative retrieval, max 3 cycles)
|
|
30
31
|
|
|
31
|
-
Use the Agent tool
|
|
32
|
+
Use the Agent tool with a **fresh context** per spawn. Pass prior decisions only via `specs/STATE.md`.
|
|
33
|
+
|
|
34
|
+
**Cycle:** dispatch → evaluate output vs goal → refine brief → re-spawn if needed (max 3 cycles).
|
|
35
|
+
|
|
36
|
+
Include in each brief:
|
|
32
37
|
- All context the agent needs (it starts cold — no shared state)
|
|
33
38
|
- Reference to CONVENTIONS.md constraints
|
|
34
39
|
- The verify command it must run before reporting done
|
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: design-interface
|
|
3
|
+
model: opus
|
|
3
4
|
description: Generate multiple radically different interface designs for a module using parallel sub-agents, then compare trade-offs. Based on "Design It Twice" from A Philosophy of Software Design. Use when user wants to design an API, explore interface options, compare module shapes, or mentions "design it twice".
|
|
4
5
|
---
|
|
5
6
|
|
package/develop-tdd/SKILL.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: develop-tdd
|
|
3
|
+
model: sonnet
|
|
3
4
|
description: Test-driven development with red-green-refactor loop using vertical slices. Use when user wants to build features or fix bugs using TDD, mentions "red-green-refactor", wants integration tests, asks for test-first development, or wants to implement a task from specs/PLAN.md.
|
|
4
5
|
---
|
|
5
6
|
|
|
@@ -7,7 +8,7 @@ description: Test-driven development with red-green-refactor loop using vertical
|
|
|
7
8
|
|
|
8
9
|
> **HARD GATE** — Do NOT proceed if on `main` or `master`. Run `kickoff-branch` first to create a feature branch or worktree.
|
|
9
10
|
>
|
|
10
|
-
> **HARD GATE** — Do NOT write code before you have a plan. If you are starting a new task, run `plan-work` to create `specs/PLAN.md`. If you are fixing a bug, run `investigate-bug` to create `specs/
|
|
11
|
+
> **HARD GATE** — Do NOT write code before you have a plan. If you are starting a new task, run `plan-work` to create `specs/RELEASE-PLAN.md`. If you are fixing a bug, run `investigate-bug` to create `specs/bugs/BUG-*.md`.
|
|
11
12
|
>
|
|
12
13
|
> **RECURSIVE DISCIPLINE** — This lifecycle apply to EVERY task, including updating these skills. Never skip planning because a task is "meta" or "just documentation."
|
|
13
14
|
|
|
@@ -64,7 +65,7 @@ If you find yourself thinking these things, you are likely deviating from produc
|
|
|
64
65
|
|
|
65
66
|
Before writing any code:
|
|
66
67
|
|
|
67
|
-
- [ ] Read `specs/PLAN.md` or `specs/
|
|
68
|
+
- [ ] Read `specs/RELEASE-PLAN.md` or `specs/bugs/BUG-*.md` if they exist — understand the task and verify steps
|
|
68
69
|
- [ ] Confirm with user what interface changes are needed
|
|
69
70
|
- [ ] Confirm with user which behaviors to test (prioritize)
|
|
70
71
|
- [ ] Identify opportunities for [deep modules](deep-modules.md) (small interface, deep implementation)
|
|
@@ -83,9 +84,9 @@ Apply the **enforce-first** F.I.R.S.T rubric when writing tests: Fast, Independe
|
|
|
83
84
|
Write ONE test that confirms ONE thing about the system:
|
|
84
85
|
|
|
85
86
|
```
|
|
86
|
-
RED: Write test for first behavior → test fails
|
|
87
|
-
GREEN: Write minimal code to pass → test passes
|
|
88
|
-
|
|
87
|
+
RED: Write test for first behavior → test fails → commit via commit-message: test(<scope>): ...
|
|
88
|
+
GREEN: Write minimal code to pass → test passes → commit: feat(<scope>): ... or fix(<scope>): ...
|
|
89
|
+
REFACTOR (optional): clean up → commit: refactor(<scope>): ...
|
|
89
90
|
```
|
|
90
91
|
|
|
91
92
|
This is your tracer bullet — proves the path works end-to-end.
|
|
@@ -97,9 +98,9 @@ This is your tracer bullet — proves the path works end-to-end.
|
|
|
97
98
|
For each remaining behavior:
|
|
98
99
|
|
|
99
100
|
```
|
|
100
|
-
RED: Write next test → fails
|
|
101
|
-
GREEN: Minimal code to pass → passes
|
|
102
|
-
|
|
101
|
+
RED: Write next test → fails → commit: test(<scope>): ...
|
|
102
|
+
GREEN: Minimal code to pass → passes → commit: feat|fix(<scope>): ...
|
|
103
|
+
REFACTOR (optional): → commit: refactor(<scope>): ... (use commit-message skill for title/body)
|
|
103
104
|
```
|
|
104
105
|
|
|
105
106
|
Rules:
|
|
@@ -135,7 +136,7 @@ After all tests pass, look for [refactor candidates](refactoring.md):
|
|
|
135
136
|
|
|
136
137
|
### 5. Verify step
|
|
137
138
|
|
|
138
|
-
After every behavior cycle, run the verify command from `specs/PLAN.md` if one exists for this step. Show evidence before declaring the step done.
|
|
139
|
+
After every behavior cycle, run the verify command from `specs/RELEASE-PLAN.md` if one exists for this step. Show evidence before declaring the step done.
|
|
139
140
|
|
|
140
141
|
### 6. Manual Verification Handover
|
|
141
142
|
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: diagnose-root
|
|
3
|
+
description: Run 4-phase root cause analysis — reproduce, isolate, hypothesize, verify. Use when a bug is confirmed but root cause is unclear, after investigate-bug, or when user mentions root cause analysis.
|
|
4
|
+
model: sonnet
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Diagnose Root
|
|
8
|
+
|
|
9
|
+
Four phases — do not skip. Update the active `specs/bugs/BUG-*.md` file at each phase.
|
|
10
|
+
|
|
11
|
+
## Phases
|
|
12
|
+
|
|
13
|
+
1. **Reproduce** — minimal steps; record environment; capture logs.
|
|
14
|
+
2. **Isolate** — narrow to module/function; binary-search commits or config.
|
|
15
|
+
3. **Hypothesize** — list ranked hypotheses with falsification test each.
|
|
16
|
+
4. **Verify** — run falsification; confirm single root cause; link to fix plan.
|
|
17
|
+
|
|
18
|
+
> **HARD GATE** — Do not propose a fix until phase 4 confirms one root cause with evidence.
|
|
19
|
+
|
|
20
|
+
## Verify
|
|
21
|
+
|
|
22
|
+
→ verify: `BUG_FILE=$(ls -t specs/bugs/BUG-*.md 2>/dev/null | head -1); test -n "$BUG_FILE" && grep -cE "Reproduce|Isolate|Hypothesize|Verify" "$BUG_FILE" | awk '{if($1>=4) print "OK"; else print "INCOMPLETE"}' || echo "MISSING"`
|
package/dispatch-agents/SKILL.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: dispatch-agents
|
|
3
|
+
model: sonnet
|
|
3
4
|
description: Dispatch multiple subagents in parallel on independent tasks. No waiting between them — all run concurrently. Use when tasks are truly decoupled and speed matters. Distinct from delegate-task (concurrent here, no inter-task review gate).
|
|
4
5
|
---
|
|
5
6
|
|
|
@@ -48,7 +49,16 @@ Prior decisions: [relevant entries from specs/STATE.md — omit section if none
|
|
|
48
49
|
|
|
49
50
|
Do not include the full conversation, full file contents, or decisions unrelated to this agent's task.
|
|
50
51
|
|
|
51
|
-
### 3.
|
|
52
|
+
### 3. Iterative retrieval (max 3 cycles)
|
|
53
|
+
|
|
54
|
+
After each wave completes:
|
|
55
|
+
1. **Dispatch** — run parallel agents with briefs.
|
|
56
|
+
2. **Evaluate** — read outputs; list gaps vs goal.
|
|
57
|
+
3. **Refine** — tighten briefs or spawn follow-up agents (max **3 cycles** total).
|
|
58
|
+
|
|
59
|
+
Stop when gaps empty or cycle 3 reached — escalate to user.
|
|
60
|
+
|
|
61
|
+
### 4. Dispatch in parallel
|
|
52
62
|
|
|
53
63
|
Spawn all agents in a single message using multiple Agent tool calls. Each agent gets its own complete brief.
|
|
54
64
|
|
|
@@ -58,14 +68,14 @@ Agent 2: brief for task B
|
|
|
58
68
|
Agent 3: brief for task C
|
|
59
69
|
```
|
|
60
70
|
|
|
61
|
-
###
|
|
71
|
+
### 5. Collect and review results
|
|
62
72
|
|
|
63
73
|
When all agents return:
|
|
64
74
|
- Review each result independently
|
|
65
75
|
- Run all verify commands
|
|
66
76
|
- Check diffs for scope violations or CONVENTIONS.md breaches
|
|
67
77
|
|
|
68
|
-
###
|
|
78
|
+
### 6. Integrate
|
|
69
79
|
|
|
70
80
|
Merge accepted results. If any agent's result conflicts with another, resolve manually and note the conflict.
|
|
71
81
|
|
package/edit-document/SKILL.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: edit-document
|
|
3
|
+
model: sonnet
|
|
3
4
|
description: Edit and improve documents by restructuring sections, improving clarity, and tightening prose. Use when user wants to edit, revise, restructure, or improve any document — including specs/ files, articles, READMEs, or technical writing.
|
|
4
5
|
---
|
|
5
6
|
|
package/elaborate-spec/SKILL.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: elaborate-spec
|
|
3
|
+
model: opus
|
|
3
4
|
description: Refine a rough idea into a clear, detailed specification through dialogue. Does not produce code. Use when user has a vague idea, wants to think through a feature before planning, or needs to turn "I want X" into a concrete spec.
|
|
4
5
|
---
|
|
5
6
|
|
package/enforce-first/SKILL.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: enforce-first
|
|
3
|
+
model: haiku
|
|
3
4
|
description: Apply the F.I.R.S.T test quality rubric (Fast, Independent, Repeatable, Self-Validating, Timely) to a test suite or individual tests. Use when develop-tdd is writing tests, when test quality needs to be checked, or when user mentions F.I.R.S.T or "test quality".
|
|
4
5
|
---
|
|
5
6
|
|
|
@@ -0,0 +1,12 @@
|
|
|
1
|
+
# Evolve Skill — ADR snippet
|
|
2
|
+
|
|
3
|
+
```markdown
|
|
4
|
+
## ADR-XXXX: Evolve <skill-name>
|
|
5
|
+
|
|
6
|
+
**Status:** Accepted
|
|
7
|
+
**Benchmark:** before X% / after Y%
|
|
8
|
+
**Change:** one-sentence summary
|
|
9
|
+
**Evidence:** path/to/benchmark-report.md
|
|
10
|
+
```
|
|
11
|
+
|
|
12
|
+
Benchmark repo: `/Users/danielvm/Developer/bigpowers-benchmark/`
|
|
@@ -0,0 +1,24 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: evolve-skill
|
|
3
|
+
description: Benchmark-gated skill evolution — consume bigpowers-benchmark report, propose plan-work change, edit skill via craft-skill, re-run benchmark, record ADR. Use when a skill underperforms on benchmark or stocktake finds systemic gap.
|
|
4
|
+
model: opus
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Evolve Skill
|
|
8
|
+
|
|
9
|
+
> **HARD GATE** — No skill change ships without benchmark score ≥ pre-change baseline. Learning is measured and versioned — never implicit.
|
|
10
|
+
|
|
11
|
+
## Loop
|
|
12
|
+
|
|
13
|
+
1. Run `bigpowers-benchmark` (external repo); save report path in STATE.md.
|
|
14
|
+
2. Identify target skill + measurable gap from report.
|
|
15
|
+
3. `plan-work` — minimal change proposal with verify commands.
|
|
16
|
+
4. Edit via `craft-skill` / direct SKILL.md edit; run `sync-skills.sh`.
|
|
17
|
+
5. Re-run benchmark; compare scores.
|
|
18
|
+
6. Record decision in `specs/adr/` + `session-state`; revert if regression.
|
|
19
|
+
|
|
20
|
+
## Verify
|
|
21
|
+
|
|
22
|
+
→ verify: benchmark report shows post-change score ≥ baseline (document paths in STATE.md)
|
|
23
|
+
|
|
24
|
+
See [REFERENCE.md](REFERENCE.md) for ADR template.
|
package/execute-plan/SKILL.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: execute-plan
|
|
3
|
+
model: haiku
|
|
3
4
|
description: Batch-execute tasks from specs/RELEASE-PLAN.md sequentially, with a human checkpoint after each step. Use when user has an approved plan and wants to execute it step-by-step with oversight, or mentions "execute the plan" or "run the plan".
|
|
4
5
|
---
|
|
5
6
|
|
|
@@ -15,7 +16,11 @@ Execute the tasks in `specs/RELEASE-PLAN.md` one at a time, showing evidence aft
|
|
|
15
16
|
|
|
16
17
|
### 1. Read the plan
|
|
17
18
|
|
|
18
|
-
Read `specs/RELEASE-PLAN.md` in full.
|
|
19
|
+
Read `specs/RELEASE-PLAN.md` in full. Parse `depends-on:` fields from `specs/TASKS.md` or story steps to build **execution waves** (steps with no unresolved deps run in parallel when user approves).
|
|
20
|
+
|
|
21
|
+
> **CONTEXT ISOLATION** — Spawn each skill invocation (via `delegate-task` / subagent) with a **fresh context window**. Pass decisions only through `specs/STATE.md` — never rely on chat history from prior spawns.
|
|
22
|
+
|
|
23
|
+
Confirm with the user:
|
|
19
24
|
- How many steps are there?
|
|
20
25
|
- Any steps to skip or reorder?
|
|
21
26
|
- Should you stop after a specific step?
|
|
@@ -32,9 +37,12 @@ verify: [verify command]
|
|
|
32
37
|
```
|
|
33
38
|
|
|
34
39
|
**b. Execute the work**
|
|
35
|
-
|
|
40
|
+
|
|
41
|
+
For **wave execution**: group steps that share no `depends-on:` edges; run wave members in parallel via `dispatch-agents`; wait for all verify commands green before next wave. Use atomic `STATE.md` updates (read-modify-write one block) to avoid race conditions.
|
|
42
|
+
|
|
43
|
+
Implement each step using:
|
|
36
44
|
- Write/edit code directly for small focused changes
|
|
37
|
-
- Spawn a subagent via `delegate-task` for complex isolated work
|
|
45
|
+
- Spawn a subagent via `delegate-task` for complex isolated work (fresh context; read STATE.md first)
|
|
38
46
|
|
|
39
47
|
> **STREAM CONTINUITY** — When writing file content, output in continuous chunks of ~200 lines. Do not pause. Continue immediately until complete. If you need time, emit a placeholder comment rather than going silent.
|
|
40
48
|
|
|
@@ -80,5 +88,5 @@ After all steps complete:
|
|
|
80
88
|
```
|
|
81
89
|
✓ Plan complete: N/N steps executed
|
|
82
90
|
All verify commands passed.
|
|
83
|
-
Suggested next: audit-code → commit-message → release-branch
|
|
91
|
+
Suggested next: verify-work → run-evals → audit-code → simulate-agents → commit-message → release-branch
|
|
84
92
|
```
|
package/grill-me/SKILL.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: grill-me
|
|
3
|
+
model: sonnet
|
|
3
4
|
description: Stress-test a plan or design through relentless questioning until every decision is resolved. Two modes: Design (default Q&A on decisions) and Docs (grounds every challenge in real library or API documentation). Use when user wants to challenge a plan, validate API assumptions, or mentions "grill me" or "grill me with docs".
|
|
4
5
|
---
|
|
5
6
|
|
|
@@ -0,0 +1,5 @@
|
|
|
1
|
+
# Grill With Docs — Question templates
|
|
2
|
+
|
|
3
|
+
- "Docs at [URL] show signature `foo(bar?: Baz)`. Your plan calls `foo(bar, baz)` — which is correct?"
|
|
4
|
+
- "The changelog at [URL] deprecates X in v3. Your plan still uses X — migrate or pin version?"
|
|
5
|
+
- "Error handling in [URL] throws `NetworkError`. Your plan catches `Error` only — is that sufficient?"
|
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: grill-with-docs
|
|
3
|
+
description: Stress-test plan assumptions grounded in real library or API documentation URLs. Use when the plan depends on a specific library or external API, or as a docs-grounded variant of grill-me.
|
|
4
|
+
model: opus
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Grill With Docs
|
|
8
|
+
|
|
9
|
+
> **HARD GATE** — Every challenge must cite a real documentation URL. No hallucinated APIs.
|
|
10
|
+
|
|
11
|
+
## Process
|
|
12
|
+
|
|
13
|
+
1. Read the plan or design under test (`specs/RELEASE-PLAN.md`, INTERFACE-OPTIONS.md, etc.).
|
|
14
|
+
2. List assumptions that depend on external libraries or APIs.
|
|
15
|
+
3. For each assumption: fetch or quote official docs; challenge with "docs say X, plan says Y."
|
|
16
|
+
4. Resolve or update the plan inline; unresolved items block `plan-work`.
|
|
17
|
+
|
|
18
|
+
## Docs mode rules
|
|
19
|
+
|
|
20
|
+
- Cite URL + quoted snippet (method name, parameter, version).
|
|
21
|
+
- If docs contradict the plan, plan loses until updated.
|
|
22
|
+
- Prefer official docs over blog posts.
|
|
23
|
+
|
|
24
|
+
## Verify
|
|
25
|
+
|
|
26
|
+
→ verify: dialogue log contains at least one `https://` doc URL per challenged assumption
|
|
27
|
+
|
|
28
|
+
See [REFERENCE.md](REFERENCE.md) for question templates.
|
package/guard-git/REFERENCE.md
CHANGED
|
@@ -1,5 +1,17 @@
|
|
|
1
1
|
# Git guardrails — reference
|
|
2
2
|
|
|
3
|
+
## Secret patterns (audit + pre-commit)
|
|
4
|
+
|
|
5
|
+
Agents must not commit files containing:
|
|
6
|
+
|
|
7
|
+
- `sk-` (OpenAI API keys)
|
|
8
|
+
- `ghp_` / `gho_` (GitHub tokens)
|
|
9
|
+
- `AKIA` (AWS access key id)
|
|
10
|
+
- `xoxb-` (Slack bot tokens)
|
|
11
|
+
- `-----BEGIN` private keys
|
|
12
|
+
|
|
13
|
+
Use `audit-code` supply-chain checklist before commit. Consider `git-secrets` or custom pre-commit hook in target projects.
|
|
14
|
+
|
|
3
15
|
## Copy layout
|
|
4
16
|
|
|
5
17
|
The main script is `pre-tool-use.sh`.
|
|
@@ -98,8 +110,9 @@ Use `BeforeTool` with matcher `run_shell_command`. Set **`GIT_GUARDRAILS_MODE=ge
|
|
|
98
110
|
|
|
99
111
|
Add **Deny list** entries in **Antigravity → Settings → Terminal**:
|
|
100
112
|
|
|
101
|
-
- `git push`
|
|
102
113
|
- `git push --force`
|
|
114
|
+
- `git push origin main`
|
|
115
|
+
- `git push origin master`
|
|
103
116
|
- `git reset --hard`
|
|
104
117
|
- `git clean`
|
|
105
118
|
- `git branch -D`
|
|
@@ -110,26 +123,43 @@ Add **Deny list** entries in **Antigravity → Settings → Terminal**:
|
|
|
110
123
|
|
|
111
124
|
## Verify (local tests)
|
|
112
125
|
|
|
113
|
-
**1.
|
|
126
|
+
**1. Block push to main without land mode (Claude mode):**
|
|
114
127
|
```bash
|
|
115
128
|
echo '{"tool_input":{"command":"git push origin main"}}' | ./pre-tool-use.sh
|
|
116
|
-
# Expected: exit 2,
|
|
129
|
+
# Expected: exit 2, protected branch message
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
**2. Allow push to main with GIT_BIGPOWERS_LAND=1:**
|
|
133
|
+
```bash
|
|
134
|
+
GIT_BIGPOWERS_LAND=1 echo '{"tool_input":{"command":"git push origin main"}}' | ./pre-tool-use.sh
|
|
135
|
+
# Expected: exit 0 (when on main)
|
|
117
136
|
```
|
|
118
137
|
|
|
119
|
-
**
|
|
138
|
+
**3. Allow push to feature branch:**
|
|
139
|
+
```bash
|
|
140
|
+
echo '{"tool_input":{"command":"git push -u origin feat/my-task"}}' | ./pre-tool-use.sh
|
|
141
|
+
# Expected: exit 0
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
**4. Conventional Commits (Gemini mode):**
|
|
120
145
|
```bash
|
|
121
146
|
echo '{"tool_input":{"command":"git commit -m \"bad message\""}}' | GIT_GUARDRAILS_MODE=gemini ./pre-tool-use.sh
|
|
122
147
|
# Expected: exit 0, {"decision":"deny", "reason":"..."}
|
|
123
148
|
```
|
|
124
149
|
|
|
125
|
-
**
|
|
150
|
+
**5. Protected Branch commit (Cursor mode):**
|
|
126
151
|
```bash
|
|
127
152
|
# Run on 'main' branch
|
|
128
153
|
echo '{"command":"git commit -m \"feat: valid message\""}' | GIT_GUARDRAILS_MODE=cursor ./pre-tool-use.sh
|
|
129
154
|
# Expected: exit 2, "Direct commits to protected branch 'main' are forbidden"
|
|
130
155
|
```
|
|
131
156
|
|
|
132
|
-
**
|
|
157
|
+
**6. Land script exists:**
|
|
158
|
+
```bash
|
|
159
|
+
test -x scripts/land-branch.sh && echo OK
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
**7. Allow (Gemini mode):**
|
|
133
163
|
```bash
|
|
134
164
|
echo '{"tool_input":{"command":"git status"}}' | GIT_GUARDRAILS_MODE=gemini ./pre-tool-use.sh
|
|
135
165
|
# Expected: exit 0, {"decision":"allow"}
|
package/guard-git/SKILL.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: guard-git
|
|
3
|
+
model: haiku
|
|
3
4
|
description: Block dangerous git commands (push, force push, reset --hard, clean, branch -D, checkout/restore .) and enforce Conventional Commits & Branch Protection before an AI agent runs them. Installs hook scripts for Claude Code, Cursor, Cursor CLI, and Gemini CLI; documents Google Antigravity Terminal deny lists. Use when the user wants git safety hooks, to block git push or destructive git in agents, or to mirror the same policy across AI coding tools.
|
|
4
5
|
---
|
|
5
6
|
|
|
@@ -9,9 +10,11 @@ Installs a shared hook that blocks destructive git operations and enforces workf
|
|
|
9
10
|
|
|
10
11
|
## What gets blocked/enforced
|
|
11
12
|
|
|
12
|
-
- **Safety**: `git push
|
|
13
|
-
- **Discipline**: Blocks direct commits or pushes to protected branches (`main`, `master`).
|
|
13
|
+
- **Safety**: `git push --force`, `git reset --hard`, `git clean -f`, `git branch -D`, `git checkout .`, `git restore .`.
|
|
14
|
+
- **Discipline**: Blocks direct commits or pushes to protected branches (`main`, `master`) unless `GIT_BIGPOWERS_LAND=1` (set only by `scripts/land-branch.sh`).
|
|
15
|
+
- **Allows**: `git push origin <feature-branch>` for backup/CI; solo land push to `main` only inside `land-branch.sh`.
|
|
14
16
|
- **Standardization**: Enforces [Conventional Commits](https://www.conventionalcommits.org/) for all `git commit` commands.
|
|
17
|
+
- **Secrets**: Blocks commits containing common secret patterns (`sk-`, `ghp_`, `AKIA`, `xoxb-`, `-----BEGIN` private keys) — see [REFERENCE.md](REFERENCE.md).
|
|
15
18
|
|
|
16
19
|
## Quick start
|
|
17
20
|
|
package/hook-commits/SKILL.md
CHANGED
|
@@ -1,5 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: hook-commits
|
|
3
|
+
model: haiku
|
|
3
4
|
description: Set up pre-commit hooks with lint-staged (Prettier), type checking, and tests in the current repo. Use when user wants to add pre-commit hooks, set up Husky, configure lint-staged, or add commit-time formatting/typechecking/testing.
|
|
4
5
|
---
|
|
5
6
|
|