@heart-of-gold/toolkit 0.1.20 → 0.1.22
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/plugins/deep-thought/skills/brainstorm/SKILL.md +30 -0
- package/plugins/deep-thought/skills/plan/SKILL.md +44 -8
- package/plugins/deep-thought/skills/review/SKILL.md +9 -2
- package/plugins/guide/skills/claude-code/SKILL.md +90 -190
- package/plugins/marvin/skills/compound/SKILL.md +5 -1
- package/plugins/marvin/skills/work/SKILL.md +11 -0
- package/src/index.ts +1 -1
package/package.json
CHANGED
|
@@ -112,6 +112,13 @@ Launch research agents **in parallel**:
|
|
|
112
112
|
|
|
113
113
|
**If no relevant findings:** Say so. Don't invent relevance.
|
|
114
114
|
|
|
115
|
+
If the task is design-heavy, copy-heavy, or boundary-sensitive, also surface:
|
|
116
|
+
- relevant references already present in the repo
|
|
117
|
+
- anti-references or known bad patterns from past work
|
|
118
|
+
- whether a preview artifact will likely be required before autonomous implementation
|
|
119
|
+
|
|
120
|
+
The goal is not only "what exists?" It is also "what should the future plan pull toward and stay away from?"
|
|
121
|
+
|
|
115
122
|
**Exit:** Findings presented. User has seen what exists before exploring approaches.
|
|
116
123
|
|
|
117
124
|
---
|
|
@@ -129,6 +136,16 @@ Ask one question at a time. Start broad (purpose, users), narrow to specifics (c
|
|
|
129
136
|
|
|
130
137
|
**If any open questions emerge:** You MUST ask the user about each one. Do not assume answers or defer them silently.
|
|
131
138
|
|
|
139
|
+
If the chosen approach depends on taste, hierarchy, copy quality, workshop framing, or boundary judgment, you MUST also capture before leaving this phase:
|
|
140
|
+
- target outcome
|
|
141
|
+
- anti-goals
|
|
142
|
+
- references
|
|
143
|
+
- anti-references
|
|
144
|
+
- tone or taste rules
|
|
145
|
+
- representative proof slice
|
|
146
|
+
- explicit rejection criteria
|
|
147
|
+
- whether preview artifacts will be required
|
|
148
|
+
|
|
132
149
|
**Exit when:**
|
|
133
150
|
- The approach is clear and the user signals a decision
|
|
134
151
|
- You've explored enough to choose (2-3 approaches with tradeoffs)
|
|
@@ -223,6 +240,19 @@ related:
|
|
|
223
240
|
## Why This Approach
|
|
224
241
|
{Decision rationale — what it optimizes for, why alternatives were rejected}
|
|
225
242
|
|
|
243
|
+
## Subjective Contract (when needed)
|
|
244
|
+
- Target outcome: {What the result should feel or read like}
|
|
245
|
+
- Anti-goals: {What it must not become}
|
|
246
|
+
- References: {Positive models or repo examples}
|
|
247
|
+
- Anti-references: {Patterns or tones to avoid}
|
|
248
|
+
- Tone or taste rules: {Editorial, design, or teaching constraints}
|
|
249
|
+
- Rejection criteria: {Concrete reasons to say the result is wrong}
|
|
250
|
+
|
|
251
|
+
## Preview And Proof Slice (when needed)
|
|
252
|
+
- Proof slice: {One representative slice to prove first}
|
|
253
|
+
- Required preview artifacts: {HTML mockup, ASCII preview, screenshot comp, etc.}
|
|
254
|
+
- Rollout rule: {When this can propagate broadly}
|
|
255
|
+
|
|
226
256
|
## Key Design Decisions
|
|
227
257
|
|
|
228
258
|
### Q1: {Decision topic} — RESOLVED
|
|
@@ -105,6 +105,13 @@ Search past plans for similar features. Surface proven patterns and past risks:
|
|
|
105
105
|
|
|
106
106
|
See `../knowledge/active-memory-integration.md` for retrieval patterns.
|
|
107
107
|
|
|
108
|
+
If the task is design-heavy, copy-heavy, or boundary-sensitive, also search for:
|
|
109
|
+
- existing preview artifacts or mockups
|
|
110
|
+
- repo docs that define tone, design, or boundary rules
|
|
111
|
+
- prior plans that succeeded or drifted on subjective work
|
|
112
|
+
|
|
113
|
+
Surface both positive models and anti-models. Strong autonomous planning needs references and anti-references, not just similar code.
|
|
114
|
+
|
|
108
115
|
**Exit:** Codebase patterns known, past solutions surfaced, constraints identified.
|
|
109
116
|
|
|
110
117
|
### Autonomy Gate (Medium Challenge)
|
|
@@ -175,18 +182,32 @@ confidence: high | medium | low
|
|
|
175
182
|
**All plans include:**
|
|
176
183
|
1. **Title and one-line summary**
|
|
177
184
|
2. **Problem Statement** — What's wrong or missing? Why does this matter?
|
|
178
|
-
3. **
|
|
179
|
-
4. **
|
|
180
|
-
5. **
|
|
185
|
+
3. **Target End State** — What should be true when this lands?
|
|
186
|
+
4. **Scope and Non-Goals** — What is explicitly outside this change?
|
|
187
|
+
5. **Proposed Solution** — High-level approach
|
|
188
|
+
6. **Implementation Tasks** — Checkboxes with dependency ordering. These become the tracker for `/work`.
|
|
189
|
+
7. **Acceptance Criteria** — How do we know it's done? Measurable, testable.
|
|
181
190
|
|
|
182
191
|
**Standard and detailed plans also include:**
|
|
183
|
-
|
|
184
|
-
|
|
185
|
-
|
|
192
|
+
8. **Decision Rationale** — Why this approach? Alternatives considered? Tradeoffs?
|
|
193
|
+
9. **Constraints and Boundaries** — Architectural, editorial, release, privacy, or operating rules that stay fixed
|
|
194
|
+
10. **Assumptions** — What must be true for this plan to work? (see Assumption Audit below)
|
|
195
|
+
11. **Risk Analysis** — What could go wrong? How do we mitigate it?
|
|
186
196
|
|
|
187
197
|
**Detailed plans also include:**
|
|
188
|
-
|
|
189
|
-
|
|
198
|
+
12. **Phased Implementation** — Phases with exit criteria per phase
|
|
199
|
+
13. **References** — Links to brainstorm, relevant code, past solutions
|
|
200
|
+
|
|
201
|
+
**For subjective or boundary-sensitive plans, also include:**
|
|
202
|
+
- **Target outcome** — the felt or perceived result, not just the structural change
|
|
203
|
+
- **Anti-goals** — what the result must not become
|
|
204
|
+
- **References** — positive models or repo examples
|
|
205
|
+
- **Anti-references** — patterns or tones to avoid
|
|
206
|
+
- **Tone or taste rules** — explicit editorial, design, or teaching constraints
|
|
207
|
+
- **Representative proof slice** — one slice to prove before propagation
|
|
208
|
+
- **Rollout rule** — when it is safe to spread the pattern
|
|
209
|
+
- **Rejection criteria** — what makes the result wrong even if it compiles
|
|
210
|
+
- **Required preview artifacts** — the concrete previews needed before autonomous `work` starts
|
|
190
211
|
|
|
191
212
|
**Confidence calibration (stated in frontmatter and body):**
|
|
192
213
|
- **High:** Clear requirements + existing codebase patterns + bounded scope
|
|
@@ -234,6 +255,21 @@ Before finalizing, identify the assumptions the plan depends on and run the Recu
|
|
|
234
255
|
|
|
235
256
|
See `../knowledge/discovery-patterns.md` → "Recursive Why" for the loop technique.
|
|
236
257
|
|
|
258
|
+
### Subjective Contract And Preview Gate
|
|
259
|
+
|
|
260
|
+
If the task materially changes UI, copy, information architecture, facilitation flow, or a trust boundary judged partly by human review, the plan must additionally include:
|
|
261
|
+
- target outcome
|
|
262
|
+
- anti-goals
|
|
263
|
+
- references
|
|
264
|
+
- anti-references
|
|
265
|
+
- tone or taste rules
|
|
266
|
+
- representative proof slice
|
|
267
|
+
- rollout rule
|
|
268
|
+
- rejection criteria
|
|
269
|
+
- required preview artifacts
|
|
270
|
+
|
|
271
|
+
For design-heavy tasks, autonomous `work` should not start until at least one preview artifact exists. Acceptable preview forms include an HTML/static mockup, a terminal-friendly structural preview, or another concrete representation that makes drift obvious before implementation. The plan should also say who reviews the preview and what failure sends the work back to planning.
|
|
272
|
+
|
|
237
273
|
**Exit:** Plan document written.
|
|
238
274
|
|
|
239
275
|
---
|
|
@@ -76,6 +76,7 @@ Then gather project context:
|
|
|
76
76
|
2. Check what the change touches — auth, scoring, data, migrations, money → extra scrutiny
|
|
77
77
|
3. Read the related plan or brainstorm if referenced in commits or PR description
|
|
78
78
|
4. Note what's tested and what's not in the diff
|
|
79
|
+
5. If the work is design-heavy, copy-heavy, or boundary-sensitive, check whether the plan included non-goals, references, anti-references, proof-slice discipline, and preview artifacts
|
|
79
80
|
|
|
80
81
|
**Exit:** Conventions loaded, risk areas identified, related context read.
|
|
81
82
|
|
|
@@ -118,13 +119,16 @@ See `../knowledge/socratic-patterns.md` for CoVe technique details.
|
|
|
118
119
|
|
|
119
120
|
### For Document Reviews
|
|
120
121
|
|
|
121
|
-
Read the full document — don't skim. Evaluate against
|
|
122
|
+
Read the full document — don't skim. Evaluate against eight criteria:
|
|
122
123
|
|
|
123
124
|
1. **Does it explain WHY?** Decision rationale, not just "we'll use X."
|
|
124
125
|
2. **Are risks identified?** What could go wrong? Mitigations?
|
|
125
126
|
3. **Is the scope clear?** Explicit in/out of scope?
|
|
126
127
|
4. **Are acceptance criteria measurable?** Testable? "Users can do X" not "the system is good."
|
|
127
|
-
5. **
|
|
128
|
+
5. **Are constraints and non-goals explicit?** Could autonomous work stay inside the lane?
|
|
129
|
+
6. **Is propagation discipline clear?** Proof slice first or broad rollout?
|
|
130
|
+
7. **If the work is subjective or boundary-sensitive, is the contract explicit?** References, anti-references, tone rules, rejection criteria.
|
|
131
|
+
8. **If the work is design-heavy, is there a preview gate?** Concrete preview artifacts and a review step before implementation.
|
|
128
132
|
|
|
129
133
|
### For Architecture Reviews
|
|
130
134
|
|
|
@@ -220,6 +224,9 @@ Use **AskUserQuestion** with:
|
|
|
220
224
|
- [ ] Checked scope (clear in/out)
|
|
221
225
|
- [ ] Checked criteria (measurable, testable)
|
|
222
226
|
- [ ] Checked actionability (can `/work` start from this?)
|
|
227
|
+
- [ ] Checked non-goals and constraints (especially for autonomous work)
|
|
228
|
+
- [ ] Checked subjective contract completeness when applicable
|
|
229
|
+
- [ ] Checked preview artifacts and preview gate when applicable
|
|
223
230
|
|
|
224
231
|
## When NOT to Use /review
|
|
225
232
|
|
|
@@ -5,259 +5,159 @@ description: Use when the user asks to run Claude Code CLI (`claude`, `claude re
|
|
|
5
5
|
|
|
6
6
|
# Claude Code CLI Skill Guide
|
|
7
7
|
|
|
8
|
-
Use this skill to
|
|
8
|
+
Use this skill to run Claude Code directly, capture Claude's actual output, and summarize it back to the user.
|
|
9
9
|
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
The canonical execution path is direct `claude` CLI usage, mirroring the `codex` skill's direct `codex exec` pattern. Keep the workflow simple:
|
|
13
|
-
|
|
14
|
-
- Ask which model to use.
|
|
15
|
-
- Run one direct `claude` command.
|
|
16
|
-
- If it works, summarize Claude's output.
|
|
17
|
-
- If it fails or hangs, report that clearly and stop.
|
|
18
|
-
|
|
19
|
-
Do not turn a failed Claude run into an improvised shell pipeline, background polling loop, or manual artifact-concatenation workaround unless the user explicitly asks for that style of invocation.
|
|
10
|
+
Mirror the `codex` skill's style: short, direct, and command-first. Do not improvise. Do not use the wrapper as the normal path.
|
|
20
11
|
|
|
21
12
|
## Available Models
|
|
22
13
|
|
|
23
14
|
| Model | Best for |
|
|
24
15
|
| --- | --- |
|
|
25
|
-
| `default` | Anthropic's recommended default for the current account and environment |
|
|
26
16
|
| `sonnet` | Default recommendation for most coding, review, and analysis tasks |
|
|
27
17
|
| `opus` | Stronger reasoning for ambiguous, high-stakes, or architecture-heavy work |
|
|
28
18
|
| `haiku` | Fast, lightweight follow-ups and narrow questions |
|
|
29
|
-
| `
|
|
30
|
-
|
|
31
|
-
Prefer aliases over hardcoded snapshot names in this skill, because Anthropic documents aliases as the stable Claude Code interface and moves them forward as newer snapshots ship.
|
|
19
|
+
| `default` | Anthropic's account/environment default |
|
|
20
|
+
| `opusplan` | Planning-heavy tasks that intentionally mix planning and execution |
|
|
32
21
|
|
|
33
|
-
Default recommendation: `sonnet` for most tasks, `opus` when depth matters more than speed
|
|
22
|
+
Default recommendation: `sonnet` for most tasks, `opus` when depth matters more than speed.
|
|
34
23
|
|
|
35
24
|
## Permission Modes
|
|
36
25
|
|
|
37
|
-
Choose the permission mode based on
|
|
26
|
+
Choose the permission mode yourself based on the task. Do not ask the user unless they explicitly want to control it.
|
|
38
27
|
|
|
39
|
-
| Mode | Use when |
|
|
40
|
-
| --- | --- |
|
|
41
|
-
| `
|
|
42
|
-
| `
|
|
43
|
-
| `
|
|
44
|
-
| `bypassPermissions` |
|
|
28
|
+
| Mode | Use when |
|
|
29
|
+
| --- | --- |
|
|
30
|
+
| `default` | Normal review, analysis, debugging, and repo inspection |
|
|
31
|
+
| `acceptEdits` | Claude should edit files or implement changes |
|
|
32
|
+
| `plan` | The user explicitly wants strict read-only planning / analysis |
|
|
33
|
+
| `bypassPermissions` | Only with explicit user approval in a trusted environment |
|
|
45
34
|
|
|
46
|
-
|
|
35
|
+
Hard rule: do not use `plan` for ordinary reviews just because it sounds safe. For normal reviews, use `default`.
|
|
47
36
|
|
|
48
37
|
## Running a Task
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
-
|
|
53
|
-
-
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
-
|
|
57
|
-
|
|
58
|
-
-
|
|
59
|
-
-
|
|
38
|
+
|
|
39
|
+
1. Ask the user which model to use (default: `sonnet`) in one short question, unless they already specified it.
|
|
40
|
+
2. Infer the permission mode from the task:
|
|
41
|
+
- review / analysis / debugging: `default`
|
|
42
|
+
- implementation / refactor / fix: `acceptEdits`
|
|
43
|
+
- strict read-only planning requested by the user: `plan`
|
|
44
|
+
3. Pick a sane `--max-turns`:
|
|
45
|
+
- narrow question or follow-up: `3`
|
|
46
|
+
- normal file or diff review: `6`
|
|
47
|
+
- multi-directory repo review or investigation: `8`
|
|
48
|
+
- implementation / refactor: `10` to `12`
|
|
49
|
+
4. Do not add `--effort high` by default. Use the CLI default unless the user explicitly wants deeper reasoning or the task is unusually ambiguous.
|
|
50
|
+
5. Assemble the direct `claude` command with the appropriate options:
|
|
51
|
+
- `-p, --print`
|
|
52
|
+
- `--output-format text` for short runs that should complete quickly
|
|
53
|
+
- `--output-format stream-json --verbose` for long repo reviews, investigations, or any run where you need progress visibility
|
|
60
54
|
- `--model <MODEL>`
|
|
61
55
|
- `--permission-mode <MODE>`
|
|
62
|
-
- `--
|
|
63
|
-
-
|
|
64
|
-
- `--allowedTools
|
|
65
|
-
- `--
|
|
66
|
-
|
|
67
|
-
|
|
68
|
-
5. Prefer `--output-format text` for human-readable summaries and `--output-format json` for automation or machine parsing.
|
|
69
|
-
6. In headless automation, prefer explicit permissions:
|
|
70
|
-
- `default` for most review and analysis runs
|
|
71
|
-
- `acceptEdits` for edit-capable runs
|
|
72
|
-
- `plan` only when you specifically want read-only analysis with no command execution
|
|
73
|
-
7. Run the command, capture stdout/stderr, and summarize the outcome for the user.
|
|
74
|
-
8. If Claude Code does not actually return output, stop and report that failure. Do not substitute your own review or analysis and present it as if it came from Claude Code.
|
|
75
|
-
9. Do not add fallback layers automatically. One direct Claude attempt is the normal path. A retry is acceptable only when the user asks for it or when you have a concrete reason to change one thing, such as model, permission mode, or execution environment.
|
|
56
|
+
- `--max-turns <N>`
|
|
57
|
+
- `"your prompt here"` immediately after `-p` when using variadic flags like `--allowedTools`
|
|
58
|
+
- `--allowedTools <TOOL> [<TOOL> ...]` only to pre-approve tools without prompting
|
|
59
|
+
- `--disallowedTools <TOOL> [<TOOL> ...]` when you actually need to prevent specific tools
|
|
60
|
+
6. Run the command, capture the output, and summarize Claude's actual result.
|
|
61
|
+
7. If Claude does not actually return output, stop and report failure plainly.
|
|
76
62
|
|
|
77
63
|
## Codex Execution
|
|
78
64
|
|
|
79
|
-
When this skill is used from Codex, the
|
|
80
|
-
|
|
81
|
-
- Prefer the direct `claude` command patterns in this skill. Do not silently substitute the bundled wrapper for the normal path just because it exists.
|
|
82
|
-
- In this environment, actual `claude -p` runs have been verified to hang inside the Codex sandbox but succeed outside it. Treat reviewer-approved elevated execution as the default practical path for real Claude invocations.
|
|
83
|
-
- Use the sandboxed path only when the user explicitly wants to test sandbox behavior or when you have a concrete reason to believe the sandbox limitation no longer applies.
|
|
84
|
-
- If Claude Code needs network access, auth refresh, access outside the writable sandbox, or the in-sandbox run fails or hangs for likely sandbox reasons, rerun via `exec_command` with `sandbox_permissions: "require_escalated"`.
|
|
85
|
-
- If you verify that direct `claude -p` works outside the sandbox but hangs inside it in the current environment, stop treating the sandboxed path as normal. For the rest of that Codex session, prefer reviewer-approved elevated execution as the default path for actual Claude invocations.
|
|
86
|
-
- Use a concise justification that tells the reviewer why Claude Code needs elevation, for example:
|
|
87
|
-
- "Do you want me to run Claude Code with network access so it can authenticate and complete this review?"
|
|
88
|
-
- "Do you want me to rerun Claude Code outside the sandbox because the in-sandbox headless run appears blocked by permissions?"
|
|
89
|
-
- Prefer a narrow approval request. Ask for elevation only for the Claude invocation that needs it, not unrelated commands.
|
|
90
|
-
- If the task is high-risk, restate the chosen Claude permission mode when asking for elevation.
|
|
65
|
+
When this skill is used from Codex, use the execution path that actually works here.
|
|
91
66
|
|
|
92
|
-
|
|
67
|
+
- For real Claude runs in this environment, prefer `exec_command` with `sandbox_permissions: "require_escalated"`.
|
|
68
|
+
- Do not treat the sandboxed path as normal. Direct `claude -p` has been verified to work outside the sandbox and to hang or misbehave inside it.
|
|
69
|
+
- For long repo reviews from Codex, prefer `--output-format stream-json --verbose` so progress is visible. Plain `text` output can remain silent for a long time even when Claude is actively working.
|
|
70
|
+
- Use a narrow approval request tied to the Claude invocation, for example:
|
|
71
|
+
- "Do you want me to run Claude Code outside the sandbox so it can complete this review?"
|
|
72
|
+
- "Do you want me to run Claude Code with network/auth access so it can complete this task?"
|
|
73
|
+
- Once elevated Claude execution is known to work for the current task, keep using that path. Do not retry the sandboxed path again during the same task.
|
|
93
74
|
|
|
94
|
-
|
|
95
|
-
- Use `--output-format json` when another agent or script needs structured output.
|
|
96
|
-
- Use `--allowedTools` to make headless runs more reliable and safer.
|
|
97
|
-
- Prefer letting Claude inspect the repo directly for normal review and implementation tasks.
|
|
98
|
-
- Pass diffs or logs via stdin only when the user explicitly wants artifact-only review or when direct repo access is intentionally unavailable.
|
|
99
|
-
- Use `acceptEdits` for real implementation work; do not force read-only `plan` mode onto edit tasks.
|
|
100
|
-
- Use `bypassPermissions` only in a trusted sandbox and only with explicit user approval.
|
|
101
|
-
- Use `-r latest -p` for follow-up instead of re-explaining the entire task.
|
|
102
|
-
- Avoid giant `cat file1 file2 ... | claude ...` constructions as a default strategy. They are brittle, hard to inspect, and a poor substitute for direct Claude access to the repo.
|
|
75
|
+
## Recommended Command Patterns
|
|
103
76
|
|
|
104
|
-
|
|
77
|
+
### 1. Normal repo review
|
|
105
78
|
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
Use this as the normal review path:
|
|
79
|
+
Use this as the default review path:
|
|
109
80
|
|
|
110
81
|
```bash
|
|
111
82
|
claude -p \
|
|
112
|
-
|
|
83
|
+
"Review this repository and return findings ordered by severity with file references where possible." \
|
|
84
|
+
--output-format stream-json \
|
|
85
|
+
--verbose \
|
|
113
86
|
--model sonnet \
|
|
114
87
|
--permission-mode default \
|
|
115
|
-
--max-turns
|
|
116
|
-
"
|
|
88
|
+
--max-turns 8 \
|
|
89
|
+
--allowedTools "Read" "Grep" "Glob" "LS" "Bash(git status:*)" "Bash(git diff:*)" \
|
|
90
|
+
--disallowedTools "Edit" "Write" "NotebookEdit"
|
|
117
91
|
```
|
|
118
92
|
|
|
119
|
-
### 2. Review
|
|
120
|
-
|
|
121
|
-
Use stdin only when you intentionally want a bounded artifact review:
|
|
93
|
+
### 2. Review current changes only
|
|
122
94
|
|
|
123
95
|
```bash
|
|
124
96
|
git diff --staged | claude -p \
|
|
97
|
+
"Review this diff and return findings ordered by severity with concise explanations." \
|
|
125
98
|
--output-format text \
|
|
126
99
|
--model sonnet \
|
|
127
100
|
--permission-mode default \
|
|
128
|
-
--max-turns
|
|
129
|
-
"Review this diff. Return findings ordered by severity with file paths and concise explanations."
|
|
101
|
+
--max-turns 6
|
|
130
102
|
```
|
|
131
103
|
|
|
132
|
-
### 3.
|
|
104
|
+
### 3. Multi-directory repo investigation
|
|
133
105
|
|
|
134
|
-
Use this when
|
|
106
|
+
Use this when the prompt names several directories or asks for deeper inspection:
|
|
135
107
|
|
|
136
108
|
```bash
|
|
137
109
|
claude -p \
|
|
138
|
-
|
|
139
|
-
--
|
|
110
|
+
"Inspect the relevant parts of this repo, evaluate the design, and return evidence-based findings ordered by severity." \
|
|
111
|
+
--output-format stream-json \
|
|
112
|
+
--verbose \
|
|
113
|
+
--model opus \
|
|
140
114
|
--permission-mode default \
|
|
141
|
-
--max-turns
|
|
142
|
-
--allowedTools "Read
|
|
143
|
-
"
|
|
115
|
+
--max-turns 8 \
|
|
116
|
+
--allowedTools "Read" "Grep" "Glob" "LS" "Bash(git status:*)" "Bash(git diff:*)" \
|
|
117
|
+
--disallowedTools "Edit" "Write" "NotebookEdit"
|
|
144
118
|
```
|
|
145
119
|
|
|
146
|
-
### 4.
|
|
147
|
-
|
|
148
|
-
Use `acceptEdits` when Claude should change files:
|
|
120
|
+
### 4. Implementation or refactor
|
|
149
121
|
|
|
150
122
|
```bash
|
|
151
123
|
claude -p \
|
|
124
|
+
"Implement the requested change in this repo, keep the diff minimal, and summarize what changed." \
|
|
152
125
|
--output-format text \
|
|
153
126
|
--model sonnet \
|
|
154
127
|
--permission-mode acceptEdits \
|
|
155
|
-
--
|
|
156
|
-
--max-turns 8 \
|
|
157
|
-
"Implement the requested change in the current repo, keep the diff minimal, and summarize what changed."
|
|
158
|
-
```
|
|
159
|
-
|
|
160
|
-
### 5. Resume the latest Claude Code session
|
|
161
|
-
|
|
162
|
-
```bash
|
|
163
|
-
claude -r latest -p \
|
|
164
|
-
--output-format text \
|
|
165
|
-
"Focus only on the migration risk you mentioned earlier. What is the safest rollout plan?"
|
|
166
|
-
```
|
|
167
|
-
|
|
168
|
-
### 6. Request structured output for automation
|
|
169
|
-
|
|
170
|
-
```bash
|
|
171
|
-
claude -p \
|
|
172
|
-
--output-format json \
|
|
173
|
-
--model sonnet \
|
|
174
|
-
--permission-mode default \
|
|
175
|
-
--max-turns 1 \
|
|
176
|
-
"Summarize the provided diff as JSON with keys: verdict, findings, risks."
|
|
128
|
+
--max-turns 12
|
|
177
129
|
```
|
|
178
130
|
|
|
179
|
-
###
|
|
180
|
-
|
|
181
|
-
Use `plan` only when you want Claude prevented from executing commands or editing files:
|
|
131
|
+
### 5. Resume the latest session
|
|
182
132
|
|
|
183
133
|
```bash
|
|
184
|
-
|
|
185
|
-
--output-format text
|
|
186
|
-
--model sonnet \
|
|
187
|
-
--permission-mode plan \
|
|
188
|
-
--max-turns 1 \
|
|
189
|
-
"Analyze this diff and explain the key risks. Keep the response concise."
|
|
134
|
+
claude -r latest -p "Focus only on the open issue you identified earlier and propose the safest next step." \
|
|
135
|
+
--output-format text
|
|
190
136
|
```
|
|
191
137
|
|
|
192
|
-
## Optional Wrapper
|
|
193
|
-
|
|
194
|
-
This skill also ships a convenience wrapper at `scripts/run-claude-code.sh`.
|
|
195
|
-
|
|
196
|
-
Use it as a thin helper around the documented CLI patterns above. It supports:
|
|
197
|
-
|
|
198
|
-
- `--check`
|
|
199
|
-
- `--prompt`
|
|
200
|
-
- `--model`
|
|
201
|
-
- `--permission-mode`
|
|
202
|
-
- `--effort`
|
|
203
|
-
- `--max-turns`
|
|
204
|
-
- `--output-format`
|
|
205
|
-
- `--allowed-tools`
|
|
206
|
-
- `--disallowed-tools`
|
|
207
|
-
- `--add-dir`
|
|
208
|
-
- `--resume`
|
|
209
|
-
- `--continue`
|
|
210
|
-
- `--name`
|
|
211
|
-
- `--no-session-persistence`
|
|
212
|
-
- `--verbose`
|
|
213
|
-
- `--timeout-seconds`
|
|
214
|
-
|
|
215
|
-
Do not treat the wrapper as the default execution path for this skill. Use it only when:
|
|
216
|
-
|
|
217
|
-
- the user explicitly asks to use the wrapper
|
|
218
|
-
- you are debugging Claude Code hangs or invocation edge cases
|
|
219
|
-
- you need its optional timeout helper for diagnosis
|
|
220
|
-
|
|
221
|
-
Otherwise, use the direct `claude` commands in this document.
|
|
222
|
-
|
|
223
138
|
## Following Up
|
|
224
139
|
|
|
225
|
-
- After every Claude
|
|
226
|
-
-
|
|
227
|
-
- Keep
|
|
228
|
-
|
|
229
|
-
## Critical Evaluation of Claude Code Output
|
|
230
|
-
|
|
231
|
-
Claude Code is a colleague, not an authority.
|
|
232
|
-
|
|
233
|
-
### Guidelines
|
|
234
|
-
- Trust your own knowledge when confident.
|
|
235
|
-
- Push back if Claude Code makes a claim that conflicts with the code or docs in front of you.
|
|
236
|
-
- Research disagreements before accepting them for high-impact decisions.
|
|
237
|
-
- Do not defer blindly on models, permissions, or fast-moving best practices.
|
|
238
|
-
|
|
239
|
-
### When Claude Code Seems Wrong
|
|
240
|
-
1. State the disagreement clearly to the user.
|
|
241
|
-
2. Provide evidence from the codebase, documentation, or your own verification.
|
|
242
|
-
3. Optionally resume the Claude Code session with a corrective prompt:
|
|
243
|
-
|
|
244
|
-
```bash
|
|
245
|
-
claude -r latest -p \
|
|
246
|
-
--output-format text \
|
|
247
|
-
"I disagree with your earlier conclusion because [evidence]. Re-evaluate only that point."
|
|
248
|
-
```
|
|
249
|
-
|
|
250
|
-
4. Treat the exchange as a discussion, not a correction ritual.
|
|
140
|
+
- After every Claude run, tell the user which model and permission mode were used.
|
|
141
|
+
- Tell the user they can ask to resume the Claude session or rerun with a different model.
|
|
142
|
+
- Keep follow-up prompts narrow.
|
|
251
143
|
|
|
252
144
|
## Error Handling
|
|
253
145
|
|
|
254
146
|
- Stop and report failures whenever `claude --version` or a `claude -p` command exits non-zero.
|
|
255
|
-
- If Claude
|
|
256
|
-
- If
|
|
257
|
-
- If
|
|
258
|
-
-
|
|
259
|
-
-
|
|
260
|
-
-
|
|
261
|
-
- Do not
|
|
262
|
-
-
|
|
263
|
-
- Do not
|
|
147
|
+
- If Claude reaches `max turns`, report that plainly. Do not paraphrase it as a hang.
|
|
148
|
+
- If `--output-format text` stays silent during a long repo review, do not assume it is hung. Re-run or start with `--output-format stream-json --verbose` to confirm whether Claude is actively working.
|
|
149
|
+
- If Claude truly hangs or returns no output even in streaming mode, report that plainly and stop.
|
|
150
|
+
- Do not substitute your own review and imply it came from Claude.
|
|
151
|
+
- Do not automatically switch to `plan` after a failed `default` run.
|
|
152
|
+
- Do not pass the prompt after `--allowedTools`, because this CLI can consume it as another allowed tool and then fail with "Input must be provided..."
|
|
153
|
+
- Do not describe `--allowedTools` as a strict read-only sandbox. It pre-approves tools; it does not, by itself, ban other tools.
|
|
154
|
+
- Do not automatically fall back to giant `cat file1 file2 ... | claude ...` pipelines.
|
|
155
|
+
- Do not use the bundled wrapper unless the user explicitly asks for it or you are debugging the invocation itself.
|
|
156
|
+
|
|
157
|
+
## Critical Evaluation
|
|
158
|
+
|
|
159
|
+
Claude Code is a colleague, not an authority.
|
|
160
|
+
|
|
161
|
+
- Trust your own technical judgment when the repo contradicts Claude.
|
|
162
|
+
- Verify high-impact claims against the code or docs.
|
|
163
|
+
- If needed, resume the Claude session with a corrective prompt instead of silently accepting a bad conclusion.
|
|
@@ -59,6 +59,8 @@ Auto-detect capture type:
|
|
|
59
59
|
3. label: "Learning", description: "Pattern, preference, or principle for CLAUDE.md or memory"
|
|
60
60
|
- multiSelect: false
|
|
61
61
|
|
|
62
|
+
Repeated workflow friction counts as captureable knowledge. If the same clarification keeps happening during `brainstorm`, `plan`, `work`, or `review`, capture whether it belongs in repo doctrine or in reusable toolkit workflow guidance.
|
|
63
|
+
|
|
62
64
|
**Exit:** Capture type determined.
|
|
63
65
|
|
|
64
66
|
---
|
|
@@ -119,7 +121,8 @@ Determine the right location:
|
|
|
119
121
|
### Learning Capture
|
|
120
122
|
|
|
121
123
|
- **Project-specific:** Add to project CLAUDE.md or memory files
|
|
122
|
-
- **Toolkit-wide:** Add to the relevant plugin's `../knowledge/` directory
|
|
124
|
+
- **Toolkit-wide:** Add to the relevant plugin's `../knowledge/` directory or skill docs
|
|
125
|
+
- **Hybrid:** Split repo-local truth from toolkit-wide truth explicitly instead of mixing them
|
|
123
126
|
- Keep it concise — one pattern per entry
|
|
124
127
|
|
|
125
128
|
**Exit:** Knowledge extracted and structured.
|
|
@@ -220,6 +223,7 @@ Before delivering, verify:
|
|
|
220
223
|
- [ ] **Verified:** Root cause confirmed, fix tested (not just proposed)
|
|
221
224
|
- [ ] **No duplicates:** Searched existing docs — this is genuinely new
|
|
222
225
|
- [ ] **Correctly typed:** Solution vs context doc vs learning — right structure and location
|
|
226
|
+
- [ ] **Layered correctly:** Repo-local rule vs toolkit-wide workflow truth is explicit
|
|
223
227
|
- [ ] **No application code modified** — knowledge documents only
|
|
224
228
|
|
|
225
229
|
## What Makes This Heart of Gold
|
|
@@ -56,6 +56,15 @@ Prefer the harness's structured choice UI when available. Otherwise present conc
|
|
|
56
56
|
**If anything in the plan is unclear:**
|
|
57
57
|
Ask clarifying questions now — better to ask than build wrong.
|
|
58
58
|
|
|
59
|
+
If the plan authorizes design-heavy, copy-heavy, or boundary-sensitive work, verify before leaving Phase 0 that the plan already includes:
|
|
60
|
+
- target outcome and anti-goals
|
|
61
|
+
- references and anti-references
|
|
62
|
+
- proof-slice or rollout rule
|
|
63
|
+
- explicit rejection criteria
|
|
64
|
+
- preview artifacts when the task depends on human judgment
|
|
65
|
+
|
|
66
|
+
If those items are missing, stop and return to planning. Do not invent the missing subjective contract during implementation.
|
|
67
|
+
|
|
59
68
|
### Autonomy Activation
|
|
60
69
|
|
|
61
70
|
When a plan file is provided: **default to autonomous mode.** Plans are pre-approved decisions — execute without re-litigating them.
|
|
@@ -137,6 +146,8 @@ while (unchecked tasks remain):
|
|
|
137
146
|
|
|
138
147
|
**Stage specific files — never `git add .`**
|
|
139
148
|
|
|
149
|
+
If the plan requires a proof slice, do not propagate beyond that slice until its verification passes. If preview review or plan adherence fails, update the plan instead of improvising the missing contract in code.
|
|
150
|
+
|
|
140
151
|
**Exit:** All tasks checked off, all tests passing.
|
|
141
152
|
|
|
142
153
|
---
|
package/src/index.ts
CHANGED
|
@@ -7,7 +7,7 @@ import { targetsCommand } from "./commands/targets";
|
|
|
7
7
|
const main = defineCommand({
|
|
8
8
|
meta: {
|
|
9
9
|
name: "heart-of-gold",
|
|
10
|
-
version: "0.1.
|
|
10
|
+
version: "0.1.22",
|
|
11
11
|
description:
|
|
12
12
|
"Cross-platform installer for Heart of Gold skills — Codex, OpenCode, Pi, Claude Code, and more",
|
|
13
13
|
},
|