opencodekit 0.21.10 → 0.23.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (213) hide show
  1. package/README.md +1 -1
  2. package/dist/index.js +4 -25
  3. package/dist/template/.opencode/.template-manifest.json +115 -188
  4. package/dist/template/.opencode/AGENTS.md +127 -484
  5. package/dist/template/.opencode/README.md +2 -2
  6. package/dist/template/.opencode/agent/build.md +158 -356
  7. package/dist/template/.opencode/agent/explore.md +0 -1
  8. package/dist/template/.opencode/agent/plan.md +7 -16
  9. package/dist/template/.opencode/agent/review.md +0 -1
  10. package/dist/template/.opencode/agent/scout.md +2 -3
  11. package/dist/template/.opencode/agent/vision.md +0 -1
  12. package/dist/template/.opencode/artifacts/.active +1 -0
  13. package/dist/template/.opencode/artifacts/example/plan.md +12 -0
  14. package/dist/template/.opencode/artifacts/example/progress.md +4 -0
  15. package/dist/template/.opencode/artifacts/example/research.md +4 -0
  16. package/dist/template/.opencode/artifacts/example/spec.md +16 -0
  17. package/dist/template/.opencode/artifacts/todo.md +5 -0
  18. package/dist/template/.opencode/artifacts/verify.log +4 -0
  19. package/dist/template/.opencode/command/clarify.md +46 -0
  20. package/dist/template/.opencode/command/commit.md +53 -0
  21. package/dist/template/.opencode/command/create.md +29 -71
  22. package/dist/template/.opencode/command/design.md +1 -2
  23. package/dist/template/.opencode/command/explore.md +3 -4
  24. package/dist/template/.opencode/command/fix.md +55 -0
  25. package/dist/template/.opencode/command/improve-architecture.md +55 -0
  26. package/dist/template/.opencode/command/init.md +86 -69
  27. package/dist/template/.opencode/command/plan.md +30 -60
  28. package/dist/template/.opencode/command/pr.md +10 -28
  29. package/dist/template/.opencode/command/refactor.md +65 -0
  30. package/dist/template/.opencode/command/research.md +7 -29
  31. package/dist/template/.opencode/command/review-codebase.md +6 -13
  32. package/dist/template/.opencode/command/ship.md +136 -78
  33. package/dist/template/.opencode/command/test.md +66 -0
  34. package/dist/template/.opencode/command/ui-review.md +2 -4
  35. package/dist/template/.opencode/command/verify.md +15 -23
  36. package/dist/template/.opencode/dcp.jsonc +96 -85
  37. package/dist/template/.opencode/memory/README.md +4 -6
  38. package/dist/template/.opencode/memory/_templates/adr.md +45 -0
  39. package/dist/template/.opencode/memory/_templates/prd.md +1 -1
  40. package/dist/template/.opencode/memory/_templates/roadmap.md +1 -1
  41. package/dist/template/.opencode/memory/_templates/state.md +1 -1
  42. package/dist/template/.opencode/memory/project/gotchas.md +4 -4
  43. package/dist/template/.opencode/memory/project/project.md +2 -2
  44. package/dist/template/.opencode/memory/project/roadmap.md +1 -1
  45. package/dist/template/.opencode/memory/project/state.md +2 -2
  46. package/dist/template/.opencode/memory/project/tech-stack.md +2 -2
  47. package/dist/template/.opencode/memory/session-context.md +1 -1
  48. package/dist/template/.opencode/opencode.json +14 -152
  49. package/dist/template/.opencode/plugin/README.md +2 -2
  50. package/dist/template/.opencode/plugin/guard.ts +62 -0
  51. package/dist/template/.opencode/plugin/{lib/memory-admin-tools.ts → memory/admin.ts} +4 -4
  52. package/dist/template/.opencode/plugin/{lib → memory}/capture.ts +1 -1
  53. package/dist/template/.opencode/plugin/{lib → memory}/compile.ts +2 -2
  54. package/dist/template/.opencode/plugin/{lib → memory}/context.ts +1 -1
  55. package/dist/template/.opencode/plugin/{lib → memory}/curator.ts +1 -1
  56. package/dist/template/.opencode/plugin/{lib → memory}/db/observations.ts +102 -3
  57. package/dist/template/.opencode/plugin/{lib → memory}/db/schema.ts +43 -1
  58. package/dist/template/.opencode/plugin/{lib → memory}/db/types.ts +22 -0
  59. package/dist/template/.opencode/plugin/{lib/memory-db.ts → memory/db.ts} +1 -1
  60. package/dist/template/.opencode/plugin/{lib → memory}/distill.ts +1 -1
  61. package/dist/template/.opencode/plugin/{lib/memory-helpers.ts → memory/helpers.ts} +5 -1
  62. package/dist/template/.opencode/plugin/{lib/memory-hooks.ts → memory/hooks.ts} +1 -1
  63. package/dist/template/.opencode/plugin/{lib → memory}/index-generator.ts +2 -2
  64. package/dist/template/.opencode/plugin/{lib → memory}/inject.ts +1 -1
  65. package/dist/template/.opencode/plugin/{lib → memory}/lint.ts +2 -2
  66. package/dist/template/.opencode/plugin/memory/tools.ts +322 -0
  67. package/dist/template/.opencode/plugin/{lib → memory}/validate.ts +2 -2
  68. package/dist/template/.opencode/plugin/memory.ts +7 -17
  69. package/dist/template/.opencode/plugin/srcwalk.ts +721 -0
  70. package/dist/template/.opencode/skill/agent-code-quality-gate/SKILL.md +98 -0
  71. package/dist/template/.opencode/skill/behavioral-kernel/SKILL.md +52 -0
  72. package/dist/template/.opencode/skill/brainstorming/SKILL.md +1 -1
  73. package/dist/template/.opencode/skill/browser-testing-with-devtools/SKILL.md +85 -0
  74. package/dist/template/.opencode/skill/code-cleanup/SKILL.md +114 -0
  75. package/dist/template/.opencode/skill/code-navigation/SKILL.md +142 -0
  76. package/dist/template/.opencode/skill/code-review-and-quality/SKILL.md +131 -0
  77. package/dist/template/.opencode/skill/context-engineering/SKILL.md +1 -1
  78. package/dist/template/.opencode/skill/debugging-and-error-recovery/SKILL.md +109 -0
  79. package/dist/template/.opencode/skill/deep-module-design/SKILL.md +207 -0
  80. package/dist/template/.opencode/skill/development-lifecycle/SKILL.md +26 -45
  81. package/dist/template/.opencode/skill/gemini-large-context/SKILL.md +4 -4
  82. package/dist/template/.opencode/skill/git-workflow-and-versioning/SKILL.md +77 -0
  83. package/dist/template/.opencode/skill/grill-me/SKILL.md +140 -0
  84. package/dist/template/.opencode/skill/memory-system/SKILL.md +9 -10
  85. package/dist/template/.opencode/skill/opensrc/references/example-workflow.md +1 -1
  86. package/dist/template/.opencode/skill/planning-and-task-breakdown/SKILL.md +116 -0
  87. package/dist/template/.opencode/skill/shipping-and-launch/SKILL.md +95 -0
  88. package/dist/template/.opencode/skill/source-driven-development/SKILL.md +103 -0
  89. package/dist/template/.opencode/skill/spec-driven-development/SKILL.md +121 -0
  90. package/dist/template/.opencode/skill/srcwalk/SKILL.md +161 -0
  91. package/dist/template/.opencode/skill/subagent-driven-development/SKILL.md +1 -1
  92. package/dist/template/.opencode/skill/ubiquitous-language/SKILL.md +184 -0
  93. package/dist/template/.opencode/skill/using-git-worktrees/SKILL.md +6 -6
  94. package/dist/template/.opencode/skill/verification-before-completion/SKILL.md +6 -6
  95. package/dist/template/.opencode/skill/verification-before-completion/references/VERIFICATION_PROTOCOL.md +5 -5
  96. package/package.json +76 -76
  97. package/dist/template/.opencode/AGENT_ALIGNMENT.md +0 -564
  98. package/dist/template/.opencode/agent/painter.md +0 -83
  99. package/dist/template/.opencode/command/compound.md +0 -240
  100. package/dist/template/.opencode/command/curate.md +0 -299
  101. package/dist/template/.opencode/command/handoff.md +0 -149
  102. package/dist/template/.opencode/command/health.md +0 -356
  103. package/dist/template/.opencode/command/init-context.md +0 -297
  104. package/dist/template/.opencode/command/init-user.md +0 -125
  105. package/dist/template/.opencode/command/iterate.md +0 -200
  106. package/dist/template/.opencode/command/lfg.md +0 -173
  107. package/dist/template/.opencode/command/resume.md +0 -78
  108. package/dist/template/.opencode/command/status.md +0 -126
  109. package/dist/template/.opencode/command/ui-slop-check.md +0 -169
  110. package/dist/template/.opencode/plans/1768385996691-silent-wizard.md +0 -247
  111. package/dist/template/.opencode/plans/1770006237537-mighty-otter.md +0 -418
  112. package/dist/template/.opencode/plans/1770006913647-glowing-forest.md +0 -170
  113. package/dist/template/.opencode/plans/1770013678126-witty-planet.md +0 -278
  114. package/dist/template/.opencode/plans/1770112267595-shiny-rocket.md +0 -258
  115. package/dist/template/.opencode/plans/swarm-protocol.md +0 -123
  116. package/dist/template/.opencode/plugin/lib/memory-tools.ts +0 -535
  117. package/dist/template/.opencode/skill/agent-evals/SKILL.md +0 -208
  118. package/dist/template/.opencode/skill/anti-ai-slop/SKILL.md +0 -76
  119. package/dist/template/.opencode/skill/augment-context-engine/SKILL.md +0 -122
  120. package/dist/template/.opencode/skill/augment-context-engine/mcp.json +0 -6
  121. package/dist/template/.opencode/skill/beads/SKILL.md +0 -182
  122. package/dist/template/.opencode/skill/beads/references/BEST_PRACTICES.md +0 -27
  123. package/dist/template/.opencode/skill/beads/references/BOUNDARIES.md +0 -219
  124. package/dist/template/.opencode/skill/beads/references/DEPENDENCIES.md +0 -124
  125. package/dist/template/.opencode/skill/beads/references/EXAMPLES.md +0 -45
  126. package/dist/template/.opencode/skill/beads/references/FILE_CLAIMING.md +0 -101
  127. package/dist/template/.opencode/skill/beads/references/GIT_SYNC.md +0 -25
  128. package/dist/template/.opencode/skill/beads/references/HIERARCHY.md +0 -71
  129. package/dist/template/.opencode/skill/beads/references/MULTI_AGENT.md +0 -40
  130. package/dist/template/.opencode/skill/beads/references/RESUMABILITY.md +0 -177
  131. package/dist/template/.opencode/skill/beads/references/SESSION_PROTOCOL.md +0 -61
  132. package/dist/template/.opencode/skill/beads/references/TASK_CREATION.md +0 -38
  133. package/dist/template/.opencode/skill/beads/references/TROUBLESHOOTING.md +0 -38
  134. package/dist/template/.opencode/skill/beads/references/WORKFLOWS.md +0 -226
  135. package/dist/template/.opencode/skill/brand-asset-protocol/SKILL.md +0 -222
  136. package/dist/template/.opencode/skill/code-search-patterns/SKILL.md +0 -224
  137. package/dist/template/.opencode/skill/code-simplification/SKILL.md +0 -211
  138. package/dist/template/.opencode/skill/context-condensation/SKILL.md +0 -149
  139. package/dist/template/.opencode/skill/context-initialization/SKILL.md +0 -69
  140. package/dist/template/.opencode/skill/context-management/SKILL.md +0 -390
  141. package/dist/template/.opencode/skill/deep-research/SKILL.md +0 -384
  142. package/dist/template/.opencode/skill/design-direction-advisor/SKILL.md +0 -139
  143. package/dist/template/.opencode/skill/dispatching-parallel-agents/SKILL.md +0 -191
  144. package/dist/template/.opencode/skill/executing-plans/SKILL.md +0 -247
  145. package/dist/template/.opencode/skill/figma-go/SKILL.md +0 -65
  146. package/dist/template/.opencode/skill/finishing-a-development-branch/SKILL.md +0 -357
  147. package/dist/template/.opencode/skill/full-output-enforcement/SKILL.md +0 -62
  148. package/dist/template/.opencode/skill/gh-address-comments/SKILL.md +0 -29
  149. package/dist/template/.opencode/skill/gh-address-comments/scripts/fetch_comments.py +0 -237
  150. package/dist/template/.opencode/skill/gh-fix-ci/SKILL.md +0 -38
  151. package/dist/template/.opencode/skill/gh-fix-ci/scripts/inspect_pr_checks.py +0 -509
  152. package/dist/template/.opencode/skill/hi-fi-prototype-html/SKILL.md +0 -253
  153. package/dist/template/.opencode/skill/html-deck-export/SKILL.md +0 -189
  154. package/dist/template/.opencode/skill/index-knowledge/SKILL.md +0 -413
  155. package/dist/template/.opencode/skill/memory-grounding/SKILL.md +0 -68
  156. package/dist/template/.opencode/skill/playwriter/SKILL.md +0 -158
  157. package/dist/template/.opencode/skill/portless/SKILL.md +0 -109
  158. package/dist/template/.opencode/skill/prd/SKILL.md +0 -146
  159. package/dist/template/.opencode/skill/prd-task/SKILL.md +0 -182
  160. package/dist/template/.opencode/skill/prd-task/references/prd-schema.json +0 -124
  161. package/dist/template/.opencode/skill/prompt-leverage/SKILL.md +0 -90
  162. package/dist/template/.opencode/skill/prompt-leverage/references/framework.md +0 -91
  163. package/dist/template/.opencode/skill/prompt-leverage/scripts/augment_prompt.py +0 -157
  164. package/dist/template/.opencode/skill/receiving-code-review/SKILL.md +0 -263
  165. package/dist/template/.opencode/skill/reconcile/SKILL.md +0 -183
  166. package/dist/template/.opencode/skill/reflection-checkpoints/SKILL.md +0 -183
  167. package/dist/template/.opencode/skill/requesting-code-review/SKILL.md +0 -443
  168. package/dist/template/.opencode/skill/requesting-code-review/references/specialist-profiles.md +0 -108
  169. package/dist/template/.opencode/skill/requesting-code-review/review.md +0 -160
  170. package/dist/template/.opencode/skill/rtk-command-compression/SKILL.md +0 -134
  171. package/dist/template/.opencode/skill/screenshot/SKILL.md +0 -48
  172. package/dist/template/.opencode/skill/screenshot/scripts/ensure_macos_permissions.sh +0 -54
  173. package/dist/template/.opencode/skill/screenshot/scripts/macos_display_info.swift +0 -22
  174. package/dist/template/.opencode/skill/screenshot/scripts/macos_permissions.swift +0 -40
  175. package/dist/template/.opencode/skill/screenshot/scripts/macos_window_info.swift +0 -126
  176. package/dist/template/.opencode/skill/screenshot/scripts/take_screenshot.ps1 +0 -163
  177. package/dist/template/.opencode/skill/screenshot/scripts/take_screenshot.py +0 -585
  178. package/dist/template/.opencode/skill/security-threat-model/SKILL.md +0 -36
  179. package/dist/template/.opencode/skill/security-threat-model/references/prompt-template.md +0 -255
  180. package/dist/template/.opencode/skill/security-threat-model/references/security-controls-and-assets.md +0 -32
  181. package/dist/template/.opencode/skill/sharing-skills/SKILL.md +0 -214
  182. package/dist/template/.opencode/skill/skill-creator/SKILL.md +0 -181
  183. package/dist/template/.opencode/skill/skill-installer/SKILL.md +0 -58
  184. package/dist/template/.opencode/skill/skill-installer/scripts/github_utils.py +0 -21
  185. package/dist/template/.opencode/skill/skill-installer/scripts/install-skill-from-github.py +0 -313
  186. package/dist/template/.opencode/skill/skill-installer/scripts/list-skills.py +0 -106
  187. package/dist/template/.opencode/skill/swarm-coordination/SKILL.md +0 -244
  188. package/dist/template/.opencode/skill/swarm-coordination/references/architecture.md +0 -39
  189. package/dist/template/.opencode/skill/swarm-coordination/references/delegation-worker-protocol.md +0 -145
  190. package/dist/template/.opencode/skill/swarm-coordination/references/dependency-graph.md +0 -50
  191. package/dist/template/.opencode/skill/swarm-coordination/references/drift-check.md +0 -90
  192. package/dist/template/.opencode/skill/swarm-coordination/references/integration-beads.md +0 -20
  193. package/dist/template/.opencode/skill/swarm-coordination/references/launch-flow.md +0 -186
  194. package/dist/template/.opencode/skill/swarm-coordination/references/reconciler.md +0 -172
  195. package/dist/template/.opencode/skill/swarm-coordination/references/tier-enforcement.md +0 -78
  196. package/dist/template/.opencode/skill/swarm-coordination/references/tmux-integration.md +0 -134
  197. package/dist/template/.opencode/skill/systematic-debugging/SKILL.md +0 -402
  198. package/dist/template/.opencode/skill/terse-output-mode/SKILL.md +0 -95
  199. package/dist/template/.opencode/skill/think-in-code/SKILL.md +0 -136
  200. package/dist/template/.opencode/skill/ux-quality-gates/SKILL.md +0 -137
  201. package/dist/template/.opencode/skill/v1-run/SKILL.md +0 -175
  202. package/dist/template/.opencode/skill/v1-run/mcp.json +0 -6
  203. package/dist/template/.opencode/skill/verification-gates/SKILL.md +0 -63
  204. package/dist/template/.opencode/skill/visual-analysis/SKILL.md +0 -154
  205. package/dist/template/.opencode/skill/web-design-guidelines/SKILL.md +0 -46
  206. package/dist/template/.opencode/skill/workspace-setup/SKILL.md +0 -76
  207. package/dist/template/.opencode/skill/writing-plans/SKILL.md +0 -320
  208. /package/dist/template/.opencode/plugin/{lib → memory}/compact.ts +0 -0
  209. /package/dist/template/.opencode/plugin/{lib → memory}/db/graph.ts +0 -0
  210. /package/dist/template/.opencode/plugin/{lib → memory}/db/maintenance.ts +0 -0
  211. /package/dist/template/.opencode/plugin/{lib → memory}/db/pipeline.ts +0 -0
  212. /package/dist/template/.opencode/plugin/{lib → memory}/notify.ts +0 -0
  213. /package/dist/template/.opencode/plugin/{lib → memory}/operation-log.ts +0 -0
@@ -1,208 +0,0 @@
1
- ---
2
- name: agent-evals
3
- description: Use when adding/changing a skill, command, or agent prompt and you want evidence it actually helps — not just intuition. Defines bounded-task evals, no-skill baselines, deterministic verifiers, JSONL trace logs, and when to skip eval. Adapted from OpenAI eval guide, OpenHands "evaluating agent skills", Anthropic "demystifying evals".
4
- ---
5
-
6
- # Agent Evals
7
-
8
- > Without evals, every skill ships on vibes. The harness-engineering literature is unanimous: **measured changes beat believed-changes**. This skill gives you the smallest workable eval loop.
9
-
10
- ## When to Use
11
-
12
- Run an eval when:
13
-
14
- - Adding a new skill that claims to improve outcomes (`anti-ai-slop`, `prompt-leverage`, `condition-based-waiting`)
15
- - Changing a prompt or instruction in an agent (`build`, `plan`, `review`)
16
- - Comparing two approaches and you don't know which is better
17
- - A skill is suspected of being inert ("does this even do anything?")
18
-
19
- **Skip eval when:**
20
-
21
- - The change is mechanical (rename, refactor, lint fix)
22
- - The change is a one-shot fix with obvious verification (test passes / build green)
23
- - The skill is purely procedural with deterministic output (workspace setup)
24
-
25
- ## Core Principle: Bounded + Baseline + Verifier
26
-
27
- Three ingredients. Skip any one and the eval is theatre.
28
-
29
- 1. **Bounded task** — a concrete prompt with a definite finish line, runnable in <5 minutes
30
- 2. **No-skill baseline** — the same task run **without** the skill loaded, for comparison
31
- 3. **Deterministic verifier** — a check that returns pass/fail without human judgment
32
-
33
- If you cannot write the verifier, the skill's value is unmeasurable and you are guessing.
34
-
35
- ## Eval Loop (Minimum Viable)
36
-
37
- ### Step 1: Define the task
38
-
39
- Pick a real failure mode the skill targets. One paragraph, copy-pastable as a prompt.
40
-
41
- ```markdown
42
- ## Task: anti-ai-slop / no-purple-gradient
43
-
44
- **Prompt:** "Build a landing page hero for a coffee roastery brand. Single HTML file."
45
-
46
- **Verifier (deterministic):**
47
-
48
- - grep output for `linear-gradient.*purple|#[89a]\d[0-9a-f]\d{3}` → must return 0 matches
49
- - grep output for `Inter|Roboto` in `font-family` → must return 0 matches
50
- - File contains `<h1>` with content → must be true
51
-
52
- **Pass criteria:** all 3 checks pass
53
- ```
54
-
55
- ### Step 2: Run baseline (no skill)
56
-
57
- Fresh subagent, **don't load the skill**. Same prompt. Save output.
58
-
59
- ```typescript
60
- task({
61
- subagent_type: "general",
62
- description: "Baseline: coffee landing page",
63
- prompt:
64
- "Build a landing page hero for a coffee roastery brand. Single HTML file. Output the full HTML only.",
65
- });
66
- ```
67
-
68
- Save result to `.beads/artifacts/<eval-id>/baseline.html`.
69
-
70
- ### Step 3: Run treatment (with skill)
71
-
72
- Fresh subagent, **load the skill explicitly** in prompt. Same task.
73
-
74
- ```typescript
75
- task({
76
- subagent_type: "general",
77
- description: "Treatment: coffee landing page",
78
- prompt: `First load the anti-ai-slop skill. Then: Build a landing page hero for a coffee roastery brand. Single HTML file. Output the full HTML only.`,
79
- });
80
- ```
81
-
82
- Save result to `.beads/artifacts/<eval-id>/treatment.html`.
83
-
84
- ### Step 4: Run verifier on both
85
-
86
- ```bash
87
- # Baseline
88
- grep -cE "linear-gradient.*purple|#[89a][0-9a-f]" baseline.html
89
- grep -cE "(Inter|Roboto)" baseline.html
90
-
91
- # Treatment
92
- grep -cE "linear-gradient.*purple|#[89a][0-9a-f]" treatment.html
93
- grep -cE "(Inter|Roboto)" treatment.html
94
- ```
95
-
96
- ### Step 5: Record result
97
-
98
- Append one JSONL line to `.opencode/evals/log.jsonl`:
99
-
100
- ```json
101
- {
102
- "eval_id": "anti-slop-001",
103
- "skill": "anti-ai-slop",
104
- "date": "2026-04-21",
105
- "baseline_pass": false,
106
- "treatment_pass": true,
107
- "delta": "+1",
108
- "notes": "baseline used purple gradient + Inter; treatment used warm browns + Source Serif"
109
- }
110
- ```
111
-
112
- ## Multi-Run for Confidence
113
-
114
- A single run can be lucky. For a skill you're seriously evaluating:
115
-
116
- - Run baseline **3 times**, treatment **3 times** (different seeds via different prompts framings)
117
- - Report pass-rate not single result: `baseline 1/3, treatment 3/3`
118
- - If treatment ≤ baseline, the skill is **inert or harmful** — fix or delete
119
-
120
- ## Verifier Patterns That Work
121
-
122
- | Skill type | Verifier |
123
- | ---------------------- | ----------------------------------------------------------------- |
124
- | Anti-pattern avoidance | `grep` for the banned pattern → expect 0 |
125
- | Required output shape | JSON schema validation, presence of required sections |
126
- | Code correctness | run the code, run its tests, check exit code |
127
- | Behavior change | call site count via `srcwalk callers`, file existence, line counts |
128
- | UI / visual | Playwright screenshot + pixel diff against expected, or DOM query |
129
- | Refusal / safety | grep for forbidden phrases or correct refusal pattern |
130
-
131
- ## Verifier Anti-Patterns (Don't Use)
132
-
133
- - "Ask another LLM if this is good" — non-deterministic, expensive, judgment-laden
134
- - "Check if it looks right" — not a verifier, that's a vibe
135
- - "Pass if no errors thrown" — too weak, baseline also passes
136
- - "Manually inspect" — fine for one-off, useless for regression
137
-
138
- ## Trace Logging Format
139
-
140
- For multi-step evals (agent ran 5 tool calls, made 3 edits), log the trace:
141
-
142
- ```json
143
- {
144
- "eval_id": "ship-flow-002",
145
- "steps": [
146
- { "tool": "task", "args": { "subagent_type": "explore" }, "ok": true },
147
- { "tool": "edit", "args": { "path": "src/auth.ts" }, "ok": true },
148
- { "tool": "bash", "args": { "command": "npm test" }, "ok": false, "exit_code": 1 }
149
- ],
150
- "outcome": "failed_at_step_3",
151
- "verifier_pass": false
152
- }
153
- ```
154
-
155
- This lets you find **which step** failed across many runs — surfaces flaky points in a workflow.
156
-
157
- ## When Eval Disagrees with Intuition
158
-
159
- The skill **feels** great but the eval says baseline ≥ treatment. Trust the eval. Common causes:
160
-
161
- 1. The skill is too long — the agent ignored it
162
- 2. The skill targets a problem the model already handles
163
- 3. The verifier doesn't measure what the skill actually changes (re-read your verifier)
164
- 4. The baseline prompt was too easy (try a harder task)
165
-
166
- Fix in this order: verifier → task difficulty → skill content. Delete the skill if all three fail.
167
-
168
- ## Eval Storage Convention
169
-
170
- ```
171
- .opencode/evals/
172
- ├── log.jsonl # append-only, one line per run
173
- ├── tasks/ # task definitions
174
- │ ├── anti-slop-001.md
175
- │ └── ship-flow-002.md
176
- └── artifacts/ # baseline.* and treatment.* outputs
177
- └── <eval_id>/
178
- ```
179
-
180
- Keep evals in-repo. They're documentation that the skill works.
181
-
182
- ## Integration with `/health` and `/curate`
183
-
184
- - `/health` should flag skills with **zero eval coverage** as IMPORTANT (not CRITICAL — many skills are simple enough not to need it)
185
- - `/curate` should surface eval results when proposing skill consolidation: "skill X has 0/5 passes over last 3 months, propose deletion"
186
-
187
- ## Cost Discipline
188
-
189
- - Each eval run = 1 subagent call. 6 runs (3 baseline + 3 treatment) = 6 calls.
190
- - Don't eval every skill. Eval the ones whose **value is contested** or whose **failure would be expensive**.
191
- - Cache baseline runs — re-run only when the underlying model changes.
192
-
193
- ## Output
194
-
195
- After running an eval, return:
196
-
197
- ```markdown
198
- ## Eval: <skill-name>
199
-
200
- - **Task:** [one line]
201
- - **Baseline:** N/M passes
202
- - **Treatment:** N/M passes
203
- - **Delta:** [+/-N]
204
- - **Verdict:** keep | iterate | delete
205
- - **Trace:** `.opencode/evals/log.jsonl` line <N>
206
- ```
207
-
208
- Brief. Evidence-based. No padding.
@@ -1,76 +0,0 @@
1
- ---
2
- name: anti-ai-slop
3
- description: Use when generating any visual design output (web UI, slides, animations, mockups, infographics) to actively prevent the AI default aesthetic that strips brand identity. Bans purple gradients, emoji-as-icons, rounded-card+left-accent, AI-drawn human SVGs, GitHub-dark `#0D1117`, Inter/Roboto-as-display. Adapted from huashu-design.
4
- ---
5
-
6
- # Anti AI Slop
7
-
8
- > The AI default aesthetic is the **visual common denominator** of all training data. Using it makes every brand look identical. Avoiding it is **brand protection**, not aesthetic snobbery.
9
-
10
- ## Why This Matters
11
-
12
- The reasoning chain:
13
-
14
- 1. The user wants their brand to be recognizable.
15
- 2. AI default output = average of training corpus = all brands blended = **no brand recognized**.
16
- 3. So AI defaults dilute the user's brand into "another AI-generated page."
17
- 4. Avoiding AI slop is replacing default-mode output with **brand-specific intent**.
18
-
19
- Anti-slop is the **defensive** half of design discipline. The **offensive** half is `brand-asset-protocol` (use real logos, real product images, real colors). Both required.
20
-
21
- ## The Slop Lookup Table
22
-
23
- | Pattern | Why it's slop | Allowed when |
24
- | ------------------------------------------------------ | ------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ |
25
- | **Aggressive purple gradient** | The "tech feel" formula in every SaaS/AI/web3 landing page in training data | Brand actually uses purple gradient (some Linear contexts), or task is satire of slop |
26
- | **Emoji as icon** (`🚀 Fast`, `✨ Magic`) | "Not professional enough? Add an emoji" disease — symptom of training data | Brand uses them (Notion), or audience is kids / casual |
27
- | **Rounded card + left colored border accent** | 2020-2024 Material/Tailwind cliche, now visual noise | User explicitly asks, or it's in the brand spec |
28
- | **SVG-drawn imagery** (humans, faces, scenes) | AI-drawn SVG humans always have warped faces, weird proportions | **Almost never** — use real images (Wikimedia/Unsplash/AI-generated) or honest placeholder |
29
- | **CSS silhouettes for product photos** | Generates "generic tech animation" — black bg + orange accent + rounded bars. Zero brand ID | Almost never — fetch real product photo first (see `brand-asset-protocol`) |
30
- | **Inter / Roboto / Arial / system as display font** | Too common — reader can't tell "designed product" from "demo page" | Brand spec uses them (Stripe uses tuned Inter variant) |
31
- | **Cyber neon / GitHub dark `#0D1117`** | Copy of GitHub dark mode aesthetic, used everywhere | Developer tools brand that genuinely uses this direction |
32
- | **Generic stock photo "lifestyle" hero on text essay** | Adds no information; pure decoration = slop | Image is the content (museum portrait, product detail, location card) |
33
- | **3+ accent colors** | Multi-color clustering reads as "I couldn't decide" | Data legitimately has ≥3 categorical dimensions |
34
- | **Decorative-icon-on-every-line** | "Iconography slop" — pads visual density without information | Icon carries differentiating product information (status, type, action) |
35
- | **Fabricated stats / fake quotes / lorem ipsum** | "Data slop" — fills space with meaningless numbers | Never. Ask user for real content or leave honest blank space |
36
- | **One generic "page load" animation everywhere** | Scattered micro-interactions feel cheap | One well-orchestrated, intentional animation per page |
37
-
38
- **Single criterion for allowing any of these**: "the brand spec uses it" or "the task is intentionally about showing slop." Without that explicit reason, default to avoiding.
39
-
40
- ## What to Do Instead (Positive Patterns)
41
-
42
- - ✅ **`text-wrap: pretty`, CSS Grid, advanced CSS** — typography details an AI usually skips. Signals "real designer."
43
- - ✅ **Use `oklch()` or colors from the brand spec.** Never invent new colors mid-design — every invented color erodes brand consistency.
44
- - ✅ **Real images > AI-drawn SVG > HTML/CSS-faked imagery.** Photo first; AI generation second; CSS shapes only when imagery isn't the point.
45
- - ✅ **Typographic curly quotes** ("smart" not "straight") — signal of "this was reviewed."
46
- - ✅ **120% on one detail, 80% on the rest.** Taste = picking the right place to be precise. Even attention everywhere = uniformly bland.
47
- - ✅ **Honest blank > clumsy fill.** A gray block labeled "user avatar" beats an AI-drawn portrait.
48
-
49
- ## "But the task IS about slop" — Negative Examples Done Right
50
-
51
- When the work is showing what _not_ to do (a critique post, a slop-vs-good comparison):
52
-
53
- - **Don't fill the whole page with slop.** Containerize the bad sample.
54
- - Use a **dashed border + corner label** "Anti-pattern · do not copy" so the reader can tell intent.
55
- - The negative example serves the narrative; it doesn't pollute the page's primary visual register.
56
-
57
- ## Self-Check Questions (run before delivery)
58
-
59
- For every visual element on the page, ask:
60
-
61
- 1. **Why is this here?** "It looks nice" is not enough. Each element must earn its place.
62
- 2. **Could a different brand use this exact element?** If yes → it's not specific enough.
63
- 3. **Did I invent this color/font/shape, or did it come from the brand spec or a real source?**
64
- 4. **Is there an icon that adds no information?** Remove it.
65
- 5. **Is there a number/stat I made up?** Remove it or get real data.
66
- 6. **Is there a gradient that has no brand basis?** Remove it.
67
- 7. **Is there an SVG of a human/face I drew?** Replace with real image or honest placeholder.
68
-
69
- If any answer fails the test, fix before claiming done.
70
-
71
- ## Pairs Well With
72
-
73
- - `brand-asset-protocol` — the positive counterpart (real assets, brand spec)
74
- - `design-direction-advisor` — when no brand context exists, recommends differentiated directions instead of falling into AI default
75
- - `design-taste-frontend` — base aesthetic discipline for web UI (more prescriptive; this skill is more prohibitive)
76
- - `high-end-visual-design` — premium aesthetic overlay
@@ -1,122 +0,0 @@
1
- ---
2
- name: augment-context-engine
3
- description: Semantic codebase search via Augment Context Engine MCP. Use when you need to understand code relationships, find relevant implementations, or explore unfamiliar codebases beyond keyword matching.
4
- version: 1.0.0
5
- tags: [research, context]
6
- dependencies: []
7
- ---
8
-
9
- # Augment Context Engine (MCP)
10
-
11
- ## When to Use
12
-
13
- - When you need semantic search to understand code relationships or find implementations beyond keyword matching.
14
-
15
- ## When NOT to Use
16
-
17
- - When exact string matching or known file paths can be handled by grep/read/LSP locally.
18
-
19
-
20
- ## Available Tools
21
-
22
- - `augment_code_search` - Semantic search across indexed GitHub repositories
23
-
24
- **Parameters:**
25
-
26
- - `repo_owner` (string, required) - GitHub org or username (e.g. `"augmentcode"`)
27
- - `repo_name` (string, required) - Repository name (e.g. `"test-repo"`)
28
- - `query` (string, required) - Natural language description of what you're looking for
29
- - `branch` (string, optional) - Branch to search (defaults to `"main"`)
30
- - `max_results` (integer, optional) - Max code snippets to return (defaults to 5)
31
-
32
- ## Quick Start
33
-
34
- ```
35
- # Search a GitHub repo for authentication code
36
- skill_mcp(skill_name="augment-context-engine", tool_name="augment_code_search", arguments='{"repo_owner": "myorg", "repo_name": "myapp", "query": "authentication middleware that validates JWT tokens"}')
37
-
38
- # Search a specific branch with more results
39
- skill_mcp(skill_name="augment-context-engine", tool_name="augment_code_search", arguments='{"repo_owner": "myorg", "repo_name": "myapp", "query": "error handling patterns", "branch": "develop", "max_results": 10}')
40
- ```
41
-
42
- > **Tip:** Get `repo_owner` and `repo_name` from git: `git remote get-url origin`
43
-
44
- ## Setup
45
-
46
- ### 1. Install the Augment GitHub App
47
-
48
- Install at [github.com/apps/augmentcode](https://github.com/apps/augmentcode/installations/new) and grant access to the repos you want to index.
49
-
50
- ### 2. Add the remote server
51
-
52
- The skill's `mcp.json` uses `mcp-remote` to bridge the remote API via stdio.
53
-
54
- For global access, add to `~/.config/opencode/opencode.json`:
55
-
56
- ```json
57
- {
58
- "$schema": "https://opencode.ai/config.json",
59
- "mcp": {
60
- "auggie": {
61
- "type": "remote",
62
- "url": "https://api.augmentcode.com/mcp",
63
- "enabled": true
64
- }
65
- }
66
- }
67
- ```
68
-
69
- ### 3. Authenticate
70
-
71
- Sign in at [app.augmentcode.com](https://app.augmentcode.com) when prompted by `mcp-remote` on first use.
72
-
73
- ## Query Tips
74
-
75
- Good queries (natural language, conceptual):
76
-
77
- - `"function that handles user authentication"`
78
- - `"database connection setup code"`
79
- - `"tests for the payment processing module"`
80
-
81
- Bad queries (use grep instead):
82
-
83
- - `"foo_bar"` — exact symbol search
84
- - `"TODO"` — keyword search
85
- - `"code that deals with everything"` — too vague
86
-
87
- ## When to Use
88
-
89
- | Scenario | Use This | Instead Of |
90
- | ------------------------ | --------------------- | -------------------------- |
91
- | Find code by meaning | `augment_code_search` | `grep` (text only) |
92
- | Understand relationships | `augment_code_search` | `@explore` agent (heavier) |
93
- | Unfamiliar codebase | `augment_code_search` | Manual file exploration |
94
- | Cross-repo dependencies | `augment_code_search` | LSP references (narrower) |
95
-
96
- ## When NOT to Use
97
-
98
- - **Exact string matching** — Use `grep` instead (faster, free)
99
- - **Known file paths** — Use `read` directly
100
- - **Symbol definitions** — Use LSP `goToDefinition` (precise)
101
- - **Local-only work** — This searches GitHub-indexed repos, not local files
102
-
103
- ## Tool Priority Integration
104
-
105
- ```
106
- grep (text) → semantic search (meaning) → read (full file) → LSP (symbols) → edit
107
- ```
108
-
109
- Use grep first for exact matches. Escalate to semantic search when grep results are noisy or you need conceptual understanding.
110
-
111
- ## Cost
112
-
113
- - ~40-70 credits per query
114
- - Not free — use judiciously
115
- - Prefer grep for simple lookups
116
-
117
- ## Resources
118
-
119
- - Docs: https://docs.augmentcode.com/context-services/mcp/overview
120
- - OpenCode Quickstart: https://docs.augmentcode.com/context-services/mcp/quickstart-open-code
121
- - GitHub App: https://github.com/apps/augmentcode/installations/new
122
- - Product: https://www.augmentcode.com/product/context-engine-mcp
@@ -1,6 +0,0 @@
1
- {
2
- "augment-context-engine": {
3
- "command": "npx",
4
- "args": ["-y", "mcp-remote", "https://api.augmentcode.com/mcp"]
5
- }
6
- }
@@ -1,182 +0,0 @@
1
- ---
2
- name: beads
3
- description: >
4
- Multi-agent task coordination using br (beads_rust) CLI. Use when work spans multiple
5
- sessions, has dependencies, needs file locking, or requires agent coordination. Covers
6
- claim/reserve/done cycle, dependency management, hierarchical decomposition, and session protocols.
7
- version: "2.0.0"
8
- license: MIT
9
- tags: [workflow, agent-coordination]
10
- dependencies: []
11
- ---
12
-
13
- # Beads Workflow - Multi-Agent Task Coordination
14
-
15
- > **Replaces** ad-hoc task tracking with sticky notes, TODO comments, or mental checklists that lose state between sessions
16
-
17
- ## When to Use
18
-
19
- - Coordinating multi-session work with dependencies, blockers, or file locking needs
20
- - Running multi-agent work where tasks must persist across sessions and handoffs
21
-
22
- ## When NOT to Use
23
-
24
- - Single-session, linear tasks tracked via TodoWrite
25
- - Quick changes with no dependencies or handoff needs
26
-
27
- ## Overview
28
-
29
- **br (beads_rust) CLI** replaces the old `bd` (beads) CLI with a faster Rust implementation.
30
-
31
- **Key Distinction**:
32
-
33
- - **br CLI**: Multi-session work, dependencies, file locking, agent coordination
34
- - **TodoWrite**: Single-session tasks, linear execution, conversation-scoped
35
-
36
- **When to Use br vs TodoWrite**:
37
-
38
- - "Will I need this context in 2 weeks?" → **YES** = br
39
- - "Does this have blockers/dependencies?" → **YES** = br
40
- - "Multiple agents editing same codebase?" → **YES** = br
41
- - "Will this be done in this session?" → **YES** = TodoWrite
42
-
43
- **Decision Rule**: If resuming in 2 weeks would be hard without beads, use beads.
44
-
45
- ## Essential Commands
46
-
47
- ```bash
48
- br ready # Show issues ready to work (no blockers)
49
- br list --status open # All open issues
50
- br show <id> # Full issue details with dependencies
51
- br create --title "Fix bug" --type bug --priority 2 --description "Details here"
52
- br update <id> --status in_progress
53
- br close <id> --reason "Completed"
54
- br sync --flush-only # Export to JSONL (then git add/commit manually)
55
- ```
56
-
57
- ## Hierarchy & Dependencies (Summary)
58
-
59
- - Beads supports up to 3 levels of hierarchy: Epic → Task → Subtask
60
- - Use hierarchy for multi-day, cross-domain, or multi-agent work
61
- - Dependencies unblock parallel work when parents close
62
-
63
- See: `references/HIERARCHY.md` and `references/DEPENDENCIES.md` for full details.
64
-
65
- ## Session Protocol (Summary)
66
-
67
- **Start:** `br ready` → `br update <id> --status in_progress` → `br show <id>`
68
-
69
- **End:** `br close <id> --reason "..."` → `br sync --flush-only` → commit `.beads/` → restart session
70
-
71
- See: `references/SESSION_PROTOCOL.md` and `references/WORKFLOWS.md` for detailed steps and checklists.
72
-
73
- ## Task Creation (Summary)
74
-
75
- Create tasks when work spans sessions, has dependencies, or is discovered mid-implementation (>2 min).
76
-
77
- ```bash
78
- br create --title "Fix authentication bug" --priority 0 --type bug
79
- ```
80
-
81
- See: `references/TASK_CREATION.md` for full examples and patterns.
82
-
83
- ## Git Sync (Summary)
84
-
85
- `br` never runs git commands. Always `br sync --flush-only` and commit `.beads/` manually.
86
-
87
- See: `references/GIT_SYNC.md` for detailed flow and cleanup guidance.
88
-
89
- ## Troubleshooting (Summary)
90
-
91
- - No ready tasks → `br list --status open`, check blockers via `br show <id>`
92
- - Sync failures → `br doctor`
93
-
94
- See: `references/TROUBLESHOOTING.md` for common issues and fixes.
95
-
96
- ## Examples
97
-
98
- See: `references/EXAMPLES.md` for complete usage examples.
99
-
100
- ## Multi-Agent Coordination (Summary)
101
-
102
- For parallel execution with multiple subagents, use the **swarm-coordination** skill.
103
-
104
- See: `references/MULTI_AGENT.md` for swarm tool usage and examples.
105
-
106
- ## Rules
107
-
108
- 1. **Check `br ready` first** - Find unblocked work before starting
109
- 2. **Claim before editing** - `br update <id> --status in_progress`
110
- 3. **One task per session** - Restart after `br close`
111
- 4. **Always sync and commit** - `br sync --flush-only` then `git add .beads/ && git commit`
112
- 5. **Write notes for future agents** - Assume zero conversation context
113
- 6. **Claim file paths before editing** - Use reserve to declare ownership (multi-agent only)
114
-
115
- ## Anti-Patterns
116
-
117
- | Anti-Pattern | Why It Fails | Instead |
118
- | ------------------------------------------------------------------- | ------------------------------------------------------------- | ---------------------------------------------------------------- |
119
- | Claiming a bead without reading its current state first (`br show`) | Misses dependencies, blockers, and prior context | Run `br show <id>` before `br update <id> --status in_progress` |
120
- | Closing a bead without verification evidence | Marks incomplete or broken work as done | Run verification commands and capture output before `br close` |
121
- | Working on blocked beads (dependencies not met) | Wastes time and causes out-of-order delivery | Use `br ready` and confirm dependencies in `br show <id>` |
122
- | Modifying bead state without user confirmation | Violates workflow expectations and can surprise collaborators | Ask before changing bead status, especially close/sync actions |
123
- | Using `br sync` without `--flush-only` (can cause conflicts) | May write unexpected state and increase sync conflict risk | Always use `br sync --flush-only` then commit `.beads/` manually |
124
-
125
- ## Verification
126
-
127
- - **Before closing:** run verification commands, paste output as evidence
128
- - **After close:** `br show <id>` confirms `status=closed`
129
- - **After sync:** `git status` shows clean working tree
130
-
131
- ## File Path Claiming (Summary)
132
-
133
- Claim files before editing in multi-agent work using `br reserve <id> --files "..."`.
134
-
135
- See: `references/FILE_CLAIMING.md` for the full protocol and examples.
136
-
137
- ## Best Practices (Summary)
138
-
139
- - One task per session, then restart
140
- - File issues for work >2 minutes
141
- - Weekly `br doctor`, periodic `br cleanup --days 7`
142
-
143
- See: `references/BEST_PRACTICES.md` for maintenance and database health guidance.
144
-
145
- ## Quick Reference
146
-
147
- ```
148
- SESSION START:
149
- br ready → br update <id> --status in_progress → br show <id>
150
-
151
- DURING WORK:
152
- br create for discovered work (>2min)
153
- br show <id> for context
154
-
155
- SESSION END:
156
- br close <id> --reason "..." → br sync --flush-only → git add .beads/ && git commit → RESTART SESSION
157
-
158
- MAINTENANCE:
159
- br doctor - weekly health check
160
- br cleanup --days 7 - remove old issues
161
- ```
162
-
163
- ## References
164
-
165
- - `references/BOUNDARIES.md`
166
- - `references/RESUMABILITY.md`
167
- - `references/DEPENDENCIES.md`
168
- - `references/WORKFLOWS.md`
169
- - `references/HIERARCHY.md`
170
- - `references/SESSION_PROTOCOL.md`
171
- - `references/TASK_CREATION.md`
172
- - `references/GIT_SYNC.md`
173
- - `references/TROUBLESHOOTING.md`
174
- - `references/EXAMPLES.md`
175
- - `references/MULTI_AGENT.md`
176
- - `references/FILE_CLAIMING.md`
177
- - `references/BEST_PRACTICES.md`
178
-
179
- ## See Also
180
-
181
- - `verification-before-completion`
182
- - `swarm-coordination`
@@ -1,27 +0,0 @@
1
- # Best Practices
2
-
3
- ## Daily/Weekly Maintenance
4
-
5
- | Task | Frequency | Command | Why |
6
- | ------------ | -------------- | --------------------- | ---------------------------------------------- |
7
- | Health check | Weekly | `br doctor` | Repairs database issues, detects orphaned work |
8
- | Cleanup | Every few days | `br cleanup --days 7` | Keep DB under 200-500 issues for performance |
9
-
10
- ## Key Principles
11
-
12
- 1. **Plan outside Beads first** - Use planning tools, then import tasks to beads
13
- 2. **One task per session, then restart** - Fresh context prevents confusion
14
- 3. **File lots of issues** - Any work >2 minutes should be tracked
15
- 4. **"Land the plane" = PUSH** - `br sync --flush-only` + git commit/push, not "ready when you are"
16
- 5. **Include issue ID in commits** - `git commit -m "Fix bug (br-abc)"`
17
-
18
- ## Database Health
19
-
20
- ```bash
21
- # Check database size
22
- br list --status all --json | wc -l
23
-
24
- # Target: under 200-500 issues
25
- # If over, run cleanup more aggressively:
26
- br cleanup --days 3
27
- ```