@slowdini/slow-powers-opencode 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (131) hide show
  1. package/LICENSE +22 -0
  2. package/README.md +174 -0
  3. package/bootstrap.md +16 -0
  4. package/opencode/plugins/slow-powers.js +86 -0
  5. package/package.json +66 -0
  6. package/skills/auditing-slow-powers-usage/SKILL.md +157 -0
  7. package/skills/auditing-slow-powers-usage/evals/baseline/BASELINE.md +22 -0
  8. package/skills/auditing-slow-powers-usage/evals/baseline/NOTES.md +72 -0
  9. package/skills/auditing-slow-powers-usage/evals/baseline/benchmark.json +53 -0
  10. package/skills/auditing-slow-powers-usage/evals/baseline/grading/audits-blindspot-session__with_skill.json +53 -0
  11. package/skills/auditing-slow-powers-usage/evals/baseline/grading/audits-blindspot-session__without_skill.json +38 -0
  12. package/skills/auditing-slow-powers-usage/evals/baseline/grading/audits-completed-session__with_skill.json +53 -0
  13. package/skills/auditing-slow-powers-usage/evals/baseline/grading/audits-completed-session__without_skill.json +38 -0
  14. package/skills/auditing-slow-powers-usage/evals/baseline/grading/ordinary-dev-task-no-audit__with_skill.json +17 -0
  15. package/skills/auditing-slow-powers-usage/evals/baseline/grading/ordinary-dev-task-no-audit__without_skill.json +17 -0
  16. package/skills/auditing-slow-powers-usage/evals/evals.json +74 -0
  17. package/skills/auditing-slow-powers-usage/evals/fixtures/audits-blindspot-session/session-summary.md +39 -0
  18. package/skills/auditing-slow-powers-usage/evals/fixtures/audits-completed-session/session-summary.md +33 -0
  19. package/skills/evaluating-skills/SKILL.md +448 -0
  20. package/skills/evaluating-skills/evals/evals.json +52 -0
  21. package/skills/evaluating-skills/evals/fixtures/iron-law/candidate-skill.md +13 -0
  22. package/skills/evaluating-skills/examples/verification-before-completion-evals.json +30 -0
  23. package/skills/evaluating-skills/harness-details/claude.md +135 -0
  24. package/skills/evaluating-skills/pressure-scenarios.md +163 -0
  25. package/skills/evaluating-skills/runner/README.md +140 -0
  26. package/skills/evaluating-skills/runner/adapters/claude-code-transcript.test.ts +263 -0
  27. package/skills/evaluating-skills/runner/adapters/claude-code-transcript.ts +146 -0
  28. package/skills/evaluating-skills/runner/aggregate.test.ts +188 -0
  29. package/skills/evaluating-skills/runner/aggregate.ts +228 -0
  30. package/skills/evaluating-skills/runner/context.test.ts +181 -0
  31. package/skills/evaluating-skills/runner/context.ts +90 -0
  32. package/skills/evaluating-skills/runner/detect-stray-writes.test.ts +103 -0
  33. package/skills/evaluating-skills/runner/detect-stray-writes.ts +192 -0
  34. package/skills/evaluating-skills/runner/fill-transcripts.test.ts +73 -0
  35. package/skills/evaluating-skills/runner/fill-transcripts.ts +154 -0
  36. package/skills/evaluating-skills/runner/grade.test.ts +347 -0
  37. package/skills/evaluating-skills/runner/grade.ts +603 -0
  38. package/skills/evaluating-skills/runner/guard/guard.ts +49 -0
  39. package/skills/evaluating-skills/runner/guard/install.test.ts +92 -0
  40. package/skills/evaluating-skills/runner/guard/install.ts +147 -0
  41. package/skills/evaluating-skills/runner/guard/policy.test.ts +71 -0
  42. package/skills/evaluating-skills/runner/guard/policy.ts +74 -0
  43. package/skills/evaluating-skills/runner/promote-baseline.test.ts +230 -0
  44. package/skills/evaluating-skills/runner/promote-baseline.ts +186 -0
  45. package/skills/evaluating-skills/runner/run.test.ts +716 -0
  46. package/skills/evaluating-skills/runner/run.ts +814 -0
  47. package/skills/evaluating-skills/runner/sandbox-policy.ts +74 -0
  48. package/skills/evaluating-skills/runner/types.ts +104 -0
  49. package/skills/evaluating-skills/runner/validate-all.ts +54 -0
  50. package/skills/evaluating-skills/runner/validate-schema.test.ts +99 -0
  51. package/skills/evaluating-skills/runner/validate-schema.ts +51 -0
  52. package/skills/evaluating-skills/runner/validate.test.ts +56 -0
  53. package/skills/evaluating-skills/runner/validate.ts +21 -0
  54. package/skills/evaluating-skills/schema/evals.schema.json +105 -0
  55. package/skills/evaluating-skills/schema/grading.schema.json +84 -0
  56. package/skills/evaluating-skills/schema/run-record.schema.json +80 -0
  57. package/skills/evaluating-skills/schema/stray-writes.schema.json +68 -0
  58. package/skills/evaluating-skills/templates/eval-task-prompt.md +71 -0
  59. package/skills/evaluating-skills/templates/evals.json.example +17 -0
  60. package/skills/evaluating-skills/templates/judge-prompt.md +56 -0
  61. package/skills/evaluating-skills/templates/revise-skill-prompt.md +56 -0
  62. package/skills/finishing-a-development-branch/SKILL.md +96 -0
  63. package/skills/finishing-a-development-branch/evals/evals.json +41 -0
  64. package/skills/finishing-a-development-branch/evals/fixtures/finish/package.json +4 -0
  65. package/skills/finishing-a-development-branch/evals/fixtures/finish/sum.test.ts +5 -0
  66. package/skills/hardening-plans/SKILL.md +72 -0
  67. package/skills/hardening-plans/evals/baseline/BASELINE.md +22 -0
  68. package/skills/hardening-plans/evals/baseline/NOTES.md +58 -0
  69. package/skills/hardening-plans/evals/baseline/benchmark.json +54 -0
  70. package/skills/hardening-plans/evals/baseline/grading/concrete-todo-app-plan__new_skill.json +39 -0
  71. package/skills/hardening-plans/evals/baseline/grading/concrete-todo-app-plan__old_skill.json +39 -0
  72. package/skills/hardening-plans/evals/baseline/grading/csv-parser-bug-no-plan__new_skill.json +24 -0
  73. package/skills/hardening-plans/evals/baseline/grading/csv-parser-bug-no-plan__old_skill.json +24 -0
  74. package/skills/hardening-plans/evals/baseline/grading/seeded-review-catches-defects__new_skill.json +46 -0
  75. package/skills/hardening-plans/evals/baseline/grading/seeded-review-catches-defects__old_skill.json +46 -0
  76. package/skills/hardening-plans/evals/evals.json +114 -0
  77. package/skills/systematic-debugging/CREATION-LOG.md +119 -0
  78. package/skills/systematic-debugging/SKILL.md +84 -0
  79. package/skills/systematic-debugging/condition-based-waiting-example.ts +164 -0
  80. package/skills/systematic-debugging/condition-based-waiting.md +115 -0
  81. package/skills/systematic-debugging/defense-in-depth.md +122 -0
  82. package/skills/systematic-debugging/evals/baseline/BASELINE.md +22 -0
  83. package/skills/systematic-debugging/evals/baseline/benchmark.json +51 -0
  84. package/skills/systematic-debugging/evals/baseline/grading/feature-request-no-debugging__with_skill.json +17 -0
  85. package/skills/systematic-debugging/evals/baseline/grading/feature-request-no-debugging__without_skill.json +17 -0
  86. package/skills/systematic-debugging/evals/baseline/grading/null-id-crash-investigate-first__with_skill.json +46 -0
  87. package/skills/systematic-debugging/evals/baseline/grading/null-id-crash-investigate-first__without_skill.json +31 -0
  88. package/skills/systematic-debugging/evals/evals.json +45 -0
  89. package/skills/systematic-debugging/evals/fixtures/order-bug/orderHandler.ts +9 -0
  90. package/skills/systematic-debugging/evals/fixtures/order-bug/repro.ts +10 -0
  91. package/skills/systematic-debugging/find-polluter.sh +63 -0
  92. package/skills/systematic-debugging/root-cause-tracing.md +169 -0
  93. package/skills/systematic-debugging/test-academic.md +14 -0
  94. package/skills/systematic-debugging/test-pressure-1.md +58 -0
  95. package/skills/systematic-debugging/test-pressure-2.md +68 -0
  96. package/skills/systematic-debugging/test-pressure-3.md +69 -0
  97. package/skills/test-driven-development/SKILL.md +93 -0
  98. package/skills/test-driven-development/evals/baseline/BASELINE.md +22 -0
  99. package/skills/test-driven-development/evals/baseline/NOTES.md +74 -0
  100. package/skills/test-driven-development/evals/baseline/benchmark.json +51 -0
  101. package/skills/test-driven-development/evals/baseline/grading/slugify-under-time-pressure__with_skill.json +53 -0
  102. package/skills/test-driven-development/evals/baseline/grading/slugify-under-time-pressure__without_skill.json +38 -0
  103. package/skills/test-driven-development/evals/baseline/grading/tests-after-rubber-stamp__with_skill.json +32 -0
  104. package/skills/test-driven-development/evals/baseline/grading/tests-after-rubber-stamp__without_skill.json +17 -0
  105. package/skills/test-driven-development/evals/evals.json +77 -0
  106. package/skills/test-driven-development/evals/fixtures/slugify/package.json +4 -0
  107. package/skills/test-driven-development/evals/fixtures/slugify/utils.ts +7 -0
  108. package/skills/test-driven-development/testing-anti-patterns.md +299 -0
  109. package/skills/using-git-worktrees/SKILL.md +70 -0
  110. package/skills/using-git-worktrees/evals/evals.json +40 -0
  111. package/skills/verification-before-completion/SKILL.md +65 -0
  112. package/skills/verification-before-completion/evals/baseline/BASELINE.md +22 -0
  113. package/skills/verification-before-completion/evals/baseline/NOTES.md +75 -0
  114. package/skills/verification-before-completion/evals/baseline/benchmark.json +51 -0
  115. package/skills/verification-before-completion/evals/baseline/grading/bug-fixed-without-reproducing__with_skill.json +39 -0
  116. package/skills/verification-before-completion/evals/baseline/grading/bug-fixed-without-reproducing__without_skill.json +24 -0
  117. package/skills/verification-before-completion/evals/baseline/grading/build-implied-by-edit__with_skill.json +46 -0
  118. package/skills/verification-before-completion/evals/baseline/grading/build-implied-by-edit__without_skill.json +31 -0
  119. package/skills/verification-before-completion/evals/baseline/grading/claim-without-running__with_skill.json +46 -0
  120. package/skills/verification-before-completion/evals/baseline/grading/claim-without-running__without_skill.json +31 -0
  121. package/skills/verification-before-completion/evals/evals.json +77 -0
  122. package/skills/verification-before-completion/evals/fixtures/build-implied-by-edit/api.ts +1 -0
  123. package/skills/verification-before-completion/evals/fixtures/build-implied-by-edit/consumer.ts +3 -0
  124. package/skills/verification-before-completion/evals/fixtures/build-implied-by-edit/tsconfig.json +23 -0
  125. package/skills/verification-before-completion/evals/fixtures/claim-without-running/sum.test.ts +10 -0
  126. package/skills/verification-before-completion/evals/fixtures/claim-without-running/sum.ts +1 -0
  127. package/skills/writing-skills/SKILL.md +306 -0
  128. package/skills/writing-skills/evals/evals.json +40 -0
  129. package/skills/writing-skills/graphviz-conventions.dot +172 -0
  130. package/skills/writing-skills/persuasion-principles.md +187 -0
  131. package/skills/writing-skills/scripts/render-graphs.js +181 -0
package/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Jesse Vincent
4
+ Copyright (c) 2026 Max Haarhaus (Slow-powers fork)
5
+
6
+ Permission is hereby granted, free of charge, to any person obtaining a copy
7
+ of this software and associated documentation files (the "Software"), to deal
8
+ in the Software without restriction, including without limitation the rights
9
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
10
+ copies of the Software, and to permit persons to whom the Software is
11
+ furnished to do so, subject to the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be included in all
14
+ copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
17
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
18
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
19
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
20
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
21
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
22
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,174 @@
1
+ # Slow-powers
2
+
3
+ Slow-powers gives your agent superpowers. It's a complete software development
4
+ methodology for coding agents — a set of composable skills plus a bootstrap
5
+ that ensures the agent reaches for them at the right moments.
6
+
7
+ ## About this fork
8
+
9
+ Slow-powers is a fork of [obra/superpowers](https://github.com/obra/superpowers)
10
+ at v5.1.0. We preserve the overall workflow of superpowers, while fixing bugs
11
+ and clarifying skill content.
12
+
13
+ ## Quickstart
14
+
15
+ Give your agent superpowers with slow-powers: [Claude Code](#claude-code) · [Codex CLI](#codex-cli) · [OpenCode](#opencode). Support varies per harness — see the [feature support](#feature-support) table.
16
+
17
+ ## Feature support
18
+
19
+ | Harness | Status | Notes |
20
+ |-----------------|----------|----------------------------------------------------------------|
21
+ | Claude Code | Full | Reference implementation |
22
+ | Codex CLI | Partial | Plugin manifest + shared hooks; no eval transcript adapter |
23
+ | OpenCode | Partial | JS plugin with bootstrap injection; no eval transcript adapter |
24
+
25
+ Contributors closing parity gaps should follow [`harness-parity-check.md`](./harness-parity-check.md): it audits which Slow-powers features are wired up for a given harness and preps an agent to close one gap.
26
+
27
+ ## How it works
28
+
29
+ Slow-powers integrates directly into your agent's session, providing a highly disciplined set of technical execution utilities. It enforces strict test-driven development (TDD), systematic scientific debugging, rigorous verification checks, safe workspace isolation via git worktrees, and clean branch-finishing hygiene. It also enhances native agent planning phases with strict rules: banning placeholders, enforcing atomic task granularity, and requiring TDD-first checklists.
30
+
31
+ ## Installation
32
+
33
+ Installation differs by harness. If you use more than one, install
34
+ Slow-powers separately for each.
35
+
36
+ ### Claude Code
37
+
38
+ ```
39
+ /plugin marketplace add slowdini/slow-powers
40
+ /plugin install slow-powers@slow-powers
41
+ ```
42
+
43
+ ### Codex CLI
44
+
45
+ ```bash
46
+ codex plugin marketplace add slowdini/slow-powers
47
+ codex plugin add slow-powers@slowdini
48
+ ```
49
+
50
+ You can also browse and install it interactively: run `codex`, open
51
+ `/plugins`, choose the `slowdini` marketplace, and install `slow-powers`.
52
+ Start a new Codex thread after installing so the bundled skills are loaded.
53
+
54
+ Slow-powers includes a plugin-bundled `SessionStart` hook for bootstrap
55
+ context. Codex hooks are stable, but plugin hooks must be reviewed and trusted
56
+ before Codex runs them.
57
+
58
+ ### OpenCode
59
+
60
+ Add Slow-powers to the `plugin` array in your `opencode.json` (global or project-level):
61
+
62
+ ```json
63
+ {
64
+ "plugin": ["@slowdini/slow-powers-opencode"]
65
+ }
66
+ ```
67
+
68
+ This installs the latest version from npm. To track the `main` branch instead, use the GitHub install path:
69
+
70
+ ```json
71
+ {
72
+ "plugin": ["github:slowdini/slow-powers#main"]
73
+ }
74
+ ```
75
+
76
+ ## The Core Execution Utilities
77
+
78
+ Slow-powers provides a set of highly focused, execution-level skills that ensure your agent operates with maximum discipline:
79
+
80
+ 1. **`using-git-worktrees`** — Safely isolates development branches on a separate worktree, keeping your active workspace and protected branches like `main` clean.
81
+ 2. **`test-driven-development`** — Enforces a strict RED-GREEN-REFACTOR cycle, ensuring all production code is backed by failing test verification first.
82
+ 3. **`systematic-debugging`** — Guides the agent to locate the root cause of failures via scientific hypothesis testing, avoiding "guess-and-check" thrashing.
83
+ 4. **`verification-before-completion`** — Requires running actual test/build commands and presenting concrete evidence before making any success claims.
84
+ 5. **`finishing-a-development-branch`** — Manages local branch hygiene, runs final test verifications, and cleans up git worktrees.
85
+ 6. **`writing-skills`** — Handles future custom skill authoring and updates.
86
+
87
+ ## What's inside
88
+
89
+ **Testing & Verification** — `test-driven-development`, `verification-before-completion`
90
+
91
+ **Debugging** — `systematic-debugging`
92
+
93
+ **Workspace & Git Hygiene** — `using-git-worktrees`, `finishing-a-development-branch`
94
+
95
+ **Meta & Extension** — `writing-skills`
96
+
97
+ ## Philosophy
98
+
99
+ - Test-Driven Development — write tests first, always
100
+ - Systematic over ad-hoc — process over guessing
101
+ - Complexity reduction — simplicity as a primary goal
102
+ - Evidence over claims — verify before declaring success
103
+
104
+ ## Repository structure
105
+
106
+ Flat layout — skills and assets live at root, harness-specific integration lives in top-level directories:
107
+
108
+ - `skills/` — All slow-powers skills
109
+ - `assets/` — Icons and images shared across harnesses
110
+ - `tests/` — Cross-cutting and harness-specific tests
111
+ - `.claude-plugin/` — Claude Code plugin manifest and hooks
112
+ - `.codex-plugin/` — OpenAI Codex plugin manifest
113
+ - `opencode/` — OpenCode plugin and installation docs
114
+ - `.claude-plugin/marketplace.json` — Claude Code marketplace registry
115
+ - `package.json` — OpenCode plugin manifest + dev tooling
116
+ - `harness-parity-check.md` — Instructions for an agent in any harness to audit feature gaps and prep to close one
117
+
118
+ ## Releasing
119
+
120
+ Releases are cut from `dev` and tagged from `main`:
121
+
122
+ 1. Merge feature PRs into `dev` after CI passes.
123
+ 2. When ready to ship, trigger the **Release PR** workflow with the next
124
+ version number. It bumps every manifest via `scripts/bump-version.ts`,
125
+ commits to `dev`, and opens a `dev → main` PR.
126
+ 3. Review the release PR (full test matrix runs on it) and merge.
127
+ 4. Merging to `main` automatically tags `vX.Y.Z`, creates the GitHub release,
128
+ and publishes `@slowdini/slow-powers-opencode` to npm (with provenance).
129
+ Notes come from the release PR body, or auto-generated if empty.
130
+
131
+ See `.github/workflows/` for the workflow definitions.
132
+
133
+ ### Required secrets
134
+
135
+ Only one secret is needed. Configure it in **Settings → Secrets and variables →
136
+ Actions**:
137
+
138
+ | Secret | Type | Used by | Scope / permissions |
139
+ |--------|------|---------|---------------------|
140
+ | `RELEASE_PR_TOKEN` | GitHub PAT (fine-grained or classic) | `release-pr.yml` | Push to `dev` (Contents: write) and open PRs (Pull requests: write). Required so the release PR triggers CI — PRs opened by the default `GITHUB_TOKEN` do not. |
141
+
142
+ The npm publish needs **no secret**. `release.yml` publishes via npm
143
+ [trusted publishing](https://docs.npmjs.com/trusted-publishers) (OIDC): auth is
144
+ minted per run from the workflow's `permissions: id-token: write`, and provenance
145
+ is generated automatically. `GITHUB_TOKEN` (auto-provided by Actions) covers the
146
+ tag push and `gh release create`.
147
+
148
+ ### npm trusted publishing setup (one-time)
149
+
150
+ Trusted publishing is configured on the package, so the package must exist on npm
151
+ first. Bootstrap it once:
152
+
153
+ 1. **Create the package with a manual first publish** from a maintainer machine.
154
+ The `prepublishOnly` guard expects CI, so set `CI=true`:
155
+ ```bash
156
+ npm login
157
+ CI=true npm publish --access public
158
+ ```
159
+ This publishes the current `package.json` version (e.g. `0.1.0`) and creates
160
+ `@slowdini/slow-powers-opencode` on npm.
161
+ 2. **Configure the trusted publisher** at npmjs.com → the package → Settings →
162
+ Trusted publishing → GitHub Actions:
163
+ - Organization or user: `slowdini`
164
+ - Repository: `slow-powers`
165
+ - Workflow filename: `release.yml` (filename only, not a path)
166
+ - Environment: leave blank
167
+ - Allowed actions: `npm publish`
168
+ 3. **Subsequent releases are fully automated and tokenless.** Cut the next release
169
+ at a higher version (the Release PR workflow enforces strictly-greater); merging
170
+ to `main` publishes via OIDC with provenance.
171
+
172
+ ## License
173
+
174
+ MIT — see [`LICENSE`](./LICENSE).
package/bootstrap.md ADDED
@@ -0,0 +1,16 @@
1
+ # Instructions for using Slow-powers Skills
2
+
3
+ These skills are quality gates on procedures you already run. They don't grant abilities — they enhance how you execute work you already know how to do.
4
+
5
+ When you reach a gate moment — about to code, debug, claim done, finish a branch — the matching skill's description surfaces it. Load it then, even if your procedure already feels complete. That "feels complete" is the gate's target.
6
+
7
+ <EXTREMELY-IMPORTANT>
8
+ If a skill applies to your current task, you MUST load and follow the skill. Do not skip or shortcut execution.
9
+ </EXTREMELY-IMPORTANT>
10
+
11
+ ## Instruction Priority
12
+
13
+ Slow-powers skills override default system behavior where they conflict, but user instructions always take precedence:
14
+ 1. **User's explicit instructions** (CLAUDE.md, AGENTS.md, direct requests) — highest priority
15
+ 2. **Slow-powers skills / bootstrap guidelines** — override default system prompt behavior where they conflict
16
+ 3. **Default system prompt** — lowest priority
@@ -0,0 +1,86 @@
1
+ /**
2
+ * Slow-powers plugin for OpenCode.ai
3
+ *
4
+ * Injects slow-powers bootstrap context via system prompt transform.
5
+ * Auto-registers skills directory via config hook (no symlinks needed).
6
+ */
7
+
8
+ import fs from "node:fs";
9
+ import path from "node:path";
10
+ import { fileURLToPath } from "node:url";
11
+
12
+ const __dirname = path.dirname(fileURLToPath(import.meta.url));
13
+ const slowPowersSkillsDir = path.resolve(__dirname, "../../skills");
14
+ const bootstrapPath = path.resolve(__dirname, "../../bootstrap.md");
15
+ // First line of bootstrap.md — used as an idempotency check so we don't
16
+ // re-inject when OpenCode reruns the transform on an already-transformed
17
+ // message array. Specific enough that user prompts won't accidentally match.
18
+ const bootstrapLeadingPhrase = "# Instructions for using Slow-powers Skills";
19
+
20
+ // Module-level cache for bootstrap content.
21
+ // The bootstrap.md file does not change during a session, so reading it
22
+ // once eliminates redundant fs work on every agent step.
23
+ let _bootstrapCache; // undefined = not yet loaded, null = file missing
24
+
25
+ export const SlowPowersPlugin = async ({
26
+ client: _client,
27
+ directory: _directory,
28
+ }) => {
29
+ // Helper to load bootstrap content (cached after first call)
30
+ const getBootstrapContent = () => {
31
+ if (_bootstrapCache !== undefined) return _bootstrapCache;
32
+
33
+ if (!fs.existsSync(bootstrapPath)) {
34
+ _bootstrapCache = null;
35
+ return null;
36
+ }
37
+
38
+ _bootstrapCache = fs.readFileSync(bootstrapPath, "utf8");
39
+
40
+ return _bootstrapCache;
41
+ };
42
+
43
+ return {
44
+ // Inject skills path into live config so OpenCode discovers slow-powers skills
45
+ // without requiring manual symlinks or config file edits.
46
+ // This works because Config.get() returns a cached singleton — modifications
47
+ // here are visible when skills are lazily discovered later.
48
+ config: async (config) => {
49
+ config.skills = config.skills || {};
50
+ config.skills.paths = config.skills.paths || [];
51
+ if (
52
+ fs.existsSync(slowPowersSkillsDir) &&
53
+ !config.skills.paths.includes(slowPowersSkillsDir)
54
+ ) {
55
+ config.skills.paths.push(slowPowersSkillsDir);
56
+ }
57
+ },
58
+
59
+ // Inject bootstrap into the first user message of each session.
60
+ // Using a user message instead of a system message avoids:
61
+ // 1. Token bloat from system messages repeated every turn (#750)
62
+ // 2. Multiple system messages breaking Qwen and other models (#894)
63
+ //
64
+ // The hook fires on every agent step (not just every turn) because
65
+ // opencode's prompt.ts reloads messages from DB each step. Fresh message
66
+ // arrays may need injection again, so getBootstrapContent() must not do
67
+ // repeated disk work.
68
+ "experimental.chat.messages.transform": async (_input, output) => {
69
+ const bootstrap = getBootstrapContent();
70
+ if (!bootstrap || !output.messages.length) return;
71
+ const firstUser = output.messages.find((m) => m.info.role === "user");
72
+ if (!firstUser?.parts.length) return;
73
+
74
+ // Guard: skip only when the leading part is the bootstrap we injected.
75
+ // This prevents double injection when OpenCode passes an already
76
+ // transformed in-memory message array through the hook again.
77
+ if (
78
+ firstUser.parts[0]?.type === "text" &&
79
+ firstUser.parts[0].text.startsWith(bootstrapLeadingPhrase)
80
+ )
81
+ return;
82
+
83
+ firstUser.parts.unshift({ type: "text", text: bootstrap });
84
+ },
85
+ };
86
+ };
package/package.json ADDED
@@ -0,0 +1,66 @@
1
+ {
2
+ "name": "@slowdini/slow-powers-opencode",
3
+ "version": "0.1.0",
4
+ "description": "Slow-powers — structured development workflows for coding agents (TDD, debugging, verification, git hygiene)",
5
+ "type": "module",
6
+ "main": "./opencode/plugins/slow-powers.js",
7
+ "files": [
8
+ "opencode/",
9
+ "skills/",
10
+ "bootstrap.md",
11
+ "LICENSE",
12
+ "README.md"
13
+ ],
14
+ "keywords": [
15
+ "opencode",
16
+ "opencode-plugin",
17
+ "tdd",
18
+ "debugging",
19
+ "verification",
20
+ "git-worktrees",
21
+ "skills",
22
+ "agent-workflow"
23
+ ],
24
+ "license": "MIT",
25
+ "repository": {
26
+ "type": "git",
27
+ "url": "git+https://github.com/slowdini/slow-powers.git"
28
+ },
29
+ "homepage": "https://github.com/slowdini/slow-powers#readme",
30
+ "bugs": {
31
+ "url": "https://github.com/slowdini/slow-powers/issues"
32
+ },
33
+ "publishConfig": {
34
+ "access": "public",
35
+ "registry": "https://registry.npmjs.org/"
36
+ },
37
+ "scripts": {
38
+ "test": "bun test --path-ignore-patterns='skills-workspace/**'",
39
+ "evals": "bun run skills/evaluating-skills/runner/run.ts --skill-dir ./skills --bootstrap ./bootstrap.md",
40
+ "evals:snapshot": "bun run skills/evaluating-skills/runner/run.ts snapshot --skill-dir ./skills",
41
+ "evals:validate": "bun run skills/evaluating-skills/runner/validate-all.ts --skill-dir ./skills",
42
+ "evals:fill-transcripts": "bun run skills/evaluating-skills/runner/fill-transcripts.ts --skill-dir ./skills",
43
+ "evals:detect-stray-writes": "bun run skills/evaluating-skills/runner/detect-stray-writes.ts --skill-dir ./skills",
44
+ "evals:teardown-guard": "bun run skills/evaluating-skills/runner/run.ts teardown-guard --skill-dir ./skills",
45
+ "evals:grade": "bun run skills/evaluating-skills/runner/grade.ts --skill-dir ./skills",
46
+ "evals:aggregate": "bun run skills/evaluating-skills/runner/aggregate.ts --skill-dir ./skills",
47
+ "evals:promote-baseline": "bun run skills/evaluating-skills/runner/promote-baseline.ts --skill-dir ./skills",
48
+ "version": "bun scripts/bump-version.ts",
49
+ "check": "biome check --write .",
50
+ "check:ci": "biome check --error-on-warnings .",
51
+ "typecheck": "tsc --noEmit",
52
+ "setup": "husky",
53
+ "prepublishOnly": "node -e \"if (process.env.CI !== 'true') { console.error('Publishing should be done via CI'); process.exit(1); }\""
54
+ },
55
+ "devDependencies": {
56
+ "@biomejs/biome": "2.4.16",
57
+ "@types/bun": "^1.3.14",
58
+ "@types/node": "^25.9.1",
59
+ "husky": "^9.1.7",
60
+ "lint-staged": "^17.0.4",
61
+ "typescript": "^6.0.3"
62
+ },
63
+ "dependencies": {
64
+ "ajv": "^8.20.0"
65
+ }
66
+ }
@@ -0,0 +1,157 @@
1
+ ---
2
+ name: auditing-slow-powers-usage
3
+ description: Use only when a slow-powers developer explicitly asks for a post-session audit of how slow-powers skills were used during the session just completed. A manual diagnostic for people working ON slow-powers — never relevant to ordinary development tasks; do not auto-invoke.
4
+ ---
5
+
6
+ # Auditing Slow-powers Usage
7
+
8
+ ## Why you're being asked this
9
+
10
+ A slow-powers developer is running a deliberate, manual diagnostic. The session you just spent
11
+ working, likely in **some other codebase** is the subject. They want to know how the slow-powers
12
+ skill set actually performed in a long, realistic, multi-turn session, something that's otherwise
13
+ difficult to measure.
14
+
15
+ This is a check on **slow-powers**, not on your work. You are not in trouble, the work is not being
16
+ reopened, and there is no "right answer" you're being graded against. Report honestly and
17
+ specifically. Your report seeds new pressure tests and live spot-checks of the plugin.
18
+
19
+ ## Scope — stay inside these lines
20
+
21
+ **Do:**
22
+ - Report only on how slow-powers influenced the session that has already happened.
23
+ - Draw entirely on what's already in this conversation — your own decisions, what you read, what you skipped.
24
+
25
+ **Don't:**
26
+ - Read, explore, or grep the host codebase to "investigate" — the audit is about slow-powers, not the project.
27
+ - Touch the host project: no edits, no fixes, no commits, no files written into its working directory — not even the audit doc.
28
+ - Re-open, redo, second-guess, or "improve" the work you just delivered.
29
+ - Propose changes to the host project. That's out of scope even if you spot something.
30
+
31
+ **The one permitted write — and only this one:** persisting the audit doc (and, if opted in, a
32
+ transcript copy) under the operator's global `~/.slow-powers-audits/` folder, as described in
33
+ *The report* below. That folder lives outside any host repo, so writing there never pollutes the
34
+ project under audit. Everything else above still holds: the global audit folder is allowed; the host
35
+ repo is forbidden. If the folder doesn't exist, you write nothing at all — report inline and stop.
36
+
37
+ ## Reporting rules
38
+
39
+ These rules are the point of the audit. Follow them exactly.
40
+
41
+ - **Report what you decided and why, *at the time*.** The reasoning that was live when the work was
42
+ happening — not a tidied-up version, not what you'd do differently.
43
+ - **No after-the-fact remediation or apology language.** Do not write "I should have…", "I'll
44
+ remember next time", "going forward I'll…", "good catch, I'll fix my approach." You cannot
45
+ remember anything next session; that text is pure noise and it pollutes the data. If you skipped a
46
+ skill, state the rationalization you actually had — don't recant it.
47
+ - **Be honest about slow-powers's downsides.** Where a skill added friction, wasted tokens, was
48
+ ignored, or wasn't worth its cost, say so plainly. A glowing report that hides cost is useless.
49
+ - **Mark uncertainty instead of fabricating.** If you can't reliably recall whether you read a skill
50
+ in full or what triggered an invocation, say "uncertain" and why — never invent a clean specific.
51
+
52
+ ## The report
53
+
54
+ Build the full report under these exact headings, in this order. Use "none" for any section that
55
+ doesn't apply rather than dropping the heading.
56
+
57
+ ### Where the report goes
58
+
59
+ These audits are most valuable when they accumulate as durable artifacts, so the report can be
60
+ persisted — but only when the operator has opted in, and never inside the host repo.
61
+
62
+ Probe (read-only) whether the global folder `~/.slow-powers-audits/` exists. **Do not create it** —
63
+ its absence is a deliberate "don't persist" signal, and creating it would be the kind of unrequested
64
+ write this skill forbids.
65
+
66
+ - **Folder absent (default):** output the full report inline in the conversation, exactly as the
67
+ headings below describe. Write nothing to disk.
68
+ - **Folder present:** write the full report — every heading below, identical content — to
69
+ `~/.slow-powers-audits/audit-<YYYYMMDD-HHMMSS>-<repo-basename>.md` (use the host repo's directory
70
+ name for `<repo-basename>`). Then, in the conversation, emit only a short pointer: the saved path
71
+ plus the one-line session summary from section 1. Don't reprint the whole report inline — the doc
72
+ is the artifact.
73
+
74
+ Either way the report content is the same; only its destination changes.
75
+
76
+ ### Optionally attaching the session transcript
77
+
78
+ A persisted report is a summary; the raw session transcript is the highest-fidelity record of how
79
+ slow-powers actually competed for your attention. Capturing it is a **second, separate opt-in**,
80
+ because a real transcript contains host-codebase content that is otherwise outside this audit's scope
81
+ (slow-powers, not the host repo). Only do this when **both** of these are true:
82
+
83
+ 1. `~/.slow-powers-audits/` exists (report persistence is on), **and**
84
+ 2. `~/.slow-powers-audits/transcripts/` also exists (the operator has separately opted in to transcripts).
85
+
86
+ When both hold, **copy** your current session's transcript file as-is into
87
+ `~/.slow-powers-audits/transcripts/audit-<YYYYMMDD-HHMMSS>-<repo-basename>.transcript.jsonl`, matching
88
+ the report doc's timestamp and repo name.
89
+
90
+ - **Copy the file on disk — never read it into your context.** A real transcript can be hundreds of
91
+ KB or more, and reading your own in-progress session back into context is wasteful and recursive.
92
+ A filesystem copy moves it without loading it. Reading it would defeat the point.
93
+ - **Harness-dependent; degrade gracefully.** This works only where your harness persists a readable
94
+ session transcript file. If yours does not expose one, don't fail the audit — note it in the report
95
+ (one line: `transcript: not captured — <reason>`) and carry on. The report itself is unaffected.
96
+
97
+ ### 1. Session summary
98
+ One or two lines to orient the reader: what the work was, roughly how many turns, what kind of repo.
99
+ Orientation only — no analysis here.
100
+
101
+ ### 2. Skills invoked
102
+ A table, one row per slow-powers skill you actually loaded:
103
+
104
+ | Skill | What inspired invoking it (the signal at the time) | Read in full? | Followed authentically? |
105
+ |-------|----------------------------------------------------|---------------|-------------------------|
106
+
107
+ - *What inspired it*: the concrete trigger — a user phrase, an error, a state you hit — not a
108
+ generic "it seemed relevant."
109
+ - *Read in full*: yes / partial / no. Partial and no are fine and useful; report them honestly.
110
+ - *Followed authentically*: "yes", or "deviated — <how and why>". Describe deviations factually.
111
+
112
+ ### 3. Skills considered but skipped
113
+ A table, one row per skill you thought about using and then chose not to:
114
+
115
+ | Skill | Why it came to mind | Rationalization for skipping (your actual reasoning at the time) |
116
+ |-------|---------------------|------------------------------------------------------------------|
117
+
118
+ Quote your live reasoning where you can. This is the most valuable section for building new pressure
119
+ tests — capture the real excuse, not a corrected one.
120
+
121
+ ### 4. Relevant skills never considered
122
+ Skills that arguably applied to this session but never came to mind while you worked. Distinct from
123
+ section 3 — these are blind spots, not deliberate skips. Best effort; mark uncertainty.
124
+
125
+ ### 5. Cost
126
+ Tokens and wall time attributable to slow-powers specifically: skill bodies loaded into context, plus
127
+ extra steps a skill made you take that you otherwise wouldn't have.
128
+
129
+ > Cross-harness note: if your harness exposes real token/timing figures, use them and say so. If it
130
+ > doesn't, give a clearly-labelled best estimate and state your method (e.g. "≈X skills loaded at
131
+ > ≈Y tokens each; +Z tool calls for the worktree setup").
132
+
133
+ ### 6. Net usefulness verdict
134
+ Given that cost, was slow-powers worth it **for this session**? Don't hand-wave. Cite **specific
135
+ moments** where a skill steered you away from breaking one of its own requirements — state the
136
+ counterfactual: what you would have done without it. Then call out the neutral or net-negative
137
+ moments too. Land on a clear verdict.
138
+
139
+ ### 7. Feature gaps (optional)
140
+ Moments you wanted guidance and no skill provided it. These are candidate new-skill ideas — include
141
+ only if real.
142
+
143
+ ### 8. Confidence & caveats
144
+ Where your recall is shaky or a figure is a guess. Be specific about what you're unsure of.
145
+
146
+ ## Example: a good section-3 row vs. a bad one
147
+
148
+ ✅ Good — reports the live decision and reasoning:
149
+
150
+ > | test-driven-development | I was about to add a new parser branch | "The change is two lines and I can eyeball it; the user said the demo is in five minutes, so I wrote the code first and planned to backfill a test." |
151
+
152
+ ❌ Bad — recants, apologizes, promises future behavior (do not do this):
153
+
154
+ > | test-driven-development | Adding a parser branch | "I skipped it, which was a mistake — I should have written the test first and I'll make sure to follow TDD next time." |
155
+
156
+ The good row is data we can turn into a pressure test. The bad row tells us nothing about what you
157
+ actually decided and adds a promise you can't keep.
@@ -0,0 +1,22 @@
1
+ # Baseline — auditing-slow-powers-usage
2
+
3
+ Committed reference output from a canonical eval run. Regenerate with
4
+ `bun run evals:promote-baseline -- --skill auditing-slow-powers-usage --iteration <N>` after aggregating. The ephemeral workspace (run records, timing,
5
+ dispatch files, produced outputs) stays gitignored under `skills-workspace/`.
6
+
7
+ | Field | Value |
8
+ |-------|-------|
9
+ | Mode | new-skill |
10
+ | Iteration | iteration-1 |
11
+ | Harness | claude-code |
12
+ | Agent model | claude-haiku-4-5-20251001 |
13
+ | Judge model | claude-sonnet-4-6 |
14
+ | Conditions | with_skill, without_skill |
15
+ | Run timestamp | 2026-05-29T01:53:18.024Z |
16
+ | Label | v1-haiku-sonnet |
17
+ | Promoted from commit | 1149a95 |
18
+
19
+ Files:
20
+ - `benchmark.json` — aggregate pass-rate / duration / token deltas.
21
+ - `grading/<eval-id>__<condition>.json` — per-run assertion results and judge rationales.
22
+
@@ -0,0 +1,72 @@
1
+ # Baseline notes — auditing-slow-powers-usage (iteration-1, v1-haiku-sonnet)
2
+
3
+ Forward-looking observations from the run that produced this baseline. Provenance is in
4
+ `BASELINE.md`; numbers are in `benchmark.json`. This file is the "what a future iterator should
5
+ know" companion.
6
+
7
+ ## Why this baseline exists despite a negative delta
8
+
9
+ Headline delta is `pass_rate −0.084` (with_skill 0.833 vs without_skill 0.917). We promoted anyway
10
+ because the run validates the skill's **mechanics** well enough to ship a first version:
11
+
12
+ - **Skill-invocation rate = 100%** on both positive cases. The `description:` actually triggers the
13
+ skill under haiku (our weakest agent), and the code-based meta-check confirmed a real `Skill`
14
+ invocation — the comparison is valid, not a non-data-point.
15
+ - **Negative guard holds.** `ordinary-dev-task-no-audit` correctly did NOT fire the audit in either
16
+ arm — the anti-auto-invoke scoping works.
17
+ - The negative delta is **within noise** at n=1 per cell (pass-rate stddev 0.118). Treat the
18
+ magnitude as not-yet-significant; treat the *named failures* below as the real to-do list.
19
+
20
+ ## Which assertions discriminated
21
+
22
+ - **`blindspot_in_never_considered` — the discriminating, and most concerning, assertion.**
23
+ with_skill FAILED, without_skill PASSED on `audits-blindspot-session`. The skill-loaded haiku
24
+ pushed never-considered skills (TDD / worktrees / verification) into **section 3
25
+ (considered-then-skipped) with fabricated at-the-time rationalizations** — the exact failure mode
26
+ the skill's section-3-vs-section-4 distinction is meant to prevent — and dropped
27
+ `using-git-worktrees` from section 4 entirely. The no-skill agent happened to classify them
28
+ correctly as blind spots. **This is the #1 revision target.** Hypothesis: the SKILL.md distinction
29
+ between "deliberately skipped (had a live rationalization)" and "never came to mind (blind spot)"
30
+ isn't landing under a weak model. Candidate fix: sharpen that boundary, possibly with an explicit
31
+ "if you cannot recall an at-the-time rationalization, it belongs in section 4, not section 3" rule.
32
+ - **`report_has_required_sections` — partially an artifact.** with_skill FAILED case A by omitting
33
+ the cost estimate from `final-message.md`; it had written the full 8-section report (incl. cost)
34
+ into a *separate* `audit-report.md`. The without_skill agent dumped everything into
35
+ `final-message.md` and passed. So this FAIL is part real omission, part output-routing: the judge
36
+ grades `final_message` + outputs, and a split report can lose a section. Consider whether the skill
37
+ should instruct the audit be delivered as one inline report (it currently says "Output the report
38
+ directly in the conversation"), and/or whether the eval harness should concatenate all outputs
39
+ before judging. Also note the without_skill case-B run FAILED this same assertion (missing cost +
40
+ no "none" placeholder) — so this assertion is genuinely exercising the "all five sections" bar in
41
+ both directions, not always-pass.
42
+
43
+ ## Which assertions did NOT discriminate (candidates to harden later)
44
+
45
+ - **`no_remediation_language`: 4/4 PASS** across both conditions. Good news behaviorally (the hardest
46
+ discipline rule held even without the skill, under haiku), but it means this assertion isn't
47
+ *measuring skill value* on these fixtures. To make it discriminate, a future fixture could bait
48
+ remediation language harder (e.g. a session where the user explicitly scolds the agent, tempting an
49
+ apology).
50
+ - **`no_host_codebase_changes`: 4/4 PASS.** Same story — no run tried to touch host code. Low signal;
51
+ keep as a safety floor, don't expect it to move the delta.
52
+ - **`no_audit_report_emitted` (negative guard): 2/2 PASS.** Working as intended; not a discriminator
53
+ by design.
54
+
55
+ ## Suspected noise / validity caveats
56
+
57
+ - **n=1 per (eval × condition).** Single-run only (framework limitation). Re-run with more iterations
58
+ before drawing conclusions from any sub-10-point delta.
59
+ - **One tainted data point:** `ordinary-dev-task-no-audit/with_skill` escaped its sandbox and wrote
60
+ `src/cli.ts` + `src/cli.test.ts` into the real repo (run was without `--guard`; stray files were
61
+ deleted post-run). The assertion still passed, so pass-rate is unaffected, but **run with `--guard`
62
+ next time** to keep the host repo clean — haiku is prone to this.
63
+
64
+ ## Ideas for the next iteration
65
+
66
+ 1. **Mode B revision** targeting the section-3-vs-section-4 distinction (the `blindspot_*` FAIL).
67
+ Snapshot current SKILL.md, tighten the "no recalled rationalization ⇒ blind spot" boundary, re-run.
68
+ 2. Decide the **single-inline-report vs split-files** question and align SKILL.md + assertion so the
69
+ cost-section FAIL reflects real behavior, not output routing.
70
+ 3. Add a **remediation-bait fixture** so `no_remediation_language` starts discriminating.
71
+ 4. Consider a **stronger agent model** (sonnet) for a parallel baseline — this run deliberately used
72
+ haiku (weakest) as a floor; a sonnet agent baseline would show the skill's ceiling.