@glrs-dev/cli 0.1.1 → 0.3.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +8 -0
- package/dist/vendor/harness-opencode/dist/agents/prompts/pilot-builder.md +12 -2
- package/dist/vendor/harness-opencode/dist/agents/prompts/pilot-planner.md +16 -1
- package/dist/vendor/harness-opencode/dist/agents/prompts/research-auto.md +37 -0
- package/dist/vendor/harness-opencode/dist/agents/prompts/research-local.md +33 -0
- package/dist/vendor/harness-opencode/dist/agents/prompts/research-web.md +32 -0
- package/dist/vendor/harness-opencode/dist/agents/prompts/research.md +15 -20
- package/dist/vendor/harness-opencode/dist/{chunk-XCZ3NOXR.js → chunk-CZMAJISX.js} +28 -0
- package/dist/vendor/harness-opencode/dist/{chunk-VVMP6QWS.js → chunk-WBBN7OVN.js} +162 -2
- package/dist/vendor/harness-opencode/dist/cli.js +83 -4
- package/dist/vendor/harness-opencode/dist/index.js +2 -2
- package/dist/vendor/harness-opencode/dist/install-X5KEANRB.js +13 -0
- package/dist/vendor/harness-opencode/dist/skills/pilot-planning/SKILL.md +5 -1
- package/dist/vendor/harness-opencode/dist/skills/pilot-planning/rules/qa-expectations.md +120 -0
- package/dist/vendor/harness-opencode/dist/skills/pilot-planning/rules/setup-authoring.md +68 -0
- package/dist/vendor/harness-opencode/dist/skills/pilot-planning/rules/verify-design.md +4 -0
- package/dist/vendor/harness-opencode/package.json +1 -1
- package/package.json +1 -1
- package/dist/vendor/harness-opencode/dist/install-4EYR56OR.js +0 -9
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,13 @@
|
|
|
1
1
|
# @glrs-dev/cli
|
|
2
2
|
|
|
3
|
+
## 0.3.1
|
|
4
|
+
|
|
5
|
+
### Patch Changes
|
|
6
|
+
|
|
7
|
+
- [#19](https://github.com/iceglober/glrs/pull/19) [`6e942c5`](https://github.com/iceglober/glrs/commit/6e942c5099a535a7d1cda161a1bbc1692f937008) Thanks [@iceglober](https://github.com/iceglober)! - Link `@glrs-dev/cli` and `@glrs-dev/harness-plugin-opencode` versions in Changesets config so they always release together. The CLI vendors the harness plugin's `dist/` at build time (via `packages/cli/scripts/vendor-harness.ts`), so plugin fixes don't reach users running `glrs oc install` until a CLI release is cut. Linking the two ensures every harness-plugin bump produces a matching CLI bump, closing the gap where a plugin fix sat on npm without a CLI tarball that bundled it.
|
|
8
|
+
|
|
9
|
+
This bump also forces a CLI republish that vendors `@glrs-dev/harness-plugin-opencode@0.3.0` so users get the recent `glrs oc install` reconfigure fix via `glrs oc install`, not just `glrs-oc install` directly.
|
|
10
|
+
|
|
3
11
|
## 0.1.1
|
|
4
12
|
|
|
5
13
|
### Patch Changes
|
|
@@ -68,12 +68,22 @@ Write the minimal code that makes verify pass:
|
|
|
68
68
|
- Modify existing? Read the surrounding 30 lines first; mirror the existing patterns in indentation, error handling, log format.
|
|
69
69
|
- Add a test? Look at one existing test in the same dir; copy its scaffolding (imports, setup, teardown). Don't invent a new test pattern when the codebase has a strong convention.
|
|
70
70
|
|
|
71
|
-
## 4.
|
|
71
|
+
## 4. Dependency rules — task-level vs environment bootstrap
|
|
72
72
|
|
|
73
|
-
|
|
73
|
+
### 4a. Task-level dependencies still require task approval
|
|
74
|
+
|
|
75
|
+
If `task.prompt` says "add lodash to handle deep merging", install it. If the task is silent on deps, don't add them — find an existing util, write a tiny helper inline, or STOP if the task is genuinely impossible without a dep.
|
|
74
76
|
|
|
75
77
|
`package.json` / `bun.lock` / `Cargo.lock` etc. are typically NOT in your `touches:` scope. Adding a dep when the scope forbids editing the lock file is a touches violation; the worker will catch it.
|
|
76
78
|
|
|
79
|
+
### 4b. Environment bootstrap self-heals during the fix-loop
|
|
80
|
+
|
|
81
|
+
If a verify failure clearly points to an environmental issue — `Cannot find module 'X'` where `X` is a workspace/monorepo dep, `node_modules` absent despite a lockfile committed to the repo, a stale build artifact a typecheck depends on — you ARE expected to run the obvious install command BEFORE giving up with STOP.
|
|
82
|
+
|
|
83
|
+
Recognise these canonical bootstrap commands: `pnpm install`, `bun install`, `npm install`, `npm ci`, `cargo fetch`, `cargo build`. If the plan declared a `setup:` block, treat that block as the canonical list — run those commands verbatim.
|
|
84
|
+
|
|
85
|
+
The plugin deny list does not block any of these; they are not task-level dependency additions and they do not require lockfile edits.
|
|
86
|
+
|
|
77
87
|
## 5. When you think you're done, just stop
|
|
78
88
|
|
|
79
89
|
Don't write a "Summary" message. Don't list the files you changed. Don't propose follow-ups. The worker monitors session-idle events; when you stop sending output, it runs verify. If verify passes, the work commits with the message `<task.id>: <task.title>`. If verify fails, you'll get a fix prompt with the failure output verbatim.
|
|
@@ -45,12 +45,13 @@ Use Serena and grep to map out:
|
|
|
45
45
|
- Existing tests that already cover related code (the verify commands will likely be variations of those).
|
|
46
46
|
- Existing patterns the change should match.
|
|
47
47
|
- Any module boundaries that suggest natural task splits.
|
|
48
|
+
- **Tooling footprint** — lockfiles, docker-compose services, migration tooling, UI/API/DB test frameworks. You'll use these in Section 3 to propose a `setup:` block and per-surface verify patterns.
|
|
48
49
|
|
|
49
50
|
Be thorough here. A planner who shipped a sloppy plan because they only skimmed the codebase wastes hours of pilot-builder time chasing bad scope.
|
|
50
51
|
|
|
51
52
|
## 3. Apply the planning methodology
|
|
52
53
|
|
|
53
|
-
The `pilot-planning` skill carries the
|
|
54
|
+
The `pilot-planning` skill carries the ten rules. Apply them:
|
|
54
55
|
|
|
55
56
|
1. First-principles task framing.
|
|
56
57
|
2. Decomposition into right-sized tasks.
|
|
@@ -60,6 +61,16 @@ The `pilot-planning` skill carries the eight rules. Apply them:
|
|
|
60
61
|
6. Optional milestone grouping.
|
|
61
62
|
7. Self-review.
|
|
62
63
|
8. Per-task `context:` population (rationale, code pointers, acceptance shorthand).
|
|
64
|
+
9. **Setup-block authoring** — detect lockfiles (pnpm, bun, npm, yarn, Cargo), docker-compose services, and migration tooling (prisma, drizzle-kit, knex, flyway), then propose specific setup commands to the user for confirmation.
|
|
65
|
+
10. **QA-expectations establishment** — detect per-surface test frameworks and propose concrete verify patterns:
|
|
66
|
+
- **UI**: Playwright, Cypress, or Vitest browser mode for visual/interaction assertions
|
|
67
|
+
- **API**: curl against local endpoints or OpenAPI-based contract tests
|
|
68
|
+
- **DB**: Postgres readiness checks and migration verification (prisma migrate, drizzle-kit push)
|
|
69
|
+
- **Integration**: `test/integration` or `e2e` directory patterns
|
|
70
|
+
- **Browser-based component**: Storybook or Chromatic visual tests
|
|
71
|
+
- **CLI**: bin/ smoke tests or `--help` verification
|
|
72
|
+
|
|
73
|
+
Rules 9 and 10 typically involve ONE bundled `question` tool call to the user — combine setup proposals and per-surface verify proposals into a single round (respecting "talk to the user — once" guidance).
|
|
63
74
|
|
|
64
75
|
## 4. Write the YAML
|
|
65
76
|
|
|
@@ -69,6 +80,10 @@ Required schema (see `src/pilot/plan/schema.ts` for the canonical Zod definition
|
|
|
69
80
|
|
|
70
81
|
```yaml
|
|
71
82
|
name: <human-readable plan name>
|
|
83
|
+
setup: # optional — run once per worktree before any task
|
|
84
|
+
- pnpm install --frozen-lockfile
|
|
85
|
+
- docker compose up -d postgres
|
|
86
|
+
- pnpm prisma migrate dev
|
|
72
87
|
defaults: # optional, override per-task as needed
|
|
73
88
|
agent: pilot-builder # default
|
|
74
89
|
model: anthropic/claude-sonnet-4-6
|
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: research-auto
|
|
3
|
+
description: Research orchestrator subagent — Autonomous experimentation skill. Agent interviews the user, sets up a lab, then explores freely (think, test, reflect) until stopped or a target is hit. Works for any domain where you can measure or evaluate a result. Use when user says 'optimize this', 'experiment with', 'find the best approach', 'iterate on', 'research mode'. Do NOT use for binary validation tests (use /spec-lab instead). Based on ResearcherSkill v1.4.4 by krzysztofdudek.
|
|
4
|
+
mode: all
|
|
5
|
+
model: anthropic/claude-opus-4-7
|
|
6
|
+
temperature: 0.3
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# @research-auto — Autonomous Experimentation Agent
|
|
10
|
+
|
|
11
|
+
You are the `research-auto` agent. Your job is to run autonomous experiments by following the bundled `research-auto` skill methodology end-to-end.
|
|
12
|
+
|
|
13
|
+
**Research Query:** $ARGUMENTS
|
|
14
|
+
|
|
15
|
+
## Task
|
|
16
|
+
|
|
17
|
+
1. Read the bundled `research-auto` skill via the Skill tool
|
|
18
|
+
2. Follow every instruction in the skill exactly
|
|
19
|
+
3. Execute the full experimentation workflow from discovery through conclusion
|
|
20
|
+
|
|
21
|
+
## Notes on Experiment Commands
|
|
22
|
+
|
|
23
|
+
This agent may run arbitrary user-supplied commands as part of experiments. The `.lab/` directory is used for scratch writes and experiment tracking. These are expected behaviors per the skill methodology.
|
|
24
|
+
|
|
25
|
+
## PRIME-Delegation Brief Contract
|
|
26
|
+
|
|
27
|
+
When PRIME passes a brief via task tool:
|
|
28
|
+
- Trust the brief. The task-tool arguments ARE the research query — proceed directly.
|
|
29
|
+
- Do not re-interview on points already resolved in the brief.
|
|
30
|
+
- If the brief lacks critical context (e.g., no query provided), ask once then proceed.
|
|
31
|
+
|
|
32
|
+
## STOP — Do Not
|
|
33
|
+
|
|
34
|
+
- Do NOT experiment directly without following the skill methodology
|
|
35
|
+
- Do NOT skip the discovery phase — it is mandatory
|
|
36
|
+
- Do NOT skip the commit-before-run guardrail — it is mandatory
|
|
37
|
+
- Do NOT exceed 3 rounds without presenting — MAX 3 ROUNDS, THEN PRESENT
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: research-local
|
|
3
|
+
description: Research orchestrator subagent — Deep codebase research using parallel Explore subagents. Decomposes a question about the local codebase into research tasks, launches parallel explorations, reviews for gaps, iterates, and synthesizes findings with specific file paths and line numbers. Use when user says 'how does X work in this codebase', 'where is Y implemented', 'trace the data flow for Z', 'what patterns does this repo use', 'explain the architecture of'. Provide the research topic as arguments.
|
|
4
|
+
mode: all
|
|
5
|
+
model: anthropic/claude-opus-4-7
|
|
6
|
+
temperature: 0.3
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# @research-local — Codebase Research Agent
|
|
10
|
+
|
|
11
|
+
You are the `research-local` agent. Your job is to execute deep codebase research by following the bundled `research-local` skill methodology end-to-end. Scope is local codebase ONLY — no web research.
|
|
12
|
+
|
|
13
|
+
**Research Query:** $ARGUMENTS
|
|
14
|
+
|
|
15
|
+
## Task
|
|
16
|
+
|
|
17
|
+
1. Read the bundled `research-local` skill via the Skill tool
|
|
18
|
+
2. Follow every instruction in the skill exactly
|
|
19
|
+
3. Execute the full research workflow from decomposition through synthesis
|
|
20
|
+
|
|
21
|
+
## PRIME-Delegation Brief Contract
|
|
22
|
+
|
|
23
|
+
When PRIME passes a brief via task tool:
|
|
24
|
+
- Trust the brief. The task-tool arguments ARE the research query — proceed directly.
|
|
25
|
+
- Do not re-interview on points already resolved in the brief.
|
|
26
|
+
- If the brief lacks critical context (e.g., no query provided), ask once then proceed.
|
|
27
|
+
|
|
28
|
+
## STOP — Do Not
|
|
29
|
+
|
|
30
|
+
- Do NOT research directly — always follow the research-local skill methodology
|
|
31
|
+
- Do NOT use exploration tools yourself — every phase is a subagent
|
|
32
|
+
- Do NOT skip the decomposition phase — it is mandatory
|
|
33
|
+
- Do NOT synthesize findings yourself — synthesis is a subagent
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: research-web
|
|
3
|
+
description: Research orchestrator subagent — Multi-agent web research orchestrator. Decomposes a research question into parallel agent workstreams, launches them, monitors progress, and synthesizes results. Use when user says 'research this topic', 'I need to understand', 'deep dive into', 'investigate the market for', 'what do we know about'. Provide the research topic and context.
|
|
4
|
+
mode: all
|
|
5
|
+
model: anthropic/claude-opus-4-7
|
|
6
|
+
temperature: 0.3
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# @research-web — Web Research Agent
|
|
10
|
+
|
|
11
|
+
You are the `research-web` agent. Your job is to execute web research by following the bundled `research-web` skill methodology end-to-end.
|
|
12
|
+
|
|
13
|
+
**Research Query:** $ARGUMENTS
|
|
14
|
+
|
|
15
|
+
## Task
|
|
16
|
+
|
|
17
|
+
1. Read the bundled `research-web` skill via the Skill tool
|
|
18
|
+
2. Follow every instruction in the skill exactly
|
|
19
|
+
3. Execute the full research workflow from planning through synthesis
|
|
20
|
+
|
|
21
|
+
## PRIME-Delegation Brief Contract
|
|
22
|
+
|
|
23
|
+
When PRIME passes a brief via task tool:
|
|
24
|
+
- Trust the brief. The task-tool arguments ARE the research query — proceed directly.
|
|
25
|
+
- Do not re-interview on points already resolved in the brief.
|
|
26
|
+
- If the brief lacks critical context (e.g., no query provided), ask once then proceed.
|
|
27
|
+
|
|
28
|
+
## STOP — Do Not
|
|
29
|
+
|
|
30
|
+
- Do NOT research directly — always follow the research-web skill methodology
|
|
31
|
+
- Do NOT skip the planning phase — it is mandatory
|
|
32
|
+
- Do NOT launch agents sequentially — dispatch all independent workstreams in ONE message
|
|
@@ -22,30 +22,25 @@ You are an **orchestrator only**. You do NOT:
|
|
|
22
22
|
|
|
23
23
|
Every cognitive task is a subagent. You launch subagents and pass their outputs to other subagents.
|
|
24
24
|
|
|
25
|
-
## How to Invoke
|
|
25
|
+
## How to Invoke Research Agents
|
|
26
26
|
|
|
27
|
-
The four research
|
|
27
|
+
The four research agents are available:
|
|
28
28
|
|
|
29
|
-
1.
|
|
30
|
-
2.
|
|
31
|
-
3.
|
|
32
|
-
4.
|
|
29
|
+
1. **`@research`** (this agent) — umbrella orchestrator for multi-workstream research
|
|
30
|
+
2. **`@research-local`** — deep codebase research using parallel Explore subagents
|
|
31
|
+
3. **`@research-web`** — multi-agent web research with skeleton-file pattern
|
|
32
|
+
4. **`@research-auto`** — autonomous experimentation with `.lab/` directory
|
|
33
33
|
|
|
34
|
-
**To
|
|
34
|
+
**To dispatch a research subagent:** Use the task tool with the agent name and pass the sub-question as the prompt:
|
|
35
35
|
|
|
36
36
|
```
|
|
37
|
-
|
|
38
|
-
"
|
|
39
|
-
|
|
40
|
-
## Research Query
|
|
41
|
-
{the full query or sub-question}
|
|
42
|
-
|
|
43
|
-
## Task
|
|
44
|
-
1. Read the bundled {skill-name} skill via the Skill tool and follow every instruction
|
|
45
|
-
2. Focus specifically on: {sub-question}
|
|
46
|
-
3. Report back with your complete findings"
|
|
37
|
+
task tool:
|
|
38
|
+
agent: "research-web"
|
|
39
|
+
prompt: "Research the competitive landscape for X. Focus on: {specific angle}."
|
|
47
40
|
```
|
|
48
41
|
|
|
42
|
+
The research agents are thin shims that load their matching bundled skill and follow it end-to-end. Trust the brief — the task-tool arguments ARE the research query.
|
|
43
|
+
|
|
49
44
|
## 7-Phase Flow
|
|
50
45
|
|
|
51
46
|
### Phase 1: Plan — Subagent
|
|
@@ -77,9 +72,9 @@ Output 3-6 workstreams. Mark dependencies explicitly."
|
|
|
77
72
|
|
|
78
73
|
Dispatch **one Agent per workstream**. Launch ALL independent workstreams in a SINGLE message.
|
|
79
74
|
|
|
80
|
-
For LOCAL workstreams:
|
|
81
|
-
For WEB workstreams:
|
|
82
|
-
For AUTO workstreams:
|
|
75
|
+
For LOCAL workstreams: dispatch `@research-local` via task tool.
|
|
76
|
+
For WEB workstreams: dispatch `@research-web` via task tool.
|
|
77
|
+
For AUTO workstreams: dispatch `@research-auto` via task tool.
|
|
83
78
|
|
|
84
79
|
### Phase 3: Review Round 1 — Subagent
|
|
85
80
|
|
|
@@ -59,6 +59,9 @@ var agentsMdWriterPrompt = readPrompt("agents-md-writer.md");
|
|
|
59
59
|
var pilotBuilderPrompt = readPrompt("pilot-builder.md");
|
|
60
60
|
var pilotPlannerPrompt = readPrompt("pilot-planner.md");
|
|
61
61
|
var researchPrompt = readPrompt("research.md");
|
|
62
|
+
var researchWebPrompt = readPrompt("research-web.md");
|
|
63
|
+
var researchLocalPrompt = readPrompt("research-local.md");
|
|
64
|
+
var researchAutoPrompt = readPrompt("research-auto.md");
|
|
62
65
|
function stripFrontmatter(md) {
|
|
63
66
|
if (!md.startsWith("---")) return md;
|
|
64
67
|
const end = md.indexOf("\n---", 3);
|
|
@@ -557,6 +560,9 @@ var AGENT_TIERS = {
|
|
|
557
560
|
"gap-analyzer": "deep",
|
|
558
561
|
"pilot-planner": "deep",
|
|
559
562
|
research: "deep",
|
|
563
|
+
"research-web": "deep",
|
|
564
|
+
"research-local": "deep",
|
|
565
|
+
"research-auto": "deep",
|
|
560
566
|
build: "mid",
|
|
561
567
|
"qa-reviewer": "mid",
|
|
562
568
|
"docs-maintainer": "mid",
|
|
@@ -641,6 +647,28 @@ function createAgents() {
|
|
|
641
647
|
model: "anthropic/claude-opus-4-7",
|
|
642
648
|
temperature: 0.3,
|
|
643
649
|
permission: RESEARCH_PERMISSIONS
|
|
650
|
+
}),
|
|
651
|
+
// Research subagents — thin shims that load the bundled skills
|
|
652
|
+
"research-web": agentFromPrompt(researchWebPrompt, {
|
|
653
|
+
description: "Research orchestrator subagent \u2014 Multi-agent web research orchestrator. Decomposes a research question into parallel agent workstreams, launches them, monitors progress, and synthesizes results. Use when user says 'research this topic', 'I need to understand', 'deep dive into', 'investigate the market for', 'what do we know about'. Provide the research topic and context.",
|
|
654
|
+
mode: "all",
|
|
655
|
+
model: "anthropic/claude-opus-4-7",
|
|
656
|
+
temperature: 0.3,
|
|
657
|
+
permission: RESEARCH_PERMISSIONS
|
|
658
|
+
}),
|
|
659
|
+
"research-local": agentFromPrompt(researchLocalPrompt, {
|
|
660
|
+
description: "Research orchestrator subagent \u2014 Deep codebase research using parallel Explore subagents. Decomposes a question about the local codebase into research tasks, launches parallel explorations, reviews for gaps, iterates, and synthesizes findings with specific file paths and line numbers. Use when user says 'how does X work in this codebase', 'where is Y implemented', 'trace the data flow for Z', 'what patterns does this repo use', 'explain the architecture of'. Provide the research topic as arguments.",
|
|
661
|
+
mode: "all",
|
|
662
|
+
model: "anthropic/claude-opus-4-7",
|
|
663
|
+
temperature: 0.3,
|
|
664
|
+
permission: RESEARCH_PERMISSIONS
|
|
665
|
+
}),
|
|
666
|
+
"research-auto": agentFromPrompt(researchAutoPrompt, {
|
|
667
|
+
description: "Research orchestrator subagent \u2014 Autonomous experimentation skill. Agent interviews the user, sets up a lab, then explores freely (think, test, reflect) until stopped or a target is hit. Works for any domain where you can measure or evaluate a result. Use when user says 'optimize this', 'experiment with', 'find the best approach', 'iterate on', 'research mode'. Do NOT use for binary validation tests (use /spec-lab instead). Based on ResearcherSkill v1.4.4 by krzysztofdudek.",
|
|
668
|
+
mode: "all",
|
|
669
|
+
model: "anthropic/claude-opus-4-7",
|
|
670
|
+
temperature: 0.3,
|
|
671
|
+
permission: RESEARCH_PERMISSIONS
|
|
644
672
|
})
|
|
645
673
|
};
|
|
646
674
|
}
|
|
@@ -257,7 +257,7 @@ async function requirePlugin() {
|
|
|
257
257
|
);
|
|
258
258
|
process.exit(1);
|
|
259
259
|
}
|
|
260
|
-
const { install: install2 } = await import("./install-
|
|
260
|
+
const { install: install2 } = await import("./install-X5KEANRB.js");
|
|
261
261
|
await install2({ nonInteractive: true });
|
|
262
262
|
}
|
|
263
263
|
|
|
@@ -505,6 +505,116 @@ function migrateHarnessKeyToPluginOptions(configPath) {
|
|
|
505
505
|
} catch {
|
|
506
506
|
}
|
|
507
507
|
}
|
|
508
|
+
function deepEqual(a, b) {
|
|
509
|
+
if (a === b) return true;
|
|
510
|
+
if (typeof a !== typeof b) return false;
|
|
511
|
+
if (a === null || b === null) return a === b;
|
|
512
|
+
if (typeof a !== "object") return false;
|
|
513
|
+
const aObj = a;
|
|
514
|
+
const bObj = b;
|
|
515
|
+
const aKeys = Object.keys(aObj);
|
|
516
|
+
const bKeys = Object.keys(bObj);
|
|
517
|
+
if (aKeys.length !== bKeys.length) return false;
|
|
518
|
+
for (const key of aKeys) {
|
|
519
|
+
if (!bKeys.includes(key)) return false;
|
|
520
|
+
if (!deepEqual(aObj[key], bObj[key])) return false;
|
|
521
|
+
}
|
|
522
|
+
return true;
|
|
523
|
+
}
|
|
524
|
+
function writePluginOption(configPath, subKey, value, opts) {
|
|
525
|
+
try {
|
|
526
|
+
if (!fs3.existsSync(configPath)) {
|
|
527
|
+
return { changed: false };
|
|
528
|
+
}
|
|
529
|
+
const raw = fs3.readFileSync(configPath, "utf8");
|
|
530
|
+
const config = JSON.parse(raw);
|
|
531
|
+
if (!Array.isArray(config.plugin)) {
|
|
532
|
+
return { changed: false };
|
|
533
|
+
}
|
|
534
|
+
const pluginIdx = config.plugin.findIndex((entry) => {
|
|
535
|
+
const name = typeof entry === "string" ? entry : Array.isArray(entry) ? entry[0] : null;
|
|
536
|
+
return name === PLUGIN_NAME2 || String(name ?? "").startsWith(`${PLUGIN_NAME2}@`);
|
|
537
|
+
});
|
|
538
|
+
if (pluginIdx < 0) {
|
|
539
|
+
return { changed: false };
|
|
540
|
+
}
|
|
541
|
+
const current = config.plugin[pluginIdx];
|
|
542
|
+
const existingName = typeof current === "string" ? current : Array.isArray(current) ? current[0] : PLUGIN_NAME2;
|
|
543
|
+
const existingOpts = Array.isArray(current) && current.length >= 2 ? current[1] : {};
|
|
544
|
+
if (deepEqual(existingOpts[subKey], value)) {
|
|
545
|
+
return { changed: false };
|
|
546
|
+
}
|
|
547
|
+
const newOpts = { ...existingOpts, [subKey]: value };
|
|
548
|
+
if (opts.dryRun) {
|
|
549
|
+
info(`[dry-run] Would reconfigure ${subKey} in plugin options`);
|
|
550
|
+
return { changed: true };
|
|
551
|
+
}
|
|
552
|
+
const bakPath = `${configPath}.bak.${Date.now()}-${process.pid}`;
|
|
553
|
+
fs3.copyFileSync(configPath, bakPath);
|
|
554
|
+
config.plugin[pluginIdx] = [existingName, newOpts];
|
|
555
|
+
fs3.writeFileSync(configPath, JSON.stringify(config, null, 2) + "\n");
|
|
556
|
+
ok(`Reconfigured ${subKey}`);
|
|
557
|
+
info(`Backup: ${bakPath}`);
|
|
558
|
+
return { changed: true, bakPath };
|
|
559
|
+
} catch {
|
|
560
|
+
return { changed: false };
|
|
561
|
+
}
|
|
562
|
+
}
|
|
563
|
+
function writeMcpToggles(configPath, enabledSet, opts) {
|
|
564
|
+
try {
|
|
565
|
+
if (!fs3.existsSync(configPath)) {
|
|
566
|
+
return { changed: false };
|
|
567
|
+
}
|
|
568
|
+
const raw = fs3.readFileSync(configPath, "utf8");
|
|
569
|
+
const config = JSON.parse(raw);
|
|
570
|
+
const toggleNames = new Set(MCP_TOGGLES.map((t) => t.name));
|
|
571
|
+
const existingMcp = config.mcp && typeof config.mcp === "object" ? { ...config.mcp } : {};
|
|
572
|
+
const newMcp = {};
|
|
573
|
+
let hasChanges = false;
|
|
574
|
+
for (const [key, val] of Object.entries(existingMcp)) {
|
|
575
|
+
if (!toggleNames.has(key)) {
|
|
576
|
+
newMcp[key] = val;
|
|
577
|
+
}
|
|
578
|
+
}
|
|
579
|
+
for (const toggleName of toggleNames) {
|
|
580
|
+
if (enabledSet.has(toggleName)) {
|
|
581
|
+
newMcp[toggleName] = { enabled: true };
|
|
582
|
+
if (!deepEqual(existingMcp[toggleName], { enabled: true })) {
|
|
583
|
+
hasChanges = true;
|
|
584
|
+
}
|
|
585
|
+
} else {
|
|
586
|
+
if (existingMcp[toggleName] !== void 0) {
|
|
587
|
+
hasChanges = true;
|
|
588
|
+
}
|
|
589
|
+
}
|
|
590
|
+
}
|
|
591
|
+
if (!hasChanges && Object.keys(newMcp).length === Object.keys(existingMcp).length) {
|
|
592
|
+
const allKeysMatch = Object.keys(newMcp).every(
|
|
593
|
+
(k) => deepEqual(newMcp[k], existingMcp[k])
|
|
594
|
+
);
|
|
595
|
+
if (allKeysMatch) {
|
|
596
|
+
return { changed: false };
|
|
597
|
+
}
|
|
598
|
+
}
|
|
599
|
+
if (opts.dryRun) {
|
|
600
|
+
info(`[dry-run] Would reconfigure MCP toggles`);
|
|
601
|
+
return { changed: true };
|
|
602
|
+
}
|
|
603
|
+
const bakPath = `${configPath}.bak.${Date.now()}-${process.pid}`;
|
|
604
|
+
fs3.copyFileSync(configPath, bakPath);
|
|
605
|
+
if (Object.keys(newMcp).length > 0) {
|
|
606
|
+
config.mcp = newMcp;
|
|
607
|
+
} else {
|
|
608
|
+
delete config.mcp;
|
|
609
|
+
}
|
|
610
|
+
fs3.writeFileSync(configPath, JSON.stringify(config, null, 2) + "\n");
|
|
611
|
+
ok("Reconfigured MCPs");
|
|
612
|
+
info(`Backup: ${bakPath}`);
|
|
613
|
+
return { changed: true, bakPath };
|
|
614
|
+
} catch {
|
|
615
|
+
return { changed: false };
|
|
616
|
+
}
|
|
617
|
+
}
|
|
508
618
|
async function install(opts = {}) {
|
|
509
619
|
const { dryRun = false, pin = false, nonInteractive = false } = opts;
|
|
510
620
|
const configPath = getOpencodeConfigPath2();
|
|
@@ -533,6 +643,10 @@ ${c.bold}${c.blue}@glrs-dev/harness-plugin-opencode${c.reset} setup
|
|
|
533
643
|
if (existingMcps.size > 0) {
|
|
534
644
|
ok(`MCPs: ${[...existingMcps].join(", ")} enabled`);
|
|
535
645
|
}
|
|
646
|
+
let reconfigureModels = false;
|
|
647
|
+
let reconfigureMcps = false;
|
|
648
|
+
let newModelsValue = null;
|
|
649
|
+
let newMcpEnabledSet = /* @__PURE__ */ new Set();
|
|
536
650
|
if (hasPlugin && (existingProvider || hasModels)) {
|
|
537
651
|
const unconfiguredMcps = MCP_TOGGLES.filter(
|
|
538
652
|
(t) => !existingMcps.has(t.name) && !existing?.mcp?.[t.name]
|
|
@@ -544,8 +658,20 @@ ${c.bold}${c.blue}@glrs-dev/harness-plugin-opencode${c.reset} setup
|
|
|
544
658
|
0
|
|
545
659
|
);
|
|
546
660
|
if (reconfigure === 1) {
|
|
661
|
+
reconfigureModels = true;
|
|
547
662
|
hasModels = false;
|
|
548
|
-
}
|
|
663
|
+
}
|
|
664
|
+
if (existingMcps.size > 0) {
|
|
665
|
+
const reconfigureMcpChoice = await promptChoice(
|
|
666
|
+
" Reconfigure MCPs?",
|
|
667
|
+
["No, keep current config", "Yes, reconfigure MCPs"],
|
|
668
|
+
0
|
|
669
|
+
);
|
|
670
|
+
if (reconfigureMcpChoice === 1) {
|
|
671
|
+
reconfigureMcps = true;
|
|
672
|
+
}
|
|
673
|
+
}
|
|
674
|
+
if (!reconfigureModels && !reconfigureMcps && unconfiguredMcps.length === 0) {
|
|
549
675
|
console.log(`
|
|
550
676
|
${c.bold}Ready.${c.reset} Run ${c.green}opencode${c.reset} to start.
|
|
551
677
|
`);
|
|
@@ -632,6 +758,11 @@ ${c.bold}Ready.${c.reset} Run ${c.green}opencode${c.reset} to start.
|
|
|
632
758
|
mid: [preset.mid],
|
|
633
759
|
fast: [preset.fast]
|
|
634
760
|
};
|
|
761
|
+
newModelsValue = {
|
|
762
|
+
deep: [preset.deep],
|
|
763
|
+
mid: [preset.mid],
|
|
764
|
+
fast: [preset.fast]
|
|
765
|
+
};
|
|
635
766
|
ok(`Models configured`);
|
|
636
767
|
} else if (!pluginOpts._skipModels) {
|
|
637
768
|
info("Enter model IDs in <provider>/<model-id> format (e.g. amazon-bedrock/global.anthropic.claude-opus-4-7)");
|
|
@@ -645,6 +776,11 @@ ${c.bold}Ready.${c.reset} Run ${c.green}opencode${c.reset} to start.
|
|
|
645
776
|
mid: [midModel || deepModel],
|
|
646
777
|
fast: [fastModel || midModel || deepModel]
|
|
647
778
|
};
|
|
779
|
+
newModelsValue = {
|
|
780
|
+
deep: [deepModel],
|
|
781
|
+
mid: [midModel || deepModel],
|
|
782
|
+
fast: [fastModel || midModel || deepModel]
|
|
783
|
+
};
|
|
648
784
|
ok("Models: custom");
|
|
649
785
|
} else {
|
|
650
786
|
ok("Models: OpenCode defaults");
|
|
@@ -653,6 +789,22 @@ ${c.bold}Ready.${c.reset} Run ${c.green}opencode${c.reset} to start.
|
|
|
653
789
|
delete pluginOpts._skipModels;
|
|
654
790
|
console.log();
|
|
655
791
|
}
|
|
792
|
+
if (interactive && reconfigureMcps) {
|
|
793
|
+
console.log(`${c.dim}Reconfigure MCP servers${c.reset}`);
|
|
794
|
+
const currentEnabled = new Set(existingMcps);
|
|
795
|
+
const selected = await promptMulti(
|
|
796
|
+
" Select MCPs to enable:",
|
|
797
|
+
MCP_TOGGLES.map((t) => ({ label: t.label, defaultOn: currentEnabled.has(t.name) }))
|
|
798
|
+
);
|
|
799
|
+
newMcpEnabledSet = new Set([...selected].map((i) => MCP_TOGGLES[i].name));
|
|
800
|
+
const names = [...newMcpEnabledSet].join(", ");
|
|
801
|
+
if (newMcpEnabledSet.size > 0) {
|
|
802
|
+
ok(`MCPs to enable: ${names}`);
|
|
803
|
+
} else {
|
|
804
|
+
ok("MCPs: all disabled");
|
|
805
|
+
}
|
|
806
|
+
console.log();
|
|
807
|
+
}
|
|
656
808
|
const pluginValue = Object.keys(pluginOpts).length > 0 ? [pluginEntry, pluginOpts] : pluginEntry;
|
|
657
809
|
const config = {
|
|
658
810
|
$schema: "https://opencode.ai/config.json",
|
|
@@ -683,6 +835,12 @@ ${c.bold}Ready.${c.reset} Run ${c.green}opencode${c.reset} to start.
|
|
|
683
835
|
console.log();
|
|
684
836
|
}
|
|
685
837
|
}
|
|
838
|
+
if (reconfigureModels && newModelsValue) {
|
|
839
|
+
writePluginOption(configPath, "models", newModelsValue, { dryRun });
|
|
840
|
+
}
|
|
841
|
+
if (reconfigureMcps) {
|
|
842
|
+
writeMcpToggles(configPath, newMcpEnabledSet, { dryRun });
|
|
843
|
+
}
|
|
686
844
|
if (!fs3.existsSync(configPath)) {
|
|
687
845
|
if (dryRun) {
|
|
688
846
|
info(`[dry-run] Would create ${configPath}`);
|
|
@@ -727,5 +885,7 @@ ${c.bold}Ready.${c.reset} Run ${c.green}opencode${c.reset} to start.
|
|
|
727
885
|
export {
|
|
728
886
|
requirePlugin,
|
|
729
887
|
MODEL_PRESETS,
|
|
888
|
+
writePluginOption,
|
|
889
|
+
writeMcpToggles,
|
|
730
890
|
install
|
|
731
891
|
};
|
|
@@ -2,11 +2,11 @@
|
|
|
2
2
|
import {
|
|
3
3
|
createAgents,
|
|
4
4
|
validateModelOverride
|
|
5
|
-
} from "./chunk-
|
|
5
|
+
} from "./chunk-CZMAJISX.js";
|
|
6
6
|
import {
|
|
7
7
|
install,
|
|
8
8
|
requirePlugin
|
|
9
|
-
} from "./chunk-
|
|
9
|
+
} from "./chunk-WBBN7OVN.js";
|
|
10
10
|
import "./chunk-VJUETC6A.js";
|
|
11
11
|
|
|
12
12
|
// src/cli.ts
|
|
@@ -514,6 +514,7 @@ var PlanSchema = z.object({
|
|
|
514
514
|
branch_prefix: z.string().min(1).optional(),
|
|
515
515
|
defaults: DefaultsSchema,
|
|
516
516
|
milestones: z.array(MilestoneSchema).default([]),
|
|
517
|
+
setup: z.array(VerifyCommandSchema).default([]),
|
|
517
518
|
tasks: z.array(TaskSchema).min(1, "plan must declare at least one task")
|
|
518
519
|
}).strict();
|
|
519
520
|
function parsePlan(input) {
|
|
@@ -2224,7 +2225,8 @@ var WorktreePool = class {
|
|
|
2224
2225
|
path: "",
|
|
2225
2226
|
// filled by prepare
|
|
2226
2227
|
prepared: false,
|
|
2227
|
-
preserved: false
|
|
2228
|
+
preserved: false,
|
|
2229
|
+
setupCompleted: false
|
|
2228
2230
|
};
|
|
2229
2231
|
this.slots.set(n, stub);
|
|
2230
2232
|
return stub;
|
|
@@ -2862,6 +2864,8 @@ async function runWorker(deps) {
|
|
|
2862
2864
|
const attempted = [];
|
|
2863
2865
|
const maxAttempts = deps.maxAttempts ?? 3;
|
|
2864
2866
|
const stallMs = deps.stallMs ?? 60 * 60 * 1e3;
|
|
2867
|
+
let setupAborted = false;
|
|
2868
|
+
const depsWithAbort = deps;
|
|
2865
2869
|
while (true) {
|
|
2866
2870
|
if (deps.abortSignal?.aborted) {
|
|
2867
2871
|
return { aborted: true, attempted };
|
|
@@ -2871,7 +2875,10 @@ async function runWorker(deps) {
|
|
|
2871
2875
|
return { aborted: false, attempted };
|
|
2872
2876
|
}
|
|
2873
2877
|
attempted.push(pick.task.id);
|
|
2874
|
-
await runOneTask(
|
|
2878
|
+
await runOneTask(depsWithAbort, pick.task, { maxAttempts, stallMs });
|
|
2879
|
+
if (depsWithAbort.setupAborted) {
|
|
2880
|
+
return { aborted: false, attempted };
|
|
2881
|
+
}
|
|
2875
2882
|
const row = getTask(deps.db, deps.runId, pick.task.id);
|
|
2876
2883
|
if (row && (row.status === "failed" || row.status === "aborted")) {
|
|
2877
2884
|
const blocked = deps.scheduler.cascadeFail(
|
|
@@ -2957,6 +2964,78 @@ async function runOneTask(deps, task, opts) {
|
|
|
2957
2964
|
});
|
|
2958
2965
|
return;
|
|
2959
2966
|
}
|
|
2967
|
+
const setupCommands = deps.plan.setup ?? [];
|
|
2968
|
+
if (setupCommands.length > 0 && !slot.setupCompleted) {
|
|
2969
|
+
const setupStart = Date.now();
|
|
2970
|
+
appendEvent(deps.db, {
|
|
2971
|
+
runId: deps.runId,
|
|
2972
|
+
taskId: task.id,
|
|
2973
|
+
kind: "slot.setup.started",
|
|
2974
|
+
payload: {
|
|
2975
|
+
slotIndex: slot.index,
|
|
2976
|
+
commands: deps.plan.setup,
|
|
2977
|
+
taskId: task.id
|
|
2978
|
+
}
|
|
2979
|
+
});
|
|
2980
|
+
const setupResult = await runVerify(setupCommands, {
|
|
2981
|
+
cwd: prepared.path,
|
|
2982
|
+
abortSignal: deps.abortSignal,
|
|
2983
|
+
onLine: deps.onVerifyLine
|
|
2984
|
+
});
|
|
2985
|
+
if (!setupResult.ok) {
|
|
2986
|
+
const durationMs = Date.now() - setupStart;
|
|
2987
|
+
const failure = setupResult.failure;
|
|
2988
|
+
const reason2 = `setup failed: ${failure.command} \u2192 exit ${failure.exitCode}`;
|
|
2989
|
+
appendEvent(deps.db, {
|
|
2990
|
+
runId: deps.runId,
|
|
2991
|
+
taskId: task.id,
|
|
2992
|
+
kind: "slot.setup.failed",
|
|
2993
|
+
payload: {
|
|
2994
|
+
slotIndex: slot.index,
|
|
2995
|
+
command: failure.command,
|
|
2996
|
+
exitCode: failure.exitCode,
|
|
2997
|
+
output: failure.output.slice(0, 4096),
|
|
2998
|
+
// truncate
|
|
2999
|
+
durationMs
|
|
3000
|
+
}
|
|
3001
|
+
});
|
|
3002
|
+
deps.pool.preserveOnFailure(slot);
|
|
3003
|
+
markFailedSafe(deps.db, deps.runId, task.id, reason2);
|
|
3004
|
+
const blocked = new Set(
|
|
3005
|
+
deps.scheduler.cascadeFail(task.id, reason2)
|
|
3006
|
+
);
|
|
3007
|
+
for (const row of listTasks(deps.db, deps.runId)) {
|
|
3008
|
+
if (row.task_id === task.id) continue;
|
|
3009
|
+
if (blocked.has(row.task_id)) continue;
|
|
3010
|
+
if (row.status !== "pending" && row.status !== "ready") continue;
|
|
3011
|
+
try {
|
|
3012
|
+
markBlocked(deps.db, deps.runId, row.task_id, reason2);
|
|
3013
|
+
blocked.add(row.task_id);
|
|
3014
|
+
} catch {
|
|
3015
|
+
}
|
|
3016
|
+
}
|
|
3017
|
+
for (const blockedId of blocked) {
|
|
3018
|
+
appendEvent(deps.db, {
|
|
3019
|
+
runId: deps.runId,
|
|
3020
|
+
taskId: blockedId,
|
|
3021
|
+
kind: "task.blocked",
|
|
3022
|
+
payload: { reason: reason2, failedDep: task.id }
|
|
3023
|
+
});
|
|
3024
|
+
}
|
|
3025
|
+
deps.setupAborted = true;
|
|
3026
|
+
return;
|
|
3027
|
+
}
|
|
3028
|
+
slot.setupCompleted = true;
|
|
3029
|
+
appendEvent(deps.db, {
|
|
3030
|
+
runId: deps.runId,
|
|
3031
|
+
taskId: task.id,
|
|
3032
|
+
kind: "slot.setup.completed",
|
|
3033
|
+
payload: {
|
|
3034
|
+
slotIndex: slot.index,
|
|
3035
|
+
durationMs: Date.now() - setupStart
|
|
3036
|
+
}
|
|
3037
|
+
});
|
|
3038
|
+
}
|
|
2960
3039
|
let sessionId;
|
|
2961
3040
|
try {
|
|
2962
3041
|
const created = await deps.client.session.create({
|
|
@@ -3,7 +3,7 @@ import {
|
|
|
3
3
|
createAgents,
|
|
4
4
|
formatModelOverrideWarning,
|
|
5
5
|
validateModelOverride
|
|
6
|
-
} from "./chunk-
|
|
6
|
+
} from "./chunk-CZMAJISX.js";
|
|
7
7
|
import {
|
|
8
8
|
PACKAGE_NAME,
|
|
9
9
|
readOurPackageVersion,
|
|
@@ -1850,7 +1850,7 @@ import { join as join8 } from "path";
|
|
|
1850
1850
|
var APP_KEY = "A-US-3617699429";
|
|
1851
1851
|
var ENDPOINT = "https://us.aptabase.com/api/v0/event";
|
|
1852
1852
|
var PKG_NAME = "@glrs-dev/harness-plugin-opencode";
|
|
1853
|
-
var PKG_VERSION = true ? "0.
|
|
1853
|
+
var PKG_VERSION = true ? "0.3.1" : "dev";
|
|
1854
1854
|
var DISABLED = process.env.HARNESS_OPENCODE_TELEMETRY === "0" || process.env.HARNESS_OPENCODE_TELEMETRY === "false" || process.env.DO_NOT_TRACK === "1" || process.env.CI === "true";
|
|
1855
1855
|
var SESSION_ID = randomUUID();
|
|
1856
1856
|
function getInstallId() {
|
|
@@ -11,7 +11,7 @@ A good plan trades a planning-session's worth of patient thought for hours of un
|
|
|
11
11
|
|
|
12
12
|
## Workflow
|
|
13
13
|
|
|
14
|
-
Apply these
|
|
14
|
+
Apply these ten rules in order. Each rule has its own file in `rules/` for the full text:
|
|
15
15
|
|
|
16
16
|
1. [`first-principles.md`](rules/first-principles.md) — Frame the task FROM the user's intent, not from a templated checklist. Ask "what does the user actually want done?" before "what files might change?"
|
|
17
17
|
|
|
@@ -29,6 +29,10 @@ Apply these eight rules in order. Each rule has its own file in `rules/` for the
|
|
|
29
29
|
|
|
30
30
|
8. [`task-context.md`](rules/task-context.md) — Every non-trivial task carries a `context:` block. Thin plans fail because the builder works each task from scratch with no carry-over; rich context pre-loads what the builder needs to work confidently. Cover outcome, rationale, code pointers, acceptance.
|
|
31
31
|
|
|
32
|
+
9. [`setup-authoring.md`](rules/setup-authoring.md) — Detect → propose → confirm the top-level `setup:` block. Covers package manager install, docker-compose services, and migration tooling detection.
|
|
33
|
+
|
|
34
|
+
10. [`qa-expectations.md`](rules/qa-expectations.md) — Detect → propose → confirm per-surface verify patterns for UI, API, DB, integration, browser-based component, and CLI surfaces.
|
|
35
|
+
|
|
32
36
|
## After applying the rules
|
|
33
37
|
|
|
34
38
|
1. Save the YAML to the path returned by `bunx @glrs-dev/harness-plugin-opencode pilot plan-dir`.
|
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
# Rule 10 — QA-expectations establishment
|
|
2
|
+
|
|
3
|
+
**Detect → propose → confirm per-surface verify patterns.**
|
|
4
|
+
|
|
5
|
+
A plan's verify commands are its contract with the builder. Generic verifies ("run tests") waste builder time; specific verifies ("run the API tests that exercise the files this task touches") catch real failures. This rule establishes concrete, per-surface QA expectations with the user before emitting the plan.
|
|
6
|
+
|
|
7
|
+
## The six surfaces
|
|
8
|
+
|
|
9
|
+
For each surface below, detect signals in the codebase, propose a canonical verify pattern, and confirm with the user.
|
|
10
|
+
|
|
11
|
+
### UI — Browser-based user interface
|
|
12
|
+
|
|
13
|
+
**Detection signals:**
|
|
14
|
+
- `@playwright/test`, `cypress`, or `@vitest/browser` in `package.json` dependencies
|
|
15
|
+
- `playwright.config.{ts,js}` or `cypress.config.*` present
|
|
16
|
+
|
|
17
|
+
**Proposed verify pattern:**
|
|
18
|
+
Playwright MCP invocation for visual/interaction assertions:
|
|
19
|
+
```yaml
|
|
20
|
+
verify:
|
|
21
|
+
- playwright test --project=chromium --grep "@task-specific-tag"
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
### API — HTTP endpoints
|
|
25
|
+
|
|
26
|
+
**Detection signals:**
|
|
27
|
+
- `openapi.yaml` / `openapi.json` present
|
|
28
|
+
- `curl` or `httpie` usage in existing scripts
|
|
29
|
+
- Postman collection files
|
|
30
|
+
|
|
31
|
+
**Proposed verify pattern:**
|
|
32
|
+
Direct HTTP assertion against a local port:
|
|
33
|
+
```yaml
|
|
34
|
+
verify:
|
|
35
|
+
- curl -fsS http://localhost:3000/health | jq '.status == "ok"'
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
### DB — Database schema and queries
|
|
39
|
+
|
|
40
|
+
**Detection signals:**
|
|
41
|
+
- `docker-compose` postgres service defined
|
|
42
|
+
- `prisma`, `drizzle-kit`, `knex`, or `flyway` in dependencies
|
|
43
|
+
- `test/db` or similar helper directory
|
|
44
|
+
|
|
45
|
+
**Proposed verify pattern:**
|
|
46
|
+
Postgres readiness + migration + assertion:
|
|
47
|
+
```yaml
|
|
48
|
+
verify:
|
|
49
|
+
- pg_isready -h localhost -p 5432
|
|
50
|
+
- pnpm prisma migrate deploy
|
|
51
|
+
- pnpm tsx scripts/verify-db.ts
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
### Integration — Cross-module workflows
|
|
55
|
+
|
|
56
|
+
**Detection signals:**
|
|
57
|
+
- `test/integration/**` directory exists
|
|
58
|
+
- `e2e/**` directory exists
|
|
59
|
+
- `*.integration.test.ts` files
|
|
60
|
+
|
|
61
|
+
**Proposed verify pattern:**
|
|
62
|
+
Integration test runner scoped to relevant paths:
|
|
63
|
+
```yaml
|
|
64
|
+
verify:
|
|
65
|
+
- pnpm test test/integration
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
### Browser-based component — Storybook stories
|
|
69
|
+
|
|
70
|
+
**Detection signals:**
|
|
71
|
+
- `storybook` or `@storybook/*` in dependencies
|
|
72
|
+
- `*.stories.{ts,tsx}` files present
|
|
73
|
+
|
|
74
|
+
**Proposed verify pattern:**
|
|
75
|
+
Storybook test or Chromatic visual verification:
|
|
76
|
+
```yaml
|
|
77
|
+
verify:
|
|
78
|
+
- pnpm storybook test --stories "ComponentName"
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
### CLI — Command-line interface
|
|
82
|
+
|
|
83
|
+
**Detection signals:**
|
|
84
|
+
- `bin/*` directory with executables
|
|
85
|
+
- `package.json` `bin:` entry defined
|
|
86
|
+
|
|
87
|
+
**Proposed verify pattern:**
|
|
88
|
+
Smoke test via help flag or scripted invocation:
|
|
89
|
+
```yaml
|
|
90
|
+
verify:
|
|
91
|
+
- pnpm my-cli --help
|
|
92
|
+
- pnpm tsx scripts/smoke-test-cli.ts
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
## Question-bundling rule
|
|
96
|
+
|
|
97
|
+
**Two or more surfaces detected:** Bundle into a single structured `question` tool call with one checkbox group per surface.
|
|
98
|
+
|
|
99
|
+
**One surface detected:** Still ask (confirmation, not interrogation), but use a single-field call.
|
|
100
|
+
|
|
101
|
+
**Zero surfaces detected:** Skip the QA-expectation question entirely. Fall back to generic verifies:
|
|
102
|
+
```yaml
|
|
103
|
+
defaults:
|
|
104
|
+
verify_after_each:
|
|
105
|
+
- pnpm run typecheck
|
|
106
|
+
- pnpm test
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
## Emission
|
|
110
|
+
|
|
111
|
+
Confirmed patterns become:
|
|
112
|
+
|
|
113
|
+
1. **Per-task verify templates** — tasks targeting specific files use scoped verifies (e.g., `pnpm test test/api/users.test.ts` for a task touching `src/api/users.ts`)
|
|
114
|
+
2. **defaults.verify_after_each** — global breakage catchers (typecheck, full test suite)
|
|
115
|
+
|
|
116
|
+
The rule: per-task verify targets the specific files touched; defaults catches global breakage.
|
|
117
|
+
|
|
118
|
+
## Cross-reference to verify-design.md
|
|
119
|
+
|
|
120
|
+
This rule (10) is the per-surface tactical layer — it names the tools to detect and the patterns to propose. Rule 3 (verify-design.md) owns the principles: deterministic, assertive, would-have-failed-before. Every proposed command must satisfy both layers.
|
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
# Rule 9 — Setup-block authoring
|
|
2
|
+
|
|
3
|
+
**Detect → propose → confirm the top-level `setup:` block.**
|
|
4
|
+
|
|
5
|
+
The `setup:` block runs once per worktree before any task executes. It is the environment bootstrap: package manager install, docker-compose services, migration runs. A good setup block means the builder starts with a working environment; a missing one means tasks fail confusingly on missing dependencies.
|
|
6
|
+
|
|
7
|
+
## Detection signals
|
|
8
|
+
|
|
9
|
+
During codebase research (Section 2), look for these signals:
|
|
10
|
+
|
|
11
|
+
**Lockfiles → package manager install:**
|
|
12
|
+
- `pnpm-lock.yaml` → `pnpm install --frozen-lockfile`
|
|
13
|
+
- `bun.lock` → `bun install --frozen-lockfile`
|
|
14
|
+
- `package-lock.json` → `npm ci`
|
|
15
|
+
- `yarn.lock` → `yarn install --frozen-lockfile`
|
|
16
|
+
- `Cargo.lock` → `cargo fetch`
|
|
17
|
+
|
|
18
|
+
**Docker Compose → service startup:**
|
|
19
|
+
- `docker-compose.yml` or `compose.yaml` with defined services → `docker compose up -d <svc>` for each service the tasks will need (typically postgres, redis, etc.)
|
|
20
|
+
|
|
21
|
+
**Migration tooling → schema setup:**
|
|
22
|
+
- `package.json` deps containing `knex`, `prisma`, `drizzle-kit`, or `flyway` → corresponding migrate/push command (e.g., `prisma migrate dev`, `drizzle-kit push`)
|
|
23
|
+
|
|
24
|
+
## Proposal shape
|
|
25
|
+
|
|
26
|
+
When you detect one or more setup commands, bundle them into a single `question` tool call:
|
|
27
|
+
|
|
28
|
+
- Present each detected command as a pre-selected checkbox
|
|
29
|
+
- Group by category (Package install, Services, Migrations)
|
|
30
|
+
- Allow the user to uncheck commands that aren't needed or edit the command text
|
|
31
|
+
- Include an "Add another command" free-text field for anything you missed
|
|
32
|
+
|
|
33
|
+
Example question structure:
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
Setup commands detected (check all that should run before the first task):
|
|
37
|
+
|
|
38
|
+
[✓] Package install: pnpm install --frozen-lockfile
|
|
39
|
+
[✓] Services: docker compose up -d postgres
|
|
40
|
+
[✓] Migrations: pnpm prisma migrate dev
|
|
41
|
+
|
|
42
|
+
[Add another command: __________]
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
## No-op behavior
|
|
46
|
+
|
|
47
|
+
If NOTHING is detected (no lockfile, no compose, no migration tooling), emit `setup: []` or omit the key entirely. Do NOT ask the user open-ended "do you need setup?" questions. The schema defaults to `[]`; omitting is safe.
|
|
48
|
+
|
|
49
|
+
## Emission
|
|
50
|
+
|
|
51
|
+
Whatever the user confirms becomes the top-level `setup:` block in the written YAML, positioned above `defaults:` (matching schema ordering):
|
|
52
|
+
|
|
53
|
+
```yaml
|
|
54
|
+
name: my-plan
|
|
55
|
+
setup:
|
|
56
|
+
- pnpm install --frozen-lockfile
|
|
57
|
+
- docker compose up -d postgres
|
|
58
|
+
- pnpm prisma migrate dev
|
|
59
|
+
defaults:
|
|
60
|
+
verify_after_each:
|
|
61
|
+
- pnpm run typecheck
|
|
62
|
+
tasks:
|
|
63
|
+
...
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
## Back-compat note
|
|
67
|
+
|
|
68
|
+
The `setup:` key already defaults to `[]` in the schema (line 241 of `src/pilot/plan/schema.ts`). Plans that omit it or set it to `[]` behave identically to before this rule existed.
|
|
@@ -51,3 +51,7 @@ If a verify command flakes, three retries will exhaust attempts and the task fai
|
|
|
51
51
|
## Always include a "before" check
|
|
52
52
|
|
|
53
53
|
For non-trivial tasks, write a verify that would HAVE FAILED before the task ran. This makes the task's value observable. If the verify passed before AND passes after, the task didn't actually move the system.
|
|
54
|
+
|
|
55
|
+
## Cross-reference: per-surface tooling menu
|
|
56
|
+
|
|
57
|
+
For the per-surface tooling menu (Playwright for UI, curl for API, Postgres for DB), see rule 10 (`qa-expectations.md`). That rule applies these principles to specific tools; this rule defines the principles themselves.
|
package/package.json
CHANGED