@glrs-dev/cli 1.2.0 → 2.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/CHANGELOG.md +4 -0
  2. package/dist/vendor/harness-opencode/dist/agents/prompts/pilot-assessor.md +77 -0
  3. package/dist/vendor/harness-opencode/dist/agents/prompts/pilot-builder.md +24 -116
  4. package/dist/vendor/harness-opencode/dist/agents/prompts/pilot-planner.md +38 -160
  5. package/dist/vendor/harness-opencode/dist/agents/prompts/pilot-scoper.md +58 -0
  6. package/dist/vendor/harness-opencode/dist/{chunk-BWERBERN.js → chunk-6CZPRUMJ.js} +12 -62
  7. package/dist/vendor/harness-opencode/dist/chunk-DZG4D3OH.js +54 -0
  8. package/dist/vendor/harness-opencode/dist/chunk-OYRKOEXK.js +88 -0
  9. package/dist/vendor/harness-opencode/dist/cli.js +1631 -4224
  10. package/dist/vendor/harness-opencode/dist/index.js +831 -166
  11. package/dist/vendor/harness-opencode/dist/{install-5JKWK6Z4.js → install-6775ZBDG.js} +1 -1
  12. package/dist/vendor/harness-opencode/dist/paths-WZ23ZQOV.js +18 -0
  13. package/dist/vendor/harness-opencode/package.json +1 -1
  14. package/package.json +1 -1
  15. package/dist/vendor/harness-opencode/dist/agents/prompts/pilot-builder.open.md +0 -129
  16. package/dist/vendor/harness-opencode/dist/chunk-57EOY72Y.js +0 -174
  17. package/dist/vendor/harness-opencode/dist/chunk-5TAMY7P6.js +0 -67
  18. package/dist/vendor/harness-opencode/dist/chunk-BKTFWXLG.js +0 -204
  19. package/dist/vendor/harness-opencode/dist/chunk-EK7K4NTV.js +0 -747
  20. package/dist/vendor/harness-opencode/dist/chunk-KB7M7JXU.js +0 -145
  21. package/dist/vendor/harness-opencode/dist/chunk-RNRCXQ65.js +0 -56
  22. package/dist/vendor/harness-opencode/dist/paths-LT3QQKCF.js +0 -18
  23. package/dist/vendor/harness-opencode/dist/pilot/mcp/status-server.d.ts +0 -1
  24. package/dist/vendor/harness-opencode/dist/pilot/mcp/status-server.js +0 -228
  25. package/dist/vendor/harness-opencode/dist/pilot-config-7LJZ23YK.js +0 -55
  26. package/dist/vendor/harness-opencode/dist/runs-QWPL3TKV.js +0 -18
  27. package/dist/vendor/harness-opencode/dist/safety-gate-WM3EWOCY.js +0 -10
  28. package/dist/vendor/harness-opencode/dist/setup-hook-FHTXMAQL.js +0 -88
  29. package/dist/vendor/harness-opencode/dist/skills/pilot-planning/SKILL.md +0 -80
  30. package/dist/vendor/harness-opencode/dist/skills/pilot-planning/rules/dag-shape.md +0 -47
  31. package/dist/vendor/harness-opencode/dist/skills/pilot-planning/rules/decomposition.md +0 -63
  32. package/dist/vendor/harness-opencode/dist/skills/pilot-planning/rules/first-principles.md +0 -29
  33. package/dist/vendor/harness-opencode/dist/skills/pilot-planning/rules/milestones.md +0 -57
  34. package/dist/vendor/harness-opencode/dist/skills/pilot-planning/rules/qa-expectations.md +0 -120
  35. package/dist/vendor/harness-opencode/dist/skills/pilot-planning/rules/self-review.md +0 -46
  36. package/dist/vendor/harness-opencode/dist/skills/pilot-planning/rules/task-context.md +0 -47
  37. package/dist/vendor/harness-opencode/dist/skills/pilot-planning/rules/touches-scope.md +0 -81
  38. package/dist/vendor/harness-opencode/dist/skills/pilot-planning/rules/verify-design.md +0 -163
  39. package/dist/vendor/harness-opencode/dist/tasks-KJ3WN2KY.js +0 -32
@@ -1,163 +0,0 @@
1
- # Rule 3 — Verify-command design
2
-
3
- **Each task's `verify:` commands must succeed iff the task is correctly done.**
4
-
5
- The verify list is the contract between the planner and the builder. It is the ONLY signal pilot uses to decide "did this task work?". A weak verify means you're shipping work the run thinks is fine but really isn't. An over-broad verify means the task fails for reasons unrelated to the work — pre-existing test failures, missing infrastructure, flaky integration tests — and the agent wastes its retry budget on something it can't fix.
6
-
7
- ## The cardinal rule: verify ONLY what the task changed
8
-
9
- A verify command must exercise **exactly the code the task produced** — no more, no less. If the task adds `src/entities/audit-log/schema.ts` and its test file, the verify is:
10
-
11
- ```yaml
12
- verify:
13
- - pnpm --filter @kn/core test -- --run src/entities/audit-log/__tests__/schema.test.ts
14
- ```
15
-
16
- NOT:
17
-
18
- ```yaml
19
- verify:
20
- - pnpm --filter @kn/core test -- --run src/entities/audit-log
21
- ```
22
-
23
- The second form runs EVERY test under that directory — including integration tests that need a running database, tests for pre-existing code the task didn't touch, and tests that may already be failing on the base branch. The agent cannot fix those failures. It will exhaust its retry budget and STOP.
24
-
25
- **The verify command's scope must be as tight as the `touches:` scope.** If you wouldn't put a file in `touches:`, don't let the verify command exercise it.
26
-
27
- ## What a good verify looks like
28
-
29
- - `pnpm test -- --run path/to/specific.test.ts` — runs ONE test file
30
- - `bun test test/api/specific.test.ts` — same, bun flavor
31
- - `bun run typecheck` — semantic check, catches real type failures (good as `verify_after_each`)
32
- - `node scripts/check-schema.ts` — your own probe script (write it as part of the task)
33
- - `grep -q 'export function newThing' src/file.ts && bun test test/file.test.ts` — existence + behavior
34
-
35
- ## What's not OK
36
-
37
- - `echo done` — proves nothing
38
- - `test -f src/foo.ts` — file existence is necessary but rarely sufficient
39
- - `bun run build` ALONE — build success without tests means "TypeScript was happy"; insufficient for behavior tasks
40
- - `pnpm test` (whole package) — pulls in every test in the package; pre-existing failures block the task
41
- - `pnpm --filter @pkg test -- --run src/module` (directory-level) — same problem; runs integration tests the task didn't write
42
- - `grep -q 'newFunction' src/file.ts` — proves text presence, not behavior
43
- - `git diff --name-only | grep src/api` — proves edits happened, not that they're correct
44
-
45
- ## The pre-existing-failure trap
46
-
47
- Pilot runs a **baseline check** before the agent starts: every verify command is executed on the clean tree. If ANY command fails in baseline, the task aborts immediately with a clear message:
48
-
49
- > baseline verify failed: `pnpm --filter @kn/core test` → exit 1.
50
- > This command fails on the clean tree BEFORE the agent starts —
51
- > fix your environment or narrow the verify scope.
52
-
53
- This prevents the agent from wasting its 5-attempt retry budget on failures it didn't cause and can't fix. The baseline is the planner's contract: "these commands WILL pass if the environment is set up correctly."
54
-
55
- **If your verify command fails in baseline, the fix is one of:**
56
- 1. Start the missing infrastructure (the setup hook should handle this).
57
- 2. Narrow the verify to only the specific test file the task creates.
58
- 3. Fix the pre-existing test failure on the base branch first.
59
-
60
- The agent gets 5 attempts (with escalating "try a different approach" nudges) for failures it introduces AFTER the baseline passes. Pre-existing failures never reach the agent.
61
-
62
- ## Milestone and defaults verify run in the baseline too
63
-
64
- The baseline check doesn't only run task-specific verify commands — it runs **everything except** the task's own `verify:` list. That means:
65
-
66
- - `defaults.verify_after_each` commands
67
- - The task's milestone `verify` commands
68
- - `pilot.json` `baseline` and `after_each` commands
69
-
70
- These commands run on the clean tree **before every task in their scope**. If a milestone verify is `pnpm --filter @pkg test` and the first task in that milestone scaffolds the package with a test runner config but zero test files, the *second* task's baseline fails — vitest/jest exit 1 on "no test files found", and the entire downstream DAG cascades to failure.
71
-
72
- **The rule: every milestone and defaults verify command must pass at every point in the DAG where it applies — including immediately after scaffold tasks that create zero test files.**
73
-
74
- ### The empty-test-suite trap
75
-
76
- Test runners treat "no test files found" as a failure by default:
77
-
78
- | Runner | Behavior on zero tests | Fix |
79
- |---|---|---|
80
- | vitest | exit 1 | `--passWithNoTests` |
81
- | jest | exit 1 | `--passWithNoTests` |
82
- | bun test | exit 0 (safe by default) | — |
83
-
84
- When a plan scaffolds a new package or module, the scaffold task creates the test runner config but typically no test files — the first real task creates those. Any milestone or defaults verify that runs the package's test suite will hit the empty-suite exit code.
85
-
86
- **Fix: always use `--passWithNoTests` (or equivalent) on milestone and defaults verify commands that run a test suite.** This is not a weakening of the verify — it's acknowledging that "zero tests, zero failures" is a valid baseline state for a package under construction.
87
-
88
- ```yaml
89
- # WRONG — fails baseline after scaffold task
90
- milestones:
91
- - name: M1-ENGINE
92
- verify:
93
- - pnpm --filter @pkg test
94
-
95
- # RIGHT — tolerates the empty state between scaffold and first real task
96
- milestones:
97
- - name: M1-ENGINE
98
- verify:
99
- - pnpm --filter @pkg test -- --passWithNoTests
100
- ```
101
-
102
- Task-specific verify does NOT need `--passWithNoTests` — it targets the exact test file the task creates, and the baseline excludes task-specific verify commands (they'd fail before the task runs by design — that's TDD).
103
-
104
- ## Two-tier verify
105
-
106
- Use BOTH a per-task verify and `defaults.verify_after_each`:
107
-
108
- ```yaml
109
- defaults:
110
- verify_after_each:
111
- - bun run typecheck # always must pass — catches cross-file breakage
112
- tasks:
113
- - id: T1
114
- verify:
115
- - bun test test/api/create-rule.test.ts # task-specific behavior proof
116
- ```
117
-
118
- `verify_after_each` catches global breakage (a syntax error in a file the task didn't even touch); per-task verify catches task-specific behavior. Together they form a tight net without over-reaching.
119
-
120
- ## Touches and verify must agree
121
-
122
- If the task `touches: [src/api/rules.ts, test/api/rules.test.ts]` but the verify command runs `bun test test/web/`, you have a wrong scope. The verify must exercise files in the touched scope — and ONLY those files.
123
-
124
- Conversely: if the verify runs `test/api/rules.test.ts` but `touches:` doesn't include `test/api/rules.test.ts`, the agent can't create or edit that test file. Both must agree.
125
-
126
- ## Verify must be deterministic and self-contained
127
-
128
- - No `sleep` to wait for a service that may not start.
129
- - No external network calls that could flake — mock or skip.
130
- - No dependency on infrastructure the setup hook didn't start. If the verify needs postgres, the setup hook must start it. If the verify needs an API server, the setup hook must start it.
131
- - No dependency on other tasks' output being committed (use `depends_on` to sequence).
132
-
133
- If a verify command flakes, three retries will exhaust attempts and the task fails for environmental reasons. Pilot has no way to distinguish "real failure" from "flake".
134
-
135
- ## Always include a "before" check
136
-
137
- For non-trivial tasks, write a verify that would HAVE FAILED before the task ran. This makes the task's value observable. If the verify passed before AND passes after, the task didn't actually move the system.
138
-
139
- Good pattern: the test file the agent creates IS the "before" check — it didn't exist before, so `bun test path/to/new.test.ts` would have failed (file not found). After the task, it exists and passes.
140
-
141
- ## Port and environment awareness
142
-
143
- If the setup hook starts services on non-default ports (to avoid collisions with the user's dev stack), verify commands must use those ports. Two patterns:
144
-
145
- **A. Source the env file the hook wrote:**
146
- ```yaml
147
- verify:
148
- - bash -c 'source .env.pilot && pnpm --filter @pkg test -- --run path/to/test.ts'
149
- ```
150
-
151
- **B. Use `defaults.verify_after_each` for the env-sourcing wrapper:**
152
- ```yaml
153
- defaults:
154
- verify_after_each:
155
- - bash -c 'source .env.pilot && bun run typecheck'
156
- ```
157
-
158
- **C. Tests read from `process.env` at runtime** (best — no wrapper needed):
159
- If the test framework reads `DATABASE_URL` from the environment, and the setup hook exports it, the verify command just works. This is the cleanest pattern.
160
-
161
- ## Cross-reference: per-surface tooling menu
162
-
163
- For the per-surface tooling menu (Playwright for UI, curl for API, Postgres for DB), see rule 9 (`qa-expectations.md`). That rule applies these principles to specific tools; this rule defines the principles themselves.
@@ -1,32 +0,0 @@
1
- import {
2
- countByStatus,
3
- getTask,
4
- listTasks,
5
- markAborted,
6
- markBlocked,
7
- markFailed,
8
- markPending,
9
- markReady,
10
- markRunning,
11
- markSucceeded,
12
- readyTasks,
13
- resetTasksForResume,
14
- setCostUsd,
15
- upsertFromPlan
16
- } from "./chunk-57EOY72Y.js";
17
- export {
18
- countByStatus,
19
- getTask,
20
- listTasks,
21
- markAborted,
22
- markBlocked,
23
- markFailed,
24
- markPending,
25
- markReady,
26
- markRunning,
27
- markSucceeded,
28
- readyTasks,
29
- resetTasksForResume,
30
- setCostUsd,
31
- upsertFromPlan
32
- };