planr 1.1.17 → 1.1.18
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/CLI_REFERENCE.md +3 -1
- package/npm/native/darwin-arm64/planr +0 -0
- package/npm/native/darwin-x86_64/planr +0 -0
- package/npm/native/linux-arm64/planr +0 -0
- package/npm/native/linux-x86_64/planr +0 -0
- package/package.json +1 -1
- package/plugins/planr/.claude-plugin/plugin.json +1 -1
- package/plugins/planr/.codex-plugin/plugin.json +1 -1
- package/plugins/planr/skills/planr-goal/SKILL.md +2 -0
- package/plugins/planr/skills/planr-loop/SKILL.md +1 -1
- package/plugins/planr/skills/planr-work/SKILL.md +1 -1
package/docs/CLI_REFERENCE.md
CHANGED
|
@@ -77,7 +77,7 @@ With `--json`, responses follow one convention so agents never guess where data
|
|
|
77
77
|
- Other single objects use their semantic key: `plan`, `log`, `review`, `artifact`, `context`.
|
|
78
78
|
- Optional guidance appears under `hint` or `next` when a follow-up command is the expected move.
|
|
79
79
|
|
|
80
|
-
`plan check` validates path, YAML frontmatter, and that required sections have content: build plans need `## Scope Decision`, `## Verification`, and `## Acceptance Criteria` filled; product plans need `## Problem`, `## Requirements`, and `## Success Criteria` filled in `PRODUCT_SPEC.md`. Each warning is structured — `{"file", "section", "message", "fix"}` — and names the exact file to edit plus the re-run command, so a failed check is a repair instruction, not a riddle.
|
|
80
|
+
`plan check` validates path, YAML frontmatter, and that required sections have content: build plans need `## Scope Decision`, `## Verification`, and `## Acceptance Criteria` filled; product plans need `## Problem`, `## Requirements`, and `## Success Criteria` filled in `PRODUCT_SPEC.md`. It also flags a task list that still contains only the scaffold placeholder (or no work specs at all) — `map build` would turn that into a single coarse item, so the fix names the granularity contract: one `### TASK-00n:` heading (or `- [ ]` line) per verifiable slice, typically 4-8, in execution order. Each warning is structured — `{"file", "section", "message", "fix"}` — and names the exact file to edit plus the re-run command, so a failed check is a repair instruction, not a riddle.
|
|
81
81
|
|
|
82
82
|
`plan audit <plan-id>` is the one-call contract verdict for a plan's map scope. It evaluates four clauses with evidence: `items_settled` (open items listed), `reviews_complete` (open review items listed), `approvals_clear` (requested/denied approvals listed), and `verification_logged` (logs with `--kind verification` on scope items). The stored goal contract (`planr context --tag goal-contract` mentioning the plan id) is included; the verification clause is binding only when such a contract exists. `holds: true` means the contract is satisfied — loop agents use this as their stop condition instead of stitching the verdict together from `map status`, `log list`, and `approval list`. Also available as MCP `planr_plan_audit`.
|
|
83
83
|
|
|
@@ -115,6 +115,8 @@ With `--json`, responses follow one convention so agents never guess where data
|
|
|
115
115
|
|
|
116
116
|
`pick --work-type <type>` restricts the lease to one work type, so checker agents pick only `review` items and makers only work items. `pick --plan <plan-id>` restricts the lease to one plan's items, so plan-scoped goal runs never pick work outside their contract even when other plans share the board; an unknown plan id is an error, never a silent unscoped pick. Both filters are available on MCP `planr_pick_item` and HTTP `POST /v1/pick` (`work_type`, `plan`). A null pick is never blind: `{"item": null}` carries a `reason` (`empty_map`, `all_settled`, `nothing_ready`, `ready_items_excluded_by_filter`) and the `remaining` snapshot. When ready work exists but the active filters rejected all of it, `excluded` lists each ready item with the cause (`work_type` mismatch, outside the `--plan` scope, or just requested by this worker) and `repair` carries the exact pick commands that would lease that work — across CLI, MCP, and HTTP. On a review item, `close_effect` previews the full `--close-target` cascade: it lists the work that closing the review (and with it the reviewed item) would unlock.
|
|
117
117
|
|
|
118
|
+
`artifact add` infers the mime type from the file extension when `--path` is given without `--mime` (PNG screenshots land as `image/png`, not `text/plain`); inline `--content` defaults to `text/plain`. The same inference applies on MCP `planr_artifact_add` and HTTP `POST /v1/artifacts`.
|
|
119
|
+
|
|
118
120
|
`review evidence` reports Git worktree status scoped to files named by item logs or artifacts. Dirty files without item provenance are listed as unrelated and are not treated as agent-owned evidence. `--pr-url` records an item-scoped PR reference before returning the evidence package.
|
|
119
121
|
|
|
120
122
|
`recover sweep` previews by default. With `--apply`, timed-out picked work that has a retry budget (`max_retries > 0`) is marked `failed` with an `item_timed_out` event; stale work and timeouts without a retry budget are released back to `ready`. Failed work re-enters `ready` once its retry delay has elapsed (`retry_delay_ms`, doubled per retry under `exponential` backoff) until the budget is exhausted. Every transition records a recovery event. Item pre/post conditions are visible in pick context, trace output, and close previews; post conditions are reported as manual verification gates instead of being guessed automatically.
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "planr",
|
|
3
3
|
"description": "Skill-driven planning and execution loop for coding agents: one planr entry point, an autonomous planr-loop, and evidence-backed task graph skills powered by the planr CLI.",
|
|
4
|
-
"version": "1.1.
|
|
4
|
+
"version": "1.1.18",
|
|
5
5
|
"author": {
|
|
6
6
|
"name": "instructa"
|
|
7
7
|
},
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "planr",
|
|
3
|
-
"version": "1.1.
|
|
3
|
+
"version": "1.1.18",
|
|
4
4
|
"description": "Skill-driven planning and execution loop for coding agents: one $planr entry point, an autonomous $planr-loop, and evidence-backed task graph skills powered by the planr CLI.",
|
|
5
5
|
"author": {
|
|
6
6
|
"name": "instructa",
|
|
@@ -34,6 +34,8 @@ planr map build --from <plan-id> # idempotent: safe to re-run
|
|
|
34
34
|
|
|
35
35
|
`plan refine` appends notes; the plan body is yours to edit. When `plan check` fails, each warning names the exact file and section — edit that file directly, fill the section with real content, and re-run the check. Scaffold sections (`## Scope Decision`, `## Verification`, `## Acceptance Criteria`) are filled by editing the plan markdown, not by more `refine` notes.
|
|
36
36
|
|
|
37
|
+
Before `map build`, expand the plan's task list: the scaffold ships a single placeholder task, and mapping it produces one coarse item that forces the worker to guess the breakdown later. Replace the placeholder with one `### TASK-00n: <slice>` heading (or `- [ ]` line) per verifiable slice — typically 4-8, in execution order, each one closeable with its own evidence. Derive the slices from the acceptance criteria; `plan check` flags the unexpanded placeholder.
|
|
38
|
+
|
|
37
39
|
`map build` creates one item per plan step and chains them in plan order with `blocks` links; the output lists the created items and links. Review that chain and adjust it only where execution order differs from document order:
|
|
38
40
|
|
|
39
41
|
```bash
|
|
@@ -52,7 +52,7 @@ The short path per item is three commands: `planr pick --json` (one flat work pa
|
|
|
52
52
|
|
|
53
53
|
`map build` chains created items in plan order with `blocks` links automatically and prints the created items and links. In step 2, verify that chain against real execution-order dependencies and adjust with `planr link add` only where document order and execution order differ. `item breakdown` works the same way: pass one `--into` per child title (or one value with newline-separated titles), and the output lists the chained children plus the next command.
|
|
54
54
|
|
|
55
|
-
Request reviews where they carry signal: implementation slices and anything user-facing finish with `done --review`. Trivial inspection, baseline, or setup items close with plain `done` (evidence still required) — a review that can only confirm "the repo was empty" adds ceremony, not safety. The goal contract's "all reviews closed" clause audits review items that exist; plain-`done` items satisfy it without a review gate, so skipping low-signal reviews never blocks `plan audit`.
|
|
55
|
+
Request reviews where they carry signal: implementation slices and anything user-facing finish with `done --review`. Trivial inspection, baseline, or setup items close with plain `done` (evidence still required) — a review that can only confirm "the repo was empty" adds ceremony, not safety. The goal contract's "all reviews closed" clause audits review items that exist; plain-`done` items satisfy it without a review gate, so skipping low-signal reviews never blocks `plan audit`. In a single-agent host this bar rises: a review you close yourself mostly re-runs your own commands, so reserve gates for the riskiest slices — the core implementation and the final live verification — and close the rest with plain `done`.
|
|
56
56
|
|
|
57
57
|
The loop never closes its own reviews when the host supports a second agent. Maker and checker stay separate. One agent instance keeps one `PLANR_WORKER_ID` for the whole session — never export a second identity inside the same instance to make reviews look `independent`; an honest `single_agent` stamp beats a fake `independent` one.
|
|
58
58
|
|
|
@@ -22,7 +22,7 @@ The pick output is one flat work packet — item, links, logs, runtime, recovery
|
|
|
22
22
|
planr done <item-id> --summary "what changed" --files path-a --files path-b --cmd "exact verification command" --tests "exact test command" --review
|
|
23
23
|
```
|
|
24
24
|
|
|
25
|
-
Put build/serve commands in `--cmd` and test runs in `--tests` — both are recorded as evidence. Single-quote `--files` values that contain `$` (route files like `watch.$videoId.tsx`), or the shell expands them before planr sees them. `done --review` writes the completion log, requests the review, and moves the item to `in_review` (you keep ownership; it is waiting on the gate, not abandoned) — the response names the target's new status and the plan-scoped reviewer pick command; add `--next` to pick the following item in the same call. Without `--review` it closes the item directly (only for items that need no review gate). Running `done` on a ready item you never picked adopts it: the lease is written retroactively under your worker id so the review always has a maker. The response reports what your settlement `unlocked`, echoes the item's post condition, and hints when downstream work depends on an item closed without command/test evidence.
|
|
25
|
+
Put build/serve commands in `--cmd` and test runs in `--tests` — both are recorded as evidence. Include the decisive output line in `--summary` (e.g. "12 tests passed", "GET /videos returned 3 entries"): reviewers see your recorded command strings, not your terminal, so the summary must carry what you observed, not just what you ran. Single-quote `--files` values that contain `$` (route files like `watch.$videoId.tsx`), or the shell expands them before planr sees them. `done --review` writes the completion log, requests the review, and moves the item to `in_review` (you keep ownership; it is waiting on the gate, not abandoned) — the response names the target's new status and the plan-scoped reviewer pick command; add `--next` to pick the following item in the same call. Without `--review` it closes the item directly (only for items that need no review gate). Running `done` on a ready item you never picked adopts it: the lease is written retroactively under your worker id so the review always has a maker. The response reports what your settlement `unlocked`, echoes the item's post condition, and hints when downstream work depends on an item closed without command/test evidence.
|
|
26
26
|
|
|
27
27
|
Live verification (browser flow, executed binary, real requests) gets its own log kind so `plan audit` can find it:
|
|
28
28
|
|