planr 0.0.1 → 1.1.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (63) hide show
  1. package/LICENSE.md +21 -0
  2. package/README.md +150 -0
  3. package/docs/ARCHITECTURE.md +75 -0
  4. package/docs/CI.md +54 -0
  5. package/docs/CLAUDE_CODE.md +33 -0
  6. package/docs/CLI_REFERENCE.md +126 -0
  7. package/docs/CODEX.md +48 -0
  8. package/docs/CURSOR.md +30 -0
  9. package/docs/GOALS.md +155 -0
  10. package/docs/HANDOFFS_AND_STORIES.md +121 -0
  11. package/docs/IMPORT.md +21 -0
  12. package/docs/INSTALL.md +113 -0
  13. package/docs/MCP_CONTRACT.md +70 -0
  14. package/docs/MCP_GUIDE.md +40 -0
  15. package/docs/NPM.md +40 -0
  16. package/docs/OPERATING_MODEL.md +250 -0
  17. package/docs/RELEASE.md +140 -0
  18. package/docs/SECURITY.md +8 -0
  19. package/docs/SKILLS.md +278 -0
  20. package/docs/TASK_GRAPH_MODEL.md +222 -0
  21. package/docs/TESTING.md +87 -0
  22. package/docs/TROUBLESHOOTING.md +26 -0
  23. package/docs/fixtures/mcp-contract.json +92 -0
  24. package/docs/planr-spec/ADRS.md +160 -0
  25. package/docs/planr-spec/AI_SPEC.md +138 -0
  26. package/docs/planr-spec/ANALYTICS_OBSERVABILITY_SPEC.md +124 -0
  27. package/docs/planr-spec/API_AND_DATA_MODEL.md +517 -0
  28. package/docs/planr-spec/BACKEND_IMPLEMENTATION_SPEC.md +178 -0
  29. package/docs/planr-spec/CLIENT_IMPLEMENTATION_SPEC.md +119 -0
  30. package/docs/planr-spec/DESIGN_SYSTEM_SPEC.md +102 -0
  31. package/docs/planr-spec/PRODUCT_SPEC.md +193 -0
  32. package/docs/planr-spec/QA_ACCEPTANCE_TESTS.md +146 -0
  33. package/docs/planr-spec/README.md +67 -0
  34. package/docs/planr-spec/REFERENCES.md +29 -0
  35. package/docs/planr-spec/RELEASE_READINESS.md +95 -0
  36. package/docs/planr-spec/SAFETY_PRIVACY_SECURITY.md +169 -0
  37. package/docs/planr-spec/TASKS.md +932 -0
  38. package/docs/planr-spec/TECH_ARCHITECTURE.md +143 -0
  39. package/docs/planr-spec/UX_FLOWS.md +235 -0
  40. package/docs/planr-spec/V1_1_DIFFERENTIATION_CONTRACT.md +177 -0
  41. package/docs/planr-spec.zip +0 -0
  42. package/npm/bin/planr.js +54 -0
  43. package/npm/native/darwin-arm64/planr +0 -0
  44. package/npm/native/darwin-x86_64/planr +0 -0
  45. package/npm/native/linux-arm64/planr +0 -0
  46. package/npm/native/linux-x86_64/planr +0 -0
  47. package/package.json +27 -8
  48. package/plugins/planr/.claude-plugin/plugin.json +11 -0
  49. package/plugins/planr/.codex-plugin/plugin.json +25 -0
  50. package/plugins/planr/agents/planr-reviewer.md +12 -0
  51. package/plugins/planr/agents/planr-worker.md +10 -0
  52. package/plugins/planr/skills/planr/SKILL.md +52 -0
  53. package/plugins/planr/skills/planr-goal/SKILL.md +69 -0
  54. package/plugins/planr/skills/planr-loop/SKILL.md +114 -0
  55. package/plugins/planr/skills/planr-loop/agents/planr-reviewer.toml +17 -0
  56. package/plugins/planr/skills/planr-loop/agents/planr-worker.toml +14 -0
  57. package/plugins/planr/skills/planr-plan/SKILL.md +58 -0
  58. package/plugins/planr/skills/planr-review/SKILL.md +51 -0
  59. package/plugins/planr/skills/planr-status/SKILL.md +50 -0
  60. package/plugins/planr/skills/planr-summary/SKILL.md +28 -0
  61. package/plugins/planr/skills/planr-task-graph/SKILL.md +228 -0
  62. package/plugins/planr/skills/planr-verify-web/SKILL.md +76 -0
  63. package/plugins/planr/skills/planr-work/SKILL.md +68 -0
@@ -0,0 +1,119 @@
1
+ # Client Implementation Specification
2
+
3
+ ## Client Surfaces
4
+
5
+ - CLI: primary V1 interface.
6
+ - MCP client: primary agent integration.
7
+ - Optional TUI/dashboard: local visual inspection.
8
+ - Install helpers: client-specific setup for Codex, Claude Code, Cursor.
9
+
10
+ ## CLI Requirements
11
+
12
+ - REQ-CLI-001: Every command that mutates state must support `--json`.
13
+ - REQ-CLI-002: Human output must include next actions without hiding failure causes.
14
+ - REQ-CLI-003: Commands must be composable in shell scripts.
15
+ - REQ-CLI-004: `--db` or equivalent must allow alternate database paths.
16
+ - REQ-CLI-005: Planr must derive a stable worker id automatically from client/session context where available.
17
+ - REQ-CLI-006: `planr prompt` must print client-ready CLI, MCP, or HTTP instructions and report that global config was not edited.
18
+ - REQ-CLI-007: Recovery, import, cancellation, and replan commands must support preview-first workflows before destructive mutations.
19
+
20
+ ## Command Groups
21
+
22
+ ```text
23
+ planr project ...
24
+ planr plan ...
25
+ planr map ...
26
+ planr item ...
27
+ planr pick
28
+ planr log ...
29
+ planr review ...
30
+ planr recover ...
31
+ planr prompt cli|mcp|http
32
+ planr close
33
+ planr context ...
34
+ planr search
35
+ planr doctor
36
+ planr mcp
37
+ planr serve
38
+ ```
39
+
40
+ ## MCP Client Requirements
41
+
42
+ - REQ-CLIENT-010: MCP tools must have stable names and JSON schemas.
43
+ - REQ-CLIENT-011: MCP prompts must expose plan/work/review/map/summary workflows.
44
+ - REQ-CLIENT-012: MCP resources must be read-only.
45
+ - REQ-CLIENT-013: Mutation tools must return compact log summaries and next-action hints.
46
+
47
+ ## Codex Integration
48
+
49
+ V1 must provide:
50
+
51
+ - `planr install codex` or `planr doctor --client codex`.
52
+ - MCP registration instructions compatible with Codex CLI.
53
+ - Optional AGENTS.md snippet.
54
+ - Optional `planr codex run` post-V1 wrapper for `codex exec`.
55
+
56
+ Acceptance:
57
+
58
+ - REQ-CLIENT-020: A Codex user can see Planr MCP registration command and verify it with `codex mcp list`.
59
+ - REQ-CLIENT-021: Codex prompts must not assume Codex is the only agent in the project.
60
+
61
+ ## Claude Code Integration
62
+
63
+ V1 must provide:
64
+
65
+ - `.mcp.json` project-scoped config example.
66
+ - `claude mcp add` command example where available.
67
+ - Prompt/skill instructions that preserve Planr graph SSOT.
68
+
69
+ Acceptance:
70
+
71
+ - REQ-CLIENT-030: Claude Code install output must explain project vs user scope.
72
+ - REQ-CLIENT-031: Claude Code prompt package must include plan/work/review/map/summary workflows.
73
+
74
+ ## Cursor Integration
75
+
76
+ V1 must provide:
77
+
78
+ - `.cursor/mcp.json` project config example.
79
+ - Global `~/.cursor/mcp.json` example.
80
+ - Cursor Agent usage notes.
81
+
82
+ Acceptance:
83
+
84
+ - REQ-CLIENT-040: Cursor install output must distinguish stdio, SSE, and streamable HTTP options when relevant.
85
+ - REQ-CLIENT-041: Cursor prompts must avoid relying on Codex-only skill behavior.
86
+
87
+ ## Optional TUI/Dashboard
88
+
89
+ If implemented:
90
+
91
+ - read-only by default;
92
+ - optional mutation actions behind confirmation;
93
+ - graph and list views;
94
+ - live event updates;
95
+ - local-only server.
96
+
97
+ The implemented local browser review workspace is served at `/review` with data from `/v1/review-workspace`.
98
+
99
+ ## Offline Behavior
100
+
101
+ All V1 client flows must work without internet once the binary is installed.
102
+
103
+ ## Error Handling
104
+
105
+ Errors must include:
106
+
107
+ - machine-readable code;
108
+ - plain-language message;
109
+ - affected object id/path;
110
+ - suggested next command when safe.
111
+
112
+ ## Client Tests
113
+
114
+ - CLI golden output tests.
115
+ - JSON schema output tests.
116
+ - MCP tool discovery tests.
117
+ - Config-generation fixture tests for Codex, Claude Code, and Cursor.
118
+ - Prompt output tests for CLI, MCP, HTTP, and per-client wording.
119
+ - Browser workspace smoke tests against localhost.
@@ -0,0 +1,102 @@
1
+ # Design System Specification
2
+
3
+ ## Design Principles
4
+
5
+ - Operational, not decorative.
6
+ - Dense enough for repeated developer use.
7
+ - Log-first: status, blockers, files, and verification are always scannable.
8
+ - Calm hierarchy: avoid dashboards that obscure the next action.
9
+ - Text remains copyable and useful in terminals.
10
+
11
+ ## Brand Tone
12
+
13
+ Planr should sound precise, direct, and practical:
14
+
15
+ - "picked i-api by codex-1"
16
+ - "blocked: t-schema is not done"
17
+ - "review found 2 issues"
18
+
19
+ Avoid hype, gamification, and vague success language.
20
+
21
+ ## Visual Direction
22
+
23
+ - CLI/TUI: high-contrast, compact, status-color restrained.
24
+ - Web dashboard: work-focused, table/graph hybrid, no oversized hero sections.
25
+ - Cards only for repeated items, log, and item summaries.
26
+ - Radius: 6px maximum unless platform default requires otherwise.
27
+
28
+ ## Color System
29
+
30
+ - Neutral background.
31
+ - Status colors:
32
+ - ready: blue or cyan.
33
+ - running/picked: amber.
34
+ - done: green.
35
+ - blocked/failed: red.
36
+ - review: violet only as accent, not dominant palette.
37
+ - Provide no-color output mode.
38
+
39
+ ## Typography
40
+
41
+ - CLI: terminal default.
42
+ - Dashboard: system UI font for text, monospace for ids/commands.
43
+ - No viewport-scaled text.
44
+ - Long ids and paths must wrap or truncate with copy affordance.
45
+
46
+ ## Spacing And Layout
47
+
48
+ - Item rows must keep stable height in tables.
49
+ - Graph nodes must have predictable min/max widths.
50
+ - Side panels should show details without covering the primary queue.
51
+ - Mobile dashboard, if implemented, uses stacked list/detail, not dense graph canvas.
52
+
53
+ ## Component System
54
+
55
+ - Status badge.
56
+ - Item row.
57
+ - Dependency edge.
58
+ - Plan link.
59
+ - Log card.
60
+ - Command log block.
61
+ - Review finding row.
62
+ - Agent run row.
63
+ - Search result row.
64
+ - Doctor diagnostic item.
65
+
66
+ ## Motion And Haptics
67
+
68
+ - No required motion.
69
+ - Dashboard transitions must respect reduced motion.
70
+ - Live updates may pulse once but must not animate continuously.
71
+
72
+ ## Iconography And Illustration
73
+
74
+ - Use simple icons only for status, actions, and diagnostics.
75
+ - Do not use decorative illustrations in the product UI.
76
+
77
+ ## Data Visualization Rules
78
+
79
+ - Graph view must distinguish containment from dependency edges.
80
+ - Critical path must be visually separable from general edges.
81
+ - Hidden nodes must be indicated with counts.
82
+ - Graph view must have an equivalent text/table representation.
83
+
84
+ ## Accessibility Requirements
85
+
86
+ - REQ-DES-001: All dashboard actions must be keyboard reachable.
87
+ - REQ-DES-002: Color cannot be the only status indicator.
88
+ - REQ-DES-003: Command blocks must be selectable and copyable.
89
+ - REQ-DES-004: Graph state must be available as text for screen readers.
90
+
91
+ ## Platform-Specific UI Conventions
92
+
93
+ - CLI follows Unix command conventions.
94
+ - MCP prompt names are stable and descriptive.
95
+ - Cursor/Claude/Codex instructions should use each client's standard config paths and avoid hidden global edits unless requested.
96
+
97
+ ## Do Not Do
98
+
99
+ - Do not build a marketing landing page as the product surface.
100
+ - Do not use a single purple gradient theme.
101
+ - Do not hide blockers behind cheerful progress summaries.
102
+ - Do not put UI cards inside cards.
@@ -0,0 +1,193 @@
1
+ # Product Specification
2
+
3
+ ## Vision
4
+
5
+ Planr is the planning and coordination layer coding agents are missing: a local-first system that turns an app idea into a production plan, narrows it into a build plan, and runs the work on a dependency-aware map with log-backed review.
6
+
7
+ ## Product Promise
8
+
9
+ Planr turns broad product ideas and coding work into a coherent flow: product plan -> build plan -> map -> pick -> log -> review/evidence -> recovery/package -> close. It prevents duplicate work through atomic picks and makes closure auditable through logs, reviews, verification, and explicit recovery.
10
+
11
+ ## Target Users
12
+
13
+ - Individual developers running one or more coding agents locally.
14
+ - Power users coordinating Codex, Claude Code, Cursor, Gemini CLI, or custom MCP agents in the same repo.
15
+ - Teams that want repo-local planning artifacts before adopting a hosted workflow.
16
+ - Agent builders who need a small coordination primitive for local or CI-based agent workers.
17
+
18
+ ## Problems Solved
19
+
20
+ - Agents lose context across sessions, compaction, and handoffs.
21
+ - Flat todo lists cannot model dependency order or safe parallelism.
22
+ - Markdown plans have rich context but weak live status and no atomic picking unless connected to a map.
23
+ - Issue trackers are too human-centric for fast, mid-flight agent replanning.
24
+ - Agent completions often lack proof: no files, commands, review, or blocked-state record.
25
+
26
+ ## Product Principles
27
+
28
+ - Local first: the repository plus a local database should be enough.
29
+ - Product plans capture intent, build plans capture implementation context, and the map coordinates live work.
30
+ - Log over optimism: completion is proven, not declared.
31
+ - Cross-agent by default: Codex, Claude Code, Cursor, and MCP clients are peers.
32
+ - Hard-cut bias: avoid duplicate sources of truth and transitional shells.
33
+ - Human-readable artifacts: all important plans and decisions must be inspectable without a proprietary UI.
34
+
35
+ ## V1 Scope
36
+
37
+ - A `planr` CLI.
38
+ - A local SQLite map graph with items, links, picks, contexts, artifacts, logs, reviews, runs, and events.
39
+ - A `.planr/` repo pack for plans, project context, review artifacts, and skill/prompt templates.
40
+ - MCP server exposing tools, resources, and prompts for Claude Code, Cursor, Codex, and compatible clients.
41
+ - Optional HTTP/SSE local server for dashboard and automation clients.
42
+ - Codex, Claude Code, and Cursor install/config helpers.
43
+ - Import of existing `.planr` data.
44
+ - Export/import of map graph and Markdown plan packs.
45
+ - Explicit recovery sweeps for stale, timed-out, and retryable work.
46
+ - Scoped Git/PR review evidence and a local browser review workspace.
47
+ - Reusable template packages with preview-first import.
48
+ - Prompt output for CLI, MCP, and HTTP agent setup without hidden global config edits.
49
+
50
+ ## Explicit Non-Goals
51
+
52
+ - REQ-PROD-001: V1 must not require a cloud account, hosted database, or network service.
53
+ - REQ-PROD-002: V1 must not depend on unowned coordination-layer packages or hosted services.
54
+ - REQ-PROD-003: V1 must not be a general project management SaaS.
55
+ - REQ-PROD-004: V1 must not store full agent transcripts by default.
56
+ - REQ-PROD-005: V1 must not privilege one vendor as the only supported workflow.
57
+
58
+ ## User Personas
59
+
60
+ - Solo operator: runs Codex and Claude Code in one repo and wants clean handoffs.
61
+ - Multi-agent power user: launches several workers and wants atomic picking plus conflict visibility.
62
+ - Reviewer: audits whether an item is actually closed against a plan, diff, log, and tests.
63
+ - Toolsmith: wants MCP/HTTP primitives to embed Planr in another agent system.
64
+
65
+ ## Product Flow
66
+
67
+ The canonical Planr flow is:
68
+
69
+ ```text
70
+ idea -> product plan -> build plan -> map -> pick -> log -> review/evidence -> recovery/package -> close
71
+ ```
72
+
73
+ - Idea: raw user request, startup concept, feature request, bug, refactor, or product slice.
74
+ - Product plan: broad product/spec package for app ideas and major initiatives.
75
+ - Build plan: focused implementation contract for a buildable slice.
76
+ - Map: live dependency graph of executable items.
77
+ - Pick: atomic assignment of one ready item to one agent.
78
+ - Log: proof bundle for implementation, verification, review, or handoff.
79
+ - Review: approval or audit condition that blocks closure until satisfied.
80
+ - Evidence: scoped Git, PR URL, file, command, test, and artifact proof attached to the item.
81
+ - Recovery: explicit preview/apply operation for stale, timed-out, retryable, or condition-gated work.
82
+ - Package: reusable local export/import bundle for graph state, plans, logs, contexts, and review artifacts.
83
+ - Close: log-backed completion of an item or parent slice.
84
+
85
+ ## Core Objects And Vocabulary
86
+
87
+ - Project: one repository or multi-root project tracked by Planr.
88
+ - Plan: Markdown artifact under `.planr/plans/` with an internal stage such as `product`, `build`, or `review`.
89
+ - Map: live graph for work items, links, picks, reviews, log, and status.
90
+ - Item: graph node with status, work type, owner, acceptance summary, and optional plan links.
91
+ - Link: directed relationship between items; blocking or non-blocking.
92
+ - Context: project-wide discovery, decision, constraint, or pattern.
93
+ - Log: proof bundle produced when an agent implements, verifies, reviews, or hands off work.
94
+ - Run: one agent execution attempt against an item.
95
+ - Review: approval node or policy requiring log before closure.
96
+
97
+ ## Core User Journeys
98
+
99
+ - Initialize Planr in a repo and configure Codex, Claude Code, and Cursor.
100
+ - Create a plan from a broad app idea or PRD request.
101
+ - Convert product plan slices into build plans.
102
+ - Seed map items from a plan.
103
+ - Pick and execute the next ready item.
104
+ - Run multiple agents concurrently without duplicate picks.
105
+ - Add log with files, commands, tests, and result summary.
106
+ - Review an item against its plan and create fix/review follow-up items when needed.
107
+ - Inspect item-scoped Git evidence and optional PR URL context before approving work.
108
+ - Recover safely after interruption with explicit stale-pick and retry sweeps.
109
+ - Export a reusable package/template and preview import before mutating a project.
110
+ - Resume after interruption and see the current map, active plans, and blockers.
111
+
112
+ ## Feature Requirements
113
+
114
+ ### Initialization
115
+
116
+ - REQ-PROD-010: `planr project init` must create `.planr/`, `.planr/project/`, `.planr/plans/`, `.planr/reviews/`, and a local database without overwriting user content unless `--force` is provided.
117
+ - REQ-PROD-011: `planr project init --client codex|claude|cursor|all` must print or apply integration instructions for the selected client.
118
+ - REQ-PROD-012: Initialization must detect existing `.planr` data and offer import commands.
119
+
120
+ ### Product Plans
121
+
122
+ - REQ-PROD-020: Product plans must support PRD/product spec, UX flows, design system plan, technical architecture, ADRs, AI spec where relevant, safety/privacy/security, API/data model, implementation specs, QA, release readiness, executable task checklist, and references.
123
+ - REQ-PROD-021: Product plan generation must ask only blocking questions and mark assumptions explicitly.
124
+ - REQ-PROD-022: Product plan requirements must use stable IDs and testable language.
125
+ - REQ-PROD-023: Product plan work lists must be convertible into build plans or map candidate items, but must not automatically become live map commitments without user/agent selection.
126
+
127
+ ### Build Plans
128
+
129
+ - REQ-PROD-030: Build plans must support frontmatter, source, scope decision, ownership target, existing leverage, phases, verification, acceptance criteria, out-of-scope, and notes.
130
+ - REQ-PROD-031: A build plan may be linked to one or more map items.
131
+ - REQ-PROD-032: The project context pack must preserve product, ownership, flows, state SSOT, constraints, and quality checks.
132
+ - REQ-PROD-033: Plan closure claims must be reconciled against map item state and log.
133
+
134
+ ### Map Planning
135
+
136
+ - REQ-PROD-040: Map items must support statuses: pending, ready, picked, running, in_review, blocked, closed, closed_partial, failed, cancelled.
137
+ - REQ-PROD-041: Links must support hard blocking order and soft contextual relationships.
138
+ - REQ-PROD-042: Item readiness must be computed from map graph state, not from Markdown checkboxes.
139
+ - REQ-PROD-043: Picking a ready item must be atomic across concurrent agents.
140
+ - REQ-PROD-044: Parent items must not close until required child code, fix, and review items are closed.
141
+
142
+ ### Agent Execution
143
+
144
+ - REQ-PROD-050: Planr must provide agent-specific prompts or MCP prompts for Codex, Claude Code, and Cursor.
145
+ - REQ-PROD-051: Runs must record worker id, client, model/profile when available, item id, command surface, start/end time, and result status.
146
+ - REQ-PROD-052: Item closure must require or allow a log entry with files changed, tests run, commands run, result summary, and blocked/unverified items.
147
+ - REQ-PROD-053: Review findings must create fix items rather than failing ordinary code items.
148
+ - REQ-PROD-054: `planr prompt` must expose CLI, MCP, and HTTP operating instructions without editing global configuration.
149
+
150
+ ### Search And Recall
151
+
152
+ - REQ-PROD-060: Planr must search items, plans, contexts, logs, and review artifacts.
153
+ - REQ-PROD-061: Picking an item must surface relevant upstream context and linked plan sections.
154
+ - REQ-PROD-062: Recovery sweep must preview stale, timed-out, retryable, and condition-gated work before applying mutations.
155
+ - REQ-PROD-063: Review evidence must distinguish item-scoped changed files from unrelated dirty worktree files.
156
+ - REQ-PROD-064: Package import must be preview-first and confirmed explicitly.
157
+
158
+ ## Plans, Tiers, Monetization
159
+
160
+ - V1 is an open-source local tool.
161
+ - Commercial hosted sync, team dashboards, or enterprise policy packs are explicitly post-V1.
162
+
163
+ ## Integrations
164
+
165
+ - Codex CLI and Codex MCP configuration.
166
+ - Claude Code MCP project/user configuration.
167
+ - Cursor MCP project/global configuration.
168
+ - Generic MCP clients via stdio and optional streamable HTTP.
169
+ - Git worktrees and Git diff logs.
170
+ - Optional CI invocation for verification tasks.
171
+
172
+ ## Content, Moderation, And Safety Boundaries
173
+
174
+ Planr handles developer artifacts and may reference private code. It must minimize stored content and avoid collecting prompts, responses, source file contents, secrets, or private transcripts unless the user explicitly enables retention.
175
+
176
+ ## Success Metrics
177
+
178
+ - A broad request can become a product plan, build plan, and map items without manual file editing.
179
+ - Two or more agents can pick independent items without collision.
180
+ - A reviewer can determine what changed, why, what was verified, and what remains.
181
+ - A repo can be resumed after interruption using only Planr state and Git state.
182
+ - A fresh consumer project can prove CLI, MCP, HTTP, review workspace, recovery, package, and installer behavior without relying on maintainer-only state.
183
+
184
+ ## Analytics Constraints
185
+
186
+ - V1 analytics are local diagnostics only.
187
+ - No source code, prompt text, response text, file contents, secrets, or private plan body text may be sent to analytics.
188
+
189
+ ## Open Decisions
190
+
191
+ - OD-PROD-001: Final implementation language: Rust is assumed, but Go remains viable.
192
+ - OD-PROD-002: Whether the local HTTP server ships in V1 or behind a feature flag.
193
+ - OD-PROD-003: Whether Codex-specific worker orchestration is V1 or V1.1.
@@ -0,0 +1,146 @@
1
+ # QA Acceptance Tests
2
+
3
+ ## Test Strategy
4
+
5
+ Planr needs fast deterministic tests around graph correctness, plan parsing, MCP contracts, package import/export, and schema upgrades. The smallest relevant test should run before broader suites.
6
+
7
+ ## Unit Tests
8
+
9
+ ### State Machine
10
+
11
+ - REQ-QA-001: Valid transitions succeed.
12
+ - REQ-QA-002: Invalid transitions return `invalid_transition`.
13
+ - REQ-QA-003: Parent item cannot close while required child review is open.
14
+ - REQ-QA-004: Review annotation and feedback ingestion persist item-linked evidence without auto-closing or auto-approving work.
15
+ - REQ-QA-005: Closing a review writes a `.planr/reviews/*.review.md` artifact and registers it as a review artifact.
16
+
17
+ ### Dependency Readiness
18
+
19
+ - REQ-QA-010: Item with no blockers becomes ready.
20
+ - REQ-QA-011: Item with unfinished upstream remains pending.
21
+ - REQ-QA-012: Completing upstream promotes downstream in the same transaction.
22
+ - REQ-QA-013: Soft relationship does not block readiness.
23
+
24
+ ### Claiming
25
+
26
+ - REQ-QA-020: Two concurrent picks for one item produce exactly one winner.
27
+ - REQ-QA-021: Picked item cannot be picked by another agent.
28
+ - REQ-QA-022: Timeout/release policy is explicit and tested.
29
+ - REQ-QA-023: Heartbeat updates move picked work to running and record `last_heartbeat_at`.
30
+ - REQ-QA-024: Progress, pause, and resume preserve the worker claim while updating runtime state.
31
+ - REQ-QA-025: Stale picked work can be detected and intentionally released for re-pick.
32
+ - REQ-QA-026: Requested or denied approvals block close; approved approvals allow close when other gates pass.
33
+ - REQ-QA-027: Recovery sweep previews stale, timed-out, and retryable work before applying changes.
34
+ - REQ-QA-028: Recovery apply records recovery events and preserves manual pre/post condition visibility.
35
+
36
+ ### Plan Parser
37
+
38
+ - REQ-QA-030: Valid `.plan.md` parses frontmatter and sections.
39
+ - REQ-QA-031: Invalid frontmatter records parse error without rewriting file.
40
+ - REQ-QA-032: Unknown frontmatter fields are preserved.
41
+
42
+ ## Integration Tests
43
+
44
+ ### CLI
45
+
46
+ - `planr project init` creates database and `.planr` pack.
47
+ - `planr plan new` creates a plan package.
48
+ - `planr plan split` creates a deterministic build plan file.
49
+ - `planr item create` creates a map item.
50
+ - `planr pick` picks next ready item.
51
+ - `planr log add` creates log.
52
+ - `planr close` closes item and promotes downstream.
53
+ - `planr map show --json` returns valid JSON.
54
+ - `planr recover sweep` previews and applies explicit recovery actions.
55
+ - `planr review evidence` records item-scoped Git evidence without source content.
56
+ - `planr prompt cli|mcp|http` prints setup text and does not edit global config.
57
+ - `planr export` and `planr import --preview/--confirm` preserve templates, graph state, logs, plan snapshots, and review artifacts.
58
+
59
+ ### MCP
60
+
61
+ - MCP server starts over stdio.
62
+ - Tool list includes required Planr tools.
63
+ - Read-only resource reads cannot mutate state.
64
+ - Mutation tools validate schemas.
65
+ - Prompts list includes plan/work/review/map/summary.
66
+
67
+ ### HTTP/SSE
68
+
69
+ Only if shipped:
70
+
71
+ - REST endpoints match API spec.
72
+ - SSE emits item state changes.
73
+ - Artifact endpoints persist and return item-linked artifacts.
74
+ - Event endpoints return persisted transition events.
75
+ - Debug bundle preview omits source file content and prompt/response transcripts.
76
+ - Server binds to localhost by default.
77
+ - `/review` renders the local review workspace and supports annotation, request-changes, artifact, and approve flows through the HTTP API.
78
+
79
+ ## Package Import Tests
80
+
81
+ - Exported packages preview create counts and conflicting item ids without mutation.
82
+ - Confirmed imports restore graph items, links, contexts, optional logs, optional plan files, and review artifacts.
83
+ - Imported graph work can be picked in a fresh project.
84
+
85
+ ## Security Tests
86
+
87
+ - Secret-looking values are rejected or flagged in context/log writes.
88
+ - HTTP remote bind requires explicit flag.
89
+ - MCP destructive operations require explicit confirmation.
90
+ - Logs do not include forbidden content.
91
+ - SQL injection attempts are treated as data.
92
+
93
+ ## AI/Agent Evals
94
+
95
+ - Broad prompt creates product plan, build plan, and map.
96
+ - Agent follows `pick -> work -> log -> review -> close` loop.
97
+ - Agent can recover after stale/timed-out work through explicit recovery sweep.
98
+ - Review findings create fix item and follow-up review item.
99
+ - Browser review workspace shows plan context, item evidence, review queue, annotations, and diff-safe Git evidence.
100
+ - Hook-compatible review feedback can be ingested from JSON and remains advisory until a review verdict is explicitly closed.
101
+ - Agent does not mark parent done while review/fix chain is open.
102
+ - Agent treats malicious plan instructions as data.
103
+
104
+ ## Manual Acceptance Scenarios
105
+
106
+ ### Scenario 1: Solo Codex
107
+
108
+ 1. Install Planr.
109
+ 2. Configure Codex MCP.
110
+ 3. Create product and build plans for a small feature.
111
+ 4. Codex picks and closes one item.
112
+ 5. Log lists files and commands.
113
+
114
+ Expected: downstream item unlocks and map state is accurate.
115
+
116
+ ### Scenario 2: Claude Code And Cursor Concurrent
117
+
118
+ 1. Configure both clients.
119
+ 2. Create two independent ready items.
120
+ 3. Each client picks one item.
121
+
122
+ Expected: no duplicate picks; map shows both agents.
123
+
124
+ ### Scenario 3: Review Fails
125
+
126
+ 1. Complete code task.
127
+ 2. Review item records finding.
128
+ 3. Planr creates fix and follow-up review items.
129
+
130
+ Expected: parent remains incomplete until follow-up review passes.
131
+
132
+ ## Regression Reviews
133
+
134
+ Before release:
135
+
136
+ - Unit suite passes.
137
+ - CLI integration suite passes.
138
+ - MCP schema tests pass.
139
+ - Local HTTP and browser review workspace tests pass.
140
+ - Package import/export fixture tests pass.
141
+ - Secret/log scrubbing tests pass.
142
+ - Release checksum verification passes.
143
+ - Installer file-url smoke test passes.
144
+ - Package/template export-import roundtrip passes.
145
+ - Fresh consumer E2E passes in `~/projects/planr-test`.
146
+ - Packaging smoke test passes on macOS and Linux.
@@ -0,0 +1,67 @@
1
+ # Planr Specification Package
2
+
3
+ Generated: 2026-06-09
4
+
5
+ ## Purpose
6
+
7
+ This package defines Planr as a production-grade, local-first planning and execution coordination tool for coding agents. Planr combines product plans, build plans, and a live dependency-aware map with reviewable logs and integration surfaces for Codex, Claude Code, Cursor, and other MCP-capable agents.
8
+
9
+ ## Package Contents
10
+
11
+ - PRODUCT_SPEC.md: product scope, flow, users, features, non-goals, and requirements.
12
+ - UX_FLOWS.md: CLI, MCP, and optional dashboard flows.
13
+ - DESIGN_SYSTEM_SPEC.md: UI and TUI design direction.
14
+ - TECH_ARCHITECTURE.md: system architecture and core ownership boundaries.
15
+ - ADRS.md: major product and architecture decisions.
16
+ - AI_SPEC.md: agent behavior, prompts, context, and evals.
17
+ - SAFETY_PRIVACY_SECURITY.md: data, privacy, security, and tool execution boundaries.
18
+ - API_AND_DATA_MODEL.md: project, plan, map, item, log, review, and API contracts.
19
+ - CLIENT_IMPLEMENTATION_SPEC.md: CLI, TUI, editor, and agent-client surfaces.
20
+ - BACKEND_IMPLEMENTATION_SPEC.md: local service, MCP server, HTTP/SSE, and storage implementation.
21
+ - ANALYTICS_OBSERVABILITY_SPEC.md: no-content telemetry, local logs, and diagnostics.
22
+ - QA_ACCEPTANCE_TESTS.md: acceptance suites and regression strategy.
23
+ - RELEASE_READINESS.md: packaging, install, upgrade, and release checks.
24
+ - V1_1_DIFFERENTIATION_CONTRACT.md: V1.1 product differentiation requirements, acceptance criteria, and final verification contract.
25
+ - TASKS.md: executable implementation tasks for coding agents.
26
+ - REFERENCES.md: sources and source-derived assumptions.
27
+
28
+ ## Global Assumptions
29
+
30
+ - V1 is a local-first developer tool, not a hosted SaaS.
31
+ - The first implementation may be a Rust CLI and local daemon using SQLite.
32
+ - Codex, Claude Code, and Cursor should all work through standard CLI and MCP integration paths.
33
+ - Existing `.planr` data may be imported, but the V1 CLI, data model, and docs define the final product API.
34
+ - Planr should ship with its own name, code, docs, and architecture.
35
+
36
+ ## Global Non-Goals
37
+
38
+ - Do not build cloud accounts, billing, team sync, or hosted dashboards in V1.
39
+ - Do not depend on unowned coordination-layer code.
40
+ - Do not make the map graph the only planning artifact; product and build plans are first-class.
41
+ - Do not store full prompts, private code, or agent transcripts by default.
42
+ - Do not make Codex the only supported client.
43
+
44
+ ## Critical Invariants
45
+
46
+ - The map database is the source of truth for item state, links, picks, approvals, reviews, and closure.
47
+ - Product plans are the source of PRD, architecture, UX, security, QA, and release context.
48
+ - Build plans are the source of implementation-level scope, ownership, acceptance criteria, and narrative decisions.
49
+ - Agent closures require log: changed files, commands run, review outcome, and remaining blockers.
50
+ - Multiple agents must not pick the same ready item.
51
+ - Every external command/tool execution path must be explicit, inspectable, and attributable.
52
+ - Parent items are completion gates. Executable code, test, and review work should live in child items, and downstream work should depend on the parent gate when review cleanliness matters.
53
+ - Story logs and handoff documents explain narrative history only. They never override map state, logs, reviews, or plan acceptance criteria.
54
+
55
+ ## How Coding Agents Should Use This Package
56
+
57
+ 1. Read PRODUCT_SPEC.md, TECH_ARCHITECTURE.md, API_AND_DATA_MODEL.md, and TASKS.md first.
58
+ 2. Use UX_FLOWS.md and CLIENT_IMPLEMENTATION_SPEC.md when implementing the CLI, TUI, MCP prompts, or dashboard.
59
+ 3. Use AI_SPEC.md for agent prompt, context, and review-loop behavior.
60
+ 4. Treat TASKS.md acceptance criteria as the build checklist.
61
+ 5. Keep implementation aligned with ADRS.md unless a new ADR supersedes a decision.
62
+
63
+ ## Operating References
64
+
65
+ - `../OPERATING_MODEL.md`: daily operator flow, parent gates, completion rules, and recovery.
66
+ - `../TASK_GRAPH_MODEL.md`: map objects, readiness, links, picks, reviews, and evidence.
67
+ - `../HANDOFFS_AND_STORIES.md`: log, context, note, story, and handoff policy.
@@ -0,0 +1,29 @@
1
+ # References
2
+
3
+ ## Official External Sources
4
+
5
+ - OpenAI Codex CLI help on this machine, checked 2026-06-09 with `codex --help`, `codex exec --help`, `codex review --help`, and `codex mcp --help`.
6
+ - OpenAI Docs MCP: https://platform.openai.com/docs/docs-mcp
7
+ - Used for current Codex MCP setup concept and Codex/IDE shared MCP configuration note.
8
+ - OpenAI Codex CLI Help Center: https://help.openai.com/en/articles/11096431
9
+ - Used for Codex CLI local coding agent and approval workflow positioning.
10
+ - Claude Code MCP docs: https://code.claude.com/docs/en/mcp
11
+ - Used for Claude Code MCP integration and project/user configuration assumptions.
12
+ - Cursor MCP docs: https://docs.cursor.com/context/model-context-protocol
13
+ - Used for Cursor MCP transports, `.cursor/mcp.json`, global config, and security guidance.
14
+ - Model Context Protocol docs: https://modelcontextprotocol.io/specification/draft/server/prompts
15
+ - Used for MCP prompt behavior and prompt security requirements.
16
+ - Model Context Protocol docs: https://modelcontextprotocol.io/specification/2025-03-26/server/tools
17
+ - Used for MCP tool behavior and model-controlled tool concepts.
18
+ - Model Context Protocol docs: https://modelcontextprotocol.io/specification/draft/client/elicitation
19
+ - Used for sensitive information constraints and elicitation safety assumptions.
20
+
21
+ ## Source Freshness Notes
22
+
23
+ - MCP and coding-agent client behavior changes frequently. Re-check Codex, Claude Code, Cursor, and MCP docs before implementing install helpers or promising exact config commands.
24
+ - The spec intentionally prefers stable integration concepts over vendor-specific hidden config internals.
25
+
26
+ ## Product Independence Notes
27
+
28
+ - Planr docs, command names, assets, and implementation should remain original.
29
+ - Any retained code or asset must have its license recorded before release.