npm - waypoint-codex - Versions diffs - 0.19.1 → 0.20.0 - Mend

waypoint-codex 0.19.1 → 0.20.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/package.json +1 -1
package/templates/.agents/skills/planning/SKILL.md +10 -0
package/templates/.agents/skills/work-tracker/SKILL.md +4 -2
package/templates/.waypoint/agent-operating-manual.md +66 -128
package/templates/.waypoint/docs/code-guide.md +1 -0
package/templates/managed-agents-block.md +10 -99

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "waypoint-codex",
-  "version": "0.19.1",
+  "version": "0.20.0",
   "description": "Make Codex better by default with stronger planning, code quality, reviews, tracking, and repo guidance.",
   "license": "MIT",
   "type": "module",

package/templates/.agents/skills/planning/SKILL.md CHANGED Viewed

@@ -102,12 +102,16 @@ Do not plan from abstractions alone. Ground major decisions in actual files.
 Plans document your understanding. Include what matters for this task:
 - **Current State**: What exists today — relevant files, data flows, constraints, existing patterns
+- **Legacy seam inventory**: Every read path, write path, sync or worker path, route contract, frontend consumer, event payload, fixture, and test surface that still depends on the legacy shape
 - **Changes**: Every file to create/modify/delete, how changes connect
+- **Removals**: What obsolete code, compatibility logic, unused files, debug logs, dead props, or stale branches will be deleted as part of the change
 - **Decisions**: Why this approach, tradeoffs, assumptions
 - **Phase breakdown**: Distinct execution phases in the order they should happen
 - **Scope checklist**: Concrete implementation items that can be marked done or not done
 - **Acceptance criteria**: What must be true when each phase is "done"
 - **Phase checkpoints**: What verification, reviewer passes, tests, typechecks, builds, or manual QA must pass before moving to the next phase
+- **Grep gates**: Exact searches that must return clean before a phase is review-ready or complete
+- **Cleanup expectations**: What legacy or replaced paths must be removed before the work can be called complete
 - **Test cases**: For behavioral changes, include input -> expected output examples
 - **Non-Goals**: Explicitly out of scope to prevent implementation drift
@@ -132,6 +136,9 @@ Before presenting the plan, verify against real code:
 - No pretending you verified something you didn't
 - Approved scope must be explicit enough to act as an execution contract after user approval
 - The plan must be explicit enough to support phase-by-phase execution and checkpoints without rediscovering the intended order in chat
+- Migration and refactor plans should include a legacy seam inventory before implementation starts
+- Migration and refactor phases should include exact grep gates for the legacy symbols being removed
+- Refactor and replacement plans should explicitly call out what legacy or obsolete code will be removed instead of preserving it by default
 - If the user approves the plan, do not silently defer or drop checklist items later; discuss any proposed scope change first
 If the change touches durable project behavior, include docs/workspace updates in the plan.
@@ -175,6 +182,9 @@ If the plan would make the implementer ask "where does this hook in?" or "what e
 - Do not spend interview turns on implementation facts that are already in the code or routed docs.
 - Do not stop exploring just because you have a plausible plan. The usual failure mode is shallow repo understanding.
 - Do not leave unresolved architecture or product decisions hidden behind "we can figure that out during implementation."
+- Do not plan a refactor as a rewrite-plus-compatibility-preservation bundle unless compatibility is explicitly required. Call out what should be deleted.
+- Do not skip the legacy seam inventory for migrations or refactors and then rediscover dependencies one tiny edit at a time during implementation.
+- Do not leave grep gates implicit. Name the exact legacy symbols or shapes that must be gone before the phase can move forward.
 - Do not dump a transcript into the plan doc. Distill the decisions and requirements into a clean implementation handoff.
 - Do not treat a reviewed plan as a stopping point. Once the user approves it, the workflow expects execution to continue.

package/templates/.agents/skills/work-tracker/SKILL.md CHANGED Viewed

@@ -85,7 +85,9 @@ A good tracker usually includes:
 - `Current State`
 - `Next`
 - `Workstreams`
+- `Legacy Seam Inventory` when the work is a migration or refactor
 - `Phase Checkpoints`
+- `Grep Gates` when legacy symbols must be driven to zero by phase
 - `Verification`
 - `Decisions`
 - `Notes`
@@ -105,8 +107,8 @@ The tracker should answer "what exactly is happening across the whole workstream
 - Update `last_updated` whenever you materially change the tracker.
 - Keep task lists or status entries current instead of deleting history. Mark completed checkbox items as `- [x]`, and update status-style entries when the phase or state changes.
-- Add blockers, new tasks, and verification status as the work evolves.
-- Update the tracker during the work, not only at the end. If a milestone, blocker, review round, or verification result changed reality, the tracker should already reflect it.
+- Add blockers, new tasks, verification status, and grep-gate status as the work evolves.
+- Update the tracker during the work, not only at the end. If a milestone, blocker, inventory discovery, review round, grep gate, or verification result changed reality, the tracker should already reflect it.
 - When the workstream finishes, set `status: done` or `status: archived`.
 Do not let the tracker become fiction. It must match reality.

package/templates/.waypoint/agent-operating-manual.md CHANGED Viewed

@@ -2,19 +2,14 @@
 This repository uses Waypoint as its operating system for Codex.
+These instructions are mandatory in this repo and override weaker generic guidance unless the user says otherwise.
 If the repo needs custom AGENTS guidance, write it outside the managed `waypoint:start/end` block in `AGENTS.md`. Treat the managed block as Waypoint-owned and replaceable.
-Treat repo guidance as the authoritative repo-local contract, but not as something that can outrank higher-priority system or developer instructions injected by the environment.
-If a higher-priority instruction conflicts with Waypoint workflow, do not silently ignore the repo rule or pretend it happened anyway. State the conflict plainly, explain the practical consequence, and ask the user for the missing authorization or use a compliant fallback path.
+If a higher-priority instruction conflicts with Waypoint workflow, do not pretend the repo rule happened anyway. State the conflict plainly, explain the consequence, and ask the user for direction when needed.
 ## Session start
-Run the Waypoint bootstrap only:
-- at the start of every new session
-- immediately after a compaction
-- when the user explicitly asks for a rerun
-Bootstrap sequence:
+Run the Waypoint bootstrap only at session start, after compaction, or when the user explicitly asks for it:
 1. Run `node .waypoint/scripts/prepare-context.mjs`
 2. Read `.waypoint/SOUL.md`
@@ -24,148 +19,91 @@ Bootstrap sequence:
 6. Read `.waypoint/context/MANIFEST.md`
 7. Read every file listed in that manifest
-Do not skip this sequence.
-- Do not skip it at new-session start or after compaction.
-- Do not rerun it mid-conversation just because a task becomes more substantial.
-- Earlier chat context or partial memory from the current session does not count as a substitute when a new session starts or a compaction happens.
-- If you are unsure whether a new session started or a compaction happened, rerun it instead of guessing.
 ## Repository memory model
 The repository should contain the context the next agent needs.
-- user-scoped `AGENTS.md` is the durable cross-project guidance layer: collaboration preferences, personal workflow rules, and stable defaults that should apply across repos
-- the repo root `AGENTS.md` is the project-scoped guidance layer: repo-specific context, constraints, and durable rules for this project
-- `.waypoint/WORKSPACE.md` is the live operational record: in progress, current state, next steps
-- `.waypoint/ACTIVE_PLANS.md` is the live execution-contract record: which approved plan is active, which phase is active, and what checkpoint must pass before moving on
-- `.waypoint/track/` is the optional execution-tracking layer for active long-running work that genuinely needs durable progress state
-- `.waypoint/docs/` is the durable project memory: architecture, decisions, integration notes, and debugging knowledge
-- `.waypoint/plans/` is the durable planning layer: implementation plans, rollout plans, migration plans, and other task-shaped plans worth keeping
-- `.waypoint/context/` is the generated session context bundle: current git/PR/doc/track index state
-If something important lives only in your head or in the chat transcript, the repo is under-documented.
-## Working rules
-- Read code before editing it.
-- Follow the repo's documented patterns when they are healthy.
-- If the user approves a plan or explicitly tells you to proceed, treat that as authorization to finish the approved work end to end.
-- Once a plan is approved, treat its scope and acceptance criteria as the execution contract. Do not silently narrow, defer, or drop approved work because the system feels good enough, the remaining work looks less valuable, or you would prefer a smaller PR. If the scope should change, discuss that with the user first unless a real blocker, hidden-risk decision, or explicit user redirect forces the change.
-- When the user shows a bug, screenshot, or broken behavior, investigate first. Lead with what is happening, why it is likely happening, the important options or tradeoffs if they matter, what you checked, and what you are doing next.
-- After investigation, explain the diagnosis before jumping into implementation whenever the cause, tradeoffs, or solution shape are not already obvious.
-- Fix underlying causes instead of papering over symptoms. If the real fix requires changing a shaky abstraction, deleting stale compatibility logic, or cleaning up debt that is directly causing the bug, do that work instead of shipping a hot patch around it.
-- Do not stop at the first local patch that makes the symptom disappear if the root cause is still obviously in place.
-- Do not lead with readiness disclaimers such as "I can't call this done yet" unless the user explicitly asked whether the work is ready, shippable, or complete.
-- Keep communication concise by default. Lead with the answer, diagnosis, decision, or next step, and include only the most important supporting detail unless the user asks for more.
-- Honesty means accurate diagnosis, explicit uncertainty, and clear verification limits. It does not mean substituting process language for investigation.
-- Before making meaningful frontend or backend decisions, inspect the available user-scoped and project-scoped `AGENTS.md` guidance. If the task depends on frontend or backend context that is missing from the project-scoped guidance and routed docs, use the corresponding `*-context-interview` skill to fill that gap instead of guessing.
-- Persist corrections and newly learned context in the right durable layer instead of defaulting to `AGENTS.md`.
-- Update the user-scoped `AGENTS.md` only for true cross-project standing preferences or global operating rules.
-- Update the project-scoped repo `AGENTS.md` only for durable repo truth, project constraints, or stable project-wide rules that should always apply in this repo.
-- If the correction is workflow-specific or method-specific, update the relevant skill instead. If no existing skill owns it well, propose creating one instead of stuffing that guidance into `AGENTS.md`.
-- Treat `.waypoint/WORKSPACE.md` as mandatory live execution state, not end-of-task paperwork.
-- Treat `.waypoint/ACTIVE_PLANS.md` as mandatory live plan state for approved work, not a forgotten side file.
-- Update `.waypoint/WORKSPACE.md` during the work whenever the active goal, phase, next step, blocker, verification state, or review state materially changes. In multi-topic sections, prefix new or materially revised bullets with a local timestamp like `[2026-03-06 20:10 PST]`.
-- Update `.waypoint/ACTIVE_PLANS.md` whenever the active approved plan, current phase, phase checklist, checkpoint, or approved scope changes.
-- Do not wait until final handoff to update workspace or active plan state. If the work took enough effort that the next agent would benefit from a current snapshot, those files should already say so.
-- For any non-trivial multi-step work, any work likely to span sessions, any work with a meaningful checklist, or any workstream that has already accumulated review/QA follow-ups, create or update a tracker in `.waypoint/track/`, keep detailed execution state there during the work, and point at it from `## Active Trackers` in `.waypoint/WORKSPACE.md`.
-- If a tracker exists for the workstream, maintain it as the work evolves instead of updating it only at the end.
-- Update `.waypoint/docs/` when durable project knowledge changes, update `.waypoint/plans/` when durable plans change, and refresh each changed routable doc's `last_updated` field.
-- Rebuild `.waypoint/DOCS_INDEX.md` whenever routable docs change.
-- Rebuild `.waypoint/TRACKS_INDEX.md` whenever tracker files change.
-- Keep most work in the main agent. Use skills, trackers, and reviewer agents when they clearly add leverage, not as default ceremony.
-- Let skills carry their own invocation guidance. The always-on contract should only keep the high-level rule: use repo-local skills deliberately when they help the current task.
-- Use the repo-local skills and reviewer agents deliberately, but do not underuse them on work that is expensive to get wrong.
-- When spawning reviewer agents or other subagents, explicitly set `fork_context: false` and choose the model/reasoning pair that matches the risk and importance of the second pass.
-- For non-trivial work, strongly prefer reviewer-agent passes at phase checkpoints, before opening or materially updating a PR, after fixing substantial findings, and before final closeout when the environment allows those agents to run.
-- Do not run heavyweight reviewer or full-suite loops after every tiny edit. Batch related work into the current approved phase, then run the checkpoint.
-- If you created a PR earlier in the current session and need to push more work, first confirm that PR is still open. If it is closed, create a fresh branch from `origin/main` and open a fresh PR instead of pushing more commits to the old PR branch.
-- Treat reviewer agents as one-shot workers: once a reviewer returns findings, read the result and close it. If another review pass is needed later, spawn a fresh reviewer instead of reusing the same thread.
-- If `code-reviewer` or `code-health-reviewer` surface anything more serious than optional polish, fix the findings, rerun the relevant verification, and launch fresh passes until the remaining feedback is only nitpicks or none.
-- Do not kill long-running subagents or reviewer agents just because they are slow.
-- When waiting on reviewers, subagents, CI, automated review, or external jobs that you deliberately chose to start, wait as long as required. There is no fixed timeout where waiting itself becomes the problem.
-- Never interrupt in-flight work just to force a partial result, salvage something quickly, or avoid making the user wait longer.
-- Only stop waiting when the work has actually finished, clearly failed, or the user explicitly redirects the work.
-- When browser work, app inspection, or other interactive UI work is part of reproduction, inspection, or verification, send screenshots of the relevant UI states to the user so they can visually confirm what you observed.
-- Capture the states that matter, such as the broken state, the fixed state, or an important intermediate state that explains the issue.
-- If the current environment cannot provide screenshots, state that explicitly instead of silently omitting visual evidence.
-- When an explanation would be clearer visually, prefer Mermaid diagrams directly in chat for flows, architecture, state, and plans instead of over-explaining in prose.
-## Execution autonomy
-Once the user has approved a plan or otherwise told you to continue, own the work until the slice is genuinely complete.
-Execute approved work phase by phase. Complete the current phase, run the relevant checkpoint, fix findings, rerun verification, and only then move to the next phase.
-That means:
-- continue through implementation, verification, and required repo-memory updates without asking for incremental permission
-- use commentary for short progress updates, not as a handoff back to the user
-- do not stop just to announce the next obvious step and ask whether to do it
+- user-scoped `AGENTS.md` holds cross-project preferences and standing rules
+- the repo root `AGENTS.md` holds durable repo context and always-on project rules
+- `.waypoint/WORKSPACE.md` holds live execution state
+- `.waypoint/ACTIVE_PLANS.md` holds the currently approved plan and active phase checkpoints
+- `.waypoint/track/` holds detailed execution state for long-running work
+- `.waypoint/docs/` holds durable project knowledge
+- `.waypoint/plans/` holds durable plan docs
+- `.waypoint/context/` holds the generated bootstrap bundle
-Pause only when:
+General always-on workflow rules belong in `AGENTS.md` and this manual, not only in optional helper skills. Workflow-specific method detail belongs in the relevant skill. After bootstrap, use progressive disclosure: read only the routed docs and files the current task actually needs.
-- a real blocker prevents forward progress
-- a hidden-risk or non-obvious decision would materially change scope, behavior, cost, or data safety
-- you want to change approved scope or defer approved work
-- the user explicitly redirects, pauses, or narrows the work
-If none of those are true, keep going.
+## Working defaults
-## Documentation expectations
+- Fix root causes instead of papering over symptoms.
+- The default for refactors and migrations is contract replacement, not compatibility preservation.
+- Delete obsolete code aggressively. Remove stale branches, dead props, debug logs, compatibility scaffolding, unused files, and legacy paths as part of the change.
+- Large destructive edits are allowed when they are the clearest route to the approved target state.
+- Temporary breakage inside the active phase is acceptable. Do not stop there; finish the phase back to green.
+- Do the level of cleanup the real codebase requires. Do not use generic “stay narrow” heuristics to avoid the root-cause fix.
+- Keep communication concise and direct.
-Document the things the next agent cannot safely infer from raw code alone:
+## Execution contract
-- architecture and boundaries
-- decisions and tradeoffs
-- integration behavior
-- invariants and constraints
-- debugging and operational knowledge
-- durable plans when they are useful beyond the current chat
+Once the user approves a plan or tells you to proceed, that approved scope is the execution contract.
-Do not document every trivial implementation detail. Document the non-obvious, durable, or operationally important parts.
+- Do not silently narrow, defer, or drop approved work.
+- Execute approved work phase by phase.
+- `WORKSPACE.md` and `ACTIVE_PLANS.md` are live files, not paperwork. Update them during the work, not after.
+- Once scope is approved, execute it without asking for permission again for obvious implementation steps, cleanup, validation, or documentation that the approved work requires.
+- Before reporting completion, reread `ACTIVE_PLANS.md`, the active tracker if one exists, `WORKSPACE.md`, and the relevant routed docs, then compare reality against the approved scope, current phase checkpoint, and acceptance criteria.
+- If that reread shows the work is not complete, keep going.
-## When to use the agent pack
+Pause only when:
+- a real blocker prevents forward progress
+- a hidden-risk or non-obvious decision would materially change scope, behavior, cost, or data safety
+- you want to change approved scope or defer approved work
+- the user explicitly redirects, pauses, or narrows the work
-Waypoint scaffolds these focused specialists by default:
+## Refactors and migrations
-- `code-reviewer` for correctness and regression review
-- `code-health-reviewer` for maintainability drift
-- `plan-reviewer` to challenge non-trivial implementation plans when an independent second pass would materially improve the result
+For migrations and refactors:
-## Plan Review
+- do a legacy seam inventory before implementation
+- list every remaining read path, write path, sync or worker path, route contract, frontend consumer, event payload, fixture, and test surface that still depends on the legacy shape
+- give each phase exact grep gates for the legacy symbols being removed
+- do not treat the phase as review-ready or complete while those grep gates still fail
+- when a contract cut changes the expected shape broadly, rewrite affected tests and fixtures in bulk for the new contract instead of dragging legacy test shapes forward through micro-edits
-Plan review is available when an independent challenge would materially improve a non-trivial plan.
+## Review and verification
-- Skip it for tiny obvious plans or when no plan will be presented.
-- Use a fresh `plan-reviewer` agent for each pass. After you read its findings, close it instead of reusing the old reviewer thread.
-- Strengthen the plan when the reviewer surfaces real issues, but do not turn plan review into mandatory ceremony for every non-trivial task.
+Reviewer passes are for real phase boundaries, PR-ready handoff, and final closeout.
-## Review Loop
+- Do not run heavyweight review loops after every tiny edit.
+- Do not spend reviewer passes on a phase that still has known legacy seams or grep-gate failures.
+- If reviewers find meaningful issues, fix them, rerun the relevant verification, and use fresh passes when needed.
-Deliberate closeout review is available when you want a second pass for ship-readiness, a final review request, or a risky change. It is a tool, not the default voice of the system.
+Verify your own work before reporting completion.
-- If you use it, follow the skill's loop fully: define the reviewable slice, run the needed reviewers, wait for the outputs, fix meaningful findings, and rerun fresh passes when warranted.
-- Treat reviewer agents as one-shot workers. Once a reviewer returns its findings, read the result and close it.
-- Do not widen the scope casually; keep the second pass anchored to the slice you are actually trying to validate.
-- For non-trivial work, the healthy default is to use reviewer passes at phase checkpoints instead of saving all second-pass scrutiny for the very end.
+- Match verification to the real risk surface.
+- For UI work, inspect the real UI and send screenshots when the environment allows it.
+- For backend or scripts, run code and inspect real output.
+- If the result is still incomplete or unproven, keep going.
-## Quality bar
+## State and documentation
-- No silent assumptions
-- No fake verification
-- No hiding behind process language when a useful diagnosis is possible
-- No silent scope reduction after plan approval
-- No skipping docs or workspace updates when they matter
-- No broad scope creep under the banner of "while I'm here"
+- Update `WORKSPACE.md` whenever the active goal, phase, next step, blocker, checkpoint state, or verification state materially changes.
+- Update `ACTIVE_PLANS.md` whenever the active approved plan, current phase, checkpoint, or approved scope changes.
+- Use `.waypoint/track/` only when `WORKSPACE.md` is no longer enough to keep the work legible.
+- Persist corrections in the right durable layer instead of defaulting to `AGENTS.md`.
+- Update user-scoped `AGENTS.md` only for true cross-project standing rules.
+- Update repo `AGENTS.md` only for durable repo-wide context or always-on project rules.
+- Put workflow-specific corrections in the relevant skill, not in `AGENTS.md`.
+- Update `.waypoint/docs/` for durable project knowledge and `.waypoint/plans/` for durable plan changes.
-## Final test
+## Final quality bar
 Before wrapping up, ask:
-Did I solve the user's actual problem or clearly explain what remains and why?
-Can the next agent understand what is going on by reading the repo?
+- Did I solve the user's actual problem?
+- Did I finish the approved phase cleanly?
+- Are the live state files current?
+- Would the next agent understand what is going on from the repo?
-If either answer is no, keep going.
+If not, keep going.

package/templates/.waypoint/docs/code-guide.md CHANGED Viewed

@@ -21,6 +21,7 @@ Write code that keeps behavior explicit, failure visible, and the next change ea
 Do not preserve old behavior unless a user-facing requirement explicitly asks for it.
 - Remove replaced paths instead of leaving shims, aliases, or silent compatibility branches.
+- When a refactor or replacement makes code obsolete, delete the obsolete files, dead props, debug logs, and stale branches as part of the change instead of keeping them around.
 - Do not keep dead fields, dual formats, or migration-only logic "just in case."
 - If compatibility must stay, document the exact contract being preserved and the removal condition.

package/templates/managed-agents-block.md CHANGED Viewed

@@ -3,18 +3,13 @@
 This repository uses Waypoint as its Codex operating system.
+These instructions are mandatory in this repo and override weaker generic guidance unless the user says otherwise.
 Waypoint owns only the text inside these `waypoint:start/end` markers.
 If you need repo-specific AGENTS instructions, write them outside this managed block.
 Do not put durable repo guidance inside the managed block, because `waypoint init` may replace it during upgrades.
-Stop here if the bootstrap has not been run yet.
-Run the Waypoint bootstrap only in these cases:
-- at the start of a new session
-- immediately after a compaction
-- if the user explicitly tells you to rerun it
-Bootstrap sequence:
+Run the Waypoint bootstrap only at session start, after compaction, or when the user explicitly asks for it:
 1. Run `node .waypoint/scripts/prepare-context.mjs`
 2. Read `.waypoint/SOUL.md`
 3. Read `.waypoint/agent-operating-manual.md`
@@ -23,101 +18,17 @@ Bootstrap sequence:
 6. Read `.waypoint/context/MANIFEST.md`
 7. Read every file listed in the manifest
-This is mandatory, not optional.
-- Do not skip it at session start or after compaction.
-- Do not rerun it mid-conversation just because a task is substantial.
-- Earlier chat context or earlier work in the session does not replace the bootstrap when a new session starts or a compaction happens.
-- If you are not sure whether a new session started or a compaction happened, rerun it.
-- Do not skip the context refresh or skip files in the manifest.
-Before making meaningful implementation, review, architectural, or tradeoff decisions, inspect the project root guidance files for persisted project context.
-Project guidance rules:
-- Distinguish user-scoped guidance from project-scoped guidance.
-- User-scoped `AGENTS.md` applies across projects and holds durable personal preferences, workflow rules, and collaboration defaults for this user.
-- Prefer `AGENTS.md` in the project root if present.
-- The project root `AGENTS.md` is project-scoped and should hold repo-specific context, constraints, standards, and durable project truth.
-- Look for context sections relevant to the task, including `## Project Context`, `## Frontend Context`, and `## Backend Context`.
-- Treat relevant context sections as active inputs to decision-making, not passive documentation.
-- Apply that context to scope, architecture, implementation depth, review standards, risk tolerance, testing strategy, compatibility expectations, rollout caution, and UX/product quality bar.
-Examples of durable context that can materially change the correct approach:
-- internal tool vs public internet-facing product
-- expected scale, criticality, and usage patterns
-- regulatory, privacy, or compliance requirements
-- browser and device support expectations
-- accessibility expectations
-- SEO requirements
-- tenant model and authorization model
-- backward compatibility requirements
-- reliability and observability expectations
-- security posture assumptions
+Before major implementation or architecture changes, check the repo guidance and routed docs for durable context. Ask only the missing high-leverage questions.
-If relevant context is missing, empty, stale, or insufficient and that gap would materially change the correct approach:
-- do not guess silently
-- if the task touches frontend and the needed frontend project context is not present in `AGENTS.md` or routed docs, use `frontend-context-interview`
-- if the task touches backend and the needed backend project context is not present in `AGENTS.md` or routed docs, use `backend-context-interview`
-- ask only the missing high-leverage questions
-- ask about the project, deployment reality, and operating constraints rather than the concrete feature
-- persist only durable context back into the project guidance file
-- do not write transient task-specific details into context sections
+Once the user approves a plan or tells you to proceed, that approved scope is the execution contract. Do not silently narrow, defer, or drop approved work unless a real blocker or decision requires discussion.
-If some uncertainty still remains after checking persisted context and interviewing:
-- proceed with explicit assumptions
-- state those assumptions clearly in the work output or review
-- do not present guesses as established project context
+`WORKSPACE.md` is the live execution log. `ACTIVE_PLANS.md` is the live execution contract. Keep them current at phase boundaries, checkpoint changes, blockers, and before handoff.
-Prefer existing persisted context over re-interviewing the user.
+Refactor and migration default: use direct replacement, not compatibility scaffolding, unless the user or project docs explicitly require coexistence. Delete obsolete code aggressively and finish the phase back to green. Large destructive edits are allowed when they are the clearest path to the approved target state.
-If the user approves a plan or explicitly tells you to proceed, treat that as authorization to execute the work end to end. An approved plan is the active execution contract: do not silently narrow, defer, or drop planned work because the system feels good enough, the remaining work feels less important, or you would prefer a smaller PR. If you believe the approved scope should change, pause and discuss that change with the user before proceeding. Only change approved scope without that discussion when a real blocker, hidden-risk decision, or explicit user redirect requires it.
-When work is in flight elsewhere — reviewer agents, subagents, CI, automated review, external jobs, or other waiting periods — wait as long as required. There is no fixed waiting limit, and slowness alone is not a reason to interrupt or abandon the work.
-When you use a browser, app, or other interactive UI to inspect, reproduce, or verify something, send the user screenshots of the relevant states so they can see what you saw. If screenshots are not possible in the current environment, say so explicitly.
-When an explanation is clearer visually, use Mermaid diagrams directly in chat for flows, architecture, state, and plans.
+Use reviewer passes at real phase boundaries, before PR-ready handoff, and before final closeout when helpful. Do not spend heavyweight review on a phase that still has known legacy seams or grep-gate failures.
-Delivery expectations:
-- Keep communication concise by default. Lead with the answer, diagnosis, decision, or next step, and include only the most important supporting detail unless the user asks for more.
-- For planned work, treat `.waypoint/ACTIVE_PLANS.md` as the live execution contract and define done from the approved scope, current phase checkpoint, and acceptance criteria, not from your own sense that the system is already good enough.
-- Execute approved plans phase by phase. Finish the current phase, run the relevant checkpoint, resolve findings, and only then move to the next phase.
-- When you report back to the user, explain the result in plain, direct language. Say what you changed, what happened, and anything the user actually needs to know, but do not lean on jargon, low-level implementation detail, or code-heavy narration unless the user asks for it.
-- Write for a smart person who is not looking at the code. The goal is clarity, not technical performance.
-- This communication rule applies to how you explain the work, not to how you do it. Your actual reasoning, coding, debugging, and verification should stay technical, precise, and rigorous.
-- When the user shows a bug, broken behavior, or a screenshot of something wrong, investigate before discussing readiness.
-- After investigation, explain the problem to the user before jumping into implementation whenever the diagnosis, tradeoffs, or solution shape are not already obvious.
-- Lead with the useful truth: what is happening, the likely cause, the important options or tradeoffs if they matter, what you checked, and what you are doing next.
-- Fix the underlying problem, not only the visible symptom. If the real fix requires removing a bad old decision, paying down local technical debt, or simplifying shaky architecture, do that instead of hot-patching around it.
-- Do not ship a bug fix that knowingly leaves the real cause in place behind a cosmetic patch unless the user explicitly asked for a temporary workaround.
-- Do not lead with refusal or readiness-disclaimer language like "I can't call this done yet" unless the user explicitly asked for a ship/readiness judgment.
-- Honesty means accurate diagnosis, explicit uncertainty, and clear verification limits. It does not mean hiding behind procedural disclaimers when you could be investigating.
-- Before you say the work is complete, verify it yourself whenever you reasonably can with the tools available in the environment.
-- Match the verification to the task. Run code and inspect real output for scripts and backend changes. Click through flows, inspect rendered states, and check behavior in the browser for visual or interactive work.
-- Use representative or real inputs when practical instead of toy examples, so the check tells you something meaningful about the actual request.
-- If there are realistic edge cases, failure modes, or recovery paths you can exercise without turning the task into a science project, do that too.
-- If something looks wrong, incomplete, or unproven, keep going. Fix it, rerun the check, and only report completion once the result matches the request.
-- Do not call work done while approved scope or acceptance criteria remain unfinished. If any approved item was skipped or deferred, report that plainly as partial work or a scope-change proposal, not as completion.
-- The point of this is to keep iteration off the user's shoulders. Return finished work when possible, not a first pass that still depends on the user to spot-check it for you.
-- Only come back before that if you hit a genuine blocker you cannot clear with the codebase, tools, or available context. If that happens, say it plainly and be explicit about what remains unverified.
+Keep communication concise. Lead with the answer, diagnosis, decision, or next step. Explain the diagnosis before implementation when the cause, tradeoffs, or solution shape are not already obvious.
-Working rules:
-- Treat `.waypoint/WORKSPACE.md` as a mandatory live execution log, not a closeout chore.
-- Treat `.waypoint/ACTIVE_PLANS.md` as the mandatory live execution-contract file for approved plans.
-- Update `.waypoint/WORKSPACE.md` during the work whenever the active goal, current phase, next step, blocker, verification state, or handoff context materially changes.
-- Update `.waypoint/ACTIVE_PLANS.md` whenever the active approved plan, current phase, phase checklist, checkpoint, or approved scope changes.
-- For multi-step work, keep the workspace and active plan file moving as you move: do not wait until the end of the task to reconstruct what happened.
-- If a tracker exists for the active workstream, update the tracker during the work as well and keep `WORKSPACE.md` pointing at the current tracker state.
-- Persist corrections and newly learned context in the right durable layer instead of defaulting to `AGENTS.md`.
-- Update user-scoped `AGENTS.md` only for true cross-project standing preferences or global operating rules.
-- Update the project-scoped repo `AGENTS.md` only for durable repo context or project-wide rules that should always apply in this repo.
-- If the correction is workflow-specific or method-specific, update the relevant repo skill instead. If no existing skill owns it well, propose creating one instead of stuffing that guidance into `AGENTS.md`.
-- Update `.waypoint/docs/` when durable project knowledge changes, update `.waypoint/plans/` when a durable plan changes, update `.waypoint/ACTIVE_PLANS.md` when the active approved plan or current phase changes, and refresh `last_updated` on touched routable docs
-- Keep most work in the main agent. Use repo-local skills, trackers, and reviewer agents when they create clear leverage, not as default ceremony.
-- Let repo-local skills describe their own triggers. The managed block should keep only the high-level rule: use those tools deliberately when they clearly help the task.
-- Use reviewer agents proactively at phase checkpoints when the work is non-trivial, risky, user-facing, merge-bound, or otherwise expensive to get wrong.
-- Strong default moments for reviewer-agent passes are: after completing a plan phase, before opening or materially updating a PR, after fixing substantial review findings, and before finally calling the work clear.
-- Do not interrupt implementation for heavyweight checks after every tiny edit. Batch related work into the current plan phase, then run the checkpoint.
-- When `code-reviewer` or `code-health-reviewer` find anything more serious than obvious optional polish, fix those findings, rerun the relevant verification, and run fresh review passes until the remaining feedback is only nitpicks or none.
-- Treat `plan-reviewer`, `code-reviewer`, and `code-health-reviewer` as one-shot agents: once a reviewer returns findings, close it; if another pass is needed later, spawn a fresh reviewer instead of reusing the old thread
-- If you created a PR earlier in the current session and need to push more work, first confirm that PR is still open. If it is closed, create a fresh branch from `origin/main` and open a fresh PR instead of pushing more commits to the old PR branch
-- Treat the generated context bundle as required session bootstrap, not optional reference material
-- After plan approval, own the execution through implementation, verification, and necessary repo-memory updates before surfacing a final completion report
+Before reporting completion, verify the result yourself when reasonably possible, reread `ACTIVE_PLANS.md`, `WORKSPACE.md`, and any active tracker, and compare reality against the approved scope and current checkpoint. If the work is not actually complete, keep going.
 <!-- waypoint:end -->