npm - waypoint-codex - Versions diffs - 0.9.8 → 0.10.0 - Mend

waypoint-codex 0.9.8 → 0.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/README.md +6 -1
package/dist/src/core.js +1 -0
package/package.json +1 -1
package/templates/.agents/skills/conversation-retrospective/SKILL.md +98 -0
package/templates/.agents/skills/conversation-retrospective/agents/openai.yaml +4 -0
package/templates/.agents/skills/planning/SKILL.md +1 -0
package/templates/.agents/skills/pr-review/SKILL.md +3 -0
package/templates/.gitignore.snippet +1 -0
package/templates/.waypoint/SOUL.md +2 -0
package/templates/.waypoint/agent-operating-manual.md +28 -1
package/templates/managed-agents-block.md +4 -0

package/README.md CHANGED Viewed

@@ -137,6 +137,7 @@ Waypoint ships a strong default skill pack for real coding work:
 - `docs-sync`
 - `code-guide-audit`
 - `break-it-qa`
+- `conversation-retrospective`
 - `frontend-ship-audit`
 - `backend-ship-audit`
 - `workspace-compress`
@@ -144,7 +145,7 @@ Waypoint ships a strong default skill pack for real coding work:
 - `pr-review`
 These are repo-local, so the workflow travels with the project.
-`break-it-qa`, `frontend-ship-audit`, and `backend-ship-audit` are user-invoked audit skills, not default autonomous agent steps.
+`conversation-retrospective`, `break-it-qa`, `frontend-ship-audit`, and `backend-ship-audit` are on-demand skills, not default autonomous agent steps.
 ## Reviewer agents
@@ -156,6 +157,10 @@ Waypoint scaffolds these reviewer agents by default:
 The intended workflow is closeout-based: run `code-reviewer` before considering any non-trivial implementation slice complete, and run `code-health-reviewer` before considering medium or large changes complete, especially when they add structure, duplicate logic, or introduce new abstractions. If both apply, run them in parallel. A recent self-authored commit is the preferred scope anchor when one cleanly represents the slice, but it is not the only valid trigger.
+For planning work, run `plan-reviewer` before presenting a non-trivial implementation plan to the user and iterate until it has no meaningful review findings left.
+When the user approves a reviewed plan or explicitly says to proceed, the intended Waypoint behavior is autonomous execution: keep going through implementation, verification, review, and repo-memory updates unless a real blocker or materially risky unresolved decision requires a pause.
 ## What makes it different
 Waypoint is not trying to hide everything behind hooks and background machinery.

package/dist/src/core.js CHANGED Viewed

@@ -383,6 +383,7 @@ export function doctorRepository(projectRoot) {
         "docs-sync",
         "code-guide-audit",
         "break-it-qa",
+        "conversation-retrospective",
         "workspace-compress",
         "pre-pr-hygiene",
         "pr-review",

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "waypoint-codex",
-  "version": "0.9.8",
+  "version": "0.10.0",
   "description": "Codex-native repository operating system: scaffolding, docs routing, repo-local skills, doctor, and sync.",
   "license": "MIT",
   "type": "module",

package/templates/.agents/skills/conversation-retrospective/SKILL.md ADDED Viewed

@@ -0,0 +1,98 @@
+---
+name: conversation-retrospective
+description: Analyze the active conversation for durable repo knowledge, skill improvements, and repeated workflow patterns. Use when the user asks to save what was learned from the current conversation, update memory/docs without more prompting, improve skills that were used or exposed gaps, or propose new skills based on repetitive work in the live thread.
+---
+# Conversation Retrospective
+Use this skill to harvest the active conversation into the repo's existing memory system.
+This skill works from the live conversation already in context. Do not go hunting through archived session files unless the user explicitly asks for that.
+This is a closeout and distillation workflow, not a generic planning pass or a broad docs audit.
+## Read First
+Before persisting anything:
+1. Read the repo's main agent guidance and project-context files
+2. Read the repo's current durable memory surfaces, such as docs, workspace/handoff files, trackers, decision logs, or knowledge files
+3. Read the exact docs, notes, and skill files that the conversation touched
+Do not assume the repo uses Waypoint. Adapt to the memory structure that already exists.
+## Step 1: Distill Durable Knowledge
+Review the current conversation and separate:
+- durable project knowledge
+- live execution state
+- transient chatter
+Persist without asking follow-up questions when the correct destination is clear.
+Write durable knowledge to the smallest truthful home the repo already uses:
+- the main docs or knowledge layer for architecture, behavior, decisions, debugging knowledge, durable plans, and reusable operating guidance
+- the repo's standing guidance file for durable project context or long-lived working rules
+- the repo's live handoff or workspace file for current state, blockers, and immediate next steps
+- the repo's tracker or execution-log layer when the conversation created or materially changed a long-running workstream
+If the repo uses doc metadata such as `last_updated`, refresh it when needed.
+If the repo has no obvious durable home but the need is clear, create the smallest coherent doc or note that fits the surrounding patterns instead of leaving the learning only in chat.
+Do not leave important truths only in chat.
+## Step 2: Improve Existing Skills
+Identify which skills were actually used in this conversation, or which existing skills clearly should have covered the workflow but left avoidable gaps.
+For each affected skill:
+- read the existing skill before editing it
+- update only reusable guidance, not one-off transcript details
+- add missing guardrails, path hints, failure modes, decision rules, or references that would have made the conversation easier to complete
+- keep `SKILL.md` concise; prefer targeted structural improvements over turning the skill into a diary
+If the environment has both a source-of-truth skill and one or more mirrored or installed copies, update the source-of-truth version and any copies the user expects to stay in sync.
+Do not assume there is only one skill location, and do not assume there are many.
+## Step 3: Propose New Skills
+When the conversation revealed repetitive work that existing skills do not cover well:
+- do not silently scaffold a new skill unless the user asked for implementation
+- record the proposal in the repo's existing docs or idea-capture layer
+If there is no obvious place for durable skill proposals, create a small doc such as `skill-ideas.md` in the repo's normal docs area.
+Each proposal should include:
+- the repeated workflow or problem
+- likely trigger phrases
+- expected outputs or side effects
+- why existing skills were insufficient
+Skip this doc when there is no real new-skill candidate.
+## Step 4: Refresh Repo Memory
+After changing docs, handoff state, trackers, or skills:
+- run whatever repo-local refresh or index step the project uses, if one exists
+- otherwise make sure the edited memory surfaces are internally consistent and discoverable
+Do not invent a refresh command when the repo does not have one.
+## Step 5: Report
+Summarize:
+- what durable knowledge you saved and where
+- which skills you improved
+- which new skill ideas you recorded, if any
+- what you intentionally left unpersisted because it was transient
+If no substantive persistence changes were needed, say that explicitly instead of inventing updates.

package/templates/.agents/skills/conversation-retrospective/agents/openai.yaml ADDED Viewed

@@ -0,0 +1,4 @@
+interface:
+  display_name: "Conversation Retrospective"
+  short_description: "Harvest the live conversation into repo memory"
+  default_prompt: "Use this skill to analyze the active conversation, save durable knowledge into the repo's existing docs, memory, guidance, handoff, or tracker surfaces, improve any skills that were used or exposed gaps, and record new skill ideas without asking follow-up questions when the correct destination is clear."

package/templates/.agents/skills/planning/SKILL.md CHANGED Viewed

@@ -145,6 +145,7 @@ When the plan doc is written:
 - give a short chat summary
 - include the doc path
 - call out any unresolved decisions that still need the user's input
+- if there are no unresolved decisions and the user approves the plan, treat that approval as authorization to execute the plan end to end rather than asking again at each obvious next step
 ## Quality Bar

package/templates/.agents/skills/pr-review/SKILL.md CHANGED Viewed

@@ -10,8 +10,10 @@ Use this skill to drive the PR through review instead of treating review as a on
 ## Step 1: Wait For Review To Settle
 - Check the PR's current review and CI status.
+- If CI is red or pending, inspect the failed check logs before triaging review comments so you do not chase comment fixes while a separate blocker is breaking the branch.
 - If automated review is still running, wait for it to finish instead of racing it.
 - If comments are still arriving, do not prematurely declare the loop complete.
+- For stacked or non-`main` PRs, explicitly compare the PR head against its base branch and make sure later fixes on the base branch have actually been merged or rebased forward. Do not assume a sibling/base PR fix is already present in the dependent PR.
 ## Step 2: Read Every Review Comment
@@ -33,6 +35,7 @@ Do not leave comments unanswered.
 - Make the needed fixes.
 - rerun the relevant verification
+- if the PR is stacked, repeat the base-vs-head sanity check after pushes so you catch missing forward-merges before the next CI cycle
 - push follow-up commit(s)
 - return to the PR and continue the loop

package/templates/.gitignore.snippet CHANGED Viewed

@@ -12,6 +12,7 @@
 .agents/skills/backend-context-interview/
 .agents/skills/frontend-ship-audit/
 .agents/skills/backend-ship-audit/
+.agents/skills/conversation-retrospective/
 .agents/skills/workspace-compress/
 .agents/skills/pre-pr-hygiene/
 .agents/skills/pr-review/

package/templates/.waypoint/SOUL.md CHANGED Viewed

@@ -20,6 +20,8 @@ You're direct, opinionated, and evidence-driven. You read before you write. You
 **Correctness beats theater.** No fake verification. No fake confidence. No pretending a shallow answer is good enough.
+**Approval means ownership.** Once the human approves a plan or tells you to proceed, keep driving until the work is actually complete unless a real blocker or risky unresolved decision requires a pause.
 ## How You Work
 **Read before you write.** Never modify code you haven't read.

package/templates/.waypoint/agent-operating-manual.md CHANGED Viewed

@@ -43,6 +43,7 @@ If something important lives only in your head or in the chat transcript, the re
 - Read code before editing it.
 - Follow the repo's documented patterns when they are healthy.
+- If the user approves a plan or explicitly tells you to proceed, treat that as authorization to finish the approved work end to end.
 - Update `.waypoint/WORKSPACE.md` as live execution state when progress meaningfully changes. In multi-topic sections, prefix new or materially revised bullets with a local timestamp like `[2026-03-06 20:10 PST]`.
 - For large multi-step work, create or update a tracker in `.waypoint/track/`, keep detailed execution state there, and point at it from `## Active Trackers` in `.waypoint/WORKSPACE.md`.
 - Update `.waypoint/docs/` when durable knowledge changes, and refresh each changed routable doc's `last_updated` field.
@@ -51,6 +52,24 @@ If something important lives only in your head or in the chat transcript, the re
 - Use the repo-local skills and reviewer agents instead of improvising from scratch.
 - Do not kill long-running subagents or reviewer agents just because they are slow. Wait unless they are clearly stuck, failed, or the user redirects the work.
+## Execution autonomy
+Once the user has approved a plan or otherwise told you to continue, own the work until the slice is genuinely complete.
+That means:
+- continue through implementation, verification, reviewer passes, and required docs/workspace updates without asking for incremental permission
+- use commentary for short progress updates, not as a handoff back to the user
+- do not stop just to announce the next obvious step and ask whether to do it
+Pause only when:
+- a real blocker prevents forward progress
+- a hidden-risk or non-obvious decision would materially change scope, behavior, cost, or data safety
+- the user explicitly redirects, pauses, or narrows the work
+If none of those are true, keep going.
 ## Documentation expectations
 Document the things the next agent cannot safely infer from raw code alone:
@@ -82,7 +101,15 @@ Waypoint scaffolds these focused second-pass specialists by default:
 - `code-reviewer` for correctness and regression review
 - `code-health-reviewer` for maintainability drift
-- `plan-reviewer` to challenge weak implementation plans before execution
+- `plan-reviewer` to challenge non-trivial implementation plans before they are shown to the user
+## Plan Review
+Run `plan-reviewer` before presenting a non-trivial implementation plan to the user.
+- Use it when the plan includes meaningful design choices, multiple work phases, migrations, or non-obvious tradeoffs.
+- Skip it for tiny obvious plans or when no plan will be presented.
+- Read the reviewer result, strengthen the plan, and rerun `plan-reviewer` until there are no meaningful issues left before showing the plan to the user.
 ## Review Loop

package/templates/managed-agents-block.md CHANGED Viewed

@@ -66,6 +66,8 @@ If some uncertainty still remains after checking persisted context and interview
 Prefer existing persisted context over re-interviewing the user.
+If the user approves a plan or explicitly tells you to proceed, treat that as authorization to execute the work end to end. Do not stop mid-implementation for incremental permission unless a real blocker, hidden-risk decision, or explicit user redirect requires a pause.
 Working rules:
 - Keep `.waypoint/WORKSPACE.md` current as the live execution state, with timestamped new or materially revised entries in multi-topic sections
 - For large multi-step work, create or update `.waypoint/track/<slug>.md`, keep detailed execution state there, and point to it from `## Active Trackers` in `.waypoint/WORKSPACE.md`
@@ -75,9 +77,11 @@ Working rules:
 - Use `docs-sync` when the docs may be stale or a change altered shipped behavior, contracts, routes, or commands
 - Use `code-guide-audit` for a targeted coding-guide compliance pass on a specific feature, file set, or change slice
 - Do not invoke `break-it-qa`, `frontend-ship-audit`, or `backend-ship-audit` yourself from the managed AGENTS block workflow; they are user-facing skills for explicit human-requested QA or ship-readiness audits, not default agent steps
+- Before presenting a non-trivial implementation plan to the user, run `plan-reviewer` and iterate on the plan until it has no meaningful review findings left
 - Before considering a non-trivial implementation slice complete, run `code-reviewer`; use a recent self-authored commit as the default scope anchor when one cleanly represents that slice
 - Before considering medium or large changes complete, run `code-health-reviewer`, especially when they add structure, duplicate logic, or introduce new abstractions
 - Before pushing or opening/updating a PR for substantial work, use `pre-pr-hygiene`
 - Use `pr-review` once a PR has active review comments or automated review in progress
 - Treat the generated context bundle as required session bootstrap, not optional reference material
+- After plan approval, own the execution through implementation, verification, review, and repo-memory updates before surfacing a final completion report
 <!-- waypoint:end -->