waypoint-codex 1.1.1 → 1.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/dist/src/core.js CHANGED
@@ -357,6 +357,8 @@ export function initRepository(projectRoot, options) {
357
357
  ".agents/skills/waypoint-explore",
358
358
  ".agents/skills/waypoint-research",
359
359
  ".agents/skills/visual-explanations",
360
+ ".agents/skills/execution-reset",
361
+ ".agents/skills/plan-start",
360
362
  ".agents/skills/work-tracker",
361
363
  ".agents/skills/docs-sync",
362
364
  ".agents/skills/break-it-qa",
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "waypoint-codex",
3
- "version": "1.1.1",
3
+ "version": "1.1.3",
4
4
  "description": "Make Codex better by default with stronger planning, code quality, reviews, tracking, and repo guidance.",
5
5
  "license": "MIT",
6
6
  "type": "module",
@@ -20,6 +20,8 @@ Run this loop until exit criteria are satisfied.
20
20
  1. Load PR state:
21
21
  - collect all current review threads and comments, including existing comments present before the skill started
22
22
  - collect CI/CD status for required checks
23
+ - for each reviewer (`Codex`, `CodeRabbit`), identify only that reviewer's latest top-level PR comment as the source of truth for the current reviewer state
24
+ - do not bind Codex parsing to one exact phrase; use author identity plus the latest comment/findings content (including explicit "no meaningful issues" style outcomes)
23
25
 
24
26
  2. Triage and act:
25
27
  - classify each reviewer finding as either major (`P1+`) or minor/nitpick
@@ -27,21 +29,31 @@ Run this loop until exit criteria are satisfied.
27
29
  - fix all non-false-positive major (`P1+`) findings in code/docs/tests
28
30
  - minor/nitpick findings may be accepted without code changes, but must still be replied to inline and resolved
29
31
  - if CI/CD has failures, fix those failures as part of the same loop
32
+ - for CodeRabbit state handling:
33
+ - if latest CodeRabbit comment is `Actions performed -> Review triggered`, treat CodeRabbit as `pending` and wait for its next latest comment before triaging CodeRabbit findings
34
+ - do not use CodeRabbit CI/CD or check-run status as reviewer truth; comments are authoritative
30
35
 
31
36
  3. Thread discipline for every addressed or skipped finding:
32
37
  - post an inline reply on that thread explaining the fix or why it is a false positive
33
38
  - resolve the thread after replying
34
39
 
35
- 4. Push and re-request automated review:
40
+ 4. Push and request reviewer-specific re-review only when needed:
36
41
  - push commits
37
- - post comment: `@coderabbitai review`
38
- - post comment: `@codex review`
42
+ - determine which reviewer findings were addressed this round:
43
+ - if Codex findings were addressed, post comment: `@codex review`
44
+ - if CodeRabbit findings were addressed, post comment: `@coderabbitai review`
45
+ - if both were addressed, post both comments
46
+ - if neither reviewer's findings were addressed, do not trigger either reviewer
39
47
 
40
48
  5. Wait for review/check updates:
41
49
  - wait up to 30 minutes total
42
50
  - check every 5 minutes using a sleep interval (`sleep 300`)
43
51
  - on each check, re-read both review and CI/CD status
44
52
  - if major (`P1+`) findings or CI/CD failures appear, continue the loop immediately
53
+ - when both reviewers' latest completed comments (not pending triggers) contain no major (`P1+`) findings, enter terminal cleanup mode:
54
+ - fix and resolve that round's remaining findings
55
+ - push commits if needed
56
+ - do not post any further `@codex review` or `@coderabbitai review` comments
45
57
 
46
58
  ## Exit Criteria
47
59
 
@@ -51,12 +63,16 @@ You may end the loop only when all are true:
51
63
  - no unresolved major (`P1+`) Codex findings remain
52
64
  - every addressed or skipped finding has an inline reply and is resolved
53
65
  - CI/CD is green (or explicitly non-blocking per repo policy)
54
- - the latest reviewer rounds contain only nitpicks/minor issues (no major `P1+` issues)
66
+ - the latest completed reviewer comments from both bots contain only nitpicks/minor issues (no major `P1+` issues)
67
+ - terminal cleanup mode has been completed (remaining minor/nitpick findings handled, with no retrigger afterward)
55
68
 
56
69
  ## Required Behavior
57
70
 
58
71
  - Do not ignore existing comments that were already open when the skill was invoked.
59
72
  - Do not stop after one pass if reviewer bots are still producing new findings.
73
+ - Distinguish reviewer ownership of findings and retrigger only the reviewer whose findings were addressed.
74
+ - Use each reviewer's latest comment as reviewer truth; do not infer reviewer completion from CodeRabbit CI/CD status.
75
+ - Treat CodeRabbit `Actions performed -> Review triggered` as pending review, not as final findings.
60
76
  - Do not mark false positives without a concrete reason in the inline reply.
61
77
  - Do not leave handled threads unresolved.
62
78
  - Do not declare completion while CI/CD is failing for actionable reasons.
@@ -1,4 +1,4 @@
1
1
  interface:
2
2
  display_name: "PR Review"
3
3
  short_description: "Close the review loop with CodeRabbit, Codex, and CI/CD"
4
- default_prompt: "Use $pr-review: address all existing PR review findings, fix actionable CI/CD failures, reply inline and resolve each handled thread, push fixes, comment '@coderabbitai review' and '@codex review', then poll every 5 minutes for up to 30 minutes per round until no major (P1+) issues remain and latest comments are only minor or nitpicks."
4
+ default_prompt: "Use $pr-review: address all existing PR review findings, fix actionable CI/CD failures, reply inline and resolve each handled thread, and request re-review per reviewer ownership (`@codex review` only if Codex findings were addressed, `@coderabbitai review` only if CodeRabbit findings were addressed). Treat each reviewer's latest comment as source of truth (CodeRabbit `Actions performed -> Review triggered` means pending); for Codex, do not key detection to one exact phrase, use latest authored comment/findings content. When both reviewers' latest completed comments have no major (P1+) findings, do one terminal cleanup pass and do not retrigger either reviewer."
@@ -1,92 +0,0 @@
1
- ---
2
- name: execution-reset
3
- description: Recover stalled mid-execution work on a referenced implementation plan when progress has degraded into local edits, repeated patching, or phase drift. Use only after a phase has already started and the current session is no longer advancing a concrete milestone. Reconstruct actual progress from the codebase, test for a stall, and select the next substantial work package that advances or completes the active phase.
4
- ---
5
-
6
- ## Core Instruction
7
- Reset execution from evidence, not memory, when an in-progress phase has stalled and the next meaningful unit of work is unclear.
8
-
9
- ## Required Inputs
10
- You must have all of the following before you act:
11
- 1. A referenced plan or roadmap.
12
- 2. The active phase or checkpoint within that plan.
13
- 3. The relevant workspace or codebase context needed to inspect current state.
14
-
15
- ## Blocked Behavior
16
- If any required input is missing, do not infer it, do not continue, and do not propose a reset package.
17
- State exactly which input is missing and request it.
18
-
19
- ## Trigger Boundary
20
- Use this skill only when:
21
- 1. Work on the current phase has already started.
22
- 2. The current session is not advancing the phase toward completion.
23
- 3. The problem is execution drift, not plan creation, backlog grooming, or initial scoping.
24
-
25
- Do not use this skill at plan start or for ordinary planning questions.
26
-
27
- ## Stall Test
28
- Treat execution as stalled when inspection shows at least one of the following:
29
- - Three consecutive implementation attempts have changed files or run checks without moving any acceptance condition from missing or partial to complete.
30
- - The same local area has been patched twice or more in the current phase without producing a measurable milestone change.
31
- - The current state is still partial, broken, or stale after a review of the codebase, and no next step can be named that clearly advances the phase end-to-end.
32
- - Progress is limited to micro-edits, reformatting, or cleanup that does not change the phase outcome.
33
-
34
- The stall test is objective: if none of the conditions above hold, do not declare a reset.
35
-
36
- ## Workflow
37
- 1. Identify the active phase and its intended outcome from the plan.
38
- 2. Inspect the codebase and classify the phase state as complete, partial, missing, or broken/stale.
39
- 3. Apply the stall test.
40
- 4. If stalled, restate the phase as concrete system behavior and identify one work package that will complete a sub-milestone, unblock the main dependency, or complete the end-to-end path.
41
- 5. Execute only that work package.
42
- 6. Re-check whether the change moved at least one acceptance condition from missing or partial to complete.
43
-
44
- ## Rules
45
- - Do not continue with micro-edits unless they directly unblock the selected work package.
46
- - Do not polish incomplete phases.
47
- - Do not trust prior session context over the codebase.
48
- - Do not select more than one work package per reset cycle.
49
- - Do not expand scope to adjacent cleanup unless it is required for the chosen package to finish.
50
-
51
- ## Exception Rule
52
- You may relax the stall test only when the workspace evidence shows a phase-critical blocker outside the current code path, such as a missing prerequisite, unavailable dependency, or an upstream change that makes the current plan invalid. This exception is bounded:
53
- - The blocker must be named.
54
- - The blocker must be evidenced in the inspected state.
55
- - The response must switch to blocker resolution, not continued execution.
56
-
57
- ## Stop Condition
58
- Stop the reset cycle immediately when one of these is true:
59
- - The selected package cannot be executed because a required input is still missing.
60
- - The selected package fails to move at least one acceptance condition from missing or partial to complete after a concrete attempt.
61
- - The inspected codebase shows the phase is no longer the right unit of work.
62
-
63
- ## Next Action If Reset Package Fails
64
- If the chosen package fails, do not chain into a second package. Report the blocker, the evidence, and the smallest next question or dependency needed to continue.
65
-
66
- ## Output Contract
67
- Use this shape:
68
-
69
- ### Execution Reset
70
- ### Active Phase
71
- [phase]
72
-
73
- ### Objective
74
- [what this phase must accomplish]
75
-
76
- ### Actual Status
77
- - Complete: [...]
78
- - Partial: [...]
79
- - Missing: [...]
80
- - Broken/stale: [...]
81
-
82
- ### Stall Diagnosis
83
- [why the stall test passed, or why it did not]
84
-
85
- ### Next Work Package
86
- [one substantial chunk, or the blocker if execution is stopped]
87
-
88
- ### Definition of Done
89
- [concrete completion conditions for the package]
90
-
91
- ### Plan Correction
92
- [only if the plan is locally stale]
@@ -1,4 +0,0 @@
1
- interface:
2
- display_name: "Execution Reset"
3
- short_description: "Recover stalled mid-execution work on an active plan phase"
4
- default_prompt: "Use $execution-reset only after a plan phase has already started and execution has stalled; provide the referenced plan, the active phase, and the relevant workspace context so the skill can inspect current state, test for a stall, and pick one substantial next work package."
@@ -1,79 +0,0 @@
1
- ---
2
- name: plan-start
3
- description: Bootstrap a fresh session onto an existing implementation plan. Use when a referenced plan already exists, execution has not meaningfully started in the current session, and Codex needs to reconstruct the active phase from the plan plus current repository state before beginning the first substantial work package.
4
- ---
5
-
6
- # Plan Start
7
-
8
- ## Core Instruction
9
- Convert a referenced plan into the first executable work package for a fresh session.
10
-
11
- ## Trigger Boundary
12
- Use this skill when:
13
- - a durable plan, roadmap, or phase list already exists
14
- - the current session has not yet become a stalled execution loop
15
- - the task is to re-enter the plan, recover the active phase, and begin substantive work
16
-
17
- Do not use this skill when:
18
- - the session is already mid-execution and progress has degraded into micro-edits, repeated patching, or phase drift
19
- - the current problem is recovering momentum on a stuck phase
20
- - the request is to revise the plan itself before execution can begin
21
-
22
- In those cases, route to:
23
- - `$execution-reset` for stalled or compacted plan execution
24
- - `$planning` when the plan is missing, non-durable, or too underspecified to execute
25
-
26
- ## Required Inputs
27
- Do not proceed until you have:
28
- - a plan path, plan identifier, or equivalent durable plan reference
29
- - the current repository/worktree state relevant to that plan
30
- - enough current context to tell whether the referenced plan is still actionable
31
-
32
- If any required input is missing, stop and route to `$planning` or ask for the missing reference instead of guessing.
33
-
34
- ## Workflow
35
- 1. Read the referenced plan end to end.
36
- 2. Inspect the current repository state relevant to the plan.
37
- 3. Determine:
38
- - which phases are complete
39
- - which phase is active
40
- - what remains inside the active phase
41
- 4. Restate the active phase as concrete system behavior.
42
- 5. Select the first substantial work package that most directly advances that phase.
43
- 6. Begin execution on that package.
44
-
45
- ## Rules
46
- - Do not re-plan the whole project unless the referenced plan is locally stale.
47
- - Do not use this skill to recover from a stalled session; that is `$execution-reset`.
48
- - Do not spend the session on summaries, narration, or cosmetic cleanup when a substantive work package is available.
49
- - Do not choose a micro-edit as the first move unless it is the smallest change that unblocks the first substantial package.
50
- - Select the work package that most directly advances the active phase.
51
-
52
- ## Bounded Exception
53
- If the plan is locally stale, allow one bounded re-anchoring pass:
54
- - reconcile the active phase against the current codebase
55
- - update the phase boundary only as needed to make execution possible
56
- - do not rewrite the whole roadmap
57
-
58
- If the plan still cannot be executed after that pass, stop and route to `$planning`.
59
-
60
- ## Output Contract
61
- ### Normal
62
- Return:
63
- - `Active Phase`
64
- - `Objective`
65
- - `Current State`
66
- - `First Work Package`
67
- - `Definition of Done`
68
-
69
- ### Blocked
70
- Return `Blocked` when required inputs are missing or the plan cannot be safely actioned yet. Include:
71
- - what is missing
72
- - why execution cannot start
73
- - the exact reroute target: `$planning` or `$execution-reset`
74
-
75
- ### Reroute
76
- Return `Reroute` when the request belongs to another skill. Include:
77
- - `Reroute Target`
78
- - `Reason`
79
- - the minimal handoff needed to continue
@@ -1,4 +0,0 @@
1
- interface:
2
- display_name: "Plan Start"
3
- short_description: "Bootstrap a fresh session onto an existing implementation plan"
4
- default_prompt: "Use $plan-start when a durable implementation plan already exists and this is a fresh-session bootstrap, not a stalled execution recovery. Read the plan, inspect the current repo state, identify the active phase, pick the first substantial work package, and begin execution. If the plan is missing or underspecified, route to $planning; if the session is already stalled, route to $execution-reset."