waypoint-codex 1.1.1 → 1.1.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/src/core.js +2 -0
- package/package.json +1 -1
- package/templates/.agents/skills/pr-review/SKILL.md +20 -4
- package/templates/.agents/skills/pr-review/agents/openai.yaml +1 -1
- package/templates/.agents/skills/execution-reset/SKILL.md +0 -92
- package/templates/.agents/skills/execution-reset/agents/openai.yaml +0 -4
- package/templates/.agents/skills/plan-start/SKILL.md +0 -79
- package/templates/.agents/skills/plan-start/agents/openai.yaml +0 -4
package/dist/src/core.js
CHANGED
|
@@ -357,6 +357,8 @@ export function initRepository(projectRoot, options) {
|
|
|
357
357
|
".agents/skills/waypoint-explore",
|
|
358
358
|
".agents/skills/waypoint-research",
|
|
359
359
|
".agents/skills/visual-explanations",
|
|
360
|
+
".agents/skills/execution-reset",
|
|
361
|
+
".agents/skills/plan-start",
|
|
360
362
|
".agents/skills/work-tracker",
|
|
361
363
|
".agents/skills/docs-sync",
|
|
362
364
|
".agents/skills/break-it-qa",
|
package/package.json
CHANGED
|
@@ -20,6 +20,8 @@ Run this loop until exit criteria are satisfied.
|
|
|
20
20
|
1. Load PR state:
|
|
21
21
|
- collect all current review threads and comments, including existing comments present before the skill started
|
|
22
22
|
- collect CI/CD status for required checks
|
|
23
|
+
- for each reviewer (`Codex`, `CodeRabbit`), identify only that reviewer's latest top-level PR comment as the source of truth for the current reviewer state
|
|
24
|
+
- do not bind Codex parsing to one exact phrase; use author identity plus the latest comment/findings content (including explicit "no meaningful issues" style outcomes)
|
|
23
25
|
|
|
24
26
|
2. Triage and act:
|
|
25
27
|
- classify each reviewer finding as either major (`P1+`) or minor/nitpick
|
|
@@ -27,21 +29,31 @@ Run this loop until exit criteria are satisfied.
|
|
|
27
29
|
- fix all non-false-positive major (`P1+`) findings in code/docs/tests
|
|
28
30
|
- minor/nitpick findings may be accepted without code changes, but must still be replied to inline and resolved
|
|
29
31
|
- if CI/CD has failures, fix those failures as part of the same loop
|
|
32
|
+
- for CodeRabbit state handling:
|
|
33
|
+
- if latest CodeRabbit comment is `Actions performed -> Review triggered`, treat CodeRabbit as `pending` and wait for its next latest comment before triaging CodeRabbit findings
|
|
34
|
+
- do not use CodeRabbit CI/CD or check-run status as reviewer truth; comments are authoritative
|
|
30
35
|
|
|
31
36
|
3. Thread discipline for every addressed or skipped finding:
|
|
32
37
|
- post an inline reply on that thread explaining the fix or why it is a false positive
|
|
33
38
|
- resolve the thread after replying
|
|
34
39
|
|
|
35
|
-
4. Push and re-
|
|
40
|
+
4. Push and request reviewer-specific re-review only when needed:
|
|
36
41
|
- push commits
|
|
37
|
-
-
|
|
38
|
-
- post comment: `@codex review`
|
|
42
|
+
- determine which reviewer findings were addressed this round:
|
|
43
|
+
- if Codex findings were addressed, post comment: `@codex review`
|
|
44
|
+
- if CodeRabbit findings were addressed, post comment: `@coderabbitai review`
|
|
45
|
+
- if both were addressed, post both comments
|
|
46
|
+
- if neither reviewer's findings were addressed, do not trigger either reviewer
|
|
39
47
|
|
|
40
48
|
5. Wait for review/check updates:
|
|
41
49
|
- wait up to 30 minutes total
|
|
42
50
|
- check every 5 minutes using a sleep interval (`sleep 300`)
|
|
43
51
|
- on each check, re-read both review and CI/CD status
|
|
44
52
|
- if major (`P1+`) findings or CI/CD failures appear, continue the loop immediately
|
|
53
|
+
- when both reviewers' latest completed comments (not pending triggers) contain no major (`P1+`) findings, enter terminal cleanup mode:
|
|
54
|
+
- fix and resolve that round's remaining findings
|
|
55
|
+
- push commits if needed
|
|
56
|
+
- do not post any further `@codex review` or `@coderabbitai review` comments
|
|
45
57
|
|
|
46
58
|
## Exit Criteria
|
|
47
59
|
|
|
@@ -51,12 +63,16 @@ You may end the loop only when all are true:
|
|
|
51
63
|
- no unresolved major (`P1+`) Codex findings remain
|
|
52
64
|
- every addressed or skipped finding has an inline reply and is resolved
|
|
53
65
|
- CI/CD is green (or explicitly non-blocking per repo policy)
|
|
54
|
-
- the latest reviewer
|
|
66
|
+
- the latest completed reviewer comments from both bots contain only nitpicks/minor issues (no major `P1+` issues)
|
|
67
|
+
- terminal cleanup mode has been completed (remaining minor/nitpick findings handled, with no retrigger afterward)
|
|
55
68
|
|
|
56
69
|
## Required Behavior
|
|
57
70
|
|
|
58
71
|
- Do not ignore existing comments that were already open when the skill was invoked.
|
|
59
72
|
- Do not stop after one pass if reviewer bots are still producing new findings.
|
|
73
|
+
- Distinguish reviewer ownership of findings and retrigger only the reviewer whose findings were addressed.
|
|
74
|
+
- Use each reviewer's latest comment as reviewer truth; do not infer reviewer completion from CodeRabbit CI/CD status.
|
|
75
|
+
- Treat CodeRabbit `Actions performed -> Review triggered` as pending review, not as final findings.
|
|
60
76
|
- Do not mark false positives without a concrete reason in the inline reply.
|
|
61
77
|
- Do not leave handled threads unresolved.
|
|
62
78
|
- Do not declare completion while CI/CD is failing for actionable reasons.
|
|
@@ -1,4 +1,4 @@
|
|
|
1
1
|
interface:
|
|
2
2
|
display_name: "PR Review"
|
|
3
3
|
short_description: "Close the review loop with CodeRabbit, Codex, and CI/CD"
|
|
4
|
-
default_prompt: "Use $pr-review: address all existing PR review findings, fix actionable CI/CD failures, reply inline and resolve each handled thread,
|
|
4
|
+
default_prompt: "Use $pr-review: address all existing PR review findings, fix actionable CI/CD failures, reply inline and resolve each handled thread, and request re-review per reviewer ownership (`@codex review` only if Codex findings were addressed, `@coderabbitai review` only if CodeRabbit findings were addressed). Treat each reviewer's latest comment as source of truth (CodeRabbit `Actions performed -> Review triggered` means pending); for Codex, do not key detection to one exact phrase, use latest authored comment/findings content. When both reviewers' latest completed comments have no major (P1+) findings, do one terminal cleanup pass and do not retrigger either reviewer."
|
|
@@ -1,92 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: execution-reset
|
|
3
|
-
description: Recover stalled mid-execution work on a referenced implementation plan when progress has degraded into local edits, repeated patching, or phase drift. Use only after a phase has already started and the current session is no longer advancing a concrete milestone. Reconstruct actual progress from the codebase, test for a stall, and select the next substantial work package that advances or completes the active phase.
|
|
4
|
-
---
|
|
5
|
-
|
|
6
|
-
## Core Instruction
|
|
7
|
-
Reset execution from evidence, not memory, when an in-progress phase has stalled and the next meaningful unit of work is unclear.
|
|
8
|
-
|
|
9
|
-
## Required Inputs
|
|
10
|
-
You must have all of the following before you act:
|
|
11
|
-
1. A referenced plan or roadmap.
|
|
12
|
-
2. The active phase or checkpoint within that plan.
|
|
13
|
-
3. The relevant workspace or codebase context needed to inspect current state.
|
|
14
|
-
|
|
15
|
-
## Blocked Behavior
|
|
16
|
-
If any required input is missing, do not infer it, do not continue, and do not propose a reset package.
|
|
17
|
-
State exactly which input is missing and request it.
|
|
18
|
-
|
|
19
|
-
## Trigger Boundary
|
|
20
|
-
Use this skill only when:
|
|
21
|
-
1. Work on the current phase has already started.
|
|
22
|
-
2. The current session is not advancing the phase toward completion.
|
|
23
|
-
3. The problem is execution drift, not plan creation, backlog grooming, or initial scoping.
|
|
24
|
-
|
|
25
|
-
Do not use this skill at plan start or for ordinary planning questions.
|
|
26
|
-
|
|
27
|
-
## Stall Test
|
|
28
|
-
Treat execution as stalled when inspection shows at least one of the following:
|
|
29
|
-
- Three consecutive implementation attempts have changed files or run checks without moving any acceptance condition from missing or partial to complete.
|
|
30
|
-
- The same local area has been patched twice or more in the current phase without producing a measurable milestone change.
|
|
31
|
-
- The current state is still partial, broken, or stale after a review of the codebase, and no next step can be named that clearly advances the phase end-to-end.
|
|
32
|
-
- Progress is limited to micro-edits, reformatting, or cleanup that does not change the phase outcome.
|
|
33
|
-
|
|
34
|
-
The stall test is objective: if none of the conditions above hold, do not declare a reset.
|
|
35
|
-
|
|
36
|
-
## Workflow
|
|
37
|
-
1. Identify the active phase and its intended outcome from the plan.
|
|
38
|
-
2. Inspect the codebase and classify the phase state as complete, partial, missing, or broken/stale.
|
|
39
|
-
3. Apply the stall test.
|
|
40
|
-
4. If stalled, restate the phase as concrete system behavior and identify one work package that will complete a sub-milestone, unblock the main dependency, or complete the end-to-end path.
|
|
41
|
-
5. Execute only that work package.
|
|
42
|
-
6. Re-check whether the change moved at least one acceptance condition from missing or partial to complete.
|
|
43
|
-
|
|
44
|
-
## Rules
|
|
45
|
-
- Do not continue with micro-edits unless they directly unblock the selected work package.
|
|
46
|
-
- Do not polish incomplete phases.
|
|
47
|
-
- Do not trust prior session context over the codebase.
|
|
48
|
-
- Do not select more than one work package per reset cycle.
|
|
49
|
-
- Do not expand scope to adjacent cleanup unless it is required for the chosen package to finish.
|
|
50
|
-
|
|
51
|
-
## Exception Rule
|
|
52
|
-
You may relax the stall test only when the workspace evidence shows a phase-critical blocker outside the current code path, such as a missing prerequisite, unavailable dependency, or an upstream change that makes the current plan invalid. This exception is bounded:
|
|
53
|
-
- The blocker must be named.
|
|
54
|
-
- The blocker must be evidenced in the inspected state.
|
|
55
|
-
- The response must switch to blocker resolution, not continued execution.
|
|
56
|
-
|
|
57
|
-
## Stop Condition
|
|
58
|
-
Stop the reset cycle immediately when one of these is true:
|
|
59
|
-
- The selected package cannot be executed because a required input is still missing.
|
|
60
|
-
- The selected package fails to move at least one acceptance condition from missing or partial to complete after a concrete attempt.
|
|
61
|
-
- The inspected codebase shows the phase is no longer the right unit of work.
|
|
62
|
-
|
|
63
|
-
## Next Action If Reset Package Fails
|
|
64
|
-
If the chosen package fails, do not chain into a second package. Report the blocker, the evidence, and the smallest next question or dependency needed to continue.
|
|
65
|
-
|
|
66
|
-
## Output Contract
|
|
67
|
-
Use this shape:
|
|
68
|
-
|
|
69
|
-
### Execution Reset
|
|
70
|
-
### Active Phase
|
|
71
|
-
[phase]
|
|
72
|
-
|
|
73
|
-
### Objective
|
|
74
|
-
[what this phase must accomplish]
|
|
75
|
-
|
|
76
|
-
### Actual Status
|
|
77
|
-
- Complete: [...]
|
|
78
|
-
- Partial: [...]
|
|
79
|
-
- Missing: [...]
|
|
80
|
-
- Broken/stale: [...]
|
|
81
|
-
|
|
82
|
-
### Stall Diagnosis
|
|
83
|
-
[why the stall test passed, or why it did not]
|
|
84
|
-
|
|
85
|
-
### Next Work Package
|
|
86
|
-
[one substantial chunk, or the blocker if execution is stopped]
|
|
87
|
-
|
|
88
|
-
### Definition of Done
|
|
89
|
-
[concrete completion conditions for the package]
|
|
90
|
-
|
|
91
|
-
### Plan Correction
|
|
92
|
-
[only if the plan is locally stale]
|
|
@@ -1,4 +0,0 @@
|
|
|
1
|
-
interface:
|
|
2
|
-
display_name: "Execution Reset"
|
|
3
|
-
short_description: "Recover stalled mid-execution work on an active plan phase"
|
|
4
|
-
default_prompt: "Use $execution-reset only after a plan phase has already started and execution has stalled; provide the referenced plan, the active phase, and the relevant workspace context so the skill can inspect current state, test for a stall, and pick one substantial next work package."
|
|
@@ -1,79 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: plan-start
|
|
3
|
-
description: Bootstrap a fresh session onto an existing implementation plan. Use when a referenced plan already exists, execution has not meaningfully started in the current session, and Codex needs to reconstruct the active phase from the plan plus current repository state before beginning the first substantial work package.
|
|
4
|
-
---
|
|
5
|
-
|
|
6
|
-
# Plan Start
|
|
7
|
-
|
|
8
|
-
## Core Instruction
|
|
9
|
-
Convert a referenced plan into the first executable work package for a fresh session.
|
|
10
|
-
|
|
11
|
-
## Trigger Boundary
|
|
12
|
-
Use this skill when:
|
|
13
|
-
- a durable plan, roadmap, or phase list already exists
|
|
14
|
-
- the current session has not yet become a stalled execution loop
|
|
15
|
-
- the task is to re-enter the plan, recover the active phase, and begin substantive work
|
|
16
|
-
|
|
17
|
-
Do not use this skill when:
|
|
18
|
-
- the session is already mid-execution and progress has degraded into micro-edits, repeated patching, or phase drift
|
|
19
|
-
- the current problem is recovering momentum on a stuck phase
|
|
20
|
-
- the request is to revise the plan itself before execution can begin
|
|
21
|
-
|
|
22
|
-
In those cases, route to:
|
|
23
|
-
- `$execution-reset` for stalled or compacted plan execution
|
|
24
|
-
- `$planning` when the plan is missing, non-durable, or too underspecified to execute
|
|
25
|
-
|
|
26
|
-
## Required Inputs
|
|
27
|
-
Do not proceed until you have:
|
|
28
|
-
- a plan path, plan identifier, or equivalent durable plan reference
|
|
29
|
-
- the current repository/worktree state relevant to that plan
|
|
30
|
-
- enough current context to tell whether the referenced plan is still actionable
|
|
31
|
-
|
|
32
|
-
If any required input is missing, stop and route to `$planning` or ask for the missing reference instead of guessing.
|
|
33
|
-
|
|
34
|
-
## Workflow
|
|
35
|
-
1. Read the referenced plan end to end.
|
|
36
|
-
2. Inspect the current repository state relevant to the plan.
|
|
37
|
-
3. Determine:
|
|
38
|
-
- which phases are complete
|
|
39
|
-
- which phase is active
|
|
40
|
-
- what remains inside the active phase
|
|
41
|
-
4. Restate the active phase as concrete system behavior.
|
|
42
|
-
5. Select the first substantial work package that most directly advances that phase.
|
|
43
|
-
6. Begin execution on that package.
|
|
44
|
-
|
|
45
|
-
## Rules
|
|
46
|
-
- Do not re-plan the whole project unless the referenced plan is locally stale.
|
|
47
|
-
- Do not use this skill to recover from a stalled session; that is `$execution-reset`.
|
|
48
|
-
- Do not spend the session on summaries, narration, or cosmetic cleanup when a substantive work package is available.
|
|
49
|
-
- Do not choose a micro-edit as the first move unless it is the smallest change that unblocks the first substantial package.
|
|
50
|
-
- Select the work package that most directly advances the active phase.
|
|
51
|
-
|
|
52
|
-
## Bounded Exception
|
|
53
|
-
If the plan is locally stale, allow one bounded re-anchoring pass:
|
|
54
|
-
- reconcile the active phase against the current codebase
|
|
55
|
-
- update the phase boundary only as needed to make execution possible
|
|
56
|
-
- do not rewrite the whole roadmap
|
|
57
|
-
|
|
58
|
-
If the plan still cannot be executed after that pass, stop and route to `$planning`.
|
|
59
|
-
|
|
60
|
-
## Output Contract
|
|
61
|
-
### Normal
|
|
62
|
-
Return:
|
|
63
|
-
- `Active Phase`
|
|
64
|
-
- `Objective`
|
|
65
|
-
- `Current State`
|
|
66
|
-
- `First Work Package`
|
|
67
|
-
- `Definition of Done`
|
|
68
|
-
|
|
69
|
-
### Blocked
|
|
70
|
-
Return `Blocked` when required inputs are missing or the plan cannot be safely actioned yet. Include:
|
|
71
|
-
- what is missing
|
|
72
|
-
- why execution cannot start
|
|
73
|
-
- the exact reroute target: `$planning` or `$execution-reset`
|
|
74
|
-
|
|
75
|
-
### Reroute
|
|
76
|
-
Return `Reroute` when the request belongs to another skill. Include:
|
|
77
|
-
- `Reroute Target`
|
|
78
|
-
- `Reason`
|
|
79
|
-
- the minimal handoff needed to continue
|
|
@@ -1,4 +0,0 @@
|
|
|
1
|
-
interface:
|
|
2
|
-
display_name: "Plan Start"
|
|
3
|
-
short_description: "Bootstrap a fresh session onto an existing implementation plan"
|
|
4
|
-
default_prompt: "Use $plan-start when a durable implementation plan already exists and this is a fresh-session bootstrap, not a stalled execution recovery. Read the plan, inspect the current repo state, identify the active phase, pick the first substantial work package, and begin execution. If the plan is missing or underspecified, route to $planning; if the session is already stalled, route to $execution-reset."
|