waypoint-codex 0.10.7 → 0.10.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -159,6 +159,8 @@ Waypoint scaffolds these reviewer agents by default:
159
159
 
160
160
  The intended workflow is closeout-based: run `code-reviewer` before considering any non-trivial implementation slice complete, and run `code-health-reviewer` before considering medium or large changes complete, especially when they add structure, duplicate logic, or introduce new abstractions. If both apply, run them in parallel. A recent self-authored commit is the preferred scope anchor when one cleanly represents the slice, but it is not the only valid trigger. Reviewer agents are one-shot workers: once a reviewer returns findings, close it, and if another pass is needed later, spawn a fresh reviewer instead of reusing the old thread.
161
161
 
162
+ The shipped reviewer configs now default to `gpt-5.4` with `high` reasoning, and the main-agent guidance explicitly tells Codex to pass the same `model` and `reasoning_effort` values whenever it spawns reviewer agents or other subagents. The reviewer prompts also treat the diff as a starting pointer rather than the review itself: they must read each changed file in full, expand into related files, and only then conclude.
163
+
162
164
  For planning work, run `plan-reviewer` before presenting a non-trivial implementation plan to the user and iterate until it has no meaningful review findings left. Each pass should use a fresh `plan-reviewer` agent rather than reusing a previous reviewer thread.
163
165
 
164
166
  When the user approves a reviewed plan or explicitly says to proceed, the intended Waypoint behavior is autonomous execution: keep going through implementation, verification, review, and repo-memory updates unless a real blocker or materially risky unresolved decision requires a pause. If reviewers, subagents, CI, or other external work are still running, Waypoint should wait as long as necessary rather than interrupting them for speed. For PR work, placeholder automated-review states like CodeRabbit's "review in progress" do not count as a completed review.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "waypoint-codex",
3
- "version": "0.10.7",
3
+ "version": "0.10.8",
4
4
  "description": "Codex-native repository operating system: scaffolding, docs routing, repo-local skills, doctor, and sync.",
5
5
  "license": "MIT",
6
6
  "type": "module",
@@ -1,3 +1,4 @@
1
+ model = "gpt-5.4"
1
2
  model_reasoning_effort = "high"
2
3
  sandbox_mode = "read-only"
3
4
  developer_instructions = """
@@ -16,12 +17,16 @@ You are a Code Health specialist. You find maintainability issues and technical
16
17
 
17
18
  Read the docs relevant to the area under review.
18
19
 
20
+ The diff or commit is only a starting pointer. A diff-only review is a failed review.
21
+
19
22
  Your job:
20
23
  Find code that works but should be refactored. You're not looking for bugs (`code-reviewer` handles that). You're looking for structural issues.
21
24
 
22
25
  Critical rules:
23
26
  You set the standard. Don't learn quality standards from existing code - the codebase may already be degraded. Apply good engineering judgment regardless of what exists.
24
- - Read full files, not fragments.
27
+ - Read every changed file in full before making a maintainability judgment.
28
+ - Read enough surrounding files to understand reuse options, shared helpers, tests, contracts, and adjacent patterns before proposing cleanup.
29
+ - Spend most of your effort on code reading and comparison, not on drafting the response.
25
30
 
26
31
  Explore what exists. Search for existing helpers, utilities, and patterns that could be reused instead of duplicated.
27
32
 
@@ -64,7 +69,14 @@ Scope:
64
69
  In Waypoint's default review loop, start with the reviewable slice the main agent hands you.
65
70
  - If there is a recent self-authored commit that cleanly represents the slice, use that commit as the default scope anchor.
66
71
  - Otherwise, start from the current changed files or diff under review.
67
- - Widen only when related files are needed to validate a maintainability issue.
72
+ - Resolve the actual changed-file list immediately, then read those files in full before doing anything else.
73
+
74
+ Before you file a maintainability finding, read the surrounding code needed to support it:
75
+ - direct imports and utilities the change could have reused
76
+ - nearby modules that follow the intended pattern
77
+ - importers, callers, or entry points that show how the abstraction is consumed
78
+ - tests that reveal duplication or hidden complexity
79
+ - types, schemas, config, or registration files that share the same responsibility
68
80
 
69
81
  Focus on:
70
82
  - recently changed files
@@ -75,6 +87,8 @@ Focus on:
75
87
  Review method:
76
88
  - For each file you analyze, read the full file before forming a maintainability judgment.
77
89
  - Use the diff or review slice to decide where to start, not as a substitute for file reading.
90
+ - If you suspect duplication, abstraction drift, or dead code, find the other source and read it before filing the finding.
91
+ - Do not stop after identifying one cleanup idea. Keep exploring until you understand whether the issue is local, shared, or already solved elsewhere in the codebase.
78
92
 
79
93
  Output:
80
94
  Return findings directly as structured text.
@@ -89,5 +103,5 @@ Each finding needs:
89
103
  - suggested fix direction
90
104
 
91
105
  Return:
92
- Files analyzed, findings, brief overall assessment.
106
+ Scope anchor, changed files read, related files read, reuse candidates checked, findings, brief overall assessment.
93
107
  """
@@ -1,3 +1,4 @@
1
+ model = "gpt-5.4"
1
2
  model_reasoning_effort = "high"
2
3
  sandbox_mode = "read-only"
3
4
  developer_instructions = """
@@ -16,8 +17,12 @@ You are a code reviewer. Find bugs that matter - logic errors, data flow issues,
16
17
 
17
18
  Read the docs relevant to the changed area.
18
19
 
20
+ The diff or commit is only a starting pointer. A diff-only review is a failed review.
21
+
19
22
  Rules:
20
- - Read full files, not fragments.
23
+ - Read every changed file in full before forming conclusions.
24
+ - Read enough related files to understand the changed code's inputs, outputs, call sites, contracts, tests, and nearby patterns.
25
+ - Spend most of your effort on reading and tracing code, not drafting the final response.
21
26
  - Find bugs, not style issues.
22
27
  - Assume issues are hiding. Dig until you find them or can justify that the code is solid.
23
28
 
@@ -41,19 +46,32 @@ Workflow:
41
46
  In Waypoint's default review loop, start with the reviewable slice the main agent hands you.
42
47
  - If there is a recent self-authored commit that cleanly represents the slice, use that commit as the default scope anchor.
43
48
  - Otherwise, start from the current changed files or diff the main agent is asking you to review.
44
- - Widen only as needed.
49
+ - Resolve the actual changed-file list immediately, then read those files in full before doing anything else.
50
+
51
+ 2. Build the review map.
52
+ For each changed file, identify the related code you need to read before judging it:
53
+ - direct imports used by the changed logic
54
+ - importers, callers, or entry points that exercise it
55
+ - tests that cover or should cover it
56
+ - shared types, schemas, config, or registration surfaces it depends on
57
+ - nearby files that establish the intended pattern
45
58
 
46
- 2. Deep research.
59
+ If a changed file seems isolated, prove that with code search instead of assuming it.
60
+
61
+ 3. Deep research.
47
62
  For each changed file:
48
63
  1. Read the full file
49
- 2. Find related files (importers, imports, callers)
50
- 3. Trace data flow end-to-end
64
+ 2. Read the related files required to validate the behavior
65
+ 3. Trace important data flow end-to-end
51
66
  4. Compare against patterns in similar codebase files
52
67
  5. Check interfaces and type contracts
68
+ 6. Verify that tests, config, and registration still match the behavior when relevant
53
69
 
54
70
  Do your own analysis - walkthroughs, diagrams, whatever helps you understand the code. This is internal; it does not need to appear in your output.
55
71
 
56
- 3. Find issues and return.
72
+ Do not stop after the first plausible issue. Keep reading until you understand the slice well enough to explain why the surrounding code does or does not support the change.
73
+
74
+ 4. Find issues and return.
57
75
  Classify each issue:
58
76
  - p0 - data loss, security holes, crashes
59
77
  - p1 - bugs, incorrect behavior
@@ -64,8 +82,10 @@ Return your findings directly as structured text.
64
82
  Output format:
65
83
  ## Code Review: [brief description of changes]
66
84
 
67
- Files analyzed: [list]
85
+ Scope anchor: [commit, diff, or file set]
86
+ Changed files read: [list]
68
87
  Related files read: [list]
88
+ Key paths traced: [list or "none"]
69
89
 
70
90
  ### Issues
71
91
 
@@ -74,7 +94,7 @@ Description of the issue with evidence.
74
94
  **Fix:** What to change.
75
95
 
76
96
  ### No Issues Found
77
- [Use this section instead if the code is clean. State what you verified.]
97
+ [Use this section instead if the code is clean. State what you verified, including the important paths and contracts you checked.]
78
98
 
79
99
  Quality bar:
80
100
  Only report issues that:
@@ -1,3 +1,4 @@
1
+ model = "gpt-5.4"
1
2
  model_reasoning_effort = "high"
2
3
  sandbox_mode = "read-only"
3
4
  developer_instructions = """
@@ -49,6 +49,7 @@ If something important lives only in your head or in the chat transcript, the re
49
49
  - Update `.waypoint/docs/` when durable knowledge changes, and refresh each changed routable doc's `last_updated` field.
50
50
  - Rebuild `.waypoint/DOCS_INDEX.md` whenever routable docs change.
51
51
  - Rebuild `.waypoint/TRACKS_INDEX.md` whenever tracker files change.
52
+ - When spawning reviewer agents or other subagents, explicitly set `model` to `gpt-5.4` and `reasoning_effort` to `high` unless the user explicitly requests a different model or lower reasoning.
52
53
  - Use the repo-local skills and reviewer agents instead of improvising from scratch.
53
54
  - Treat reviewer agents as one-shot workers: once a reviewer returns findings, read the result and close it. If another review pass is needed later, spawn a fresh reviewer instead of reusing the same thread.
54
55
  - Do not kill long-running subagents or reviewer agents just because they are slow.
@@ -75,6 +75,7 @@ Working rules:
75
75
  - Keep `.waypoint/WORKSPACE.md` current as the live execution state, with timestamped new or materially revised entries in multi-topic sections
76
76
  - For large multi-step work, create or update `.waypoint/track/<slug>.md`, keep detailed execution state there, and point to it from `## Active Trackers` in `.waypoint/WORKSPACE.md`
77
77
  - Update `.waypoint/docs/` when behavior or durable project knowledge changes, and refresh `last_updated` on touched routable docs
78
+ - When spawning reviewer agents or other subagents, explicitly set `model` to `gpt-5.4` and `reasoning_effort` to `high` unless the user explicitly requests a different model or lower reasoning
78
79
  - Use the repo-local skills Waypoint ships for structured workflows when relevant
79
80
  - Use `work-tracker` when a long-running implementation, remediation, or verification campaign needs durable progress tracking
80
81
  - Use `docs-sync` when the docs may be stale or a change altered shipped behavior, contracts, routes, or commands