waypoint-codex 0.16.0 → 0.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -31,10 +31,18 @@ Or try it without a global install:
31
31
  npx waypoint-codex@latest --help
32
32
  ```
33
33
 
34
+ Inside the repo you want to prepare for Codex:
35
+
36
+ ```bash
37
+ waypoint init
38
+ waypoint doctor
39
+ ```
40
+
34
41
  Keep an existing repo up to date:
35
42
 
36
43
  ```bash
37
44
  waypoint upgrade
45
+ waypoint doctor
38
46
  ```
39
47
 
40
48
  ## What gets better
@@ -261,6 +269,7 @@ Waypoint ships a strong default skill pack for real coding work:
261
269
  - `workspace-compress`
262
270
  - `pre-pr-hygiene`
263
271
  - `pr-review`
272
+ - `agi-help`
264
273
 
265
274
  These are repo-local, so the workflow travels with the project.
266
275
 
package/dist/src/core.js CHANGED
@@ -347,6 +347,7 @@ export function initRepository(projectRoot, options) {
347
347
  "docs/README.md",
348
348
  "docs/code-guide.md",
349
349
  "docs/legacy-import",
350
+ ".codex/agents/coding-agent.toml",
350
351
  ".agents/skills/error-audit",
351
352
  ".agents/skills/observability-audit",
352
353
  ".agents/skills/ux-states-audit",
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "waypoint-codex",
3
- "version": "0.16.0",
4
- "description": "Codex-native repository operating system: scaffolding, docs routing, repo-local skills, doctor, and sync.",
3
+ "version": "0.18.0",
4
+ "description": "Make Codex better by default with stronger planning, code quality, reviews, tracking, and repo guidance.",
5
5
  "license": "MIT",
6
6
  "type": "module",
7
7
  "bin": {
@@ -0,0 +1,259 @@
1
+ ---
2
+ name: agi-help
3
+ description: Prepare a complete external handoff package for GPT-5.4-Pro in ChatGPT when a task is unusually high-stakes, ambiguous, leverage-heavy, or quality-sensitive and one excellent answer is worth a slower manual loop. Use for greenfield project starts, major refactors, architecture rethinks, migration strategy, big-feature planning, hard product or strategy decisions, and other work where the external model needs full relevant context because it has no access to the repo, files, history, or local tools.
4
+ ---
5
+
6
+ # AGI-Help
7
+
8
+ Use this skill to prepare a high-quality manual handoff for GPT-5.4-Pro.
9
+
10
+ GPT-5.4-Pro is an external thinking partner, not a connected coding agent. It cannot see the repo, local files, prior discussion, current state, or failed attempts unless you package that context for Mark to send manually in ChatGPT.
11
+
12
+ The job of this skill is to create a complete handoff bundle that gives GPT-5.4-Pro the best possible chance of producing one exceptional answer.
13
+
14
+ ## What This Skill Owns
15
+
16
+ This skill owns the preparation step:
17
+
18
+ - decide whether GPT-5.4-Pro is justified for this task
19
+ - collect all relevant context in full
20
+ - copy the relevant files into a temporary handoff folder
21
+ - write a strong prompt for an external model with zero local context
22
+ - tell Mark exactly what to send
23
+ - stop and wait for the external response
24
+
25
+ This skill does **not** send anything itself.
26
+
27
+ ## When To Use This Skill
28
+
29
+ Use AGI-Help when:
30
+
31
+ - the task is high-stakes and a weak answer would be costly
32
+ - one strong answer is more valuable than a fast back-and-forth loop
33
+ - deep synthesis, architecture judgment, strategy, or reframing quality matters more than local tool execution speed
34
+ - the task is large enough or important enough that a manual GPT-5.4-Pro pass is worth 20-50 minutes
35
+
36
+ Typical examples:
37
+
38
+ - starting a project from scratch
39
+ - major refactors or system redesigns
40
+ - architecture or migration strategy
41
+ - planning a large feature or multi-phase initiative
42
+ - resolving hard tradeoffs across product, UX, engineering, and operations
43
+ - reshaping positioning, messaging, or strategy where synthesis quality matters a lot
44
+ - any other difficult task where Mark explicitly wants the strongest available single response
45
+
46
+ ## When Not To Use This Skill
47
+
48
+ Do not use it for:
49
+
50
+ - small or routine edits
51
+ - local debugging where filesystem access matters more than abstract reasoning
52
+ - simple implementation tasks that are already clear
53
+ - requests where a normal answer or normal planning pass is sufficient
54
+
55
+ ## Output
56
+
57
+ Create a handoff bundle at one of these locations:
58
+
59
+ - prefer `tmp/agi-help/<timestamp>/` inside the current workspace when that is practical
60
+ - otherwise use `~/.codex/tmp/agi-help/<timestamp>/`
61
+
62
+ The bundle should contain:
63
+
64
+ ```text
65
+ tmp/agi-help/<timestamp>/
66
+ ├── prompt.md
67
+ ├── manifest.md
68
+ ├── request-summary.md
69
+ └── files/
70
+ └── ...copied source files...
71
+ ```
72
+
73
+ ### prompt.md
74
+
75
+ The exact prompt Mark should paste into GPT-5.4-Pro.
76
+
77
+ ### manifest.md
78
+
79
+ A file-by-file list of what is included and why each file matters.
80
+
81
+ ### request-summary.md
82
+
83
+ A short operator note for Mark that explains:
84
+
85
+ - why AGI-Help was used here
86
+ - what to paste
87
+ - which files to attach
88
+ - what kind of answer to ask for
89
+
90
+ ### files/
91
+
92
+ Copies of every relevant file that should be attached.
93
+
94
+ ## Core Rule: Include All Relevant Context
95
+
96
+ Do not optimize for brevity by dropping relevant material.
97
+
98
+ If a file is relevant, include it in full.
99
+ If multiple files are relevant, include all of them.
100
+ If prior plans, failed attempts, docs, architecture notes, or state files materially change the answer, include them too.
101
+
102
+ The bottleneck here is not token thrift inside Codex. The bottleneck is giving GPT-5.4-Pro enough real context to reason well.
103
+
104
+ Curate relevance aggressively. Compress relevance only by excluding things that truly do not matter.
105
+ Do **not** compress relevant context just because it is long.
106
+
107
+ ## Workflow
108
+
109
+ ### 1. Justify AGI-Help
110
+
111
+ Before building the bundle, write 3-6 bullets in `request-summary.md` explaining why GPT-5.4-Pro is warranted here.
112
+
113
+ Focus on why:
114
+
115
+ - the task is unusually important, difficult, or leverage-heavy
116
+ - the answer needs deep synthesis, design judgment, or strategy
117
+ - normal local iteration is likely to be weaker than one high-quality external pass
118
+
119
+ ### 2. Reconstruct The Full Situation
120
+
121
+ Assume GPT-5.4-Pro knows nothing.
122
+
123
+ Collect the context it would need to reason well, such as:
124
+
125
+ - what the project, company, system, or situation is
126
+ - who the users are
127
+ - what the current state is
128
+ - what we want to achieve
129
+ - why this matters now
130
+ - what constraints exist
131
+ - what tradeoffs matter
132
+ - what has already been tried
133
+ - what is blocked, unclear, risky, or contentious
134
+ - what a successful answer would help us decide or do next
135
+
136
+ This is not a fixed checklist. Include whatever materially changes the quality of the answer.
137
+
138
+ ### 3. Copy The Relevant Files
139
+
140
+ Create `files/` and copy in the relevant source material.
141
+
142
+ Examples of relevant files:
143
+
144
+ - core implementation files
145
+ - architecture docs
146
+ - plans
147
+ - tracker files
148
+ - config files
149
+ - failing or partial implementations
150
+ - screenshots or exported artifacts when available through the current tool surface
151
+ - strategy docs, briefs, drafts, notes, or prior outputs that define the problem
152
+
153
+ Preserve relative structure inside `files/` when it helps orientation.
154
+
155
+ ### 4. Write manifest.md
156
+
157
+ For each included file, list:
158
+
159
+ - copied path inside the bundle
160
+ - original path
161
+ - why the file matters
162
+ - any brief note about how GPT-5.4-Pro should interpret it
163
+
164
+ Keep this concise but useful.
165
+
166
+ ### 5. Write prompt.md
167
+
168
+ Write the prompt as if briefing a world-class expert who has zero implicit context.
169
+
170
+ The prompt should usually include:
171
+
172
+ 1. **Role / framing**
173
+ - who GPT-5.4-Pro should act as for this task
174
+ 2. **Project or situation context**
175
+ - what this is, who it serves, and how to think about it
176
+ 3. **Current state**
177
+ - what exists today and what is happening now
178
+ 4. **Objective**
179
+ - what we need help with
180
+ 5. **Constraints and tradeoffs**
181
+ - technical, product, operational, organizational, or personal constraints
182
+ 6. **What has already been tried or considered**
183
+ - prior attempts, rejected options, partial work, or known problems
184
+ 7. **Attached materials**
185
+ - tell it that files are attached and should be read before answering
186
+ 8. **Specific request**
187
+ - the concrete question or task
188
+ 9. **Desired output shape**
189
+ - exactly how the answer should be structured
190
+
191
+ ## Prompt Writing Rules
192
+
193
+ ### Be Exhaustive About Relevant Context
194
+
195
+ Write enough that GPT-5.4-Pro can reason without guessing the basics.
196
+
197
+ ### Ask For A Concrete Deliverable
198
+
199
+ Do not ask vague questions like "thoughts?"
200
+
201
+ Ask for something concrete, such as:
202
+
203
+ - a recommendation with reasoning
204
+ - a detailed architecture proposal
205
+ - a refactor or migration plan
206
+ - a critique of the current direction
207
+ - a better strategy or positioning approach
208
+ - a decision memo with tradeoffs and risks
209
+
210
+ ### Specify The Output Format
211
+
212
+ Tell GPT-5.4-Pro how to respond.
213
+
214
+ Good example shapes:
215
+
216
+ - recommendation first, then reasoning, then alternatives, then risks, then implementation plan
217
+ - diagnosis, root causes, proposed direction, concrete changes, failure modes, validation plan
218
+ - executive summary, strategic recommendation, tradeoffs, suggested next steps, open questions
219
+
220
+ ### Tell It To Read The Attachments First
221
+
222
+ Explicitly instruct it to review the attached files before answering.
223
+
224
+ ## Final Handoff To Mark
225
+
226
+ When the bundle is ready, report:
227
+
228
+ - the bundle path
229
+ - why AGI-Help was used here
230
+ - the exact file to paste: `prompt.md`
231
+ - which files to attach from `files/`
232
+ - any note about what kind of response will be most useful when Mark pastes it back
233
+
234
+ Do not continue into implementation as if GPT-5.4-Pro already answered.
235
+ Stop and wait for Mark.
236
+
237
+ ## After Mark Returns With The Response
238
+
239
+ Once Mark pastes the GPT-5.4-Pro response back into the conversation:
240
+
241
+ - treat it as a strong external input, not automatic truth
242
+ - compare it against the actual repo and current state
243
+ - identify where it fits reality, where it conflicts, and what needs adaptation
244
+ - turn the useful parts into a concrete plan, decision, or implementation path
245
+
246
+ ## Gotchas
247
+
248
+ - Do not use this skill just because a task is non-trivial. Use it when answer quality is worth the slower manual loop.
249
+ - Do not assume GPT-5.4-Pro knows the repo, current state, history, or constraints.
250
+ - Do not omit relevant files just because they are large.
251
+ - Do not give GPT-5.4-Pro a vague prompt when a concrete deliverable is needed.
252
+ - Do not bury the actual question under context; the prompt needs both deep context and a crisp ask.
253
+ - Do not continue as though the external answer has already arrived.
254
+
255
+ ## Keep This Skill Sharp
256
+
257
+ - Tighten the trigger description if it fires on normal planning or routine coding tasks.
258
+ - Add new gotchas when a GPT-5.4-Pro handoff fails because context, constraints, or the requested output shape were incomplete.
259
+ - If the same bundle structure or prompt sections keep recurring, strengthen this skill around those patterns instead of rediscovering them each time.
@@ -0,0 +1,4 @@
1
+ interface:
2
+ display_name: "AGI-Help"
3
+ short_description: "Prepare a full GPT-5.4-Pro handoff package for high-stakes work"
4
+ default_prompt: "Use $agi-help when this task is unusually high-stakes, ambiguous, or leverage-heavy and the best next move is to prepare a complete GPT-5.4-Pro handoff package with full relevant context, copied source files, and a strong external prompt for Mark to send manually in ChatGPT."
@@ -88,7 +88,7 @@ A good tracker usually includes:
88
88
  - `Decisions`
89
89
  - `Notes`
90
90
 
91
- Use checklists when there are many concrete items. Use timestamped bullets for materially revised state.
91
+ Use `- [ ]` checkboxes when there are many concrete tasks to track. Use status-style entries when the work is better expressed as phase/state updates than as a task list. Use timestamped bullets for materially revised state.
92
92
 
93
93
  ## Step 4: Link It From The Workspace
94
94
 
@@ -100,7 +100,7 @@ The tracker should answer "what exactly is happening across the whole workstream
100
100
  ## Step 5: Maintain It During Execution
101
101
 
102
102
  - Update `last_updated` whenever you materially change the tracker.
103
- - Mark completed items done instead of deleting the record.
103
+ - Keep task lists or status entries current instead of deleting history. Mark completed checkbox items as `- [x]`, and update status-style entries when the phase or state changes.
104
104
  - Add blockers, new tasks, and verification status as the work evolves.
105
105
  - Update the tracker during the work, not only at the end. If a milestone, blocker, review round, or verification result changed reality, the tracker should already reflect it.
106
106
  - When the workstream finishes, set `status: done` or `status: archived`.
@@ -9,10 +9,6 @@ max_threads = 24
9
9
  description = "Read-only background reviewer for post-commit maintainability drift, dead code, duplication, and refactoring opportunities worth fixing."
10
10
  config_file = "agents/code-health-reviewer.toml"
11
11
 
12
- [agents."coding-agent"]
13
- description = "Optional workspace-writing implementation helper for bounded coding tasks that should follow the code guide, verify changes, and hand the slice back cleanly."
14
- config_file = "agents/coding-agent.toml"
15
-
16
12
  [agents."code-reviewer"]
17
13
  description = "Read-only background reviewer for post-commit bugs, regressions, and integration mistakes."
18
14
  config_file = "agents/code-reviewer.toml"
@@ -1,6 +1,5 @@
1
1
  # Waypoint state
2
2
  .codex/config.toml
3
- .codex/agents/coding-agent.toml
4
3
  .codex/agents/code-reviewer.toml
5
4
  .codex/agents/code-health-reviewer.toml
6
5
  .codex/agents/plan-reviewer.toml
@@ -65,11 +65,10 @@ If something important lives only in your head or in the chat transcript, the re
65
65
  - Update `.waypoint/docs/` when durable project knowledge changes, update `.waypoint/plans/` when durable plans change, and refresh each changed routable doc's `last_updated` field.
66
66
  - Rebuild `.waypoint/DOCS_INDEX.md` whenever routable docs change.
67
67
  - Rebuild `.waypoint/TRACKS_INDEX.md` whenever tracker files change.
68
- - Keep most work in the main agent. Use skills, trackers, `coding-agent`, or reviewer agents when they clearly add leverage, not as default ceremony.
68
+ - Keep most work in the main agent. Use skills, trackers, and reviewer agents when they clearly add leverage, not as default ceremony.
69
69
  - Let skills carry their own invocation guidance. The always-on contract should only keep the high-level rule: use repo-local skills deliberately when they help the current task.
70
- - When spawning `coding-agent`, default to `fork_context: false` and choose the model/reasoning pair that fits the slice. Use stronger models when the delegated slice is user-visible, architecturally important, or hard to unwind.
71
- - When spawning reviewer agents or other non-`coding-agent` subagents, explicitly set `fork_context: false` and choose the model/reasoning pair that matches the risk and importance of the second pass.
72
70
  - Use the repo-local skills and reviewer agents deliberately, but do not underuse them on work that is expensive to get wrong.
71
+ - When spawning reviewer agents or other subagents, explicitly set `fork_context: false` and choose the model/reasoning pair that matches the risk and importance of the second pass.
73
72
  - For non-trivial work, strongly prefer reviewer-agent passes between major implementation milestones, before opening or updating a PR, after fixing substantial findings, and before final closeout when the environment allows those agents to run.
74
73
  - If you created a PR earlier in the current session and need to push more work, first confirm that PR is still open. If it is closed, create a fresh branch from `origin/main` and open a fresh PR instead of pushing more commits to the old PR branch.
75
74
  - Treat reviewer agents as one-shot workers: once a reviewer returns findings, read the result and close it. If another review pass is needed later, spawn a fresh reviewer instead of reusing the same thread.
@@ -118,7 +117,6 @@ Do not document every trivial implementation detail. Document the non-obvious, d
118
117
 
119
118
  Waypoint scaffolds these focused specialists by default:
120
119
 
121
- - `coding-agent` for bounded implementation slices the main agent deliberately wants to hand off because parallelism or context preservation will clearly help
122
120
  - `code-reviewer` for correctness and regression review
123
121
  - `code-health-reviewer` for maintainability drift
124
122
  - `plan-reviewer` to challenge non-trivial implementation plans when an independent second pass would materially improve the result
@@ -101,7 +101,7 @@ Working rules:
101
101
  - Update user-scoped `AGENTS.md` when you learn a durable preference, standing rule, or default that should apply across projects and your environment allows you to edit that file
102
102
  - Update the project-scoped repo `AGENTS.md` when you learn durable repo truth, project constraints, or stable project-specific collaboration rules
103
103
  - Update `.waypoint/docs/` when durable project knowledge changes, update `.waypoint/plans/` when a durable plan changes, and refresh `last_updated` on touched routable docs
104
- - Keep most work in the main agent. Use repo-local skills, trackers, reviewer agents, or `coding-agent` when they create clear leverage, not as default ceremony.
104
+ - Keep most work in the main agent. Use repo-local skills, trackers, and reviewer agents when they create clear leverage, not as default ceremony.
105
105
  - Let repo-local skills describe their own triggers. The managed block should keep only the high-level rule: use those tools deliberately when they clearly help the task.
106
106
  - Use reviewer agents proactively at meaningful milestones when the work is non-trivial, risky, user-facing, merge-bound, or otherwise expensive to get wrong.
107
107
  - Strong default moments for reviewer-agent passes are: after a meaningful implementation milestone, before opening or updating a PR, after fixing substantial review findings, and before finally calling the work clear.
@@ -1,49 +0,0 @@
1
- model = "gpt-5.4-mini"
2
- model_reasoning_effort = "high"
3
- sandbox_mode = "workspace-write"
4
- developer_instructions = """
5
- Read these files in order before doing anything else:
6
- 1. .waypoint/SOUL.md
7
- 2. .waypoint/agent-operating-manual.md
8
- 3. .waypoint/WORKSPACE.md
9
- 4. .waypoint/context/MANIFEST.md
10
- 5. every file listed in that manifest
11
- 6. .waypoint/docs/code-guide.md
12
-
13
- After reading them, follow these operating instructions:
14
-
15
- You are the coding agent. You implement a bounded slice that the main agent deliberately handed to you because delegation is useful here.
16
-
17
- You are a single-slice execution worker:
18
- - Finish the specific implementation task you were handed, then stop.
19
- - Do not silently broaden scope.
20
- - If the handoff is missing something essential, inspect the repo first and make the narrowest reasonable assumption instead of bouncing the task back immediately.
21
-
22
- Working rules:
23
- - Read the files you plan to change before editing them.
24
- - Read the docs relevant to the area you touch.
25
- - Follow `.waypoint/docs/code-guide.md` as an active contract, not background reading.
26
- - Prefer direct, explicit code over speculative abstraction.
27
- - Keep changes reviewable and easy to explain.
28
- - When the handed-off slice involves a bug or broken behavior, investigate first and explain the likely cause plainly in your handoff.
29
- - Do not revert user work you did not create.
30
- - Do not make unrelated cleanup changes unless they are required to land the slice safely.
31
-
32
- Implementation expectations:
33
- - Resolve the exact files and entry points involved in the handed-off slice before editing.
34
- - Validate inputs, state changes, and integration points at the boundaries you touch.
35
- - Update tests when behavior changes.
36
- - Update docs or workspace state when the slice changes durable behavior or repo memory.
37
- - Run the most relevant verification you can for the owned slice before handing back.
38
- - If verification fails, keep iterating when you can fix it yourself.
39
-
40
- Output:
41
- Return a concise implementation handoff that includes:
42
- - what you changed
43
- - files changed
44
- - verification run and outcome
45
- - any assumptions, follow-ups, or risks the main agent should know about
46
-
47
- Do not pretend the slice is verified if you did not run verification.
48
- Do not use readiness-disclaimer language as the main point of the handoff. If important known issues remain, say what they are, why they matter, and what needs to happen next.
49
- """