npm - waypoint-codex - Versions diffs - 0.16.0 → 0.18.0 - Mend

waypoint-codex 0.16.0 → 0.18.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/README.md +9 -0
package/dist/src/core.js +1 -0
package/package.json +2 -2
package/templates/.agents/skills/agi-help/SKILL.md +259 -0
package/templates/.agents/skills/agi-help/agents/openai.yaml +4 -0
package/templates/.agents/skills/work-tracker/SKILL.md +2 -2
package/templates/.codex/config.toml +0 -4
package/templates/.gitignore.snippet +0 -1
package/templates/.waypoint/agent-operating-manual.md +2 -4
package/templates/managed-agents-block.md +1 -1
package/templates/.codex/agents/coding-agent.toml +0 -49

package/README.md CHANGED Viewed

@@ -31,10 +31,18 @@ Or try it without a global install:
 npx waypoint-codex@latest --help
 ```
+Inside the repo you want to prepare for Codex:
+```bash
+waypoint init
+waypoint doctor
+```
 Keep an existing repo up to date:
 ```bash
 waypoint upgrade
+waypoint doctor
 ```
 ## What gets better
@@ -261,6 +269,7 @@ Waypoint ships a strong default skill pack for real coding work:
 - `workspace-compress`
 - `pre-pr-hygiene`
 - `pr-review`
+- `agi-help`
 These are repo-local, so the workflow travels with the project.

package/dist/src/core.js CHANGED Viewed

@@ -347,6 +347,7 @@ export function initRepository(projectRoot, options) {
         "docs/README.md",
         "docs/code-guide.md",
         "docs/legacy-import",
+        ".codex/agents/coding-agent.toml",
         ".agents/skills/error-audit",
         ".agents/skills/observability-audit",
         ".agents/skills/ux-states-audit",

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "waypoint-codex",
-  "version": "0.16.0",
-  "description": "Codex-native repository operating system: scaffolding, docs routing, repo-local skills, doctor, and sync.",
+  "version": "0.18.0",
+  "description": "Make Codex better by default with stronger planning, code quality, reviews, tracking, and repo guidance.",
   "license": "MIT",
   "type": "module",
   "bin": {

package/templates/.agents/skills/agi-help/SKILL.md ADDED Viewed

@@ -0,0 +1,259 @@
+---
+name: agi-help
+description: Prepare a complete external handoff package for GPT-5.4-Pro in ChatGPT when a task is unusually high-stakes, ambiguous, leverage-heavy, or quality-sensitive and one excellent answer is worth a slower manual loop. Use for greenfield project starts, major refactors, architecture rethinks, migration strategy, big-feature planning, hard product or strategy decisions, and other work where the external model needs full relevant context because it has no access to the repo, files, history, or local tools.
+---
+# AGI-Help
+Use this skill to prepare a high-quality manual handoff for GPT-5.4-Pro.
+GPT-5.4-Pro is an external thinking partner, not a connected coding agent. It cannot see the repo, local files, prior discussion, current state, or failed attempts unless you package that context for Mark to send manually in ChatGPT.
+The job of this skill is to create a complete handoff bundle that gives GPT-5.4-Pro the best possible chance of producing one exceptional answer.
+## What This Skill Owns
+This skill owns the preparation step:
+- decide whether GPT-5.4-Pro is justified for this task
+- collect all relevant context in full
+- copy the relevant files into a temporary handoff folder
+- write a strong prompt for an external model with zero local context
+- tell Mark exactly what to send
+- stop and wait for the external response
+This skill does **not** send anything itself.
+## When To Use This Skill
+Use AGI-Help when:
+- the task is high-stakes and a weak answer would be costly
+- one strong answer is more valuable than a fast back-and-forth loop
+- deep synthesis, architecture judgment, strategy, or reframing quality matters more than local tool execution speed
+- the task is large enough or important enough that a manual GPT-5.4-Pro pass is worth 20-50 minutes
+Typical examples:
+- starting a project from scratch
+- major refactors or system redesigns
+- architecture or migration strategy
+- planning a large feature or multi-phase initiative
+- resolving hard tradeoffs across product, UX, engineering, and operations
+- reshaping positioning, messaging, or strategy where synthesis quality matters a lot
+- any other difficult task where Mark explicitly wants the strongest available single response
+## When Not To Use This Skill
+Do not use it for:
+- small or routine edits
+- local debugging where filesystem access matters more than abstract reasoning
+- simple implementation tasks that are already clear
+- requests where a normal answer or normal planning pass is sufficient
+## Output
+Create a handoff bundle at one of these locations:
+- prefer `tmp/agi-help/<timestamp>/` inside the current workspace when that is practical
+- otherwise use `~/.codex/tmp/agi-help/<timestamp>/`
+The bundle should contain:
+```text
+tmp/agi-help/<timestamp>/
+├── prompt.md
+├── manifest.md
+├── request-summary.md
+└── files/
+    └── ...copied source files...
+```
+### prompt.md
+The exact prompt Mark should paste into GPT-5.4-Pro.
+### manifest.md
+A file-by-file list of what is included and why each file matters.
+### request-summary.md
+A short operator note for Mark that explains:
+- why AGI-Help was used here
+- what to paste
+- which files to attach
+- what kind of answer to ask for
+### files/
+Copies of every relevant file that should be attached.
+## Core Rule: Include All Relevant Context
+Do not optimize for brevity by dropping relevant material.
+If a file is relevant, include it in full.
+If multiple files are relevant, include all of them.
+If prior plans, failed attempts, docs, architecture notes, or state files materially change the answer, include them too.
+The bottleneck here is not token thrift inside Codex. The bottleneck is giving GPT-5.4-Pro enough real context to reason well.
+Curate relevance aggressively. Compress relevance only by excluding things that truly do not matter.
+Do **not** compress relevant context just because it is long.
+## Workflow
+### 1. Justify AGI-Help
+Before building the bundle, write 3-6 bullets in `request-summary.md` explaining why GPT-5.4-Pro is warranted here.
+Focus on why:
+- the task is unusually important, difficult, or leverage-heavy
+- the answer needs deep synthesis, design judgment, or strategy
+- normal local iteration is likely to be weaker than one high-quality external pass
+### 2. Reconstruct The Full Situation
+Assume GPT-5.4-Pro knows nothing.
+Collect the context it would need to reason well, such as:
+- what the project, company, system, or situation is
+- who the users are
+- what the current state is
+- what we want to achieve
+- why this matters now
+- what constraints exist
+- what tradeoffs matter
+- what has already been tried
+- what is blocked, unclear, risky, or contentious
+- what a successful answer would help us decide or do next
+This is not a fixed checklist. Include whatever materially changes the quality of the answer.
+### 3. Copy The Relevant Files
+Create `files/` and copy in the relevant source material.
+Examples of relevant files:
+- core implementation files
+- architecture docs
+- plans
+- tracker files
+- config files
+- failing or partial implementations
+- screenshots or exported artifacts when available through the current tool surface
+- strategy docs, briefs, drafts, notes, or prior outputs that define the problem
+Preserve relative structure inside `files/` when it helps orientation.
+### 4. Write manifest.md
+For each included file, list:
+- copied path inside the bundle
+- original path
+- why the file matters
+- any brief note about how GPT-5.4-Pro should interpret it
+Keep this concise but useful.
+### 5. Write prompt.md
+Write the prompt as if briefing a world-class expert who has zero implicit context.
+The prompt should usually include:
+1. **Role / framing**
+   - who GPT-5.4-Pro should act as for this task
+2. **Project or situation context**
+   - what this is, who it serves, and how to think about it
+3. **Current state**
+   - what exists today and what is happening now
+4. **Objective**
+   - what we need help with
+5. **Constraints and tradeoffs**
+   - technical, product, operational, organizational, or personal constraints
+6. **What has already been tried or considered**
+   - prior attempts, rejected options, partial work, or known problems
+7. **Attached materials**
+   - tell it that files are attached and should be read before answering
+8. **Specific request**
+   - the concrete question or task
+9. **Desired output shape**
+   - exactly how the answer should be structured
+## Prompt Writing Rules
+### Be Exhaustive About Relevant Context
+Write enough that GPT-5.4-Pro can reason without guessing the basics.
+### Ask For A Concrete Deliverable
+Do not ask vague questions like "thoughts?"
+Ask for something concrete, such as:
+- a recommendation with reasoning
+- a detailed architecture proposal
+- a refactor or migration plan
+- a critique of the current direction
+- a better strategy or positioning approach
+- a decision memo with tradeoffs and risks
+### Specify The Output Format
+Tell GPT-5.4-Pro how to respond.
+Good example shapes:
+- recommendation first, then reasoning, then alternatives, then risks, then implementation plan
+- diagnosis, root causes, proposed direction, concrete changes, failure modes, validation plan
+- executive summary, strategic recommendation, tradeoffs, suggested next steps, open questions
+### Tell It To Read The Attachments First
+Explicitly instruct it to review the attached files before answering.
+## Final Handoff To Mark
+When the bundle is ready, report:
+- the bundle path
+- why AGI-Help was used here
+- the exact file to paste: `prompt.md`
+- which files to attach from `files/`
+- any note about what kind of response will be most useful when Mark pastes it back
+Do not continue into implementation as if GPT-5.4-Pro already answered.
+Stop and wait for Mark.
+## After Mark Returns With The Response
+Once Mark pastes the GPT-5.4-Pro response back into the conversation:
+- treat it as a strong external input, not automatic truth
+- compare it against the actual repo and current state
+- identify where it fits reality, where it conflicts, and what needs adaptation
+- turn the useful parts into a concrete plan, decision, or implementation path
+## Gotchas
+- Do not use this skill just because a task is non-trivial. Use it when answer quality is worth the slower manual loop.
+- Do not assume GPT-5.4-Pro knows the repo, current state, history, or constraints.
+- Do not omit relevant files just because they are large.
+- Do not give GPT-5.4-Pro a vague prompt when a concrete deliverable is needed.
+- Do not bury the actual question under context; the prompt needs both deep context and a crisp ask.
+- Do not continue as though the external answer has already arrived.
+## Keep This Skill Sharp
+- Tighten the trigger description if it fires on normal planning or routine coding tasks.
+- Add new gotchas when a GPT-5.4-Pro handoff fails because context, constraints, or the requested output shape were incomplete.
+- If the same bundle structure or prompt sections keep recurring, strengthen this skill around those patterns instead of rediscovering them each time.

package/templates/.agents/skills/agi-help/agents/openai.yaml ADDED Viewed

@@ -0,0 +1,4 @@
+interface:
+  display_name: "AGI-Help"
+  short_description: "Prepare a full GPT-5.4-Pro handoff package for high-stakes work"
+  default_prompt: "Use $agi-help when this task is unusually high-stakes, ambiguous, or leverage-heavy and the best next move is to prepare a complete GPT-5.4-Pro handoff package with full relevant context, copied source files, and a strong external prompt for Mark to send manually in ChatGPT."

package/templates/.agents/skills/work-tracker/SKILL.md CHANGED Viewed

@@ -88,7 +88,7 @@ A good tracker usually includes:
 - `Decisions`
 - `Notes`
-Use checklists when there are many concrete items. Use timestamped bullets for materially revised state.
+Use `- [ ]` checkboxes when there are many concrete tasks to track. Use status-style entries when the work is better expressed as phase/state updates than as a task list. Use timestamped bullets for materially revised state.
 ## Step 4: Link It From The Workspace
@@ -100,7 +100,7 @@ The tracker should answer "what exactly is happening across the whole workstream
 ## Step 5: Maintain It During Execution
 - Update `last_updated` whenever you materially change the tracker.
-- Mark completed items done instead of deleting the record.
+- Keep task lists or status entries current instead of deleting history. Mark completed checkbox items as `- [x]`, and update status-style entries when the phase or state changes.
 - Add blockers, new tasks, and verification status as the work evolves.
 - Update the tracker during the work, not only at the end. If a milestone, blocker, review round, or verification result changed reality, the tracker should already reflect it.
 - When the workstream finishes, set `status: done` or `status: archived`.

package/templates/.codex/config.toml CHANGED Viewed

@@ -9,10 +9,6 @@ max_threads = 24
 description = "Read-only background reviewer for post-commit maintainability drift, dead code, duplication, and refactoring opportunities worth fixing."
 config_file = "agents/code-health-reviewer.toml"
-[agents."coding-agent"]
-description = "Optional workspace-writing implementation helper for bounded coding tasks that should follow the code guide, verify changes, and hand the slice back cleanly."
-config_file = "agents/coding-agent.toml"
 [agents."code-reviewer"]
 description = "Read-only background reviewer for post-commit bugs, regressions, and integration mistakes."
 config_file = "agents/code-reviewer.toml"

package/templates/.gitignore.snippet CHANGED Viewed

@@ -1,6 +1,5 @@
 # Waypoint state
 .codex/config.toml
-.codex/agents/coding-agent.toml
 .codex/agents/code-reviewer.toml
 .codex/agents/code-health-reviewer.toml
 .codex/agents/plan-reviewer.toml

package/templates/.waypoint/agent-operating-manual.md CHANGED Viewed

@@ -65,11 +65,10 @@ If something important lives only in your head or in the chat transcript, the re
 - Update `.waypoint/docs/` when durable project knowledge changes, update `.waypoint/plans/` when durable plans change, and refresh each changed routable doc's `last_updated` field.
 - Rebuild `.waypoint/DOCS_INDEX.md` whenever routable docs change.
 - Rebuild `.waypoint/TRACKS_INDEX.md` whenever tracker files change.
-- Keep most work in the main agent. Use skills, trackers, `coding-agent`, or reviewer agents when they clearly add leverage, not as default ceremony.
+- Keep most work in the main agent. Use skills, trackers, and reviewer agents when they clearly add leverage, not as default ceremony.
 - Let skills carry their own invocation guidance. The always-on contract should only keep the high-level rule: use repo-local skills deliberately when they help the current task.
-- When spawning `coding-agent`, default to `fork_context: false` and choose the model/reasoning pair that fits the slice. Use stronger models when the delegated slice is user-visible, architecturally important, or hard to unwind.
-- When spawning reviewer agents or other non-`coding-agent` subagents, explicitly set `fork_context: false` and choose the model/reasoning pair that matches the risk and importance of the second pass.
 - Use the repo-local skills and reviewer agents deliberately, but do not underuse them on work that is expensive to get wrong.
+- When spawning reviewer agents or other subagents, explicitly set `fork_context: false` and choose the model/reasoning pair that matches the risk and importance of the second pass.
 - For non-trivial work, strongly prefer reviewer-agent passes between major implementation milestones, before opening or updating a PR, after fixing substantial findings, and before final closeout when the environment allows those agents to run.
 - If you created a PR earlier in the current session and need to push more work, first confirm that PR is still open. If it is closed, create a fresh branch from `origin/main` and open a fresh PR instead of pushing more commits to the old PR branch.
 - Treat reviewer agents as one-shot workers: once a reviewer returns findings, read the result and close it. If another review pass is needed later, spawn a fresh reviewer instead of reusing the same thread.
@@ -118,7 +117,6 @@ Do not document every trivial implementation detail. Document the non-obvious, d
 Waypoint scaffolds these focused specialists by default:
-- `coding-agent` for bounded implementation slices the main agent deliberately wants to hand off because parallelism or context preservation will clearly help
 - `code-reviewer` for correctness and regression review
 - `code-health-reviewer` for maintainability drift
 - `plan-reviewer` to challenge non-trivial implementation plans when an independent second pass would materially improve the result

package/templates/managed-agents-block.md CHANGED Viewed

@@ -101,7 +101,7 @@ Working rules:
 - Update user-scoped `AGENTS.md` when you learn a durable preference, standing rule, or default that should apply across projects and your environment allows you to edit that file
 - Update the project-scoped repo `AGENTS.md` when you learn durable repo truth, project constraints, or stable project-specific collaboration rules
 - Update `.waypoint/docs/` when durable project knowledge changes, update `.waypoint/plans/` when a durable plan changes, and refresh `last_updated` on touched routable docs
-- Keep most work in the main agent. Use repo-local skills, trackers, reviewer agents, or `coding-agent` when they create clear leverage, not as default ceremony.
+- Keep most work in the main agent. Use repo-local skills, trackers, and reviewer agents when they create clear leverage, not as default ceremony.
 - Let repo-local skills describe their own triggers. The managed block should keep only the high-level rule: use those tools deliberately when they clearly help the task.
 - Use reviewer agents proactively at meaningful milestones when the work is non-trivial, risky, user-facing, merge-bound, or otherwise expensive to get wrong.
 - Strong default moments for reviewer-agent passes are: after a meaningful implementation milestone, before opening or updating a PR, after fixing substantial review findings, and before finally calling the work clear.

package/templates/.codex/agents/coding-agent.toml DELETED Viewed

@@ -1,49 +0,0 @@
-model = "gpt-5.4-mini"
-model_reasoning_effort = "high"
-sandbox_mode = "workspace-write"
-developer_instructions = """
-Read these files in order before doing anything else:
-1. .waypoint/SOUL.md
-2. .waypoint/agent-operating-manual.md
-3. .waypoint/WORKSPACE.md
-4. .waypoint/context/MANIFEST.md
-5. every file listed in that manifest
-6. .waypoint/docs/code-guide.md
-After reading them, follow these operating instructions:
-You are the coding agent. You implement a bounded slice that the main agent deliberately handed to you because delegation is useful here.
-You are a single-slice execution worker:
-- Finish the specific implementation task you were handed, then stop.
-- Do not silently broaden scope.
-- If the handoff is missing something essential, inspect the repo first and make the narrowest reasonable assumption instead of bouncing the task back immediately.
-Working rules:
-- Read the files you plan to change before editing them.
-- Read the docs relevant to the area you touch.
-- Follow `.waypoint/docs/code-guide.md` as an active contract, not background reading.
-- Prefer direct, explicit code over speculative abstraction.
-- Keep changes reviewable and easy to explain.
-- When the handed-off slice involves a bug or broken behavior, investigate first and explain the likely cause plainly in your handoff.
-- Do not revert user work you did not create.
-- Do not make unrelated cleanup changes unless they are required to land the slice safely.
-Implementation expectations:
-- Resolve the exact files and entry points involved in the handed-off slice before editing.
-- Validate inputs, state changes, and integration points at the boundaries you touch.
-- Update tests when behavior changes.
-- Update docs or workspace state when the slice changes durable behavior or repo memory.
-- Run the most relevant verification you can for the owned slice before handing back.
-- If verification fails, keep iterating when you can fix it yourself.
-Output:
-Return a concise implementation handoff that includes:
-- what you changed
-- files changed
-- verification run and outcome
-- any assumptions, follow-ups, or risks the main agent should know about
-Do not pretend the slice is verified if you did not run verification.
-Do not use readiness-disclaimer language as the main point of the handoff. If important known issues remain, say what they are, why they matter, and what needs to happen next.
-"""