waypoint-codex 0.6.0 → 0.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,58 +2,128 @@
2
2
 
3
3
  Waypoint is a docs-first repository operating system for Codex.
4
4
 
5
- It helps the next agent pick up your repo with full context by keeping the important things in markdown files inside the repo:
5
+ It helps the next agent understand your repo by making the important context live in the repo itself instead of disappearing into chat history.
6
6
 
7
- - `AGENTS.md` for startup instructions
8
- - `.waypoint/WORKSPACE.md` for live state, with timestamped multi-topic entries
9
- - `.waypoint/docs/` for durable project memory, with `summary`, `last_updated`, and `read_when` frontmatter on routable docs
7
+ ## Why people use it
8
+
9
+ Most agent workflows break down the same way:
10
+
11
+ - the next session starts half-blind
12
+ - important project docs exist, but the agent does not know which ones matter
13
+ - workspace notes turn into noisy append-only logs
14
+ - repo conventions live in people's heads instead of files
15
+ - review and cleanup happen inconsistently
16
+
17
+ Waypoint gives you a lightweight repo contract that fixes those problems with explicit files, generated context, and a strong default skill set.
18
+
19
+ ## What Waypoint sets up
20
+
21
+ Waypoint scaffolds a Codex-friendly repo structure built around a few core pieces:
22
+
23
+ - `AGENTS.md` for the startup contract
24
+ - `.waypoint/WORKSPACE.md` for live operational state
25
+ - `.waypoint/docs/` for durable project memory
10
26
  - `.waypoint/DOCS_INDEX.md` for docs routing
11
- - repo-local skills for planning, audits, verification, workspace compression, and review closure
27
+ - `.waypoint/context/` for generated startup context
28
+ - `.agents/skills/` for repo-local workflows like planning, audits, and QA
29
+
30
+ The philosophy is simple:
31
+
32
+ - less hidden runtime magic
33
+ - more repo-local state
34
+ - more markdown
35
+ - better continuity for the next agent
36
+
37
+ ## Best fit
38
+
39
+ Waypoint is most useful when you want:
40
+
41
+ - multi-session continuity in a real repo
42
+ - a durable docs and workspace structure for agents
43
+ - stronger planning, review, QA, and closeout defaults
44
+ - repo-local scaffolding instead of a bunch of global mystery behavior
45
+
46
+ If you only use Codex for tiny one-off edits, Waypoint is probably unnecessary.
12
47
 
13
48
  ## Install
14
49
 
50
+ Waypoint requires Node 20+.
51
+
15
52
  ```bash
16
53
  npm install -g waypoint-codex
17
54
  ```
18
55
 
19
- Or use it without installing globally:
56
+ Or run it without a global install:
20
57
 
21
58
  ```bash
22
59
  npx waypoint-codex@latest --help
23
60
  ```
24
61
 
25
- ## Start using it
62
+ ## Quick start
26
63
 
27
- Inside your Codex project:
64
+ Inside the repo you want to prepare for Codex:
28
65
 
29
66
  ```bash
30
- waypoint init --with-automations --with-roles
67
+ waypoint init --with-roles --with-automations
31
68
  waypoint doctor
32
69
  ```
33
70
 
34
- That scaffolds:
71
+ That gives you a repo that looks roughly like this:
35
72
 
36
73
  ```text
37
74
  repo/
38
75
  ├── AGENTS.md
39
- ├── .agents/skills/
76
+ ├── .agents/
77
+ │ └── skills/
40
78
  └── .waypoint/
41
79
  ├── WORKSPACE.md
42
80
  ├── DOCS_INDEX.md
43
81
  ├── docs/
44
82
  ├── context/
83
+ ├── scripts/
45
84
  └── ...
46
85
  ```
47
86
 
87
+ From there, start your Codex session in the repo and follow the generated bootstrap in `AGENTS.md`.
88
+
89
+ ## Common init modes
90
+
91
+ ### Minimal setup
92
+
93
+ ```bash
94
+ waypoint init
95
+ ```
96
+
97
+ ### Full local workflow setup
98
+
99
+ ```bash
100
+ waypoint init --with-roles --with-rules --with-automations
101
+ ```
102
+
103
+ ### App-friendly profile
104
+
105
+ ```bash
106
+ waypoint init --app-friendly --with-roles --with-automations
107
+ ```
108
+
109
+ Flags you can combine:
110
+
111
+ - `--app-friendly`
112
+ - `--with-roles`
113
+ - `--with-rules`
114
+ - `--with-automations`
115
+
48
116
  ## Main commands
49
117
 
50
118
  - `waypoint init` — scaffold or refresh the repo
51
- - `waypoint doctor` — check for drift and missing pieces
52
- - `waypoint sync` — rebuild `.waypoint/DOCS_INDEX.md` and sync optional automations/rules
53
- - `waypoint upgrade` — update the global Waypoint CLI and refresh the current repo with its existing config
54
- - `waypoint import-legacy` — import from an older repo layout
119
+ - `waypoint doctor` — validate health and report drift
120
+ - `waypoint sync` — rebuild the docs index and sync optional user-home artifacts
121
+ - `waypoint upgrade` — update the CLI and refresh the current repo using its saved config
122
+ - `waypoint import-legacy` — analyze an older repo layout and produce an adoption report
55
123
 
56
- ## Shipped skills
124
+ ## Built-in skills
125
+
126
+ Waypoint ships a strong default skill pack for real coding work:
57
127
 
58
128
  - `planning`
59
129
  - `error-audit`
@@ -65,7 +135,8 @@ repo/
65
135
  - `workspace-compress`
66
136
  - `pre-pr-hygiene`
67
137
  - `pr-review`
68
- - `e2e-verify`
138
+
139
+ These are repo-local, so the workflow travels with the project.
69
140
 
70
141
  ## Optional reviewer roles
71
142
 
@@ -75,23 +146,54 @@ If you initialize with `--with-roles`, Waypoint scaffolds:
75
146
  - `code-reviewer`
76
147
  - `plan-reviewer`
77
148
 
78
- The intended workflow is post-commit: after your own commit lands, run `code-reviewer` and `code-health-reviewer` in parallel in the background, then fix real findings before you call the work finished.
149
+ The intended workflow is chunk-based: once there is a meaningful reviewable slice, run the reviewers in parallel, fix real findings, then close out. A recent self-authored commit is the preferred scope anchor when one cleanly represents the slice, but it is not the only valid trigger.
150
+
151
+ ## What makes it different
152
+
153
+ Waypoint is not trying to hide everything behind hooks and background machinery.
154
+
155
+ It is opinionated, but explicit:
79
156
 
80
- ## Update
157
+ - state lives in files you can inspect
158
+ - docs routing is generated, not guessed from memory
159
+ - repo conventions are encoded in markdown
160
+ - startup context is rebuilt on purpose
161
+ - the repo remains the source of truth
162
+
163
+ ## Upgrading
164
+
165
+ Recommended path:
81
166
 
82
167
  ```bash
83
168
  waypoint upgrade
84
169
  ```
85
170
 
86
- If you only want to update the CLI without refreshing the repo:
171
+ That updates the global CLI and refreshes the current repo using its existing Waypoint config.
172
+
173
+ If you only want to update the CLI:
87
174
 
88
175
  ```bash
89
176
  waypoint upgrade --skip-repo-refresh
90
177
  ```
91
178
 
179
+ ## Importing an existing repo
180
+
181
+ If you already have an older assistant setup or repo-memory system:
182
+
183
+ ```bash
184
+ waypoint import-legacy /path/to/source-repo /path/to/new-repo --init-target
185
+ ```
186
+
187
+ This generates an adoption report and helps separate durable docs from old runtime-specific scaffolding.
188
+
92
189
  ## Learn more
93
190
 
94
191
  - [Overview](docs/overview.md)
95
192
  - [Architecture](docs/architecture.md)
96
193
  - [Upgrading](docs/upgrading.md)
97
194
  - [Importing Existing Repositories](docs/importing-existing-repos.md)
195
+ - [Releasing](docs/releasing.md)
196
+
197
+ ## License
198
+
199
+ MIT. See [LICENSE](LICENSE).
package/dist/src/core.js CHANGED
@@ -459,7 +459,6 @@ export function doctorRepository(projectRoot) {
459
459
  "workspace-compress",
460
460
  "pre-pr-hygiene",
461
461
  "pr-review",
462
- "e2e-verify",
463
462
  ]) {
464
463
  const skillPath = path.join(projectRoot, ".agents/skills", skillName, "SKILL.md");
465
464
  if (!existsSync(skillPath)) {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "waypoint-codex",
3
- "version": "0.6.0",
3
+ "version": "0.7.0",
4
4
  "description": "Codex-native repository operating system: scaffolding, docs routing, repo-local skills, doctor, and sync.",
5
5
  "license": "MIT",
6
6
  "type": "module",
@@ -7,12 +7,21 @@ description: Verify a user-facing feature by trying to break it on purpose inste
7
7
 
8
8
  Use this skill to attack the feature like an impatient, confused, or careless user.
9
9
 
10
- This is not the same as `e2e-verify`.
10
+ This skill is for adversarial manual QA. It tries to make the feature fail through invalid, interrupted, stale, repeated, or out-of-order interactions instead of only proving the happy path works.
11
11
 
12
- - `e2e-verify` proves the intended flow works end to end.
13
- - `break-it-qa` tries to make the feature fail through invalid, interrupted, stale, repeated, or out-of-order interactions.
12
+ ## Step 1: Ask The Three Setup Questions
14
13
 
15
- ## Read First
14
+ Before testing, ask the user these questions if the answer is not already clear from context:
15
+
16
+ - what exact feature or scope should this cover?
17
+ - how many attack items should the break log reach before stopping?
18
+ - should the skill stop at findings or also fix clear issues after they are found?
19
+
20
+ Keep this intake short. These are the main user-controlled knobs for the skill.
21
+
22
+ If the user does not specify a count, use a reasonable default such as `40`.
23
+
24
+ ## Step 2: Read First
16
25
 
17
26
  Before verification:
18
27
 
@@ -23,20 +32,94 @@ Before verification:
23
32
  5. Read every file listed in that manifest
24
33
  6. Read the routed docs or nearby code that define the feature being tested
25
34
 
26
- ## Step 1: Identify Break Surfaces
35
+ ## Step 3: Identify Break Surfaces
27
36
 
28
37
  - Identify the happy path first so you know what "broken" means.
29
38
  - Find the fragile surfaces: forms, wizards, pending states, destructive actions, async transitions, navigation changes, and persisted state.
39
+ - For each major step or transition, ask explicit "What if...?" questions before testing. Examples:
40
+ - What if the user refreshes here?
41
+ - What if they go back now?
42
+ - What if they click twice?
43
+ - What if this input is empty, malformed, too long, or contradictory?
44
+ - What if this action succeeds in the UI but fails in persistence?
30
45
 
31
46
  Do not test blindly.
32
47
 
33
- ## Step 2: Use The Real UI
48
+ ## Step 4: Create A Break Log
49
+
50
+ Write or update a durable markdown log under `.waypoint/docs/`.
51
+
52
+ - Prefer a focused path such as `.waypoint/docs/verification/<feature>-break-it-qa.md`.
53
+ - If a routed verification doc already exists for this feature, update it instead of creating a competing file.
54
+ - The log is part of the skill, not an optional extra.
55
+ - Pre-generate the attack plan in this log before executing it. Do not improvise everything live.
56
+
57
+ Use one item per attempted action. A good entry shape is:
58
+
59
+ ```markdown
60
+ - [ ] What if the user refreshes on the confirmation step before the request finishes?
61
+ Step: confirmation
62
+ Category: navigation
63
+ Status: pending
64
+ Observed: not tried yet
65
+ ```
66
+
67
+ Then update each item as you go:
68
+
69
+ - `survived`
70
+ - `broke`
71
+ - `fixed`
72
+ - `retested-survived`
73
+ - `blocked`
74
+ - `not-applicable`
75
+
76
+ Every executed item must include:
77
+
78
+ - `Step`
79
+ - `Category`
80
+ - `Status`
81
+ - `Observed`
82
+
83
+ If the user sets a target such as "make this file 150 items long before you stop," treat that as a hard stopping condition unless you hit a real blocker and explain why.
84
+
85
+ Use consistent categories such as:
86
+
87
+ - `navigation`
88
+ - `input-validation`
89
+ - `repeat-action`
90
+ - `stale-state`
91
+ - `error-recovery`
92
+ - `destructive-action`
93
+ - `permissions`
94
+ - `async-state`
95
+ - `persistence`
96
+
97
+ ## Step 5: Enforce Coverage Before Execution
98
+
99
+ Before you start executing attacks:
100
+
101
+ - pre-generate a meaningful attack list
102
+ - spread it across the major flow steps
103
+ - spread it across relevant categories
104
+ - make sure the count is not satisfied by one repetitive corner of the feature
105
+
106
+ Do not treat total item count alone as sufficient coverage.
107
+
108
+ If the user asks for a large target such as `150`, ensure the log covers multiple steps and multiple categories instead of padding one surface.
109
+
110
+ Anti-cheating rules:
111
+
112
+ - no filler items
113
+ - each attack must be meaningfully distinct
114
+ - reworded duplicates do not count toward the target
115
+
116
+ ## Step 6: Use The Real UI
34
117
 
35
118
  - Use `playwright-interactive`.
36
119
  - Exercise the actual UI instead of mocking the flow in code.
37
120
  - Keep the scope focused on the feature the user asked you to verify.
38
121
 
39
- ## Step 3: Try To Break It On Purpose
122
+ ## Step 7: Try To Break It On Purpose
40
123
 
41
124
  Do more than a happy-path walkthrough.
42
125
 
@@ -58,30 +141,41 @@ Actively try:
58
141
 
59
142
  If the feature is stateful, also check whether the UI, network result, and persisted state stay coherent after those interactions.
60
143
 
61
- ## Step 4: Record And Fix Real Bugs
144
+ As you test, keep expanding the break log with new "What if...?" cases that emerge from the flow. Do not rely on memory or chat-only notes.
145
+
146
+ ## Step 8: Record And Fix Real Bugs
62
147
 
63
148
  - Document each meaningful issue you find.
64
- - Fix the issue when the remediation is clear.
149
+ - Fix the issue when the remediation is clear and the chosen mode includes fixes.
65
150
  - If the behavior is ambiguous, call out the product decision instead of bluffing a fix.
66
151
  - Update docs when the verification exposes stale assumptions about how the feature works.
152
+ - Update the break log entry for each attempted action with what happened and whether the feature survived.
153
+ - Require a short observed-result note for every executed item. "Worked" is too weak; capture what actually happened.
67
154
 
68
155
  Do not stop at the first bug.
69
156
 
70
- ## Step 5: Repeat Until The Feature Resists Abuse
157
+ ## Step 9: Repeat Until The Feature Resists Abuse
71
158
 
72
159
  After fixes:
73
160
 
74
161
  - rerun the relevant happy path
75
162
  - rerun the break attempts that previously failed
163
+ - rerun directly related attacks
164
+ - rerun neighboring attacks that touch the same step, state transition, or failure surface
76
165
  - verify the fix did not create a new inconsistent state
166
+ - keep adding and executing new "What if...?" items until the requested target coverage is reached
77
167
 
78
168
  The skill is not done when the feature only works once. It is done when the feature behaves predictably under sloppy real-world use.
79
169
 
80
- ## Step 6: Report Truthfully
170
+ ## Step 10: Report Truthfully
81
171
 
82
172
  Summarize:
83
173
 
174
+ - the path to the break log markdown file
175
+ - how many attack items were recorded and exercised
176
+ - how coverage was distributed across steps and categories
84
177
  - what break attempts you tried
85
178
  - which issues you found
86
179
  - what you fixed
180
+ - a short systemic-risks summary describing recurring weakness patterns, not just individual bugs
87
181
  - what still looks risky or was not exercised
@@ -11,7 +11,9 @@ This skill owns one job: inspect the specific code the user points at, map it ag
11
11
 
12
12
  ## Step 1: Load The Right Scope
13
13
 
14
- - Read `.waypoint/docs/code-guide.md`.
14
+ - Read the repo's routed code guide.
15
+ - In standard Waypoint repos, use `.waypoint/docs/code-guide.md`.
16
+ - If the repo routes the code guide somewhere else, follow the repo's own docs and routing instead of assuming another fixed path.
15
17
  - Read only the files, routes, tests, contracts, and nearby docs needed to understand the specific feature or slice under review.
16
18
  - If the scope is ambiguous, resolve it to a concrete file set, feature path, or commit-sized change surface before auditing.
17
19
 
@@ -16,7 +16,9 @@ Before the hygiene pass:
16
16
  3. Read `.waypoint/WORKSPACE.md`
17
17
  4. Read `.waypoint/context/MANIFEST.md`
18
18
  5. Read every file listed in that manifest
19
- 6. Read `.waypoint/docs/code-guide.md` and the routed docs relevant to the area being shipped
19
+ 6. Read the repo's routed code guide and the routed docs relevant to the area being shipped
20
+
21
+ In standard Waypoint repos, the code guide lives at `.waypoint/docs/code-guide.md`. If the repo routes it somewhere else, follow the repo's own docs and routing instead of assuming another fixed path.
20
22
 
21
23
  ## Step 1: Audit The Whole Change Surface
22
24
 
@@ -73,7 +73,6 @@ Do not document every trivial implementation detail. Document the non-obvious, d
73
73
  - `workspace-compress` after meaningful chunks, before stopping, and before review when the live handoff needs compression
74
74
  - `pre-pr-hygiene` before pushing or opening/updating a PR for substantial work
75
75
  - `pr-review` once a PR has active review comments or automated review in progress
76
- - `e2e-verify` for major user-facing or cross-system changes that need manual end-to-end verification
77
76
 
78
77
  ## When to use the optional reviewer agents
79
78
 
@@ -83,14 +82,15 @@ If the repo was initialized with Waypoint roles enabled, use them as focused sec
83
82
  - `code-health-reviewer` for maintainability drift
84
83
  - `plan-reviewer` to challenge weak implementation plans before execution
85
84
 
86
- ## Post-Commit Review Loop
85
+ ## Review Loop
87
86
 
88
- If Waypoint's optional roles are enabled and you authored a commit, immediately after that commit:
87
+ If Waypoint's optional roles are enabled, run the reviewer pair after a meaningful reviewable implementation chunk, not just as a reflex after every tiny commit.
89
88
 
90
- 1. Launch `code-reviewer` and `code-health-reviewer` in parallel as background, read-only reviewers.
91
- 2. Scope them to the commit you just made, then widen only when surrounding files are needed to validate a finding.
92
- 3. Do not call the work finished before you read both reviewer results.
93
- 4. Fix real findings, rerun the relevant verification, update workspace/docs if needed, and make a follow-up commit when fixes change the repo.
89
+ 1. Launch `code-reviewer` and `code-health-reviewer` in parallel as background, read-only reviewers once there is a coherent slice of work worth reviewing.
90
+ 2. If you have a recent self-authored commit that cleanly represents that slice, use it as the default review scope anchor. Otherwise scope the reviewers to the current changed slice.
91
+ 3. Widen only when surrounding files are needed to validate a finding.
92
+ 4. Do not call the work finished before you read both reviewer results.
93
+ 5. Fix real findings, rerun the relevant verification, update workspace/docs if needed, and make a follow-up commit when fixes change the repo.
94
94
 
95
95
  ## Quality bar
96
96
 
@@ -59,7 +59,13 @@ Do not create findings for:
59
59
 
60
60
  ## Scope
61
61
 
62
- In Waypoint's default post-commit review loop, start with the latest self-authored commit, then widen only when related files are needed to validate a maintainability issue. Focus on:
62
+ In Waypoint's default review loop, start with the reviewable slice the main agent hands you.
63
+
64
+ - If there is a recent self-authored commit that cleanly represents the slice, use that commit as the default scope anchor.
65
+ - Otherwise, start from the current changed files or diff under review.
66
+ - Widen only when related files are needed to validate a maintainability issue.
67
+
68
+ Focus on:
63
69
 
64
70
  - recently changed files
65
71
  - their importers
@@ -41,7 +41,11 @@ Not:
41
41
 
42
42
  ### 1. Get the Changes
43
43
 
44
- In Waypoint's default post-commit review loop, start with the latest self-authored commit. Review the actual diff or recent changed files first, then widen only as needed.
44
+ In Waypoint's default review loop, start with the reviewable slice the main agent hands you.
45
+
46
+ - If there is a recent self-authored commit that cleanly represents the slice, use that commit as the default scope anchor.
47
+ - Otherwise, start from the current changed files or diff the main agent is asking you to review.
48
+ - Widen only as needed.
45
49
 
46
50
  ### 2. Deep Research
47
51
 
@@ -37,9 +37,8 @@ Working rules:
37
37
  - Use `docs-sync` when the docs may be stale or a change altered shipped behavior, contracts, routes, or commands
38
38
  - Use `code-guide-audit` for a targeted coding-guide compliance pass on a specific feature, file set, or change slice
39
39
  - Use `break-it-qa` for browser-facing features that should be tested against invalid inputs, refreshes, repeated clicks, wrong navigation, and other adversarial user behavior
40
- - If optional reviewer roles are present and you make a commit, run `code-reviewer` and `code-health-reviewer` in parallel before calling the work done
40
+ - If optional reviewer roles are present and there is a meaningful reviewable implementation chunk, run `code-reviewer` and `code-health-reviewer` in parallel before calling the work done; use a recent self-authored commit as the default scope anchor when one cleanly represents that slice
41
41
  - Before pushing or opening/updating a PR for substantial work, use `pre-pr-hygiene`
42
42
  - Use `pr-review` once a PR has active review comments or automated review in progress
43
- - Use `e2e-verify` for major user-facing or cross-system changes that need manual end-to-end verification
44
43
  - Treat the generated context bundle as required session bootstrap, not optional reference material
45
44
  <!-- waypoint:end -->
@@ -1,63 +0,0 @@
1
- ---
2
- name: e2e-verify
3
- description: Perform manual end-to-end verification for a shipped feature or major change. Use when frontend and backend behavior must be verified together, when a feature needs a realistic walkthrough, or when the agent should manually exercise the flow, inspect logs and persisted state, document issues, fix them, and repeat until no meaningful end-to-end issues remain.
4
- ---
5
-
6
- # E2E Verify
7
-
8
- Use this skill when "it should work" is not enough and the flow needs to be proven end to end.
9
-
10
- ## Read First
11
-
12
- Before verification:
13
-
14
- 1. Read `.waypoint/SOUL.md`
15
- 2. Read `.waypoint/agent-operating-manual.md`
16
- 3. Read `.waypoint/WORKSPACE.md`
17
- 4. Read `.waypoint/context/MANIFEST.md`
18
- 5. Read every file listed in that manifest
19
- 6. Read the routed docs that define the feature, flow, or contract being verified
20
-
21
- ## Step 1: Exercise The Real Flow
22
-
23
- - For browser-facing paths, manually exercise the feature through the real UI.
24
- - For backend-only or service flows, drive the real API or runtime path directly.
25
- - Follow the feature from entry point to persistence to user-visible outcome.
26
-
27
- ## Step 2: Inspect End-To-End State
28
-
29
- Check the surfaces that prove the system actually behaved correctly:
30
-
31
- - UI state
32
- - server responses
33
- - logs
34
- - background-job state if relevant
35
- - database or persisted records when relevant
36
-
37
- Do not stop at "the page looked okay."
38
-
39
- ## Step 3: Record And Fix Issues
40
-
41
- - Document each meaningful issue you find.
42
- - Fix the issue when the remediation is clear.
43
- - Update docs or contracts if verification exposes stale assumptions.
44
-
45
- ## Step 4: Repeat Until Clean
46
-
47
- Re-run the end-to-end flow after fixes.
48
-
49
- The skill is complete only when:
50
-
51
- - the intended flow works
52
- - the persisted state is correct
53
- - the logs tell a truthful story
54
- - no meaningful issues remain
55
-
56
- ## Step 5: Report Verification Truthfully
57
-
58
- Summarize:
59
-
60
- - the flows exercised
61
- - the state surfaces inspected
62
- - the issues found and fixed
63
- - any residual risks or unverified edges
@@ -1,4 +0,0 @@
1
- interface:
2
- display_name: "E2E Verify"
3
- short_description: "Manually verify a feature end to end"
4
- default_prompt: "Use this skill for manual end-to-end verification of a feature or major change. Exercise the real flow, inspect UI plus logs and persisted state, document issues, fix them, and repeat until no meaningful end-to-end issues remain."