waypoint-codex 0.5.0 → 0.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -2,58 +2,128 @@
2
2
 
3
3
  Waypoint is a docs-first repository operating system for Codex.
4
4
 
5
- It helps the next agent pick up your repo with full context by keeping the important things in markdown files inside the repo:
5
+ It helps the next agent understand your repo by making the important context live in the repo itself instead of disappearing into chat history.
6
6
 
7
- - `AGENTS.md` for startup instructions
8
- - `.waypoint/WORKSPACE.md` for live state, with timestamped multi-topic entries
9
- - `.waypoint/docs/` for durable project memory, with `summary`, `last_updated`, and `read_when` frontmatter on routable docs
7
+ ## Why people use it
8
+
9
+ Most agent workflows break down the same way:
10
+
11
+ - the next session starts half-blind
12
+ - important project docs exist, but the agent does not know which ones matter
13
+ - workspace notes turn into noisy append-only logs
14
+ - repo conventions live in people's heads instead of files
15
+ - review and cleanup happen inconsistently
16
+
17
+ Waypoint gives you a lightweight repo contract that fixes those problems with explicit files, generated context, and a strong default skill set.
18
+
19
+ ## What Waypoint sets up
20
+
21
+ Waypoint scaffolds a Codex-friendly repo structure built around a few core pieces:
22
+
23
+ - `AGENTS.md` for the startup contract
24
+ - `.waypoint/WORKSPACE.md` for live operational state
25
+ - `.waypoint/docs/` for durable project memory
10
26
  - `.waypoint/DOCS_INDEX.md` for docs routing
11
- - repo-local skills for planning, audits, verification, workspace compression, and review closure
27
+ - `.waypoint/context/` for generated startup context
28
+ - `.agents/skills/` for repo-local workflows like planning, audits, and QA
29
+
30
+ The philosophy is simple:
31
+
32
+ - less hidden runtime magic
33
+ - more repo-local state
34
+ - more markdown
35
+ - better continuity for the next agent
36
+
37
+ ## Best fit
38
+
39
+ Waypoint is most useful when you want:
40
+
41
+ - multi-session continuity in a real repo
42
+ - a durable docs and workspace structure for agents
43
+ - stronger planning, review, QA, and closeout defaults
44
+ - repo-local scaffolding instead of a bunch of global mystery behavior
45
+
46
+ If you only use Codex for tiny one-off edits, Waypoint is probably unnecessary.
12
47
 
13
48
  ## Install
14
49
 
50
+ Waypoint requires Node 20+.
51
+
15
52
  ```bash
16
53
  npm install -g waypoint-codex
17
54
  ```
18
55
 
19
- Or use it without installing globally:
56
+ Or run it without a global install:
20
57
 
21
58
  ```bash
22
59
  npx waypoint-codex@latest --help
23
60
  ```
24
61
 
25
- ## Start using it
62
+ ## Quick start
26
63
 
27
- Inside your Codex project:
64
+ Inside the repo you want to prepare for Codex:
28
65
 
29
66
  ```bash
30
- waypoint init --with-automations --with-roles
67
+ waypoint init --with-roles --with-automations
31
68
  waypoint doctor
32
69
  ```
33
70
 
34
- That scaffolds:
71
+ That gives you a repo that looks roughly like this:
35
72
 
36
73
  ```text
37
74
  repo/
38
75
  ├── AGENTS.md
39
- ├── .agents/skills/
76
+ ├── .agents/
77
+ │ └── skills/
40
78
  └── .waypoint/
41
79
  ├── WORKSPACE.md
42
80
  ├── DOCS_INDEX.md
43
81
  ├── docs/
44
82
  ├── context/
83
+ ├── scripts/
45
84
  └── ...
46
85
  ```
47
86
 
87
+ From there, start your Codex session in the repo and follow the generated bootstrap in `AGENTS.md`.
88
+
89
+ ## Common init modes
90
+
91
+ ### Minimal setup
92
+
93
+ ```bash
94
+ waypoint init
95
+ ```
96
+
97
+ ### Full local workflow setup
98
+
99
+ ```bash
100
+ waypoint init --with-roles --with-rules --with-automations
101
+ ```
102
+
103
+ ### App-friendly profile
104
+
105
+ ```bash
106
+ waypoint init --app-friendly --with-roles --with-automations
107
+ ```
108
+
109
+ Flags you can combine:
110
+
111
+ - `--app-friendly`
112
+ - `--with-roles`
113
+ - `--with-rules`
114
+ - `--with-automations`
115
+
48
116
  ## Main commands
49
117
 
50
118
  - `waypoint init` — scaffold or refresh the repo
51
- - `waypoint doctor` — check for drift and missing pieces
52
- - `waypoint sync` — rebuild `.waypoint/DOCS_INDEX.md` and sync optional automations/rules
53
- - `waypoint upgrade` — update the global Waypoint CLI and refresh the current repo with its existing config
54
- - `waypoint import-legacy` — import from an older repo layout
119
+ - `waypoint doctor` — validate health and report drift
120
+ - `waypoint sync` — rebuild the docs index and sync optional user-home artifacts
121
+ - `waypoint upgrade` — update the CLI and refresh the current repo using its saved config
122
+ - `waypoint import-legacy` — analyze an older repo layout and produce an adoption report
55
123
 
56
- ## Shipped skills
124
+ ## Built-in skills
125
+
126
+ Waypoint ships a strong default skill pack for real coding work:
57
127
 
58
128
  - `planning`
59
129
  - `error-audit`
@@ -61,11 +131,14 @@ repo/
61
131
  - `ux-states-audit`
62
132
  - `docs-sync`
63
133
  - `code-guide-audit`
134
+ - `break-it-qa`
64
135
  - `workspace-compress`
65
136
  - `pre-pr-hygiene`
66
137
  - `pr-review`
67
138
  - `e2e-verify`
68
139
 
140
+ These are repo-local, so the workflow travels with the project.
141
+
69
142
  ## Optional reviewer roles
70
143
 
71
144
  If you initialize with `--with-roles`, Waypoint scaffolds:
@@ -74,23 +147,54 @@ If you initialize with `--with-roles`, Waypoint scaffolds:
74
147
  - `code-reviewer`
75
148
  - `plan-reviewer`
76
149
 
77
- The intended workflow is post-commit: after your own commit lands, run `code-reviewer` and `code-health-reviewer` in parallel in the background, then fix real findings before you call the work finished.
150
+ The intended workflow is post-commit: make your change, commit it, run the reviewers in parallel, fix real findings, then close out.
151
+
152
+ ## What makes it different
153
+
154
+ Waypoint is not trying to hide everything behind hooks and background machinery.
155
+
156
+ It is opinionated, but explicit:
78
157
 
79
- ## Update
158
+ - state lives in files you can inspect
159
+ - docs routing is generated, not guessed from memory
160
+ - repo conventions are encoded in markdown
161
+ - startup context is rebuilt on purpose
162
+ - the repo remains the source of truth
163
+
164
+ ## Upgrading
165
+
166
+ Recommended path:
80
167
 
81
168
  ```bash
82
169
  waypoint upgrade
83
170
  ```
84
171
 
85
- If you only want to update the CLI without refreshing the repo:
172
+ That updates the global CLI and refreshes the current repo using its existing Waypoint config.
173
+
174
+ If you only want to update the CLI:
86
175
 
87
176
  ```bash
88
177
  waypoint upgrade --skip-repo-refresh
89
178
  ```
90
179
 
180
+ ## Importing an existing repo
181
+
182
+ If you already have an older assistant setup or repo-memory system:
183
+
184
+ ```bash
185
+ waypoint import-legacy /path/to/source-repo /path/to/new-repo --init-target
186
+ ```
187
+
188
+ This generates an adoption report and helps separate durable docs from old runtime-specific scaffolding.
189
+
91
190
  ## Learn more
92
191
 
93
192
  - [Overview](docs/overview.md)
94
193
  - [Architecture](docs/architecture.md)
95
194
  - [Upgrading](docs/upgrading.md)
96
195
  - [Importing Existing Repositories](docs/importing-existing-repos.md)
196
+ - [Releasing](docs/releasing.md)
197
+
198
+ ## License
199
+
200
+ MIT. See [LICENSE](LICENSE).
package/dist/src/core.js CHANGED
@@ -455,6 +455,7 @@ export function doctorRepository(projectRoot) {
455
455
  "ux-states-audit",
456
456
  "docs-sync",
457
457
  "code-guide-audit",
458
+ "break-it-qa",
458
459
  "workspace-compress",
459
460
  "pre-pr-hygiene",
460
461
  "pr-review",
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "waypoint-codex",
3
- "version": "0.5.0",
3
+ "version": "0.6.1",
4
4
  "description": "Codex-native repository operating system: scaffolding, docs routing, repo-local skills, doctor, and sync.",
5
5
  "license": "MIT",
6
6
  "type": "module",
@@ -0,0 +1,184 @@
1
+ ---
2
+ name: break-it-qa
3
+ description: Verify a user-facing feature by trying to break it on purpose instead of only following the happy path. Use after building forms, multistep flows, settings pages, onboarding, stateful UI, destructive actions, or any browser-facing feature where invalid inputs, refreshes, back navigation, repeated clicks, wrong action order, or recovery paths might expose real bugs.
4
+ ---
5
+
6
+ # Break-It QA
7
+
8
+ Use this skill to attack the feature like an impatient, confused, or careless user.
9
+
10
+ This is not the same as `e2e-verify`.
11
+
12
+ - `e2e-verify` proves the intended flow works end to end.
13
+ - `break-it-qa` tries to make the feature fail through invalid, interrupted, stale, repeated, or out-of-order interactions.
14
+
15
+ ## Step 1: Ask The Three Setup Questions
16
+
17
+ Before testing, ask the user these questions if the answer is not already clear from context:
18
+
19
+ - what exact feature or scope should this cover?
20
+ - how many attack items should the break log reach before stopping?
21
+ - should the skill stop at findings or also fix clear issues after they are found?
22
+
23
+ Keep this intake short. These are the main user-controlled knobs for the skill.
24
+
25
+ If the user does not specify a count, use a reasonable default such as `40`.
26
+
27
+ ## Step 2: Read First
28
+
29
+ Before verification:
30
+
31
+ 1. Read `.waypoint/SOUL.md`
32
+ 2. Read `.waypoint/agent-operating-manual.md`
33
+ 3. Read `.waypoint/WORKSPACE.md`
34
+ 4. Read `.waypoint/context/MANIFEST.md`
35
+ 5. Read every file listed in that manifest
36
+ 6. Read the routed docs or nearby code that define the feature being tested
37
+
38
+ ## Step 3: Identify Break Surfaces
39
+
40
+ - Identify the happy path first so you know what "broken" means.
41
+ - Find the fragile surfaces: forms, wizards, pending states, destructive actions, async transitions, navigation changes, and persisted state.
42
+ - For each major step or transition, ask explicit "What if...?" questions before testing. Examples:
43
+ - What if the user refreshes here?
44
+ - What if they go back now?
45
+ - What if they click twice?
46
+ - What if this input is empty, malformed, too long, or contradictory?
47
+ - What if this action succeeds in the UI but fails in persistence?
48
+
49
+ Do not test blindly.
50
+
51
+ ## Step 4: Create A Break Log
52
+
53
+ Write or update a durable markdown log under `.waypoint/docs/`.
54
+
55
+ - Prefer a focused path such as `.waypoint/docs/verification/<feature>-break-it-qa.md`.
56
+ - If a routed verification doc already exists for this feature, update it instead of creating a competing file.
57
+ - The log is part of the skill, not an optional extra.
58
+ - Pre-generate the attack plan in this log before executing it. Do not improvise everything live.
59
+
60
+ Use one item per attempted action. A good entry shape is:
61
+
62
+ ```markdown
63
+ - [ ] What if the user refreshes on the confirmation step before the request finishes?
64
+ Step: confirmation
65
+ Category: navigation
66
+ Status: pending
67
+ Observed: not tried yet
68
+ ```
69
+
70
+ Then update each item as you go:
71
+
72
+ - `survived`
73
+ - `broke`
74
+ - `fixed`
75
+ - `retested-survived`
76
+ - `blocked`
77
+ - `not-applicable`
78
+
79
+ Every executed item must include:
80
+
81
+ - `Step`
82
+ - `Category`
83
+ - `Status`
84
+ - `Observed`
85
+
86
+ If the user sets a target such as "make this file 150 items long before you stop," treat that as a hard stopping condition unless you hit a real blocker and explain why.
87
+
88
+ Use consistent categories such as:
89
+
90
+ - `navigation`
91
+ - `input-validation`
92
+ - `repeat-action`
93
+ - `stale-state`
94
+ - `error-recovery`
95
+ - `destructive-action`
96
+ - `permissions`
97
+ - `async-state`
98
+ - `persistence`
99
+
100
+ ## Step 5: Enforce Coverage Before Execution
101
+
102
+ Before you start executing attacks:
103
+
104
+ - pre-generate a meaningful attack list
105
+ - spread it across the major flow steps
106
+ - spread it across relevant categories
107
+ - make sure the count is not satisfied by one repetitive corner of the feature
108
+
109
+ Do not treat total item count alone as sufficient coverage.
110
+
111
+ If the user asks for a large target such as `150`, ensure the log covers multiple steps and multiple categories instead of padding one surface.
112
+
113
+ Anti-cheating rules:
114
+
115
+ - no filler items
116
+ - each attack must be meaningfully distinct
117
+ - reworded duplicates do not count toward the target
118
+
119
+ ## Step 6: Use The Real UI
120
+
121
+ - Use `playwright-interactive`.
122
+ - Exercise the actual UI instead of mocking the flow in code.
123
+ - Keep the scope focused on the feature the user asked you to verify.
124
+
125
+ ## Step 7: Try To Break It On Purpose
126
+
127
+ Do more than a happy-path walkthrough.
128
+
129
+ Actively try:
130
+
131
+ - invalid inputs
132
+ - empty required fields
133
+ - boundary-length or malformed inputs
134
+ - repeated or double clicks
135
+ - submitting twice
136
+ - wrong action order
137
+ - back and forward navigation
138
+ - page refresh during the flow
139
+ - closing and reopening modals or screens
140
+ - canceling mid-flow and re-entering
141
+ - stale UI state after edits
142
+ - conflicting selections or toggles
143
+ - error recovery after a failed action
144
+
145
+ If the feature is stateful, also check whether the UI, network result, and persisted state stay coherent after those interactions.
146
+
147
+ As you test, keep expanding the break log with new "What if...?" cases that emerge from the flow. Do not rely on memory or chat-only notes.
148
+
149
+ ## Step 8: Record And Fix Real Bugs
150
+
151
+ - Document each meaningful issue you find.
152
+ - Fix the issue when the remediation is clear and the chosen mode includes fixes.
153
+ - If the behavior is ambiguous, call out the product decision instead of bluffing a fix.
154
+ - Update docs when the verification exposes stale assumptions about how the feature works.
155
+ - Update the break log entry for each attempted action with what happened and whether the feature survived.
156
+ - Require a short observed-result note for every executed item. "Worked" is too weak; capture what actually happened.
157
+
158
+ Do not stop at the first bug.
159
+
160
+ ## Step 9: Repeat Until The Feature Resists Abuse
161
+
162
+ After fixes:
163
+
164
+ - rerun the relevant happy path
165
+ - rerun the break attempts that previously failed
166
+ - rerun directly related attacks
167
+ - rerun neighboring attacks that touch the same step, state transition, or failure surface
168
+ - verify the fix did not create a new inconsistent state
169
+ - keep adding and executing new "What if...?" items until the requested target coverage is reached
170
+
171
+ The skill is not done when the feature only works once. It is done when the feature behaves predictably under sloppy real-world use.
172
+
173
+ ## Step 10: Report Truthfully
174
+
175
+ Summarize:
176
+
177
+ - the path to the break log markdown file
178
+ - how many attack items were recorded and exercised
179
+ - how coverage was distributed across steps and categories
180
+ - what break attempts you tried
181
+ - which issues you found
182
+ - what you fixed
183
+ - a short systemic-risks summary describing recurring weakness patterns, not just individual bugs
184
+ - what still looks risky or was not exercised
@@ -0,0 +1,4 @@
1
+ interface:
2
+ display_name: "Break-It QA"
3
+ short_description: "Try to break a feature through the UI"
4
+ default_prompt: "Use this skill to verify a user-facing feature by trying to break it through the browser with invalid inputs, wrong action order, refreshes, back navigation, repeated clicks, and other adversarial interactions, then fix clear issues and repeat."
@@ -69,6 +69,7 @@ Do not document every trivial implementation detail. Document the non-obvious, d
69
69
  - `ux-states-audit` when async/data-driven UI likely lacks loading, empty, or error states
70
70
  - `docs-sync` when routed docs may be stale, missing, or inconsistent with the codebase
71
71
  - `code-guide-audit` when a specific feature or file set needs a targeted coding-guide compliance check
72
+ - `break-it-qa` when a browser-facing feature should be attacked with invalid inputs, refreshes, repeated clicks, wrong action order, or other adversarial manual QA
72
73
  - `workspace-compress` after meaningful chunks, before stopping, and before review when the live handoff needs compression
73
74
  - `pre-pr-hygiene` before pushing or opening/updating a PR for substantial work
74
75
  - `pr-review` once a PR has active review comments or automated review in progress
@@ -36,6 +36,7 @@ Working rules:
36
36
  - Use the repo-local skills Waypoint ships for structured workflows when relevant
37
37
  - Use `docs-sync` when the docs may be stale or a change altered shipped behavior, contracts, routes, or commands
38
38
  - Use `code-guide-audit` for a targeted coding-guide compliance pass on a specific feature, file set, or change slice
39
+ - Use `break-it-qa` for browser-facing features that should be tested against invalid inputs, refreshes, repeated clicks, wrong navigation, and other adversarial user behavior
39
40
  - If optional reviewer roles are present and you make a commit, run `code-reviewer` and `code-health-reviewer` in parallel before calling the work done
40
41
  - Before pushing or opening/updating a PR for substantial work, use `pre-pr-hygiene`
41
42
  - Use `pr-review` once a PR has active review comments or automated review in progress