sequant 2.1.1 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/.claude-plugin/marketplace.json +1 -1
  2. package/.claude-plugin/plugin.json +1 -1
  3. package/dist/bin/cli.js +1 -0
  4. package/dist/src/commands/init.d.ts +1 -0
  5. package/dist/src/commands/init.js +122 -3
  6. package/dist/src/commands/run-compat.d.ts +14 -0
  7. package/dist/src/commands/run-compat.js +12 -0
  8. package/dist/src/commands/run-display.d.ts +17 -0
  9. package/dist/src/commands/run-display.js +116 -0
  10. package/dist/src/commands/run.d.ts +4 -26
  11. package/dist/src/commands/run.js +47 -772
  12. package/dist/src/commands/status.js +24 -1
  13. package/dist/src/index.d.ts +11 -0
  14. package/dist/src/index.js +9 -0
  15. package/dist/src/lib/errors.d.ts +93 -0
  16. package/dist/src/lib/errors.js +97 -0
  17. package/dist/src/lib/settings.d.ts +236 -0
  18. package/dist/src/lib/settings.js +482 -37
  19. package/dist/src/lib/skill-version.d.ts +19 -0
  20. package/dist/src/lib/skill-version.js +68 -0
  21. package/dist/src/lib/templates.d.ts +1 -0
  22. package/dist/src/lib/templates.js +1 -1
  23. package/dist/src/lib/workflow/batch-executor.js +13 -5
  24. package/dist/src/lib/workflow/config-resolver.d.ts +50 -0
  25. package/dist/src/lib/workflow/config-resolver.js +167 -0
  26. package/dist/src/lib/workflow/error-classifier.d.ts +17 -7
  27. package/dist/src/lib/workflow/error-classifier.js +113 -15
  28. package/dist/src/lib/workflow/phase-executor.d.ts +31 -0
  29. package/dist/src/lib/workflow/phase-executor.js +143 -48
  30. package/dist/src/lib/workflow/run-log-schema.d.ts +12 -0
  31. package/dist/src/lib/workflow/run-log-schema.js +7 -1
  32. package/dist/src/lib/workflow/run-orchestrator.d.ts +161 -0
  33. package/dist/src/lib/workflow/run-orchestrator.js +510 -0
  34. package/dist/src/lib/workflow/worktree-manager.d.ts +4 -3
  35. package/dist/src/lib/workflow/worktree-manager.js +61 -11
  36. package/package.json +1 -1
  37. package/templates/skills/assess/SKILL.md +239 -77
  38. package/templates/skills/exec/SKILL.md +7 -68
  39. package/templates/skills/fullsolve/SKILL.md +303 -137
  40. package/templates/skills/qa/SKILL.md +42 -46
  41. package/templates/skills/qa/scripts/quality-checks.sh +47 -1
  42. package/templates/skills/spec/SKILL.md +183 -982
  43. package/templates/skills/spec/references/quality-checklist.md +75 -0
  44. package/templates/skills/test/SKILL.md +0 -27
  45. package/templates/skills/testgen/SKILL.md +0 -27
@@ -110,18 +110,21 @@ Surface red flags. Only track signals that change the recommendation.
110
110
 
111
111
  **Phase selection from labels:**
112
112
 
113
- | Labels | Workflow |
114
- |--------|----------|
115
- | bug, fix, hotfix, patch | `exec → qa` |
116
- | docs, documentation, readme | `exec → qa` |
117
- | ui, frontend, admin, web, browser | `spec → exec → test qa` |
118
- | security, auth, authentication, permissions | `spec security-review exec qa` |
119
- | complex, refactor, breaking, major | `spec → exec → qa` + `-q` |
120
- | enhancement, feature (default) | `spec exec → qa` |
113
+ | Labels | Category | Workflow |
114
+ |--------|----------|----------|
115
+ | security, auth, authentication, permissions | Domain | `spec → security-review → exec → qa` |
116
+ | ui, frontend, admin, web, browser | Domain | `spec → exec → test → qa` |
117
+ | complex, refactor, breaking, major | Modifier | `spec → exec → qa` + `-q` |
118
+ | (ui/frontend) + (enhancement/feature), or testable-AC signals | Modifier | inserts `testgen` before `exec` (see Testgen detection below) |
119
+ | enhancement, feature (default) | Generic | `spec → exec → qa` |
120
+ | bug, fix, hotfix, patch | Generic | `exec → qa` |
121
+ | docs, documentation, readme | Generic | `exec → qa` |
122
+
123
+ **Label priority:** Domain labels take precedence over generic labels. When an issue has both a domain label and a generic label (e.g., `bug` + `auth`), use the domain-specific workflow. Example: an issue labeled `bug` + `auth` gets `spec → security-review → exec → qa`, not `exec → qa`. Similarly, `bug` + `ui` gets `spec → exec → test → qa`.
121
124
 
122
125
  **Valid phases (from `PhaseSchema` in `src/lib/workflow/types.ts`):** `spec`, `security-review`, `exec`, `testgen`, `test`, `verify`, `qa`, `loop`, `merger`
123
126
 
124
- **Skip spec when:** bug/docs label, OR spec comment already exists on issue.
127
+ **Skip spec when:** (bug/docs label AND no domain labels like security/auth/ui/frontend), OR spec comment already exists on issue.
125
128
 
126
129
  **Resume detection:** Branch exists with commits ahead of main → mark as resume (`◂`).
127
130
 
@@ -129,10 +132,24 @@ Surface red flags. Only track signals that change the recommendation.
129
132
 
130
133
  **Quality loop (`-q`):** Recommend for everything except simple bug fixes and docs-only.
131
134
 
132
- **Other flags:**
133
- - `--chain` Chain issues: each branches from previous (implies --sequential)
134
- - `--qa-gate` Pause chain on QA failure, preventing downstream issues from building on broken code (requires --chain)
135
- - `--base <branch>` — Issue references a feature branch
135
+ **Testgen detection:** Add `testgen` to the workflow when any apply:
136
+ - Labels include (`ui` or `frontend`) AND (`enhancement` or `feature`)
137
+ - ACs reference "unit test", "integration test", or list "Automated Test" as a verification method
138
+
139
+ Skip when: only `bug`/`fix` labels present, only `docs` label present, or a prior `testgen` phase marker exists in issue comments.
140
+
141
+ **Chain detection (suggest-only, never auto-apply):** When 2+ assessed issues have a detected dependency, emit a `Chain:` line alongside (not replacing) the default per-issue commands. False dependency inference produces silently-wrong branch topology, so the user decides.
142
+
143
+ Triggers (any one):
144
+ - Issue body or comments mention `"depends on #N"`, `"blocked by #N"`, or `"after #N"`
145
+ - One issue's described output is another issue's input (e.g., A changes a function signature that B consumes)
146
+
147
+ Format: `Chain: npx sequant run <N1> <N2> --chain --qa-gate -q <phases> # alternative — <one-line reason>`
148
+
149
+ Flag references:
150
+ - `--chain` chains issues (each branches from previous; implies `--sequential`)
151
+ - `--qa-gate` pauses chain on QA failure (requires `--chain`)
152
+ - `--base <branch>` — issue references a feature branch
136
153
 
137
154
  ### Step 5: Conflict Detection
138
155
 
@@ -150,23 +167,28 @@ For each active worktree, check `git diff --name-only main...HEAD` for file over
150
167
 
151
168
  **Design principle:** Dashboard first. Copy-pasteable commands. Silence means healthy.
152
169
 
170
+ **Table column rules:** The "Reason" column must not be truncated mid-word. If a row's reason text would exceed the column width, prefer abbreviating the reason to a shorter synonym rather than cutting a word in half. Column widths should adapt to content — do not force a fixed table width.
171
+
153
172
  ```
154
- # Action Reason Run
155
- <N> <ACTION> <short reason> <workflow or symbol>
156
- <N> <ACTION> <short reason> <workflow or symbol>
173
+ # Action [ACs] Reason Run
174
+ <N> <ACTION> [N] <short reason> <workflow or symbol>
175
+ <N> <ACTION> [N] <short reason> <workflow or symbol>
157
176
  ...
158
177
  ────────────────────────────────────────────────────────────────
159
-
160
- ╭──────────────────────────────────────────────────────────────╮
161
- npx sequant run <N1> <N2> <flags> │
162
- │ npx sequant run <N3> <flags> # resume │
163
- ╰──────────────────────────────────────────────────────────────╯
164
-
178
+ Commands:
179
+ npx sequant run <N1> <N2> <flags>
180
+ npx sequant run <N3> <flags> # resume
165
181
  ────────────────────────────────────────────────────────────────
166
- Order: <N> → <N> (<shared file>) · <N> → <N> (<dependency>)
182
+ Order: <N> → <N> (<dependency reason>)
167
183
 
168
184
  ⚠ #<N> <warning>
169
185
  ⚠ #<N> <warning>
186
+
187
+ Chain: npx sequant run <N1> <N2> --chain --qa-gate -q <phases> # alternative — <reason>
188
+
189
+ Flags:
190
+ <flag> <one-line reason>
191
+ <flag> <one-line reason>
170
192
  ────────────────────────────────────────────────────────────────
171
193
  Cleanup:
172
194
  <executable command> # reason
@@ -179,6 +201,8 @@ Cleanup:
179
201
  <!-- assess:quality-loop=<bool> -->
180
202
  ```
181
203
 
204
+ **`ACs` column (conditional):** Include the `ACs` column only when every assessed issue has at least one explicit `- [ ]` checkbox AC in its body. Otherwise omit the column entirely — do not show partial values. The counter prevents eroding table trust when some issues use implicit/narrative ACs.
205
+
182
206
  #### Run Column Symbols
183
207
 
184
208
  | Symbol | Meaning | Example |
@@ -193,24 +217,50 @@ Cleanup:
193
217
  | `‖` | Blocked/deferred | Dependency or manual |
194
218
  | `—` | No action needed | Already closed/merged |
195
219
 
196
- #### Command Block Rules
220
+ #### Commands Block Rules
221
+
222
+ The commands block is headed by `Commands:` — no box-drawing, no character counting. The header label is the visual anchor.
197
223
 
198
224
  1. Only PROCEED and REWRITE issues get commands
199
225
  2. Group by identical phases + flags → same line
200
226
  3. Resume issues get `# resume` comment
201
227
  4. Rewrite issues get `# restart` comment
202
- 5. Chain mode issues use `--chain` flag
228
+ 5. Chain mode issues use `--chain` flag (see `Chain:` annotation rules below)
203
229
  6. If ALL issues share the same workflow, emit a single command
230
+ 7. **Line splitting:** When a single command would contain more than 6 issue numbers, split into multiple commands of at most 6 issues each, grouped by compatible workflow. Example: 11 issues → two commands (6 + 5)
204
231
 
205
232
  #### Annotation Rules
206
233
 
207
- - **`Order:`** Only when sequencing matters (shared files or dependencies). Format: `A → B (reason)` joined by ` · `
208
- - **`⚠` warnings** Only non-obvious signals (complexity, staleness, dual concerns). One line each. Prefix with issue number.
234
+ Emit annotations in this order between the separators that follow `Commands:`:
235
+ `Order:` `⚠` warnings `Chain:` `Flags:`. `Cleanup:` goes in its own block after. Omit any section (and its surrounding blank line) when it has no content.
236
+
237
+ - **`Order:`** — Only when sequencing matters. Include the **reason** for the ordering, not just `(<filename>)`. Prefer dependency reasoning over filename.
238
+ - Good: `Order: 185 → 186 (185 changes fetchApi error format that 186 consumes)`
239
+ - Good: `Order: 460 → 461 (460 adds batch-executor tests that 461's label matching depends on)`
240
+ - Avoid bare filenames when a reason is clearer.
241
+
242
+ - **`⚠` warnings** — Only non-obvious signals (complexity, staleness, dual concerns, partial-AC satisfaction). One line each, prefixed with issue number. Warnings can note when part of an AC is already satisfied in the codebase:
243
+ - `⚠ #185 Domain errors already exist in repository layer — scope may be smaller than expected`
244
+ - `⚠ #412 bug + auth labels — domain label (auth) takes priority over bug`
245
+
246
+ - **`Chain:`** — Only when 2+ PROCEED issues have a detected dependency (see "Chain detection" in Step 4). Suggests an alternative execution topology. Does not replace the default per-issue commands. Format:
247
+ `Chain: npx sequant run <N1> <N2> --chain --qa-gate -q <phases> # alternative — <one-line reason>`
248
+
249
+ - **`Flags:`** — Only when non-default flags appear in the commands and the reason isn't obvious. One line per **distinct** flag used across all commands. Omit entire section when `-q` is the only non-default flag AND its reason is obvious (e.g., all issues are enhancements). Format:
250
+ ```
251
+ Flags:
252
+ -q 9+ ACs or multi-file scope
253
+ --testgen testable ACs detected (UI hooks + API integration)
254
+ --phases ...,test ui label → browser verification
255
+ ```
256
+
209
257
  - **`Cleanup:`** — Only when actionable (stale branches, merged-but-open issues, label changes). Show as executable commands with `# reason` comments.
210
- - **Omit entire section** (including its separator) when no annotations of that type exist.
258
+
211
259
  - **"All clear" is silence** — no annotation means no issues.
212
260
 
213
- #### Batch Example (mixed states)
261
+ #### Batch Example (mixed states, with label priority)
262
+
263
+ Not all issues have explicit `- [ ]` checkboxes, so the `ACs` column is omitted.
214
264
 
215
265
  ```
216
266
  # Action Reason Run
@@ -220,22 +270,26 @@ Cleanup:
220
270
  458 PROCEED Parallel UX + race condition spec → exec → qa
221
271
  447 CLOSE PR #457 merged —
222
272
  443 PROCEED Consolidate gh calls spec → exec → qa
223
- 412 PROCEED Auth token refresh ◂ exec → qa
273
+ 412 PROCEED Auth bug (domain: auth overrides bug) spec → security-review → exec → qa
274
+ 411 PROCEED Config path normalization ◂ exec → qa
224
275
  405 REWRITE PR #380 200+ commits behind ⟳ spec → exec → qa
225
276
  ────────────────────────────────────────────────────────────────
226
-
227
- ╭──────────────────────────────────────────────────────────────╮
228
- npx sequant run 461 460 -q --phases exec,qa │
229
- npx sequant run 458 443 -q │
230
- npx sequant run 412 -q --phases exec,qa # resume
231
- npx sequant run 405 -q # restart
232
- ╰──────────────────────────────────────────────────────────────╯
233
-
277
+ Commands:
278
+ npx sequant run 461 460 -q --phases exec,qa
279
+ npx sequant run 458 443 -q
280
+ npx sequant run 412 -q --phases spec,security-review,exec,qa
281
+ npx sequant run 411 -q --phases exec,qa # resume
282
+ npx sequant run 405 -q # restart
234
283
  ────────────────────────────────────────────────────────────────
235
- Order: 460 → 461 (batch-executor.ts)
284
+ Order: 460 → 461 (460 adds batch-executor tests that 461's label matching depends on)
236
285
 
237
286
  ⚠ #458 Dual concern (UX + race) across 4 files
238
287
  ⚠ #405 Stale 30+ days, ACs still valid
288
+ ⚠ #412 bug + auth labels — domain label (auth) takes priority over bug
289
+
290
+ Flags:
291
+ -q multi-file scope across most PROCEED issues
292
+ --phases spec,... spec phase added for 458/443/412/405 (standard features)
239
293
  ────────────────────────────────────────────────────────────────
240
294
  Cleanup:
241
295
  git worktree remove .../447-... # merged, stale worktree
@@ -249,13 +303,44 @@ Cleanup:
249
303
  <!-- #458 assess:action=PROCEED assess:phases=spec,exec,qa assess:quality-loop=true -->
250
304
  <!-- #447 assess:action=CLOSE -->
251
305
  <!-- #443 assess:action=PROCEED assess:phases=spec,exec,qa assess:quality-loop=true -->
252
- <!-- #412 assess:action=PROCEED assess:phases=exec,qa assess:quality-loop=true -->
306
+ <!-- #412 assess:action=PROCEED assess:phases=spec,security-review,exec,qa assess:quality-loop=true -->
307
+ <!-- #411 assess:action=PROCEED assess:phases=exec,qa assess:quality-loop=true -->
253
308
  <!-- #405 assess:action=REWRITE assess:phases=spec,exec,qa assess:quality-loop=true -->
254
309
  ```
255
310
 
311
+ #### Batch Example (dependent issues with testgen, chain suggestion)
312
+
313
+ All issues have explicit checkbox ACs, so the `ACs` column is shown. A dependency is detected (185 → 186), so a `Chain:` suggestion appears alongside the default commands.
314
+
315
+ ```
316
+ # Action ACs Reason Run
317
+ 185 PROCEED 6 Domain error standardization spec → exec → qa
318
+ 186 PROCEED 9 React Query hooks migration spec → testgen → exec → test → qa
319
+ ────────────────────────────────────────────────────────────────
320
+ Commands:
321
+ npx sequant run 185 -q
322
+ npx sequant run 186 -q --phases spec,testgen,exec,test,qa
323
+ ────────────────────────────────────────────────────────────────
324
+ Order: 185 → 186 (185 changes fetchApi error format that 186 consumes)
325
+
326
+ ⚠ #185 Domain errors already exist in repository layer — scope may be smaller than expected
327
+ ⚠ #186 @tanstack/react-query not installed; large scope (9 hooks + optimistic updates)
328
+
329
+ Chain: npx sequant run 185 186 --chain --qa-gate -q --phases spec,testgen,exec,test,qa
330
+ # alternative — use if 186 should branch from 185's work
331
+
332
+ Flags:
333
+ --testgen #186 has testable ACs (UI hooks + API integration)
334
+ --phases ...,test #186 ui label → browser verification
335
+ ────────────────────────────────────────────────────────────────
336
+
337
+ <!-- #185 assess:action=PROCEED assess:phases=spec,exec,qa assess:quality-loop=true -->
338
+ <!-- #186 assess:action=PROCEED assess:phases=spec,testgen,exec,test,qa assess:quality-loop=true -->
339
+ ```
340
+
256
341
  #### Batch Example (all clean)
257
342
 
258
- When every issue is PROCEED with no warnings, the output is minimal:
343
+ When every issue is PROCEED with no warnings, no dependencies, and no non-default flags beyond an obvious `-q`, the output is minimal. The `Flags:` section is omitted because `-q` is obvious here (all PROCEED enhancements).
259
344
 
260
345
  ```
261
346
  # Action Reason Run
@@ -263,12 +348,9 @@ When every issue is PROCEED with no warnings, the output is minimal:
263
348
  460 PROCEED batch-executor tests exec → qa
264
349
  443 PROCEED Consolidate gh calls spec → exec → qa
265
350
  ────────────────────────────────────────────────────────────────
266
-
267
- ╭──────────────────────────────────────────────────────────────╮
268
- npx sequant run 461 460 -q --phases exec,qa │
269
- │ npx sequant run 443 -q │
270
- ╰──────────────────────────────────────────────────────────────╯
271
-
351
+ Commands:
352
+ npx sequant run 461 460 -q --phases exec,qa
353
+ npx sequant run 443 -q
272
354
  ────────────────────────────────────────────────────────────────
273
355
 
274
356
  <!-- #461 assess:action=PROCEED assess:phases=exec,qa assess:quality-loop=true -->
@@ -276,6 +358,63 @@ When every issue is PROCEED with no warnings, the output is minimal:
276
358
  <!-- #443 assess:action=PROCEED assess:phases=spec,exec,qa assess:quality-loop=true -->
277
359
  ```
278
360
 
361
+ Silence means clean — no `Order:`, no `⚠`, no `Chain:`, no `Flags:`, no `Cleanup:`.
362
+
363
+ #### Batch Example (large batch, 13 issues with Rule 7 split)
364
+
365
+ When assessing 9+ issues, commands are split per Rule 7 (max 6 issue numbers per line), and the table adapts to content width. Mixed AC styles across issues → `ACs` column omitted.
366
+
367
+ ```
368
+ # Action Reason Run
369
+ 503 PROCEED Fix typo in error output exec → qa
370
+ 502 PROCEED Update deprecated API call exec → qa
371
+ 501 PROCEED Add retry logic to API client exec → qa
372
+ 500 PROCEED Fix token refresh race condition spec → security-review → exec → qa
373
+ 499 PROCEED Dashboard chart rendering bug spec → exec → test → qa
374
+ 498 PROCEED Update error messages exec → qa
375
+ 497 PROCEED Refactor batch executor spec → exec → qa
376
+ 496 PARK Blocked on #490 schema migration ‖
377
+ 495 PROCEED CLI help text improvements exec → qa
378
+ 494 PROCEED Assess batch formatting fix exec → qa
379
+ 493 CLOSE Duplicate of #491 —
380
+ 492 PROCEED Add export command spec → exec → qa
381
+ 491 PROCEED Normalize config paths exec → qa
382
+ ────────────────────────────────────────────────────────────────
383
+ Commands:
384
+ npx sequant run 503 502 501 498 495 494 -q --phases exec,qa
385
+ npx sequant run 491 -q --phases exec,qa
386
+ npx sequant run 499 -q --phases spec,exec,test,qa
387
+ npx sequant run 500 -q --phases spec,security-review,exec,qa
388
+ npx sequant run 497 492 -q
389
+ ────────────────────────────────────────────────────────────────
390
+ Order: 497 → 492 (497 refactors batch-executor internals that 492's export command uses)
391
+
392
+ ⚠ #500 bug + auth labels — domain label takes priority
393
+ ⚠ #499 bug + ui labels — domain label triggers test phase
394
+
395
+ Flags:
396
+ --phases ...,security-review #500 auth label → security review required
397
+ --phases ...,test #499 ui label → browser verification
398
+ ────────────────────────────────────────────────────────────────
399
+ Cleanup:
400
+ gh issue close 493 # duplicate of #491
401
+ ────────────────────────────────────────────────────────────────
402
+
403
+ <!-- #503 assess:action=PROCEED assess:phases=exec,qa assess:quality-loop=true -->
404
+ <!-- #502 assess:action=PROCEED assess:phases=exec,qa assess:quality-loop=true -->
405
+ <!-- #501 assess:action=PROCEED assess:phases=exec,qa assess:quality-loop=true -->
406
+ <!-- #500 assess:action=PROCEED assess:phases=spec,security-review,exec,qa assess:quality-loop=true -->
407
+ <!-- #499 assess:action=PROCEED assess:phases=spec,exec,test,qa assess:quality-loop=true -->
408
+ <!-- #498 assess:action=PROCEED assess:phases=exec,qa assess:quality-loop=true -->
409
+ <!-- #497 assess:action=PROCEED assess:phases=spec,exec,qa assess:quality-loop=true -->
410
+ <!-- #496 assess:action=PARK -->
411
+ <!-- #495 assess:action=PROCEED assess:phases=exec,qa assess:quality-loop=true -->
412
+ <!-- #494 assess:action=PROCEED assess:phases=exec,qa assess:quality-loop=true -->
413
+ <!-- #493 assess:action=CLOSE -->
414
+ <!-- #492 assess:action=PROCEED assess:phases=spec,exec,qa assess:quality-loop=true -->
415
+ <!-- #491 assess:action=PROCEED assess:phases=exec,qa assess:quality-loop=true -->
416
+ ```
417
+
279
418
  ---
280
419
 
281
420
  ### Single Mode (1 issue)
@@ -291,11 +430,13 @@ More context since you're focused on one issue. Separators between every section
291
430
 
292
431
  → PROCEED — <one-line reason>
293
432
 
294
- ╭──────────────────────────────────────────────────────────────╮
295
- npx sequant run <N> <flags>
296
- ╰──────────────────────────────────────────────────────────────╯
433
+ Commands:
434
+ npx sequant run <N> <flags>
297
435
 
298
- <phases> · <N> ACs · <flag reasoning>
436
+ <phases> · <N> ACs
437
+
438
+ Flags:
439
+ <flag> <one-line reason>
299
440
  ────────────────────────────────────────────────────────────────
300
441
  ⚠ <warning if any>
301
442
  ⚠ Conflict: #<N> also modifies <path>
@@ -306,7 +447,9 @@ More context since you're focused on one issue. Separators between every section
306
447
  <!-- assess:quality-loop=<bool> -->
307
448
  ```
308
449
 
309
- If no warnings exist, omit the warning section and its separator:
450
+ **`Flags:` (single mode):** Indented list of each enabled non-default flag with a one-line reason. Omit the entire `Flags:` section when `-q` is the only non-default flag AND the reason is obvious (e.g., a straightforward enhancement). Do not repeat obvious flags.
451
+
452
+ Example with `Flags:` (non-obvious `-q` + `--testgen`):
310
453
 
311
454
  ```
312
455
  #458 — Parallel run UX freeze + reconcileState race condition
@@ -315,11 +458,33 @@ Open · bug, enhancement, cli
315
458
 
316
459
  → PROCEED — Both root causes confirmed in codebase
317
460
 
318
- ╭──────────────────────────────────────────────────────────────╮
319
- npx sequant run 458 -q
320
- ╰──────────────────────────────────────────────────────────────╯
461
+ Commands:
462
+ npx sequant run 458 -q
463
+
464
+ spec → exec → qa · 8 ACs
321
465
 
322
- spec → exec → qa · 8 ACs · -q (dual concern)
466
+ Flags:
467
+ -q dual concern across 4 files
468
+ ────────────────────────────────────────────────────────────────
469
+
470
+ <!-- assess:action=PROCEED -->
471
+ <!-- assess:phases=spec,exec,qa -->
472
+ <!-- assess:quality-loop=true -->
473
+ ```
474
+
475
+ Example omitting `Flags:` (obvious `-q` for a standard enhancement):
476
+
477
+ ```
478
+ #443 — Consolidate gh CLI calls
479
+ Open · enhancement
480
+ ────────────────────────────────────────────────────────────────
481
+
482
+ → PROCEED — Codebase matches spec, 5 ACs
483
+
484
+ Commands:
485
+ npx sequant run 443 -q
486
+
487
+ spec → exec → qa · 5 ACs
323
488
  ────────────────────────────────────────────────────────────────
324
489
 
325
490
  <!-- assess:action=PROCEED -->
@@ -397,9 +562,8 @@ Need: <specific information required>
397
562
 
398
563
  → REWRITE — <reason>
399
564
 
400
- ╭──────────────────────────────────────────────────────────────╮
401
- npx sequant run <N> <flags> # fresh start
402
- ╰──────────────────────────────────────────────────────────────╯
565
+ Commands:
566
+ npx sequant run <N> <flags> # fresh start
403
567
 
404
568
  <phases> · <N> ACs
405
569
  ────────────────────────────────────────────────────────────────
@@ -417,27 +581,19 @@ Need: <specific information required>
417
581
 
418
582
  | Section | Show when |
419
583
  |---------|-----------|
420
- | Command block | At least one PROCEED or REWRITE issue |
584
+ | `ACs` column (batch) | Every assessed issue has ≥1 explicit `- [ ]` checkbox AC |
585
+ | `Commands:` block | At least one PROCEED or REWRITE issue |
421
586
  | `Order:` | File conflicts or dependencies require sequencing |
422
- | `⚠` warnings | Non-obvious signals exist |
587
+ | `⚠` warnings | Non-obvious signals exist (complexity, staleness, dual concerns, partial-AC satisfaction) |
588
+ | `Chain:` | 2+ PROCEED issues with detected dependency (suggest-only) |
589
+ | `Flags:` | Non-default flags appear AND `-q` is not the sole flag with an obvious reason |
423
590
  | `Cleanup:` | Stale branches, merged-but-open issues, or label changes |
424
591
  | Separators | Between sections that are both shown; omit if adjacent section is omitted |
425
592
 
426
- Every separator and section is conditional. If there are no warnings and no cleanup, the output is just: table → separator → command block → separator → markers.
593
+ Every separator and section is conditional. If there are no warnings, no chain, no flags, and no cleanup, the output is just: table → separator → `Commands:` block → separator → markers.
427
594
 
428
595
  ---
429
596
 
430
- ## State Tracking
431
-
432
- Initialize state for each assessed issue:
433
-
434
- ```bash
435
- TITLE=$(gh issue view <N> --json title -q '.title')
436
- npx tsx scripts/state/update.ts init <N> "$TITLE"
437
- ```
438
-
439
- Note: `/assess` only initializes issues — actual phase tracking happens during workflow execution.
440
-
441
597
  ## Persist Analysis
442
598
 
443
599
  After displaying output, prompt the user to save using `AskUserQuestion` with options "Yes (Recommended)" and "No".
@@ -467,10 +623,16 @@ If confirmed, post a structured comment to each issue via `gh issue comment`. Ea
467
623
 
468
624
  - [ ] Every issue has exactly one action in the table
469
625
  - [ ] Run column uses correct symbol for the action/state
470
- - [ ] Command block only contains PROCEED and REWRITE issues
471
- - [ ] Commands are grouped by compatible workflow
472
- - [ ] Separators appear between every shown section
473
- - [ ] Annotations omitted when not applicable (silence = healthy)
626
+ - [ ] `ACs` column included only when every issue has explicit `- [ ]` checkboxes
627
+ - [ ] Commands appear under a `Commands:` header (no bare indented block, no box-drawing)
628
+ - [ ] Commands block only contains PROCEED and REWRITE issues, grouped by compatible workflow
629
+ - [ ] `testgen` included when ui/frontend + enhancement/feature labels OR testable-AC signals
630
+ - [ ] `Chain:` suggested (not auto-applied) when 2+ PROCEED issues have a detected dependency
631
+ - [ ] `Flags:` section present when non-default flags appear (unless only obvious `-q`)
632
+ - [ ] `Order:` annotations carry dependency **reasoning**, not bare filenames
633
+ - [ ] `⚠` warnings include partial-AC satisfaction where applicable
634
+ - [ ] Separators appear between every shown section; omitted when adjacent section is omitted
635
+ - [ ] Annotations/sections omitted when not applicable (silence = healthy)
474
636
  - [ ] HTML markers present for every assessed issue
475
637
  - [ ] Batch mode: table is the primary output, no per-issue detail sections
476
638
  - [ ] Single mode: focused summary with separators between sections
@@ -806,16 +806,6 @@ After implementation is complete and all checks pass, create and verify the PR:
806
806
  - If PR exists: Record the URL from `gh pr view` output
807
807
  - If PR creation failed: Record the error and include manual creation instructions
808
808
 
809
- 6. **Record PR info in workflow state:**
810
- ```bash
811
- # Extract PR number and URL from gh pr view output, then update state
812
- PR_INFO=$(gh pr view --json number,url)
813
- PR_NUMBER=$(echo "$PR_INFO" | jq -r '.number')
814
- PR_URL=$(echo "$PR_INFO" | jq -r '.url')
815
- npx tsx scripts/state/update.ts pr <issue-number> "$PR_NUMBER" "$PR_URL"
816
- ```
817
- This enables `--cleanup` to detect merged PRs and auto-remove state entries.
818
-
819
809
  **PR Verification Failure Handling:**
820
810
 
821
811
  If `gh pr view` fails after retry:
@@ -1837,40 +1827,20 @@ When in doubt, choose:
1837
1827
 
1838
1828
  The goal is to satisfy AC with the smallest, safest change possible.
1839
1829
 
1840
- ### 5. Adversarial Self-Evaluation (REQUIRED)
1830
+ ### 5. Pre-PR Confidence Check (REQUIRED)
1841
1831
 
1842
- **Before outputting your final summary**, you MUST complete this adversarial self-evaluation to catch issues that automated checks miss.
1843
-
1844
- **Why this matters:** Sessions show that honest self-questioning consistently catches real issues:
1845
- - Tests that pass but don't cover the actual changes
1846
- - Features that build but don't work as expected
1847
- - AC items marked "done" but with weak implementation
1848
-
1849
- **Answer these questions honestly:**
1850
- 1. "Did anything not work as expected during implementation?"
1851
- 2. "If this feature broke tomorrow, would the current tests catch it?"
1852
- 3. "What's the weakest part of this implementation?"
1853
- 4. "Am I reporting success metrics without honest self-evaluation?"
1854
- 5. "For each changed source file, does a corresponding test file exist? If not, why is that acceptable?"
1855
- 6. "Did I run `npm run lint` and fix all errors, or am I hoping CI will pass?"
1832
+ **Before creating a PR**, state your confidence in 2-3 sentences.
1856
1833
 
1857
1834
  **Include this section in your output:**
1858
1835
 
1859
1836
  ```markdown
1860
- ### Self-Evaluation
1837
+ ### Pre-PR Confidence Check
1861
1838
 
1862
- - **Worked as expected:** [Yes/No - if No, explain what didn't work]
1863
- - **Test coverage confidence:** [High/Medium/Low - explain why]
1864
- - **Weakest part:** [Identify the weakest aspect of the implementation]
1865
- - **Honest assessment:** [Any concerns or caveats?]
1839
+ - **Weakest part:** [What's the most fragile aspect of this implementation?]
1840
+ - **Coverage gaps:** [Which changed files lack corresponding tests, and why is that acceptable?]
1866
1841
  ```
1867
1842
 
1868
- **If any answer reveals concerns:**
1869
- - Address the issues before proceeding
1870
- - Re-run relevant checks (`npm test`, `npm run build`)
1871
- - Update the self-evaluation after fixes
1872
-
1873
- **Do NOT skip this self-evaluation.** Honest reflection catches issues that automated checks miss.
1843
+ **If either field reveals concerns**, address them before creating the PR. Re-run `npm test` and `npm run build` after fixes.
1874
1844
 
1875
1845
  ---
1876
1846
 
@@ -1944,42 +1914,11 @@ You may be invoked multiple times for the same issue. Each time, re-establish co
1944
1914
 
1945
1915
  ---
1946
1916
 
1947
- ## State Tracking
1948
-
1949
- **IMPORTANT:** Update workflow state when running standalone (not orchestrated).
1950
-
1951
- ### Check Orchestration Mode
1952
-
1953
- The orchestration check happens automatically when you run the state update script - it exits silently if `SEQUANT_ORCHESTRATOR` is set.
1954
-
1955
- ### State Updates (Standalone Only)
1956
-
1957
- When NOT orchestrated (`SEQUANT_ORCHESTRATOR` is not set):
1958
-
1959
- **At skill start:**
1960
- ```bash
1961
- npx tsx scripts/state/update.ts start <issue-number> exec
1962
- ```
1963
-
1964
- **On successful completion:**
1965
- ```bash
1966
- npx tsx scripts/state/update.ts complete <issue-number> exec
1967
- ```
1968
-
1969
- **On failure:**
1970
- ```bash
1971
- npx tsx scripts/state/update.ts fail <issue-number> exec "Error description"
1972
- ```
1973
-
1974
- **Why this matters:** State tracking enables dashboard visibility, resume capability, and workflow orchestration. Skills update state when standalone; orchestrators handle state when running workflows.
1975
-
1976
- ---
1977
-
1978
1917
  ## Output Verification
1979
1918
 
1980
1919
  **Before responding, verify your output includes ALL of these:**
1981
1920
 
1982
- - [ ] **Self-Evaluation Completed** - Adversarial self-evaluation section included in output
1921
+ - [ ] **Pre-PR Confidence Check** - Weakest part and coverage gaps stated
1983
1922
  - [ ] **AC Progress Summary** - Which AC items are satisfied, partially met, or blocked
1984
1923
  - [ ] **Files Changed** - List of key files modified
1985
1924
  - [ ] **Test/Build/Lint Results** - Output from `npm run build`, `npm run lint`, and `npm test`