@calmo/task-runner 3.8.3 → 4.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (94) hide show
  1. package/.agent/workflows/openspec-apply.md +20 -0
  2. package/.agent/workflows/openspec-archive.md +24 -0
  3. package/.agent/workflows/openspec-proposal.md +25 -0
  4. package/.github/workflows/release-please.yml +46 -0
  5. package/.husky/commit-msg +0 -0
  6. package/.husky/pre-commit +0 -0
  7. package/.release-please-manifest.json +3 -0
  8. package/AGENTS.md +2 -4
  9. package/CHANGELOG.md +109 -0
  10. package/README.md +1 -1
  11. package/dist/TaskGraphValidator.js +2 -4
  12. package/dist/TaskGraphValidator.js.map +1 -1
  13. package/openspec/changes/adopt-release-pr/design.md +40 -0
  14. package/openspec/changes/adopt-release-pr/proposal.md +47 -0
  15. package/openspec/changes/adopt-release-pr/specs/release-pr/spec.md +34 -0
  16. package/openspec/changes/adopt-release-pr/tasks.md +14 -0
  17. package/openspec/changes/archive/2026-01-18-add-concurrency-control/specs/task-runner/spec.md +26 -0
  18. package/openspec/changes/archive/2026-01-18-add-external-task-cancellation/specs/task-runner/spec.md +63 -0
  19. package/openspec/changes/archive/2026-01-18-add-integration-tests/specs/task-runner/spec.md +22 -0
  20. package/openspec/changes/archive/2026-01-18-add-task-retry-policy/specs/task-runner/spec.md +40 -0
  21. package/openspec/changes/archive/2026-01-18-add-workflow-preview/specs/task-runner/spec.md +25 -0
  22. package/openspec/changes/archive/2026-01-18-refactor-core-architecture/specs/task-runner/spec.md +31 -0
  23. package/openspec/changes/feat-continue-on-error/proposal.md +20 -0
  24. package/openspec/changes/feat-continue-on-error/tasks.md +17 -0
  25. package/openspec/changes/feat-per-task-timeout/specs/task-runner/spec.md +34 -0
  26. package/openspec/changes/feat-task-metrics/specs/001-generic-task-runner/spec.md +13 -0
  27. package/openspec/specs/task-runner/spec.md +162 -0
  28. package/package.json +11 -20
  29. package/release-please-config.json +9 -0
  30. package/src/TaskGraphValidator.ts +2 -4
  31. package/.gemini/commands/speckit.analyze.toml +0 -188
  32. package/.gemini/commands/speckit.checklist.toml +0 -298
  33. package/.gemini/commands/speckit.clarify.toml +0 -185
  34. package/.gemini/commands/speckit.constitution.toml +0 -86
  35. package/.gemini/commands/speckit.implement.toml +0 -139
  36. package/.gemini/commands/speckit.plan.toml +0 -93
  37. package/.gemini/commands/speckit.specify.toml +0 -262
  38. package/.gemini/commands/speckit.tasks.toml +0 -141
  39. package/.gemini/commands/speckit.taskstoissues.toml +0 -34
  40. package/.github/workflows/release.yml +0 -46
  41. package/.releaserc.json +0 -27
  42. package/coverage/base.css +0 -224
  43. package/coverage/block-navigation.js +0 -87
  44. package/coverage/coverage-final.json +0 -15
  45. package/coverage/favicon.png +0 -0
  46. package/coverage/index.html +0 -146
  47. package/coverage/lcov-report/base.css +0 -224
  48. package/coverage/lcov-report/block-navigation.js +0 -87
  49. package/coverage/lcov-report/favicon.png +0 -0
  50. package/coverage/lcov-report/index.html +0 -146
  51. package/coverage/lcov-report/prettify.css +0 -1
  52. package/coverage/lcov-report/prettify.js +0 -2
  53. package/coverage/lcov-report/sort-arrow-sprite.png +0 -0
  54. package/coverage/lcov-report/sorter.js +0 -210
  55. package/coverage/lcov-report/src/EventBus.ts.html +0 -379
  56. package/coverage/lcov-report/src/ExecutionConstants.ts.html +0 -121
  57. package/coverage/lcov-report/src/TaskGraphValidationError.ts.html +0 -130
  58. package/coverage/lcov-report/src/TaskGraphValidator.ts.html +0 -649
  59. package/coverage/lcov-report/src/TaskRunner.ts.html +0 -706
  60. package/coverage/lcov-report/src/TaskRunnerBuilder.ts.html +0 -337
  61. package/coverage/lcov-report/src/TaskRunnerExecutionConfig.ts.html +0 -154
  62. package/coverage/lcov-report/src/TaskStateManager.ts.html +0 -529
  63. package/coverage/lcov-report/src/WorkflowExecutor.ts.html +0 -712
  64. package/coverage/lcov-report/src/contracts/ErrorTypes.ts.html +0 -103
  65. package/coverage/lcov-report/src/contracts/RunnerEvents.ts.html +0 -217
  66. package/coverage/lcov-report/src/contracts/index.html +0 -131
  67. package/coverage/lcov-report/src/index.html +0 -236
  68. package/coverage/lcov-report/src/strategies/DryRunExecutionStrategy.ts.html +0 -178
  69. package/coverage/lcov-report/src/strategies/RetryingExecutionStrategy.ts.html +0 -373
  70. package/coverage/lcov-report/src/strategies/StandardExecutionStrategy.ts.html +0 -190
  71. package/coverage/lcov-report/src/strategies/index.html +0 -146
  72. package/coverage/lcov.info +0 -671
  73. package/coverage/prettify.css +0 -1
  74. package/coverage/prettify.js +0 -2
  75. package/coverage/sort-arrow-sprite.png +0 -0
  76. package/coverage/sorter.js +0 -210
  77. package/coverage/src/EventBus.ts.html +0 -379
  78. package/coverage/src/ExecutionConstants.ts.html +0 -121
  79. package/coverage/src/TaskGraphValidationError.ts.html +0 -130
  80. package/coverage/src/TaskGraphValidator.ts.html +0 -649
  81. package/coverage/src/TaskRunner.ts.html +0 -706
  82. package/coverage/src/TaskRunnerBuilder.ts.html +0 -337
  83. package/coverage/src/TaskRunnerExecutionConfig.ts.html +0 -154
  84. package/coverage/src/TaskStateManager.ts.html +0 -529
  85. package/coverage/src/WorkflowExecutor.ts.html +0 -712
  86. package/coverage/src/contracts/ErrorTypes.ts.html +0 -103
  87. package/coverage/src/contracts/RunnerEvents.ts.html +0 -217
  88. package/coverage/src/contracts/index.html +0 -131
  89. package/coverage/src/index.html +0 -236
  90. package/coverage/src/strategies/DryRunExecutionStrategy.ts.html +0 -178
  91. package/coverage/src/strategies/RetryingExecutionStrategy.ts.html +0 -373
  92. package/coverage/src/strategies/StandardExecutionStrategy.ts.html +0 -190
  93. package/coverage/src/strategies/index.html +0 -146
  94. package/test-report.xml +0 -299
@@ -1,298 +0,0 @@
1
- description = "Generate a custom checklist for the current feature based on user requirements."
2
-
3
- prompt = """
4
- ---
5
- description: Generate a custom checklist for the current feature based on user requirements.
6
- ---
7
-
8
- ## Checklist Purpose: "Unit Tests for English"
9
-
10
- **CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, and completeness of requirements in a given domain.
11
-
12
- **NOT for verification/testing**:
13
-
14
- - ❌ NOT "Verify the button clicks correctly"
15
- - ❌ NOT "Test error handling works"
16
- - ❌ NOT "Confirm the API returns 200"
17
- - ❌ NOT checking if code/implementation matches the spec
18
-
19
- **FOR requirements quality validation**:
20
-
21
- - ✅ "Are visual hierarchy requirements defined for all card types?" (completeness)
22
- - ✅ "Is 'prominent display' quantified with specific sizing/positioning?" (clarity)
23
- - ✅ "Are hover state requirements consistent across all interactive elements?" (consistency)
24
- - ✅ "Are accessibility requirements defined for keyboard navigation?" (coverage)
25
- - ✅ "Does the spec define what happens when logo image fails to load?" (edge cases)
26
-
27
- **Metaphor**: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works.
28
-
29
- ## User Input
30
-
31
- ```text
32
- $ARGUMENTS
33
- ```
34
-
35
- You **MUST** consider the user input before proceeding (if not empty).
36
-
37
- ## Execution Steps
38
-
39
- 1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS list.
40
- - All file paths must be absolute.
41
- - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\\''m Groot' (or double-quote if possible: "I'm Groot").
42
-
43
- 2. **Clarify intent (dynamic)**: Derive up to THREE initial contextual clarifying questions (no pre-baked catalog). They MUST:
44
- - Be generated from the user's phrasing + extracted signals from spec/plan/tasks
45
- - Only ask about information that materially changes checklist content
46
- - Be skipped individually if already unambiguous in `$ARGUMENTS`
47
- - Prefer precision over breadth
48
-
49
- Generation algorithm:
50
- 1. Extract signals: feature domain keywords (e.g., auth, latency, UX, API), risk indicators ("critical", "must", "compliance"), stakeholder hints ("QA", "review", "security team"), and explicit deliverables ("a11y", "rollback", "contracts").
51
- 2. Cluster signals into candidate focus areas (max 4) ranked by relevance.
52
- 3. Identify probable audience & timing (author, reviewer, QA, release) if not explicit.
53
- 4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria.
54
- 5. Formulate questions chosen from these archetypes:
55
- - Scope refinement (e.g., "Should this include integration touchpoints with X and Y or stay limited to local module correctness?")
56
- - Risk prioritization (e.g., "Which of these potential risk areas should receive mandatory gating checks?")
57
- - Depth calibration (e.g., "Is this a lightweight pre-commit sanity list or a formal release gate?")
58
- - Audience framing (e.g., "Will this be used by the author only or peers during PR review?")
59
- - Boundary exclusion (e.g., "Should we explicitly exclude performance tuning items this round?")
60
- - Scenario class gap (e.g., "No recovery flows detected—are rollback / partial failure paths in scope?")
61
-
62
- Question formatting rules:
63
- - If presenting options, generate a compact table with columns: Option | Candidate | Why It Matters
64
- - Limit to A–E options maximum; omit table if a free-form answer is clearer
65
- - Never ask the user to restate what they already said
66
- - Avoid speculative categories (no hallucination). If uncertain, ask explicitly: "Confirm whether X belongs in scope."
67
-
68
- Defaults when interaction impossible:
69
- - Depth: Standard
70
- - Audience: Reviewer (PR) if code-related; Author otherwise
71
- - Focus: Top 2 relevance clusters
72
-
73
- Output the questions (label Q1/Q2/Q3). After answers: if ≥2 scenario classes (Alternate / Exception / Recovery / Non-Functional domain) remain unclear, you MAY ask up to TWO more targeted follow‑ups (Q4/Q5) with a one-line justification each (e.g., "Unresolved recovery path risk"). Do not exceed five total questions. Skip escalation if user explicitly declines more.
74
-
75
- 3. **Understand user request**: Combine `$ARGUMENTS` + clarifying answers:
76
- - Derive checklist theme (e.g., security, review, deploy, ux)
77
- - Consolidate explicit must-have items mentioned by user
78
- - Map focus selections to category scaffolding
79
- - Infer any missing context from spec/plan/tasks (do NOT hallucinate)
80
-
81
- 4. **Load feature context**: Read from FEATURE_DIR:
82
- - spec.md: Feature requirements and scope
83
- - plan.md (if exists): Technical details, dependencies
84
- - tasks.md (if exists): Implementation tasks
85
-
86
- **Context Loading Strategy**:
87
- - Load only necessary portions relevant to active focus areas (avoid full-file dumping)
88
- - Prefer summarizing long sections into concise scenario/requirement bullets
89
- - Use progressive disclosure: add follow-on retrieval only if gaps detected
90
- - If source docs are large, generate interim summary items instead of embedding raw text
91
-
92
- 5. **Generate checklist** - Create "Unit Tests for Requirements":
93
- - Create `FEATURE_DIR/checklists/` directory if it doesn't exist
94
- - Generate unique checklist filename:
95
- - Use short, descriptive name based on domain (e.g., `ux.md`, `api.md`, `security.md`)
96
- - Format: `[domain].md`
97
- - If file exists, append to existing file
98
- - Number items sequentially starting from CHK001
99
- - Each `/speckit.checklist` run creates a NEW file (never overwrites existing checklists)
100
-
101
- **CORE PRINCIPLE - Test the Requirements, Not the Implementation**:
102
- Every checklist item MUST evaluate the REQUIREMENTS THEMSELVES for:
103
- - **Completeness**: Are all necessary requirements present?
104
- - **Clarity**: Are requirements unambiguous and specific?
105
- - **Consistency**: Do requirements align with each other?
106
- - **Measurability**: Can requirements be objectively verified?
107
- - **Coverage**: Are all scenarios/edge cases addressed?
108
-
109
- **Category Structure** - Group items by requirement quality dimensions:
110
- - **Requirement Completeness** (Are all necessary requirements documented?)
111
- - **Requirement Clarity** (Are requirements specific and unambiguous?)
112
- - **Requirement Consistency** (Do requirements align without conflicts?)
113
- - **Acceptance Criteria Quality** (Are success criteria measurable?)
114
- - **Scenario Coverage** (Are all flows/cases addressed?)
115
- - **Edge Case Coverage** (Are boundary conditions defined?)
116
- - **Non-Functional Requirements** (Performance, Security, Accessibility, etc. - are they specified?)
117
- - **Dependencies & Assumptions** (Are they documented and validated?)
118
- - **Ambiguities & Conflicts** (What needs clarification?)
119
-
120
- **HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English"**:
121
-
122
- ❌ **WRONG** (Testing implementation):
123
- - "Verify landing page displays 3 episode cards"
124
- - "Test hover states work on desktop"
125
- - "Confirm logo click navigates home"
126
-
127
- ✅ **CORRECT** (Testing requirements quality):
128
- - "Are the exact number and layout of featured episodes specified?" [Completeness]
129
- - "Is 'prominent display' quantified with specific sizing/positioning?" [Clarity]
130
- - "Are hover state requirements consistent across all interactive elements?" [Consistency]
131
- - "Are keyboard navigation requirements defined for all interactive UI?" [Coverage]
132
- - "Is the fallback behavior specified when logo image fails to load?" [Edge Cases]
133
- - "Are loading states defined for asynchronous episode data?" [Completeness]
134
- - "Does the spec define visual hierarchy for competing UI elements?" [Clarity]
135
-
136
- **ITEM STRUCTURE**:
137
- Each item should follow this pattern:
138
- - Question format asking about requirement quality
139
- - Focus on what's WRITTEN (or not written) in the spec/plan
140
- - Include quality dimension in brackets [Completeness/Clarity/Consistency/etc.]
141
- - Reference spec section `[Spec §X.Y]` when checking existing requirements
142
- - Use `[Gap]` marker when checking for missing requirements
143
-
144
- **EXAMPLES BY QUALITY DIMENSION**:
145
-
146
- Completeness:
147
- - "Are error handling requirements defined for all API failure modes? [Gap]"
148
- - "Are accessibility requirements specified for all interactive elements? [Completeness]"
149
- - "Are mobile breakpoint requirements defined for responsive layouts? [Gap]"
150
-
151
- Clarity:
152
- - "Is 'fast loading' quantified with specific timing thresholds? [Clarity, Spec §NFR-2]"
153
- - "Are 'related episodes' selection criteria explicitly defined? [Clarity, Spec §FR-5]"
154
- - "Is 'prominent' defined with measurable visual properties? [Ambiguity, Spec §FR-4]"
155
-
156
- Consistency:
157
- - "Do navigation requirements align across all pages? [Consistency, Spec §FR-10]"
158
- - "Are card component requirements consistent between landing and detail pages? [Consistency]"
159
-
160
- Coverage:
161
- - "Are requirements defined for zero-state scenarios (no episodes)? [Coverage, Edge Case]"
162
- - "Are concurrent user interaction scenarios addressed? [Coverage, Gap]"
163
- - "Are requirements specified for partial data loading failures? [Coverage, Exception Flow]"
164
-
165
- Measurability:
166
- - "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec §FR-1]"
167
- - "Can 'balanced visual weight' be objectively verified? [Measurability, Spec §FR-2]"
168
-
169
- **Scenario Classification & Coverage** (Requirements Quality Focus):
170
- - Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios
171
- - For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?"
172
- - If scenario class missing: "Are [scenario type] requirements intentionally excluded or missing? [Gap]"
173
- - Include resilience/rollback when state mutation occurs: "Are rollback requirements defined for migration failures? [Gap]"
174
-
175
- **Traceability Requirements**:
176
- - MINIMUM: ≥80% of items MUST include at least one traceability reference
177
- - Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`
178
- - If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]"
179
-
180
- **Surface & Resolve Issues** (Requirements Quality Problems):
181
- Ask questions about the requirements themselves:
182
- - Ambiguities: "Is the term 'fast' quantified with specific metrics? [Ambiguity, Spec §NFR-1]"
183
- - Conflicts: "Do navigation requirements conflict between §FR-10 and §FR-10a? [Conflict]"
184
- - Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]"
185
- - Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]"
186
- - Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]"
187
-
188
- **Content Consolidation**:
189
- - Soft cap: If raw candidate items > 40, prioritize by risk/impact
190
- - Merge near-duplicates checking the same requirement aspect
191
- - If >5 low-impact edge cases, create one item: "Are edge cases X, Y, Z addressed in requirements? [Coverage]"
192
-
193
- **🚫 ABSOLUTELY PROHIBITED** - These make it an implementation test, not a requirements test:
194
- - ❌ Any item starting with "Verify", "Test", "Confirm", "Check" + implementation behavior
195
- - ❌ References to code execution, user actions, system behavior
196
- - ❌ "Displays correctly", "works properly", "functions as expected"
197
- - ❌ "Click", "navigate", "render", "load", "execute"
198
- - ❌ Test cases, test plans, QA procedures
199
- - ❌ Implementation details (frameworks, APIs, algorithms)
200
-
201
- **✅ REQUIRED PATTERNS** - These test requirements quality:
202
- - ✅ "Are [requirement type] defined/specified/documented for [scenario]?"
203
- - ✅ "Is [vague term] quantified/clarified with specific criteria?"
204
- - ✅ "Are requirements consistent between [section A] and [section B]?"
205
- - ✅ "Can [requirement] be objectively measured/verified?"
206
- - ✅ "Are [edge cases/scenarios] addressed in requirements?"
207
- - ✅ "Does the spec define [missing aspect]?"
208
-
209
- 6. **Structure Reference**: Generate the checklist following the canonical template in `.specify/templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### <requirement item>` lines with globally incrementing IDs starting at CHK001.
210
-
211
- 7. **Report**: Output full path to created checklist, item count, and remind user that each run creates a new file. Summarize:
212
- - Focus areas selected
213
- - Depth level
214
- - Actor/timing
215
- - Any explicit user-specified must-have items incorporated
216
-
217
- **Important**: Each `/speckit.checklist` command invocation creates a checklist file using short, descriptive names unless file already exists. This allows:
218
-
219
- - Multiple checklists of different types (e.g., `ux.md`, `test.md`, `security.md`)
220
- - Simple, memorable filenames that indicate checklist purpose
221
- - Easy identification and navigation in the `checklists/` folder
222
-
223
- To avoid clutter, use descriptive types and clean up obsolete checklists when done.
224
-
225
- ## Example Checklist Types & Sample Items
226
-
227
- **UX Requirements Quality:** `ux.md`
228
-
229
- Sample items (testing the requirements, NOT the implementation):
230
-
231
- - "Are visual hierarchy requirements defined with measurable criteria? [Clarity, Spec §FR-1]"
232
- - "Is the number and positioning of UI elements explicitly specified? [Completeness, Spec §FR-1]"
233
- - "Are interaction state requirements (hover, focus, active) consistently defined? [Consistency]"
234
- - "Are accessibility requirements specified for all interactive elements? [Coverage, Gap]"
235
- - "Is fallback behavior defined when images fail to load? [Edge Case, Gap]"
236
- - "Can 'prominent display' be objectively measured? [Measurability, Spec §FR-4]"
237
-
238
- **API Requirements Quality:** `api.md`
239
-
240
- Sample items:
241
-
242
- - "Are error response formats specified for all failure scenarios? [Completeness]"
243
- - "Are rate limiting requirements quantified with specific thresholds? [Clarity]"
244
- - "Are authentication requirements consistent across all endpoints? [Consistency]"
245
- - "Are retry/timeout requirements defined for external dependencies? [Coverage, Gap]"
246
- - "Is versioning strategy documented in requirements? [Gap]"
247
-
248
- **Performance Requirements Quality:** `performance.md`
249
-
250
- Sample items:
251
-
252
- - "Are performance requirements quantified with specific metrics? [Clarity]"
253
- - "Are performance targets defined for all critical user journeys? [Coverage]"
254
- - "Are performance requirements under different load conditions specified? [Completeness]"
255
- - "Can performance requirements be objectively measured? [Measurability]"
256
- - "Are degradation requirements defined for high-load scenarios? [Edge Case, Gap]"
257
-
258
- **Security Requirements Quality:** `security.md`
259
-
260
- Sample items:
261
-
262
- - "Are authentication requirements specified for all protected resources? [Coverage]"
263
- - "Are data protection requirements defined for sensitive information? [Completeness]"
264
- - "Is the threat model documented and requirements aligned to it? [Traceability]"
265
- - "Are security requirements consistent with compliance obligations? [Consistency]"
266
- - "Are security failure/breach response requirements defined? [Gap, Exception Flow]"
267
-
268
- ## Anti-Examples: What NOT To Do
269
-
270
- **❌ WRONG - These test implementation, not requirements:**
271
-
272
- ```markdown
273
- - [ ] CHK001 - Verify landing page displays 3 episode cards [Spec §FR-001]
274
- - [ ] CHK002 - Test hover states work correctly on desktop [Spec §FR-003]
275
- - [ ] CHK003 - Confirm logo click navigates to home page [Spec §FR-010]
276
- - [ ] CHK004 - Check that related episodes section shows 3-5 items [Spec §FR-005]
277
- ```
278
-
279
- **✅ CORRECT - These test requirements quality:**
280
-
281
- ```markdown
282
- - [ ] CHK001 - Are the number and layout of featured episodes explicitly specified? [Completeness, Spec §FR-001]
283
- - [ ] CHK002 - Are hover state requirements consistently defined for all interactive elements? [Consistency, Spec §FR-003]
284
- - [ ] CHK003 - Are navigation requirements clear for all clickable brand elements? [Clarity, Spec §FR-010]
285
- - [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005]
286
- - [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
287
- - [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001]
288
- ```
289
-
290
- **Key Differences:**
291
-
292
- - Wrong: Tests if the system works correctly
293
- - Correct: Tests if the requirements are written correctly
294
- - Wrong: Verification of behavior
295
- - Correct: Validation of requirement quality
296
- - Wrong: "Does it do X?"
297
- - Correct: "Is X clearly specified?"
298
- """
@@ -1,185 +0,0 @@
1
- description = "Identify underspecified areas in the current feature spec by asking up to 5 highly targeted clarification questions and encoding answers back into the spec."
2
-
3
- prompt = """
4
- ---
5
- description: Identify underspecified areas in the current feature spec by asking up to 5 highly targeted clarification questions and encoding answers back into the spec.
6
- handoffs:
7
- - label: Build Technical Plan
8
- agent: speckit.plan
9
- prompt: Create a plan for the spec. I am building with...
10
- ---
11
-
12
- ## User Input
13
-
14
- ```text
15
- $ARGUMENTS
16
- ```
17
-
18
- You **MUST** consider the user input before proceeding (if not empty).
19
-
20
- ## Outline
21
-
22
- Goal: Detect and reduce ambiguity or missing decision points in the active feature specification and record the clarifications directly in the spec file.
23
-
24
- Note: This clarification workflow is expected to run (and be completed) BEFORE invoking `/speckit.plan`. If the user explicitly states they are skipping clarification (e.g., exploratory spike), you may proceed, but must warn that downstream rework risk increases.
25
-
26
- Execution steps:
27
-
28
- 1. Run `.specify/scripts/bash/check-prerequisites.sh --json --paths-only` from repo root **once** (combined `--json --paths-only` mode / `-Json -PathsOnly`). Parse minimal JSON payload fields:
29
- - `FEATURE_DIR`
30
- - `FEATURE_SPEC`
31
- - (Optionally capture `IMPL_PLAN`, `TASKS` for future chained flows.)
32
- - If JSON parsing fails, abort and instruct user to re-run `/speckit.specify` or verify feature branch environment.
33
- - For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\\''m Groot' (or double-quote if possible: "I'm Groot").
34
-
35
- 2. Load the current spec file. Perform a structured ambiguity & coverage scan using this taxonomy. For each category, mark status: Clear / Partial / Missing. Produce an internal coverage map used for prioritization (do not output raw map unless no questions will be asked).
36
-
37
- Functional Scope & Behavior:
38
- - Core user goals & success criteria
39
- - Explicit out-of-scope declarations
40
- - User roles / personas differentiation
41
-
42
- Domain & Data Model:
43
- - Entities, attributes, relationships
44
- - Identity & uniqueness rules
45
- - Lifecycle/state transitions
46
- - Data volume / scale assumptions
47
-
48
- Interaction & UX Flow:
49
- - Critical user journeys / sequences
50
- - Error/empty/loading states
51
- - Accessibility or localization notes
52
-
53
- Non-Functional Quality Attributes:
54
- - Performance (latency, throughput targets)
55
- - Scalability (horizontal/vertical, limits)
56
- - Reliability & availability (uptime, recovery expectations)
57
- - Observability (logging, metrics, tracing signals)
58
- - Security & privacy (authN/Z, data protection, threat assumptions)
59
- - Compliance / regulatory constraints (if any)
60
-
61
- Integration & External Dependencies:
62
- - External services/APIs and failure modes
63
- - Data import/export formats
64
- - Protocol/versioning assumptions
65
-
66
- Edge Cases & Failure Handling:
67
- - Negative scenarios
68
- - Rate limiting / throttling
69
- - Conflict resolution (e.g., concurrent edits)
70
-
71
- Constraints & Tradeoffs:
72
- - Technical constraints (language, storage, hosting)
73
- - Explicit tradeoffs or rejected alternatives
74
-
75
- Terminology & Consistency:
76
- - Canonical glossary terms
77
- - Avoided synonyms / deprecated terms
78
-
79
- Completion Signals:
80
- - Acceptance criteria testability
81
- - Measurable Definition of Done style indicators
82
-
83
- Misc / Placeholders:
84
- - TODO markers / unresolved decisions
85
- - Ambiguous adjectives ("robust", "intuitive") lacking quantification
86
-
87
- For each category with Partial or Missing status, add a candidate question opportunity unless:
88
- - Clarification would not materially change implementation or validation strategy
89
- - Information is better deferred to planning phase (note internally)
90
-
91
- 3. Generate (internally) a prioritized queue of candidate clarification questions (maximum 5). Do NOT output them all at once. Apply these constraints:
92
- - Maximum of 10 total questions across the whole session.
93
- - Each question must be answerable with EITHER:
94
- - A short multiple‑choice selection (2–5 distinct, mutually exclusive options), OR
95
- - A one-word / short‑phrase answer (explicitly constrain: "Answer in <=5 words").
96
- - Only include questions whose answers materially impact architecture, data modeling, task decomposition, test design, UX behavior, operational readiness, or compliance validation.
97
- - Ensure category coverage balance: attempt to cover the highest impact unresolved categories first; avoid asking two low-impact questions when a single high-impact area (e.g., security posture) is unresolved.
98
- - Exclude questions already answered, trivial stylistic preferences, or plan-level execution details (unless blocking correctness).
99
- - Favor clarifications that reduce downstream rework risk or prevent misaligned acceptance tests.
100
- - If more than 5 categories remain unresolved, select the top 5 by (Impact * Uncertainty) heuristic.
101
-
102
- 4. Sequential questioning loop (interactive):
103
- - Present EXACTLY ONE question at a time.
104
- - For multiple‑choice questions:
105
- - **Analyze all options** and determine the **most suitable option** based on:
106
- - Best practices for the project type
107
- - Common patterns in similar implementations
108
- - Risk reduction (security, performance, maintainability)
109
- - Alignment with any explicit project goals or constraints visible in the spec
110
- - Present your **recommended option prominently** at the top with clear reasoning (1-2 sentences explaining why this is the best choice).
111
- - Format as: `**Recommended:** Option [X] - <reasoning>`
112
- - Then render all options as a Markdown table:
113
-
114
- | Option | Description |
115
- |--------|-------------|
116
- | A | <Option A description> |
117
- | B | <Option B description> |
118
- | C | <Option C description> (add D/E as needed up to 5) |
119
- | Short | Provide a different short answer (<=5 words) (Include only if free-form alternative is appropriate) |
120
-
121
- - After the table, add: `You can reply with the option letter (e.g., "A"), accept the recommendation by saying "yes" or "recommended", or provide your own short answer.`
122
- - For short‑answer style (no meaningful discrete options):
123
- - Provide your **suggested answer** based on best practices and context.
124
- - Format as: `**Suggested:** <your proposed answer> - <brief reasoning>`
125
- - Then output: `Format: Short answer (<=5 words). You can accept the suggestion by saying "yes" or "suggested", or provide your own answer.`
126
- - After the user answers:
127
- - If the user replies with "yes", "recommended", or "suggested", use your previously stated recommendation/suggestion as the answer.
128
- - Otherwise, validate the answer maps to one option or fits the <=5 word constraint.
129
- - If ambiguous, ask for a quick disambiguation (count still belongs to same question; do not advance).
130
- - Once satisfactory, record it in working memory (do not yet write to disk) and move to the next queued question.
131
- - Stop asking further questions when:
132
- - All critical ambiguities resolved early (remaining queued items become unnecessary), OR
133
- - User signals completion ("done", "good", "no more"), OR
134
- - You reach 5 asked questions.
135
- - Never reveal future queued questions in advance.
136
- - If no valid questions exist at start, immediately report no critical ambiguities.
137
-
138
- 5. Integration after EACH accepted answer (incremental update approach):
139
- - Maintain in-memory representation of the spec (loaded once at start) plus the raw file contents.
140
- - For the first integrated answer in this session:
141
- - Ensure a `## Clarifications` section exists (create it just after the highest-level contextual/overview section per the spec template if missing).
142
- - Under it, create (if not present) a `### Session YYYY-MM-DD` subheading for today.
143
- - Append a bullet line immediately after acceptance: `- Q: <question> → A: <final answer>`.
144
- - Then immediately apply the clarification to the most appropriate section(s):
145
- - Functional ambiguity → Update or add a bullet in Functional Requirements.
146
- - User interaction / actor distinction → Update User Stories or Actors subsection (if present) with clarified role, constraint, or scenario.
147
- - Data shape / entities → Update Data Model (add fields, types, relationships) preserving ordering; note added constraints succinctly.
148
- - Non-functional constraint → Add/modify measurable criteria in Non-Functional / Quality Attributes section (convert vague adjective to metric or explicit target).
149
- - Edge case / negative flow → Add a new bullet under Edge Cases / Error Handling (or create such subsection if template provides placeholder for it).
150
- - Terminology conflict → Normalize term across spec; retain original only if necessary by adding `(formerly referred to as "X")` once.
151
- - If the clarification invalidates an earlier ambiguous statement, replace that statement instead of duplicating; leave no obsolete contradictory text.
152
- - Save the spec file AFTER each integration to minimize risk of context loss (atomic overwrite).
153
- - Preserve formatting: do not reorder unrelated sections; keep heading hierarchy intact.
154
- - Keep each inserted clarification minimal and testable (avoid narrative drift).
155
-
156
- 6. Validation (performed after EACH write plus final pass):
157
- - Clarifications session contains exactly one bullet per accepted answer (no duplicates).
158
- - Total asked (accepted) questions ≤ 5.
159
- - Updated sections contain no lingering vague placeholders the new answer was meant to resolve.
160
- - No contradictory earlier statement remains (scan for now-invalid alternative choices removed).
161
- - Markdown structure valid; only allowed new headings: `## Clarifications`, `### Session YYYY-MM-DD`.
162
- - Terminology consistency: same canonical term used across all updated sections.
163
-
164
- 7. Write the updated spec back to `FEATURE_SPEC`.
165
-
166
- 8. Report completion (after questioning loop ends or early termination):
167
- - Number of questions asked & answered.
168
- - Path to updated spec.
169
- - Sections touched (list names).
170
- - Coverage summary table listing each taxonomy category with Status: Resolved (was Partial/Missing and addressed), Deferred (exceeds question quota or better suited for planning), Clear (already sufficient), Outstanding (still Partial/Missing but low impact).
171
- - If any Outstanding or Deferred remain, recommend whether to proceed to `/speckit.plan` or run `/speckit.clarify` again later post-plan.
172
- - Suggested next command.
173
-
174
- Behavior rules:
175
-
176
- - If no meaningful ambiguities found (or all potential questions would be low-impact), respond: "No critical ambiguities detected worth formal clarification." and suggest proceeding.
177
- - If spec file missing, instruct user to run `/speckit.specify` first (do not create a new spec here).
178
- - Never exceed 5 total asked questions (clarification retries for a single question do not count as new questions).
179
- - Avoid speculative tech stack questions unless the absence blocks functional clarity.
180
- - Respect user early termination signals ("stop", "done", "proceed").
181
- - If no questions asked due to full coverage, output a compact coverage summary (all categories Clear) then suggest advancing.
182
- - If quota reached with unresolved high-impact categories remaining, explicitly flag them under Deferred with rationale.
183
-
184
- Context for prioritization: {{args}}
185
- """
@@ -1,86 +0,0 @@
1
- description = "Create or update the project constitution from interactive or provided principle inputs, ensuring all dependent templates stay in sync."
2
-
3
- prompt = """
4
- ---
5
- description: Create or update the project constitution from interactive or provided principle inputs, ensuring all dependent templates stay in sync.
6
- handoffs:
7
- - label: Build Specification
8
- agent: speckit.specify
9
- prompt: Implement the feature specification based on the updated constitution. I want to build...
10
- ---
11
-
12
- ## User Input
13
-
14
- ```text
15
- $ARGUMENTS
16
- ```
17
-
18
- You **MUST** consider the user input before proceeding (if not empty).
19
-
20
- ## Outline
21
-
22
- You are updating the project constitution at `.specify/memory/constitution.md`. This file is a TEMPLATE containing placeholder tokens in square brackets (e.g. `[PROJECT_NAME]`, `[PRINCIPLE_1_NAME]`). Your job is to (a) collect/derive concrete values, (b) fill the template precisely, and (c) propagate any amendments across dependent artifacts.
23
-
24
- Follow this execution flow:
25
-
26
- 1. Load the existing constitution template at `.specify/memory/constitution.md`.
27
- - Identify every placeholder token of the form `[ALL_CAPS_IDENTIFIER]`.
28
- **IMPORTANT**: The user might require less or more principles than the ones used in the template. If a number is specified, respect that - follow the general template. You will update the doc accordingly.
29
-
30
- 2. Collect/derive values for placeholders:
31
- - If user input (conversation) supplies a value, use it.
32
- - Otherwise infer from existing repo context (README, docs, prior constitution versions if embedded).
33
- - For governance dates: `RATIFICATION_DATE` is the original adoption date (if unknown ask or mark TODO), `LAST_AMENDED_DATE` is today if changes are made, otherwise keep previous.
34
- - `CONSTITUTION_VERSION` must increment according to semantic versioning rules:
35
- - MAJOR: Backward incompatible governance/principle removals or redefinitions.
36
- - MINOR: New principle/section added or materially expanded guidance.
37
- - PATCH: Clarifications, wording, typo fixes, non-semantic refinements.
38
- - If version bump type ambiguous, propose reasoning before finalizing.
39
-
40
- 3. Draft the updated constitution content:
41
- - Replace every placeholder with concrete text (no bracketed tokens left except intentionally retained template slots that the project has chosen not to define yet—explicitly justify any left).
42
- - Preserve heading hierarchy and comments can be removed once replaced unless they still add clarifying guidance.
43
- - Ensure each Principle section: succinct name line, paragraph (or bullet list) capturing non‑negotiable rules, explicit rationale if not obvious.
44
- - Ensure Governance section lists amendment procedure, versioning policy, and compliance review expectations.
45
-
46
- 4. Consistency propagation checklist (convert prior checklist into active validations):
47
- - Read `.specify/templates/plan-template.md` and ensure any "Constitution Check" or rules align with updated principles.
48
- - Read `.specify/templates/spec-template.md` for scope/requirements alignment—update if constitution adds/removes mandatory sections or constraints.
49
- - Read `.specify/templates/tasks-template.md` and ensure task categorization reflects new or removed principle-driven task types (e.g., observability, versioning, testing discipline).
50
- - Read each command file in `.specify/templates/commands/*.md` (including this one) to verify no outdated references (agent-specific names like CLAUDE only) remain when generic guidance is required.
51
- - Read any runtime guidance docs (e.g., `README.md`, `docs/quickstart.md`, or agent-specific guidance files if present). Update references to principles changed.
52
-
53
- 5. Produce a Sync Impact Report (prepend as an HTML comment at top of the constitution file after update):
54
- - Version change: old → new
55
- - List of modified principles (old title → new title if renamed)
56
- - Added sections
57
- - Removed sections
58
- - Templates requiring updates (✅ updated / ⚠ pending) with file paths
59
- - Follow-up TODOs if any placeholders intentionally deferred.
60
-
61
- 6. Validation before final output:
62
- - No remaining unexplained bracket tokens.
63
- - Version line matches report.
64
- - Dates ISO format YYYY-MM-DD.
65
- - Principles are declarative, testable, and free of vague language ("should" → replace with MUST/SHOULD rationale where appropriate).
66
-
67
- 7. Write the completed constitution back to `.specify/memory/constitution.md` (overwrite).
68
-
69
- 8. Output a final summary to the user with:
70
- - New version and bump rationale.
71
- - Any files flagged for manual follow-up.
72
- - Suggested commit message (e.g., `docs: amend constitution to vX.Y.Z (principle additions + governance update)`).
73
-
74
- Formatting & Style Requirements:
75
-
76
- - Use Markdown headings exactly as in the template (do not demote/promote levels).
77
- - Wrap long rationale lines to keep readability (<100 chars ideally) but do not hard enforce with awkward breaks.
78
- - Keep a single blank line between sections.
79
- - Avoid trailing whitespace.
80
-
81
- If the user supplies partial updates (e.g., only one principle revision), still perform validation and version decision steps.
82
-
83
- If critical info missing (e.g., ratification date truly unknown), insert `TODO(<FIELD_NAME>): explanation` and include in the Sync Impact Report under deferred items.
84
-
85
- Do not create a new template; always operate on the existing `.specify/memory/constitution.md` file.
86
- """