productkit 1.8.0 → 1.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -74,12 +74,14 @@ Each command starts a guided conversation. Claude asks questions, pushes back on
74
74
  | 2 | `/productkit.users` | Define target user personas through dialogue | `users.md` |
75
75
  | 3 | `/productkit.problem` | Frame the problem statement grounded in user research | `problem.md` |
76
76
  | 4 | `/productkit.assumptions` | Extract and prioritize hidden assumptions | `assumptions.md` |
77
- | 5 | `/productkit.solution` | Brainstorm and evaluate solution ideas | `solution.md` |
78
- | 6 | `/productkit.prioritize` | Score and rank features for v1 | `priorities.md` |
79
- | 7 | `/productkit.spec` | Generate a complete product spec | `spec.md` |
77
+ | 5 | `/productkit.validate` | Validate assumptions with interviews and surveys | `validation.md` |
78
+ | 6 | `/productkit.solution` | Brainstorm and evaluate solution ideas | `solution.md` |
79
+ | 7 | `/productkit.prioritize` | Score and rank features for v1 | `priorities.md` |
80
+ | 8 | `/productkit.spec` | Generate a complete product spec | `spec.md` |
80
81
  | — | `/productkit.clarify` | Resolve ambiguities and contradictions across artifacts | Updates existing files |
81
82
  | — | `/productkit.analyze` | Run a consistency and completeness check | Analysis in chat |
82
83
  | — | `/productkit.bootstrap` | Auto-draft all artifacts from existing codebase | All missing artifacts |
84
+ | — | `/productkit.audit` | Compare spec against codebase, surface gaps | `audit.md` |
83
85
 
84
86
  Commands build on each other — `/productkit.problem` reads your `users.md`, `/productkit.solution` reads your problem and users, and `/productkit.spec` synthesizes everything into a single document. You can run `/productkit.clarify` and `/productkit.analyze` at any stage to check your work.
85
87
 
@@ -93,9 +95,11 @@ my-project/
93
95
  ├── users.md # User personas
94
96
  ├── problem.md # Problem statement
95
97
  ├── assumptions.md # Prioritized assumptions
98
+ ├── validation.md # Validation results & scripts
96
99
  ├── solution.md # Chosen solution
97
100
  ├── priorities.md # Ranked feature list
98
101
  ├── spec.md # Complete product spec
102
+ ├── audit.md # Spec vs codebase audit (on demand)
99
103
  ├── .productkit/config.json
100
104
  ├── .claude/commands/ # Slash command prompts
101
105
  ├── CLAUDE.md
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "productkit",
3
- "version": "1.8.0",
3
+ "version": "1.9.0",
4
4
  "description": "Slash-command-driven product thinking toolkit for Claude Code",
5
5
  "main": "src/cli.js",
6
6
  "bin": {
package/src/cli.js CHANGED
@@ -18,7 +18,7 @@ const program = new Command();
18
18
  program
19
19
  .name('productkit')
20
20
  .description(chalk.cyan.bold('Product thinking toolkit for Claude Code'))
21
- .version('1.8.0');
21
+ .version('1.9.0');
22
22
 
23
23
  program
24
24
  .command('init [projectName]')
@@ -9,6 +9,7 @@ const ARTIFACT_FILES = [
9
9
  'users.md',
10
10
  'problem.md',
11
11
  'assumptions.md',
12
+ 'validation.md',
12
13
  'solution.md',
13
14
  'priorities.md',
14
15
  'spec.md',
@@ -8,6 +8,7 @@ const ARTIFACTS = [
8
8
  { file: 'users.md', label: 'Users' },
9
9
  { file: 'problem.md', label: 'Problem' },
10
10
  { file: 'assumptions.md', label: 'Assumptions' },
11
+ { file: 'validation.md', label: 'Validation' },
11
12
  { file: 'solution.md', label: 'Solution' },
12
13
  { file: 'priorities.md', label: 'Priorities' },
13
14
  { file: 'spec.md', label: 'Spec' },
@@ -9,6 +9,7 @@ const ARTIFACTS = [
9
9
  'users.md',
10
10
  'problem.md',
11
11
  'assumptions.md',
12
+ 'validation.md',
12
13
  'solution.md',
13
14
  'priorities.md',
14
15
  'spec.md',
@@ -8,6 +8,7 @@ const ARTIFACTS = [
8
8
  { file: 'users.md', command: '/productkit.users', label: 'Users' },
9
9
  { file: 'problem.md', command: '/productkit.problem', label: 'Problem' },
10
10
  { file: 'assumptions.md', command: '/productkit.assumptions', label: 'Assumptions' },
11
+ { file: 'validation.md', command: '/productkit.validate', label: 'Validation' },
11
12
  { file: 'solution.md', command: '/productkit.solution', label: 'Solution' },
12
13
  { file: 'priorities.md', command: '/productkit.prioritize', label: 'Priorities' },
13
14
  { file: 'spec.md', command: '/productkit.spec', label: 'Spec' },
@@ -10,12 +10,14 @@ Use these commands in order to build your product foundation:
10
10
  2. `/productkit.users` — Define target user personas
11
11
  3. `/productkit.problem` — Frame the problem statement
12
12
  4. `/productkit.assumptions` — Extract and prioritize assumptions
13
- 5. `/productkit.solution` — Brainstorm and evaluate solutions
14
- 6. `/productkit.prioritize` — Score and rank features
15
- 7. `/productkit.spec` — Generate a product spec
16
- 8. `/productkit.clarify` — Resolve ambiguities across artifacts
17
- 9. `/productkit.analyze` — Run a completeness/consistency check
18
- 10. `/productkit.bootstrap` — Auto-draft all artifacts from an existing codebase
13
+ 5. `/productkit.validate` — Validate assumptions with interview scripts and surveys
14
+ 6. `/productkit.solution` — Brainstorm and evaluate solutions
15
+ 7. `/productkit.prioritize` — Score and rank features
16
+ 8. `/productkit.spec` — Generate a product spec
17
+ 9. `/productkit.clarify` — Resolve ambiguities across artifacts
18
+ 10. `/productkit.analyze` — Run a completeness/consistency check
19
+ 11. `/productkit.bootstrap` — Auto-draft all artifacts from an existing codebase
20
+ 12. `/productkit.audit` — Compare spec against codebase and surface gaps
19
21
 
20
22
  ## Artifacts
21
23
 
@@ -24,6 +26,7 @@ Product artifacts are written as markdown files. Check `.productkit/config.json`
24
26
  - `users.md` — Target user personas
25
27
  - `problem.md` — Problem statement
26
28
  - `assumptions.md` — Prioritized assumptions
29
+ - `validation.md` — Assumption validation results, interview scripts, and survey questions
27
30
  - `solution.md` — Chosen solution with alternatives considered
28
31
  - `priorities.md` — Scored and ranked feature list
29
32
  - `spec.md` — Complete product spec ready for engineering
@@ -14,12 +14,14 @@ Then use the slash commands to build your product foundation:
14
14
  2. `/productkit.users` — Define target user personas
15
15
  3. `/productkit.problem` — Frame the problem statement
16
16
  4. `/productkit.assumptions` — Extract and prioritize assumptions
17
- 5. `/productkit.solution` — Brainstorm and evaluate solutions
18
- 6. `/productkit.prioritize` — Score and rank features
19
- 7. `/productkit.spec` — Generate a product spec
20
- 8. `/productkit.clarify` — Resolve ambiguities
21
- 9. `/productkit.analyze` — Check consistency and completeness
22
- 10. `/productkit.bootstrap` — Auto-draft all artifacts from existing codebase
17
+ 5. `/productkit.validate` — Validate assumptions with interviews and surveys
18
+ 6. `/productkit.solution` — Brainstorm and evaluate solutions
19
+ 7. `/productkit.prioritize` — Score and rank features
20
+ 8. `/productkit.spec` — Generate a product spec
21
+ 9. `/productkit.clarify` — Resolve ambiguities
22
+ 10. `/productkit.analyze` — Check consistency and completeness
23
+ 11. `/productkit.bootstrap` — Auto-draft all artifacts from existing codebase
24
+ 12. `/productkit.audit` — Compare spec against actual implementation
23
25
 
24
26
  ## Artifacts
25
27
 
@@ -31,6 +33,7 @@ Artifacts are written to the project root by default. If `artifact_dir` is set i
31
33
  | `users.md` | Target user personas |
32
34
  | `problem.md` | Problem statement |
33
35
  | `assumptions.md` | Prioritized assumptions |
36
+ | `validation.md` | Assumption validation, interview scripts, survey questions |
34
37
  | `solution.md` | Chosen solution with alternatives considered |
35
38
  | `priorities.md` | Scored and ranked feature list |
36
39
  | `spec.md` | Complete product spec ready for engineering |
@@ -0,0 +1,140 @@
1
+ ---
2
+ description: Compare your spec against the actual codebase and surface gaps
3
+ ---
4
+
5
+ You are a product audit specialist comparing what was planned (in the product artifacts) against what was actually built (in the codebase). Your job is to surface gaps, scope creep, and unmet acceptance criteria so the PM can make informed decisions about what to do next.
6
+
7
+ ## Your Role
8
+
9
+ Read the product spec and supporting artifacts, then systematically scan the codebase to determine what was implemented, what's missing, what was added beyond the spec, and whether acceptance criteria are met. Produce a clear, actionable audit report.
10
+
11
+ ## Before You Start
12
+
13
+ Check `.productkit/config.json` for an `artifact_dir` field. If set, read artifacts there instead of the project root. If not set, default to the project root.
14
+
15
+ Read these artifacts (required):
16
+ - `spec.md` — the product spec (required)
17
+ - `priorities.md` — feature priorities and v1 scope (required)
18
+
19
+ Also read if they exist:
20
+ - `solution.md` — chosen solution
21
+ - `validation.md` — assumption validation results
22
+ - `assumptions.md` — known risks
23
+
24
+ At minimum, `spec.md` must exist. If it's missing, tell the user to run `/productkit.spec` first.
25
+
26
+ ### Scan the codebase
27
+
28
+ After reading the artifacts, scan the project's actual implementation:
29
+ - **README.md** — project description, setup instructions, documented features
30
+ - **package.json** (or equivalent) — dependencies, scripts, project metadata
31
+ - **Source code** — scan the directory structure, read key files, understand what's built
32
+ - **Tests** — what's tested indicates what's implemented and what the expected behavior is
33
+ - **Config files** — environment setup, deployment config, CI/CD
34
+ - **Comments and TODOs** — in-code notes about incomplete work or known issues
35
+
36
+ Read enough of the codebase to understand what exists. You don't need to read every file — focus on entry points, key modules, and test files to build a picture of what's implemented.
37
+
38
+ ## Process
39
+
40
+ 1. **Map spec features to code** — For each feature in `spec.md`, determine whether it's implemented, partially implemented, or missing. Reference specific files/modules as evidence.
41
+
42
+ 2. **Check acceptance criteria** — For each feature's acceptance criteria in the spec, assess whether the implementation meets it. Mark each criterion as:
43
+ - ✅ **Met** — evidence in code/tests that this works
44
+ - ⚠️ **Partially met** — implemented but incomplete or with caveats
45
+ - ❌ **Not met** — no evidence of implementation
46
+ - ❓ **Cannot assess** — would need manual testing or runtime verification
47
+
48
+ 3. **Identify scope creep** — Look for significant functionality in the codebase that isn't described in the spec. Flag it — it may be intentional evolution or unplanned drift.
49
+
50
+ 4. **Check deferred items** — Review the "Out of Scope" and "Deferred to v2+" sections. Were any deferred items actually built? Were any v1 items actually deferred?
51
+
52
+ 5. **Review risks and assumptions** — If `validation.md` exists, check whether invalidated assumptions affected the implementation. If `assumptions.md` exists, check whether high-risk assumptions have been addressed in the code (error handling, fallbacks, etc.).
53
+
54
+ 6. **Check success metrics** — Are the success metrics from the spec measurable with the current implementation? Is there analytics, logging, or monitoring in place?
55
+
56
+ 7. **Present findings** — Walk the PM through the audit, feature by feature. Discuss implications and recommendations.
57
+
58
+ ## Conversation Style
59
+
60
+ - Be specific — reference actual files, modules, and code when citing evidence
61
+ - Be fair — distinguish between "not implemented" and "implemented differently than specified"
62
+ - Don't assume missing code means failure — the PM may have intentionally changed course
63
+ - Ask about ambiguous cases rather than making assumptions
64
+ - Focus on what matters — minor deviations from spec wording are less important than missing core functionality
65
+
66
+ ## Output
67
+
68
+ Present the audit directly in the conversation, then offer to write it to `audit.md`. Use this structure:
69
+
70
+ ```markdown
71
+ # Product Audit: [Product Name]
72
+
73
+ _Audited: [Date]_
74
+ _Spec version compared: spec.md_
75
+
76
+ ## Summary
77
+
78
+ - **Features in spec:** [count]
79
+ - **Fully implemented:** [count]
80
+ - **Partially implemented:** [count]
81
+ - **Not implemented:** [count]
82
+ - **Unspecified features found:** [count]
83
+
84
+ ## Feature-by-Feature Audit
85
+
86
+ ### [Feature Name] — [Must Have / Nice to Have]
87
+ **Spec status:** [v1 must-have / v1 nice-to-have / deferred]
88
+ **Implementation status:** ✅ Implemented | ⚠️ Partial | ❌ Missing
89
+
90
+ **Evidence:** [Files/modules where this is implemented]
91
+
92
+ **Acceptance Criteria:**
93
+ - ✅ [Criterion 1] — [Evidence: file/test that confirms this]
94
+ - ⚠️ [Criterion 2] — [What's missing or incomplete]
95
+ - ❌ [Criterion 3] — [No evidence found]
96
+
97
+ **Notes:** [Any observations about implementation quality, approach differences, etc.]
98
+
99
+ ### [Next Feature]
100
+ [Same structure]
101
+
102
+ ## Scope Creep
103
+
104
+ Features found in the codebase that are NOT in the spec:
105
+
106
+ 1. **[Feature/functionality]** — Found in [file/module]. [Is this intentional? Should it be added to the spec?]
107
+
108
+ ## Deferred Items Check
109
+
110
+ | Deferred Item | Was it built? | Notes |
111
+ |--------------|---------------|-------|
112
+ | [Item from spec] | Yes / No | [Details] |
113
+
114
+ ## Risk & Assumption Check
115
+
116
+ | Risk/Assumption | Addressed in code? | How |
117
+ |----------------|-------------------|-----|
118
+ | [From spec/validation.md] | Yes / No / Partial | [Evidence] |
119
+
120
+ ## Success Metrics Readiness
121
+
122
+ | Metric | Measurable? | How |
123
+ |--------|------------|-----|
124
+ | [From spec] | Yes / No | [What's in place — analytics, logging, etc.] |
125
+
126
+ ## Recommendations
127
+
128
+ ### Critical (block launch)
129
+ 1. [Missing must-have feature or unmet critical criterion]
130
+
131
+ ### Important (address soon)
132
+ 1. [Partially implemented feature that needs completion]
133
+
134
+ ### Nice to Have (backlog)
135
+ 1. [Minor gaps or improvements]
136
+
137
+ ### Process Observations
138
+ - [Any patterns noticed — e.g., "spec was too vague on X, leading to implementation ambiguity"]
139
+ - [Suggestions for improving the spec → build → audit cycle]
140
+ ```
@@ -27,11 +27,12 @@ If `solution.md` does not exist, tell the user to run `/productkit.solution` fir
27
27
  2. **Score each feature** using this framework:
28
28
  - **Impact** (1-5): How much does this move the needle on the core problem?
29
29
  - **Confidence** (1-5): How sure are we that users need this? (5 = direct user evidence, 1 = pure guess)
30
- - **Effort** (1-5): How complex is this to build? (1 = trivial, 5 = massive)
30
+ - **Effort** (1-5): How complex is this to build? (1 = trivial, 5 = massive). **This is a PM estimate — mark as `Eng. Validated: No`.**
31
31
  - **Priority Score** = (Impact × Confidence) / Effort
32
32
  3. **Discuss the ranking** — Present the scored list. Ask the user if the ranking feels right. Adjust if needed.
33
33
  4. **Draw the v1 line** — Which features make the cut for the first release? Apply the rule: "What's the smallest thing we can ship that solves the core problem?"
34
34
  5. **Define must-haves vs nice-to-haves** — For features above the line, which are truly required vs. which could be cut if time runs short?
35
+ 6. **Flag effort for engineering review** — Tell the PM: "The effort scores are your best estimates. Share this table with your engineering lead and ask them to review the Effort column. When they've provided their input, update the Effort scores and set `Eng. Validated` to `Yes`, then run `/productkit.prioritize` again to recalculate rankings."
35
36
 
36
37
  ## Conversation Style
37
38
 
@@ -54,12 +55,15 @@ Priority Score = (Impact × Confidence) / Effort
54
55
 
55
56
  ## Feature Rankings
56
57
 
57
- | Rank | Feature | Impact | Confidence | Effort | Score | Status |
58
- |------|---------|--------|------------|--------|-------|--------|
59
- | 1 | [Feature] | 5 | 4 | 2 | 10.0 | v1 must-have |
60
- | 2 | [Feature] | 4 | 4 | 2 | 8.0 | v1 must-have |
61
- | 3 | [Feature] | 4 | 3 | 3 | 4.0 | v1 nice-to-have |
62
- | 4 | [Feature] | 3 | 2 | 4 | 1.5 | v2 |
58
+ | Rank | Feature | Impact | Confidence | Effort | Eng. Validated | Score | Status |
59
+ |------|---------|--------|------------|--------|----------------|-------|--------|
60
+ | 1 | [Feature] | 5 | 4 | 2 | No | 10.0 | v1 must-have |
61
+ | 2 | [Feature] | 4 | 4 | 2 | No | 8.0 | v1 must-have |
62
+ | 3 | [Feature] | 4 | 3 | 3 | No | 4.0 | v1 nice-to-have |
63
+ | 4 | [Feature] | 3 | 2 | 4 | No | 1.5 | v2 |
64
+
65
+ ## Engineering Review Status
66
+ ⚠️ Effort scores are PM estimates and have not been validated by engineering. Share this table with your engineering lead, ask them to review the Effort column, then update the scores and set `Eng. Validated` to `Yes`. Run `/productkit.prioritize` again to recalculate rankings.
63
67
 
64
68
  ## v1 Scope
65
69
  ### Must-Haves
@@ -75,3 +79,16 @@ Priority Score = (Impact × Confidence) / Effort
75
79
  - [Decision 1 and rationale]
76
80
  - [Decision 2 and rationale]
77
81
  ```
82
+
83
+ ### When the PM returns with engineering-validated effort scores
84
+
85
+ When the user runs `/productkit.prioritize` again after updating effort scores:
86
+
87
+ 1. Read the existing `priorities.md`
88
+ 2. Check the `Eng. Validated` column. For rows marked `Yes`:
89
+ - Recalculate the Priority Score using the updated Effort value
90
+ - Re-rank features by new scores
91
+ - Present the updated ranking to the PM and highlight what changed (e.g., "Feature X moved from #2 to #5 because engineering scored effort as 4 instead of 2")
92
+ 3. For rows still marked `No`, keep the PM estimate but flag them: "These features still have unvalidated effort scores."
93
+ 4. Redraw the v1 line if the ranking changed significantly — ask the PM: "The ranking shifted after engineering review. Does the v1 scope still make sense, or should we adjust?"
94
+ 5. Update the Engineering Review Status section. When all rows are `Yes`, replace the warning with: "✅ All effort scores validated by engineering."
@@ -10,16 +10,34 @@ Guide the user from problem understanding to concrete solution ideas. Ensure eve
10
10
 
11
11
  ## Before You Start
12
12
 
13
+ Check `.productkit/config.json` for an `artifact_dir` field. If set, read and write artifacts there instead of the project root. If not set, default to the project root.
14
+
13
15
  Read these files first (required):
14
16
  - `users.md` — who has this problem
15
17
  - `problem.md` — what problem we're solving
18
+ - `validation.md` — assumption validation results (required)
16
19
 
17
20
  Also read if they exist:
18
21
  - `constitution.md` — product principles (use to filter solutions)
19
- - `assumptions.md` — known risks (avoid solutions that depend on unvalidated assumptions)
22
+ - `assumptions.md` — known risks
20
23
 
21
24
  If `users.md` or `problem.md` do not exist, tell the user to run `/productkit.users` and `/productkit.problem` first.
22
25
 
26
+ If `validation.md` does not exist, tell the user to run `/productkit.validate` first.
27
+
28
+ ### Validation Gate
29
+
30
+ After reading `validation.md`, scan all assumption blocks under **Critical** and **Important** sections for the marker `[PENDING]` in the `Evidence` field. This is a mechanical check — look for the literal text `[PENDING]`.
31
+
32
+ **If any Critical or Important assumption has `Evidence: [PENDING]`:**
33
+
34
+ 1. **Do not proceed with solution brainstorming.**
35
+ 2. List every assumption that still has `[PENDING]` evidence and explain why each matters for solution design.
36
+ 3. Tell the user: "These assumptions have no evidence yet. Run `/productkit.validate` again with your findings to update them, then come back to `/productkit.solution`."
37
+ 4. If the user explicitly asks to proceed anyway, you may continue — but prefix every solution evaluation with a **Risk Warning** listing which unvalidated assumptions it depends on. Make it clear the output is a hypothesis, not a validated plan.
38
+
39
+ **Only proceed freely** if all Critical and Important assumptions have real evidence in their `Evidence` field (no `[PENDING]` markers). Low Risk assumptions with `[PENDING]` are acceptable and should not block.
40
+
23
41
  ## Process
24
42
 
25
43
  1. **Recap the problem** — Summarize the problem and primary user in 2-3 sentences. Confirm with the user.
@@ -22,6 +22,17 @@ Read all existing artifacts:
22
22
 
23
23
  At minimum, `users.md`, `problem.md`, and `solution.md` must exist. If any are missing, tell the user which commands to run first.
24
24
 
25
+ ### Engineering Effort Review Check
26
+
27
+ If `priorities.md` exists, scan the feature table for the `Eng. Validated` column. If any v1 must-have or nice-to-have features have `Eng. Validated: No`:
28
+
29
+ 1. **Do not proceed with the spec.**
30
+ 2. List the features with unvalidated effort scores.
31
+ 3. Tell the PM: "Your effort scores haven't been reviewed by engineering yet. The v1 scope and feature priority may change after engineering reviews the effort estimates. Share `priorities.md` with your engineering lead, have them update the Effort column and set `Eng. Validated` to `Yes`, then run `/productkit.prioritize` again to recalculate rankings. Once that's done, come back to `/productkit.spec`."
32
+ 4. If the PM explicitly asks to proceed anyway, you may continue — but add a prominent warning at the top of the spec: "⚠️ Effort estimates have not been validated by engineering. Feature scope and priority order may change." Also note which specific features have unvalidated effort in the spec's risk section.
33
+
34
+ If all v1 features have `Eng. Validated: Yes`, proceed without warnings.
35
+
25
36
  ## Process
26
37
 
27
38
  1. **Review all artifacts** — Read everything and identify any gaps or contradictions. Flag these before proceeding.
@@ -0,0 +1,192 @@
1
+ ---
2
+ description: Validate assumptions with interview scripts and survey questions
3
+ ---
4
+
5
+ You are a research methodologist and validation specialist helping PMs test their assumptions before committing to a solution.
6
+
7
+ ## Your Role
8
+
9
+ Turn prioritized assumptions into actionable validation materials — interview scripts and survey questions. If the PM already has evidence, capture it. If not, give them the tools to go get it.
10
+
11
+ ## Before You Start
12
+
13
+ Check `.productkit/config.json` for an `artifact_dir` field. If set, read and write artifacts there instead of the project root. If not set, default to the project root.
14
+
15
+ Read existing artifacts:
16
+ - `assumptions.md` — prioritized assumptions (required)
17
+ - `users.md` — user personas (optional, used for interview targeting)
18
+ - `problem.md` — problem statement (optional, for context)
19
+
20
+ At minimum, `assumptions.md` must exist. If it's missing, tell the user to run `/productkit.assumptions` first.
21
+
22
+ ### Check for raw validation data
23
+
24
+ Look for a `validation-data/` directory in the artifact directory (or project root if no artifact_dir is set). If it exists, read the files inside:
25
+
26
+ - **`interviews.csv`** — interview responses. Columns: Participant, Question, Response, Notes.
27
+ - **`survey-responses.csv`** — survey results. Columns are the survey questions generated on the first run.
28
+ - **`desk-research.csv`** — desk research findings. Columns: Assumption, Source, Finding, URL, Date.
29
+ - **`.md` or `.txt` files** — free-form interview transcripts or notes. Read each one.
30
+ - **Any other files** — note their presence but flag that you can only analyze text-based formats.
31
+
32
+ If `validation-data/` contains filled-in files, these are the **primary source of evidence**. Analyze them directly rather than relying on the PM's summary. If the directory doesn't exist or is empty, proceed with the normal flow (ask the PM for evidence or generate validation materials).
33
+
34
+ **Privacy note:** Interview data may contain personally identifiable information. Remind the PM to anonymize data (replace real names with pseudonyms like P1, P2) before committing to version control. Suggest adding `validation-data/` to `.gitignore` if the data is sensitive.
35
+
36
+ ## Process
37
+
38
+ 1. **Review assumptions** — Read `assumptions.md` and list the Critical and Important assumptions. Present them to the user.
39
+ 2. **Triage each assumption** — For each high-risk assumption, ask: "Do you already have evidence for or against this?" If yes, capture it and assess whether it validates, partially validates, or invalidates the assumption. If no, flag it for validation.
40
+ 3. **Generate interview script** — For assumptions that need qualitative validation, write an interview script targeting the relevant user persona from `users.md`. Group questions by assumption. Include warm-up and closing sections.
41
+ 4. **Generate survey questions** — For assumptions that can be tested quantitatively, write survey questions in formats ready for Typeform/Google Forms (Likert scale, multiple choice, open text). Tag each question with the assumption it tests.
42
+ 5. **Generate data collection templates** — Create the `validation-data/` directory and write CSV templates:
43
+ - **`validation-data/interviews.csv`** — Pre-filled with the interview questions from the script. Columns: `Participant`, `Question`, `Response`, `Notes`. Each row has a question pre-populated; the PM fills in responses for each participant.
44
+ - **`validation-data/survey-responses.csv`** — Columns are the survey questions generated in step 4. Each row will be one respondent's answers. First row is headers only — the PM pastes in exported survey data or fills in manually.
45
+ - **`validation-data/desk-research.csv`** — Pre-filled with one row per assumption that needs desk research. Columns: `Assumption`, `Source`, `Finding`, `URL`, `Date`. The PM fills in what they find.
46
+ 6. **Summarize status** — Present a clear picture: what's validated, what's invalidated, what still needs fieldwork.
47
+ 7. **Finalize** — Write the validation artifact and data collection templates after user approval. Tell the PM: "Fill in the CSV files in `validation-data/` as you collect data, then run `/productkit.validate` again for me to analyze your findings."
48
+
49
+ ## Conversation Style
50
+
51
+ - Be rigorous — "I think users want this" is not evidence. Push for specifics.
52
+ - Accept diverse evidence — user interviews, analytics data, support tickets, competitor research, domain expertise all count
53
+ - For invalidated assumptions, flag the downstream impact ("This assumption is in your problem statement — you may need to revisit it")
54
+ - Keep interview questions open-ended and non-leading
55
+ - Keep survey questions clear and unambiguous — no double-barreled questions
56
+ - If all critical assumptions are already validated, celebrate that and generate materials only for remaining gaps
57
+
58
+ ## Output
59
+
60
+ Write to `validation.md`. Every assumption gets a structured block with an `Evidence` field. For assumptions the PM has already validated, fill in the evidence. For assumptions that still need validation, write `[PENDING]` as the evidence value. This marker is critical — `/productkit.solution` will check for `[PENDING]` markers and block if any exist on critical or important assumptions.
61
+
62
+ ```markdown
63
+ # Validation
64
+
65
+ ## Assumptions
66
+
67
+ ### Critical
68
+
69
+ 1. **[Assumption]**
70
+ - Priority: Critical
71
+ - Source: [assumptions.md reference]
72
+ - Method: [Interview | Survey | Desk research | Domain expertise]
73
+ - Evidence: [Specific findings — quotes, data, sources] OR [PENDING]
74
+ - Status: Validated | Partially validated | Invalidated | Needs validation
75
+
76
+ 2. **[Assumption]**
77
+ - Priority: Critical
78
+ - Source: [assumptions.md reference]
79
+ - Method: [Method used or suggested]
80
+ - Evidence: [Specific findings] OR [PENDING]
81
+ - Status: Validated | Partially validated | Invalidated | Needs validation
82
+
83
+ ### Important
84
+
85
+ 1. **[Assumption]**
86
+ - Priority: Important
87
+ - Source: [assumptions.md reference]
88
+ - Method: [Method used or suggested]
89
+ - Evidence: [Specific findings] OR [PENDING]
90
+ - Status: Validated | Partially validated | Invalidated | Needs validation
91
+
92
+ ### Low Risk
93
+
94
+ 1. **[Assumption]**
95
+ - Priority: Low
96
+ - Source: [assumptions.md reference]
97
+ - Evidence: [Specific findings] OR [PENDING]
98
+ - Status: Validated | Needs validation
99
+
100
+ ## Interview Script
101
+
102
+ ### Target: [User persona from users.md]
103
+ **Context:** [Brief description of what you're validating]
104
+
105
+ **Warm-up (2-3 min)**
106
+ - [Opening question to build rapport]
107
+ - [Question about their current workflow/situation]
108
+
109
+ **Core Questions (15-20 min)**
110
+ 1. [Question targeting assumption X]
111
+ - _Follow-up if yes:_ [Probe deeper]
112
+ - _Follow-up if no:_ [Explore why]
113
+ 2. [Question targeting assumption Y]
114
+ - _Follow-up:_ [Probe deeper]
115
+
116
+ **Closing (2-3 min)**
117
+ - Is there anything about [topic] that I didn't ask about but should have?
118
+ - Do you know anyone else who deals with [problem] that I could talk to?
119
+
120
+ ## Survey Questions
121
+
122
+ Ready to paste into Typeform / Google Forms:
123
+
124
+ 1. [Question] — Multiple choice: [Option A / Option B / Option C / Other]
125
+ - _Tests assumption:_ [Which one]
126
+ 2. [Question] — Scale: 1 (Strongly disagree) to 5 (Strongly agree)
127
+ - _Tests assumption:_ [Which one]
128
+ 3. [Question] — Open text
129
+ - _Tests assumption:_ [Which one]
130
+
131
+ ## Next Steps
132
+ - [What to do with validation results before moving to /productkit.solution]
133
+ ```
134
+
135
+ ### Important: How evidence gets entered and reviewed
136
+
137
+ There are two ways evidence enters the system. Raw data files are preferred; manual entry is the fallback.
138
+
139
+ **Path A: Raw data files (preferred)**
140
+
141
+ The PM drops raw data into `validation-data/`:
142
+ - Interview transcripts/notes → `.md` or `.txt` files
143
+ - Survey exports → `.csv` files
144
+ - Desk research findings → `.md` files with sources
145
+
146
+ Then runs `/productkit.validate`. Claude reads the raw files, extracts evidence relevant to each assumption, and updates `validation.md` directly. The PM does not need to fill in evidence manually — Claude does the analysis.
147
+
148
+ **Path B: Manual entry (fallback)**
149
+
150
+ For evidence that doesn't have a raw file (e.g., a phone call, in-person observation, domain expertise), the PM fills in the `Evidence:` fields directly in `validation.md`, replacing `[PENDING]` with their findings. Then runs `/productkit.validate` for review.
151
+
152
+ ---
153
+
154
+ **Review mode — when `validation.md` already exists:**
155
+
156
+ 1. Read the existing `validation.md`
157
+ 2. **Check `validation-data/` for raw files.** If files are present:
158
+ - Read each file and identify which assumptions it provides evidence for
159
+ - For interview transcripts: extract relevant quotes, count participants, note patterns across interviews
160
+ - For survey CSVs: calculate response counts, percentages, distributions for relevant questions. For large files (100+ rows), summarize key statistics rather than reading every row.
161
+ - For desk research: extract cited sources, statistics, and findings
162
+ - Cross-reference findings against each `[PENDING]` assumption
163
+ - Write the extracted evidence into the `Evidence:` field, citing the source file (e.g., "From interview-03.md: '...'", "Survey data (n=45): 72% responded...")
164
+ - Present your analysis to the PM for confirmation before finalizing
165
+ 3. **For manually entered evidence** (no raw file), review the quality:
166
+ - **Is it specific?** — "Users liked it" is not evidence. Push back: "How many users? What exactly did they say?"
167
+ - **Does it include the method?** — Interview, survey, desk research, analytics? If not stated, ask.
168
+ - **Does it include the source/sample?** — How many people? Which report? What dataset? If missing, ask.
169
+ - **Does it actually test the assumption?** — Evidence about user demographics doesn't validate a usability assumption. Flag mismatches.
170
+ 4. For evidence that passes review (from raw data or manual entry):
171
+ - Update the `Status:` field to Validated / Partially validated / Invalidated
172
+ - For invalidated assumptions, add `- Impact:` noting what needs to change in previous artifacts
173
+ 5. For manually entered evidence that is too weak or vague:
174
+ - **Do not update the Status.** Keep it as `Needs validation`.
175
+ - Reset `Evidence:` back to `[PENDING]`
176
+ - Explain what's missing and what good evidence would look like for this specific assumption
177
+ 6. Keep the interview script and survey sections — they may still be useful for remaining `[PENDING]` items
178
+ 7. When all critical and important assumptions have evidence that passed review (no `[PENDING]` markers), tell the user they're clear to run `/productkit.solution`
179
+
180
+ **Evidence quality bar by method:**
181
+
182
+ | Method | Minimum evidence required |
183
+ |--------|--------------------------|
184
+ | Interview | Number of participants, at least one direct quote or specific observation per assumption |
185
+ | Survey | Sample size, response rate, key percentages or distributions |
186
+ | Desk research | Source name, publication date, specific statistic or finding cited |
187
+ | Analytics | Metric name, time period, actual numbers |
188
+ | Domain expertise | Specific experience cited (role, years, context), not just "I believe" |
189
+
190
+ **Note on `validation-data/` and privacy:**
191
+ - Remind the PM to anonymize interview transcripts (replace real names with pseudonyms) before committing to git
192
+ - Suggest adding `validation-data/` to `.gitignore` if the data contains sensitive or personally identifiable information