@anthropologies/claudestory 0.1.41 → 0.1.43

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,65 @@
1
+ # Autonomous & Tiered Modes
2
+
3
+ This file is referenced from SKILL.md for `/story auto`, `/story review`, `/story plan`, and `/story guided` commands.
4
+
5
+ ## Autonomous Mode
6
+
7
+ `/story auto` starts an autonomous coding session. The guide picks tickets, plans, reviews, implements, and commits -- looping until all tickets are done or the session limit is reached.
8
+
9
+ **How it works:**
10
+
11
+ 1. Call `claudestory_autonomous_guide` with `{ "sessionId": null, "action": "start" }`
12
+ 2. The guide returns an instruction with ticket candidates and exact JSON for the next call
13
+ 3. Follow every instruction exactly. Call the guide back after each step.
14
+ 4. The guide advances through: PICK_TICKET -> PLAN -> PLAN_REVIEW -> IMPLEMENT -> CODE_REVIEW -> FINALIZE -> COMPLETE -> loop
15
+ 5. Continue until the guide returns SESSION_END
16
+
17
+ **Critical rules for autonomous mode:**
18
+ - Do NOT use Claude Code's plan mode -- write plans as markdown files
19
+ - Do NOT ask the user for confirmation or approval
20
+ - Do NOT stop or summarize between tickets -- call the guide IMMEDIATELY
21
+ - Follow the guide's instructions exactly -- it specifies which tools to call, what parameters to use
22
+ - After each step completes, call `claudestory_autonomous_guide` with `action: "report"` and the results
23
+
24
+ **If the guide says to compact:** Call `claudestory_autonomous_guide` with `action: "pre_compact"`, then run `/compact`, then call with `action: "resume"`.
25
+
26
+ **If something goes wrong:** Call `claudestory_autonomous_guide` with `action: "cancel"` to cleanly end the session.
27
+
28
+ ## Tiered Access -- Review, Plan, Guided Modes
29
+
30
+ The autonomous guide supports four execution tiers. Same guide, same handlers, different entry/exit points.
31
+
32
+ ### `/story review T-XXX`
33
+
34
+ "I wrote code for T-XXX, review it." Enters at CODE_REVIEW, loops review rounds, exits on approval.
35
+
36
+ 1. Call `claudestory_autonomous_guide` with `{ "sessionId": null, "action": "start", "mode": "review", "ticketId": "T-XXX" }`
37
+ 2. The guide enters CODE_REVIEW -- follow its diff capture and review instructions
38
+ 3. On approve: session ends automatically. On revise/reject: fix code, re-review
39
+ 4. After approval, you can proceed to commit -- the guide does NOT auto-commit in review mode
40
+
41
+ **Note:** Review mode relaxes git constraints -- dirty working tree is allowed since the user has code ready for review.
42
+
43
+ ### `/story plan T-XXX`
44
+
45
+ "Help me plan T-XXX." Enters at PLAN, runs PLAN_REVIEW rounds, exits on approval.
46
+
47
+ 1. Call `claudestory_autonomous_guide` with `{ "sessionId": null, "action": "start", "mode": "plan", "ticketId": "T-XXX" }`
48
+ 2. The guide enters PLAN -- write the implementation plan as a markdown file
49
+ 3. On plan review approve: session ends automatically. On revise/reject: revise plan, re-review
50
+ 4. The approved plan is saved in `.story/sessions/<id>/plan.md`
51
+
52
+ ### `/story guided T-XXX`
53
+
54
+ "Do T-XXX end to end with review." Full pipeline for a single ticket: PLAN -> PLAN_REVIEW -> IMPLEMENT -> CODE_REVIEW -> FINALIZE -> COMPLETE -> HANDOVER -> SESSION_END.
55
+
56
+ 1. Call `claudestory_autonomous_guide` with `{ "sessionId": null, "action": "start", "mode": "guided", "ticketId": "T-XXX" }`
57
+ 2. Follow every instruction exactly, calling the guide back after each step
58
+ 3. Session ends automatically after the single ticket is complete
59
+
60
+ **Guided vs Auto:** Guided mode forces `maxTicketsPerSession: 1` and exits after the ticket. Auto mode loops until all tickets are done or the session limit is reached.
61
+
62
+ ### All tiered modes:
63
+ - Require a `ticketId` -- no ad-hoc review without a ticket in V1
64
+ - Use the same review process as auto mode (same backends, same adaptive depth)
65
+ - Can be cancelled with `action: "cancel"` at any point
@@ -0,0 +1,581 @@
1
+ # Setup Flow -- AI-Assisted Project Initialization
2
+
3
+ This file is referenced from SKILL.md when no `.story/` directory exists but project indicators are present. SKILL.md has already determined that setup is needed before routing here.
4
+
5
+ **If arriving from Step 2b (scaffold detection):** The project already has an empty `.story/` scaffold but no tickets. Skip 1a and start at **1b. Existing Project -- Analyze**.
6
+
7
+ ## Design Rules
8
+
9
+ These rules govern the entire setup flow. Follow them at every gate.
10
+
11
+ 1. **Every "Not sure" recommendation must reference at least one prior answer.** Example: "Monolith -- you described a billing tool for solo contractors, keep it simple." Builds trust, teaches why the choice matters.
12
+ 2. **Three-strike acceleration:** If user picks "Not sure" 3 out of any 4 gates, shift to: "Based on what you've told me, I'll recommend the rest. You can review everything in the proposal." Generate recommendations for remaining gates, surface them in the proposal as **editable assumptions** (clearly marked, not presented as certainties). **Exception:** auth model, sensitive domain, and primary AI pattern are never silently collapsed -- always ask these even in acceleration mode.
13
+ 3. **Never three bounded-choice gates in a row without a break.** Breaks are: free-text questions, visible mini-summaries, or output. Insert a summary after each cluster of gates.
14
+ 4. **Infer domain concerns from descriptions and entities, don't add gates.** If someone describes a collaborative whiteboard, detect realtime and generate websocket/presence tickets. If they describe a SaaS with pricing tiers, generate billing/subscription tickets. The interview captures *what*; the ticket generator knows *what that implies*.
15
+ 5. **Unrecognized types (Other):** ask architecture + deployment, skip domain-specific gates unless free-text mentions data or users.
16
+ 6. **Deployment recommendations are contextual, not stack-hardcoded.** Long-running processes, websockets, compliance, self-hosting preferences override the stack-based default.
17
+ 7. **Claude as recommended LLM is a product-opinionated choice.** claudestory is built on Claude. State this explicitly: "Claude is the product default for the claudestory ecosystem. Choose based on your needs."
18
+ 8. **Phrase gates as outcome decisions, not system jargon.** "How do users log in?" not "Auth model?" "What data does this system store?" not "Data layer?" "How should this go live?" not "Deployment target?"
19
+
20
+ ## AI-Assisted Setup Flow
21
+
22
+ This flow creates a meaningful `.story/` project instead of empty scaffolding. Claude analyzes the project, proposes structure, and creates everything via MCP tools.
23
+
24
+ #### 1a. Detect Project Type
25
+
26
+ Check for project indicators to determine if this is an **existing project** or a **new/empty project**:
27
+
28
+ - `package.json` -> npm/node (read `name`, `description`, check for `typescript` dep)
29
+ - `Cargo.toml` -> Rust
30
+ - `go.mod` -> Go
31
+ - `pyproject.toml` / `requirements.txt` -> Python
32
+ - `*.xcodeproj` / `Package.swift` -> Swift/macOS
33
+ - `*.sln` / `*.csproj` -> C#/.NET
34
+ - `Gemfile` -> Ruby
35
+ - `build.gradle.kts` / `build.gradle` -> Android/Kotlin/Java (or Spring Boot)
36
+ - `pubspec.yaml` -> Flutter/Dart
37
+ - `angular.json` -> Angular
38
+ - `svelte.config.js` -> SvelteKit
39
+ - `.git/` -> has version history
40
+
41
+ If none found (empty or near-empty directory) -> skip to **1c. New Project -- Interview**.
42
+
43
+ #### 1b. Existing Project -- Analyze
44
+
45
+ Before diving into analysis, briefly introduce claudestory to the user:
46
+
47
+ "Claude Story tracks your project's roadmap, tickets, issues, and session handovers in a `.story/` directory. Every Claude Code session starts by reading this context, so you never re-explain your project from scratch. Sessions build on each other: decisions, blockers, and lessons carry forward automatically. I'll analyze your project and propose a structure. You can adjust everything before I create anything."
48
+
49
+ Keep it to 3-4 sentences. Not a sales pitch, just enough that the user knows what they're opting into and that they're in control.
50
+
51
+ Read these files to understand the project (skip any that don't exist, skip files > 50KB):
52
+
53
+ 1. **README.md** -- project description, goals, feature list, roadmap/TODO sections
54
+ 2. **Package manifest** -- project name, dependencies, scripts
55
+ 3. **CLAUDE.md** -- existing project spec (if any)
56
+ 4. **Top-level directory listing** -- identify major components (src/, test/, docs/, etc.)
57
+ 5. **Git summary** -- `git log --oneline -20` for recent work patterns
58
+ 6. **GitHub issues (ask user first)** -- `gh issue list --limit 30 --state open --json number,title,labels,body,createdAt`. If gh fails (auth, rate limit, no remote), skip cleanly and note "GitHub import skipped: [reason]"
59
+ 7. **Project brief / PRD scan** -- glob for `*.md` files in project root and `docs/`. For each candidate (exclude CHANGELOG, LICENSE, CONTRIBUTING, README which is already read above):
60
+ - If file is >100 lines and contains headings matching "entities", "schema", "architecture", "tech stack", "roadmap", "phases", "milestones", "screens", or "API" -- treat as a project brief
61
+ - Read at most 2 candidate briefs (prefer the longest matching file)
62
+ - Extract into structured notes for use in later steps: entity schemas (names, fields, relationships), technical decisions (stack choices, architecture), screen/page inventory, business rules and domain logic, key constraints
63
+ - Summarize once here; do not re-read the full brief at later steps
64
+
65
+ **Brief precedence:** If multiple sources describe the project:
66
+ - Existing `CLAUDE.md` is the authority for current project state
67
+ - A PRD/brief file is the authority for proposed scope and specifications
68
+ - README is a product overview (may be outdated or aspirational)
69
+ - If two briefs disagree on stack, entities, or milestones, ask the user to choose
70
+
71
+ **Framework-specific deep scan** -- after detecting the project type in 1a, scan deeper into framework conventions to understand architecture:
72
+
73
+ - **Next.js / Nuxt:** Check `app/` vs `pages/` routing, scan `app/api/` or `pages/api/` for API routes, read `next.config.*` / `nuxt.config.*`, check for middleware.
74
+ - **Express / Fastify / Koa:** Scan for route files (`routes/`, `src/routes/`), look for `router.get/post` patterns, identify service/controller layers.
75
+ - **NestJS:** Read `nest-cli.json`, scan `src/` for `*.module.ts`, check for controllers and services.
76
+ - **React (CRA / Vite) / Vue / Svelte:** Check `src/components/` structure, look for state management imports (redux, zustand, pinia), identify routing setup.
77
+ - **Angular:** Read `angular.json`, scan `src/app/` for modules and components, check for services and guards.
78
+ - **Django / FastAPI / Flask:** Check for `manage.py`, scan for app directories or router files, look at models and migrations.
79
+ - **Spring Boot:** Check `pom.xml` or `build.gradle` for Spring deps, scan `src/main/java` for controller/service/repository layers.
80
+ - **Rust:** Check `Cargo.toml` for workspace members, scan for `mod.rs` / `lib.rs` structure, identify crate types.
81
+ - **Swift / Xcode:** Check `.xcodeproj` or `Package.swift`, identify SwiftUI vs UIKit, scan for targets.
82
+ - **Android (Kotlin/Java):** Check `build.gradle.kts`, scan `app/src/main/` for activity/fragment/composable structure, check `AndroidManifest.xml`, identify Compose vs XML layouts.
83
+ - **Flutter / Dart:** Check `pubspec.yaml`, scan `lib/` for feature folders (models/, screens/, widgets/, services/), check for state management imports (provider, riverpod, bloc).
84
+ - **Go:** Check `go.mod`, scan for `cmd/` and `internal/`/`pkg/`, check for `Makefile`.
85
+ - **Monorepo:** If `packages/`, `apps/`, or workspace config detected, list each package with its purpose before proposing phases.
86
+ - **Other:** Scan `src/` two levels deep and identify dominant patterns (MVC, service layers, feature folders).
87
+
88
+ When the detected stack has common architectural variants (e.g., App Router vs Pages Router, Expo vs bare), use AskUserQuestion to confirm instead of guessing. Only ask when the choice changes ticket topology. Otherwise infer silently.
89
+
90
+ **Derive project metadata:**
91
+ - **name**: from package manifest `name` field, or directory name
92
+ - **type**: from package manager (npm, cargo, pip, etc.)
93
+ - **language**: from file extensions and manifest
94
+
95
+ **Assess project stage** from the data -- don't use fixed thresholds. A project with 3 commits and a half-written README is greenfield. A project with 500+ commits, test suites, and release tags is mature. A project with 200 commits and active PRs is active development. Use your judgment.
96
+
97
+ **Propose 3-7 phases** reflecting the project's actual development trajectory. Examples:
98
+ - Library: setup -> core-api -> documentation -> testing -> publishing
99
+ - App: mvp -> auth -> data-layer -> ui-polish -> deployment
100
+ - Mid-development project: capture completed work as early phases, then plan forward
101
+
102
+ **Propose initial tickets** per active phase (2-5 each), based on:
103
+ - README TODOs or roadmap sections (treat as hints, not ground truth)
104
+ - GitHub issues if imported -- infer from label semantics: bug/defect labels -> issues, enhancement/feature labels -> tickets
105
+ - Brief entity specs and roadmap sections (if a brief was found)
106
+ - Obvious gaps (missing tests, no CI, no docs, etc.)
107
+ - If more than 30 GitHub issues exist, note "Showing 30 of N. Additional issues can be imported later."
108
+
109
+ **Important:** Only mark phases complete if explicitly confirmed by user or docs -- do NOT infer completion from git history alone.
110
+
111
+ After analysis, skip to **1d. Present Proposal**.
112
+
113
+ #### 1c. New Project -- Interview
114
+
115
+ This is a guided funnel using a mix of free text and structured choices via the `AskUserQuestion` tool. The flow adapts to the user's confidence level and project complexity.
116
+
117
+ **--- Cluster 1: Identity ---**
118
+
119
+ **Step 1:** Ask the user: "What are you building?" (free text -- project name and purpose)
120
+
121
+ Parse the answer. If the user already named a stack ("billing app in Next.js"), confirm it and skip surface + stack questions. Do NOT skip characteristics -- stack doesn't tell us if it's AI-powered, realtime, or marketplace.
122
+
123
+ **Step 2a:** Use `AskUserQuestion`:
124
+ - question: "What's the primary surface?"
125
+ - header: "Surface"
126
+ - options:
127
+ - "Web app" -- SaaS, dashboard, admin panel
128
+ - "Mobile app" -- iOS, Android, cross-platform
129
+ - "API / backend service" -- REST, GraphQL, microservice
130
+ - "Website / content site" -- landing page, blog, docs
131
+ - (Other always available: desktop, CLI, library, package, etc. If Other, ask one free-text follow-up for primary delivery target.)
132
+
133
+ **Step 2b:** Use `AskUserQuestion`:
134
+ - question: "Any special characteristics?"
135
+ - header: "Traits"
136
+ - multiSelect: true
137
+ - options:
138
+ - "AI / LLM powered" -- chatbot, RAG, agent, AI features
139
+ - "Realtime / collaboration" -- live updates, websockets, multiplayer
140
+ - "Marketplace / multi-role" -- multiple user types, separate views
141
+ - "Content-heavy / CMS" -- admin panel, editorial workflow
142
+ - (None of these always available)
143
+
144
+ Surface drives stack recommendations. Characteristics drive additional gates and inferred tickets. Composes cleanly: AI health app = Web + AI. Marketplace = Web + Marketplace. Scales by adding characteristics, not conflicting top-level types.
145
+
146
+ **Simple project fast path:** If surface is Website/content + no AI/realtime/marketplace traits + no brief detected, offer early exit: "This looks like a straightforward site. Want me to skip the detailed questions and go straight to milestones?" If yes: default to Astro + Vercel/Netlify, no auth, no data model. Skip clusters 2-4 entirely. Defaults shown in proposal as editable assumptions. Portfolio user flow: name -> Website -> None -> "Skip?" -> milestones -> proposal. ~4 steps total.
147
+
148
+ **--- Cluster 2: Stack + design ---**
149
+
150
+ **Step 3:** Use `AskUserQuestion` with top 3-4 stacks from the **Default Stack Recommendations** appendix at the bottom of this file. Context-aware: characteristics influence ranking. AI + Web -> Next.js + Vercel AI SDK rises to top. Skip if stack was already confirmed in Step 1.
151
+
152
+ **Step 4:** Framework-specific `AskUserQuestion` only when the choice changes ticket topology:
153
+ - Next.js: App Router (Recommended) vs Pages Router
154
+ - React Native: Expo (Recommended) vs bare
155
+ - ORM: Drizzle (SQL-friendly, lightweight) vs Prisma (higher-level DX, generated types)
156
+ - Other framework-specific choices from the appendix
157
+
158
+ **--- Summary break ---**
159
+ Show: "So far: [name] is a [surface] + [traits] built with [stack]."
160
+
161
+ **Step 4a:** Design source (skip for APIs, CLIs, libraries, backends). Use `AskUserQuestion`:
162
+ - question: "Do you have designs?"
163
+ - header: "Design"
164
+ - options:
165
+ - "Yes, mockups / Figma" -- UI tickets reference designs. Usually skip component library question (implied by mockups, though user can override).
166
+ - "Rough idea / sketches" -- UI tickets include design decisions
167
+ - "No, start from scratch" -- generate design foundation tickets (color palette, typography, layout, component selection)
168
+ - "Not sure yet"
169
+
170
+ If NOT "Yes, mockups / Figma" and project has a UI surface, follow up with component library choice:
171
+ - `AskUserQuestion`: "Component library?"
172
+ - options: shadcn/ui (Recommended for Next.js), Material UI, Chakra UI, None/custom
173
+
174
+ **--- Cluster 3: System structure ---**
175
+
176
+ **Step 4b:** System shape (skip for static sites, CLIs, libs). Use `AskUserQuestion`:
177
+ - question: "How should the system be structured?"
178
+ - header: "Shape"
179
+ - options:
180
+ - "One app does everything" -- monolith, single deployable. Recommended for solo/small projects.
181
+ - "Frontend + backend separately" -- separate frontend and API
182
+ - "Frontend + managed backend (Supabase/Firebase)" -- BaaS handles DB, auth, storage. You build the frontend. Fastest path to MVP.
183
+ - "Not sure -- recommend one"
184
+
185
+ BaaS as a first-class path. If selected: skip ORM choice (BaaS handles it), skip auth gate (BaaS handles it), adjust deployment to match (Vercel + Supabase, or Firebase Hosting + Firebase). Generates different tickets: BaaS setup, client SDK integration, security rules, instead of custom API + auth tickets.
186
+
187
+ **BaaS + AI edge case:** If the user also selected "AI / LLM powered" as a characteristic, AI gates still fire normally. For document ownership tickets (RAG + auth), reference "Supabase RLS policies" or "Firebase Security Rules" instead of "custom row-level access." The AI cluster's data model tickets adapt to the BaaS context.
188
+
189
+ **Step 4b-ii:** Execution model (skip for static sites, CLIs, libs, content sites, BaaS projects). Use `AskUserQuestion`:
190
+ - question: "How does processing work?"
191
+ - header: "Processing"
192
+ - options:
193
+ - "Users request, system responds" -- standard web flow
194
+ - "Background processing needed" -- queues, workers, scheduled tasks
195
+ - "Both" -- user-facing + background processing
196
+ - "Not sure -- recommend one"
197
+
198
+ Note: realtime/event-driven is inferred from the "Realtime" characteristic (design rule 4), not asked as an option here.
199
+
200
+ **Step 4b2:** Deployment (skip for libraries, CLIs, packages). Use `AskUserQuestion`:
201
+ - question: "How should this go live?"
202
+ - header: "Deploy"
203
+ - options:
204
+ - "Easiest path" -- Vercel, Netlify, Railway, Fly.io. Connect repo, push to deploy. Minimal infrastructure work.
205
+ - "Full control (AWS/GCP/Azure)" -- you manage everything. More setup: infrastructure-as-code, containers, CI/CD pipelines.
206
+ - "Self-hosted / own servers" -- Docker Compose, nginx. You manage the hardware and networking.
207
+ - "Not sure -- recommend one"
208
+
209
+ Recommendations are contextual: consider prior answers about compliance, long-running processes, websockets, self-hosting preferences -- not just stack.
210
+
211
+ **--- Summary break ---**
212
+ Show: "[name]: [shape], deploying to [platform]. Now let me understand the data."
213
+
214
+ **--- Cluster 4: Data + domain + auth ---**
215
+
216
+ **Step 4c:** Data model (skip for clearly stateless: static sites, simple CLIs, no persistence. Also skip for BaaS -- handled by BaaS schema setup.) Use `AskUserQuestion`:
217
+ - question: "What data does this system store?"
218
+ - header: "Data"
219
+ - options:
220
+ - "I know the main things" -> follow up with free text: "What are the main objects and how they relate? (e.g., users have many projects, invoices belong to projects, projects have status workflows)"
221
+ - "Help me figure it out" -> Claude proposes entities from brief/interview answers
222
+ - "Keep it simple for now" -- start minimal, add later
223
+ - "Nothing -- no database needed"
224
+
225
+ **Step 4d:** Domain complexity (skip for static sites, CLIs, libs, "no database", "keep it simple", BaaS). Use `AskUserQuestion`:
226
+ - question: "What kind of rules does this system have?"
227
+ - header: "Rules"
228
+ - multiSelect: true
229
+ - options:
230
+ - "Workflows / approvals" -- things move through stages, need approval, have status transitions
231
+ - "Multiple organizations / teams" -- different groups see different data, separate access
232
+ - "None of the above" -- straightforward data, no special rules (exclusive: if selected, deselects the others)
233
+ - "Not sure -- recommend one"
234
+
235
+ Multi-select because these are orthogonal: a system can have both workflows AND org scoping. "None of the above" is the exclusive fallback for simple CRUD projects.
236
+
237
+ **Step 4e:** Auth / identity (skip for static sites, CLIs, libs, packages. Do NOT skip for "no database" -- auth can be external/stateless: JWT, Clerk, Auth0, API keys). Use `AskUserQuestion`:
238
+ - question: "How do users log in?"
239
+ - header: "Auth"
240
+ - options:
241
+ - "No login needed" -- single user or public access
242
+ - "Individual accounts" -- email/password, social login. Easiest setup: Firebase Auth, Clerk, or Supabase Auth.
243
+ - "Team / organization accounts" -- multi-user, org scoping
244
+ - "External auth / SSO" -- enterprise IdP, OAuth providers
245
+ - (Other always available: API keys only, guest+optional, machine clients, etc. + "Not sure -- recommend one")
246
+
247
+ When recommending auth setup for individual accounts, suggest Firebase Auth or Clerk as the easiest options. These handle email/password, social login, session management, and JWT with minimal code. Only recommend custom auth when the user has specific requirements.
248
+
249
+ **Step 4f:** Sensitive domain (skip unless project description or characteristics suggest health, legal, finance, compliance, government, or regulated industry). This is the canonical sensitive domain gate for ALL projects. The AI safety cluster (Cluster 5) references this answer instead of re-asking. Use `AskUserQuestion`:
250
+ - question: "Is this in a sensitive/regulated domain?"
251
+ - header: "Domain"
252
+ - options:
253
+ - "Yes (health, legal, finance, compliance)" -- audit logging, privacy controls, disclaimers, stricter testing, compliance tickets
254
+ - "No"
255
+ - "Not sure"
256
+
257
+ This exists outside the AI branch because a non-AI health billing platform still needs audit trails and compliance tickets.
258
+
259
+ **--- Cluster 5: AI-specific (only when AI characteristic selected in Step 2b) ---**
260
+
261
+ **AI pattern:** Use `AskUserQuestion`:
262
+ - question: "What's the primary AI pattern?"
263
+ - header: "AI Pattern"
264
+ - options:
265
+ - "RAG" -- knowledge base Q&A, document search, domain answers
266
+ - "Agentic / tool use" -- AI that takes actions, calls APIs
267
+ - "Conversational" -- chatbot, assistant, guided interaction
268
+ - "Structured generation" -- extract data, classify, generate reports, transform inputs to JSON
269
+ - (Other + "Not sure -- recommend one")
270
+
271
+ Follow up: "Any secondary capabilities?" (multiSelect, optional). Same options minus the one already picked. This handles real AI products that are RAG + conversational, or agentic + structured generation.
272
+
273
+ **LLM provider:** Use `AskUserQuestion`:
274
+ - question: "Which LLM provider?"
275
+ - header: "LLM"
276
+ - options:
277
+ - "Anthropic Claude (product default)" -- Claude API / Agent SDK. claudestory ecosystem default; choose based on your needs.
278
+ - "OpenAI" -- GPT models
279
+ - "Google Gemini" -- multimodal, good pricing
280
+ - "Self-hosted" -- Qwen, Llama, Mistral via Ollama/vLLM
281
+ - (Other: multi-provider, custom, etc.)
282
+
283
+ **AI processing:** Use `AskUserQuestion`:
284
+ - question: "How does AI processing work?"
285
+ - header: "Processing"
286
+ - options:
287
+ - "Synchronous" -- user sends, waits for response
288
+ - "Async / background" -- ingestion, batch, workers
289
+ - "Both" -- sync chat + async ingestion
290
+ - "Not sure -- recommend one"
291
+
292
+ **Vector database (if RAG primary or secondary):** Use `AskUserQuestion`:
293
+ - question: "Vector database?"
294
+ - header: "Vector DB"
295
+ - options:
296
+ - "pgvector (Recommended)" -- PostgreSQL extension, simple
297
+ - "Pinecone" -- managed, scalable
298
+ - "Qdrant" -- open-source, self-hosted
299
+ - "Not sure -- recommend one"
300
+
301
+ **AI audience + safety:** Use `AskUserQuestion`:
302
+ - question: "Who interacts with the AI output?"
303
+ - header: "Audience"
304
+ - options:
305
+ - "Public users" -- guardrails, content filtering, rate limiting
306
+ - "Internal users" -- lighter guardrails
307
+ - "Backend / pipeline" -- skip safety layer
308
+ - "Not sure -- recommend one"
309
+
310
+ If public or internal, check whether sensitive domain was already answered in Step 4f. If yes, use that answer. If Step 4f was skipped (e.g., AI-only project where sensitive domain wasn't obvious from the description), ask now with `AskUserQuestion`:
311
+ - question: "Is this a sensitive domain?"
312
+ - header: "Domain"
313
+ - options:
314
+ - "Yes (health, legal, finance)" -- audit logging, evals, disclaimers, stricter testing
315
+ - "No"
316
+
317
+ Sensitive domain is orthogonal to audience: an internal health app still needs audit logging.
318
+
319
+ AI gates generate tickets the current flow would miss:
320
+ - LLM client setup (provider SDK, error handling, retries, streaming)
321
+ - Prompt engineering / system prompt design
322
+ - RAG pipeline (if RAG): ingestion, chunking, embedding, retrieval, reranking
323
+ - Secondary AI capabilities as additional tickets
324
+ - Guardrails / safety layer (if user-facing)
325
+ - Evaluation framework (how do you know it's working?)
326
+ - Cost monitoring / token tracking
327
+ - Conversation/session storage
328
+ - Background workers (if async)
329
+ - Document ownership model (if RAG + auth)
330
+ - Audit logging + disclaimers (if sensitive domain)
331
+
332
+ **--- Milestones ---**
333
+
334
+ **Step 5:** "What are the major milestones?" (free text)
335
+
336
+ **Step 6:** "What's the first thing to build?" (free text)
337
+
338
+ Propose phases and initial tickets from all gathered answers.
339
+
340
+ #### 1d. Present Proposal
341
+
342
+ Show the user a structured proposal (table format, not raw JSON):
343
+ - **Project:** name, type, language
344
+ - **System shape + execution model**
345
+ - **Deployment target**
346
+ - **Core entities + key relationships** (if defined)
347
+ - **Domain complexity** (workflows, org scoping, or simple CRUD)
348
+ - **Auth / identity model**
349
+ - **AI pattern + provider + processing** (if AI project)
350
+ - **Any inferred concerns** (realtime, billing, etc. per design rule 4)
351
+ - **Editable assumptions** (if three-strike acceleration was used, clearly marked)
352
+ - **Unresolved decisions**
353
+ - **Phases** (table: id, name, description)
354
+ - **Tickets per phase** (title, type, status)
355
+ - **Issues** (if GitHub import was used)
356
+
357
+ Before asking for approval, briefly explain what they're looking at:
358
+
359
+ "**How this works:** Phases are milestones in your project's development. They track progress from setup to shipping. Tickets are specific work items within each phase. After setup, typing `/story` at the start of any Claude Code session loads this context automatically. Claude will know your project's state, what was done last session, and what to work on next."
360
+
361
+ Then use `AskUserQuestion` for approval:
362
+ - question: "How does this proposal look?"
363
+ - header: "Proposal"
364
+ - options:
365
+ - "Looks good" -- approve and continue
366
+ - "Adjust phases" -- iterate on phase structure
367
+ - "Adjust tickets" -- iterate on ticket details
368
+ - "Start over" -- re-analyze from scratch
369
+
370
+ Re-show this `AskUserQuestion` after adjustments. Loop until "Looks good."
371
+
372
+ #### 1d2. Refinement and Review
373
+
374
+ After the user approves the proposal structure, use `AskUserQuestion` for refinement depth:
375
+ - question: "How much refinement before creating?"
376
+ - header: "Depth"
377
+ - options:
378
+ - "Create as-is" -- skip refinement and review, execute immediately
379
+ - "Refine tickets" -- add descriptions, dependencies, sizing
380
+ - "Refine + independent review (if review tools available)" -- full pipeline
381
+
382
+ If "Create as-is" and no brief exists: warn "Note: tickets will have titles only -- you can add descriptions later."
383
+
384
+ **If "Refine tickets" or "Refine + review":**
385
+
386
+ Refine the proposal. If a brief/PRD was found in step 1b, use those structured notes. If no brief exists (e.g., the user came through step 1c interview), infer descriptions from the interview answers and propose standard dependencies based on the tech stack.
387
+
388
+ **Descriptions:** Extract specs from the brief into ticket descriptions -- entity fields, acceptance criteria, API contracts, business rules. If no brief, write descriptions based on the user's interview answers and common patterns for the chosen stack. Cap each description at 3-4 sentences. Keep them actionable, not exhaustive. The goal is "enough to implement without re-reading the brief."
389
+
390
+ **Dependencies:** Infer `blockedBy` relationships from phase ordering and domain logic:
391
+ - Schema/migration tickets block CRUD API tickets
392
+ - Auth tickets block protected route tickets
393
+ - CRUD/model tickets block business logic that depends on them
394
+ - API tickets block UI tickets that consume them
395
+
396
+ **Sizing check:** Flag tickets that cover more than one major concern:
397
+ - Mentions 3+ distinct entities in one ticket
398
+ - Covers both API implementation and UI in one ticket
399
+ - Handles 3+ distinct models, modes, or billing types in one ticket
400
+ - Offer to split flagged tickets into sub-tasks
401
+
402
+ **Missing entity detection:** Cross-reference entities and concepts mentioned in the brief against the proposed ticket list. Flag entities that appear in the brief but have no corresponding ticket. Common misses: user profile/settings, notification system, seed data, admin/config screens.
403
+
404
+ **Core differentiator detection:** Identify the ticket(s) covering what the brief emphasizes most (the main value proposition). If the core differentiator is a single ticket, flag it for decomposition -- it likely needs 3-4 sub-tickets.
405
+
406
+ **Undecided tech choices:** Surface technology decisions mentioned in the brief as "X or Y" that haven't been resolved. Present them as explicit decisions to make before implementation starts (e.g., "ORM: Drizzle or Prisma -- decide before T-002").
407
+
408
+ After refinement, present the updated proposal showing what changed: added descriptions, new blockedBy links, split tickets, newly created tickets for missing entities, and flagged decisions. Wait for the user to approve the refined proposal before continuing.
409
+
410
+ **If "Refine + review":**
411
+
412
+ After refinement, run an independent review of the full proposal (phases, tickets, descriptions, dependencies):
413
+
414
+ **Backend selection:** Use the same review backend selection as autonomous mode -- if the `review_plan` MCP tool is available, use it (pass the full proposal as the plan document); otherwise spawn an independent Claude agent with the brief + proposal and ask it to audit for gaps, sizing issues, missing dependencies, and architectural concerns. If neither is available, skip review with a note.
415
+
416
+ **Review cap:** Maximum 2 review rounds for setup proposals.
417
+
418
+ **After review findings come back:**
419
+ - Present ALL findings to the user as a summary diff: added tickets, changed descriptions, new dependencies, files to be generated.
420
+ - User approves the final version before any execution. Do not auto-incorporate findings.
421
+ - If the user requests changes based on findings, update the proposal and optionally re-review.
422
+
423
+ #### 1e. Execute on Approval
424
+
425
+ **Two-pass ticket creation:**
426
+
427
+ 1. Call `claudestory_init` with name, type, language -- after this, all MCP tools become available dynamically
428
+ 2. Call `claudestory_phase_create` for each phase -- first phase with `atStart: true`, subsequent with `after: <previous-phase-id>`
429
+ 3. **Pass 1:** Call `claudestory_ticket_create` for each ticket WITHOUT `blockedBy` (ticket IDs don't exist until after creation)
430
+ 4. Call `claudestory_issue_create` for each imported GitHub issue
431
+ 5. **Pass 2:** Call `claudestory_ticket_update` for each ticket that has `blockedBy` dependencies, now that all IDs exist. Validate: no cycles, no self-references.
432
+ 6. Call `claudestory_ticket_update` to mark already-complete tickets as `complete`
433
+ 7. Call `claudestory_snapshot` to save initial baseline
434
+
435
+ **CLAUDE.md generation:** If a brief/PRD was read in step 1b AND no `CLAUDE.md` exists in the project root, use `AskUserQuestion` for governance files:
436
+ - question: "Write project governance files?"
437
+ - header: "Files"
438
+ - options:
439
+ - "Write both" -- CLAUDE.md and RULES.md
440
+ - "CLAUDE.md only"
441
+ - "RULES.md only"
442
+ - "Skip" -- I'll write them manually
443
+
444
+ **If writing CLAUDE.md**, generate with tiered structure:
445
+
446
+ *Always present:*
447
+ - Project purpose (1-2 sentences)
448
+ - Tech stack and key dependencies (including any pivots from the brief, with rationale)
449
+ - Architecture pattern (shape + execution model) + rationale
450
+ - Testing strategy (TDD when applicable -- see RULES.md generation)
451
+
452
+ *Present when relevant:*
453
+ - Deployment target + hosting model
454
+ - Core entities + key relationships (names + relationships, not full schemas)
455
+ - Domain complexity (workflows, org scoping)
456
+ - Auth / identity model
457
+ - AI pattern + provider + processing model
458
+ - Tenancy model
459
+ - State machines / workflows
460
+
461
+ *Flagged for resolution:*
462
+ - Undecided tech choices (flagged as TBD with options)
463
+
464
+ **Sanitization:** Never copy secrets, tokens, credentials, API keys, connection strings, customer-identifying data, or internal-only endpoints into generated files.
465
+
466
+ Show a preview of the generated content to the user. Only write after explicit approval.
467
+
468
+ **If writing RULES.md**, generate capturing:
469
+ - Domain-specific rules (e.g., "all monetary calculations use fixed-point arithmetic, not floats")
470
+ - API design constraints (versioning, auth requirements, response format)
471
+ - Data integrity rules (soft deletes, audit trails, idempotency requirements)
472
+ - Testing requirements for core business logic
473
+
474
+ **TDD recommendation:** Add when domain complexity includes "Workflows/approvals" or "Multiple organizations/teams", or project has AI evaluation needs, or project has a sensitive domain. This is mechanical -- tied directly to gate answers, no judgment calls:
475
+
476
+ ```
477
+ - TDD for business logic: write tests first for core functional code
478
+ (calculations, validation rules, state machines, data transformations,
479
+ AI evaluation harnesses). Tests define the contract before implementation.
480
+ ```
481
+
482
+ Same sanitization and preview rules as CLAUDE.md. Only write after explicit approval.
483
+
484
+ #### 1f. Post-Setup
485
+
486
+ After creation completes:
487
+ - Confirm what was created (e.g., "Created 5 phases, 18 tickets, 3 issues, CLAUDE.md, and RULES.md")
488
+ - Check if `.gitignore` includes `.story/snapshots/` (warn if missing -- snapshots should not be committed)
489
+ - Write an initial handover documenting the setup decisions. Explicitly capture which gates were answered and what was chosen: surface, characteristics, stack, system shape, execution model, deployment, data model, domain complexity, auth model, sensitive domain, AI pattern/provider/processing (if applicable), design source. This handover is the source of truth for decisions; CLAUDE.md is the project description.
490
+ - Setup complete. Continue with **Step 2: Load Context** in SKILL.md (already in your context). Execute all 6 steps -- the project now has data to load.
491
+
492
+ ---
493
+
494
+ ## Appendix: Default Stack Recommendations
495
+
496
+ Choose based on team familiarity, hosting model, and product shape; these are defaults, not absolutes.
497
+
498
+ ### Web application
499
+ - Next.js + TypeScript (Recommended) -- full-stack React, SSR, API routes, easiest deploy via Vercel. Best for: solo devs, startups, fast shipping.
500
+ - SvelteKit + TypeScript -- lighter, less boilerplate, excellent DX. Best for: minimal framework overhead.
501
+ - Django + Python -- batteries-included, built-in admin, ORM, auth. Best for: data-heavy apps, Python teams.
502
+ - Rails -- convention-over-config. Best for: fastest 0-to-MVP.
503
+
504
+ ### AI / LLM application
505
+ - Next.js + TypeScript + Vercel AI SDK (Recommended for web AI) -- streaming UI, provider-agnostic. Best for: AI products with web interface.
506
+ - Python + FastAPI + provider SDK -- direct integration, no orchestration overhead. Best for: most AI apps. Add LangChain only when multi-step orchestration matters.
507
+ - Python + FastAPI + LlamaIndex -- simpler for pure retrieval. Best for: knowledge base Q&A.
508
+ - TypeScript + Anthropic SDK / Agent SDK -- Claude-native. Best for: agents and tool use.
509
+
510
+ ### BaaS / backendless
511
+ - Supabase (Recommended) -- Postgres, auth, realtime, storage built-in. Best for: fast MVP, solo devs, speed over control.
512
+ - Firebase -- Google ecosystem, NoSQL, good mobile support. Best for: mobile-first, Google Cloud teams.
513
+ - Convex -- reactive database, no REST. Best for: realtime-first apps.
514
+
515
+ ### Website / content site
516
+ - Astro (Recommended) -- zero JS default, islands. Best for: marketing, blogs, docs.
517
+ - Next.js -- when site needs dynamic features.
518
+ - Hugo -- pure static, fast builds. Best for: large content sites.
519
+ - Docusaurus -- documentation sites. React-based.
520
+
521
+ ### Mobile app
522
+ - Flutter + Dart -- cross-platform, consistent UI, single codebase. Best for: mobile-primary projects wanting one tightly controlled UI system.
523
+ - React Native + Expo + TypeScript -- cross-platform, shares logic with web. Best for: teams already using React.
524
+ - Swift + SwiftUI -- iOS/macOS native. Best for: iOS-only, deep OS integration.
525
+ - Kotlin + Compose -- Android native. Best for: Android-only.
526
+
527
+ (Neither Flutter nor React Native is a universal winner. Recommendation depends on team context.)
528
+
529
+ ### Full-stack / multi-service
530
+ - Next.js + NestJS + PostgreSQL -- TypeScript end-to-end, structured API. Best for: strong typing, module organization.
531
+ - Next.js + FastAPI + PostgreSQL -- TS frontend, Python API. Best for: AI features in the API.
532
+ - React + Go + PostgreSQL -- lightweight. Best for: high-throughput services.
533
+ - Monorepo (Turborepo/Nx) when sharing types/utils; polyrepo when teams are independent.
534
+
535
+ ### Desktop app
536
+ - Tauri + TypeScript -- cross-platform, Rust backend, small binaries. Best for: lightweight tools.
537
+ - Electron + TypeScript -- cross-platform, largest ecosystem. Best for: feature-rich apps.
538
+ - Swift + SwiftUI -- macOS native. Best for: Mac-only.
539
+ - .NET MAUI -- Windows + macOS. Best for: C#/.NET teams.
540
+
541
+ ### API / backend
542
+ - Node.js + Fastify + TypeScript -- fast, lean. Best for: microservices, performance.
543
+ - NestJS + TypeScript -- structured, enterprise-friendly. Best for: larger teams, complex domains.
544
+ - Python + FastAPI -- auto-docs, async. Best for: Python teams, AI, data-heavy.
545
+ - Go -- compiled, high-throughput. Best for: infrastructure services.
546
+ - Rust + Axum -- maximum performance. Best for: systems-level APIs.
547
+
548
+ ### CLI tool
549
+ - TypeScript + Node.js -- fast to build, npm distribution.
550
+ - Rust -- single binary, fast. Best for: performance, wide distribution.
551
+ - Go -- single binary, fast compile. Best for: DevOps, infra CLIs.
552
+ - Python + Typer -- fastest to prototype. Best for: internal tools.
553
+
554
+ ### Library / package
555
+ - TypeScript -- npm, widest web reach.
556
+ - Rust -- crates.io, WASM target.
557
+ - Python -- PyPI, data science.
558
+
559
+ ### Framework-specific choices (ask only when it affects tickets)
560
+ - Next.js: App Router (Recommended) vs Pages Router
561
+ - React Native: Expo (Recommended) vs bare
562
+ - Node.js ORM: Drizzle (SQL-friendly, lightweight) vs Prisma (higher-level DX, generated types)
563
+ - Python ORM: SQLAlchemy vs Django ORM
564
+ - Database: PostgreSQL (Recommended), SQLite (local), MongoDB (documents), MySQL (legacy)
565
+ - Component library (web): shadcn/ui (Recommended for Next.js), Material UI, Chakra UI, None/custom
566
+ - Component library (mobile): default to framework built-in (Flutter Material, RN Paper/NativeBase)
567
+
568
+ ### Deployment / hosting
569
+ - Vercel (Recommended for Next.js/Astro) -- zero-config, git push deploy
570
+ - Netlify -- static + serverless
571
+ - Railway -- full-stack, databases included
572
+ - Fly.io -- containers, global edge
573
+ - AWS / GCP / Azure -- maximum control, complex setup
574
+ - Self-hosted / VPS -- Docker Compose + nginx, full control
575
+
576
+ ### AI-specific choices
577
+ - LLM: Anthropic Claude (product default), OpenAI, Google Gemini, Self-hosted (Qwen/Llama/Mistral), Multi-provider
578
+ - Pattern: RAG, Agentic, Conversational, Structured generation (composable: primary + secondary)
579
+ - Processing: Sync, Async, Both
580
+ - Vector DB: pgvector (Recommended), Pinecone, Qdrant, Chroma
581
+ - Safety: by audience (public/internal/backend) + domain sensitivity (separate axis)