@zigrivers/scaffold 2.1.1 → 2.28.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (100) hide show
  1. package/README.md +272 -59
  2. package/dist/project/frontmatter.d.ts.map +1 -1
  3. package/dist/project/frontmatter.js +4 -0
  4. package/dist/project/frontmatter.js.map +1 -1
  5. package/knowledge/core/adr-craft.md +53 -0
  6. package/knowledge/core/ai-memory-management.md +246 -0
  7. package/knowledge/core/api-design.md +4 -0
  8. package/knowledge/core/claude-md-patterns.md +254 -0
  9. package/knowledge/core/coding-conventions.md +246 -0
  10. package/knowledge/core/database-design.md +4 -0
  11. package/knowledge/core/design-system-tokens.md +465 -0
  12. package/knowledge/core/dev-environment.md +223 -0
  13. package/knowledge/core/domain-modeling.md +4 -0
  14. package/knowledge/core/eval-craft.md +1008 -0
  15. package/knowledge/core/multi-model-review-dispatch.md +250 -0
  16. package/knowledge/core/operations-runbook.md +37 -226
  17. package/knowledge/core/project-structure-patterns.md +231 -0
  18. package/knowledge/core/review-step-template.md +247 -0
  19. package/knowledge/core/{security-review.md → security-best-practices.md} +5 -1
  20. package/knowledge/core/task-decomposition.md +57 -34
  21. package/knowledge/core/task-tracking.md +225 -0
  22. package/knowledge/core/tech-stack-selection.md +214 -0
  23. package/knowledge/core/testing-strategy.md +63 -70
  24. package/knowledge/core/user-stories.md +69 -60
  25. package/knowledge/core/user-story-innovation.md +57 -0
  26. package/knowledge/core/ux-specification.md +5 -148
  27. package/knowledge/finalization/apply-fixes-and-freeze.md +165 -14
  28. package/knowledge/product/prd-craft.md +55 -34
  29. package/knowledge/review/review-adr.md +32 -0
  30. package/knowledge/review/{review-api-contracts.md → review-api-design.md} +34 -1
  31. package/knowledge/review/{review-database-schema.md → review-database-design.md} +27 -1
  32. package/knowledge/review/review-domain-modeling.md +33 -0
  33. package/knowledge/review/review-implementation-tasks.md +50 -0
  34. package/knowledge/review/review-operations.md +55 -0
  35. package/knowledge/review/review-prd.md +33 -0
  36. package/knowledge/review/review-security.md +53 -0
  37. package/knowledge/review/review-system-architecture.md +28 -0
  38. package/knowledge/review/review-testing-strategy.md +51 -0
  39. package/knowledge/review/review-user-stories.md +54 -0
  40. package/knowledge/review/{review-ux-spec.md → review-ux-specification.md} +37 -1
  41. package/methodology/custom-defaults.yml +32 -3
  42. package/methodology/deep.yml +32 -3
  43. package/methodology/mvp.yml +32 -3
  44. package/package.json +2 -1
  45. package/pipeline/architecture/review-architecture.md +18 -6
  46. package/pipeline/architecture/system-architecture.md +14 -2
  47. package/pipeline/consolidation/claude-md-optimization.md +73 -0
  48. package/pipeline/consolidation/workflow-audit.md +73 -0
  49. package/pipeline/decisions/adrs.md +14 -2
  50. package/pipeline/decisions/review-adrs.md +18 -5
  51. package/pipeline/environment/ai-memory-setup.md +70 -0
  52. package/pipeline/environment/automated-pr-review.md +70 -0
  53. package/pipeline/environment/design-system.md +73 -0
  54. package/pipeline/environment/dev-env-setup.md +65 -0
  55. package/pipeline/environment/git-workflow.md +71 -0
  56. package/pipeline/finalization/apply-fixes-and-freeze.md +1 -1
  57. package/pipeline/finalization/developer-onboarding-guide.md +1 -1
  58. package/pipeline/finalization/implementation-playbook.md +3 -3
  59. package/pipeline/foundation/beads.md +68 -0
  60. package/pipeline/foundation/coding-standards.md +68 -0
  61. package/pipeline/foundation/project-structure.md +69 -0
  62. package/pipeline/foundation/tdd.md +60 -0
  63. package/pipeline/foundation/tech-stack.md +74 -0
  64. package/pipeline/integration/add-e2e-testing.md +65 -0
  65. package/pipeline/modeling/domain-modeling.md +14 -2
  66. package/pipeline/modeling/review-domain-modeling.md +18 -5
  67. package/pipeline/parity/platform-parity-review.md +70 -0
  68. package/pipeline/planning/implementation-plan-review.md +56 -0
  69. package/pipeline/planning/{implementation-tasks.md → implementation-plan.md} +29 -9
  70. package/pipeline/pre/create-prd.md +13 -4
  71. package/pipeline/pre/innovate-prd.md +37 -8
  72. package/pipeline/pre/innovate-user-stories.md +38 -7
  73. package/pipeline/pre/review-prd.md +18 -6
  74. package/pipeline/pre/review-user-stories.md +23 -6
  75. package/pipeline/pre/user-stories.md +12 -2
  76. package/pipeline/quality/create-evals.md +102 -0
  77. package/pipeline/quality/operations.md +38 -13
  78. package/pipeline/quality/review-operations.md +17 -5
  79. package/pipeline/quality/review-security.md +17 -5
  80. package/pipeline/quality/review-testing.md +20 -8
  81. package/pipeline/quality/security.md +25 -3
  82. package/pipeline/quality/story-tests.md +73 -0
  83. package/pipeline/specification/api-contracts.md +17 -2
  84. package/pipeline/specification/database-schema.md +17 -2
  85. package/pipeline/specification/review-api.md +18 -6
  86. package/pipeline/specification/review-database.md +18 -6
  87. package/pipeline/specification/review-ux.md +19 -7
  88. package/pipeline/specification/ux-spec.md +29 -10
  89. package/pipeline/validation/critical-path-walkthrough.md +34 -7
  90. package/pipeline/validation/cross-phase-consistency.md +34 -7
  91. package/pipeline/validation/decision-completeness.md +34 -7
  92. package/pipeline/validation/dependency-graph-validation.md +34 -7
  93. package/pipeline/validation/implementability-dry-run.md +34 -7
  94. package/pipeline/validation/scope-creep-check.md +34 -7
  95. package/pipeline/validation/traceability-matrix.md +34 -7
  96. package/skills/multi-model-dispatch/SKILL.md +326 -0
  97. package/skills/scaffold-pipeline/SKILL.md +195 -0
  98. package/skills/scaffold-runner/SKILL.md +465 -0
  99. package/pipeline/planning/review-tasks.md +0 -38
  100. package/pipeline/quality/testing-strategy.md +0 -42
@@ -4,11 +4,45 @@ description: Breaking architecture into implementable tasks with dependency anal
4
4
  topics: [tasks, decomposition, dependencies, user-stories, parallelization, sizing, critical-path]
5
5
  ---
6
6
 
7
- ## User Stories to Tasks
7
+ # Task Decomposition
8
8
 
9
- > **Note:** User stories are created as an upstream artifact in the pre-pipeline phase and available at `docs/user-stories.md`. This section covers how to consume stories and derive implementation tasks from them.
9
+ Expert knowledge for breaking user stories into implementable tasks with dependency analysis, sizing, parallelization, and agent context requirements.
10
+
11
+ ## Summary
12
+
13
+ ### Story-to-Task Mapping
14
+
15
+ User stories bridge PRD features and implementation tasks. Each story decomposes into tasks following the technical layers needed. Every task must trace back to a user story, and every story to a PRD feature (PRD Feature → US-xxx → Task BD-xxx).
16
+
17
+ ### Task Sizing
18
+
19
+ Each task should be completable in a single AI agent session (30-90 minutes of agent time). A well-sized task has a clear title (usable as commit message), touches 1-5 files, produces a testable result, and has no ambiguity about "done."
20
+
21
+ Split large tasks by layer (API, UI, DB, tests), by feature slice (happy path, validation, edge cases), or by entity. Combine tiny tasks that touch the same file and have no independent value.
22
+
23
+ ### Dependency Types
24
+
25
+ - **Logical** — Task B requires Task A's output (endpoint needs DB schema)
26
+ - **File contention** — Two tasks modify the same file (merge conflict risk)
27
+ - **Infrastructure** — Task requires setup that must exist first (DB, auth, CI)
28
+ - **Knowledge** — Task benefits from understanding gained in another task
29
+
30
+ Only logical, file contention, and infrastructure dependencies should be formal constraints.
31
+
32
+ ### Definition of Done
33
+
34
+ 1. Acceptance criteria from the user story are met
35
+ 2. Unit tests pass (for new logic)
36
+ 3. Integration tests pass (for API endpoints or component interactions)
37
+ 4. No linting or type errors
38
+ 5. Code follows project coding standards
39
+ 6. Changes committed with proper message format
40
+
41
+ ## Deep Guidance
10
42
 
11
- ### From Stories to Tasks
43
+ ### From Stories to Tasks — Extended
44
+
45
+ > **Note:** User stories are created as an upstream artifact in the pre-pipeline phase and available at `docs/user-stories.md`. This section covers how to consume stories and derive implementation tasks from them.
12
46
 
13
47
  User stories bridge the gap between what the business wants (PRD features) and what developers build (implementation tasks). Every PRD feature maps to one or more user stories (created in the pre-pipeline), and every user story should map to one or more implementation tasks.
14
48
 
@@ -115,9 +149,9 @@ This traceability ensures:
115
149
  - No orphan tasks exist (every task serves a purpose)
116
150
  - Impact analysis is possible (changing a PRD feature reveals which tasks are affected)
117
151
 
118
- ## Task Sizing
152
+ ### Task Sizing — Extended
119
153
 
120
- ### Right-Sizing for Agent Sessions
154
+ #### Right-Sizing for Agent Sessions
121
155
 
122
156
  Each task should be completable in a single AI agent session (typically 30-90 minutes of agent time). Tasks that are too large overflow the context window; tasks that are too small create unnecessary coordination overhead.
123
157
 
@@ -136,7 +170,7 @@ Each task should be completable in a single AI agent session (typically 30-90 mi
136
170
  | "Create Button component" | "Build form components (Input, Select, Textarea) with validation states" | "Create the full design system" |
137
171
  | "Add index to users table" | "Create database schema for user management with migration" | "Set up the entire database" |
138
172
 
139
- ### Splitting Large Tasks
173
+ #### Splitting Large Tasks
140
174
 
141
175
  When a task is too large, split along these axes:
142
176
 
@@ -163,7 +197,7 @@ When a task is too large, split along these axes:
163
197
  - The task involves more than 2 architectural boundaries (e.g., database + API + frontend + auth)
164
198
  - You can't describe what "done" looks like in 2-3 sentences
165
199
 
166
- ### Combining Small Tasks
200
+ #### Combining Small Tasks
167
201
 
168
202
  If multiple tiny tasks touch the same file and have no independent value, combine them:
169
203
 
@@ -172,20 +206,9 @@ If multiple tiny tasks touch the same file and have no independent value, combin
172
206
 
173
207
  The test: would the small task result in a useful commit on its own? If not, combine.
174
208
 
175
- ### Definition of Done
176
-
177
- Every task needs a clear definition of done. Standard criteria:
178
-
179
- 1. All acceptance criteria from the user story are met
180
- 2. Unit tests pass (for new logic)
181
- 3. Integration tests pass (for API endpoints or component interactions)
182
- 4. No linting or type errors
183
- 5. Code follows project coding standards
184
- 6. Changes are committed with proper message format
185
-
186
- ## Dependency Analysis
209
+ ### Dependency Analysis — Extended
187
210
 
188
- ### Types of Dependencies
211
+ #### Types of Dependencies
189
212
 
190
213
  **Logical dependencies:** Task B requires Task A's output. The API endpoint task depends on the database schema task because the endpoint queries tables that must exist first.
191
214
 
@@ -195,7 +218,7 @@ Every task needs a clear definition of done. Standard criteria:
195
218
 
196
219
  **Knowledge dependencies:** A task requires understanding gained from completing another task. The developer who builds the auth system understands the auth patterns needed by other features.
197
220
 
198
- ### Building Dependency Graphs (DAGs)
221
+ #### Building Dependency Graphs (DAGs)
199
222
 
200
223
  A dependency graph is a directed acyclic graph (DAG) where:
201
224
  - Nodes are tasks
@@ -210,7 +233,7 @@ A dependency graph is a directed acyclic graph (DAG) where:
210
233
  4. Draw an edge from producer to consumer
211
234
  5. Check for cycles (if A depends on B and B depends on A, something is wrong — split or reorganize)
212
235
 
213
- ### Detecting Cycles
236
+ #### Detecting Cycles
214
237
 
215
238
  Cycles indicate a modeling problem. Common causes and fixes:
216
239
 
@@ -218,7 +241,7 @@ Cycles indicate a modeling problem. Common causes and fixes:
218
241
  - **Feature interaction:** Feature X needs Feature Y's component, and Feature Y needs Feature X's component. Fix: extract the shared component into its own task.
219
242
  - **Testing dependency:** "Can't test A without B, can't test B without A." Fix: use mocks/stubs to break the cycle during testing. The integration test that tests both together becomes a separate task.
220
243
 
221
- ### Finding Critical Path
244
+ #### Finding Critical Path
222
245
 
223
246
  The critical path is the longest chain of dependent tasks from start to finish. It determines the minimum project duration.
224
247
 
@@ -235,7 +258,7 @@ The critical path is the longest chain of dependent tasks from start to finish.
235
258
  - To shorten the project, focus on splitting or accelerating critical-path tasks
236
259
  - Non-critical-path tasks have "float" — they can be delayed without affecting the project end date
237
260
 
238
- ### Dependency Documentation
261
+ #### Dependency Documentation
239
262
 
240
263
  For each dependency, document:
241
264
 
@@ -245,9 +268,9 @@ For each dependency, document:
245
268
  | BD-12 -> BD-13 | File contention | Both modify src/routes/index.ts | Medium — merge conflict risk |
246
269
  | BD-01 -> BD-* | Infrastructure | BD-01 sets up the database; everything needs it | High — blocks all work |
247
270
 
248
- ## Parallelization
271
+ ### Parallelization and Wave Planning
249
272
 
250
- ### Identifying Independent Tasks
273
+ #### Identifying Independent Tasks
251
274
 
252
275
  Tasks are safe to run in parallel when:
253
276
  - They have no shared dependencies (no common prerequisite still in progress)
@@ -267,7 +290,7 @@ Tasks are safe to run in parallel when:
267
290
  - Tasks that modify the same shared utility file
268
291
  - Tasks where one produces test fixtures the other consumes
269
292
 
270
- ### Managing Shared-State Tasks
293
+ #### Managing Shared-State Tasks
271
294
 
272
295
  When tasks must share state (database, shared configuration, route registry):
273
296
 
@@ -277,7 +300,7 @@ When tasks must share state (database, shared configuration, route registry):
277
300
 
278
301
  **Feature flags:** Both tasks can merge independently. A feature flag controls which one is active. Integrate them in a separate task after both complete.
279
302
 
280
- ### Merge Strategies for Parallel Work
303
+ #### Merge Strategies for Parallel Work
281
304
 
282
305
  When parallel tasks produce branches that must be merged to main:
283
306
 
@@ -285,7 +308,7 @@ When parallel tasks produce branches that must be merged to main:
285
308
  - **First-in wins:** The first task to merge gets a clean merge. Subsequent tasks must rebase and resolve conflicts.
286
309
  - **Minimize shared files:** Design the task decomposition to minimize file overlap. Feature-based directory structure helps enormously.
287
310
 
288
- ### Wave Planning
311
+ #### Wave Planning
289
312
 
290
313
  Organize tasks into waves based on the dependency graph:
291
314
 
@@ -298,9 +321,9 @@ Wave 4 (depends on Wave 3): End-to-end tests, performance optimization, polish
298
321
 
299
322
  Each wave's tasks can run in parallel. Wave N+1 starts only when all its dependencies in Wave N are complete. The number of parallel agents should match the number of independent tasks in the current wave.
300
323
 
301
- ## Agent Context
324
+ ### Agent Context Requirements
302
325
 
303
- ### What Context Each Task Needs
326
+ #### What Context Each Task Needs
304
327
 
305
328
  Every task description should specify what documents and code the implementing agent needs to read:
306
329
 
@@ -321,7 +344,7 @@ Produces:
321
344
  - tests/features/auth/register.integration.test.ts
322
345
  ```
323
346
 
324
- ### Handoff Information
347
+ #### Handoff Information
325
348
 
326
349
  When a task produces output that another task consumes, specify the handoff:
327
350
 
@@ -338,7 +361,7 @@ Consuming tasks:
338
361
  BD-30 (onboarding flow) expects the response shape above
339
362
  ```
340
363
 
341
- ### Assumed Prior Work
364
+ #### Assumed Prior Work
342
365
 
343
366
  Explicitly state what the agent can assume exists:
344
367
 
@@ -353,7 +376,7 @@ Does NOT assume:
353
376
  - Any auth endpoints exist (this is the first)
354
377
  ```
355
378
 
356
- ## Common Pitfalls
379
+ ### Common Pitfalls
357
380
 
358
381
  **Tasks too vague.** "Implement backend" or "Set up auth" with no acceptance criteria, no file paths, and no test requirements. An agent receiving this task will guess wrong about scope, structure, and conventions. Fix: every task must specify exact files to create/modify, acceptance criteria, and test requirements.
359
382
 
@@ -0,0 +1,225 @@
1
+ ---
2
+ name: task-tracking
3
+ description: Task tracking patterns including Beads methodology, task hierarchies, progress tracking, and lessons-learned workflows
4
+ topics: [task-management, beads, progress-tracking, lessons-learned, autonomous-work]
5
+ ---
6
+
7
+ # Task Tracking
8
+
9
+ Structured task tracking for AI agents ensures work continuity across sessions, prevents drift, and builds institutional memory. This knowledge covers the Beads methodology, task hierarchies, progress conventions, and the lessons-learned workflow that turns mistakes into permanent improvements.
10
+
11
+ ## Summary
12
+
13
+ ### Beads Methodology Overview
14
+
15
+ Beads is an AI-friendly issue tracker designed for single-developer and AI-agent workflows. Unlike heavyweight project management tools (Jira, Linear), Beads stores task data in the repository itself, making it accessible to AI agents without external API integration.
16
+
17
+ Core properties:
18
+ - **Repository-local** — Task data lives in `.beads/`, committed alongside code
19
+ - **Git-hook synced** — Task state updates automatically on commit via data-sync hooks
20
+ - **CLI-driven** — All operations via `bd` commands (create, list, status, ready)
21
+ - **ID-prefixed commits** — Every commit message includes `[BD-xxx]` for traceability
22
+
23
+ ### Task Hierarchy
24
+
25
+ Tasks organize into three levels:
26
+
27
+ | Level | Scope | Example | Typical Count |
28
+ |-------|-------|---------|---------------|
29
+ | **Epic** | Large feature or milestone | "User authentication system" | 3-8 per project |
30
+ | **Task** | Single agent session (30-90 min) | "Implement login endpoint with validation" | 10-50 per project |
31
+ | **Subtask** | Atomic unit within a task | "Add password hashing util" | 0-5 per task |
32
+
33
+ Epics group related tasks. Tasks are the unit of work assignment — one task per agent session. Subtasks are optional decomposition within a task, useful when a task has distinct testable steps.
34
+
35
+ ### Progress Tracking
36
+
37
+ Track task status through a simple state machine:
38
+
39
+ ```
40
+ ready → in-progress → review → done
41
+ ↘ blocked
42
+ ```
43
+
44
+ - **ready** — All dependencies met, can start immediately
45
+ - **in-progress** — Agent is actively working on it
46
+ - **review** — Implementation complete, awaiting PR merge
47
+ - **done** — PR merged, tests passing on main
48
+ - **blocked** — Cannot proceed, dependency or question unresolved
49
+
50
+ ### Lessons-Learned Workflow
51
+
52
+ The `tasks/lessons.md` file captures patterns discovered during work. It has three sections:
53
+
54
+ 1. **Patterns** — Approaches that worked well (reuse these)
55
+ 2. **Anti-Patterns** — Approaches that failed (avoid these)
56
+ 3. **Common Gotchas** — Project-specific traps (watch for these)
57
+
58
+ After ANY correction from the user, immediately update `tasks/lessons.md` with the pattern. Write the rule so that it prevents the same mistake in future sessions.
59
+
60
+ ## Deep Guidance
61
+
62
+ ### Beads Setup and Commands
63
+
64
+ #### Initialization
65
+
66
+ ```bash
67
+ bd init # Creates .beads/ directory with data store and git hooks
68
+ ```
69
+
70
+ Initialization creates:
71
+ - `.beads/` — Data directory (committed to git)
72
+ - Git hooks for automatic data sync (these are Beads data hooks, not code-quality hooks like pre-commit linters)
73
+ - Initial `[BD-0]` bootstrap convention
74
+
75
+ #### Core Commands
76
+
77
+ | Command | Purpose | When to Use |
78
+ |---------|---------|-------------|
79
+ | `bd create "title"` | Create a new task | Starting new work |
80
+ | `bd list` | Show all tasks | Session start, planning |
81
+ | `bd status BD-xxx` | Check task state | Before picking up work |
82
+ | `bd start BD-xxx` | Mark task in-progress | Beginning work on a task |
83
+ | `bd done BD-xxx` | Mark task complete | After PR merged |
84
+ | `bd ready` | List tasks ready to start | Picking next task |
85
+ | `bd block BD-xxx "reason"` | Mark task blocked | When dependency is unmet |
86
+
87
+ #### Commit Message Convention
88
+
89
+ Every commit references its Beads task:
90
+
91
+ ```
92
+ [BD-42] feat(api): implement user registration endpoint
93
+
94
+ - Add POST /api/v1/auth/register
95
+ - Add input validation with zod schema
96
+ - Add integration tests for happy path and validation errors
97
+ ```
98
+
99
+ The `[BD-xxx]` prefix enables:
100
+ - Automatic task-to-commit traceability
101
+ - Progress tracking based on commit activity
102
+ - Session reconstruction (which commits belong to which task)
103
+
104
+ ### Task Lifecycle Patterns
105
+
106
+ #### Session Start Protocol
107
+
108
+ 1. Review `tasks/lessons.md` for recent patterns and corrections
109
+ 2. Run `bd ready` to see available tasks
110
+ 3. Pick the highest-priority ready task (or continue an in-progress task)
111
+ 4. Run `bd start BD-xxx` to claim the task
112
+ 5. Read the task description and acceptance criteria before writing code
113
+
114
+ #### Session End Protocol
115
+
116
+ 1. Commit all work with `[BD-xxx]` prefix
117
+ 2. If task is complete: create PR, run `bd done BD-xxx`
118
+ 3. If task is incomplete: leave clear notes about current state and next steps
119
+ 4. If lessons were learned: update `tasks/lessons.md`
120
+
121
+ #### Task Completion Criteria
122
+
123
+ A task is done when:
124
+ - All acceptance criteria from the task description are met
125
+ - Tests pass (`make check` or equivalent)
126
+ - Code follows project coding standards
127
+ - Changes are committed with proper `[BD-xxx]` message
128
+ - PR is created (or merged, depending on workflow)
129
+
130
+ Do not mark a task done based on "it seems to work." Prove it works — tests pass, logs clean, behavior verified.
131
+
132
+ ### Lessons-Learned Workflow — Extended
133
+
134
+ #### When to Capture
135
+
136
+ Capture a lesson immediately when:
137
+ - The user corrects your approach or output
138
+ - A test fails due to a pattern you should have known
139
+ - You discover a project-specific convention by reading code
140
+ - A dependency or tool behaves differently than expected
141
+ - A workaround is needed for a known issue
142
+
143
+ #### How to Write Lessons
144
+
145
+ Each lesson should be specific, actionable, and preventive:
146
+
147
+ **Good lesson:**
148
+ ```markdown
149
+ ### Anti-Pattern: Using `git push -f` on shared branches
150
+ - **Trigger:** Pushed force to a branch with an open PR
151
+ - **Impact:** Overwrote collaborator's review comments
152
+ - **Rule:** Never force-push to branches with open PRs. Use `git push --force-with-lease` if force is truly needed.
153
+ ```
154
+
155
+ **Bad lesson:**
156
+ ```markdown
157
+ ### Be careful with git
158
+ - Don't break things
159
+ ```
160
+
161
+ The lesson must contain enough detail that a future agent (or the same agent in a new session) can apply the rule without additional context.
162
+
163
+ #### Integration with CLAUDE.md
164
+
165
+ The CLAUDE.md Self-Improvement section establishes the contract:
166
+
167
+ > After ANY correction from the user: update `tasks/lessons.md` with the pattern.
168
+ > Write rules that prevent the same mistake recurring.
169
+ > Review `tasks/lessons.md` at session start before picking up work.
170
+
171
+ This creates a feedback loop: correction → lesson → rule → prevention. Each session starts by reviewing lessons, ensuring that past mistakes inform current work.
172
+
173
+ #### Cross-Session Memory
174
+
175
+ `tasks/lessons.md` is the primary cross-session learning mechanism. It persists in the repository and is loaded via CLAUDE.md references. For projects using MCP memory servers (Tier 2 memory), lessons can also be stored in the knowledge graph for structured querying — but `tasks/lessons.md` remains the canonical file. Do not duplicate entries across both systems.
176
+
177
+ ### Progress Tracking Conventions
178
+
179
+ #### Status Files
180
+
181
+ For complex projects, maintain a progress summary:
182
+
183
+ ```markdown
184
+ # Progress
185
+
186
+ ## Current Sprint
187
+ - [x] BD-10: Database schema migration (done)
188
+ - [x] BD-11: Auth middleware (done)
189
+ - [ ] BD-12: User registration endpoint (in-progress)
190
+ - [ ] BD-13: Login endpoint (ready)
191
+ - [ ] BD-14: Profile management (blocked — needs BD-12)
192
+
193
+ ## Blocked
194
+ - BD-14: Waiting on BD-12 (user model finalization)
195
+ ```
196
+
197
+ #### Completion Criteria Checklists
198
+
199
+ Each task should define explicit completion criteria, not vague goals:
200
+
201
+ ```markdown
202
+ ## BD-12: User registration endpoint
203
+
204
+ ### Done when:
205
+ - [ ] POST /api/v1/auth/register endpoint exists
206
+ - [ ] Input validation rejects invalid email, weak password
207
+ - [ ] Password is hashed with bcrypt (cost factor 12)
208
+ - [ ] Duplicate email returns 409 Conflict
209
+ - [ ] Integration test covers happy path + 3 error cases
210
+ - [ ] `make check` passes
211
+ ```
212
+
213
+ ### Common Anti-Patterns
214
+
215
+ **Stale tasks.** Tasks created during planning but never updated as the project evolves. The task list says "implement X" but X was descoped two sessions ago. Fix: review the task list at the start of each session. Archive or close tasks that no longer apply.
216
+
217
+ **Unclear completion criteria.** "Implement the feature" with no acceptance criteria, no test requirements, no file paths. An agent starting this task has to guess what "done" means. Fix: every task specifies exact deliverables, test requirements, and a verifiable definition of done.
218
+
219
+ **Missing lessons.** The user corrects the same mistake three sessions in a row because nobody captured it in `tasks/lessons.md`. Fix: treat lesson capture as mandatory, not optional. After every correction, update the file before continuing with other work.
220
+
221
+ **Task ID drift.** Commits stop including `[BD-xxx]` prefixes partway through the project. Traceability breaks down. Fix: make task ID inclusion a habit enforced by review. If using a pre-commit hook, validate the prefix.
222
+
223
+ **Overloaded tasks.** A single task covers "implement the API, write the UI, add tests, update docs." This overflows a single session and makes progress tracking meaningless. Fix: split into tasks that each fit in one agent session (30-90 minutes).
224
+
225
+ **Lessons without rules.** A lesson says "we had trouble with X" but doesn't state a preventive rule. Future sessions read the lesson but don't know what to do differently. Fix: every lesson must include a concrete rule — "Always do Y" or "Never do Z" — not just a description of what went wrong.
@@ -0,0 +1,214 @@
1
+ ---
2
+ name: tech-stack-selection
3
+ description: Framework evaluation methodology, decision matrices, and technology tradeoff analysis
4
+ topics: [tech-stack, framework-selection, decision-matrix, tradeoffs, scalability, ecosystem]
5
+ ---
6
+
7
+ # Tech Stack Selection
8
+
9
+ Choosing a technology stack is one of the highest-leverage decisions in a project. A poor choice compounds into years of friction; a good choice becomes invisible. This knowledge covers systematic evaluation frameworks, decision matrices, and the discipline to separate signal from hype.
10
+
11
+ ## Summary
12
+
13
+ ### Selection Criteria Categories
14
+
15
+ Every technology choice should be evaluated across six dimensions:
16
+
17
+ 1. **Ecosystem Maturity** — Package ecosystem breadth, stability of core libraries, frequency of breaking changes, quality of documentation, Stack Overflow answer density.
18
+ 2. **Team Expertise** — Current team proficiency, hiring pool depth in your market, ramp-up time for new developers, availability of training resources.
19
+ 3. **Performance Characteristics** — Throughput, latency, memory footprint, startup time, concurrency model. Match to your workload profile, not benchmarks.
20
+ 4. **Community & Support** — GitHub activity, release cadence, corporate backing stability, conference presence, number of active maintainers.
21
+ 5. **Licensing & Cost** — License type (MIT, Apache, BSL, SSPL), commercial support costs, cloud provider pricing, vendor lock-in implications.
22
+ 6. **Integration Fit** — Compatibility with existing systems, deployment target constraints, team tooling preferences, CI/CD compatibility.
23
+
24
+ ### Decision Matrix Concept
25
+
26
+ A decision matrix scores each candidate technology against weighted criteria. Weights reflect project priorities — a startup prototype weights "time to first feature" heavily; an enterprise migration weights "long-term support" heavily. The matrix does not make the decision — it structures the conversation and forces explicit tradeoff acknowledgment. Set weights before scoring begins to prevent post-hoc rationalization of a predetermined choice.
27
+
28
+ ### When to Revisit
29
+
30
+ Stack decisions should be revisited when: the team composition changes significantly, a dependency reaches end-of-life, performance requirements shift by an order of magnitude, or the licensing model changes. Do not revisit because a new framework is trending.
31
+
32
+ ### The Anti-Pattern Shortlist
33
+
34
+ The most common selection failures: **Resume-Driven Development** (choosing tech the team wants to learn, not what fits), **Hype-Driven Development** (choosing what is trending, not what is proven), **Ignoring Team Skills** (a 20% perf gain is not worth a 200% productivity loss during ramp-up), and **Premature Vendor Lock-In** (building on proprietary services without abstraction layers).
35
+
36
+ ### Documentation Requirement
37
+
38
+ Every stack decision must produce a written record: what was chosen, what was rejected, why, and under what conditions the decision should be revisited. This lives in `docs/tech-stack.md` or as an Architecture Decision Record (ADR). Undocumented decisions get relitigated every quarter.
39
+
40
+ ## Deep Guidance
41
+
42
+ ### The Evaluation Framework
43
+
44
+ #### Step 1: Define Non-Negotiable Constraints
45
+
46
+ Before evaluating options, enumerate hard constraints that eliminate candidates outright:
47
+
48
+ - **Runtime environment**: Browser, Node, Deno, Bun, JVM, native binary, embedded
49
+ - **Deployment target**: Serverless, containers, bare metal, edge, mobile device
50
+ - **Compliance requirements**: HIPAA, SOC2, FedRAMP — some libraries/services are pre-approved
51
+ - **Existing commitments**: Must integrate with an existing PostgreSQL database, must deploy to AWS, must support IE11
52
+ - **Team size and tenure**: A 2-person team cannot maintain a microservices architecture in 4 languages
53
+
54
+ Hard constraints are binary. If a technology fails any constraint, it is eliminated regardless of how well it scores on other dimensions.
55
+
56
+ #### Step 2: Weight the Criteria
57
+
58
+ Assign weights (1-5) to each criterion based on project context:
59
+
60
+ | Criterion | Startup MVP | Enterprise Migration | Performance-Critical | Open Source Tool |
61
+ |-----------|-------------|---------------------|---------------------|-----------------|
62
+ | Ecosystem Maturity | 3 | 5 | 3 | 4 |
63
+ | Team Expertise | 5 | 4 | 3 | 2 |
64
+ | Performance | 2 | 3 | 5 | 3 |
65
+ | Community | 4 | 3 | 2 | 5 |
66
+ | Licensing | 2 | 5 | 2 | 5 |
67
+ | Integration Fit | 3 | 5 | 4 | 3 |
68
+
69
+ These weights are examples. The team must set them for their specific context before scoring begins — otherwise weights get adjusted post-hoc to justify a predetermined choice.
70
+
71
+ #### Step 3: Score and Compare
72
+
73
+ Score each candidate 1-5 per criterion. Multiply by weight. Sum. The highest score is not automatically the winner — it is the starting point for discussion.
74
+
75
+ ```
76
+ | Criterion (weight) | React (score) | Vue (score) | Svelte (score) |
77
+ |--------------------------|---------------|-------------|----------------|
78
+ | Ecosystem Maturity (5) | 5 (25) | 4 (20) | 3 (15) |
79
+ | Team Expertise (4) | 5 (20) | 2 (8) | 1 (4) |
80
+ | Performance (3) | 3 (9) | 3 (9) | 5 (15) |
81
+ | Community (3) | 5 (15) | 4 (12) | 3 (9) |
82
+ | Licensing (2) | 5 (10) | 5 (10) | 5 (10) |
83
+ | Integration Fit (4) | 4 (16) | 4 (16) | 3 (12) |
84
+ | **Total** | **95** | **75** | **65** |
85
+ ```
86
+
87
+ The matrix reveals where tradeoffs concentrate. In this example, Svelte wins on performance but loses on ecosystem and team expertise. The conversation is now: "Is the performance gain worth the ramp-up cost and ecosystem risk?"
88
+
89
+ ### Category-Specific Evaluation
90
+
91
+ #### Frontend Frameworks
92
+
93
+ Key discriminators: bundle size, SSR support, routing model, state management ecosystem, TypeScript support quality, component library availability, build tooling maturity.
94
+
95
+ **React**: Largest ecosystem, most hiring options, most third-party libraries. Risk: meta-framework churn (Next.js vs Remix vs others). Best when: team knows React, project needs rich component library ecosystem.
96
+
97
+ **Vue**: Batteries-included official ecosystem (Vue Router, Pinia, Vite). Gentler learning curve. Smaller hiring pool in US/UK, larger in Asia-Pacific. Best when: team is learning frontend, project benefits from cohesive tooling.
98
+
99
+ **Svelte/SvelteKit**: Best runtime performance, smallest bundles, compiler-based approach. Smaller ecosystem, fewer battle-tested libraries. Best when: performance is critical, team is small and adaptable.
100
+
101
+ #### Backend Frameworks
102
+
103
+ Key discriminators: request throughput, cold start time, ORM/database tooling, middleware ecosystem, deployment model compatibility, type safety.
104
+
105
+ **Node.js (Express/Fastify/Hono)**: Same language as frontend, huge npm ecosystem, excellent serverless support. Risk: callback/async complexity at scale, single-threaded CPU bottlenecks. Best when: team is JavaScript-native, workload is I/O-bound.
106
+
107
+ **Python (FastAPI/Django)**: Strong ML/data ecosystem, excellent type hints (FastAPI), batteries-included admin (Django). Risk: GIL for CPU-bound work, slower raw throughput. Best when: project involves data processing/ML, team is Python-native.
108
+
109
+ **Go**: Excellent concurrency, fast compilation, small binaries, low memory footprint. Risk: verbose error handling, less expressive type system, smaller web framework ecosystem. Best when: high-concurrency services, CLI tools, infrastructure software.
110
+
111
+ #### Database Selection
112
+
113
+ Key discriminators: data model fit, query patterns, scalability model, operational complexity, backup/restore tooling, managed service availability.
114
+
115
+ **PostgreSQL**: Default choice for relational data. JSON support bridges document needs. Extensions ecosystem (PostGIS, pgvector, TimescaleDB). Risk: horizontal scaling requires careful planning. Best when: data is relational, you need ACID guarantees, you want one database.
116
+
117
+ **SQLite**: Zero-ops, embedded, surprisingly capable for read-heavy workloads. Litestream for replication. Risk: single-writer limitation, no built-in network access. Best when: single-server deployment, edge/embedded, development/testing.
118
+
119
+ **MongoDB**: True document model, flexible schema, built-in horizontal scaling. Risk: no joins (denormalization complexity), eventual consistency by default. Best when: data is genuinely document-shaped, schema evolves rapidly, write-heavy workload.
120
+
121
+ #### Infrastructure & Deployment
122
+
123
+ Key discriminators: operational burden, cost model, scaling characteristics, vendor lock-in degree, team DevOps expertise.
124
+
125
+ **Serverless (Lambda/Cloud Functions)**: Zero idle cost, automatic scaling, no server management. Risk: cold starts, vendor lock-in, debugging complexity, execution time limits. Best when: unpredictable traffic, many small functions, cost-sensitive.
126
+
127
+ **Containers (ECS/Cloud Run/Fly.io)**: Portable, predictable performance, good local development parity. Risk: orchestration complexity (if self-managed), persistent storage challenges. Best when: consistent workloads, need local dev parity, multi-cloud possible.
128
+
129
+ **PaaS (Railway/Render/Vercel)**: Fastest time to deploy, managed everything. Risk: cost at scale, limited customization, vendor-specific features. Best when: small team, prototype/MVP, standard web application architecture.
130
+
131
+ ### Common Anti-Patterns
132
+
133
+ #### Resume-Driven Development
134
+
135
+ **Pattern**: Choosing technologies because the team wants to learn them, not because they fit the project.
136
+ **Signal**: "Let's use Kubernetes" for a single-server app. "Let's rewrite in Rust" for a CRUD API.
137
+ **Mitigation**: The decision matrix forces explicit scoring. If a technology wins only on "fun to learn," the matrix will show it.
138
+
139
+ #### Hype-Driven Development
140
+
141
+ **Pattern**: Choosing technologies because they are trending on Hacker News or have impressive benchmarks.
142
+ **Signal**: Citing benchmarks without mapping them to actual workload characteristics. "X is 10x faster than Y" without asking "do we need that speed?"
143
+ **Mitigation**: Require a concrete performance requirement before performance can be weighted heavily.
144
+
145
+ #### Ignoring Team Skills
146
+
147
+ **Pattern**: Choosing the "best" technology without accounting for team proficiency.
148
+ **Signal**: Picking Go for a team of Python developers because "Go is faster." The 6-month ramp-up and initial low-quality Go code will cost more than Python's slower runtime.
149
+ **Mitigation**: Weight team expertise appropriately. A 20% performance gain is rarely worth a 200% productivity loss during ramp-up.
150
+
151
+ #### Premature Vendor Lock-In
152
+
153
+ **Pattern**: Building on vendor-specific services without an abstraction layer, making migration prohibitively expensive.
154
+ **Signal**: Direct use of DynamoDB-specific APIs throughout business logic. Lambda-specific handler signatures in core code.
155
+ **Mitigation**: Score "portability" as part of integration fit. Use repository/adapter patterns for external services.
156
+
157
+ ### Migration Cost Assessment
158
+
159
+ When evaluating a technology change mid-project, assess migration cost across five dimensions:
160
+
161
+ 1. **Code rewrite volume** — What percentage of the codebase must change? API boundaries, data models, business logic, or just infrastructure wrappers?
162
+ 2. **Data migration complexity** — Schema changes, data transformation, downtime requirements, rollback capability.
163
+ 3. **Team retraining** — How long until the team is productive in the new technology? Count weeks, not days.
164
+ 4. **Integration surface** — How many external systems connect to the component being replaced? Each integration point is a migration risk.
165
+ 5. **Rollback plan** — Can you run old and new in parallel? Can you revert if the migration fails? If not, the risk multiplier is high.
166
+
167
+ A migration is justified when: the current technology is end-of-life, the current technology cannot meet a hard requirement, or the migration cost is less than the ongoing maintenance cost of staying.
168
+
169
+ ### Vendor Lock-In Evaluation
170
+
171
+ Rate lock-in risk on a scale:
172
+
173
+ | Level | Description | Example | Exit Cost |
174
+ |-------|-------------|---------|-----------|
175
+ | **None** | Standard interface, multiple providers | PostgreSQL, S3-compatible storage | Low |
176
+ | **Low** | Portable with adapter work | Redis (managed vs self-hosted) | Medium |
177
+ | **Medium** | Significant API surface to abstract | Firebase Auth, Stripe Billing | High |
178
+ | **High** | Deep integration, no portable equivalent | DynamoDB single-table design, Vercel Edge Config | Very High |
179
+ | **Total** | No alternative exists | Apple Push Notifications, platform-specific APIs | Impossible |
180
+
181
+ For each dependency, document the lock-in level in `docs/tech-stack.md`. When lock-in is Medium or higher, require an abstraction layer (repository pattern, adapter interface) that isolates vendor-specific code.
182
+
183
+ ### Decision Record Template
184
+
185
+ Every technology decision should produce a record:
186
+
187
+ ```markdown
188
+ ## Decision: [Technology Choice]
189
+
190
+ **Date**: YYYY-MM-DD
191
+ **Status**: Accepted | Superseded by [link]
192
+ **Deciders**: [Names]
193
+
194
+ ### Context
195
+ What problem are we solving? What constraints exist?
196
+
197
+ ### Options Considered
198
+ 1. **[Option A]** — Brief description. Pros: ... Cons: ...
199
+ 2. **[Option B]** — Brief description. Pros: ... Cons: ...
200
+ 3. **[Option C]** — Brief description. Pros: ... Cons: ...
201
+
202
+ ### Decision
203
+ We chose [Option X] because [primary reasons].
204
+
205
+ ### Consequences
206
+ - Positive: [what we gain]
207
+ - Negative: [what we accept as tradeoffs]
208
+ - Neutral: [what doesn't change]
209
+
210
+ ### Revisit Conditions
211
+ Revisit this decision if: [specific, measurable conditions]
212
+ ```
213
+
214
+ This record prevents "nobody remembers why we chose X" six months later. It also prevents relitigating decisions without new information — if the conditions for revisiting haven't changed, the decision stands.