prjct-cli 1.21.0 → 1.22.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,10 +1,107 @@
1
1
  # Changelog
2
2
 
3
+ ## [1.22.0] - 2026-02-10
4
+
5
+ ### Features
6
+
7
+ - add task-to-analysis feedback loop (PRJ-272) (#165)
8
+ - add task history array with FIFO eviction (PRJ-281) (#164)
9
+
10
+
11
+ ## [1.22.0] - 2026-02-10
12
+
13
+ ### Features
14
+
15
+ - **Task-to-analysis feedback loop** (PRJ-272): Tasks report discoveries back into analysis and agent generation
16
+ - TaskFeedbackSchema: stackConfirmed, patternsDiscovered, agentAccuracy (with rating enum), issuesEncountered
17
+ - Optional `feedback` field on TaskHistoryEntry for backward compatibility
18
+ - `getAggregatedFeedback()` consolidates patterns, stack confirmations, and issues across task history
19
+ - Recurring issues (2+ occurrences) automatically promoted to "known gotchas"
20
+ - Sync incorporates feedback: patterns populate analysis draft, gotchas become anti-patterns
21
+ - Agent generator injects "Recent Learnings" section into domain agents with patterns, gotchas, and accuracy notes
22
+ - Workflow `done()` accepts and passes feedback through to storage
23
+ - 22 new tests covering schema validation, persistence, aggregation, gotcha promotion, and backward compatibility (1020 total)
24
+
25
+ ### Implementation Details
26
+
27
+ Closes the knowledge loop between task execution and project analysis. Previously, discoveries made during tasks were lost when sessions ended. Now, structured feedback persists in task history and feeds into the next sync cycle.
28
+
29
+ **Data flow:** `p. done` (feedback captured) → `taskHistory[].feedback` → `p. sync` → `analysis.patterns` + `agents/*.md` "Recent Learnings"
30
+
31
+ **Modified modules:**
32
+ - `core/schemas/state.ts` — Added TaskFeedbackSchema, extended TaskHistoryEntrySchema with optional feedback field
33
+ - `core/storage/state-storage.ts` — completeTask() accepts feedback, createTaskHistoryEntry() attaches it, getAggregatedFeedback() provides read-side API, toMarkdown() shows feedback in context
34
+ - `core/commands/workflow.ts` — done() passes feedback through options to completeTask()
35
+ - `core/services/sync-service.ts` — saveDraftAnalysis() loads aggregated feedback, injectFeedbackSection() adds learnings to agents
36
+ - `core/services/agent-generator.ts` — generate() accepts TaskFeedbackContext, injectFeedbackSection() appends learnings to domain agents
37
+ - `core/__tests__/storage/state-storage-feedback.test.ts` — 22 comprehensive tests
38
+
39
+ ### Learnings
40
+
41
+ - **SyncService duplicates AgentGenerator:** Both have their own `generateDomainAgent()` — feedback injection needed in both places
42
+ - **Write-Through pattern:** All state flows JSON → MD → Event; feedback follows the same pattern
43
+ - **Backward compatibility via optional fields:** Adding `feedback?: TaskFeedback` to existing schema requires zero migration
44
+
45
+ ### Test Plan
46
+
47
+ #### For QA
48
+ 1. Complete a task with `p. done` — verify feedback stored in `taskHistory[0].feedback`
49
+ 2. Complete multiple tasks with same issue — verify gotcha promotion (2+ occurrences)
50
+ 3. Run `p. sync` after tasks with feedback — verify analysis draft has patterns
51
+ 4. Run `p. sync` with agent regeneration — verify "Recent Learnings" in domain agents
52
+ 5. Complete task WITHOUT feedback — verify backward compatibility
53
+ 6. Run `bun test` — all 1020 tests pass
54
+
55
+ #### For Users
56
+ **What changed:** Task discoveries now persist and improve future agent context automatically.
57
+ **How to use:** Automatic via `p. done` template. No user action required.
58
+ **Breaking changes:** None.
59
+
3
60
  ## [1.21.0] - 2026-02-10
4
61
 
5
62
  ### Features
6
63
 
7
64
  - add semantic verification for analysis results (PRJ-270) (#163)
65
+ - **Task history array** (PRJ-281): Replace single previousTask with bounded task history for pattern learning
66
+ - TaskHistoryEntry schema captures completed task metadata: title, classification, timestamps, subtasks, outcome, branch, Linear IDs
67
+ - Automatic history push on task completion with FIFO eviction (max 20 entries)
68
+ - Context injection: shows 3 recent same-type tasks when active, 5 recent when idle
69
+ - Accessor methods: getTaskHistory(), getMostRecentTask(), getTaskHistoryByType()
70
+ - Backward compatible: undefined taskHistory initializes as empty array
71
+ - Comprehensive test suite with 20 test cases (998 tests total pass)
72
+
73
+ ### Implementation Details
74
+
75
+ Replaced single previousTask field with bounded task history array to enable pattern learning and cross-task context for AI agents. When tasks complete, metadata is automatically captured and stored with FIFO eviction.
76
+
77
+ **Modified modules:**
78
+ - `core/schemas/state.ts` — Added TaskHistoryEntrySchema with 12 fields, updated StateJsonSchema, exported TaskHistoryEntry type, updated DEFAULT_STATE
79
+ - `core/storage/state-storage.ts` — Updated completeTask() to push history entries, added createTaskHistoryEntry() helper, added 3 accessor methods, updated toMarkdown() for context injection, updated getDefault()
80
+ - `core/__tests__/storage/state-storage-history.test.ts` (468 lines) — 20 comprehensive tests covering push, eviction, backward compatibility, accessors, and context injection
81
+ - `README.md` — Added Task History section with usage documentation
82
+ - `CHANGELOG.md` — Documented task history feature
83
+
84
+ ### Learnings
85
+
86
+ - **Schema-first design:** Define Zod schemas before implementation ensures type safety and validation at runtime
87
+ - **Type assertions for extended properties:** Use `taskAny = task as any` to access properties not in CurrentTask schema (type, branch, parentDescription)
88
+ - **Context injection in toMarkdown():** The state-storage toMarkdown() method is where context is generated, not context-builder.ts
89
+ - **pathManager mocking for test isolation:** Mock getGlobalProjectPath, getStoragePath, getFilePath to use temp directories in tests
90
+ - **FIFO over LRU:** Simpler implementation with predictable behavior for bounded history
91
+
92
+ ### Test Plan
93
+
94
+ #### For QA
95
+ 1. Complete a task with `p. done` — verify taskHistory entry appears in state.json with all metadata fields
96
+ 2. Complete 25+ tasks — verify only 20 entries remain (oldest dropped)
97
+ 3. Start a bug task — verify context markdown shows recent bug tasks only (not features)
98
+ 4. Test with existing state.json missing taskHistory field — verify backward compatibility
99
+ 5. Verify accessor methods return correct data: getTaskHistory(), getMostRecentTask(), getTaskHistoryByType()
100
+
101
+ #### For Users
102
+ **What changed:** Completed tasks are now tracked in a history array (max 20) instead of only storing the last paused task
103
+ **How to use:** No action needed — task history is automatic on `p. done`
104
+ **Breaking changes:** None — fully backward compatible
8
105
 
9
106
 
10
107
  ## [1.20.0] - 2026-02-10
package/README.md CHANGED
@@ -123,6 +123,47 @@ All agents share the same project storage, so you can switch between them freely
123
123
  | `p. linear` | - | Linear integration |
124
124
  | `p. github` | - | GitHub Issues integration |
125
125
 
126
+ ## Task History
127
+
128
+ prjct automatically tracks your completed tasks to help AI agents learn from patterns and make better decisions across sessions.
129
+
130
+ ### How It Works
131
+
132
+ When you complete a task (`p. done` / `/done`), prjct stores:
133
+ - Task description and classification (feature, bug, improvement, chore)
134
+ - Start and completion timestamps
135
+ - Number of subtasks and their summaries
136
+ - Git branch name and Linear issue ID (if linked)
137
+ - PR URL (if shipped)
138
+
139
+ This history is:
140
+ - **Bounded**: Maximum 20 entries with FIFO (First-In-First-Out) eviction
141
+ - **Contextual**: Filtered by task type when starting similar work
142
+ - **Persistent**: Survives across sessions and agent types
143
+
144
+ ### Context Injection
145
+
146
+ Task history is automatically injected into the AI agent's context:
147
+ - When **starting a task**: Shows 3 most recent tasks of the same type (e.g., recent bug fixes when starting a new bug)
148
+ - When **idle**: Shows 5 most recent tasks across all types
149
+ - **Purpose**: Helps agents identify patterns, avoid repeating mistakes, and build on previous solutions
150
+
151
+ ### Accessor Methods (for developers)
152
+
153
+ ```typescript
154
+ import { stateStorage } from './storage/state-storage'
155
+
156
+ // Get full task history (max 20 entries, newest first)
157
+ const history = await stateStorage.getTaskHistory(projectId)
158
+
159
+ // Get most recent completed task
160
+ const recent = await stateStorage.getMostRecentTask(projectId)
161
+
162
+ // Get tasks by classification
163
+ const bugs = await stateStorage.getTaskHistoryByType(projectId, 'bug')
164
+ const features = await stateStorage.getTaskHistoryByType(projectId, 'feature')
165
+ ```
166
+
126
167
  ## CLI Commands
127
168
 
128
169
  ```bash
@@ -0,0 +1,463 @@
1
+ /**
2
+ * State Storage Task Feedback Tests (PRJ-272)
3
+ *
4
+ * Tests for the task-to-analysis feedback loop:
5
+ * - Feedback schema validation
6
+ * - Feedback persistence in task history
7
+ * - Feedback aggregation across tasks
8
+ * - Known gotchas promotion (2+ occurrences)
9
+ * - Backward compatibility (tasks without feedback)
10
+ * - Context injection (markdown with feedback)
11
+ */
12
+
13
+ import { afterEach, beforeEach, describe, expect, it } from 'bun:test'
14
+ import fs from 'node:fs/promises'
15
+ import os from 'node:os'
16
+ import path from 'node:path'
17
+ import pathManager from '../../infrastructure/path-manager'
18
+ import type { CurrentTask, StateJson, TaskFeedback } from '../../schemas/state'
19
+ import { TaskFeedbackSchema } from '../../schemas/state'
20
+ import { prjctDb } from '../../storage/database'
21
+ import { stateStorage } from '../../storage/state-storage'
22
+
23
+ // =============================================================================
24
+ // Test Setup
25
+ // =============================================================================
26
+
27
+ let tmpRoot: string | null = null
28
+ let testProjectId: string
29
+
30
+ const originalGetGlobalProjectPath = pathManager.getGlobalProjectPath.bind(pathManager)
31
+ const originalGetStoragePath = pathManager.getStoragePath.bind(pathManager)
32
+ const originalGetFilePath = pathManager.getFilePath.bind(pathManager)
33
+
34
+ beforeEach(async () => {
35
+ tmpRoot = await fs.mkdtemp(path.join(os.tmpdir(), 'prjct-feedback-test-'))
36
+ testProjectId = `test-feedback-${Date.now()}`
37
+
38
+ pathManager.getGlobalProjectPath = (projectId: string) => {
39
+ return path.join(tmpRoot!, projectId)
40
+ }
41
+
42
+ pathManager.getStoragePath = (projectId: string, filename: string) => {
43
+ return path.join(tmpRoot!, projectId, 'storage', filename)
44
+ }
45
+
46
+ pathManager.getFilePath = (projectId: string, layer: string, filename: string) => {
47
+ return path.join(tmpRoot!, projectId, layer, filename)
48
+ }
49
+
50
+ const storagePath = pathManager.getStoragePath(testProjectId, '')
51
+ await fs.mkdir(storagePath, { recursive: true })
52
+
53
+ const syncPath = path.join(tmpRoot!, testProjectId, 'sync')
54
+ await fs.mkdir(syncPath, { recursive: true })
55
+ })
56
+
57
+ afterEach(async () => {
58
+ prjctDb.close()
59
+
60
+ pathManager.getGlobalProjectPath = originalGetGlobalProjectPath
61
+ pathManager.getStoragePath = originalGetStoragePath
62
+ pathManager.getFilePath = originalGetFilePath
63
+
64
+ if (tmpRoot) {
65
+ await fs.rm(tmpRoot, { recursive: true, force: true })
66
+ tmpRoot = null
67
+ }
68
+ })
69
+
70
+ // =============================================================================
71
+ // Helper Functions
72
+ // =============================================================================
73
+
74
+ function createMockTask(
75
+ overrides: Partial<CurrentTask> & Record<string, unknown> = {}
76
+ ): CurrentTask {
77
+ return {
78
+ id: `task-${Date.now()}-${Math.random().toString(36).slice(2, 6)}`,
79
+ description: 'Test task',
80
+ startedAt: new Date().toISOString(),
81
+ sessionId: `session-${Date.now()}`,
82
+ ...overrides,
83
+ } as CurrentTask
84
+ }
85
+
86
+ async function startAndCompleteWithFeedback(
87
+ projectId: string,
88
+ task: CurrentTask,
89
+ feedback?: TaskFeedback
90
+ ): Promise<StateJson> {
91
+ await stateStorage.startTask(projectId, task)
92
+ await stateStorage.completeTask(projectId, feedback)
93
+ return await stateStorage.read(projectId)
94
+ }
95
+
96
+ // =============================================================================
97
+ // Tests: TaskFeedback Schema Validation
98
+ // =============================================================================
99
+
100
+ describe('TaskFeedback Schema', () => {
101
+ it('should validate a complete feedback object', () => {
102
+ const feedback: TaskFeedback = {
103
+ stackConfirmed: ['React 18', 'TypeScript strict mode'],
104
+ patternsDiscovered: ['API routes follow /api/v1/{resource} pattern'],
105
+ agentAccuracy: [{ agent: 'backend.md', rating: 'helpful', note: 'Good API patterns' }],
106
+ issuesEncountered: ['ESLint conflicts with Prettier'],
107
+ }
108
+
109
+ const result = TaskFeedbackSchema.safeParse(feedback)
110
+ expect(result.success).toBe(true)
111
+ })
112
+
113
+ it('should validate an empty feedback object', () => {
114
+ const feedback: TaskFeedback = {}
115
+
116
+ const result = TaskFeedbackSchema.safeParse(feedback)
117
+ expect(result.success).toBe(true)
118
+ })
119
+
120
+ it('should validate feedback with only patterns', () => {
121
+ const feedback: TaskFeedback = {
122
+ patternsDiscovered: ['Components use barrel exports'],
123
+ }
124
+
125
+ const result = TaskFeedbackSchema.safeParse(feedback)
126
+ expect(result.success).toBe(true)
127
+ })
128
+
129
+ it('should reject invalid agent accuracy rating', () => {
130
+ const feedback = {
131
+ agentAccuracy: [{ agent: 'backend.md', rating: 'invalid_rating' }],
132
+ }
133
+
134
+ const result = TaskFeedbackSchema.safeParse(feedback)
135
+ expect(result.success).toBe(false)
136
+ })
137
+
138
+ it('should validate all agent accuracy rating values', () => {
139
+ for (const rating of ['helpful', 'neutral', 'inaccurate'] as const) {
140
+ const feedback: TaskFeedback = {
141
+ agentAccuracy: [{ agent: 'test.md', rating }],
142
+ }
143
+ const result = TaskFeedbackSchema.safeParse(feedback)
144
+ expect(result.success).toBe(true)
145
+ }
146
+ })
147
+ })
148
+
149
+ // =============================================================================
150
+ // Tests: Feedback Persistence in Task History
151
+ // =============================================================================
152
+
153
+ describe('Feedback Persistence', () => {
154
+ it('should store feedback in task history entry', async () => {
155
+ const task = createMockTask({ description: 'Task with feedback' })
156
+ const feedback: TaskFeedback = {
157
+ stackConfirmed: ['TypeScript'],
158
+ patternsDiscovered: ['Uses Hono framework'],
159
+ }
160
+
161
+ const state = await startAndCompleteWithFeedback(testProjectId, task, feedback)
162
+
163
+ expect(state.taskHistory).toBeDefined()
164
+ expect(state.taskHistory!.length).toBe(1)
165
+ expect(state.taskHistory![0].feedback).toBeDefined()
166
+ expect(state.taskHistory![0].feedback?.stackConfirmed).toEqual(['TypeScript'])
167
+ expect(state.taskHistory![0].feedback?.patternsDiscovered).toEqual(['Uses Hono framework'])
168
+ })
169
+
170
+ it('should store task without feedback (backward compatible)', async () => {
171
+ const task = createMockTask({ description: 'Task without feedback' })
172
+
173
+ const state = await startAndCompleteWithFeedback(testProjectId, task)
174
+
175
+ expect(state.taskHistory).toBeDefined()
176
+ expect(state.taskHistory!.length).toBe(1)
177
+ expect(state.taskHistory![0].feedback).toBeUndefined()
178
+ })
179
+
180
+ it('should preserve feedback through FIFO eviction', async () => {
181
+ // Complete first task with feedback
182
+ const task1 = createMockTask({ description: 'Task 1' })
183
+ await stateStorage.startTask(testProjectId, task1)
184
+ await stateStorage.completeTask(testProjectId, {
185
+ patternsDiscovered: ['Pattern from task 1'],
186
+ })
187
+
188
+ // Complete second task with feedback
189
+ const task2 = createMockTask({ description: 'Task 2' })
190
+ await stateStorage.startTask(testProjectId, task2)
191
+ await stateStorage.completeTask(testProjectId, {
192
+ patternsDiscovered: ['Pattern from task 2'],
193
+ })
194
+
195
+ const state = await stateStorage.read(testProjectId)
196
+ expect(state.taskHistory!.length).toBe(2)
197
+ // Most recent first (FIFO)
198
+ expect(state.taskHistory![0].feedback?.patternsDiscovered).toEqual(['Pattern from task 2'])
199
+ expect(state.taskHistory![1].feedback?.patternsDiscovered).toEqual(['Pattern from task 1'])
200
+ })
201
+
202
+ it('should store full feedback with all fields', async () => {
203
+ const task = createMockTask({ description: 'Full feedback task' })
204
+ const feedback: TaskFeedback = {
205
+ stackConfirmed: ['React 18', 'TypeScript'],
206
+ patternsDiscovered: ['API routes use /api/v1/{resource}', 'Barrel exports'],
207
+ agentAccuracy: [
208
+ { agent: 'backend.md', rating: 'helpful', note: 'Good patterns' },
209
+ { agent: 'frontend.md', rating: 'inaccurate', note: 'Missing Tailwind' },
210
+ ],
211
+ issuesEncountered: ['ESLint conflicts with Prettier'],
212
+ }
213
+
214
+ const state = await startAndCompleteWithFeedback(testProjectId, task, feedback)
215
+
216
+ const stored = state.taskHistory![0].feedback!
217
+ expect(stored.stackConfirmed).toEqual(['React 18', 'TypeScript'])
218
+ expect(stored.patternsDiscovered).toEqual([
219
+ 'API routes use /api/v1/{resource}',
220
+ 'Barrel exports',
221
+ ])
222
+ expect(stored.agentAccuracy).toHaveLength(2)
223
+ expect(stored.agentAccuracy![0].rating).toBe('helpful')
224
+ expect(stored.agentAccuracy![1].rating).toBe('inaccurate')
225
+ expect(stored.issuesEncountered).toEqual(['ESLint conflicts with Prettier'])
226
+ })
227
+ })
228
+
229
+ // =============================================================================
230
+ // Tests: Feedback Aggregation
231
+ // =============================================================================
232
+
233
+ describe('Feedback Aggregation', () => {
234
+ it('should aggregate patterns from multiple tasks', async () => {
235
+ // Task 1
236
+ const task1 = createMockTask({ description: 'Task 1' })
237
+ await stateStorage.startTask(testProjectId, task1)
238
+ await stateStorage.completeTask(testProjectId, {
239
+ patternsDiscovered: ['Pattern A'],
240
+ })
241
+
242
+ // Task 2
243
+ const task2 = createMockTask({ description: 'Task 2' })
244
+ await stateStorage.startTask(testProjectId, task2)
245
+ await stateStorage.completeTask(testProjectId, {
246
+ patternsDiscovered: ['Pattern B'],
247
+ })
248
+
249
+ const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
250
+ expect(aggregated.patternsDiscovered).toContain('Pattern A')
251
+ expect(aggregated.patternsDiscovered).toContain('Pattern B')
252
+ })
253
+
254
+ it('should deduplicate patterns', async () => {
255
+ // Both tasks discover the same pattern
256
+ const task1 = createMockTask({ description: 'Task 1' })
257
+ await stateStorage.startTask(testProjectId, task1)
258
+ await stateStorage.completeTask(testProjectId, {
259
+ patternsDiscovered: ['Same pattern'],
260
+ })
261
+
262
+ const task2 = createMockTask({ description: 'Task 2' })
263
+ await stateStorage.startTask(testProjectId, task2)
264
+ await stateStorage.completeTask(testProjectId, {
265
+ patternsDiscovered: ['Same pattern'],
266
+ })
267
+
268
+ const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
269
+ expect(aggregated.patternsDiscovered).toEqual(['Same pattern'])
270
+ })
271
+
272
+ it('should deduplicate stack confirmations', async () => {
273
+ const task1 = createMockTask({ description: 'Task 1' })
274
+ await stateStorage.startTask(testProjectId, task1)
275
+ await stateStorage.completeTask(testProjectId, {
276
+ stackConfirmed: ['TypeScript', 'React'],
277
+ })
278
+
279
+ const task2 = createMockTask({ description: 'Task 2' })
280
+ await stateStorage.startTask(testProjectId, task2)
281
+ await stateStorage.completeTask(testProjectId, {
282
+ stackConfirmed: ['TypeScript', 'Next.js'],
283
+ })
284
+
285
+ const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
286
+ expect(aggregated.stackConfirmed).toContain('TypeScript')
287
+ expect(aggregated.stackConfirmed).toContain('React')
288
+ expect(aggregated.stackConfirmed).toContain('Next.js')
289
+ // TypeScript should not be duplicated
290
+ expect(aggregated.stackConfirmed.filter((s) => s === 'TypeScript')).toHaveLength(1)
291
+ })
292
+
293
+ it('should promote recurring issues to known gotchas', async () => {
294
+ // Same issue encountered twice
295
+ const task1 = createMockTask({ description: 'Task 1' })
296
+ await stateStorage.startTask(testProjectId, task1)
297
+ await stateStorage.completeTask(testProjectId, {
298
+ issuesEncountered: ['ESLint conflicts with Prettier'],
299
+ })
300
+
301
+ const task2 = createMockTask({ description: 'Task 2' })
302
+ await stateStorage.startTask(testProjectId, task2)
303
+ await stateStorage.completeTask(testProjectId, {
304
+ issuesEncountered: ['ESLint conflicts with Prettier'],
305
+ })
306
+
307
+ const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
308
+ expect(aggregated.knownGotchas).toContain('ESLint conflicts with Prettier')
309
+ })
310
+
311
+ it('should NOT promote single-occurrence issues to gotchas', async () => {
312
+ const task1 = createMockTask({ description: 'Task 1' })
313
+ await stateStorage.startTask(testProjectId, task1)
314
+ await stateStorage.completeTask(testProjectId, {
315
+ issuesEncountered: ['One-time issue'],
316
+ })
317
+
318
+ const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
319
+ expect(aggregated.issuesEncountered).toContain('One-time issue')
320
+ expect(aggregated.knownGotchas).not.toContain('One-time issue')
321
+ })
322
+
323
+ it('should aggregate agent accuracy across tasks', async () => {
324
+ const task1 = createMockTask({ description: 'Task 1' })
325
+ await stateStorage.startTask(testProjectId, task1)
326
+ await stateStorage.completeTask(testProjectId, {
327
+ agentAccuracy: [{ agent: 'backend.md', rating: 'helpful' }],
328
+ })
329
+
330
+ const task2 = createMockTask({ description: 'Task 2' })
331
+ await stateStorage.startTask(testProjectId, task2)
332
+ await stateStorage.completeTask(testProjectId, {
333
+ agentAccuracy: [{ agent: 'backend.md', rating: 'inaccurate', note: 'Missing Hono context' }],
334
+ })
335
+
336
+ const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
337
+ expect(aggregated.agentAccuracy).toHaveLength(2)
338
+ expect(aggregated.agentAccuracy[0].agent).toBe('backend.md')
339
+ expect(aggregated.agentAccuracy[1].agent).toBe('backend.md')
340
+ })
341
+
342
+ it('should return empty aggregation when no feedback exists', async () => {
343
+ // Complete task without feedback
344
+ const task = createMockTask({ description: 'No feedback' })
345
+ await stateStorage.startTask(testProjectId, task)
346
+ await stateStorage.completeTask(testProjectId)
347
+
348
+ const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
349
+ expect(aggregated.stackConfirmed).toEqual([])
350
+ expect(aggregated.patternsDiscovered).toEqual([])
351
+ expect(aggregated.agentAccuracy).toEqual([])
352
+ expect(aggregated.issuesEncountered).toEqual([])
353
+ expect(aggregated.knownGotchas).toEqual([])
354
+ })
355
+
356
+ it('should return empty aggregation when no tasks exist', async () => {
357
+ const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
358
+ expect(aggregated.stackConfirmed).toEqual([])
359
+ expect(aggregated.patternsDiscovered).toEqual([])
360
+ expect(aggregated.knownGotchas).toEqual([])
361
+ })
362
+ })
363
+
364
+ // =============================================================================
365
+ // Tests: Context Injection (Markdown Generation)
366
+ // =============================================================================
367
+
368
+ describe('Feedback in Markdown Context', () => {
369
+ it('should include patterns in task history markdown', async () => {
370
+ const task = createMockTask({ description: 'Task with patterns' })
371
+ await stateStorage.startTask(testProjectId, task)
372
+ await stateStorage.completeTask(testProjectId, {
373
+ patternsDiscovered: ['Uses barrel exports'],
374
+ })
375
+
376
+ const state = await stateStorage.read(testProjectId)
377
+ // Access the private toMarkdown via the generated context
378
+ const md = (stateStorage as any).toMarkdown(state)
379
+
380
+ expect(md).toContain('Patterns: Uses barrel exports')
381
+ })
382
+
383
+ it('should include gotchas in task history markdown', async () => {
384
+ const task = createMockTask({ description: 'Task with gotchas' })
385
+ await stateStorage.startTask(testProjectId, task)
386
+ await stateStorage.completeTask(testProjectId, {
387
+ issuesEncountered: ['Port 3000 already in use'],
388
+ })
389
+
390
+ const state = await stateStorage.read(testProjectId)
391
+ const md = (stateStorage as any).toMarkdown(state)
392
+
393
+ expect(md).toContain('Gotchas: Port 3000 already in use')
394
+ })
395
+
396
+ it('should NOT show feedback section when feedback is empty', async () => {
397
+ const task = createMockTask({ description: 'Task without feedback' })
398
+ await stateStorage.startTask(testProjectId, task)
399
+ await stateStorage.completeTask(testProjectId)
400
+
401
+ const state = await stateStorage.read(testProjectId)
402
+ const md = (stateStorage as any).toMarkdown(state)
403
+
404
+ expect(md).not.toContain('Patterns:')
405
+ expect(md).not.toContain('Gotchas:')
406
+ })
407
+ })
408
+
409
+ // =============================================================================
410
+ // Tests: Mixed Feedback and Non-Feedback Tasks
411
+ // =============================================================================
412
+
413
+ describe('Mixed Tasks (with and without feedback)', () => {
414
+ it('should handle mix of tasks with and without feedback', async () => {
415
+ // Task 1: with feedback
416
+ const task1 = createMockTask({ description: 'With feedback' })
417
+ await stateStorage.startTask(testProjectId, task1)
418
+ await stateStorage.completeTask(testProjectId, {
419
+ patternsDiscovered: ['Pattern from task 1'],
420
+ })
421
+
422
+ // Task 2: without feedback
423
+ const task2 = createMockTask({ description: 'Without feedback' })
424
+ await stateStorage.startTask(testProjectId, task2)
425
+ await stateStorage.completeTask(testProjectId)
426
+
427
+ // Task 3: with feedback
428
+ const task3 = createMockTask({ description: 'With feedback again' })
429
+ await stateStorage.startTask(testProjectId, task3)
430
+ await stateStorage.completeTask(testProjectId, {
431
+ patternsDiscovered: ['Pattern from task 3'],
432
+ })
433
+
434
+ const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
435
+ expect(aggregated.patternsDiscovered).toContain('Pattern from task 1')
436
+ expect(aggregated.patternsDiscovered).toContain('Pattern from task 3')
437
+ expect(aggregated.patternsDiscovered).toHaveLength(2)
438
+ })
439
+
440
+ it('should correctly count occurrences for gotcha promotion with mixed tasks', async () => {
441
+ // Task 1: encounters issue
442
+ const task1 = createMockTask({ description: 'Task 1' })
443
+ await stateStorage.startTask(testProjectId, task1)
444
+ await stateStorage.completeTask(testProjectId, {
445
+ issuesEncountered: ['Build fails on M1'],
446
+ })
447
+
448
+ // Task 2: no feedback
449
+ const task2 = createMockTask({ description: 'Task 2' })
450
+ await stateStorage.startTask(testProjectId, task2)
451
+ await stateStorage.completeTask(testProjectId)
452
+
453
+ // Task 3: encounters same issue
454
+ const task3 = createMockTask({ description: 'Task 3' })
455
+ await stateStorage.startTask(testProjectId, task3)
456
+ await stateStorage.completeTask(testProjectId, {
457
+ issuesEncountered: ['Build fails on M1'],
458
+ })
459
+
460
+ const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
461
+ expect(aggregated.knownGotchas).toContain('Build fails on M1')
462
+ })
463
+ })