prjct-cli 1.21.0 → 1.22.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +97 -0
- package/README.md +41 -0
- package/core/__tests__/storage/state-storage-feedback.test.ts +463 -0
- package/core/__tests__/storage/state-storage-history.test.ts +469 -0
- package/core/commands/workflow.ts +5 -2
- package/core/schemas/state.ts +43 -0
- package/core/services/agent-generator.ts +70 -1
- package/core/services/sync-service.ts +115 -4
- package/core/storage/state-storage.ts +190 -3
- package/dist/bin/prjct.mjs +256 -10
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -1,10 +1,107 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## [1.22.0] - 2026-02-10
|
|
4
|
+
|
|
5
|
+
### Features
|
|
6
|
+
|
|
7
|
+
- add task-to-analysis feedback loop (PRJ-272) (#165)
|
|
8
|
+
- add task history array with FIFO eviction (PRJ-281) (#164)
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
## [1.22.0] - 2026-02-10
|
|
12
|
+
|
|
13
|
+
### Features
|
|
14
|
+
|
|
15
|
+
- **Task-to-analysis feedback loop** (PRJ-272): Tasks report discoveries back into analysis and agent generation
|
|
16
|
+
- TaskFeedbackSchema: stackConfirmed, patternsDiscovered, agentAccuracy (with rating enum), issuesEncountered
|
|
17
|
+
- Optional `feedback` field on TaskHistoryEntry for backward compatibility
|
|
18
|
+
- `getAggregatedFeedback()` consolidates patterns, stack confirmations, and issues across task history
|
|
19
|
+
- Recurring issues (2+ occurrences) automatically promoted to "known gotchas"
|
|
20
|
+
- Sync incorporates feedback: patterns populate analysis draft, gotchas become anti-patterns
|
|
21
|
+
- Agent generator injects "Recent Learnings" section into domain agents with patterns, gotchas, and accuracy notes
|
|
22
|
+
- Workflow `done()` accepts and passes feedback through to storage
|
|
23
|
+
- 22 new tests covering schema validation, persistence, aggregation, gotcha promotion, and backward compatibility (1020 total)
|
|
24
|
+
|
|
25
|
+
### Implementation Details
|
|
26
|
+
|
|
27
|
+
Closes the knowledge loop between task execution and project analysis. Previously, discoveries made during tasks were lost when sessions ended. Now, structured feedback persists in task history and feeds into the next sync cycle.
|
|
28
|
+
|
|
29
|
+
**Data flow:** `p. done` (feedback captured) → `taskHistory[].feedback` → `p. sync` → `analysis.patterns` + `agents/*.md` "Recent Learnings"
|
|
30
|
+
|
|
31
|
+
**Modified modules:**
|
|
32
|
+
- `core/schemas/state.ts` — Added TaskFeedbackSchema, extended TaskHistoryEntrySchema with optional feedback field
|
|
33
|
+
- `core/storage/state-storage.ts` — completeTask() accepts feedback, createTaskHistoryEntry() attaches it, getAggregatedFeedback() provides read-side API, toMarkdown() shows feedback in context
|
|
34
|
+
- `core/commands/workflow.ts` — done() passes feedback through options to completeTask()
|
|
35
|
+
- `core/services/sync-service.ts` — saveDraftAnalysis() loads aggregated feedback, injectFeedbackSection() adds learnings to agents
|
|
36
|
+
- `core/services/agent-generator.ts` — generate() accepts TaskFeedbackContext, injectFeedbackSection() appends learnings to domain agents
|
|
37
|
+
- `core/__tests__/storage/state-storage-feedback.test.ts` — 22 comprehensive tests
|
|
38
|
+
|
|
39
|
+
### Learnings
|
|
40
|
+
|
|
41
|
+
- **SyncService duplicates AgentGenerator:** Both have their own `generateDomainAgent()` — feedback injection needed in both places
|
|
42
|
+
- **Write-Through pattern:** All state flows JSON → MD → Event; feedback follows the same pattern
|
|
43
|
+
- **Backward compatibility via optional fields:** Adding `feedback?: TaskFeedback` to existing schema requires zero migration
|
|
44
|
+
|
|
45
|
+
### Test Plan
|
|
46
|
+
|
|
47
|
+
#### For QA
|
|
48
|
+
1. Complete a task with `p. done` — verify feedback stored in `taskHistory[0].feedback`
|
|
49
|
+
2. Complete multiple tasks with same issue — verify gotcha promotion (2+ occurrences)
|
|
50
|
+
3. Run `p. sync` after tasks with feedback — verify analysis draft has patterns
|
|
51
|
+
4. Run `p. sync` with agent regeneration — verify "Recent Learnings" in domain agents
|
|
52
|
+
5. Complete task WITHOUT feedback — verify backward compatibility
|
|
53
|
+
6. Run `bun test` — all 1020 tests pass
|
|
54
|
+
|
|
55
|
+
#### For Users
|
|
56
|
+
**What changed:** Task discoveries now persist and improve future agent context automatically.
|
|
57
|
+
**How to use:** Automatic via `p. done` template. No user action required.
|
|
58
|
+
**Breaking changes:** None.
|
|
59
|
+
|
|
3
60
|
## [1.21.0] - 2026-02-10
|
|
4
61
|
|
|
5
62
|
### Features
|
|
6
63
|
|
|
7
64
|
- add semantic verification for analysis results (PRJ-270) (#163)
|
|
65
|
+
- **Task history array** (PRJ-281): Replace single previousTask with bounded task history for pattern learning
|
|
66
|
+
- TaskHistoryEntry schema captures completed task metadata: title, classification, timestamps, subtasks, outcome, branch, Linear IDs
|
|
67
|
+
- Automatic history push on task completion with FIFO eviction (max 20 entries)
|
|
68
|
+
- Context injection: shows 3 recent same-type tasks when active, 5 recent when idle
|
|
69
|
+
- Accessor methods: getTaskHistory(), getMostRecentTask(), getTaskHistoryByType()
|
|
70
|
+
- Backward compatible: undefined taskHistory initializes as empty array
|
|
71
|
+
- Comprehensive test suite with 20 test cases (998 tests total pass)
|
|
72
|
+
|
|
73
|
+
### Implementation Details
|
|
74
|
+
|
|
75
|
+
Replaced single previousTask field with bounded task history array to enable pattern learning and cross-task context for AI agents. When tasks complete, metadata is automatically captured and stored with FIFO eviction.
|
|
76
|
+
|
|
77
|
+
**Modified modules:**
|
|
78
|
+
- `core/schemas/state.ts` — Added TaskHistoryEntrySchema with 12 fields, updated StateJsonSchema, exported TaskHistoryEntry type, updated DEFAULT_STATE
|
|
79
|
+
- `core/storage/state-storage.ts` — Updated completeTask() to push history entries, added createTaskHistoryEntry() helper, added 3 accessor methods, updated toMarkdown() for context injection, updated getDefault()
|
|
80
|
+
- `core/__tests__/storage/state-storage-history.test.ts` (468 lines) — 20 comprehensive tests covering push, eviction, backward compatibility, accessors, and context injection
|
|
81
|
+
- `README.md` — Added Task History section with usage documentation
|
|
82
|
+
- `CHANGELOG.md` — Documented task history feature
|
|
83
|
+
|
|
84
|
+
### Learnings
|
|
85
|
+
|
|
86
|
+
- **Schema-first design:** Define Zod schemas before implementation ensures type safety and validation at runtime
|
|
87
|
+
- **Type assertions for extended properties:** Use `taskAny = task as any` to access properties not in CurrentTask schema (type, branch, parentDescription)
|
|
88
|
+
- **Context injection in toMarkdown():** The state-storage toMarkdown() method is where context is generated, not context-builder.ts
|
|
89
|
+
- **pathManager mocking for test isolation:** Mock getGlobalProjectPath, getStoragePath, getFilePath to use temp directories in tests
|
|
90
|
+
- **FIFO over LRU:** Simpler implementation with predictable behavior for bounded history
|
|
91
|
+
|
|
92
|
+
### Test Plan
|
|
93
|
+
|
|
94
|
+
#### For QA
|
|
95
|
+
1. Complete a task with `p. done` — verify taskHistory entry appears in state.json with all metadata fields
|
|
96
|
+
2. Complete 25+ tasks — verify only 20 entries remain (oldest dropped)
|
|
97
|
+
3. Start a bug task — verify context markdown shows recent bug tasks only (not features)
|
|
98
|
+
4. Test with existing state.json missing taskHistory field — verify backward compatibility
|
|
99
|
+
5. Verify accessor methods return correct data: getTaskHistory(), getMostRecentTask(), getTaskHistoryByType()
|
|
100
|
+
|
|
101
|
+
#### For Users
|
|
102
|
+
**What changed:** Completed tasks are now tracked in a history array (max 20) instead of only storing the last paused task
|
|
103
|
+
**How to use:** No action needed — task history is automatic on `p. done`
|
|
104
|
+
**Breaking changes:** None — fully backward compatible
|
|
8
105
|
|
|
9
106
|
|
|
10
107
|
## [1.20.0] - 2026-02-10
|
package/README.md
CHANGED
|
@@ -123,6 +123,47 @@ All agents share the same project storage, so you can switch between them freely
|
|
|
123
123
|
| `p. linear` | - | Linear integration |
|
|
124
124
|
| `p. github` | - | GitHub Issues integration |
|
|
125
125
|
|
|
126
|
+
## Task History
|
|
127
|
+
|
|
128
|
+
prjct automatically tracks your completed tasks to help AI agents learn from patterns and make better decisions across sessions.
|
|
129
|
+
|
|
130
|
+
### How It Works
|
|
131
|
+
|
|
132
|
+
When you complete a task (`p. done` / `/done`), prjct stores:
|
|
133
|
+
- Task description and classification (feature, bug, improvement, chore)
|
|
134
|
+
- Start and completion timestamps
|
|
135
|
+
- Number of subtasks and their summaries
|
|
136
|
+
- Git branch name and Linear issue ID (if linked)
|
|
137
|
+
- PR URL (if shipped)
|
|
138
|
+
|
|
139
|
+
This history is:
|
|
140
|
+
- **Bounded**: Maximum 20 entries with FIFO (First-In-First-Out) eviction
|
|
141
|
+
- **Contextual**: Filtered by task type when starting similar work
|
|
142
|
+
- **Persistent**: Survives across sessions and agent types
|
|
143
|
+
|
|
144
|
+
### Context Injection
|
|
145
|
+
|
|
146
|
+
Task history is automatically injected into the AI agent's context:
|
|
147
|
+
- When **starting a task**: Shows 3 most recent tasks of the same type (e.g., recent bug fixes when starting a new bug)
|
|
148
|
+
- When **idle**: Shows 5 most recent tasks across all types
|
|
149
|
+
- **Purpose**: Helps agents identify patterns, avoid repeating mistakes, and build on previous solutions
|
|
150
|
+
|
|
151
|
+
### Accessor Methods (for developers)
|
|
152
|
+
|
|
153
|
+
```typescript
|
|
154
|
+
import { stateStorage } from './storage/state-storage'
|
|
155
|
+
|
|
156
|
+
// Get full task history (max 20 entries, newest first)
|
|
157
|
+
const history = await stateStorage.getTaskHistory(projectId)
|
|
158
|
+
|
|
159
|
+
// Get most recent completed task
|
|
160
|
+
const recent = await stateStorage.getMostRecentTask(projectId)
|
|
161
|
+
|
|
162
|
+
// Get tasks by classification
|
|
163
|
+
const bugs = await stateStorage.getTaskHistoryByType(projectId, 'bug')
|
|
164
|
+
const features = await stateStorage.getTaskHistoryByType(projectId, 'feature')
|
|
165
|
+
```
|
|
166
|
+
|
|
126
167
|
## CLI Commands
|
|
127
168
|
|
|
128
169
|
```bash
|
|
@@ -0,0 +1,463 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* State Storage Task Feedback Tests (PRJ-272)
|
|
3
|
+
*
|
|
4
|
+
* Tests for the task-to-analysis feedback loop:
|
|
5
|
+
* - Feedback schema validation
|
|
6
|
+
* - Feedback persistence in task history
|
|
7
|
+
* - Feedback aggregation across tasks
|
|
8
|
+
* - Known gotchas promotion (2+ occurrences)
|
|
9
|
+
* - Backward compatibility (tasks without feedback)
|
|
10
|
+
* - Context injection (markdown with feedback)
|
|
11
|
+
*/
|
|
12
|
+
|
|
13
|
+
import { afterEach, beforeEach, describe, expect, it } from 'bun:test'
|
|
14
|
+
import fs from 'node:fs/promises'
|
|
15
|
+
import os from 'node:os'
|
|
16
|
+
import path from 'node:path'
|
|
17
|
+
import pathManager from '../../infrastructure/path-manager'
|
|
18
|
+
import type { CurrentTask, StateJson, TaskFeedback } from '../../schemas/state'
|
|
19
|
+
import { TaskFeedbackSchema } from '../../schemas/state'
|
|
20
|
+
import { prjctDb } from '../../storage/database'
|
|
21
|
+
import { stateStorage } from '../../storage/state-storage'
|
|
22
|
+
|
|
23
|
+
// =============================================================================
|
|
24
|
+
// Test Setup
|
|
25
|
+
// =============================================================================
|
|
26
|
+
|
|
27
|
+
let tmpRoot: string | null = null
|
|
28
|
+
let testProjectId: string
|
|
29
|
+
|
|
30
|
+
const originalGetGlobalProjectPath = pathManager.getGlobalProjectPath.bind(pathManager)
|
|
31
|
+
const originalGetStoragePath = pathManager.getStoragePath.bind(pathManager)
|
|
32
|
+
const originalGetFilePath = pathManager.getFilePath.bind(pathManager)
|
|
33
|
+
|
|
34
|
+
beforeEach(async () => {
|
|
35
|
+
tmpRoot = await fs.mkdtemp(path.join(os.tmpdir(), 'prjct-feedback-test-'))
|
|
36
|
+
testProjectId = `test-feedback-${Date.now()}`
|
|
37
|
+
|
|
38
|
+
pathManager.getGlobalProjectPath = (projectId: string) => {
|
|
39
|
+
return path.join(tmpRoot!, projectId)
|
|
40
|
+
}
|
|
41
|
+
|
|
42
|
+
pathManager.getStoragePath = (projectId: string, filename: string) => {
|
|
43
|
+
return path.join(tmpRoot!, projectId, 'storage', filename)
|
|
44
|
+
}
|
|
45
|
+
|
|
46
|
+
pathManager.getFilePath = (projectId: string, layer: string, filename: string) => {
|
|
47
|
+
return path.join(tmpRoot!, projectId, layer, filename)
|
|
48
|
+
}
|
|
49
|
+
|
|
50
|
+
const storagePath = pathManager.getStoragePath(testProjectId, '')
|
|
51
|
+
await fs.mkdir(storagePath, { recursive: true })
|
|
52
|
+
|
|
53
|
+
const syncPath = path.join(tmpRoot!, testProjectId, 'sync')
|
|
54
|
+
await fs.mkdir(syncPath, { recursive: true })
|
|
55
|
+
})
|
|
56
|
+
|
|
57
|
+
afterEach(async () => {
|
|
58
|
+
prjctDb.close()
|
|
59
|
+
|
|
60
|
+
pathManager.getGlobalProjectPath = originalGetGlobalProjectPath
|
|
61
|
+
pathManager.getStoragePath = originalGetStoragePath
|
|
62
|
+
pathManager.getFilePath = originalGetFilePath
|
|
63
|
+
|
|
64
|
+
if (tmpRoot) {
|
|
65
|
+
await fs.rm(tmpRoot, { recursive: true, force: true })
|
|
66
|
+
tmpRoot = null
|
|
67
|
+
}
|
|
68
|
+
})
|
|
69
|
+
|
|
70
|
+
// =============================================================================
|
|
71
|
+
// Helper Functions
|
|
72
|
+
// =============================================================================
|
|
73
|
+
|
|
74
|
+
function createMockTask(
|
|
75
|
+
overrides: Partial<CurrentTask> & Record<string, unknown> = {}
|
|
76
|
+
): CurrentTask {
|
|
77
|
+
return {
|
|
78
|
+
id: `task-${Date.now()}-${Math.random().toString(36).slice(2, 6)}`,
|
|
79
|
+
description: 'Test task',
|
|
80
|
+
startedAt: new Date().toISOString(),
|
|
81
|
+
sessionId: `session-${Date.now()}`,
|
|
82
|
+
...overrides,
|
|
83
|
+
} as CurrentTask
|
|
84
|
+
}
|
|
85
|
+
|
|
86
|
+
async function startAndCompleteWithFeedback(
|
|
87
|
+
projectId: string,
|
|
88
|
+
task: CurrentTask,
|
|
89
|
+
feedback?: TaskFeedback
|
|
90
|
+
): Promise<StateJson> {
|
|
91
|
+
await stateStorage.startTask(projectId, task)
|
|
92
|
+
await stateStorage.completeTask(projectId, feedback)
|
|
93
|
+
return await stateStorage.read(projectId)
|
|
94
|
+
}
|
|
95
|
+
|
|
96
|
+
// =============================================================================
|
|
97
|
+
// Tests: TaskFeedback Schema Validation
|
|
98
|
+
// =============================================================================
|
|
99
|
+
|
|
100
|
+
describe('TaskFeedback Schema', () => {
|
|
101
|
+
it('should validate a complete feedback object', () => {
|
|
102
|
+
const feedback: TaskFeedback = {
|
|
103
|
+
stackConfirmed: ['React 18', 'TypeScript strict mode'],
|
|
104
|
+
patternsDiscovered: ['API routes follow /api/v1/{resource} pattern'],
|
|
105
|
+
agentAccuracy: [{ agent: 'backend.md', rating: 'helpful', note: 'Good API patterns' }],
|
|
106
|
+
issuesEncountered: ['ESLint conflicts with Prettier'],
|
|
107
|
+
}
|
|
108
|
+
|
|
109
|
+
const result = TaskFeedbackSchema.safeParse(feedback)
|
|
110
|
+
expect(result.success).toBe(true)
|
|
111
|
+
})
|
|
112
|
+
|
|
113
|
+
it('should validate an empty feedback object', () => {
|
|
114
|
+
const feedback: TaskFeedback = {}
|
|
115
|
+
|
|
116
|
+
const result = TaskFeedbackSchema.safeParse(feedback)
|
|
117
|
+
expect(result.success).toBe(true)
|
|
118
|
+
})
|
|
119
|
+
|
|
120
|
+
it('should validate feedback with only patterns', () => {
|
|
121
|
+
const feedback: TaskFeedback = {
|
|
122
|
+
patternsDiscovered: ['Components use barrel exports'],
|
|
123
|
+
}
|
|
124
|
+
|
|
125
|
+
const result = TaskFeedbackSchema.safeParse(feedback)
|
|
126
|
+
expect(result.success).toBe(true)
|
|
127
|
+
})
|
|
128
|
+
|
|
129
|
+
it('should reject invalid agent accuracy rating', () => {
|
|
130
|
+
const feedback = {
|
|
131
|
+
agentAccuracy: [{ agent: 'backend.md', rating: 'invalid_rating' }],
|
|
132
|
+
}
|
|
133
|
+
|
|
134
|
+
const result = TaskFeedbackSchema.safeParse(feedback)
|
|
135
|
+
expect(result.success).toBe(false)
|
|
136
|
+
})
|
|
137
|
+
|
|
138
|
+
it('should validate all agent accuracy rating values', () => {
|
|
139
|
+
for (const rating of ['helpful', 'neutral', 'inaccurate'] as const) {
|
|
140
|
+
const feedback: TaskFeedback = {
|
|
141
|
+
agentAccuracy: [{ agent: 'test.md', rating }],
|
|
142
|
+
}
|
|
143
|
+
const result = TaskFeedbackSchema.safeParse(feedback)
|
|
144
|
+
expect(result.success).toBe(true)
|
|
145
|
+
}
|
|
146
|
+
})
|
|
147
|
+
})
|
|
148
|
+
|
|
149
|
+
// =============================================================================
|
|
150
|
+
// Tests: Feedback Persistence in Task History
|
|
151
|
+
// =============================================================================
|
|
152
|
+
|
|
153
|
+
describe('Feedback Persistence', () => {
|
|
154
|
+
it('should store feedback in task history entry', async () => {
|
|
155
|
+
const task = createMockTask({ description: 'Task with feedback' })
|
|
156
|
+
const feedback: TaskFeedback = {
|
|
157
|
+
stackConfirmed: ['TypeScript'],
|
|
158
|
+
patternsDiscovered: ['Uses Hono framework'],
|
|
159
|
+
}
|
|
160
|
+
|
|
161
|
+
const state = await startAndCompleteWithFeedback(testProjectId, task, feedback)
|
|
162
|
+
|
|
163
|
+
expect(state.taskHistory).toBeDefined()
|
|
164
|
+
expect(state.taskHistory!.length).toBe(1)
|
|
165
|
+
expect(state.taskHistory![0].feedback).toBeDefined()
|
|
166
|
+
expect(state.taskHistory![0].feedback?.stackConfirmed).toEqual(['TypeScript'])
|
|
167
|
+
expect(state.taskHistory![0].feedback?.patternsDiscovered).toEqual(['Uses Hono framework'])
|
|
168
|
+
})
|
|
169
|
+
|
|
170
|
+
it('should store task without feedback (backward compatible)', async () => {
|
|
171
|
+
const task = createMockTask({ description: 'Task without feedback' })
|
|
172
|
+
|
|
173
|
+
const state = await startAndCompleteWithFeedback(testProjectId, task)
|
|
174
|
+
|
|
175
|
+
expect(state.taskHistory).toBeDefined()
|
|
176
|
+
expect(state.taskHistory!.length).toBe(1)
|
|
177
|
+
expect(state.taskHistory![0].feedback).toBeUndefined()
|
|
178
|
+
})
|
|
179
|
+
|
|
180
|
+
it('should preserve feedback through FIFO eviction', async () => {
|
|
181
|
+
// Complete first task with feedback
|
|
182
|
+
const task1 = createMockTask({ description: 'Task 1' })
|
|
183
|
+
await stateStorage.startTask(testProjectId, task1)
|
|
184
|
+
await stateStorage.completeTask(testProjectId, {
|
|
185
|
+
patternsDiscovered: ['Pattern from task 1'],
|
|
186
|
+
})
|
|
187
|
+
|
|
188
|
+
// Complete second task with feedback
|
|
189
|
+
const task2 = createMockTask({ description: 'Task 2' })
|
|
190
|
+
await stateStorage.startTask(testProjectId, task2)
|
|
191
|
+
await stateStorage.completeTask(testProjectId, {
|
|
192
|
+
patternsDiscovered: ['Pattern from task 2'],
|
|
193
|
+
})
|
|
194
|
+
|
|
195
|
+
const state = await stateStorage.read(testProjectId)
|
|
196
|
+
expect(state.taskHistory!.length).toBe(2)
|
|
197
|
+
// Most recent first (FIFO)
|
|
198
|
+
expect(state.taskHistory![0].feedback?.patternsDiscovered).toEqual(['Pattern from task 2'])
|
|
199
|
+
expect(state.taskHistory![1].feedback?.patternsDiscovered).toEqual(['Pattern from task 1'])
|
|
200
|
+
})
|
|
201
|
+
|
|
202
|
+
it('should store full feedback with all fields', async () => {
|
|
203
|
+
const task = createMockTask({ description: 'Full feedback task' })
|
|
204
|
+
const feedback: TaskFeedback = {
|
|
205
|
+
stackConfirmed: ['React 18', 'TypeScript'],
|
|
206
|
+
patternsDiscovered: ['API routes use /api/v1/{resource}', 'Barrel exports'],
|
|
207
|
+
agentAccuracy: [
|
|
208
|
+
{ agent: 'backend.md', rating: 'helpful', note: 'Good patterns' },
|
|
209
|
+
{ agent: 'frontend.md', rating: 'inaccurate', note: 'Missing Tailwind' },
|
|
210
|
+
],
|
|
211
|
+
issuesEncountered: ['ESLint conflicts with Prettier'],
|
|
212
|
+
}
|
|
213
|
+
|
|
214
|
+
const state = await startAndCompleteWithFeedback(testProjectId, task, feedback)
|
|
215
|
+
|
|
216
|
+
const stored = state.taskHistory![0].feedback!
|
|
217
|
+
expect(stored.stackConfirmed).toEqual(['React 18', 'TypeScript'])
|
|
218
|
+
expect(stored.patternsDiscovered).toEqual([
|
|
219
|
+
'API routes use /api/v1/{resource}',
|
|
220
|
+
'Barrel exports',
|
|
221
|
+
])
|
|
222
|
+
expect(stored.agentAccuracy).toHaveLength(2)
|
|
223
|
+
expect(stored.agentAccuracy![0].rating).toBe('helpful')
|
|
224
|
+
expect(stored.agentAccuracy![1].rating).toBe('inaccurate')
|
|
225
|
+
expect(stored.issuesEncountered).toEqual(['ESLint conflicts with Prettier'])
|
|
226
|
+
})
|
|
227
|
+
})
|
|
228
|
+
|
|
229
|
+
// =============================================================================
|
|
230
|
+
// Tests: Feedback Aggregation
|
|
231
|
+
// =============================================================================
|
|
232
|
+
|
|
233
|
+
describe('Feedback Aggregation', () => {
|
|
234
|
+
it('should aggregate patterns from multiple tasks', async () => {
|
|
235
|
+
// Task 1
|
|
236
|
+
const task1 = createMockTask({ description: 'Task 1' })
|
|
237
|
+
await stateStorage.startTask(testProjectId, task1)
|
|
238
|
+
await stateStorage.completeTask(testProjectId, {
|
|
239
|
+
patternsDiscovered: ['Pattern A'],
|
|
240
|
+
})
|
|
241
|
+
|
|
242
|
+
// Task 2
|
|
243
|
+
const task2 = createMockTask({ description: 'Task 2' })
|
|
244
|
+
await stateStorage.startTask(testProjectId, task2)
|
|
245
|
+
await stateStorage.completeTask(testProjectId, {
|
|
246
|
+
patternsDiscovered: ['Pattern B'],
|
|
247
|
+
})
|
|
248
|
+
|
|
249
|
+
const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
|
|
250
|
+
expect(aggregated.patternsDiscovered).toContain('Pattern A')
|
|
251
|
+
expect(aggregated.patternsDiscovered).toContain('Pattern B')
|
|
252
|
+
})
|
|
253
|
+
|
|
254
|
+
it('should deduplicate patterns', async () => {
|
|
255
|
+
// Both tasks discover the same pattern
|
|
256
|
+
const task1 = createMockTask({ description: 'Task 1' })
|
|
257
|
+
await stateStorage.startTask(testProjectId, task1)
|
|
258
|
+
await stateStorage.completeTask(testProjectId, {
|
|
259
|
+
patternsDiscovered: ['Same pattern'],
|
|
260
|
+
})
|
|
261
|
+
|
|
262
|
+
const task2 = createMockTask({ description: 'Task 2' })
|
|
263
|
+
await stateStorage.startTask(testProjectId, task2)
|
|
264
|
+
await stateStorage.completeTask(testProjectId, {
|
|
265
|
+
patternsDiscovered: ['Same pattern'],
|
|
266
|
+
})
|
|
267
|
+
|
|
268
|
+
const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
|
|
269
|
+
expect(aggregated.patternsDiscovered).toEqual(['Same pattern'])
|
|
270
|
+
})
|
|
271
|
+
|
|
272
|
+
it('should deduplicate stack confirmations', async () => {
|
|
273
|
+
const task1 = createMockTask({ description: 'Task 1' })
|
|
274
|
+
await stateStorage.startTask(testProjectId, task1)
|
|
275
|
+
await stateStorage.completeTask(testProjectId, {
|
|
276
|
+
stackConfirmed: ['TypeScript', 'React'],
|
|
277
|
+
})
|
|
278
|
+
|
|
279
|
+
const task2 = createMockTask({ description: 'Task 2' })
|
|
280
|
+
await stateStorage.startTask(testProjectId, task2)
|
|
281
|
+
await stateStorage.completeTask(testProjectId, {
|
|
282
|
+
stackConfirmed: ['TypeScript', 'Next.js'],
|
|
283
|
+
})
|
|
284
|
+
|
|
285
|
+
const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
|
|
286
|
+
expect(aggregated.stackConfirmed).toContain('TypeScript')
|
|
287
|
+
expect(aggregated.stackConfirmed).toContain('React')
|
|
288
|
+
expect(aggregated.stackConfirmed).toContain('Next.js')
|
|
289
|
+
// TypeScript should not be duplicated
|
|
290
|
+
expect(aggregated.stackConfirmed.filter((s) => s === 'TypeScript')).toHaveLength(1)
|
|
291
|
+
})
|
|
292
|
+
|
|
293
|
+
it('should promote recurring issues to known gotchas', async () => {
|
|
294
|
+
// Same issue encountered twice
|
|
295
|
+
const task1 = createMockTask({ description: 'Task 1' })
|
|
296
|
+
await stateStorage.startTask(testProjectId, task1)
|
|
297
|
+
await stateStorage.completeTask(testProjectId, {
|
|
298
|
+
issuesEncountered: ['ESLint conflicts with Prettier'],
|
|
299
|
+
})
|
|
300
|
+
|
|
301
|
+
const task2 = createMockTask({ description: 'Task 2' })
|
|
302
|
+
await stateStorage.startTask(testProjectId, task2)
|
|
303
|
+
await stateStorage.completeTask(testProjectId, {
|
|
304
|
+
issuesEncountered: ['ESLint conflicts with Prettier'],
|
|
305
|
+
})
|
|
306
|
+
|
|
307
|
+
const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
|
|
308
|
+
expect(aggregated.knownGotchas).toContain('ESLint conflicts with Prettier')
|
|
309
|
+
})
|
|
310
|
+
|
|
311
|
+
it('should NOT promote single-occurrence issues to gotchas', async () => {
|
|
312
|
+
const task1 = createMockTask({ description: 'Task 1' })
|
|
313
|
+
await stateStorage.startTask(testProjectId, task1)
|
|
314
|
+
await stateStorage.completeTask(testProjectId, {
|
|
315
|
+
issuesEncountered: ['One-time issue'],
|
|
316
|
+
})
|
|
317
|
+
|
|
318
|
+
const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
|
|
319
|
+
expect(aggregated.issuesEncountered).toContain('One-time issue')
|
|
320
|
+
expect(aggregated.knownGotchas).not.toContain('One-time issue')
|
|
321
|
+
})
|
|
322
|
+
|
|
323
|
+
it('should aggregate agent accuracy across tasks', async () => {
|
|
324
|
+
const task1 = createMockTask({ description: 'Task 1' })
|
|
325
|
+
await stateStorage.startTask(testProjectId, task1)
|
|
326
|
+
await stateStorage.completeTask(testProjectId, {
|
|
327
|
+
agentAccuracy: [{ agent: 'backend.md', rating: 'helpful' }],
|
|
328
|
+
})
|
|
329
|
+
|
|
330
|
+
const task2 = createMockTask({ description: 'Task 2' })
|
|
331
|
+
await stateStorage.startTask(testProjectId, task2)
|
|
332
|
+
await stateStorage.completeTask(testProjectId, {
|
|
333
|
+
agentAccuracy: [{ agent: 'backend.md', rating: 'inaccurate', note: 'Missing Hono context' }],
|
|
334
|
+
})
|
|
335
|
+
|
|
336
|
+
const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
|
|
337
|
+
expect(aggregated.agentAccuracy).toHaveLength(2)
|
|
338
|
+
expect(aggregated.agentAccuracy[0].agent).toBe('backend.md')
|
|
339
|
+
expect(aggregated.agentAccuracy[1].agent).toBe('backend.md')
|
|
340
|
+
})
|
|
341
|
+
|
|
342
|
+
it('should return empty aggregation when no feedback exists', async () => {
|
|
343
|
+
// Complete task without feedback
|
|
344
|
+
const task = createMockTask({ description: 'No feedback' })
|
|
345
|
+
await stateStorage.startTask(testProjectId, task)
|
|
346
|
+
await stateStorage.completeTask(testProjectId)
|
|
347
|
+
|
|
348
|
+
const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
|
|
349
|
+
expect(aggregated.stackConfirmed).toEqual([])
|
|
350
|
+
expect(aggregated.patternsDiscovered).toEqual([])
|
|
351
|
+
expect(aggregated.agentAccuracy).toEqual([])
|
|
352
|
+
expect(aggregated.issuesEncountered).toEqual([])
|
|
353
|
+
expect(aggregated.knownGotchas).toEqual([])
|
|
354
|
+
})
|
|
355
|
+
|
|
356
|
+
it('should return empty aggregation when no tasks exist', async () => {
|
|
357
|
+
const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
|
|
358
|
+
expect(aggregated.stackConfirmed).toEqual([])
|
|
359
|
+
expect(aggregated.patternsDiscovered).toEqual([])
|
|
360
|
+
expect(aggregated.knownGotchas).toEqual([])
|
|
361
|
+
})
|
|
362
|
+
})
|
|
363
|
+
|
|
364
|
+
// =============================================================================
|
|
365
|
+
// Tests: Context Injection (Markdown Generation)
|
|
366
|
+
// =============================================================================
|
|
367
|
+
|
|
368
|
+
describe('Feedback in Markdown Context', () => {
|
|
369
|
+
it('should include patterns in task history markdown', async () => {
|
|
370
|
+
const task = createMockTask({ description: 'Task with patterns' })
|
|
371
|
+
await stateStorage.startTask(testProjectId, task)
|
|
372
|
+
await stateStorage.completeTask(testProjectId, {
|
|
373
|
+
patternsDiscovered: ['Uses barrel exports'],
|
|
374
|
+
})
|
|
375
|
+
|
|
376
|
+
const state = await stateStorage.read(testProjectId)
|
|
377
|
+
// Access the private toMarkdown via the generated context
|
|
378
|
+
const md = (stateStorage as any).toMarkdown(state)
|
|
379
|
+
|
|
380
|
+
expect(md).toContain('Patterns: Uses barrel exports')
|
|
381
|
+
})
|
|
382
|
+
|
|
383
|
+
it('should include gotchas in task history markdown', async () => {
|
|
384
|
+
const task = createMockTask({ description: 'Task with gotchas' })
|
|
385
|
+
await stateStorage.startTask(testProjectId, task)
|
|
386
|
+
await stateStorage.completeTask(testProjectId, {
|
|
387
|
+
issuesEncountered: ['Port 3000 already in use'],
|
|
388
|
+
})
|
|
389
|
+
|
|
390
|
+
const state = await stateStorage.read(testProjectId)
|
|
391
|
+
const md = (stateStorage as any).toMarkdown(state)
|
|
392
|
+
|
|
393
|
+
expect(md).toContain('Gotchas: Port 3000 already in use')
|
|
394
|
+
})
|
|
395
|
+
|
|
396
|
+
it('should NOT show feedback section when feedback is empty', async () => {
|
|
397
|
+
const task = createMockTask({ description: 'Task without feedback' })
|
|
398
|
+
await stateStorage.startTask(testProjectId, task)
|
|
399
|
+
await stateStorage.completeTask(testProjectId)
|
|
400
|
+
|
|
401
|
+
const state = await stateStorage.read(testProjectId)
|
|
402
|
+
const md = (stateStorage as any).toMarkdown(state)
|
|
403
|
+
|
|
404
|
+
expect(md).not.toContain('Patterns:')
|
|
405
|
+
expect(md).not.toContain('Gotchas:')
|
|
406
|
+
})
|
|
407
|
+
})
|
|
408
|
+
|
|
409
|
+
// =============================================================================
|
|
410
|
+
// Tests: Mixed Feedback and Non-Feedback Tasks
|
|
411
|
+
// =============================================================================
|
|
412
|
+
|
|
413
|
+
describe('Mixed Tasks (with and without feedback)', () => {
|
|
414
|
+
it('should handle mix of tasks with and without feedback', async () => {
|
|
415
|
+
// Task 1: with feedback
|
|
416
|
+
const task1 = createMockTask({ description: 'With feedback' })
|
|
417
|
+
await stateStorage.startTask(testProjectId, task1)
|
|
418
|
+
await stateStorage.completeTask(testProjectId, {
|
|
419
|
+
patternsDiscovered: ['Pattern from task 1'],
|
|
420
|
+
})
|
|
421
|
+
|
|
422
|
+
// Task 2: without feedback
|
|
423
|
+
const task2 = createMockTask({ description: 'Without feedback' })
|
|
424
|
+
await stateStorage.startTask(testProjectId, task2)
|
|
425
|
+
await stateStorage.completeTask(testProjectId)
|
|
426
|
+
|
|
427
|
+
// Task 3: with feedback
|
|
428
|
+
const task3 = createMockTask({ description: 'With feedback again' })
|
|
429
|
+
await stateStorage.startTask(testProjectId, task3)
|
|
430
|
+
await stateStorage.completeTask(testProjectId, {
|
|
431
|
+
patternsDiscovered: ['Pattern from task 3'],
|
|
432
|
+
})
|
|
433
|
+
|
|
434
|
+
const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
|
|
435
|
+
expect(aggregated.patternsDiscovered).toContain('Pattern from task 1')
|
|
436
|
+
expect(aggregated.patternsDiscovered).toContain('Pattern from task 3')
|
|
437
|
+
expect(aggregated.patternsDiscovered).toHaveLength(2)
|
|
438
|
+
})
|
|
439
|
+
|
|
440
|
+
it('should correctly count occurrences for gotcha promotion with mixed tasks', async () => {
|
|
441
|
+
// Task 1: encounters issue
|
|
442
|
+
const task1 = createMockTask({ description: 'Task 1' })
|
|
443
|
+
await stateStorage.startTask(testProjectId, task1)
|
|
444
|
+
await stateStorage.completeTask(testProjectId, {
|
|
445
|
+
issuesEncountered: ['Build fails on M1'],
|
|
446
|
+
})
|
|
447
|
+
|
|
448
|
+
// Task 2: no feedback
|
|
449
|
+
const task2 = createMockTask({ description: 'Task 2' })
|
|
450
|
+
await stateStorage.startTask(testProjectId, task2)
|
|
451
|
+
await stateStorage.completeTask(testProjectId)
|
|
452
|
+
|
|
453
|
+
// Task 3: encounters same issue
|
|
454
|
+
const task3 = createMockTask({ description: 'Task 3' })
|
|
455
|
+
await stateStorage.startTask(testProjectId, task3)
|
|
456
|
+
await stateStorage.completeTask(testProjectId, {
|
|
457
|
+
issuesEncountered: ['Build fails on M1'],
|
|
458
|
+
})
|
|
459
|
+
|
|
460
|
+
const aggregated = await stateStorage.getAggregatedFeedback(testProjectId)
|
|
461
|
+
expect(aggregated.knownGotchas).toContain('Build fails on M1')
|
|
462
|
+
})
|
|
463
|
+
})
|