maestro-flow 0.2.2 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -80,6 +80,10 @@ Follow the instructions loaded from the role_spec body. This contains the domain
80
80
  - Use CLI tools (`maestro cli`) or direct tools (Read, Grep, Glob) for analysis — see @~/.maestro/templates/search-tools.md for tool selection
81
81
  - If agent delegation is needed, send a request to the coordinator via SendMessage
82
82
 
83
+ ### Context-Aware Signal Emission (Optional)
84
+
85
+ During Phase 2-4 execution, if you detect codebase signals relevant to specialist injection (SQL usage, auth modules, ML imports, performance-sensitive code, etc.), include `tech_profile` in your Phase 5 state_update data. This enables the coordinator to evaluate specialist injection for the pipeline.
86
+
83
87
  ### 6. Publish Results
84
88
 
85
89
  After execution, publish contributions:
@@ -172,14 +176,14 @@ Determine report variant based on loop state:
172
176
 
173
177
  **Loop continuation** (inner_loop=true AND more same-prefix tasks pending):
174
178
  1. `TaskUpdate` -- mark current task `completed`
175
- 2. Log `state_update` via `team_msg` with task results
179
+ 2. Log `state_update` via `team_msg` with task results and optional `tech_profile` (if codebase signals detected in Phase 2-4)
176
180
  3. Accumulate summary to in-memory `context_accumulator`
177
181
  4. Interrupt check: consensus_blocked HIGH or errors >= 3 -- SendMessage and STOP
178
182
  5. Return to step 3 (Task Discovery)
179
183
 
180
184
  **Final report** (no more same-prefix tasks OR inner_loop=false):
181
185
  1. `TaskUpdate` -- mark current task `completed`
182
- 2. Log `state_update` via `team_msg`
186
+ 2. Log `state_update` via `team_msg` (include `tech_profile` if codebase signals detected)
183
187
  3. Compile and send final report via SendMessage to coordinator:
184
188
  - Tasks completed (count + list)
185
189
  - Artifacts produced (paths)
@@ -0,0 +1,131 @@
1
+ ---
2
+ name: manage-harvest
3
+ description: Extract knowledge from workflow artifacts and route to wiki / spec / issue stores
4
+ argument-hint: "[<session-id|path>] [--to wiki|spec|issue|auto] [--source <type>] [--recent N] [--dry-run] [-y]"
5
+ allowed-tools:
6
+ - Read
7
+ - Write
8
+ - Edit
9
+ - Bash
10
+ - Glob
11
+ - Grep
12
+ - Agent
13
+ - AskUserQuestion
14
+ ---
15
+ <purpose>
16
+ Extract knowledge fragments from workflow artifacts (analysis results, brainstorm outputs, debug sessions, lite-plan/fix results, scratchpad notes, completed sessions) and route them into the project's three knowledge stores: wiki entries, spec conventions, and trackable issues.
17
+
18
+ Complements `quality-retrospective` (which is phase-scoped) by harvesting from **any** workflow artifact. Prevents knowledge loss from completed analysis and planning sessions that would otherwise only exist as stale files.
19
+
20
+ **Closed-loop**: harvest extracts → wiki/spec/issue stores → downstream commands consume (wiki-digest, spec-load, manage-issue-plan).
21
+ </purpose>
22
+
23
+ <required_reading>
24
+ @workflows/harvest.md
25
+ </required_reading>
26
+
27
+ <deferred_reading>
28
+ - @workflows/issue.md (issues.jsonl schema for issue routing — read when creating issues in Stage 6c)
29
+ - @workflows/specs-add.md (spec entry format — read when routing to spec in Stage 6b)
30
+ </deferred_reading>
31
+
32
+ <context>
33
+ Arguments: $ARGUMENTS
34
+
35
+ **Modes (auto-detected):**
36
+ - No arguments → `scan` mode: discover all harvestable artifacts, interactive selection
37
+ - `<session-id>` (e.g., `ANL-auth-20260410`, `WFS-xxx`) → `session` mode: harvest specific session
38
+ - `<path>` (e.g., `.workflow/.analysis/ANL-auth-20260410/`) → `path` mode: harvest from explicit directory
39
+
40
+ **Flags:**
41
+ - `--to <target>` — Force routing: `wiki`, `spec`, `issue`, `auto` (default: `auto`)
42
+ - `--source <type>` — Filter source type: `analysis`, `brainstorm`, `debug`, `lite-plan`, `lite-fix`, `scratchpad`, `session`, `learning`, `all` (default: `all`)
43
+ - `--recent N` — Only artifacts updated within last N days (default: 30)
44
+ - `--dry-run` — Preview extraction and routing without writing
45
+ - `-y` / `--yes` — Skip confirmation prompts
46
+ - `--min-confidence N` — Minimum extraction confidence 0.0-1.0 (default: 0.5)
47
+
48
+ **Source registry (scan paths):**
49
+ | Source Type | Scan Path | Key Files |
50
+ |-------------|-----------|-----------|
51
+ | `analysis` | `.workflow/.analysis/ANL-*/` | `conclusions.json`, `*.md` |
52
+ | `brainstorm` | `.workflow/scratch/brainstorm-*/` | `guidance-specification.md` |
53
+ | `lite-plan` | `.workflow/.lite-plan/*/` | `plan.json`, `plan-overview.md` |
54
+ | `lite-fix` | `.workflow/.lite-fix/*/` | `fix-plan.json` |
55
+ | `debug` | `.workflow/.debug/*/` | `debug-log.md`, `hypothesis-*.md` |
56
+ | `scratchpad` | `.workflow/.scratchpad/` | `*.md`, `*.json` |
57
+ | `session` | `.workflow/active/WFS-*/` | `workflow-session.json` |
58
+ | `learning` | `.workflow/learning/` | `lessons.jsonl`, `digest-*.md` |
59
+
60
+ **Storage written:**
61
+ - `.workflow/harvest/harvest-log.jsonl` — provenance log (prevents duplicate harvesting)
62
+ - `.workflow/harvest/harvest-report-{date}.md` — per-run report
63
+ - Wiki entries via `maestro wiki create`
64
+ - Spec entries via `Skill({ skill: "spec-add" })`
65
+ - Issue entries appended to `.workflow/issues/issues.jsonl`
66
+
67
+ **Storage read (never modified):**
68
+ - All artifact source files (read-only until routing stage)
69
+ - `.workflow/harvest/harvest-log.jsonl` (dedup check)
70
+ </context>
71
+
72
+ <execution>
73
+ Follow 'workflows/harvest.md' Stages 1–8 in order. Key invariants:
74
+
75
+ 1. **Read-only until Stage 6** — Stages 1–5 must not write anything. All extraction and classification happens in-memory.
76
+ 2. **Dedup before write** — Stage 7 (dedup_check) runs BEFORE each write in Stage 6. Check harvest-log.jsonl, wiki search, issues.jsonl, and learnings.md for existing matches.
77
+ 3. **Stable fragment IDs** — `HRV-{8 hex}` from `hash(source_id + content_hash)` so re-runs on same artifacts do not create duplicates.
78
+ 4. **Reuse existing routing infrastructure**:
79
+ - Wiki: `maestro wiki create --type <type> --slug harvest-<source_type>-<short_id>`
80
+ - Spec: `Skill({ skill: "spec-add", args: "<type> <content>" })`
81
+ - Issue: append to `issues.jsonl` matching canonical schema from `workflows/issue.md`
82
+ 5. **Never modify source artifacts** — harvest is purely extractive. Source files remain untouched.
83
+ 6. **Confidence filtering** — fragments below `--min-confidence` are logged but not routed.
84
+ 7. **Provenance tracking** — every routed item logged to `harvest-log.jsonl` with fragment_id, source reference, and target reference.
85
+
86
+ **Fragment extraction uses source-specific parsing** (see harvest.md Stage 3b for per-source patterns). The agent should read each artifact file and identify discrete knowledge items: findings, decisions, patterns, bugs, risks, tasks, lessons, recommendations.
87
+
88
+ **Classification uses category-to-target mapping** (see harvest.md Stage 4). Override with `--to` flag if user wants all items in one store.
89
+
90
+ **Next-step routing on completion:**
91
+ - Review wiki entries → `maestro wiki list --type note`
92
+ - Connect wiki graph → `Skill({ skill: "wiki-connect", args: "--fix" })`
93
+ - Triage issues → `Skill({ skill: "manage-issue", args: "list --source harvest" })`
94
+ - View specs → `Skill({ skill: "spec-load", args: "--category general" })`
95
+ - Full retrospective → `Skill({ skill: "quality-retrospective" })`
96
+ </execution>
97
+
98
+ <error_codes>
99
+ | Code | Severity | Condition | Recovery |
100
+ |------|----------|-----------|----------|
101
+ | E001 | error | `.workflow/` not initialized | Run `Skill({ skill: "maestro-init" })` first |
102
+ | E002 | error | Invalid `--to` target (must be: wiki, spec, issue, auto) | Display valid options |
103
+ | E003 | error | Invalid `--source` type | Display valid source types from registry |
104
+ | E004 | error | Session ID not found in any source path | Show available sessions with `--source all` |
105
+ | E005 | error | Path does not exist or contains no parseable artifacts | Verify path and file structure |
106
+ | W001 | warning | No harvestable artifacts found within `--recent` window | Widen time window or check `.workflow/` contents |
107
+ | W002 | warning | `maestro wiki create` failed — wiki entries saved to `.workflow/harvest/wiki-pending-*.md` | Apply pending entries manually or retry |
108
+ | W003 | warning | Some fragments below confidence threshold — logged but not routed | Lower `--min-confidence` to include |
109
+ | W004 | warning | Duplicate fragments skipped | Review harvest-log.jsonl for prior routing |
110
+ | W005 | warning | `.workflow/issues/` directory missing | Auto-create directory and empty issues.jsonl |
111
+ </error_codes>
112
+
113
+ <success_criteria>
114
+ - [ ] Mode correctly resolved (scan / session / path)
115
+ - [ ] Source artifacts discovered and listed with metadata
116
+ - [ ] User selected artifact(s) to harvest (or auto-selected via session/path mode)
117
+ - [ ] All files in selected artifacts loaded and parsed
118
+ - [ ] Knowledge fragments extracted with category, confidence, tags
119
+ - [ ] Fragments filtered by `--min-confidence`
120
+ - [ ] Routing classification applied (auto or forced by `--to`)
121
+ - [ ] Dedup check passed against harvest-log.jsonl and existing stores
122
+ - [ ] If `--dry-run`: preview displayed, no files written
123
+ - [ ] If not dry-run: all routed items written to target stores
124
+ - [ ] Wiki entries created via `maestro wiki create` (or fallback to pending files)
125
+ - [ ] Spec entries added via `spec-add` mechanism
126
+ - [ ] Issue entries appended to `issues.jsonl` with canonical schema
127
+ - [ ] `harvest-log.jsonl` updated with provenance for each routed item
128
+ - [ ] `harvest-report-{date}.md` written with full summary
129
+ - [ ] No source artifacts modified
130
+ - [ ] Summary displayed with counts and next-step routing
131
+ </success_criteria>
@@ -122,6 +122,12 @@ Quality thresholds from [specs/quality-gates.md](quality-gates.md):
122
122
  - Review 60-79%: report completed with warnings
123
123
  - Fail < 60%: retry Phase 3 (max 2)
124
124
 
125
+ ### Tech Profile Injection
126
+
127
+ When generating role-specs for analysis or exploration roles, append a Tech Profile Scan instruction after their Phase 3:
128
+ - Instruct the role to scan analysis results for codebase signals relevant to its domain
129
+ - Include `tech_profile` in state_update data for coordinator specialist injection evaluation
130
+
125
131
  ### Error Protocol
126
132
 
127
133
  - Primary approach fails → try alternative (different CLI tool / different tool)
@@ -59,6 +59,22 @@ CONTEXT: @**/*
59
59
  EXPECTED: JSON with: tech_stack[], architecture_patterns[], conventions[], integration_points[]" --tool gemini --mode analysis`, run_in_background: false })
60
60
  ```
61
61
 
62
+ ### Tech Profile Scan
63
+
64
+ After codebase exploration, scan results for context-aware trigger signals (based on detected codebase characteristics):
65
+
66
+ 1. Check imports/dependencies → framework signals (`sql_detected`, `auth_detected`, `ml_detected`, `frontend_framework`)
67
+ 2. Check file patterns → infrastructure signals (`devops_detected`, `data_migration`, `realtime_detected`)
68
+ 3. Check code patterns → risk signals (`perf_sensitive`, `crypto_usage`, `legacy_patterns`, `test_gap`)
69
+ 4. Include `tech_profile` in Phase 5 state_update data:
70
+ ```json
71
+ "tech_profile": {
72
+ "signals": ["<detected signals>"],
73
+ "evidence": { "<signal>": ["<file paths>"] },
74
+ "confidence": "high|medium|low"
75
+ }
76
+ ```
77
+
62
78
  ## Phase 4: Context Packaging
63
79
 
64
80
  1. Write spec-config.json → <session>/spec/
@@ -31,14 +31,63 @@ Worker completed. Process and advance.
31
31
  4. Completion -> mark task done
32
32
  - Resident agent (supervisor) -> keep in active_workers (stays alive for future checkpoints)
33
33
  - Standard worker -> remove from active_workers
34
+ 4.5. **evaluateSpecialistInjection** (based on detected codebase characteristics):
35
+ - If callback from analyst, planner, or executor role:
36
+ a. `get_state(role=<callback_role>)` → extract `tech_profile.signals`
37
+ b. Merge with previously collected signals from other roles
38
+ c. Evaluate against trigger matrix (§4)
39
+ d. P0 matches → TaskCreate with blockedBy on current stage, blocks downstream
40
+ e. P1 matches → TaskCreate parallel with REVIEW/TEST stage
41
+ f. Log: `team_msg(type="specialist_injection", data={ specialist, signals, priority, evidence })`
42
+ g. Dedup: skip if same specialist already injected this session
34
43
  5. Check for checkpoints:
35
44
  - CHECKPOINT-* with verdict "block" -> AskUserQuestion: Override / Revise upstream / Abort
36
45
  - CHECKPOINT-* with verdict "warn" -> log risks to wisdom, proceed normally
37
46
  - CHECKPOINT-* with verdict "pass" -> proceed normally
38
47
  - QUALITY-001 -> display quality gate, pause for user commands
39
- - PLAN-001 -> read plan.json complexity, create dynamic IMPL tasks per specs/pipelines.md routing
48
+ - PLAN-001 -> dynamicImplDispatch (see below)
40
49
  6. -> handleSpawnNext
41
50
 
51
+ ### dynamicImplDispatch (PLAN-001 callback)
52
+
53
+ When PLAN-001 completes, coordinator creates IMPL tasks based on complexity:
54
+
55
+ 1. Read `<session>/plan/plan.json` → extract `complexity`, `tasks[]`
56
+ 2. Route by complexity (per specs/pipelines.md §6):
57
+
58
+ | Complexity | Action |
59
+ |------------|--------|
60
+ | Low (1-2 modules) | Create single IMPL-001, blockedBy: [PLAN-001], InnerLoop: true |
61
+ | Medium (3-4 modules) | Create IMPL-{1..N}, each blockedBy: [PLAN-001] only, InnerLoop: false |
62
+ | High (5+ modules) | Create IMPL-{1..N} with DAG deps from plan.json, InnerLoop per dispatch rules |
63
+
64
+ 3. For each IMPL task: TaskCreate with structured description (dispatch.md template)
65
+ 4. Set blockedBy:
66
+ - **Parallel tasks**: blockedBy: [PLAN-001] (or [CHECKPOINT-003] if supervision enabled)
67
+ - **Serial chain within DAG**: blockedBy includes upstream IMPL task IDs
68
+ 5. Update team-session.json: `pipeline.tasks_total`, `pipeline.impl_topology: "single"|"parallel"|"dag"`
69
+ 6. Log via team_msg: `{ type: "state_update", data: { impl_count: N, topology: "..." } }`
70
+
71
+ ### dynamicImplDispatch (PLAN-001 callback)
72
+
73
+ When PLAN-001 completes, coordinator creates IMPL tasks based on complexity:
74
+
75
+ 1. Read `<session>/plan/plan.json` → extract `complexity`, `tasks[]`
76
+ 2. Route by complexity (per specs/pipelines.md §6):
77
+
78
+ | Complexity | Action |
79
+ |------------|--------|
80
+ | Low (1-2 modules) | Create single IMPL-001, blockedBy: [PLAN-001], InnerLoop: true |
81
+ | Medium (3-4 modules) | Create IMPL-{1..N}, each blockedBy: [PLAN-001] only, InnerLoop: false |
82
+ | High (5+ modules) | Create IMPL-{1..N} with DAG deps from plan.json, InnerLoop per dispatch rules |
83
+
84
+ 3. For each IMPL task: TaskCreate with structured description (dispatch.md template)
85
+ 4. Set blockedBy:
86
+ - **Parallel tasks**: blockedBy: [PLAN-001] (or [CHECKPOINT-003] if supervision enabled)
87
+ - **Serial chain within DAG**: blockedBy includes upstream IMPL task IDs
88
+ 5. Update team-session.json: `pipeline.tasks_total`, `pipeline.impl_topology: "single"|"parallel"|"dag"`
89
+ 6. Log via team_msg: `{ type: "state_update", data: { impl_count: N, topology: "..." } }`
90
+
42
91
  ## handleCheck
43
92
 
44
93
  Read-only status report, then STOP.
@@ -41,6 +41,15 @@ Codebase-informed implementation planning with complexity assessment.
41
41
  ```
42
42
  4. Store results in <session>/explorations/
43
43
 
44
+ ### Secondary Signal Scan
45
+
46
+ After exploration, supplement upstream tech_profile with planning-phase signals (based on detected codebase characteristics):
47
+
48
+ 1. Check plan complexity → `scaling_concern` if O(n^2)+ patterns found
49
+ 2. Check scope → `breaking_change` if public API modifications planned
50
+ 3. Check data → `data_migration` if schema changes identified
51
+ 4. Include `tech_profile` in Phase 5 state_update (merge with any upstream signals)
52
+
44
53
  ## Phase 3: Plan Generation
45
54
 
46
55
  Generate plan.json + .task/TASK-*.json:
@@ -53,7 +53,8 @@ Sent via `team_msg(type="state_update")` on task completion.
53
53
  "files_modified": [
54
54
  "path/to/file.ts"
55
55
  ],
56
- "verification": "self-validated | peer-reviewed | tested"
56
+ "verification": "self-validated | peer-reviewed | tested",
57
+ "tech_profile": "<optional, from Phase 2-4 if codebase signals detected>"
57
58
  }
58
59
  ```
59
60
 
@@ -63,6 +64,7 @@ Sent via `team_msg(type="state_update")` on task completion.
63
64
  - `decisions`: Include rationale, not just the choice
64
65
  - `files_modified`: Only for implementation tasks
65
66
  - `verification`: One of `self-validated`, `peer-reviewed`, `tested`
67
+ - `tech_profile`: Optional. Codebase signals for context-aware specialist injection. Schema: `{ signals: string[], evidence: { signal: filePaths[] }, confidence: "high|medium|low" }`
66
68
 
67
69
  **Supervisor-specific extensions** (CHECKPOINT tasks only):
68
70
 
@@ -107,19 +107,34 @@ PLAN-001 outputs a complexity assessment that determines the impl topology.
107
107
  | TEST-001 | tester | validation | IMPL-* | - | P0 |
108
108
  | REVIEW-001 | reviewer | review | IMPL-* | - | P0 |
109
109
 
110
- ## 8. Dynamic Specialist Injection
110
+ ## 8. Context-Aware Specialist Injection
111
111
 
112
- When task content or user request matches trigger keywords, inject a specialist task.
112
+ Specialists are injected based on **codebase signals** detected by explorer/analyst/planner workers, not keyword matching. The coordinator evaluates signals emitted in worker state updates against a trigger matrix to determine when specialist roles are needed.
113
113
 
114
- | Trigger Keywords | Specialist Role | Task Prefix | Priority | Insert After |
115
- |------------------|----------------|-------------|----------|--------------|
116
- | security, vulnerability, OWASP | security-expert | SECURITY-* | P0 | PLAN |
117
- | performance, optimization, latency | performance-optimizer | PERF-* | P1 | IMPL |
118
- | data, pipeline, ETL, migration | data-engineer | DATA-* | P0 | parallel with IMPL |
119
- | devops, CI/CD, deployment, infra | devops-engineer | DEVOPS-* | P1 | IMPL |
120
- | ML, model, training, inference | ml-engineer | ML-* | P0 | parallel with IMPL |
114
+ ### Signal Flow
121
115
 
122
- **Injection rules**:
123
- - Specialist tasks inherit the session context and wisdom
116
+ ```
117
+ analyst (RESEARCH-001) emits tech_profile in state_update
118
+ → coordinator evaluateSpecialistInjection (in handleCallback)
119
+ → signal combination matches trigger matrix
120
+ → P0: TaskCreate blocking downstream | P1: TaskCreate parallel with REVIEW/TEST
121
+ ```
122
+
123
+ ### Common Trigger Examples
124
+
125
+ | Signal Combination | Specialist | Priority |
126
+ |-------------------|-----------|----------|
127
+ | `sql_detected` + `auth_detected` | security-expert (SECURITY-*) | P0 |
128
+ | `perf_sensitive` + `scaling_concern` | performance-optimizer (PERF-*) | P0 |
129
+ | `ml_detected` | ml-engineer (ML-*) | P0 |
130
+ | `data_migration` | data-engineer (DATA-*) | P0 |
131
+ | `devops_detected` + CI config changes | devops-engineer (DEVOPS-*) | P1 |
132
+
133
+
134
+
135
+ ### Injection Rules
136
+
137
+ - Specialist tasks inherit session context and wisdom
124
138
  - They publish state_update on completion like any other task
125
139
  - P0 specialists block downstream tasks; P1 run in parallel
140
+ - Same specialist is only injected once per session (dedup)
@@ -62,6 +62,14 @@ TASK: Classify defects by root cause, identify high-density files, analyze cover
62
62
  MODE: analysis
63
63
  ```
64
64
 
65
+ ### Tech Profile Scan
66
+
67
+ After quality analysis, emit context-aware trigger signals (based on detected codebase characteristics):
68
+
69
+ 1. Check defect patterns → signals (`injection_risk`, `auth_detected`, `sql_detected`)
70
+ 2. Check coverage data → risk signals (`test_gap`, `perf_sensitive`, `legacy_patterns`)
71
+ 3. Include `tech_profile` in Phase 5 state_update data
72
+
65
73
  ## Phase 4: Report Generation & Output
66
74
 
67
75
  1. Generate quality report markdown with: score, defect patterns, coverage analysis, test effectiveness, quality trend, recommendations
@@ -58,6 +58,14 @@ After all perspectives complete:
58
58
  - Compare against known defect patterns from .msg/meta.json
59
59
  - Rank by severity: critical > high > medium > low
60
60
 
61
+ ### Tech Profile Scan
62
+
63
+ After scanning, emit context-aware trigger signals (based on detected codebase characteristics):
64
+
65
+ 1. Check scan findings → signals (`sql_detected`, `auth_detected`, `injection_risk`, `eval_usage`)
66
+ 2. Check quality issues → risk signals (`test_gap`, `legacy_patterns`, `perf_sensitive`)
67
+ 3. Include `tech_profile` in Phase 5 state_update data
68
+
61
69
  ## Phase 4: Result Aggregation
62
70
 
63
71
  1. Build `discoveredIssues` array from critical + high findings (with id, severity, perspective, file, line, description)
@@ -61,6 +61,14 @@ Build prompt with target file patterns, toolchain dedup summary, and per-dimensi
61
61
 
62
62
  Execute via `maestro cli --tool gemini --mode analysis --rule analysis-review-code-quality` (fallback: qwen -> codex). Parse JSON array response, validate required fields (dimension, title, location.file), enforce per-dimension limit (max 5 each), filter minimum severity (medium+). Write `<session>/scan/semantic-findings.json`.
63
63
 
64
+ ### Tech Profile Scan
65
+
66
+ After scan execution, emit context-aware trigger signals (based on detected codebase characteristics):
67
+
68
+ 1. Check security findings → signals (`injection_risk`, `eval_usage`, `sql_detected`, `auth_detected`)
69
+ 2. Check quality findings → risk signals (`legacy_patterns`, `test_gap`, `perf_sensitive`)
70
+ 3. Include `tech_profile` in Phase 5 state_update data
71
+
64
72
  ## Phase 4: Aggregate & Output
65
73
 
66
74
  1. Merge toolchain + semantic findings, deduplicate (same file + line + dimension = duplicate)
@@ -61,6 +61,14 @@ Quantitative evaluator for tech debt items. Score each debt item on business imp
61
61
 
62
62
  For CLI mode, prompt gemini with full debt summary requesting JSON array of `{id, impact_score, cost_score, risk_if_unfixed, priority_quadrant}`. Unevaluated items fall back to heuristic scoring.
63
63
 
64
+ ### Tech Profile Scan
65
+
66
+ After assessment, emit context-aware trigger signals (based on detected codebase characteristics):
67
+
68
+ 1. Check debt items → signals (`legacy_patterns`, `perf_sensitive`, `test_gap`)
69
+ 2. Check code patterns → risk signals (`sql_detected`, `auth_detected`, `scaling_concern`)
70
+ 3. Include `tech_profile` in Phase 5 state_update data
71
+
64
72
  ## Phase 4: Generate Priority Matrix
65
73
 
66
74
  1. Build matrix structure: evaluation_date, total_items, by_quadrant (grouped), summary (counts per quadrant)
@@ -74,6 +74,14 @@ Multi-dimension tech debt scanner. Scan codebase across 5 dimensions (code, arch
74
74
  | `suggestion` | Fix suggestion |
75
75
  | `estimated_effort` | small, medium, large, unknown |
76
76
 
77
+ ### Tech Profile Scan
78
+
79
+ After multi-dimension scan, emit context-aware trigger signals (based on detected codebase characteristics):
80
+
81
+ 1. Check debt dimensions → signals (`legacy_patterns`, `test_gap`, `perf_sensitive`)
82
+ 2. Check detected patterns → risk signals (`sql_detected`, `auth_detected`, `scaling_concern`, `injection_risk`)
83
+ 3. Include `tech_profile` in Phase 5 state_update data
84
+
77
85
  ## Phase 4: Aggregate & Save
78
86
 
79
87
  1. Deduplicate findings across Fan-out layers (file:line key), merge cross-references
@@ -79,6 +79,14 @@ Glob("<session>/tests/**/*")
79
79
 
80
80
  Write report to `<session>/analysis/quality-report.md`
81
81
 
82
+ ### Tech Profile Scan
83
+
84
+ After test analysis, emit context-aware trigger signals (based on detected codebase characteristics):
85
+
86
+ 1. Check test findings → signals (`test_gap`, `perf_sensitive`)
87
+ 2. Check tested code → risk signals (`sql_detected`, `auth_detected`, `injection_risk`)
88
+ 3. Include `tech_profile` in Phase 5 state_update data
89
+
82
90
  ## Phase 4: Trend Analysis & State Update
83
91
 
84
92
  **Historical comparison** (if multiple sessions exist):
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "maestro-flow",
3
- "version": "0.2.2",
3
+ "version": "0.3.0",
4
4
  "description": "Workflow orchestration CLI with MCP endpoint support and extensible architecture",
5
5
  "type": "module",
6
6
  "imports": {
@@ -0,0 +1,420 @@
1
+ # Harvest Workflow
2
+
3
+ Extract knowledge from workflow artifacts and route into wiki / spec / issue stores.
4
+
5
+ Unlike `retrospective.md` which is phase-scoped and post-execution, harvest operates on **any workflow session artifact** — analysis results, brainstorm outputs, debug sessions, lite-plan/fix results, scratchpad notes, and completed workflow sessions.
6
+
7
+ ---
8
+
9
+ ## Prerequisites
10
+
11
+ - `.workflow/` initialized (`.workflow/state.json` exists)
12
+ - At least one artifact source present (analysis, brainstorm, debug, lite-plan, lite-fix, scratchpad, or active session)
13
+ - For wiki routing: `maestro wiki` CLI available
14
+
15
+ ---
16
+
17
+ ## Argument Shape
18
+
19
+ ```
20
+ /manage-harvest → scan all sources, interactive selection
21
+ /manage-harvest <session-id> → harvest specific session (ANL-*, WFS-*, etc.)
22
+ /manage-harvest <path> → harvest from explicit directory or file
23
+ /manage-harvest --recent 7 → harvest from artifacts updated in last 7 days
24
+ /manage-harvest --source analysis → harvest only from analysis sessions
25
+ /manage-harvest <target> --to wiki → force all findings to wiki
26
+ /manage-harvest <target> --to spec → force all findings to spec
27
+ /manage-harvest <target> --to issue → force all findings to issue
28
+ /manage-harvest <target> --to auto → auto-classify routing (default)
29
+ /manage-harvest <target> --dry-run → preview without writing
30
+ ```
31
+
32
+ | Flag | Effect |
33
+ |------|--------|
34
+ | `--to <target>` | Force routing target: `wiki`, `spec`, `issue`, `auto` (default: auto) |
35
+ | `--source <type>` | Filter by source type: `analysis`, `brainstorm`, `debug`, `lite-plan`, `lite-fix`, `scratchpad`, `session`, `all` |
36
+ | `--recent N` | Only scan artifacts updated within last N days (default: 30) |
37
+ | `--dry-run` | Preview extracted items without writing to any store |
38
+ | `-y` / `--yes` | Skip confirmation prompts, accept all routing |
39
+ | `--min-confidence N` | Minimum extraction confidence 0.0-1.0 (default: 0.5) |
40
+
41
+ ---
42
+
43
+ ## Stage 1: parse_input
44
+
45
+ ```
46
+ 1. Verify .workflow/ exists; else error E001.
47
+ 2. Tokenize $ARGUMENTS:
48
+ - First non-flag token: session ID, path, or empty (scan mode)
49
+ - Flags: --to, --source, --recent, --dry-run, -y, --min-confidence
50
+ 3. Build:
51
+ mode = "scan" | "session" | "path"
52
+ target_filter = "auto" | "wiki" | "spec" | "issue"
53
+ source_filter = "all" | specific source type
54
+ recent_days = 30 (or --recent value)
55
+ dry_run = false
56
+ auto_yes = false
57
+ min_confidence = 0.5
58
+ 4. Validate --to value. Unknown target → error E002.
59
+ 5. Validate --source value. Unknown source → error E003.
60
+ ```
61
+
62
+ ---
63
+
64
+ ## Stage 2: discover_artifacts
65
+
66
+ Scan `.workflow/` for harvestable artifacts. Each source type has a known structure:
67
+
68
+ ### Source Registry
69
+
70
+ | Source Type | Scan Path | Key Files | ID Pattern |
71
+ |-------------|-----------|-----------|------------|
72
+ | `analysis` | `.workflow/.analysis/ANL-*/` | `conclusions.json`, `*.md` | `ANL-*` |
73
+ | `brainstorm` | `.workflow/scratch/brainstorm-*/` | `guidance-specification.md`, `brainstorm-*.md` | directory name |
74
+ | `lite-plan` | `.workflow/.lite-plan/*/` | `plan.json`, `plan-overview.md` | directory name |
75
+ | `lite-fix` | `.workflow/.lite-fix/*/` | `fix-plan.json` | directory name |
76
+ | `debug` | `.workflow/.debug/*/` | `debug-log.md`, `hypothesis-*.md` | directory name |
77
+ | `scratchpad` | `.workflow/.scratchpad/` | `*.md`, `*.json` | filename |
78
+ | `session` | `.workflow/active/WFS-*/` | `workflow-session.json` | `WFS-*` |
79
+ | `learning` | `.workflow/learning/` | `lessons.jsonl`, `digest-*.md`, `*.md` | filename |
80
+
81
+ ```
82
+ candidates = []
83
+ FOR each source_type in source_registry:
84
+ IF source_filter != "all" AND source_filter != source_type: SKIP
85
+ Glob for directories/files matching scan_path
86
+ FOR each match:
87
+ stat = file modification time
88
+ IF stat.mtime < (now - recent_days): SKIP
89
+ Read key files, extract:
90
+ - session_id or directory name
91
+ - title (from JSON title field or markdown H1)
92
+ - created_at / updated_at
93
+ - summary (first paragraph or JSON summary field)
94
+ - file_count (number of artifact files)
95
+ candidates.push({ source_type, id, path, title, updated_at, summary, file_count })
96
+ ```
97
+
98
+ ### Display candidates
99
+
100
+ ```
101
+ === HARVESTABLE ARTIFACTS ===
102
+
103
+ # Source ID Title Updated Files
104
+ ─ ────────── ──────────────────── ─────────────────────── ──────────── ─────
105
+ 1 analysis ANL-auth-20260410 Auth vulnerability scan 2026-04-10 4
106
+ 2 brainstorm brainstorm-cache Cache strategy options 2026-04-08 3
107
+ 3 lite-fix rate-limit-20260405 Rate limiter edge case 2026-04-05 2
108
+ 4 debug debug-memory-leak Memory leak in worker 2026-04-03 5
109
+
110
+ Found: 4 artifacts (filtered by: last 30 days)
111
+ ```
112
+
113
+ ### Selection logic
114
+
115
+ | Mode | Action |
116
+ |------|--------|
117
+ | `scan`, 0 candidates | Print "No harvestable artifacts found", exit 0 |
118
+ | `scan`, ≥1 candidates | AskUserQuestion: select one, multiple (comma-separated), or "all" |
119
+ | `session` | Find matching session ID in candidates; error E004 if not found |
120
+ | `path` | Validate path exists; auto-detect source type from structure |
121
+
122
+ ---
123
+
124
+ ## Stage 3: load_and_extract (per selected artifact)
125
+
126
+ For each selected artifact, load all files and extract knowledge fragments.
127
+
128
+ ### 3a. Load artifact content
129
+
130
+ Read all relevant files in the artifact directory. Build a content bundle:
131
+
132
+ ```
133
+ bundle = {
134
+ source_type: "analysis" | "brainstorm" | ...,
135
+ id: session_id,
136
+ path: artifact_directory,
137
+ files: [{ name, content, type: "json"|"md" }],
138
+ metadata: extracted from key files (conclusions.json, plan.json, etc.)
139
+ }
140
+ ```
141
+
142
+ ### 3b. Extract knowledge fragments
143
+
144
+ Parse content to identify discrete knowledge items. Each source type has specific extraction patterns:
145
+
146
+ **Analysis (`conclusions.json` + markdown):**
147
+ - `findings[]` → each finding is a fragment
148
+ - `recommendations[]` → each recommendation is a fragment
149
+ - `risks[]` → each risk is a fragment
150
+ - Markdown sections with `## ` headings → section-level fragments
151
+
152
+ **Brainstorm (`guidance-specification.md` + notes):**
153
+ - `## Options` or `## Approaches` → each option is a fragment
154
+ - `## Decision` or `## Recommendation` → decision fragment
155
+ - `## Trade-offs` → trade-off fragments
156
+ - Action items (lines starting with `- [ ]` or `TODO`) → task fragments
157
+
158
+ **Lite-plan (`plan.json`):**
159
+ - `tasks[]` → each with rationale → decision fragments
160
+ - `dependencies[]` → architectural constraint fragments
161
+ - `risks[]` → risk fragments
162
+
163
+ **Lite-fix (`fix-plan.json`):**
164
+ - `root_cause` → bug fragment
165
+ - `fix_strategy` → pattern fragment
166
+ - `verification` → test/validation fragment
167
+
168
+ **Debug (`debug-log.md`, `hypothesis-*.md`):**
169
+ - Final diagnosis → bug fragment
170
+ - Verified hypothesis → pattern/lesson fragment
171
+ - Rejected hypotheses with reasoning → lesson fragment
172
+
173
+ **Scratchpad (*.md):**
174
+ - Markdown sections → generic fragments
175
+ - Code blocks with explanations → pattern fragments
176
+
177
+ **Session (`workflow-session.json`):**
178
+ - `completed_tasks[].summary` → pattern/decision fragments
179
+ - `key_decisions[]` → decision fragments
180
+ - `deferred_items[]` → issue fragments
181
+
182
+ **Learning (`lessons.jsonl`):**
183
+ - Each lesson line → lesson fragment (check if already routed to wiki/spec/issue)
184
+
185
+ Each fragment gets:
186
+ ```
187
+ fragment = {
188
+ id: "HRV-{8 hex}" from hash(source_id + content_hash),
189
+ source_type: ...,
190
+ source_id: ...,
191
+ title: extracted or inferred,
192
+ content: raw text,
193
+ tags: extracted from context,
194
+ category: "finding" | "decision" | "pattern" | "bug" | "risk" | "task" | "lesson" | "recommendation",
195
+ confidence: 0.0-1.0 (based on specificity and actionability)
196
+ }
197
+ ```
198
+
199
+ Filter by `--min-confidence`.
200
+
201
+ ---
202
+
203
+ ## Stage 4: classify_routing
204
+
205
+ For each fragment, determine the best routing target (unless `--to` forces a specific target).
206
+
207
+ ### Classification Rules
208
+
209
+ | Category | Default Target | Rationale |
210
+ |----------|---------------|-----------|
211
+ | `finding` | wiki (note) | Observations go to knowledge graph |
212
+ | `decision` | wiki (spec) or spec (decision) | Architectural decisions → spec ADR or wiki spec entry |
213
+ | `pattern` | spec (pattern) | Reusable code patterns → coding conventions |
214
+ | `bug` | issue or spec (bug) | Active bugs → issue; fixed bugs → spec learnings |
215
+ | `risk` | issue | Unmitigated risks → trackable issues |
216
+ | `task` | issue | Unfinished work → trackable issues |
217
+ | `lesson` | wiki (lesson) | Generalizable insights → wiki knowledge |
218
+ | `recommendation` | wiki (note) or issue | Actionable recommendations → issue; informational → wiki |
219
+
220
+ ### Override with `--to`
221
+
222
+ If `--to wiki`: all fragments → wiki entries
223
+ If `--to spec`: all fragments → spec entries
224
+ If `--to issue`: all fragments → issue entries
225
+ If `--to auto`: use classification rules above
226
+
227
+ ### Build routing plan
228
+
229
+ ```
230
+ routing_plan = {
231
+ wiki: [{ fragment, wiki_type, slug, title, tags, body }],
232
+ spec: [{ fragment, spec_type, content }],
233
+ issue: [{ fragment, title, severity, description }]
234
+ }
235
+ ```
236
+
237
+ ---
238
+
239
+ ## Stage 5: preview_and_confirm
240
+
241
+ Display the routing plan:
242
+
243
+ ```
244
+ === HARVEST PLAN ===
245
+ Source: ANL-auth-20260410 (analysis)
246
+ Fragments extracted: 8 (filtered from 12 by confidence ≥ 0.5)
247
+
248
+ → Wiki (3 entries):
249
+ [note] "SQL injection vector in user input" tags: security, sql
250
+ [lesson] "Parameterized queries prevent injection" tags: security, pattern
251
+ [spec] "Auth token rotation policy" tags: auth, security
252
+
253
+ → Spec (2 entries):
254
+ [pattern] "Always use parameterized queries for user input"
255
+ [decision] "JWT refresh tokens over session cookies"
256
+
257
+ → Issue (3 entries):
258
+ [high] "Unvalidated redirect in OAuth callback"
259
+ [medium] "Missing rate limit on token refresh endpoint"
260
+ [low] "Inconsistent error messages leak internal state"
261
+
262
+ Total: 3 wiki + 2 spec + 3 issue = 8 routed items
263
+ ```
264
+
265
+ If `--dry-run`: display and exit.
266
+ If NOT `--dry-run` AND NOT `-y`:
267
+ AskUserQuestion: "Apply this routing plan? (yes/edit/skip)" with options.
268
+ - `edit`: re-display with per-item accept/reject
269
+ - `skip`: exit without writing
270
+
271
+ ---
272
+
273
+ ## Stage 6: route_outputs
274
+
275
+ Execute the routing plan. Each target uses existing infrastructure:
276
+
277
+ ### 6a. Wiki routing
278
+
279
+ For each wiki item:
280
+ ```bash
281
+ maestro wiki create --type <wiki_type> --slug harvest-<source_type>-<short_id> \
282
+ --title "<title>" --tags "<tags>" --body "<body>"
283
+ ```
284
+
285
+ Wiki types mapping:
286
+ - `note` → `--type note`
287
+ - `lesson` → `--type lesson`
288
+ - `spec` → `--type spec`
289
+
290
+ If `maestro wiki create` fails, fall back to writing `.workflow/harvest/wiki-pending-{id}.md` with frontmatter.
291
+
292
+ ### 6b. Spec routing
293
+
294
+ For each spec item, use the same mechanism as `quality-retrospective` Stage 6:
295
+
296
+ ```
297
+ Skill({ skill: "spec-add", args: "<spec_type> <content>" })
298
+ ```
299
+
300
+ Where `spec_type` maps from fragment category:
301
+ - `pattern` → `pattern`
302
+ - `decision` → `decision`
303
+ - `bug` → `bug`
304
+ - `lesson` → `rule` (if it prescribes a rule)
305
+
306
+ ### 6c. Issue routing
307
+
308
+ For each issue item, append to `.workflow/issues/issues.jsonl` using the canonical schema from `workflows/issue.md`:
309
+
310
+ ```json
311
+ {
312
+ "id": "ISS-{YYYYMMDD}-{NNN}",
313
+ "title": "<title>",
314
+ "description": "<description>",
315
+ "severity": "<high|medium|low>",
316
+ "status": "open",
317
+ "source": "harvest",
318
+ "source_ref": "<source_id>",
319
+ "tags": [],
320
+ "created_at": "<ISO timestamp>",
321
+ "issue_history": [{ "action": "created", "timestamp": "<ISO>", "by": "harvest", "detail": "Extracted from <source_type> <source_id>" }]
322
+ }
323
+ ```
324
+
325
+ ### 6d. Track harvest provenance
326
+
327
+ For each routed item, record in `.workflow/harvest/harvest-log.jsonl`:
328
+
329
+ ```json
330
+ {
331
+ "fragment_id": "HRV-...",
332
+ "source_type": "analysis",
333
+ "source_id": "ANL-auth-20260410",
334
+ "routed_to": "wiki|spec|issue",
335
+ "target_id": "note-harvest-analysis-abc123|ISS-20260413-001|...",
336
+ "timestamp": "<ISO>",
337
+ "title": "<title>",
338
+ "confidence": 0.85
339
+ }
340
+ ```
341
+
342
+ This log prevents duplicate harvesting in future runs.
343
+
344
+ ---
345
+
346
+ ## Stage 7: dedup_check
347
+
348
+ Before writing any item in Stage 6, check for duplicates:
349
+
350
+ 1. **harvest-log.jsonl**: Has this fragment_id already been routed?
351
+ 2. **Wiki**: `maestro wiki search "<title>"` — does a similar entry exist?
352
+ 3. **Issues**: Search `issues.jsonl` for matching title/description
353
+ 4. **Specs**: Search `learnings.md` for similar content
354
+
355
+ If duplicate found:
356
+ - Skip with `[SKIP-DUP]` marker
357
+ - Log to harvest report
358
+
359
+ ---
360
+
361
+ ## Stage 8: report
362
+
363
+ Write `.workflow/harvest/harvest-report-{date}.md`:
364
+
365
+ ```markdown
366
+ # Harvest Report — {date}
367
+
368
+ ## Source
369
+ - Type: {source_type}
370
+ - ID: {source_id}
371
+ - Path: {path}
372
+
373
+ ## Extraction Summary
374
+ - Fragments found: {total}
375
+ - Filtered by confidence: {filtered_count}
376
+ - Duplicates skipped: {dup_count}
377
+
378
+ ## Routing Results
379
+
380
+ ### Wiki ({N} entries)
381
+ | # | Type | Slug | Title | Status |
382
+ |---|------|------|-------|--------|
383
+ | 1 | note | harvest-analysis-abc | SQL injection vector | CREATED |
384
+ | 2 | lesson | harvest-analysis-def | Parameterized queries | CREATED |
385
+
386
+ ### Spec ({N} entries)
387
+ | # | Type | Content (truncated) | Status |
388
+ |---|------|---------------------|--------|
389
+ | 1 | pattern | Always use parameterized queries... | ADDED |
390
+
391
+ ### Issue ({N} entries)
392
+ | # | Severity | Title | ID | Status |
393
+ |---|----------|-------|-----|--------|
394
+ | 1 | high | Unvalidated redirect in OAuth... | ISS-20260413-001 | CREATED |
395
+
396
+ ## Skipped
397
+ | Fragment | Reason |
398
+ |----------|--------|
399
+ | HRV-abc123 | Duplicate: existing wiki entry note-sql-injection |
400
+ ```
401
+
402
+ Display summary:
403
+
404
+ ```
405
+ === HARVEST COMPLETE ===
406
+ Source: ANL-auth-20260410 (analysis)
407
+
408
+ Wiki: 3 created, 0 skipped
409
+ Spec: 2 added, 0 skipped
410
+ Issue: 3 created, 1 skipped (dup)
411
+
412
+ Report: .workflow/harvest/harvest-report-2026-04-13.md
413
+ Log: .workflow/harvest/harvest-log.jsonl
414
+
415
+ Next:
416
+ → Review wiki entries: maestro wiki list --type note
417
+ → Triage issues: Skill({ skill: "manage-issue", args: "list --source harvest" })
418
+ → Connect wiki graph: Skill({ skill: "wiki-connect", args: "--fix" })
419
+ → View specs: Skill({ skill: "spec-load", args: "--category general" })
420
+ ```