all-hands-cli 0.1.5 → 0.1.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.allhands/flows/COMPOUNDING.md +3 -3
- package/.allhands/flows/EMERGENT_PLANNING.md +1 -1
- package/.allhands/flows/INITIATIVE_STEERING.md +0 -1
- package/.allhands/flows/PROMPT_TASK_EXECUTION.md +0 -1
- package/.allhands/flows/SPEC_PLANNING.md +1 -2
- package/.allhands/flows/harness/WRITING_HARNESS_FLOWS.md +1 -1
- package/.allhands/flows/harness/WRITING_HARNESS_KNOWLEDGE.md +1 -1
- package/.allhands/flows/harness/WRITING_HARNESS_ORCHESTRATION.md +1 -1
- package/.allhands/flows/harness/WRITING_HARNESS_SKILLS.md +1 -1
- package/.allhands/flows/harness/WRITING_HARNESS_TOOLS.md +1 -1
- package/.allhands/flows/harness/WRITING_HARNESS_VALIDATION_TOOLING.md +1 -1
- package/.allhands/flows/shared/CODEBASE_UNDERSTANDING.md +2 -3
- package/.allhands/flows/shared/CREATE_VALIDATION_TOOLING_SPEC.md +1 -1
- package/.allhands/flows/shared/PLAN_DEEPENING.md +2 -3
- package/.allhands/flows/shared/PROMPT_TASKS_CURATION.md +3 -5
- package/.allhands/flows/shared/WRITING_HARNESS_FLOWS.md +1 -1
- package/.allhands/flows/shared/jury/BEST_PRACTICES_REVIEW.md +4 -4
- package/.allhands/harness/src/cli.ts +4 -0
- package/.allhands/harness/src/commands/knowledge.ts +8 -5
- package/.allhands/harness/src/commands/skills.ts +299 -16
- package/.allhands/harness/src/commands/solutions.ts +227 -111
- package/.allhands/harness/src/commands/spawn.ts +6 -13
- package/.allhands/harness/src/hooks/shared.ts +1 -0
- package/.allhands/harness/src/lib/opencode/index.ts +65 -0
- package/.allhands/harness/src/lib/opencode/prompts/skills-aggregator.md +77 -0
- package/.allhands/harness/src/lib/opencode/prompts/solutions-aggregator.md +97 -0
- package/.allhands/harness/src/lib/opencode/runner.ts +98 -5
- package/.allhands/settings.json +2 -1
- package/.allhands/skills/harness-maintenance/SKILL.md +1 -1
- package/.allhands/skills/harness-maintenance/references/harness-skills.md +1 -1
- package/.allhands/skills/harness-maintenance/references/knowledge-compounding.md +5 -10
- package/.allhands/skills/harness-maintenance/references/validation-tooling.md +10 -1
- package/.allhands/skills/harness-maintenance/references/writing-flows.md +1 -1
- package/CLAUDE.md +1 -1
- package/docs/flows/compounding.md +2 -2
- package/docs/flows/plan-deepening-and-research.md +2 -2
- package/docs/flows/validation-and-skills-integration.md +14 -8
- package/docs/flows/wip/wip-flows.md +1 -1
- package/docs/harness/cli/search-commands.md +9 -19
- package/docs/memories.md +1 -1
- package/package.json +1 -1
- package/specs/workflow-domain-configuration.spec.md +2 -2
- package/.allhands/flows/shared/SKILL_EXTRACTION.md +0 -84
- package/.allhands/harness/src/commands/memories.ts +0 -302
|
@@ -0,0 +1,97 @@
|
|
|
1
|
+
# Solutions Aggregator
|
|
2
|
+
|
|
3
|
+
You synthesize documented solutions and project memories into task-relevant guidance. The caller needs actionable knowledge from past learnings -- not a catalog of files.
|
|
4
|
+
|
|
5
|
+
## Core Principle
|
|
6
|
+
|
|
7
|
+
Extract what matters for the task. Every piece of guidance must be grounded in solution content or memory entries:
|
|
8
|
+
- BAD: "Consider checking for similar issues in the codebase..."
|
|
9
|
+
- GOOD: "The solution `unix-socket-path-length-hooks-20250115.md` documents that socket paths exceeding 104 chars cause ENOENT -- use path hashing as described in `docs/solutions/infrastructure/unix-socket-path-length-hooks-20250115.md`"
|
|
10
|
+
|
|
11
|
+
## Input Format
|
|
12
|
+
|
|
13
|
+
You receive JSON with:
|
|
14
|
+
1. `query`: The user's task description or question
|
|
15
|
+
2. `solutions`: Array of matched solutions, each containing:
|
|
16
|
+
- `title`: Solution title from frontmatter
|
|
17
|
+
- `path`: Relative file path
|
|
18
|
+
- `severity`: Issue severity
|
|
19
|
+
- `problem_type`: Category of problem
|
|
20
|
+
- `component`: Affected component
|
|
21
|
+
- `tags`: Search tags
|
|
22
|
+
- `content`: Full solution body (without frontmatter)
|
|
23
|
+
3. `memories`: Array of all memory entries from `docs/memories.md`, each containing:
|
|
24
|
+
- `name`: Memory identifier
|
|
25
|
+
- `domain`: One of planning, validation, implementation, harness-tooling, ideation
|
|
26
|
+
- `source`: user-steering or agent-inferred
|
|
27
|
+
- `description`: Learning description
|
|
28
|
+
|
|
29
|
+
## Expansion Protocol
|
|
30
|
+
|
|
31
|
+
Need content from a referenced file (e.g., a linked spec or solution)? Output:
|
|
32
|
+
```
|
|
33
|
+
EXPAND: <file_path>
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
You'll receive the content. Max 3 expansions. Only expand if the file path suggests direct relevance to the query.
|
|
37
|
+
|
|
38
|
+
## Output Format
|
|
39
|
+
|
|
40
|
+
Return ONLY valid JSON:
|
|
41
|
+
|
|
42
|
+
```json
|
|
43
|
+
{
|
|
44
|
+
"guidance": "Task-relevant synthesis: what patterns to follow, what to avoid, key learnings. Ground every statement in solution content or memory entries. 3-6 sentences max.",
|
|
45
|
+
"relevant_solutions": [
|
|
46
|
+
{
|
|
47
|
+
"title": "Solution title",
|
|
48
|
+
"file": "docs/solutions/category/filename.md",
|
|
49
|
+
"relevance": "Why this solution matters for the query",
|
|
50
|
+
"key_excerpts": ["Specific actionable insights extracted from the solution"],
|
|
51
|
+
"related_memories": ["Memory names that relate to this solution"]
|
|
52
|
+
}
|
|
53
|
+
],
|
|
54
|
+
"memory_insights": [
|
|
55
|
+
{
|
|
56
|
+
"name": "Memory name",
|
|
57
|
+
"domain": "domain",
|
|
58
|
+
"source": "source",
|
|
59
|
+
"relevance": "Why this memory matters for the query"
|
|
60
|
+
}
|
|
61
|
+
],
|
|
62
|
+
"design_notes": ["Architectural constraints or patterns from solutions that affect the task"]
|
|
63
|
+
}
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
## Field Guidelines
|
|
67
|
+
|
|
68
|
+
**guidance**:
|
|
69
|
+
- Synthesize across matched solutions and relevant memories into coherent task guidance
|
|
70
|
+
- Include specific patterns, workarounds, and constraints
|
|
71
|
+
- Mention file paths and memory names for attribution
|
|
72
|
+
- If solutions encode anti-patterns, state them as warnings
|
|
73
|
+
|
|
74
|
+
**relevant_solutions** (ranked by relevance to query):
|
|
75
|
+
- `title`: Solution title from frontmatter
|
|
76
|
+
- `file`: Path to solution file
|
|
77
|
+
- `relevance`: One sentence -- why does this solution matter for the task?
|
|
78
|
+
- `key_excerpts`: 1-4 specific, actionable insights extracted verbatim or closely paraphrased from the solution
|
|
79
|
+
- `related_memories`: Memory names that provide additional context for this solution (may be empty)
|
|
80
|
+
|
|
81
|
+
**memory_insights** (only include relevant memories):
|
|
82
|
+
- `name`: Memory identifier
|
|
83
|
+
- `domain`: Memory domain
|
|
84
|
+
- `source`: user-steering or agent-inferred
|
|
85
|
+
- `relevance`: One sentence -- why does this memory matter for the query?
|
|
86
|
+
|
|
87
|
+
**design_notes** (optional, max 3):
|
|
88
|
+
- Only include if solutions or memories explicitly discuss design rationale or constraints
|
|
89
|
+
- Format: "[Constraint]: [Detail]" e.g. "Socket path limit: Unix domain sockets have a 104-char path limit on macOS"
|
|
90
|
+
|
|
91
|
+
## Anti-patterns
|
|
92
|
+
|
|
93
|
+
- Generic advice not grounded in solution or memory content
|
|
94
|
+
- Copying entire solution files instead of extracting task-relevant parts
|
|
95
|
+
- Including solutions or memories that aren't relevant to the query
|
|
96
|
+
- Restating the query as guidance
|
|
97
|
+
- Listing every memory entry regardless of relevance
|
|
@@ -7,6 +7,7 @@ import { createOpencode } from "@opencode-ai/sdk";
|
|
|
7
7
|
import { existsSync, readFileSync, statSync } from "fs";
|
|
8
8
|
import { join } from "path";
|
|
9
9
|
import { logCommandStart, logCommandSuccess, logCommandError } from "../trace-store.js";
|
|
10
|
+
import { sendNotification } from "../notification.js";
|
|
10
11
|
import type { AgentConfig, AgentResult } from "./index.js";
|
|
11
12
|
import { loadProjectSettings } from "../../hooks/shared.js";
|
|
12
13
|
|
|
@@ -17,6 +18,7 @@ const DEFAULT_TIMEOUT_MS = 60000;
|
|
|
17
18
|
// Model from settings, undefined means use opencode default
|
|
18
19
|
const settings = loadProjectSettings();
|
|
19
20
|
const SETTINGS_AGENT_MODEL = settings?.opencodeSdk?.model?.trim() || undefined;
|
|
21
|
+
const SETTINGS_FALLBACK_MODEL = settings?.opencodeSdk?.fallbackModel?.trim() || undefined;
|
|
20
22
|
|
|
21
23
|
export class AgentRunner {
|
|
22
24
|
private readonly projectRoot: string;
|
|
@@ -27,12 +29,99 @@ export class AgentRunner {
|
|
|
27
29
|
|
|
28
30
|
/**
|
|
29
31
|
* Execute an agent with given config and user message.
|
|
30
|
-
*
|
|
32
|
+
* Tries the primary model first, then falls back to the fallback model if configured.
|
|
31
33
|
*/
|
|
32
34
|
async run<T>(config: AgentConfig, userMessage: string): Promise<AgentResult<T>> {
|
|
35
|
+
const runStart = Date.now();
|
|
36
|
+
const primaryModel = config.model ?? SETTINGS_AGENT_MODEL;
|
|
37
|
+
const fallbackModel = SETTINGS_FALLBACK_MODEL;
|
|
38
|
+
|
|
39
|
+
const primaryResult = await this.executeWithModel<T>(config, userMessage, primaryModel);
|
|
40
|
+
|
|
41
|
+
if (primaryResult.success) {
|
|
42
|
+
logCommandSuccess("opencode.agent.run.complete", {
|
|
43
|
+
agent: config.name,
|
|
44
|
+
model: primaryModel ?? "opencode-default",
|
|
45
|
+
outcome: "primary_success",
|
|
46
|
+
time_taken: Date.now() - runStart,
|
|
47
|
+
});
|
|
48
|
+
return primaryResult;
|
|
49
|
+
}
|
|
50
|
+
|
|
51
|
+
// Primary failed — attempt fallback
|
|
52
|
+
if (fallbackModel && fallbackModel !== primaryModel) {
|
|
53
|
+
sendNotification({
|
|
54
|
+
title: "Model Fallback",
|
|
55
|
+
message: `Primary model ${primaryModel ?? "opencode-default"} failed for ${config.name}: ${primaryResult.error}. Retrying with ${fallbackModel}...`,
|
|
56
|
+
type: "alert",
|
|
57
|
+
});
|
|
58
|
+
|
|
59
|
+
logCommandStart("opencode.agent.run.fallback", {
|
|
60
|
+
agent: config.name,
|
|
61
|
+
primaryModel: primaryModel ?? "opencode-default",
|
|
62
|
+
fallbackModel,
|
|
63
|
+
primaryError: primaryResult.error,
|
|
64
|
+
});
|
|
65
|
+
|
|
66
|
+
const fallbackResult = await this.executeWithModel<T>(config, userMessage, fallbackModel);
|
|
67
|
+
|
|
68
|
+
if (fallbackResult.success) {
|
|
69
|
+
const result = {
|
|
70
|
+
...fallbackResult,
|
|
71
|
+
metadata: {
|
|
72
|
+
...fallbackResult.metadata!,
|
|
73
|
+
fallback: true,
|
|
74
|
+
primary_error: primaryResult.error,
|
|
75
|
+
},
|
|
76
|
+
};
|
|
77
|
+
logCommandSuccess("opencode.agent.run.complete", {
|
|
78
|
+
agent: config.name,
|
|
79
|
+
model: fallbackModel,
|
|
80
|
+
outcome: "fallback_success",
|
|
81
|
+
time_taken: Date.now() - runStart,
|
|
82
|
+
});
|
|
83
|
+
return result;
|
|
84
|
+
}
|
|
85
|
+
|
|
86
|
+
// Both failed
|
|
87
|
+
sendNotification({
|
|
88
|
+
title: "Model Failure",
|
|
89
|
+
message: `Both ${primaryModel ?? "opencode-default"} and fallback ${fallbackModel} failed for ${config.name}.`,
|
|
90
|
+
type: "alert",
|
|
91
|
+
sound: "Basso",
|
|
92
|
+
});
|
|
93
|
+
|
|
94
|
+
logCommandError("opencode.agent.run.complete", fallbackResult.error ?? "unknown", {
|
|
95
|
+
agent: config.name,
|
|
96
|
+
model: fallbackModel,
|
|
97
|
+
outcome: "both_failed",
|
|
98
|
+
time_taken: Date.now() - runStart,
|
|
99
|
+
});
|
|
100
|
+
return fallbackResult;
|
|
101
|
+
}
|
|
102
|
+
|
|
103
|
+
// No fallback configured
|
|
104
|
+
sendNotification({
|
|
105
|
+
title: "Model Failure",
|
|
106
|
+
message: `Model ${primaryModel ?? "opencode-default"} failed for ${config.name}. No fallback configured.`,
|
|
107
|
+
type: "alert",
|
|
108
|
+
sound: "Basso",
|
|
109
|
+
});
|
|
110
|
+
|
|
111
|
+
logCommandError("opencode.agent.run.complete", primaryResult.error ?? "unknown", {
|
|
112
|
+
agent: config.name,
|
|
113
|
+
model: primaryModel ?? "opencode-default",
|
|
114
|
+
outcome: "no_fallback",
|
|
115
|
+
time_taken: Date.now() - runStart,
|
|
116
|
+
});
|
|
117
|
+
return primaryResult;
|
|
118
|
+
}
|
|
119
|
+
|
|
120
|
+
/**
|
|
121
|
+
* Core execution: spawn server, create session, send prompts, handle expansions, parse JSON.
|
|
122
|
+
*/
|
|
123
|
+
private async executeWithModel<T>(config: AgentConfig, userMessage: string, model: string | undefined): Promise<AgentResult<T>> {
|
|
33
124
|
const startTime = Date.now();
|
|
34
|
-
// Use config.model if specified, else settings AGENT_MODEL, else opencode default
|
|
35
|
-
const model = config.model ?? SETTINGS_AGENT_MODEL;
|
|
36
125
|
logCommandStart("opencode.agent.run", {
|
|
37
126
|
agent: config.name,
|
|
38
127
|
model: model ?? "opencode-default",
|
|
@@ -126,12 +215,14 @@ export class AgentRunner {
|
|
|
126
215
|
throw new Error("Failed to parse agent response as JSON");
|
|
127
216
|
}
|
|
128
217
|
|
|
218
|
+
const retryDurationMs = Date.now() - startTime;
|
|
129
219
|
logCommandSuccess("opencode.agent.run", {
|
|
130
220
|
agent: config.name,
|
|
131
221
|
expansions: expansionCount,
|
|
132
222
|
retry: true,
|
|
133
223
|
model: model ?? "opencode-default",
|
|
134
|
-
duration_ms:
|
|
224
|
+
duration_ms: retryDurationMs,
|
|
225
|
+
time_taken: retryDurationMs,
|
|
135
226
|
});
|
|
136
227
|
|
|
137
228
|
return {
|
|
@@ -139,7 +230,7 @@ export class AgentRunner {
|
|
|
139
230
|
data: retryParsed,
|
|
140
231
|
metadata: {
|
|
141
232
|
model: model ?? "opencode-default",
|
|
142
|
-
duration_ms:
|
|
233
|
+
duration_ms: retryDurationMs,
|
|
143
234
|
},
|
|
144
235
|
};
|
|
145
236
|
}
|
|
@@ -149,6 +240,7 @@ export class AgentRunner {
|
|
|
149
240
|
expansions: expansionCount,
|
|
150
241
|
model: model ?? "opencode-default",
|
|
151
242
|
duration_ms: durationMs,
|
|
243
|
+
time_taken: durationMs,
|
|
152
244
|
});
|
|
153
245
|
|
|
154
246
|
return {
|
|
@@ -167,6 +259,7 @@ export class AgentRunner {
|
|
|
167
259
|
agent: config.name,
|
|
168
260
|
model: model ?? "opencode-default",
|
|
169
261
|
duration_ms: durationMs,
|
|
262
|
+
time_taken: durationMs,
|
|
170
263
|
});
|
|
171
264
|
|
|
172
265
|
return {
|
package/.allhands/settings.json
CHANGED
|
@@ -335,7 +335,7 @@ Per **Context is Precious**, agents only see what they need when they need it.
|
|
|
335
335
|
|
|
336
336
|
### Flow Referencing
|
|
337
337
|
```markdown
|
|
338
|
-
- Read `.allhands/flows/shared/
|
|
338
|
+
- Read `.allhands/flows/shared/UTILIZE_VALIDATION_TOOLING.md` and follow its instructions
|
|
339
339
|
```
|
|
340
340
|
|
|
341
341
|
### Inputs/Outputs Pattern
|
|
@@ -35,7 +35,7 @@ Skills are discovered via glob matching against the files an agent is working on
|
|
|
35
35
|
3. Matching skill(s) are surfaced to the agent
|
|
36
36
|
4. Agent reads `SKILL.md` hub for routing context
|
|
37
37
|
|
|
38
|
-
|
|
38
|
+
Search skills: `ah skills search "<query>" --paths <files>`
|
|
39
39
|
|
|
40
40
|
## Hub-and-Spoke Pattern
|
|
41
41
|
|
|
@@ -20,17 +20,12 @@ Run `ah schema <type> body` to see the body format (not just frontmatter).
|
|
|
20
20
|
|
|
21
21
|
## Knowledge Indexes
|
|
22
22
|
|
|
23
|
-
### Solutions (`docs/solutions/`)
|
|
24
|
-
Reusable patterns discovered during work. Searchable by future agents:
|
|
25
|
-
- `ah solutions search "<keywords>"` — Find relevant past solutions
|
|
23
|
+
### Solutions (`docs/solutions/`) + Memories (`docs/memories.md`)
|
|
24
|
+
Reusable patterns and lightweight learnings discovered during work. Searchable by future agents:
|
|
25
|
+
- `ah solutions search "<keywords>"` — Find relevant past solutions with memory context
|
|
26
26
|
- Solutions are created when an agent discovers a reusable pattern worth preserving
|
|
27
|
-
-
|
|
28
|
-
|
|
29
|
-
### Memories (`ah memories`)
|
|
30
|
-
Agent learnings and engineer preferences that persist across sessions:
|
|
31
|
-
- `ah memories search "<keywords>"` — Find relevant learnings
|
|
32
|
-
- Captures: debugging insights, preference decisions, architectural rationale
|
|
33
|
-
- Per **Knowledge Compounding**, memories prevent repeated mistakes
|
|
27
|
+
- Memories capture debugging insights, preference decisions, architectural rationale
|
|
28
|
+
- Per **Knowledge Compounding**, solutions and memories prevent re-discovery of known patterns
|
|
34
29
|
|
|
35
30
|
### Knowledge Docs
|
|
36
31
|
Codebase knowledge indexed for semantic search:
|
|
@@ -42,10 +42,19 @@ Per **Frontier Models are Capable** and **Context is Precious**:
|
|
|
42
42
|
- **`--help` as prerequisite**: Suites MUST instruct agents to pull `<tool> --help` before any exploration — command vocabulary shapes exploration quality. The suite MUST NOT replicate full command docs.
|
|
43
43
|
- **Inline command examples**: Weave brief examples into use-case motivations as calibration anchors — not exhaustive catalogs, not separated command reference sections.
|
|
44
44
|
- **Motivation framing**: Frame around harness value: reducing human-in-loop supervision, verifying code quality, confirming implementation matches expectations.
|
|
45
|
-
- **Exploration categories**: Describe with enough command specificity to orient,
|
|
45
|
+
- **Exploration categories**: Describe with enough command specificity to orient. For untested territory, prefer motivations over prescriptive sequences — the agent extrapolates better from goals than rigid steps. For patterns verified through testing, state them authoritatively (see below).
|
|
46
46
|
|
|
47
47
|
Formula: **motivations backed by inline command examples + `--help` as prerequisite and progressive disclosure**. Commands woven into use cases give direction; `--help` reveals depth.
|
|
48
48
|
|
|
49
|
+
### Proven vs Untested Guidance
|
|
50
|
+
|
|
51
|
+
Validation suites should be grounded in hands-on testing against the actual repo, not theoretical instructions. The level of authority in how guidance is written depends on whether it has been verified:
|
|
52
|
+
|
|
53
|
+
- **Proven patterns** (verified via the Tool Validation Phase): State authoritatively within use-case motivations — the pattern is established fact, not a suggestion. These override generic tool documentation when they conflict. Example: "`xctrace` requires `--device '<UDID>'` for simulator" is a hard requirement discovered through testing, stated directly alongside the motivation (why: `xctrace` can't find simulator processes without it). The motivation formula still applies — proven patterns are *authoritative examples within motivations*, not raw command catalogs.
|
|
54
|
+
- **Untested edge cases** (not yet exercised in this repo): Define the **motivation** (what the agent should achieve and why) and reference **analogous solved examples** from proven patterns. Do NOT write prescriptive step-by-step instructions for scenarios that haven't been verified — unverified prescriptions can mislead the agent into rigid sequences that don't match reality. Instead, trust that a frontier model given clear motivation and a reference example of how a similar problem was solved will extrapolate the correct approach through stochastic exploration.
|
|
55
|
+
|
|
56
|
+
**Why this matters**: Frontier models produce emergent, adaptive behavior when given goals and reference points. Unverified prescriptive instructions constrain this emergence and risk encoding incorrect assumptions. Motivation + examples activate the model's reasoning about the problem space; rigid untested instructions bypass it. The Tool Validation Phase exists to convert untested guidance into proven patterns over time — the crystallization lifecycle in action.
|
|
57
|
+
|
|
49
58
|
### Evidence Capture
|
|
50
59
|
|
|
51
60
|
Per **Quality Engineering**, two audiences require different artifacts:
|
|
@@ -48,7 +48,7 @@ Per **Knowledge Compounding**:
|
|
|
48
48
|
|
|
49
49
|
### Progressive Disclosure Pattern
|
|
50
50
|
```markdown
|
|
51
|
-
- Read `.allhands/flows/shared/
|
|
51
|
+
- Read `.allhands/flows/shared/UTILIZE_VALIDATION_TOOLING.md` and follow its instructions
|
|
52
52
|
```
|
|
53
53
|
|
|
54
54
|
Sub-flows use `<inputs>` and `<outputs>` tags for execution-agnostic subtasks. This decouples the flow from its caller — any agent can execute it given the right inputs.
|
package/CLAUDE.md
CHANGED
|
@@ -2,5 +2,5 @@
|
|
|
2
2
|
|
|
3
3
|
When debugging .allhands , use the ah trace --help command for the tools to investigate the trace entries.
|
|
4
4
|
|
|
5
|
-
When modifying ANY `.allhands/` files, discover the `harness-maintenance` skill via `ah skills
|
|
5
|
+
When modifying ANY `.allhands/` files, discover the `harness-maintenance` skill via `ah skills search` and follow its routing table for domain-specific guidance.
|
|
6
6
|
|
|
@@ -79,7 +79,7 @@ The flow produces three distinct knowledge artifacts:
|
|
|
79
79
|
|
|
80
80
|
| Artifact | Location | Purpose |
|
|
81
81
|
|----------|----------|---------|
|
|
82
|
-
| Memories | `docs/memories.md` | Lightweight learnings searchable via `ah
|
|
82
|
+
| Memories | `docs/memories.md` | Lightweight learnings searchable via `ah solutions search` |
|
|
83
83
|
| Solutions | `docs/solutions/<category>/` | Detailed problem-solution documentation for non-trivial issues |
|
|
84
84
|
| Spec Finalization | `.planning/<spec>/spec.md` | Historical record with implementation reality vs. original plan |
|
|
85
85
|
|
|
@@ -108,7 +108,7 @@ flowchart TD
|
|
|
108
108
|
D -->|Defer| F
|
|
109
109
|
```
|
|
110
110
|
|
|
111
|
-
Inline updates (skills, validation suites) require engineer approval. Structural changes always go through a spec. Deferred items are documented in `docs/memories.md` under "Deferred Harness Improvements.
|
|
111
|
+
Inline updates (skills, validation suites) require engineer approval. Structural changes always go through a spec. Deferred items are documented in `docs/memories.md` under "Deferred Harness Improvements" (searchable via `ah solutions search`).
|
|
112
112
|
|
|
113
113
|
### Crystallization Promotion
|
|
114
114
|
|
|
@@ -36,7 +36,7 @@ This flow governs how agents explore the codebase without consuming excessive co
|
|
|
36
36
|
| 1st | `ah knowledge docs search` | Any discovery task -- returns engineered knowledge with "why" context |
|
|
37
37
|
| 2nd | `tldr semantic search` / grep | Knowledge search insufficient, need code-level patterns |
|
|
38
38
|
| 3rd | LSP | Known symbol name from knowledge search results |
|
|
39
|
-
| 4th | `ah solutions search`
|
|
39
|
+
| 4th | `ah solutions search` | Similar problem solved before, engineer preferences, or prior insights |
|
|
40
40
|
| 5th | `ast-grep` | Structured code pattern matching as last resort |
|
|
41
41
|
|
|
42
42
|
Knowledge search results include `insight` (engineering knowledge), `lsp_entry_points` (files with exploration rationale), and `design_notes` (architectural decisions). This is richer than raw file reads and costs fewer tokens.
|
|
@@ -86,7 +86,7 @@ flowchart LR
|
|
|
86
86
|
Each axis runs in parallel:
|
|
87
87
|
|
|
88
88
|
- **Skill application**: Matches available skills to plan domains, extracts patterns and gotchas
|
|
89
|
-
- **Solutions search**: Checks `ah solutions search`
|
|
89
|
+
- **Solutions search**: Checks `ah solutions search` for relevant past learnings (includes memory context)
|
|
90
90
|
- **Codebase patterns**: Discovers existing implementations of similar patterns via `CODEBASE_UNDERSTANDING.md`
|
|
91
91
|
- **External research**: For novel technologies or high-risk domains via `RESEARCH_GUIDANCE.md`
|
|
92
92
|
|
|
@@ -88,17 +88,23 @@ Stochastic exploration during implementation is not ordered -- agents follow mod
|
|
|
88
88
|
|
|
89
89
|
## Skill Extraction
|
|
90
90
|
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
This flow finds and distills domain expertise from skill files into actionable prompt guidance. Per **Knowledge Compounding**, skills are "how to do it right" -- expertise that compounds across prompts.
|
|
91
|
+
This is handled by `ah skills search`, which finds and distills domain expertise from skill files into actionable prompt guidance. Per **Knowledge Compounding**, skills are "how to do it right" -- expertise that compounds across prompts.
|
|
94
92
|
|
|
95
93
|
### Extraction Pipeline
|
|
96
94
|
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
95
|
+
```bash
|
|
96
|
+
ah skills search "<task_description>" --paths <files_being_touched...>
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
Internally, the command performs:
|
|
100
|
+
|
|
101
|
+
1. **Discover**: Lists all available skills from `.allhands/skills/*/SKILL.md`
|
|
102
|
+
2. **Match**: Keyword scoring (name, description, globs) + path boosting via `--paths` glob matching
|
|
103
|
+
3. **Read**: Reads SKILL.md body content and discovers reference docs for matched skills
|
|
104
|
+
4. **Synthesize**: AI aggregator subagent distills task-relevant knowledge with source attribution
|
|
105
|
+
5. **Return**: Structured JSON with `guidance`, `relevant_skills` (excerpts + references), and `design_notes`
|
|
106
|
+
|
|
107
|
+
Add skill paths (from `relevant_skills[].file`) to `skills` frontmatter and embed `guidance` in prompt Tasks section.
|
|
102
108
|
|
|
103
109
|
### Key Distinction from Validation
|
|
104
110
|
|
|
@@ -99,4 +99,4 @@ Memories are prioritized by domain match, keyword overlap, and source authority
|
|
|
99
99
|
| Ideation Session | Similar initiatives, prior engineer preferences |
|
|
100
100
|
| Compounding | Verify memory doesn't already exist before adding |
|
|
101
101
|
|
|
102
|
-
For detailed technical solutions beyond lightweight memories, the recall flow also directs callers to `ah solutions search
|
|
102
|
+
For detailed technical solutions beyond lightweight memories, the recall flow also directs callers to `ah solutions search` (which now includes memory context in its aggregation).
|
|
@@ -1,10 +1,10 @@
|
|
|
1
1
|
---
|
|
2
|
-
description: "Knowledge retrieval commands for agents: local solution
|
|
2
|
+
description: "Knowledge retrieval commands for agents: local solution search (with memory aggregation) via keyword scoring and AI synthesis, plus external web research through Perplexity, Tavily, and Context7 integrations."
|
|
3
3
|
---
|
|
4
4
|
|
|
5
5
|
## Intent
|
|
6
6
|
|
|
7
|
-
Agents need to retrieve knowledge from two distinct sources: **local project knowledge** (solutions
|
|
7
|
+
Agents need to retrieve knowledge from two distinct sources: **local project knowledge** (solutions + memories) and **external web knowledge** (documentation, research). The search commands provide a unified CLI surface for both, with consistent JSON output that agents can parse. The local search commands use weighted keyword scoring against structured frontmatter rather than full-text search, trading recall for precision and avoiding embedding infrastructure. Solutions search includes AI aggregation that synthesizes matched solutions with memory context from `docs/memories.md`.
|
|
8
8
|
|
|
9
9
|
## Local Knowledge Search
|
|
10
10
|
|
|
@@ -25,29 +25,19 @@ Scoring weights determine field importance:
|
|
|
25
25
|
|
|
26
26
|
[ref:.allhands/harness/src/commands/solutions.ts:scoreSolution:19e47dd] computes a cumulative score per keyword across all fields. [ref:.allhands/harness/src/commands/solutions.ts:extractKeywords:19e47dd] handles quoted phrases and whitespace splitting, allowing queries like `"tmux session" timeout`.
|
|
27
27
|
|
|
28
|
-
###
|
|
28
|
+
### Memory Integration
|
|
29
29
|
|
|
30
|
-
|
|
30
|
+
Solutions search automatically loads all memory entries from `docs/memories.md` and includes them in the AI aggregation context. This means a single `ah solutions search` call returns synthesized guidance from both solution files and memory entries, with `memory_insights` in the output identifying relevant memories. Use `--no-aggregate` to get raw keyword matches without memory context.
|
|
31
31
|
|
|
32
|
-
|
|
32
|
+
### Search Design
|
|
33
33
|
|
|
34
|
-
|
|
35
|
-
|-------|--------|
|
|
36
|
-
| name | 3 |
|
|
37
|
-
| description | 2 |
|
|
38
|
-
| domain | 2 |
|
|
39
|
-
| source | 1 |
|
|
40
|
-
|
|
41
|
-
Memories support additional filtering by `--domain` and `--source` before scoring, enabling queries like `ah memories search "hook" --domain planning`.
|
|
42
|
-
|
|
43
|
-
### Shared Search Design
|
|
44
|
-
|
|
45
|
-
Both local search commands share these characteristics:
|
|
34
|
+
Solutions search has these characteristics:
|
|
46
35
|
- Keyword extraction with quoted phrase support ([ref:.allhands/harness/src/commands/solutions.ts:extractKeywords:19e47dd])
|
|
47
36
|
- Cumulative scoring across multiple fields
|
|
48
37
|
- Results sorted by score descending, truncated to `--limit`
|
|
49
|
-
-
|
|
50
|
-
-
|
|
38
|
+
- AI aggregation via solutions-aggregator agent (with graceful degradation to raw matches on failure)
|
|
39
|
+
- JSON output with synthesized `guidance`, `relevant_solutions`, `memory_insights`, and `design_notes`
|
|
40
|
+
- `--no-aggregate` flag for raw keyword matches with `matchedFields` array
|
|
51
41
|
|
|
52
42
|
## External Web Research
|
|
53
43
|
|
package/docs/memories.md
CHANGED
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
description: "Lightweight learnings from past sessions, searchable via `ah
|
|
2
|
+
description: "Lightweight learnings from past sessions, searchable via `ah solutions search` (included as memory context in aggregation). Captures technical patterns, engineer preferences, and harness behavior discoveries."
|
|
3
3
|
---
|
|
4
4
|
|
|
5
5
|
# Memories
|
package/package.json
CHANGED
|
@@ -63,7 +63,7 @@ Engineer expects the existing milestone workflow behavior to be fully preserved
|
|
|
63
63
|
- Review options breakdown after jury with actionable options for engineer
|
|
64
64
|
- Plan deepening option for complex/high-risk specs
|
|
65
65
|
- Alignment doc decision recording: only deviations from recommendations (what was recommended, what was chosen, stated reasoning)
|
|
66
|
-
- Solutions search
|
|
66
|
+
- Solutions search (includes memory context) during context gathering
|
|
67
67
|
- Prompt output range: 5-15 coordinated prompts for milestone, 0-3 seed prompts for exploratory
|
|
68
68
|
|
|
69
69
|
The milestone workflow domain config encodes the domain-specific knowledge (what to explore, what to consider, what to check). The unified flows preserve the orchestration logic (how to interview, how to spawn subtasks, how to sequence phases). The abstraction must not lose any of these practices — they are the result of iterative refinement and represent proven milestone development patterns.
|
|
@@ -114,7 +114,7 @@ For milestone domains, the spec planning flow must preserve the full planning pi
|
|
|
114
114
|
- Plan verification self-check before jury (requirement coverage, task completeness, key links, scope sanity, validation coverage)
|
|
115
115
|
- 4-member jury review (expectations fit, flow analysis, YAGNI, premortem) with review options breakdown
|
|
116
116
|
- Plan deepening option for complex/high-risk specs
|
|
117
|
-
- Solutions
|
|
117
|
+
- Solutions search (includes memory context) during context gathering
|
|
118
118
|
- Decision recording: only deviations from recommendations
|
|
119
119
|
|
|
120
120
|
The domain config determines which of these phases activate. Milestone activates all of them. Exploratory domains activate a subset (focused research, open question interview, seed prompt creation, no jury, no variants).
|
|
@@ -1,84 +0,0 @@
|
|
|
1
|
-
<goal>
|
|
2
|
-
Find and extract domain expertise from skills to embed in prompt instructions. Per **Knowledge Compounding**, skills are "how to do it right" - expertise that compounds across prompts.
|
|
3
|
-
</goal>
|
|
4
|
-
|
|
5
|
-
<inputs>
|
|
6
|
-
- Files/domains involved in the implementation task
|
|
7
|
-
- Nature of the changes (UI, native code, deployment, etc.)
|
|
8
|
-
</inputs>
|
|
9
|
-
|
|
10
|
-
<outputs>
|
|
11
|
-
- Extracted knowledge distilled for prompt embedding
|
|
12
|
-
- Sources consulted (skill file paths)
|
|
13
|
-
</outputs>
|
|
14
|
-
|
|
15
|
-
<constraints>
|
|
16
|
-
- MUST run `ah skills list` to discover available skills
|
|
17
|
-
- MUST match skills via both glob patterns AND description inference
|
|
18
|
-
- MUST extract task-relevant knowledge, not copy entire skill files
|
|
19
|
-
- MUST list sources consulted in output
|
|
20
|
-
</constraints>
|
|
21
|
-
|
|
22
|
-
## Step 1: Discover Available Skills
|
|
23
|
-
|
|
24
|
-
- Run `ah skills list`
|
|
25
|
-
- Returns JSON with: `name`, `description`, `globs`, `file` path
|
|
26
|
-
|
|
27
|
-
## Step 2: Identify Relevant Skills
|
|
28
|
-
|
|
29
|
-
Match skills using two approaches:
|
|
30
|
-
|
|
31
|
-
**Glob pattern matching** (programmatic):
|
|
32
|
-
- Compare files you're touching against each skill's `globs`
|
|
33
|
-
- Skills with matching patterns are likely relevant
|
|
34
|
-
|
|
35
|
-
**Description inference** (semantic):
|
|
36
|
-
- Read skill descriptions
|
|
37
|
-
- Match against task nature (UI, deployment, native modules, etc.)
|
|
38
|
-
|
|
39
|
-
Select all skills that apply to implementation scope.
|
|
40
|
-
|
|
41
|
-
## Step 3: Read Skill Documentation
|
|
42
|
-
|
|
43
|
-
For each relevant skill, read the full file:
|
|
44
|
-
- Read `.allhands/skills/<skill-name>/SKILL.md`
|
|
45
|
-
|
|
46
|
-
Extract:
|
|
47
|
-
- **Key patterns**: Code patterns, library preferences, common pitfalls
|
|
48
|
-
- **Best practices**: Guidelines specific to this domain
|
|
49
|
-
- **References**: Sub-documents within the skill folder
|
|
50
|
-
|
|
51
|
-
## Step 4: Extract Knowledge for Prompt
|
|
52
|
-
|
|
53
|
-
Synthesize skill content into actionable prompt guidance:
|
|
54
|
-
- Distill key instructions
|
|
55
|
-
- Include specific examples where relevant
|
|
56
|
-
- Reference sources
|
|
57
|
-
- Avoid duplication - extract what's task-relevant
|
|
58
|
-
|
|
59
|
-
## Step 5: Output with Sources
|
|
60
|
-
|
|
61
|
-
Provide:
|
|
62
|
-
|
|
63
|
-
```
|
|
64
|
-
## Skill-Derived Guidance
|
|
65
|
-
|
|
66
|
-
### From building-expo-ui:
|
|
67
|
-
- Use `<Link.Preview>` for context menus
|
|
68
|
-
- Prefer `contentInsetAdjustmentBehavior="automatic"` over SafeAreaView
|
|
69
|
-
|
|
70
|
-
### From react-native-best-practices:
|
|
71
|
-
- Profile with React DevTools before optimizing
|
|
72
|
-
- Use FlashList for lists with >50 items
|
|
73
|
-
|
|
74
|
-
## Sources Consulted
|
|
75
|
-
- .allhands/skills/building-expo-ui/SKILL.md
|
|
76
|
-
- .allhands/skills/react-native-best-practices/SKILL.md
|
|
77
|
-
```
|
|
78
|
-
|
|
79
|
-
## For Prompt Curation
|
|
80
|
-
|
|
81
|
-
When used via PROMPT_TASKS_CURATION:
|
|
82
|
-
- Add skill file paths to prompt's `skills` frontmatter
|
|
83
|
-
- Embed extracted guidance in prompt's Tasks section
|
|
84
|
-
- Makes domain expertise explicit and immediately available to executors
|