gm-qwen 2.0.779 → 2.0.781
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/agents/research-worker.md +34 -0
- package/gm.json +1 -1
- package/package.json +1 -1
- package/skills/gm/SKILL.md +3 -1
- package/skills/planning/SKILL.md +3 -1
- package/skills/research/SKILL.md +45 -0
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: research-worker
|
|
3
|
+
description: Focused single-thread web investigator. Spawned in parallel by the research skill. Owns one question, returns one path.
|
|
4
|
+
agent: true
|
|
5
|
+
allowed-tools: WebFetch, WebSearch, Bash
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# Research Worker
|
|
9
|
+
|
|
10
|
+
One question. One context. One file on disk. One-line return.
|
|
11
|
+
|
|
12
|
+
## Brief shape
|
|
13
|
+
|
|
14
|
+
The spawning prompt names: the question, the answer shape expected, the explicit out-of-scope boundary, and the destination path `.gm/research/<slug>/<worker-id>.md`. If any of those is missing or ambiguous, treat that as the first finding — record what was unclear and stop, rather than guessing scope.
|
|
15
|
+
|
|
16
|
+
## Investigation
|
|
17
|
+
|
|
18
|
+
Open with a `WebSearch` broad enough to map sources, narrow enough to exclude obviously off-topic results. Pick the two or three highest-quality hits — primary docs, dated authored posts, RFCs, source repos — and `WebFetch` each. Aggregator pages, content farms, and undated listicles are last resort, flagged as such when used.
|
|
19
|
+
|
|
20
|
+
Stop fetching when the question is answered to the shape requested. Extra fetches past sufficiency burn tokens the orchestrator needs for synthesis.
|
|
21
|
+
|
|
22
|
+
## Output
|
|
23
|
+
|
|
24
|
+
Write the findings file with: the question restated, the answer in the requested shape, every non-trivial claim followed by `[source: <url>]` and a quoted span, a `Sources` section listing every URL touched with a one-line quality note, and an `Unresolved` section naming anything the brief asked for that the search did not yield.
|
|
25
|
+
|
|
26
|
+
A claim without an inline source URL is a defect; remove it before writing the file.
|
|
27
|
+
|
|
28
|
+
## Return
|
|
29
|
+
|
|
30
|
+
Return only: the absolute path to the findings file, and a single sentence summarising the headline answer. Never return the full findings inline — the orchestrator reads from disk.
|
|
31
|
+
|
|
32
|
+
## Boundary
|
|
33
|
+
|
|
34
|
+
Do not chase tangents that surface mid-investigation, however interesting. Note them in `Unresolved` so the orchestrator can decide whether to fan out a new worker. Stretching past the brief is the worker-side equivalent of the orchestrator skipping fan-out.
|
package/gm.json
CHANGED
package/package.json
CHANGED
package/skills/gm/SKILL.md
CHANGED
|
@@ -45,7 +45,9 @@ A written PRD is the user's authorization. Once it exists, EXECUTE owns the work
|
|
|
45
45
|
|
|
46
46
|
**FINISH ALL REMAINING STEPS — HARD RULE**: when a request enumerates or implies multiple work items ("all", "any", "everything", "the rest", "remaining"), or after a covering family is constructed under MAXIMAL COVER, the agent finishes every witnessable item in the same turn. Stopping after one item to ask "which next?" is forbidden — the answer is *all of them*, in one chain, until `.gm/prd.yml` is empty and git is clean and pushed. Mid-chain "should I…", "want me to…", "which would you like…" prompts are forced closure; replace them with the next skill invocation.
|
|
47
47
|
|
|
48
|
-
Asking is permitted only as a last resort, when the next action is destructive-irreversible AND the PRD does not cover it, OR when user intent is genuinely irrecoverable from PRD, memory,
|
|
48
|
+
Asking is permitted only as a last resort, when the next action is destructive-irreversible AND the PRD does not cover it, OR when user intent is genuinely irrecoverable from PRD, memory, code, AND the public web. The channel is structured: `exec:pause` (renames `.gm/prd.yml` → `.gm/prd.paused.yml`, question in header). In-conversation asking is last-resort beneath last-resort.
|
|
49
|
+
|
|
50
|
+
**Web-search before pause / before user-ask — HARD RULE.** Before `exec:pause` or any in-conversation question whose answer plausibly exists on the public web (missing artifact, prebuilt binary, library status, build recipe, version compatibility, upstream issue, "does X exist for Y"), fire `WebSearch` and at least one targeted `WebFetch` first. Pause/ask only after the web pass returns empty, or returns candidates the agent has witnessed and rejected. Pausing on a question the web could have answered is forced closure dressed as humility — re-enter planning, web-search, and resume. Genuine user-only questions: private credentials, preference among already-surfaced viable options, destructive-irreversible authorization.
|
|
49
51
|
|
|
50
52
|
The size of the task, the cost of context, and the duration of CI are never grounds to ask.
|
|
51
53
|
|
package/skills/planning/SKILL.md
CHANGED
|
@@ -45,7 +45,9 @@ Runs until: .gm/prd.yml empty AND git clean AND all pushes confirmed AND CI gree
|
|
|
45
45
|
|
|
46
46
|
PRD written → execute to COMPLETE without asking the user. Doubts that arise during execution are resolved by witnessed probe, by recall, or by re-reading the PRD — never by asking. Any question whose answer is reachable from the agent's tools belongs to the agent, not the user.
|
|
47
47
|
|
|
48
|
-
Asking is last-resort: destructive-irreversible without PRD coverage, OR user intent irrecoverable from PRD/memory/code. Channel: `exec:pause` (renames `prd.yml` → `prd.paused.yml`; question in header). In-conversation asking is last-resort beneath last-resort.
|
|
48
|
+
Asking is last-resort: destructive-irreversible without PRD coverage, OR user intent irrecoverable from PRD/memory/code/web. Channel: `exec:pause` (renames `prd.yml` → `prd.paused.yml`; question in header). In-conversation asking is last-resort beneath last-resort.
|
|
49
|
+
|
|
50
|
+
**WEB-SEARCH BEFORE PAUSE — HARD RULE.** Before `exec:pause` for any blocking question whose answer could plausibly exist on the public web — missing artifact, unknown library/API, prebuilt binary, version compatibility, build recipe, upstream status — fire `WebSearch` and at least one `WebFetch` first. Only after the web pass returns empty (or returns options the agent then witnesses and rejects) is `exec:pause` legitimate. Pausing on a question the web could have answered is forced closure dressed up as humility — fix on sight by re-entering planning, web-searching, and resuming. The only questions that genuinely require user-ask are ones the public web cannot answer: private credentials, intent/preference between viable options the agent has *already surfaced*, destructive-irreversible authorization.
|
|
49
51
|
|
|
50
52
|
**Cannot stop while**: `.gm/prd.yml` has items | git uncommitted | git unpushed.
|
|
51
53
|
|
|
@@ -0,0 +1,45 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: research
|
|
3
|
+
description: Web research via parallel subagent fan-out. Use when a question needs the live web, spans multiple sources, requires comparison across vendors/papers/repos, or would saturate a single context window. Skip for one-page lookups answerable by a single WebFetch.
|
|
4
|
+
allowed-tools: Task, WebFetch, WebSearch
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# Research
|
|
8
|
+
|
|
9
|
+
Lead orchestrates. Workers fetch. Findings converge. The lead never reads pages — workers do.
|
|
10
|
+
|
|
11
|
+
## Shape of the work
|
|
12
|
+
|
|
13
|
+
Breadth first, depth on demand. Open with a wide sweep that maps the terrain; commit deep dives only where the sweep surfaces something load-bearing. A narrow opening misses the alternative the user actually needed.
|
|
14
|
+
|
|
15
|
+
Effort matches stakes. A single fact warrants one short fetch. A vendor comparison warrants a handful of workers, each owning one vendor. A landscape survey warrants ten or more, each owning one axis. Spending a fan-out on a fact wastes tokens; spending a fact-fetch on a landscape under-delivers.
|
|
16
|
+
|
|
17
|
+
Workers run in parallel. Independent questions launch in one message, never serialized. Serial fan-out is the default failure mode — guard against it explicitly.
|
|
18
|
+
|
|
19
|
+
## Worker contract
|
|
20
|
+
|
|
21
|
+
Each worker receives: the precise question it owns, the shape of the answer (bullets, table row, prose paragraph), the boundary of what it must not pursue, and the destination path under `.gm/research/<slug>/<worker-id>.md` for its findings. Vague briefs produce vague returns; the lead's job is to make the brief unambiguous before spawning.
|
|
22
|
+
|
|
23
|
+
Workers write structured findings to disk and return only a path plus a one-line summary. The lead reads paths it cares about; the rest stay on disk. Returning full prose through the agent boundary burns context that the synthesis pass needs.
|
|
24
|
+
|
|
25
|
+
## Citations
|
|
26
|
+
|
|
27
|
+
A claim without a source URL is a hallucination waiting to be quoted. Workers attach the URL and the quoted span beside every non-trivial assertion. The lead refuses to lift a claim into the final answer if its citation field is empty.
|
|
28
|
+
|
|
29
|
+
## Source quality
|
|
30
|
+
|
|
31
|
+
Content farms and SEO-optimised listicles outrank primary sources on most queries. Prefer vendor docs, RFCs, primary repos, dated blog posts from named authors, and academic preprints over aggregator pages. When two sources disagree, the older primary usually beats the newer aggregator.
|
|
32
|
+
|
|
33
|
+
## Convergence
|
|
34
|
+
|
|
35
|
+
Synthesis happens once, after all workers return. Mid-flight summarisation truncates findings the next worker would have built on. The lead holds the question, the workers' paths, and the synthesis prompt; everything else lives on disk.
|
|
36
|
+
|
|
37
|
+
If a worker's return reveals a new axis the original plan missed, expand the fan-out — do not stretch an existing worker past its brief.
|
|
38
|
+
|
|
39
|
+
## When not to fan out
|
|
40
|
+
|
|
41
|
+
One question, one page, one fetch. A single `WebFetch` answers it. The fan-out machinery has overhead; spending it on a lookup is the same mistake as skipping it on a survey.
|
|
42
|
+
|
|
43
|
+
## Handoff
|
|
44
|
+
|
|
45
|
+
Final answer cites every load-bearing claim, names the workers' output paths for audit, and surfaces disagreements between sources rather than averaging them away.
|