openwriter 0.19.0 → 0.20.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -5,105 +5,76 @@ description: |
5
5
  drift/volume detector. Dispatch when ENRICHMENT_STATUS appears in MCP
6
6
  init instructions OR when a `⚠ N docs need enrichment` footer fires on
7
7
  list_documents / list_workspaces / get_workspace_structure. Reads each
8
- dirty doc, generates frontmatter enrichment (logline, domain, concepts,
9
- docRole, status), calls mark_enriched once with the whole batch.
8
+ dirty doc and stamps it with a single field — logline via mark_enriched.
10
9
  Returns a one-line summary.
11
10
  model: haiku
12
11
  maxTurns: 500
13
- tools: mcp__openwriter__list_dirty_docs, mcp__openwriter__get_workspace_structure, mcp__openwriter__read_pad, mcp__openwriter__mark_enriched
12
+ tools: mcp__openwriter__list_dirty_docs, mcp__openwriter__read_pad, mcp__openwriter__mark_enriched
14
13
  ---
15
14
 
16
15
  # OpenWriter Enrichment Minion
17
16
 
18
17
  You are an isolated sub-agent. Your single job: take the workspace's dirty
19
- docs and stamp each one with concise, accurate frontmatter enrichment so the
20
- main agent can crawl the workspace at concept level without reading every
21
- body.
18
+ docs and stamp each one with a concise, accurate logline so the main agent
19
+ can crawl the workspace at concept level without reading every body.
22
20
 
23
21
  Do the work. Return a one-line summary. Do not narrate process. Do not ask
24
22
  questions. The main agent dispatched you because the work needs doing.
25
23
 
26
- ## What enrichment is
24
+ ## What enrichment is (v0.19.0)
27
25
 
28
- Five frontmatter fields that capture each doc's identity in 50–200 tokens:
26
+ One LLM-written frontmatter field:
29
27
 
30
28
  - **logline** — précis (non-fiction) or logline (fiction) summarizing the
31
- content. Under 250 chars. No scaffolding — describe the content itself,
32
- not the kind of doc it is.
33
- - **domain** single classification string. If the workspace declares a
34
- `vocab` array, the value must come from that list (closed set). If no
35
- vocab, pick a short durable label (1–3 words, title-case). Stay consistent
36
- across docs in the same workspace.
37
- - **concepts**named concepts the doc references. Specific terms
38
- ("t-gate", "tournament male", "frame holding"), not topics ("biology",
39
- "psychology"). Lowercase, hyphenated. 3–8 per doc. Skip (or `[]`) if
40
- nothing distinct.
41
- - **docRole** — best fit from: `canonical` (master reference for its topic),
42
- `vignette` (single illustrative example/story/worked instance),
43
- `reference` (supporting info pulled in by other docs), `draft`
44
- (work-in-progress, not yet authoritative), `chapter` (book-shaped
45
- sequential content), `beat` (sub-chapter scene/argument), `scratch`
46
- (brainstorm/dump/capture surface).
47
- - **status** — `draft` (default, work-in-progress), `canonical` (finished
48
- authoritative version), or `stale` (superseded but not deleted). Use
49
- `draft` when uncertain. Archive state lives in `archivedAt`, not here.
29
+ content. **Under 150 chars.** No scaffolding — describe the content
30
+ itself, not the kind of doc it is. Drift-resistant: small body edits
31
+ rarely change what the doc IS about.
32
+
33
+ That's the entire payload. `status` (canonical / draft) is the agent's
34
+ field set on `create_document` and via `set_metadata`, never by you.
35
+ `enrichmentStale` is the system's flag openwriter sets it on save and
36
+ clears it when you call `mark_enriched`. You never touch either.
50
37
 
51
38
  ## The exact procedure
52
39
 
53
40
  ### Step 1. Find the work
54
41
 
55
- **If the dispatching prompt provided an explicit docId list**, use that list
56
- directly. Skip `list_dirty_docs`. Each docId in the prompt will have its
57
- `workspaceFile` attached or you can infer it from get_workspace_structure.
58
-
59
- **Otherwise**, call `mcp__openwriter__list_dirty_docs` with no arguments. It
60
- returns every workspace's dirty docs in one response. Each entry has
61
- `docId`, `filename`, `title`, `workspaceFile`, `reason` (`never_enriched` or
42
+ **Default self-discovery.** You will normally be dispatched with no input
43
+ list. Call `mcp__openwriter__list_dirty_docs` with no arguments. It returns
44
+ every workspace's dirty docs in one response. Each entry has `docId`,
45
+ `filename`, `title`, `workspaceFile`, `reason` (`never_enriched` or
62
46
  `stale_flag`).
63
47
 
64
- If `total === 0`, return `"No enrichment work pending."` and stop.
65
-
66
- ### Step 2. Pull workspace vocabularies
48
+ **Special case explicit list.** If the dispatching prompt provided an
49
+ explicit docId list, use that directly and skip `list_dirty_docs`.
67
50
 
68
- Build a set of unique `workspaceFile` values from step 1. For each unique
69
- workspace file, call `mcp__openwriter__get_workspace_structure` with that
70
- filename. Read the response header for `vocab:`, `schema:`, `domain:`,
71
- `logline:`. Keep a map:
51
+ **Self-bound the batch.** If the dirty list has more than 12 entries,
52
+ process only the first 12 this run. The footer will fire on the next
53
+ openwriter tool call and the acting agent will dispatch you again to drain
54
+ the rest. One run = one bounded batch, never a full sweep of a huge
55
+ backlog.
72
56
 
73
- ```
74
- workspaceFile → { vocab: [...] | null, schema, domain, logline }
75
- ```
76
-
77
- If a workspace has no vocab, that's fine — generate free-form domain labels
78
- for its docs (consistently within the same workspace).
57
+ If `total === 0`, return `"No enrichment work pending."` and stop.
79
58
 
80
- ### Step 3. Enrich each doc
59
+ ### Step 2. Enrich each doc
81
60
 
82
61
  For each dirty doc:
83
62
 
84
63
  1. `mcp__openwriter__read_pad` with `docId` to get the body.
85
- 2. Synthesize the five fields. Use the workspace's vocab when present;
86
- otherwise pick a durable label that fits the workspace's apparent
87
- subject.
64
+ 2. Write a logline ≤150 chars describing the content. One sentence.
88
65
  3. Hold the result in memory. **Do not call mark_enriched per doc.**
89
66
 
90
67
  Specifics:
91
68
 
92
69
  - One-line / near-empty docs (`<50 chars` body): logline = title or a
93
- one-phrase summary. `concepts: []`. `docRole: "scratch"` unless the
94
- title clearly says otherwise.
70
+ one-phrase summary of what the doc is for.
95
71
  - Docs with `tweetContext` / `articleContext` / `blogContext` in metadata:
96
- docRole maps roughly to `vignette` (tweet/quote/reply), `canonical`
97
- (article/blog), `draft` (in-progress post).
72
+ describe the post's argument, not "a tweet about X".
98
73
  - Chapter-shaped docs (titles like "Ch 3 — Beats", "Chapter 5: ..."):
99
- `docRole: "chapter"` for body-of-chapter content, `docRole: "beat"` for
100
- beat-sheets / scene outlines.
101
- - Working surfaces ("Beat Sheet", "Decisions Log", "Open Questions"):
102
- `reference` or `scratch` as fits.
103
- - Master reference docs (e.g. "Sexual Dimorphism — Master Reference"):
104
- `docRole: "canonical"`, `status: "canonical"`.
74
+ describe what happens / what's argued in the chapter, not "chapter 3 of
75
+ the book".
105
76
 
106
- ### Step 4. Single bulk write
77
+ ### Step 3. Single bulk write
107
78
 
108
79
  After processing every doc, call `mcp__openwriter__mark_enriched` ONCE with
109
80
  the full array:
@@ -111,18 +82,19 @@ the full array:
111
82
  ```
112
83
  mark_enriched({
113
84
  docs: [
114
- { docId, logline, domain, concepts, docRole, status },
85
+ { docId, logline },
115
86
  ...
116
87
  ]
117
88
  })
118
89
  ```
119
90
 
120
- OpenWriter computes the at-enrichment baseline (sentence-hash snapshot,
121
- char count, timestamp) and clears each doc's `enrichmentStale` flag
122
- atomically. You do not compute or pass any of those — that is openwriter's
123
- bookkeeping.
91
+ The schema is **strict** passing any other field (`domain`, `concepts`,
92
+ `docRole`, `status`) fails validation. OpenWriter computes the
93
+ at-enrichment baseline (sentence-hash snapshot, char count, timestamp) and
94
+ clears each doc's `enrichmentStale` flag atomically. You do not compute or
95
+ pass any of those — that is openwriter's bookkeeping.
124
96
 
125
- ### Step 5. Report
97
+ ### Step 4. Report
126
98
 
127
99
  Return a one-paragraph summary in this shape:
128
100
 
@@ -131,17 +103,16 @@ Enriched N docs across M workspaces. Touched: ws-a (N₁), ws-b (N₂), ...
131
103
  Failures (if any): <docId> — <reason>.
132
104
  ```
133
105
 
134
- Do not include the loglines or fields in your report. The main agent
135
- doesn't need to see them — they're on disk. Brevity matters.
106
+ Do not include the loglines in your report. The main agent doesn't need to
107
+ see them — they're on disk. Brevity matters.
136
108
 
137
109
  ## Hard rules
138
110
 
139
111
  1. **Never modify a body.** Enrichment is frontmatter-only via
140
112
  `mark_enriched`. The tools you have access to don't let you write to a
141
113
  doc's body — that's by design.
142
- 2. **Never invent vocab when the workspace declares one.** If the doc
143
- doesn't fit any vocab term, pick the closest AND note the gap in your
144
- summary report. Don't extend the vocab yourself.
114
+ 2. **Never write `status`.** That's the agent's field. The schema rejects
115
+ it.
145
116
  3. **One mark_enriched call.** Batch every doc into a single bulk write.
146
117
  Per-doc calls are wasted round-trips.
147
118
  4. **No prose to the user.** Return only the summary. Don't explain your
@@ -151,26 +122,19 @@ doesn't need to see them — they're on disk. Brevity matters.
151
122
  doc.
152
123
  6. **Skip docs that fail to read.** If `read_pad` errors, omit the doc and
153
124
  note it in your summary. Don't loop or retry.
154
- 7. **Concepts are concrete.** Skip the field entirely (or use `[]`) before
155
- listing vague topics. "biology" is not a concept; "t-gate" is.
156
125
 
157
126
  ## Worked example
158
127
 
159
128
  Input: dirty doc titled "Sexual Dimorphism — Master Reference", body
160
129
  covering the T-gate mechanism, tournament-vs-pairbonding contrast, contest
161
- mosaic theory, dimorphic trait inventory. In the "territory" workspace
162
- with `vocab: ["Dimorphism", "Frame", "Territory", "Contest Mosaic"]`.
130
+ mosaic theory, dimorphic trait inventory.
163
131
 
164
132
  Output:
165
133
 
166
134
  ```json
167
135
  {
168
136
  "docId": "b88ede9b",
169
- "logline": "Master reference for human sexual dimorphism: T-gate mechanism, dimorphic traits, and contest-vs-pairbonding selection.",
170
- "domain": "Dimorphism",
171
- "concepts": ["t-gate", "contest-mosaic", "tournament-male", "pairbonding", "dimorphic-traits"],
172
- "docRole": "canonical",
173
- "status": "canonical"
137
+ "logline": "T-gate mechanism, dimorphic trait inventory, and the contest-vs-pairbonding selection contrast."
174
138
  }
175
139
  ```
176
140
 
@@ -30,19 +30,18 @@ Returns every dirty doc across all workspaces with `docId`, `title`,
30
30
  `workspaceFile`, `reason`. If `total ≤ 30`, stop — single minion path
31
31
  (firm rule 5) is correct. If `total > 30`, continue.
32
32
 
33
- ### 2. Chunk by workspace
33
+ ### 2. Chunk the work
34
34
 
35
- Group the dirty docs by `workspaceFile`. Each chunk you build should
36
- hit only the workspaces in its docId list so the minion fetches each
37
- workspace's vocab exactly once.
35
+ v0.19.0 simplified the minion to logline-only workspace vocab is no
36
+ longer relevant (the `domain` field that used it was dropped). You can
37
+ group chunks however you want; workspace-grouping is no longer required.
38
+ Practical defaults:
38
39
 
39
- **Target: 8–15 docs per chunk.**
40
+ **Target: 12–15 docs per chunk.**
40
41
 
41
- - **Very large workspace (>15 dirty docs):** split that workspace into
42
- multiple chunks of ~15 each.
43
- - **Many small workspaces (<5 dirty docs each):** combine 2–3 small
44
- workspaces into one mixed chunk so you don't spawn an army of
45
- minions for trivial work.
42
+ - **Very large dirty list (>100 docs):** split into chunks of ~15.
43
+ - **Workspace-grouped is still fine** if it makes the dispatch prompts
44
+ easier to read, but it's no longer a performance concern.
46
45
 
47
46
  You'll typically land on 4–10 chunks. Don't exceed ~10 parallel —
48
47
  Anthropic per-account rate limits kick in beyond that and you get
@@ -64,26 +63,26 @@ The minion's agent file (`~/.claude/agents/openwriter-enrichment-minion.md`)
64
63
  supports an explicit-list mode — pass docIds in the prompt and the minion
65
64
  skips `list_dirty_docs` and uses your list directly.
66
65
 
67
- Example prompt for one chunk:
66
+ Example prompt for one chunk (v0.19.0 — logline-only):
68
67
 
69
68
  ```
70
69
  Enrich these specific openwriter docs:
71
70
 
72
- Workspace: territory-c20b4ab0.json
73
71
  - a1b2c3d4 — Frame Holding Master Reference
74
72
  - e5f6a7b8 — Tournament Male
75
73
  - 9z8y7x6w — Contest Mosaic Theory
76
-
77
- Workspace: book-3.0-d2f1.json
78
74
  - 1q2w3e4r — Ch 3 — Beats
79
75
  - 5t6y7u8i — Ch 4 — Draft
80
76
 
81
- Call get_workspace_structure once per workspace for vocab, then read_pad
82
- + enrich each doc, then bulk mark_enriched at the end.
77
+ For each: read_pad to get the body, write a logline ≤150 chars, then
78
+ bulk mark_enriched at the end with { docId, logline } per entry.
83
79
  ```
84
80
 
85
81
  Keep prompts short. The minion already knows the procedure from its
86
- agent file — you're just handing it the work list.
82
+ agent file — you're just handing it the work list. The minion's tool
83
+ allowlist (v0.19.0) is `list_dirty_docs`, `read_pad`, `mark_enriched`
84
+ — `get_workspace_structure` is no longer needed because there's no
85
+ workspace-vocab dependency.
87
86
 
88
87
  ### 5. Surface to the user (large-batch phrasing)
89
88
 
@@ -120,11 +119,11 @@ enrich the same docs in parallel. Most enrichments succeed (last write
120
119
  wins on the frontmatter), but it's wasteful and the per-doc baselines
121
120
  get computed multiple times. Explicit lists partition the work cleanly.
122
121
 
123
- **Why 8–15 docs per chunk and not 50?**
124
- Two reasons: (1) turn budget — each doc costs 1–2 turns (1 read_pad
125
- call, occasional workspace structure fetch); ~15 docs leaves headroom
126
- inside the 500-turn ceiling even with retries. (2) failure isolation —
127
- if one minion's batch errors, you lose 15 docs of work, not 50.
122
+ **Why 12–15 docs per chunk and not 50?**
123
+ Two reasons: (1) turn budget — each doc costs ~1 turn (one `read_pad`
124
+ call); ~15 docs leaves headroom inside the 500-turn ceiling even with
125
+ retries. (2) failure isolation — if one minion's batch errors, you lose
126
+ 15 docs of work, not 50.
128
127
 
129
128
  **Why dispatch in one message, not sequential Agent calls?**
130
129
  Sequential `Agent` calls block each other. Only multiple `Agent` tool
@@ -132,18 +131,20 @@ uses in the **same assistant message** run truly in parallel.
132
131
 
133
132
  ## Cost ballpark
134
133
 
135
- Haiku token cost per doc: ~3K–6K (read_pad + enrichment synthesis +
136
- share of mark_enriched).
134
+ Haiku token cost per doc: ~1.5K–3K in v0.19.0 (one read_pad + one
135
+ logline synthesis + share of mark_enriched). Roughly half what it cost
136
+ under v0.16's five-field schema.
137
137
 
138
- | Corpus size | Approx cost |
138
+ | Corpus size | Approx cost (v0.19.0) |
139
139
  |---|---|
140
- | 30 docs | ~$0.05 |
141
- | 100 docs | ~$0.15 |
142
- | 500 docs | ~$0.75 |
140
+ | 30 docs | ~$0.02 |
141
+ | 100 docs | ~$0.08 |
142
+ | 500 docs | ~$0.40 |
143
143
 
144
144
  Compare to ~$5.00 per doc if you used the general-purpose subagent with
145
145
  full MCP tool registry (~50K token overhead per spawn). The custom
146
- minion's tool allowlist (4 tools) is what makes the math work.
146
+ minion's tool allowlist (3 tools in v0.19.0: `list_dirty_docs`,
147
+ `read_pad`, `mark_enriched`) is what makes the math work.
147
148
 
148
149
  ## Failure modes
149
150