@gcunharodrigues/wrxn 0.2.1 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/wrxn.cjs +102 -2
- package/lib/brain.cjs +295 -0
- package/lib/convert.cjs +215 -0
- package/lib/ingest.cjs +174 -0
- package/manifest.json +10 -0
- package/migrations/003-serve-http-door.cjs +44 -0
- package/package.json +2 -2
- package/payload/.claude/hooks/recall-surface.cjs +260 -80
- package/payload/.claude/skills/ingest/SKILL.md +72 -0
- package/payload/.mcp.json +1 -1
- package/payload/.recon-wrxn.json +2 -0
- package/payload/.wrxn/raw/.gitkeep +0 -0
|
@@ -0,0 +1,72 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: ingest
|
|
3
|
+
description: Distill a dropped source file (PDF/DOCX/HTML/PPTX/XLSX/TXT) into curated memory-wiki pages — a summary page + N note pages, each linked back to the raw source. Use when an operator drops a file in .wrxn/raw/ (or names one) and says "ingest this", "distill this document", or "turn this source into wiki notes".
|
|
4
|
+
user-invocable: true
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# ingest — read → distill → split into notes
|
|
8
|
+
|
|
9
|
+
The distillation half of `wrxn ingest <file>`. Turns a raw source into compounding wiki knowledge,
|
|
10
|
+
following the Karpathy LLM-wiki pattern (Adler's *How to Read a Book*: read → distill → split into
|
|
11
|
+
notes). This is built as the first concrete slice of the Phase-3 `dream` ingest.
|
|
12
|
+
|
|
13
|
+
The work is **split** (the kernel executor pattern):
|
|
14
|
+
|
|
15
|
+
- **The harness is deterministic** (`lib/ingest.cjs` → `wrxn ingest`): convert the source to markdown,
|
|
16
|
+
place/keep the raw under `.wrxn/raw/`, write the pages you produce with a `derived_from:` provenance
|
|
17
|
+
stamp, and enforce the **additive-only** guard. You do NOT re-implement any of that.
|
|
18
|
+
- **You are the distillation step.** You read the converted markdown and produce the *content* — a
|
|
19
|
+
summary page + N note pages. Your job is the curation quality.
|
|
20
|
+
|
|
21
|
+
## Scope (PRD decision E)
|
|
22
|
+
|
|
23
|
+
- **Inspectional + analytical depth**: 1 source → **1 summary page + N note pages**. Divide the source
|
|
24
|
+
into its natural documents (sections / themes), one note page each.
|
|
25
|
+
- **Additive-only.** You CREATE new pages. You never edit an existing wiki page and never synthesise
|
|
26
|
+
across sources — that is the `dream` loop, out of scope. The harness refuses to overwrite, so a slug
|
|
27
|
+
that collides with an existing page is silently skipped: choose fresh, source-specific slugs.
|
|
28
|
+
|
|
29
|
+
## Loop
|
|
30
|
+
|
|
31
|
+
1. **Convert.** Read the converted markdown — either run `wrxn convert <file>` and read its output, or
|
|
32
|
+
read what the harness converted. Do not parse the binary yourself.
|
|
33
|
+
2. **Read for the gist.** One pass for the whole; identify the source's structure and its key claims.
|
|
34
|
+
3. **Summary page.** One page capturing what the source IS and its main points — the inspectional read.
|
|
35
|
+
4. **Note pages.** One page per distinct theme/section — the analytical read. Each note is
|
|
36
|
+
self-contained and titled by its idea, not "Section 3".
|
|
37
|
+
5. **Emit the result** as the structured object below and hand it to the harness.
|
|
38
|
+
|
|
39
|
+
## Result contract
|
|
40
|
+
|
|
41
|
+
The harness consumes this exact shape (the CLI accepts it via `--distillation <result.json>`):
|
|
42
|
+
|
|
43
|
+
```json
|
|
44
|
+
{
|
|
45
|
+
"summary": { "slug": "paper-summary", "title": "...", "description": "one line", "body": "markdown" },
|
|
46
|
+
"notes": [
|
|
47
|
+
{ "slug": "paper-method", "title": "...", "description": "one line", "body": "markdown" },
|
|
48
|
+
{ "slug": "paper-results", "title": "...", "description": "one line", "body": "markdown" }
|
|
49
|
+
]
|
|
50
|
+
}
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
- `slug` — kebab-case (`[a-z0-9-]`), unique, source-specific (prefix with the source name to avoid
|
|
54
|
+
collisions). The harness rejects a non-kebab slug.
|
|
55
|
+
- `summary` is required (`{slug, body}` minimum); `notes` is an array (≥1 in practice).
|
|
56
|
+
- `tier` is optional per page (default `concepts`; may be `concepts|decisions|gotchas|sessions`).
|
|
57
|
+
- The harness adds the `derived_from: .wrxn/raw/<file>` stamp, `role`, and `source: wrxn-ingest`
|
|
58
|
+
frontmatter — you do not write frontmatter into `body`.
|
|
59
|
+
|
|
60
|
+
## Run
|
|
61
|
+
|
|
62
|
+
```bash
|
|
63
|
+
# the harness does convert → raw placement → provenance stamp → additive write:
|
|
64
|
+
wrxn ingest .wrxn/raw/paper.pdf --distillation result.json
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
Re-running on the same source is safe — existing pages are skipped, never clobbered.
|
|
68
|
+
|
|
69
|
+
## Source
|
|
70
|
+
|
|
71
|
+
WRXN Kernel issue multiformat-distill-06 (PRD decisions D + E). Harness: `lib/ingest.cjs`.
|
|
72
|
+
Converter: `lib/convert.cjs` (slice 05). Wiki: `.wrxn/wiki/` (see the `memory` skill).
|
package/payload/.mcp.json
CHANGED
package/payload/.recon-wrxn.json
CHANGED
|
File without changes
|