wicked-brain 0.16.0 → 0.17.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json
CHANGED
package/server/package.json
CHANGED
|
@@ -115,6 +115,7 @@ entities:
|
|
|
115
115
|
people: [{people/roles}]
|
|
116
116
|
programs: [{programs/initiatives}]
|
|
117
117
|
metrics: ["{metric}: {value}"]
|
|
118
|
+
method: {extraction method — see "Extraction method" below}
|
|
118
119
|
confidence: {0.7 for text, 0.85 for vision}
|
|
119
120
|
indexed_at: {current ISO timestamp}
|
|
120
121
|
narrative_theme: {the "so what" in 8 words or fewer}
|
|
@@ -122,6 +123,23 @@ narrative_theme: {the "so what" in 8 words or fewer}
|
|
|
122
123
|
|
|
123
124
|
{Extracted content in markdown format}
|
|
124
125
|
|
|
126
|
+
## Extraction method
|
|
127
|
+
|
|
128
|
+
The `method:` field records *how* the chunk's content was obtained — the
|
|
129
|
+
provenance answer to "how do we know this?" It is distinct from `source_type`
|
|
130
|
+
(which is the file format, e.g. `pdf`/`md`/`js`). Set it deterministically
|
|
131
|
+
from the path you are already taking:
|
|
132
|
+
|
|
133
|
+
- `deterministic-parse` — the TEXT path above (Read + split, no model judgement).
|
|
134
|
+
- `llm-vision` — the BINARY path above (content extracted by the model viewing
|
|
135
|
+
the document/image).
|
|
136
|
+
|
|
137
|
+
Use one of the controlled values: `deterministic-parse`, `llm-vision`,
|
|
138
|
+
`llm-synthesis` (model-generated/inferred content), or `manual` (hand-authored).
|
|
139
|
+
The value is plain frontmatter — it is stored and returned verbatim by the
|
|
140
|
+
server with no schema migration. If omitted, downstream lint treats the chunk
|
|
141
|
+
as `method: unknown`; prefer to set it explicitly.
|
|
142
|
+
|
|
125
143
|
## Tag Expansion
|
|
126
144
|
|
|
127
145
|
After generating the initial `contains:` tags, expand each keyword with 1-3 synonyms or related terms:
|
|
@@ -334,6 +352,10 @@ async function ingestFile(filePath) {
|
|
|
334
352
|
" - text",
|
|
335
353
|
"contains:",
|
|
336
354
|
...keywords.map(k => ` - ${k}`),
|
|
355
|
+
// method = HOW this chunk was obtained (provenance), distinct from
|
|
356
|
+
// source_type (file format). The batch path is a deterministic
|
|
357
|
+
// Read + split with no model judgement.
|
|
358
|
+
`method: deterministic-parse`,
|
|
337
359
|
`confidence: 0.7`,
|
|
338
360
|
`indexed_at: "${ts}"`,
|
|
339
361
|
"---",
|
|
@@ -82,6 +82,21 @@ For each wiki article with source_hashes in frontmatter:
|
|
|
82
82
|
### Missing frontmatter
|
|
83
83
|
Check each chunk has required frontmatter fields (source, chunk_id, confidence, indexed_at).
|
|
84
84
|
|
|
85
|
+
Also check the **provenance** field `method` (how the chunk/memory was obtained:
|
|
86
|
+
`deterministic-parse`, `llm-vision`, `llm-synthesis`, `manual`, or
|
|
87
|
+
`session-capture` for memories). `method` is **optional** — it was added after
|
|
88
|
+
some content was written, so a chunk/memory without it is still valid. When it
|
|
89
|
+
is missing, auto-fix by stamping `method: unknown` and report the fix as `info`
|
|
90
|
+
severity, type `missing_field` (do NOT raise it to a warning/error — that would
|
|
91
|
+
invalidate pre-existing content). Surfacing `method: unknown` lets a reviewer
|
|
92
|
+
distinguish facts with known provenance from those whose origin was never
|
|
93
|
+
recorded.
|
|
94
|
+
|
|
95
|
+
Lightweight provenance check (the "no source ⇒ assumption" rule): if a chunk has
|
|
96
|
+
no `source`/`source_path` and its `method` is not one of the inferred kinds
|
|
97
|
+
(`llm-synthesis`, `unknown`), flag it `info`, type `missing_field`:
|
|
98
|
+
`unsourced fact with method "{method}" — add a source or set method to llm-synthesis`.
|
|
99
|
+
|
|
85
100
|
### Tag synonym candidates
|
|
86
101
|
|
|
87
102
|
Call the server to get all tag frequencies:
|
|
@@ -100,6 +100,7 @@ Write to `{brain_path}/memory/{safe_name}.md`:
|
|
|
100
100
|
---
|
|
101
101
|
type: {detected or provided type}
|
|
102
102
|
tier: {resolved tier from Step 2b}
|
|
103
|
+
method: {extraction method — see "Extraction method" below}
|
|
103
104
|
confidence: 0.5
|
|
104
105
|
importance: {from type defaults or override}
|
|
105
106
|
ttl_days: {from type defaults or override, null if permanent}
|
|
@@ -117,6 +118,21 @@ indexed_at: "{ISO 8601 timestamp}"
|
|
|
117
118
|
{memory content}
|
|
118
119
|
```
|
|
119
120
|
|
|
121
|
+
#### Extraction method
|
|
122
|
+
|
|
123
|
+
The `method:` field records *how* the memory was obtained — the provenance
|
|
124
|
+
answer to "how do we know this?", mirroring the `method:` field on ingested
|
|
125
|
+
chunks. Set it from how the memory came to be:
|
|
126
|
+
|
|
127
|
+
- `session-capture` — captured live from the current session (the default for
|
|
128
|
+
"remember this" during work).
|
|
129
|
+
- `manual` — explicitly stated by the user ("we decided X", interview-style).
|
|
130
|
+
- `llm-synthesis` — inferred/derived by the agent rather than directly observed.
|
|
131
|
+
|
|
132
|
+
Default to `session-capture` when unsure. The value is plain frontmatter,
|
|
133
|
+
stored and returned verbatim by the server (no schema migration). If omitted,
|
|
134
|
+
lint treats the memory as `method: unknown` — prefer to set it explicitly.
|
|
135
|
+
|
|
120
136
|
#### Tier definitions
|
|
121
137
|
|
|
122
138
|
- **working**: Active, session-specific context. Expires quickly (hours to days). Use for in-progress decisions, temporary notes, and things only relevant to the current task.
|
|
@@ -131,6 +147,7 @@ New memories start at the tier resolved from importance (default `episodic` for
|
|
|
131
147
|
---
|
|
132
148
|
type: decision
|
|
133
149
|
tier: semantic
|
|
150
|
+
method: manual
|
|
134
151
|
confidence: 0.9
|
|
135
152
|
importance: 7
|
|
136
153
|
ttl_days: null
|
|
@@ -43,8 +43,10 @@ Format:
|
|
|
43
43
|
```
|
|
44
44
|
|
|
45
45
|
Keys are the short/common form. Values are expansions to try when the key
|
|
46
|
-
appears in a search query. The search
|
|
47
|
-
|
|
46
|
+
appears in a search query. The default search path does NOT read this file —
|
|
47
|
+
`wicked-brain:search` only loads it as a fallback when a direct search returns
|
|
48
|
+
sparse results (0–2 matches), then re-runs the query with matching synonym
|
|
49
|
+
values OR'd in and merges the results.
|
|
48
50
|
|
|
49
51
|
## Commands
|
|
50
52
|
|