@ara-commons/ara-skills 0.1.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +4 -4
- package/skills/compiler/SKILL.md +208 -180
- package/skills/compiler/references/ara-schema.md +185 -63
- package/skills/compiler/references/exploration-tree-spec.md +6 -7
- package/skills/compiler/references/figure-extraction-guide.md +218 -0
- package/skills/compiler/references/validation-checklist.md +76 -27
- package/skills/research-manager/SKILL.md +57 -102
- package/src/installer.js +1 -1
|
@@ -4,37 +4,30 @@ These are all checks the Seal validator runs. Fix ALL failures before reporting
|
|
|
4
4
|
|
|
5
5
|
## 1. Directory Existence
|
|
6
6
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
- `logic/solution/`
|
|
10
|
-
- `src/`
|
|
11
|
-
- `src/configs/`
|
|
12
|
-
- `trace/`
|
|
13
|
-
- `evidence/`
|
|
7
|
+
Mandatory-core dirs — all must exist: `logic/`, `logic/solution/`, `src/`, `trace/`, `evidence/`.
|
|
8
|
+
Other dirs (`src/configs/`, `data/`, `evidence/proofs/`, …) exist only when the work warrants them.
|
|
14
9
|
|
|
15
|
-
## 2. Mandatory File Existence (non-empty)
|
|
10
|
+
## 2. Mandatory File Existence (non-empty, >10 bytes)
|
|
16
11
|
|
|
17
|
-
All must exist with >10 bytes:
|
|
18
12
|
- `PAPER.md`
|
|
19
13
|
- `logic/problem.md`
|
|
20
14
|
- `logic/claims.md`
|
|
21
15
|
- `logic/concepts.md`
|
|
22
16
|
- `logic/experiments.md`
|
|
23
|
-
- `logic/solution/architecture.md`
|
|
24
|
-
- `logic/solution/algorithm.md`
|
|
25
17
|
- `logic/solution/constraints.md`
|
|
26
|
-
- `logic/solution/heuristics.md`
|
|
27
18
|
- `logic/related_work.md`
|
|
28
|
-
- `src/configs/training.md`
|
|
29
|
-
- `src/configs/model.md`
|
|
30
19
|
- `src/environment.md`
|
|
31
20
|
- `trace/exploration_tree.yaml`
|
|
32
21
|
- `evidence/README.md`
|
|
22
|
+
- an evidence file for every numbered table and figure (see §11)
|
|
23
|
+
|
|
24
|
+
Additional method/artifact files (`logic/solution/*`, `src/*`, `data/*`) are validated only that,
|
|
25
|
+
where present, they are non-trivial — there is no fixed list. Model-training files
|
|
26
|
+
(`training.md`/`model.md`) should not appear unless the work actually trained a model.
|
|
33
27
|
|
|
34
28
|
## 3. PAPER.md Checks
|
|
35
29
|
|
|
36
|
-
- Starts with `---` (YAML frontmatter)
|
|
37
|
-
- Frontmatter is valid YAML mapping
|
|
30
|
+
- Starts with `---` (YAML frontmatter); valid YAML mapping
|
|
38
31
|
- Contains keys: `title`, `authors`, `year`
|
|
39
32
|
- Body contains "Layer Index" section
|
|
40
33
|
|
|
@@ -43,6 +36,8 @@ All must exist with >10 bytes:
|
|
|
43
36
|
### logic/claims.md
|
|
44
37
|
- Has `## C\d+` blocks (at least one claim)
|
|
45
38
|
- Contains `**Statement**`
|
|
39
|
+
- Contains `**Sources**`; every load-bearing number in a `Statement` has a `Sources` entry carrying
|
|
40
|
+
a verbatim «quote» plus an `[input]`/`[result]` tag — no bare-path entries, no memory-filled numbers
|
|
46
41
|
- Contains `**Status**`
|
|
47
42
|
- Contains `**Falsification criteria**`
|
|
48
43
|
- Contains `**Proof**`
|
|
@@ -61,12 +56,17 @@ All must exist with >10 bytes:
|
|
|
61
56
|
- Contains `**Procedure**`
|
|
62
57
|
- Contains `**Expected outcome**` or `**Expected results**`
|
|
63
58
|
|
|
64
|
-
### logic/solution/heuristics.md
|
|
59
|
+
### logic/solution/heuristics.md (when present)
|
|
65
60
|
- Has `## H\d+` blocks
|
|
66
61
|
- Contains `**Rationale**`
|
|
67
62
|
- Contains `**Sensitivity**`
|
|
68
63
|
- Contains `**Bounds**`
|
|
69
64
|
|
|
65
|
+
### logic/solution/ method files
|
|
66
|
+
- `logic/solution/constraints.md` exists (mandatory core)
|
|
67
|
+
- Whatever other method files the work warrants (architecture/algorithm/method/study_design/
|
|
68
|
+
formalization/proofs/…) exist and are non-trivial — there is no required set
|
|
69
|
+
|
|
70
70
|
### logic/related_work.md
|
|
71
71
|
- Has `## RW\d+` blocks
|
|
72
72
|
- Contains `**Type**`
|
|
@@ -80,10 +80,24 @@ All must exist with >10 bytes:
|
|
|
80
80
|
|
|
81
81
|
## 5. Count Checks
|
|
82
82
|
|
|
83
|
-
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
83
|
+
Counts are **source-bounded targets, not quotas** (Rule 14): they must be met from genuine source
|
|
84
|
+
content, never by padding with trivial, borrowed, or invented items. A paper that honestly supports
|
|
85
|
+
fewer passes with fewer; what fails is fabricated filler.
|
|
86
|
+
|
|
87
|
+
- `logic/concepts.md`: aim ≥5 concept sections (`## ` headers) — but only genuine technical terms
|
|
88
|
+
- `logic/experiments.md`: aim ≥3 experiment/analysis blocks (`## E\d+`) — only experiments the paper actually describes
|
|
89
|
+
- `src/execution/`: ≥1 `.py` file only when the work has implementable content (repo code / paper pseudocode / named interface). NOT mandatory otherwise; omitting it (with a note in `environment.md`) beats fabricating one.
|
|
90
|
+
- `evidence/tables/`, `evidence/figures/`, or `evidence/proofs/`: contains the filed evidence (see §11)
|
|
91
|
+
|
|
92
|
+
### Implementation layer (`src/`) — captured, not re-encoded
|
|
93
|
+
- Concrete artifacts that exist are captured in native form: prompts/templates verbatim in `src/prompts/`, **real repo source code captured into `src/execution/`** (native form, `# Grounding: transcribed`, cite path — not reduced to a pointer), non-code deliverables (released binaries, skill/spec docs, referenced datasets) described in `src/artifacts.md`, config values in `src/configs/`. A lone `environment.md` is wrong when such artifacts exist.
|
|
94
|
+
- **When a code repo/directory was provided as input**: every non-trivial runnable source file in it (a named module, entrypoint, or roughly ≥30 lines) is captured into `src/execution/` (or another `src/` subdir, in native form), not merely named in `artifacts.md`. FAIL on real source code represented only by a pointer, or a set of real files dismissed collectively (e.g. "not a core contribution") without capture.
|
|
95
|
+
- Conversely, a prose-only method (no code, no prompt, no config values) is NOT re-encoded as a `.py` stub or pseudo-code — it lives in `logic/solution/`; a lone `environment.md` is correct here. FAIL on a `.py` stub manufactured from prose (it just duplicates the cognitive layer).
|
|
96
|
+
|
|
97
|
+
### Code grounding (each `src/execution/*.py`, when present)
|
|
98
|
+
- Declares a `# Grounding: transcribed|reconstructed` tag
|
|
99
|
+
- Docstrings cite the source (§/Eq/repo path), not paraphrases of the compiler skill
|
|
100
|
+
- FAIL if the file invents API names, constants, or function bodies with no traceable source — a hollow fabricated API must be omitted, not shipped
|
|
87
101
|
|
|
88
102
|
## 5b. Appendix Coverage
|
|
89
103
|
|
|
@@ -93,12 +107,20 @@ one ARA file, with the granularity of the source preserved.
|
|
|
93
107
|
## 6. Evidence Quality
|
|
94
108
|
|
|
95
109
|
For each file in `evidence/tables/*.md` and `evidence/figures/*.md`:
|
|
96
|
-
- Must contain a Markdown table (`|...|...|` pattern)
|
|
97
110
|
- Must contain `**Source**` field
|
|
111
|
+
- **Must have a sibling screenshot `.png`** (e.g. `table3.md` ↔ `table3.png`, `figure5.md` ↔ `figure5.png`), declared via a `**Screenshot**` field
|
|
112
|
+
- Table files must contain a Markdown table (`|...|...|` pattern)
|
|
98
113
|
- If the filename includes `table{N}` or `figure{N}`, the `**Source**` field must reference the same identifier
|
|
99
114
|
- If the file is a derived subset, it must say so explicitly via `**Extraction type**: derived_subset` or equivalent
|
|
100
115
|
- Raw source-table files should not silently omit rows while still presenting themselves as the original table
|
|
101
116
|
|
|
117
|
+
For each file in `evidence/figures/*.md` specifically:
|
|
118
|
+
- Must declare `**Figure type**` in {quantitative_plot, diagram, qualitative_sample, mixed}
|
|
119
|
+
- Must declare `**Extraction method**` in {exact_from_labels, digitized_estimate, visual_description} and `**Reading confidence**` in {high, medium, low}
|
|
120
|
+
- `quantitative_plot` figures must contain either a Markdown data table OR an explicit unreadable statement with `Reading confidence: low` plus a `Trend summary`; their `**Axes**` field must state the scale (linear/log)
|
|
121
|
+
- `diagram` and `qualitative_sample` figures must contain a `Visual description` section and must NOT present a fabricated numeric data table
|
|
122
|
+
- Any estimated numeric reading should be marked approximate (`≈`) and the file's extraction method should be `digitized_estimate` (not `exact_from_labels`)
|
|
123
|
+
|
|
102
124
|
## 7. evidence/README.md
|
|
103
125
|
|
|
104
126
|
- Must contain a Markdown table (file index)
|
|
@@ -109,10 +131,9 @@ For each file in `evidence/tables/*.md` and `evidence/figures/*.md`:
|
|
|
109
131
|
|
|
110
132
|
- Parses as valid YAML
|
|
111
133
|
- Has top-level `tree` key
|
|
112
|
-
-
|
|
134
|
+
- ~8+ nodes is the target for a rich paper, but a smaller fully source-backed tree PASSES — do not flag low counts that reflect a paper genuinely exposing little exploration (Rule 14). What fails is invented/unsupported nodes (see Trace Hygiene), not honest small trees.
|
|
113
135
|
- All node types in {question, decision, experiment, dead_end, pivot}
|
|
114
|
-
-
|
|
115
|
-
- At least 1 `decision` node exists
|
|
136
|
+
- `dead_end` / `decision` nodes are expected when the paper reveals ablations, rejected alternatives, or design choices — but are NOT required if the source exposes none; never invent one to satisfy this check (Rule 9)
|
|
116
137
|
- Every node has `id` and `type` fields
|
|
117
138
|
- Every node has `support_level` in {explicit, inferred}
|
|
118
139
|
- Type-specific required fields:
|
|
@@ -134,10 +155,10 @@ For each file in `evidence/tables/*.md` and `evidence/figures/*.md`:
|
|
|
134
155
|
### Experiment Verifies → Claim Resolution
|
|
135
156
|
- Every `C\d+` in an experiment's `**Verifies**` must exist in claims.md
|
|
136
157
|
|
|
137
|
-
### Heuristic Code Ref → File Resolution
|
|
158
|
+
### Heuristic Code Ref → File Resolution (only when heuristics.md + src/execution/ are both present)
|
|
138
159
|
- Every `src/...` path in `**Code ref**: [...]` must be an existing file
|
|
139
160
|
|
|
140
|
-
### Architecture Components → Code Stubs (fuzzy)
|
|
161
|
+
### Architecture Components → Code Stubs (fuzzy; only when architecture.md + src/execution/ are both present)
|
|
141
162
|
- Significant words from `## ` headings in architecture.md should appear somewhere in src/execution/ code
|
|
142
163
|
|
|
143
164
|
### Tree Evidence → Claims (YAML)
|
|
@@ -146,3 +167,31 @@ For each file in `evidence/tables/*.md` and `evidence/figures/*.md`:
|
|
|
146
167
|
### Trace Hygiene
|
|
147
168
|
- Do not add dead_end, decision, or experiment nodes that are unsupported by the provided source material
|
|
148
169
|
- If a node is reconstructed from partial evidence rather than stated explicitly, it should be marked as inferred or excluded from Seal Level 1 outputs
|
|
170
|
+
|
|
171
|
+
## 10. Citation Verification (Rule 15)
|
|
172
|
+
|
|
173
|
+
- Every repo path / `file:line` referenced (in `src/`, heuristic `Code ref`, environment "Code location") exists in the provided repo; no line reference points past the file's actual length
|
|
174
|
+
- No fact ABOUT a repo artifact (line count, path, internal structure) is transcribed from the paper without checking the real file — when paper and repo disagree, the discrepancy is flagged, not silently resolved to the paper's number
|
|
175
|
+
- Spot-check trace `source_refs` and evidence `**Source**` labels: the cited section/table/appendix actually contains the claimed content
|
|
176
|
+
- A statistic carries its scope/denominator (N, population) in its `Source` — subset figures (e.g. "5 papers / 3,050 reqs") are not juxtaposed with full-corpus figures as if same-denominator
|
|
177
|
+
- **Claim/heuristic number sources** (exhaustive, not spot-checked): each `**Sources**` entry's cited
|
|
178
|
+
`file:line` (or trace `node:field`) exists, the verbatim «quote» is actually present there, and the
|
|
179
|
+
number in the `Statement`/`Rationale` matches the value inside that quote; `[input]` entries cite
|
|
180
|
+
recipe scripts and `[result]` entries cite run logs/trace (not swapped). A bare path with no «quote»,
|
|
181
|
+
a «quote» absent from the cited line, or a value that disagrees with its quote FAILS. `[pending: …]`
|
|
182
|
+
entries pass but are listed for follow-up — an unverified plausible path does not pass
|
|
183
|
+
|
|
184
|
+
## 11. Evidence Ledger Completeness
|
|
185
|
+
|
|
186
|
+
- **Every numbered `Table N` and `Figure N` in the source is filed** — a complete, in-order sweep,
|
|
187
|
+
not a sample. Each filed object has BOTH a markdown file and a screenshot `.png`.
|
|
188
|
+
- Every value a claim quotes traces to a filed table/figure.
|
|
189
|
+
- Any numbered object deliberately not filed (e.g. an exact duplicate) is listed in
|
|
190
|
+
`evidence/README.md` with a reason — no silent omissions. A run that quietly filed only some of
|
|
191
|
+
the source's tables/figures FAILS.
|
|
192
|
+
|
|
193
|
+
## 12. Self-Consistency
|
|
194
|
+
|
|
195
|
+
- Any ARA-authored derived number (a delta, percentage, or comparison the ARA computes itself) recomputes correctly from its cited cells
|
|
196
|
+
- `PAPER.md` frontmatter/Layer-Index declared counts (claims, concepts, experiments, …) match the actual files
|
|
197
|
+
- Tree `evidence:` references are claim IDs (`C\d+`), not observation IDs (`O\d+`) or other layers
|
|
@@ -15,7 +15,7 @@ argument-hint: "[optional: hint about what happened this turn]"
|
|
|
15
15
|
allowed-tools: Read, Write, Edit, Glob, Grep
|
|
16
16
|
metadata:
|
|
17
17
|
author: ara-commons
|
|
18
|
-
version: "2.
|
|
18
|
+
version: "2.3.0"
|
|
19
19
|
tags: [research, process-recording, provenance, progressive-crystallization, knowledge-management]
|
|
20
20
|
---
|
|
21
21
|
|
|
@@ -128,8 +128,10 @@ When a signal fires for `O{XX}`:
|
|
|
128
128
|
|
|
129
129
|
1. Read O{XX}'s `content`, `context`, `potential_type`, `provenance`, `bound_to`.
|
|
130
130
|
2. Allocate the next ID for the target layer (read the target file first).
|
|
131
|
-
3. Construct a typed entry using the schema (see Schemas below).
|
|
132
|
-
`
|
|
131
|
+
3. Construct a typed entry using the schema (see Schemas below). **Before any number enters a
|
|
132
|
+
`Statement`/`Rationale`, ground it per "Number grounding" below — open the source, copy the
|
|
133
|
+
matched line verbatim into `Sources`, then write the number as a copy of that quote.** Carry
|
|
134
|
+
forward `provenance`. Verbal-affirmation upgrades `ai-suggested` → `user-revised` (or `user` if
|
|
133
135
|
reproduced verbatim). The other three signals do **not** upgrade provenance.
|
|
134
136
|
4. Add fields: `Crystallized via: <signal>`, `From staging: O{XX}`.
|
|
135
137
|
5. Establish forensic bindings (claim→proof, heuristic→code, decision→evidence). Use
|
|
@@ -137,6 +139,24 @@ When a signal fires for `O{XX}`:
|
|
|
137
139
|
6. Update O{XX}: `promoted: true`, `promoted_to: <layer>:<id>`, `crystallized_via: <signal>`.
|
|
138
140
|
**Do not delete the observation** — the trail from raw to typed is part of the record.
|
|
139
141
|
|
|
142
|
+
#### Number grounding (claims & heuristics)
|
|
143
|
+
|
|
144
|
+
Every load-bearing number in a `Statement` (or a heuristic's `Rationale`/`Sensitivity`/`Bounds`)
|
|
145
|
+
is grounded the way code is — transcribed from an open source, never written from memory:
|
|
146
|
+
|
|
147
|
+
1. **Open before you write.** Before the number enters the prose, open its source and copy the
|
|
148
|
+
matched line *verbatim* into `Sources` (`<value> ← <source ref> «matched line» [input|result]`).
|
|
149
|
+
The number you then write in the prose is a copy of the value inside that quote — not a value
|
|
150
|
+
recalled and back-cited. An entry with a bare path and no «quote» is invalid.
|
|
151
|
+
2. **Input vs result.** Tag each entry `[input]` (a value you set — cite the source that defines it)
|
|
152
|
+
or `[result]` (a value the run produced — cite the log/output that reports it). Don't cite a
|
|
153
|
+
measured outcome to the config meant to produce it, or vice versa.
|
|
154
|
+
3. **No inheritance.** Re-open *this* claim's own source for every number; a value shared with a
|
|
155
|
+
dependency claim is re-verified here, never copied from the dependency's wording.
|
|
156
|
+
4. **`[pending]` beats a guess.** Can't open or locate a source this turn? Write
|
|
157
|
+
`<value> ← [pending: what's missing]`. An unverified-but-plausible path is fabrication and is
|
|
158
|
+
worse than `[pending]`.
|
|
159
|
+
|
|
140
160
|
#### Contradiction trigger
|
|
141
161
|
|
|
142
162
|
When a new event contradicts something already staged or crystallized:
|
|
@@ -155,25 +175,17 @@ researcher to triage — the manager does not auto-discard.
|
|
|
155
175
|
|
|
156
176
|
### Stage 4 — Logic Layer Reconciliation
|
|
157
177
|
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
present evidence.
|
|
162
|
-
|
|
163
|
-
The trace layer (`ara/trace/`, `ara/staging/`) is append-only and immutable. All history
|
|
164
|
-
of how the logic layer evolved — prior statements, status transitions, revision reasons —
|
|
165
|
-
lives there. The logic file itself carries only the current snapshot plus a `Last revised`
|
|
166
|
-
pointer back to the trace.
|
|
167
|
-
|
|
168
|
-
This stage operates only on **already-crystallized** entries in `logic/`. Staged
|
|
169
|
-
observations belong to Stage 3.
|
|
178
|
+
Reconcile `logic/` (the current best understanding) with this turn's events so it stays
|
|
179
|
+
internally consistent and faithful to present evidence. Operates only on **already-crystallized**
|
|
180
|
+
entries — staged observations belong to Stage 3. (History lives in the trace; see Layer Mutability.)
|
|
170
181
|
|
|
171
182
|
#### What Stage 4 may do
|
|
172
183
|
|
|
173
184
|
1. **Status updates** — flip a claim's `Status` field when evidence warrants.
|
|
174
185
|
2. **Content revisions** — rewrite a `Statement`, `Rationale`, or definition when new
|
|
175
186
|
evidence narrows scope, terminology changed, or wording no longer matches what's
|
|
176
|
-
actually supported.
|
|
187
|
+
actually supported. A rewrite re-grounds every number it now contains (Number grounding);
|
|
188
|
+
any changed value gets its own fresh `Sources` «quote», never a carried-over one.
|
|
177
189
|
3. **Structural changes** — split a claim into two, merge duplicates, repair
|
|
178
190
|
dependencies, rename ids when concepts are renamed.
|
|
179
191
|
4. **Consistency pass** — scan for broken cross-references (claim cites C05 which no
|
|
@@ -281,28 +293,13 @@ When a signal fires for entry `E` (claim, heuristic, or concept):
|
|
|
281
293
|
## Per-Turn Procedure
|
|
282
294
|
|
|
283
295
|
```
|
|
284
|
-
1. Read existing ara/ files
|
|
285
|
-
2. Stage 1 —
|
|
286
|
-
3. Stage 2 —
|
|
287
|
-
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
291
|
-
for each staged observation: check closure signals → crystallize if fired
|
|
292
|
-
for each entry: check contradictions with this turn's events → flag if found
|
|
293
|
-
for long-staged observations (3+ days idle): mark stale: true
|
|
294
|
-
5. Stage 4 — Logic Layer Reconciliation:
|
|
295
|
-
for each crystallized entry in logic/ (claims, heuristics, concepts):
|
|
296
|
-
check status signals → edit Status line if fired
|
|
297
|
-
check content signals → rewrite Statement / Rationale / definition if reconciliation demanded
|
|
298
|
-
check structural signals → split, merge, repair dependencies, fix terminology drift
|
|
299
|
-
run cross-reference consistency pass (broken refs, renamed ids, terminology mismatch)
|
|
300
|
-
record before/after of every change in today's session record (the logic file does not retain history)
|
|
301
|
-
log near-miss signals (considered but rejected) to pm_reasoning_log.yaml
|
|
302
|
-
6. Append turn events to today's session record.
|
|
303
|
-
7. Update or append today's entry in trace/sessions/session_index.yaml.
|
|
304
|
-
8. Append a brief reasoning entry to trace/pm_reasoning_log.yaml (self-continuity).
|
|
305
|
-
9. Print one-line summary, e.g.:
|
|
296
|
+
1. Read existing ara/ files (current state, next IDs).
|
|
297
|
+
2. Stage 1 — harvest this turn's candidate events.
|
|
298
|
+
3. Stage 2 — classify/route each (per event-taxonomy.md): journey facts direct to trace/; interpretive events staged to staging/observations.yaml.
|
|
299
|
+
4. Stage 3 — crystallize staged observations whose closure signal fired; flag contradictions; mark 3+-day-idle observations stale.
|
|
300
|
+
5. Stage 4 — for each crystallized logic/ entry, apply status/content/structural edits when a signal fires; run the cross-ref consistency pass; record before/after in the session record; log near-misses.
|
|
301
|
+
6. Append turn events to today's session record; update session_index.yaml; append a line to pm_reasoning_log.yaml.
|
|
302
|
+
7. Print one-line summary, e.g.:
|
|
306
303
|
[PM] Turn captured: 1 decision (direct), 2 observations staged, 1 claim crystallized via affirmation, C03 testing→supported, C07 revised (scope narrowed).
|
|
307
304
|
Or, for empty turns:
|
|
308
305
|
[PM] Turn skipped: no research events.
|
|
@@ -314,20 +311,9 @@ When a signal fires for entry `E` (claim, heuristic, or concept):
|
|
|
314
311
|
ara/
|
|
315
312
|
PAPER.md # Root manifest + layer index
|
|
316
313
|
logic/ # MUTABLE — current best understanding (Stage 4 reconciles)
|
|
317
|
-
problem.md
|
|
318
|
-
|
|
319
|
-
|
|
320
|
-
experiments.md
|
|
321
|
-
solution/
|
|
322
|
-
architecture.md
|
|
323
|
-
algorithm.md
|
|
324
|
-
constraints.md
|
|
325
|
-
heuristics.md # Tricks + rationale + sensitivity
|
|
326
|
-
related_work.md
|
|
327
|
-
src/ # How (code artifacts)
|
|
328
|
-
configs/
|
|
329
|
-
kernel/
|
|
330
|
-
environment.md
|
|
314
|
+
claims.md problem.md concepts.md experiments.md related_work.md
|
|
315
|
+
solution/ # constraints.md + method files per the compiler's domain profile
|
|
316
|
+
src/ # How (artifacts) — configs/code/data per domain profile; always environment.md
|
|
331
317
|
trace/ # APPEND-ONLY — the journey, never rewritten
|
|
332
318
|
exploration_tree.yaml # Research DAG: decisions, experiments, dead_ends, pivots, questions
|
|
333
319
|
pm_reasoning_log.yaml # Manager's own organizational decisions per turn
|
|
@@ -377,6 +363,7 @@ tree:
|
|
|
377
363
|
```markdown
|
|
378
364
|
## C{XX}: {title}
|
|
379
365
|
- **Statement**: {current falsifiable assertion}
|
|
366
|
+
- **Sources**: [{one entry per load-bearing number in `Statement`: `<value> ← <file:line | trace-node:field> «verbatim line copied from source» [input|result]`, or `<value> ← [pending: reason]`}] # see "Number grounding"; a bare path with no «quote» is invalid
|
|
380
367
|
- **Status**: hypothesis | untested | testing | supported | weakened | refuted | withdrawn
|
|
381
368
|
- **Provenance**: user | ai-suggested | user-revised
|
|
382
369
|
- **Falsification criteria**: {what would disprove this}
|
|
@@ -386,37 +373,26 @@ tree:
|
|
|
386
373
|
- **Last revised**: YYYY-MM-DD (turn-id) # pointer back to the trace; absent until first revision
|
|
387
374
|
```
|
|
388
375
|
|
|
389
|
-
|
|
390
|
-
|
|
391
|
-
|
|
392
|
-
|
|
393
|
-
|
|
394
|
-
was promoted) and `staging/observations.yaml` (the source observation, still flagged
|
|
395
|
-
`promoted: true`).
|
|
396
|
-
- Every subsequent edit: `trace/sessions/YYYY-MM-DD_NNN.yaml` under `logic_revisions:`
|
|
397
|
-
with full before/after, signal, and provenance.
|
|
398
|
-
- Reasoning for each edit: `trace/pm_reasoning_log.yaml`.
|
|
399
|
-
|
|
400
|
-
`refuted` and `withdrawn` are terminal — once set, the claim is not edited further except
|
|
401
|
-
via an explicit revival by the user (which reopens it through a `revised` transition and
|
|
402
|
-
settles to `testing` or `hypothesis`). `revised` itself is a transition marker, not a
|
|
403
|
-
resting state: after the revision is recorded in the trace, `Status` settles back to a
|
|
404
|
-
working value.
|
|
376
|
+
Current-state snapshot only — no prior statements, no `From staging`/`Crystallized via`
|
|
377
|
+
notes. Crystallization and every edit are recorded in the trace (`trace/sessions/…` under
|
|
378
|
+
`logic_revisions:` with before/after; source observation stays in `staging/`; reasoning in
|
|
379
|
+
`pm_reasoning_log.yaml`). `refuted`/`withdrawn` are terminal and `revised` is a transition
|
|
380
|
+
marker, not a resting state — see Stage 4.
|
|
405
381
|
|
|
406
382
|
### Heuristic (`logic/solution/heuristics.md`) — crystallized only
|
|
407
383
|
|
|
408
384
|
```markdown
|
|
409
385
|
## H{XX}: {title}
|
|
410
386
|
- **Rationale**: {current best explanation of why this works}
|
|
387
|
+
- **Sources**: [{one entry per load-bearing number in `Rationale`/`Sensitivity`/`Bounds`, same format as claims — see "Number grounding"}]
|
|
411
388
|
- **Status**: active | weakened | retired
|
|
412
389
|
- **Provenance**: user | ai-suggested | user-revised
|
|
413
|
-
- **Sensitivity**: low | medium | high
|
|
414
|
-
- **Code ref**: [{file paths}]
|
|
390
|
+
- **Sensitivity**: low | medium | high | unknown # "unknown" until the turn establishes it — never guess
|
|
391
|
+
- **Code ref**: [{file paths, or "pending"}]
|
|
415
392
|
- **Last revised**: YYYY-MM-DD (turn-id) # absent until first revision
|
|
416
393
|
```
|
|
417
394
|
|
|
418
|
-
|
|
419
|
-
`Crystallized via` clutter. Crystallization and revision history live in the trace.
|
|
395
|
+
Current-state snapshot only (same as claims); history lives in the trace.
|
|
420
396
|
|
|
421
397
|
### Observation (`staging/observations.yaml`) — staged
|
|
422
398
|
|
|
@@ -527,7 +503,7 @@ Create the structure on the first turn that contains research-significant activi
|
|
|
527
503
|
ask unprompted on a purely conversational opener.
|
|
528
504
|
|
|
529
505
|
```
|
|
530
|
-
mkdir -p ara/{logic/solution,src
|
|
506
|
+
mkdir -p ara/{logic/solution,src,trace/sessions,evidence/{tables,figures},staging}
|
|
531
507
|
```
|
|
532
508
|
|
|
533
509
|
Seed:
|
|
@@ -557,32 +533,11 @@ deliver the full briefing.
|
|
|
557
533
|
|
|
558
534
|
## Rules
|
|
559
535
|
|
|
560
|
-
1. **
|
|
561
|
-
2. **Never fabricate
|
|
562
|
-
3. **Stage by default
|
|
563
|
-
|
|
564
|
-
|
|
565
|
-
|
|
566
|
-
|
|
567
|
-
|
|
568
|
-
rewrites, splits/merges, and consistency repairs are allowed but require an explicit
|
|
569
|
-
signal from this turn. Log near-misses. Terminal states (`refuted`, `withdrawn`)
|
|
570
|
-
need explicit triggers — never reach them by silence or staleness.
|
|
571
|
-
7. **Logic layer is a current-state snapshot.** Each edit overwrites the prior value in
|
|
572
|
-
`logic/`. The before/after lives in the trace, not in the logic file. Never carry a
|
|
573
|
-
`Previous statement` line or status history in claim entries.
|
|
574
|
-
8. **Trace and staging are append-only.** Never edit prior entries in `trace/sessions/`,
|
|
575
|
-
`trace/pm_reasoning_log.yaml`, `trace/exploration_tree.yaml`, or
|
|
576
|
-
`staging/observations.yaml` except to set forward-reference pointers (e.g.
|
|
577
|
-
`promoted: true`, `promoted_to:`, appending to today's events). Existing content is
|
|
578
|
-
never rewritten.
|
|
579
|
-
9. **Never silently overwrite contradictions.** Flag both, append unresolved decision
|
|
580
|
-
node, defer.
|
|
581
|
-
10. **Always read existing files first.** Get correct next IDs, avoid duplicates.
|
|
582
|
-
11. **Establish forensic bindings.** claim→proof, heuristic→code, decision→evidence. Use
|
|
583
|
-
`[pending]` + TODO if not yet bindable.
|
|
584
|
-
12. **Every logic-layer edit gets a `logic_revisions:` entry in the session record** with
|
|
585
|
-
full before/after. This is the only place pre-edit content is preserved.
|
|
586
|
-
13. **Skip empty turns.** No record for greetings, ack, pure formatting.
|
|
587
|
-
14. **Keep YAML valid.** Validate structure mentally before writes.
|
|
588
|
-
15. **Be terse in the summary line.** One line per turn, factual, no narration.
|
|
536
|
+
1. **End-of-turn only; never mid-turn.** Skip empty turns (greetings, ack, formatting).
|
|
537
|
+
2. **Never fabricate.** Log only what actually happened or was discussed.
|
|
538
|
+
3. **Stage interpretive events by default; crystallize only on a closure signal** — abandonment / affirmation / resolution / commitment. No counters, no LM-judged maturity.
|
|
539
|
+
4. **Never auto-upgrade provenance.** `ai-suggested` holds until explicit user affirmation.
|
|
540
|
+
5. **Stage 4 defaults to no change.** Edits require an explicit signal this turn; terminal states (`refuted`/`withdrawn`) need explicit triggers, never silence/staleness. Log near-misses.
|
|
541
|
+
6. **Respect layer mutability** (see top): `logic/` overwrites in place; `trace/` and `staging/` are append-only except forward-reference pointers. Every logic edit gets a `logic_revisions:` before/after in the session record — the only place pre-edit content is kept.
|
|
542
|
+
7. **Never silently overwrite contradictions** — flag both, append an `unresolved` decision node, defer.
|
|
543
|
+
8. **Read target files first** (correct IDs, no dupes); establish forensic bindings (claim→proof, heuristic→code, decision→evidence), `[pending]`+TODO if not yet bindable. Keep YAML valid; summary line terse.
|
package/src/installer.js
CHANGED