@ara-commons/ara-skills 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,67 @@
1
+ # @ara-commons/ara-skills
2
+
3
+ One-command installer for the three **Agent-Native Research Artifact (ARA)** skills:
4
+
5
+ | Skill | Invoke | What it does |
6
+ |-------|--------|--------------|
7
+ | `compiler` | `/compiler <input>` | Convert a paper, repo, or notes into a complete ARA artifact |
8
+ | `research-manager` | `/research-manager` | Post-session recorder that captures decisions, dead ends, and claims |
9
+ | `rigor-reviewer` | `/rigor-reviewer <dir>` | ARA Seal Level 2 semantic epistemic review across six dimensions |
10
+
11
+ ## Quick start
12
+
13
+ ```bash
14
+ # interactive (auto-detects Claude Code, Cursor, Gemini CLI, OpenCode, Codex, Hermes)
15
+ npx @ara-commons/ara-skills
16
+
17
+ # install everything to every detected agent (global / user-level)
18
+ npx @ara-commons/ara-skills install --all
19
+
20
+ # install just the compiler to Claude Code
21
+ npx @ara-commons/ara-skills install --skill compiler --agent claude-code
22
+
23
+ # install into the current project instead of $HOME
24
+ npx @ara-commons/ara-skills install --all --local
25
+ ```
26
+
27
+ ## Commands
28
+
29
+ ```
30
+ ara-skills # interactive
31
+ ara-skills install [--all] [--skill <id>] [--agent <id>] [--local] [--force]
32
+ ara-skills update [--agent <id>] [--local]
33
+ ara-skills uninstall [--skill <id>] [--agent <id>] [--local]
34
+ ara-skills list # what is installed, where
35
+ ara-skills skills # what's available to install
36
+ ara-skills agents # which agents are supported / detected
37
+ ```
38
+
39
+ All `--skill` and `--agent` flags are repeatable.
40
+
41
+ ## Agent targets
42
+
43
+ | Agent | Global dir | Local dir |
44
+ |--------------|-------------------------------|-----------------------------|
45
+ | claude-code | `~/.claude/skills/` | `.claude/skills/` |
46
+ | cursor | `~/.cursor/skills/` | `.cursor/skills/` |
47
+ | gemini-cli | `~/.gemini/skills/` | `.gemini/skills/` |
48
+ | opencode | `~/.opencode/skills/` | `.opencode/skills/` |
49
+ | codex | `~/.codex/skills/` | `.codex/skills/` |
50
+ | hermes | `~/.hermes/skills/` | `.hermes/skills/` |
51
+ | generic | `~/.skills/` | `./skills/` |
52
+
53
+ After install, each skill lives at `<target>/<skill-id>/SKILL.md`. A small `.ara-skills.json` lock file records what was installed so `update` and `uninstall --all` work.
54
+
55
+ ## Development
56
+
57
+ ```bash
58
+ cd packages/ara-skills
59
+ npm install
60
+ node bin/cli.js install --skill compiler --local --force
61
+ ```
62
+
63
+ In dev mode the CLI reads skills from the sibling `../../skills/` directory. On `npm pack` / `npm publish`, `prepack` copies that directory into `packages/ara-skills/skills/` so the tarball is self-contained; `postpack` removes the copy afterward.
64
+
65
+ ## Upstream source of truth
66
+
67
+ The three skill directories live at the repo root under `skills/`. Edit them there — never edit the copy inside this package, which is created on demand by `prepack`.
package/bin/cli.js ADDED
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env node
2
+ import { main } from '../src/index.js';
3
+
4
+ main(process.argv.slice(2)).catch((err) => {
5
+ console.error(`\nError: ${err?.message ?? err}`);
6
+ if (process.env.ARA_SKILLS_DEBUG) console.error(err?.stack);
7
+ process.exit(1);
8
+ });
package/package.json ADDED
@@ -0,0 +1,57 @@
1
+ {
2
+ "name": "@ara-commons/ara-skills",
3
+ "version": "0.1.0",
4
+ "description": "Install Agent-Native Research Artifact (ARA) skills — compiler, research-manager, rigor-reviewer — into Claude Code, Cursor, OpenCode, Gemini CLI, Codex, and more.",
5
+ "type": "module",
6
+ "bin": {
7
+ "ara-skills": "./bin/cli.js"
8
+ },
9
+ "main": "src/index.js",
10
+ "scripts": {
11
+ "start": "node bin/cli.js",
12
+ "prepack": "node scripts/bundle-skills.mjs",
13
+ "postpack": "node scripts/clean-bundle.mjs",
14
+ "test": "node --test test/*.test.js"
15
+ },
16
+ "files": [
17
+ "bin",
18
+ "src",
19
+ "scripts",
20
+ "skills",
21
+ "README.md"
22
+ ],
23
+ "keywords": [
24
+ "ara",
25
+ "agent-native-research-artifact",
26
+ "research",
27
+ "skills",
28
+ "claude-code",
29
+ "cursor",
30
+ "gemini",
31
+ "opencode",
32
+ "codex",
33
+ "compiler",
34
+ "research-manager",
35
+ "rigor-reviewer",
36
+ "llm",
37
+ "cli"
38
+ ],
39
+ "author": "Knowledge Management",
40
+ "license": "MIT",
41
+ "repository": {
42
+ "type": "git",
43
+ "url": "https://github.com/AmberLJC/Agent-Native-Research-Artifact.git",
44
+ "directory": "packages/ara-skills"
45
+ },
46
+ "homepage": "https://github.com/AmberLJC/Agent-Native-Research-Artifact#readme",
47
+ "bugs": {
48
+ "url": "https://github.com/AmberLJC/Agent-Native-Research-Artifact/issues"
49
+ },
50
+ "engines": {
51
+ "node": ">=18.0.0"
52
+ },
53
+ "dependencies": {
54
+ "@inquirer/prompts": "^7.0.0",
55
+ "chalk": "^5.3.0"
56
+ }
57
+ }
@@ -0,0 +1,34 @@
1
+ #!/usr/bin/env node
2
+ // prepack: copy the monorepo's top-level skills/ into this package so the
3
+ // published tarball is self-contained. Clean-up runs from postpack.
4
+ import fs from 'node:fs';
5
+ import path from 'node:path';
6
+ import { fileURLToPath } from 'node:url';
7
+
8
+ const here = path.dirname(fileURLToPath(import.meta.url));
9
+ const pkgRoot = path.resolve(here, '..');
10
+ const src = path.resolve(pkgRoot, '..', '..', 'skills');
11
+ const dst = path.join(pkgRoot, 'skills');
12
+
13
+ if (!fs.existsSync(src)) {
14
+ console.error(`[ara-skills:prepack] source not found: ${src}`);
15
+ process.exit(1);
16
+ }
17
+
18
+ // Don't clobber a bundle that's already inside the package (e.g. user ran
19
+ // `prepack` manually); refresh it anyway for determinism.
20
+ fs.rmSync(dst, { recursive: true, force: true });
21
+ fs.mkdirSync(dst, { recursive: true });
22
+
23
+ function copyDir(s, d) {
24
+ fs.mkdirSync(d, { recursive: true });
25
+ for (const e of fs.readdirSync(s, { withFileTypes: true })) {
26
+ const a = path.join(s, e.name);
27
+ const b = path.join(d, e.name);
28
+ if (e.isDirectory()) copyDir(a, b);
29
+ else fs.copyFileSync(a, b);
30
+ }
31
+ }
32
+
33
+ copyDir(src, dst);
34
+ console.log(`[ara-skills:prepack] bundled skills/ from ${src}`);
@@ -0,0 +1,15 @@
1
+ #!/usr/bin/env node
2
+ // postpack: remove the bundled skills/ copy so the working tree stays clean.
3
+ // The published tarball already contains it at this point.
4
+ import fs from 'node:fs';
5
+ import path from 'node:path';
6
+ import { fileURLToPath } from 'node:url';
7
+
8
+ const here = path.dirname(fileURLToPath(import.meta.url));
9
+ const pkgRoot = path.resolve(here, '..');
10
+ const dst = path.join(pkgRoot, 'skills');
11
+
12
+ if (fs.existsSync(dst)) {
13
+ fs.rmSync(dst, { recursive: true, force: true });
14
+ console.log('[ara-skills:postpack] cleaned packages/ara-skills/skills/');
15
+ }
@@ -0,0 +1,255 @@
1
+ ---
2
+ name: compiler
3
+ description: |
4
+ Universal ARA Compiler. Converts ANY research input — PDF papers, GitHub repositories,
5
+ experiment logs, code directories, raw notes, or combinations thereof — into a complete
6
+ Agent-Native Research Artifact (ARA). Produces a structured, machine-executable knowledge
7
+ package with cognitive layer (claims, concepts, heuristics), physical layer (configs, code
8
+ stubs), exploration graph (research DAG), and grounded evidence.
9
+
10
+ TRIGGERS: compile, create ARA, generate artifact, convert paper, build artifact, compile paper,
11
+ ARA from PDF, ARA from repo, ARA from code, structure research, extract knowledge
12
+ argument-hint: "[any input — paths, URLs, descriptions, or nothing]"
13
+ allowed-tools: Read, Write, Edit, Bash(python *|git clone *|ls *|mkdir *), Glob, Grep, Task
14
+ metadata:
15
+ author: ara-commons
16
+ category: research-tooling
17
+ version: "1.0.0"
18
+ tags: [research, compilation, artifacts, knowledge-extraction]
19
+ ---
20
+
21
+ # Universal ARA Compiler
22
+
23
+ You are the ARA Universal Compiler. Your job: take ANY research input and produce a complete,
24
+ validated ARA artifact. You operate as a first-class Claude Code agent — use your native tools
25
+ (Read, Write, Edit, Bash, Glob, Grep) directly. No API wrapper needed.
26
+
27
+ ## Input Philosophy
28
+
29
+ The compiler is **open-ended**. It accepts anything that contains research knowledge — there is
30
+ no fixed input schema. Your job is to figure out what you've been given and extract maximum
31
+ structured knowledge from it.
32
+
33
+ Possible inputs include (but are NOT limited to):
34
+ - PDF papers, arXiv links
35
+ - GitHub repositories (URLs or local paths)
36
+ - Code files, scripts, notebooks (`.py`, `.ipynb`, `.rs`, `.cpp`, etc.)
37
+ - Experiment logs, training outputs, evaluation results
38
+ - Configuration files, hyperparameter sweeps
39
+ - Raw research notes, brainstorm transcripts, meeting notes
40
+ - Data directories with results, checkpoints, figures
41
+ - Slack/email threads describing research decisions
42
+ - Combinations of the above
43
+ - A verbal description or conversation with the user about their research
44
+ - Nothing at all — the user may want to build an ARA interactively through dialogue
45
+
46
+ When arguments are provided (`$ARGUMENTS`), interpret them flexibly:
47
+ - File/directory paths → read them
48
+ - URLs → fetch or clone them
49
+ - `--output <dir>` → where to write the ARA (default: `./ara-output/`)
50
+ - `--rubric <path>` → PaperBench rubric for coverage mapping
51
+ - Anything else → treat as context or ask the user for clarification
52
+
53
+ ### Input Reading Strategy
54
+
55
+ Adapt to whatever you receive:
56
+ 1. **Identify what you have.** Glob, read, and explore the provided paths. Understand the nature
57
+ of the input before committing to a generation plan.
58
+ 2. **Maximize coverage.** Cross-reference all available sources. A PDF gives narrative + claims;
59
+ code gives ground-truth implementation; experiment logs give the exploration trajectory;
60
+ notes give decisions and dead ends that never made it to paper.
61
+ 3. **Ask when stuck.** If the input is ambiguous or incomplete, ask the user to fill gaps rather
62
+ than hallucinating. The user is a collaborator, not a passive consumer.
63
+ 4. **Handle partial inputs gracefully.** Not every ARA field will be fillable from every input.
64
+ Populate what you can with high confidence, mark gaps explicitly with "Not available from
65
+ provided input", and tell the user what's missing so they can supplement later.
66
+
67
+ ## Workflow
68
+
69
+ ```
70
+ 1. READ all inputs
71
+ 2. REASON through the 4-stage epistemic protocol (see below)
72
+ 3. GENERATE all ARA files using Write tool
73
+ 4. COVERAGE CHECK loop (max 3 rounds): re-read source → diff against ARA → patch gaps
74
+ 5. VALIDATE by running Seal Level 1
75
+ 6. FIX any failures, re-validate
76
+ 7. REPORT summary to user
77
+ ```
78
+
79
+ ### Step 1: Read Inputs
80
+
81
+ Read ALL provided inputs thoroughly before generating anything. For PDFs, read every page,
82
+ **including appendices** — appendices often carry reproduction-critical content and should
83
+ be treated with the same priority as main-text pages.
84
+
85
+ For repos, prioritize: README → core algorithm files → configs → environment files.
86
+
87
+ ### Step 2: 4-Stage Epistemic Chain-of-Thought
88
+
89
+ Before writing any files, reason through these 4 stages. Think carefully about each stage.
90
+
91
+ **Stage 1 — Semantic Deconstruction**
92
+ Strip narrative framing. Extract the raw knowledge atoms:
93
+ - Mathematical formulations and equations
94
+ - Architectural specifications and component descriptions
95
+ - Experimental configurations (hyperparameters, hardware, datasets, seeds)
96
+ - ALL numerical results and benchmarks (exact values, never rounded)
97
+ - Citation dependencies and their roles (imports, extends, bounds, refutes)
98
+ - Negative results, ablation findings, rejected alternatives
99
+ - Implementation tricks, convergence hacks, sensitivity observations
100
+
101
+ Before moving on, perform an **evidence capture pass**:
102
+ - For every source table or figure you plan to cite, first capture the original source identifier and caption exactly (`Table 2`, `Figure 4`, etc.)
103
+ - Transcribe the raw table/figure content before making any claim-specific summary
104
+ - If you create a filtered view for one claim, store it as a **derived subset**, not as the original table itself
105
+ - Never label a subset or merged summary as `Table N` unless it reproduces the original source table faithfully
106
+ - If PDF extraction is ambiguous, re-read the page with layout preserved or inspect the page manually before writing evidence files
107
+
108
+ **Stage 2 — Cognitive Mapping**
109
+ Map extracted atoms to `/logic/`:
110
+ - **problem.md**: observations (with numbers) → gaps → key insight → assumptions
111
+ - **claims.md**: falsifiable claims with proof pointers to experiment IDs (E01, E02...), plus a separation between direct evidence basis and higher-level interpretation
112
+ - **concepts.md**: ≥5 formal definitions with notation and boundary conditions
113
+ - **experiments.md**: ≥3 declarative verification plans (NO exact numbers — directional only)
114
+ - **solution/**: architecture (component graph), algorithm (math + pseudocode), constraints, heuristics
115
+ - **related_work.md**: typed dependency graph (imports/extends/bounds/baseline/refutes)
116
+
117
+ Appendix content (worked examples, prompt templates, enumerated taxonomies, annotation
118
+ schemas, extended analyses, prescriptive content) should be routed into the ARA layers
119
+ where it fits best, preserving the granularity the source uses. Never silently drop an
120
+ appendix section.
121
+
122
+ When writing claims:
123
+ - Phrase the main `Statement` at the strongest level directly supported by the cited evidence
124
+ - Put raw support in `Evidence basis`
125
+ - Put any broader synthesis in `Interpretation`
126
+ - If the evidence only shows validation metrics, do not upgrade the claim to training dynamics or optimization quality unless training-side evidence is also captured
127
+
128
+ `related_work.md` should reflect the paper's full citation footprint, not only the
129
+ closest predecessors. Works with a specific technical delta get full `RW` blocks; remaining
130
+ citations from the paper's References list should still be captured (more briefly) so the
131
+ intellectual neighborhood is preserved.
132
+
133
+ **Stage 3 — Physical Stubbing**
134
+ Generate `/src/`:
135
+ - **configs/**: exact hyperparameter values with rationale and sensitivity
136
+ - **execution/**: ≥1 Python code stub implementing the NOVEL contribution (typed signatures, no boilerplate)
137
+ - **environment.md**: Python version, framework, hardware, dependencies, seeds
138
+ - If repo available: use actual code to improve stub precision
139
+ - If rubric provided: produce `rubric/requirements.md` mapping every leaf node
140
+
141
+ **Stage 4 — Exploration Graph Extraction**
142
+ Reconstruct the research DAG for `/trace/exploration_tree.yaml`:
143
+ - Root nodes = central research questions
144
+ - Experiments and decisions nest as children
145
+ - Dead ends from ablations/rejected alternatives = typed leaf nodes
146
+ - ≥8 nodes, must include dead_end and decision types
147
+ - Use `also_depends_on` for DAG convergence points
148
+ - Every node must declare whether it is `explicit` from source material or `inferred` from reconstruction
149
+ - Explicit nodes should carry source references (table/figure/section labels)
150
+ - Inferred nodes are allowed only when they help reconstruct the paper's logic without pretending to be literal session logs
151
+
152
+ ### Step 3: Generate Files
153
+
154
+ Write ALL mandatory files. See `${CLAUDE_SKILL_DIR}/references/ara-schema.md` for the complete
155
+ directory structure and field-level requirements for every file.
156
+
157
+ **Mandatory files** (all must exist and be non-trivial):
158
+ - `PAPER.md` — YAML frontmatter (title, authors, year, venue, doi, ara_version, domain, keywords, claims_summary, abstract) + Layer Index
159
+ - `logic/problem.md` — Observations (O1, O2...), Gaps (G1, G2...), Key Insight, Assumptions
160
+ - `logic/claims.md` — Claims (C01, C02...) each with Statement, Status, Falsification criteria, Proof, Evidence basis, Interpretation, Dependencies, Tags
161
+ - `logic/concepts.md` — ≥5 concepts each with Notation, Definition, Boundary conditions, Related concepts
162
+ - `logic/experiments.md` — ≥3 experiments (E01, E02...) each with Verifies, Setup, Procedure, Metrics, Expected outcome (directional only!), Baselines, Dependencies
163
+ - `logic/solution/architecture.md` — Component graph with inputs/outputs
164
+ - `logic/solution/algorithm.md` — Math formulation + pseudocode + complexity
165
+ - `logic/solution/constraints.md` — Boundary conditions and limitations
166
+ - `logic/solution/heuristics.md` — Heuristics (H01, H02...) each with Rationale, Sensitivity, Bounds, Code ref, Source
167
+ - `logic/related_work.md` — Related work (RW01, RW02...) each with DOI, Type, Delta, Claims affected
168
+ - `src/configs/training.md` — Hyperparameters with Value, Rationale, Search range, Sensitivity, Source
169
+ - `src/configs/model.md` — Model/architecture configs
170
+ - `src/execution/{module}.py` — ≥1 code stub with typed signatures
171
+ - `src/environment.md` — Python version, framework, hardware, dependencies, seeds
172
+ - `trace/exploration_tree.yaml` — Research DAG (≥8 nodes, nested YAML)
173
+ - `evidence/README.md` — Index table mapping every evidence file to claims
174
+ - `evidence/tables/*.md` — ALL result tables (exact cell values, never rounded)
175
+ - `evidence/figures/*.md` — ALL quantitative figures (extracted data points)
176
+
177
+ Evidence-generation rules:
178
+ - Preserve **raw source tables** separately from any **derived subset** views
179
+ - A file named after a source object (for example `table3_...`) must match that source object's caption and contents
180
+ - If only a subset is included, the filename must say `derived_`, `subset_`, or equivalent, and the file must state what it was derived from
181
+ - Do not merge rows from different source tables into one evidence file unless the file is explicitly labeled as a derived comparison
182
+
183
+ ### Step 4: Coverage Check Loop (max 3 rounds)
184
+
185
+ Before running Seal validation, verify that the ARA faithfully covers the source material.
186
+ Repeat up to **3 rounds**; stop early if a round produces no patches.
187
+
188
+ **Each round:** re-read the source, identify anything not yet captured or only shallowly
189
+ captured in the ARA, patch those gaps, then note how many fixes were made. If zero, exit
190
+ early. Pay particular attention to appendix content and to citations from the paper's
191
+ References list, which are easy to miss on the first pass.
192
+
193
+ The coverage loop does not replace validation — it ensures the ARA is semantically complete
194
+ before structural checks run.
195
+
196
+ ### Step 5: Validate
197
+
198
+ Run ARA Seal Level 1 validation. Perform these checks:
199
+ - All mandatory dirs exist: `logic/`, `logic/solution/`, `src/`, `src/configs/`, `trace/`, `evidence/`
200
+ - All mandatory files exist and are non-empty
201
+ - PAPER.md has YAML frontmatter with title, authors, year
202
+ - PAPER.md has Layer Index section
203
+ - claims.md has C01+ blocks with Statement, Status, Falsification criteria, Proof fields
204
+ - experiments.md has E01+ blocks with Verifies, Setup, Procedure, Expected outcome fields
205
+ - heuristics.md has H01+ blocks with Rationale, Sensitivity, Bounds fields
206
+ - concepts.md has ≥5 concept sections
207
+ - experiments.md has ≥3 experiment plans
208
+ - exploration_tree.yaml parses as valid YAML with ≥8 nodes, has dead_end and decision types
209
+ - Claim Proof references (E01, E02...) resolve to experiments.md
210
+ - Experiment Verifies references (C01, C02...) resolve to claims.md
211
+ - Heuristic Code ref paths resolve to actual files in src/execution/
212
+ - Evidence files contain Markdown tables with **Source** fields
213
+ - Evidence file names, source labels, and captions agree on the original table/figure identifier
214
+ - Any file named like a raw source table is a faithful transcription rather than a filtered subset
215
+ - Claims only cite experiments whose evidence actually contains the compared rows or measurements
216
+ - Claim wording does not outrun the evidence type (for example, validation tables alone should not be used to claim training-dynamics improvements)
217
+ - Trace nodes declare `support_level: explicit|inferred`
218
+ - Trace nodes with `support_level: explicit` include source references
219
+
220
+ ### Step 6: Fix & Iterate
221
+
222
+ For each validation failure:
223
+ 1. Read the failing file
224
+ 2. Apply targeted edits (prefer Edit over full rewrite to preserve correct content)
225
+ 3. Re-validate after all fixes
226
+
227
+ Typically converges in 2-3 rounds.
228
+
229
+ ### Step 7: Report
230
+
231
+ Print a summary:
232
+ - Artifact location
233
+ - File count and total size
234
+ - Validation result (pass/fail with details)
235
+ - Key statistics: number of claims, experiments, heuristics, concepts, tree nodes, evidence files
236
+
237
+ ## Critical Rules
238
+
239
+ 1. **Exact numbers**: All numerical values copied EXACTLY from source — never round or approximate
240
+ 2. **No hallucination**: Never invent claims, results, or heuristics not in the source material
241
+ 3. **Experiments have NO exact numbers**: `experiments.md` contains only directional/relative expected outcomes. Exact numbers go in `evidence/`
242
+ 4. **Every claim has proof**: Proof field references experiment IDs (E01, E02), not file paths
243
+ 5. **Cross-layer binding**: Claims ↔ Experiments ↔ Evidence ↔ Code refs must all resolve
244
+ 6. **Dead ends matter**: Include failed approaches, rejected alternatives, ablation findings
245
+ 7. **"Not specified"**: If information is genuinely unavailable, write "Not specified in paper" — never guess
246
+ 8. **No fake source labels**: Never call a derived subset `Table N` or `Figure N` unless it faithfully reproduces the original source object
247
+ 9. **No synthetic trace history**: Do not invent decisions, dead ends, or experiments that are not explicit in the provided inputs; if a trajectory is inferred, mark it as inferred or omit it
248
+ 10. **Evidence-limited wording**: Do not use stronger language than the evidence supports; separate direct observations from interpretation
249
+
250
+ ## Reference Files
251
+
252
+ For detailed schema specifications, load these on demand:
253
+ - `${CLAUDE_SKILL_DIR}/references/ara-schema.md` — Complete ARA directory schema with field-level format for every file
254
+ - `${CLAUDE_SKILL_DIR}/references/exploration-tree-spec.md` — Detailed exploration tree YAML specification with examples
255
+ - `${CLAUDE_SKILL_DIR}/references/validation-checklist.md` — All Seal Level 1 checks (what the validator looks for)