atris 3.0.1 → 3.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +22 -0
- package/atris/skills/endgame/SKILL.md +19 -1
- package/atris/skills/improve/SKILL.md +65 -62
- package/atris/skills/launch/SKILL.md +62 -0
- package/atris/skills/tidy/SKILL.md +84 -0
- package/bin/atris.js +2 -1
- package/commands/autopilot.js +312 -31
- package/commands/business.js +149 -32
- package/commands/sync.js +9 -5
- package/lib/scorecard.js +287 -0
- package/lib/todo.js +12 -2
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -73,6 +73,20 @@ Core loop: `plan` -> `do` -> `review`
|
|
|
73
73
|
|
|
74
74
|
Works with Claude Code, Cursor, Windsurf, Codex, GitHub Copilot, and other coding agents.
|
|
75
75
|
|
|
76
|
+
## Business Workspaces
|
|
77
|
+
|
|
78
|
+
If you want a real business workspace, use the business command instead of raw `atris init`.
|
|
79
|
+
|
|
80
|
+
```bash
|
|
81
|
+
atris business init "BLOND:ISH" --owner-email joel@blondish.world
|
|
82
|
+
cd ~/arena/atris-business/blondish
|
|
83
|
+
atris align --fix
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
That creates the cloud business, writes `.atris/business.json`, and scaffolds the canonical local `atris/` workspace under `~/arena/atris-business/<slug>/`.
|
|
87
|
+
|
|
88
|
+
If you already have a folder full of source material, run it from there with `atris business init "BLOND:ISH" --here`.
|
|
89
|
+
|
|
76
90
|
## Core Commands
|
|
77
91
|
|
|
78
92
|
| Command | Purpose |
|
|
@@ -102,6 +116,14 @@ Works with Claude Code, Cursor, Windsurf, Codex, GitHub Copilot, and other codin
|
|
|
102
116
|
- `atris experiments` runs Karpathy-style keep/revert loops in `atris/experiments/`
|
|
103
117
|
- `atris pull` and `atris push` sync cloud workspaces and journals
|
|
104
118
|
|
|
119
|
+
## Verifiable Feedback Loop
|
|
120
|
+
|
|
121
|
+
Under the hood, Atris can keep score on real repo work.
|
|
122
|
+
|
|
123
|
+
- Endgame tasks can carry a `Verify:` command, so work can end on a deterministic check instead of pure prose.
|
|
124
|
+
- `atris autopilot` can run that check after review, record a reward in the journal, and append a scorecard when a horizon closes.
|
|
125
|
+
- Future horizon picks can weight against recent scorecards, so the loop learns from repo-local history without claiming model retraining.
|
|
126
|
+
|
|
105
127
|
## Benchmark Harness
|
|
106
128
|
|
|
107
129
|
Atris ships one public head-to-head benchmark harness for comparing a pinned
|
|
@@ -66,15 +66,30 @@ After running the three moves, write the result to `atris/TODO.md`:
|
|
|
66
66
|
|
|
67
67
|
```markdown
|
|
68
68
|
- **T1:** <step 1 description> [endgame]
|
|
69
|
+
**Verify:** <deterministic-check>
|
|
69
70
|
- **T2:** <step 2 description> [endgame]
|
|
71
|
+
**Verify:** <deterministic-check>
|
|
70
72
|
- **T3:** <step 3 description> [endgame]
|
|
73
|
+
**Verify:** <deterministic-check>
|
|
71
74
|
```
|
|
72
75
|
|
|
73
76
|
The tag must be exactly `[endgame]` (parser only matches `\w+`, no colons or hyphens). The slug lives in the section header.
|
|
74
77
|
|
|
78
|
+
3. **Each task must include a `Verify:` line** with a deterministic check:
|
|
79
|
+
- **Test command:** `npm test` or `npm run test:feature`
|
|
80
|
+
- **Grep pattern:** `grep -q "pattern" file.js`
|
|
81
|
+
- **File presence:** `test -f path/to/file.md`
|
|
82
|
+
- **Exit code:** `node -e "process.exit(...)"` (or any shell command)
|
|
83
|
+
|
|
84
|
+
The verify command must:
|
|
85
|
+
- Complete in <30 seconds
|
|
86
|
+
- Exit 0 on pass, non-zero on fail
|
|
87
|
+
- Not require user input
|
|
88
|
+
- Be runnable from project root
|
|
89
|
+
|
|
75
90
|
Use `T1`, `T2`, `T3` … as IDs (or `W1`/`E1`/etc per endgame domain). Single uppercase letter + digits, optional trailing lowercase letter (the parser was extended in commit `4db14d9` to accept `W3b`-style validator sub-task IDs).
|
|
76
91
|
|
|
77
|
-
|
|
92
|
+
4. **Append the full endgame to today's journal `## Notes`** so the reasoning is preserved:
|
|
78
93
|
|
|
79
94
|
```markdown
|
|
80
95
|
### Endgame picked — HH:MM PDT
|
|
@@ -95,6 +110,7 @@ REVERSE PATH
|
|
|
95
110
|
|
|
96
111
|
NEXT MOVE
|
|
97
112
|
T1: <description>
|
|
113
|
+
Verify: <check-command>
|
|
98
114
|
Why this first: <one line>
|
|
99
115
|
```
|
|
100
116
|
|
|
@@ -132,6 +148,7 @@ REVERSE PATH
|
|
|
132
148
|
|
|
133
149
|
NEXT MOVE
|
|
134
150
|
[one concrete action, doable in one session]
|
|
151
|
+
Verify: [deterministic check — test, grep, file, exit code]
|
|
135
152
|
Why this first: [one line]
|
|
136
153
|
```
|
|
137
154
|
|
|
@@ -143,6 +160,7 @@ NEXT MOVE
|
|
|
143
160
|
- **REVERSE PATH includes eliminate.** Half of strategy is removal. Forward-greedy planning never asks this. Endgame must.
|
|
144
161
|
- **The chain must terminate this week.** If it can't, the horizon is too far — pick a closer one and say so.
|
|
145
162
|
- **5–7 links max in the chain.** More than that = horizon is too vague.
|
|
163
|
+
- **Every task must have a Verify line.** Deterministic check (test, grep, file, exit code). Allows the validator to score the endgame autonomously.
|
|
146
164
|
- **Cite wiki pages** with `[[atris/wiki/...]]` refs.
|
|
147
165
|
- **Ask 1–3 questions max** if the horizon is unclear. Never a wall of text.
|
|
148
166
|
- **One chain, not three.** Pick the shortest defensible one.
|
|
@@ -1,84 +1,87 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: improve
|
|
3
|
-
description: "
|
|
3
|
+
description: "Run one RL improvement tick on the workspace via POST /api/improve. Ships one verifiable change, scores it, writes the scorecard. The thing you pay for. Triggers on: improve, make this better, ship one thing, run a tick, get smarter."
|
|
4
4
|
version: 1.0.0
|
|
5
5
|
tags:
|
|
6
|
-
-
|
|
7
|
-
-
|
|
8
|
-
-
|
|
9
|
-
-
|
|
6
|
+
- rl
|
|
7
|
+
- improve
|
|
8
|
+
- reward
|
|
9
|
+
- tick
|
|
10
|
+
- autopilot
|
|
10
11
|
---
|
|
11
12
|
|
|
12
13
|
# /improve
|
|
13
14
|
|
|
14
|
-
|
|
15
|
+
Runs one improvement tick on the workspace. Calls `POST /api/improve` on the backend, which plans one task, builds it, verifies it, and scores it. Returns what shipped + the reward. Writes the scorecard locally.
|
|
15
16
|
|
|
16
|
-
|
|
17
|
+
This is the product. The thing the user pays for. One call, one verifiable result.
|
|
17
18
|
|
|
18
|
-
|
|
19
|
-
- "Clean this up"
|
|
20
|
-
- After a big refactor when docs have drifted
|
|
21
|
-
- Periodically, to keep the knowledge base honest
|
|
22
|
-
- When you suspect MAP.md or wiki pages are out of date
|
|
19
|
+
## How it works
|
|
23
20
|
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
**Stale wiki pages** — pages with `last_compiled` frontmatter where the source files have been modified since. The page content may be wrong.
|
|
33
|
-
|
|
34
|
-
**Broken MAP.md references** — file:line refs that point to code that moved or was deleted. The auto-healer fixes what it can; report what it can't.
|
|
35
|
-
|
|
36
|
-
**Abandoned tasks** — in-progress tasks claimed more than 3 days ago. Either finish them, re-scope them, or delete them.
|
|
37
|
-
|
|
38
|
-
**Orphan docs** — markdown pages under atris/ that nothing links to. They're invisible and probably stale.
|
|
39
|
-
|
|
40
|
-
**Stale MAP.md** — if MAP.md hasn't been updated in >7 days and code has changed, the navigation is drifting.
|
|
41
|
-
|
|
42
|
-
**Empty sections** — TODO.md sections with placeholder text like "(empty)" or "(clean)".
|
|
43
|
-
|
|
44
|
-
4. Present findings as a numbered list, sorted by impact. For each:
|
|
45
|
-
- What's wrong
|
|
46
|
-
- Why it matters
|
|
47
|
-
- What you'd do to fix it
|
|
48
|
-
|
|
49
|
-
5. Ask: "want me to fix these? all / pick numbers / skip"
|
|
21
|
+
```
|
|
22
|
+
/improve
|
|
23
|
+
→ POST /api/improve { workspace: ".", mode: "full" }
|
|
24
|
+
→ backend picks a task, plans, builds, reviews, verifies
|
|
25
|
+
→ returns { task, reward, files_changed, verify_pass, summary }
|
|
26
|
+
→ CLI writes scorecard to atris/scorecards.md
|
|
27
|
+
→ CLI reports result to user
|
|
28
|
+
```
|
|
50
29
|
|
|
51
|
-
|
|
52
|
-
- Make the change
|
|
53
|
-
- Update last_compiled if touching wiki pages
|
|
54
|
-
- Commit with a clear message
|
|
30
|
+
The inference is Claude Code (or whatever model the backend uses). The environment is the folder. The endpoint is the bridge.
|
|
55
31
|
|
|
56
|
-
|
|
32
|
+
## On invoke
|
|
57
33
|
|
|
58
|
-
|
|
34
|
+
1. Read `~/.atris/credentials.json` for auth token
|
|
35
|
+
2. Read `.atris/business.json` for the API base URL (or default to `http://localhost:8000`)
|
|
36
|
+
3. Call `POST /api/improve` with:
|
|
37
|
+
```json
|
|
38
|
+
{
|
|
39
|
+
"workspace": "<current working directory>",
|
|
40
|
+
"mode": "full",
|
|
41
|
+
"model": "sonnet"
|
|
42
|
+
}
|
|
43
|
+
```
|
|
44
|
+
4. Wait for response (may take 1-5 minutes)
|
|
45
|
+
5. On success:
|
|
46
|
+
- Show what shipped (task name, files changed, verify result)
|
|
47
|
+
- Show the reward score
|
|
48
|
+
- Write scorecard to `atris/scorecards.md`
|
|
49
|
+
- Append tick to today's journal
|
|
50
|
+
6. On failure:
|
|
51
|
+
- Show the error
|
|
52
|
+
- Write a lesson to `atris/lessons.md`
|
|
53
|
+
- Do not write a scorecard
|
|
54
|
+
|
|
55
|
+
## Modes
|
|
56
|
+
|
|
57
|
+
- `full` — plan, build, review, verify (default)
|
|
58
|
+
- `plan` — just pick the task and show what it would do
|
|
59
|
+
- `dry_run` — run everything but don't commit
|
|
60
|
+
|
|
61
|
+
## Fallback
|
|
62
|
+
|
|
63
|
+
If the backend is unreachable (no auth, no network, localhost not running), fall back to local mode: run `atris autopilot --auto --iterations=1` instead. Same loop, just local inference via `claude -p` subprocess. Report that it ran locally.
|
|
64
|
+
|
|
65
|
+
## Output
|
|
59
66
|
|
|
60
67
|
```
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
1. MAP.md has 11 broken refs — 3 files moved, 8 functions renamed.
|
|
64
|
-
These make navigation wrong. I can auto-heal most of them.
|
|
65
|
-
|
|
66
|
-
2. atris/TODO.md has a task claimed 26 days ago by Executor.
|
|
67
|
-
It's blocking the in-progress slot. Should delete or re-scope.
|
|
68
|
-
|
|
69
|
-
3. MAP.md hasn't been updated in 25 days.
|
|
70
|
-
Code has changed — the map is drifting from reality.
|
|
68
|
+
improved.
|
|
71
69
|
|
|
72
|
-
|
|
73
|
-
|
|
70
|
+
task: fixed the stale wiki ref in auth-flow.md
|
|
71
|
+
verify: pass (npm test, 143/143)
|
|
72
|
+
reward: +4
|
|
73
|
+
files: atris/wiki/briefs/auth-flow.md
|
|
74
|
+
time: 47s
|
|
74
75
|
|
|
75
|
-
|
|
76
|
+
scorecard updated.
|
|
76
77
|
```
|
|
77
78
|
|
|
78
79
|
## Rules
|
|
79
80
|
|
|
80
|
-
-
|
|
81
|
-
- Always
|
|
82
|
-
-
|
|
83
|
-
-
|
|
84
|
-
-
|
|
81
|
+
- One tick only. Never batch.
|
|
82
|
+
- Always verify. No reward without a check.
|
|
83
|
+
- Show what shipped, not what was attempted.
|
|
84
|
+
- Write the scorecard. This is the receipt.
|
|
85
|
+
- If verify fails, halt honestly and write a lesson.
|
|
86
|
+
- Fallback to local if backend is unreachable. Never error silently.
|
|
87
|
+
- The user pays because something real happened. Never fake it.
|
|
@@ -0,0 +1,62 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: launch
|
|
3
|
+
description: "Write a release post for Twitter and LinkedIn. 3 emoji bullets, plain English, no jargon. Triggers on: /launch, write a launch post, release announcement, ship post."
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
tags: [launch, release, social, twitter, linkedin]
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# /launch
|
|
9
|
+
|
|
10
|
+
Writes a copy-paste-ready release post for Twitter and LinkedIn.
|
|
11
|
+
|
|
12
|
+
## Format
|
|
13
|
+
|
|
14
|
+
```
|
|
15
|
+
<project> <version> update
|
|
16
|
+
|
|
17
|
+
<emoji> What it does, one sentence. Plain English.
|
|
18
|
+
<emoji> What it does, one sentence. No buzzwords.
|
|
19
|
+
<emoji> What it does, one sentence. Concrete.
|
|
20
|
+
|
|
21
|
+
<install command>
|
|
22
|
+
<release URL>
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
## Example (atris v3.0.1)
|
|
26
|
+
|
|
27
|
+
```
|
|
28
|
+
atris v3.0.1 update
|
|
29
|
+
|
|
30
|
+
🎯 Write what "done" looks like. The loop plans, builds, and reviews until it gets there.
|
|
31
|
+
🧠 Reads past lessons and notes every run. Better decisions over time.
|
|
32
|
+
📂 One command sets up a workspace with team, wiki, and context wired in.
|
|
33
|
+
|
|
34
|
+
npm install -g atris
|
|
35
|
+
https://github.com/atrislabs/atris/releases/tag/v3.0.1
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
## How to invoke
|
|
39
|
+
|
|
40
|
+
User says "write a launch post", "post the release", "/launch", or "announce this version".
|
|
41
|
+
|
|
42
|
+
The agent then:
|
|
43
|
+
|
|
44
|
+
1. Read the latest git tag and release notes (or ask what shipped)
|
|
45
|
+
2. Distill into exactly 3 bullets, one sentence each
|
|
46
|
+
3. Pick an emoji per bullet that fits the content
|
|
47
|
+
4. Add install command + release URL at the bottom
|
|
48
|
+
5. Output the final text ready to copy-paste
|
|
49
|
+
|
|
50
|
+
## Rules
|
|
51
|
+
|
|
52
|
+
- 3 bullets max. Never 4, never 5.
|
|
53
|
+
- Each bullet is one sentence about what it does, not what it is
|
|
54
|
+
- No em dashes
|
|
55
|
+
- No jargon ("canonical", "substrate", "two-engine architecture", "self-improving")
|
|
56
|
+
- No mentioning specific tools by name (Claude Code, Cursor, etc.)
|
|
57
|
+
- No model/AI buzzwords ("LLM", "same model", "agentic")
|
|
58
|
+
- No marketing speak. If it sounds like a pitch deck, rewrite it.
|
|
59
|
+
- Plain English a high schooler would understand
|
|
60
|
+
- Install command + release URL always at the bottom
|
|
61
|
+
- Same post works for both Twitter and LinkedIn
|
|
62
|
+
- Optional witty one-liner closer if it fits naturally. Never forced.
|
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: tidy
|
|
3
|
+
description: "Workspace maintenance and knowledge hygiene. Finds stale docs, broken refs, abandoned tasks, and fixes them. Use when things feel messy or you want the system to clean itself up. Triggers on: tidy, clean up, maintenance, lint, health check, freshen up."
|
|
4
|
+
version: 1.1.0
|
|
5
|
+
tags:
|
|
6
|
+
- maintenance
|
|
7
|
+
- knowledge
|
|
8
|
+
- hygiene
|
|
9
|
+
- docs
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
# /tidy
|
|
13
|
+
|
|
14
|
+
Finds what's rotting in your workspace and fixes it. Stale pages, broken references, abandoned tasks, outdated docs.
|
|
15
|
+
|
|
16
|
+
## When to use
|
|
17
|
+
|
|
18
|
+
- "Things feel messy"
|
|
19
|
+
- "Clean this up"
|
|
20
|
+
- After a big refactor when docs have drifted
|
|
21
|
+
- Periodically, to keep the knowledge base honest
|
|
22
|
+
- When you suspect MAP.md or wiki pages are out of date
|
|
23
|
+
|
|
24
|
+
## On invoke
|
|
25
|
+
|
|
26
|
+
1. Run `atris clean --dry-run` silently. Collect results.
|
|
27
|
+
2. Read atris/MAP.md, atris/TODO.md, and today's journal for context.
|
|
28
|
+
3. Scan for these problems (in priority order):
|
|
29
|
+
|
|
30
|
+
### What to look for
|
|
31
|
+
|
|
32
|
+
**Stale wiki pages** — pages with `last_compiled` frontmatter where the source files have been modified since. The page content may be wrong.
|
|
33
|
+
|
|
34
|
+
**Broken MAP.md references** — file:line refs that point to code that moved or was deleted. The auto-healer fixes what it can; report what it can't.
|
|
35
|
+
|
|
36
|
+
**Abandoned tasks** — in-progress tasks claimed more than 3 days ago. Either finish them, re-scope them, or delete them.
|
|
37
|
+
|
|
38
|
+
**Orphan docs** — markdown pages under atris/ that nothing links to. They're invisible and probably stale.
|
|
39
|
+
|
|
40
|
+
**Stale MAP.md** — if MAP.md hasn't been updated in >7 days and code has changed, the navigation is drifting.
|
|
41
|
+
|
|
42
|
+
**Empty sections** — TODO.md sections with placeholder text like "(empty)" or "(clean)".
|
|
43
|
+
|
|
44
|
+
4. Present findings as a numbered list, sorted by impact. For each:
|
|
45
|
+
- What's wrong
|
|
46
|
+
- Why it matters
|
|
47
|
+
- What you'd do to fix it
|
|
48
|
+
|
|
49
|
+
5. Ask: "want me to fix these? all / pick numbers / skip"
|
|
50
|
+
|
|
51
|
+
6. Fix what they approve. For each fix:
|
|
52
|
+
- Make the change
|
|
53
|
+
- Update last_compiled if touching wiki pages
|
|
54
|
+
- Commit with a clear message
|
|
55
|
+
|
|
56
|
+
7. After all fixes, run `atris clean` one more time to verify.
|
|
57
|
+
|
|
58
|
+
## Example
|
|
59
|
+
|
|
60
|
+
```
|
|
61
|
+
Found 4 things to improve:
|
|
62
|
+
|
|
63
|
+
1. MAP.md has 11 broken refs — 3 files moved, 8 functions renamed.
|
|
64
|
+
These make navigation wrong. I can auto-heal most of them.
|
|
65
|
+
|
|
66
|
+
2. atris/TODO.md has a task claimed 26 days ago by Executor.
|
|
67
|
+
It's blocking the in-progress slot. Should delete or re-scope.
|
|
68
|
+
|
|
69
|
+
3. MAP.md hasn't been updated in 25 days.
|
|
70
|
+
Code has changed — the map is drifting from reality.
|
|
71
|
+
|
|
72
|
+
4. 2 empty sections in TODO.md.
|
|
73
|
+
Just noise. Can clean them out.
|
|
74
|
+
|
|
75
|
+
want me to fix these? all / pick numbers / skip
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
## Rules
|
|
79
|
+
|
|
80
|
+
- Never delete user content without asking.
|
|
81
|
+
- Always show what you found before fixing.
|
|
82
|
+
- Commit fixes in small, clear commits (one per category).
|
|
83
|
+
- Update last_compiled frontmatter when recompiling wiki pages.
|
|
84
|
+
- Run atris clean at the end to verify everything is actually fixed.
|
package/bin/atris.js
CHANGED
|
@@ -271,13 +271,14 @@ function showHelp() {
|
|
|
271
271
|
console.log(' wake [business] - Resume workspace (agents restart)');
|
|
272
272
|
console.log('');
|
|
273
273
|
console.log('Business:');
|
|
274
|
+
console.log(' business init <name> - Create canonical business workspace (cloud + local)');
|
|
274
275
|
console.log(' business add <slug> - Connect a business');
|
|
275
276
|
console.log(' business list - Show connected businesses');
|
|
276
277
|
console.log(' business remove <slug> - Disconnect a business');
|
|
277
278
|
console.log(' business team [slug] - Show members, roles, and admin access');
|
|
278
279
|
console.log(' business health <slug> - Health report (members, workspace, issues)');
|
|
279
280
|
console.log(' business audit - One-line health summary of all businesses');
|
|
280
|
-
console.log(' business create <name> - Create new business
|
|
281
|
+
console.log(' business create <name> - Create new business; add --workspace for canonical local scaffold');
|
|
281
282
|
console.log(' business connect <svc> - Wire a skill/integration');
|
|
282
283
|
console.log(' business notify <mode> - Set notification mode (digest/silent/push)');
|
|
283
284
|
console.log(' business deploy <slug> - Push local business to cloud');
|