atris 3.0.1 → 3.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -73,6 +73,20 @@ Core loop: `plan` -> `do` -> `review`
73
73
 
74
74
  Works with Claude Code, Cursor, Windsurf, Codex, GitHub Copilot, and other coding agents.
75
75
 
76
+ ## Business Workspaces
77
+
78
+ If you want a real business workspace, use the business command instead of raw `atris init`.
79
+
80
+ ```bash
81
+ atris business init "BLOND:ISH" --owner-email joel@blondish.world
82
+ cd ~/arena/atris-business/blondish
83
+ atris align --fix
84
+ ```
85
+
86
+ That creates the cloud business, writes `.atris/business.json`, initializes `.atris/state/` for events, episodes, and scorecards, and scaffolds the local `atris/` workspace under `~/arena/atris-business/<slug>/` with starter team lanes, a default recap artifact, and a first-loop starter queue in `atris/TODO.md`.
87
+
88
+ If you already have a folder full of source material, run it from there with `atris business init "BLOND:ISH" --here`.
89
+
76
90
  ## Core Commands
77
91
 
78
92
  | Command | Purpose |
@@ -97,11 +111,20 @@ Works with Claude Code, Cursor, Windsurf, Codex, GitHub Copilot, and other codin
97
111
 
98
112
  - `atris learn` stores structured project memory in `atris/learnings.jsonl`
99
113
  - `atris wiki` keeps repo memory in `atris/wiki/` by default, with `--cloud` when you want the remote workspace path
114
+ - `atris wiki --private` uses `.atris/presidio/` for local-only sensitive notes and operating memory
100
115
  - `atris loop` refreshes `atris/wiki/STATUS.md` and `atris/wiki/log.md`, flags stale/orphan pages, and suggests the next ingest
101
116
  - `atris activate` loads the current wiki status so the next session starts with project memory, not just tasks
102
117
  - `atris experiments` runs Karpathy-style keep/revert loops in `atris/experiments/`
103
118
  - `atris pull` and `atris push` sync cloud workspaces and journals
104
119
 
120
+ ## Verifiable Feedback Loop
121
+
122
+ Under the hood, Atris can keep score on real repo work.
123
+
124
+ - Endgame tasks can carry a `Verify:` command, so work can end on a deterministic check instead of pure prose.
125
+ - `atris autopilot` can run that check after review, record a reward in the journal, and append a local scorecard when a horizon closes.
126
+ - Future horizon picks can weight against recent scorecards, so the loop learns from repo-local history without claiming model retraining.
127
+
105
128
  ## Benchmark Harness
106
129
 
107
130
  Atris ships one public head-to-head benchmark harness for comparing a pinned
@@ -156,6 +179,14 @@ atris skill link [--all]
156
179
 
157
180
  For Codex, copy any skill folder into `~/.codex/skills/`.
158
181
 
182
+ ## v3.2.0
183
+
184
+ - **Staleness gate** — tasks tagged `[unverified]` are skipped at the moment of use, not pruned eagerly. Three-state model: actionable / unverified / deleted.
185
+ - **Lesson gate** — `isLessonResolved` checks whether a lesson already shipped before proposing new horizons from it. Prevents the loop from re-solving solved problems.
186
+ - **`atris release`** — new command: tags the version, bumps package.json, creates a GitHub release, and drafts a `/launch` post in one shot.
187
+ - **Shell injection fix** — `checkStaleness` switched from `execSync` string interpolation to `execFileSync` with args arrays. Markdown-derived content (task titles, inbox items) no longer reaches a shell.
188
+ - **Codex hardening** — `atris activate` and `atris` entry point detect Codex environments and write `AGENTS.md` so Codex sessions start with workspace context.
189
+
159
190
  ## Update
160
191
 
161
192
  ```bash
@@ -66,15 +66,30 @@ After running the three moves, write the result to `atris/TODO.md`:
66
66
 
67
67
  ```markdown
68
68
  - **T1:** <step 1 description> [endgame]
69
+ **Verify:** <deterministic-check>
69
70
  - **T2:** <step 2 description> [endgame]
71
+ **Verify:** <deterministic-check>
70
72
  - **T3:** <step 3 description> [endgame]
73
+ **Verify:** <deterministic-check>
71
74
  ```
72
75
 
73
76
  The tag must be exactly `[endgame]` (parser only matches `\w+`, no colons or hyphens). The slug lives in the section header.
74
77
 
78
+ 3. **Each task must include a `Verify:` line** with a deterministic check:
79
+ - **Test command:** `npm test` or `npm run test:feature`
80
+ - **Grep pattern:** `grep -q "pattern" file.js`
81
+ - **File presence:** `test -f path/to/file.md`
82
+ - **Exit code:** `node -e "process.exit(...)"` (or any shell command)
83
+
84
+ The verify command must:
85
+ - Complete in <30 seconds
86
+ - Exit 0 on pass, non-zero on fail
87
+ - Not require user input
88
+ - Be runnable from project root
89
+
75
90
  Use `T1`, `T2`, `T3` … as IDs (or `W1`/`E1`/etc per endgame domain). Single uppercase letter + digits, optional trailing lowercase letter (the parser was extended in commit `4db14d9` to accept `W3b`-style validator sub-task IDs).
76
91
 
77
- 3. **Append the full endgame to today's journal `## Notes`** so the reasoning is preserved:
92
+ 4. **Append the full endgame to today's journal `## Notes`** so the reasoning is preserved:
78
93
 
79
94
  ```markdown
80
95
  ### Endgame picked — HH:MM PDT
@@ -95,6 +110,7 @@ REVERSE PATH
95
110
 
96
111
  NEXT MOVE
97
112
  T1: <description>
113
+ Verify: <check-command>
98
114
  Why this first: <one line>
99
115
  ```
100
116
 
@@ -132,6 +148,7 @@ REVERSE PATH
132
148
 
133
149
  NEXT MOVE
134
150
  [one concrete action, doable in one session]
151
+ Verify: [deterministic check — test, grep, file, exit code]
135
152
  Why this first: [one line]
136
153
  ```
137
154
 
@@ -143,6 +160,7 @@ NEXT MOVE
143
160
  - **REVERSE PATH includes eliminate.** Half of strategy is removal. Forward-greedy planning never asks this. Endgame must.
144
161
  - **The chain must terminate this week.** If it can't, the horizon is too far — pick a closer one and say so.
145
162
  - **5–7 links max in the chain.** More than that = horizon is too vague.
163
+ - **Every task must have a Verify line.** Deterministic check (test, grep, file, exit code). Allows the validator to score the endgame autonomously.
146
164
  - **Cite wiki pages** with `[[atris/wiki/...]]` refs.
147
165
  - **Ask 1–3 questions max** if the horizon is unclear. Never a wall of text.
148
166
  - **One chain, not three.** Pick the shortest defensible one.
@@ -1,84 +1,87 @@
1
1
  ---
2
2
  name: improve
3
- description: "Workspace maintenance and knowledge hygiene. Finds stale docs, broken refs, abandoned tasks, and fixes them. Use when things feel messy or you want the system to clean itself up. Triggers on: improve, clean up, maintenance, lint, health check, freshen up."
3
+ description: "Run one RL improvement tick on the workspace via POST /api/improve. Ships one verifiable change, scores it, writes the scorecard. The thing you pay for. Triggers on: improve, make this better, ship one thing, run a tick, get smarter."
4
4
  version: 1.0.0
5
5
  tags:
6
- - maintenance
7
- - knowledge
8
- - hygiene
9
- - docs
6
+ - rl
7
+ - improve
8
+ - reward
9
+ - tick
10
+ - autopilot
10
11
  ---
11
12
 
12
13
  # /improve
13
14
 
14
- Finds what's rotting in your workspace and fixes it. Stale pages, broken references, abandoned tasks, outdated docs.
15
+ Runs one improvement tick on the workspace. Calls `POST /api/improve` on the backend, which plans one task, builds it, verifies it, and scores it. Returns what shipped + the reward. Writes the scorecard locally.
15
16
 
16
- ## When to use
17
+ This is the product. The thing the user pays for. One call, one verifiable result.
17
18
 
18
- - "Things feel messy"
19
- - "Clean this up"
20
- - After a big refactor when docs have drifted
21
- - Periodically, to keep the knowledge base honest
22
- - When you suspect MAP.md or wiki pages are out of date
19
+ ## How it works
23
20
 
24
- ## On invoke
25
-
26
- 1. Run `atris clean --dry-run` silently. Collect results.
27
- 2. Read atris/MAP.md, atris/TODO.md, and today's journal for context.
28
- 3. Scan for these problems (in priority order):
29
-
30
- ### What to look for
31
-
32
- **Stale wiki pages** — pages with `last_compiled` frontmatter where the source files have been modified since. The page content may be wrong.
33
-
34
- **Broken MAP.md references** — file:line refs that point to code that moved or was deleted. The auto-healer fixes what it can; report what it can't.
35
-
36
- **Abandoned tasks** — in-progress tasks claimed more than 3 days ago. Either finish them, re-scope them, or delete them.
37
-
38
- **Orphan docs** — markdown pages under atris/ that nothing links to. They're invisible and probably stale.
39
-
40
- **Stale MAP.md** — if MAP.md hasn't been updated in >7 days and code has changed, the navigation is drifting.
41
-
42
- **Empty sections** — TODO.md sections with placeholder text like "(empty)" or "(clean)".
43
-
44
- 4. Present findings as a numbered list, sorted by impact. For each:
45
- - What's wrong
46
- - Why it matters
47
- - What you'd do to fix it
48
-
49
- 5. Ask: "want me to fix these? all / pick numbers / skip"
21
+ ```
22
+ /improve
23
+ POST /api/improve { workspace: ".", mode: "full" }
24
+ backend picks a task, plans, builds, reviews, verifies
25
+ returns { task, reward, files_changed, verify_pass, summary }
26
+ → CLI writes scorecard to .atris/presidio/scorecards.md
27
+ CLI reports result to user
28
+ ```
50
29
 
51
- 6. Fix what they approve. For each fix:
52
- - Make the change
53
- - Update last_compiled if touching wiki pages
54
- - Commit with a clear message
30
+ The inference is Claude Code (or whatever model the backend uses). The environment is the folder. The endpoint is the bridge.
55
31
 
56
- 7. After all fixes, run `atris clean` one more time to verify.
32
+ ## On invoke
57
33
 
58
- ## Example
34
+ 1. Read `~/.atris/credentials.json` for auth token
35
+ 2. Read `.atris/business.json` for the API base URL (or default to `http://localhost:8000`)
36
+ 3. Call `POST /api/improve` with:
37
+ ```json
38
+ {
39
+ "workspace": "<current working directory>",
40
+ "mode": "full",
41
+ "model": "sonnet"
42
+ }
43
+ ```
44
+ 4. Wait for response (may take 1-5 minutes)
45
+ 5. On success:
46
+ - Show what shipped (task name, files changed, verify result)
47
+ - Show the reward score
48
+ - Write scorecard to `.atris/presidio/scorecards.md`
49
+ - Append tick to today's journal
50
+ 6. On failure:
51
+ - Show the error
52
+ - Write a lesson to `atris/lessons.md`
53
+ - Do not write a scorecard
54
+
55
+ ## Modes
56
+
57
+ - `full` — plan, build, review, verify (default)
58
+ - `plan` — just pick the task and show what it would do
59
+ - `dry_run` — run everything but don't commit
60
+
61
+ ## Fallback
62
+
63
+ If the backend is unreachable (no auth, no network, localhost not running), fall back to local mode: run `atris autopilot --auto --iterations=1` instead. Same loop, just local inference via `claude -p` subprocess. Report that it ran locally.
64
+
65
+ ## Output
59
66
 
60
67
  ```
61
- Found 4 things to improve:
62
-
63
- 1. MAP.md has 11 broken refs — 3 files moved, 8 functions renamed.
64
- These make navigation wrong. I can auto-heal most of them.
65
-
66
- 2. atris/TODO.md has a task claimed 26 days ago by Executor.
67
- It's blocking the in-progress slot. Should delete or re-scope.
68
-
69
- 3. MAP.md hasn't been updated in 25 days.
70
- Code has changed — the map is drifting from reality.
68
+ improved.
71
69
 
72
- 4. 2 empty sections in TODO.md.
73
- Just noise. Can clean them out.
70
+ task: fixed the stale wiki ref in auth-flow.md
71
+ verify: pass (npm test, 143/143)
72
+ reward: +4
73
+ files: atris/wiki/briefs/auth-flow.md
74
+ time: 47s
74
75
 
75
- want me to fix these? all / pick numbers / skip
76
+ scorecard updated.
76
77
  ```
77
78
 
78
79
  ## Rules
79
80
 
80
- - Never delete user content without asking.
81
- - Always show what you found before fixing.
82
- - Commit fixes in small, clear commits (one per category).
83
- - Update last_compiled frontmatter when recompiling wiki pages.
84
- - Run atris clean at the end to verify everything is actually fixed.
81
+ - One tick only. Never batch.
82
+ - Always verify. No reward without a check.
83
+ - Show what shipped, not what was attempted.
84
+ - Write the scorecard. This is the receipt.
85
+ - If verify fails, halt honestly and write a lesson.
86
+ - Fallback to local if backend is unreachable. Never error silently.
87
+ - The user pays because something real happened. Never fake it.
@@ -0,0 +1,62 @@
1
+ ---
2
+ name: launch
3
+ description: "Write a release post for Twitter and LinkedIn. 3 emoji bullets, plain English, no jargon. Triggers on: /launch, write a launch post, release announcement, ship post."
4
+ version: 1.0.0
5
+ tags: [launch, release, social, twitter, linkedin]
6
+ ---
7
+
8
+ # /launch
9
+
10
+ Writes a copy-paste-ready release post for Twitter and LinkedIn.
11
+
12
+ ## Format
13
+
14
+ ```
15
+ <project> <version> update
16
+
17
+ <emoji> What it does, one sentence. Plain English.
18
+ <emoji> What it does, one sentence. No buzzwords.
19
+ <emoji> What it does, one sentence. Concrete.
20
+
21
+ <install command>
22
+ <release URL>
23
+ ```
24
+
25
+ ## Example (atris v3.0.1)
26
+
27
+ ```
28
+ atris v3.0.1 update
29
+
30
+ 🎯 Write what "done" looks like. The loop plans, builds, and reviews until it gets there.
31
+ 🧠 Reads past lessons and notes every run. Better decisions over time.
32
+ 📂 One command sets up a workspace with team, wiki, and context wired in.
33
+
34
+ npm install -g atris
35
+ https://github.com/atrislabs/atris/releases/tag/v3.0.1
36
+ ```
37
+
38
+ ## How to invoke
39
+
40
+ User says "write a launch post", "post the release", "/launch", or "announce this version".
41
+
42
+ The agent then:
43
+
44
+ 1. Read the latest git tag and release notes (or ask what shipped)
45
+ 2. Distill into exactly 3 bullets, one sentence each
46
+ 3. Pick an emoji per bullet that fits the content
47
+ 4. Add install command + release URL at the bottom
48
+ 5. Output the final text ready to copy-paste
49
+
50
+ ## Rules
51
+
52
+ - 3 bullets max. Never 4, never 5.
53
+ - Each bullet is one sentence about what it does, not what it is
54
+ - No em dashes
55
+ - No jargon ("canonical", "substrate", "two-engine architecture", "self-improving")
56
+ - No mentioning specific tools by name (Claude Code, Cursor, etc.)
57
+ - No model/AI buzzwords ("LLM", "same model", "agentic")
58
+ - No marketing speak. If it sounds like a pitch deck, rewrite it.
59
+ - Plain English a high schooler would understand
60
+ - Install command + release URL always at the bottom
61
+ - Same post works for both Twitter and LinkedIn
62
+ - Optional witty one-liner closer if it fits naturally. Never forced.
@@ -0,0 +1,84 @@
1
+ ---
2
+ name: tidy
3
+ description: "Workspace maintenance and knowledge hygiene. Finds stale docs, broken refs, abandoned tasks, and fixes them. Use when things feel messy or you want the system to clean itself up. Triggers on: tidy, clean up, maintenance, lint, health check, freshen up."
4
+ version: 1.1.0
5
+ tags:
6
+ - maintenance
7
+ - knowledge
8
+ - hygiene
9
+ - docs
10
+ ---
11
+
12
+ # /tidy
13
+
14
+ Finds what's rotting in your workspace and fixes it. Stale pages, broken references, abandoned tasks, outdated docs.
15
+
16
+ ## When to use
17
+
18
+ - "Things feel messy"
19
+ - "Clean this up"
20
+ - After a big refactor when docs have drifted
21
+ - Periodically, to keep the knowledge base honest
22
+ - When you suspect MAP.md or wiki pages are out of date
23
+
24
+ ## On invoke
25
+
26
+ 1. Run `atris clean --dry-run` silently. Collect results.
27
+ 2. Read atris/MAP.md, atris/TODO.md, and today's journal for context.
28
+ 3. Scan for these problems (in priority order):
29
+
30
+ ### What to look for
31
+
32
+ **Stale wiki pages** — pages with `last_compiled` frontmatter where the source files have been modified since. The page content may be wrong.
33
+
34
+ **Broken MAP.md references** — file:line refs that point to code that moved or was deleted. The auto-healer fixes what it can; report what it can't.
35
+
36
+ **Abandoned tasks** — in-progress tasks claimed more than 3 days ago. Either finish them, re-scope them, or delete them.
37
+
38
+ **Orphan docs** — markdown pages under atris/ that nothing links to. They're invisible and probably stale.
39
+
40
+ **Stale MAP.md** — if MAP.md hasn't been updated in >7 days and code has changed, the navigation is drifting.
41
+
42
+ **Empty sections** — TODO.md sections with placeholder text like "(empty)" or "(clean)".
43
+
44
+ 4. Present findings as a numbered list, sorted by impact. For each:
45
+ - What's wrong
46
+ - Why it matters
47
+ - What you'd do to fix it
48
+
49
+ 5. Ask: "want me to fix these? all / pick numbers / skip"
50
+
51
+ 6. Fix what they approve. For each fix:
52
+ - Make the change
53
+ - Update last_compiled if touching wiki pages
54
+ - Commit with a clear message
55
+
56
+ 7. After all fixes, run `atris clean` one more time to verify.
57
+
58
+ ## Example
59
+
60
+ ```
61
+ Found 4 things to improve:
62
+
63
+ 1. MAP.md has 11 broken refs — 3 files moved, 8 functions renamed.
64
+ These make navigation wrong. I can auto-heal most of them.
65
+
66
+ 2. atris/TODO.md has a task claimed 26 days ago by Executor.
67
+ It's blocking the in-progress slot. Should delete or re-scope.
68
+
69
+ 3. MAP.md hasn't been updated in 25 days.
70
+ Code has changed — the map is drifting from reality.
71
+
72
+ 4. 2 empty sections in TODO.md.
73
+ Just noise. Can clean them out.
74
+
75
+ want me to fix these? all / pick numbers / skip
76
+ ```
77
+
78
+ ## Rules
79
+
80
+ - Never delete user content without asking.
81
+ - Always show what you found before fixing.
82
+ - Commit fixes in small, clear commits (one per category).
83
+ - Update last_compiled frontmatter when recompiling wiki pages.
84
+ - Run atris clean at the end to verify everything is actually fixed.
package/bin/atris.js CHANGED
@@ -237,6 +237,7 @@ function showHelp() {
237
237
  console.log(' search - Search journal history (atris search <keyword>)');
238
238
  console.log(' clean - Housekeeping (stale tasks, archive journals, broken refs)');
239
239
  console.log(' verify - Validate work is done (tests, MAP.md, changes)');
240
+ console.log(' release - Tag release, bump version, create GitHub release, draft /launch');
240
241
  console.log(' learn - Project learnings (patterns, pitfalls, preferences)');
241
242
  console.log(' ingest - Local-first wiki ingest into atris/wiki/');
242
243
  console.log(' query - Local-first wiki query against atris/wiki/');
@@ -271,13 +272,14 @@ function showHelp() {
271
272
  console.log(' wake [business] - Resume workspace (agents restart)');
272
273
  console.log('');
273
274
  console.log('Business:');
275
+ console.log(' business init <name> - Create canonical business workspace (cloud + local)');
274
276
  console.log(' business add <slug> - Connect a business');
275
277
  console.log(' business list - Show connected businesses');
276
278
  console.log(' business remove <slug> - Disconnect a business');
277
279
  console.log(' business team [slug] - Show members, roles, and admin access');
278
280
  console.log(' business health <slug> - Health report (members, workspace, issues)');
279
281
  console.log(' business audit - One-line health summary of all businesses');
280
- console.log(' business create <name> - Create new business (cloud + local)');
282
+ console.log(' business create <name> - Create new business; add --workspace for canonical local scaffold');
281
283
  console.log(' business connect <svc> - Wire a skill/integration');
282
284
  console.log(' business notify <mode> - Set notification mode (digest/silent/push)');
283
285
  console.log(' business deploy <slug> - Push local business to cloud');
@@ -423,7 +425,7 @@ const { planAtris: planCmd, doAtris: doCmd, reviewAtris: reviewCmd } = require('
423
425
  // All other commands are lazy-loaded inline (require() only when invoked)
424
426
 
425
427
  // Check if this is a known command or natural language input
426
- const knownCommands = ['init', 'log', 'status', 'analytics', 'visualize', 'brainstorm', 'autopilot', 'run', 'plan', 'do', 'review',
428
+ const knownCommands = ['init', 'log', 'status', 'analytics', 'visualize', 'brainstorm', 'autopilot', 'run', 'plan', 'do', 'review', 'release',
427
429
  'activate', '_activate', 'agent', 'chat', 'console', 'login', 'logout', 'whoami', 'switch', 'use', 'accounts', '_resolve', '_profile-email', '_switch-session', 'shell-init', 'update', 'upgrade', 'version', 'help', 'next', 'atris',
428
430
  'clean', 'verify', 'search', 'skill', 'member', 'learn', 'plugin', 'experiments', 'pull', 'push', 'align', 'terminal', 'diff', 'business', 'sync',
429
431
  'ingest', 'query', 'lint', 'loop',
@@ -1001,6 +1003,11 @@ if (command === 'init') {
1001
1003
  } else if (command === 'verify') {
1002
1004
  const taskId = process.argv[3] || null;
1003
1005
  require('../commands/verify').verifyAtris(taskId);
1006
+ } else if (command === 'release') {
1007
+ const dryRun = process.argv.includes('--dry-run');
1008
+ require('../commands/release').releaseAtris({ dryRun })
1009
+ .then(() => process.exit(0))
1010
+ .catch((err) => { console.error(`\n✗ Error: ${err.message || err}`); process.exit(1); });
1004
1011
  } else if (command === 'search') {
1005
1012
  const keyword = process.argv.slice(3).join(' ');
1006
1013
  searchJournal(keyword);