npm - atris - Versions diffs - 3.0.1 → 3.2.0 - Mend

atris 3.0.1 → 3.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

package/README.md +31 -0
package/atris/skills/endgame/SKILL.md +19 -1
package/atris/skills/improve/SKILL.md +65 -62
package/atris/skills/launch/SKILL.md +62 -0
package/atris/skills/tidy/SKILL.md +84 -0
package/bin/atris.js +9 -2
package/commands/autopilot.js +847 -43
package/commands/business.js +157 -35
package/commands/experiments.js +1 -1
package/commands/release.js +183 -0
package/commands/research.js +52 -0
package/commands/sync.js +108 -15
package/commands/verify.js +3 -3
package/commands/wiki.js +45 -25
package/lib/reward-config.js +24 -0
package/lib/scorecard.js +301 -0
package/lib/todo.js +12 -2
package/lib/wiki.js +87 -56
package/package.json +3 -2

package/README.md CHANGED Viewed

@@ -73,6 +73,20 @@ Core loop: `plan` -> `do` -> `review`
 Works with Claude Code, Cursor, Windsurf, Codex, GitHub Copilot, and other coding agents.
+## Business Workspaces
+If you want a real business workspace, use the business command instead of raw `atris init`.
+```bash
+atris business init "BLOND:ISH" --owner-email joel@blondish.world
+cd ~/arena/atris-business/blondish
+atris align --fix
+```
+That creates the cloud business, writes `.atris/business.json`, initializes `.atris/state/` for events, episodes, and scorecards, and scaffolds the local `atris/` workspace under `~/arena/atris-business/<slug>/` with starter team lanes, a default recap artifact, and a first-loop starter queue in `atris/TODO.md`.
+If you already have a folder full of source material, run it from there with `atris business init "BLOND:ISH" --here`.
 ## Core Commands
 | Command | Purpose |
@@ -97,11 +111,20 @@ Works with Claude Code, Cursor, Windsurf, Codex, GitHub Copilot, and other codin
 - `atris learn` stores structured project memory in `atris/learnings.jsonl`
 - `atris wiki` keeps repo memory in `atris/wiki/` by default, with `--cloud` when you want the remote workspace path
+- `atris wiki --private` uses `.atris/presidio/` for local-only sensitive notes and operating memory
 - `atris loop` refreshes `atris/wiki/STATUS.md` and `atris/wiki/log.md`, flags stale/orphan pages, and suggests the next ingest
 - `atris activate` loads the current wiki status so the next session starts with project memory, not just tasks
 - `atris experiments` runs Karpathy-style keep/revert loops in `atris/experiments/`
 - `atris pull` and `atris push` sync cloud workspaces and journals
+## Verifiable Feedback Loop
+Under the hood, Atris can keep score on real repo work.
+- Endgame tasks can carry a `Verify:` command, so work can end on a deterministic check instead of pure prose.
+- `atris autopilot` can run that check after review, record a reward in the journal, and append a local scorecard when a horizon closes.
+- Future horizon picks can weight against recent scorecards, so the loop learns from repo-local history without claiming model retraining.
 ## Benchmark Harness
 Atris ships one public head-to-head benchmark harness for comparing a pinned
@@ -156,6 +179,14 @@ atris skill link [--all]
 For Codex, copy any skill folder into `~/.codex/skills/`.
+## v3.2.0
+- **Staleness gate** — tasks tagged `[unverified]` are skipped at the moment of use, not pruned eagerly. Three-state model: actionable / unverified / deleted.
+- **Lesson gate** — `isLessonResolved` checks whether a lesson already shipped before proposing new horizons from it. Prevents the loop from re-solving solved problems.
+- **`atris release`** — new command: tags the version, bumps package.json, creates a GitHub release, and drafts a `/launch` post in one shot.
+- **Shell injection fix** — `checkStaleness` switched from `execSync` string interpolation to `execFileSync` with args arrays. Markdown-derived content (task titles, inbox items) no longer reaches a shell.
+- **Codex hardening** — `atris activate` and `atris` entry point detect Codex environments and write `AGENTS.md` so Codex sessions start with workspace context.
 ## Update
 ```bash

package/atris/skills/endgame/SKILL.md CHANGED Viewed

@@ -66,15 +66,30 @@ After running the three moves, write the result to `atris/TODO.md`:
 ```markdown
 - **T1:** <step 1 description> [endgame]
+  **Verify:** <deterministic-check>
 - **T2:** <step 2 description> [endgame]
+  **Verify:** <deterministic-check>
 - **T3:** <step 3 description> [endgame]
+  **Verify:** <deterministic-check>
 ```
 The tag must be exactly `[endgame]` (parser only matches `\w+`, no colons or hyphens). The slug lives in the section header.
+3. **Each task must include a `Verify:` line** with a deterministic check:
+   - **Test command:** `npm test` or `npm run test:feature`
+   - **Grep pattern:** `grep -q "pattern" file.js`
+   - **File presence:** `test -f path/to/file.md`
+   - **Exit code:** `node -e "process.exit(...)"` (or any shell command)
+   The verify command must:
+   - Complete in <30 seconds
+   - Exit 0 on pass, non-zero on fail
+   - Not require user input
+   - Be runnable from project root
 Use `T1`, `T2`, `T3` … as IDs (or `W1`/`E1`/etc per endgame domain). Single uppercase letter + digits, optional trailing lowercase letter (the parser was extended in commit `4db14d9` to accept `W3b`-style validator sub-task IDs).
-3. **Append the full endgame to today's journal `## Notes`** so the reasoning is preserved:
+4. **Append the full endgame to today's journal `## Notes`** so the reasoning is preserved:
 ```markdown
 ### Endgame picked — HH:MM PDT
@@ -95,6 +110,7 @@ REVERSE PATH
 NEXT MOVE
   T1: <description>
+  Verify: <check-command>
   Why this first: <one line>
 ```
@@ -132,6 +148,7 @@ REVERSE PATH
 NEXT MOVE
   [one concrete action, doable in one session]
+  Verify: [deterministic check — test, grep, file, exit code]
   Why this first: [one line]
 ```
@@ -143,6 +160,7 @@ NEXT MOVE
 - **REVERSE PATH includes eliminate.** Half of strategy is removal. Forward-greedy planning never asks this. Endgame must.
 - **The chain must terminate this week.** If it can't, the horizon is too far — pick a closer one and say so.
 - **5–7 links max in the chain.** More than that = horizon is too vague.
+- **Every task must have a Verify line.** Deterministic check (test, grep, file, exit code). Allows the validator to score the endgame autonomously.
 - **Cite wiki pages** with `[[atris/wiki/...]]` refs.
 - **Ask 1–3 questions max** if the horizon is unclear. Never a wall of text.
 - **One chain, not three.** Pick the shortest defensible one.

package/atris/skills/improve/SKILL.md CHANGED Viewed

@@ -1,84 +1,87 @@
 ---
 name: improve
-description: "Workspace maintenance and knowledge hygiene. Finds stale docs, broken refs, abandoned tasks, and fixes them. Use when things feel messy or you want the system to clean itself up. Triggers on: improve, clean up, maintenance, lint, health check, freshen up."
+description: "Run one RL improvement tick on the workspace via POST /api/improve. Ships one verifiable change, scores it, writes the scorecard. The thing you pay for. Triggers on: improve, make this better, ship one thing, run a tick, get smarter."
 version: 1.0.0
 tags:
-  - maintenance
-  - knowledge
-  - hygiene
-  - docs
+  - rl
+  - improve
+  - reward
+  - tick
+  - autopilot
 ---
 # /improve
-Finds what's rotting in your workspace and fixes it. Stale pages, broken references, abandoned tasks, outdated docs.
+Runs one improvement tick on the workspace. Calls `POST /api/improve` on the backend, which plans one task, builds it, verifies it, and scores it. Returns what shipped + the reward. Writes the scorecard locally.
-## When to use
+This is the product. The thing the user pays for. One call, one verifiable result.
-- "Things feel messy"
-- "Clean this up"
-- After a big refactor when docs have drifted
-- Periodically, to keep the knowledge base honest
-- When you suspect MAP.md or wiki pages are out of date
+## How it works
-## On invoke
-1. Run `atris clean --dry-run` silently. Collect results.
-2. Read atris/MAP.md, atris/TODO.md, and today's journal for context.
-3. Scan for these problems (in priority order):
-### What to look for
-**Stale wiki pages** — pages with `last_compiled` frontmatter where the source files have been modified since. The page content may be wrong.
-**Broken MAP.md references** — file:line refs that point to code that moved or was deleted. The auto-healer fixes what it can; report what it can't.
-**Abandoned tasks** — in-progress tasks claimed more than 3 days ago. Either finish them, re-scope them, or delete them.
-**Orphan docs** — markdown pages under atris/ that nothing links to. They're invisible and probably stale.
-**Stale MAP.md** — if MAP.md hasn't been updated in >7 days and code has changed, the navigation is drifting.
-**Empty sections** — TODO.md sections with placeholder text like "(empty)" or "(clean)".
-4. Present findings as a numbered list, sorted by impact. For each:
-   - What's wrong
-   - Why it matters
-   - What you'd do to fix it
-5. Ask: "want me to fix these? all / pick numbers / skip"
+```
+/improve
+  → POST /api/improve { workspace: ".", mode: "full" }
+  → backend picks a task, plans, builds, reviews, verifies
+  → returns { task, reward, files_changed, verify_pass, summary }
+  → CLI writes scorecard to .atris/presidio/scorecards.md
+  → CLI reports result to user
+```
-6. Fix what they approve. For each fix:
-   - Make the change
-   - Update last_compiled if touching wiki pages
-   - Commit with a clear message
+The inference is Claude Code (or whatever model the backend uses). The environment is the folder. The endpoint is the bridge.
-7. After all fixes, run `atris clean` one more time to verify.
+## On invoke
-## Example
+1. Read `~/.atris/credentials.json` for auth token
+2. Read `.atris/business.json` for the API base URL (or default to `http://localhost:8000`)
+3. Call `POST /api/improve` with:
+   ```json
+   {
+     "workspace": "<current working directory>",
+     "mode": "full",
+     "model": "sonnet"
+   }
+   ```
+4. Wait for response (may take 1-5 minutes)
+5. On success:
+   - Show what shipped (task name, files changed, verify result)
+   - Show the reward score
+   - Write scorecard to `.atris/presidio/scorecards.md`
+   - Append tick to today's journal
+6. On failure:
+   - Show the error
+   - Write a lesson to `atris/lessons.md`
+   - Do not write a scorecard
+## Modes
+- `full` — plan, build, review, verify (default)
+- `plan` — just pick the task and show what it would do
+- `dry_run` — run everything but don't commit
+## Fallback
+If the backend is unreachable (no auth, no network, localhost not running), fall back to local mode: run `atris autopilot --auto --iterations=1` instead. Same loop, just local inference via `claude -p` subprocess. Report that it ran locally.
+## Output
 ```
-Found 4 things to improve:
-1. MAP.md has 11 broken refs — 3 files moved, 8 functions renamed.
-   These make navigation wrong. I can auto-heal most of them.
-2. atris/TODO.md has a task claimed 26 days ago by Executor.
-   It's blocking the in-progress slot. Should delete or re-scope.
-3. MAP.md hasn't been updated in 25 days.
-   Code has changed — the map is drifting from reality.
+improved.
-4. 2 empty sections in TODO.md.
-   Just noise. Can clean them out.
+  task:    fixed the stale wiki ref in auth-flow.md
+  verify:  pass (npm test, 143/143)
+  reward:  +4
+  files:   atris/wiki/briefs/auth-flow.md
+  time:    47s
-want me to fix these? all / pick numbers / skip
+  scorecard updated.
 ```
 ## Rules
-- Never delete user content without asking.
-- Always show what you found before fixing.
-- Commit fixes in small, clear commits (one per category).
-- Update last_compiled frontmatter when recompiling wiki pages.
-- Run atris clean at the end to verify everything is actually fixed.
+- One tick only. Never batch.
+- Always verify. No reward without a check.
+- Show what shipped, not what was attempted.
+- Write the scorecard. This is the receipt.
+- If verify fails, halt honestly and write a lesson.
+- Fallback to local if backend is unreachable. Never error silently.
+- The user pays because something real happened. Never fake it.

package/atris/skills/launch/SKILL.md ADDED Viewed

@@ -0,0 +1,62 @@
+---
+name: launch
+description: "Write a release post for Twitter and LinkedIn. 3 emoji bullets, plain English, no jargon. Triggers on: /launch, write a launch post, release announcement, ship post."
+version: 1.0.0
+tags: [launch, release, social, twitter, linkedin]
+---
+# /launch
+Writes a copy-paste-ready release post for Twitter and LinkedIn.
+## Format
+```
+<project> <version> update
+<emoji> What it does, one sentence. Plain English.
+<emoji> What it does, one sentence. No buzzwords.
+<emoji> What it does, one sentence. Concrete.
+<install command>
+<release URL>
+```
+## Example (atris v3.0.1)
+```
+atris v3.0.1 update
+🎯 Write what "done" looks like. The loop plans, builds, and reviews until it gets there.
+🧠 Reads past lessons and notes every run. Better decisions over time.
+📂 One command sets up a workspace with team, wiki, and context wired in.
+npm install -g atris
+https://github.com/atrislabs/atris/releases/tag/v3.0.1
+```
+## How to invoke
+User says "write a launch post", "post the release", "/launch", or "announce this version".
+The agent then:
+1. Read the latest git tag and release notes (or ask what shipped)
+2. Distill into exactly 3 bullets, one sentence each
+3. Pick an emoji per bullet that fits the content
+4. Add install command + release URL at the bottom
+5. Output the final text ready to copy-paste
+## Rules
+- 3 bullets max. Never 4, never 5.
+- Each bullet is one sentence about what it does, not what it is
+- No em dashes
+- No jargon ("canonical", "substrate", "two-engine architecture", "self-improving")
+- No mentioning specific tools by name (Claude Code, Cursor, etc.)
+- No model/AI buzzwords ("LLM", "same model", "agentic")
+- No marketing speak. If it sounds like a pitch deck, rewrite it.
+- Plain English a high schooler would understand
+- Install command + release URL always at the bottom
+- Same post works for both Twitter and LinkedIn
+- Optional witty one-liner closer if it fits naturally. Never forced.

package/atris/skills/tidy/SKILL.md ADDED Viewed

@@ -0,0 +1,84 @@
+---
+name: tidy
+description: "Workspace maintenance and knowledge hygiene. Finds stale docs, broken refs, abandoned tasks, and fixes them. Use when things feel messy or you want the system to clean itself up. Triggers on: tidy, clean up, maintenance, lint, health check, freshen up."
+version: 1.1.0
+tags:
+  - maintenance
+  - knowledge
+  - hygiene
+  - docs
+---
+# /tidy
+Finds what's rotting in your workspace and fixes it. Stale pages, broken references, abandoned tasks, outdated docs.
+## When to use
+- "Things feel messy"
+- "Clean this up"
+- After a big refactor when docs have drifted
+- Periodically, to keep the knowledge base honest
+- When you suspect MAP.md or wiki pages are out of date
+## On invoke
+1. Run `atris clean --dry-run` silently. Collect results.
+2. Read atris/MAP.md, atris/TODO.md, and today's journal for context.
+3. Scan for these problems (in priority order):
+### What to look for
+**Stale wiki pages** — pages with `last_compiled` frontmatter where the source files have been modified since. The page content may be wrong.
+**Broken MAP.md references** — file:line refs that point to code that moved or was deleted. The auto-healer fixes what it can; report what it can't.
+**Abandoned tasks** — in-progress tasks claimed more than 3 days ago. Either finish them, re-scope them, or delete them.
+**Orphan docs** — markdown pages under atris/ that nothing links to. They're invisible and probably stale.
+**Stale MAP.md** — if MAP.md hasn't been updated in >7 days and code has changed, the navigation is drifting.
+**Empty sections** — TODO.md sections with placeholder text like "(empty)" or "(clean)".
+4. Present findings as a numbered list, sorted by impact. For each:
+   - What's wrong
+   - Why it matters
+   - What you'd do to fix it
+5. Ask: "want me to fix these? all / pick numbers / skip"
+6. Fix what they approve. For each fix:
+   - Make the change
+   - Update last_compiled if touching wiki pages
+   - Commit with a clear message
+7. After all fixes, run `atris clean` one more time to verify.
+## Example
+```
+Found 4 things to improve:
+1. MAP.md has 11 broken refs — 3 files moved, 8 functions renamed.
+   These make navigation wrong. I can auto-heal most of them.
+2. atris/TODO.md has a task claimed 26 days ago by Executor.
+   It's blocking the in-progress slot. Should delete or re-scope.
+3. MAP.md hasn't been updated in 25 days.
+   Code has changed — the map is drifting from reality.
+4. 2 empty sections in TODO.md.
+   Just noise. Can clean them out.
+want me to fix these? all / pick numbers / skip
+```
+## Rules
+- Never delete user content without asking.
+- Always show what you found before fixing.
+- Commit fixes in small, clear commits (one per category).
+- Update last_compiled frontmatter when recompiling wiki pages.
+- Run atris clean at the end to verify everything is actually fixed.

package/bin/atris.js CHANGED Viewed

@@ -237,6 +237,7 @@ function showHelp() {
   console.log('  search     - Search journal history (atris search <keyword>)');
   console.log('  clean      - Housekeeping (stale tasks, archive journals, broken refs)');
   console.log('  verify     - Validate work is done (tests, MAP.md, changes)');
+  console.log('  release    - Tag release, bump version, create GitHub release, draft /launch');
   console.log('  learn      - Project learnings (patterns, pitfalls, preferences)');
   console.log('  ingest     - Local-first wiki ingest into atris/wiki/');
   console.log('  query      - Local-first wiki query against atris/wiki/');
@@ -271,13 +272,14 @@ function showHelp() {
   console.log('  wake [business]    - Resume workspace (agents restart)');
   console.log('');
   console.log('Business:');
+  console.log('  business init <name>   - Create canonical business workspace (cloud + local)');
   console.log('  business add <slug>    - Connect a business');
   console.log('  business list          - Show connected businesses');
   console.log('  business remove <slug> - Disconnect a business');
   console.log('  business team [slug]   - Show members, roles, and admin access');
   console.log('  business health <slug> - Health report (members, workspace, issues)');
   console.log('  business audit         - One-line health summary of all businesses');
-  console.log('  business create <name> - Create new business (cloud + local)');
+  console.log('  business create <name> - Create new business; add --workspace for canonical local scaffold');
   console.log('  business connect <svc> - Wire a skill/integration');
   console.log('  business notify <mode> - Set notification mode (digest/silent/push)');
   console.log('  business deploy <slug> - Push local business to cloud');
@@ -423,7 +425,7 @@ const { planAtris: planCmd, doAtris: doCmd, reviewAtris: reviewCmd } = require('
 // All other commands are lazy-loaded inline (require() only when invoked)
 // Check if this is a known command or natural language input
-const knownCommands = ['init', 'log', 'status', 'analytics', 'visualize', 'brainstorm', 'autopilot', 'run', 'plan', 'do', 'review',
+const knownCommands = ['init', 'log', 'status', 'analytics', 'visualize', 'brainstorm', 'autopilot', 'run', 'plan', 'do', 'review', 'release',
                        'activate', '_activate', 'agent', 'chat', 'console', 'login', 'logout', 'whoami', 'switch', 'use', 'accounts', '_resolve', '_profile-email', '_switch-session', 'shell-init', 'update', 'upgrade', 'version', 'help', 'next', 'atris',
                        'clean', 'verify', 'search', 'skill', 'member', 'learn', 'plugin', 'experiments', 'pull', 'push', 'align', 'terminal', 'diff', 'business', 'sync',
                        'ingest', 'query', 'lint', 'loop',
@@ -1001,6 +1003,11 @@ if (command === 'init') {
 } else if (command === 'verify') {
   const taskId = process.argv[3] || null;
   require('../commands/verify').verifyAtris(taskId);
+} else if (command === 'release') {
+  const dryRun = process.argv.includes('--dry-run');
+  require('../commands/release').releaseAtris({ dryRun })
+    .then(() => process.exit(0))
+    .catch((err) => { console.error(`\n✗ Error: ${err.message || err}`); process.exit(1); });
 } else if (command === 'search') {
   const keyword = process.argv.slice(3).join(' ');
   searchJournal(keyword);