npm - selftune - Versions diffs - 0.2.0 → 0.2.1 - Mend

selftune 0.2.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (122) hide show

package/.claude/agents/diagnosis-analyst.md +20 -10
package/.claude/agents/evolution-reviewer.md +14 -1
package/.claude/agents/integration-guide.md +18 -6
package/.claude/agents/pattern-analyst.md +18 -5
package/CHANGELOG.md +12 -4
package/README.md +43 -35
package/apps/local-dashboard/dist/assets/geist-cyrillic-wght-normal-CHSlOQsW.woff2 +0 -0
package/apps/local-dashboard/dist/assets/geist-latin-ext-wght-normal-DMtmJ5ZE.woff2 +0 -0
package/apps/local-dashboard/dist/assets/geist-latin-wght-normal-Dm3htQBi.woff2 +0 -0
package/apps/local-dashboard/dist/assets/index-C4EOTFZ2.js +15 -0
package/apps/local-dashboard/dist/assets/index-bl-Webyd.css +1 -0
package/apps/local-dashboard/dist/assets/vendor-react-U7zYD9Rg.js +60 -0
package/apps/local-dashboard/dist/assets/vendor-table-B7VF2Ipl.js +26 -0
package/apps/local-dashboard/dist/assets/vendor-ui-D7_zX_qy.js +346 -0
package/apps/local-dashboard/dist/favicon.png +0 -0
package/apps/local-dashboard/dist/index.html +17 -0
package/apps/local-dashboard/dist/logo.png +0 -0
package/apps/local-dashboard/dist/logo.svg +9 -0
package/cli/selftune/badge/badge-data.ts +1 -1
package/cli/selftune/badge/badge.ts +4 -8
package/cli/selftune/canonical-export.ts +183 -0
package/cli/selftune/constants.ts +28 -0
package/cli/selftune/contribute/contribute.ts +1 -1
package/cli/selftune/cron/setup.ts +17 -17
package/cli/selftune/dashboard-contract.ts +202 -0
package/cli/selftune/dashboard-server.ts +653 -186
package/cli/selftune/dashboard.ts +41 -176
package/cli/selftune/eval/baseline.ts +5 -4
package/cli/selftune/eval/composability-v2.ts +273 -0
package/cli/selftune/eval/hooks-to-evals.ts +34 -15
package/cli/selftune/eval/unit-test-cli.ts +1 -1
package/cli/selftune/evolution/evidence.ts +26 -0
package/cli/selftune/evolution/evolve-body.ts +105 -11
package/cli/selftune/evolution/evolve.ts +371 -25
package/cli/selftune/evolution/extract-patterns.ts +87 -29
package/cli/selftune/evolution/rollback.ts +2 -2
package/cli/selftune/grading/auto-grade.ts +200 -0
package/cli/selftune/grading/grade-session.ts +448 -97
package/cli/selftune/grading/results.ts +42 -0
package/cli/selftune/hooks/prompt-log.ts +172 -2
package/cli/selftune/hooks/session-stop.ts +123 -3
package/cli/selftune/hooks/skill-eval.ts +119 -3
package/cli/selftune/index.ts +395 -116
package/cli/selftune/ingestors/claude-replay.ts +140 -114
package/cli/selftune/ingestors/codex-rollout.ts +345 -46
package/cli/selftune/ingestors/codex-wrapper.ts +207 -39
package/cli/selftune/ingestors/openclaw-ingest.ts +141 -8
package/cli/selftune/ingestors/opencode-ingest.ts +193 -17
package/cli/selftune/init.ts +227 -14
package/cli/selftune/last.ts +14 -5
package/cli/selftune/localdb/db.ts +63 -0
package/cli/selftune/localdb/materialize.ts +428 -0
package/cli/selftune/localdb/queries.ts +376 -0
package/cli/selftune/localdb/schema.ts +204 -0
package/cli/selftune/monitoring/watch.ts +66 -15
package/cli/selftune/normalization.ts +682 -0
package/cli/selftune/observability.ts +19 -44
package/cli/selftune/orchestrate.ts +1073 -0
package/cli/selftune/quickstart.ts +203 -0
package/cli/selftune/repair/skill-usage.ts +576 -0
package/cli/selftune/schedule.ts +561 -0
package/cli/selftune/status.ts +48 -26
package/cli/selftune/sync.ts +627 -0
package/cli/selftune/types.ts +148 -0
package/cli/selftune/utils/canonical-log.ts +45 -0
package/cli/selftune/utils/hooks.ts +41 -0
package/cli/selftune/utils/html.ts +27 -0
package/cli/selftune/utils/llm-call.ts +78 -20
package/cli/selftune/utils/math.ts +10 -0
package/cli/selftune/utils/query-filter.ts +139 -0
package/cli/selftune/utils/skill-discovery.ts +340 -0
package/cli/selftune/utils/skill-log.ts +68 -0
package/cli/selftune/utils/skill-usage-confidence.ts +18 -0
package/cli/selftune/utils/transcript.ts +272 -26
package/cli/selftune/workflows/discover.ts +254 -0
package/cli/selftune/workflows/skill-md-writer.ts +288 -0
package/cli/selftune/workflows/workflows.ts +188 -0
package/package.json +21 -8
package/packages/telemetry-contract/README.md +11 -0
package/packages/telemetry-contract/fixtures/golden.json +87 -0
package/packages/telemetry-contract/fixtures/golden.test.ts +42 -0
package/packages/telemetry-contract/index.ts +1 -0
package/packages/telemetry-contract/package.json +19 -0
package/packages/telemetry-contract/src/index.ts +2 -0
package/packages/telemetry-contract/src/types.ts +163 -0
package/packages/telemetry-contract/src/validators.ts +109 -0
package/skill/SKILL.md +84 -53
package/skill/Workflows/AutoActivation.md +17 -16
package/skill/Workflows/Badge.md +6 -0
package/skill/Workflows/Baseline.md +46 -23
package/skill/Workflows/Composability.md +12 -5
package/skill/Workflows/Contribute.md +17 -14
package/skill/Workflows/Cron.md +56 -79
package/skill/Workflows/Dashboard.md +45 -34
package/skill/Workflows/Doctor.md +30 -17
package/skill/Workflows/Evals.md +64 -40
package/skill/Workflows/EvolutionMemory.md +2 -0
package/skill/Workflows/Evolve.md +102 -47
package/skill/Workflows/EvolveBody.md +6 -6
package/skill/Workflows/Grade.md +36 -31
package/skill/Workflows/ImportSkillsBench.md +11 -5
package/skill/Workflows/Ingest.md +43 -36
package/skill/Workflows/Initialize.md +44 -30
package/skill/Workflows/Orchestrate.md +139 -0
package/skill/Workflows/Replay.md +39 -18
package/skill/Workflows/Rollback.md +3 -3
package/skill/Workflows/Schedule.md +61 -0
package/skill/Workflows/Sync.md +88 -0
package/skill/Workflows/UnitTest.md +34 -22
package/skill/Workflows/Watch.md +14 -4
package/skill/Workflows/Workflows.md +129 -0
package/skill/assets/activation-rules-default.json +26 -0
package/skill/assets/multi-skill-settings.json +63 -0
package/skill/assets/single-skill-settings.json +57 -0
package/skill/references/invocation-taxonomy.md +2 -2
package/skill/references/logs.md +164 -2
package/skill/references/setup-patterns.md +65 -0
package/skill/references/version-history.md +40 -0
package/skill/settings_snippet.json +1 -1
package/templates/multi-skill-settings.json +7 -7
package/templates/single-skill-settings.json +6 -6
package/dashboard/index.html +0 -1680

package/.claude/agents/diagnosis-analyst.md CHANGED Viewed

@@ -11,12 +11,22 @@ Investigate why a specific skill is underperforming. Analyze telemetry logs,
 grading results, and session transcripts to identify root causes and recommend
 targeted fixes.
-**Activate when the user says:**
-- "diagnose skill issues"
-- "why is skill X underperforming"
-- "what's wrong with this skill"
-- "skill failure analysis"
-- "debug skill performance"
+**Activation policy:** This is a subagent-only role, spawned by the main agent.
+If a user asks for diagnosis directly, the main agent should route to this subagent.
+## Connection to Workflows
+This agent is spawned by the main agent as a subagent when deeper analysis is
+needed — it is not called directly by the user.
+**Connected workflows:**
+- **Doctor** — when `selftune doctor` reveals persistent issues with a specific skill, spawn this agent for root cause analysis
+- **Grade** — when grades are consistently low for a skill, spawn this agent to investigate why
+- **Status** — when `selftune status` shows CRITICAL or WARNING flags on a skill, spawn this agent for a deep dive
+The main agent decides when to escalate to this subagent based on severity
+and persistence of the issue. One-off failures are handled inline; recurring
+or unexplained failures warrant spawning this agent.
 ## Context
@@ -48,7 +58,7 @@ any warnings or regression flags.
 ### Step 3: Pull telemetry stats
 ```bash
-selftune evals --skill <name> --stats
+selftune eval generate --skill <name> --stats
 ```
 Review aggregate metrics:
@@ -59,7 +69,7 @@ Review aggregate metrics:
 ### Step 4: Analyze trigger coverage
 ```bash
-selftune evals --skill <name> --max 50
+selftune eval generate --skill <name> --max 50
 ```
 Review the generated eval set. Count entries by invocation type:
@@ -106,8 +116,8 @@ Compile findings into a structured report.
 |---------|---------|
 | `selftune status` | Overall health snapshot |
 | `selftune last` | Most recent session details |
-| `selftune evals --skill <name> --stats` | Aggregate telemetry |
-| `selftune evals --skill <name> --max 50` | Generate eval set for coverage analysis |
+| `selftune eval generate --skill <name> --stats` | Aggregate telemetry |
+| `selftune eval generate --skill <name> --max 50` | Generate eval set for coverage analysis |
 | `selftune doctor` | Check infrastructure health |
 ## Output

package/.claude/agents/evolution-reviewer.md CHANGED Viewed

@@ -18,6 +18,19 @@ vs. new descriptions, and provides an approve/reject verdict with reasoning.
 - "review pending changes"
 - "should I deploy this evolution"
+## Connection to Workflows
+This agent is spawned by the main agent as a subagent to provide a safety
+review before deploying an evolution.
+**Connected workflows:**
+- **Evolve** — in the review-before-deploy step, spawn this agent to evaluate the proposal for regressions, scope creep, and eval set quality
+- **EvolveBody** — same role for full-body and routing-table evolutions
+**Mode behavior:**
+- **Interactive mode** — spawn this agent before deploying an evolution to get a human-readable safety review with an approve/reject verdict
+- **Autonomous mode** — the orchestrator handles validation internally using regression thresholds and auto-rollback; this agent is for interactive safety reviews only
 ## Context
 You need access to:
@@ -114,7 +127,7 @@ Issue an approve or reject decision with full reasoning.
 | Command | Purpose |
 |---------|---------|
 | `selftune evolve --skill <name> --skill-path <path> --dry-run` | Generate proposal without deploying |
-| `selftune evals --skill <name>` | Check eval set used for validation |
+| Read eval file from evolve output or audit log | Inspect the exact eval set used for validation |
 | `selftune watch --skill <name> --skill-path <path>` | Check current performance baseline |
 | `selftune status` | Overall skill health context |

package/.claude/agents/integration-guide.md CHANGED Viewed

@@ -19,6 +19,18 @@ verify the setup is working end-to-end.
 - "get selftune working"
 - "selftune setup guide"
+## Connection to Workflows
+This agent is the deep-dive version of the Initialize workflow, spawned by
+the main agent as a subagent when the project structure is complex.
+**Connected workflows:**
+- **Initialize** — for complex project structures (monorepos, multi-skill repos, mixed agent platforms), spawn this agent instead of running the basic init workflow
+**When to spawn:** when the project has multiple SKILL.md files, multiple
+packages or workspaces, mixed agent platforms (Claude + Codex), or any
+structure where the standard `selftune init` needs project-specific guidance.
 ## Context
 You need access to:
@@ -90,8 +102,8 @@ Parse the output to confirm `~/.selftune/config.json` was created. Note the
 detected `agent_type` and `cli_path`.
 If the user is on a non-Claude agent platform:
-- **Codex** — inform about `wrap-codex` and `ingest-codex` options
-- **OpenCode** — inform about `ingest-opencode` option
+- **Codex** — inform about `ingest wrap-codex` and `ingest codex` options
+- **OpenCode** — inform about `ingest opencode` option
 ### Step 5: Install hooks
@@ -106,8 +118,8 @@ into `~/.claude/settings.json`. Three hooks are required:
 Derive script paths from `cli_path` in `~/.selftune/config.json`.
-For **Codex**: use `selftune wrap-codex` or `selftune ingest-codex`.
-For **OpenCode**: use `selftune ingest-opencode`.
+For **Codex**: use `selftune ingest wrap-codex` or `selftune ingest codex`.
+For **OpenCode**: use `selftune ingest opencode`.
 ### Step 6: Verify with doctor
@@ -159,7 +171,7 @@ from any package directory.
 Tell the user what to do next based on their goals:
 - **"I want to see how my skills are doing"** — run `selftune status`
-- **"I want to improve a skill"** — run `selftune evals --skill <name>` then `selftune evolve`
+- **"I want to improve a skill"** — run `selftune eval generate --skill <name>` then `selftune evolve --skill <name>`
 - **"I want to grade a session"** — run `selftune grade --skill <name>`
 ## Commands
@@ -170,7 +182,7 @@ Tell the user what to do next based on their goals:
 | `selftune doctor` | Verify installation health |
 | `selftune status` | Post-setup health check |
 | `selftune last` | Verify telemetry capture |
-| `selftune evals --list-skills` | Confirm skills are being tracked |
+| `selftune eval generate --list-skills` | Confirm skills are being tracked |
 ## Output

package/.claude/agents/pattern-analyst.md CHANGED Viewed

@@ -19,6 +19,19 @@ opportunities, and identify systemic issues affecting multiple skills.
 - "skill trigger conflicts"
 - "optimize my skills"
+## Connection to Workflows
+This agent is spawned by the main agent as a subagent for deep cross-skill
+analysis.
+**Connected workflows:**
+- **Composability** — when `selftune eval composability` identifies conflict candidates, spawn this agent for deeper investigation of trigger overlaps and resolution strategies
+- **Evals** — when analyzing cross-skill patterns or systemwide undertriggering, spawn this agent to find optimization opportunities
+**When to spawn:** when the user asks about conflicts between skills,
+cross-skill optimization, or when composability scores indicate moderate-to-severe
+conflicts (score > 0.3).
 ## Context
 You need access to:
@@ -33,7 +46,7 @@ You need access to:
 ### Step 1: Inventory all skills
 ```bash
-selftune evals --list-skills
+selftune eval generate --list-skills
 ```
 Parse the JSON output to get a complete list of skills with their query
@@ -77,7 +90,7 @@ Read `skill_usage_log.jsonl` and group by query text. Look for:
 For each skill, pull stats:
 ```bash
-selftune evals --skill <name> --stats
+selftune eval generate --skill <name> --stats
 ```
 Compare across skills:
@@ -100,10 +113,10 @@ Compile a cross-skill analysis report.
 | Command | Purpose |
 |---------|---------|
-| `selftune evals --list-skills` | Inventory all skills with query counts |
+| `selftune eval generate --list-skills` | Inventory all skills with query counts |
 | `selftune status` | Health snapshot across all skills |
-| `selftune evals --skill <name> --stats` | Per-skill aggregate telemetry |
-| `selftune evals --skill <name> --max 50` | Generate eval set per skill |
+| `selftune eval generate --skill <name> --stats` | Per-skill aggregate telemetry |
+| `selftune eval generate --skill <name> --max 50` | Generate eval set per skill |
 ## Output

package/CHANGELOG.md CHANGED Viewed

@@ -7,15 +7,23 @@ and this project adheres to [Semantic Versioning](https://semver.org/).
 ## [Unreleased]
+### Added
+- **Real-time improvement signal detection** — `prompt-log` hook detects user corrections ("why didn't you use X?") and explicit skill requests via pure regex patterns. Signals are logged to `~/.claude/improvement_signals.jsonl` with skill name extraction from installed skills.
+- **Signal-reactive orchestration** — `session-stop` hook checks for pending improvement signals and spawns a focused `selftune orchestrate --max-skills 2` run in the background. Respects a 30-minute lockfile to prevent concurrent runs.
+- **Signal-aware candidate selection** — Orchestrator reads pending signals and boosts priority for mentioned skills (+150 per signal, capped at +450). Signaled skills bypass the minimum evidence gate and the "UNGRADED with 0 missed queries" gate.
+- **Orchestrate lockfile** — `acquireLock()`/`releaseLock()` with PID+timestamp in `~/.claude/.orchestrate.lock`. 30-minute stale threshold prevents deadlocks from crashed runs.
+- **Signal consumption** — After an orchestrate run completes, consumed signals are marked with `consumed: true`, `consumed_at`, and `consumed_by_run` so they don't affect subsequent runs.
 ## [0.2.0] — 2026-03-08
 ### Added
 - **Full skill body evolution** — Teacher-student model for evolving routing tables and complete skill bodies with 3-gate validation (structural, trigger, quality)
-- **Synthetic eval generation** — `selftune evals --synthetic --skill <name> --skill-path <path>` generates eval sets from SKILL.md via LLM without needing real session logs. Solves cold-start for new skills.
+- **Synthetic eval generation** — `selftune eval generate --synthetic --skill <name> --skill-path <path>` generates eval sets from SKILL.md via LLM without needing real session logs. Solves cold-start for new skills.
 - **Batch trigger validation** — `validateProposalBatched()` batches 10 queries per LLM call (configurable via `TRIGGER_CHECK_BATCH_SIZE`). ~10x faster evolution loops. Sequential `validateProposalSequential()` kept for backward compat.
 - **Cheap-loop evolution mode** — `selftune evolve --cheap-loop` uses haiku for proposal generation and validation, sonnet only for the final deployment gate. New `--gate-model` and `--proposal-model` flags for manual per-stage control.
-- **Validation model selection** — `--validation-model` flag on `evolve` and `evolve-body` commands (default: `haiku`).
+- **Validation model selection** — `--validation-model` flag on `evolve` and `evolve body` commands (default: `haiku`).
 - **Proposal model selection** — `--proposal-model` flag on `evolve`, passed through to `generateProposal()` and `generateMultipleProposals()`.
 - **Gate validation dependency injection** — `gateValidateProposal` added to `EvolveDeps` for testability.
 - **Auto-activation system** — `auto-activate.ts` UserPromptSubmit hook detects when selftune should run and outputs formatted suggestions; session state tracking prevents repeated nags; PAI coexistence support
@@ -47,7 +55,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/).
 - `selftune status` — CLI skill health summary with pass rates, trends, and system health
 - `selftune last` — Quick insight from the most recent session
 - `selftune dashboard` — Skill-health-centric HTML dashboard with grid view and drill-down
-- `selftune replay` — Claude Code transcript replay for retroactive log backfill
+- `selftune ingest claude` — Claude Code transcript replay for retroactive log backfill
 - `selftune contribute` — Opt-in anonymized data export for community contribution
 - CI/CD workflows: publish, auto-bump, CodeQL, scorecard
 - FOSS governance: LICENSE (MIT), CODE_OF_CONDUCT, CONTRIBUTING, SECURITY
@@ -57,7 +65,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/).
 ### Added
-- CLI entry point with 10 commands: `init`, `evals`, `grade`, `evolve`, `rollback`, `watch`, `doctor`, `ingest-codex`, `ingest-opencode`, `wrap-codex`
+- CLI entry point with 10 commands: `init`, `eval generate`, `grade`, `evolve`, `evolve rollback`, `watch`, `doctor`, `ingest codex`, `ingest opencode`, `ingest wrap-codex`
 - Agent auto-detection for Claude Code, Codex, and OpenCode
 - Telemetry hooks for Claude Code (`prompt-log`, `skill-eval`, `session-stop`)
 - Codex wrapper and batch ingestor for rollout logs

package/README.md CHANGED Viewed

@@ -6,9 +6,9 @@
 **Self-improving skills for AI agents.**
-[![CI](https://github.com/WellDunDun/selftune/actions/workflows/ci.yml/badge.svg)](https://github.com/WellDunDun/selftune/actions/workflows/ci.yml)
-[![CodeQL](https://github.com/WellDunDun/selftune/actions/workflows/codeql.yml/badge.svg)](https://github.com/WellDunDun/selftune/actions/workflows/codeql.yml)
-[![OpenSSF Scorecard](https://api.securityscorecards.dev/projects/github.com/WellDunDun/selftune/badge)](https://securityscorecards.dev/viewer/?uri=github.com/WellDunDun/selftune)
+[![CI](https://github.com/selftune-dev/selftune/actions/workflows/ci.yml/badge.svg)](https://github.com/selftune-dev/selftune/actions/workflows/ci.yml)
+[![CodeQL](https://github.com/selftune-dev/selftune/actions/workflows/codeql.yml/badge.svg)](https://github.com/selftune-dev/selftune/actions/workflows/codeql.yml)
+[![OpenSSF Scorecard](https://api.securityscorecards.dev/projects/github.com/selftune-dev/selftune/badge)](https://securityscorecards.dev/viewer/?uri=github.com/selftune-dev/selftune)
 [![npm version](https://img.shields.io/npm/v/selftune)](https://www.npmjs.com/package/selftune)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
 [![TypeScript](https://img.shields.io/badge/TypeScript-blue.svg)](https://www.typescriptlang.org/)
@@ -25,17 +25,17 @@ Your agent skills learn how you work. Detect what's broken. Fix it automatically
 Your skills don't understand how you talk. You say "make me a slide deck" and nothing happens — no error, no log, no signal. selftune watches your real sessions, learns how you actually speak, and rewrites skill descriptions to match. Automatically.
-Works with **Claude Code**, **Codex**, **OpenCode**, and **OpenClaw**. Zero runtime dependencies.
+Works with **Claude Code** (primary). Codex, OpenCode, and OpenClaw adapters are experimental. Zero runtime dependencies.
 ## Install
 ```bash
-npx skills add WellDunDun/selftune
+npx skills add selftune-dev/selftune
 ```
 Then tell your agent: **"initialize selftune"**
-Two minutes. No API keys. No external services. No configuration ceremony. Uses your existing agent subscription. Within minutes you'll see which skills are undertriggering.
+Two minutes. No API keys. No external services. No configuration ceremony. Uses your existing agent subscription. You'll see which skills are undertriggering.
 **CLI only** (no skill, just the CLI):
@@ -53,11 +53,11 @@ selftune learned that real users say "slides", "deck", "presentation for Monday"
 ## Built for How You Actually Work
-**I write and use my own skills** — You built skills for your workflow but your descriptions don't match how you actually talk. selftune learns your language from real sessions and evolves descriptions to match — no more manual tuning. `selftune status` · `selftune evolve` · `selftune baseline`
+**I write and use my own skills** — Your skill descriptions don't match how you actually talk. Tell your agent "improve my skills" and selftune learns your language from real sessions, evolves descriptions to match, and validates before deploying. No manual tuning.
-**I publish skills others install** — Your skill works for you, but every user talks differently. selftune ships skills that get better for every user automatically — adapting descriptions to how each person actually works. `selftune status` · `selftune evals` · `selftune badge`
+**I publish skills others install** — Your skill works for you, but every user talks differently. selftune ships skills that get better for every user automatically — adapting descriptions to how each person actually works.
-**I manage an agent setup with many skills** — You have 15+ skills installed. Some work. Some don't. Some conflict. selftune gives you a health dashboard and automatically improves the skills that aren't keeping up with how your team works. `selftune dashboard` · `selftune composability` · `selftune doctor`
+**I manage an agent setup with many skills** — You have 15+ skills installed. Some work. Some don't. Some conflict. Tell your agent "how are my skills doing?" and selftune gives you a health dashboard and automatically improves the skills that aren't keeping up.
 ## How It Works
@@ -65,20 +65,22 @@ selftune learned that real users say "slides", "deck", "presentation for Monday"
   <img src="./assets/FeedbackLoop.gif" alt="Observe → Detect → Evolve → Watch" width="800">
 </p>
-A continuous feedback loop that makes your skills learn and adapt. Automatically.
+A continuous feedback loop that makes your skills learn and adapt. Automatically. Your agent runs everything — you just install the skill and talk naturally.
-**Observe** — Hooks capture every user query and which skills fired. On Claude Code, hooks install automatically. Use `selftune replay` to backfill existing transcripts. This is how your skills start learning.
+**Observe** — Hooks capture every query and which skills fired. On Claude Code, hooks install automatically during `selftune init`. Backfill existing transcripts with `selftune ingest claude`.
-**Detect** — selftune finds the gap between how you talk and how your skills are described. You say "make me a slide deck" and your pptx skill stays silent — selftune catches that mismatch.
+**Detect** — Finds the gap between how you talk and how your skills are described. You say "make me a slide deck" and your pptx skill stays silent — selftune catches that mismatch. Real-time correction signals ("why didn't you use X?") are detected and trigger immediate improvement.
-**Evolve** — Rewrites skill descriptions — and full skill bodies — to match how you actually work. Batched validation with per-stage model control (`--cheap-loop` uses haiku for the loop, sonnet for the gate). Teacher-student body evolution with 3-gate validation. Baseline comparison gates on measurable lift. Automatic backup.
+**Evolve** — Rewrites skill descriptions — and full skill bodies — to match how you actually work. Cheap-loop mode uses haiku for the loop, sonnet for the gate (~80% cost reduction). Teacher-student body evolution with 3-gate validation. Automatic backup.
-**Watch** — After deploying changes, selftune monitors skill trigger rates. If anything regresses, it rolls back automatically. Your skills keep improving without you touching them.
+**Watch** — After deploying changes, selftune monitors skill trigger rates. If anything regresses, it rolls back automatically.
+**Automate** — Run `selftune cron setup` to install OS-level scheduling. selftune syncs, evaluates, evolves, and watches on a schedule — no manual intervention needed.
 ## What's New in v0.2.0
 - **Full skill body evolution** — Beyond descriptions: evolve routing tables and entire skill bodies using teacher-student model with structural, trigger, and quality gates
-- **Synthetic eval generation** — `selftune evals --synthetic` generates eval sets from SKILL.md via LLM, no session logs needed. Solves cold-start: new skills get evals immediately.
+- **Synthetic eval generation** — `selftune eval generate --synthetic` generates eval sets from SKILL.md via LLM, no session logs needed. Solves cold-start: new skills get evals immediately.
 - **Cheap-loop evolution** — `selftune evolve --cheap-loop` uses haiku for proposal generation and validation, sonnet only for the final deployment gate. ~80% cost reduction.
 - **Batch trigger validation** — Validation now batches 10 queries per LLM call instead of one-per-query. ~10x faster evolution loops.
 - **Per-stage model control** — `--validation-model`, `--proposal-model`, and `--gate-model` flags give fine-grained control over which model runs each evolution stage.
@@ -91,21 +93,27 @@ A continuous feedback loop that makes your skills learn and adapt. Automatically
 ## Commands
-| Command | What it does |
-|---|---|
-| `selftune status` | See which skills are undertriggering and why |
-| `selftune evals --skill <name>` | Generate eval sets from real session data (`--synthetic` for cold-start) |
-| `selftune evolve --skill <name>` | Propose, validate, and deploy improved descriptions (`--cheap-loop`, `--with-baseline`) |
-| `selftune evolve-body --skill <name>` | Evolve full skill body or routing table (teacher-student, 3-gate validation) |
-| `selftune baseline --skill <name>` | Measure skill value vs no-skill baseline |
-| `selftune unit-test --skill <name>` | Run or generate skill-level unit tests |
-| `selftune composability --skill <name>` | Detect conflicts between co-occurring skills |
-| `selftune import-skillsbench` | Import external eval corpus from [SkillsBench](https://github.com/benchflow-ai/skillsbench) |
-| `selftune badge --skill <name>` | Generate skill health badge SVG |
-| `selftune watch --skill <name>` | Monitor after deploy. Auto-rollback on regression. |
-| `selftune dashboard` | Open the visual skill health dashboard |
-| `selftune replay` | Backfill data from existing Claude Code transcripts |
-| `selftune doctor` | Health check: logs, hooks, config, permissions |
+Your agent runs these — you just say what you want ("improve my skills", "show the dashboard").
+| Group | Command | What it does |
+|-------|---------|-------------|
+| | `selftune status` | See which skills are undertriggering and why |
+| | `selftune orchestrate` | Run the full autonomous loop (sync → evolve → watch) |
+| | `selftune dashboard` | Open the visual skill health dashboard |
+| | `selftune doctor` | Health check: logs, hooks, config, permissions |
+| **ingest** | `selftune ingest claude` | Backfill from Claude Code transcripts |
+| | `selftune ingest codex` | Import Codex rollout logs (experimental) |
+| **grade** | `selftune grade --skill <name>` | Grade a skill session with evidence |
+| | `selftune grade baseline --skill <name>` | Measure skill value vs no-skill baseline |
+| **evolve** | `selftune evolve --skill <name>` | Propose, validate, and deploy improved descriptions |
+| | `selftune evolve body --skill <name>` | Evolve full skill body or routing table |
+| | `selftune evolve rollback --skill <name>` | Rollback a previous evolution |
+| **eval** | `selftune eval generate --skill <name>` | Generate eval sets (`--synthetic` for cold-start) |
+| | `selftune eval unit-test --skill <name>` | Run or generate skill-level unit tests |
+| | `selftune eval composability --skill <name>` | Detect conflicts between co-occurring skills |
+| | `selftune eval import` | Import external eval corpus from [SkillsBench](https://github.com/benchflow-ai/skillsbench) |
+| **auto** | `selftune cron setup` | Install OS-level scheduling (cron/launchd/systemd) |
+| | `selftune watch --skill <name>` | Monitor after deploy. Auto-rollback on regression. |
 Full command reference: `selftune --help`
@@ -135,13 +143,13 @@ selftune is complementary to these tools, not competitive. They trace what happe
 ## Platforms
-**Claude Code** — Hooks install automatically. `selftune replay` backfills existing transcripts.
+**Claude Code** (fully supported) — Hooks install automatically. `selftune ingest claude` backfills existing transcripts. This is the primary supported platform.
-**Codex** — `selftune wrap-codex -- <args>` or `selftune ingest-codex`
+**Codex** (experimental) — `selftune ingest wrap-codex -- <args>` or `selftune ingest codex`. Adapter exists but is not actively tested.
-**OpenCode** — `selftune ingest-opencode`
+**OpenCode** (experimental) — `selftune ingest opencode`. Adapter exists but is not actively tested.
-**OpenClaw** — `selftune ingest-openclaw` + `selftune cron setup` for autonomous evolution
+**OpenClaw** (experimental) — `selftune ingest openclaw` + `selftune cron setup` for autonomous evolution. Adapter exists but is not actively tested.
 Requires [Bun](https://bun.sh) or Node.js 18+. No extra API keys.
@@ -151,6 +159,6 @@ Requires [Bun](https://bun.sh) or Node.js 18+. No extra API keys.
 [Architecture](ARCHITECTURE.md) · [Contributing](CONTRIBUTING.md) · [Security](SECURITY.md) · [Integration Guide](docs/integration-guide.md) · [Sponsor](https://github.com/sponsors/WellDunDun)
-MIT licensed. Free forever. Works with Claude Code, Codex, OpenCode, and OpenClaw.
+MIT licensed. Free forever. Primary support for Claude Code; experimental adapters for Codex, OpenCode, and OpenClaw.
 </div>

package/apps/local-dashboard/dist/assets/geist-cyrillic-wght-normal-CHSlOQsW.woff2 ADDED Viewed

Binary file

package/apps/local-dashboard/dist/assets/geist-latin-ext-wght-normal-DMtmJ5ZE.woff2 ADDED Viewed

Binary file

package/apps/local-dashboard/dist/assets/geist-latin-wght-normal-Dm3htQBi.woff2 ADDED Viewed

Binary file