selftune 0.2.9 → 0.2.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +35 -35
- package/apps/local-dashboard/dist/assets/index-BZVLv70T.js +16 -0
- package/apps/local-dashboard/dist/assets/{vendor-react-BQH_6WrG.js → vendor-react-BXP54cYo.js} +4 -4
- package/apps/local-dashboard/dist/assets/{vendor-table-dK1QMLq9.js → vendor-table-DTF_SXoy.js} +1 -1
- package/apps/local-dashboard/dist/assets/{vendor-ui-CO2mrx6e.js → vendor-ui-CWU0d1wd.js} +66 -66
- package/apps/local-dashboard/dist/index.html +15 -15
- package/bin/selftune.cjs +1 -1
- package/cli/selftune/activation-rules.ts +1 -0
- package/cli/selftune/alpha-upload/build-payloads.ts +18 -2
- package/cli/selftune/alpha-upload/stage-canonical.ts +94 -0
- package/cli/selftune/auth/device-code.ts +32 -0
- package/cli/selftune/auto-update.ts +12 -0
- package/cli/selftune/badge/badge.ts +1 -0
- package/cli/selftune/canonical-export.ts +5 -0
- package/cli/selftune/claude-agents.ts +154 -0
- package/cli/selftune/contribute/bundle.ts +1 -0
- package/cli/selftune/contribute/contribute.ts +1 -0
- package/cli/selftune/cron/setup.ts +2 -2
- package/cli/selftune/dashboard-server.ts +1 -0
- package/cli/selftune/eval/hooks-to-evals.ts +1 -0
- package/cli/selftune/eval/import-skillsbench.ts +1 -0
- package/cli/selftune/eval/synthetic-evals.ts +2 -3
- package/cli/selftune/eval/unit-test.ts +1 -0
- package/cli/selftune/evolution/deploy-proposal.ts +1 -0
- package/cli/selftune/evolution/evolve-body.ts +93 -6
- package/cli/selftune/evolution/evolve.ts +0 -1
- package/cli/selftune/evolution/propose-body.ts +3 -2
- package/cli/selftune/evolution/propose-routing.ts +3 -2
- package/cli/selftune/evolution/refine-body.ts +3 -2
- package/cli/selftune/export.ts +1 -0
- package/cli/selftune/grading/grade-session.ts +8 -0
- package/cli/selftune/hooks/auto-activate.ts +1 -0
- package/cli/selftune/hooks/evolution-guard.ts +1 -1
- package/cli/selftune/hooks/prompt-log.ts +1 -0
- package/cli/selftune/hooks/session-stop.ts +34 -40
- package/cli/selftune/hooks/skill-change-guard.ts +1 -0
- package/cli/selftune/hooks/skill-eval.ts +1 -1
- package/cli/selftune/index.ts +23 -14
- package/cli/selftune/ingestors/claude-replay.ts +1 -0
- package/cli/selftune/ingestors/codex-rollout.ts +1 -0
- package/cli/selftune/ingestors/codex-wrapper.ts +1 -0
- package/cli/selftune/ingestors/openclaw-ingest.ts +1 -0
- package/cli/selftune/ingestors/opencode-ingest.ts +1 -0
- package/cli/selftune/init.ts +121 -29
- package/cli/selftune/localdb/db.ts +1 -0
- package/cli/selftune/localdb/direct-write.ts +39 -0
- package/cli/selftune/localdb/materialize.ts +2 -0
- package/cli/selftune/localdb/queries.ts +53 -0
- package/cli/selftune/localdb/schema.ts +28 -0
- package/cli/selftune/normalization.ts +1 -0
- package/cli/selftune/observability.ts +1 -0
- package/cli/selftune/repair/skill-usage.ts +1 -0
- package/cli/selftune/routes/orchestrate-runs.ts +1 -0
- package/cli/selftune/routes/overview.ts +1 -0
- package/cli/selftune/routes/skill-report.ts +1 -0
- package/cli/selftune/sync.ts +30 -1
- package/cli/selftune/uninstall.ts +412 -0
- package/cli/selftune/utils/canonical-log.ts +2 -0
- package/cli/selftune/utils/jsonl.ts +1 -0
- package/cli/selftune/utils/llm-call.ts +131 -3
- package/cli/selftune/utils/skill-log.ts +1 -0
- package/cli/selftune/utils/transcript.ts +1 -0
- package/cli/selftune/utils/trigger-check.ts +1 -1
- package/cli/selftune/workflows/skill-md-writer.ts +5 -5
- package/cli/selftune/workflows/workflows.ts +1 -0
- package/package.json +37 -33
- package/packages/telemetry-contract/fixtures/golden.test.ts +1 -0
- package/packages/telemetry-contract/package.json +1 -1
- package/packages/telemetry-contract/src/schemas.ts +1 -0
- package/packages/telemetry-contract/tests/compatibility.test.ts +1 -0
- package/packages/ui/README.md +35 -34
- package/packages/ui/package.json +3 -3
- package/packages/ui/src/components/ActivityTimeline.tsx +49 -42
- package/packages/ui/src/components/EvidenceViewer.tsx +306 -182
- package/packages/ui/src/components/EvolutionTimeline.tsx +83 -72
- package/packages/ui/src/components/InfoTip.tsx +4 -3
- package/packages/ui/src/components/OrchestrateRunsPanel.tsx +60 -53
- package/packages/ui/src/components/section-cards.tsx +19 -24
- package/packages/ui/src/components/skill-health-grid.tsx +213 -193
- package/packages/ui/src/lib/constants.tsx +1 -0
- package/packages/ui/src/primitives/badge.tsx +12 -15
- package/packages/ui/src/primitives/button.tsx +7 -7
- package/packages/ui/src/primitives/card.tsx +15 -26
- package/packages/ui/src/primitives/checkbox.tsx +7 -8
- package/packages/ui/src/primitives/collapsible.tsx +5 -5
- package/packages/ui/src/primitives/dropdown-menu.tsx +45 -55
- package/packages/ui/src/primitives/label.tsx +6 -6
- package/packages/ui/src/primitives/select.tsx +28 -37
- package/packages/ui/src/primitives/table.tsx +17 -44
- package/packages/ui/src/primitives/tabs.tsx +14 -21
- package/packages/ui/src/primitives/tooltip.tsx +10 -22
- package/skill/SKILL.md +70 -57
- package/skill/Workflows/AlphaUpload.md +4 -4
- package/skill/Workflows/AutoActivation.md +11 -6
- package/skill/Workflows/Badge.md +22 -16
- package/skill/Workflows/Baseline.md +34 -36
- package/skill/Workflows/Composability.md +16 -11
- package/skill/Workflows/Contribute.md +26 -21
- package/skill/Workflows/Cron.md +23 -22
- package/skill/Workflows/Dashboard.md +32 -27
- package/skill/Workflows/Doctor.md +33 -27
- package/skill/Workflows/Evals.md +48 -47
- package/skill/Workflows/EvolutionMemory.md +31 -21
- package/skill/Workflows/Evolve.md +84 -82
- package/skill/Workflows/EvolveBody.md +58 -47
- package/skill/Workflows/Grade.md +16 -13
- package/skill/Workflows/ImportSkillsBench.md +9 -6
- package/skill/Workflows/Ingest.md +36 -21
- package/skill/Workflows/Initialize.md +108 -40
- package/skill/Workflows/Orchestrate.md +22 -16
- package/skill/Workflows/Replay.md +12 -7
- package/skill/Workflows/Rollback.md +13 -6
- package/skill/Workflows/Schedule.md +6 -6
- package/skill/Workflows/Sync.md +18 -11
- package/skill/Workflows/UnitTest.md +28 -17
- package/skill/Workflows/Watch.md +28 -21
- package/skill/agents/diagnosis-analyst.md +11 -0
- package/skill/agents/evolution-reviewer.md +15 -1
- package/skill/agents/integration-guide.md +10 -0
- package/skill/agents/pattern-analyst.md +12 -1
- package/skill/references/grading-methodology.md +23 -24
- package/skill/references/interactive-config.md +7 -7
- package/skill/references/invocation-taxonomy.md +22 -20
- package/skill/references/logs.md +14 -6
- package/skill/references/setup-patterns.md +4 -2
- package/.claude/agents/diagnosis-analyst.md +0 -156
- package/.claude/agents/evolution-reviewer.md +0 -180
- package/.claude/agents/integration-guide.md +0 -212
- package/.claude/agents/pattern-analyst.md +0 -160
- package/apps/local-dashboard/dist/assets/index-C4UYGWKr.js +0 -15
package/README.md
CHANGED
|
@@ -105,38 +105,38 @@ A continuous feedback loop that makes your skills learn and adapt. Automatically
|
|
|
105
105
|
|
|
106
106
|
Your agent runs these — you just say what you want ("improve my skills", "show the dashboard").
|
|
107
107
|
|
|
108
|
-
| Group
|
|
109
|
-
|
|
110
|
-
|
|
|
111
|
-
|
|
|
112
|
-
|
|
|
113
|
-
|
|
|
114
|
-
| **ingest** | `selftune ingest claude`
|
|
115
|
-
|
|
|
116
|
-
| **grade**
|
|
117
|
-
|
|
|
118
|
-
| **evolve** | `selftune evolve --skill <name>`
|
|
119
|
-
|
|
|
120
|
-
|
|
|
121
|
-
| **eval**
|
|
122
|
-
|
|
|
123
|
-
|
|
|
124
|
-
|
|
|
125
|
-
| **auto**
|
|
126
|
-
|
|
|
127
|
-
| **other**
|
|
128
|
-
|
|
|
108
|
+
| Group | Command | What it does |
|
|
109
|
+
| ---------- | -------------------------------------------- | ------------------------------------------------------------------------------------------- |
|
|
110
|
+
| | `selftune status` | See which skills are undertriggering and why |
|
|
111
|
+
| | `selftune orchestrate` | Run the full autonomous loop (sync → evolve → watch) |
|
|
112
|
+
| | `selftune dashboard` | Open the visual skill health dashboard |
|
|
113
|
+
| | `selftune doctor` | Health check: logs, hooks, config, permissions |
|
|
114
|
+
| **ingest** | `selftune ingest claude` | Backfill from Claude Code transcripts |
|
|
115
|
+
| | `selftune ingest codex` | Import Codex rollout logs (experimental) |
|
|
116
|
+
| **grade** | `selftune grade --skill <name>` | Grade a skill session with evidence |
|
|
117
|
+
| | `selftune grade baseline --skill <name>` | Measure skill value vs no-skill baseline |
|
|
118
|
+
| **evolve** | `selftune evolve --skill <name>` | Propose, validate, and deploy improved descriptions |
|
|
119
|
+
| | `selftune evolve body --skill <name>` | Evolve full skill body or routing table |
|
|
120
|
+
| | `selftune evolve rollback --skill <name>` | Rollback a previous evolution |
|
|
121
|
+
| **eval** | `selftune eval generate --skill <name>` | Generate eval sets (`--synthetic` for cold-start) |
|
|
122
|
+
| | `selftune eval unit-test --skill <name>` | Run or generate skill-level unit tests |
|
|
123
|
+
| | `selftune eval composability --skill <name>` | Detect conflicts between co-occurring skills |
|
|
124
|
+
| | `selftune eval import` | Import external eval corpus from [SkillsBench](https://github.com/benchflow-ai/skillsbench) |
|
|
125
|
+
| **auto** | `selftune cron setup` | Install OS-level scheduling (cron/launchd/systemd) |
|
|
126
|
+
| | `selftune watch --skill <name>` | Monitor after deploy. Auto-rollback on regression. |
|
|
127
|
+
| **other** | `selftune telemetry` | Manage anonymous usage analytics (status, enable, disable) |
|
|
128
|
+
| | `selftune alpha upload` | Run a manual alpha upload cycle and emit a JSON send summary |
|
|
129
129
|
|
|
130
130
|
Full command reference: `selftune --help`
|
|
131
131
|
|
|
132
132
|
## Why Not Just Rewrite Skills Manually?
|
|
133
133
|
|
|
134
|
-
| Approach
|
|
135
|
-
|
|
136
|
-
| Rewrite the description yourself
|
|
137
|
-
| Add "ALWAYS invoke when..." directives | Brittle. One agent rewrite away from breaking.
|
|
138
|
-
| Force-load skills on every prompt
|
|
139
|
-
| **selftune**
|
|
134
|
+
| Approach | Problem |
|
|
135
|
+
| -------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- |
|
|
136
|
+
| Rewrite the description yourself | No data on how users actually talk. No validation. No regression detection. |
|
|
137
|
+
| Add "ALWAYS invoke when..." directives | Brittle. One agent rewrite away from breaking. |
|
|
138
|
+
| Force-load skills on every prompt | Doesn't fix the description. Expensive band-aid. |
|
|
139
|
+
| **selftune** | Learns from real usage, rewrites descriptions to match how you work, validates against eval sets, auto-rollbacks on regressions. |
|
|
140
140
|
|
|
141
141
|
## Different Layer, Different Problem
|
|
142
142
|
|
|
@@ -144,14 +144,14 @@ LLM observability tools trace API calls. Infrastructure tools monitor servers. N
|
|
|
144
144
|
|
|
145
145
|
selftune is complementary to these tools, not competitive. They trace what happens inside the LLM. selftune makes sure the right skill is called in the first place.
|
|
146
146
|
|
|
147
|
-
| Dimension
|
|
148
|
-
|
|
149
|
-
| **Layer**
|
|
150
|
-
| **Detects**
|
|
151
|
-
| **Improves** | Descriptions, body, and routing automatically
|
|
152
|
-
| **Setup**
|
|
153
|
-
| **Price**
|
|
154
|
-
| **Unique**
|
|
147
|
+
| Dimension | selftune | Langfuse | LangSmith | OpenLIT |
|
|
148
|
+
| ------------ | ------------------------------------------------- | -------------------- | -------------- | -------------- |
|
|
149
|
+
| **Layer** | Skill-specific | LLM call | Agent trace | Infrastructure |
|
|
150
|
+
| **Detects** | Missed triggers, false negatives, skill conflicts | Token usage, latency | Chain failures | System metrics |
|
|
151
|
+
| **Improves** | Descriptions, body, and routing automatically | — | — | — |
|
|
152
|
+
| **Setup** | Zero deps, zero API keys | Self-host or cloud | Cloud required | Helm chart |
|
|
153
|
+
| **Price** | Free (MIT) | Freemium | Paid | Free |
|
|
154
|
+
| **Unique** | Self-improving skills + auto-rollback | Prompt management | Evaluations | Dashboards |
|
|
155
155
|
|
|
156
156
|
## Platforms
|
|
157
157
|
|