npm - selftune - Versions diffs - 0.2.6 → 0.2.9 - Mend

selftune 0.2.6 → 0.2.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (119) hide show

package/README.md +1 -0
package/apps/local-dashboard/dist/assets/index-Bs3Y4ixf.css +1 -0
package/apps/local-dashboard/dist/assets/index-C4UYGWKr.js +15 -0
package/apps/local-dashboard/dist/assets/vendor-react-BQH_6WrG.js +60 -0
package/apps/local-dashboard/dist/assets/{vendor-table-B7VF2Ipl.js → vendor-table-dK1QMLq9.js} +1 -1
package/apps/local-dashboard/dist/assets/{vendor-ui-r2k_Ku_V.js → vendor-ui-CO2mrx6e.js} +60 -65
package/apps/local-dashboard/dist/index.html +5 -5
package/cli/selftune/activation-rules.ts +57 -18
package/cli/selftune/agent-guidance.ts +96 -0
package/cli/selftune/alpha-identity.ts +156 -0
package/cli/selftune/alpha-upload/build-payloads.ts +151 -0
package/cli/selftune/alpha-upload/client.ts +113 -0
package/cli/selftune/alpha-upload/flush.ts +191 -0
package/cli/selftune/alpha-upload/index.ts +194 -0
package/cli/selftune/alpha-upload/queue.ts +252 -0
package/cli/selftune/alpha-upload/stage-canonical.ts +251 -0
package/cli/selftune/alpha-upload-contract.ts +52 -0
package/cli/selftune/auth/device-code.ts +110 -0
package/cli/selftune/auto-update.ts +130 -0
package/cli/selftune/badge/badge.ts +19 -9
package/cli/selftune/canonical-export.ts +16 -3
package/cli/selftune/constants.ts +28 -8
package/cli/selftune/contribute/bundle.ts +33 -5
package/cli/selftune/dashboard-contract.ts +32 -1
package/cli/selftune/dashboard-server.ts +215 -693
package/cli/selftune/dashboard.ts +1 -1
package/cli/selftune/eval/baseline.ts +11 -7
package/cli/selftune/eval/hooks-to-evals.ts +39 -15
package/cli/selftune/eval/synthetic-evals.ts +54 -1
package/cli/selftune/evolution/audit.ts +24 -19
package/cli/selftune/evolution/constitutional.ts +176 -0
package/cli/selftune/evolution/evidence.ts +18 -13
package/cli/selftune/evolution/evolve-body.ts +104 -7
package/cli/selftune/evolution/evolve.ts +195 -22
package/cli/selftune/evolution/propose-body.ts +18 -1
package/cli/selftune/evolution/propose-description.ts +27 -2
package/cli/selftune/evolution/rollback.ts +11 -15
package/cli/selftune/export.ts +84 -0
package/cli/selftune/grading/auto-grade.ts +14 -4
package/cli/selftune/grading/grade-session.ts +17 -6
package/cli/selftune/hooks/auto-activate.ts +5 -0
package/cli/selftune/hooks/evolution-guard.ts +25 -11
package/cli/selftune/hooks/prompt-log.ts +23 -9
package/cli/selftune/hooks/session-stop.ts +78 -15
package/cli/selftune/hooks/skill-eval.ts +189 -10
package/cli/selftune/index.ts +274 -2
package/cli/selftune/ingestors/claude-replay.ts +48 -21
package/cli/selftune/init.ts +260 -49
package/cli/selftune/last.ts +7 -7
package/cli/selftune/localdb/db.ts +90 -10
package/cli/selftune/localdb/direct-write.ts +573 -0
package/cli/selftune/localdb/materialize.ts +296 -42
package/cli/selftune/localdb/queries.ts +482 -32
package/cli/selftune/localdb/schema.ts +153 -1
package/cli/selftune/monitoring/watch.ts +27 -8
package/cli/selftune/normalization.ts +88 -15
package/cli/selftune/observability.ts +257 -5
package/cli/selftune/orchestrate.ts +176 -53
package/cli/selftune/quickstart.ts +34 -10
package/cli/selftune/repair/skill-usage.ts +15 -2
package/cli/selftune/routes/actions.ts +77 -0
package/cli/selftune/routes/badge.ts +66 -0
package/cli/selftune/routes/doctor.ts +12 -0
package/cli/selftune/routes/index.ts +14 -0
package/cli/selftune/routes/orchestrate-runs.ts +13 -0
package/cli/selftune/routes/overview.ts +14 -0
package/cli/selftune/routes/report.ts +293 -0
package/cli/selftune/routes/skill-report.ts +230 -0
package/cli/selftune/status.ts +203 -7
package/cli/selftune/sync.ts +14 -1
package/cli/selftune/types.ts +52 -2
package/cli/selftune/utils/jsonl.ts +58 -1
package/cli/selftune/utils/selftune-meta.ts +38 -0
package/cli/selftune/utils/skill-log.ts +30 -4
package/cli/selftune/utils/transcript.ts +15 -0
package/cli/selftune/workflows/workflows.ts +7 -6
package/package.json +11 -6
package/packages/telemetry-contract/fixtures/complete-push.ts +184 -0
package/packages/telemetry-contract/fixtures/evidence-only-push.ts +58 -0
package/packages/telemetry-contract/fixtures/golden.json +1 -0
package/packages/telemetry-contract/fixtures/index.ts +4 -0
package/packages/telemetry-contract/fixtures/partial-push-no-sessions.ts +40 -0
package/packages/telemetry-contract/fixtures/partial-push-unresolved-parents.ts +79 -0
package/packages/telemetry-contract/package.json +6 -1
package/packages/telemetry-contract/src/schemas.ts +196 -0
package/packages/telemetry-contract/src/types.ts +3 -1
package/packages/telemetry-contract/src/validators.ts +3 -1
package/packages/telemetry-contract/tests/compatibility.test.ts +144 -0
package/packages/ui/package.json +4 -0
package/packages/ui/src/components/ActivityTimeline.tsx +61 -29
package/packages/ui/src/components/section-cards.tsx +31 -14
package/packages/ui/src/types.ts +1 -0
package/skill/SKILL.md +214 -174
package/skill/Workflows/AlphaUpload.md +45 -0
package/skill/Workflows/Baseline.md +18 -12
package/skill/Workflows/Composability.md +3 -3
package/skill/Workflows/Dashboard.md +39 -91
package/skill/Workflows/Doctor.md +93 -66
package/skill/Workflows/Evals.md +49 -40
package/skill/Workflows/Evolve.md +76 -28
package/skill/Workflows/EvolveBody.md +37 -38
package/skill/Workflows/Initialize.md +145 -26
package/skill/Workflows/Orchestrate.md +11 -2
package/skill/Workflows/Sync.md +23 -0
package/skill/Workflows/Watch.md +2 -5
package/skill/agents/diagnosis-analyst.md +163 -0
package/skill/agents/evolution-reviewer.md +149 -0
package/skill/agents/integration-guide.md +154 -0
package/skill/agents/pattern-analyst.md +149 -0
package/skill/assets/multi-skill-settings.json +1 -1
package/skill/assets/single-skill-settings.json +1 -1
package/skill/references/interactive-config.md +39 -0
package/skill/references/invocation-taxonomy.md +34 -0
package/skill/references/logs.md +15 -1
package/skill/references/setup-patterns.md +3 -3
package/skill/settings_snippet.json +1 -1
package/apps/local-dashboard/dist/assets/index-C75H1Q3n.css +0 -1
package/apps/local-dashboard/dist/assets/index-axE4kz3Q.js +0 -15
package/apps/local-dashboard/dist/assets/vendor-react-U7zYD9Rg.js +0 -60

package/skill/Workflows/Evolve.md CHANGED Viewed

@@ -30,7 +30,13 @@ selftune evolve --skill <name> --skill-path <path> [options]
 | `--confidence <n>` | Minimum confidence threshold (0-1) | 0.6 |
 | `--max-iterations <n>` | Maximum retry iterations | 3 |
 | `--validation-model <model>` | Model for trigger-check validation LLM calls | `haiku` |
-| `--cheap-loop` | Use cheap models for loop, expensive for final gate | Off |
+| `--pareto` | Generate multiple candidates per iteration | Off |
+| `--candidates <n>` | Number of candidates per iteration (with `--pareto`) | 3 |
+| `--token-efficiency` | Optimize for token efficiency in proposals | Off |
+| `--with-baseline` | Include a no-skill baseline comparison | Off |
+| `--cheap-loop` | Use cheap models for loop, expensive for final gate | On |
+| `--full-model` | Use full-cost model throughout (disables cheap-loop) | Off |
+| `--verbose` | Print detailed progress during evolution | Off |
 | `--gate-model <model>` | Model for final gate validation | `sonnet` (when `--cheap-loop`) |
 | `--proposal-model <model>` | Model for proposal generation LLM calls | None |
 | `--sync-first` | Refresh source-truth telemetry before generating evals/failure patterns | Off |
@@ -89,34 +95,38 @@ The evolution process writes multiple audit entries:
 ### 0. Pre-Flight Configuration
-Before running the evolve command, present numbered configuration options to the user inline in your response, then wait for the user's answer before proceeding.
+Before running the evolve command, use the `AskUserQuestion` tool to present structured configuration options. If the user responds with "use defaults" or similar shorthand, skip to step 1 using the recommended defaults. If the user cancels, stop and do not continue.
-If the user responds with "use defaults", "just run it", or similar shorthand, skip to step 1 using the recommended defaults marked below.
+Use `AskUserQuestion` with these questions (max 4 per call — split if needed):
-Present the following options inline in your response:
+**Call 1:**
-1. **Execution Mode**
-   - a) Dry run — preview proposal without deploying (recommended for first run)
-   - b) Live — validate and deploy if improved
-2. **Model Tier** (see SKILL.md Model Tier Reference)
-   - a) Fast (haiku) — cheapest, ~2s/call (recommended with cheap-loop)
-   - b) Balanced (sonnet) — good quality, ~5s/call
-   - c) Best (opus) — highest quality, ~10s/call
-3. **Cost Optimization**
-   - a) Cheap loop — haiku for iteration, sonnet for final gate (recommended)
-   - b) Single model — use one model throughout
-4. **Confidence Threshold:** 0.6 (default, higher = stricter)
-5. **Max Iterations:** 3 (default, more = longer but better results)
+```json
+{
+  "questions": [
+    {
+      "question": "Execution Mode",
+      "options": ["Dry run — preview without deploying (recommended for first run)", "Live — validate and deploy if improved"]
+    },
+    {
+      "question": "Model Tier (see SKILL.md reference)",
+      "options": ["Fast (haiku) — cheapest, ~2s/call (recommended with cheap-loop)", "Balanced (sonnet) — good quality, ~5s/call", "Best (opus) — highest quality, ~10s/call"]
+    },
+    {
+      "question": "Cost Optimization",
+      "options": ["Cheap loop — haiku for iteration, sonnet for final gate (recommended)", "Single model — use one model throughout"]
+    },
+    {
+      "question": "Advanced Options",
+      "options": ["Defaults (0.6 confidence, 3 iterations, single candidate) (recommended)", "Stricter (0.7 confidence, 5 iterations)", "Pareto mode (multiple candidates per iteration)"]
+    }
+  ]
+}
+```
-6. **Multi-Candidate Selection**
-   - a) Single candidate — one proposal per iteration (recommended)
-   - b) Pareto mode — generate multiple candidates, pick best on frontier
+If `AskUserQuestion` is not available, fall back to presenting these as inline numbered options.
-Ask: "Reply with your choices (e.g., '1a, 2a, 3a, defaults for rest') or 'use defaults' for recommended settings."
+If the user cancels, stop -- do not proceed with defaults. If the user selects "use defaults", skip to step 1 with recommended defaults.
 After the user responds, parse their selections and map each choice to the corresponding CLI flags:
@@ -193,6 +203,26 @@ The command groups missed queries by invocation type:
 See `references/invocation-taxonomy.md` for the taxonomy.
+### 4b. Constitutional Pre-Validation Gate
+Before any LLM-based validation, each proposal passes through a
+deterministic constitutional check that rejects obviously bad proposals
+at zero cost. Four principles are enforced:
+1. **Size constraint** — description must be ≤1024 characters and within
+   0.3x–3.0x word count of the original.
+2. **No XML injection** — reject proposals containing XML/HTML tags.
+3. **No unbounded broadening** — reject bare "all", "any", "every",
+   "everything" unless qualified by enumeration markers ("including",
+   "such as", "like", "e.g.", or a comma-separated list).
+4. **Anchor preservation** — if the original contains `USE WHEN` trigger
+   phrases or `$skillName` references, those must appear in the proposal.
+If a proposal fails any principle, it is rejected with a descriptive
+violation message and the pipeline retries (if iterations remain).
+For body evolution (`evolve body`), only the size constraint applies.
 ### 5. Propose Description Changes
 An LLM generates a candidate description that would catch the missed
@@ -211,6 +241,23 @@ The candidate is tested against the full eval set:
 If validation fails, the command retries up to `--max-iterations` times
 with adjusted proposals.
+### Aggregate Metrics To Report
+When summarizing an evolution run, include these aggregate metrics rather
+than only saying "passed" or "failed":
+| Metric | Meaning |
+|--------|---------|
+| `original_pass_rate` | Baseline pass rate before the proposal |
+| `proposed_pass_rate` | Pass rate after applying the proposal |
+| `regression_count` | Eval entries that passed before and failed after |
+| `net_change` | Total passes gained minus regressions introduced |
+| `iteration` / `iterations_used` | Which retry produced the current candidate |
+| `baseline_lift` | Additional lift over the no-skill baseline when `--with-baseline` is enabled |
+These metrics explain whether the proposal is genuinely better, merely
+different, or too risky to deploy.
 ### 7. Deploy (or Preview)
 If `--dry-run`, the proposal is printed but not deployed. The audit log
@@ -292,10 +339,11 @@ Use `--agent <name>` to override (claude, codex, opencode).
 ## Subagent Escalation
-For high-stakes evolutions, consider spawning the `evolution-reviewer` agent
-as a subagent to review the proposal before deploying. This is especially
-valuable when the skill has a history of regressions, the evolution touches
-many trigger phrases, or the confidence score is near the threshold.
+For high-stakes evolutions, read `skill/agents/evolution-reviewer.md` and spawn a
+subagent with those instructions to review the proposal before deploying.
+This is especially valuable when the skill has a history of regressions,
+the evolution touches many trigger phrases, or the confidence score is near
+the threshold.
 ## Autonomous Mode

package/skill/Workflows/EvolveBody.md CHANGED Viewed

@@ -16,7 +16,7 @@ selftune evolve body --skill <name> --skill-path <path> --target <target> [optio
 |------|-------------|---------|
 | `--skill <name>` | Skill name | Required |
 | `--skill-path <path>` | Path to the skill's SKILL.md | Required |
-| `--target <type>` | Evolution target: `routing_table` or `full_body` | Required |
+| `--target <type>` | Evolution target: `routing` or `body` | Required |
 | `--teacher-agent <name>` | Agent CLI for proposal generation | Auto-detected |
 | `--student-agent <name>` | Agent CLI for validation | Same as teacher |
 | `--teacher-model <flag>` | Model flag for teacher (e.g. `opus`) | Agent default |
@@ -30,13 +30,13 @@ selftune evolve body --skill <name> --skill-path <path> --target <target> [optio
 ## Evolution Targets
-### `routing_table`
+### `routing`
 Optimizes the `## Workflow Routing` markdown table in SKILL.md. The teacher
 LLM analyzes missed triggers and proposes new routing entries that map
 trigger keywords to the correct workflow files.
-### `full_body`
+### `body`
 Rewrites the entire SKILL.md body below the frontmatter. This includes
 the description, routing table, examples, and all other sections. The
@@ -59,42 +59,41 @@ a refined proposal. This repeats up to `--max-iterations` times.
 ### 0. Pre-Flight Configuration
-Before running evolve-body, present configuration options to the user.
-If the user says "use defaults" or similar, skip to step 1 with recommended defaults.
-Present these options:
+Before running evolve-body, use the `AskUserQuestion` tool to present structured configuration options.
+If the user says "use defaults" or similar, skip to step 1 with recommended defaults. If the user cancels, abort the workflow -- do not proceed with defaults.
+Use `AskUserQuestion` with these questions:
+```json
+{
+  "questions": [
+    {
+      "question": "Evolution Target",
+      "options": ["Routing table — optimize workflow routing only (recommended)", "Full body — rewrite entire SKILL.md (more aggressive)"]
+    },
+    {
+      "question": "Execution Mode",
+      "options": ["Dry run — preview without deploying (recommended)", "Live — validate and deploy if improved"]
+    },
+    {
+      "question": "Teacher Model (generates proposals)",
+      "options": ["Balanced (sonnet) — good quality (recommended)", "Best (opus) — highest quality, slower"]
+    },
+    {
+      "question": "Student Model & Iterations",
+      "options": ["Fast (haiku) + 3 iterations (recommended)", "Balanced (sonnet) + 3 iterations", "Fast (haiku) + 5 iterations"]
+    }
+  ]
+}
 ```
-selftune evolve body — Pre-Flight Configuration
-1. Evolution Target
-   a) Routing table — optimize the workflow routing table only
-   b) Full body — rewrite entire SKILL.md body (more aggressive)
-2. Execution Mode
-   a) Dry run — preview proposal without deploying (recommended)
-   b) Live — validate and deploy if improved
-3. Teacher Model (generates proposals)
-   a) Balanced (sonnet) — good quality proposals (recommended)
-   b) Best (opus) — highest quality, slower and more expensive
-4. Student Model (validates proposals)
-   a) Fast (haiku) — cheap validation (recommended)
-   b) Balanced (sonnet) — higher quality validation
-5. Max Iterations: [3] (default)
-6. Few-Shot Examples: [none] (paths to example SKILL.md files for guidance)
-→ Reply with your choices or "use defaults" for recommended settings.
-```
+If `AskUserQuestion` is not available, fall back to presenting these as inline numbered options.
 After the user responds, show a confirmation summary:
 ```
 Configuration Summary:
-  Target:        routing_table
+  Target:        routing
   Mode:          dry-run
   Teacher model: sonnet
   Student model: haiku
@@ -126,8 +125,8 @@ pipeline. See `references/invocation-taxonomy.md`.
 ### 4. Generate Proposal (Teacher)
 The teacher LLM generates a proposal based on the target:
-- **routing_table**: Optimized `## Workflow Routing` markdown table
-- **full_body**: Complete SKILL.md body replacement
+- **routing**: Optimized `## Workflow Routing` markdown table
+- **body**: Complete SKILL.md body replacement
 Few-shot examples from `--few-shot` paths provide structural guidance.
@@ -140,20 +139,20 @@ failure details and generates a refined proposal.
 If `--dry-run`, prints the proposal without deploying. Otherwise:
 1. Creates a timestamped backup of the current SKILL.md
-2. Applies the change: `replaceSection()` for routing, `replaceBody()` for full_body
+2. Applies the change: `replaceSection()` for routing, `replaceBody()` for body
 3. Records audit entries
 4. Updates evolution memory
 ## Common Patterns
 **"Evolve the routing table for the Research skill"**
-> `selftune evolve body --skill Research --skill-path ~/.claude/skills/Research/SKILL.md --target routing_table`
+> `selftune evolve body --skill Research --skill-path ~/.claude/skills/Research/SKILL.md --target routing`
 **"Rewrite the entire skill body"**
-> `selftune evolve body --skill Research --skill-path ~/.claude/skills/Research/SKILL.md --target full_body --dry-run`
+> `selftune evolve body --skill Research --skill-path ~/.claude/skills/Research/SKILL.md --target body --dry-run`
 **"Use a stronger model for generation"**
-> `selftune evolve body --skill pptx --skill-path /path/SKILL.md --target full_body --teacher-model opus --student-model haiku`
+> `selftune evolve body --skill pptx --skill-path /path/SKILL.md --target body --teacher-model opus --student-model haiku`
 **"Preview what would change"**
 > Always start with `--dry-run` to review the proposal before deploying.

package/skill/Workflows/Initialize.md CHANGED Viewed

@@ -12,15 +12,23 @@ Bootstrap selftune for first-time use or after changing environments.
 ```bash
 selftune init [--agent <type>] [--cli-path <path>] [--force]
+selftune init --alpha --alpha-email <email> [--alpha-name "Name"] [--force]
+selftune init --no-alpha [--force]
 ```
 ## Options
 | Flag | Description | Default |
 |------|-------------|---------|
-| `--agent <type>` | Agent platform: `claude`, `codex`, `opencode` | Auto-detected |
+| `--agent <type>` | Agent platform: `claude_code`, `codex`, `opencode`, `openclaw` | Auto-detected |
 | `--cli-path <path>` | Override auto-detected CLI entry-point path | Auto-detected |
 | `--force` | Reinitialize even if config already exists | Off |
+| `--enable-autonomy` | Enable autonomous scheduling during init | Off |
+| `--schedule-format <fmt>` | Schedule format: `cron`, `launchd`, `systemd` | Auto-detected |
+| `--alpha` | Enroll in the selftune alpha program (opens browser for device-code auth) | Off |
+| `--no-alpha` | Unenroll from the alpha program (preserves user_id) | Off |
+| `--alpha-email <email>` | Email for alpha enrollment (required with `--alpha`) | - |
+| `--alpha-name <name>` | Display name for alpha enrollment | - |
 ## Output Format
@@ -28,12 +36,22 @@ Creates `~/.selftune/config.json`:
 ```json
 {
-  "agent_type": "claude",
+  "agent_type": "claude_code",
   "cli_path": "/Users/you/selftune/cli/selftune/index.ts",
   "llm_mode": "agent",
   "agent_cli": "claude",
   "hooks_installed": true,
-  "initialized_at": "2026-02-28T10:00:00Z"
+  "initialized_at": "2026-02-28T10:00:00Z",
+  "alpha": {
+    "enrolled": true,
+    "user_id": "a1b2c3d4-e5f6-4a7b-8c9d-0e1f2a3b4c5d",
+    "cloud_user_id": "cloud-uuid-...",
+    "cloud_org_id": "org-uuid-...",
+    "email": "user@example.com",
+    "display_name": "User Name",
+    "consent_timestamp": "2026-02-28T10:00:00Z",
+    "api_key": "<provisioned automatically via device-code flow>"
+  }
 }
 ```
@@ -47,6 +65,15 @@ Creates `~/.selftune/config.json`:
 | `agent_cli` | string | CLI binary name for the detected agent |
 | `hooks_installed` | boolean | Whether telemetry hooks are installed |
 | `initialized_at` | string | ISO 8601 timestamp |
+| `alpha` | object? | Alpha program enrollment (present only if enrolled) |
+| `alpha.enrolled` | boolean | Whether the user is currently enrolled |
+| `alpha.user_id` | string | Stable UUID, generated once, preserved across reinits |
+| `alpha.cloud_user_id` | string? | Cloud account UUID (set by device-code flow) |
+| `alpha.cloud_org_id` | string? | Cloud organization UUID (set by device-code flow) |
+| `alpha.email` | string? | Email provided at enrollment |
+| `alpha.display_name` | string? | Optional display name |
+| `alpha.consent_timestamp` | string | ISO 8601 timestamp of consent |
+| `alpha.api_key` | string? | Upload credential (provisioned automatically by device-code flow) |
 ## Steps
@@ -98,7 +125,7 @@ The init output will report what was installed, e.g.:
 | `UserPromptSubmit` | `hooks/auto-activate.ts` | Suggest skills before prompt processing |
 | `PreToolUse` (Write/Edit) | `hooks/skill-change-guard.ts` | Detect uncontrolled skill edits |
 | `PreToolUse` (Write/Edit) | `hooks/evolution-guard.ts` | Block SKILL.md edits on monitored skills |
-| `PostToolUse` (Read) | `hooks/skill-eval.ts` | Track skill triggers |
+| `PostToolUse` (Read/Skill) | `hooks/skill-eval.ts` | Track skill triggers and Skill tool invocations |
 | `Stop` | `hooks/session-stop.ts` | Capture session telemetry |
 **Codex agents:**
@@ -135,21 +162,7 @@ The activation rules file configures auto-activation behavior -- which skills
 get suggested and under what conditions. Edit `~/.selftune/activation-rules.json`
 to customize thresholds and skill mappings for your project.
-### 7. Verify Agent Availability
-`selftune init` installs the specialized agent files to `~/.claude/agents/`
-automatically. Verify they are present:
-```bash
-ls ~/.claude/agents/
-```
-Expected agents: `diagnosis-analyst.md`, `pattern-analyst.md`,
-`evolution-reviewer.md`, `integration-guide.md`. These are used by evolve
-and doctor workflows for deeper analysis. If missing, run `selftune init --force`
-to reinstall them.
-### 8. Verify with Doctor
+### 7. Verify with Doctor
 ```bash
 selftune doctor
@@ -163,17 +176,118 @@ reported issues before proceeding.
 For project-type-specific setup (single-skill, multi-skill, monorepo, Codex,
 OpenCode, mixed agents), see [docs/integration-guide.md](../../docs/integration-guide.md).
-Templates for each project type are in the `templates/` directory:
-- `templates/single-skill-settings.json` — hooks for single-skill projects
-- `templates/multi-skill-settings.json` — hooks for multi-skill projects with activation rules
-- `templates/activation-rules-default.json` — default auto-activation rule configuration
+Templates for each project type are bundled with the skill:
+- `skill/settings_snippet.json` — hooks for Claude Code projects
+- `assets/activation-rules-default.json` — default auto-activation rule configuration
 ## Subagent Escalation
 For complex project structures (monorepos, multi-skill repos, mixed agent
-platforms), spawn the `integration-guide` agent as a subagent for guided
-setup. This agent handles project-type detection, per-package configuration,
-and verification steps that go beyond what the basic init workflow covers.
+platforms), read `agents/integration-guide.md` and spawn a subagent with
+those instructions. That agent handles project-type detection, per-package
+configuration, and verification steps that go beyond what the basic init
+workflow covers.
+## Alpha Enrollment
+Enroll the user in the selftune alpha program for early access features.
+Before running the alpha command:
+1. Ask whether the user wants to opt into the selftune alpha data-sharing program
+2. If they opt in, ask for their email and optional display name
+3. If they decline, skip alpha enrollment and continue with plain `selftune init`
+The CLI stays non-interactive. The agent is responsible for collecting consent
+and the required `--alpha-email` value before invoking the command.
+## Alpha Enrollment (Device-Code Flow)
+The alpha program sends canonical telemetry to the selftune cloud for analysis.
+Enrollment uses a device-code flow — one command, one browser approval, fully automatic.
+### Setup Sequence
+1. **Check local config**: Run `selftune status` — look for the "Alpha Upload" section
+2. **If not linked**: Collect the user's email and run:
+   ```bash
+   selftune init --alpha --alpha-email <user-email> --force
+   ```
+3. **Browser opens automatically**: The CLI requests a device code, opens the verification URL in the browser with the code pre-filled, and polls for approval.
+4. **User approves in browser**: One click to authorize.
+5. **CLI receives credentials**: API key, cloud_user_id, and org_id are automatically provisioned and stored in `~/.selftune/config.json` with `0600` permissions.
+6. **Verify readiness**: The init command prints a readiness check. If all checks pass, alpha upload is active.
+   The readiness JSON includes a `guidance` object with:
+   - `message`
+   - `next_command`
+   - `suggested_commands[]`
+   - `blocking`
+7. **If readiness fails**: Run `selftune doctor` to diagnose. Common issues:
+   - `not enrolled` → re-run `selftune init --alpha --alpha-email <email> --force`
+   - Device-code expired → re-run the init command (codes expire after ~15 minutes)
+### Key Principle
+The cloud app is used **only** for the one-time browser approval during device-code auth. All other selftune operations happen through the local CLI and this agent.
+### Enroll
+```bash
+selftune init --alpha --alpha-email user@example.com --alpha-name "User Name" --force
+```
+The `--alpha-email` flag is required. The command will:
+1. Generate a stable UUID (preserved across reinits)
+2. Request a device code from the cloud API
+3. Open the browser to the verification URL
+4. Poll until the user approves
+5. Receive and store the API key, cloud_user_id, and org_id automatically
+6. Write the alpha block to `~/.selftune/config.json` with `0600` permissions
+7. Print an `alpha_enrolled` JSON message to stdout
+8. Print the consent notice to stderr
+The consent notice explicitly states that the friendly alpha cohort shares raw
+prompt/query text in addition to skill/session/evolution metadata.
+### Upload Behavior
+Once enrolled, `selftune orchestrate` automatically uploads new session,
+invocation, and evolution data to the cloud API at the end of each run.
+This upload step is fail-open -- errors never block the orchestrate loop.
+Use `selftune alpha upload` for manual uploads or `selftune alpha upload --dry-run`
+to preview what would be sent.
+The upload endpoint is `https://api.selftune.dev/api/v1/push`, authenticated with
+the stored API key via `Authorization: Bearer` header. The endpoint can be
+overridden with the `SELFTUNE_ALPHA_ENDPOINT` environment variable.
+### Unenroll
+```bash
+selftune init --no-alpha --force
+```
+Sets `enrolled: false` in the alpha block but preserves the `user_id` so re-enrollment does not create a new identity.
+### Error Handling
+If `--alpha` is passed without `--alpha-email`, the CLI throws a JSON error:
+```json
+{
+  "code": "alpha_email_required",
+  "error": "alpha_email_required",
+  "message": "The --alpha-email flag is required for alpha enrollment.",
+  "next_command": "selftune init --alpha --alpha-email <email>",
+  "suggested_commands": ["selftune status", "selftune doctor"],
+  "blocking": true
+}
+```
+If the device-code flow fails (network error, timeout, user denied), the CLI throws
+with a descriptive error message. The agent should relay this to the user and suggest
+retrying with `selftune init --alpha --alpha-email <email> --force`.
 ## Common Patterns
@@ -182,6 +296,11 @@ and verification steps that go beyond what the basic init workflow covers.
 > `npm install -g selftune`. Run `selftune init`, then verify with
 > `selftune doctor`. Report results to the user.
+**User wants alpha enrollment**
+> Ask whether they want to opt into alpha data sharing. If yes, collect email
+> and optional display name, then run `selftune init --alpha --alpha-email ...`.
+> The browser opens automatically for approval. No manual key management needed.
 **Hooks not capturing data**
 > Run `selftune doctor` to check hook installation. Parse the JSON output
 > for failed hook checks. If paths are wrong, update

package/skill/Workflows/Orchestrate.md CHANGED Viewed

@@ -26,10 +26,13 @@ selftune orchestrate
 |------|-------------|---------|
 | `--dry-run` | Plan and validate without deploying changes | Off |
 | `--review-required` | Keep validated changes in review mode instead of deploying | Off |
+| `--auto-approve` | *(Deprecated)* Autonomous mode is now the default | — |
 | `--skill <name>` | Limit the loop to one skill | All skills |
-| `--max-skills <n>` | Cap how many candidates are processed in one run | `3` |
-| `--recent-window <hours>` | Window for post-deploy watch/rollback checks | `24` |
+| `--max-skills <n>` | Cap how many candidates are processed in one run | `5` |
+| `--recent-window <hours>` | Window for post-deploy watch/rollback checks | `48` |
 | `--sync-force` | Force a full source replay before candidate selection | Off |
+| `--loop` | Run as a long-lived process that cycles continuously | Off |
+| `--loop-interval <seconds>` | Pause between cycles (minimum 60) | `3600` |
 ## Default Behavior
@@ -133,6 +136,12 @@ In autonomous mode, orchestrate calls sub-workflows in this fixed order:
 2. **Status** — compute skill health using existing grade results (reads `grading.json` outputs from previous sessions)
 3. **Evolve** — run evolution on selected candidates (pre-flight is skipped, cheap-loop mode enabled, defaults used)
 4. **Watch** — monitor recently evolved skills (auto-rollback enabled by default, `--recent-window` hours lookback)
+5. **Alpha Upload** — if enrolled in the alpha program (`config.alpha.enrolled === true`) and an API key is configured, stage new canonical records (sessions, invocations, evolution evidence, orchestrate runs) into `canonical_upload_staging`, build V2 push payloads, and flush to the cloud API (`POST /api/v1/push`) with Bearer auth. Fail-open: upload errors never block the orchestrate loop. Respects `--dry-run`.
+Between candidate selection and evolution, orchestrate checks for
+**cross-skill eval set overlap**. When two or more evolution candidates
+share >30% of their positive eval queries, a warning is logged to stderr.
+This is an informational diagnostic only — it does not block evolution.
 All sub-workflows run with defaults and no user interaction. The safety
 model relies on regression thresholds, automatic rollback, and SKILL.md

package/skill/Workflows/Sync.md CHANGED Viewed

@@ -29,6 +29,7 @@ selftune sync
 | `--no-opencode` | Skip OpenCode ingest |
 | `--no-openclaw` | Skip OpenClaw ingest |
 | `--no-repair` | Skip rebuilding `skill_usage_repaired.jsonl` |
+| `--json` | Output results as JSON |
 ## Output
@@ -66,6 +67,28 @@ After sync completes, proceed with the user's intended workflow:
 `selftune status`, `selftune dashboard`, `selftune watch --sync-first`,
 or `selftune evolve --sync-first`.
+## `--json` Usage
+```bash
+selftune sync --json
+```
+Sample output:
+```json
+{
+  "sources": {
+    "claude": { "scanned": 12, "synced": 3, "skipped": 9 },
+    "codex": { "scanned": 0, "synced": 0, "skipped": 0 }
+  },
+  "repaired": { "total": 42 },
+  "errors": []
+}
+```
+Use `--json` when the agent needs to parse sync results programmatically
+(e.g., to decide whether to proceed with evolution or surface counts to the user).
 ## Common Patterns
 **User wants to refresh telemetry data**

package/skill/Workflows/Watch.md CHANGED Viewed

@@ -17,8 +17,9 @@ selftune watch --skill <name> --skill-path <path> [options]
 | `--skill-path <path>` | Path to the skill's SKILL.md | Required |
 | `--window <n>` | Sliding window size (number of sessions) | 20 |
 | `--threshold <n>` | Regression threshold (drop from baseline) | 0.1 |
-| `--baseline <n>` | Explicit baseline pass rate (0-1) | Auto-detected from last deploy |
 | `--auto-rollback` | Automatically rollback on detected regression | Off |
+| `--sync-first` | Refresh source-truth telemetry before evaluating | Off |
+| `--sync-force` | Force a full source rescan during `--sync-first` | Off |
 ## Output Format
@@ -138,10 +139,6 @@ context window resets before the user acts on the results.
 > Use `--auto-rollback`. The command will restore the previous description
 > automatically if pass rate drops below baseline minus threshold.
-**"Set a custom baseline"**
-> Use `--baseline 0.85` to override auto-detection. Useful when the
-> auto-detected baseline is from an older evolution.
 ## Autonomous Mode
 When called by `selftune orchestrate`, watch runs automatically on recently