npm - claude-turing - Versions diffs - 4.6.0 → 4.8.0 - Mend

claude-turing 4.6.0 → 4.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (333) hide show

package/skills/turing/train/SKILL.md ADDED Viewed

@@ -0,0 +1,74 @@
+---
+name: train
+description: Run the autonomous ML experiment loop. Iteratively hypothesizes, trains, evaluates, and decides — keeping only improvements. Implements the autoresearch pattern with formal convergence detection and git-disciplined rollback.
+argument-hint: "[max_iterations]"
+allowed-tools: Read, Write, Edit, Bash(python train.py:*, python scripts/*:*, git:*, source .venv/bin/activate:*, pip:*), Grep, Glob
+---
+You are an autonomous ML researcher. Your goal: iteratively improve a model by following the experiment loop protocol — the scientific method applied to machine learning.
+Read `program.md` in the ML project directory for the complete protocol. Follow it exactly.
+## Arguments
+`$ARGUMENTS` — accepts a project path (e.g., `ml/coding`), a number for max_iterations, or both (e.g., `ml/coding 10`). If no number, run until convergence (as defined in `config.yaml` convergence settings).
+## Bootstrap Sequence
+0. **Detect project directory:**
+   - If `$ARGUMENTS` contains a path (e.g., `ml/coding`), use that as the project directory
+   - Else if cwd contains `config.yaml` and `train.py`, use cwd
+   - Else search for `ml/*/` subdirectories containing `config.yaml`
+     - If exactly one found, use it
+     - If multiple found, list them and ask the user which to target
+   - All subsequent commands run from the detected project directory
+   - Memory path: `.claude/agent-memory/ml-researcher-{project_name}/MEMORY.md`
+1. **Restore memory:** Read `.claude/agent-memory/ml-researcher-{project_name}/MEMORY.md` for prior observations and best results.
+2. **Read protocol:** Read `program.md` completely — it defines the experiment loop, constraints, and output format.
+3. **Bootstrap data:** Check for training data at `config.yaml` → `data.source`. If no splits exist, run `python prepare.py`.
+4. **Bootstrap venv:** `test -d .venv || (python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt)`
+5. **Assess state:** `source .venv/bin/activate && python scripts/show_metrics.py --last 5`
+6. **Begin the loop** from program.md.
+## The Loop
+Each iteration follows the experiment lifecycle (`config/lifecycle.toml`):
+```
+proposed -> running -> evaluating -> kept/discarded -> (next iteration)
+```
+The agent proposes a hypothesis, executes it, measures the result against the immutable evaluation harness, and decides whether to keep or discard. Only improvements survive in git history.
+## Delegation
+Use `@ml-evaluator` for analysis tasks. It is read-only (no Write/Edit) and cannot accidentally modify the pipeline.
+## Context Management
+- Redirect all training output: `python train.py > run.log 2>&1`
+- Parse metrics with grep, never read full output
+- Persist observations to MEMORY.md after each experiment
+## Convergence
+- Stop after `max_iterations` if provided
+- Otherwise, stop after N consecutive non-improvements (`config.yaml` → `convergence.patience`)
+- Report final best experiment and recommend next steps
+## /loop Integration
+For fully hands-off training:
+```
+/loop 5m /turing:train
+```
+The Stop hook automatically detects convergence and halts the loop. Recommended intervals:
+- `3m` — fast iterations, small datasets
+- `5m` — standard training runs
+- `10m` — deep training with large models
+## Rules
+See `rules/loop-protocol.md` for safety constraints governing the experiment loop.

package/skills/turing/transfer/SKILL.md ADDED Viewed

@@ -0,0 +1,53 @@
+---
+name: transfer
+description: Cross-project knowledge transfer — find similar prior projects and surface what worked. Builds institutional ML memory.
+argument-hint: "[--from project-path] [--auto]"
+allowed-tools: Read, Bash(*), Grep, Glob
+---
+Find similar prior projects and surface what worked. "Last time you had tabular classification with class imbalance, LightGBM beat everything by 3%."
+## Steps
+1. **Activate environment:**
+   ```bash
+   source .venv/bin/activate
+   ```
+2. **Parse arguments from `$ARGUMENTS`:**
+   - `--from ~/projects/fraud-detection` — transfer from a specific project
+   - `--auto` — auto-queue hypotheses from recommendations
+   - `--index ~/.turing/project_index.yaml` — custom index path
+   - `--json` — raw JSON output
+3. **Run knowledge transfer:**
+   ```bash
+   python scripts/knowledge_transfer.py $ARGUMENTS
+   ```
+4. **Report includes:**
+   - Similar prior projects ranked by similarity score
+   - Per project: task type, winner model, key insights
+   - Suggested hypotheses from winning strategies
+   - Auto-queued hypotheses (with `--auto`)
+5. **Similarity matching** uses:
+   - Task type (classification/regression) — highest weight
+   - Dataset size (log-scale comparison)
+   - Feature types (tabular/image/text)
+   - Class balance characteristics
+   - Dimensionality
+6. **Project index** at `~/.turing/project_index.yaml` — local only, never uploaded
+7. **If no similar projects found:** suggest running on more projects first or specifying one with `--from`
+8. **Saved output:** report in `experiments/transfers/transfer-*.yaml`
+## Examples
+```
+/turing:transfer                                    # Search index for similar projects
+/turing:transfer --from ~/projects/fraud-detection  # Transfer from specific project
+/turing:transfer --auto                             # Auto-queue hypotheses
+```

package/skills/turing/trend/SKILL.md ADDED Viewed

@@ -0,0 +1,20 @@
+---
+name: trend
+description: Long-term trend analysis — improvement velocity, family ROI, diminishing returns detection, strategic research direction.
+argument-hint: "[--window 30d] [--metric accuracy]"
+allowed-tools: Read, Bash(*), Grep, Glob
+---
+See the arc of your research, not just the latest results. Strategic view over 100+ experiments.
+## Steps
+1. **Activate environment:** `source .venv/bin/activate`
+2. **Run:** `python scripts/trend_analysis.py $ARGUMENTS`
+3. **Report:** improvement velocity over time windows, family ROI ranking, diminishing returns prediction, phase transitions
+4. **Saved output:** `experiments/trends/trend-*.yaml`
+## Examples
+```
+/turing:trend                        # Full trend analysis
+/turing:trend --window 14d           # Last 2 weeks
+```

package/skills/turing/try/SKILL.md ADDED Viewed

@@ -0,0 +1,62 @@
+---
+name: try
+description: Inject a hypothesis into the agent's experiment queue. This is how research taste reaches the agent — the human selects which coins to flip, the agent flips them.
+argument-hint: "<hypothesis description>"
+allowed-tools: Read, Write, Edit, Bash(python scripts/*:*, source .venv/bin/activate:*), Grep, Glob
+---
+Inject a human hypothesis into the experiment queue for the next `/turing:train` iteration.
+This is the taste-leverage mechanism: you provide judgment about what's worth trying, the agent provides disciplined execution.
+## Steps
+1. **Parse the hypothesis** from `$ARGUMENTS`. If empty, ask the user what they want the agent to try.
+2. **Check for archetype syntax.** If the argument starts with `archetype:`, expand it:
+   ```bash
+   source .venv/bin/activate && python scripts/manage_hypotheses.py add --archetype <name> --priority high --source human
+   ```
+   Otherwise, use the raw description:
+   ```bash
+   source .venv/bin/activate && python scripts/manage_hypotheses.py add "$ARGUMENTS" --priority high --source human
+   ```
+3. **Confirm** with the hypothesis ID and instructions:
+   - "Queued as hyp-NNN (high priority, human-injected)"
+   - "The agent will prioritize this on the next `/turing:train` iteration"
+   - Show current queue: `python scripts/manage_hypotheses.py list --status queued`
+## Examples
+```
+# Free-text hypotheses
+/turing:try switch to LightGBM with dart boosting and lower learning rate
+/turing:try add polynomial features for the numeric columns
+/turing:try increase regularization, the train/val gap suggests overfitting
+# Archetype-based structured strategies
+/turing:try archetype:model_comparison
+/turing:try archetype:feature_sweep
+/turing:try archetype:ensemble_construction
+/turing:try archetype:regularization_search
+/turing:try archetype:ablation_study
+```
+## Available Archetypes
+| Archetype | What it does | Expected experiments |
+|-----------|-------------|---------------------|
+| `model_comparison` | Compare XGBoost, LightGBM, RF, LR, MLP with statistical tests | ~5 |
+| `hyperparameter_sweep` | Grid search with multi-seed validation | 15-36 |
+| `feature_sweep` | Add/remove feature transforms one at a time | 6-10 |
+| `regularization_search` | Binary search for optimal regularization | 4-6 |
+| `ensemble_construction` | Voting, stacking, blending of top models | 4-6 |
+| `learning_rate_schedule` | lr vs n_estimators tradeoff | 4-5 |
+| `data_quality_audit` | Class balance, label noise, leakage checks | 3-5 |
+| `ablation_study` | Remove features one at a time to measure importance | N+1 |
+## How It Connects
+The `/turing:train` loop checks `hypotheses.yaml` during the OBSERVE step. Human-injected hypotheses (high priority) are tried before the agent generates its own. After testing, the hypothesis is marked as `tested`, `promising`, or `dead-end` with a link to the resulting experiment.

package/skills/turing/update/SKILL.md ADDED Viewed

@@ -0,0 +1,26 @@
+---
+name: update
+description: Incremental model update — add new data without full retraining, with forgetting detection.
+argument-hint: "<exp-id> --new-data <path> [--replay-ratio 0.1] [--tolerance 0.005]"
+allowed-tools: Read, Bash(*), Grep, Glob
+---
+Add new data to an existing model without starting from scratch. Detects catastrophic forgetting.
+## Steps
+1. `source .venv/bin/activate`
+2. `python scripts/incremental_update.py $ARGUMENTS`
+3. **Saved:** `experiments/updates/`
+## Model-specific strategies
+- **XGBoost/LightGBM:** continued boosting with additional rounds
+- **Neural networks:** fine-tune with reduced LR + replay buffer from old data
+- **scikit-learn:** partial_fit() or warm_start=True
+## Examples
+```
+/turing:update exp-089 --new-data data/new_batch.csv
+/turing:update exp-089 --new-data data/new.csv --replay-ratio 0.2
+/turing:update exp-089 --new-data data/new.csv --tolerance 0.01
+/turing:update exp-089 --new-data data/new.csv --json
+```

package/skills/turing/validate/SKILL.md ADDED Viewed

@@ -0,0 +1,33 @@
+---
+name: validate
+description: Run stability validation on the current experiment configuration. Executes N runs to measure metric variance and auto-configures multi-run evaluation if variance is too high.
+argument-hint: "[--auto]"
+allowed-tools: Read, Bash(*), Grep, Glob
+---
+Validate the stability of the current ML pipeline by running it multiple times and measuring variance.
+## Steps
+1. **Activate environment:**
+   ```bash
+   source .venv/bin/activate
+   ```
+2. **Run stability check:**
+   ```bash
+   python scripts/validate_stability.py
+   ```
+3. **If `$ARGUMENTS` contains `--auto`:**
+   ```bash
+   python scripts/validate_stability.py --auto
+   ```
+   This auto-writes `evaluation.n_runs: 3` to `config.yaml` if CV > 5%.
+4. **Report results:**
+   - **Stable (CV < 5%):** metric is reliable, single-run evaluation is sufficient
+   - **Unstable (CV >= 5%):** metric has high variance, multi-run with median is recommended
+   - If `--auto` was used, report what was changed in config.yaml
+5. **If no training pipeline exists:** suggest `/turing:init` first.

package/skills/turing/warm/SKILL.md ADDED Viewed

@@ -0,0 +1,52 @@
+---
+name: warm
+description: Warm-start from a prior model — load checkpoint, optionally freeze layers, adjust learning rate, and continue training.
+argument-hint: "<exp-id> [--freeze-layers encoder] [--unfreeze-after 5]"
+allowed-tools: Read, Bash(*), Grep, Glob
+---
+Take a trained checkpoint and use it as initialization for a new experiment. Automates the "start from here but change X" pattern.
+## Steps
+1. **Activate environment:**
+   ```bash
+   source .venv/bin/activate
+   ```
+2. **Parse arguments from `$ARGUMENTS`:**
+   - First argument is the source experiment ID (required)
+   - `--freeze-layers encoder decoder` — layer names to freeze (neural only)
+   - `--unfreeze-after 5` — unfreeze all layers after N epochs (gradual unfreezing)
+   - `--lr-factor 0.1` — learning rate reduction factor (default: 0.1x)
+   - `--json` — raw JSON output
+3. **Run warm-start planner:**
+   ```bash
+   python scripts/warm_start.py $ARGUMENTS
+   ```
+4. **Report results:**
+   - Model type detection (tree, neural, sklearn)
+   - Strategy: continue_boosting, load_weights, or warm_start_param
+   - Numbered step-by-step instructions
+   - Config changes to apply
+   - Checkpoint info (path, format, size)
+5. **Strategies by model type:**
+   - **Tree models (XGBoost/LightGBM):** continue boosting from existing trees with more estimators
+   - **Neural networks:** load weights, optionally freeze layers, reset optimizer, reduce LR
+   - **scikit-learn:** use `warm_start=True` parameter for incremental learning
+6. **If no checkpoint found:** plan is still generated, but warns that checkpoint is needed
+7. **Saved output:** report written to `experiments/warm_starts/warm-<exp-id>.yaml`
+## Examples
+```
+/turing:warm exp-042                                   # Auto-detect strategy
+/turing:warm exp-042 --freeze-layers encoder           # Freeze encoder layers
+/turing:warm exp-042 --freeze-layers encoder --unfreeze-after 5  # Gradual unfreezing
+/turing:warm exp-042 --lr-factor 0.01                  # Very small fine-tuning LR
+```

package/skills/turing/watch/SKILL.md ADDED Viewed

@@ -0,0 +1,59 @@
+---
+name: watch
+description: Live training monitor with early-warning alerts for loss spikes, NaN, overfitting, and metric plateaus.
+argument-hint: "[--alerts] [--interval 10] [--analyze run.log]"
+allowed-tools: Read, Bash(*), Grep, Glob
+---
+Stream metrics during training with early-warning alerts. Catches problems mid-run instead of at the end.
+## Steps
+1. **Activate environment:**
+   ```bash
+   source .venv/bin/activate
+   ```
+2. **Parse arguments from `$ARGUMENTS`:**
+   - `--analyze run.log` — post-hoc analysis of a completed log (non-blocking)
+   - `--alerts` — show only alert lines, suppress normal output
+   - `--interval 10` — check interval in seconds (default: 10)
+   - `--alerts-config config/watch_alerts.yaml` — custom alert rules
+   - `--json` — raw JSON output (for `--analyze` mode)
+3. **For post-hoc analysis:**
+   ```bash
+   python scripts/training_monitor.py --analyze run.log
+   ```
+4. **For live monitoring (inform user):**
+   Live monitoring requires a running training process. Suggest the user run in a separate terminal:
+   ```bash
+   python scripts/training_monitor.py --log run.log --interval 10
+   ```
+5. **Alert types:**
+   - **Loss spike:** loss > 3x rolling mean (configurable multiplier)
+   - **NaN detected:** any metric is NaN — CRITICAL, suggests pausing
+   - **Overfitting onset:** train/val gap widening for 3+ consecutive epochs
+   - **Plateau:** metric improvement < 0.001 for 5+ consecutive epochs
+6. **Dashboard line format:**
+   ```
+   Epoch 23/100 | loss: 0.342 ↓ | acc: 0.865 ↑ | gap: 0.018 | ⚠ plateau
+   ```
+7. **Alert config:** rules are in `config/watch_alerts.yaml` — users can customize thresholds.
+8. **Saved output:** analysis report written to `experiments/monitors/analysis-*.yaml`
+9. **If no training log exists:** suggest running `/turing:train` first.
+## Examples
+```
+/turing:watch --analyze run.log           # Analyze completed training
+/turing:watch --analyze run.log --json    # JSON output for scripting
+/turing:watch --alerts                    # Live: show only alerts
+/turing:watch --interval 5               # Live: check every 5 seconds
+```

package/skills/turing/whatif/SKILL.md ADDED Viewed

@@ -0,0 +1,30 @@
+---
+name: whatif
+description: What-if analysis — answer hypotheticals from existing experiment data without running new experiments.
+argument-hint: "\"<question>\" [--json]"
+allowed-tools: Read, Bash(*), Grep, Glob
+---
+Answer "what if?" questions using existing experiment data. Routes to the right estimator automatically.
+## Steps
+1. `source .venv/bin/activate`
+2. `python scripts/whatif_engine.py $ARGUMENTS`
+3. **Saved:** `experiments/whatif/`
+## Supported question types
+- **Data scaling:** "what if I had 2x more data" → scaling law extrapolation
+- **Ablation:** "what if I removed class 3" → ablation study data
+- **Pipeline stitch:** "what if I combined exp-031 with exp-042" → stitch estimation
+- **Hyperparameters:** "what if learning_rate was 0.01" → sensitivity interpolation
+- **Ensemble:** "what if I ensembled the top models" → correlation analysis
+- **Pruning:** "what if I pruned to 50% sparsity" → pruning sweep interpolation
+- **Budget:** "what if I spent my budget on X vs Y" → budget allocation
+## Examples
+```
+/turing:whatif "what if I had 2x more data"
+/turing:whatif "what if I removed class 3"
+/turing:whatif "what if I combined exp-031 with exp-042"
+/turing:whatif "what if learning_rate was 0.01" --json
+```

package/skills/turing/xray/SKILL.md ADDED Viewed

@@ -0,0 +1,42 @@
+---
+name: xray
+description: Internal model diagnostics — gradient flow, dead neurons, activation stats, weight distributions, tree depth analysis.
+argument-hint: "[exp-id] [--layer encoder.layer.2] [--compare exp-a exp-b]"
+allowed-tools: Read, Bash(*), Grep, Glob
+---
+See inside the model. When it underperforms, the fix depends on *why*.
+## Steps
+1. **Activate environment:**
+   ```bash
+   source .venv/bin/activate
+   ```
+2. **Parse arguments from `$ARGUMENTS`:**
+   - Optional experiment ID
+   - `--layer "name"` — focus on specific layer
+   - `--compare exp-a exp-b` — side-by-side diagnostics
+   - `--json` — raw JSON output
+3. **Run model diagnostics:**
+   ```bash
+   python scripts/model_xray.py $ARGUMENTS
+   ```
+4. **Diagnostics by model type:**
+   - **Neural networks:** gradient magnitudes, activation stats, dead neuron %, weight distributions, gradient-to-weight ratio
+   - **Tree models:** depth utilization, leaf purity, feature split dominance
+   - **scikit-learn:** coefficient magnitudes, feature importance concentration
+5. **Issues detected:** dead gradients, vanishing/exploding gradients, dead neurons, sparse weights, feature dominance, overfitting risk
+6. **Saved output:** report in `experiments/xrays/<exp-id>-xray.yaml`
+## Examples
+```
+/turing:xray exp-042              # Full diagnostics
+/turing:xray                      # Best experiment
+```

package/src/command-registry.js CHANGED Viewed

@@ -140,11 +140,32 @@ export async function getCommandNames(registryPath) {
   return registry.commandNames;
 }
+// Installed public layout under .claude/commands/turing.
 export async function getExpectedCommandPaths(registryPath) {
   const names = await getCommandNames(registryPath);
   return ['SKILL.md', ...names.map((name) => `${name}/SKILL.md`)];
 }
+// Editable repository source layout.
+export async function getExpectedSkillSourcePaths(registryPath) {
+  const names = await getCommandNames(registryPath);
+  return [
+    'skills/turing/SKILL.md',
+    ...names.map((name) => `skills/turing/${name}/SKILL.md`),
+    'skills/turing/rules/loop-protocol.md',
+  ];
+}
+// Generated repository compatibility layout.
+export async function getExpectedLegacyCommandCompatPaths(registryPath) {
+  const names = await getCommandNames(registryPath);
+  return [
+    'commands/turing.md',
+    ...names.map((name) => `commands/${name}.md`),
+    'commands/rules/loop-protocol.md',
+  ];
+}
 export async function getConfigFiles(registryPath) {
   const registry = await loadCommandRegistry(registryPath);
   return registry.configFiles;

package/src/install.js CHANGED Viewed

@@ -18,6 +18,7 @@ import { getCommandNames, getConfigFiles } from "./command-registry.js";
 const __dirname = dirname(fileURLToPath(import.meta.url));
 const PLUGIN_ROOT = join(__dirname, "..");
+const SKILL_SOURCE_ROOT = join(PLUGIN_ROOT, "skills", "turing");
 export async function install(opts = {}) {
@@ -37,7 +38,7 @@ export async function install(opts = {}) {
   // Copy root command (router) as SKILL.md
   await copyFile(
-    join(PLUGIN_ROOT, "commands", "turing.md"),
+    join(SKILL_SOURCE_ROOT, "SKILL.md"),
     join(paths.commands, "SKILL.md"),
   );
   console.log("  Router -> SKILL.md");
@@ -45,7 +46,7 @@ export async function install(opts = {}) {
   // Copy sub-commands as <name>/SKILL.md
   for (const cmd of subCommands) {
     await copyFile(
-      join(PLUGIN_ROOT, "commands", `${cmd}.md`),
+      join(SKILL_SOURCE_ROOT, cmd, "SKILL.md"),
       join(paths.commands, cmd, "SKILL.md"),
     );
   }
@@ -53,7 +54,7 @@ export async function install(opts = {}) {
   // Copy rules
   await copyFile(
-    join(PLUGIN_ROOT, "commands", "rules", "loop-protocol.md"),
+    join(SKILL_SOURCE_ROOT, "rules", "loop-protocol.md"),
     join(paths.commands, "rules", "loop-protocol.md"),
   );
   console.log("  Rules installed");

package/src/sync-commands-layout.js ADDED Viewed

@@ -0,0 +1,149 @@
+#!/usr/bin/env node
+/**
+ * Synchronize the legacy commands/ compatibility tree from skills/turing/.
+ *
+ * Usage:
+ *   node src/sync-commands-layout.js [--check]
+ */
+import { mkdir, readdir, readFile, rm, writeFile } from "fs/promises";
+import { dirname, join, relative } from "path";
+import { fileURLToPath } from "url";
+import { getCommandNames } from "./command-registry.js";
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const PLUGIN_ROOT = join(__dirname, "..");
+const SKILLS_DIR = join(PLUGIN_ROOT, "skills", "turing");
+const COMMANDS_DIR = join(PLUGIN_ROOT, "commands");
+async function readUtf8(path) {
+  return readFile(path, "utf8");
+}
+async function copyTextFile(source, target) {
+  await mkdir(dirname(target), { recursive: true });
+  await writeFile(target, await readUtf8(source));
+}
+async function compatibilityEntries() {
+  const names = await getCommandNames();
+  return [
+    {
+      source: join(SKILLS_DIR, "SKILL.md"),
+      target: join(COMMANDS_DIR, "turing.md"),
+    },
+    ...names.map((name) => ({
+      source: join(SKILLS_DIR, name, "SKILL.md"),
+      target: join(COMMANDS_DIR, `${name}.md`),
+    })),
+    {
+      source: join(SKILLS_DIR, "rules", "loop-protocol.md"),
+      target: join(COMMANDS_DIR, "rules", "loop-protocol.md"),
+    },
+  ];
+}
+async function existingCompatibilityEntries(dir = COMMANDS_DIR) {
+  let entries;
+  try {
+    entries = await readdir(dir, { withFileTypes: true });
+  } catch (error) {
+    if (error.code === "ENOENT") {
+      return [];
+    }
+    throw error;
+  }
+  const paths = [];
+  for (const entry of entries) {
+    const path = join(dir, entry.name);
+    paths.push(path);
+    if (entry.isDirectory()) {
+      paths.push(...await existingCompatibilityEntries(path));
+    }
+  }
+  return paths;
+}
+async function findDrift() {
+  const entries = await compatibilityEntries();
+  const expectedTargets = new Set(entries.map(({ target }) => target));
+  const expectedPaths = new Set([COMMANDS_DIR]);
+  for (const target of expectedTargets) {
+    let current = target;
+    while (current.startsWith(COMMANDS_DIR)) {
+      expectedPaths.add(current);
+      if (current === COMMANDS_DIR) {
+        break;
+      }
+      current = dirname(current);
+    }
+  }
+  const issues = [];
+  for (const { source, target } of entries) {
+    let sourceText;
+    try {
+      sourceText = await readUtf8(source);
+    } catch (error) {
+      issues.push(`missing source ${relative(PLUGIN_ROOT, source)}: ${error.message}`);
+      continue;
+    }
+    let targetText;
+    try {
+      targetText = await readUtf8(target);
+    } catch (error) {
+      if (error.code === "ENOENT") {
+        issues.push(`missing compatibility file ${relative(PLUGIN_ROOT, target)}`);
+      } else {
+        issues.push(`cannot read compatibility file ${relative(PLUGIN_ROOT, target)}: ${error.message}`);
+      }
+      continue;
+    }
+    if (targetText !== sourceText) {
+      issues.push(`diverged compatibility file ${relative(PLUGIN_ROOT, target)}`);
+    }
+  }
+  for (const path of await existingCompatibilityEntries()) {
+    if (!expectedPaths.has(path)) {
+      issues.push(`stale compatibility path ${relative(PLUGIN_ROOT, path)}`);
+    }
+  }
+  return issues;
+}
+export async function syncCommandsLayout({ check = false } = {}) {
+  if (check) {
+    const issues = await findDrift();
+    if (issues.length > 0) {
+      for (const issue of issues) {
+        console.error(issue);
+      }
+      process.exitCode = 1;
+      return;
+    }
+    console.log("commands compatibility tree is in sync");
+    return;
+  }
+  await rm(COMMANDS_DIR, { recursive: true, force: true });
+  for (const { source, target } of await compatibilityEntries()) {
+    await copyTextFile(source, target);
+  }
+  console.log("commands compatibility tree synchronized");
+}
+const isDirectRun =
+  process.argv[1] &&
+  fileURLToPath(import.meta.url).endsWith(process.argv[1].replace(/^.*\//, ""));
+if (isDirectRun) {
+  syncCommandsLayout({ check: process.argv.includes("--check") }).catch((error) => {
+    console.error(error.message);
+    process.exitCode = 1;
+  });
+}