npm - claude-turing - Versions diffs - 4.7.0 → 4.8.1 - Mend

claude-turing 4.7.0 → 4.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (172) hide show

package/.claude-plugin/plugin.json +2 -2
package/README.md +1 -1
package/agents/ml-evaluator.md +4 -4
package/agents/ml-researcher.md +2 -2
package/bin/turing-init.sh +2 -2
package/commands/ablate.md +3 -4
package/commands/annotate.md +2 -3
package/commands/archive.md +2 -3
package/commands/audit.md +3 -4
package/commands/baseline.md +3 -4
package/commands/brief.md +5 -6
package/commands/budget.md +3 -4
package/commands/calibrate.md +3 -4
package/commands/card.md +3 -4
package/commands/changelog.md +2 -3
package/commands/checkpoint.md +3 -4
package/commands/cite.md +2 -3
package/commands/compare.md +1 -2
package/commands/counterfactual.md +2 -3
package/commands/curriculum.md +3 -4
package/commands/design.md +3 -4
package/commands/diagnose.md +4 -5
package/commands/diff.md +3 -4
package/commands/distill.md +3 -4
package/commands/doctor.md +2 -3
package/commands/ensemble.md +3 -4
package/commands/explore.md +4 -5
package/commands/export.md +3 -4
package/commands/feature.md +3 -4
package/commands/flashback.md +2 -3
package/commands/fork.md +3 -4
package/commands/frontier.md +3 -4
package/commands/init.md +5 -6
package/commands/leak.md +3 -4
package/commands/lit.md +3 -4
package/commands/logbook.md +5 -6
package/commands/merge.md +2 -3
package/commands/mode.md +1 -2
package/commands/onboard.md +2 -3
package/commands/paper.md +3 -4
package/commands/plan.md +2 -3
package/commands/poster.md +3 -4
package/commands/postmortem.md +2 -3
package/commands/preflight.md +5 -6
package/commands/present.md +2 -3
package/commands/profile.md +3 -4
package/commands/prune.md +2 -3
package/commands/quantize.md +2 -3
package/commands/queue.md +3 -4
package/commands/registry.md +2 -3
package/commands/regress.md +3 -4
package/commands/replay.md +2 -3
package/commands/report.md +3 -4
package/commands/reproduce.md +3 -4
package/commands/retry.md +3 -4
package/commands/review.md +2 -3
package/commands/rules/loop-protocol.md +11 -11
package/commands/sanity.md +3 -4
package/commands/scale.md +4 -5
package/commands/search.md +2 -3
package/commands/seed.md +3 -4
package/commands/sensitivity.md +3 -4
package/commands/share.md +2 -3
package/commands/simulate.md +2 -3
package/commands/status.md +1 -2
package/commands/stitch.md +3 -4
package/commands/suggest.md +5 -6
package/commands/surgery.md +2 -3
package/commands/sweep.md +8 -9
package/commands/template.md +2 -3
package/commands/train.md +5 -6
package/commands/transfer.md +3 -4
package/commands/trend.md +2 -3
package/commands/try.md +4 -5
package/commands/turing.md +3 -3
package/commands/update.md +2 -3
package/commands/validate.md +4 -5
package/commands/warm.md +3 -4
package/commands/watch.md +4 -5
package/commands/whatif.md +2 -3
package/commands/xray.md +3 -4
package/config/commands.yaml +75 -75
package/package.json +3 -2
package/skills/turing/SKILL.md +3 -3
package/skills/turing/ablate/SKILL.md +3 -4
package/skills/turing/annotate/SKILL.md +2 -3
package/skills/turing/archive/SKILL.md +2 -3
package/skills/turing/audit/SKILL.md +3 -4
package/skills/turing/baseline/SKILL.md +3 -4
package/skills/turing/brief/SKILL.md +5 -6
package/skills/turing/budget/SKILL.md +3 -4
package/skills/turing/calibrate/SKILL.md +3 -4
package/skills/turing/card/SKILL.md +3 -4
package/skills/turing/changelog/SKILL.md +2 -3
package/skills/turing/checkpoint/SKILL.md +3 -4
package/skills/turing/cite/SKILL.md +2 -3
package/skills/turing/compare/SKILL.md +1 -2
package/skills/turing/counterfactual/SKILL.md +2 -3
package/skills/turing/curriculum/SKILL.md +3 -4
package/skills/turing/design/SKILL.md +3 -4
package/skills/turing/diagnose/SKILL.md +4 -5
package/skills/turing/diff/SKILL.md +3 -4
package/skills/turing/distill/SKILL.md +3 -4
package/skills/turing/doctor/SKILL.md +2 -3
package/skills/turing/ensemble/SKILL.md +3 -4
package/skills/turing/explore/SKILL.md +4 -5
package/skills/turing/export/SKILL.md +3 -4
package/skills/turing/feature/SKILL.md +3 -4
package/skills/turing/flashback/SKILL.md +2 -3
package/skills/turing/fork/SKILL.md +3 -4
package/skills/turing/frontier/SKILL.md +3 -4
package/skills/turing/init/SKILL.md +5 -6
package/skills/turing/leak/SKILL.md +3 -4
package/skills/turing/lit/SKILL.md +3 -4
package/skills/turing/logbook/SKILL.md +5 -6
package/skills/turing/merge/SKILL.md +2 -3
package/skills/turing/mode/SKILL.md +1 -2
package/skills/turing/onboard/SKILL.md +2 -3
package/skills/turing/paper/SKILL.md +3 -4
package/skills/turing/plan/SKILL.md +2 -3
package/skills/turing/poster/SKILL.md +3 -4
package/skills/turing/postmortem/SKILL.md +2 -3
package/skills/turing/preflight/SKILL.md +5 -6
package/skills/turing/present/SKILL.md +2 -3
package/skills/turing/profile/SKILL.md +3 -4
package/skills/turing/prune/SKILL.md +2 -3
package/skills/turing/quantize/SKILL.md +2 -3
package/skills/turing/queue/SKILL.md +3 -4
package/skills/turing/registry/SKILL.md +2 -3
package/skills/turing/regress/SKILL.md +3 -4
package/skills/turing/replay/SKILL.md +2 -3
package/skills/turing/report/SKILL.md +3 -4
package/skills/turing/reproduce/SKILL.md +3 -4
package/skills/turing/retry/SKILL.md +3 -4
package/skills/turing/review/SKILL.md +2 -3
package/skills/turing/rules/loop-protocol.md +11 -11
package/skills/turing/sanity/SKILL.md +3 -4
package/skills/turing/scale/SKILL.md +4 -5
package/skills/turing/search/SKILL.md +2 -3
package/skills/turing/seed/SKILL.md +3 -4
package/skills/turing/sensitivity/SKILL.md +3 -4
package/skills/turing/share/SKILL.md +2 -3
package/skills/turing/simulate/SKILL.md +2 -3
package/skills/turing/status/SKILL.md +1 -2
package/skills/turing/stitch/SKILL.md +3 -4
package/skills/turing/suggest/SKILL.md +5 -6
package/skills/turing/surgery/SKILL.md +2 -3
package/skills/turing/sweep/SKILL.md +8 -9
package/skills/turing/template/SKILL.md +2 -3
package/skills/turing/train/SKILL.md +5 -6
package/skills/turing/transfer/SKILL.md +3 -4
package/skills/turing/trend/SKILL.md +2 -3
package/skills/turing/try/SKILL.md +4 -5
package/skills/turing/update/SKILL.md +2 -3
package/skills/turing/validate/SKILL.md +4 -5
package/skills/turing/warm/SKILL.md +3 -4
package/skills/turing/watch/SKILL.md +4 -5
package/skills/turing/whatif/SKILL.md +2 -3
package/skills/turing/xray/SKILL.md +3 -4
package/src/command-registry.js +12 -0
package/src/install.js +4 -3
package/src/sync-commands-layout.js +149 -0
package/src/sync-skills-layout.js +4 -133
package/templates/README.md +5 -8
package/templates/program.md +18 -18
package/templates/pyproject.toml +10 -0
package/templates/requirements.txt +4 -1
package/templates/scripts/generate_onboarding.py +1 -1
package/templates/scripts/post-train-hook.sh +7 -8
package/templates/scripts/scaffold.py +24 -26
package/templates/scripts/stop-hook.sh +2 -3
package/templates/scripts/turing-run-python.sh +9 -0

package/skills/turing/watch/SKILL.md CHANGED Viewed

@@ -1,7 +1,6 @@
 ---
 name: watch
 description: Live training monitor with early-warning alerts for loss spikes, NaN, overfitting, and metric plateaus.
-disable-model-invocation: true
 argument-hint: "[--alerts] [--interval 10] [--analyze run.log]"
 allowed-tools: Read, Bash(*), Grep, Glob
 ---
@@ -10,9 +9,9 @@ Stream metrics during training with early-warning alerts. Catches problems mid-r
 ## Steps
-1. **Activate environment:**
+1. **Sync environment:**
    ```bash
-   source .venv/bin/activate
+   uv sync
    ```
 2. **Parse arguments from `$ARGUMENTS`:**
@@ -24,13 +23,13 @@ Stream metrics during training with early-warning alerts. Catches problems mid-r
 3. **For post-hoc analysis:**
    ```bash
-   python scripts/training_monitor.py --analyze run.log
+   uv run python scripts/training_monitor.py --analyze run.log
    ```
 4. **For live monitoring (inform user):**
    Live monitoring requires a running training process. Suggest the user run in a separate terminal:
    ```bash
-   python scripts/training_monitor.py --log run.log --interval 10
+   uv run python scripts/training_monitor.py --log run.log --interval 10
    ```
 5. **Alert types:**

package/skills/turing/whatif/SKILL.md CHANGED Viewed

@@ -1,7 +1,6 @@
 ---
 name: whatif
 description: What-if analysis — answer hypotheticals from existing experiment data without running new experiments.
-disable-model-invocation: true
 argument-hint: "\"<question>\" [--json]"
 allowed-tools: Read, Bash(*), Grep, Glob
 ---
@@ -9,8 +8,8 @@ allowed-tools: Read, Bash(*), Grep, Glob
 Answer "what if?" questions using existing experiment data. Routes to the right estimator automatically.
 ## Steps
-1. `source .venv/bin/activate`
-2. `python scripts/whatif_engine.py $ARGUMENTS`
+1. `uv sync`
+2. `uv run python scripts/whatif_engine.py $ARGUMENTS`
 3. **Saved:** `experiments/whatif/`
 ## Supported question types

package/skills/turing/xray/SKILL.md CHANGED Viewed

@@ -1,7 +1,6 @@
 ---
 name: xray
 description: Internal model diagnostics — gradient flow, dead neurons, activation stats, weight distributions, tree depth analysis.
-disable-model-invocation: true
 argument-hint: "[exp-id] [--layer encoder.layer.2] [--compare exp-a exp-b]"
 allowed-tools: Read, Bash(*), Grep, Glob
 ---
@@ -10,9 +9,9 @@ See inside the model. When it underperforms, the fix depends on *why*.
 ## Steps
-1. **Activate environment:**
+1. **Sync environment:**
    ```bash
-   source .venv/bin/activate
+   uv sync
    ```
 2. **Parse arguments from `$ARGUMENTS`:**
@@ -23,7 +22,7 @@ See inside the model. When it underperforms, the fix depends on *why*.
 3. **Run model diagnostics:**
    ```bash
-   python scripts/model_xray.py $ARGUMENTS
+   uv run python scripts/model_xray.py $ARGUMENTS
    ```
 4. **Diagnostics by model type:**

package/src/command-registry.js CHANGED Viewed

@@ -140,11 +140,13 @@ export async function getCommandNames(registryPath) {
   return registry.commandNames;
 }
+// Installed public layout under .claude/commands/turing.
 export async function getExpectedCommandPaths(registryPath) {
   const names = await getCommandNames(registryPath);
   return ['SKILL.md', ...names.map((name) => `${name}/SKILL.md`)];
 }
+// Editable repository source layout.
 export async function getExpectedSkillSourcePaths(registryPath) {
   const names = await getCommandNames(registryPath);
   return [
@@ -154,6 +156,16 @@ export async function getExpectedSkillSourcePaths(registryPath) {
   ];
 }
+// Generated repository compatibility layout.
+export async function getExpectedLegacyCommandCompatPaths(registryPath) {
+  const names = await getCommandNames(registryPath);
+  return [
+    'commands/turing.md',
+    ...names.map((name) => `commands/${name}.md`),
+    'commands/rules/loop-protocol.md',
+  ];
+}
 export async function getConfigFiles(registryPath) {
   const registry = await loadCommandRegistry(registryPath);
   return registry.configFiles;

package/src/install.js CHANGED Viewed

@@ -18,6 +18,7 @@ import { getCommandNames, getConfigFiles } from "./command-registry.js";
 const __dirname = dirname(fileURLToPath(import.meta.url));
 const PLUGIN_ROOT = join(__dirname, "..");
+const SKILL_SOURCE_ROOT = join(PLUGIN_ROOT, "skills", "turing");
 export async function install(opts = {}) {
@@ -37,7 +38,7 @@ export async function install(opts = {}) {
   // Copy root command (router) as SKILL.md
   await copyFile(
-    join(PLUGIN_ROOT, "commands", "turing.md"),
+    join(SKILL_SOURCE_ROOT, "SKILL.md"),
     join(paths.commands, "SKILL.md"),
   );
   console.log("  Router -> SKILL.md");
@@ -45,7 +46,7 @@ export async function install(opts = {}) {
   // Copy sub-commands as <name>/SKILL.md
   for (const cmd of subCommands) {
     await copyFile(
-      join(PLUGIN_ROOT, "commands", `${cmd}.md`),
+      join(SKILL_SOURCE_ROOT, cmd, "SKILL.md"),
       join(paths.commands, cmd, "SKILL.md"),
     );
   }
@@ -53,7 +54,7 @@ export async function install(opts = {}) {
   // Copy rules
   await copyFile(
-    join(PLUGIN_ROOT, "commands", "rules", "loop-protocol.md"),
+    join(SKILL_SOURCE_ROOT, "rules", "loop-protocol.md"),
     join(paths.commands, "rules", "loop-protocol.md"),
   );
   console.log("  Rules installed");

package/src/sync-commands-layout.js ADDED Viewed

@@ -0,0 +1,149 @@
+#!/usr/bin/env node
+/**
+ * Synchronize the legacy commands/ compatibility tree from skills/turing/.
+ *
+ * Usage:
+ *   node src/sync-commands-layout.js [--check]
+ */
+import { mkdir, readdir, readFile, rm, writeFile } from "fs/promises";
+import { dirname, join, relative } from "path";
+import { fileURLToPath } from "url";
+import { getCommandNames } from "./command-registry.js";
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const PLUGIN_ROOT = join(__dirname, "..");
+const SKILLS_DIR = join(PLUGIN_ROOT, "skills", "turing");
+const COMMANDS_DIR = join(PLUGIN_ROOT, "commands");
+async function readUtf8(path) {
+  return readFile(path, "utf8");
+}
+async function copyTextFile(source, target) {
+  await mkdir(dirname(target), { recursive: true });
+  await writeFile(target, await readUtf8(source));
+}
+async function compatibilityEntries() {
+  const names = await getCommandNames();
+  return [
+    {
+      source: join(SKILLS_DIR, "SKILL.md"),
+      target: join(COMMANDS_DIR, "turing.md"),
+    },
+    ...names.map((name) => ({
+      source: join(SKILLS_DIR, name, "SKILL.md"),
+      target: join(COMMANDS_DIR, `${name}.md`),
+    })),
+    {
+      source: join(SKILLS_DIR, "rules", "loop-protocol.md"),
+      target: join(COMMANDS_DIR, "rules", "loop-protocol.md"),
+    },
+  ];
+}
+async function existingCompatibilityEntries(dir = COMMANDS_DIR) {
+  let entries;
+  try {
+    entries = await readdir(dir, { withFileTypes: true });
+  } catch (error) {
+    if (error.code === "ENOENT") {
+      return [];
+    }
+    throw error;
+  }
+  const paths = [];
+  for (const entry of entries) {
+    const path = join(dir, entry.name);
+    paths.push(path);
+    if (entry.isDirectory()) {
+      paths.push(...await existingCompatibilityEntries(path));
+    }
+  }
+  return paths;
+}
+async function findDrift() {
+  const entries = await compatibilityEntries();
+  const expectedTargets = new Set(entries.map(({ target }) => target));
+  const expectedPaths = new Set([COMMANDS_DIR]);
+  for (const target of expectedTargets) {
+    let current = target;
+    while (current.startsWith(COMMANDS_DIR)) {
+      expectedPaths.add(current);
+      if (current === COMMANDS_DIR) {
+        break;
+      }
+      current = dirname(current);
+    }
+  }
+  const issues = [];
+  for (const { source, target } of entries) {
+    let sourceText;
+    try {
+      sourceText = await readUtf8(source);
+    } catch (error) {
+      issues.push(`missing source ${relative(PLUGIN_ROOT, source)}: ${error.message}`);
+      continue;
+    }
+    let targetText;
+    try {
+      targetText = await readUtf8(target);
+    } catch (error) {
+      if (error.code === "ENOENT") {
+        issues.push(`missing compatibility file ${relative(PLUGIN_ROOT, target)}`);
+      } else {
+        issues.push(`cannot read compatibility file ${relative(PLUGIN_ROOT, target)}: ${error.message}`);
+      }
+      continue;
+    }
+    if (targetText !== sourceText) {
+      issues.push(`diverged compatibility file ${relative(PLUGIN_ROOT, target)}`);
+    }
+  }
+  for (const path of await existingCompatibilityEntries()) {
+    if (!expectedPaths.has(path)) {
+      issues.push(`stale compatibility path ${relative(PLUGIN_ROOT, path)}`);
+    }
+  }
+  return issues;
+}
+export async function syncCommandsLayout({ check = false } = {}) {
+  if (check) {
+    const issues = await findDrift();
+    if (issues.length > 0) {
+      for (const issue of issues) {
+        console.error(issue);
+      }
+      process.exitCode = 1;
+      return;
+    }
+    console.log("commands compatibility tree is in sync");
+    return;
+  }
+  await rm(COMMANDS_DIR, { recursive: true, force: true });
+  for (const { source, target } of await compatibilityEntries()) {
+    await copyTextFile(source, target);
+  }
+  console.log("commands compatibility tree synchronized");
+}
+const isDirectRun =
+  process.argv[1] &&
+  fileURLToPath(import.meta.url).endsWith(process.argv[1].replace(/^.*\//, ""));
+if (isDirectRun) {
+  syncCommandsLayout({ check: process.argv.includes("--check") }).catch((error) => {
+    console.error(error.message);
+    process.exitCode = 1;
+  });
+}

package/src/sync-skills-layout.js CHANGED Viewed

@@ -1,148 +1,19 @@
 #!/usr/bin/env node
 /**
- * Synchronize the modern skills/turing package mirror from commands/.
+ * Backward-compatible wrapper for the flipped source layout.
  *
- * Usage:
- *   node src/sync-skills-layout.js [--check]
+ * The editable source is now skills/turing/, and sync generates commands/.
  */
-import { mkdir, readdir, readFile, rm, writeFile } from "fs/promises";
-import { dirname, join, relative } from "path";
 import { fileURLToPath } from "url";
-import { getCommandNames } from "./command-registry.js";
-const __dirname = dirname(fileURLToPath(import.meta.url));
-const PLUGIN_ROOT = join(__dirname, "..");
-const COMMANDS_DIR = join(PLUGIN_ROOT, "commands");
-const SKILLS_DIR = join(PLUGIN_ROOT, "skills", "turing");
-async function readUtf8(path) {
-  return readFile(path, "utf8");
-}
-async function copyTextFile(source, target) {
-  await mkdir(dirname(target), { recursive: true });
-  await writeFile(target, await readUtf8(source));
-}
-async function mirrorEntries() {
-  const names = await getCommandNames();
-  return [
-    {
-      source: join(COMMANDS_DIR, "turing.md"),
-      target: join(SKILLS_DIR, "SKILL.md"),
-    },
-    ...names.map((name) => ({
-      source: join(COMMANDS_DIR, `${name}.md`),
-      target: join(SKILLS_DIR, name, "SKILL.md"),
-    })),
-    {
-      source: join(COMMANDS_DIR, "rules", "loop-protocol.md"),
-      target: join(SKILLS_DIR, "rules", "loop-protocol.md"),
-    },
-  ];
-}
-async function existingMirrorEntries(dir = SKILLS_DIR) {
-  let entries;
-  try {
-    entries = await readdir(dir, { withFileTypes: true });
-  } catch (error) {
-    if (error.code === "ENOENT") {
-      return [];
-    }
-    throw error;
-  }
-  const paths = [];
-  for (const entry of entries) {
-    const path = join(dir, entry.name);
-    paths.push(path);
-    if (entry.isDirectory()) {
-      paths.push(...await existingMirrorEntries(path));
-    }
-  }
-  return paths;
-}
-async function findDrift() {
-  const entries = await mirrorEntries();
-  const expectedTargets = new Set(entries.map(({ target }) => target));
-  const expectedPaths = new Set([SKILLS_DIR]);
-  for (const target of expectedTargets) {
-    let current = target;
-    while (current.startsWith(SKILLS_DIR)) {
-      expectedPaths.add(current);
-      if (current === SKILLS_DIR) {
-        break;
-      }
-      current = dirname(current);
-    }
-  }
-  const issues = [];
-  for (const { source, target } of entries) {
-    let sourceText;
-    try {
-      sourceText = await readUtf8(source);
-    } catch (error) {
-      issues.push(`missing source ${relative(PLUGIN_ROOT, source)}: ${error.message}`);
-      continue;
-    }
-    let targetText;
-    try {
-      targetText = await readUtf8(target);
-    } catch (error) {
-      if (error.code === "ENOENT") {
-        issues.push(`missing mirror ${relative(PLUGIN_ROOT, target)}`);
-      } else {
-        issues.push(`cannot read mirror ${relative(PLUGIN_ROOT, target)}: ${error.message}`);
-      }
-      continue;
-    }
-    if (targetText !== sourceText) {
-      issues.push(`diverged mirror ${relative(PLUGIN_ROOT, target)}`);
-    }
-  }
-  for (const path of await existingMirrorEntries()) {
-    if (!expectedPaths.has(path)) {
-      issues.push(`stale mirror ${relative(PLUGIN_ROOT, path)}`);
-    }
-  }
-  return issues;
-}
-export async function syncSkillsLayout({ check = false } = {}) {
-  if (check) {
-    const issues = await findDrift();
-    if (issues.length > 0) {
-      for (const issue of issues) {
-        console.error(issue);
-      }
-      process.exitCode = 1;
-      return;
-    }
-    console.log("skills/turing mirror is in sync");
-    return;
-  }
-  await rm(SKILLS_DIR, { recursive: true, force: true });
-  for (const { source, target } of await mirrorEntries()) {
-    await copyTextFile(source, target);
-  }
-  console.log("skills/turing mirror synchronized");
-}
+import { syncCommandsLayout } from "./sync-commands-layout.js";
 const isDirectRun =
   process.argv[1] &&
   fileURLToPath(import.meta.url).endsWith(process.argv[1].replace(/^.*\//, ""));
 if (isDirectRun) {
-  syncSkillsLayout({ check: process.argv.includes("--check") }).catch((error) => {
+  syncCommandsLayout({ check: process.argv.includes("--check") }).catch((error) => {
     console.error(error.message);
     process.exitCode = 1;
   });

package/templates/README.md CHANGED Viewed

@@ -21,23 +21,21 @@ This separation is the invariant that makes experiment comparisons valid.
 ```bash
 # 1. Set up the environment
-python -m venv .venv
-source .venv/bin/activate
-pip install -r requirements.txt
+uv sync
 # 2. Add your training data to {{DATA_SOURCE}}
 # 3. Create train/val/test splits
-python prepare.py
+uv run python prepare.py
 # 4. Run training
-python train.py > run.log 2>&1
+uv run python train.py > run.log 2>&1
 # 5. Check results
 grep -A 10 "^---" run.log
 # 6. View experiment history
-python scripts/show_metrics.py
+uv run python scripts/show_metrics.py
 ```
 ## Using the Autoresearch Agent
@@ -88,6 +86,5 @@ For hands-off mode: `/loop 5m /turing:train`
 ## Running Tests
 ```bash
-source .venv/bin/activate
-python -m pytest tests/ -v
+uv run pytest tests/ -v
 ```

package/templates/program.md CHANGED Viewed

@@ -54,11 +54,11 @@ Update it after each experiment with:
 For systematic hyperparameter search:
 1. Edit `sweep_config.yaml` with parameter ranges
-2. Generate queue: `python scripts/sweep.py`
-3. Check status: `python scripts/sweep.py --status`
-4. Get next: `python scripts/sweep.py --next`
+2. Generate queue: `uv run python scripts/sweep.py`
+3. Check status: `uv run python scripts/sweep.py --status`
+4. Get next: `uv run python scripts/sweep.py --next`
 5. Apply overrides, create branch, run training
-6. Mark done: `python scripts/sweep.py --mark <name> complete|failed`
+6. Mark done: `uv run python scripts/sweep.py --mark <name> complete|failed`
 ## THE LOOP
@@ -66,8 +66,8 @@ The autoresearch experiment loop. Each iteration is one experiment — one hypot
 1. **OBSERVE** — Read recent results, check hypothesis queue, research plan, and review failed diffs:
    ```bash
-   python scripts/show_metrics.py --last 5
-   python scripts/manage_hypotheses.py next 2>/dev/null || echo "No queued hypotheses"
+   uv run python scripts/show_metrics.py --last 5
+   uv run python scripts/manage_hypotheses.py next 2>/dev/null || echo "No queued hypotheses"
    cat RESEARCH_PLAN.md 2>/dev/null || true
    ```
@@ -88,12 +88,12 @@ The autoresearch experiment loop. Each iteration is one experiment — one hypot
    **If using a queued hypothesis:**
    ```bash
-   python scripts/manage_hypotheses.py mark hyp-NNN in-progress
+   uv run python scripts/manage_hypotheses.py mark hyp-NNN in-progress
    ```
    **If generating your own hypothesis**, register it with structured detail:
    ```bash
-   python scripts/manage_hypotheses.py add "your hypothesis description" \
+   uv run python scripts/manage_hypotheses.py add "your hypothesis description" \
      --priority medium --source agent \
      --model-type xgboost \
      --hyperparams '{"max_depth": 8, "n_estimators": 200}' \
@@ -101,7 +101,7 @@ The autoresearch experiment loop. Each iteration is one experiment — one hypot
      --tags "depth,estimators" \
      --parent exp-NNN \
      --expected "deeper trees should capture feature interactions"
-   python scripts/manage_hypotheses.py mark hyp-NNN in-progress
+   uv run python scripts/manage_hypotheses.py mark hyp-NNN in-progress
    ```
    This creates both an index entry in `hypotheses.yaml` and a detailed file at `hypotheses/hyp-NNN.yaml` with full architecture, hyperparameters, expected outcome, and lineage.
@@ -110,7 +110,7 @@ The autoresearch experiment loop. Each iteration is one experiment — one hypot
    To read a hypothesis's full detail:
    ```bash
-   python scripts/manage_hypotheses.py show hyp-NNN
+   uv run python scripts/manage_hypotheses.py show hyp-NNN
    ```
 3. **PREPARE** — Modify `config.yaml` for hyperparameter changes. Only modify `train.py` for structural code changes.
@@ -122,7 +122,7 @@ The autoresearch experiment loop. Each iteration is one experiment — one hypot
 5. **EXECUTE** training:
    ```bash
-   source .venv/bin/activate && python train.py > run.log 2>&1
+   uv run python train.py > run.log 2>&1
    ```
 6. **MEASURE** — Parse metrics from run.log:
@@ -144,7 +144,7 @@ The autoresearch experiment loop. Each iteration is one experiment — one hypot
 8. **RECORD** — Log the experiment (kept or discarded):
    ```bash
-   python scripts/log_experiment.py experiments/log.jsonl exp-NNN kept|discarded \
+   uv run python scripts/log_experiment.py experiments/log.jsonl exp-NNN kept|discarded \
      '{"{{TARGET_METRIC}}": X.XX, ...}' \
      '{"model_type": "xgboost", "hyperparams": {...}}' \
      models/model.joblib "Description of hypothesis and outcome"
@@ -152,7 +152,7 @@ The autoresearch experiment loop. Each iteration is one experiment — one hypot
    Update the hypothesis status with result metrics:
    ```bash
-   python scripts/manage_hypotheses.py mark hyp-NNN tested \
+   uv run python scripts/manage_hypotheses.py mark hyp-NNN tested \
      --result exp-NNN \
      --metrics '{"{{TARGET_METRIC}}": X.XX, ...}' \
      --notes "Brief explanation of what happened and why"
@@ -162,7 +162,7 @@ The autoresearch experiment loop. Each iteration is one experiment — one hypot
    Then synthesize a decision packet and auto-queue follow-ups:
    ```bash
-   python scripts/synthesize_decision.py --experiment exp-NNN --auto-queue
+   uv run python scripts/synthesize_decision.py --experiment exp-NNN --auto-queue
    ```
    This produces a verdict (promote/branch_followup/abandon/fix_and_retry) and automatically queues follow-up hypotheses for `branch_followup` and `fix_and_retry` outcomes.
@@ -172,7 +172,7 @@ The autoresearch experiment loop. Each iteration is one experiment — one hypot
    - Report final best model and recommend next steps
    - **Before declaring final results**, run a seed study to verify robustness:
      ```bash
-     python scripts/seed_runner.py --quick
+     uv run python scripts/seed_runner.py --quick
      ```
      If CV > 5%, the result is seed-sensitive — report mean ± std, not a single-seed number.
@@ -180,9 +180,9 @@ The autoresearch experiment loop. Each iteration is one experiment — one hypot
 ## Execution Rules
-- **ALWAYS redirect output:** `python train.py > run.log 2>&1`
+- **ALWAYS redirect output:** `uv run python train.py > run.log 2>&1`
 - **ALWAYS parse with grep:** `grep -A 10 "^---" run.log | head -10`
-- **ALWAYS activate venv:** `source .venv/bin/activate`
+- **ALWAYS run Python through uv:** `uv run python ...`
 - **NEVER install packages** without human approval
 ## Strategy Escalation Protocol
@@ -219,5 +219,5 @@ Starting suggestions (ordered by expected impact):
 ## Comparing Runs
 ```bash
-python scripts/compare_runs.py exp-001 exp-002
+uv run python scripts/compare_runs.py exp-001 exp-002
 ```

package/templates/pyproject.toml CHANGED Viewed

@@ -2,6 +2,16 @@
 name = "{{PROJECT_NAME}}-ml"
 version = "0.1.0"
 requires-python = ">=3.12"
+dependencies = [
+    "scikit-learn>=1.6",
+    "xgboost>=3.2",
+    "lightgbm>=4.6",
+    "pandas>=2.2",
+    "numpy>=2.0",
+    "joblib>=1.4",
+    "pyyaml>=6.0",
+    "pytest>=8.0",
+]
 [tool.pytest.ini_options]
 testpaths = ["tests"]

package/templates/requirements.txt CHANGED Viewed

@@ -1,3 +1,6 @@
+# Compatibility export only. pyproject.toml is canonical for dependencies.
+# Prefer: uv sync
 scikit-learn>=1.6
 xgboost>=3.2
 lightgbm>=4.6
@@ -8,5 +11,5 @@ pyyaml>=6.0
 pytest>=8.0
 # Optional: tree-search-guided hypothesis exploration
-# Install with: pip install "treequest[all]"
+# Install with: uv add "treequest[all]"
 # treequest>=0.1

package/templates/scripts/generate_onboarding.py CHANGED Viewed

@@ -210,7 +210,7 @@ def format_onboarding_report(config, experiments, families, best, decisions,
         "5. `/turing:try \"your hypothesis\"` — inject ideas",
         "6. `/turing:train` — run next experiment",
     ], "engineer": [
-        "1. `pip install -r requirements.txt`",
+        "1. `uv sync`",
         "2. Review `config.yaml` for data paths",
         "3. `/turing:status` — where things stand",
         "4. Check `train.py` for current model",