claude-turing 4.7.0 → 4.8.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +2 -2
- package/README.md +1 -1
- package/agents/ml-evaluator.md +4 -4
- package/agents/ml-researcher.md +2 -2
- package/bin/turing-init.sh +2 -2
- package/commands/ablate.md +3 -4
- package/commands/annotate.md +2 -3
- package/commands/archive.md +2 -3
- package/commands/audit.md +3 -4
- package/commands/baseline.md +3 -4
- package/commands/brief.md +5 -6
- package/commands/budget.md +3 -4
- package/commands/calibrate.md +3 -4
- package/commands/card.md +3 -4
- package/commands/changelog.md +2 -3
- package/commands/checkpoint.md +3 -4
- package/commands/cite.md +2 -3
- package/commands/compare.md +1 -2
- package/commands/counterfactual.md +2 -3
- package/commands/curriculum.md +3 -4
- package/commands/design.md +3 -4
- package/commands/diagnose.md +4 -5
- package/commands/diff.md +3 -4
- package/commands/distill.md +3 -4
- package/commands/doctor.md +2 -3
- package/commands/ensemble.md +3 -4
- package/commands/explore.md +4 -5
- package/commands/export.md +3 -4
- package/commands/feature.md +3 -4
- package/commands/flashback.md +2 -3
- package/commands/fork.md +3 -4
- package/commands/frontier.md +3 -4
- package/commands/init.md +5 -6
- package/commands/leak.md +3 -4
- package/commands/lit.md +3 -4
- package/commands/logbook.md +5 -6
- package/commands/merge.md +2 -3
- package/commands/mode.md +1 -2
- package/commands/onboard.md +2 -3
- package/commands/paper.md +3 -4
- package/commands/plan.md +2 -3
- package/commands/poster.md +3 -4
- package/commands/postmortem.md +2 -3
- package/commands/preflight.md +5 -6
- package/commands/present.md +2 -3
- package/commands/profile.md +3 -4
- package/commands/prune.md +2 -3
- package/commands/quantize.md +2 -3
- package/commands/queue.md +3 -4
- package/commands/registry.md +2 -3
- package/commands/regress.md +3 -4
- package/commands/replay.md +2 -3
- package/commands/report.md +3 -4
- package/commands/reproduce.md +3 -4
- package/commands/retry.md +3 -4
- package/commands/review.md +2 -3
- package/commands/rules/loop-protocol.md +11 -11
- package/commands/sanity.md +3 -4
- package/commands/scale.md +4 -5
- package/commands/search.md +2 -3
- package/commands/seed.md +3 -4
- package/commands/sensitivity.md +3 -4
- package/commands/share.md +2 -3
- package/commands/simulate.md +2 -3
- package/commands/status.md +1 -2
- package/commands/stitch.md +3 -4
- package/commands/suggest.md +5 -6
- package/commands/surgery.md +2 -3
- package/commands/sweep.md +8 -9
- package/commands/template.md +2 -3
- package/commands/train.md +5 -6
- package/commands/transfer.md +3 -4
- package/commands/trend.md +2 -3
- package/commands/try.md +4 -5
- package/commands/turing.md +3 -3
- package/commands/update.md +2 -3
- package/commands/validate.md +4 -5
- package/commands/warm.md +3 -4
- package/commands/watch.md +4 -5
- package/commands/whatif.md +2 -3
- package/commands/xray.md +3 -4
- package/config/commands.yaml +75 -75
- package/package.json +3 -2
- package/skills/turing/SKILL.md +3 -3
- package/skills/turing/ablate/SKILL.md +3 -4
- package/skills/turing/annotate/SKILL.md +2 -3
- package/skills/turing/archive/SKILL.md +2 -3
- package/skills/turing/audit/SKILL.md +3 -4
- package/skills/turing/baseline/SKILL.md +3 -4
- package/skills/turing/brief/SKILL.md +5 -6
- package/skills/turing/budget/SKILL.md +3 -4
- package/skills/turing/calibrate/SKILL.md +3 -4
- package/skills/turing/card/SKILL.md +3 -4
- package/skills/turing/changelog/SKILL.md +2 -3
- package/skills/turing/checkpoint/SKILL.md +3 -4
- package/skills/turing/cite/SKILL.md +2 -3
- package/skills/turing/compare/SKILL.md +1 -2
- package/skills/turing/counterfactual/SKILL.md +2 -3
- package/skills/turing/curriculum/SKILL.md +3 -4
- package/skills/turing/design/SKILL.md +3 -4
- package/skills/turing/diagnose/SKILL.md +4 -5
- package/skills/turing/diff/SKILL.md +3 -4
- package/skills/turing/distill/SKILL.md +3 -4
- package/skills/turing/doctor/SKILL.md +2 -3
- package/skills/turing/ensemble/SKILL.md +3 -4
- package/skills/turing/explore/SKILL.md +4 -5
- package/skills/turing/export/SKILL.md +3 -4
- package/skills/turing/feature/SKILL.md +3 -4
- package/skills/turing/flashback/SKILL.md +2 -3
- package/skills/turing/fork/SKILL.md +3 -4
- package/skills/turing/frontier/SKILL.md +3 -4
- package/skills/turing/init/SKILL.md +5 -6
- package/skills/turing/leak/SKILL.md +3 -4
- package/skills/turing/lit/SKILL.md +3 -4
- package/skills/turing/logbook/SKILL.md +5 -6
- package/skills/turing/merge/SKILL.md +2 -3
- package/skills/turing/mode/SKILL.md +1 -2
- package/skills/turing/onboard/SKILL.md +2 -3
- package/skills/turing/paper/SKILL.md +3 -4
- package/skills/turing/plan/SKILL.md +2 -3
- package/skills/turing/poster/SKILL.md +3 -4
- package/skills/turing/postmortem/SKILL.md +2 -3
- package/skills/turing/preflight/SKILL.md +5 -6
- package/skills/turing/present/SKILL.md +2 -3
- package/skills/turing/profile/SKILL.md +3 -4
- package/skills/turing/prune/SKILL.md +2 -3
- package/skills/turing/quantize/SKILL.md +2 -3
- package/skills/turing/queue/SKILL.md +3 -4
- package/skills/turing/registry/SKILL.md +2 -3
- package/skills/turing/regress/SKILL.md +3 -4
- package/skills/turing/replay/SKILL.md +2 -3
- package/skills/turing/report/SKILL.md +3 -4
- package/skills/turing/reproduce/SKILL.md +3 -4
- package/skills/turing/retry/SKILL.md +3 -4
- package/skills/turing/review/SKILL.md +2 -3
- package/skills/turing/rules/loop-protocol.md +11 -11
- package/skills/turing/sanity/SKILL.md +3 -4
- package/skills/turing/scale/SKILL.md +4 -5
- package/skills/turing/search/SKILL.md +2 -3
- package/skills/turing/seed/SKILL.md +3 -4
- package/skills/turing/sensitivity/SKILL.md +3 -4
- package/skills/turing/share/SKILL.md +2 -3
- package/skills/turing/simulate/SKILL.md +2 -3
- package/skills/turing/status/SKILL.md +1 -2
- package/skills/turing/stitch/SKILL.md +3 -4
- package/skills/turing/suggest/SKILL.md +5 -6
- package/skills/turing/surgery/SKILL.md +2 -3
- package/skills/turing/sweep/SKILL.md +8 -9
- package/skills/turing/template/SKILL.md +2 -3
- package/skills/turing/train/SKILL.md +5 -6
- package/skills/turing/transfer/SKILL.md +3 -4
- package/skills/turing/trend/SKILL.md +2 -3
- package/skills/turing/try/SKILL.md +4 -5
- package/skills/turing/update/SKILL.md +2 -3
- package/skills/turing/validate/SKILL.md +4 -5
- package/skills/turing/warm/SKILL.md +3 -4
- package/skills/turing/watch/SKILL.md +4 -5
- package/skills/turing/whatif/SKILL.md +2 -3
- package/skills/turing/xray/SKILL.md +3 -4
- package/src/command-registry.js +12 -0
- package/src/install.js +4 -3
- package/src/sync-commands-layout.js +149 -0
- package/src/sync-skills-layout.js +4 -133
- package/templates/README.md +5 -8
- package/templates/program.md +18 -18
- package/templates/pyproject.toml +10 -0
- package/templates/requirements.txt +4 -1
- package/templates/scripts/generate_onboarding.py +1 -1
- package/templates/scripts/post-train-hook.sh +7 -8
- package/templates/scripts/scaffold.py +24 -26
- package/templates/scripts/stop-hook.sh +2 -3
- package/templates/scripts/turing-run-python.sh +9 -0
package/commands/turing.md
CHANGED
|
@@ -7,10 +7,10 @@ You are the Turing ML research router. Detect the user's intent and identify the
|
|
|
7
7
|
|
|
8
8
|
## Execution Contract
|
|
9
9
|
|
|
10
|
-
Turing sub-commands are
|
|
10
|
+
Turing sub-commands are slash-command skills that allow model invocation, so router handling may select the focused skill when the user's intent matches a sub-command.
|
|
11
11
|
|
|
12
|
-
- If the user explicitly invokes `/turing:<cmd>`,
|
|
13
|
-
- If the user invokes `/turing` as a router and the detected command is `slash_only`,
|
|
12
|
+
- If the user explicitly invokes `/turing:<cmd>`, handle that focused sub-command directly.
|
|
13
|
+
- If the user invokes `/turing` as a router and the detected command is `slash_only`, route to the focused sub-command skill when appropriate.
|
|
14
14
|
- If a command has a documented safe equivalent script, the assistant may execute those documented steps inline when safe and appropriate.
|
|
15
15
|
|
|
16
16
|
## Routing Table
|
package/commands/update.md
CHANGED
|
@@ -1,7 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: update
|
|
3
3
|
description: Incremental model update — add new data without full retraining, with forgetting detection.
|
|
4
|
-
disable-model-invocation: true
|
|
5
4
|
argument-hint: "<exp-id> --new-data <path> [--replay-ratio 0.1] [--tolerance 0.005]"
|
|
6
5
|
allowed-tools: Read, Bash(*), Grep, Glob
|
|
7
6
|
---
|
|
@@ -9,8 +8,8 @@ allowed-tools: Read, Bash(*), Grep, Glob
|
|
|
9
8
|
Add new data to an existing model without starting from scratch. Detects catastrophic forgetting.
|
|
10
9
|
|
|
11
10
|
## Steps
|
|
12
|
-
1. `
|
|
13
|
-
2. `python scripts/incremental_update.py $ARGUMENTS`
|
|
11
|
+
1. `uv sync`
|
|
12
|
+
2. `uv run python scripts/incremental_update.py $ARGUMENTS`
|
|
14
13
|
3. **Saved:** `experiments/updates/`
|
|
15
14
|
|
|
16
15
|
## Model-specific strategies
|
package/commands/validate.md
CHANGED
|
@@ -1,7 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: validate
|
|
3
3
|
description: Run stability validation on the current experiment configuration. Executes N runs to measure metric variance and auto-configures multi-run evaluation if variance is too high.
|
|
4
|
-
disable-model-invocation: true
|
|
5
4
|
argument-hint: "[--auto]"
|
|
6
5
|
allowed-tools: Read, Bash(*), Grep, Glob
|
|
7
6
|
---
|
|
@@ -10,19 +9,19 @@ Validate the stability of the current ML pipeline by running it multiple times a
|
|
|
10
9
|
|
|
11
10
|
## Steps
|
|
12
11
|
|
|
13
|
-
1. **
|
|
12
|
+
1. **Sync environment:**
|
|
14
13
|
```bash
|
|
15
|
-
|
|
14
|
+
uv sync
|
|
16
15
|
```
|
|
17
16
|
|
|
18
17
|
2. **Run stability check:**
|
|
19
18
|
```bash
|
|
20
|
-
python scripts/validate_stability.py
|
|
19
|
+
uv run python scripts/validate_stability.py
|
|
21
20
|
```
|
|
22
21
|
|
|
23
22
|
3. **If `$ARGUMENTS` contains `--auto`:**
|
|
24
23
|
```bash
|
|
25
|
-
python scripts/validate_stability.py --auto
|
|
24
|
+
uv run python scripts/validate_stability.py --auto
|
|
26
25
|
```
|
|
27
26
|
This auto-writes `evaluation.n_runs: 3` to `config.yaml` if CV > 5%.
|
|
28
27
|
|
package/commands/warm.md
CHANGED
|
@@ -1,7 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: warm
|
|
3
3
|
description: Warm-start from a prior model — load checkpoint, optionally freeze layers, adjust learning rate, and continue training.
|
|
4
|
-
disable-model-invocation: true
|
|
5
4
|
argument-hint: "<exp-id> [--freeze-layers encoder] [--unfreeze-after 5]"
|
|
6
5
|
allowed-tools: Read, Bash(*), Grep, Glob
|
|
7
6
|
---
|
|
@@ -10,9 +9,9 @@ Take a trained checkpoint and use it as initialization for a new experiment. Aut
|
|
|
10
9
|
|
|
11
10
|
## Steps
|
|
12
11
|
|
|
13
|
-
1. **
|
|
12
|
+
1. **Sync environment:**
|
|
14
13
|
```bash
|
|
15
|
-
|
|
14
|
+
uv sync
|
|
16
15
|
```
|
|
17
16
|
|
|
18
17
|
2. **Parse arguments from `$ARGUMENTS`:**
|
|
@@ -24,7 +23,7 @@ Take a trained checkpoint and use it as initialization for a new experiment. Aut
|
|
|
24
23
|
|
|
25
24
|
3. **Run warm-start planner:**
|
|
26
25
|
```bash
|
|
27
|
-
python scripts/warm_start.py $ARGUMENTS
|
|
26
|
+
uv run python scripts/warm_start.py $ARGUMENTS
|
|
28
27
|
```
|
|
29
28
|
|
|
30
29
|
4. **Report results:**
|
package/commands/watch.md
CHANGED
|
@@ -1,7 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: watch
|
|
3
3
|
description: Live training monitor with early-warning alerts for loss spikes, NaN, overfitting, and metric plateaus.
|
|
4
|
-
disable-model-invocation: true
|
|
5
4
|
argument-hint: "[--alerts] [--interval 10] [--analyze run.log]"
|
|
6
5
|
allowed-tools: Read, Bash(*), Grep, Glob
|
|
7
6
|
---
|
|
@@ -10,9 +9,9 @@ Stream metrics during training with early-warning alerts. Catches problems mid-r
|
|
|
10
9
|
|
|
11
10
|
## Steps
|
|
12
11
|
|
|
13
|
-
1. **
|
|
12
|
+
1. **Sync environment:**
|
|
14
13
|
```bash
|
|
15
|
-
|
|
14
|
+
uv sync
|
|
16
15
|
```
|
|
17
16
|
|
|
18
17
|
2. **Parse arguments from `$ARGUMENTS`:**
|
|
@@ -24,13 +23,13 @@ Stream metrics during training with early-warning alerts. Catches problems mid-r
|
|
|
24
23
|
|
|
25
24
|
3. **For post-hoc analysis:**
|
|
26
25
|
```bash
|
|
27
|
-
python scripts/training_monitor.py --analyze run.log
|
|
26
|
+
uv run python scripts/training_monitor.py --analyze run.log
|
|
28
27
|
```
|
|
29
28
|
|
|
30
29
|
4. **For live monitoring (inform user):**
|
|
31
30
|
Live monitoring requires a running training process. Suggest the user run in a separate terminal:
|
|
32
31
|
```bash
|
|
33
|
-
python scripts/training_monitor.py --log run.log --interval 10
|
|
32
|
+
uv run python scripts/training_monitor.py --log run.log --interval 10
|
|
34
33
|
```
|
|
35
34
|
|
|
36
35
|
5. **Alert types:**
|
package/commands/whatif.md
CHANGED
|
@@ -1,7 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: whatif
|
|
3
3
|
description: What-if analysis — answer hypotheticals from existing experiment data without running new experiments.
|
|
4
|
-
disable-model-invocation: true
|
|
5
4
|
argument-hint: "\"<question>\" [--json]"
|
|
6
5
|
allowed-tools: Read, Bash(*), Grep, Glob
|
|
7
6
|
---
|
|
@@ -9,8 +8,8 @@ allowed-tools: Read, Bash(*), Grep, Glob
|
|
|
9
8
|
Answer "what if?" questions using existing experiment data. Routes to the right estimator automatically.
|
|
10
9
|
|
|
11
10
|
## Steps
|
|
12
|
-
1. `
|
|
13
|
-
2. `python scripts/whatif_engine.py $ARGUMENTS`
|
|
11
|
+
1. `uv sync`
|
|
12
|
+
2. `uv run python scripts/whatif_engine.py $ARGUMENTS`
|
|
14
13
|
3. **Saved:** `experiments/whatif/`
|
|
15
14
|
|
|
16
15
|
## Supported question types
|
package/commands/xray.md
CHANGED
|
@@ -1,7 +1,6 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: xray
|
|
3
3
|
description: Internal model diagnostics — gradient flow, dead neurons, activation stats, weight distributions, tree depth analysis.
|
|
4
|
-
disable-model-invocation: true
|
|
5
4
|
argument-hint: "[exp-id] [--layer encoder.layer.2] [--compare exp-a exp-b]"
|
|
6
5
|
allowed-tools: Read, Bash(*), Grep, Glob
|
|
7
6
|
---
|
|
@@ -10,9 +9,9 @@ See inside the model. When it underperforms, the fix depends on *why*.
|
|
|
10
9
|
|
|
11
10
|
## Steps
|
|
12
11
|
|
|
13
|
-
1. **
|
|
12
|
+
1. **Sync environment:**
|
|
14
13
|
```bash
|
|
15
|
-
|
|
14
|
+
uv sync
|
|
16
15
|
```
|
|
17
16
|
|
|
18
17
|
2. **Parse arguments from `$ARGUMENTS`:**
|
|
@@ -23,7 +22,7 @@ See inside the model. When it underperforms, the fix depends on *why*.
|
|
|
23
22
|
|
|
24
23
|
3. **Run model diagnostics:**
|
|
25
24
|
```bash
|
|
26
|
-
python scripts/model_xray.py $ARGUMENTS
|
|
25
|
+
uv run python scripts/model_xray.py $ARGUMENTS
|
|
27
26
|
```
|
|
28
27
|
|
|
29
28
|
4. **Diagnostics by model type:**
|