researchloop 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,30 @@
1
+ # Architecture Optimization Playbook
2
+
3
+ Your goal is to explore small architecture changes without losing the baseline.
4
+
5
+ Use this order:
6
+
7
+ 1. Lock the current baseline and metric.
8
+ 2. Make one architecture change at a time.
9
+ 3. Prefer the smallest change that can plausibly matter.
10
+ 4. Keep data, evaluation, and training loop fixed.
11
+ 5. Log the exact config diff and the result.
12
+
13
+ Useful architecture knobs:
14
+
15
+ - `d_model`
16
+ - `n_heads`
17
+ - `n_layers`
18
+ - `d_ff`
19
+ - `n_kv_heads`
20
+ - `activation_variant`
21
+ - `activation_slope`
22
+
23
+ Suggested first experiments:
24
+
25
+ - widen `d_model` slightly
26
+ - reduce or increase `n_layers` by one step
27
+ - change `n_kv_heads` while keeping `d_model` and `n_heads` consistent
28
+ - compare `squared_relu` against `relu` or `leaky_squared_relu`
29
+
30
+ Do not change architecture and optimizer in the same first-pass experiment unless the goal explicitly requires it.
@@ -0,0 +1,27 @@
1
+ # Attention Optimization Playbook
2
+
3
+ Your goal is to test attention-related changes while keeping the rest of the loop stable.
4
+
5
+ Use this order:
6
+
7
+ 1. Freeze the baseline model and metric.
8
+ 2. Change one attention-related setting at a time.
9
+ 3. Keep data, optimizer, and evaluation fixed.
10
+ 4. Measure the effect quickly.
11
+ 5. Re-run any promising result before calling it a win.
12
+
13
+ Useful attention knobs:
14
+
15
+ - `n_heads`
16
+ - `n_kv_heads`
17
+ - RoPE sequence length
18
+ - attention dropout
19
+ - `compile_model` only when backend supports it
20
+
21
+ Suggested first experiments:
22
+
23
+ - compare fewer KV heads vs more KV heads
24
+ - adjust attention dropout if overfitting appears
25
+ - keep RoPE length fixed unless the dataset truly requires a change
26
+
27
+ If the repo is on a MacBook, stay in the backend that actually works locally and do not force CUDA-only paths.
@@ -0,0 +1,32 @@
1
+ # Hyperparameter Optimization Playbook
2
+
3
+ Your goal is to test the cheapest high-signal hyperparameter changes first.
4
+
5
+ Use this order:
6
+
7
+ 1. Establish the current baseline exactly as it exists.
8
+ 2. Sweep learning rate before changing architecture.
9
+ 3. Hold data, sequence length, batch size, and evaluation constant.
10
+ 4. Change one knob at a time.
11
+ 5. Record every run in `.researchloop/scratchpad/runs.jsonl`.
12
+
13
+ Useful hyperparameters to try:
14
+
15
+ - `muon_lr`
16
+ - `adamw_lr`
17
+ - `warmup_ratio`
18
+ - `weight_decay`
19
+ - `dropout`
20
+ - `grad_clip`
21
+ - `activation_variant`
22
+ - `activation_slope`
23
+
24
+ For each candidate:
25
+
26
+ - state the hypothesis
27
+ - define the kill criterion
28
+ - run the smallest proof that can fail or improve
29
+ - compare against baseline
30
+ - only keep wins that are reproducible
31
+
32
+ Do not stack multiple changes in the first pass.
@@ -0,0 +1,8 @@
1
+ You are an autonomous research engineer in this repository.
2
+
3
+ Read `.researchloop/AGENTS.md`, `.researchloop/goal.md`, `.researchloop/plan.md`, and `.researchloop/scratchpad/THREAD.md`.
4
+
5
+ Goal:
6
+ {{GOAL}}
7
+
8
+ Design and run small experiments. Track commands, metrics, code changes, and decisions. Do not claim results without evidence. Keep the research loop moving.
@@ -0,0 +1,26 @@
1
+ You are Hermes acting as an autonomous AI research orchestrator in this repository.
2
+
3
+ First read:
4
+ - `.researchloop/AGENTS.md`
5
+ - `.researchloop/goal.md`
6
+ - `.researchloop/plan.md`
7
+ - `.researchloop/scratchpad/THREAD.md`
8
+ - `.researchloop/repo-profile.json` if present
9
+
10
+ Goal:
11
+ {{GOAL}}
12
+
13
+ Coordinate the research loop:
14
+ - Inspect the repo and summarize the experiment surface.
15
+ - Choose the smallest high-signal experiment.
16
+ - Delegate or execute implementation and validation.
17
+ - Store durable state in `.researchloop/`.
18
+ - Keep raw logs out of the main context; summarize and link paths.
19
+ - Maintain a picklist of active, backlog, and ruled-out ideas.
20
+ - Preserve reproducibility: command, config, metric, hardware, git diff.
21
+
22
+ Return with:
23
+ - Research state
24
+ - Experiment queue
25
+ - Evidence gathered
26
+ - Next action