opencode-autoresearch 3.1.0-beta.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.opencode-plugin/plugin.json +37 -0
- package/LICENSE +21 -0
- package/README.md +74 -0
- package/commands/autoresearch/debug.md +23 -0
- package/commands/autoresearch/fix.md +21 -0
- package/commands/autoresearch/learn.md +21 -0
- package/commands/autoresearch/plan.md +25 -0
- package/commands/autoresearch/predict.md +21 -0
- package/commands/autoresearch/scenario.md +21 -0
- package/commands/autoresearch/security.md +21 -0
- package/commands/autoresearch/ship.md +22 -0
- package/commands/autoresearch.md +45 -0
- package/dist/cli.d.ts +3 -0
- package/dist/cli.d.ts.map +1 -0
- package/dist/cli.js +202 -0
- package/dist/cli.js.map +1 -0
- package/dist/constants.d.ts +13 -0
- package/dist/constants.d.ts.map +1 -0
- package/dist/constants.js +13 -0
- package/dist/constants.js.map +1 -0
- package/dist/helpers.d.ts +19 -0
- package/dist/helpers.d.ts.map +1 -0
- package/dist/helpers.js +137 -0
- package/dist/helpers.js.map +1 -0
- package/dist/index.d.ts +4 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +3 -0
- package/dist/index.js.map +1 -0
- package/dist/run-manager.d.ts +9 -0
- package/dist/run-manager.d.ts.map +1 -0
- package/dist/run-manager.js +239 -0
- package/dist/run-manager.js.map +1 -0
- package/dist/subagent-pool.d.ts +7 -0
- package/dist/subagent-pool.d.ts.map +1 -0
- package/dist/subagent-pool.js +101 -0
- package/dist/subagent-pool.js.map +1 -0
- package/dist/types.d.ts +129 -0
- package/dist/types.d.ts.map +1 -0
- package/dist/types.js +2 -0
- package/dist/types.js.map +1 -0
- package/dist/wizard.d.ts +3 -0
- package/dist/wizard.d.ts.map +1 -0
- package/dist/wizard.js +56 -0
- package/dist/wizard.js.map +1 -0
- package/docs/ARCHITECTURE.md +88 -0
- package/docs/CNAME +1 -0
- package/docs/OPENCODE_INSTALL.md +48 -0
- package/docs/RELEASE.md +67 -0
- package/docs/autoresearch-loop.svg +95 -0
- package/docs/index.html +249 -0
- package/hooks/init.sh +21 -0
- package/hooks/status.sh +23 -0
- package/hooks/stop.sh +27 -0
- package/package.json +49 -0
- package/skills/autoresearch/SKILL.md +77 -0
- package/skills/autoresearch/references/core-principles.md +20 -0
- package/skills/autoresearch/references/debug-workflow.md +31 -0
- package/skills/autoresearch/references/fix-workflow.md +25 -0
- package/skills/autoresearch/references/interaction-wizard.md +33 -0
- package/skills/autoresearch/references/learn-workflow.md +15 -0
- package/skills/autoresearch/references/loop-workflow.md +35 -0
- package/skills/autoresearch/references/plan-workflow.md +42 -0
- package/skills/autoresearch/references/predict-workflow.md +15 -0
- package/skills/autoresearch/references/results-logging.md +29 -0
- package/skills/autoresearch/references/runtime-hard-invariants.md +13 -0
- package/skills/autoresearch/references/scenario-workflow.md +15 -0
- package/skills/autoresearch/references/security-workflow.md +15 -0
- package/skills/autoresearch/references/ship-workflow.md +15 -0
- package/skills/autoresearch/references/state-management.md +39 -0
- package/skills/autoresearch/references/structured-output-spec.md +34 -0
- package/skills/autoresearch/references/subagent-orchestration.md +42 -0
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
# Results Logging
|
|
2
|
+
|
|
3
|
+
`research-results.tsv` is the primary append-only results log per run.
|
|
4
|
+
|
|
5
|
+
The runtime maintains `autoresearch-results.tsv` as the canonical iteration log.
|
|
6
|
+
|
|
7
|
+
## Required Columns
|
|
8
|
+
|
|
9
|
+
The TSV header is:
|
|
10
|
+
|
|
11
|
+
`timestamp iteration decision metric_value verify_status guard_status hypothesis change_summary labels note`
|
|
12
|
+
|
|
13
|
+
## Logging Rules
|
|
14
|
+
|
|
15
|
+
1. Record exactly one row per completed iteration.
|
|
16
|
+
2. Treat each row as the orchestrator-owned result for that iteration, even when several subagents contributed evidence.
|
|
17
|
+
3. Use `keep`, `discard`, or `needs_human` as the decision.
|
|
18
|
+
4. Store the observed metric value as text; leave it blank only when no metric was produced.
|
|
19
|
+
5. Keep `change_summary` short and specific to the experiment that just finished.
|
|
20
|
+
6. Use `labels` for compact tags such as `test`, `perf`, `retry`, `docs`, or `security`.
|
|
21
|
+
7. Put blocker details, rollback notes, or subagent evidence worth preserving in `note`.
|
|
22
|
+
|
|
23
|
+
## Interpretation
|
|
24
|
+
|
|
25
|
+
- `verify_status=pass` means the primary metric command completed successfully.
|
|
26
|
+
- `guard_status=pass` means the regression guard also passed.
|
|
27
|
+
- `decision=keep` means the change survived verification and stays in the working tree.
|
|
28
|
+
- `decision=discard` means the change should be rolled back before the next experiment.
|
|
29
|
+
- `decision=needs_human` means the run hit ambiguity or risk that should stop autonomous progress.
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
# Runtime Hard Invariants
|
|
2
|
+
|
|
3
|
+
Use this checklist when a run starts, resumes, or feels out of sync.
|
|
4
|
+
|
|
5
|
+
## Re-anchor
|
|
6
|
+
|
|
7
|
+
- Re-state the goal, scope, metric, and verify command before making the next change.
|
|
8
|
+
- Re-check the latest state and results artifacts so you are not acting on stale context.
|
|
9
|
+
- Re-anchor the standing subagent pool with the latest findings, objections, and open questions before the next iteration.
|
|
10
|
+
- Make one focused change at a time and record it before moving on.
|
|
11
|
+
- Re-run the verify command after a keep-worthy change and before the next iteration.
|
|
12
|
+
- Honor any keep/stop label gates, duration limits, and human-blocker flags.
|
|
13
|
+
- If the current state no longer matches reality, stop and surface the mismatch instead of guessing.
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
# Scenario Workflow
|
|
2
|
+
|
|
3
|
+
Use this when the user wants edge cases, use cases, or stress scenarios derived from a seed idea.
|
|
4
|
+
|
|
5
|
+
## Goal
|
|
6
|
+
|
|
7
|
+
Expand a seed situation into a practical set of testable scenarios.
|
|
8
|
+
|
|
9
|
+
## Steps
|
|
10
|
+
|
|
11
|
+
1. Read the seed scenario and target subsystem.
|
|
12
|
+
2. Expand along scale, failure, misuse, timing, dependency, and data-shape axes.
|
|
13
|
+
3. Deduplicate similar cases.
|
|
14
|
+
4. Rank the scenarios by risk or value.
|
|
15
|
+
5. Feed the selected scenarios into tests, debug, or fix work if requested.
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
# Security Workflow
|
|
2
|
+
|
|
3
|
+
Use this when the user wants a structured security pass.
|
|
4
|
+
|
|
5
|
+
## Goal
|
|
6
|
+
|
|
7
|
+
Find concrete security issues with enough evidence to prioritize and fix them.
|
|
8
|
+
|
|
9
|
+
## Steps
|
|
10
|
+
|
|
11
|
+
1. Define the in-scope attack surface.
|
|
12
|
+
2. Review it through common categories such as input handling, authz/authn, secrets, injection, unsafe deserialization, file access, and dependency exposure.
|
|
13
|
+
3. Prefer reproducible evidence over broad claims.
|
|
14
|
+
4. Rank findings by severity and exploitability.
|
|
15
|
+
5. If the user wants remediation, hand the highest-value issue into the fix workflow.
|
|
@@ -0,0 +1,15 @@
|
|
|
1
|
+
# Ship Workflow
|
|
2
|
+
|
|
3
|
+
Use this when the user wants release-readiness or structured closeout.
|
|
4
|
+
|
|
5
|
+
## Goal
|
|
6
|
+
|
|
7
|
+
Move from "changes exist" to "this is ready to ship" with a bounded checklist.
|
|
8
|
+
|
|
9
|
+
## Steps
|
|
10
|
+
|
|
11
|
+
1. Confirm the target artifact or release unit.
|
|
12
|
+
2. Verify tests and guards.
|
|
13
|
+
3. Review open risks and missing docs.
|
|
14
|
+
4. Summarize user-visible changes.
|
|
15
|
+
5. Produce the final checklist: verification, rollback note, release note, and follow-up monitoring items.
|
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
# State Management
|
|
2
|
+
|
|
3
|
+
`autoresearch-state.json` is the run checkpoint.
|
|
4
|
+
|
|
5
|
+
## Core Fields
|
|
6
|
+
|
|
7
|
+
- `run_id`: stable identifier for the current run
|
|
8
|
+
- `status`: `initialized`, `running`, `stopping`, `stopped`, or `completed`
|
|
9
|
+
- `mode`: `foreground` or `background`
|
|
10
|
+
- `goal`: human-readable objective
|
|
11
|
+
- `metric`: name, direction, baseline, latest, and best values
|
|
12
|
+
- `subagent_pool`: standing-pool plan, role activation, and re-anchor guidance
|
|
13
|
+
- `continuation_policy`: launch approval boundary and post-launch stop rules
|
|
14
|
+
- `stats`: total iterations, keep/discard counters, best iteration, and discard streak
|
|
15
|
+
- `flags`: stop request, needs-human marker, and background activity
|
|
16
|
+
- `last_iteration`: summary of the latest completed iteration
|
|
17
|
+
|
|
18
|
+
## Rules
|
|
19
|
+
|
|
20
|
+
1. Baseline exactly once when the run is initialized.
|
|
21
|
+
2. Update `updated_at` on every state mutation.
|
|
22
|
+
3. Keep `metric.latest` aligned with the most recent finished iteration.
|
|
23
|
+
4. Only update `metric.best` on strict improvement.
|
|
24
|
+
5. Only the orchestrator records iterations or mutates the authoritative run state.
|
|
25
|
+
6. Set `flags.needs_human=true` when autonomous progress should stop for user input.
|
|
26
|
+
7. For detached runs, `flags.background_active` reflects whether a background owner is currently expected to continue the loop.
|
|
27
|
+
8. If the pool metadata is missing from an older state file, reconstruct it from the goal, scope, and mode before resuming.
|
|
28
|
+
|
|
29
|
+
## Resume Semantics
|
|
30
|
+
|
|
31
|
+
- `python scripts/autoresearch_runtime_ctl.py resume` clears `stop_requested` and marks the background run active again.
|
|
32
|
+
- Resume does not create a new run; it continues the existing state snapshot.
|
|
33
|
+
- Resume should re-anchor the standing pool with the latest metric, last iteration, and active role guidance before the next handoff.
|
|
34
|
+
- Completed runs are not resumable; return to the previous state by starting a new run.
|
|
35
|
+
- If the run is not a background run, resume should fail fast.
|
|
36
|
+
|
|
37
|
+
## Completion Semantics
|
|
38
|
+
|
|
39
|
+
- `python scripts/autoresearch_runtime_ctl.py complete` moves a background run to `completed`, clears `background_active`, and ends the detached session lifecycle.
|
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
# Structured Output Spec
|
|
2
|
+
|
|
3
|
+
Interactive runs should use three output phases.
|
|
4
|
+
|
|
5
|
+
## Setup Summary
|
|
6
|
+
|
|
7
|
+
Before launch, summarize:
|
|
8
|
+
|
|
9
|
+
- goal
|
|
10
|
+
- scope
|
|
11
|
+
- metric and direction
|
|
12
|
+
- verify command
|
|
13
|
+
- guard command, if any
|
|
14
|
+
- run mode
|
|
15
|
+
- stop condition or iteration cap
|
|
16
|
+
|
|
17
|
+
## Iteration Update
|
|
18
|
+
|
|
19
|
+
After each completed iteration, report:
|
|
20
|
+
|
|
21
|
+
- iteration number
|
|
22
|
+
- decision (`keep`, `discard`, or `needs_human`)
|
|
23
|
+
- short explanation
|
|
24
|
+
- current best-known metric, if available
|
|
25
|
+
|
|
26
|
+
## Completion Summary
|
|
27
|
+
|
|
28
|
+
When the run ends, report:
|
|
29
|
+
|
|
30
|
+
- why it ended
|
|
31
|
+
- total iterations
|
|
32
|
+
- kept vs discarded counts
|
|
33
|
+
- best recorded metric, if available
|
|
34
|
+
- next action, if a blocker remains
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
# Subagent Orchestration
|
|
2
|
+
|
|
3
|
+
Use this reference when a run should be subagent-first.
|
|
4
|
+
|
|
5
|
+
## Orchestration Model
|
|
6
|
+
|
|
7
|
+
- The main agent is the orchestrator. It owns the goal, scope, metric, direction, verify command, guard, and final keep/discard decision.
|
|
8
|
+
- Subagents form a standing pool for parallel context gathering, alternative synthesis, verification, and critique.
|
|
9
|
+
- The main agent hands bounded questions to the pool, waits for findings, and folds those findings into the next iteration before changing code again.
|
|
10
|
+
- Subagents surface evidence, objections, risks, and candidate next steps. They do not independently advance the run state.
|
|
11
|
+
- Approval belongs before launch. Once the run is launched, keep the same pool moving until the user stops the run, the configured stop condition is met, or a real `needs_human` blocker appears.
|
|
12
|
+
|
|
13
|
+
## Launch Decision
|
|
14
|
+
|
|
15
|
+
- Decide whether the standing pool should already be active during setup or only after launch.
|
|
16
|
+
- Default to an active pool for multi-step, uncertain, or unattended work.
|
|
17
|
+
- Fall back to orchestrator-only serial execution when the task is tiny or the environment cannot support clean parallel work.
|
|
18
|
+
|
|
19
|
+
## Pool Rules
|
|
20
|
+
|
|
21
|
+
- Keep the pool alive across iterations unless context drift forces a reset.
|
|
22
|
+
- Reuse roles where possible so context compounds instead of resetting.
|
|
23
|
+
- Prefer a small pool with distinct jobs over one-off ad hoc spawning.
|
|
24
|
+
- Re-anchor the pool after every keep/discard decision with the latest goal, state, and results.
|
|
25
|
+
- Feed findings back into the loop before the next code change.
|
|
26
|
+
|
|
27
|
+
## State Ownership
|
|
28
|
+
|
|
29
|
+
- Only the orchestrator records iterations, mutates `autoresearch-state.json`, and decides whether the latest step is `keep`, `discard`, or `needs_human`.
|
|
30
|
+
- Subagents may disagree, critique, or verify, but their output is supporting evidence.
|
|
31
|
+
- If several subagents contribute to one change, roll that evidence into one orchestrator-owned iteration result.
|
|
32
|
+
|
|
33
|
+
## Fallback Rules
|
|
34
|
+
|
|
35
|
+
- If one subagent times out, conflicts with another role, or stops adding value, replace or drop that role without stopping the whole run.
|
|
36
|
+
- If the whole pool becomes unhelpful, continue serially under the orchestrator instead of rerunning setup.
|
|
37
|
+
- Only surface `needs_human` when the orchestrator cannot continue safely after folding in the latest pool findings.
|
|
38
|
+
|
|
39
|
+
## Local Scope
|
|
40
|
+
|
|
41
|
+
- This bundle stays compact and narrower than the reference repo.
|
|
42
|
+
- Prefer the references in this repository over broader upstream orchestration patterns unless the user explicitly asks for them.
|