opencode-autoresearch 3.1.0-beta.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (71) hide show
  1. package/.opencode-plugin/plugin.json +37 -0
  2. package/LICENSE +21 -0
  3. package/README.md +74 -0
  4. package/commands/autoresearch/debug.md +23 -0
  5. package/commands/autoresearch/fix.md +21 -0
  6. package/commands/autoresearch/learn.md +21 -0
  7. package/commands/autoresearch/plan.md +25 -0
  8. package/commands/autoresearch/predict.md +21 -0
  9. package/commands/autoresearch/scenario.md +21 -0
  10. package/commands/autoresearch/security.md +21 -0
  11. package/commands/autoresearch/ship.md +22 -0
  12. package/commands/autoresearch.md +45 -0
  13. package/dist/cli.d.ts +3 -0
  14. package/dist/cli.d.ts.map +1 -0
  15. package/dist/cli.js +202 -0
  16. package/dist/cli.js.map +1 -0
  17. package/dist/constants.d.ts +13 -0
  18. package/dist/constants.d.ts.map +1 -0
  19. package/dist/constants.js +13 -0
  20. package/dist/constants.js.map +1 -0
  21. package/dist/helpers.d.ts +19 -0
  22. package/dist/helpers.d.ts.map +1 -0
  23. package/dist/helpers.js +137 -0
  24. package/dist/helpers.js.map +1 -0
  25. package/dist/index.d.ts +4 -0
  26. package/dist/index.d.ts.map +1 -0
  27. package/dist/index.js +3 -0
  28. package/dist/index.js.map +1 -0
  29. package/dist/run-manager.d.ts +9 -0
  30. package/dist/run-manager.d.ts.map +1 -0
  31. package/dist/run-manager.js +239 -0
  32. package/dist/run-manager.js.map +1 -0
  33. package/dist/subagent-pool.d.ts +7 -0
  34. package/dist/subagent-pool.d.ts.map +1 -0
  35. package/dist/subagent-pool.js +101 -0
  36. package/dist/subagent-pool.js.map +1 -0
  37. package/dist/types.d.ts +129 -0
  38. package/dist/types.d.ts.map +1 -0
  39. package/dist/types.js +2 -0
  40. package/dist/types.js.map +1 -0
  41. package/dist/wizard.d.ts +3 -0
  42. package/dist/wizard.d.ts.map +1 -0
  43. package/dist/wizard.js +56 -0
  44. package/dist/wizard.js.map +1 -0
  45. package/docs/ARCHITECTURE.md +88 -0
  46. package/docs/CNAME +1 -0
  47. package/docs/OPENCODE_INSTALL.md +48 -0
  48. package/docs/RELEASE.md +67 -0
  49. package/docs/autoresearch-loop.svg +95 -0
  50. package/docs/index.html +249 -0
  51. package/hooks/init.sh +21 -0
  52. package/hooks/status.sh +23 -0
  53. package/hooks/stop.sh +27 -0
  54. package/package.json +49 -0
  55. package/skills/autoresearch/SKILL.md +77 -0
  56. package/skills/autoresearch/references/core-principles.md +20 -0
  57. package/skills/autoresearch/references/debug-workflow.md +31 -0
  58. package/skills/autoresearch/references/fix-workflow.md +25 -0
  59. package/skills/autoresearch/references/interaction-wizard.md +33 -0
  60. package/skills/autoresearch/references/learn-workflow.md +15 -0
  61. package/skills/autoresearch/references/loop-workflow.md +35 -0
  62. package/skills/autoresearch/references/plan-workflow.md +42 -0
  63. package/skills/autoresearch/references/predict-workflow.md +15 -0
  64. package/skills/autoresearch/references/results-logging.md +29 -0
  65. package/skills/autoresearch/references/runtime-hard-invariants.md +13 -0
  66. package/skills/autoresearch/references/scenario-workflow.md +15 -0
  67. package/skills/autoresearch/references/security-workflow.md +15 -0
  68. package/skills/autoresearch/references/ship-workflow.md +15 -0
  69. package/skills/autoresearch/references/state-management.md +39 -0
  70. package/skills/autoresearch/references/structured-output-spec.md +34 -0
  71. package/skills/autoresearch/references/subagent-orchestration.md +42 -0
@@ -0,0 +1,29 @@
1
+ # Results Logging
2
+
3
+ `research-results.tsv` is the primary append-only results log per run.
4
+
5
+ The runtime maintains `autoresearch-results.tsv` as the canonical iteration log.
6
+
7
+ ## Required Columns
8
+
9
+ The TSV header is:
10
+
11
+ `timestamp iteration decision metric_value verify_status guard_status hypothesis change_summary labels note`
12
+
13
+ ## Logging Rules
14
+
15
+ 1. Record exactly one row per completed iteration.
16
+ 2. Treat each row as the orchestrator-owned result for that iteration, even when several subagents contributed evidence.
17
+ 3. Use `keep`, `discard`, or `needs_human` as the decision.
18
+ 4. Store the observed metric value as text; leave it blank only when no metric was produced.
19
+ 5. Keep `change_summary` short and specific to the experiment that just finished.
20
+ 6. Use `labels` for compact tags such as `test`, `perf`, `retry`, `docs`, or `security`.
21
+ 7. Put blocker details, rollback notes, or subagent evidence worth preserving in `note`.
22
+
23
+ ## Interpretation
24
+
25
+ - `verify_status=pass` means the primary metric command completed successfully.
26
+ - `guard_status=pass` means the regression guard also passed.
27
+ - `decision=keep` means the change survived verification and stays in the working tree.
28
+ - `decision=discard` means the change should be rolled back before the next experiment.
29
+ - `decision=needs_human` means the run hit ambiguity or risk that should stop autonomous progress.
@@ -0,0 +1,13 @@
1
+ # Runtime Hard Invariants
2
+
3
+ Use this checklist when a run starts, resumes, or feels out of sync.
4
+
5
+ ## Re-anchor
6
+
7
+ - Re-state the goal, scope, metric, and verify command before making the next change.
8
+ - Re-check the latest state and results artifacts so you are not acting on stale context.
9
+ - Re-anchor the standing subagent pool with the latest findings, objections, and open questions before the next iteration.
10
+ - Make one focused change at a time and record it before moving on.
11
+ - Re-run the verify command after a keep-worthy change and before the next iteration.
12
+ - Honor any keep/stop label gates, duration limits, and human-blocker flags.
13
+ - If the current state no longer matches reality, stop and surface the mismatch instead of guessing.
@@ -0,0 +1,15 @@
1
+ # Scenario Workflow
2
+
3
+ Use this when the user wants edge cases, use cases, or stress scenarios derived from a seed idea.
4
+
5
+ ## Goal
6
+
7
+ Expand a seed situation into a practical set of testable scenarios.
8
+
9
+ ## Steps
10
+
11
+ 1. Read the seed scenario and target subsystem.
12
+ 2. Expand along scale, failure, misuse, timing, dependency, and data-shape axes.
13
+ 3. Deduplicate similar cases.
14
+ 4. Rank the scenarios by risk or value.
15
+ 5. Feed the selected scenarios into tests, debug, or fix work if requested.
@@ -0,0 +1,15 @@
1
+ # Security Workflow
2
+
3
+ Use this when the user wants a structured security pass.
4
+
5
+ ## Goal
6
+
7
+ Find concrete security issues with enough evidence to prioritize and fix them.
8
+
9
+ ## Steps
10
+
11
+ 1. Define the in-scope attack surface.
12
+ 2. Review it through common categories such as input handling, authz/authn, secrets, injection, unsafe deserialization, file access, and dependency exposure.
13
+ 3. Prefer reproducible evidence over broad claims.
14
+ 4. Rank findings by severity and exploitability.
15
+ 5. If the user wants remediation, hand the highest-value issue into the fix workflow.
@@ -0,0 +1,15 @@
1
+ # Ship Workflow
2
+
3
+ Use this when the user wants release-readiness or structured closeout.
4
+
5
+ ## Goal
6
+
7
+ Move from "changes exist" to "this is ready to ship" with a bounded checklist.
8
+
9
+ ## Steps
10
+
11
+ 1. Confirm the target artifact or release unit.
12
+ 2. Verify tests and guards.
13
+ 3. Review open risks and missing docs.
14
+ 4. Summarize user-visible changes.
15
+ 5. Produce the final checklist: verification, rollback note, release note, and follow-up monitoring items.
@@ -0,0 +1,39 @@
1
+ # State Management
2
+
3
+ `autoresearch-state.json` is the run checkpoint.
4
+
5
+ ## Core Fields
6
+
7
+ - `run_id`: stable identifier for the current run
8
+ - `status`: `initialized`, `running`, `stopping`, `stopped`, or `completed`
9
+ - `mode`: `foreground` or `background`
10
+ - `goal`: human-readable objective
11
+ - `metric`: name, direction, baseline, latest, and best values
12
+ - `subagent_pool`: standing-pool plan, role activation, and re-anchor guidance
13
+ - `continuation_policy`: launch approval boundary and post-launch stop rules
14
+ - `stats`: total iterations, keep/discard counters, best iteration, and discard streak
15
+ - `flags`: stop request, needs-human marker, and background activity
16
+ - `last_iteration`: summary of the latest completed iteration
17
+
18
+ ## Rules
19
+
20
+ 1. Baseline exactly once when the run is initialized.
21
+ 2. Update `updated_at` on every state mutation.
22
+ 3. Keep `metric.latest` aligned with the most recent finished iteration.
23
+ 4. Only update `metric.best` on strict improvement.
24
+ 5. Only the orchestrator records iterations or mutates the authoritative run state.
25
+ 6. Set `flags.needs_human=true` when autonomous progress should stop for user input.
26
+ 7. For detached runs, `flags.background_active` reflects whether a background owner is currently expected to continue the loop.
27
+ 8. If the pool metadata is missing from an older state file, reconstruct it from the goal, scope, and mode before resuming.
28
+
29
+ ## Resume Semantics
30
+
31
+ - `python scripts/autoresearch_runtime_ctl.py resume` clears `stop_requested` and marks the background run active again.
32
+ - Resume does not create a new run; it continues the existing state snapshot.
33
+ - Resume should re-anchor the standing pool with the latest metric, last iteration, and active role guidance before the next handoff.
34
+ - Completed runs are not resumable; return to the previous state by starting a new run.
35
+ - If the run is not a background run, resume should fail fast.
36
+
37
+ ## Completion Semantics
38
+
39
+ - `python scripts/autoresearch_runtime_ctl.py complete` moves a background run to `completed`, clears `background_active`, and ends the detached session lifecycle.
@@ -0,0 +1,34 @@
1
+ # Structured Output Spec
2
+
3
+ Interactive runs should use three output phases.
4
+
5
+ ## Setup Summary
6
+
7
+ Before launch, summarize:
8
+
9
+ - goal
10
+ - scope
11
+ - metric and direction
12
+ - verify command
13
+ - guard command, if any
14
+ - run mode
15
+ - stop condition or iteration cap
16
+
17
+ ## Iteration Update
18
+
19
+ After each completed iteration, report:
20
+
21
+ - iteration number
22
+ - decision (`keep`, `discard`, or `needs_human`)
23
+ - short explanation
24
+ - current best-known metric, if available
25
+
26
+ ## Completion Summary
27
+
28
+ When the run ends, report:
29
+
30
+ - why it ended
31
+ - total iterations
32
+ - kept vs discarded counts
33
+ - best recorded metric, if available
34
+ - next action, if a blocker remains
@@ -0,0 +1,42 @@
1
+ # Subagent Orchestration
2
+
3
+ Use this reference when a run should be subagent-first.
4
+
5
+ ## Orchestration Model
6
+
7
+ - The main agent is the orchestrator. It owns the goal, scope, metric, direction, verify command, guard, and final keep/discard decision.
8
+ - Subagents form a standing pool for parallel context gathering, alternative synthesis, verification, and critique.
9
+ - The main agent hands bounded questions to the pool, waits for findings, and folds those findings into the next iteration before changing code again.
10
+ - Subagents surface evidence, objections, risks, and candidate next steps. They do not independently advance the run state.
11
+ - Approval belongs before launch. Once the run is launched, keep the same pool moving until the user stops the run, the configured stop condition is met, or a real `needs_human` blocker appears.
12
+
13
+ ## Launch Decision
14
+
15
+ - Decide whether the standing pool should already be active during setup or only after launch.
16
+ - Default to an active pool for multi-step, uncertain, or unattended work.
17
+ - Fall back to orchestrator-only serial execution when the task is tiny or the environment cannot support clean parallel work.
18
+
19
+ ## Pool Rules
20
+
21
+ - Keep the pool alive across iterations unless context drift forces a reset.
22
+ - Reuse roles where possible so context compounds instead of resetting.
23
+ - Prefer a small pool with distinct jobs over one-off ad hoc spawning.
24
+ - Re-anchor the pool after every keep/discard decision with the latest goal, state, and results.
25
+ - Feed findings back into the loop before the next code change.
26
+
27
+ ## State Ownership
28
+
29
+ - Only the orchestrator records iterations, mutates `autoresearch-state.json`, and decides whether the latest step is `keep`, `discard`, or `needs_human`.
30
+ - Subagents may disagree, critique, or verify, but their output is supporting evidence.
31
+ - If several subagents contribute to one change, roll that evidence into one orchestrator-owned iteration result.
32
+
33
+ ## Fallback Rules
34
+
35
+ - If one subagent times out, conflicts with another role, or stops adding value, replace or drop that role without stopping the whole run.
36
+ - If the whole pool becomes unhelpful, continue serially under the orchestrator instead of rerunning setup.
37
+ - Only surface `needs_human` when the orchestrator cannot continue safely after folding in the latest pool findings.
38
+
39
+ ## Local Scope
40
+
41
+ - This bundle stays compact and narrower than the reference repo.
42
+ - Prefer the references in this repository over broader upstream orchestration patterns unless the user explicitly asks for them.