agentera 0.0.0 → 3.0.0-dev.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (256) hide show
  1. package/README.md +6 -45
  2. package/bundle/.agentera-npx-bundle.json +4 -0
  3. package/bundle/references/adapters/cursor.md +213 -0
  4. package/bundle/references/adapters/opencode.md +530 -0
  5. package/bundle/references/adapters/package-manifest-interface-model.yaml +337 -0
  6. package/bundle/references/adapters/package-registry.yaml +247 -0
  7. package/bundle/references/adapters/package-surface-characterization.md +48 -0
  8. package/bundle/references/adapters/runtime-adapter-characterization.md +79 -0
  9. package/bundle/references/adapters/runtime-adapter-interface-model.yaml +200 -0
  10. package/bundle/references/adapters/runtime-adapter-registry.yaml +548 -0
  11. package/bundle/references/adapters/runtime-feature-parity.md +189 -0
  12. package/bundle/references/analysis/benchmark.md +267 -0
  13. package/bundle/references/analysis/startup-measurement-contract.yaml +424 -0
  14. package/bundle/references/artifacts/artifact-registry-interface-model.yaml +288 -0
  15. package/bundle/references/cli/agent-ready-state-contract.yaml +950 -0
  16. package/bundle/references/cli/app-lifecycle-vocabulary.yaml +241 -0
  17. package/bundle/references/cli/audience-namespace-cli-migration.yaml +355 -0
  18. package/bundle/references/cli/bundle-skill-vocabulary.yaml +278 -0
  19. package/bundle/references/cli/capability-instruction-contract.yaml +123 -0
  20. package/bundle/references/cli/capability-tool-classification.yaml +53 -0
  21. package/bundle/references/cli/routing-execution-vocabulary.yaml +281 -0
  22. package/bundle/references/cli/update-channels.yaml +147 -0
  23. package/bundle/references/cli/vocabulary-index.yaml +160 -0
  24. package/bundle/references/cli/vocabulary.md +566 -0
  25. package/bundle/references/meta/documentation-inventory.md +43 -0
  26. package/bundle/references/v1-section-mapping.md +47 -0
  27. package/bundle/registry.json +39 -0
  28. package/bundle/skills/agentera/.claude-plugin/plugin.json +27 -0
  29. package/bundle/skills/agentera/SKILL.md +470 -0
  30. package/bundle/skills/agentera/agents/dokumentera.toml +6 -0
  31. package/bundle/skills/agentera/agents/hej.toml +6 -0
  32. package/bundle/skills/agentera/agents/inspektera.toml +6 -0
  33. package/bundle/skills/agentera/agents/inspirera.toml +6 -0
  34. package/bundle/skills/agentera/agents/optimera.toml +6 -0
  35. package/bundle/skills/agentera/agents/orkestrera.toml +6 -0
  36. package/bundle/skills/agentera/agents/planera.toml +6 -0
  37. package/bundle/skills/agentera/agents/profilera.toml +6 -0
  38. package/bundle/skills/agentera/agents/realisera.toml +6 -0
  39. package/bundle/skills/agentera/agents/resonera.toml +6 -0
  40. package/bundle/skills/agentera/agents/visionera.toml +6 -0
  41. package/bundle/skills/agentera/agents/visualisera.toml +6 -0
  42. package/bundle/skills/agentera/capabilities/dokumentera/instructions.md +428 -0
  43. package/bundle/skills/agentera/capabilities/dokumentera/schemas/artifacts.yaml +73 -0
  44. package/bundle/skills/agentera/capabilities/dokumentera/schemas/exit.yaml +35 -0
  45. package/bundle/skills/agentera/capabilities/dokumentera/schemas/triggers.yaml +35 -0
  46. package/bundle/skills/agentera/capabilities/dokumentera/schemas/validation.yaml +139 -0
  47. package/bundle/skills/agentera/capabilities/hej/instructions.md +331 -0
  48. package/bundle/skills/agentera/capabilities/hej/schemas/artifacts.yaml +69 -0
  49. package/bundle/skills/agentera/capabilities/hej/schemas/exit.yaml +32 -0
  50. package/bundle/skills/agentera/capabilities/hej/schemas/triggers.yaml +58 -0
  51. package/bundle/skills/agentera/capabilities/hej/schemas/validation.yaml +55 -0
  52. package/bundle/skills/agentera/capabilities/inspektera/instructions.md +514 -0
  53. package/bundle/skills/agentera/capabilities/inspektera/schemas/artifacts.yaml +76 -0
  54. package/bundle/skills/agentera/capabilities/inspektera/schemas/exit.yaml +36 -0
  55. package/bundle/skills/agentera/capabilities/inspektera/schemas/triggers.yaml +38 -0
  56. package/bundle/skills/agentera/capabilities/inspektera/schemas/validation.yaml +113 -0
  57. package/bundle/skills/agentera/capabilities/inspirera/instructions.md +280 -0
  58. package/bundle/skills/agentera/capabilities/inspirera/schemas/artifacts.yaml +24 -0
  59. package/bundle/skills/agentera/capabilities/inspirera/schemas/exit.yaml +33 -0
  60. package/bundle/skills/agentera/capabilities/inspirera/schemas/triggers.yaml +34 -0
  61. package/bundle/skills/agentera/capabilities/inspirera/schemas/validation.yaml +58 -0
  62. package/bundle/skills/agentera/capabilities/optimera/instructions.md +437 -0
  63. package/bundle/skills/agentera/capabilities/optimera/schemas/artifacts.yaml +69 -0
  64. package/bundle/skills/agentera/capabilities/optimera/schemas/exit.yaml +35 -0
  65. package/bundle/skills/agentera/capabilities/optimera/schemas/triggers.yaml +39 -0
  66. package/bundle/skills/agentera/capabilities/optimera/schemas/validation.yaml +91 -0
  67. package/bundle/skills/agentera/capabilities/orkestrera/instructions.md +433 -0
  68. package/bundle/skills/agentera/capabilities/orkestrera/schemas/artifacts.yaml +64 -0
  69. package/bundle/skills/agentera/capabilities/orkestrera/schemas/exit.yaml +34 -0
  70. package/bundle/skills/agentera/capabilities/orkestrera/schemas/triggers.yaml +42 -0
  71. package/bundle/skills/agentera/capabilities/orkestrera/schemas/validation.yaml +107 -0
  72. package/bundle/skills/agentera/capabilities/planera/instructions.md +368 -0
  73. package/bundle/skills/agentera/capabilities/planera/schemas/artifacts.yaml +62 -0
  74. package/bundle/skills/agentera/capabilities/planera/schemas/exit.yaml +33 -0
  75. package/bundle/skills/agentera/capabilities/planera/schemas/triggers.yaml +34 -0
  76. package/bundle/skills/agentera/capabilities/planera/schemas/validation.yaml +61 -0
  77. package/bundle/skills/agentera/capabilities/profilera/instructions.md +419 -0
  78. package/bundle/skills/agentera/capabilities/profilera/schemas/artifacts.yaml +18 -0
  79. package/bundle/skills/agentera/capabilities/profilera/schemas/exit.yaml +34 -0
  80. package/bundle/skills/agentera/capabilities/profilera/schemas/triggers.yaml +45 -0
  81. package/bundle/skills/agentera/capabilities/profilera/schemas/validation.yaml +57 -0
  82. package/bundle/skills/agentera/capabilities/realisera/instructions.md +403 -0
  83. package/bundle/skills/agentera/capabilities/realisera/schemas/artifacts.yaml +80 -0
  84. package/bundle/skills/agentera/capabilities/realisera/schemas/exit.yaml +35 -0
  85. package/bundle/skills/agentera/capabilities/realisera/schemas/triggers.yaml +39 -0
  86. package/bundle/skills/agentera/capabilities/realisera/schemas/validation.yaml +110 -0
  87. package/bundle/skills/agentera/capabilities/resonera/instructions.md +329 -0
  88. package/bundle/skills/agentera/capabilities/resonera/schemas/artifacts.yaml +47 -0
  89. package/bundle/skills/agentera/capabilities/resonera/schemas/exit.yaml +35 -0
  90. package/bundle/skills/agentera/capabilities/resonera/schemas/triggers.yaml +46 -0
  91. package/bundle/skills/agentera/capabilities/resonera/schemas/validation.yaml +77 -0
  92. package/bundle/skills/agentera/capabilities/visionera/instructions.md +309 -0
  93. package/bundle/skills/agentera/capabilities/visionera/schemas/artifacts.yaml +57 -0
  94. package/bundle/skills/agentera/capabilities/visionera/schemas/exit.yaml +35 -0
  95. package/bundle/skills/agentera/capabilities/visionera/schemas/triggers.yaml +41 -0
  96. package/bundle/skills/agentera/capabilities/visionera/schemas/validation.yaml +74 -0
  97. package/bundle/skills/agentera/capabilities/visualisera/instructions.md +400 -0
  98. package/bundle/skills/agentera/capabilities/visualisera/schemas/artifacts.yaml +44 -0
  99. package/bundle/skills/agentera/capabilities/visualisera/schemas/exit.yaml +34 -0
  100. package/bundle/skills/agentera/capabilities/visualisera/schemas/triggers.yaml +33 -0
  101. package/bundle/skills/agentera/capabilities/visualisera/schemas/validation.yaml +80 -0
  102. package/bundle/skills/agentera/capability_schema_contract.yaml +385 -0
  103. package/bundle/skills/agentera/protocol.yaml +463 -0
  104. package/bundle/skills/agentera/references/contract.md +1039 -0
  105. package/bundle/skills/agentera/schemas/artifacts/changelog.yaml +60 -0
  106. package/bundle/skills/agentera/schemas/artifacts/decisions.yaml +461 -0
  107. package/bundle/skills/agentera/schemas/artifacts/design.yaml +55 -0
  108. package/bundle/skills/agentera/schemas/artifacts/docs.yaml +402 -0
  109. package/bundle/skills/agentera/schemas/artifacts/experiments.yaml +373 -0
  110. package/bundle/skills/agentera/schemas/artifacts/health.yaml +484 -0
  111. package/bundle/skills/agentera/schemas/artifacts/objective.yaml +399 -0
  112. package/bundle/skills/agentera/schemas/artifacts/plan.yaml +342 -0
  113. package/bundle/skills/agentera/schemas/artifacts/progress.yaml +325 -0
  114. package/bundle/skills/agentera/schemas/artifacts/todo.yaml +110 -0
  115. package/bundle/skills/agentera/schemas/artifacts/vision.yaml +262 -0
  116. package/bundle/skills/hej/.claude-plugin/plugin.json +6 -0
  117. package/bundle/skills/hej/SKILL.md +69 -0
  118. package/bundle/skills/hej/agents/hej.toml +11 -0
  119. package/bundle/skills/hej/agents/openai.yaml +8 -0
  120. package/dist/analytics/extractCorpus.js +1791 -0
  121. package/dist/analytics/extractCorpus.js.map +1 -0
  122. package/dist/analytics/usageStats.js +487 -0
  123. package/dist/analytics/usageStats.js.map +1 -0
  124. package/dist/bin/agentera.js +4 -0
  125. package/dist/bin/agentera.js.map +1 -0
  126. package/dist/cli/appContext.js +226 -0
  127. package/dist/cli/appContext.js.map +1 -0
  128. package/dist/cli/argvalidate.js +41 -0
  129. package/dist/cli/argvalidate.js.map +1 -0
  130. package/dist/cli/capabilityContext.js +2421 -0
  131. package/dist/cli/capabilityContext.js.map +1 -0
  132. package/dist/cli/commands/backfill.js +84 -0
  133. package/dist/cli/commands/backfill.js.map +1 -0
  134. package/dist/cli/commands/capability.js +44 -0
  135. package/dist/cli/commands/capability.js.map +1 -0
  136. package/dist/cli/commands/compact.js +148 -0
  137. package/dist/cli/commands/compact.js.map +1 -0
  138. package/dist/cli/commands/doctor.js +180 -0
  139. package/dist/cli/commands/doctor.js.map +1 -0
  140. package/dist/cli/commands/lint.js +179 -0
  141. package/dist/cli/commands/lint.js.map +1 -0
  142. package/dist/cli/commands/prime.js +544 -0
  143. package/dist/cli/commands/prime.js.map +1 -0
  144. package/dist/cli/commands/query.js +346 -0
  145. package/dist/cli/commands/query.js.map +1 -0
  146. package/dist/cli/commands/report.js +210 -0
  147. package/dist/cli/commands/report.js.map +1 -0
  148. package/dist/cli/commands/schema.js +306 -0
  149. package/dist/cli/commands/schema.js.map +1 -0
  150. package/dist/cli/commands/state.js +1012 -0
  151. package/dist/cli/commands/state.js.map +1 -0
  152. package/dist/cli/commands/upgrade.js +48 -0
  153. package/dist/cli/commands/upgrade.js.map +1 -0
  154. package/dist/cli/commands/validate.js +519 -0
  155. package/dist/cli/commands/validate.js.map +1 -0
  156. package/dist/cli/commands/verify.js +204 -0
  157. package/dist/cli/commands/verify.js.map +1 -0
  158. package/dist/cli/dispatch.js +958 -0
  159. package/dist/cli/dispatch.js.map +1 -0
  160. package/dist/cli/orientation.js +595 -0
  161. package/dist/cli/orientation.js.map +1 -0
  162. package/dist/cli/prime-blob.js +3 -0
  163. package/dist/cli/prime-blob.js.map +1 -0
  164. package/dist/cli/stateQuery.js +292 -0
  165. package/dist/cli/stateQuery.js.map +1 -0
  166. package/dist/cli/structured.js +18 -0
  167. package/dist/cli/structured.js.map +1 -0
  168. package/dist/core/difflib.js +274 -0
  169. package/dist/core/difflib.js.map +1 -0
  170. package/dist/core/git.js +43 -0
  171. package/dist/core/git.js.map +1 -0
  172. package/dist/core/paths.js +50 -0
  173. package/dist/core/paths.js.map +1 -0
  174. package/dist/core/pyjson.js +101 -0
  175. package/dist/core/pyjson.js.map +1 -0
  176. package/dist/core/sourceRoot.js +72 -0
  177. package/dist/core/sourceRoot.js.map +1 -0
  178. package/dist/core/toml.js +11 -0
  179. package/dist/core/toml.js.map +1 -0
  180. package/dist/core/yaml.js +25 -0
  181. package/dist/core/yaml.js.map +1 -0
  182. package/dist/eval/evalSkills.js +258 -0
  183. package/dist/eval/evalSkills.js.map +1 -0
  184. package/dist/eval/semanticEval.js +148 -0
  185. package/dist/eval/semanticEval.js.map +1 -0
  186. package/dist/eval/semanticFixtures.js +227 -0
  187. package/dist/eval/semanticFixtures.js.map +1 -0
  188. package/dist/hooks/common.js +160 -0
  189. package/dist/hooks/common.js.map +1 -0
  190. package/dist/hooks/compaction.js +935 -0
  191. package/dist/hooks/compaction.js.map +1 -0
  192. package/dist/hooks/cursorPreToolUse.js +19 -0
  193. package/dist/hooks/cursorPreToolUse.js.map +1 -0
  194. package/dist/hooks/cursorSessionStart.js +71 -0
  195. package/dist/hooks/cursorSessionStart.js.map +1 -0
  196. package/dist/hooks/sessionStart.js +209 -0
  197. package/dist/hooks/sessionStart.js.map +1 -0
  198. package/dist/hooks/sessionStop.js +212 -0
  199. package/dist/hooks/sessionStop.js.map +1 -0
  200. package/dist/hooks/validateArtifact.js +933 -0
  201. package/dist/hooks/validateArtifact.js.map +1 -0
  202. package/dist/registries/artifactRegistry.js +206 -0
  203. package/dist/registries/artifactRegistry.js.map +1 -0
  204. package/dist/registries/capabilityContract.js +310 -0
  205. package/dist/registries/capabilityContract.js.map +1 -0
  206. package/dist/registries/packageRegistry.js +641 -0
  207. package/dist/registries/packageRegistry.js.map +1 -0
  208. package/dist/registries/runtimeAdapterRegistry.js +315 -0
  209. package/dist/registries/runtimeAdapterRegistry.js.map +1 -0
  210. package/dist/setup/codex.js +1056 -0
  211. package/dist/setup/codex.js.map +1 -0
  212. package/dist/setup/copilot.js +227 -0
  213. package/dist/setup/copilot.js.map +1 -0
  214. package/dist/setup/cursor.js +127 -0
  215. package/dist/setup/cursor.js.map +1 -0
  216. package/dist/setup/doctor.js +1276 -0
  217. package/dist/setup/doctor.js.map +1 -0
  218. package/dist/state/installRoot.js +279 -0
  219. package/dist/state/installRoot.js.map +1 -0
  220. package/dist/state/progressCommit.js +289 -0
  221. package/dist/state/progressCommit.js.map +1 -0
  222. package/dist/state/startupAnalysis.js +1953 -0
  223. package/dist/state/startupAnalysis.js.map +1 -0
  224. package/dist/upgrade/appModel.js +189 -0
  225. package/dist/upgrade/appModel.js.map +1 -0
  226. package/dist/upgrade/channels.js +208 -0
  227. package/dist/upgrade/channels.js.map +1 -0
  228. package/dist/upgrade/compatibility.js +201 -0
  229. package/dist/upgrade/compatibility.js.map +1 -0
  230. package/dist/upgrade/doctor.js +373 -0
  231. package/dist/upgrade/doctor.js.map +1 -0
  232. package/dist/upgrade/migrateArtifactsV2ToV3.js +332 -0
  233. package/dist/upgrade/migrateArtifactsV2ToV3.js.map +1 -0
  234. package/dist/upgrade/runtimeMigration.js +484 -0
  235. package/dist/upgrade/runtimeMigration.js.map +1 -0
  236. package/dist/upgrade/upgradeCommands.js +36 -0
  237. package/dist/upgrade/upgradeCommands.js.map +1 -0
  238. package/dist/upgrade/upgradeOrchestrator.js +299 -0
  239. package/dist/upgrade/upgradeOrchestrator.js.map +1 -0
  240. package/dist/upgrade/versionResolution.js +179 -0
  241. package/dist/upgrade/versionResolution.js.map +1 -0
  242. package/dist/validate/appHomeContract.js +150 -0
  243. package/dist/validate/appHomeContract.js.map +1 -0
  244. package/dist/validate/capability.js +412 -0
  245. package/dist/validate/capability.js.map +1 -0
  246. package/dist/validate/crossCapability.js +145 -0
  247. package/dist/validate/crossCapability.js.map +1 -0
  248. package/dist/validate/lifecycleAdapters.js +772 -0
  249. package/dist/validate/lifecycleAdapters.js.map +1 -0
  250. package/dist/validate/selfAudit.js +107 -0
  251. package/dist/validate/selfAudit.js.map +1 -0
  252. package/package.json +28 -8
  253. package/LICENSE +0 -201
  254. package/bin/agentera.mjs +0 -50
  255. package/lib/exec.mjs +0 -116
  256. package/lib/resolve.mjs +0 -129
@@ -0,0 +1,437 @@
1
+ # OPTIMERA
2
+
3
+ **Objective Pursuit: Targeted Iterative Measurement. Experiment, Record, Advance.**
4
+
5
+ Metric-driven optimization: improve any measurable property one experiment at a time. User defines the objective, agent writes an eval harness, harness becomes the immutable judge. Improve + pass regression = keep; everything else is discarded.
6
+
7
+ Each invocation = one experiment. `/loop` handles recurrence.
8
+
9
+ ---
10
+
11
+ ## Visual identity
12
+
13
+ Glyph: **⎘** (protocol ref: SG7). Used in the mandatory exit marker.
14
+
15
+ ---
16
+
17
+ ## State artifacts
18
+
19
+ Three artifacts per objective, under `.agentera/optimera/<objective-name>/`, bootstrapped if absent.
20
+
21
+ | Artifact | Purpose | Bootstrap |
22
+ |----------|---------|-----------|
23
+ | `.agentera/optimera/<objective-name>/objective.yaml` | What we're optimizing, why, how we measure it, and what "done" looks like. | Via inline brainstorm session with the user (see below). |
24
+ | `.agentera/optimera/<objective-name>/harness` | Eval script that measures the metric. Locked after user approval. | Written by the agent during brainstorm, approved by the user. |
25
+ | `.agentera/optimera/<objective-name>/experiments.yaml` | Log of every experiment: what was tried, what the metric said, kept or discarded. | First experiment entry in YAML form. |
26
+
27
+ ### Artifact path resolution
28
+
29
+ Before reading or writing any artifact, check if `.agentera/docs.yaml` exists. If it has an Artifact Mapping section, use the path specified for each canonical filename. If `.agentera/docs.yaml` doesn't exist or has no mapping for a given artifact, use the default layout: TODO.md, CHANGELOG.md, and DESIGN.md at the project root; canonical VISION.md at `.agentera/vision.yaml`; other agent-facing artifacts at `.agentera/*.yaml`. This applies to all artifact references in this capability, including cross-capability reads (`.agentera/decisions.yaml`). objective.yaml and experiments.yaml are NOT resolved via the docs.yaml mapping; they always live under `.agentera/optimera/<objective-name>/` for whichever objective is active.
30
+
31
+ ### Contract
32
+
33
+ Before starting, read `references/contract.md` (at v2 skill location `skills/agentera/references/contract.md`) for authoritative values: token budgets, severity levels, format contracts, and other shared conventions referenced in the steps below. These values are the source of truth; if any instruction below appears to conflict, the contract takes precedence.
34
+
35
+ ### Benchmark context source contract
36
+
37
+ For benchmark-oriented optimization work, start from:
38
+
39
+ ```bash
40
+ agentera prime --context optimera --format json
41
+ ```
42
+
43
+ Use `benchmark_context` before direct retained startup benchmark file access. If `benchmark_context.source_contract.complete_for_benchmark_context` is true, do not read `latest-report.json`, `latest-report.md`, or `runs.jsonl` during normal Optimera startup. Use the bounded fields in `benchmark_context.latest_report`, `benchmark_context.history_summary`, `benchmark_context.runtime_coverage`, `benchmark_context.state_access_metrics`, `benchmark_context.token_impact`, `benchmark_context.comparison`, `benchmark_context.recommendation`, and `benchmark_context.manual_refresh` instead.
44
+
45
+ If benchmark_context is incomplete, follow `benchmark_context.fallback_commands` and `benchmark_context.manual_refresh` first. Direct reads of retained benchmark files are last-resort diagnostics only, and must preserve the context caveats rather than reconstructing hidden state. Never run `mage bench:startupState` automatically; it is manual-only.
46
+
47
+ When reporting benchmark evidence, preserve caveats about manual-only execution, missing or malformed retained evidence, empty local history, runtime coverage degradation, missing token estimates, non-comparable previous rows, and privacy boundaries. Do not expose raw transcripts, raw corpus files, raw intermediates, raw runtime store paths, raw session IDs, private salts, generated salted hashes, raw benchmark report bodies, or full local benchmark paths.
48
+
49
+ ### objective.yaml
50
+
51
+ Evergreen. Created via brainstorm on first run, refined only when the user explicitly asks. Outside those two cases, the agent reads it but never writes it. Typical structure:
52
+
53
+ ```yaml
54
+ target: Optimization target name
55
+ status: active
56
+ objective: >-
57
+ Precise metric, current value, and target value, for example reduce p95
58
+ latency of /api/search from 320ms to under 100ms.
59
+ why: >-
60
+ What changes when the target is hit, who benefits, and what tradeoffs matter.
61
+ measurement:
62
+ command: .agentera/optimera/<objective-name>/harness
63
+ metric: p95_latency_ms
64
+ direction: lower
65
+ baseline: 320
66
+ target: 100
67
+ budget:
68
+ runs: 5
69
+ time_limit: 10m
70
+ constraints:
71
+ - Existing tests must pass.
72
+ - Public API must not change.
73
+ scope:
74
+ included: [api/search]
75
+ excluded: [public_api]
76
+ ```
77
+
78
+ The objective must be precise enough to measure, constraints clear enough to enforce, and scope defined enough to prevent wandering.
79
+
80
+ Fixed budgets are part of the measurement contract, not experiment strategy. Keep them in objective.yaml and the locked harness. Do not store budget state in root artifacts, registries, symlinks, or DOCS.md mappings. experiments.yaml records the budget actually used only when that evidence matters to interpret the result.
81
+
82
+ ### `.agentera/optimera/<objective-name>/harness`
83
+
84
+ Script that measures the metric and outputs structured JSON. Written during brainstorm, approved by the user, then **locked**. Never modified during optimization cycles.
85
+
86
+ Wraps the project's own tooling (test runners, benchmarks, linters) and translates output into a consistent format. The project's tooling is the source of truth.
87
+
88
+ **Before writing a harness**, inspect the project's existing test, benchmark, lint, or measurement commands. The Agentera app currently ships only the shared contract reference, so harness specifics come from project tooling and the objective's measurement fields.
89
+
90
+ **Output contract** (minimal):
91
+
92
+ ```json
93
+ {"metric": <number>, "direction": "higher"|"lower"}
94
+ ```
95
+
96
+ **Output contract** (with optional fields for richer signal):
97
+
98
+ ```json
99
+ {"metric": 85.5, "direction": "higher", "unit": "%", "detail": "42/50 tests passing", "breakdown": [{"name": "unit", "value": 95.0}, {"name": "integration", "value": 60.0}]}
100
+ ```
101
+
102
+ The harness is the **immutable ground truth**, separating measurement from optimization. If wrong, the user must explicitly ask to rebuild it.
103
+
104
+ ### experiments.yaml
105
+
106
+ When presenting experiment results, open with your interpretation of what happened before the structured data. "Here's what I tried and what it told us"; then the metrics table backs it up. Call out surprises, dead ends, and what the result changes about the approach.
107
+
108
+ ```yaml
109
+ experiments:
110
+ - number: N
111
+ timestamp: YYYY-MM-DD HH:MM
112
+ hypothesis: What we expected to improve and why.
113
+ method: The approach taken to test the hypothesis.
114
+ change: One-line summary of the code change.
115
+ metric:
116
+ before: 320
117
+ after: 250
118
+ direction: lower
119
+ verdict: better
120
+ regression: pass
121
+ status: kept
122
+ commit: <hash>
123
+ inspiration: External source, if any.
124
+ conclusion: What the experiment taught.
125
+ next: What the result suggests trying next.
126
+ ```
127
+
128
+ Closure entries are appended once when the objective reaches its target:
129
+
130
+ ```yaml
131
+ closure:
132
+ timestamp: YYYY-MM-DDTHH:MM:SSZ
133
+ final_value: <value>
134
+ target: <target>
135
+ reason: already met at startup
136
+ ```
137
+
138
+ The "Next" field from the previous experiment is a suggestion, not a mandate. Re-evaluate fresh each cycle based on the full experiment history.
139
+
140
+ ### Experiment history analyzer contract
141
+
142
+ `npx -y agentera experiments` is the read-only summary layer for rich experiments.yaml records. It must inspect the active objective directory only. The command never creates root objective artifacts, registries, symlinks, DOCS.md fixed mappings, or sidecar ledgers.
143
+
144
+ ---
145
+
146
+ ## Brainstorm: bootstrapping or refining the objective
147
+
148
+ This runs in two situations:
149
+
150
+ 1. **objective.yaml doesn't exist**: the first time optimera runs on a project
151
+ 2. **User explicitly asks** to refine the objective (e.g., "change the target", "update objective.yaml")
152
+
153
+ In all other cases, skip straight to the cycle.
154
+
155
+ ### How the brainstorm works
156
+
157
+ The sharp colleague figuring out what to optimize. One question at a time, push for precision, push back on vague targets. Call out when an objective is too fuzzy to measure or when constraints are missing.
158
+
159
+ 1. **Objective**: "What metric, current value, target?" If code exists, run existing test/bench/lint commands first.
160
+ 2. **Motivation**: "Why does this matter? What breaks at current value? What's possible at target?"
161
+ 3. **Constraints**: "What must NOT break? Off-limits files? Resource limits?" If a decision profile exists, propose constraints from it.
162
+ 4. **Scope**: "Which parts to focus on? Where are the biggest gains?" Read codebase to propose informed boundaries.
163
+ 5. **Pre-write self-audit**: run `agentera lint --artifact <ARTIFACT> --text "<DRAFT>"` (or `--file <PATH>`; schema names such as `decisions` auto-resolve the artifact file when no input is given) on the draft entry to check verbosity overruns, abstraction creep, and filler accumulation. Max 3 revision attempts. Flag with [post-audit-flagged] if still failing.
164
+ 6. **Write objective.yaml**: synthesize into a precise charter. Write to `.agentera/optimera/<objective-name>/objective.yaml`. Present for approval.
165
+ 7. **Write the eval harness**: use the project's own tooling and the objective's measurement fields. Write `.agentera/optimera/<objective-name>/harness` so it outputs JSON with at least `metric` and `direction`. Present, explain, get approval, run once to establish baseline.
166
+
167
+ Artifact writing follows contract Artifact Writing Conventions: banned verbosity patterns, 25-word sentence cap, preferred vocabulary, and lead-with-conclusion structure.
168
+
169
+ When **refining**, read current objective.yaml, show proposed changes with rationale, get confirmation. If the harness changes, the user must approve the new version. After brainstorm, proceed to experiment 1.
170
+
171
+ ---
172
+
173
+ ## The cycle
174
+
175
+ Skill introduction: `─── ⎘ optimera · experiment N ───`
176
+
177
+ Step markers: display `── step N/8: verb` before each step.
178
+ Steps: orient, analyze, hypothesize, implement, measure, decide, audit, log.
179
+
180
+ ### Step 1: Orient
181
+
182
+ **Benchmark context**: for benchmark-oriented work, use `benchmark_context` from `agentera prime --context optimera --format json` before direct retained benchmark files. Raw benchmark file reads are last-resort diagnostics.
183
+
184
+ **Active-objective inference**: before reading any per-objective artifact, determine which objective is active by inspecting `.agentera/optimera/`:
185
+
186
+ - If no objective subdirectories exist, keep the existing new-objective path: run the brainstorm.
187
+ - For each objective subdirectory with an objective.yaml, classify it as closed before any active selection when `status: closed`. Do not reopen closed objectives.
188
+ - If the user explicitly names a closed objective, load its objective.yaml and experiments.yaml read-only for context, summarize that it is closed, and ask before defining successor work.
189
+ - If one or more objective subdirectories exist and all are closed, ask the user for a successor objective.
190
+ - If only one non-closed subdirectory exists, use it.
191
+ - If multiple non-closed subdirectories exist, run `git log -1 --format=%aI -- .agentera/optimera/<name>/experiments.yaml` for each and pick the one with the most recent modification timestamp.
192
+ - If the result is ambiguous, ask the user to specify the active objective by name.
193
+
194
+ All subsequent references to objective.yaml, experiments.yaml, and harness refer to the files under `.agentera/optimera/<active-objective-name>/`.
195
+
196
+ 1. **experiments.yaml**: last 5 experiments only (check for plateau patterns)
197
+ 2. **objective.yaml**: the metric, target, constraints, and scope
198
+ 3. **Decision profile**: read `$PROFILERA_PROFILE_DIR/PROFILE.md` directly when it exists. Apply confidence thresholds per contract profile consumption conventions. If missing, proceed without persona grounding but flag it.
199
+ 4. **Project discovery** (experiment 1 or when unfamiliar): map directory structure within scope, read dependency manifests, and read README.md, CLAUDE.md, AGENTS.md.
200
+ 5. `git log --oneline -20` for recent changes
201
+
202
+ Before experimenting: in your response, list the current baseline, target, status, and constraints from objective.yaml.
203
+
204
+ **Objective closure procedure**: when closing an objective, update objective.yaml with canonical closed state: `status: closed`, `closed_at: <ISO-8601 UTC timestamp>`, `final_value: <value>`, `target: <target>`, and `reason: <reason>`. Append one experiments.yaml closure entry. Do not append duplicates.
205
+
206
+ **Exit-early stop condition**: If objective.yaml or experiments.yaml evidence shows the target is already met and the objective is not already closed, run the objective closure procedure with reason `already met at startup`, report exit signal `complete: objective achieved`, and stop before Analyze.
207
+
208
+ ### Step 2: Analyze
209
+
210
+ Run two things:
211
+
212
+ **2a. Experiment history analysis**: if experiments.yaml has prior entries, run:
213
+
214
+ ```bash
215
+ npx -y agentera experiments
216
+ ```
217
+
218
+ Outputs recent experiment status counts, metric deltas, conclusions, and next-step notes.
219
+
220
+ **2b. Current metric**: run the eval harness to get the baseline for this experiment:
221
+
222
+ ```bash
223
+ chmod +x .agentera/optimera/<objective-name>/harness && .agentera/optimera/<objective-name>/harness
224
+ ```
225
+
226
+ Parse the JSON output. Record the current metric as the baseline.
227
+
228
+ **Plateau detection**: if `plateau_detected: true` (no improvement in 3+ experiments), flag explicitly. Consider a radically different approach, ⬚ inspirera, or escalate to the user.
229
+
230
+ ### Step 3: Hypothesize
231
+
232
+ Formulate a single, focused hypothesis.
233
+
234
+ Effort-bias check: if one hypothesis took more effort to construct, reset before selection. Choose by experiment history, expected metric impact, risk, constraints, and smallest falsifiable test; construction effort is not evidence.
235
+
236
+ 1. **Review history**: what's been tried, what worked, what failed?
237
+ 2. **Seek inspiration**: for non-trivial domains, 2-3 targeted web queries for techniques, libraries, or patterns.
238
+ 3. **Formulate**: "I expect [change] to improve the metric because [reasoning]." Must be falsifiable.
239
+
240
+ Be conservative early; escalate if conservative approaches plateau.
241
+
242
+ ### Step 4: Implement
243
+
244
+ **Pre-spawn Git commit**: before creating the worktree, commit any pending artifact changes so the subagent branches from current state.
245
+
246
+ 1. Run `git status --porcelain`. If empty, skip to spawn.
247
+ 2. Stage only the artifact files this session wrote.
248
+ 3. Commit with `chore(optimera): checkpoint before worktree dispatch`. Do not pass `--no-verify`.
249
+ 4. If pre-commit hooks reject the commit: fix and retry. If retry also fails, abort the spawn.
250
+
251
+ **Stale-base awareness**: some harnesses create the worktree branch from `origin/main` rather than local `HEAD`. Before spawning, run `git rev-list --count origin/main..HEAD`. If the count is greater than zero, the worktree will be based on a stale commit. Proceed with spawn, but in Step 5 do NOT merge the worktree branch: fetch the diff and apply it to the main checkout. Re-run the eval harness in the main checkout.
252
+
253
+ Runtime subagent mechanisms:
254
+
255
+ | Runtime | Substrate | Limitation |
256
+ |---------|-----------|------------|
257
+ | Claude Code | Task tool with worktree-aware prompt | Native in-session spawn. |
258
+ | OpenCode | `@<capability>` descriptors from `~/.config/opencode/agents/*.md` or a host Task subagent | Same working tree unless this step explicitly creates and targets a manual git worktree. |
259
+ | Codex CLI | `~/.codex/agents/*.toml` descriptors plus `[agents]` limits | Agentera setup installs descriptor files; do not write legacy `[agents.<name>]` config blocks. |
260
+ | Copilot CLI | User-driven `/fleet` or equivalent host action | No guaranteed programmatic in-session spawn. |
261
+
262
+ Never spawn workers by running unsupported capability-name CLI commands such as `agentera optimera`; use the runtime-native subagent surface with the experiment prompt below.
263
+
264
+ Spawn an implementation sub-agent in a worktree (`isolation: "worktree"`) with:
265
+
266
+ - The hypothesis from step 3
267
+ - Relevant context files (objective.yaml, recent experiments, source files being modified)
268
+ - Clear constraint: implement the hypothesis and nothing else
269
+
270
+ ```
271
+ You are implementing one optimization experiment for [project].
272
+
273
+ ## Hypothesis
274
+ [The hypothesis]
275
+
276
+ ## Context
277
+ - Current metric: [value] ([unit])
278
+ - Target: [target value]
279
+ - Scope: [files/modules in scope from objective.yaml]
280
+
281
+ ## Constraints
282
+ - Implement ONLY what the hypothesis describes. No scope creep.
283
+ - Do NOT modify the eval harness at .agentera/optimera/<objective-name>/harness.
284
+ - Do NOT modify objective.yaml or experiments.yaml.
285
+ - Follow existing code patterns and conventions.
286
+ - Read the files you are modifying before changing them.
287
+ - Keep the change as small as possible while testing the hypothesis.
288
+ - If you encounter a bug unrelated to your task, note it but do not fix it.
289
+ ```
290
+
291
+ Wait for the implementation agent to complete before proceeding.
292
+
293
+ ### Step 5: Measure
294
+
295
+ After implementation completes, run two checks in sequence:
296
+
297
+ **5a. Regression check**: run the project's existing test/build/lint suite. If the regression check fails, **stop here**. The experiment is discarded. Do not run the eval harness. Log the regression failure and move to Step 7.
298
+
299
+ **5b. Metric measurement**: run the eval harness. Parse the JSON output. Compare the new metric against the baseline from Step 2.
300
+
301
+ ### Step 6: Decide
302
+
303
+ Present the decision conversationally: what the numbers say and what you'd recommend, then the structured gate below makes it official.
304
+
305
+ Apply the decision gate. **Both conditions must be true** to keep an experiment:
306
+
307
+ 1. **Regression check passed** (from Step 5a)
308
+ 2. **Metric improved**: the new value is strictly better than the baseline, in the direction declared by the harness (lower for "lower", higher for "higher")
309
+
310
+ If both pass: **keep**. Merge the worktree branch into the current branch. Commit with a conventional commit message:
311
+
312
+ ```
313
+ perf(scope): summary of what improved the metric
314
+
315
+ Metric: <before> → <after> ⮉ (<unit>)
316
+ ```
317
+
318
+ If either fails: **discard**. The worktree is abandoned. No merge. No commit.
319
+
320
+ If the kept experiment's new metric also meets the target in the harness direction, mark the objective as ready for closure after the experiment entry is logged in Step 8.
321
+
322
+ ### Step 7: Pre-write self-audit
323
+
324
+ Pre-write self-audit: run `agentera lint --artifact <ARTIFACT> --text "<DRAFT>"` (or `--file <PATH>`; schema names such as `decisions` auto-resolve the artifact file when no input is given) on the draft entry to check verbosity overruns (per-artifact budget), abstraction creep (>=1 concrete anchor), and filler accumulation (banned patterns table). Max 3 revision attempts. Flag with [post-audit-flagged] if still failing.
325
+
326
+ Narration voice (riff, don't script):
327
+ "Tightening this up..." · "Cutting the filler first..." · "One more pass..."
328
+
329
+ ### Step 8: Log
330
+
331
+ Summarize the experiment for the user before writing the log: what moved, what didn't, and what it suggests trying next. Then write the structured record.
332
+
333
+ Update **experiments.yaml**: append the experiment entry. Output constraint per contract token budgets.
334
+
335
+ If Step 6 marked the objective as ready for closure, immediately run the objective closure procedure with reason `experiment met target`. This closure is part of the same log step, after the experiment result is recorded.
336
+
337
+ After writing a new experiment entry to experiments.yaml, apply the schema COMPACTION rules before writing if thresholds are exceeded: keep 10 full experiments, keep up to 40 one-line archive entries, and drop beyond 50 total.
338
+
339
+ Artifact writing follows contract Artifact Writing Conventions: banned verbosity patterns, 25-word sentence cap, preferred vocabulary, and lead-with-conclusion structure.
340
+
341
+ Then stop. One experiment complete.
342
+
343
+ ---
344
+
345
+ ## Safety rails
346
+
347
+ <critical>
348
+
349
+ - NEVER push to any remote. Local commits only.
350
+ - NEVER modify the eval harness (`.agentera/optimera/<objective-name>/harness`) during an optimization cycle. Only touch it during a brainstorm (bootstrap or user-requested refinement).
351
+ - NEVER modify objective.yaml during a cycle except to record canonical closure when the target is met. Other objective.yaml edits only happen during brainstorm or refine.
352
+ - NEVER bypass the project's test/lint/build suite. Regression check before every metric measurement. Regression failure = automatic discard.
353
+ - NEVER modify git config or skip git hooks.
354
+ - NEVER force push, amend published commits, or run destructive git operations.
355
+ - NEVER keep an experiment that causes a regression, even if the metric improved.
356
+ - NEVER add placeholder data or functionality. All code must be real and functional.
357
+ - NEVER modify files outside the scope declared in objective.yaml (when scope is declared).
358
+ - One experiment per invocation. Do not attempt multiple experiments.
359
+
360
+ </critical>
361
+
362
+ ---
363
+
364
+ ## Handling blocked experiments
365
+
366
+ If blocked (missing dependency, ambiguous constraint, too risky):
367
+
368
+ 1. Log blocked hypothesis in experiments.yaml with context and decision needed
369
+ 2. Formulate a different hypothesis and complete a full experiment on that instead
370
+
371
+ ---
372
+
373
+ ## Exit signals
374
+
375
+ Report one of these statuses at workflow completion (protocol refs: EX1-EX4).
376
+
377
+ Format: `─── ⎘ optimera · status ───` followed by a summary sentence.
378
+ For flagged, stuck, and waiting: add `▸` bullet details below the summary.
379
+
380
+ - **complete** (EX1): One experiment completed the full cycle: hypothesis formulated, implementation dispatched, regression check passed, metric measured, decision made (kept or discarded), and experiments.yaml updated.
381
+ - **flagged** (EX2): The experiment cycle completed but with issues worth noting: the metric did not improve after multiple attempts, a plateau was detected, or the experiment had to be discarded due to a regression.
382
+ - **stuck** (EX3): Cannot proceed because objective.yaml is missing and the brainstorm cannot be completed without user input, the eval harness is broken and cannot be repaired without user approval, or the regression check infrastructure is unavailable.
383
+ - **waiting** (EX4): The optimization objective is too vague to experiment against, the metric cannot be measured by any available tooling, or the scope is undefined and cannot be safely inferred.
384
+
385
+ Before reporting any status, inspect the last 3 entries in PROGRESS.md. If all 3 entries record failed or discarded experiments, this constitutes 3 consecutive failures: **stop the cycle**, log the failure pattern to TODO.md, and surface the situation to the user with a recommended course of action. Do not attempt a 4th consecutive experiment on the same problem.
386
+
387
+ ---
388
+
389
+ ## Cross-capability integration
390
+
391
+ Optimera is part of a twelve-capability suite. Each capability can invoke the others when the work calls for it.
392
+
393
+ ### Optimera invokes ⬚ inspirera
394
+
395
+ When the Hypothesize step needs external techniques (especially after a plateau), search for approaches the way ⬚ inspirera would. Read the source deeply, extract transferable patterns, and fold them into the next hypothesis.
396
+
397
+ ### Realisera invokes ⎘ optimera
398
+
399
+ When realisera picks work that is optimization-shaped (e.g., "improve test performance by 20%", "reduce build time", "increase coverage"), it can delegate to optimera. Realisera provides the context; optimera runs the optimization loop.
400
+
401
+ ### Optimera reads ♾ profilera output
402
+
403
+ Every experiment reads `$PROFILERA_PROFILE_DIR/PROFILE.md` when it exists and applies confidence thresholds per contract profile consumption conventions. Effective confidence weighting ensures stale preferences don't over-constrain experiments.
404
+
405
+ ### Optimera uses ❈ resonera for objective decisions
406
+
407
+ When the brainstorm session surfaces ambiguity about what to optimize (competing metrics, unclear constraints, or tradeoffs between measurement approaches), suggest ❈ resonera to deliberate first. Resonera can produce or refine objective.yaml directly, and its DECISIONS.md entries give optimera context for why the objective was chosen. During Orient, use `agentera decisions --format json` for prior deliberation context and preserve returned `missing_fields`, `compacted`, `caveats`, and `satisfaction.review_needed` pressure instead of raw-reading missing historical context.
408
+
409
+ ### Inspektera feeds ⎘ optimera
410
+
411
+ When an inspektera audit reveals a poor dimension grade with a clearly measurable improvement path (test coverage, complexity score, dependency count), the finding can become an optimization objective. ⛶ inspektera may suggest ⎘ optimera when the metric and direction are clear.
412
+
413
+ ---
414
+
415
+ ## Getting started
416
+
417
+ ### First optimization
418
+
419
+ 1. `/agentera profile`: generate or refresh the decision profile (skip if recent)
420
+ 2. `/agentera optimize`: the first run detects no objective.yaml, runs a brainstorm with you to define the objective and write the eval harness, then proceeds to experiment 1
421
+ 3. Host loop + `/agentera optimize`: set up continuous optimization where supported
422
+
423
+ ### Resuming optimization
424
+
425
+ 1. `/agentera optimize`: if objective.yaml and the eval harness exist, starts experimenting immediately. Reads experiments.yaml to understand what's been tried.
426
+
427
+ ### Changing the target
428
+
429
+ Edit objective.yaml directly to adjust the target value or constraints, or tell optimera to "refine the objective" for a guided session. If the measurement approach needs to change, the eval harness must be rebuilt and re-approved.
430
+
431
+ ### Optimera is fed by ≡ planera
432
+
433
+ When a plan includes optimization-shaped tasks (improving a measurable metric), planera can delegate those tasks to optimera. The plan's acceptance criteria inform the optimization objective.
434
+
435
+ ### Drawing in external techniques
436
+
437
+ Run `/agentera research <url>` with a relevant article, repo, or resource. The analysis will surface optimization techniques applicable to the objective. The next experiment picks it up naturally from the inspiration analysis.
@@ -0,0 +1,69 @@
1
+ ARTIFACTS:
2
+ 1:
3
+ id: A1
4
+ artifact_id: objective
5
+ local_role: produces_and_consumes
6
+ description: >-
7
+ Optimization target definition: metric, current value, target, constraints,
8
+ scope, and measurement approach. Optimera writes this during brainstorm
9
+ and reads it every cycle. Written to only during brainstorm, refine, or
10
+ canonical closure.
11
+ 2:
12
+ id: A2
13
+ artifact_id: optimera_harness
14
+ local_role: produces
15
+ description: >-
16
+ Eval script that measures the metric and outputs structured JSON. Written
17
+ during brainstorm, approved by user, then locked.
18
+ Never modified during optimization cycles.
19
+ 3:
20
+ id: A3
21
+ artifact_id: experiments
22
+ local_role: produces_and_consumes
23
+ description: >-
24
+ Log of every experiment: hypothesis, method, change, metric before/after,
25
+ regression result, status (kept/discarded/error), conclusion, and next
26
+ suggestion. Optimera appends to this each cycle.
27
+ 4:
28
+ id: A4
29
+ artifact_id: progress
30
+ local_role: consumes
31
+ description: >-
32
+ Optimera inspects the last 3 entries to detect 3 consecutive failures
33
+ before reporting exit status.
34
+ 5:
35
+ id: A5
36
+ artifact_id: todo
37
+ local_role: produces
38
+ description: >-
39
+ Optimera writes to this when logging a failure pattern after 3
40
+ consecutive failures.
41
+ 6:
42
+ id: A6
43
+ artifact_id: decisions
44
+ local_role: consumes
45
+ description: >-
46
+ Optimera reads this during Orient for context on prior deliberations
47
+ about the optimization objective.
48
+ 7:
49
+ id: A7
50
+ artifact_id: profile
51
+ local_role: consumes
52
+ description: >-
53
+ Optimera reads this to modulate experiment aggressiveness and constraint
54
+ tolerance based on user preferences.
55
+ 8:
56
+ id: A8
57
+ artifact_id: docs
58
+ local_role: consumes
59
+ description: >-
60
+ Optimera reads this first to resolve project-local artifact mappings
61
+ before accessing other artifacts.
62
+ 9:
63
+ id: A9
64
+ artifact_id: benchmark_context
65
+ local_role: consumes
66
+ description: >-
67
+ CLI-provided startup benchmark summary from `agentera prime --context
68
+ optimera --format json`. Optimera consumes this before any direct
69
+ latest-report.json, latest-report.md, or runs.jsonl diagnostic read.
@@ -0,0 +1,35 @@
1
+ EXIT_CONDITIONS:
2
+ 1:
3
+ id: E1
4
+ condition: complete
5
+ description: >-
6
+ One experiment completed the full cycle: hypothesis formulated,
7
+ implementation dispatched, regression check passed, metric measured,
8
+ decision made (kept or discarded), and EXPERIMENTS.md updated.
9
+ exit_signal: complete
10
+ 2:
11
+ id: E2
12
+ condition: flagged
13
+ description: >-
14
+ The experiment cycle completed but with issues worth noting: the metric
15
+ did not improve after multiple attempts, a plateau was detected, or the
16
+ experiment had to be discarded due to a regression. Each concern is
17
+ listed explicitly.
18
+ exit_signal: flagged
19
+ 3:
20
+ id: E3
21
+ condition: stuck
22
+ description: >-
23
+ Cannot proceed because OBJECTIVE.md is missing and the brainstorm cannot
24
+ be completed without user input, the eval harness is broken and cannot
25
+ be repaired without user approval, or the regression check infrastructure
26
+ is unavailable.
27
+ exit_signal: stuck
28
+ 4:
29
+ id: E4
30
+ condition: waiting
31
+ description: >-
32
+ The optimization objective is too vague to experiment against, the metric
33
+ cannot be measured by any available tooling, or the scope is undefined
34
+ and cannot be safely inferred.
35
+ exit_signal: waiting
@@ -0,0 +1,39 @@
1
+ TRIGGERS:
2
+ 1:
3
+ id: T1
4
+ description: >-
5
+ Direct invocation by name or slash command. Matches when the user
6
+ explicitly requests optimera.
7
+ priority: high
8
+ patterns:
9
+ - "optimera"
10
+ - "/optimera"
11
+ 2:
12
+ id: T2
13
+ description: >-
14
+ Metric-driven optimization requests. Matches when the user wants to
15
+ improve a concrete, quantifiable property of their codebase.
16
+ priority: medium
17
+ patterns:
18
+ - "optimize"
19
+ - "improve performance"
20
+ - "reduce latency"
21
+ - "increase test coverage"
22
+ - "lower bundle size"
23
+ - "speed up"
24
+ - "make faster"
25
+ - "make smaller"
26
+ - "get the score up"
27
+ - "hit the target"
28
+ - "improve the metric"
29
+ 3:
30
+ id: T3
31
+ description: >-
32
+ Iterative experimentation requests. Matches when the user wants to
33
+ run experiments, benchmark, or iterate on a measurable objective.
34
+ priority: medium
35
+ patterns:
36
+ - "benchmark and iterate"
37
+ - "run experiments"
38
+ - "tune"
39
+ - "experiment until"
@@ -0,0 +1,91 @@
1
+ VALIDATION:
2
+ 1:
3
+ id: V1
4
+ rule: harness_locked_during_cycle
5
+ description: >-
6
+ The eval harness MUST NOT be modified during an optimization cycle.
7
+ It may only be written during brainstorm (bootstrap) or user-requested
8
+ refinement. This ensures measurement consistency across experiments.
9
+ severity: critical
10
+ checks:
11
+ - "Harness file is not modified between Step 1 and Step 8"
12
+ 2:
13
+ id: V2
14
+ rule: regression_first
15
+ description: >-
16
+ The regression check MUST pass before the eval harness runs. If the
17
+ project's test/build/lint suite fails, the experiment is automatically
18
+ discarded. Metric improvement does not override regression failure.
19
+ severity: critical
20
+ checks:
21
+ - "Regression check runs before metric measurement"
22
+ - "Regression failure results in experiment discard"
23
+ 3:
24
+ id: V3
25
+ rule: objective_readonly_during_cycle
26
+ description: >-
27
+ OBJECTIVE.md MUST NOT be modified during a cycle except for canonical
28
+ closure when the target is met. Other edits only happen during
29
+ brainstorm or refine.
30
+ severity: critical
31
+ checks:
32
+ - "OBJECTIVE.md only modified for closure during cycle"
33
+ 4:
34
+ id: V4
35
+ rule: one_experiment_per_invocation
36
+ description: >-
37
+ Each optimera invocation performs exactly one experiment. Multiple
38
+ experiments require multiple invocations or /loop setup.
39
+ severity: critical
40
+ checks:
41
+ - "At most one experiment logged per invocation"
42
+ 5:
43
+ id: V5
44
+ rule: decision_gate_both_conditions
45
+ description: >-
46
+ An experiment is kept only when BOTH conditions are true:
47
+ (1) regression check passed, and (2) metric improved in the declared
48
+ direction. If either fails, the experiment is discarded.
49
+ severity: critical
50
+ checks:
51
+ - "Kept experiments have both regression pass and metric improvement"
52
+ 6:
53
+ id: V6
54
+ rule: exit_marker_required
55
+ description: >-
56
+ Every optimera invocation MUST emit an exit marker. The marker uses
57
+ the canonical glyph ⎘ (SG7) in the format ⎘ optimera · <status>
58
+ where status is one of EX1-EX4.
59
+ severity: critical
60
+ checks:
61
+ - "Exit marker present after experiment"
62
+ - "Exit marker uses glyph ⎘ (SG7)"
63
+ - "Exit marker status is one of complete, flagged, stuck, waiting"
64
+ 7:
65
+ id: V7
66
+ rule: benchmark_context_first
67
+ description: >-
68
+ For benchmark-oriented startup, Optimera MUST start from `agentera prime
69
+ --context optimera --format json` and consume complete
70
+ benchmark_context before direct retained benchmark file reads. Direct
71
+ latest-report.json, latest-report.md, or runs.jsonl reads are last-resort
72
+ diagnostics only when the CLI context is incomplete.
73
+ severity: critical
74
+ checks:
75
+ - "Optimera startup uses benchmark_context before direct retained benchmark files"
76
+ - "Incomplete benchmark_context follows listed CLI fallback and manual refresh guidance"
77
+ - "Direct benchmark file reads are treated as last-resort diagnostics"
78
+ 8:
79
+ id: V8
80
+ rule: benchmark_caveat_preservation
81
+ description: >-
82
+ Benchmark caveats from benchmark_context MUST be preserved when Optimera
83
+ reports measurement evidence. Agents must not hide, upgrade, reconstruct,
84
+ or infer away caveats for manual-only execution, missing local history,
85
+ runtime coverage degradation, missing token estimates, non-comparable
86
+ previous rows, or privacy boundaries.
87
+ severity: critical
88
+ checks:
89
+ - "Manual-only benchmark execution caveats are reported"
90
+ - "Missing or malformed retained evidence remains incomplete"
91
+ - "Runtime coverage, token-impact, comparison, and privacy caveats are preserved"