helixevo 0.6.1 → 0.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -4,6 +4,32 @@ All notable changes to HelixEvo are documented here.
4
4
 
5
5
  ## [Unreleased]
6
6
 
7
+ ## [0.8.0] - 2026-03-25
8
+
9
+ ### Added
10
+ - Persisted `topology-optimize-status.json` so the dashboard and CLI can distinguish full optimize refresh from partial/degraded conflict enrichment
11
+ - Persisted `llm-runtime-state.json` so HelixEvo can track default provider, per-provider health, last execution, and explicit fallback truth across Claude Code, Codex, and Ollama
12
+ - New provider-control layer that keeps Claude Code as default while adding optional Codex and Ollama support for shared prompt-in / text-out operations
13
+
14
+ ### Changed
15
+ - `graph --optimize` now refreshes the topology review queue first and reports partial-vs-full enrichment truthfully instead of hiding useful structural backlog behind brittle enrichment failures
16
+ - Dashboard run actions now prefer the local built CLI when available, improving live control coherence during local execution
17
+ - Overview, Topology, and Proof now provide stronger next-step guidance around degraded optimize runs and measuring/regressed proof states
18
+ - `status`, Overview, Commands, Guide, and README now expose provider-control truth, including Claude default state, optional Codex/Ollama support, and explicit fallback/degraded behavior
19
+ - Claude-backed web search and research remain explicitly Claude-scoped rather than pretending provider symmetry where it does not exist yet
20
+
21
+ ## [0.7.0] - 2026-03-25
22
+
23
+ ### Added
24
+ - New `helixevo proof` command for bounded outcome attribution and explicit proof review across interventions, transfer, topology execution, semantic adoption, and legacy evolution impact
25
+ - New dashboard `Proof` route and `/api/proof` operator control surface for first-class prove-stage review
26
+ - New `~/.helix/proof-reviews.jsonl` ledger for verify / defer / contest decisions on derived proof records
27
+
28
+ ### Changed
29
+ - The dashboard operator loop now routes the Prove stage to `/proof` instead of only the Guide metrics anchor
30
+ - Overview, Co-Evolution, Ontology, Topology, Guide, Commands, and README now surface the new proof layer and the broader prove-stage framing
31
+ - `metrics`, `status`, and `report` now point operators toward `helixevo proof --status` for broader post-action review
32
+
7
33
  ## [0.6.1] - 2026-03-24
8
34
 
9
35
  ### Added
package/README.md CHANGED
@@ -1,6 +1,6 @@
1
1
  # HelixEvo
2
2
 
3
- Co-evolving skill and project brain for AI agents. HelixEvo captures failures, traces activations, models pressure, routes governed responses, promotes cross-project transfer, reviews structural topology changes, safely executes accepted topology transitions with rollback, and now lets approved ontology concepts become active semantic consumers inside the live control loop.
3
+ Co-evolving skill and project brain for AI agents. HelixEvo captures failures, traces activations, models pressure, routes governed responses, promotes cross-project transfer, reviews structural topology changes, safely executes accepted topology transitions with rollback, lets approved ontology concepts become active semantic consumers inside the live control loop, and now exposes a first-class proof layer for bounded outcome attribution across the brain loop.
4
4
 
5
5
  ## How it works
6
6
 
@@ -21,15 +21,21 @@ Every proposed change goes through:
21
21
  - **[Bun](https://bun.sh)** — used for building (`curl -fsSL https://bun.sh/install | bash`)
22
22
  - **[Claude CLI](https://docs.anthropic.com/en/docs/claude-code)** — installed and authenticated
23
23
  - Requires a **Claude Max plan** subscription
24
- - HelixEvo uses `claude --print` for all LLM operations (no API key needed)
24
+ - Claude Code remains the **default provider** for HelixEvo
25
25
  - Prefer `claude auth login` managed credentials over exporting a hardcoded `CLAUDE_CODE_OAUTH_TOKEN`
26
26
  - HelixEvo now retries once without an inherited `CLAUDE_CODE_OAUTH_TOKEN` if that override is stale but local Claude auth is valid
27
+ - **Optional providers**
28
+ - **Codex CLI** (`codex`) for GPT Codex on shared prompt-in / text-out paths
29
+ - **Ollama** (`ollama` + local daemon) for shared local-model prompt-in / text-out paths
30
+ - Claude-only web-search and research tooling remain explicitly Claude-scoped
27
31
 
28
32
  Verify prerequisites:
29
33
  ```bash
30
34
  node --version # v18+
31
35
  bun --version # any
32
- claude --version # any
36
+ claude --version # default provider
37
+ codex --version # optional
38
+ ollama --version # optional
33
39
  ```
34
40
 
35
41
  ## Install
@@ -81,6 +87,7 @@ helixevo dashboard
81
87
  |---------|-------------|
82
88
  | `helixevo watch` | Always-on learning: auto-capture + auto-evolve |
83
89
  | `helixevo metrics` | Correction rates, skill trends, evolution impact |
90
+ | `helixevo proof` | Outcome attribution and proof review across interventions, transfer, topology, ontology, and evolution |
84
91
  | `helixevo health` | Network health: cohesion, coverage, balance, transfer |
85
92
  | `helixevo init` | Import existing skills + generate skill tests |
86
93
  | `helixevo capture <session>` | Extract failures from a session file |
@@ -91,9 +98,9 @@ helixevo dashboard
91
98
  | `helixevo graph` | View skill network in terminal |
92
99
  | `helixevo ontology` | Refresh, review, adopt, and inspect ontology concepts plus semantic control coverage |
93
100
  | `helixevo topology` | Prepare, apply, roll back, and inspect reviewed topology execution |
94
- | `helixevo research` | Proactive web research for skill improvement |
101
+ | `helixevo research` | Proactive web research for skill improvement (Claude-scoped web-tool path) |
95
102
  | `helixevo dashboard [--port <n>]` | Open web dashboard, preferring localhost:3847 and falling forward if occupied |
96
- | `helixevo status` | Show system health |
103
+ | `helixevo status` | Show system health plus provider-control truth |
97
104
  | `helixevo report` | Generate evolution report |
98
105
 
99
106
  ### Common options
@@ -109,7 +116,7 @@ helixevo graph # TUI view (instant, cached)
109
116
  helixevo graph --mermaid # Open in browser as Mermaid diagram
110
117
  helixevo graph --obsidian ~/vault # Sync to Obsidian vault
111
118
  helixevo graph --rebuild # Re-infer relationships (LLM call)
112
- helixevo graph --optimize # Detect structural candidates + refresh topology review queue
119
+ helixevo graph --optimize # Refresh topology review queue first, then report full vs partial conflict enrichment
113
120
  helixevo ontology --status # Show ontology kernel / frontier / extension / adoption state
114
121
  helixevo ontology --status --verbose
115
122
  # Show top active concepts, unused extensions, and deprecation-sensitive concepts
@@ -120,6 +127,9 @@ helixevo topology --status # Show reviewed topology execution state
120
127
  helixevo topology --prepare <id> # Prepare an accepted topology candidate
121
128
  helixevo topology --apply <id> # Apply a safe prepared topology plan
122
129
  helixevo topology --rollback <id> # Roll back an applied topology plan
130
+ helixevo proof --status # Review proof state across the live loop
131
+ helixevo proof --review <id> --decision verify
132
+ # Verify a proof record after operator review
123
133
  ```
124
134
 
125
135
  ### Research options
@@ -144,13 +154,16 @@ All data is stored in `~/.helix/`:
144
154
  ├── pressure-interventions.jsonl # Routed intervention ledger across response lanes
145
155
  ├── transfer-events.jsonl # Promotion / transfer evidence across motifs and projects
146
156
  ├── governance-state.json # Operator steering for active governance mode
157
+ ├── llm-runtime-state.json # Default provider, per-provider health, last execution, and fallback truth
147
158
  ├── topology-review-candidates.json # Persisted structural review queue
148
159
  ├── topology-review-decisions.jsonl # Operator accept/reject/defer decision ledger
160
+ ├── topology-optimize-status.json # Last full/partial optimize refresh status + queue/enrichment summary
149
161
  ├── topology-overrides.json # Applied safe structural topology overrides
150
162
  ├── topology-snapshots.json # Snapshot refs for reviewed execution and rollback
151
163
  ├── topology-apply-plans.json # Prepared reviewed topology plans
152
164
  ├── topology-executions.jsonl # Prepared/applied/rolled-back execution ledger
153
165
  ├── topology-artifacts.jsonl # Evidence artifacts for reviewed structural execution
166
+ ├── proof-reviews.jsonl # Operator verify/defer/contest ledger for derived proof records
154
167
  ├── evolution-artifacts.jsonl # Evolution + ontology-review evidence artifacts
155
168
  ├── ontology/
156
169
  │ ├── kernel.json # Materialized ontology kernel snapshot
@@ -184,11 +197,12 @@ helixevo dashboard --port 3900
184
197
  ```
185
198
 
186
199
  **Tabs:**
187
- - **Overview** — Premium control cockpit with frontier signals, brain foundation, semantic backbone, ontology adoption visibility, pressure counts, topology review visibility, and prepared/applied structural state
200
+ - **Overview** — Premium control cockpit with frontier signals, brain foundation, provider-control truth, semantic backbone, ontology adoption visibility, proof review visibility, pressure counts, topology review visibility, and prepared/applied structural state
188
201
  - **Skill Network** — Interactive graph, premium inspector, co-evolution routing signals, and topology review/execution handoff links
189
202
  - **Co-Evolution** — Operator cockpit for routed pressure response, governance mode visibility, promotion queues, transfer evidence, semantic route influence, and topology handoff
190
203
  - **Ontology** — Semantic control surface for kernel visibility, frontier concept review, approved ontology extensions, adoption coverage, deprecation risk, and native ontology change events
191
204
  - **Topology** — Governance steering plus a persistent operator pipeline for review → prepare → apply → rollback across merge / split / promote / rewire / consolidate candidates
205
+ - **Proof** — Outcome-attribution and proof-review cockpit for bounded effectiveness review across interventions, transfer, topology execution, semantic adoption, and evolution impact
192
206
  - **Projects** — Project intake studio, live project analysis, gap routing, per-project pressure hotspots, and promotion feeders
193
207
  - **Evolution** — Timeline of evolution runs with judge scores, artifact provenance, and activation-aware context
194
208
  - **Research** — Knowledge buffer plus a live “why research now” handoff from current pressure, governed routing, and recurring gaps
@@ -235,6 +249,7 @@ Failures → Cluster → Propose → Replay → Multi-Judge → Regression → C
235
249
  - **Governance steering** lets the operator pin or release the active adaptation mode rather than relying only on derived routing.
236
250
  - **Topology review** persists merge / split / promote / rewire / consolidate candidates so manual review is a real workflow.
237
251
  - **Reviewed topology execution** turns accepted safe candidates into prepared plans, snapshot-backed applies, and rollbackable structural transitions.
252
+ - **Proof control** turns bounded outcome attribution into an explicit operator layer where interventions, transfer, topology execution, semantic adoption, and evolution impact can be verified, deferred, or contested.
238
253
  - **Evolution artifacts** preserve proposal-level evidence so the dashboard can show what changed, why, and with what provenance.
239
254
 
240
255
  **Three-layer hierarchy:**
@@ -0,0 +1,71 @@
1
+ import { NextResponse } from 'next/server'
2
+ import { spawn } from 'child_process'
3
+ import { existsSync } from 'fs'
4
+ import { join } from 'path'
5
+ import { loadProofDashboardSummary } from '@/lib/proof'
6
+ import type { ProofReviewDecisionStatus } from '@/lib/proof'
7
+
8
+ export const dynamic = 'force-dynamic'
9
+
10
+ function resolveProofRunner(): { cmd: string; argsPrefix: string[] } {
11
+ const candidates = [
12
+ join(process.cwd(), '..', 'dist', 'cli.js'),
13
+ join(process.cwd(), 'dist', 'cli.js'),
14
+ ]
15
+
16
+ for (const candidate of candidates) {
17
+ if (existsSync(candidate)) return { cmd: process.execPath, argsPrefix: [candidate] }
18
+ }
19
+
20
+ return { cmd: 'helixevo', argsPrefix: [] }
21
+ }
22
+
23
+ function runProofCommand(args: string[]): Promise<{ success: boolean; output: string }> {
24
+ return new Promise((resolve) => {
25
+ const runner = resolveProofRunner()
26
+ const child = spawn(runner.cmd, [...runner.argsPrefix, 'proof', ...args], {
27
+ env: { ...process.env },
28
+ stdio: ['ignore', 'pipe', 'pipe'],
29
+ })
30
+
31
+ let output = ''
32
+ child.stdout?.on('data', (chunk: Buffer) => { output += chunk.toString() })
33
+ child.stderr?.on('data', (chunk: Buffer) => { output += chunk.toString() })
34
+ child.on('close', (code) => resolve({ success: code === 0, output }))
35
+ child.on('error', (err) => resolve({ success: false, output: `Error: ${err.message}` }))
36
+ })
37
+ }
38
+
39
+ export async function GET() {
40
+ return NextResponse.json(loadProofDashboardSummary())
41
+ }
42
+
43
+ export async function POST(request: Request) {
44
+ const body = await request.json() as {
45
+ action?: 'review'
46
+ recordId?: string
47
+ decision?: ProofReviewDecisionStatus
48
+ rationale?: string
49
+ }
50
+
51
+ if (body.action !== 'review') {
52
+ return NextResponse.json({ success: false, error: 'action must be review' }, { status: 400 })
53
+ }
54
+ if (!body.recordId || !body.decision) {
55
+ return NextResponse.json({ success: false, error: 'recordId and decision are required' }, { status: 400 })
56
+ }
57
+
58
+ const args = ['--review', body.recordId, '--decision', body.decision]
59
+ if (body.rationale?.trim()) args.push('--rationale', body.rationale.trim())
60
+
61
+ const result = await runProofCommand(args)
62
+ if (!result.success) {
63
+ return NextResponse.json({ success: false, error: result.output || 'Proof command failed' }, { status: 500 })
64
+ }
65
+
66
+ return NextResponse.json({
67
+ success: true,
68
+ output: result.output,
69
+ dashboard: loadProofDashboardSummary(),
70
+ })
71
+ }
@@ -1,5 +1,7 @@
1
1
  import { NextResponse } from 'next/server'
2
2
  import { spawn, type ChildProcess } from 'child_process'
3
+ import { existsSync } from 'fs'
4
+ import { join } from 'path'
3
5
 
4
6
  export const dynamic = 'force-dynamic'
5
7
 
@@ -7,6 +9,7 @@ const ALLOWED_COMMANDS: Record<string, { cmd: string; args: string[]; timeout: n
7
9
  'status': { cmd: 'helixevo', args: ['status'], timeout: 15000 },
8
10
  'health': { cmd: 'helixevo', args: ['health', '--verbose'], timeout: 120000 },
9
11
  'metrics': { cmd: 'helixevo', args: ['metrics', '--verbose'], timeout: 15000 },
12
+ 'proof': { cmd: 'helixevo', args: ['proof', '--verbose'], timeout: 20000 },
10
13
  'evolve': { cmd: 'helixevo', args: ['evolve', '--verbose'], timeout: 300000 },
11
14
  'evolve-dry': { cmd: 'helixevo', args: ['evolve', '--dry-run', '--verbose'], timeout: 300000 },
12
15
  'generalize': { cmd: 'helixevo', args: ['generalize', '--verbose'], timeout: 300000 },
@@ -45,6 +48,21 @@ function buildCommandEntry(body: { command: string; project?: string; path?: str
45
48
  return null
46
49
  }
47
50
 
51
+ function resolveRunRunner(): { cmd: string; argsPrefix: string[] } {
52
+ const candidates = [
53
+ join(process.cwd(), '..', 'dist', 'cli.js'),
54
+ join(process.cwd(), 'dist', 'cli.js'),
55
+ ]
56
+
57
+ for (const candidate of candidates) {
58
+ if (existsSync(candidate)) {
59
+ return { cmd: process.execPath, argsPrefix: [candidate] }
60
+ }
61
+ }
62
+
63
+ return { cmd: 'helixevo', argsPrefix: [] }
64
+ }
65
+
48
66
  let activeProcess: ChildProcess | null = null
49
67
  let activeCommand: string | null = null
50
68
 
@@ -70,7 +88,8 @@ export async function POST(request: Request) {
70
88
 
71
89
  const stream = new ReadableStream({
72
90
  start(controller) {
73
- const child = spawn(entry.cmd, entry.args, {
91
+ const runner = resolveRunRunner()
92
+ const child = spawn(runner.cmd, [...runner.argsPrefix, ...entry.args], {
74
93
  env: { ...process.env },
75
94
  stdio: ['ignore', 'pipe', 'pipe'],
76
95
  })
@@ -9,6 +9,7 @@ import { SectionFrame } from '@/components/section-frame'
9
9
  import { OperatorLoopTrail } from '@/components/operator-loop-trail'
10
10
  import { SurfaceJumpLinks } from '@/components/surface-jump-links'
11
11
  import { NextStepEmptyState } from '@/components/next-step-empty-state'
12
+ import type { ProofDashboardSummary } from '@/lib/proof'
12
13
 
13
14
  type RunState = 'idle' | 'running' | 'success' | 'error' | 'stopped'
14
15
  type CommandName = 'evolve' | 'research' | 'generalize'
@@ -127,6 +128,7 @@ interface Props {
127
128
  }
128
129
  }
129
130
  }
131
+ proof: ProofDashboardSummary
130
132
  }
131
133
 
132
134
  function consoleTone(state: RunState): 'neutral' | 'green' | 'red' | 'yellow' {
@@ -169,7 +171,7 @@ function formatMode(mode: Summary['governance']['activeMode']) {
169
171
  return mode.split('-').join(' ')
170
172
  }
171
173
 
172
- export default function CoEvolutionClient({ summary, ontology }: Props) {
174
+ export default function CoEvolutionClient({ summary, ontology, proof }: Props) {
173
175
  const [runState, setRunState] = useState<RunState>('idle')
174
176
  const [activeCommand, setActiveCommand] = useState<CommandName | null>(null)
175
177
  const [output, setOutput] = useState('')
@@ -272,6 +274,7 @@ export default function CoEvolutionClient({ summary, ontology }: Props) {
272
274
  { label: `${summary.topologyReviews.open} topology reviews`, tone: summary.topologyReviews.open > 0 ? 'yellow' : 'green' },
273
275
  { label: `${ontology.ontologyLoop.frontier} ontology frontier`, tone: ontology.ontologyLoop.reviewOpen > 0 ? 'blue' : 'neutral' },
274
276
  { label: `${ontology.ontologyLoop.adoption.activeConcepts} active concepts`, tone: ontology.ontologyLoop.adoption.activeConcepts > 0 ? 'green' : 'neutral' },
277
+ { label: `${proof.summary.reviewOpen} proof review`, tone: proof.summary.reviewOpen > 0 ? 'yellow' : proof.summary.effective > 0 ? 'green' : 'neutral' },
275
278
  { label: `${summary.recentTransfers.length} recent transfers`, tone: summary.recentTransfers.length > 0 ? 'green' : 'neutral' },
276
279
  { label: formatMode(summary.governance.activeMode), tone: toneForMode(summary.governance.activeMode) },
277
280
  ]}
@@ -284,6 +287,7 @@ export default function CoEvolutionClient({ summary, ontology }: Props) {
284
287
  <div style={{ marginTop: 8, display: 'flex', gap: 6, flexWrap: 'wrap' }}>
285
288
  <span className="badge badge-gray">source: {summary.governance.source}</span>
286
289
  <span className="badge badge-gray">review threshold {(summary.governance.profile.reviewThreshold * 100).toFixed(0)}%</span>
290
+ <Link href="/proof" className="badge badge-gray" style={{ textDecoration: 'none' }}>open proof</Link>
287
291
  </div>
288
292
  </div>
289
293
  <div style={{ display: 'grid', gap: 10 }}>
@@ -307,6 +311,7 @@ export default function CoEvolutionClient({ summary, ontology }: Props) {
307
311
  <MetricCard label="Prepared topology" value={summary.topologyExecution.prepared} sublabel={`${summary.topologyExecution.applied} applied • ${summary.topologyExecution.rolledBack} rolled back`} tone={summary.topologyExecution.prepared > 0 ? 'blue' : summary.topologyExecution.applied > 0 ? 'green' : 'neutral'} icon="↑" />
308
312
  <MetricCard label="Active semantics" value={ontology.ontologyLoop.adoption.activeConcepts} sublabel={`${ontology.ontologyLoop.adoption.totalBindings} bindings • ${ontology.ontologyLoop.adoption.routesInfluenced} influenced routes`} tone={ontology.ontologyLoop.adoption.activeConcepts > 0 ? 'green' : 'neutral'} icon="◎" />
309
313
  <MetricCard label="Recorded interventions" value={summary.pressureInterventions.total} sublabel={`${summary.pressureInterventions.completed} completed • ${summary.pressureInterventions.dryRun} dry-run`} tone="blue" icon="↺" />
314
+ <MetricCard label="Proof review" value={proof.summary.reviewOpen} sublabel={`${proof.summary.effective} effective • ${proof.summary.regressed} regressed`} tone={proof.summary.reviewOpen > 0 ? 'yellow' : proof.summary.effective > 0 ? 'green' : 'neutral'} icon="◇" />
310
315
  <MetricCard label="Realized transfers" value={summary.recentTransfers.filter((event) => event.status === 'realized').length} sublabel={`${summary.pressureMotifs.addressed} motifs now addressed`} tone="green" icon="↑" />
311
316
  </div>
312
317
 
@@ -1,4 +1,5 @@
1
1
  import { loadCoEvolutionSummary, loadOntologySummary } from '@/lib/data'
2
+ import { loadProofDashboardSummary } from '@/lib/proof'
2
3
  import CoEvolutionClient from './client'
3
4
 
4
5
  export const dynamic = 'force-dynamic'
@@ -6,5 +7,6 @@ export const dynamic = 'force-dynamic'
6
7
  export default function CoEvolutionPage() {
7
8
  const summary = loadCoEvolutionSummary()
8
9
  const ontology = loadOntologySummary()
9
- return <CoEvolutionClient summary={summary} ontology={ontology} />
10
+ const proof = loadProofDashboardSummary()
11
+ return <CoEvolutionClient summary={summary} ontology={ontology} proof={proof} />
10
12
  }
@@ -111,7 +111,7 @@ const COMMANDS: CommandInfo[] = [
111
111
  },
112
112
  {
113
113
  name: 'graph',
114
- description: 'Visualize and manage the skill network graph. Shows relationships between skills (depends, enhances, conflicts, co-evolves), and graph optimize now persists a topology review queue for merge, split, promote, and rewire candidates.',
114
+ description: 'Visualize and manage the skill network graph. Shows relationships between skills (depends, enhances, conflicts, co-evolves), and graph optimize now refreshes a truthful topology review queue first, then reports whether conflict enrichment completed fully or only partially.',
115
115
  usage: 'helixevo graph [options]',
116
116
  examples: [
117
117
  { cmd: 'helixevo graph', desc: 'Show skill network in terminal (instant)' },
@@ -130,6 +130,7 @@ const COMMANDS: CommandInfo[] = [
130
130
  category: 'network',
131
131
  needsLLM: true,
132
132
  runnable: { command: 'graph-rebuild', label: 'Rebuild Graph', icon: 'M13.828 10.172a4 4 0 00-5.656 0l-4 4a4 4 0 105.656 5.656l1.102-1.101m-.758-4.899a4 4 0 005.656 0l4-4a4 4 0 00-5.656-5.656l-1.1 1.1', color: 'var(--purple)' },
133
+ note: 'graph --optimize now distinguishes queue refresh from conflict enrichment. In degraded mode it can still surface a real review queue while clearly marking enrichment as partial rather than silently pretending full optimize succeeded.',
133
134
  },
134
135
  {
135
136
  name: 'ontology',
@@ -178,7 +179,7 @@ const COMMANDS: CommandInfo[] = [
178
179
  },
179
180
  {
180
181
  name: 'research',
181
- description: 'Proactive skill discovery via web research. Identifies gaps in your skill network, generates hypotheses, searches the web for solutions, and creates draft skills from discoveries.',
182
+ description: 'Proactive skill discovery via web research. Identifies gaps in your skill network, generates hypotheses, searches the web for solutions, and creates draft skills from discoveries. This lane remains explicitly Claude-scoped because it depends on Claude tool-enabled web search rather than provider-neutral prompting.',
182
183
  usage: 'helixevo research [options]',
183
184
  examples: [
184
185
  { cmd: 'helixevo research', desc: 'Run proactive research' },
@@ -212,7 +213,7 @@ const COMMANDS: CommandInfo[] = [
212
213
  },
213
214
  {
214
215
  name: 'metrics',
215
- description: 'Show correction rates, skill improvement trends, and evolution impact over time. Helps you understand how your skills are improving and where attention is needed.',
216
+ description: 'Show correction rates, skill improvement trends, and legacy evolution impact over time. Metrics remains the quantitative prove surface, while the newer proof layer now expands outcome review across interventions, topology, transfer, and semantic adoption.',
216
217
  usage: 'helixevo metrics [options]',
217
218
  examples: [
218
219
  { cmd: 'helixevo metrics', desc: 'Show summary metrics' },
@@ -225,9 +226,30 @@ const COMMANDS: CommandInfo[] = [
225
226
  needsLLM: false,
226
227
  runnable: { command: 'metrics', label: 'Show Metrics', icon: 'M9 19v-6a2 2 0 00-2-2H5a2 2 0 00-2 2v6a2 2 0 002 2h2a2 2 0 002-2zm0 0V9a2 2 0 012-2h2a2 2 0 012 2v10m-6 0a2 2 0 002 2h2a2 2 0 002-2m0 0V5a2 2 0 012-2h2a2 2 0 012 2v14a2 2 0 01-2 2h-2a2 2 0 01-2-2z', color: 'var(--text-secondary)' },
227
228
  },
229
+ {
230
+ name: 'proof',
231
+ description: 'Review bounded outcome attribution across interventions, transfer, topology execution, semantic adoption, and legacy evolution impact. Proof is where the newer brain loop becomes operator-reviewable instead of relying only on passive heuristics.',
232
+ usage: 'helixevo proof [options]',
233
+ examples: [
234
+ { cmd: 'helixevo proof --status', desc: 'Show proof summary plus the current open review queue' },
235
+ { cmd: 'helixevo proof --status --verbose', desc: 'Show detailed proof records, reasons, and next actions' },
236
+ { cmd: 'helixevo proof --review <recordId> --decision verify', desc: 'Verify a derived proof record after operator review' },
237
+ ],
238
+ options: [
239
+ { flag: '--status', desc: 'Show proof summary and open review state' },
240
+ { flag: '--review <recordId>', desc: 'Review a derived proof record' },
241
+ { flag: '--decision <verify|defer|contest>', desc: 'Decision for --review' },
242
+ { flag: '--rationale <text>', desc: 'Optional rationale for the proof review decision' },
243
+ { flag: '--verbose', desc: 'Show detailed proof records and derived reasons' },
244
+ ],
245
+ category: 'analysis',
246
+ needsLLM: false,
247
+ runnable: { command: 'proof', label: 'Open Proof State', icon: 'M9 17v-2m3 2v-4m3 4v-6m2 10H7a2 2 0 01-2-2V5a2 2 0 012-2h5.586a1 1 0 01.707.293l5.414 5.414a1 1 0 01.293.707V19a2 2 0 01-2 2z', color: 'var(--blue)' },
248
+ note: 'Proof stays bounded and reviewable. Measuring means live but not yet proven, regressed means explicit negative evidence, and verified strengthens review trust without pretending stronger causal certainty than the evidence supports.',
249
+ },
228
250
  {
229
251
  name: 'status',
230
- description: 'Quick overview of system state: total skills, frontier size, failure count, skill tests, and network health. Like a health check but without LLM analysis.',
252
+ description: 'Quick overview of system state: total skills, frontier size, failure count, skill tests, provider control health, and the last recorded provider execution. Like a health check but without deep model analysis.',
231
253
  usage: 'helixevo status',
232
254
  examples: [
233
255
  { cmd: 'helixevo status', desc: 'Show system status' },
@@ -313,7 +335,7 @@ const WORKFLOW = [
313
335
  { label: 'evolve', desc: 'Improve skills', tone: 'green' as const },
314
336
  { label: 'generalize', desc: 'Abstract patterns', tone: 'purple' as const },
315
337
  { label: 'graph --rebuild', desc: 'Map relationships', tone: 'yellow' as const },
316
- { label: 'health', desc: 'Assess quality', tone: 'blue' as const },
338
+ { label: 'proof --status', desc: 'Review outcomes', tone: 'blue' as const },
317
339
  ]
318
340
 
319
341
  const WORKFLOW_RECIPES = [
@@ -341,6 +363,12 @@ const WORKFLOW_RECIPES = [
341
363
  summary: 'Refresh structural review candidates, prepare accepted safe plans, apply them, and keep rollback available.',
342
364
  steps: ['helixevo graph --optimize', 'helixevo topology --prepare <candidateId>', 'helixevo topology --apply <planId>'],
343
365
  },
366
+ {
367
+ title: 'Proof review loop',
368
+ tone: 'blue' as const,
369
+ summary: 'Inspect outcome attribution across the live loop, then verify, defer, or contest proof records explicitly.',
370
+ steps: ['helixevo proof --status', 'helixevo proof --status --verbose', 'helixevo proof --review <recordId> --decision verify'],
371
+ },
344
372
  ]
345
373
 
346
374
  const CATEGORIES: Array<{
@@ -384,13 +412,14 @@ export default function CommandsPage() {
384
412
  { label: `${COMMANDS.length} commands`, tone: 'blue' },
385
413
  { label: `${stats.llmCommands} need LLM access`, tone: 'purple' },
386
414
  { label: `${stats.runnableCommands} runnable here`, tone: 'green' },
415
+ { label: 'Claude default • Codex + Ollama optional', tone: 'blue' },
387
416
  { label: `${stats.optionFlags} documented flags`, tone: 'yellow' },
388
417
  ]}
389
418
  actions={
390
419
  <div className="hero-note-card">
391
420
  <div className="hero-note-label">Recommended operating loop</div>
392
- <div className="hero-note-title">Project Setup → Watch → Co-Evolution → Ontology → Topology</div>
393
- <div className="hero-note-copy">Use the Commands page as the practical bridge between HelixEvo’s CLI, semantic control loop, and the premium dashboard cockpit.</div>
421
+ <div className="hero-note-title">Project Setup → Watch → Co-Evolution → Ontology → Topology → Proof</div>
422
+ <div className="hero-note-copy">Use the Commands page as the practical bridge between HelixEvo’s CLI, semantic control loop, structural control, and the new prove surface.</div>
394
423
  </div>
395
424
  }
396
425
  />
@@ -402,6 +431,28 @@ export default function CommandsPage() {
402
431
  <MetricCard label="Flags documented" value={stats.optionFlags} sublabel="Total option flags surfaced across the current command reference." tone="yellow" />
403
432
  </div>
404
433
 
434
+ <SectionFrame
435
+ eyebrow="Provider control"
436
+ title="Claude default, Codex and Ollama optional"
437
+ description="Shared prompt-in / text-out operations now run through provider control. Claude Code remains the default provider, while Codex and Ollama can be enabled for supported operations. Claude-only web search and research tooling stay explicitly Claude-scoped."
438
+ tone="blue"
439
+ >
440
+ <div className="grid-3" style={{ gap: 12 }}>
441
+ <div className="card" style={{ padding: '18px 18px 16px' }}>
442
+ <div style={{ fontSize: 12, fontWeight: 700, color: 'var(--blue)', marginBottom: 8 }}>Default path</div>
443
+ <div style={{ fontSize: 12.5, color: 'var(--text-dim)', lineHeight: 1.65 }}>Claude Code stays the default provider. Existing CLI flows keep working without forcing a provider switch.</div>
444
+ </div>
445
+ <div className="card" style={{ padding: '18px 18px 16px' }}>
446
+ <div style={{ fontSize: 12, fontWeight: 700, color: 'var(--purple)', marginBottom: 8 }}>Optional providers</div>
447
+ <div style={{ fontSize: 12.5, color: 'var(--text-dim)', lineHeight: 1.65 }}>GPT Codex and Ollama can now be enabled for shared chat / JSON / judge-style paths when you want alternate cloud or local execution.</div>
448
+ </div>
449
+ <div className="card" style={{ padding: '18px 18px 16px' }}>
450
+ <div style={{ fontSize: 12, fontWeight: 700, color: 'var(--yellow)', marginBottom: 8 }}>Truthfulness rule</div>
451
+ <div style={{ fontSize: 12.5, color: 'var(--text-dim)', lineHeight: 1.65 }}>Fallback is explicit, not silent. If a command is Claude-scoped or a fallback path was used, status and dashboard surfaces now record that truth explicitly.</div>
452
+ </div>
453
+ </div>
454
+ </SectionFrame>
455
+
405
456
  <SectionFrame
406
457
  eyebrow="Workflow framing"
407
458
  title="Typical operating sequence"
@@ -433,7 +484,7 @@ export default function CommandsPage() {
433
484
  <SectionFrame
434
485
  eyebrow="Operator recipes"
435
486
  title="Fast command loops for the live product"
436
- description="These compact sequences make the M9 dashboard and CLI feel like one coordinated operating surface instead of separate references."
487
+ description="These compact sequences make the current dashboard, CLI, and prove surface feel like one coordinated operating system instead of separate references."
437
488
  tone="blue"
438
489
  >
439
490
  <div className="grid-2" style={{ gap: 14 }}>