martin-loop 0.1.5 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CODE_OF_CONDUCT.md +32 -0
- package/LICENSE +21 -21
- package/README.md +307 -398
- package/demo/seeded-workspace/README.md +35 -35
- package/demo/seeded-workspace/TASKS.md +29 -29
- package/demo/seeded-workspace/martin.config.yaml +11 -11
- package/demo/seeded-workspace/package.json +8 -8
- package/demo/seeded-workspace/src/invoice-summary.js +11 -11
- package/demo/seeded-workspace/test/invoice-summary.test.js +20 -20
- package/dist/bin/martin-loop.js +0 -0
- package/dist/vendor/adapters/counter.d.ts +1 -0
- package/dist/vendor/adapters/counter.js +4 -0
- package/dist/vendor/adapters/git-baseline.d.ts +50 -0
- package/dist/vendor/adapters/git-baseline.js +233 -0
- package/dist/vendor/adapters/openrouter-adapter.d.ts +15 -0
- package/dist/vendor/adapters/openrouter-adapter.js +302 -0
- package/dist/vendor/adapters/usage.d.ts +48 -0
- package/dist/vendor/adapters/usage.js +66 -0
- package/dist/vendor/cli/bin/exit.d.ts +12 -0
- package/dist/vendor/cli/bin/exit.js +28 -0
- package/dist/vendor/cli/commands/analyze.d.ts +5 -0
- package/dist/vendor/cli/commands/analyze.js +58 -0
- package/dist/vendor/cli/commands/audit-log-verify.d.ts +34 -0
- package/dist/vendor/cli/commands/audit-log-verify.js +99 -0
- package/dist/vendor/cli/commands/audit.d.ts +8 -0
- package/dist/vendor/cli/commands/audit.js +199 -0
- package/dist/vendor/cli/commands/corpus.d.ts +5 -0
- package/dist/vendor/cli/commands/corpus.js +60 -0
- package/dist/vendor/cli/commands/doctor.d.ts +8 -0
- package/dist/vendor/cli/commands/doctor.js +219 -0
- package/dist/vendor/cli/commands/explain.d.ts +17 -0
- package/dist/vendor/cli/commands/explain.js +176 -0
- package/dist/vendor/cli/commands/export.d.ts +5 -0
- package/dist/vendor/cli/commands/export.js +60 -0
- package/dist/vendor/cli/commands/governance.d.ts +8 -0
- package/dist/vendor/cli/commands/governance.js +95 -0
- package/dist/vendor/cli/commands/improve.d.ts +18 -0
- package/dist/vendor/cli/commands/improve.js +396 -0
- package/dist/vendor/cli/commands/init.d.ts +8 -0
- package/dist/vendor/cli/commands/init.js +281 -0
- package/dist/vendor/cli/commands/migration.d.ts +8 -0
- package/dist/vendor/cli/commands/migration.js +67 -0
- package/dist/vendor/cli/commands/prior.d.ts +23 -0
- package/dist/vendor/cli/commands/prior.js +145 -0
- package/dist/vendor/cli/commands/resume.d.ts +21 -0
- package/dist/vendor/cli/commands/resume.js +73 -0
- package/dist/vendor/cli/commands/verify.d.ts +6 -0
- package/dist/vendor/cli/commands/verify.js +43 -0
- package/dist/vendor/cli/research/public-corpus.d.ts +43 -0
- package/dist/vendor/cli/research/public-corpus.js +151 -0
- package/dist/vendor/cli/ui/error-card.d.ts +38 -0
- package/dist/vendor/cli/ui/error-card.js +103 -0
- package/dist/vendor/cli/ui/mission-brief.d.ts +41 -0
- package/dist/vendor/cli/ui/mission-brief.js +173 -0
- package/dist/vendor/cli/ui/summary-card.d.ts +34 -0
- package/dist/vendor/cli/ui/summary-card.js +102 -0
- package/dist/vendor/contracts/audit.d.ts +46 -0
- package/dist/vendor/contracts/audit.js +360 -0
- package/dist/vendor/contracts/post-phase15.d.ts +240 -0
- package/dist/vendor/contracts/post-phase15.js +166 -0
- package/dist/vendor/core/agent/mandates.d.ts +46 -0
- package/dist/vendor/core/agent/mandates.js +178 -0
- package/dist/vendor/core/agent/receipts.d.ts +38 -0
- package/dist/vendor/core/agent/receipts.js +131 -0
- package/dist/vendor/core/agent/signing.d.ts +17 -0
- package/dist/vendor/core/agent/signing.js +91 -0
- package/dist/vendor/core/attestation/sign.d.ts +25 -0
- package/dist/vendor/core/attestation/sign.js +216 -0
- package/dist/vendor/core/autonomy/autonomous-promotion.d.ts +120 -0
- package/dist/vendor/core/autonomy/autonomous-promotion.js +346 -0
- package/dist/vendor/core/autonomy/envelope-v2.d.ts +29 -0
- package/dist/vendor/core/autonomy/envelope-v2.js +60 -0
- package/dist/vendor/core/autonomy/envelope.d.ts +17 -0
- package/dist/vendor/core/autonomy/envelope.js +27 -0
- package/dist/vendor/core/autonomy/escalation-ledger.d.ts +20 -0
- package/dist/vendor/core/autonomy/escalation-ledger.js +18 -0
- package/dist/vendor/core/autonomy/resume.d.ts +15 -0
- package/dist/vendor/core/autonomy/resume.js +23 -0
- package/dist/vendor/core/circuit/circuit-breaker.d.ts +60 -0
- package/dist/vendor/core/circuit/circuit-breaker.js +143 -0
- package/dist/vendor/core/context-distillation.d.ts +3 -0
- package/dist/vendor/core/context-distillation.js +44 -0
- package/dist/vendor/core/context-flow/compile-context.d.ts +8 -0
- package/dist/vendor/core/context-flow/compile-context.js +111 -0
- package/dist/vendor/core/context-flow/entities.d.ts +2 -0
- package/dist/vendor/core/context-flow/entities.js +44 -0
- package/dist/vendor/core/context-flow/evaluate-policy.d.ts +2 -0
- package/dist/vendor/core/context-flow/evaluate-policy.js +42 -0
- package/dist/vendor/core/context-flow/index.d.ts +11 -0
- package/dist/vendor/core/context-flow/index.js +24 -0
- package/dist/vendor/core/context-flow/labels.d.ts +3 -0
- package/dist/vendor/core/context-flow/labels.js +17 -0
- package/dist/vendor/core/context-flow/normalizer.d.ts +9 -0
- package/dist/vendor/core/context-flow/normalizer.js +69 -0
- package/dist/vendor/core/context-flow/profiles.d.ts +33 -0
- package/dist/vendor/core/context-flow/profiles.js +36 -0
- package/dist/vendor/core/context-flow/redaction.d.ts +1 -0
- package/dist/vendor/core/context-flow/redaction.js +6 -0
- package/dist/vendor/core/context-flow/sensitivity.d.ts +2 -0
- package/dist/vendor/core/context-flow/sensitivity.js +27 -0
- package/dist/vendor/core/context-flow/sync-preview.d.ts +2 -0
- package/dist/vendor/core/context-flow/sync-preview.js +22 -0
- package/dist/vendor/core/context-flow/token-estimator.d.ts +3 -0
- package/dist/vendor/core/context-flow/token-estimator.js +13 -0
- package/dist/vendor/core/context-flow/types.d.ts +91 -0
- package/dist/vendor/core/context-flow/types.js +2 -0
- package/dist/vendor/core/context-utility.d.ts +47 -0
- package/dist/vendor/core/context-utility.js +405 -0
- package/dist/vendor/core/cost/pipeline.d.ts +92 -0
- package/dist/vendor/core/cost/pipeline.js +141 -0
- package/dist/vendor/core/cost/tagged-cost.d.ts +27 -0
- package/dist/vendor/core/cost/tagged-cost.js +55 -0
- package/dist/vendor/core/cost-governor.d.ts +2 -0
- package/dist/vendor/core/cost-governor.js +50 -0
- package/dist/vendor/core/cve/cve-check.d.ts +80 -0
- package/dist/vendor/core/cve/cve-check.js +172 -0
- package/dist/vendor/core/digital-twin/index.d.ts +27 -0
- package/dist/vendor/core/digital-twin/index.js +90 -0
- package/dist/vendor/core/drift/drift-graph.d.ts +47 -0
- package/dist/vendor/core/drift/drift-graph.js +100 -0
- package/dist/vendor/core/drift/objective-lock.d.ts +69 -0
- package/dist/vendor/core/drift/objective-lock.js +88 -0
- package/dist/vendor/core/drift/scope.d.ts +46 -0
- package/dist/vendor/core/drift/scope.js +102 -0
- package/dist/vendor/core/drift/signature-lock.d.ts +48 -0
- package/dist/vendor/core/drift/signature-lock.js +202 -0
- package/dist/vendor/core/drift/stale-proof-gate.d.ts +21 -0
- package/dist/vendor/core/drift/stale-proof-gate.js +19 -0
- package/dist/vendor/core/eval/known-bad-world-runner.d.ts +24 -0
- package/dist/vendor/core/eval/known-bad-world-runner.js +256 -0
- package/dist/vendor/core/evidence/claim-audit.d.ts +18 -0
- package/dist/vendor/core/evidence/claim-audit.js +89 -0
- package/dist/vendor/core/exit-intelligence.d.ts +2 -0
- package/dist/vendor/core/exit-intelligence.js +58 -0
- package/dist/vendor/core/explain/formatter.d.ts +42 -0
- package/dist/vendor/core/explain/formatter.js +171 -0
- package/dist/vendor/core/explain/timeline.d.ts +29 -0
- package/dist/vendor/core/explain/timeline.js +213 -0
- package/dist/vendor/core/failure-taxonomy.d.ts +2 -0
- package/dist/vendor/core/failure-taxonomy.js +76 -0
- package/dist/vendor/core/gateway/index.d.ts +10 -0
- package/dist/vendor/core/gateway/index.js +12 -0
- package/dist/vendor/core/gateway/registry.d.ts +40 -0
- package/dist/vendor/core/gateway/registry.js +97 -0
- package/dist/vendor/core/gateway/transport.d.ts +31 -0
- package/dist/vendor/core/gateway/transport.js +82 -0
- package/dist/vendor/core/gateway/vault.d.ts +19 -0
- package/dist/vendor/core/gateway/vault.js +29 -0
- package/dist/vendor/core/graph/adapters.d.ts +43 -0
- package/dist/vendor/core/graph/adapters.js +91 -0
- package/dist/vendor/core/graph/hotspots.d.ts +22 -0
- package/dist/vendor/core/graph/hotspots.js +30 -0
- package/dist/vendor/core/graph/index.d.ts +1 -0
- package/dist/vendor/core/graph/index.js +2 -0
- package/dist/vendor/core/honey/honey-tokens.d.ts +32 -0
- package/dist/vendor/core/honey/honey-tokens.js +44 -0
- package/dist/vendor/core/index.d.ts +2 -2
- package/dist/vendor/core/index.js +38 -12
- package/dist/vendor/core/learning/bayesian-update.d.ts +31 -0
- package/dist/vendor/core/learning/bayesian-update.js +60 -0
- package/dist/vendor/core/learning/prior-sets.d.ts +42 -0
- package/dist/vendor/core/learning/prior-sets.js +111 -0
- package/dist/vendor/core/learning/promotion-gate.d.ts +17 -0
- package/dist/vendor/core/learning/promotion-gate.js +23 -0
- package/dist/vendor/core/leash/blast-radius.d.ts +42 -0
- package/dist/vendor/core/leash/blast-radius.js +156 -0
- package/dist/vendor/core/leash/policy-leash.d.ts +31 -0
- package/dist/vendor/core/leash/policy-leash.js +117 -0
- package/dist/vendor/core/memo/memo.d.ts +63 -0
- package/dist/vendor/core/memo/memo.js +97 -0
- package/dist/vendor/core/memory/learning-pipeline.d.ts +154 -0
- package/dist/vendor/core/memory/learning-pipeline.js +391 -0
- package/dist/vendor/core/memory/palace.d.ts +84 -0
- package/dist/vendor/core/memory/palace.js +379 -0
- package/dist/vendor/core/merge/ast-merge.d.ts +22 -0
- package/dist/vendor/core/merge/ast-merge.js +350 -0
- package/dist/vendor/core/merge/text-merge.d.ts +12 -0
- package/dist/vendor/core/merge/text-merge.js +182 -0
- package/dist/vendor/core/otel/tracer.d.ts +45 -0
- package/dist/vendor/core/otel/tracer.js +116 -0
- package/dist/vendor/core/parallel/parallel-attempts.d.ts +28 -0
- package/dist/vendor/core/parallel/parallel-attempts.js +41 -0
- package/dist/vendor/core/parallel/scorer.d.ts +24 -0
- package/dist/vendor/core/parallel/scorer.js +65 -0
- package/dist/vendor/core/pattern-detection.d.ts +64 -0
- package/dist/vendor/core/pattern-detection.js +108 -0
- package/dist/vendor/core/persistence/checkpoint.d.ts +44 -0
- package/dist/vendor/core/persistence/checkpoint.js +156 -0
- package/dist/vendor/core/persistence/cleanup.d.ts +22 -0
- package/dist/vendor/core/persistence/cleanup.js +131 -0
- package/dist/vendor/core/persistence/index.d.ts +2 -0
- package/dist/vendor/core/persistence/index.js +1 -0
- package/dist/vendor/core/persistence/runs-reader.d.ts +52 -0
- package/dist/vendor/core/persistence/runs-reader.js +84 -0
- package/dist/vendor/core/persistence/store.d.ts +6 -1
- package/dist/vendor/core/persistence/store.js +5 -0
- package/dist/vendor/core/policy/file-touch-quota.d.ts +60 -0
- package/dist/vendor/core/policy/file-touch-quota.js +105 -0
- package/dist/vendor/core/policy/policy-loader.d.ts +30 -0
- package/dist/vendor/core/policy/policy-loader.js +170 -0
- package/dist/vendor/core/policy/policy-schema.d.ts +55 -0
- package/dist/vendor/core/policy/policy-schema.js +78 -0
- package/dist/vendor/core/probe/probe.d.ts +49 -0
- package/dist/vendor/core/probe/probe.js +115 -0
- package/dist/vendor/core/proof/patch-proof.d.ts +58 -0
- package/dist/vendor/core/proof/patch-proof.js +84 -0
- package/dist/vendor/core/proof/semantic-probe.d.ts +25 -0
- package/dist/vendor/core/proof/semantic-probe.js +82 -0
- package/dist/vendor/core/recovery/failure-mode-runner.d.ts +29 -0
- package/dist/vendor/core/recovery/failure-mode-runner.js +39 -0
- package/dist/vendor/core/red-blue/red-phase.d.ts +64 -0
- package/dist/vendor/core/red-blue/red-phase.js +141 -0
- package/dist/vendor/core/red-blue/risk-tiers.d.ts +22 -0
- package/dist/vendor/core/red-blue/risk-tiers.js +33 -0
- package/dist/vendor/core/replay/replay.d.ts +85 -0
- package/dist/vendor/core/replay/replay.js +109 -0
- package/dist/vendor/core/router/engine.d.ts +54 -0
- package/dist/vendor/core/router/engine.js +131 -0
- package/dist/vendor/core/router/index.d.ts +1 -0
- package/dist/vendor/core/router/index.js +2 -0
- package/dist/vendor/core/router/trust-calibration.d.ts +57 -0
- package/dist/vendor/core/router/trust-calibration.js +127 -0
- package/dist/vendor/core/run-martin.d.ts +2 -0
- package/dist/vendor/core/run-martin.js +287 -0
- package/dist/vendor/core/security/cve-scanner.d.ts +62 -0
- package/dist/vendor/core/security/cve-scanner.js +178 -0
- package/dist/vendor/core/sentinel/efficiency-sentinel.d.ts +29 -0
- package/dist/vendor/core/sentinel/efficiency-sentinel.js +30 -0
- package/dist/vendor/core/sentinel/progress-guard.d.ts +35 -0
- package/dist/vendor/core/sentinel/progress-guard.js +46 -0
- package/dist/vendor/core/siem/siem-emitter.d.ts +49 -0
- package/dist/vendor/core/siem/siem-emitter.js +157 -0
- package/dist/vendor/core/strategy/attempt-brief.d.ts +22 -0
- package/dist/vendor/core/strategy/attempt-brief.js +89 -0
- package/dist/vendor/core/summarize/diff-summary.d.ts +35 -0
- package/dist/vendor/core/summarize/diff-summary.js +204 -0
- package/dist/vendor/core/surface-signals.d.ts +21 -0
- package/dist/vendor/core/surface-signals.js +139 -0
- package/dist/vendor/core/truth/truth-wall.d.ts +51 -0
- package/dist/vendor/core/truth/truth-wall.js +69 -0
- package/dist/vendor/core/truth-spine.d.ts +26 -0
- package/dist/vendor/core/truth-spine.js +62 -0
- package/dist/vendor/core/types.d.ts +115 -0
- package/dist/vendor/core/types.js +2 -0
- package/dist/vendor/core/verification/tiered-verify.d.ts +17 -0
- package/dist/vendor/core/verification/tiered-verify.js +29 -0
- package/dist/vendor/core/verifier-pyramid.d.ts +32 -0
- package/dist/vendor/core/verifier-pyramid.js +111 -0
- package/dist/vendor/core/workflow-artifacts.d.ts +99 -0
- package/dist/vendor/core/workflow-artifacts.js +668 -0
- package/dist/vendor/core/wrap/supervised-run.d.ts +96 -0
- package/dist/vendor/core/wrap/supervised-run.js +178 -0
- package/docs/assets/cli-animated.svg +139 -0
- package/docs/assets/cli-static.svg +34 -0
- package/docs/assets/github-hero-v2.svg +23 -0
- package/docs/assets/martin-raplph.png.jpg +0 -0
- package/docs/assets/martinloop-logo.png +0 -0
- package/docs/assets/nvidia-inception-program-light.png +0 -0
- package/docs/assets/nvidia-inception-program.png +0 -0
- package/docs/assets/phase3c-sidesidebyside-demo.html +228 -0
- package/docs/assets/side-by-side.svg +134 -0
- package/docs/oss/CLAUDE-CODE-WALKTHROUGH.md +142 -142
- package/docs/oss/EXAMPLES.md +134 -134
- package/docs/oss/OSS-BOUNDARY-REPORT.json +1 -1
- package/docs/oss/OSS-BOUNDARY-REPORT.md +1 -1
- package/docs/oss/QUICKSTART.md +170 -165
- package/docs/oss/RALPH-LOOP-SAFETY.md +113 -113
- package/docs/oss/README.md +96 -96
- package/docs/oss/RELEASE-SURFACE-REPORT.json +2 -1
- package/docs/oss/RELEASE-SURFACE-REPORT.md +2 -1
- package/package.json +130 -58
- package/docs/distribution/DIRECTORY-SUBMISSIONS.md +0 -89
- package/docs/distribution/INTEGRATION-OUTREACH.md +0 -61
- package/docs/distribution/UNDER-3-CHALLENGE.md +0 -65
package/docs/oss/EXAMPLES.md
CHANGED
|
@@ -1,134 +1,134 @@
|
|
|
1
|
-
# Examples
|
|
2
|
-
|
|
3
|
-
These examples are grounded in the current CLI and MCP surfaces in this repo. Where an example depends on a real provider path, it is labeled that way explicitly.
|
|
4
|
-
|
|
5
|
-
These are still primarily repo-local RC examples. The root `martin-loop` package facade is now real and smoke-validated, but registry publication remains a later release step.
|
|
6
|
-
|
|
7
|
-
## 1. Stub-backed hello world
|
|
8
|
-
|
|
9
|
-
Use this when you want a safe first pass through the loop without real model spend.
|
|
10
|
-
|
|
11
|
-
### PowerShell
|
|
12
|
-
|
|
13
|
-
```powershell
|
|
14
|
-
$env:MARTIN_LIVE='false'
|
|
15
|
-
pnpm run:cli -- run `
|
|
16
|
-
--workspace ws_demo `
|
|
17
|
-
--project proj_demo `
|
|
18
|
-
--objective "Describe the current Martin run lifecycle in one paragraph" `
|
|
19
|
-
--verify "pnpm --filter @martin/core test"
|
|
20
|
-
Remove-Item Env:MARTIN_LIVE
|
|
21
|
-
```
|
|
22
|
-
|
|
23
|
-
Why this is useful:
|
|
24
|
-
|
|
25
|
-
- exercises `runMartin`
|
|
26
|
-
- writes a real loop record and artifacts
|
|
27
|
-
- avoids external provider dependencies
|
|
28
|
-
|
|
29
|
-
## 2. Repo-backed task with explicit scope
|
|
30
|
-
|
|
31
|
-
Use allow and deny paths so the task contract is narrow and reviewable.
|
|
32
|
-
|
|
33
|
-
```bash
|
|
34
|
-
pnpm run:cli -- run \
|
|
35
|
-
--cwd . \
|
|
36
|
-
--objective "Tighten README wording for the OSS quickstart" \
|
|
37
|
-
--verify "pnpm --filter @martin/core test" \
|
|
38
|
-
--allow-path README.md \
|
|
39
|
-
--allow-path docs/oss/** \
|
|
40
|
-
--deny-path apps/control-plane/** \
|
|
41
|
-
--accept "Only update documentation files" \
|
|
42
|
-
--accept "Do not modify runtime code"
|
|
43
|
-
```
|
|
44
|
-
|
|
45
|
-
What this demonstrates:
|
|
46
|
-
|
|
47
|
-
- repo root selection with `--cwd`
|
|
48
|
-
- scoped file-edit boundaries
|
|
49
|
-
- acceptance criteria injection into the task contract
|
|
50
|
-
|
|
51
|
-
## 3. Safety-block example
|
|
52
|
-
|
|
53
|
-
This example is expected to block before execution because the verifier command is unsafe.
|
|
54
|
-
|
|
55
|
-
```bash
|
|
56
|
-
pnpm run:cli -- run \
|
|
57
|
-
--objective "Try to run an unsafe verifier" \
|
|
58
|
-
--verify "rm -rf ."
|
|
59
|
-
```
|
|
60
|
-
|
|
61
|
-
Expected behavior:
|
|
62
|
-
|
|
63
|
-
- the leash blocks the verifier command before adapter execution
|
|
64
|
-
- the run exits through a safety-oriented path rather than pretending the command was acceptable
|
|
65
|
-
- the attempt artifact set includes a persisted leash artifact when applicable
|
|
66
|
-
|
|
67
|
-
The point of this example is not that `rm` exists on every machine. The point is that the raw verifier text is evaluated before the process would be allowed to run.
|
|
68
|
-
|
|
69
|
-
## 4. Budget-constrained live run
|
|
70
|
-
|
|
71
|
-
This is a live-provider example. Only use it when you have the relevant CLI and credentials configured.
|
|
72
|
-
|
|
73
|
-
```bash
|
|
74
|
-
pnpm run:cli -- run \
|
|
75
|
-
--engine codex \
|
|
76
|
-
--model o3 \
|
|
77
|
-
--objective "Refactor the CLI argument parser for clarity" \
|
|
78
|
-
--verify "pnpm --filter @martin/cli test" \
|
|
79
|
-
--budget-usd 2 \
|
|
80
|
-
--soft-limit-usd 1 \
|
|
81
|
-
--max-iterations 2
|
|
82
|
-
```
|
|
83
|
-
|
|
84
|
-
What to review afterward:
|
|
85
|
-
|
|
86
|
-
- admission and settlement events in `ledger.jsonl`
|
|
87
|
-
- cost provenance labels in the run artifacts
|
|
88
|
-
- whether the loop stopped for completion, budget pressure, or lack of progress
|
|
89
|
-
|
|
90
|
-
## 5. MCP invocation shape
|
|
91
|
-
|
|
92
|
-
The MCP server exposes `martin_run`, `martin_inspect`, and `martin_status`.
|
|
93
|
-
|
|
94
|
-
Example `martin_run` payload:
|
|
95
|
-
|
|
96
|
-
```json
|
|
97
|
-
{
|
|
98
|
-
"objective": "Tighten the local dashboard copy",
|
|
99
|
-
"workingDirectory": ".",
|
|
100
|
-
"engine": "claude",
|
|
101
|
-
"verificationPlan": ["pnpm --filter @martin/control-plane test"],
|
|
102
|
-
"maxUsd": 5,
|
|
103
|
-
"maxIterations": 2,
|
|
104
|
-
"maxTokens": 20000,
|
|
105
|
-
"workspaceId": "ws_mcp",
|
|
106
|
-
"projectId": "proj_mcp"
|
|
107
|
-
}
|
|
108
|
-
```
|
|
109
|
-
|
|
110
|
-
## 6. GitHub Actions budget gate example
|
|
111
|
-
|
|
112
|
-
See [`examples/github-actions-budget-gate/`](../../examples/github-actions-budget-gate/) for a CI-safe example that runs MartinLoop with a budget cap, an explicit verifier, and an uploaded JSONL run record artifact.
|
|
113
|
-
|
|
114
|
-
## 7. OpenCode-style adapter example
|
|
115
|
-
|
|
116
|
-
If you want a runnable, no-credentials-required adapter sketch for another coding runtime, see [`examples/opencode-adapter/`](../../examples/opencode-adapter/). It shows how to keep MartinLoop's budget, verifier, and JSONL record shape stable around an OpenCode-style workflow without claiming a native adapter already exists.
|
|
117
|
-
|
|
118
|
-
## 8. What to inspect in artifacts
|
|
119
|
-
|
|
120
|
-
For a repo-backed attempt, look at:
|
|
121
|
-
|
|
122
|
-
- `contract.json`
|
|
123
|
-
- `state.json`
|
|
124
|
-
- `ledger.jsonl`
|
|
125
|
-
- `artifacts/attempt-XXX/compiled-context.json`
|
|
126
|
-
- `artifacts/attempt-XXX/diff.patch`
|
|
127
|
-
- `artifacts/attempt-XXX/grounding-scan.json`
|
|
128
|
-
- `artifacts/attempt-XXX/leash.json`
|
|
129
|
-
- `artifacts/attempt-XXX/patch-score.json`
|
|
130
|
-
- `artifacts/attempt-XXX/patch-decision.json`
|
|
131
|
-
- `artifacts/attempt-XXX/rollback-boundary.json`
|
|
132
|
-
- `artifacts/attempt-XXX/rollback-outcome.json`
|
|
133
|
-
|
|
134
|
-
Those files are the evidence trail that backs the runtime’s claims.
|
|
1
|
+
# Examples
|
|
2
|
+
|
|
3
|
+
These examples are grounded in the current CLI and MCP surfaces in this repo. Where an example depends on a real provider path, it is labeled that way explicitly.
|
|
4
|
+
|
|
5
|
+
These are still primarily repo-local RC examples. The root `martin-loop` package facade is now real and smoke-validated, but registry publication remains a later release step.
|
|
6
|
+
|
|
7
|
+
## 1. Stub-backed hello world
|
|
8
|
+
|
|
9
|
+
Use this when you want a safe first pass through the loop without real model spend.
|
|
10
|
+
|
|
11
|
+
### PowerShell
|
|
12
|
+
|
|
13
|
+
```powershell
|
|
14
|
+
$env:MARTIN_LIVE='false'
|
|
15
|
+
pnpm run:cli -- run `
|
|
16
|
+
--workspace ws_demo `
|
|
17
|
+
--project proj_demo `
|
|
18
|
+
--objective "Describe the current Martin run lifecycle in one paragraph" `
|
|
19
|
+
--verify "pnpm --filter @martin/core test"
|
|
20
|
+
Remove-Item Env:MARTIN_LIVE
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
Why this is useful:
|
|
24
|
+
|
|
25
|
+
- exercises `runMartin`
|
|
26
|
+
- writes a real loop record and artifacts
|
|
27
|
+
- avoids external provider dependencies
|
|
28
|
+
|
|
29
|
+
## 2. Repo-backed task with explicit scope
|
|
30
|
+
|
|
31
|
+
Use allow and deny paths so the task contract is narrow and reviewable.
|
|
32
|
+
|
|
33
|
+
```bash
|
|
34
|
+
pnpm run:cli -- run \
|
|
35
|
+
--cwd . \
|
|
36
|
+
--objective "Tighten README wording for the OSS quickstart" \
|
|
37
|
+
--verify "pnpm --filter @martin/core test" \
|
|
38
|
+
--allow-path README.md \
|
|
39
|
+
--allow-path docs/oss/** \
|
|
40
|
+
--deny-path apps/control-plane/** \
|
|
41
|
+
--accept "Only update documentation files" \
|
|
42
|
+
--accept "Do not modify runtime code"
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
What this demonstrates:
|
|
46
|
+
|
|
47
|
+
- repo root selection with `--cwd`
|
|
48
|
+
- scoped file-edit boundaries
|
|
49
|
+
- acceptance criteria injection into the task contract
|
|
50
|
+
|
|
51
|
+
## 3. Safety-block example
|
|
52
|
+
|
|
53
|
+
This example is expected to block before execution because the verifier command is unsafe.
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
pnpm run:cli -- run \
|
|
57
|
+
--objective "Try to run an unsafe verifier" \
|
|
58
|
+
--verify "rm -rf ."
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
Expected behavior:
|
|
62
|
+
|
|
63
|
+
- the leash blocks the verifier command before adapter execution
|
|
64
|
+
- the run exits through a safety-oriented path rather than pretending the command was acceptable
|
|
65
|
+
- the attempt artifact set includes a persisted leash artifact when applicable
|
|
66
|
+
|
|
67
|
+
The point of this example is not that `rm` exists on every machine. The point is that the raw verifier text is evaluated before the process would be allowed to run.
|
|
68
|
+
|
|
69
|
+
## 4. Budget-constrained live run
|
|
70
|
+
|
|
71
|
+
This is a live-provider example. Only use it when you have the relevant CLI and credentials configured.
|
|
72
|
+
|
|
73
|
+
```bash
|
|
74
|
+
pnpm run:cli -- run \
|
|
75
|
+
--engine codex \
|
|
76
|
+
--model o3 \
|
|
77
|
+
--objective "Refactor the CLI argument parser for clarity" \
|
|
78
|
+
--verify "pnpm --filter @martin/cli test" \
|
|
79
|
+
--budget-usd 2 \
|
|
80
|
+
--soft-limit-usd 1 \
|
|
81
|
+
--max-iterations 2
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
What to review afterward:
|
|
85
|
+
|
|
86
|
+
- admission and settlement events in `ledger.jsonl`
|
|
87
|
+
- cost provenance labels in the run artifacts
|
|
88
|
+
- whether the loop stopped for completion, budget pressure, or lack of progress
|
|
89
|
+
|
|
90
|
+
## 5. MCP invocation shape
|
|
91
|
+
|
|
92
|
+
The MCP server exposes `martin_run`, `martin_inspect`, and `martin_status`.
|
|
93
|
+
|
|
94
|
+
Example `martin_run` payload:
|
|
95
|
+
|
|
96
|
+
```json
|
|
97
|
+
{
|
|
98
|
+
"objective": "Tighten the local dashboard copy",
|
|
99
|
+
"workingDirectory": ".",
|
|
100
|
+
"engine": "claude",
|
|
101
|
+
"verificationPlan": ["pnpm --filter @martin/control-plane test"],
|
|
102
|
+
"maxUsd": 5,
|
|
103
|
+
"maxIterations": 2,
|
|
104
|
+
"maxTokens": 20000,
|
|
105
|
+
"workspaceId": "ws_mcp",
|
|
106
|
+
"projectId": "proj_mcp"
|
|
107
|
+
}
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
## 6. GitHub Actions budget gate example
|
|
111
|
+
|
|
112
|
+
See [`examples/github-actions-budget-gate/`](../../examples/github-actions-budget-gate/) for a CI-safe example that runs MartinLoop with a budget cap, an explicit verifier, and an uploaded JSONL run record artifact.
|
|
113
|
+
|
|
114
|
+
## 7. OpenCode-style adapter example
|
|
115
|
+
|
|
116
|
+
If you want a runnable, no-credentials-required adapter sketch for another coding runtime, see [`examples/opencode-adapter/`](../../examples/opencode-adapter/). It shows how to keep MartinLoop's budget, verifier, and JSONL record shape stable around an OpenCode-style workflow without claiming a native adapter already exists.
|
|
117
|
+
|
|
118
|
+
## 8. What to inspect in artifacts
|
|
119
|
+
|
|
120
|
+
For a repo-backed attempt, look at:
|
|
121
|
+
|
|
122
|
+
- `contract.json`
|
|
123
|
+
- `state.json`
|
|
124
|
+
- `ledger.jsonl`
|
|
125
|
+
- `artifacts/attempt-XXX/compiled-context.json`
|
|
126
|
+
- `artifacts/attempt-XXX/diff.patch`
|
|
127
|
+
- `artifacts/attempt-XXX/grounding-scan.json`
|
|
128
|
+
- `artifacts/attempt-XXX/leash.json`
|
|
129
|
+
- `artifacts/attempt-XXX/patch-score.json`
|
|
130
|
+
- `artifacts/attempt-XXX/patch-decision.json`
|
|
131
|
+
- `artifacts/attempt-XXX/rollback-boundary.json`
|
|
132
|
+
- `artifacts/attempt-XXX/rollback-outcome.json`
|
|
133
|
+
|
|
134
|
+
Those files are the evidence trail that backs the runtime’s claims.
|
package/docs/oss/QUICKSTART.md
CHANGED
|
@@ -1,165 +1,170 @@
|
|
|
1
|
-
# Quickstart
|
|
2
|
-
|
|
3
|
-
This quickstart is intentionally conservative. It is written for a fresh engineer validating the
|
|
4
|
-
|
|
5
|
-
## Public launch target vs current RC path
|
|
6
|
-
|
|
7
|
-
The frozen public launch target is:
|
|
8
|
-
|
|
9
|
-
- `npm install martin-loop`
|
|
10
|
-
- `npx martin-loop ...`
|
|
11
|
-
- `import { MartinLoop } from "martin-loop"`
|
|
12
|
-
- `npx @martinloop/mcp`
|
|
13
|
-
|
|
14
|
-
That runtime launch surface is implemented in the root package facade and smoke-validated from a clean temporary install. The MCP package shape is also smoke-validated from a packed tarball. This quickstart still documents the honest RC-from-source path because public registry publication is a
|
|
15
|
-
|
|
16
|
-
## Prerequisites
|
|
17
|
-
|
|
18
|
-
- Node.js 20+ recommended
|
|
19
|
-
- `pnpm` 10.x
|
|
20
|
-
- A clean local checkout of this repo
|
|
21
|
-
|
|
22
|
-
Optional for live runs:
|
|
23
|
-
|
|
24
|
-
- Claude Code CLI for the Claude adapter path
|
|
25
|
-
- OpenAI Codex CLI plus credentials for the Codex adapter path
|
|
26
|
-
|
|
27
|
-
## Install and build
|
|
28
|
-
|
|
29
|
-
From the repo root:
|
|
30
|
-
|
|
31
|
-
```bash
|
|
32
|
-
pnpm install
|
|
33
|
-
pnpm build
|
|
34
|
-
```
|
|
35
|
-
|
|
36
|
-
## Run the RC validation matrix
|
|
37
|
-
|
|
38
|
-
```bash
|
|
39
|
-
pnpm rc:validate
|
|
40
|
-
```
|
|
41
|
-
|
|
42
|
-
What this does:
|
|
43
|
-
|
|
44
|
-
- creates an isolated temporary home or profile directory
|
|
45
|
-
- points Martin run artifacts at that clean location
|
|
46
|
-
- runs the current build, lint, test, benchmark, and certification matrix
|
|
47
|
-
- writes step logs into a temp `martin-rc-validation-*` directory
|
|
48
|
-
|
|
49
|
-
Use this when you want to answer, "Can a fresh environment still reproduce the current RC baseline?"
|
|
50
|
-
|
|
51
|
-
## RC gate commands
|
|
52
|
-
|
|
53
|
-
The current Phase 13 RC gate is made of these commands:
|
|
54
|
-
|
|
55
|
-
- `pnpm oss:validate`
|
|
56
|
-
- `pnpm public:smoke`
|
|
57
|
-
- `pnpm
|
|
58
|
-
- `pnpm
|
|
59
|
-
- `pnpm
|
|
60
|
-
- `pnpm
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
pnpm
|
|
67
|
-
pnpm
|
|
68
|
-
pnpm
|
|
69
|
-
pnpm
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
90
|
-
```
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
```
|
|
105
|
-
|
|
106
|
-
|
|
107
|
-
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
```
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
```
|
|
123
|
-
|
|
124
|
-
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
|
|
129
|
-
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
|
|
144
|
-
|
|
145
|
-
|
|
146
|
-
|
|
147
|
-
|
|
148
|
-
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
|
|
160
|
-
|
|
161
|
-
|
|
162
|
-
-
|
|
163
|
-
-
|
|
164
|
-
|
|
165
|
-
|
|
1
|
+
# Quickstart
|
|
2
|
+
|
|
3
|
+
This quickstart is intentionally conservative. It is written for a fresh engineer validating the active Phase 15 release lane, not for a hypothetical future public release.
|
|
4
|
+
|
|
5
|
+
## Public launch target vs current RC path
|
|
6
|
+
|
|
7
|
+
The frozen public launch target is:
|
|
8
|
+
|
|
9
|
+
- `npm install martin-loop`
|
|
10
|
+
- `npx martin-loop ...`
|
|
11
|
+
- `import { MartinLoop } from "martin-loop"`
|
|
12
|
+
- `npx @martinloop/mcp`
|
|
13
|
+
|
|
14
|
+
That runtime launch surface is implemented in the root package facade and smoke-validated from a clean temporary install. The MCP package shape is also smoke-validated from a packed tarball. This quickstart still documents the honest RC-from-source path because public registry publication is a later release step.
|
|
15
|
+
|
|
16
|
+
## Prerequisites
|
|
17
|
+
|
|
18
|
+
- Node.js 20+ recommended
|
|
19
|
+
- `pnpm` 10.x
|
|
20
|
+
- A clean local checkout of this repo
|
|
21
|
+
|
|
22
|
+
Optional for live runs:
|
|
23
|
+
|
|
24
|
+
- Claude Code CLI for the Claude adapter path
|
|
25
|
+
- OpenAI Codex CLI plus credentials for the Codex adapter path
|
|
26
|
+
|
|
27
|
+
## Install and build
|
|
28
|
+
|
|
29
|
+
From the repo root:
|
|
30
|
+
|
|
31
|
+
```bash
|
|
32
|
+
pnpm install
|
|
33
|
+
pnpm build
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
## Run the RC validation matrix
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
pnpm rc:validate
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
What this does:
|
|
43
|
+
|
|
44
|
+
- creates an isolated temporary home or profile directory
|
|
45
|
+
- points Martin run artifacts at that clean location
|
|
46
|
+
- runs the current build, lint, test, benchmark, and certification matrix
|
|
47
|
+
- writes step logs into a temp `martin-rc-validation-*` directory
|
|
48
|
+
|
|
49
|
+
Use this when you want to answer, "Can a fresh environment still reproduce the current RC baseline?"
|
|
50
|
+
|
|
51
|
+
## RC gate commands
|
|
52
|
+
|
|
53
|
+
The current Phase 13 RC gate is made of these commands:
|
|
54
|
+
|
|
55
|
+
- `pnpm oss:validate`
|
|
56
|
+
- `pnpm public:smoke`
|
|
57
|
+
- `pnpm mcp:published:smoke`
|
|
58
|
+
- `pnpm repo:smoke`
|
|
59
|
+
- `pnpm rc:validate`
|
|
60
|
+
- `pnpm pilot:prep:validate`
|
|
61
|
+
- `pnpm release:matrix:local`
|
|
62
|
+
|
|
63
|
+
Recommended order for a fresh local reviewer:
|
|
64
|
+
|
|
65
|
+
```bash
|
|
66
|
+
pnpm oss:validate
|
|
67
|
+
pnpm public:smoke
|
|
68
|
+
pnpm mcp:published:smoke
|
|
69
|
+
pnpm repo:smoke
|
|
70
|
+
pnpm rc:validate
|
|
71
|
+
pnpm release:matrix:local
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
`pnpm release:matrix:local` runs the full local OS lane for the current machine. The repository also defines Windows, macOS, and Linux CI lanes in `.github/workflows/phase13-release-matrix.yml`.
|
|
75
|
+
|
|
76
|
+
## Stub-safe CLI run
|
|
77
|
+
|
|
78
|
+
This is the safest first run because it avoids real provider spend.
|
|
79
|
+
|
|
80
|
+
### PowerShell
|
|
81
|
+
|
|
82
|
+
```powershell
|
|
83
|
+
$env:MARTIN_LIVE='false'
|
|
84
|
+
pnpm run:cli -- run --objective "Summarize the current runtime state" --verify "pnpm --filter @martin/core test"
|
|
85
|
+
Remove-Item Env:MARTIN_LIVE
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
### Bash
|
|
89
|
+
|
|
90
|
+
```bash
|
|
91
|
+
MARTIN_LIVE=false pnpm run:cli -- run --objective "Summarize the current runtime state" --verify "pnpm --filter @martin/core test"
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
This path uses the stub adapter and still exercises the loop, persistence, and policy surfaces.
|
|
95
|
+
|
|
96
|
+
## Config-driven run
|
|
97
|
+
|
|
98
|
+
The repo ships an example config at `martin.config.example.yaml`.
|
|
99
|
+
|
|
100
|
+
Martin auto-looks for `martin.config.yaml` in the invocation root, or you can pass `--config <path>`.
|
|
101
|
+
|
|
102
|
+
Example:
|
|
103
|
+
|
|
104
|
+
```bash
|
|
105
|
+
pnpm run:cli -- run --config martin.config.example.yaml --objective "Run with repo defaults" --verify "pnpm --filter @martin/core test"
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
## Inspect a saved run
|
|
109
|
+
|
|
110
|
+
Martin persists runs under `~/.martin/runs/` by default, or under `MARTIN_RUNS_DIR` if you override it.
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
pnpm run:cli -- inspect --file path/to/loop-record.json
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
For persisted run folders, inspect the `contract.json`, `state.json`, `ledger.jsonl`, and `artifacts/attempt-XXX/` files together. Those artifacts are the source of truth for runtime behavior.
|
|
117
|
+
|
|
118
|
+
## MCP server
|
|
119
|
+
|
|
120
|
+
The publish-ready MCP install target is:
|
|
121
|
+
|
|
122
|
+
```bash
|
|
123
|
+
npx @martinloop/mcp
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
Claude Code one-line install:
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
# macOS/Linux
|
|
130
|
+
claude mcp add --scope user martin-loop -- npx @martinloop/mcp
|
|
131
|
+
|
|
132
|
+
# Windows PowerShell/cmd
|
|
133
|
+
claude mcp add --scope user martin-loop cmd /c "npx @martinloop/mcp"
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
Official MCP Registry publication has an extra metadata step beyond npm packaging. Do not mark `@martinloop/mcp` registry-ready unless both of these exist and match:
|
|
137
|
+
|
|
138
|
+
- `packages/mcp/package.json` with `mcpName`
|
|
139
|
+
- `packages/mcp/server.json` with the official server metadata
|
|
140
|
+
|
|
141
|
+
After publishing `@martinloop/mcp` to npm, run the official registry publisher from `packages/mcp`:
|
|
142
|
+
|
|
143
|
+
```bash
|
|
144
|
+
mcp-publisher login github
|
|
145
|
+
mcp-publisher publish
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
For repo-local verification from source:
|
|
149
|
+
|
|
150
|
+
```bash
|
|
151
|
+
pnpm --filter @martinloop/mcp lint
|
|
152
|
+
pnpm --filter @martinloop/mcp test
|
|
153
|
+
pnpm --filter @martinloop/mcp build
|
|
154
|
+
pnpm --filter @martinloop/mcp smoke:pack
|
|
155
|
+
pnpm --filter @martinloop/mcp smoke:published
|
|
156
|
+
node packages/mcp/dist/server.js
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
The current MCP tools are:
|
|
160
|
+
|
|
161
|
+
- `martin_run`
|
|
162
|
+
- `martin_inspect`
|
|
163
|
+
- `martin_status`
|
|
164
|
+
|
|
165
|
+
## Notes for reviewers
|
|
166
|
+
|
|
167
|
+
- Fresh-home behavior matters. Do not rely only on a long-lived `~/.martin` directory.
|
|
168
|
+
- Exact-versus-estimated cost labels are meaningful and should not be merged in docs or dashboards.
|
|
169
|
+
- The repo contains control-plane code, but the public OSS boundary is still being finalized during Phase 13.
|
|
170
|
+
- The benchmark harness remains a workspace-level RC surface; `martin bench` is not part of the publishable CLI boundary yet.
|