martin-loop 0.1.1 → 0.1.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +344 -89
- package/docs/oss/EXAMPLES.md +126 -0
- package/docs/oss/OSS-BOUNDARY-REPORT.json +113 -0
- package/docs/oss/OSS-BOUNDARY-REPORT.md +48 -0
- package/docs/oss/QUICKSTART.md +135 -0
- package/docs/{README.md → oss/README.md} +93 -89
- package/docs/oss/RELEASE-SURFACE-REPORT.json +45 -0
- package/docs/oss/RELEASE-SURFACE-REPORT.md +35 -0
- package/package.json +54 -64
- package/dist/bin/martin-loop.d.ts +0 -2
- package/dist/bin/martin-loop.js +0 -19
- package/dist/bin/martin-loop.js.map +0 -1
- package/dist/index.d.ts +0 -9
- package/dist/index.js +0 -9
- package/dist/index.js.map +0 -1
- package/docs/EXAMPLES.md +0 -96
- package/docs/QUICKSTART.md +0 -127
- package/docs/release/CLAIM-TO-CAPABILITY.md +0 -19
package/docs/EXAMPLES.md
DELETED
|
@@ -1,96 +0,0 @@
|
|
|
1
|
-
# Examples
|
|
2
|
-
|
|
3
|
-
Runnable examples for the Martin Loop CLI and SDK.
|
|
4
|
-
|
|
5
|
-
## 1. Stub-backed hello world
|
|
6
|
-
|
|
7
|
-
```bash
|
|
8
|
-
pnpm --filter @martin/cli exec martin run \
|
|
9
|
-
--objective "Describe the current Martin run lifecycle" \
|
|
10
|
-
--verify "echo verified"
|
|
11
|
-
```
|
|
12
|
-
|
|
13
|
-
## 2. Scoped task with path boundaries
|
|
14
|
-
|
|
15
|
-
```bash
|
|
16
|
-
pnpm --filter @martin/cli exec martin run \
|
|
17
|
-
--engine claude \
|
|
18
|
-
--objective "Tighten the README wording for the quickstart section" \
|
|
19
|
-
--verify "pnpm --filter @martin/core test" \
|
|
20
|
-
--allow-path README.md \
|
|
21
|
-
--allow-path docs/** \
|
|
22
|
-
--deny-path apps/** \
|
|
23
|
-
--budget-usd 0.25 \
|
|
24
|
-
--accept "Only update documentation files" \
|
|
25
|
-
--accept "Do not modify runtime source code"
|
|
26
|
-
```
|
|
27
|
-
|
|
28
|
-
## 3. Leash block
|
|
29
|
-
|
|
30
|
-
```bash
|
|
31
|
-
pnpm --filter @martin/cli exec martin run \
|
|
32
|
-
--objective "Run a dangerous verifier" \
|
|
33
|
-
--verify "rm -rf ."
|
|
34
|
-
```
|
|
35
|
-
|
|
36
|
-
## 4. Budget-constrained live run
|
|
37
|
-
|
|
38
|
-
```bash
|
|
39
|
-
pnpm --filter @martin/cli exec martin run \
|
|
40
|
-
--engine claude \
|
|
41
|
-
--objective "Refactor the CLI argument parser for clarity" \
|
|
42
|
-
--verify "pnpm --filter @martin/cli test" \
|
|
43
|
-
--budget-usd 1.00 \
|
|
44
|
-
--soft-limit-usd 0.60 \
|
|
45
|
-
--max-iterations 3
|
|
46
|
-
```
|
|
47
|
-
|
|
48
|
-
## 5. Multi-adapter fallback chain
|
|
49
|
-
|
|
50
|
-
```ts
|
|
51
|
-
import { runMartin } from "@martin/core";
|
|
52
|
-
|
|
53
|
-
const result = await runMartin({
|
|
54
|
-
workspaceId: "ws_local",
|
|
55
|
-
projectId: "proj_example",
|
|
56
|
-
task: {
|
|
57
|
-
title: "Fix the failing test",
|
|
58
|
-
objective: "Fix the failing test without widening scope.",
|
|
59
|
-
verificationPlan: ["pnpm test"]
|
|
60
|
-
},
|
|
61
|
-
budget: {
|
|
62
|
-
maxUsd: 2,
|
|
63
|
-
softLimitUsd: 1,
|
|
64
|
-
maxIterations: 6,
|
|
65
|
-
maxTokens: 20_000
|
|
66
|
-
},
|
|
67
|
-
adapter,
|
|
68
|
-
fallbackAdapters: [fallbackAdapter]
|
|
69
|
-
});
|
|
70
|
-
|
|
71
|
-
console.log(result.decision.lifecycleState);
|
|
72
|
-
```
|
|
73
|
-
|
|
74
|
-
## 6. Inspect a completed run
|
|
75
|
-
|
|
76
|
-
```bash
|
|
77
|
-
pnpm --filter @martin/cli exec martin inspect --file ~/.martin/runs/<run-id>/loop-record.json
|
|
78
|
-
```
|
|
79
|
-
|
|
80
|
-
## 7. MCP invocation
|
|
81
|
-
|
|
82
|
-
```json
|
|
83
|
-
{
|
|
84
|
-
"tool": "martin_run",
|
|
85
|
-
"arguments": {
|
|
86
|
-
"objective": "Repair the flaky test in auth.test.ts",
|
|
87
|
-
"workingDirectory": ".",
|
|
88
|
-
"engine": "claude",
|
|
89
|
-
"verificationPlan": ["pnpm test"],
|
|
90
|
-
"maxUsd": 1.0,
|
|
91
|
-
"maxIterations": 4,
|
|
92
|
-
"workspaceId": "ws_local",
|
|
93
|
-
"projectId": "proj_auth"
|
|
94
|
-
}
|
|
95
|
-
}
|
|
96
|
-
```
|
package/docs/QUICKSTART.md
DELETED
|
@@ -1,127 +0,0 @@
|
|
|
1
|
-
# Quickstart
|
|
2
|
-
|
|
3
|
-
Martin Loop runs AI coding agents with hard budget caps, grounding enforcement, and a full audit trail. This guide gets you running in under 5 minutes.
|
|
4
|
-
|
|
5
|
-
## Prerequisites
|
|
6
|
-
|
|
7
|
-
- Node.js 20+
|
|
8
|
-
- `pnpm` 10.x
|
|
9
|
-
- Optional: Claude Code CLI for live Claude runs
|
|
10
|
-
- Optional: OpenAI Codex CLI plus credentials for Codex runs
|
|
11
|
-
|
|
12
|
-
## Install
|
|
13
|
-
|
|
14
|
-
### From source
|
|
15
|
-
|
|
16
|
-
```bash
|
|
17
|
-
git clone https://github.com/martinloop/martin-loop
|
|
18
|
-
cd martin-loop
|
|
19
|
-
pnpm install
|
|
20
|
-
pnpm build
|
|
21
|
-
```
|
|
22
|
-
|
|
23
|
-
This OSS snapshot is validated through the workspace packages, so the examples below use the repo-local CLI entrypoint.
|
|
24
|
-
|
|
25
|
-
## Your first run (stub mode, no spend)
|
|
26
|
-
|
|
27
|
-
```bash
|
|
28
|
-
pnpm --filter @martin/cli exec martin run \
|
|
29
|
-
--objective "Summarize the current runtime state" \
|
|
30
|
-
--verify "echo ok"
|
|
31
|
-
```
|
|
32
|
-
|
|
33
|
-
This exercises the full loop using a stub adapter, so no model is called. Check what was written:
|
|
34
|
-
|
|
35
|
-
```bash
|
|
36
|
-
pnpm --filter @martin/cli exec martin inspect --file ~/.martin/runs/latest/loop-record.json
|
|
37
|
-
```
|
|
38
|
-
|
|
39
|
-
## Live run with a budget cap
|
|
40
|
-
|
|
41
|
-
```bash
|
|
42
|
-
pnpm --filter @martin/cli exec martin run \
|
|
43
|
-
--engine claude \
|
|
44
|
-
--objective "Fix the failing test in packages/core/tests/leash.test.ts" \
|
|
45
|
-
--verify "pnpm --filter @martin/core test" \
|
|
46
|
-
--budget-usd 0.50 \
|
|
47
|
-
--max-iterations 4
|
|
48
|
-
```
|
|
49
|
-
|
|
50
|
-
Martin will stop at $0.50 regardless of task completion. Budget is a hard cap, not a soft suggestion.
|
|
51
|
-
|
|
52
|
-
## Safety demo
|
|
53
|
-
|
|
54
|
-
```bash
|
|
55
|
-
pnpm --filter @martin/cli exec martin run \
|
|
56
|
-
--objective "Run an unsafe verifier" \
|
|
57
|
-
--verify "rm -rf ."
|
|
58
|
-
```
|
|
59
|
-
|
|
60
|
-
Expected: the run exits immediately with a leash violation.
|
|
61
|
-
|
|
62
|
-
## Scoped run with path restrictions
|
|
63
|
-
|
|
64
|
-
```bash
|
|
65
|
-
pnpm --filter @martin/cli exec martin run \
|
|
66
|
-
--engine claude \
|
|
67
|
-
--objective "Improve the README wording" \
|
|
68
|
-
--verify "echo docs-only" \
|
|
69
|
-
--allow-path README.md \
|
|
70
|
-
--allow-path docs/** \
|
|
71
|
-
--deny-path packages/** \
|
|
72
|
-
--budget-usd 0.25
|
|
73
|
-
```
|
|
74
|
-
|
|
75
|
-
## Config file
|
|
76
|
-
|
|
77
|
-
Martin reads `martin.config.yaml` from the current directory automatically:
|
|
78
|
-
|
|
79
|
-
```yaml
|
|
80
|
-
engine: claude
|
|
81
|
-
budgetUsd: 1.00
|
|
82
|
-
maxIterations: 6
|
|
83
|
-
verificationPlan:
|
|
84
|
-
- pnpm test
|
|
85
|
-
allowedPaths:
|
|
86
|
-
- src/**
|
|
87
|
-
deniedPaths:
|
|
88
|
-
- .env
|
|
89
|
-
- secrets/**
|
|
90
|
-
```
|
|
91
|
-
|
|
92
|
-
Then run:
|
|
93
|
-
|
|
94
|
-
```bash
|
|
95
|
-
pnpm --filter @martin/cli exec martin run --objective "Refactor the auth handler"
|
|
96
|
-
```
|
|
97
|
-
|
|
98
|
-
## MCP server
|
|
99
|
-
|
|
100
|
-
```bash
|
|
101
|
-
node packages/mcp/dist/server.js
|
|
102
|
-
```
|
|
103
|
-
|
|
104
|
-
Tools exposed: `martin_run`, `martin_inspect`, `martin_status`
|
|
105
|
-
|
|
106
|
-
## What to inspect after a run
|
|
107
|
-
|
|
108
|
-
```text
|
|
109
|
-
~/.martin/runs/<run-id>/
|
|
110
|
-
contract.json
|
|
111
|
-
state.json
|
|
112
|
-
ledger.jsonl
|
|
113
|
-
artifacts/attempt-001/
|
|
114
|
-
diff.patch
|
|
115
|
-
grounding-scan.json
|
|
116
|
-
leash.json
|
|
117
|
-
patch-decision.json
|
|
118
|
-
rollback-outcome.json
|
|
119
|
-
```
|
|
120
|
-
|
|
121
|
-
## Validation check
|
|
122
|
-
|
|
123
|
-
```bash
|
|
124
|
-
pnpm test
|
|
125
|
-
pnpm build
|
|
126
|
-
pnpm typecheck
|
|
127
|
-
```
|
|
@@ -1,19 +0,0 @@
|
|
|
1
|
-
# Claim To Capability
|
|
2
|
-
|
|
3
|
-
This matrix keeps the public story tied to proof. Every public claim category must point either to repo-owned artifacts or to a frozen external reference record. If a row cannot be defended by evidence, the claim must stay softened or out of market copy.
|
|
4
|
-
|
|
5
|
-
| Claim category | Current boundary | Evidence type | Evidence reference | Status |
|
|
6
|
-
|---|---|---|---|---|
|
|
7
|
-
| Runtime and artifact truth | Artifact-backed runtime lifecycle, grounding, accounting, and rollback behavior only | repo | `docs/oss/RELEASE-SURFACE-REPORT.md`, `docs/oss/OSS-BOUNDARY-REPORT.md`, `pnpm rc:validate` | ready |
|
|
8
|
-
| Evidence-backed contradiction detection | Completion claims are accepted only when repo-backed change evidence and verifier truth support them; this is contradiction detection, not semantic intent reading | repo | `packages/core/src/evidence/claim-audit.ts`, `packages/core/tests/runtime.test.ts` | ready |
|
|
9
|
-
| Deterministic supported-path recovery | Recovery is deterministic only across the declared adapter/model matrix the runtime and CLI construct; unsupported paths must be surfaced honestly | repo | `packages/core/tests/runtime.test.ts`, `packages/cli/tests/cli-recovery-topology.test.ts` | ready |
|
|
10
|
-
| Public package install and CLI surface | Public install target stays `martin-loop`; operator truth still starts from the repo until human publish | repo | `pnpm public:smoke`, `pnpm release:package:validate` | ready |
|
|
11
|
-
| Repo-backed safety, rollback, and grounding proof | Repo-backed mutations must remain explainable from artifacts alone | repo | `pnpm repo:smoke`, `docs/pilot/PILOT-GATE-REVIEW.md` | ready |
|
|
12
|
-
| Pilot closeout evidence | Phase 14 claims stay bounded to 2 accepted `disposable_internal`, 2 accepted `low_risk_real`, and gate `GO` | repo | `docs/pilot/PILOT-RUN-TRACKER.md`, `docs/review-packs/2026-04-05-phase14-pilot-04/review.md` | ready |
|
|
13
|
-
| Website copy and positioning | Website copy must not outrun the shipped public surface or the pilot evidence | external reference | `docs/release/external-evidence/WEBSITE-SURFACE-REFERENCE.md` | ready_for_signoff |
|
|
14
|
-
| Pricing and feature gating | Pricing and packaging statements must align with the single-facade `martin-loop` release and the current feature gate truth | external reference | `docs/release/external-evidence/PRICING-SURFACE-REFERENCE.md` | ready_for_signoff |
|
|
15
|
-
| Privacy commitments | Privacy claims must stay inside the shipped operator and data-handling behavior | external reference | `docs/release/external-evidence/PRIVACY-POLICY-REFERENCE.md` | ready_for_signoff |
|
|
16
|
-
| Terms and access control | Terms must reflect the current access model, manual release sequence, and support boundaries | external reference | `docs/release/external-evidence/TERMS-OF-SERVICE-REFERENCE.md` | ready_for_signoff |
|
|
17
|
-
| Product claim matrix | Marketing or launch claims stay frozen to the reviewed matrix until a human owner widens them intentionally | external reference | `docs/release/external-evidence/PRODUCT-CLAIM-MATRIX-REFERENCE.md` | ready_for_signoff |
|
|
18
|
-
| Semantic hallucination detection | Not claimed. The live repo still exposes a structural short-response heuristic, but release language must not present that as semantic truth understanding | repo | `PRODUCTION-READINESS-AUDIT-REPORT.md`, `docs/handoffs/2026-04-08-h2-h3-complete-handoff.md` | not_claimed |
|
|
19
|
-
| Universal autonomous self-recovery | Not claimed. Release language is limited to deterministic recovery across the declared supported adapter/model matrix and explicit single-path disclosures | repo | `PRODUCTION-READINESS-AUDIT-REPORT.md`, `packages/cli/tests/cli-recovery-topology.test.ts` | not_claimed |
|