martin-loop 0.1.0 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/docs/EXAMPLES.md DELETED
@@ -1,96 +0,0 @@
1
- # Examples
2
-
3
- Runnable examples for the Martin Loop CLI and SDK.
4
-
5
- ## 1. Stub-backed hello world
6
-
7
- ```bash
8
- pnpm --filter @martin/cli exec martin run \
9
- --objective "Describe the current Martin run lifecycle" \
10
- --verify "echo verified"
11
- ```
12
-
13
- ## 2. Scoped task with path boundaries
14
-
15
- ```bash
16
- pnpm --filter @martin/cli exec martin run \
17
- --engine claude \
18
- --objective "Tighten the README wording for the quickstart section" \
19
- --verify "pnpm --filter @martin/core test" \
20
- --allow-path README.md \
21
- --allow-path docs/** \
22
- --deny-path apps/** \
23
- --budget-usd 0.25 \
24
- --accept "Only update documentation files" \
25
- --accept "Do not modify runtime source code"
26
- ```
27
-
28
- ## 3. Leash block
29
-
30
- ```bash
31
- pnpm --filter @martin/cli exec martin run \
32
- --objective "Run a dangerous verifier" \
33
- --verify "rm -rf ."
34
- ```
35
-
36
- ## 4. Budget-constrained live run
37
-
38
- ```bash
39
- pnpm --filter @martin/cli exec martin run \
40
- --engine claude \
41
- --objective "Refactor the CLI argument parser for clarity" \
42
- --verify "pnpm --filter @martin/cli test" \
43
- --budget-usd 1.00 \
44
- --soft-limit-usd 0.60 \
45
- --max-iterations 3
46
- ```
47
-
48
- ## 5. Multi-adapter fallback chain
49
-
50
- ```ts
51
- import { runMartin } from "@martin/core";
52
-
53
- const result = await runMartin({
54
- workspaceId: "ws_local",
55
- projectId: "proj_example",
56
- task: {
57
- title: "Fix the failing test",
58
- objective: "Fix the failing test without widening scope.",
59
- verificationPlan: ["pnpm test"]
60
- },
61
- budget: {
62
- maxUsd: 2,
63
- softLimitUsd: 1,
64
- maxIterations: 6,
65
- maxTokens: 20_000
66
- },
67
- adapter,
68
- fallbackAdapters: [fallbackAdapter]
69
- });
70
-
71
- console.log(result.decision.lifecycleState);
72
- ```
73
-
74
- ## 6. Inspect a completed run
75
-
76
- ```bash
77
- pnpm --filter @martin/cli exec martin inspect --file ~/.martin/runs/<run-id>/loop-record.json
78
- ```
79
-
80
- ## 7. MCP invocation
81
-
82
- ```json
83
- {
84
- "tool": "martin_run",
85
- "arguments": {
86
- "objective": "Repair the flaky test in auth.test.ts",
87
- "workingDirectory": ".",
88
- "engine": "claude",
89
- "verificationPlan": ["pnpm test"],
90
- "maxUsd": 1.0,
91
- "maxIterations": 4,
92
- "workspaceId": "ws_local",
93
- "projectId": "proj_auth"
94
- }
95
- }
96
- ```
@@ -1,127 +0,0 @@
1
- # Quickstart
2
-
3
- Martin Loop runs AI coding agents with hard budget caps, grounding enforcement, and a full audit trail. This guide gets you running in under 5 minutes.
4
-
5
- ## Prerequisites
6
-
7
- - Node.js 20+
8
- - `pnpm` 10.x
9
- - Optional: Claude Code CLI for live Claude runs
10
- - Optional: OpenAI Codex CLI plus credentials for Codex runs
11
-
12
- ## Install
13
-
14
- ### From source
15
-
16
- ```bash
17
- git clone https://github.com/martinloop/martin-loop
18
- cd martin-loop
19
- pnpm install
20
- pnpm build
21
- ```
22
-
23
- This OSS snapshot is validated through the workspace packages, so the examples below use the repo-local CLI entrypoint.
24
-
25
- ## Your first run (stub mode, no spend)
26
-
27
- ```bash
28
- pnpm --filter @martin/cli exec martin run \
29
- --objective "Summarize the current runtime state" \
30
- --verify "echo ok"
31
- ```
32
-
33
- This exercises the full loop using a stub adapter, so no model is called. Check what was written:
34
-
35
- ```bash
36
- pnpm --filter @martin/cli exec martin inspect --file ~/.martin/runs/latest/loop-record.json
37
- ```
38
-
39
- ## Live run with a budget cap
40
-
41
- ```bash
42
- pnpm --filter @martin/cli exec martin run \
43
- --engine claude \
44
- --objective "Fix the failing test in packages/core/tests/leash.test.ts" \
45
- --verify "pnpm --filter @martin/core test" \
46
- --budget-usd 0.50 \
47
- --max-iterations 4
48
- ```
49
-
50
- Martin will stop at $0.50 regardless of task completion. Budget is a hard cap, not a soft suggestion.
51
-
52
- ## Safety demo
53
-
54
- ```bash
55
- pnpm --filter @martin/cli exec martin run \
56
- --objective "Run an unsafe verifier" \
57
- --verify "rm -rf ."
58
- ```
59
-
60
- Expected: the run exits immediately with a leash violation.
61
-
62
- ## Scoped run with path restrictions
63
-
64
- ```bash
65
- pnpm --filter @martin/cli exec martin run \
66
- --engine claude \
67
- --objective "Improve the README wording" \
68
- --verify "echo docs-only" \
69
- --allow-path README.md \
70
- --allow-path docs/** \
71
- --deny-path packages/** \
72
- --budget-usd 0.25
73
- ```
74
-
75
- ## Config file
76
-
77
- Martin reads `martin.config.yaml` from the current directory automatically:
78
-
79
- ```yaml
80
- engine: claude
81
- budgetUsd: 1.00
82
- maxIterations: 6
83
- verificationPlan:
84
- - pnpm test
85
- allowedPaths:
86
- - src/**
87
- deniedPaths:
88
- - .env
89
- - secrets/**
90
- ```
91
-
92
- Then run:
93
-
94
- ```bash
95
- pnpm --filter @martin/cli exec martin run --objective "Refactor the auth handler"
96
- ```
97
-
98
- ## MCP server
99
-
100
- ```bash
101
- node packages/mcp/dist/server.js
102
- ```
103
-
104
- Tools exposed: `martin_run`, `martin_inspect`, `martin_status`
105
-
106
- ## What to inspect after a run
107
-
108
- ```text
109
- ~/.martin/runs/<run-id>/
110
- contract.json
111
- state.json
112
- ledger.jsonl
113
- artifacts/attempt-001/
114
- diff.patch
115
- grounding-scan.json
116
- leash.json
117
- patch-decision.json
118
- rollback-outcome.json
119
- ```
120
-
121
- ## Validation check
122
-
123
- ```bash
124
- pnpm test
125
- pnpm build
126
- pnpm typecheck
127
- ```
@@ -1,19 +0,0 @@
1
- # Claim To Capability
2
-
3
- This matrix keeps the public story tied to proof. Every public claim category must point either to repo-owned artifacts or to a frozen external reference record. If a row cannot be defended by evidence, the claim must stay softened or out of market copy.
4
-
5
- | Claim category | Current boundary | Evidence type | Evidence reference | Status |
6
- |---|---|---|---|---|
7
- | Runtime and artifact truth | Artifact-backed runtime lifecycle, grounding, accounting, and rollback behavior only | repo | `docs/oss/RELEASE-SURFACE-REPORT.md`, `docs/oss/OSS-BOUNDARY-REPORT.md`, `pnpm rc:validate` | ready |
8
- | Evidence-backed contradiction detection | Completion claims are accepted only when repo-backed change evidence and verifier truth support them; this is contradiction detection, not semantic intent reading | repo | `packages/core/src/evidence/claim-audit.ts`, `packages/core/tests/runtime.test.ts` | ready |
9
- | Deterministic supported-path recovery | Recovery is deterministic only across the declared adapter/model matrix the runtime and CLI construct; unsupported paths must be surfaced honestly | repo | `packages/core/tests/runtime.test.ts`, `packages/cli/tests/cli-recovery-topology.test.ts` | ready |
10
- | Public package install and CLI surface | Public install target stays `martin-loop`; operator truth still starts from the repo until human publish | repo | `pnpm public:smoke`, `pnpm release:package:validate` | ready |
11
- | Repo-backed safety, rollback, and grounding proof | Repo-backed mutations must remain explainable from artifacts alone | repo | `pnpm repo:smoke`, `docs/pilot/PILOT-GATE-REVIEW.md` | ready |
12
- | Pilot closeout evidence | Phase 14 claims stay bounded to 2 accepted `disposable_internal`, 2 accepted `low_risk_real`, and gate `GO` | repo | `docs/pilot/PILOT-RUN-TRACKER.md`, `docs/review-packs/2026-04-05-phase14-pilot-04/review.md` | ready |
13
- | Website copy and positioning | Website copy must not outrun the shipped public surface or the pilot evidence | external reference | `docs/release/external-evidence/WEBSITE-SURFACE-REFERENCE.md` | ready_for_signoff |
14
- | Pricing and feature gating | Pricing and packaging statements must align with the single-facade `martin-loop` release and the current feature gate truth | external reference | `docs/release/external-evidence/PRICING-SURFACE-REFERENCE.md` | ready_for_signoff |
15
- | Privacy commitments | Privacy claims must stay inside the shipped operator and data-handling behavior | external reference | `docs/release/external-evidence/PRIVACY-POLICY-REFERENCE.md` | ready_for_signoff |
16
- | Terms and access control | Terms must reflect the current access model, manual release sequence, and support boundaries | external reference | `docs/release/external-evidence/TERMS-OF-SERVICE-REFERENCE.md` | ready_for_signoff |
17
- | Product claim matrix | Marketing or launch claims stay frozen to the reviewed matrix until a human owner widens them intentionally | external reference | `docs/release/external-evidence/PRODUCT-CLAIM-MATRIX-REFERENCE.md` | ready_for_signoff |
18
- | Semantic hallucination detection | Not claimed. The live repo still exposes a structural short-response heuristic, but release language must not present that as semantic truth understanding | repo | `PRODUCTION-READINESS-AUDIT-REPORT.md`, `docs/handoffs/2026-04-08-h2-h3-complete-handoff.md` | not_claimed |
19
- | Universal autonomous self-recovery | Not claimed. Release language is limited to deterministic recovery across the declared supported adapter/model matrix and explicit single-path disclosures | repo | `PRODUCTION-READINESS-AUDIT-REPORT.md`, `packages/cli/tests/cli-recovery-topology.test.ts` | not_claimed |