martin-loop 0.1.4 → 0.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/LICENSE +21 -21
  2. package/README.md +398 -362
  3. package/demo/seeded-workspace/README.md +35 -0
  4. package/demo/seeded-workspace/TASKS.md +29 -0
  5. package/demo/seeded-workspace/martin.config.yaml +11 -0
  6. package/demo/seeded-workspace/package.json +8 -0
  7. package/demo/seeded-workspace/src/invoice-summary.js +11 -0
  8. package/demo/seeded-workspace/test/invoice-summary.test.js +20 -0
  9. package/dist/vendor/adapters/claude-cli.d.ts +19 -4
  10. package/dist/vendor/adapters/claude-cli.js +55 -24
  11. package/dist/vendor/adapters/cli-bridge.d.ts +1 -0
  12. package/dist/vendor/adapters/cli-bridge.js +154 -28
  13. package/dist/vendor/adapters/index.d.ts +1 -0
  14. package/dist/vendor/adapters/index.js +1 -0
  15. package/dist/vendor/adapters/verifier-only.d.ts +7 -0
  16. package/dist/vendor/adapters/verifier-only.js +57 -0
  17. package/dist/vendor/cli/index.d.ts +6 -1
  18. package/dist/vendor/cli/index.js +124 -7
  19. package/dist/vendor/contracts/index.d.ts +3 -1
  20. package/dist/vendor/core/compiler.d.ts +2 -0
  21. package/dist/vendor/core/compiler.js +10 -4
  22. package/dist/vendor/core/context-integrity.d.ts +26 -0
  23. package/dist/vendor/core/context-integrity.js +56 -0
  24. package/dist/vendor/core/index.d.ts +5 -2
  25. package/dist/vendor/core/index.js +186 -54
  26. package/dist/vendor/core/policy.d.ts +6 -0
  27. package/docs/distribution/DIRECTORY-SUBMISSIONS.md +89 -0
  28. package/docs/distribution/INTEGRATION-OUTREACH.md +61 -0
  29. package/docs/distribution/UNDER-3-CHALLENGE.md +65 -0
  30. package/docs/oss/CLAUDE-CODE-WALKTHROUGH.md +142 -0
  31. package/docs/oss/EXAMPLES.md +134 -126
  32. package/docs/oss/OSS-BOUNDARY-REPORT.json +109 -113
  33. package/docs/oss/OSS-BOUNDARY-REPORT.md +48 -48
  34. package/docs/oss/QUICKSTART.md +165 -135
  35. package/docs/oss/RALPH-LOOP-SAFETY.md +113 -0
  36. package/docs/oss/README.md +96 -93
  37. package/docs/oss/RELEASE-SURFACE-REPORT.json +45 -45
  38. package/docs/oss/RELEASE-SURFACE-REPORT.md +35 -35
  39. package/package.json +19 -11
@@ -1,48 +1,48 @@
1
- # Martin Loop Phase 13 OSS Core Boundary
2
-
3
- Generated: 2026-04-23T15:03:09.849Z
4
-
5
- ## Verdict
6
- **GO**
7
-
8
- ## Summary
9
- - Public package target: martin-loop
10
- - Canonical public package manager: npm
11
- - Intended OSS core packages: 5
12
- - Non-OSS workspace packages: 2
13
- - Local-only surfaces: 1
14
- - Private OSS-core packages still gated from publish: 3
15
- - OSS-core packages already publish-configured: 2
16
- - Dependency leaks: 0
17
- - No workspace dependency leaks detected between the intended OSS core and the non-OSS workspace surfaces.
18
-
19
- ## Public Package Surface
20
- - Install target: `npm install martin-loop`
21
- - CLI target: `npx martin-loop`
22
- - SDK target: `import { MartinLoop } from 'martin-loop'`
23
- - Root `npx martin-loop` support shipped: yes
24
- - Root SDK import shipped: yes
25
-
26
- ## Intended OSS Core Packages
27
- | Package | Path | Private | Publish Access | Workspace Deps |
28
- |---|---|---|---|---|
29
- | @martin/contracts | packages/contracts | yes | n/a | none |
30
- | @martin/core | packages/core | yes | n/a | @martin/contracts |
31
- | @martin/adapters | packages/adapters | yes | n/a | @martin/core |
32
- | @martin/cli | packages/cli | no | public | @martin/adapters, @martin/contracts, @martin/core |
33
- | @martin/mcp | packages/mcp | no | public | @martin/adapters, @martin/contracts, @martin/core |
34
-
35
- ## Non-OSS Workspace Packages
36
- | Package | Path | Reason |
37
- |---|---|---|
38
- | @martin/control-plane | apps/control-plane | Managed or RC-only workspace surface that stays out of the initial OSS boundary. |
39
- | @martin/benchmarks | benchmarks | Managed or RC-only workspace surface that stays out of the initial OSS boundary. |
40
-
41
- ## Local-Only Surfaces
42
- | Path | Reason |
43
- |---|---|
44
- | apps/local-dashboard | Local read-model viewer that is not yet packaged as a publishable OSS workspace. |
45
-
46
- ## Dependency Leak Review
47
- - No workspace dependency leaks detected.
48
-
1
+ # Martin Loop Phase 13 OSS Core Boundary
2
+
3
+ Generated: 2026-05-11T21:47:36.834Z
4
+
5
+ ## Verdict
6
+ **GO**
7
+
8
+ ## Summary
9
+ - Public package target: martin-loop
10
+ - Canonical public package manager: npm
11
+ - Intended OSS core packages: 5
12
+ - Non-OSS workspace packages: 2
13
+ - Local-only surfaces: 1
14
+ - Private OSS-core packages still gated from publish: 3
15
+ - OSS-core packages already publish-configured: 2
16
+ - Dependency leaks: 0
17
+ - No workspace dependency leaks detected between the intended OSS core and the non-OSS workspace surfaces.
18
+
19
+ ## Public Package Surface
20
+ - Install target: `npm install martin-loop`
21
+ - CLI target: `npx martin-loop`
22
+ - SDK target: `import { MartinLoop } from 'martin-loop'`
23
+ - Root `npx martin-loop` support shipped: yes
24
+ - Root SDK import shipped: yes
25
+
26
+ ## Intended OSS Core Packages
27
+ | Package | Path | Private | Publish Access | Workspace Deps |
28
+ |---|---|---|---|---|
29
+ | @martin/contracts | packages/contracts | yes | n/a | none |
30
+ | @martin/core | packages/core | yes | n/a | @martin/contracts |
31
+ | @martin/adapters | packages/adapters | yes | n/a | @martin/core |
32
+ | @martin/cli | packages/cli | no | public | @martin/adapters, @martin/contracts, @martin/core |
33
+ | @martinloop/mcp | packages/mcp | no | public | none |
34
+
35
+ ## Non-OSS Workspace Packages
36
+ | Package | Path | Reason |
37
+ |---|---|---|
38
+ | @martin/control-plane | apps/control-plane | Managed or RC-only workspace surface that stays out of the initial OSS boundary. |
39
+ | @martin/benchmarks | benchmarks | Managed or RC-only workspace surface that stays out of the initial OSS boundary. |
40
+
41
+ ## Local-Only Surfaces
42
+ | Path | Reason |
43
+ |---|---|
44
+ | apps/local-dashboard | Local read-model viewer that is not yet packaged as a publishable OSS workspace. |
45
+
46
+ ## Dependency Leak Review
47
+ - No workspace dependency leaks detected.
48
+
@@ -1,135 +1,165 @@
1
- # Quickstart
2
-
3
- This quickstart is intentionally conservative. It is written for a fresh engineer validating the current Phase 13 release-candidate state, not for a hypothetical future public release.
4
-
5
- ## Public launch target vs current RC path
6
-
7
- The frozen public launch target is:
8
-
9
- - `npm install martin-loop`
10
- - `npx martin-loop ...`
11
- - `import { MartinLoop } from "martin-loop"`
12
-
13
- That launch surface is now implemented in the root package facade and smoke-validated from a clean temporary install. This quickstart still documents the honest RC-from-source path because public registry publication is a later release step.
14
-
15
- ## Prerequisites
16
-
17
- - Node.js 20+ recommended
18
- - `pnpm` 10.x
19
- - A clean local checkout of this repo
20
-
21
- Optional for live runs:
22
-
23
- - Claude Code CLI for the Claude adapter path
24
- - OpenAI Codex CLI plus credentials for the Codex adapter path
25
-
26
- ## Install and build
27
-
28
- From the repo root:
29
-
30
- ```bash
31
- pnpm install
32
- pnpm build
33
- ```
34
-
35
- ## Run the RC validation matrix
36
-
37
- ```bash
38
- pnpm rc:validate
39
- ```
40
-
41
- What this does:
42
-
43
- - creates an isolated temporary home or profile directory
44
- - points Martin run artifacts at that clean location
45
- - runs the current build, lint, test, benchmark, and certification matrix
46
- - writes step logs into a temp `martin-rc-validation-*` directory
47
-
48
- Use this when you want to answer, "Can a fresh environment still reproduce the current RC baseline?"
49
-
50
- ## RC gate commands
51
-
52
- The current Phase 13 RC gate is made of these commands:
53
-
54
- - `pnpm oss:validate`
55
- - `pnpm public:smoke`
56
- - `pnpm repo:smoke`
57
- - `pnpm rc:validate`
58
- - `pnpm pilot:prep:validate`
59
- - `pnpm release:matrix:local`
60
-
61
- Recommended order for a fresh local reviewer:
62
-
63
- ```bash
64
- pnpm oss:validate
65
- pnpm public:smoke
66
- pnpm repo:smoke
67
- pnpm rc:validate
68
- pnpm release:matrix:local
69
- ```
70
-
71
- `pnpm release:matrix:local` runs the full local OS lane for the current machine. The repository also defines Windows, macOS, and Linux CI lanes in `.github/workflows/phase13-release-matrix.yml`.
72
-
73
- ## Stub-safe CLI run
74
-
75
- This is the safest first run because it avoids real provider spend.
76
-
77
- ### PowerShell
78
-
79
- ```powershell
80
- $env:MARTIN_LIVE='false'
81
- pnpm run:cli -- run --objective "Summarize the current runtime state" --verify "pnpm --filter @martin/core test"
82
- Remove-Item Env:MARTIN_LIVE
83
- ```
84
-
85
- ### Bash
86
-
87
- ```bash
88
- MARTIN_LIVE=false pnpm run:cli -- run --objective "Summarize the current runtime state" --verify "pnpm --filter @martin/core test"
89
- ```
90
-
91
- This path uses the stub adapter and still exercises the loop, persistence, and policy surfaces.
92
-
93
- ## Config-driven run
94
-
95
- The repo ships an example config at `martin.config.example.yaml`.
96
-
97
- Martin auto-looks for `martin.config.yaml` in the invocation root, or you can pass `--config <path>`.
98
-
99
- Example:
100
-
101
- ```bash
102
- pnpm run:cli -- run --config martin.config.example.yaml --objective "Run with repo defaults" --verify "pnpm --filter @martin/core test"
103
- ```
104
-
105
- ## Inspect a saved run
106
-
107
- Martin persists runs under `~/.martin/runs/` by default, or under `MARTIN_RUNS_DIR` if you override it.
108
-
109
- ```bash
110
- pnpm run:cli -- inspect --file path/to/loop-record.json
111
- ```
112
-
113
- For persisted run folders, inspect the `contract.json`, `state.json`, `ledger.jsonl`, and `artifacts/attempt-XXX/` files together. Those artifacts are the source of truth for runtime behavior.
114
-
115
- ## MCP server
116
-
117
- Build first, then start the server from the workspace:
118
-
119
- ```bash
120
- pnpm --filter @martin/mcp build
121
- node packages/mcp/dist/server.js
122
- ```
123
-
124
- The current MCP tools are:
125
-
126
- - `martin_run`
127
- - `martin_inspect`
128
- - `martin_status`
129
-
130
- ## Notes for reviewers
131
-
132
- - Fresh-home behavior matters. Do not rely only on a long-lived `~/.martin` directory.
133
- - Exact-versus-estimated cost labels are meaningful and should not be merged in docs or dashboards.
134
- - The repo contains control-plane code, but the public OSS boundary is still being finalized during Phase 13.
135
- - The benchmark harness remains a workspace-level RC surface; `martin bench` is not part of the publishable CLI boundary yet.
1
+ # Quickstart
2
+
3
+ This quickstart is intentionally conservative. It is written for a fresh engineer validating the current Phase 13 release-candidate state, not for a hypothetical future public release.
4
+
5
+ ## Public launch target vs current RC path
6
+
7
+ The frozen public launch target is:
8
+
9
+ - `npm install martin-loop`
10
+ - `npx martin-loop ...`
11
+ - `import { MartinLoop } from "martin-loop"`
12
+ - `npx @martinloop/mcp`
13
+
14
+ That runtime launch surface is implemented in the root package facade and smoke-validated from a clean temporary install. The MCP package shape is also smoke-validated from a packed tarball. This quickstart still documents the honest RC-from-source path because public registry publication is a separate release step.
15
+
16
+ ## Prerequisites
17
+
18
+ - Node.js 20+ recommended
19
+ - `pnpm` 10.x
20
+ - A clean local checkout of this repo
21
+
22
+ Optional for live runs:
23
+
24
+ - Claude Code CLI for the Claude adapter path
25
+ - OpenAI Codex CLI plus credentials for the Codex adapter path
26
+
27
+ ## Install and build
28
+
29
+ From the repo root:
30
+
31
+ ```bash
32
+ pnpm install
33
+ pnpm build
34
+ ```
35
+
36
+ ## Run the RC validation matrix
37
+
38
+ ```bash
39
+ pnpm rc:validate
40
+ ```
41
+
42
+ What this does:
43
+
44
+ - creates an isolated temporary home or profile directory
45
+ - points Martin run artifacts at that clean location
46
+ - runs the current build, lint, test, benchmark, and certification matrix
47
+ - writes step logs into a temp `martin-rc-validation-*` directory
48
+
49
+ Use this when you want to answer, "Can a fresh environment still reproduce the current RC baseline?"
50
+
51
+ ## RC gate commands
52
+
53
+ The current Phase 13 RC gate is made of these commands:
54
+
55
+ - `pnpm oss:validate`
56
+ - `pnpm public:smoke`
57
+ - `pnpm repo:smoke`
58
+ - `pnpm rc:validate`
59
+ - `pnpm pilot:prep:validate`
60
+ - `pnpm release:matrix:local`
61
+
62
+ Recommended order for a fresh local reviewer:
63
+
64
+ ```bash
65
+ pnpm oss:validate
66
+ pnpm public:smoke
67
+ pnpm repo:smoke
68
+ pnpm rc:validate
69
+ pnpm release:matrix:local
70
+ ```
71
+
72
+ `pnpm release:matrix:local` runs the full local OS lane for the current machine. The repository also defines Windows, macOS, and Linux CI lanes in `.github/workflows/phase13-release-matrix.yml`.
73
+
74
+ ## Stub-safe CLI run
75
+
76
+ This is the safest first run because it avoids real provider spend.
77
+
78
+ ### PowerShell
79
+
80
+ ```powershell
81
+ $env:MARTIN_LIVE='false'
82
+ pnpm run:cli -- run --objective "Summarize the current runtime state" --verify "pnpm --filter @martin/core test"
83
+ Remove-Item Env:MARTIN_LIVE
84
+ ```
85
+
86
+ ### Bash
87
+
88
+ ```bash
89
+ MARTIN_LIVE=false pnpm run:cli -- run --objective "Summarize the current runtime state" --verify "pnpm --filter @martin/core test"
90
+ ```
91
+
92
+ This path uses the stub adapter and still exercises the loop, persistence, and policy surfaces.
93
+
94
+ ## Config-driven run
95
+
96
+ The repo ships an example config at `martin.config.example.yaml`.
97
+
98
+ Martin auto-looks for `martin.config.yaml` in the invocation root, or you can pass `--config <path>`.
99
+
100
+ Example:
101
+
102
+ ```bash
103
+ pnpm run:cli -- run --config martin.config.example.yaml --objective "Run with repo defaults" --verify "pnpm --filter @martin/core test"
104
+ ```
105
+
106
+ ## Inspect a saved run
107
+
108
+ Martin persists runs under `~/.martin/runs/` by default, or under `MARTIN_RUNS_DIR` if you override it.
109
+
110
+ ```bash
111
+ pnpm run:cli -- inspect --file path/to/loop-record.json
112
+ ```
113
+
114
+ For persisted run folders, inspect the `contract.json`, `state.json`, `ledger.jsonl`, and `artifacts/attempt-XXX/` files together. Those artifacts are the source of truth for runtime behavior.
115
+
116
+ ## MCP server
117
+
118
+ The publish-ready MCP install target is:
119
+
120
+ ```bash
121
+ npx @martinloop/mcp
122
+ ```
123
+
124
+ Claude Code one-line install:
125
+
126
+ ```bash
127
+ # macOS/Linux
128
+ claude mcp add --scope user martin-loop -- npx @martinloop/mcp
129
+
130
+ # Windows PowerShell/cmd
131
+ claude mcp add --scope user martin-loop cmd /c "npx @martinloop/mcp"
132
+ ```
133
+
134
+ Official MCP Registry publication has an extra metadata step beyond npm packaging. Do not mark `@martinloop/mcp` registry-ready unless both of these exist and match:
135
+
136
+ - `packages/mcp/package.json` with `mcpName`
137
+ - `packages/mcp/server.json` with the official server metadata
138
+
139
+ After publishing `@martinloop/mcp` to npm, run the official registry publisher from `packages/mcp`:
140
+
141
+ ```bash
142
+ mcp-publisher login github
143
+ mcp-publisher publish
144
+ ```
145
+
146
+ For repo-local verification from source:
147
+
148
+ ```bash
149
+ pnpm --filter @martinloop/mcp build
150
+ pnpm --filter @martinloop/mcp smoke:pack
151
+ node packages/mcp/dist/server.js
152
+ ```
153
+
154
+ The current MCP tools are:
155
+
156
+ - `martin_run`
157
+ - `martin_inspect`
158
+ - `martin_status`
159
+
160
+ ## Notes for reviewers
161
+
162
+ - Fresh-home behavior matters. Do not rely only on a long-lived `~/.martin` directory.
163
+ - Exact-versus-estimated cost labels are meaningful and should not be merged in docs or dashboards.
164
+ - The repo contains control-plane code, but the public OSS boundary is still being finalized during Phase 13.
165
+ - The benchmark harness remains a workspace-level RC surface; `martin bench` is not part of the publishable CLI boundary yet.
@@ -0,0 +1,113 @@
1
+ # Ralph-Style Loop Safety Guide
2
+
3
+ Ralph-style loops are useful because they keep trying until a coding task reaches a stopping condition. MartinLoop is not a replacement for that pattern. It is the governance layer that makes the pattern safer to run unattended.
4
+
5
+ For install and first-run steps, start with the repo quickstart: [README.md#quick-start](../../README.md#quick-start)
6
+
7
+ ## 1. What Ralph-style loops do well
8
+
9
+ Ralph-style loops are good at persistence:
10
+
11
+ - they retry after a failed attempt
12
+ - they keep working toward a concrete objective
13
+ - they help teams automate long-running coding tasks that would otherwise need constant supervision
14
+
15
+ That persistence is the reason teams use them. The problem is not the existence of the loop. The problem is what happens when the loop keeps running without a clear governance contract.
16
+
17
+ ## 2. Where unattended loops fail
18
+
19
+ An unattended coding loop can fail in ways that are expensive even when no single attempt looks dramatic on its own:
20
+
21
+ - spend keeps accumulating across retries
22
+ - verifier failures repeat without a meaningful strategy change
23
+ - file edits drift outside the intended task boundary
24
+ - the final outcome is hard to audit because the reasoning trail is incomplete
25
+ - operators know that the loop stopped, but not whether it stopped for success, safety, or exhaustion
26
+
27
+ Those are governance failures, not only model failures.
28
+
29
+ ## 3. Why max iterations alone are not enough
30
+
31
+ A max-iteration limit is helpful, but it only answers one question: "How many times may this loop try?"
32
+
33
+ It does not answer:
34
+
35
+ - how much budget can be spent before the next attempt is rejected
36
+ - whether the verifier command is safe to run
37
+ - whether the patch stayed inside the approved file scope
38
+ - whether a failed run left rollback evidence behind
39
+ - whether the recorded outcome is trustworthy enough to resume or inspect later
40
+
41
+ Iteration caps are one guardrail. They are not a full control layer.
42
+
43
+ ## 4. What MartinLoop adds
44
+
45
+ MartinLoop governs the loop before, during, and after execution:
46
+
47
+ - **Budget governance** rejects work that would exceed the configured spend, token, or iteration envelope
48
+ - **Verifier gates** only allow a run to finish as `completed` when the agent result and verification state both pass
49
+ - **Safety leash checks** evaluate verifier commands, file boundaries, and approval-sensitive actions before work is accepted
50
+ - **Stop reasons** make the final lifecycle state explicit, such as `completed`, `budget_exit`, or `human_escalation`
51
+ - **Run records** append JSONL evidence under `~/.martin/runs/` so operators can inspect what happened later
52
+ - **Rollback evidence** preserves the recovery boundary for repo-backed runs when persistence is configured
53
+
54
+ That is why MartinLoop should be thought of as a companion governance layer around a Ralph-style loop, not an argument against using one.
55
+
56
+ ## 5. Example governed run
57
+
58
+ ```bash
59
+ martin run "fix the auth regression" \
60
+ --budget 3.00 \
61
+ --soft-limit-usd 2.00 \
62
+ --max-iterations 2 \
63
+ --verify "pnpm test"
64
+ ```
65
+
66
+ This changes the operator contract in a few important ways:
67
+
68
+ - the next attempt can be rejected before overspend happens
69
+ - the run still has to satisfy the verifier
70
+ - the final state is inspectable instead of being inferred from logs alone
71
+
72
+ ## 6. Example stop reason
73
+
74
+ MartinLoop returns an explicit lifecycle state and reason when a run stops:
75
+
76
+ ```json
77
+ {
78
+ "decision": {
79
+ "shouldExit": true,
80
+ "lifecycleState": "budget_exit",
81
+ "status": "exited",
82
+ "reason": "Martin exited because the budget governor hit a hard limit."
83
+ }
84
+ }
85
+ ```
86
+
87
+ That answer is more useful than "the loop stopped" because it tells the operator whether the run ended for success, safety, or exhaustion.
88
+
89
+ ## 7. Example JSONL run record
90
+
91
+ Each run appends a JSONL record shaped like:
92
+
93
+ ```json
94
+ {
95
+ "loopId": "loop_example123",
96
+ "workspaceId": "ws_demo",
97
+ "projectId": "proj_demo",
98
+ "status": "exited",
99
+ "lifecycleState": "budget_exit",
100
+ "budget": {
101
+ "maxUsd": 3,
102
+ "softLimitUsd": 2,
103
+ "maxIterations": 2,
104
+ "maxTokens": 20000
105
+ },
106
+ "metadata": {
107
+ "policyProfile": "balanced",
108
+ "telemetryDestination": "local-only"
109
+ }
110
+ }
111
+ ```
112
+
113
+ The full record can also include attempts, events, verifier outcomes, and persisted artifact references. That is the evidence trail MartinLoop adds around a retrying coding loop.