@precode/mcp 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,13 +1,15 @@
1
1
  # @precode/mcp
2
2
 
3
- Open, agent-agnostic MCP server that turns a spec into a **self-correcting, verified build**. Works with **any** spec (`SPEC.md`, `.specify`, `.spec-workflow`) and **best** with a PreCode [`.precode/`](https://precode.dev) package. MIT licensed, free, runs on your agent and your tokens.
3
+ Open, agent-agnostic MCP server that turns a spec into a **self-correcting, verified build**. Works with **any** spec (`SPEC.md`, `.specify`, `.spec-workflow`) and **best** with a PreCode [`.precode/`](https://useprecode.vercel.app/mcp) package. Runs locally on your agent and your tokens.
4
+
5
+ Install: [`@precode/mcp` on npm](https://www.npmjs.com/package/@precode/mcp). Docs: [useprecode.vercel.app/mcp](https://useprecode.vercel.app/mcp).
4
6
 
5
7
  It fixes the loop/goal failure other spec MCPs ship with:
6
8
 
7
9
  - **Goal re-anchoring** — the agent re-reads the goal every phase, so long builds don't drift.
8
- - **Hard definition-of-done** — a task is marked done **only when its checks actually pass**, never when the agent merely claims so.
10
+ - **Hard definition-of-done** — when you declare checks in `.precode/checks.json`, **the MCP runs them itself** and the verdict comes from the real exit codes. A fabricated "it passed" cannot move a task to done.
9
11
  - **Closed verify → fix → re-verify** — gaps come back with fixes and the loop will not advance past them.
10
- - **Deterministic-first** — trusts the build/type-check/lint/test results your agent runs in its terminal.
12
+ - **Honest escapes** — a check that genuinely can't pass here (needs a secret, a deploy, a paid service) is `precode.defer_task`'d with a reason and surfaced in `TODO_FOR_YOU.md`, never silently skipped.
11
13
 
12
14
  At the end it writes an **`IMPLEMENTATION.md`** (what was built) and a **`TODO_FOR_YOU.md`** (what you still need to do — secrets, env, deploy).
13
15
 
@@ -31,7 +33,7 @@ Runs over stdio. Point any MCP host at it.
31
33
  claude mcp add precode -- npx -y @precode/mcp
32
34
  ```
33
35
 
34
- **Codex / Windsurf / VS Code** — add the same `command: npx`, `args: ["-y", "@precode/mcp"]` to their MCP config.
36
+ **Codex / Antigravity / VS Code** — add the same `command: npx`, `args: ["-y", "@precode/mcp"]` to their MCP config.
35
37
 
36
38
  The server reads `.precode/` from the directory the host runs it in (`PRECODE_ROOT` overrides). Local stdio only — the build loop needs your filesystem.
37
39
 
@@ -59,10 +61,34 @@ The smoke test launches the built MCP server over stdio, verifies the tool list,
59
61
  | `precode.get_goal` | Re-anchor on the goal + progress |
60
62
  | `precode.next_phase` | Alias for the next phased build step |
61
63
  | `precode.next_task` | Pull the next step + acceptance criteria |
62
- | `precode.verify` | Report check results; marks done **only if they pass** |
64
+ | `precode.verify` | **Runs** the task's `checks.json` checks; marks done only on real exit-0 |
65
+ | `precode.defer_task` | Honest escape for a check that can't pass here (logged to TODO) |
63
66
  | `precode.record_implementation` | Append to the implementation ledger |
64
67
  | `precode.finalize` | Write `TODO_FOR_YOU.md` |
65
- | `precode.adopt_spec` | Map a non-PreCode spec into `.precode/` |
68
+ | `precode.adopt_spec` | Map a non-PreCode spec into `.precode/` (+ starter `checks.json`) |
69
+
70
+ ## Checks — the executable done-gate
71
+
72
+ Declare checks in `.precode/checks.json`. Each `auto` check is **run by the MCP** from the project root on every `precode.verify`; the task closes only when its real exit code is `0`.
73
+
74
+ ```json
75
+ {
76
+ "checks": [
77
+ { "id": "typecheck", "name": "Type-check", "kind": "auto", "cmd": "npm run typecheck" },
78
+ { "id": "build", "name": "Build", "kind": "auto", "cmd": "npm run build" },
79
+ { "id": "test", "name": "Tests", "kind": "auto", "cmd": "npm test", "expect": "/\\d+ passing/" },
80
+ { "id": "deploy-task-3", "name": "Smoke route /health", "kind": "auto", "cmd": "curl -fsS localhost:3000/health", "taskIndex": 3 },
81
+ { "id": "secrets", "name": "Prod secrets set in real accounts", "kind": "manual" }
82
+ ]
83
+ }
84
+ ```
85
+
86
+ - `kind: "auto"` — executed; pass = exit 0 (and `expect` substring/`/regex/` matches output if set). Optional `timeoutMs` (default 120s).
87
+ - `kind: "manual"` — never auto-passed; surfaced as a human gate in `TODO_FOR_YOU.md`.
88
+ - `taskIndex` — scope a check to one task; omit for a global check that runs on every task.
89
+ - No auto check for a task? `verify` falls back to your self-reported results but stamps the pass **UNVERIFIED**, so it's never mistaken for a machine-verified one.
90
+
91
+ **Security:** the runner only executes commands you put in your own repo's `checks.json` — the same trust level as the agent already running your build. Commands run with a timeout and captured output. Because `checks.json` lives in your repo, gate tampering shows up in your diff.
66
92
 
67
93
  ## Resources
68
94
 
@@ -73,3 +99,7 @@ The smoke test launches the built MCP server over stdio, verifies the tool list,
73
99
  | `precode://tasks` | Current phased task plan |
74
100
  | `precode://acceptance` | Acceptance criteria and self-checks |
75
101
  | `precode://progress` | Implementation ledger and human TODOs |
102
+
103
+ ## Release
104
+
105
+ Tagged releases publish to npm via GitHub Actions. See [RELEASE.md](./RELEASE.md).
package/dist/checks.js ADDED
@@ -0,0 +1,120 @@
1
+ import { spawn } from "node:child_process";
2
+ const DEFAULT_TIMEOUT_MS = 120_000;
3
+ const OUTPUT_TAIL = 1600;
4
+ function tail(s, n = OUTPUT_TAIL) {
5
+ const trimmed = s.replace(/\s+$/, "");
6
+ return trimmed.length > n ? "…" + trimmed.slice(-n) : trimmed;
7
+ }
8
+ function matchesExpect(output, expect) {
9
+ if (expect.length > 1 && expect.startsWith("/") && expect.endsWith("/")) {
10
+ try {
11
+ return new RegExp(expect.slice(1, -1)).test(output);
12
+ }
13
+ catch {
14
+ // Malformed regex => fall back to substring so a bad spec can't crash the gate.
15
+ return output.includes(expect);
16
+ }
17
+ }
18
+ return output.includes(expect);
19
+ }
20
+ /**
21
+ * Resolve a spec to one of three states:
22
+ * - "manual": an explicit human gate (surfaced, never auto-passed).
23
+ * - "auto": has a runnable command (executed; authoritative).
24
+ * - "skip": an unconfigured placeholder (no command, not manual) — ignored
25
+ * until the user fills in `cmd`, so an empty starter check never gates.
26
+ */
27
+ export function checkKind(spec) {
28
+ if (spec.kind === "manual")
29
+ return "manual";
30
+ if (spec.cmd && spec.cmd.trim())
31
+ return "auto";
32
+ return "skip";
33
+ }
34
+ /** Run a single auto check. Manual checks are returned as skipped (human gate). */
35
+ export function runCheck(spec, root) {
36
+ const kind = checkKind(spec);
37
+ if (kind !== "auto") {
38
+ return Promise.resolve({
39
+ id: spec.id,
40
+ name: spec.name,
41
+ kind: kind === "manual" ? "manual" : "auto",
42
+ passed: false,
43
+ skipped: true,
44
+ exitCode: null,
45
+ timedOut: false,
46
+ detail: kind === "manual"
47
+ ? "manual check — requires human verification, not auto-passed."
48
+ : "unconfigured placeholder — no command set, ignored.",
49
+ });
50
+ }
51
+ const timeoutMs = typeof spec.timeoutMs === "number" && spec.timeoutMs > 0
52
+ ? spec.timeoutMs
53
+ : DEFAULT_TIMEOUT_MS;
54
+ return new Promise((resolve) => {
55
+ let out = "";
56
+ let settled = false;
57
+ let timedOut = false;
58
+ const child = spawn(spec.cmd, {
59
+ cwd: root,
60
+ shell: true,
61
+ env: process.env,
62
+ });
63
+ const timer = setTimeout(() => {
64
+ timedOut = true;
65
+ child.kill("SIGKILL");
66
+ }, timeoutMs);
67
+ const onData = (d) => {
68
+ out += d.toString();
69
+ if (out.length > OUTPUT_TAIL * 4)
70
+ out = out.slice(-OUTPUT_TAIL * 4);
71
+ };
72
+ child.stdout?.on("data", onData);
73
+ child.stderr?.on("data", onData);
74
+ const finish = (exitCode) => {
75
+ if (settled)
76
+ return;
77
+ settled = true;
78
+ clearTimeout(timer);
79
+ let passed = !timedOut && exitCode === 0;
80
+ let detail;
81
+ if (timedOut) {
82
+ detail = `timed out after ${timeoutMs}ms. ${tail(out)}`;
83
+ }
84
+ else if (exitCode !== 0) {
85
+ detail = `exit ${exitCode}. ${tail(out)}`;
86
+ }
87
+ else if (spec.expect && !matchesExpect(out, spec.expect)) {
88
+ passed = false;
89
+ detail = `exit 0 but output did not match expect=${spec.expect}. ${tail(out)}`;
90
+ }
91
+ else {
92
+ detail = tail(out) || "passed (exit 0).";
93
+ }
94
+ resolve({
95
+ id: spec.id,
96
+ name: spec.name,
97
+ kind,
98
+ passed,
99
+ skipped: false,
100
+ exitCode,
101
+ timedOut,
102
+ detail,
103
+ });
104
+ };
105
+ child.on("error", (err) => {
106
+ // e.g. spawn failure; surface as a real failure rather than throwing.
107
+ finish(null);
108
+ void err;
109
+ });
110
+ child.on("close", (code) => finish(code));
111
+ });
112
+ }
113
+ /** Run checks sequentially for deterministic, reproducible ordering. */
114
+ export async function runChecks(specs, root) {
115
+ const results = [];
116
+ for (const spec of specs) {
117
+ results.push(await runCheck(spec, root));
118
+ }
119
+ return results;
120
+ }
package/dist/index.js CHANGED
@@ -3,6 +3,7 @@ import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
3
3
  import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
4
4
  import { z } from "zod";
5
5
  import { PrecodeStore, adoptSpec } from "./store.js";
6
+ import { checkKind, runChecks } from "./checks.js";
6
7
  import { recordTelemetry, taskTelemetry } from "./telemetry.js";
7
8
  /**
8
9
  * @precode/mcp — drives a phased build → recheck → fix loop over a `.precode/`
@@ -90,83 +91,123 @@ server.tool("precode.next_phase", "Alias for precode.next_task. Pulls the next p
90
91
  ].join("\n"));
91
92
  });
92
93
  // --- Tool: verify (the hard done-gate + fix loop) -------------------------
93
- server.tool("precode.verify", "Report the results of the deterministic checks you RAN (build/type-check/lint/tests). The task is marked done ONLY if every reported check passed and at least one ran. Otherwise you get the gaps back and the task stays open — fix and call precode.verify again.", {
94
+ server.tool("precode.verify", "Close the done-gate for the current task. If the project declares checks in .precode/checks.json, the MCP RUNS them itself and the verdict comes from real exit codes — your reported results cannot fake a pass. A task is marked done only when every auto check it runs passes. Otherwise the gaps come back and the task stays OPEN — fix and call precode.verify again.", {
94
95
  checkResults: z
95
96
  .array(z.object({
96
97
  name: z.string().describe("e.g. 'build', 'typecheck', 'lint', 'smoke test'"),
97
98
  passed: z.boolean(),
98
99
  detail: z.string().optional(),
99
100
  }))
100
- .describe("Results of checks you actually executed in the terminal."),
101
+ .optional()
102
+ .describe("Optional supplementary notes about checks you ran. Authoritative only as a FALLBACK when checks.json defines no auto check for this task."),
101
103
  notes: z.string().optional(),
102
104
  task: z.string().optional().describe("Optional host-side task identifier or title."),
103
105
  }, async ({ checkResults, notes, task: taskLabel }) => {
104
106
  const s = await store();
105
107
  if (!s)
106
108
  return NO_STORE;
107
- const task = await s.nextOpenTask();
108
- if (!task)
109
- return text("No open task to verify. Run precode.finalize.");
110
- if (checkResults.length === 0) {
111
- return text(`HOLD. Task #${task.index} cannot be marked done — you reported no checks. ` +
112
- "Run the acceptance checks (build, type-check, lint, smoke test) in your terminal, then report results.");
109
+ if (!(await s.acquireLock())) {
110
+ return text("Another precode.verify is in progress for this package. Wait for it to finish, then retry.");
113
111
  }
114
- const failed = checkResults.filter((c) => !c.passed);
115
- if (failed.length > 0) {
116
- const failureLines = failed.map((c) => `${c.name}: FAILED${c.detail ? ` — ${c.detail}` : ""}`);
117
- const retry = await s.recordVerificationFailure({
118
- task,
119
- failures: failureLines,
120
- });
112
+ try {
113
+ const task = await s.nextOpenTask();
114
+ if (!task)
115
+ return text("No open task to verify. Run precode.finalize.");
116
+ const specs = await s.checksForTask(task.index);
117
+ const autoSpecs = specs.filter((c) => checkKind(c) === "auto");
118
+ const manualSpecs = specs.filter((c) => checkKind(c) === "manual");
119
+ // Route manual checks to the human TODO when the task closes — never auto-passed.
120
+ const routeManual = async () => {
121
+ if (!manualSpecs.length)
122
+ return;
123
+ await s.appendTodo([
124
+ `## Task #${task.index} — manual verification required`,
125
+ "",
126
+ ...manualSpecs.map((c) => `- [ ] ${c.name}`),
127
+ ].join("\n"));
128
+ };
129
+ const closeTask = async (verifiedLine) => {
130
+ await s.markTaskDone(task.index);
131
+ await routeManual();
132
+ await s.appendImplementation([
133
+ `## Task #${task.index}: ${task.text}`,
134
+ taskLabel ? `Host task: ${taskLabel}` : "",
135
+ verifiedLine,
136
+ manualSpecs.length
137
+ ? `Manual gates routed to TODO_FOR_YOU.md: ${manualSpecs.map((c) => c.name).join(", ")}`
138
+ : "",
139
+ notes ? `Notes: ${notes}` : "",
140
+ ]
141
+ .filter(Boolean)
142
+ .join("\n"));
143
+ const next = await s.nextOpenTask();
144
+ const manualNote = manualSpecs.length
145
+ ? ` ${manualSpecs.length} manual check(s) routed to TODO_FOR_YOU.md for you.`
146
+ : "";
147
+ return text((next
148
+ ? `PASS. Task #${task.index} marked done.${manualNote} Call precode.next_task for the next step (#${next.index}).`
149
+ : `PASS. Task #${task.index} marked done.${manualNote} All tasks complete — call precode.finalize.`));
150
+ };
151
+ const gateFailure = async (failureLines, fixLines) => {
152
+ const retry = await s.recordVerificationFailure({ task, failures: failureLines });
153
+ await recordTelemetry({
154
+ eventName: "mcp_verify_fail",
155
+ root: ROOT,
156
+ task: taskTelemetry(task),
157
+ metadata: { failedCount: failureLines.length, attempts: retry.attempts, escalated: retry.escalated },
158
+ });
159
+ return text([
160
+ `GATE: Task #${task.index} stays OPEN. ${failureLines.length} check(s) failed:`,
161
+ ...failureLines.map((l) => `- ${l}`),
162
+ "",
163
+ "Concrete next fixes:",
164
+ ...fixLines.map((l) => `- ${l}`),
165
+ "",
166
+ `Verify attempt ${retry.attempts}/3 for this task.`,
167
+ retry.escalated
168
+ ? "Escalated honestly into .precode/progress/TODO_FOR_YOU.md because the retry bound was reached. If a check genuinely cannot pass here (needs a secret, a deploy, an external service), call precode.defer_task with a reason instead of looping."
169
+ : "Fix these, re-run, and call precode.verify again. Do not call precode.next_task until this passes.",
170
+ ].join("\n"));
171
+ };
172
+ // --- Authoritative path: the MCP executes the declared auto checks. ---
173
+ if (autoSpecs.length > 0) {
174
+ const runs = await runChecks(autoSpecs, ROOT);
175
+ const failed = runs.filter((r) => !r.passed);
176
+ if (failed.length > 0) {
177
+ return await gateFailure(failed.map((r) => `${r.name}: FAILED — ${r.detail}`), failed.map((r) => `${r.name}: ${fixHint(r.name)}`));
178
+ }
179
+ await recordTelemetry({
180
+ eventName: "mcp_verify_pass",
181
+ root: ROOT,
182
+ task: taskTelemetry(task),
183
+ metadata: { checkCount: runs.length, executed: true },
184
+ });
185
+ return await closeTask(`Verified by MCP execution: ${runs.map((r) => `${r.name} ✓ (exit ${r.exitCode})`).join(", ")}`);
186
+ }
187
+ // --- Fallback: no auto check defined for this task. Self-report is
188
+ // accepted but explicitly stamped UNVERIFIED so it is never mistaken
189
+ // for a machine-verified pass. ---
190
+ const reported = checkResults ?? [];
191
+ if (reported.length === 0) {
192
+ return text(`HOLD. Task #${task.index} has no executable check in .precode/checks.json and you reported none. ` +
193
+ "Add a real command to checks.json (preferred — then the gate runs it), or report the checks you ran. " +
194
+ "Nothing is marked done on an empty report.");
195
+ }
196
+ const failedReported = reported.filter((c) => !c.passed);
197
+ if (failedReported.length > 0) {
198
+ return await gateFailure(failedReported.map((c) => `${c.name}: FAILED${c.detail ? ` — ${c.detail}` : ""}`), failedReported.map((c) => `${c.name}: ${fixHint(c.name)}`));
199
+ }
121
200
  await recordTelemetry({
122
- eventName: "mcp_verify_fail",
201
+ eventName: "mcp_verify_pass",
123
202
  root: ROOT,
124
203
  task: taskTelemetry(task),
125
- metadata: {
126
- checkCount: checkResults.length,
127
- failedCount: failed.length,
128
- attempts: retry.attempts,
129
- escalated: retry.escalated,
130
- },
204
+ metadata: { checkCount: reported.length, executed: false },
131
205
  });
132
- const gaps = failed
133
- .map((c) => `- ${c.name}: FAILED${c.detail ? ` — ${c.detail}` : ""}`)
134
- .join("\n");
135
- const fixes = failed
136
- .map((c) => `- ${c.name}: ${fixHint(c.name)}`)
137
- .join("\n");
138
- return text([
139
- `GATE: Task #${task.index} stays OPEN. ${failed.length} check(s) failed:`,
140
- gaps,
141
- "",
142
- "Concrete next fixes:",
143
- fixes,
144
- "",
145
- `Verify attempt ${retry.attempts}/3 for this task.`,
146
- retry.escalated
147
- ? "Escalated honestly into .precode/progress/TODO_FOR_YOU.md because the retry bound was reached."
148
- : "Fix these, re-run the checks, and call precode.verify again. Do not call precode.next_task until this passes.",
149
- ].join("\n"));
206
+ return await closeTask(`UNVERIFIED (self-reported, no checks.json command): ${reported.map((c) => `${c.name} ✓`).join(", ")}`);
207
+ }
208
+ finally {
209
+ await s.releaseLock();
150
210
  }
151
- await s.markTaskDone(task.index);
152
- await recordTelemetry({
153
- eventName: "mcp_verify_pass",
154
- root: ROOT,
155
- task: taskTelemetry(task),
156
- metadata: { checkCount: checkResults.length },
157
- });
158
- await s.appendImplementation([
159
- `## Task #${task.index}: ${task.text}`,
160
- taskLabel ? `Host task: ${taskLabel}` : "",
161
- `Verified: ${checkResults.map((c) => `${c.name} ✓`).join(", ")}`,
162
- notes ? `Notes: ${notes}` : "",
163
- ]
164
- .filter(Boolean)
165
- .join("\n"));
166
- const next = await s.nextOpenTask();
167
- return text(next
168
- ? `PASS. Task #${task.index} marked done. Call precode.next_task for the next step (#${next.index}).`
169
- : `PASS. Task #${task.index} marked done. All tasks complete — call precode.finalize.`);
170
211
  });
171
212
  // --- Tool: record implementation (ledger) ---------------------------------
172
213
  server.tool("precode.record_implementation", "Log what you built for a task and any undocumented decisions, into .precode/progress/IMPLEMENTATION.md.", {
@@ -210,17 +251,29 @@ server.tool("precode.finalize", "Write the handoff: confirm all tasks done and r
210
251
  if (!s)
211
252
  return NO_STORE;
212
253
  const tasks = await s.tasks();
213
- const open = tasks.filter((t) => !t.done);
254
+ const deferred = await s.deferredEntries();
255
+ const deferredSet = new Set(deferred.map((e) => e.index));
256
+ const open = tasks.filter((t) => !t.done && !deferredSet.has(t.index));
257
+ // Manual gates the human still owns, regenerated from checks.json for done tasks.
258
+ const manualGates = [];
259
+ for (const t of tasks.filter((t) => t.done)) {
260
+ const manual = (await s.checksForTask(t.index)).filter((c) => checkKind(c) === "manual");
261
+ for (const c of manual)
262
+ manualGates.push(`Task #${t.index}: ${c.name}`);
263
+ }
214
264
  await recordTelemetry({
215
265
  eventName: "mcp_finalize",
216
266
  root: ROOT,
217
267
  metadata: {
218
268
  taskCount: tasks.length,
219
269
  openTaskCount: open.length,
270
+ deferredCount: deferred.length,
271
+ manualGateCount: manualGates.length,
220
272
  userTodoCount: userTodos.length,
221
273
  unresolvedCount: unresolved?.length ?? 0,
222
274
  },
223
275
  });
276
+ const taskText = (i) => tasks.find((t) => t.index === i)?.text ?? "(unknown)";
224
277
  const body = [
225
278
  "# What you still need to do",
226
279
  "",
@@ -229,6 +282,14 @@ server.tool("precode.finalize", "Write the handoff: confirm all tasks done and r
229
282
  "## Your steps",
230
283
  ...(userTodos.length ? userTodos.map((t) => `- [ ] ${t}`) : ["- [ ] (none reported)"]),
231
284
  "",
285
+ "## Manual verification still required",
286
+ ...(manualGates.length ? manualGates.map((m) => `- [ ] ${m}`) : ["- (none)"]),
287
+ "",
288
+ "## Deferred tasks (could not be auto-verified here)",
289
+ ...(deferred.length
290
+ ? deferred.map((e) => `- [ ] Task #${e.index} (${taskText(e.index)}) — ${e.reason}`)
291
+ : ["- (none)"]),
292
+ "",
232
293
  "## Not yet satisfied by the build",
233
294
  ...(unresolved?.length
234
295
  ? unresolved.map((u) => `- ${u}`)
@@ -238,9 +299,42 @@ server.tool("precode.finalize", "Write the handoff: confirm all tasks done and r
238
299
  "",
239
300
  ].join("\n");
240
301
  await s.writeTodo(body);
241
- return text(`Finalized. ${tasks.length - open.length}/${tasks.length} tasks done. ` +
302
+ const doneCount = tasks.filter((t) => t.done).length;
303
+ const flags = [
304
+ open.length ? `${open.length} task(s) still OPEN` : "",
305
+ deferred.length ? `${deferred.length} deferred` : "",
306
+ manualGates.length ? `${manualGates.length} manual gate(s) for you` : "",
307
+ ].filter(Boolean);
308
+ return text(`Finalized. ${doneCount}/${tasks.length} tasks done. ` +
242
309
  `Wrote .precode/progress/TODO_FOR_YOU.md and IMPLEMENTATION.md. ` +
243
- (open.length ? `WARNING: ${open.length} task(s) still open — surfaced honestly in the TODO.` : "All tasks verified."));
310
+ (flags.length
311
+ ? `Surfaced honestly in the TODO: ${flags.join(", ")}.`
312
+ : "All tasks verified, nothing outstanding."));
313
+ });
314
+ // --- Tool: defer a task that genuinely cannot be auto-verified here --------
315
+ server.tool("precode.defer_task", "Honest escape hatch for the CURRENT open task when an auto check cannot pass in this environment (needs a secret, a real deploy, a paid/external service). Records the reason, skips the task in the loop, and surfaces it in TODO_FOR_YOU.md. Use this instead of looping forever — never to dodge a real, fixable failure.", {
316
+ reason: z
317
+ .string()
318
+ .min(8)
319
+ .describe("Why it cannot be auto-verified here. Be specific and honest."),
320
+ }, async ({ reason }) => {
321
+ const s = await store();
322
+ if (!s)
323
+ return NO_STORE;
324
+ const task = await s.nextOpenTask();
325
+ if (!task)
326
+ return text("No open task to defer. Run precode.finalize.");
327
+ await s.deferTask(task.index, reason);
328
+ await recordTelemetry({
329
+ eventName: "mcp_defer_task",
330
+ root: ROOT,
331
+ task: taskTelemetry(task),
332
+ });
333
+ const next = await s.nextOpenTask();
334
+ return text(`Deferred task #${task.index} and logged it to TODO_FOR_YOU.md: ${reason}\n` +
335
+ (next
336
+ ? `Next open task is #${next.index}. Call precode.next_task.`
337
+ : "No more open tasks — call precode.finalize."));
244
338
  });
245
339
  // --- Tool: adopt any spec -------------------------------------------------
246
340
  server.tool("precode.adopt_spec", "If there is no .precode/ here, map a plain SPEC.md / .specify / .spec-workflow / docs folder into a minimal .precode/ package so this loop works with ANY spec.", { specPath: z.string().optional().describe("Optional path hint to the spec file or folder.") }, async ({ specPath }) => {
package/dist/store.js CHANGED
@@ -65,7 +65,109 @@ export class PrecodeStore {
65
65
  return tasks;
66
66
  }
67
67
  async nextOpenTask() {
68
- return (await this.tasks()).find((t) => !t.done) ?? null;
68
+ const deferred = new Set(await this.deferredIndices());
69
+ return ((await this.tasks()).find((t) => !t.done && !deferred.has(t.index)) ?? null);
70
+ }
71
+ /** Declared, executable checks (`.precode/checks.json`). [] if absent/malformed. */
72
+ async checks() {
73
+ const raw = await this.read("checks.json");
74
+ if (!raw)
75
+ return [];
76
+ let parsed;
77
+ try {
78
+ parsed = JSON.parse(raw);
79
+ }
80
+ catch {
81
+ return [];
82
+ }
83
+ const arr = Array.isArray(parsed)
84
+ ? parsed
85
+ : Array.isArray(parsed.checks)
86
+ ? parsed.checks
87
+ : [];
88
+ const specs = [];
89
+ arr.forEach((item, i) => {
90
+ if (!item || typeof item !== "object")
91
+ return;
92
+ const o = item;
93
+ const name = typeof o.name === "string" ? o.name : `check ${i + 1}`;
94
+ specs.push({
95
+ id: typeof o.id === "string" ? o.id : `c${i + 1}`,
96
+ name,
97
+ cmd: typeof o.cmd === "string" ? o.cmd : undefined,
98
+ kind: o.kind === "manual" ? "manual" : o.kind === "auto" ? "auto" : undefined,
99
+ expect: typeof o.expect === "string" ? o.expect : undefined,
100
+ timeoutMs: typeof o.timeoutMs === "number" ? o.timeoutMs : undefined,
101
+ taskIndex: typeof o.taskIndex === "number" ? o.taskIndex : undefined,
102
+ });
103
+ });
104
+ return specs;
105
+ }
106
+ /** Checks that apply to a task: global (no taskIndex) + that task's own. */
107
+ async checksForTask(taskIndex) {
108
+ return (await this.checks()).filter((c) => c.taskIndex === undefined || c.taskIndex === taskIndex);
109
+ }
110
+ // --- Deferred tasks: an honest escape from a check that cannot pass --------
111
+ async deferredEntries() {
112
+ const raw = await this.read("progress/deferred.json");
113
+ if (!raw)
114
+ return [];
115
+ try {
116
+ const v = JSON.parse(raw);
117
+ if (!Array.isArray(v))
118
+ return [];
119
+ return v
120
+ .filter((e) => e && typeof e === "object" && typeof e.index === "number")
121
+ .map((e) => ({ index: e.index, reason: String(e.reason ?? "") }));
122
+ }
123
+ catch {
124
+ return [];
125
+ }
126
+ }
127
+ async deferredIndices() {
128
+ return (await this.deferredEntries()).map((e) => e.index);
129
+ }
130
+ async deferTask(taskIndex, reason) {
131
+ await fs.mkdir(this.file("progress"), { recursive: true });
132
+ const cur = await this.deferredEntries();
133
+ if (!cur.some((e) => e.index === taskIndex)) {
134
+ cur.push({ index: taskIndex, reason });
135
+ }
136
+ await fs.writeFile(this.file("progress/deferred.json"), JSON.stringify(cur, null, 2), "utf8");
137
+ const task = (await this.tasks()).find((t) => t.index === taskIndex);
138
+ await this.appendTodo([
139
+ `## Task #${taskIndex} deferred — needs you`,
140
+ "",
141
+ `Task: ${task?.text ?? "(unknown)"}`,
142
+ `Reason it could not be auto-verified: ${reason}`,
143
+ ].join("\n"));
144
+ }
145
+ // --- Verify lock: stop two concurrent verifies racing on tasks.md ----------
146
+ async acquireLock(staleMs = 5 * 60 * 1000) {
147
+ const lock = this.file(".lock");
148
+ try {
149
+ const handle = await fs.open(lock, "wx");
150
+ await handle.writeFile(String(Date.now()));
151
+ await handle.close();
152
+ return true;
153
+ }
154
+ catch {
155
+ // Lock exists — steal it only if it is stale (crashed prior run).
156
+ try {
157
+ const stat = await fs.stat(lock);
158
+ if (Date.now() - stat.mtimeMs > staleMs) {
159
+ await fs.writeFile(lock, String(Date.now()), "utf8");
160
+ return true;
161
+ }
162
+ }
163
+ catch {
164
+ /* race: gone between checks — treat as not acquired */
165
+ }
166
+ return false;
167
+ }
168
+ }
169
+ async releaseLock() {
170
+ await fs.rm(this.file(".lock"), { force: true });
69
171
  }
70
172
  /** Hard done-gate: only flips a task to [x]. Caller must verify checks first. */
71
173
  async markTaskDone(taskIndex) {
@@ -140,6 +242,12 @@ export class PrecodeStore {
140
242
  await fs.mkdir(this.file("progress"), { recursive: true });
141
243
  await fs.writeFile(this.file("progress/TODO_FOR_YOU.md"), content, "utf8");
142
244
  }
245
+ async appendTodo(entry) {
246
+ await fs.mkdir(this.file("progress"), { recursive: true });
247
+ const cur = (await this.read("progress/TODO_FOR_YOU.md")) ??
248
+ "# TODO for you\n\nNo manual follow-up has been recorded yet.\n";
249
+ await fs.writeFile(this.file("progress/TODO_FOR_YOU.md"), `${cur.trimEnd()}\n\n${entry}\n`, "utf8");
250
+ }
143
251
  async listDocFiles() {
144
252
  try {
145
253
  const files = await fs.readdir(this.file("docs"));
@@ -216,18 +324,21 @@ export async function adoptSpec(searchRoot, specPathHint) {
216
324
  error: "No spec found. Provide a SPEC.md, a .specify/.spec-workflow folder, or a docs/ folder of markdown.",
217
325
  };
218
326
  }
219
- // Derive tasks from markdown headings / existing checkboxes.
220
- const tasks = [];
327
+ // Derive tasks: prefer explicit checkboxes; fall back to headings only when
328
+ // the spec has none, so heading text never pollutes a real task list.
329
+ const checkboxes = [];
330
+ const headings = [];
221
331
  for (const line of specBody.split("\n")) {
222
332
  const cb = TASK_RE.exec(line);
223
333
  if (cb) {
224
- tasks.push(`- [ ] ${cb[3]}`);
334
+ checkboxes.push(`- [ ] ${cb[3]}`);
225
335
  continue;
226
336
  }
227
337
  const h = /^#{2,3}\s+(.*\S)\s*$/.exec(line);
228
- if (h && tasks.length < 40)
229
- tasks.push(`- [ ] Implement: ${h[1]}`);
338
+ if (h && headings.length < 40)
339
+ headings.push(`- [ ] Implement: ${h[1]}`);
230
340
  }
341
+ const tasks = checkboxes.length ? checkboxes : headings;
231
342
  if (!tasks.length)
232
343
  tasks.push("- [ ] Implement the full spec, then verify.");
233
344
  await fs.mkdir(path.join(dir, "progress"), { recursive: true });
@@ -254,5 +365,24 @@ export async function adoptSpec(searchRoot, specPathHint) {
254
365
  "",
255
366
  ].join("\n"), "utf8");
256
367
  await fs.writeFile(path.join(dir, "manifest.json"), JSON.stringify({ project: { name: "Adopted spec", appType: "app" }, adoptedFrom: found }, null, 2), "utf8");
368
+ // Starter checks.json. Commands are commented placeholders the user edits to
369
+ // their real build — until then the gate runs nothing auto and stamps passes
370
+ // UNVERIFIED, so it never silently claims a verified build.
371
+ await fs.writeFile(path.join(dir, "checks.json"), JSON.stringify({
372
+ $comment: "Each 'auto' check is RUN by the MCP from the project root; the task is " +
373
+ "done only when its real exit code is 0. Replace cmd values with your " +
374
+ "build. 'manual' checks are surfaced for human verification, never auto-passed.",
375
+ checks: [
376
+ { id: "typecheck", name: "Type-check", kind: "auto", cmd: "" },
377
+ { id: "lint", name: "Lint", kind: "auto", cmd: "" },
378
+ { id: "build", name: "Production build", kind: "auto", cmd: "" },
379
+ { id: "test", name: "Tests", kind: "auto", cmd: "" },
380
+ {
381
+ id: "secrets",
382
+ name: "Secrets & deploy env verified in real accounts",
383
+ kind: "manual",
384
+ },
385
+ ],
386
+ }, null, 2), "utf8");
257
387
  return { ok: true, dir };
258
388
  }
package/package.json CHANGED
@@ -1,8 +1,9 @@
1
1
  {
2
2
  "name": "@precode/mcp",
3
- "version": "0.1.0",
3
+ "version": "0.2.0",
4
4
  "description": "Open, agent-agnostic MCP server that turns a spec (any SPEC.md, best with a PreCode .precode/ package) into a self-correcting, verified build. Drives a phased build → recheck → fix loop with hard definition-of-done gates and an implemented-vs-todo ledger.",
5
5
  "license": "MIT",
6
+ "homepage": "https://useprecode.vercel.app/mcp",
6
7
  "type": "module",
7
8
  "bin": {
8
9
  "precode-mcp": "dist/index.js"
@@ -16,7 +17,9 @@
16
17
  "dev": "tsc --watch",
17
18
  "start": "node dist/index.js",
18
19
  "test:stdio": "npm run build && node scripts/stdio-smoke.mjs",
19
- "typecheck": "tsc --noEmit"
20
+ "test": "npm run build && node test/run.mjs",
21
+ "typecheck": "tsc --noEmit",
22
+ "prepublishOnly": "npm run build && node scripts/stdio-smoke.mjs && node test/run.mjs"
20
23
  },
21
24
  "keywords": [
22
25
  "mcp",