loop-engineering 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Rithvik Shetty
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,115 @@
1
+ # loop-engineering
2
+
3
+ Turn a goal description into a loop-ready spec, install it into your AI coding tool, then verify, run, audit, and cost-estimate the loop — all from one CLI.
4
+
5
+ ```bash
6
+ npx loop-engineering install
7
+ ```
8
+
9
+ No goal description converts cleanly into something an autonomous agent loop can terminate correctly on its own. "Make it look good" has no pass/fail check; a loop built on it either declares false victory or never stops. This package gives your agent a skill that refuses to start looping until the goal has five real things: a concrete end state, a verification command, termination conditions (success + cap + no-progress exit), scope, and an escalation path — then gives you a small CLI to check, run, score, and estimate that loop mechanically, without trusting the agent's own promise to behave.
10
+
11
+ ## Install the skill into your tool
12
+
13
+ ```bash
14
+ npx loop-engineering install # auto-detects what's in your project, installs there
15
+ npx loop-engineering install --tool all # installs for every supported tool
16
+ npx loop-engineering install --tool cursor # installs for one specific tool
17
+ ```
18
+
19
+ Supported tools:
20
+
21
+ | Tool | Install path | Detection signal |
22
+ |---|---|---|
23
+ | Claude Code | `.claude/skills/loop-engineering/` | `.claude/` exists |
24
+ | Codex | `.agents/skills/loop-engineering/` + `~/.codex/skills/…` | `.agents/skills/` exists |
25
+ | Windsurf | `.windsurf/skills/loop-engineering/` | `.windsurf/` exists |
26
+ | Cursor | `.cursor/commands/loop.md` + `.loop/scripts/` | `.cursor/` exists |
27
+ | Kiro | `.kiro/steering/loop-engineering.md` + `.loop/scripts/` | `.kiro/` exists |
28
+ | Trae | `.trae/rules/loop-engineering.md` + `.loop/scripts/` | `.trae/` exists |
29
+ | OpenCode | `.opencode/skills/loop-engineering/` + `~/.config/opencode/skills/…` | `.opencode/` exists |
30
+ | Rovodev | `.rovodev/skills/loop-engineering/` + `~/.rovodev/skills/…` | `.rovodev/` exists |
31
+ | Qoder | `.qoder/skills/loop-engineering/` + `~/.qoder/skills/…` | `.qoder/` exists |
32
+ | Antigravity | `~/.gemini/antigravity/skills/loop-engineering/` (global only) | `--tool antigravity` |
33
+
34
+ Auto-detection is presence-based — if the tool's config directory exists in your project, it's detected. Antigravity is global-only so it's never auto-detected; use `--tool antigravity` or `--tool all` explicitly.
35
+
36
+ Once installed, ask your agent (inside Claude Code, Cursor, Kiro, etc.) to build a loop spec for your goal — it loads the skill and interviews you for whatever's missing. Authoring the spec (`new`/`harden`) happens inside the agent conversation, not on this CLI; the CLI handles everything mechanical around it.
37
+
38
+ ## CLI commands
39
+
40
+ ```bash
41
+ npx loop-engineering verify [path] # checks LOOP_SPEC.md has all 5 required sections, non-placeholder
42
+ npx loop-engineering run [path] # runs one verification pass, tracks iteration + no-progress state
43
+ npx loop-engineering status [path] # reads iteration history, no execution
44
+ npx loop-engineering audit # scores a project's loop-readiness 0-100
45
+ npx loop-engineering cost [path] # rough token cost estimate before committing to a run
46
+ ```
47
+
48
+ ### `verify`
49
+
50
+ Exits 0 if `LOOP_SPEC.md` has a real Goal, a runnable Verification command (or an explicitly-flagged LLM-judge fallback), all three Termination conditions (success / max-iterations / no-progress), a Scope with at least one forbidden action, and an Escalation behavior. Exits 1 with an itemized list otherwise — no partial credit for "looks complete."
51
+
52
+ ### `run`
53
+
54
+ Runs the spec's verification command once, hashes both its output and the working tree (so two failures that print an identical generic error but came from genuinely different code states aren't wrongly flagged as stuck), and reports `success`, `continue`, or `stop-escalate`. State persists to `.loop/state.json` so the iteration count survives across turns — the calling agent can't lose count or talk itself past the cap.
55
+
56
+ ### `audit`
57
+
58
+ ```
59
+ Loop Readiness Score: 70/100
60
+
61
+ ✓ LOOP_SPEC.md exists: 20/20
62
+ ✓ Spec passes lint: 30/30 — PASS
63
+ ✗ Run history exists: 0/15 — no .loop/state.json
64
+ ✗ Last run resolved cleanly: 0/15 — no run history
65
+ ✓ Skill installed for at least one tool: 10/10 — claude-code
66
+ ✓ Project is a git repo: 10/10 — git detected — reliable no-progress signal
67
+ ```
68
+
69
+ Add `--badge` to print a shields.io badge for your README.
70
+
71
+ ### `cost`
72
+
73
+ ```
74
+ Estimated token cost for up to 6 iterations:
75
+ Best case (succeeds iteration 1): ~4,200 tokens
76
+ Worst case (runs full cap): ~17,700 tokens
77
+ Per-iteration: ~2,700 tokens
78
+ ```
79
+
80
+ A rough range, not a prediction — override the dominant term with `--edit-tokens N` if you have a sense of how big the actual code changes will be. The point is catching "this will burn through a plan before lunch" before it happens, not billing precisely.
81
+
82
+ ## The spec format
83
+
84
+ ```markdown
85
+ # Loop: <name>
86
+
87
+ ## Goal
88
+ <concrete, observable end state — not an adjective>
89
+
90
+ ## Verification
91
+ ```
92
+ <exact command that returns pass/fail>
93
+ ```
94
+
95
+ ## Termination
96
+ - Success: verification exits 0
97
+ - Max iterations: <N>
98
+ - No-progress: stop if 2 consecutive iterations produce identical result + unchanged tree
99
+ - Budget: <optional>
100
+
101
+ ## Scope
102
+ - Allowed: <paths/stack>
103
+ - Forbidden: <at least one explicit thing>
104
+
105
+ ## Escalation
106
+ On cap or no-progress: stop, summarize attempts, wait for human review.
107
+ ```
108
+
109
+ ## Why a script, not just instructions
110
+
111
+ Most loop-engineering failures are goal-description failures, not model failures. `skill/scripts/lint_spec.mjs` is the actual gate — it parses the file and checks each section against specific criteria (not "does this look done"). `skill/scripts/run_loop.mjs` is the actual executor — it enforces the cap and no-progress exit itself rather than asking the agent to count correctly. Both are plain Node scripts with no dependencies, auditable in under 200 lines each.
112
+
113
+ ## License
114
+
115
+ MIT
@@ -0,0 +1,225 @@
1
+ #!/usr/bin/env node
2
+ // bin/loop-engineering.mjs — CLI entry point.
3
+ //
4
+ // Subcommands:
5
+ // install [--tool <id>|all] [--target <path>]
6
+ // verify [path]
7
+ // run [path]
8
+ // status [path]
9
+ // audit [--target <path>]
10
+ // cost [path] [--edit-tokens N] [--verify-tokens N] [--max-iterations N]
11
+ //
12
+ // Deliberately dependency-free (no commander/yargs): the command surface is small enough
13
+ // that hand-rolled parsing is more auditable than pulling in a flag-parsing dependency for
14
+ // a tool whose whole pitch is "minimal, mechanical, no magic."
15
+
16
+ import { dirname, join, resolve } from "node:path";
17
+ import { fileURLToPath } from "node:url";
18
+ import { execFileSync } from "node:child_process";
19
+ import { readFileSync, existsSync } from "node:fs";
20
+
21
+ import { detectTools, allToolIds, TOOLS } from "../src/lib/detect.mjs";
22
+ import { installTools } from "../src/lib/install.mjs";
23
+ import { auditProject } from "../src/lib/audit.mjs";
24
+ import { estimateCost, extractMaxIterationsFromSpec } from "../src/lib/cost.mjs";
25
+
26
+ const __dirname = dirname(fileURLToPath(import.meta.url));
27
+ const SKILL_SCRIPTS_DIR = join(__dirname, "..", "skill", "scripts");
28
+ const LINT_SCRIPT = join(SKILL_SCRIPTS_DIR, "lint_spec.mjs");
29
+ const RUN_SCRIPT = join(SKILL_SCRIPTS_DIR, "run_loop.mjs");
30
+
31
+ function parseFlags(argv) {
32
+ const flags = {};
33
+ const positional = [];
34
+ for (let i = 0; i < argv.length; i++) {
35
+ const arg = argv[i];
36
+ if (arg.startsWith("--")) {
37
+ const key = arg.slice(2);
38
+ const next = argv[i + 1];
39
+ if (next !== undefined && !next.startsWith("--")) {
40
+ flags[key] = next;
41
+ i++;
42
+ } else {
43
+ flags[key] = true;
44
+ }
45
+ } else {
46
+ positional.push(arg);
47
+ }
48
+ }
49
+ return { flags, positional };
50
+ }
51
+
52
+ function cmdInstall(argv) {
53
+ const { flags, positional } = parseFlags(argv);
54
+ const target = resolve(flags.target || process.cwd());
55
+ let toolIds;
56
+
57
+ if (flags.tool === "all" || (!flags.tool && positional.length === 0)) {
58
+ const detected = detectTools(target);
59
+ toolIds = detected.length > 0 ? detected : allToolIds();
60
+ console.log(
61
+ detected.length > 0
62
+ ? `Detected: ${detected.join(", ")}. Installing for detected tools only. Use --tool all to install everywhere.`
63
+ : `No tool config detected in ${target}. Installing for all supported tools as a safe default.`
64
+ );
65
+ } else if (flags.tool) {
66
+ toolIds = [flags.tool];
67
+ } else {
68
+ toolIds = positional; // allow: loop-engineering install claude-code cursor
69
+ }
70
+
71
+ const unknown = toolIds.filter((id) => !allToolIds().includes(id));
72
+ if (unknown.length > 0) {
73
+ console.error(`Unknown tool id(s): ${unknown.join(", ")}. Valid: ${allToolIds().join(", ")}`);
74
+ process.exit(1);
75
+ }
76
+
77
+ const report = installTools(toolIds, target);
78
+ for (const [id, results] of Object.entries(report)) {
79
+ const label = TOOLS[id].label;
80
+ for (const r of results) {
81
+ const status = r.installed ? "installed" : `skipped (${r.reason})`;
82
+ console.log(`[${label}] ${status}: ${r.path}`);
83
+ }
84
+ }
85
+ console.log("\nDone. Restart your tool / start a new session to load the skill.");
86
+ }
87
+
88
+ function resolveSpecPath(positional) {
89
+ return resolve(positional[0] || "LOOP_SPEC.md");
90
+ }
91
+
92
+ function cmdVerify(argv) {
93
+ const { positional } = parseFlags(argv);
94
+ const specPath = resolveSpecPath(positional);
95
+ try {
96
+ execFileSync("node", [LINT_SCRIPT, specPath], { stdio: "inherit" });
97
+ process.exit(0);
98
+ } catch (err) {
99
+ process.exit(err.status ?? 1);
100
+ }
101
+ }
102
+
103
+ function cmdRun(argv) {
104
+ const { positional } = parseFlags(argv);
105
+ const specPath = resolveSpecPath(positional);
106
+ try {
107
+ execFileSync("node", [RUN_SCRIPT, "check", specPath], { stdio: "inherit" });
108
+ process.exit(0);
109
+ } catch (err) {
110
+ process.exit(err.status ?? 1);
111
+ }
112
+ }
113
+
114
+ function cmdStatus(argv) {
115
+ const { positional } = parseFlags(argv);
116
+ const specPath = resolveSpecPath(positional);
117
+ try {
118
+ execFileSync("node", [RUN_SCRIPT, "status", specPath], { stdio: "inherit" });
119
+ } catch (err) {
120
+ process.exit(err.status ?? 1);
121
+ }
122
+ }
123
+
124
+ function cmdAudit(argv) {
125
+ const { flags } = parseFlags(argv);
126
+ const target = resolve(flags.target || process.cwd());
127
+ const result = auditProject(target, LINT_SCRIPT);
128
+ console.log(`Loop Readiness Score: ${result.score}/${result.max}\n`);
129
+ for (const check of result.checks) {
130
+ const mark = check.points === check.max ? "✓" : check.points > 0 ? "~" : "✗";
131
+ console.log(` ${mark} ${check.label}: ${check.points}/${check.max}${check.detail ? ` — ${check.detail}` : ""}`);
132
+ }
133
+ if (flags.badge) {
134
+ const color = result.score >= 80 ? "brightgreen" : result.score >= 50 ? "yellow" : "red";
135
+ const url = `https://img.shields.io/badge/Loop%20Readiness-${result.score}%2F100-${color}`;
136
+ console.log(`\nBadge markdown:\n![Loop Readiness](${url})`);
137
+ }
138
+ }
139
+
140
+ function cmdCost(argv) {
141
+ const { flags, positional } = parseFlags(argv);
142
+ const specPath = resolveSpecPath(positional);
143
+ let maxIterations = flags["max-iterations"] ? parseInt(flags["max-iterations"], 10) : null;
144
+
145
+ if (!maxIterations) {
146
+ if (!existsSync(specPath)) {
147
+ console.error(`${specPath} not found, and no --max-iterations given. Either point at a spec or pass --max-iterations N.`);
148
+ process.exit(1);
149
+ }
150
+ const content = readFileSync(specPath, "utf-8");
151
+ maxIterations = extractMaxIterationsFromSpec(content);
152
+ if (!maxIterations) {
153
+ console.error(`Could not find "Max iterations: N" in ${specPath}'s Termination section. Pass --max-iterations N explicitly.`);
154
+ process.exit(1);
155
+ }
156
+ }
157
+
158
+ const opts = { maxIterations };
159
+ if (flags["edit-tokens"]) opts.agentEditTokensPerIteration = parseInt(flags["edit-tokens"], 10);
160
+ if (flags["verify-tokens"]) opts.verifierOutputTokens = parseInt(flags["verify-tokens"], 10);
161
+
162
+ const result = estimateCost(opts);
163
+ console.log(`Estimated token cost for up to ${result.maxIterations} iterations:`);
164
+ console.log(` Best case (succeeds iteration 1): ~${result.lowEstimateTokens.toLocaleString()} tokens`);
165
+ console.log(` Worst case (runs full cap): ~${result.highEstimateTokens.toLocaleString()} tokens`);
166
+ console.log(` Per-iteration: ~${result.perIterationTokens.toLocaleString()} tokens`);
167
+ console.log(`\nAssumptions (override with --edit-tokens / --verify-tokens):`);
168
+ console.log(` ${JSON.stringify(result.assumptions)}`);
169
+ console.log(`\n${result.note}`);
170
+ }
171
+
172
+ function printHelp() {
173
+ console.log(`loop-engineering — turn a goal description into a loop-ready spec, then run it.
174
+
175
+ Usage:
176
+ npx loop-engineering install [--tool <id>|all] [--target <path>]
177
+ npx loop-engineering verify [path] (default path: ./LOOP_SPEC.md)
178
+ npx loop-engineering run [path]
179
+ npx loop-engineering status [path]
180
+ npx loop-engineering audit [--target <path>] [--badge]
181
+ npx loop-engineering cost [path] [--max-iterations N] [--edit-tokens N] [--verify-tokens N]
182
+
183
+ Tool ids for install: ${allToolIds().join(", ")}, or "all"
184
+
185
+ Writing the spec itself (the "new" and "harden" workflows) happens inside your AI coding
186
+ tool, not on this CLI — install the skill, then ask your agent to author or fix a spec.
187
+ This CLI handles the mechanical parts: installing, linting, running, scoring, estimating.
188
+ `);
189
+ }
190
+
191
+ function main() {
192
+ const [command, ...rest] = process.argv.slice(2);
193
+ switch (command) {
194
+ case "install":
195
+ cmdInstall(rest);
196
+ break;
197
+ case "verify":
198
+ cmdVerify(rest);
199
+ break;
200
+ case "run":
201
+ cmdRun(rest);
202
+ break;
203
+ case "status":
204
+ cmdStatus(rest);
205
+ break;
206
+ case "audit":
207
+ cmdAudit(rest);
208
+ break;
209
+ case "cost":
210
+ cmdCost(rest);
211
+ break;
212
+ case undefined:
213
+ case "help":
214
+ case "--help":
215
+ case "-h":
216
+ printHelp();
217
+ break;
218
+ default:
219
+ console.error(`Unknown command: ${command}\n`);
220
+ printHelp();
221
+ process.exit(1);
222
+ }
223
+ }
224
+
225
+ main();
package/package.json ADDED
@@ -0,0 +1,50 @@
1
+ {
2
+ "name": "loop-engineering",
3
+ "version": "1.0.0",
4
+ "description": "Turn a goal description into a loop-ready spec, install it into Claude Code, Codex, Windsurf, Cursor, Kiro, Trae, OpenCode, Rovodev, Qoder, or Antigravity, then verify, run, audit, and cost-estimate the loop.",
5
+ "type": "module",
6
+ "license": "MIT",
7
+ "author": "Rithvik Shetty",
8
+ "keywords": [
9
+ "loop-engineering",
10
+ "ai-agents",
11
+ "claude-code",
12
+ "codex",
13
+ "cursor",
14
+ "windsurf",
15
+ "antigravity",
16
+ "kiro",
17
+ "trae",
18
+ "opencode",
19
+ "rovodev",
20
+ "qoder",
21
+ "skill",
22
+ "agentic"
23
+ ],
24
+ "bin": {
25
+ "loop-engineering": "bin/loop-engineering.mjs"
26
+ },
27
+ "files": [
28
+ "bin",
29
+ "src",
30
+ "skill",
31
+ "patterns",
32
+ "README.md",
33
+ "LICENSE"
34
+ ],
35
+ "engines": {
36
+ "node": ">=18"
37
+ },
38
+ "scripts": {
39
+ "test": "node --test test/*.test.mjs",
40
+ "build": "node scripts/build.mjs"
41
+ },
42
+ "repository": {
43
+ "type": "git",
44
+ "url": "git+https://github.com/<your-username>/loop-engineering.git"
45
+ },
46
+ "homepage": "https://github.com/<your-username>/loop-engineering#readme",
47
+ "bugs": {
48
+ "url": "https://github.com/<your-username>/loop-engineering/issues"
49
+ }
50
+ }
@@ -0,0 +1,57 @@
1
+ Act as a loop-engineering specialist. The user is invoking `/loop`, optionally followed by a sub-command and an argument: `/loop <new|harden|verify|run|status> [goal description or path]`.
2
+
3
+ This command does NOT build the feature directly. It authors/checks/runs a **loop spec** — a goal description hardened into something an autonomous agent loop can act on without a human re-prompting every turn.
4
+
5
+ ## The hard rule
6
+
7
+ A spec is never "loop-ready" because it reads complete. It is loop-ready when `node .loop/scripts/lint_spec.mjs LOOP_SPEC.md` exits 0 (this repo's installer places the scripts at `.loop/scripts/` specifically because Cursor has no native skills folder — if that path doesn't exist, check whether `.claude/skills/loop-engineering/scripts/`, `.agents/skills/loop-engineering/scripts/`, or `.windsurf/skills/loop-engineering/scripts/` exists instead, from another tool's install). Do not eyeball a spec and call it done. Run the script.
8
+
9
+ Iteration counts, the no-progress check, and the cap are tracked by `.loop/scripts/run_loop.mjs`, not by you counting in your head. It persists state to `.loop/state.json` so the count can't be lost or argued past.
10
+
11
+ ## Routing
12
+
13
+ - **`new <goal>`** (or any goal description with no existing `LOOP_SPEC.md`): interview for Goal precision and a real Verification command — don't accept vague adjectives like "better" or "cleaner" without converting them to an observable, checkable end state. Default Termination (max iterations 8–10, no-progress = 2 identical consecutive results), Scope (ask for at least one forbidden action), and Escalation (stop + summarize + wait for human) unless told otherwise. Write `LOOP_SPEC.md` with sections `## Goal`, `## Verification`, `## Termination`, `## Scope`, `## Escalation`. Run the lint script immediately after writing it; fix what it flags; re-run until it exits 0.
14
+
15
+ - **`harden [path]`**: run the lint script first, always — don't guess what's wrong by reading the file. Fix exactly what it reports, in the order reported. Re-run until clean. Report before/after issue list.
16
+
17
+ - **`verify [path]`**: run the lint script only. Report exit 0 (ready) or the itemized issue list verbatim. Do not execute the verification command itself or modify the file — that's `run` and `harden` respectively.
18
+
19
+ - **`run [path]`**: precondition — must pass `verify` first. Loop: call `node .loop/scripts/run_loop.mjs check <path>`, read the JSON `status` field. `"success"` → stop, report. `"continue"` → make a real code change closing the gap described in the failure, then check again. `"stop-escalate"` → stop immediately, report the reason and history verbatim, wait for the user — never bypass the script by running the verification command directly to "see if it would have passed."
20
+
21
+ - **`status [path]`**: run `node .loop/scripts/run_loop.mjs status <path>`, report iteration history read-only. No execution, no file changes.
22
+
23
+ ## Absolute bans
24
+
25
+ - Never mark a spec ready without running the lint script.
26
+ - Never mark a loop iteration "success" without the literal exit code being 0.
27
+ - Never silently exceed the stated iteration cap, and never re-run the verifier manually outside `run_loop.mjs` to dodge a `stop-escalate` verdict.
28
+ - Never write a Scope section with zero forbidden actions.
29
+ - Never present LLM-as-judge verification as equivalent to a deterministic check — flag it as the weakest link every time it's used.
30
+ - Never accept a goal phrased only as a vague adjective without converting it to an observable end state first.
31
+
32
+ ## Spec shape (for `new`/`harden` to produce)
33
+
34
+ ```markdown
35
+ # Loop: <name>
36
+
37
+ ## Goal
38
+ <concrete, observable end state>
39
+
40
+ ## Verification
41
+ ```
42
+ <exact command>
43
+ ```
44
+
45
+ ## Termination
46
+ - Success: ...
47
+ - Max iterations: ...
48
+ - No-progress: ...
49
+ - Budget: ...
50
+
51
+ ## Scope
52
+ - Allowed: ...
53
+ - Forbidden: ...
54
+
55
+ ## Escalation
56
+ ...
57
+ ```
package/skill/SKILL.md ADDED
@@ -0,0 +1,79 @@
1
+ ---
2
+ name: loop-engineering
3
+ description: Turn a goal description into a loop spec an AI agent can iterate against autonomously, then optionally run that loop with enforced caps. Use when the user invokes /loop or any sub-command (new, harden, verify, run, status), asks to "build until done" or "iterate until X passes," describes a goal they want an agent to work toward without manual re-prompting each turn, or asks for a spec/PRD meant to drive an autonomous coding agent (Claude Code, Codex, Cursor, Windsurf, Antigravity). Also trigger when a goal description is vague, untestable, or missing a stop condition — hardening weak goal descriptions into loop-ready specs is this skill's primary job, and it refuses to hand a vague goal to an agent loop without fixing it first.
4
+ version: 1.1.0
5
+ user-invocable: true
6
+ argument-hint: "[new|harden|verify|run|status] [goal description or path to LOOP_SPEC.md]"
7
+ license: MIT
8
+ ---
9
+
10
+ # Loop Engineering
11
+
12
+ Converts a goal description into a **loop spec**: Goal, Verification, Termination, Scope, Escalation. Then, optionally, runs the loop — act, verify, decide continue/stop — with caps enforced by a script, not by the agent's own promise to behave.
13
+
14
+ ## Why this exists
15
+
16
+ Old model: human types a prompt, agent responds, human reads and re-prompts. Human is the loop.
17
+
18
+ Loop engineering: human writes the goal once; a system prompts the agent, checks the result, and decides whether to continue — without the human in the cycle. The loop's quality is bottlenecked entirely on the goal description. A vague goal makes a loop that either declares false victory or never stops.
19
+
20
+ **Most "loop engineering" failures are goal-description failures, not model failures.** This skill's job is to catch that before iteration 1, and to make the cap/no-progress/escalation logic mechanical rather than a promise the agent makes to itself.
21
+
22
+ ## Setup (run before any sub-command)
23
+
24
+ 1. **Resolve the scripts directory once per session.** The two scripts this skill depends on (`lint_spec.mjs`, `run_loop.mjs`) live alongside this SKILL.md, but their absolute path depends on which tool installed them. Check, in order, and use the first that exists: `.claude/skills/loop-engineering/scripts/`, `.agents/skills/loop-engineering/scripts/`, `.windsurf/skills/loop-engineering/scripts/`, `.opencode/skills/loop-engineering/scripts/`, `.rovodev/skills/loop-engineering/scripts/`, `.qoder/skills/loop-engineering/scripts/`, `.loop/scripts/` (Cursor, Kiro, and Trae install scripts here), `~/.codex/skills/loop-engineering/scripts/`, `~/.gemini/antigravity/skills/loop-engineering/scripts/`, `~/.config/opencode/skills/loop-engineering/scripts/`, `~/.rovodev/skills/loop-engineering/scripts/`, `~/.qoder/skills/loop-engineering/scripts/`. Every reference file in this skill writes commands as `node <skill-dir>/scripts/lint_spec.mjs` — substitute the real resolved path for `<skill-dir>/scripts/`, don't run `node scripts/...` relative to an assumed cwd.
25
+ 2. Check whether `LOOP_SPEC.md` (or a user-specified spec path) exists in the working directory. Also check whether `LOOP_CONTEXT.md` exists — if it does, read it now; it contains standing project conventions (test command, forbidden paths) that pre-fill defaults for `new` and `harden`.
26
+ 3. If a sub-command was given (`init`, `new`, `harden`, `verify`, `run`, `status`), read `reference/<command>.md` next — non-optional, it defines that command's exact flow.
27
+ 4. Never skip the linter. The lint script is the actual gate, not a suggestion — see "The hard rule" below.
28
+
29
+ ## The hard rule
30
+
31
+ **A spec is never "loop-ready" because the agent says so. It is loop-ready when `node <skill-dir>/scripts/lint_spec.mjs <path>` exits 0.** Do not eyeball a spec and decide it looks fine. Do not skip the lint step because the spec "seems complete." Run the script, read its actual output, fix exactly what it flags. This is the difference between a skill that's foolproof and one that's just well-written prose — prose can be rationalized past; a non-zero exit code cannot.
32
+
33
+ Same for running a loop: **iteration counts, the no-progress check, and the cap are tracked by `<skill-dir>/scripts/run_loop.mjs`, not by the agent counting in its head.** The script persists state to `.loop/state.json` specifically so a model can't lose count, restart the count by accident, or talk itself into "just one more try" past the cap.
34
+
35
+ ## Commands
36
+
37
+ | Command | Category | Description | Reference |
38
+ |---|---|---|---|
39
+ | `init` | Setup | Capture standing project conventions (test command, build command, forbidden paths) into `LOOP_CONTEXT.md` so future specs don't re-ask for them | [reference/init.md](reference/init.md) |
40
+ | `new [goal]` | Build | Interview the user, author a fresh `LOOP_SPEC.md` from a goal description | [reference/new.md](reference/new.md) |
41
+ | `harden [path]` | Build | Take an existing weak/incomplete spec and fix exactly what the linter flags | [reference/harden.md](reference/harden.md) |
42
+ | `verify [path]` | Evaluate | Run the lint check only — confirm a spec is loop-ready without executing anything | [reference/verify.md](reference/verify.md) |
43
+ | `run [path]` | Iterate | Actually execute the loop: verify, report, decide continue/stop/escalate, repeat | [reference/run.md](reference/run.md) |
44
+ | `status [path]` | Evaluate | Read `.loop/state.json` and report iteration history without running anything | [reference/status.md](reference/status.md) |
45
+
46
+ ### Routing rules
47
+
48
+ 1. **No argument**: ask what goal they want to loop, or if a `LOOP_SPEC.md` already exists in the working directory, ask whether they want to `verify`, `run`, or `harden` it. Don't guess — a missing argument here means genuine ambiguity between "I have no spec yet" and "I have one, do something with it." If neither a spec nor a `LOOP_CONTEXT.md` exists and the user seems to be starting fresh in a project, suggest `init` first.
49
+ 2. **First word matches a command** (table above): load its reference file, treat everything after the command name as its argument, follow the reference's flow exactly.
50
+ 3. **First word doesn't match, but intent clearly maps to one command** (e.g. "is this spec actually ready?" → `verify`; "just start building and don't stop till tests pass" → `new` then `run`; "this spec sucks, fix it" → `harden`): load that reference and proceed as if invoked. If genuinely two could fit (e.g. they want both `new` and `run`), do `new` first, confirm the spec passes `verify`, then ask before `run`.
51
+ 4. **A goal description with no command word, and no existing spec in the working directory**: treat as `new`.
52
+ 5. **A goal description with no command word, and a spec already exists**: ask whether they mean to replace it (`new`) or check it (`verify`) — don't silently overwrite an existing spec.
53
+
54
+ ## Absolute bans
55
+
56
+ Match-and-refuse. If you're about to do any of these, stop and do the alternative instead.
57
+
58
+ - **Never mark a spec loop-ready without running `lint_spec.mjs`.** Reading it and thinking "looks complete" is not verification. Run the script.
59
+ - **Never mark a loop iteration "success" without the verification command's actual exit code being 0.** Not "the code looks right," not "this should pass" — the literal exit code from `run_loop.mjs check`.
60
+ - **Never silently exceed the stated iteration cap.** If `run_loop.mjs` reports `stop-escalate`, stop. Do not run "just one more" manually outside the script to see if it would have worked — that defeats the entire point of a mechanical cap.
61
+ - **Never write a Scope section with no forbidden actions**, and never let a user's "just do whatever it takes" stand unchallenged — ask for at least one explicit boundary (untouched files, no force-push, no editing tests to force a pass).
62
+ - **Never propose LLM-as-judge as the verification method without saying out loud that it's the weakest link.** It's sometimes the only option; it must never be presented as equivalent to a deterministic check.
63
+ - **Never accept a goal phrased only as a vague adjective** ("better," "polished," "cleaner") without converting it to an observable end state during `new` or `harden`. That conversion is most of the actual work — see `reference/examples.md`.
64
+
65
+ ## Reference files
66
+
67
+ - `reference/init.md` — interview flow for capturing project conventions into `LOOP_CONTEXT.md`
68
+ - `reference/new.md` — interview flow for authoring a fresh spec
69
+ - `reference/harden.md` — fixing an existing spec against linter output
70
+ - `reference/verify.md` — running the lint check, reporting results
71
+ - `reference/run.md` — executing the loop end-to-end, including how to act between iterations
72
+ - `reference/status.md` — reading and reporting iteration history
73
+ - `reference/ide-natives.md` — which native loop primitive to hand off to per IDE (Codex's `/goal`, Windsurf Workflows, etc.)
74
+ - `reference/examples.md` — worked weak-goal → loop-ready-spec conversions
75
+
76
+ ## Scripts
77
+
78
+ - `<skill-dir>/scripts/lint_spec.mjs` — the hard gate; validates a spec has all 5 sections, non-placeholder, with section-specific deep checks (verification has a real command or flags LLM-judge; termination has all three exits; scope names a forbidden action; escalation names a stop behavior)
79
+ - `<skill-dir>/scripts/run_loop.mjs check|reset|status` — executes one verification pass, tracks iteration + no-progress state via a composite hash of (verifier output) + (working tree changes), enforces the cap, never trusts the calling agent's own iteration count
@@ -0,0 +1,72 @@
1
+ # Worked examples: weak goal → loop-ready spec
2
+
3
+ The actual work of loop engineering is converting a subjective adjective into something `lint_spec.mjs` and `run_loop.mjs` can both act on mechanically. These are full conversions, both passing the linter.
4
+
5
+ ## Example 1 — UI build
6
+
7
+ **Weak (fails `verify` immediately):**
8
+ > "Build me a landing page that looks good and works well."
9
+
10
+ Problems: no verifier, no termination, "looks good" is unjudgeable mechanically. `lint_spec.mjs` would report missing Verification command, missing Termination, missing Scope forbidden-action, missing Escalation.
11
+
12
+ **Loop-ready (passes `verify`):**
13
+ ```markdown
14
+ # Loop: landing page build
15
+
16
+ ## Goal
17
+ Landing page at /index.html: hero, 3-feature grid, signup form. Mobile-responsive at 375px and 1440px widths.
18
+
19
+ ## Verification
20
+ ```
21
+ npm run build && npm run lint && npx playwright test tests/landing.spec.ts
22
+ ```
23
+ (visual check folded in: Lighthouse mobile score >= 90, captured via `npx lighthouse --output=json`)
24
+
25
+ ## Termination
26
+ - Success: all commands above exit 0 AND Lighthouse score >= 90
27
+ - Max iterations: 8
28
+ - No-progress: stop if 2 consecutive iterations produce identical result + unchanged working tree
29
+ - Budget: none
30
+
31
+ ## Scope
32
+ - Allowed: /src, /tests, /public
33
+ - Forbidden: do not edit playwright config to skip failing assertions; do not touch /api
34
+
35
+ ## Escalation
36
+ On cap: write attempts.md summarizing each iteration's score + diff, stop, wait for human review.
37
+ ```
38
+
39
+ ## Example 2 — Bug fix / refactor
40
+
41
+ **Weak:**
42
+ > "Fix the flaky tests."
43
+
44
+ **Loop-ready:**
45
+ ```markdown
46
+ # Loop: stabilize integration tests
47
+
48
+ ## Goal
49
+ All tests in /tests/integration pass on 5 consecutive runs (currently ~30% flake rate).
50
+
51
+ ## Verification
52
+ ```
53
+ for i in 1 2 3 4 5; do npm test -- tests/integration || exit 1; done
54
+ ```
55
+
56
+ ## Termination
57
+ - Success: loop above exits 0
58
+ - Max iterations: 10
59
+ - No-progress: stop if same test name fails with the same error message 2 iterations in a row with no code change
60
+ - Budget: none
61
+
62
+ ## Scope
63
+ - Allowed: /src, /tests/integration
64
+ - Forbidden: do not increase timeouts as the only fix; do not mark tests as skip/pending to "pass"
65
+
66
+ ## Escalation
67
+ On cap: report which test(s) still flake + last 3 stack traces, stop.
68
+ ```
69
+
70
+ ## Pattern to notice
71
+
72
+ Both conversions turn a subjective adjective ("looks good," "flaky tests") into something a shell command returns pass/fail on. If you can't find a command that returns pass/fail, you're stuck with LLM-as-judge — say so explicitly in the Verification section rather than hiding it, and expect `lint_spec.mjs` to accept it only because it was named, not because it's as trustworthy as a real test command.
@@ -0,0 +1,31 @@
1
+ # `harden` — fix an existing weak spec
2
+
3
+ Triggered by: `/loop harden [path]`, or "this spec isn't good enough" / "fix my loop spec" / intent that clearly means improving an existing spec rather than writing one.
4
+
5
+ ## Flow
6
+
7
+ 1. **Run the linter first, always.** Do not read the spec and guess what's wrong — run it:
8
+
9
+ ```bash
10
+ node <skill-dir>/scripts/lint_spec.mjs LOOP_SPEC.md
11
+ ```
12
+
13
+ (or whatever path the user gave). This produces a specific, itemized list of what's missing or weak. That list is your TODO list — not your own read of the document.
14
+
15
+ 2. **Fix exactly what's flagged, in the order reported.** Common fixes:
16
+ - Missing section header → add it with real content, not a placeholder.
17
+ - Placeholder text still present (`<...>`, `TBD`, `...`) → replace with the actual concrete answer; ask the user if you don't know it.
18
+ - Verification has no command and doesn't flag LLM-judge → either find the actual test/build/lint command for this project (check `package.json` scripts, a Makefile, CI config) or explicitly write the LLM-judge rubric and flag it as the weakest link.
19
+ - Termination missing success/cap/no-progress → add whichever is missing; don't just add a stub, make it match the actual goal (e.g. cap of 3 for a destructive task, not the generic default of 8).
20
+ - Scope has no forbidden action → ask the user for at least one real boundary; don't invent a generic one if you can ask.
21
+ - Escalation doesn't describe a stop behavior → default to "stop, summarize, wait for human" unless told otherwise.
22
+
23
+ 3. **Re-run the linter after every fix pass.** Don't fix all 5 issues blind and assume it's clean — re-run:
24
+
25
+ ```bash
26
+ node <skill-dir>/scripts/lint_spec.mjs LOOP_SPEC.md
27
+ ```
28
+
29
+ Iterate until it exits 0. Report the before/after issue list to the user so they can see what changed.
30
+
31
+ 4. If the user provided a goal description instead of a path (no existing file), this is actually a `new` request — load `reference/new.md` instead.