paperthin 0.7.1 → 0.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +10 -8
- package/.github/workflows/release.yml +16 -5
- package/CLAUDE.md +16 -22
- package/README.md +86 -37
- package/assets/banner.svg +4 -4
- package/assets/map.svg +4 -3
- package/assets/thumbnail.svg +3 -3
- package/package.json +12 -12
- package/skills/coil/flywheel/SKILL.md +10 -21
- package/skills/coil/nba/SKILL.md +8 -20
- package/skills/coil/retro/SKILL.md +9 -21
- package/skills/coil/scratch/SKILL.md +6 -16
- package/skills/depth/autobahn/SKILL.md +40 -0
- package/skills/depth/dedash/SKILL.md +43 -0
- package/skills/depth/re0/SKILL.md +7 -9
- package/skills/depth/sip/SKILL.md +9 -7
|
@@ -2,18 +2,20 @@
|
|
|
2
2
|
"name": "paperthin",
|
|
3
3
|
"description": "Plain-Markdown skills that turn old engineering wisdom into reflexes your agent reaches for on its own — on any agent.",
|
|
4
4
|
"skills": [
|
|
5
|
+
"./skills/breadth/ssotchk",
|
|
6
|
+
"./skills/breadth/ssotize",
|
|
7
|
+
"./skills/coil/flywheel",
|
|
8
|
+
"./skills/coil/nba",
|
|
9
|
+
"./skills/coil/retro",
|
|
10
|
+
"./skills/coil/scratch",
|
|
11
|
+
"./skills/depth/autobahn",
|
|
12
|
+
"./skills/depth/dedash",
|
|
5
13
|
"./skills/depth/factchk",
|
|
14
|
+
"./skills/depth/hate",
|
|
6
15
|
"./skills/depth/mandela",
|
|
7
16
|
"./skills/depth/re0",
|
|
8
17
|
"./skills/depth/re0-git",
|
|
9
|
-
"./skills/depth/hate",
|
|
10
18
|
"./skills/depth/shower",
|
|
11
|
-
"./skills/depth/sip"
|
|
12
|
-
"./skills/breadth/ssotchk",
|
|
13
|
-
"./skills/breadth/ssotize",
|
|
14
|
-
"./skills/coil/retro",
|
|
15
|
-
"./skills/coil/scratch",
|
|
16
|
-
"./skills/coil/flywheel",
|
|
17
|
-
"./skills/coil/nba"
|
|
19
|
+
"./skills/depth/sip"
|
|
18
20
|
]
|
|
19
21
|
}
|
|
@@ -13,8 +13,6 @@ jobs:
|
|
|
13
13
|
runs-on: ubuntu-latest
|
|
14
14
|
steps:
|
|
15
15
|
- uses: actions/checkout@v4
|
|
16
|
-
with:
|
|
17
|
-
fetch-depth: 0 # fetch the annotated tag's message for the release notes
|
|
18
16
|
|
|
19
17
|
- uses: actions/setup-node@v4
|
|
20
18
|
with:
|
|
@@ -31,14 +29,27 @@ jobs:
|
|
|
31
29
|
test "$PKG" = "$TAG" || { echo "tag $TAG != package.json $PKG"; exit 1; }
|
|
32
30
|
|
|
33
31
|
- name: Publish to npm (public, with provenance)
|
|
34
|
-
run: npm publish --provenance --access public
|
|
35
32
|
env:
|
|
36
33
|
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
|
|
34
|
+
run: |
|
|
35
|
+
VERSION=$(node -p "require('./package.json').version")
|
|
36
|
+
if npm view "paperthin@${VERSION}" version >/dev/null 2>&1; then
|
|
37
|
+
echo "paperthin@${VERSION} is already published; skipping"
|
|
38
|
+
else
|
|
39
|
+
npm publish --provenance --access public
|
|
40
|
+
fi
|
|
37
41
|
|
|
38
42
|
- name: Create the GitHub Release
|
|
39
43
|
env:
|
|
40
44
|
GH_TOKEN: ${{ github.token }}
|
|
41
45
|
run: |
|
|
42
|
-
|
|
46
|
+
# Read the annotated tag's message from the API: a CI checkout does not carry the
|
|
47
|
+
# tag object, so local git falls back to the commit message.
|
|
48
|
+
SHA=$(gh api "repos/${GITHUB_REPOSITORY}/git/ref/tags/${GITHUB_REF_NAME}" --jq '.object.sha')
|
|
49
|
+
gh api "repos/${GITHUB_REPOSITORY}/git/tags/${SHA}" --jq '.message' \
|
|
43
50
|
| sed '/-----BEGIN PGP SIGNATURE-----/,/-----END PGP SIGNATURE-----/d' > "$RUNNER_TEMP/notes.md"
|
|
44
|
-
gh release
|
|
51
|
+
if gh release view "$GITHUB_REF_NAME" >/dev/null 2>&1; then
|
|
52
|
+
gh release edit "$GITHUB_REF_NAME" --title "Paperthin ${GITHUB_REF_NAME#v}" --notes-file "$RUNNER_TEMP/notes.md"
|
|
53
|
+
else
|
|
54
|
+
gh release create "$GITHUB_REF_NAME" --title "Paperthin ${GITHUB_REF_NAME#v}" --notes-file "$RUNNER_TEMP/notes.md"
|
|
55
|
+
fi
|
package/CLAUDE.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
|
-
#
|
|
1
|
+
# Paperthin
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
Paperthin is an agent-agnostic suite of plain-Markdown skills that keep an artifact **clean and true** — hygiene reflexes an agent reaches for on its own. This is the guide for authoring them — and itself a skill artifact, so `re0` it when it drifts. Every skill named below is defined in the shipped catalog, the [README](./README.md).
|
|
4
4
|
|
|
5
5
|
## Philosophy
|
|
6
6
|
|
|
@@ -21,28 +21,19 @@ That cut is two orthogonal axes, **cardinality × time**:
|
|
|
21
21
|
|
|
22
22
|
```text
|
|
23
23
|
skills/
|
|
24
|
-
├──
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
├── breadth/ many artifacts · now
|
|
29
|
-
│ reconcile one truth across files and platforms
|
|
30
|
-
│ ssotchk · ssotize
|
|
31
|
-
│
|
|
32
|
-
├── coil/ one project · across iterations
|
|
33
|
-
│ carry learning between build cycles
|
|
34
|
-
│ retro · scratch · flywheel · nba
|
|
35
|
-
│
|
|
36
|
-
└── mesh/ many minds · across rounds
|
|
37
|
-
converge independent views into consensus
|
|
38
|
-
(reserved · v0.8)
|
|
24
|
+
├── breadth/ reconcile one truth across files and platforms
|
|
25
|
+
├── coil/ carry learning between build cycles
|
|
26
|
+
├── depth/ refine or verify the thing in hand
|
|
27
|
+
└── mesh/ converge independent views into consensus (reserved)
|
|
39
28
|
```
|
|
40
29
|
|
|
30
|
+
The axes' quadrants and each skill's home are the README's facts — its map and index; this file defines only the cut.
|
|
31
|
+
|
|
41
32
|
**File by trigger-scope, not by what a skill invokes.** A skill lives where the work it *triggers or emits* ranges, even when it orchestrates skills from other folders — `sip` gates one finished deliverable, so it's `depth/`, though its check runs a cross-file `ssotchk`.
|
|
42
33
|
|
|
43
34
|
Reach for `breadth/` to *establish* order (legacy refactor, knowledge-base build, fresh scaffolding); once a fact is cleanly SSOT'd, *maintain* it with `re0` rather than re-consolidating. Keep drafts and retired skills out of the README and `plugin.json`.
|
|
44
35
|
|
|
45
|
-
**Name it for the reflex it fires** — a plain real word (`shower`, `sip`) or a tight compression of a real term (`re0`, `ssotchk`, `ssotize`); a stranger should half-guess what it does from the name alone, so no opaque coinage. A self-evident metaphor-noun is allowed only as a deliberate exception, when its own intuition carries it.
|
|
36
|
+
**Name it for the reflex it fires** — a plain real word (`shower`, `sip`) or a tight compression of a real term (`re0`, `ssotchk`, `ssotize`); a stranger should half-guess what it does from the name alone, so no opaque coinage. A self-evident metaphor-noun is allowed only as a deliberate exception, when its own intuition carries it — `autobahn` is the standing example. Never model-brand a name; the mechanism must outlive any one model.
|
|
46
37
|
|
|
47
38
|
## SKILL.md format
|
|
48
39
|
|
|
@@ -65,13 +56,13 @@ description: "<trigger-rich one-liner>"
|
|
|
65
56
|
|
|
66
57
|
Every `SKILL.md` is either user-invoked (`disable-model-invocation: true`, reachable only by the human) or model-invoked (model- or user-reachable). For the full definitions, description conventions, and why a user-invoked skill can invoke model-invoked skills but never another user-invoked one, see [docs/invocation.md](./docs/invocation.md).
|
|
67
58
|
|
|
68
|
-
Default to model-invoked. Make a skill user-invoked only when the model should never reach it on its own — its trigger is a deliberate, human-decided action (commit, push, publish, deploy), or its mere presence in reach would bias the agent toward one. A user-invoked skill can't be composed, so it also stays out of `sip`.
|
|
59
|
+
Default to model-invoked. Make a skill user-invoked only when the model should never reach it on its own — its trigger is a deliberate, human-decided action (commit, push, publish, deploy), or its mere presence in reach would bias the agent toward one. A user-invoked skill can't be composed, so it also stays out of `sip`. Three qualify today: `re0-git`, because it cleans a commit's message and committing is human-decided; `hate`, because a hate-it reflex always in the agent's reach would bias it toward demolition; and `dedash`, because the user owns the exact prose scope. `autobahn` remains model-invoked because the model should autonomously carve risk-adjacent scope before execution.
|
|
69
60
|
|
|
70
61
|
## Conventions
|
|
71
62
|
|
|
72
63
|
Shared rules a skill references rather than restates:
|
|
73
64
|
|
|
74
|
-
- **edit-safety** (any mutating skill — `re0`, `ssotize`, `scratch`): a find-replace that could match nothing must **assert its target exists** (report a MISS, never silently no-op); byte-level tools corrupt multibyte text, so mutate with a **unicode-safe** pass (`PYTHONUTF8=1`, stdout reconfigured to UTF-8); never **blanket-replace a single target whose right replacement is positional**
|
|
65
|
+
- **edit-safety** (any mutating skill — `re0`, `ssotize`, `scratch`): a find-replace that could match nothing must **assert its target exists** (report a MISS, never silently no-op); byte-level tools corrupt multibyte text, so mutate with a **unicode-safe** pass (`PYTHONUTF8=1`, stdout reconfigured to UTF-8); never **blanket-replace a single target whose right replacement is positional** — when the same token maps to different replacements per location, decide per occurrence; and make large *structural* moves with a script, not by retyping.
|
|
75
66
|
- **negatives-as-corpus**: "cut" means **move-to-archive, never delete** — pruned and failed branches are assets, not waste.
|
|
76
67
|
|
|
77
68
|
## Shipping
|
|
@@ -79,10 +70,13 @@ Shared rules a skill references rather than restates:
|
|
|
79
70
|
Before committing:
|
|
80
71
|
|
|
81
72
|
1. **SKILL.md** follows the anatomy above.
|
|
82
|
-
2. **README** lists it — grouped
|
|
73
|
+
2. **README** lists it — grouped by perspective, invocation marked in the `Invoker` column, each linked to its `SKILL.md`.
|
|
83
74
|
3. **plugin.json** registers its path.
|
|
84
75
|
4. **package.json** bumps version (new skill = minor; fix/docs = patch); `keywords` stay grouped logically.
|
|
85
|
-
5.
|
|
76
|
+
5. **Refresh the skills it couples to.** Skills relate by naming each other, and those names form a graph with several edge kinds: an orchestrator that runs it (`sip`), a check that points to its remedy (`ssotchk` → `ssotize`), a skill that names it as adjacent scope (`factchk` → `mandela`/`hate`), a producer whose output feeds a consumer (`autobahn`'s ledger → `retro`). Walk the edges both into and out of the changed skill and ask two questions, not one: which existing reference drifted, and which coupling *should* exist but never got wired. Missing edges are as real as stale ones and hide better. One discriminator keeps the hunt honest: an absent edge to a user-invoked skill (`hate`, `dedash`, `re0-git`) is correct, not a gap — they can't be composed, so name their role, not the skill (`flywheel` names its adversarial phase, never `hate`). A name-coupling is a standing relationship, not one-time registration; left frozen, it silently goes out of date.
|
|
77
|
+
6. Run **`sip`** — it tastes the change with the repo's own clean-and-true checks before you ship.
|
|
78
|
+
|
|
79
|
+
A new skill earns a full-catalog sweep, not just its own edges: its inbound couplings (which existing skills should now name it) live in the *other* files and are invisible from the new one, so read the catalog end to end when you add a node — a patch only needs the touched skill's edges.
|
|
86
80
|
|
|
87
81
|
Commit messages: Conventional Commits, one bullet per real change — no padding, no mechanical trivia, no co-author tags. `re0-git` cleans a drifted message back to its essence, in the author's own style.
|
|
88
82
|
|
package/README.md
CHANGED
|
@@ -1,12 +1,12 @@
|
|
|
1
1
|
<div align="center">
|
|
2
2
|
|
|
3
|
-
#
|
|
3
|
+
# Paperthin
|
|
4
4
|
|
|
5
|
-
<img src="https://raw.githubusercontent.com/LilMGenius/paperthin/main/assets/banner.svg" alt="
|
|
5
|
+
<img src="https://raw.githubusercontent.com/LilMGenius/paperthin/main/assets/banner.svg" alt="Paperthin — Trust the artifact, not the author." width="820">
|
|
6
6
|
|
|
7
7
|
Plain-Markdown skills that turn old engineering wisdom into reflexes your agent reaches for on its own — on any agent: Claude Code, Codex, Cursor, Antigravity, Grok-Build, Hermes, OpenClaw, Pi, etc.
|
|
8
8
|
|
|
9
|
-
[Quickstart](#quickstart-15-seconds) · [
|
|
9
|
+
[Quickstart](#quickstart-15-seconds) · [The Map](#the-map) · [The Index](#the-index) · [The Problem](#the-problem) · [The Fixes](#the-fixes) · [Credits](#credits)
|
|
10
10
|
|
|
11
11
|
</div>
|
|
12
12
|
|
|
@@ -23,39 +23,52 @@ Plain-Markdown skills that turn old engineering wisdom into reflexes your agent
|
|
|
23
23
|
|
|
24
24
|
**Not sure?** Paste that command into whatever agent you're using and just say "set this up for me" — it'll do the rest.
|
|
25
25
|
|
|
26
|
-
## Index
|
|
27
|
-
|
|
28
|
-
### Model-invoked — your agent reaches for these on its own
|
|
29
|
-
| Skill | Scope | What it does |
|
|
30
|
-
|---|---|---|
|
|
31
|
-
| ♻️ **[re0](./skills/depth/re0/SKILL.md)** | one artifact | Rewrite a drifted artifact into a clean v0 — not another patch |
|
|
32
|
-
| 🚿 **[shower](./skills/depth/shower/SKILL.md)** | one artifact | Cold-read it with fresh, zero-context eyes — does it stand on its own? *(read-only)* |
|
|
33
|
-
| 🔬 **[factchk](./skills/depth/factchk/SKILL.md)** | one claim | Verify a reality-grounded claim against sources, both directions — could the absurd be real, the obvious false? *(read-only → fix)* |
|
|
34
|
-
| 🧪 **[mandela](./skills/depth/mandela/SKILL.md)** | one eval | Audit a validation for leakage — does outside ground-truth actually enter? Walks 8 patterns *(read-only)* |
|
|
35
|
-
| 🔎 **[ssotchk](./skills/breadth/ssotchk/SKILL.md)** | many artifacts | Find where one fact is scattered or duplicated; name the canonical source *(read-only)* |
|
|
36
|
-
| 🧲 **[ssotize](./skills/breadth/ssotize/SKILL.md)** | many artifacts | Consolidate it into one home and point the rest at it |
|
|
37
|
-
| 🧭 **[retro](./skills/coil/retro/SKILL.md)** | one project, across iterations | Extract lessons, anti-patterns, gates, and vocabulary from a finished or failed cycle |
|
|
38
|
-
| 🧱 **[scratch](./skills/coil/scratch/SKILL.md)** | one project, across iterations | Restart from v0 while preserving only proven lessons, contracts, gates, and negative corpus |
|
|
39
|
-
| 🌀 **[flywheel](./skills/coil/flywheel/SKILL.md)** | one project, across iterations | Run build → QA → retro → scratch cycles without mistaking code accumulation for learning |
|
|
40
|
-
| 🎯 **[nba](./skills/coil/nba/SKILL.md)** | one project, across iterations | Read the live cycle state and return the single next best action, not a menu *(read-only)* |
|
|
41
|
-
| 🥄 **[sip](./skills/depth/sip/SKILL.md)** | your output | After any change, auto-runs `shower` + `ssotchk` + `re0` on it |
|
|
42
|
-
|
|
43
|
-
### User-invoked — you run these yourself
|
|
44
|
-
| Skill | Scope | What it does |
|
|
45
|
-
|---|---|---|
|
|
46
|
-
| 🧾 **[re0-git](./skills/depth/re0-git/SKILL.md)** | one commit | Rewrite a finished commit's message into a clean v0 so `git log` alone hands off |
|
|
47
|
-
| 😈 **[hate](./skills/depth/hate/SKILL.md)** | one plan | Refuse to be nice to it — the one objection that could kill it + the cheapest test |
|
|
48
|
-
|
|
49
26
|
## The Map
|
|
50
27
|
|
|
51
28
|
**How many artifacts, and across how much time?**
|
|
52
29
|
|
|
53
|
-
Two axes — **cardinality × time** — carve four regions.
|
|
54
|
-
|
|
55
30
|
<div align="center">
|
|
56
|
-
<img src="https://raw.githubusercontent.com/LilMGenius/paperthin/main/assets/map.svg" alt="The
|
|
31
|
+
<img src="https://raw.githubusercontent.com/LilMGenius/paperthin/main/assets/map.svg" alt="The Paperthin map by LilMGenius/paperthin: a two-by-two matrix. Horizontal axis cardinality (one, then many); vertical axis time (now, then across iterations); four regions. Top-left, depth: one artifact, now; is this one thing clean and true? Top-right, breadth: many artifacts, now; is one truth consistent everywhere? Bottom-left, coil: one project, across iterations; did each pass teach the next? Bottom-right, mesh: many minds, across rounds; does the crowd converge on truth?" width="820">
|
|
57
32
|
</div>
|
|
58
33
|
|
|
34
|
+
## The Index
|
|
35
|
+
|
|
36
|
+
### `depth/`
|
|
37
|
+
|
|
38
|
+
| Skill | What it does | Scope | Invoker |
|
|
39
|
+
|---|---|---|---|
|
|
40
|
+
| ♻️ **[re0](./skills/depth/re0/SKILL.md)** | Rewrite a drifted artifact into a clean v0 — not another patch | one artifact | model |
|
|
41
|
+
| 🚿 **[shower](./skills/depth/shower/SKILL.md)** | Cold-read it with fresh, zero-context eyes — does it stand on its own? *(read-only)* | one artifact | model |
|
|
42
|
+
| 🔬 **[factchk](./skills/depth/factchk/SKILL.md)** | Verify a reality-grounded claim against sources, both directions — could the absurd be real, the obvious false? *(read-only → fix)* | one claim | model |
|
|
43
|
+
| 🧪 **[mandela](./skills/depth/mandela/SKILL.md)** | Audit a validation for leakage — does outside ground-truth actually enter? Walks 8 patterns *(read-only)* | one eval | model |
|
|
44
|
+
| 🛣️ **[autobahn](./skills/depth/autobahn/SKILL.md)** | Carve unsafe scope before execution, preserve the safe work's ambition, and make the descope ledger visible | one task | model |
|
|
45
|
+
| 🥄 **[sip](./skills/depth/sip/SKILL.md)** | After any change, tastes it with the repo's own clean-and-true checks | your output | model |
|
|
46
|
+
| 😈 **[hate](./skills/depth/hate/SKILL.md)** | Refuse to be nice to it — the one objection that could kill it + the cheapest test | one plan | user |
|
|
47
|
+
| ✂️ **[dedash](./skills/depth/dedash/SKILL.md)** | Remove em-dashes and the dashes standing in for them, one occurrence at a time, choosing the punctuation or wording each context needs | your prose | user |
|
|
48
|
+
| 🧾 **[re0-git](./skills/depth/re0-git/SKILL.md)** | Rewrite a finished commit's message into a clean v0 so `git log` alone hands off | one commit | user |
|
|
49
|
+
|
|
50
|
+
### `breadth/`
|
|
51
|
+
|
|
52
|
+
| Skill | What it does | Scope | Invoker |
|
|
53
|
+
|---|---|---|---|
|
|
54
|
+
| 🔎 **[ssotchk](./skills/breadth/ssotchk/SKILL.md)** | Find where one fact is scattered or duplicated; name the canonical source *(read-only)* | one fact, many places | model |
|
|
55
|
+
| 🧲 **[ssotize](./skills/breadth/ssotize/SKILL.md)** | Consolidate it into one home and point the rest at it | one fact, many places | model |
|
|
56
|
+
|
|
57
|
+
### `coil/`
|
|
58
|
+
|
|
59
|
+
| Skill | What it does | Scope | Invoker |
|
|
60
|
+
|---|---|---|---|
|
|
61
|
+
| 🧭 **[retro](./skills/coil/retro/SKILL.md)** | Extract lessons, anti-patterns, gates, and vocabulary from a finished or failed cycle | one finished cycle | model |
|
|
62
|
+
| 🧱 **[scratch](./skills/coil/scratch/SKILL.md)** | Restart from v0 while preserving only proven lessons, contracts, gates, and negative corpus | one restart | model |
|
|
63
|
+
| 🌀 **[flywheel](./skills/coil/flywheel/SKILL.md)** | Run build → QA → retro → scratch cycles without mistaking code accumulation for learning | the whole loop | model |
|
|
64
|
+
| 🎯 **[nba](./skills/coil/nba/SKILL.md)** | Read the live cycle state and return the single next best action, not a menu *(read-only)* | the live cycle | model |
|
|
65
|
+
|
|
66
|
+
### `mesh/`
|
|
67
|
+
|
|
68
|
+
*In development — converge independent views into consensus.*
|
|
69
|
+
|
|
70
|
+
*More on invocation: [docs/invocation.md](./docs/invocation.md)*
|
|
71
|
+
|
|
59
72
|
## The Problem
|
|
60
73
|
|
|
61
74
|
**Most agent skills are slop.**
|
|
@@ -71,6 +84,8 @@ These skills bet the other way — **every one of them removes:**
|
|
|
71
84
|
- `ssotchk` / `ssotize` collapse the same fact scattered across files,
|
|
72
85
|
- `shower` cuts whatever a stranger can't follow,
|
|
73
86
|
- `retro` / `scratch` preserve the lesson and let the wrong build die,
|
|
87
|
+
- `autobahn` carves unsafe scope out up front, so the safe remainder runs at full speed,
|
|
88
|
+
- `dedash` removes even the em-dash tell and its look-alikes, one judged occurrence at a time,
|
|
74
89
|
- `sip` runs all of it on your own output, automatically.
|
|
75
90
|
|
|
76
91
|
> [!TIP]
|
|
@@ -78,7 +93,7 @@ These skills bet the other way — **every one of them removes:**
|
|
|
78
93
|
|
|
79
94
|
## The Fixes
|
|
80
95
|
|
|
81
|
-
<!-- Fixes follow the lifecycle of a piece of work, not the skill list:
|
|
96
|
+
<!-- Fixes follow the lifecycle of a piece of work, not the skill list: keep it clean (draft → read fresh → reconcile across files → automate → ship), keep it true (a single claim → a validation → the whole plan), then keep the loop learning and the run unblocked. Slot any new fix in by where it acts in that arc. -->
|
|
82
97
|
|
|
83
98
|
**Each is a well-worn principle, made automatic.**
|
|
84
99
|
|
|
@@ -125,7 +140,7 @@ A timeout value, a decision, a status — copied into a README, a doc, a ticket,
|
|
|
125
140
|
### #4 — "Remember to verify" never fires
|
|
126
141
|
A guideline buried in docs won't trigger in a brand-new session — exactly when author bias is highest.
|
|
127
142
|
|
|
128
|
-
**The fix → `sip`:** the moment you finish something, it runs `shower`, `ssotchk`, and `
|
|
143
|
+
**The fix → `sip`:** the moment you finish something, it runs the clean checks (`shower`, `ssotchk`, `re0`) and, when there's a claim or an eval, the true ones (`factchk`, `mandela`) on your output, automatically.
|
|
129
144
|
|
|
130
145
|
> *Prior art: [dogfooding](https://en.wikipedia.org/wiki/Eating_your_own_dog_food) — eat your own dog food (Microsoft, 1988). Taste your own cooking before you serve it.*
|
|
131
146
|
|
|
@@ -169,6 +184,14 @@ A model, a scorer, and a designer can all agree a result is real while no outsid
|
|
|
169
184
|
|
|
170
185
|
> *Prior art: [Goodhart's law](https://en.wikipedia.org/wiki/Goodhart%27s_law), [data leakage](https://dl.acm.org/doi/10.1145/2382577.2382579) (Kaufman et al., 2012), and [circular analysis](https://www.nature.com/articles/nn.2303) — "double dipping" (Kriegeskorte et al., 2009).*
|
|
171
186
|
|
|
187
|
+
<details>
|
|
188
|
+
<summary><b>[PROOF]</b></summary>
|
|
189
|
+
|
|
190
|
+
- **Setup** — the audit was distilled from one research design that kept dying to a single failure mode: a scorer, a model, and a designer agreeing on a result no outside truth ever produced.
|
|
191
|
+
- **Result** — leakage surfaced in eight distinct shapes in that one project — a scorer grading buckets it had drawn, two components "verifying" each other in a shared space, a private recipe that made the verifier the designer — and that catalog became the skill's 8-pattern taxonomy.
|
|
192
|
+
- **So** — the checklist isn't theoretical: every pattern in it already drew blood once.
|
|
193
|
+
</details>
|
|
194
|
+
|
|
172
195
|
### #8 — You can't kill your own plan
|
|
173
196
|
You built it, so you defend it. The questions that would break it are exactly the ones you won't ask.
|
|
174
197
|
|
|
@@ -176,15 +199,41 @@ You built it, so you defend it. The questions that would break it are exactly th
|
|
|
176
199
|
|
|
177
200
|
> *Prior art: [egoless programming](https://en.wikipedia.org/wiki/Egoless_programming) (Weinberg, 1971 — the same root `shower` cites), hostile review, and fail-fast.*
|
|
178
201
|
|
|
179
|
-
|
|
180
|
-
|
|
202
|
+
<details>
|
|
203
|
+
<summary><b>[PROOF]</b></summary>
|
|
204
|
+
|
|
205
|
+
- **Setup** — every research pass closed with an adversarial critic, and its verdict was always one root cause plus the cheapest test that would settle it, never a checklist.
|
|
206
|
+
- **Result** — it killed a recombination engine with "one more box drawn, not a sharper tip", and a human-holdout protocol on the numbers alone: n≈24 where 36 was needed, a family-wise error rate near 34%, and a design that cited a principle while implementing its opposite.
|
|
207
|
+
- **So** — the objection that mattered was always singular and cheap to test — exactly the `{root, first nail}` that `hate` is locked to return.
|
|
208
|
+
</details>
|
|
209
|
+
|
|
210
|
+
### #9 — A running build can still be the wrong product
|
|
211
|
+
Long agentic cycles produce many working parts — panels, routes, tests, screenshots — that prove activity more than value, and the sunk cost tempts you to carry the architecture forward. Then between passes the next move blurs into a dozen live threads at once, and too many options is its own paralysis.
|
|
212
|
+
|
|
213
|
+
**The fix → `retro` + `scratch` + `flywheel` + `nba`:** extract the lesson, anti-pattern, and next gate; restart from a clean v0 when the foundation is wrong; run the build → QA → retro → scratch loop; and when the thread is lost, read the state and return the single next best action. Keep only what earned reuse.
|
|
214
|
+
|
|
215
|
+
<details>
|
|
216
|
+
<summary><b>[PROOF]</b></summary>
|
|
217
|
+
|
|
218
|
+
- **Setup** — a game-engine demo reached a full-stack, runnable state: API routes, a canvas runtime, a leaderboard, arcade pages, remix and telemetry panels, tests, screenshots.
|
|
219
|
+
- **Result** — and it was still the wrong product — the generated games were mock, one-screen, with no durable replay layer — while every pass ended in "what now?" against a pile of unmet gates and parked threads.
|
|
220
|
+
- **So** — running and shipping-shaped is not done; the cycle needs a skill to name the missing gate and one to return the single next move.
|
|
221
|
+
</details>
|
|
222
|
+
|
|
223
|
+
### #10 — Risk-adjacent work comes back hedged
|
|
224
|
+
Point an agent at a task that brushes guardrails — scraping, licensing, privacy, security — and you get the worst of both worlds: the risky sliver triggers refusals and retries, while the safe 90% comes back hedged, diluted, or quietly missing.
|
|
181
225
|
|
|
182
|
-
**The fix → `
|
|
226
|
+
**The fix → `autobahn`:** carve guardrail-adjacent items out of scope before execution, each with a safe alternative and an archive entry; run the remaining scope at full strength; ship a descope ledger so every exclusion is a visible decision, not a silent gap. The autobahn has no speed limit *because* entry discipline is strict.
|
|
183
227
|
|
|
184
|
-
|
|
185
|
-
After a disappointing pass, the next move can blur into a dozen plausible threads: patch, rebuild, test more, write docs, add features. That is how projects accumulate surfaces instead of compounding.
|
|
228
|
+
> *Prior art, from this very summer: the US [suspended Fable 5 and Mythos 5](https://www.anthropic.com/news/fable-mythos-access) over one jailbryoken safeguard (Anthropic, 2026), and OpenAI shipped [GPT-5.6](https://openai.com/index/previewing-gpt-5-6-sol/) safety-stack-first to trusted partners (OpenAI, 2026) — at the frontier, the fast lane stays open only as far as entry discipline holds.*
|
|
186
229
|
|
|
187
|
-
|
|
230
|
+
<details>
|
|
231
|
+
<summary><b>[PROOF]</b></summary>
|
|
232
|
+
|
|
233
|
+
- **Setup** — the method was lifted from a live rewrite of a confidential strategy doc that was risk-adjacent on four axes at once: stealth tooling, trademarked names, privacy-adjacent profiling, scraping gray zones.
|
|
234
|
+
- **Result** — a main loop plus ten subagents ran the frontier model end to end with zero flags, zero refusals, zero fallbacks — and every descoped item's safe alternative turned out to be the better product anyway.
|
|
235
|
+
- **So** — carving the risky sliver didn't tax the work; it's why the rest could floor it.
|
|
236
|
+
</details>
|
|
188
237
|
|
|
189
238
|
## Credits
|
|
190
239
|
|
package/assets/banner.svg
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1200 400" width="1200" height="400" role="img" aria-label="
|
|
1
|
+
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1200 400" width="1200" height="400" role="img" aria-label="Paperthin — Trust the artifact, not the author.">
|
|
2
2
|
<defs>
|
|
3
3
|
<linearGradient id="bg" x1="0" y1="0" x2="1200" y2="400" gradientUnits="userSpaceOnUse">
|
|
4
4
|
<stop offset="0" stop-color="#0B0D11"/>
|
|
@@ -50,11 +50,11 @@
|
|
|
50
50
|
stroke="url(#edge)" stroke-width="3.5" stroke-linecap="round" filter="url(#glow)"/>
|
|
51
51
|
</g>
|
|
52
52
|
|
|
53
|
-
<!-- wordmark: "
|
|
53
|
+
<!-- wordmark: "thin" drawn as hairline outline — thinness made literal, font-weight independent -->
|
|
54
54
|
<text x="568" y="206" font-family="'Helvetica Neue', Helvetica, Arial, sans-serif"
|
|
55
55
|
font-size="94" letter-spacing="0.5">
|
|
56
|
-
<tspan font-weight="500" fill="#EEF3F8">Paper</tspan><tspan
|
|
57
|
-
font-weight="400" fill="none" stroke="#D4E3EF" stroke-width="1.4">
|
|
56
|
+
<tspan font-weight="500" fill="#EEF3F8">Paper</tspan><tspan dx="3"
|
|
57
|
+
font-weight="400" fill="none" stroke="#D4E3EF" stroke-width="1.4">thin</tspan>
|
|
58
58
|
</text>
|
|
59
59
|
|
|
60
60
|
<!-- tagline -->
|
package/assets/map.svg
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1080 740" width="1080" height="740" role="img" aria-label="The
|
|
1
|
+
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1080 740" width="1080" height="740" role="img" aria-label="The Paperthin map by LilMGenius/paperthin: an old-paper four-quadrant graph. Horizontal axis cardinality moves from one to many; vertical axis time moves from now to across iterations. Depth is one artifact now, breadth is many artifacts now, coil is one project across iterations, and mesh is many minds across rounds.">
|
|
2
2
|
<defs>
|
|
3
3
|
<filter id="paper-grain" x="-10%" y="-10%" width="120%" height="120%">
|
|
4
4
|
<feTurbulence type="fractalNoise" baseFrequency="0.82" numOctaves="4" seed="23" result="noise"/>
|
|
@@ -57,8 +57,9 @@
|
|
|
57
57
|
<text x="64" y="510" font-family="'Courier New',Courier,monospace" font-size="13" fill="#5C5448">iterations</text>
|
|
58
58
|
|
|
59
59
|
<!-- title note -->
|
|
60
|
-
<text x="151" y="66" font-family="'
|
|
61
|
-
<text x="
|
|
60
|
+
<text x="151" y="66" font-family="Georgia,'Times New Roman',serif" font-size="15" font-style="italic" fill="#6A5B42">four regions on one thin sheet</text>
|
|
61
|
+
<text x="992" y="66" text-anchor="end" font-family="'Courier New',Courier,monospace"
|
|
62
|
+
font-size="13" letter-spacing="0.8" fill="#5C4E39" opacity="0.95">LilMGenius/paperthin</text>
|
|
62
63
|
|
|
63
64
|
<!-- depth -->
|
|
64
65
|
<g transform="translate(184 143)">
|
package/assets/thumbnail.svg
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1080 1080" width="1080" height="1080" role="img" aria-label="
|
|
1
|
+
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 1080 1080" width="1080" height="1080" role="img" aria-label="Paperthin — Trust the artifact, not the author.">
|
|
2
2
|
<defs>
|
|
3
3
|
<linearGradient id="bg" x1="0" y1="0" x2="1080" y2="1080" gradientUnits="userSpaceOnUse">
|
|
4
4
|
<stop offset="0" stop-color="#0B0D11"/>
|
|
@@ -49,8 +49,8 @@
|
|
|
49
49
|
<text x="540" y="722" text-anchor="middle"
|
|
50
50
|
font-family="'Helvetica Neue', Helvetica, Arial, sans-serif"
|
|
51
51
|
font-size="132" letter-spacing="0.5">
|
|
52
|
-
<tspan font-weight="500" fill="#EEF3F8">Paper</tspan><tspan
|
|
53
|
-
font-weight="400" fill="none" stroke="#D4E3EF" stroke-width="1.8">
|
|
52
|
+
<tspan font-weight="500" fill="#EEF3F8">Paper</tspan><tspan dx="4"
|
|
53
|
+
font-weight="400" fill="none" stroke="#D4E3EF" stroke-width="1.8">thin</tspan>
|
|
54
54
|
</text>
|
|
55
55
|
|
|
56
56
|
<!-- tagline, centered -->
|
package/package.json
CHANGED
|
@@ -1,26 +1,26 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "paperthin",
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.8.0",
|
|
4
4
|
"description": "Plain-Markdown skills that turn old engineering wisdom into reflexes your agent reaches for on its own — on any agent.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"agent-skills",
|
|
7
|
-
"agent-agnostic",
|
|
8
7
|
"ai-agents",
|
|
8
|
+
"claude-code",
|
|
9
|
+
"codex",
|
|
9
10
|
"anti-slop",
|
|
10
|
-
"
|
|
11
|
-
"ssot",
|
|
12
|
-
"self-improvement",
|
|
11
|
+
"hygiene",
|
|
13
12
|
"refactor",
|
|
14
13
|
"dedupe",
|
|
15
14
|
"clean-code",
|
|
16
|
-
"
|
|
17
|
-
"
|
|
18
|
-
"leakage",
|
|
19
|
-
"adversarial-review",
|
|
20
|
-
"reasoning-hygiene",
|
|
21
|
-
"documentation",
|
|
15
|
+
"single-source-of-truth",
|
|
16
|
+
"ssot",
|
|
22
17
|
"knowledge-base",
|
|
23
|
-
"
|
|
18
|
+
"documentation",
|
|
19
|
+
"review",
|
|
20
|
+
"fact-check",
|
|
21
|
+
"hallucination",
|
|
22
|
+
"evals",
|
|
23
|
+
"em-dash"
|
|
24
24
|
],
|
|
25
25
|
"author": "LilMGenius (https://github.com/LilMGenius)",
|
|
26
26
|
"license": "MIT",
|
|
@@ -3,8 +3,7 @@ name: flywheel
|
|
|
3
3
|
description: "Run repeated build-QA-retro-scratch cycles while preserving learning and letting code die. Use for long agentic projects where progress must be measured by quality-cleared templates, reusable modules, and eliminated anti-patterns rather than hours spent or features accumulated."
|
|
4
4
|
---
|
|
5
5
|
|
|
6
|
-
Run the cycle so learning compounds and code accumulation does not masquerade as
|
|
7
|
-
progress.
|
|
6
|
+
Run the cycle so learning compounds and code accumulation does not masquerade as progress.
|
|
8
7
|
|
|
9
8
|
## Goal
|
|
10
9
|
|
|
@@ -14,42 +13,32 @@ progress.
|
|
|
14
13
|
FRAME -> BUILD -> DRIVE -> RETRO -> HATE -> SCRATCH -> BUILD AGAIN
|
|
15
14
|
```
|
|
16
15
|
|
|
17
|
-
The unit of progress is not hours, files, panels, or features. It is the count of
|
|
18
|
-
quality-cleared templates, reusable platform modules, and anti-patterns eliminated
|
|
19
|
-
in later cycles.
|
|
16
|
+
The unit of progress is not hours, files, panels, or features. It is the count of quality-cleared templates, reusable platform modules, and anti-patterns eliminated in later cycles.
|
|
20
17
|
|
|
21
18
|
## Workflow
|
|
22
19
|
|
|
23
20
|
1. Frame the thesis and quality gates.
|
|
24
21
|
2. Build one complete vertical slice.
|
|
25
|
-
3. Drive it through the real surface: browser for web apps, HTTP for API
|
|
26
|
-
contracts, computer-use for desktop apps, and CLI only for data-shaped
|
|
27
|
-
artifacts.
|
|
22
|
+
3. Drive it through the real surface: browser for web apps, HTTP for API contracts, computer-use for desktop apps, and CLI only for data-shaped artifacts.
|
|
28
23
|
4. Run `retro` to extract lessons, anti-patterns, and next-cycle gates.
|
|
29
|
-
5. Decide whether the next pass iterates in place or uses `scratch
|
|
30
|
-
6. Kill the next plan before build: use the project's adversarial review skill or
|
|
31
|
-
a human-invoked attack when that skill is user-only.
|
|
24
|
+
5. Decide whether the next pass iterates in place or uses `scratch`; when that call or the next move is unclear, `nba` reads the cycle state and returns the single next action.
|
|
25
|
+
6. Kill the next plan before build: use the project's adversarial review skill or a human-invoked attack when that skill is user-only.
|
|
32
26
|
7. Version only templates or modules that clear their gates.
|
|
33
27
|
|
|
34
28
|
## Rules
|
|
35
29
|
|
|
36
|
-
- Tests are supporting evidence, not the gate; every turn needs a real-surface
|
|
37
|
-
proof sized to the artifact.
|
|
30
|
+
- Tests are supporting evidence, not the gate; every turn needs a real-surface proof sized to the artifact.
|
|
38
31
|
- Do not widen before the core loop clears.
|
|
39
|
-
- Do not confuse generated variety with structural variety. If many outputs
|
|
40
|
-
|
|
41
|
-
- Preserve negative corpus so later cycles can prove the same anti-pattern is
|
|
42
|
-
gone.
|
|
32
|
+
- Do not confuse generated variety with structural variety. If many outputs converge to the same shape, the cycle failed the variety gate.
|
|
33
|
+
- Preserve negative corpus so later cycles can prove the same anti-pattern is gone.
|
|
43
34
|
- Let code die. A restart that preserves the right lessons is progress.
|
|
44
|
-
- Stop a lap when no outside truth enters; add evidence before another internal
|
|
45
|
-
pass.
|
|
35
|
+
- Stop a lap when no outside truth enters; add evidence before another internal pass.
|
|
46
36
|
|
|
47
37
|
## Verification
|
|
48
38
|
|
|
49
39
|
Before finishing a lap:
|
|
50
40
|
|
|
51
41
|
1. The real surface was driven and evidence was captured.
|
|
52
|
-
2. The retro names at least one lesson, anti-pattern, or gate that affects the
|
|
53
|
-
next pass.
|
|
42
|
+
2. The retro names at least one lesson, anti-pattern, or gate that affects the next pass.
|
|
54
43
|
3. The keep/iterate/scratch decision is explicit.
|
|
55
44
|
4. Any version label belongs only to a quality-cleared template or module.
|
package/skills/coil/nba/SKILL.md
CHANGED
|
@@ -7,35 +7,23 @@ Find the next best action from state, not from vibes.
|
|
|
7
7
|
|
|
8
8
|
## Goal
|
|
9
9
|
|
|
10
|
-
`nba` is the manual re-entry point for a stalled cycle. It reads the current state
|
|
11
|
-
and returns one action: the cheapest move that clears the binding constraint. It
|
|
12
|
-
does not execute the action and it does not offer a menu, because a menu recreates
|
|
13
|
-
the paralysis it exists to resolve.
|
|
10
|
+
`nba` is the manual re-entry point for a stalled cycle. It reads the current state and returns one action: the cheapest move that clears the binding constraint. It does not execute the action and it does not offer a menu, because a menu recreates the paralysis it exists to resolve.
|
|
14
11
|
|
|
15
12
|
## Workflow
|
|
16
13
|
|
|
17
|
-
1. Read the live cycle state: plan gates, retro anti-patterns, priority notes,
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
3. Diagnose the blank-out cause: too many open threads, lost thread, unnamed
|
|
22
|
-
blocker, closed-loop fatigue, avoidance of the hard step, or a phase that is
|
|
23
|
-
already done.
|
|
24
|
-
4. Return one action: the cheapest move that clears the most. It may name the
|
|
25
|
-
right skill for the human to fire, the external input needed, the first nail to
|
|
26
|
-
test, or the phase transition that is already due.
|
|
14
|
+
1. Read the live cycle state: plan gates, retro anti-patterns, priority notes, recent git/file changes, QA evidence, and the current workflow phase.
|
|
15
|
+
2. Locate the project in the cycle: FRAME, BUILD, DRIVE, RETRO, HATE, SCRATCH, BUILD AGAIN, or SHIP.
|
|
16
|
+
3. Diagnose the blank-out cause: too many open threads, lost thread, unnamed blocker, closed-loop fatigue, avoidance of the hard step, or a phase that is already done.
|
|
17
|
+
4. Return one action: the cheapest move that clears the most. It may name the right skill for the human to fire, the external input needed, the first nail to test, or the phase transition that is already due.
|
|
27
18
|
5. Frame it as: where you are, next best action, why now, and done when.
|
|
28
19
|
|
|
29
20
|
## Rules
|
|
30
21
|
|
|
31
22
|
- One action, not a checklist.
|
|
32
|
-
- State-grounded, never generic. Cite the gate, doc, evidence, or file state that
|
|
33
|
-
drives the pick.
|
|
23
|
+
- State-grounded, never generic. Cite the gate, doc, evidence, or file state that drives the pick.
|
|
34
24
|
- Return the avoided move, not the easy busywork.
|
|
35
|
-
- If no outside truth is entering, the next action is external input, not another
|
|
36
|
-
|
|
37
|
-
- If the gate is already met, the action is to move phase or ship, not add another
|
|
38
|
-
surface.
|
|
25
|
+
- If no outside truth is entering, the next action is external input, not another internal lap.
|
|
26
|
+
- If the gate is already met, the action is to move phase or ship, not add another surface.
|
|
39
27
|
- Read-only on the project. `nba` recommends; it does not execute.
|
|
40
28
|
- Do not auto-invoke user-only skills. Tell the human when one is the next move.
|
|
41
29
|
|
|
@@ -7,29 +7,19 @@ Turn a completed cycle into lessons the next cycle can actually use.
|
|
|
7
7
|
|
|
8
8
|
## Goal
|
|
9
9
|
|
|
10
|
-
`retro` extracts durable learning from a work cycle without defending the artifact
|
|
11
|
-
that produced it. A cycle can run, pass tests, and still be the wrong thing. The
|
|
12
|
-
output is not a changelog or therapy note: it is a local, evidence-backed record
|
|
13
|
-
of what worked, what misled the build, and what gate the next pass must clear.
|
|
10
|
+
`retro` extracts durable learning from a work cycle without defending the artifact that produced it. A cycle can run, pass tests, and still be the wrong thing. The output is not a changelog or therapy note: it is a local, evidence-backed record of what worked, what misled the build, and what gate the next pass must clear.
|
|
14
11
|
|
|
15
|
-
Use `retro` when the artifact is done, failed, disappointing, or ambiguous enough
|
|
16
|
-
that the next agent needs the cycle's lessons more than its momentum.
|
|
12
|
+
Use `retro` when the artifact is done, failed, disappointing, or ambiguous enough that the next agent needs the cycle's lessons more than its momentum.
|
|
17
13
|
|
|
18
14
|
## Workflow
|
|
19
15
|
|
|
20
|
-
1. Read the original objective, final artifact, QA evidence, user complaints, and
|
|
21
|
-
|
|
22
|
-
2. Separate working assets from misleading progress: contracts, schemas, tests,
|
|
23
|
-
services, vocabulary, and examples that earned reuse vs. UI, panels, scaffolds,
|
|
24
|
-
or abstractions that only looked productive.
|
|
16
|
+
1. Read the original objective, final artifact, QA evidence, user complaints, and any local planning notes.
|
|
17
|
+
2. Separate working assets from misleading progress: contracts, schemas, tests, services, vocabulary, and examples that earned reuse vs. UI, panels, scaffolds, or abstractions that only looked productive.
|
|
25
18
|
3. Name each failure as an anti-pattern, not a mood.
|
|
26
19
|
4. Convert repeated or high-impact failures into quality gates for the next pass.
|
|
27
|
-
5. Convert vague user direction into architecture vocabulary a fresh agent can
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
next-cycle contracts, gates, and vocabulary.
|
|
31
|
-
7. Verify that a from-scratch agent could avoid the same failure from those docs
|
|
32
|
-
alone.
|
|
20
|
+
5. Convert vague user direction into architecture vocabulary a fresh agent can use.
|
|
21
|
+
6. Write or refresh local docs for the cycle: a retro for lessons and a plan for next-cycle contracts, gates, and vocabulary.
|
|
22
|
+
7. Verify that a from-scratch agent could avoid the same failure from those docs alone.
|
|
33
23
|
|
|
34
24
|
## Rules
|
|
35
25
|
|
|
@@ -37,8 +27,7 @@ that the next agent needs the cycle's lessons more than its momentum.
|
|
|
37
27
|
- Do not write a changelog. File lists and effort summaries are not lessons.
|
|
38
28
|
- Preserve negative corpus. Failed paths are training data.
|
|
39
29
|
- Prefer hard gates over advice.
|
|
40
|
-
- Cite evidence from the cycle: objective, file facts, QA output, screenshots,
|
|
41
|
-
transcripts, diffs, or user feedback.
|
|
30
|
+
- Cite evidence from the cycle: objective, file facts, QA output, screenshots, transcripts, diffs, or user feedback.
|
|
42
31
|
- If the next agent cannot act on it, it is not a lesson yet.
|
|
43
32
|
- Keep provenance local; shipped artifacts should not narrate their scars.
|
|
44
33
|
|
|
@@ -48,5 +37,4 @@ Before finishing:
|
|
|
48
37
|
|
|
49
38
|
1. Every lesson traces to observed cycle evidence.
|
|
50
39
|
2. Every anti-pattern names a concrete failure mode and the gate that catches it.
|
|
51
|
-
3. A fresh agent can tell what to preserve, what to discard, and what to test
|
|
52
|
-
first without reading the whole old session.
|
|
40
|
+
3. A fresh agent can tell what to preserve, what to discard, and what to test first without reading the whole old session.
|
|
@@ -3,26 +3,17 @@ name: scratch
|
|
|
3
3
|
description: "Restart a project or artifact from v0 while preserving only proven lessons, contracts, gates, vocabulary, real-surface tests, and negative corpus. Use when the foundation is wrong, accumulated code is misleading progress, or a new pass should learn from the old one without inheriting its accidental architecture."
|
|
4
4
|
---
|
|
5
5
|
|
|
6
|
-
Start over from what the previous cycle proved, not from what it happened to
|
|
7
|
-
build.
|
|
6
|
+
Start over from what the previous cycle proved, not from what it happened to build.
|
|
8
7
|
|
|
9
8
|
## Goal
|
|
10
9
|
|
|
11
|
-
`scratch` is a controlled restart. It keeps earned knowledge and refuses to copy
|
|
12
|
-
forward accidental architecture. Starting from scratch is not amnesia: contracts,
|
|
13
|
-
schemas, vocabulary, real-surface tests, quality gates, and negative corpus survive
|
|
14
|
-
when they earned it. Scaffold, explanatory UI, debug panels, shallow generated
|
|
15
|
-
content, and code whose only value was learning what not to do do not.
|
|
10
|
+
`scratch` is a controlled restart. It keeps earned knowledge and refuses to copy forward accidental architecture. Starting from scratch is not amnesia: contracts, schemas, vocabulary, real-surface tests, quality gates, and negative corpus survive when they earned it. Scaffold, explanatory UI, debug panels, shallow generated content, and code whose only value was learning what not to do do not.
|
|
16
11
|
|
|
17
12
|
## Workflow
|
|
18
13
|
|
|
19
|
-
1. Read the current plan, retro, local domain notes, QA evidence, and any user
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
vocabulary, reusable services, real-surface tests, and negative corpus.
|
|
23
|
-
3. Identify what to discard: explanatory UI, debug panels, scaffold, shallow
|
|
24
|
-
generated content, accidental abstractions, and code whose only value was
|
|
25
|
-
learning what not to do.
|
|
14
|
+
1. Read the current plan, retro, local domain notes, QA evidence, and any user complaint that triggered the restart.
|
|
15
|
+
2. Identify what to preserve: contracts, schemas that survived QA, quality gates, vocabulary, reusable services, real-surface tests, and negative corpus.
|
|
16
|
+
3. Identify what to discard: explanatory UI, debug panels, scaffold, shallow generated content, accidental abstractions, and code whose only value was learning what not to do.
|
|
26
17
|
4. Name the first quality gate before planning code.
|
|
27
18
|
5. Write a v0 skeleton plan with one complete vertical loop.
|
|
28
19
|
6. Build only the first loop until it clears the gate.
|
|
@@ -33,8 +24,7 @@ content, and code whose only value was learning what not to do do not.
|
|
|
33
24
|
- One complete vertical loop beats many shallow surfaces.
|
|
34
25
|
- Preserve provenance in local docs, not in the shipped product.
|
|
35
26
|
- Do not rebuild around a vague lesson. Turn it into a gate or leave it out.
|
|
36
|
-
- Do not delete negative corpus; archive or reference it where the next cycle
|
|
37
|
-
will see it.
|
|
27
|
+
- Do not delete negative corpus; archive or reference it where the next cycle will see it.
|
|
38
28
|
- If a reusable module survives, name the contract that proved it survives.
|
|
39
29
|
|
|
40
30
|
## Verification
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: autobahn
|
|
3
|
+
description: "Carve guardrail-adjacent items out of scope with safe alternatives before risk-adjacent work starts, carry the guard into every prompt, then execute the safe remainder at full strength. Use when a task includes stealth, scraping, privacy, IP, policy, licensing, security, or other safety-adjacent material that could be silently dropped, over-elaborated, or needlessly diluted. Fires on the impulse, not only the topic: the moment you notice yourself about to hedge, soften, silently skip, or brace for a refusal, carve before you execute."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
Carve unsafe scope before execution, preserve the safe work's ambition, and make the descope ledger visible.
|
|
7
|
+
|
|
8
|
+
## Goal
|
|
9
|
+
|
|
10
|
+
Turn risk-adjacent work into an explicit, reviewable scope decision before execution begins: risky items are descoped with safe alternatives and archive destinations, while the remaining safe work runs without hedging, apology, or capability loss. The carve replaces the attempt-refuse-retry cycle: a well-carved task never poses the ask that would have been refused, and never waters down the asks that were always fine.
|
|
11
|
+
|
|
12
|
+
## Workflow
|
|
13
|
+
|
|
14
|
+
1. **FRAME**: Read the task, inputs, and user-stated risk posture. If the user already authorized descoping, proceed with the carve. If not, produce a proposed carve, make its split explicit, and wait for approval before **RUN**: a bright-line item has no safe version, so approval cannot change its verdict and it is marked non-negotiable; a gray-zone item's safe alternative trades away scope the user might want to keep, so it is the real question. If every item is bright-line, proceed without stalling; the ledger carries the record.
|
|
15
|
+
2. **CARVE**: Sweep the task and adjacent inputs for guardrail-adjacent items before executing. For each item, propose `verdict=descope`, class it bright-line or gray-zone, give one risk-free alternative, and name an archive destination per the negatives-as-corpus convention: the project's archive if it has one, otherwise a graveyard section or file beside the deliverable. A gray-zone item the user decides to keep stays in scope and enters the ledger as a kept-by-owner decision. Point to excluded techniques only as much as identification requires; do not elaborate them.
|
|
16
|
+
3. **GUARD**: Distill the carve into a compact scope-guard block: absolute exclusions, allowed alternatives, and the context that authorizes what stays in scope, with a pointer to the ledger for per-item detail. Keep it short enough to paste whole; a guard long enough to need summarizing will drift. Insert the block verbatim into the main plan and every subagent prompt. When work fans out, include one dedicated risk-lens subagent or review pass.
|
|
17
|
+
4. **RUN**: Execute only the carved safe scope at full strength. Build the best version of the allowed deliverable; do not shrink, soften, or omit safe work merely because risky neighbors were removed. When new guardrail-adjacent material surfaces mid-run, route it back through **CARVE**, never improvise a verdict inline: a bright-line find is descoped on the spot and appended to the ledger; a gray-zone find pauses the work it shapes until the owner answers, as in **FRAME**. Refresh the guard block in prompts not yet issued, and re-check work already in flight against the updated carve at **VERIFY**.
|
|
18
|
+
5. **VERIFY**: Run an adversarial pass over the output and adjacent artifacts, checking all four failure directions: risky content elaborated, risky content silently dropped, safe work diluted or treated as excluded downstream, and stale risky material left standing nearby.
|
|
19
|
+
6. **LEDGER**: Report the deliverable with a descope ledger listing every carved item: its class, its verdict of descoped or kept-by-owner, the reason, the safe alternative, and the archive destination. Treat exclusions as visible decisions, not gaps.
|
|
20
|
+
|
|
21
|
+
## Rules
|
|
22
|
+
|
|
23
|
+
- Require a proposed carve when the user has not pre-authorized descoping; do not begin **RUN** while a gray-zone item that shapes the deliverable still awaits an answer. Bright-line exclusions are never up for negotiation, so never stall on those alone.
|
|
24
|
+
- Never probe: do not pose an excluded or gray-zone ask to see whether it passes. The carve exists to end the attempt-refuse-retry churn by settling scope before any such ask is posed.
|
|
25
|
+
- Keep the scope-guard block portable and exact. Every fan-out prompt must carry it verbatim, not as a paraphrase or reminder, and it must guard both directions: no agent elaborates excluded material, and no agent treats work the carve explicitly kept in scope as if it were excluded.
|
|
26
|
+
- Guard against unsafe elaboration: never provide operational detail for excluded techniques beyond the minimum needed to identify what is out of scope. This binds the ledger and the guard block themselves, not only the deliverable.
|
|
27
|
+
- Guard against dilution of safe work with equal force: once the carve is accepted, execute the allowed work with normal rigor, completeness, and ambition. Dilution includes unprompted disclaimers, hedged phrasing, and quietly shrunken deliverables, not only outright omission.
|
|
28
|
+
- Do not frame the skill as a way around safety controls. The method honors constraints by removing risky asks before they are posed.
|
|
29
|
+
- Do not claim control over model routing, fallback provisioning, or fixed-model selection; those are harness behavior, not prompt behavior.
|
|
30
|
+
- Preserve negatives-as-corpus: descoped material is archived with its cause of death and safe replacement, never erased from the record, where a later `retro` can mine the ledger for anti-patterns.
|
|
31
|
+
|
|
32
|
+
## Verification
|
|
33
|
+
|
|
34
|
+
Before finishing, confirm:
|
|
35
|
+
|
|
36
|
+
- The **CARVE** covers every guardrail-adjacent item found, each carrying a class, a verdict of descoped or kept-by-owner, a safe alternative, and an archive destination.
|
|
37
|
+
- The **GUARD** block appears verbatim in every subagent prompt or fan-out work order, and no downstream agent had to improvise a safety posture in either direction.
|
|
38
|
+
- The **RUN** work stayed inside the accepted carve, posed no probing ask, and routed every mid-run discovery back through **CARVE** into the ledger.
|
|
39
|
+
- The safe deliverable was not diluted because of nearby risky material.
|
|
40
|
+
- The final report includes a distinct **LEDGER** section with exclusions, classes, reasons, alternatives, and archive destinations.
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: dedash
|
|
3
|
+
disable-model-invocation: true
|
|
4
|
+
description: "Remove em-dashes and the dashes standing in for them from a user-owned scope, reading each occurrence in context and choosing the punctuation or wording that fits."
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
Remove em-dashes and the dashes standing in for them from the scope the user names, one occurrence at a time, while leaving hyphens, ranges, and deliberate marks alone.
|
|
8
|
+
|
|
9
|
+
## Goal
|
|
10
|
+
|
|
11
|
+
Replace each em-dash that should not remain with the mark or wording its clause actually needs. Target any dash doing an em-dash's job, whatever its exact character: U+2014 and its longer or repurposed look-alikes all read the same on the page. The user owns the scope: a passage, field, file, selection, or tree. Do not widen it without asking.
|
|
12
|
+
|
|
13
|
+
## Workflow
|
|
14
|
+
|
|
15
|
+
1. **FIND**: Take the exact scope from the user and locate every dash doing an em-dash's job in it, judged by the role it plays in the sentence, not its codepoint: any mark that sets off an aside, an appositive, an abrupt turn, or a missing conjunction. The em-dash (U+2014) is the common one; the horizontal bar (U+2015) and the two- and three-em dashes (U+2E3A, U+2E3B) read the same and count. The en-dash (U+2013) is the boundary case: it counts when it joins clauses where an em-dash would fit, but stays when it spans a range's endpoints, numeric or not (`1990–2020`, the `London–Paris` route). A dash doing a non-em job (a hyphen, a minus, a range) stays. Surface anything genuinely ambiguous as a judgment call rather than changing it blindly.
|
|
16
|
+
2. **REPLACE per role**: Choose per occurrence by grammatical role:
|
|
17
|
+
- *Parenthetical aside*: commas or parentheses when the dashed material can be lifted out of the sentence.
|
|
18
|
+
- *Appositive*: a colon or comma when the dashed material renames, defines, or expands the noun before it.
|
|
19
|
+
- *Abrupt turn*: a full stop, semicolon, or comma when the dash marks a pivot, interruption, or restart.
|
|
20
|
+
- *False-conjunction break*: rewrite the sentence when the dash stands in for a missing "and", "but", "because", "so", or another real connective.
|
|
21
|
+
3. **LEAVE-ALONE**: Unless the user explicitly includes them and the context proves them wrong, leave untouched:
|
|
22
|
+
- hyphenated compounds, minus signs, and negative numbers
|
|
23
|
+
- ranges such as `1-10`, `1–10`, or `London–Paris`; a range dash is not an em-dash
|
|
24
|
+
- code, diffs, command output, identifiers, serialized data, URLs, paths, anchors, and query strings
|
|
25
|
+
- quoted or clearly deliberate em-dashes in stylized prose; these surface as judgment calls, never as blind mutations
|
|
26
|
+
4. **RE-READ**: Read every changed sentence in place. Fix awkward rhythm, dangling punctuation, doubled spaces, and connector loss introduced by the replacement. If the sentence now needs wording instead of punctuation, rewrite the smallest clause that solves it.
|
|
27
|
+
5. **REPORT**: State the scope checked, how many dashes were found and changed, which replacements were used by role, and anything left alone or surfaced as a judgment call.
|
|
28
|
+
|
|
29
|
+
## Rules
|
|
30
|
+
|
|
31
|
+
- Never do blanket replacement. The same glyph can require different punctuation in different clauses.
|
|
32
|
+
- Never replace an em-dash with a hyphen as the cleanup.
|
|
33
|
+
- Mutate per the edit-safety convention in `CLAUDE.md`; if the scope contains no such dash, report a MISS rather than changing nearby punctuation.
|
|
34
|
+
- Preserve the user's voice. This skill removes a mark where requested; it does not flatten style outside the user-owned scope.
|
|
35
|
+
|
|
36
|
+
## Verification
|
|
37
|
+
|
|
38
|
+
Before finishing, confirm:
|
|
39
|
+
|
|
40
|
+
1. The scope stayed exactly user-owned.
|
|
41
|
+
2. Every replacement was selected per occurrence by grammatical role.
|
|
42
|
+
3. Leave-alone cases stayed untouched or were reported as judgment calls.
|
|
43
|
+
4. The report includes counts and any judgment calls.
|
|
@@ -1,26 +1,26 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: re0
|
|
3
|
-
description: "Refresh an existing artifact into the current best v0. Use when the user asks to clean up, sync up, dedupe, de-noise, rewrite, or update an artifact after iteration; when nearby artifacts may have drifted; or when changes in one place should be reflected across related artifacts while keeping the result minimal."
|
|
3
|
+
description: "Refresh an existing artifact into the current best v0. Use when the user asks to clean up, sync up, dedupe, de-noise, smooth, rewrite, or update an artifact after iteration; when nearby artifacts may have drifted; or when changes in one place should be reflected across related artifacts while keeping the result minimal."
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
Refresh the target artifact as if it were the first clean version.
|
|
7
7
|
|
|
8
8
|
## Goal
|
|
9
9
|
|
|
10
|
-
The result must be lighter, more current, and more accurate than the input. It
|
|
11
|
-
should not read like a changelog, cleanup note, or patch over an older draft.
|
|
10
|
+
The result must be lighter, more current, and more accurate than the input. It should not read like a changelog, cleanup note, or patch over an older draft.
|
|
12
11
|
|
|
13
12
|
## Workflow
|
|
14
13
|
|
|
15
14
|
1. Identify the target artifact from the user's request or the active context.
|
|
16
15
|
2. Read the target artifact end to end before editing.
|
|
17
16
|
3. Check nearby artifacts that must stay aligned with it.
|
|
18
|
-
4. Remove scaffolding residue, stale deltas, duplicated process noise, deprecated
|
|
19
|
-
information, and over-specific history.
|
|
17
|
+
4. Remove scaffolding residue, stale deltas, duplicated process noise, deprecated information, and over-specific history.
|
|
20
18
|
5. Fold durable lessons into the place they should have lived from the start.
|
|
21
19
|
6. Rewrite instead of appending when appending would preserve noise.
|
|
22
20
|
7. Preserve the artifact's useful voice and structure; simplify everything else.
|
|
23
|
-
8.
|
|
21
|
+
8. Smooth prose noise: fold a parenthetical that interrupts a sentence or list into its own clause or cut it, keep a word or point repeated within reach of itself only once, and unwind padding that adds length but not meaning.
|
|
22
|
+
9. Smooth source noise: unwrap a hard line break that splits a sentence mid-flow so each paragraph or list item lives on one source line, matching how sibling artifacts format theirs; leave code, tables, and quoted material as formatted.
|
|
23
|
+
10. Re-read the result and cut again.
|
|
24
24
|
|
|
25
25
|
## Rules
|
|
26
26
|
|
|
@@ -35,6 +35,4 @@ should not read like a changelog, cleanup note, or patch over an older draft.
|
|
|
35
35
|
|
|
36
36
|
## Verification
|
|
37
37
|
|
|
38
|
-
The result reads as a clean v0 to fresh eyes — no trace of the older draft, no
|
|
39
|
-
sign it was patched rather than rewritten (defer to `shower` if unsure). Report
|
|
40
|
-
what noise was removed and what durable truth was kept.
|
|
38
|
+
The result reads as a clean v0 to fresh eyes — no trace of the older draft, no sign it was patched rather than rewritten (defer to `shower` if unsure). Report what noise was removed and what durable truth was kept.
|
|
@@ -7,22 +7,24 @@ Taste your own cooking: the moment you finish making something, check it with th
|
|
|
7
7
|
|
|
8
8
|
## Goal
|
|
9
9
|
|
|
10
|
-
A reminder buried in docs ("remember to verify") won't reliably fire in a fresh session. `sip` makes the recursive self-improvement loop a triggered habit: right after any create or change, run our own skills on the result so quality doesn't ride on the author's biased in-session judgment.
|
|
10
|
+
A reminder buried in docs ("remember to verify") won't reliably fire in a fresh session. `sip` makes the recursive self-improvement loop a triggered habit: right after any create or change, run our own skills on the result — the clean checks and the true ones — so quality doesn't ride on the author's biased in-session judgment.
|
|
11
11
|
|
|
12
12
|
## Workflow
|
|
13
13
|
|
|
14
14
|
1. Spot the trigger: you just created or changed an artifact or skill and are about to call it done, commit, or hand it off.
|
|
15
15
|
2. **Cold-read it** — run `shower` on the artifact (fresh-eyes comprehension / handoff check).
|
|
16
|
-
3. **
|
|
17
|
-
4. **
|
|
18
|
-
5.
|
|
16
|
+
3. **Verify it's true** — if the artifact asserts a reality-grounded claim, run `factchk`; if it defines an eval, metric, or experiment, run `mandela`. Skip when it has neither.
|
|
17
|
+
4. **Check consistency** — run `ssotchk` across the repo for anything the change duplicated or contradicted; `ssotize` if it found scatter.
|
|
18
|
+
5. **Tidy** — `re0` the changed docs so the result reads as a clean v0, not a patch over a draft.
|
|
19
|
+
6. Apply the findings here, then serve it.
|
|
19
20
|
|
|
20
21
|
## Rules
|
|
21
22
|
|
|
22
23
|
- Trigger on your OWN output, right after making it — that's when bias is highest and a check is cheapest.
|
|
23
|
-
- Use the skills; don't re-implement them
|
|
24
|
-
- Skip what plainly doesn't apply
|
|
25
|
-
- Stop at the artifact — `sip` never touches git or makes commits.
|
|
24
|
+
- Use the skills; don't re-implement them — `shower` for clarity, `factchk`/`mandela` for truth, `ssotchk`/`ssotize` for SSOT, `re0` for cleanup. `sip` orchestrates and routes findings back to the author session to fix; the skills do the work.
|
|
25
|
+
- Skip what plainly doesn't apply — a one-line prose tweak may need only `ssotchk`, and `factchk`/`mandela` fire only when there is a claim or an eval to check. Say what you skipped and why.
|
|
26
|
+
- Stop at the artifact — `sip` never touches git or makes commits.
|
|
27
|
+
- Chain only model-invoked skills. The user-invoked ones — `re0-git`, `hate`, `dedash` — are a human's to fire deliberately, which is why they are user-invoked, so `sip` must not call them.
|
|
26
28
|
|
|
27
29
|
## Verification
|
|
28
30
|
|