npm - ppdevskill - Versions diffs - 1.3.0 → 1.3.1 - Mend

ppdevskill 1.3.0 → 1.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/README.md +91 -134
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -1,114 +1,122 @@
-# ppdevskill — Engineering Partner Skill
+# ppdevskill — your engineering partner, not a code vending machine
-A Claude skill that turns Claude from *"an AI that throws out throwaway code to finish fast"* into a **real engineering thought-partner** — one that understands the actual need, fixes the right spot, avoids over-engineering, and writes principled code.
+[![npm version](https://img.shields.io/npm/v/ppdevskill.svg)](https://www.npmjs.com/package/ppdevskill)
+[![license](https://img.shields.io/npm/l/ppdevskill.svg)](./LICENSE)
-> **Language note:** by design, the skill **replies to you in Thai**, keeping technical terms — function names, paths, errors, commands, code — in English. That is why the `examples/` walkthroughs and a few quoted strings below are in Thai: they are not untranslated debt, they are the demonstrated behavior.
+> Turn Claude from *"an AI that throws out throwaway code to finish fast"* into a **real engineering thought-partner** — one that understands the actual need, fixes the right spot, refuses to over-engineer, and never says *"it's done"* about something it never ran.
----
-## What it is
-`ppdevskill` is a single skill that unifies the software-engineering workflow into **six modes plus a commit action**. Every mode has a **hard gate** that cannot be skipped — the gate stops Claude from writing code before the information is complete and before you have approved a plan.
+`ppdevskill` is a single [Claude](https://claude.ai/code) skill that unifies the whole software-engineering workflow into **six modes + a commit action**, each guarded by a **hard gate** that cannot be skipped. The gate stops Claude from writing code before the information is complete and before you have approved a plan.
-The core idea is **zero drift**. LLMs tend to "slip" their rules as a conversation grows longer or as the user grows impatient. This skill forces Claude to follow every rule literally, every time. A user saying *"just do it"* or *"skip the gate"* is **not** authorization to break a rule — the rule is restated and the work stops until the gate is genuinely satisfied.
+> **Language note:** by design the skill **replies in Thai**, keeping technical terms — function names, paths, errors, commands, code — in English. The technical docs below are English for reach; the demonstrated behavior is Thai.
 ---
-## Modes & commands
+## The problem it solves
-| Mode | Command | Use when |
-|---|---|---|
-| Plan | `#plan` | A big multi-step arc spanning more than one mode or more than 3 slices — orchestrate and decompose |
-| Debug | `#dbg` | A bug, an error / stack trace, something broken or throwing |
-| Feature | `#ft` | Add / build / implement a new capability |
-| Refactor | `#rf` | Clean up / restructure code with **no behavior change** |
-| Review | `#rv` | Review / audit a PR, diff, plan, or design doc |
-| Post-mortem | `#pm` | Write the RCA / post-mortem after a fix has landed |
+LLM coding assistants share three expensive failure modes:
-Plus two routers and one action:
+1. **Code from thin air** — they start editing before they understand the need.
+2. **"It works" that doesn't** — they claim success they never verified.
+3. **Unguarded boundaries** — they touch auth / SQL / file-upload without a threat model.
-- **`#pp`** — auto-route: Claude picks the mode from context; if it is ambiguous, it asks one question and stops.
-- **`#plan`** — orchestrator: breaks a big arc into slices (ordered by dependency), tags each slice with a mode, then hands off. It **never writes code itself**.
-- **`#cp`** — commit / push **action** (not a mode): clean message, no AI attribution, stage only what the change touched. Runs *after* the work is verified.
-- **`#bs`** — brainstorm-partner is a **separate skill** the workflow hands off to when the *approach itself* is undecided (it generates and selects, never builds).
+And a fourth that grows over time: **drift** — the longer the session, the more they "slip" their own rules.
-```mermaid
-flowchart TD
-    U[User request or command] --> PP{"#pp — auto-route"}
-    PP -->|"bug / stack trace / broken"| DBG["#dbg Debug"]
-    PP -->|"add / build / implement"| FT["#ft Feature"]
-    PP -->|"clean up / restructure"| RF["#rf Refactor"]
-    PP -->|"review / audit PR or diff"| RV["#rv Review"]
-    PP -->|"write RCA / post-mortem"| PM["#pm Post-mortem"]
-    PP -->|"big multi-step arc"| PLAN["#plan Orchestrate"]
-    PP -->|"ambiguous"| ASK["Ask one question, stop"]
-    PLAN -.->|"decompose into slices,<br/>each faces its own gate"| DBG
-    PLAN -.-> FT
-    PLAN -.-> RF
-    PP -.->|"approach undecided"| BS["#bs Brainstorm<br/>(separate skill)"]
-```
+`ppdevskill` attacks all four with **gates, a verification discipline, a mechanical Stop-hook, and an on-disk ledger** — discipline backed by mechanism, not willpower.
 ---
-## The gate model
+## Modes & commands
+| Mode | Command | Use when | Gate |
+|---|---|---|---|
+| Plan | `#plan` | A big multi-step arc spanning >1 mode or >3 slices — orchestrate & decompose | GATE 0 · outcome + size + DoD |
+| Debug | `#dbg` | A bug, error / stack trace, something broken or throwing | GATE 1 · reliable repro |
+| Feature | `#ft` | Add / build / implement a new capability | GATE 2 · need + 3 scenarios + scope |
+| Refactor | `#rf` | Clean up / restructure with **no behavior change** | GATE 3 · safety net + smell + pin |
+| Review | `#rv` | Review / audit a PR, diff, plan, or design doc | GATE 4 · outsider trace + cite |
+| Post-mortem | `#pm` | Write the RCA after a fix has landed | GATE 5 · repro + cause + fix + validation |
-Each mode opens with a gate. **No gate, no proceed** — if the gate is not satisfied, Claude states exactly what is missing, stops, and waits. Security is a cross-cutting gate that **cannot be waved off**.
+Plus: **`#pp`** auto-routes from context · **`#cp`** is the commit/push action (clean message, no AI attribution, runs only after work is verified) · **`#bs`** hands off to a separate brainstorm skill when the *approach itself* is undecided.
-- **GATE 0 `#plan`** — outcome stated as an end state + size gate passed + unknowns surfaced + whole-arc definition of done.
-- **GATE 1 `#dbg`** — a reliable reproduction exists; no repro → full stop, no hypothesizing.
-- **GATE 2 `#ft`** — real need stated + at least 3 given/when/then acceptance scenarios + scope bounded IN/OUT.
-- **GATE 3 `#rf`** — a safety net + a concrete motivation (a named smell) + behavior pinned in one sentence.
-- **GATE 4 `#rv`** — outsider stance, end-to-end trace, cite `file:line`, no rubber stamps.
-- **GATE 5 `#pm`** — repro + root cause + fix + validation all in hand, else refuse.
-- **SECURITY GATE (cross-cutting)** — any change touching a trust boundary (input / auth / token / file / SQL / shell / crypto / secret / network / access control / new dependency) trips this gate; the abuse case is written and the relevant OWASP Top 10 items are exercised, not assumed.
+**No gate, no proceed.** If the gate is not satisfied, Claude states exactly what is missing, stops, and waits. Security is a **cross-cutting gate that cannot be waved off** — any change touching a trust boundary (input / auth / token / file / SQL / shell / crypto / secret / network / access control / new dependency) trips it, and the abuse case is *exercised, not assumed*.
 ```mermaid
 flowchart LR
-    PLAN["#plan"] --> G0["GATE 0<br/>outcome + size + DoD"]
-    DBG["#dbg"] --> G1["GATE 1<br/>reliable repro"]
-    FT["#ft"] --> G2["GATE 2<br/>need + 3 scenarios + scope"]
-    RF["#rf"] --> G3["GATE 3<br/>safety net + smell + pin"]
-    RV["#rv"] --> G4["GATE 4<br/>outsider trace + cite"]
-    PM["#pm"] --> G5["GATE 5<br/>repro + cause + fix + validation"]
-    G0 --> D{Gate satisfied?}
-    G1 --> D
-    G2 --> D
-    G3 --> D
-    G4 --> D
-    G5 --> D
-    D -->|No| B["BLOCKED — state what's missing, stop"]
-    D -->|Yes| P["Proceed to the mode's steps"]
-    SEC["SECURITY GATE<br/>cross-cutting, cannot be waived"] -.-> P
+    REQ["request / command"] --> ROUTE["pick mode + gate"]
+    ROUTE --> G{"gate satisfied?<br/>(info + approval)"}
+    G -->|No| B["BLOCKED — name what's missing, stop"]
+    G -->|Yes| P["proceed + VERIFIED block"]
+    SEC["security gate · cannot be waived"] -.-> P
 ```
+A user saying *"just do it"* / *"ทำเลย"* / *"trust me"* is **not** authorization to break a rule. The rule is restated and the work stops until the gate is genuinely satisfied. This is the **zero-drift** core.
 ---
 ## VERIFIED discipline + mechanical enforcement
-The skill never claims success it has not observed. Any claim-word — *done*, *works*, *complete*, `เสร็จ`, `เรียบร้อย` — must be backed by a `VERIFIED:` block (the actual commands run + their actual output) **immediately above it**. The escape hatch is `NOT VERIFIED:`, which lists every skipped step, the reason, and the concrete checks the user must perform. Static checks (type-check, lint) do not count as verification.
+The skill never claims success it has not observed. Any claim-word — *done*, *works*, *complete*, `เสร็จ`, `เรียบร้อย` — must be backed by a `VERIFIED:` block (the **actual** commands run + their **actual** output) immediately above it. The escape hatch is `NOT VERIFIED:`, which lists every skipped step, the reason, and the concrete checks you must perform. Static checks (type-check, lint) do **not** count as verification.
-This is not left to willpower. A **Stop hook** (`hooks/verify-guard.sh`) blocks the turn from ending when a ppdevskill response carries a banner and a claim-word but no verification block, and feeds the reason back so the model fixes it in the same turn. It is **self-scoping** (fires only on responses with a ppdevskill banner — other workflows are untouched), **fail-open** (any error → allow; a discipline hook must never brick a session), and catches **Thai claim-words too**.
+This is not left to willpower. A **Stop hook** (`hooks/verify-guard.sh`) blocks the turn from ending when a ppdevskill response carries a banner + a claim-word but no verification block, and feeds the reason back so the model fixes it in the same turn. It is **self-scoping** (fires only on ppdevskill responses — other workflows untouched), **fail-open** (any error → allow; a discipline hook must never brick a session), and catches **Thai claim-words too**.
 ```mermaid
 flowchart TD
-    R["Assistant reply at turn end"] --> H["verify-guard.sh (Stop hook)"]
-    H --> S{"ppdevskill banner present?"}
-    S -->|No| A["Allow — not a ppdevskill turn"]
+    R["reply at turn end"] --> H["verify-guard.sh (Stop hook)"]
+    H --> S{"ppdevskill banner?"}
+    S -->|No| A["allow — not our turn"]
     S -->|Yes| V{"VERIFIED: / NOT VERIFIED: block?"}
     V -->|Yes| A
     V -->|No| C{"claim-word? (done / works / เสร็จ ...)"}
     C -->|No| A
-    C -->|Yes| BL["Block — feed reason back, fix this turn"]
+    C -->|Yes| BL["block — feed reason back, fix this turn"]
     H -.->|"any error / no jq / no banner"| A
 ```
+**Ledger — anti-drift persistence.** `#plan` / `#dbg` / `#ft` / `#rf` persist their gate state (slice table, hypotheses, scope) to `.ppdev/<mode>-ledger.md`, so it **survives context compaction** — Claude re-anchors from the file, not from memory. It is bounded to one active unit (overwrite on a new unit, mark `[x]` in place, clear when done).
 ---
-## Ledger — anti-drift persistence
+## Does it actually change anything? (benchmark)
+Measured on a 27-trial A/B benchmark — **same model both arms**, identical prompts, with-skill vs without-skill, across all 7 modes + 3 security classes. The skill does not make a weak model smart; it makes a capable model **consistently, mechanically disciplined**.
+| Behavior | Without skill | With skill |
+|---|---:|---:|
+| Wrote code with no plan/approval | **64%** of turns | **0%** |
+| Claimed "works/done" without running it | **100%** (2/2) | **0%** |
+| Enumerated abuse-cases on a boundary task | **0%** (0/7) | **100%** (6/6) |
+| Guessed a root cause before a repro existed | yes | **no** |
+| Mixed a bug-fix into a refactor diff | yes | **no** |
+| Discipline banner + gate state every reply | 0% | **100%** |
+The underrated win is **consistency**: the with-skill arm has near-zero variance; the baseline is disciplined only some of the time, which is exactly what you can't trust in production.
+---
+## Pros & cons — the honest list
+**Pros**
-`#plan` / `#dbg` / `#ft` / `#rf` persist their gate state (slice table, hypotheses, scope) to `.ppdev/<mode>-ledger.md` in the working repo, so it **survives context compaction** — Claude re-anchors from the file, not from memory. Add `.ppdev/` to that repo's `.gitignore`.
+- ✅ **Kills "fake done."** Every claim is backed by real commands + real output, or an explicit `NOT VERIFIED:`.
+- ✅ **No code from thin air.** No gate, no code; incomplete info → it asks, never guesses.
+- ✅ **Security is non-negotiable.** Trust-boundary changes always go through the OWASP gate, and it cannot be waved off.
+- ✅ **Finds the root cause.** Demands a repro before hypothesizing; fixes the cause, not the symptom.
+- ✅ **Clean separation.** Behavior change and refactor never share a diff.
+- ✅ **Mechanical, not aspirational.** The Stop hook enforces verification; the ledger fights drift in long sessions.
+- ✅ **Consistent floor.** Near-zero variance — it behaves the same on turn 50 as on turn 1.
+- ✅ **An honest stance.** No flattery, no hedging, no rubber-stamp "LGTM" — an opinionated recommendation with tradeoffs.
+- ✅ **Plays nice.** The hook is self-scoping and fail-open: it never touches other workflows and never bricks a session.
-The ledger is **bounded to one active unit, never append-only**: a new unit overwrites the file, a finished slice is marked `[x]` in place, a replan edits the table in place, and a met definition-of-done clears the file. Steady-state size is one unit's worth (a slice table is ~4–12 rows). If a ledger grows past that, it is being mis-appended — truncate it to the current unit.
+**Cons** *(so you can decide with eyes open)*
+- ⚠️ **It costs more per turn.** ~1.8× tokens and latency on a *cold* turn (it reads its own rules). This amortizes over a session, but it is real upfront cost.
+- ⚠️ **It adds friction to trivial work.** A one-line rename or a throwaway script still meets some ceremony. For spikes / REPL experiments, that friction is not worth it.
+- ⚠️ **It replies in Thai by design.** Technical terms stay English, but the prose is Thai. If you need English replies, this is a mismatch.
+- ⚠️ **The Stop hook needs `jq`.** No `jq`, the hook fails open (so nothing breaks) but you lose the mechanical net.
+- ⚠️ **It pushes back.** It will ask for a repro, acceptance scenarios, or a security abuse-case before writing code. If you just want code *now*, that refusal is friction — by design ("refusal is a feature").
+- ⚠️ **Benchmark caveat.** Gains are measured against the *same* model, so the value is *consistency and guarantees*, not raw capability. Multi-turn anti-drift benefits are real but harder to quantify.
+**Bottom line:** use it for production code, anything touching auth / payments / data deletion, PR reviews, and long multi-day arcs. Skip it for throwaway scripts and quick learning spikes.
 ---
@@ -122,7 +130,7 @@ npx ppdevskill install --with-hook    # wire the Stop hook immediately, no promp
 npx ppdevskill install --no-hook      # don't touch settings.json (prints the snippet to paste)
 ```
-The installer backs up any existing install first, `chmod +x` the hook, and never clobbers unrelated keys in `settings.json`. Restart Claude Code afterward.
+The installer backs up any existing install, `chmod +x` the hook, and never clobbers unrelated keys in `settings.json`. Restart Claude Code afterward. The hook requires `jq`.
 **Or git clone:**
@@ -130,45 +138,11 @@ The installer backs up any existing install first, `chmod +x` the hook, and neve
 git clone https://github.com/Kamisadev/ppdevskill.git ~/.claude/skills/ppdevskill
 ```
-**Enable the hook manually** (if you used `--no-hook` or cloned): merge `hooks/settings.snippet.json` into `~/.claude/settings.json` (global) or `.claude/settings.json` (per project) — if a `hooks` key already exists, add the `Stop` entry, **do not overwrite**. The hook requires `jq`.
-```bash
-chmod +x ~/.claude/skills/ppdevskill/hooks/verify-guard.sh
-```
-File layout:
-```
-ppdevskill/
-├── SKILL.md              # main hub — rules, principles, routing
-├── references/
-│   ├── plan.md           # GATE 0 + steps for the ultra-plan orchestrator
-│   ├── dbg.md            # gate + steps for debug
-│   ├── ft.md             # gate + steps for feature
-│   ├── rf.md             # gate + steps for refactor
-│   ├── rv.md             # gate + steps for review
-│   ├── pm.md             # gate + steps for post-mortem
-│   ├── sec.md            # security gate (OWASP Top 10)
-│   ├── verify.md         # verification recipes per work type
-│   └── git-auto.md        # #cp commit / push procedure
-├── hooks/
-│   ├── verify-guard.sh        # Stop hook — mechanically enforces the VERIFIED block
-│   └── settings.snippet.json  # config to merge into settings.json
-└── examples/                  # one worked example per mode — read on demand
-    ├── plan.md           # ultra-plan example (GATE 0 → slice table)
-    ├── dbg.md            # debug example (gate → VERIFIED)
-    ├── ft.md             # feature example
-    ├── rf.md             # refactor example
-    ├── rv.md             # review example
-    ├── pm.md             # post-mortem example
-    └── cp.md             # commit / push example
-```
 ---
 ## Usage
-Type a mode command in chat, or let the triggers fire on their own. (Realistic input is in Thai — the skill is built for a Thai-replying workflow.)
+Type a mode command in chat, or let the triggers fire on their own:
 ```
 #dbg API /users คืน 500 ตอน login
@@ -177,38 +151,21 @@ Type a mode command in chat, or let the triggers fire on their own. (Realistic i
 #pp <describe the task>   ← let Claude pick the mode
 ```
-Claude replies in Thai (technical terms / function names / paths / errors stay English), with a one-line banner stating the mode and gate status on every response, e.g.:
-```
-> #dbg | GATE 1 PASS | STEP 1.2
-```
+Every reply carries a one-line banner stating mode + gate status, e.g. `> #dbg | GATE 1 PASS | STEP 1.2`.
 ---
-## What it buys you
+## Thank you 🙏 — no strings attached
-- **No code from thin air** — no gate, no code; incomplete information means brainstorm first, never guess.
-- **Fix the right spot** — find the root cause, don't patch the symptom.
-- **No over-engineering** — YAGNI: no abstraction before its time.
-- **An honest stance** — no flattery, no hedging, no rubber-stamps; an opinionated recommendation with the tradeoffs.
-- **No hollow "it's done"** — every *done / works / เสร็จ* is backed by a `VERIFIED:` block (real commands + real output).
-- **Security first** — touching a trust boundary always goes through the security gate (OWASP Top 10), and it cannot be waved off.
-- **Enforced by mechanism, not just by asking** — the Stop hook enforces the VERIFIED block; the ledger persists to disk to fight drift in long sessions.
-- **Worked examples + an automatic commit offer** — `examples/<mode>.md` shows each mode from gate to VERIFIED block; once work is verified, the skill **offers `#cp`** on its own (never auto-commits, never offers on broken work).
-- **Clean separation of concerns** — behavior change and refactor never share a diff; one response, one mode.
+If you are reading this, you might be about to try this skill — and that already means a lot.
-```mermaid
-flowchart TD
-    P0["#plan GATE 0"] --> ST["slice table → .ppdev/plan-ledger.md"]
-    ST --> S1["slice 1 → its mode (full gate)"]
-    S1 --> Vf{"VERIFIED?"}
-    Vf -->|"No"| S1
-    Vf -->|"Yes"| CP["offer #cp → commit"]
-    CP --> NEXT["next slice ..."]
-    NEXT --> DOD{"arc DoD met?"}
-    DOD -->|"No"| S1
-    DOD -->|"Yes"| RVW["#rv over the whole arc"]
-```
+**Even a single download is real encouragement to me.** It tells me this skill is useful to someone out there, and that is the whole reason I keep building it. You don't owe me anything: use it for months on a serious codebase, or just `npx` it once to see how it feels and walk away. Both are completely fine. **No commitment, no lock-in, no pressure.**
+> ขอบคุณจริง ๆ สำหรับการดาวน์โหลดและทดลองใช้ครับ 🙏
+> แค่ **1 download** ก็เป็นกำลังใจดี ๆ ให้ผมแล้วว่า skill นี้มีประโยชน์ต่อใครสักคน
+> จะใช้ยาว ๆ กับงานจริง หรือลองเล่นแล้วเลิกก็ได้ — **ไม่ผูกมัดอะไรทั้งนั้น** ขอบคุณที่ให้โอกาสครับ
+If it saved you from one "it works" that didn't, it did its job. ❤️
 ---

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "ppdevskill",
-  "version": "1.3.0",
+  "version": "1.3.1",
   "description": "Unified engineering-partner workflow for Claude Code — debug, build, refactor, review, post-mortem — with hard gates and mechanical (hook-enforced) verification discipline.",
   "bin": {
     "ppdevskill": "bin/cli.js"