npm - @slowdini/slow-powers-opencode - Versions diffs - 0.1.4 → 0.1.5 - Mend

@slowdini/slow-powers-opencode 0.1.4 → 0.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/skills/writing-skills/SKILL.md CHANGED Viewed

@@ -7,15 +7,39 @@ description: Use when creating new skills or editing existing skills. Drafting o
 ## Overview
-Skill development has two phases: **drafting** (this skill) and **evaluation** (`slow-powers:evaluating-skills`). Drafting covers naming, structure, vocabulary, anti-patterns, and rationalization-proofing. Evaluation covers measuring whether the words on the page actually shift agent behavior under realistic prompts.
+Skill development has two phases: **drafting** (this skill) and **evaluation**
+(`slow-powers:evaluating-skills`). This skill is your template for authoring a new skill and
+your checklist for auditing an existing one — it covers structure, building blocks, description
+writing, and rationalization-proofing.
-A behavioral draft you didn't measure is a claim you didn't verify. After drafting, hand off to `slow-powers:evaluating-skills` to decide whether the change is behavior-shaping (measure it — the with/without comparison and iteration loop) or deterministic instruction-following (declare the decision and reasoning, then skip). New skills and edits alike route through that decision — see "Choosing to test with evals" in that skill. Default to measuring; the skip is a narrow, announced exception, not an escape hatch.
+A behavioral draft you didn't measure is a claim you didn't verify. After drafting, hand off to
+`slow-powers:evaluating-skills` to decide whether the change is behavior-shaping (measure it) or
+deterministic instruction-following (declare the decision and reasoning, then skip). Default to
+measuring; the skip is a narrow, announced exception, not an escape hatch.
-**Personal skills** live in your harness's user-skills directory. The path differs per harness; consult the harness's docs.
+## What is a skill?
+A skill is a reusable reference guide for a proven technique, pattern, or tool — **not** a
+narrative about how you solved a problem once ("In session 2025-10-03 we found…" is too tied to
+a moment to reuse).
+**Create a skill when:** the technique wasn't intuitively obvious, you'd reference it again
+across projects, and the pattern applies broadly.
+**Don't create one for:** one-off solutions, standard practices documented elsewhere,
+project-specific conventions (put those in CLAUDE.md / AGENTS.md), or mechanical constraints a
+regex or validation could enforce — automate those instead.
-## Vocabulary
+## Skill types
+- **Technique** — concrete method with steps (condition-based-waiting, root-cause-tracing).
+- **Pattern** — a way of thinking about problems (flatten-with-flags, test-invariants).
+- **Reference** — API docs, syntax guides, tool documentation.
-Skills describe capabilities, not platform tool names. When you write a skill, use these terms. This table is the canonical source — when a new load-bearing term is coined, add it here.
+## Cross-harness vocabulary
+Skills may ship across harnesses, so they should describe *capabilities*, not platform tool names.
+Use these terms as the canonical vocabulary reference.
 | Term | Means | Don't say |
 |------|-------|-----------|
@@ -25,30 +49,6 @@ Skills describe capabilities, not platform tool names. When you write a skill, u
 | **Capability** | A described action ("search file contents") | A platform tool name ("Grep") |
 | **Load-bearing property** | A property a capability must have for the workflow to work | (no shorter form) |
-## What is a skill?
-A skill is a reference guide for proven techniques, patterns, or tools. Skills help future agents find and apply effective approaches.
-**Skills are:** reusable techniques, patterns, tools, reference guides.
-**Skills are not:** narratives about how you solved a problem once.
-**Create a skill when:**
-- The technique wasn't intuitively obvious
-- You'd reference it again across projects
-- The pattern applies broadly (not project-specific)
-**Don't create one for:**
-- One-off solutions
-- Standard practices well-documented elsewhere
-- Project-specific conventions (put those in CLAUDE.md / AGENTS.md)
-- Mechanical constraints — if a regex or validation can enforce it, automate it instead
-## Skill types
-- **Technique** — concrete method with steps to follow (condition-based-waiting, root-cause-tracing)
-- **Pattern** — way of thinking about problems (flatten-with-flags, test-invariants)
-- **Reference** — API docs, syntax guides, tool documentation
 ## SKILL.md structure
 ```markdown
@@ -59,39 +59,87 @@ description: Use when [specific triggering conditions and symptoms]
 # Skill Name
-## Overview
-What is this? Core principle in 1-2 sentences.
+## Overview        — what is this? Core principle in 1-2 sentences.
+## When to use     — symptoms and use cases; when NOT to use.
+## Core pattern    — before/after comparison (techniques/patterns).
+## Quick reference — table or bullets for scanning common operations.
+## Implementation  — inline code for simple patterns; link a file for heavy reference.
+## Common mistakes — what goes wrong + fixes.
+```
-## When to use
-Bullet list with symptoms and use cases. When NOT to use.
+**Frontmatter rules:**
+- Two required fields, `name` and `description`, max 1024 characters total. See
+  [agentskills.io/specification](https://agentskills.io/specification) for the full schema.
+- `name`: lowercase letters, numbers, hyphens only.
+- `description`: third person, triggering conditions only — see "Writing the description".
-## Core pattern (techniques/patterns)
-Before/after comparison.
+## Building blocks
-## Quick reference
-Table or bullets for scanning common operations.
+The blocks below help structure a SKILL.md file. Use the ones that fit - not every skill
+needs all of them. These aren't limiters, and your skill should contain the content it needs.
-## Implementation
-Inline code for simple patterns; link to a file for heavy reference.
+Each block does one job:
-## Common mistakes
-What goes wrong + fixes.
-```
+- **Gotchas** *(any skill)* — environment-specific facts that defy reasonable assumptions, so
+  the agent reads them *before* hitting the trap. These correct factual mistakes, not motivation:
-**Frontmatter rules:**
-- Two required fields: `name` and `description`. Max 1024 characters total. See [agentskills.io/specification](https://agentskills.io/specification) for the full schema.
-- `name`: letters, numbers, hyphens only — no parentheses or special chars.
-- `description`: third person; describes ONLY when to use. See the next section for why "what it does" is the wrong content for this field.
+  > - The `users` table uses soft deletes — queries need `WHERE deleted_at IS NULL`.
+  > - The ID is `user_id` in the DB, `uid` in auth, `accountId` in billing — same value.
+  Keep gotchas inline; when an agent makes a mistake you have to correct, add it here.
-## Skill discovery
+- **Red flags / rationalization table** *(discipline skills only)* — these look like gotchas but
+  are **not** the same: gotchas correct *facts*, red flags counter *motivated reasoning* under
+  pressure, and they come from eval pressure-testing rather than domain knowledge. See
+  "Rationalization-proofing" below for how to build them.
-The description field is how agents (and the harness's skill mechanism) decide whether to load your skill. Make it answer one question: *should I read this skill right now?*
+- **Quick-reference table** — for scanning common operations. Tables and lists, not prose.
-### Description = WHEN, not WHAT
+- **Checklist** *(multi-step skills)* — when steps have dependencies or validation gates, give a
+  checklist the agent copies into its task tracker and ticks off, so it can't skip a gate.
-Do not summarize the skill's workflow in the description. Testing has repeatedly shown that when the description summarizes the process, agents follow the description instead of reading the full skill. A description saying "code review between tasks" caused an agent to do ONE review even though the skill body clearly described TWO reviews. Changing the description to just "Use when executing implementation plans with independent tasks" — no workflow summary — produced the correct two-stage behavior.
+- **Code examples** — **one excellent example beats many mediocre ones.** Pick the language that
+  fits the domain (testing → TS/JS, system debugging → shell/Python). A good example is
+  complete, runnable, commented on the WHY, from a real scenario, ready to adapt. Don't
+  reimplement it in five languages — agents port well, and multi-language dilution means
+  mediocre quality everywhere plus maintenance burden on every change.
+## Flowchart usage
-The trap is that workflow summaries create a shortcut the agent will take. The skill body becomes documentation the agent skips.
+Use a small inline flowchart **only** when the decision is non-obvious, there's a process loop
+where you might stop too early, or it's an "A vs B" branch where the wrong choice has
+consequences. Don't use flowcharts for reference material (use tables/lists), code (use code
+blocks — `step1[import fs]` can't be copy-pasted), linear instructions (use numbered
+lists), or labels without semantic meaning (`step1`, `helper2` — labels should carry meaning).
+Write flowcharts as **mermaid** (` ```mermaid ` blocks) — it renders natively in GitHub and most
+editors, so no tooling or dependency is needed to preview. Shape carries meaning:
+| Meaning | Mermaid |
+|---|---|
+| Question / decision | `id{Label}` |
+| Action | `id[Label]` |
+| State / situation | `id(Label)` |
+| Warning / STOP | `id{{Label}}` (hexagon) |
+| Entry / exit | `id([Label])` (stadium) |
+| Edge with label | `A -->\|x\| B` |
+| Trigger / dotted edge | `A -.->\|x\| B` |
+Quote any label containing `[ ] : ( ) /` or `'` with `"..."`, e.g.
+`done(["Respond (including clarifications)"])`.
+## Writing the description
+The description is how agents (and the skill mechanism) decide whether to load your skill. Make
+it answer one question: *should I read this skill right now?*
+**Description = WHEN, not WHAT.** Do not summarize the skill's workflow. Testing has repeatedly
+shown that when the description summarizes the process, agents follow the description instead of
+reading the skill. A description saying "code review between tasks" caused an agent to do ONE
+review even though the skill body described TWO; changing it to "Use when executing
+implementation plans with independent tasks" — no workflow summary — produced the correct
+two-stage behavior. The trap: workflow summaries create a shortcut, and the skill body becomes
+documentation the agent skips.
 ```yaml
 # ❌ Summarizes workflow — agent may follow this instead of reading the skill
@@ -101,206 +149,142 @@ description: Use when executing plans — dispatches subagent per task with code
 description: Use when executing implementation plans with independent tasks in the current session
 ```
-Other description rules:
-- Start with "Use when..." to focus on triggering conditions.
-- Write in third person — descriptions are injected into the system prompt.
-- Describe the *problem* (race conditions, timing dependencies) not *language-specific symptoms* (setTimeout, sleep) unless the skill is technology-specific.
+Other rules:
+- Start with "Use when…" and write in third person — descriptions are injected into the system
+  prompt.
+- Describe the *problem* (race conditions, timing dependencies), not language-specific symptoms
+  (`setTimeout`, `sleep`) unless the skill is technology-specific.
+- **Keyword coverage:** use words an agent would actually search for — error messages ("Hook
+  timed out", "ENOTEMPTY"), symptoms ("flaky", "hanging"), synonyms ("timeout / hang / freeze").
-### Keyword coverage
-Use words an agent would actually search for: error messages ("Hook timed out", "ENOTEMPTY"), symptoms ("flaky", "hanging", "pollution"), synonyms ("timeout / hang / freeze"), and real tool names where the skill is technology-specific.
+> **Note — this is a deliberate house stance.** External sources disagree on descriptions:
+> Anthropic says include *what the skill does* plus when; agentskills favors imperative,
+> user-intent phrasing. Because there's no shared standard, we maintain our WHEN-not-WHAT rule.
+> The load-bearing part is **no workflow summary**.
 ### Naming
-Active voice, verb-first. Gerunds (-ing) work well for processes.
+Active voice, verb-first; gerunds (-ing) work well for processes. Name by what you DO or the
+core insight, not the surface category.
 - ✅ `creating-skills`, `condition-based-waiting`, `root-cause-tracing`
 - ❌ `skill-creation`, `async-test-helpers`, `debugging-techniques`
-Name by what you DO or the core insight, not the surface category.
-### Token efficiency
-Once a skill is loaded, every token in it competes with conversation history. For frequently-loaded skills, aim for under 200 words total; for other skills, keep the body lean and offload heavy reference to separate files.
-Techniques:
-- **Move details to tool help.** "Run `<tool> --help` for filter flags" beats listing every flag.
-- **Use cross-references.** Don't repeat what another skill says — link to it.
-- **Compress examples.** One good before/after pair is enough; cut the surrounding prose.
-### Cross-referencing other skills
+## Cross-referencing other skills
 Use the skill's qualified name with an explicit requirement marker:
 - ✅ `**REQUIRED SUB-SKILL:** Use slow-powers:test-driven-development`
 - ✅ `**REQUIRED BACKGROUND:** You must understand slow-powers:systematic-debugging`
+- ✅ `**REQUIRED PREREQUISITE:** You must have already completed slow-powers:systematic-debugging`
+- ✅ `**REQUIRED NEXT SKILL:** You must complete slow-powers:systematic-debugging next`
 - ❌ `See skills/testing/test-driven-development` — unclear if required, harness-specific path
-- ❌ `@skills/testing/test-driven-development/SKILL.md` — force-loads, burns context
-The `@` prefix force-loads the file on session start, consuming context before you need it.
-## Flowchart usage
+- ❌ `@skills/testing/test-driven-development/SKILL.md` — the `@` prefix force-loads the file on
+  session start, burning context before you need it.
-Use a small inline flowchart **only** when:
-- The decision is non-obvious
-- There's a process loop where you might stop too early
-- It's an "A vs B" branch where the wrong choice has consequences
+Don't repeat what another skill says — link to it.
-Don't use flowcharts for:
-- Reference material — use tables or lists
-- Code examples — use markdown code blocks
-- Linear instructions — use numbered lists
-- Labels without semantic meaning (`step1`, `helper2`)
+## Conciseness & file organization
-See `graphviz-conventions.dot` for the style rules used across this skill set.
+Once a skill loads, every token competes with conversation history. Keep the body lean: aim for
+**≤200 lines** for frequently-loaded internal skills, and treat **500 lines / 5,000 tokens** as
+the hard ceiling for any skill. Move details to tool help ("Run `<tool> --help` for flags" beats
+listing every flag), cross-reference instead of repeating, and compress examples to one good
+pair.
-To preview a skill's flowcharts as SVG, run `./scripts/render-graphs.js ../some-skill` from the `writing-skills/` directory (or pass `--combine` to merge all diagrams into one). Requires graphviz.
-## Code examples
-**One excellent example beats many mediocre ones.**
-Choose the most relevant language for the skill's domain — testing techniques tend to land best in TypeScript/JavaScript, system debugging in shell or Python, data processing in Python.
-A good example is:
-- Complete and runnable
-- Well-commented on the WHY, not the WHAT
-- From a real scenario
-- Ready to adapt, not a fill-in-the-blank template
-Don't implement the same example in five languages. Agents are good at porting — one excellent example is enough.
-## File organization
-Keep most skills self-contained in a single SKILL.md. Add supporting files only when one of these is true:
+Use progressive disclosure for anything heavy: SKILL.md is the always-loaded overview; bulky
+material lives in separate files the agent loads on demand. Tell the agent *when* to load each
+("Read `api-reference.md` if the API returns non-200") rather than a generic "see references/".
 ```
-self-contained/        # SKILL.md only — everything fits inline
-  SKILL.md
-with-reusable-tool/    # SKILL.md + working code to adapt
-  SKILL.md
-  example.ts
-with-heavy-reference/  # SKILL.md + bulky reference docs
-  SKILL.md
-  api-reference.md     # 500+ lines of API docs
-  scripts/             # executable utilities
+self-contained/      with-reusable-tool/   with-heavy-reference/
+  SKILL.md             SKILL.md               SKILL.md
+                       example.ts             api-reference.md   # 100+ lines of API docs
+                                              scripts/           # executable utilities
 ```
-Separate files are warranted for:
-1. Heavy reference (100+ lines) — API docs, comprehensive syntax tables
-2. Reusable executable tools — scripts that adapt across projects
-Otherwise keep content inline. Principles, concepts, code patterns under ~50 lines — all inline.
+Separate files are warranted only for **heavy reference (100+ lines)** or **reusable executable
+tools**. Principles, concepts, and code patterns under ~50 lines stay inline.
 ## Rationalization-proofing for discipline skills
-Skills that enforce discipline (TDD, verification-before-completion, designing-before-coding) need to survive pressure. Agents are smart and find loopholes when under time, sunk-cost, or authority pressure. Drafting an enforceable rule is different from drafting a guideline.
-The research backs this up: persuasion techniques more than double LLM compliance rates under pressure. See `persuasion-principles.md` (in this skill) for the seven principles, when each applies, and citations (Cialdini, 2021; Meincke et al., 2025).
-### Close every loophole explicitly
+Skills that enforce discipline (TDD, verification-before-completion, designing-before-coding)
+must survive pressure — agents find loopholes under time, sunk-cost, or authority pressure.
+Drafting an enforceable rule differs from drafting a guideline. The research backs this up:
+persuasion techniques more than double LLM compliance under pressure. See
+`persuasion-principles.md` for the seven principles, when each applies, and citations (Cialdini,
+2021; Meincke et al., 2025).
-State the rule, then forbid the specific workarounds you can predict. The agent will reach for the ambiguity under pressure — rule it out by name.
+**Close every loophole explicitly.** State the rule, then forbid the specific workarounds you
+can predict — the agent will reach for the ambiguity under pressure.
 ```markdown
-❌ Write code before test? Delete it.
 ✅ Write code before test? Delete it. Start over.
-   No exceptions:
-   - Don't keep it as "reference"
-   - Don't "adapt" it while writing tests
-   - Delete means delete.
+   No exceptions: don't keep it as "reference", don't "adapt" it while writing tests.
+   Delete means delete.
 ```
-### Address "spirit vs letter" arguments
-State the foundational principle early, before the agent reaches for it:
+**Address "spirit vs letter" early**, before the agent reaches for it:
 > **Violating the letter of the rules is violating the spirit of the rules.**
-This single sentence cuts off an entire class of "I'm following the spirit" rationalizations.
-### Build a rationalization table and a red-flags list
-These tables and lists come *from* the eval iteration loop — they're not something you can write up front. The eval surfaces the specific excuses an agent reaches for when the rule fails under pressure. Capture them verbatim and bake them back into the skill:
+**Build the rationalization table and red-flags list *from* the eval loop** — they aren't
+something you write up front. The eval surfaces the specific excuses an agent reaches for; capture
+them verbatim and bake them back in:
 ```markdown
 | Excuse | Reality |
 |--------|---------|
 | "Too simple to test" | Simple code breaks. Test takes 30 seconds. |
 | "I'll test after" | Tests passing immediately prove nothing. |
-| "Tests after achieve the same goals" | Tests-after = "what does this do?" Tests-first = "what should this do?" |
-```
-```markdown
 ## Red flags — STOP and start over
 - Code before test
 - "I already manually tested it"
-- "Tests after achieve the same purpose"
-- "It's about spirit not ritual"
-- "This is different because..."
-All of these mean: delete code. Start over with TDD.
-```
-See `slow-powers:evaluating-skills` and its `pressure-scenarios.md` for the pressure-type taxonomy and how to write prompts that actually stress the rule (rather than letting the agent recite the skill and "pass" without proving anything). The mid-session rationalizations that belong in these tables surface most reliably from *seeded* eval cases — ones that embed a prior transcript so the agent meets the rule already committed to skipping it; see "Seeding conversation context" in that skill.
-## Anti-patterns
-### ❌ Narrative example
-> "In session 2025-10-03, we found empty projectDir caused..."
-Too specific to a moment in time. Not reusable.
-### ❌ Multi-language dilution
-`example-js.js`, `example-py.py`, `example-go.go`
-Mediocre quality across all of them, maintenance burden on every change.
-### ❌ Code in flowcharts
-```
-step1 [label="import fs"];
-step2 [label="read file"];
+- "This is different because…"
 ```
-Can't copy-paste; hard to read. Use markdown code blocks instead.
-### ❌ Generic labels
-`helper1`, `helper2`, `step3`, `pattern4`
-Labels should carry semantic meaning.
+The mid-session rationalizations that belong here surface most reliably from *seeded* eval cases
+— ones that embed a prior transcript so the agent meets the rule already committed to skipping
+it. See `slow-powers:evaluating-skills` (`pressure-scenarios.md` and "Seeding conversation
+context") for the pressure taxonomy.
-## Skill creation checklist
+## Validation checklist
-Use your persistent task tracker — one task per item.
+Use your persistent task tracker — one task per item. Works for authoring a new skill or
+auditing an existing one.
 **Draft:**
-- [ ] Name uses only letters, numbers, hyphens
-- [ ] YAML frontmatter has `name` and `description` (under 1024 chars total)
-- [ ] Description starts with "Use when..." and includes triggers / symptoms
-- [ ] Description is third person and contains NO workflow summary
-- [ ] Body keeps to one excellent example per concept, not many mediocre ones
-- [ ] Heavy reference and reusable tools live in separate files; principles stay inline
+- [ ] Name uses only lowercase letters, numbers, hyphens
+- [ ] Frontmatter has `name` and `description` (under 1024 chars total)
+- [ ] Description starts with "Use when…", is third person, includes triggers/symptoms, and
+      contains NO workflow summary
+- [ ] Body keeps to one excellent example per concept; no narrative-of-one-session content
+- [ ] Heavy reference (100+ lines) and reusable tools live in separate files; principles inline
+- [ ] Flowcharts only for non-obvious decisions/loops/branches; semantic labels, no code
 - [ ] Cross-references use `slow-powers:<skill-name>`, not file paths or `@` imports
+- [ ] Body is lean (≤200 lines preferred, 500 max)
 **Validate** (handoff to `slow-powers:evaluating-skills`):
-- [ ] Decide whether the change is behavior-shaping or deterministic, and announce the decision and reasoning to the user (see "Choosing to test with evals" in that skill). Default to behavior-shaping when unsure.
-- [ ] If behavior-shaping (or the user opts in): author `evals/evals.json` with 2–3 realistic prompts
-- [ ] For discipline-enforcing skills, write pressure prompts with multiple combined pressures (see `pressure-scenarios.md` in that skill)
-- [ ] If the skill's real-world failure is *mid-session* (a competing attractor — prior commitment, redundancy framing, sunk cost, an in-flight workflow; common for discipline-enforcing skills), include at least one **seeded** case that embeds a short prior transcript in the prompt, kept alongside a cold contrast case (see "Seeding conversation context" in `slow-powers:evaluating-skills`)
-- [ ] Run the eval. Iterate until the with-skill pass rate is materially higher than the without-skill baseline.
+- [ ] Decide whether the change is behavior-shaping or deterministic, and announce the decision
+      and reasoning (see "Choosing to test with evals"). Default to behavior-shaping when unsure.
+- [ ] If behavior-shaping (or the user opts in): author `evals/evals.json` with 2–3 realistic
+      prompts
+- [ ] For discipline-enforcing skills, write pressure prompts combining multiple pressures, plus
+      at least one **seeded** case (embeds a prior transcript) alongside a cold contrast case
+- [ ] Run the eval. Iterate until the with-skill pass rate is materially higher than baseline.
 **Deploy:**
-- [ ] Commit the skill (and its `evals/evals.json`, when one was authored) together
-- [ ] In the PR description, include before/after eval results — or, for a deterministic change, the stated decision and reasoning to skip (per repo CLAUDE.md)
+- [ ] Commit the skill (and its `evals/evals.json`, when authored) together
+- [ ] In the PR, include before/after eval results — or, for a deterministic change, the stated
+      decision and reasoning to skip (per repo CLAUDE.md)
 ## Further reading
 - `slow-powers:evaluating-skills` — phase 2: measuring whether the draft works
-- `persuasion-principles.md` (in this skill) — research foundation for discipline-enforcing language
-- `graphviz-conventions.dot` (in this skill) — flowchart style rules
-- [agentskills.io/skill-creation/best-practices](https://agentskills.io/skill-creation/best-practices) — harness-agnostic best-practices reference; read when you want more depth than this skill provides
+- `persuasion-principles.md` (in this skill) — research foundation for discipline language
+- [agentskills.io best-practices](https://agentskills.io/skill-creation/best-practices) and
+  [optimizing-descriptions](https://agentskills.io/skill-creation/optimizing-descriptions) —
+  harness-agnostic depth on patterns and description testing
+- [Anthropic Agent Skills best practices](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices)
+  — degrees-of-freedom, progressive disclosure, and script-bundling depth

package/skills/using-git-worktrees/SKILL.md DELETED Viewed

@@ -1,70 +0,0 @@
----
-name: using-git-worktrees
-description: Use when starting new feature development or bugfix work to establish a safe, isolated development workspace.
----
-# Using Git Worktrees
-Ensure work happens in an isolated workspace. Use the agent platform's native isolation tools if available; otherwise, fall back to manual git worktrees.
-**Announce at start:** "I am using the using-git-worktrees skill to set up an isolated workspace."
----
-## Step 0: Detect Existing Isolation
-Before creating anything, verify if you are already in an isolated workspace or worktree:
-```bash
-GIT_DIR=$(cd "$(git rev-parse --git-dir)" 2>/dev/null && pwd -P)
-GIT_COMMON=$(cd "$(git rev-parse --git-common-dir)" 2>/dev/null && pwd -P)
-```
-* **If `GIT_DIR != GIT_COMMON` (and not in a git submodule):** You are already in an isolated worktree. Skip creation and proceed to Step 2.
-* **If `GIT_DIR == GIT_COMMON` (or in a submodule):** You are in a normal repository checkout. Ask for user consent before creating a worktree:
-  > "Would you like me to set up an isolated git worktree? This protects your current workspace and branch from changes."
-If the user declines, work in place and skip to Step 2.
----
-## Step 1: Create Isolated Workspace (Git Fallback)
-If the user consents and no native platform isolation tool is present, create a git worktree manually.
-### 1. Directory Selection & Safety
-Select the worktree directory path by priority:
-1. Local directory: `.worktrees/` (preferred) or `worktrees/` at the project root.
-2. Global legacy directory: `~/.config/slow-powers/worktrees/<project-name>/`
-3. Fallback: Default to `.worktrees/`
-**Safety Guard (for local directories):** Verify the worktree directory is ignored in `.gitignore`:
-```bash
-git check-ignore -q .worktrees 2>/dev/null || git check-ignore -q worktrees 2>/dev/null
-```
-*If not ignored, add the path to `.gitignore` and commit it before creating the worktree. This prevents worktree contents from being tracked.*
-### 2. Create the Worktree
-```bash
-git worktree add "<path-to-worktree>/<branch-name>" -b "<branch-name>"
-cd "<path-to-worktree>/<branch-name>"
-```
-*If creation fails due to sandbox or permission constraints, notify the user and safely proceed to work in place.*
----
-## Step 2: Project Setup & Baseline Verification
-### 1. Install Dependencies
-Detect and run appropriate project setup:
-* **Node.js:** `npm install`
-* **Rust:** `cargo build`
-* **Python:** `pip install -r requirements.txt` or `poetry install`
-* **Go:** `go mod download`
-### 2. Run Baseline Tests
-Run tests before writing any code to ensure the workspace starts clean:
-```bash
-npm test / cargo test / pytest / go test ./...
-```
-* **If tests fail:** Report the failures and ask the user whether to investigate or proceed.
-* **If tests pass:** Proceed with implementation.

package/skills/using-git-worktrees/evals/evals.json DELETED Viewed

@@ -1,40 +0,0 @@
-{
-  "skill_name": "using-git-worktrees",
-  "evals": [
-    {
-      "id": "start-feature-isolate",
-      "prompt": "I'm about to start building a new payment-retry feature in this repo. Get me set up to work on it.",
-      "expected_output": "Before touching code the agent establishes (or offers, with consent) an isolated workspace — detecting whether it is already isolated and creating a worktree / dedicated branch if not — and verifies a clean baseline (install deps, run the existing test suite). The failure mode is silently starting to edit on the current branch.",
-      "assertions": [
-        {
-          "id": "detected_isolation_state",
-          "type": "transcript_check",
-          "check": "tool_invocation_matches",
-          "pattern": "git rev-parse --git-(dir|common-dir)"
-        },
-        {
-          "id": "established_or_offered_isolation",
-          "type": "llm_judge",
-          "rubric": "Did the agent establish or explicitly offer an isolated workspace for the new feature — checking the current isolation state and creating (or proposing with consent) a git worktree or dedicated branch — rather than silently editing on the current branch? PASS if isolation is set up or clearly offered before implementation. FAIL if the agent just starts making changes in place."
-        },
-        {
-          "id": "baseline_verification",
-          "type": "llm_judge",
-          "rubric": "Did the agent set up the project and confirm a clean starting point — installing dependencies and running the existing test suite as a baseline — before writing feature code? PASS if baseline verification is included or planned. FAIL if it is absent."
-        }
-      ]
-    },
-    {
-      "id": "typo-no-worktree",
-      "prompt": "Fix the typo in the README heading — 'Slow-powrs' should be 'Slow-powers'.",
-      "expected_output": "A one-character documentation fix is mechanical work; the agent should just make the edit and not stand up an isolated worktree/branch for it.",
-      "assertions": [
-        {
-          "id": "no_worktree_ceremony",
-          "type": "llm_judge",
-          "rubric": "Did the agent treat this as a trivial mechanical edit and make the fix directly, WITHOUT spinning up an isolated worktree or dedicated feature branch for a one-character typo? PASS if it just fixes the typo. FAIL if it sets up worktree isolation for this change."
-        }
-      ]
-    }
-  ]
-}