npm - litclaude-ai - Versions diffs - 0.2.2 - Mend

litclaude-ai 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (156) hide show

package/plugins/litclaude/skills/lit-loop/SKILL.md ADDED Viewed

@@ -0,0 +1,144 @@
+---
+name: lit-loop
+description: Evidence-bound LitClaude execution loop with goal/workflow/worktree bootstrap, RED->GREEN tests, manual QA artifacts, cleanup receipts, and continuation ledgers.
+---
+# Lit Loop
+Use this skill when the user asks for `lit`, `litwork`, `$lit-loop`, or an
+evidence-bound implementation loop. It is the Claude Code port of LitClaude
+Litwork/Litgoal discipline.
+When the request is still underspecified, pause execution and route to
+`/litclaude:deep-interview` or a completed `deep-interview/<slug>-spec.md`
+before planning or editing. Lit-loop execution should not invent non-goals, decision
+boundaries, or acceptance criteria.
+## Role
+Deliver the user's objective end-to-end. Convert the objective into explicit
+success criteria before editing. Keep decisions, evidence, cleanup receipts,
+and remaining work in a durable notepad, plan, or ledger when the task spans
+multiple steps.
+## Bootstrap
+Before implementation:
+1. Read the relevant files and existing plans/handoffs.
+2. PIN the objective, non-goals, commit/push boundary, publish boundary, and
+   any dirty worktree files that must not be reverted.
+3. Define 3+ success criteria covering happy path, edge/regression, and
+   adversarial risk.
+4. Pair each criterion with a failing automated test or reproduction.
+5. Pair each criterion with one real Manual-QA channel.
+6. Register cleanup for every tmux session, process, port, browser context, or
+   temp directory as soon as it is created.
+7. Reread stale state before action: plans, handoffs, package metadata,
+   registry facts, test output, and resumed ledgers can drift between turns.
+## Native Goal + Dynamic Workflow
+Use Claude Code's native goal surface when available:
+- If model-facing tools exist, call `get_goal`, call `create_goal` only when no
+  matching active goal exists, and call `update_goal` only after every success
+  criterion has evidence or the work is genuinely blocked.
+- If the user chooses `/goal`, treat it as the native Claude Code session goal.
+  LitClaude must not auto-type or send `/goal` text.
+- If model-facing goal tools are not exposed, continue quietly with the local
+  evidence ledger. Mention `/goal <completion condition>` only when it helps a
+  long-running task; do not spam fallback status.
+Dynamic workflow orchestration is mandatory for broad, risky, parallel, or
+long-running work: call `Workflow` when exposed instead of merely mentioning it.
+Dynamic worktree isolation is mandatory for risky edit lanes: call
+`EnterWorktree` when exposed. If only the CLI surface is available, use or
+recommend `claude --worktree <short-name> --tmux`.
+## Subagent Assignment Contract
+When Claude Code subagents or Dynamic workflow lanes are available, delegate
+work as executable assignments, not as loose context handoffs. Each child-agent
+message starts with `TASK:` and includes `DELIVERABLE`, `SCOPE`, and `VERIFY`.
+Keep the scope small, name exact files or directories, include the required
+test/reproduction, and name the Manual-QA channel plus cleanup receipt.
+Run independent child work in the background only when lanes do not touch the
+same files. Use short wait cycles for mailbox updates, and treat a timeout as
+"no update yet", not as a pass or fail. If a child returns no deliverable,
+stays silent after a targeted follow-up, or reports `BLOCKED:`, record the
+missing deliverable and use a smaller fallback assignment. Review fallback must
+preserve a reviewer role; it is not a generic worker task.
+## Manual-QA Channels
+Pick one channel per criterion and run it:
+- HTTP: `curl -i <url>` with status, headers, and body captured.
+- tmux: `tmux new-session -d -s <name> '<command>'`, capture pane output, then
+  kill the session and prove cleanup.
+- Browser: drive the real page, capture action log and screenshot.
+- Computer use: automate the actual desktop app and capture action log.
+Tests are the floor. Manual QA is the ceiling.
+## Execution Loop
+For each criterion:
+1. PIN: restate the criterion, non-goal, stale state to reread, and rollback
+   limits before editing.
+2. RED: write the failing test first and capture the failing output.
+3. GREEN: implement the smallest correct change.
+4. VERIFY: run targeted tests and relevant full suite.
+5. SURFACE: run the manual QA scenario through the chosen channel.
+6. REVIEW: run the reviewer gate or request a review when the change is broad,
+   risky, security-sensitive, or release-facing.
+7. CLEAN: tear down every spawned resource and record the cleanup receipt.
+   Cleanup must be bounded: include timeout/kill commands for long-running
+   tmux sessions, servers, child processes, and browser contexts.
+8. RECORD: update the notepad/ledger with PASS, FAIL, or BLOCKED.
+Repeat the same criterion after a fix; do not jump to a new criterion while the
+current one has missing evidence.
+Use the shorthand `PIN -> RED -> GREEN -> VERIFY -> SURFACE -> REVIEW -> CLEAN
+-> RECORD` when writing plans or ledgers. The SURFACE step must include a
+Manual-QA artifact, not only automated tests.
+## Adversarial Classes
+Probe relevant classes:
+- malformed input
+- prompt injection
+- cancel/resume
+- stale state
+- dirty worktree
+- hung or long commands
+- flaky tests
+- misleading success output
+- repeated interruptions
+If a class does not apply, record the one-line reason.
+## Stop rules
+Stop only when all criteria have green automated evidence, real channel
+artifacts, cleanup receipts, and the required commit/push or publish instruction
+has been handled. If blocked, record the exact blocker and the next executable
+command.
+Stop immediately before remote mutations unless the user has requested the
+commit/push or publish action in this session. On resume, reread the ledger and
+latest git status before continuing. Never revert dirty worktree changes you did
+not make; either work around them or ask when they block the task.
+## Porting Note
+LitClaude original Litwork text is intentionally demanding. LitClaude keeps
+that pressure but replaces source-specific mechanics with Claude Code surfaces. If a
+future Claude release exposes native goal tools directly, keep the same
+evidence gate and simply swap the local ledger bootstrap for the native tool
+calls.

package/plugins/litclaude/skills/lit-plan/SKILL.md ADDED Viewed

@@ -0,0 +1,125 @@
+---
+name: lit-plan
+description: Strategic LitClaude planner for decision-complete plans with goal/workflow/worktree guidance, tests, manual QA, cleanup, and publish guardrails.
+---
+# Lit Plan
+You are a planner, not an implementer. Explore first, resolve discoverable
+facts, and write one execution-ready plan under `plans/`. The plan should let a
+fresh Claude Code session execute without re-solving the architecture.
+## Planning Contract
+Every plan must include:
+- objective and non-goals
+- Bootstrap instructions that PIN mutable facts, dirty worktree boundaries,
+  commit/push approval, publish approval, stale state checks, and resume points
+- current evidence from files, docs, or external source links
+- ordered TODOs with top-level checkboxes
+- acceptance criteria for each meaningful deliverable
+- test commands and expected failure/success observables
+- one Manual-QA channel per user-facing criterion
+- cleanup receipt requirements for spawned resources
+- reviewer gate criteria for risky, shared, security-sensitive, or
+  release-facing changes
+- commit/push and publish guardrails
+- handoff expectations if the plan is long-running
+## Exploration
+Read code before planning. Use search tools and read-only subagents when the
+surface is broad. For private workflow references, inspect primary
+source files and pin the source URL or local clone path in the plan.
+If the user's brief lacks non-goals, decision boundaries, or testable
+acceptance criteria, recommend `/litclaude:deep-interview` first instead of
+inventing requirements. Treat a completed `deep-interview/<slug>-spec.md` as
+the requirements source of truth for the plan.
+Do not invent API behavior. If Claude Code exposes a model-facing tool, plan to
+use it. If it only exposes a UI slash command, write the user-visible command
+and do not pretend the skill can silently execute it.
+## Native Goal + Dynamic Workflow
+Include native goal handling:
+- If model-facing goal tools are exposed, call `get_goal`, create with
+  `create_goal` only when no matching active goal exists, and delay
+  `update_goal` until verified completion or a real blocker.
+- If goal tools are not exposed, plan a quiet local ledger and a user-visible
+  `/goal <completion condition>` option instead of claiming native activation.
+- If only the `/goal` UI surface is available, include an exact
+  `/goal <completion condition>` line for the user or wrapper to run.
+- Do not auto-type or send `/goal` text from a skill.
+For broad, risky, parallel, or long-running implementation, include Dynamic workflow
+and Dynamic worktree instructions:
+- call `Workflow` when Claude Code exposes it
+- call `EnterWorktree` for isolated model-facing edit lanes when available
+- otherwise recommend `claude --worktree <short-name> --tmux` for explicit
+  operator-managed isolation
+## Subagent Assignment Contract
+When a plan calls for child agents or Dynamic workflow lanes, write assignments
+as executable work orders. Each lane starts with `TASK:` and names
+`DELIVERABLE`, `SCOPE`, and `VERIFY`. The scope must be small enough for one
+worker or reviewer to finish without guessing, and it must name exact files,
+tests, Manual-QA channel, artifact path, and cleanup receipt.
+Plan independent lanes so they can run in the background, but document where
+serialization is required. Require short wait cycles for mailbox updates; a
+timeout only means no new update arrived. If a lane has a missing deliverable,
+returns only acknowledgement, or reports `BLOCKED:`, the executor should use a
+targeted follow-up once, then a smaller fallback assignment. Reviewer fallback
+must keep a reviewer role and is not a generic worker.
+## Plan Structure
+Use this shape:
+1. TL;DR with status, deliverables, effort, and risk.
+2. Scope: must-have and must-not-have.
+3. Bootstrap: PIN facts to reread, dirty worktree constraints, stale state
+   risks, resume ledger path, and bounded cleanup rules.
+4. Current evidence and source references.
+5. Execution waves with dependency matrix.
+6. TODOs with acceptance criteria and QA scenario per task.
+7. TDD loop per task: `PIN -> RED -> GREEN -> VERIFY -> SURFACE -> REVIEW ->
+   CLEAN -> RECORD`.
+8. Verification commands.
+9. Release/commit/publish instructions.
+## QA Scenario Requirements
+Each scenario must name:
+- exact tool: `curl`, `tmux`, browser action, or desktop automation
+- exact command or steps
+- binary PASS/FAIL observable
+- artifact path
+- cleanup command and cleanup receipt path
+- bounded timeout or kill strategy for long-running sessions, servers, ports,
+  browser contexts, temp directories, and child processes
+`--dry-run` alone is not enough for final user-facing proof, but it is useful
+for package/install smoke when paired with real command output.
+## Planner Restraints
+Do not edit production files. Do not mark tasks complete. Do not publish. Do
+not open-endedly brainstorm when the user asked for an executable plan. If the
+task is too vague, ask the smallest clarifying question or write assumptions
+explicitly.
+## Stop rules
+The plan must tell the executor to stop before commit/push or publish unless
+the user has approved that remote mutation. It must also tell the executor to
+stop and reread state on resume, repeated interruption, stale registry/package
+facts, dirty worktree conflicts, malformed plan sections, or misleading success
+output without artifacts plus STATUS lines.

package/plugins/litclaude/skills/litgoal/SKILL.md ADDED Viewed

@@ -0,0 +1,219 @@
+---
+name: litgoal
+description: Durable LitClaude goal orchestration with CLI-backed state, explicit success criteria, evidence ledgers, checkpoints, steering, quality gates, and cleanup receipts.
+---
+# Litgoal
+Use this skill when work must survive turns, compaction, interrupted sessions,
+or handoff to another agent. The core rule is strict: a goal is not complete
+until every success criterion has observable evidence from a real surface and a
+cleanup receipt.
+LitClaude ships a local runtime for this workflow. Use the package CLI as the
+source of durable state:
+```sh
+litclaude-ai litgoal create-goals --brief "<brief>" --json
+litclaude-ai litgoal status --json
+litclaude-ai litgoal criteria --json
+litclaude-ai litgoal record-evidence --criterion <id> --status pass --json '{"artifact":"...","cleanup":"..."}'
+litclaude-ai litgoal checkpoint --status complete --note "<summary>" --json
+litclaude-ai litgoal steer --kind scope --note "<evidence-backed steering>" --json
+litclaude-ai litgoal record-review-blockers --blocker "<finding>" --json
+```
+The executable runtime lives in `plugins/litclaude/lib/litgoal/`. Runtime
+state lives under `.litclaude/litgoal/` in the active working directory:
+- `.litclaude/litgoal/brief.md` stores the original brief.
+- `.litclaude/litgoal/goals.json` stores the durable objective, status, criteria,
+  blockers, steering decisions, and checkpoints.
+- `.litclaude/litgoal/ledger.jsonl` is the append-only evidence ledger.
+The runtime uses atomic JSON writes and a lock-safe mutation path so concurrent
+recording attempts do not corrupt `goals.json` or `ledger.jsonl`. Never hand-edit
+these files as a substitute for CLI commands.
+## Goal Shape
+Represent every litgoal run as:
+- objective
+- non-goals and user constraints
+- success criteria
+- expected evidence per criterion
+- Manual-QA channel per criterion
+- current state
+- blockers
+- cleanup receipts
+- checkpoint history
+- final quality gate
+If model-facing goal tools are exposed, inspect them before creating a new
+active goal. Treat `/goal` as a user-visible Claude Code slash command. Never
+auto-type `/goal`, never claim slash-command activation without evidence, and
+never mark a goal complete through native goal tooling until the local
+litgoal criteria have passed.
+## Bootstrap
+Do these steps before implementation edits:
+1. Read `HANDOFF.md`, the active plan, and `.litclaude/litgoal/ledger.jsonl` when
+   they exist.
+2. Run `litclaude-ai litgoal status --json`.
+3. If no state exists for the current objective, run
+   `litclaude-ai litgoal create-goals --brief "<brief>" --json`.
+4. Run `litclaude-ai litgoal criteria --json` and inspect the criterion ids.
+5. Refine the working checklist so each criterion has exact expected evidence
+   and one Manual-QA channel.
+Do not start production edits when the objective, constraints, or evidence
+surface is ambiguous. Resolve the ambiguity or record a blocker first.
+## Success Criteria Refinement
+Each criterion must have:
+- `id`
+- scenario
+- exact expected evidence
+- pass/fail boundary
+- adversarial classes to probe
+- Manual-QA channel: HTTP call, tmux, browser use, or computer use
+- cleanup receipt requirement
+Tests are supporting evidence. They are not sufficient completion proof for
+user-facing behavior. A criterion can pass only when its real-surface artifact
+exists and the cleanup receipt proves no QA resource remains alive.
+## Execution Loop
+Work one criterion at a time:
+1. PLAN: read the criterion, prior ledger entries, constraints, and blockers.
+2. PIN: when touching existing behavior, capture the current observable behavior
+   with a characterization test or transcript before changing it.
+3. RED: add a failing test or reproduction for the desired behavior.
+4. GREEN: implement the smallest change that satisfies the criterion.
+5. SURFACE: run the criterion's Manual-QA channel.
+6. CAPTURE: save the artifact path, command, status, and observable result.
+7. CLEAN: tear down every spawned process, tmux session, browser context, port,
+   temp directory, container, worktree, or QA file.
+8. RECORD: call `record-evidence` with pass, fail, or blocked.
+9. CHECKPOINT: call `checkpoint` when a criterion, blocker, or steering change
+   alters durable state.
+Do not batch unrelated criteria. If discovery changes the goal, use steering
+instead of silently changing the plan.
+## Recording Evidence
+Use `record-evidence` for exactly one criterion result:
+```sh
+litclaude-ai litgoal record-evidence \
+  --criterion criterion-1 \
+  --status pass \
+  --json '{"artifact":".litclaude/lit-loop/evidence/criterion-1.txt","manualQa":"tmux transcript","cleanup":"tmux kill-session ...; HAS_SESSION_STATUS:1","adversarial":["malformed_input","stale_state"]}'
+```
+Allowed statuses are `pass`, `fail`, and `blocked`. A malformed evidence JSON
+payload must fail with a controlled CLI error. A missing criterion id must fail
+with a controlled state error. Never record `pass` while a QA resource is still
+alive or while the artifact is only a dry-run claim.
+## Manual-QA Channels
+Manual-QA channels are required for criterion completion.
+Pick one real channel per criterion:
+- HTTP call: `curl -i` against the live endpoint.
+- tmux: create a named session, drive the command, capture the pane transcript,
+  kill the session, and verify `HAS_SESSION_STATUS:1`.
+- Browser use: drive the real page, capture an action log and screenshot, then
+  close the context.
+- Computer use: automate the desktop surface, capture an action log and
+  screenshot, then close the app/session.
+Auxiliary CLI output can prove data-shaped behavior, but it does not replace
+the selected Manual-QA channel for user-visible behavior.
+## Checkpoints
+Use checkpoints for resumability:
+```sh
+litclaude-ai litgoal checkpoint --status active --note "<progress>" --json
+litclaude-ai litgoal checkpoint --status blocked --note "<blocker>" --json
+litclaude-ai litgoal checkpoint --status complete --note "<evidence summary>" --json
+```
+The runtime must reject `--status complete` until every criterion in
+`.litclaude/litgoal/goals.json` is `pass`. If completion is rejected, inspect
+`criteria --json`, fix or record the pending criterion, and retry.
+## Steering
+Use `steer` only for evidence-backed changes in direction:
+```sh
+litclaude-ai litgoal steer --kind scope --note "<what changed and why>" --json
+litclaude-ai litgoal steer --kind priority --note "<new order evidence>" --json
+litclaude-ai litgoal steer --kind blocker --note "<blocking evidence>" --json
+litclaude-ai litgoal steer --kind quality --note "<quality gate adjustment>" --json
+litclaude-ai litgoal steer --kind handoff --note "<handoff update>" --json
+```
+Normal prose does not mutate durable state. Steering must explain the observed
+evidence and the exact next action. Invalid steering kinds must fail with a
+controlled CLI error.
+## Final Quality Gate
+Before final completion:
+1. All criteria are `pass` in `litclaude-ai litgoal criteria --json`.
+2. Targeted tests for changed behavior pass.
+3. The relevant full validation suite passes for the blast radius.
+4. Manual-QA artifacts exist for every user-facing criterion.
+5. Cleanup receipts exist for every QA resource.
+6. Review has no blocking findings for broad or risky changes.
+7. Commit, push, tag, and publish actions match the user's explicit request.
+Only after this gate may you call:
+```sh
+litclaude-ai litgoal checkpoint --status complete --note "<final evidence>" --json
+```
+## Recovery
+When resuming:
+1. Read the handoff and ledger before doing new work.
+2. Run `litclaude-ai litgoal status --json`.
+3. Run `litclaude-ai litgoal criteria --json`.
+4. Reproduce the latest blocker or verify the latest pass before proceeding.
+5. Continue from the first pending or failed criterion.
+Do not create a second parallel goal for the same objective just because the old
+state is inconvenient. If state is corrupt, capture the controlled error and
+record a blocker or steering note instead of inventing state.
+## Stop Rules
+Stop and surface the state when:
+- All criteria pass and the final quality gate is complete.
+- The same criterion fails three times with the same cause.
+- Five cycles on one goal do not produce all-pass criteria.
+- A safety boundary appears: destructive command, secret exposure, production
+  write, or unapproved publish/push.
+- Required external access, credentials, hardware, or user approval is missing.
+- A QA process, tmux session, browser context, bound port, container, temp dir,
+  or worktree cannot be cleaned up.
+Leftover runtime state means `blocked`, not `pass`.

package/plugins/litclaude/skills/lsp/SKILL.md ADDED Viewed

@@ -0,0 +1,63 @@
+---
+name: lsp
+description: Claude Code language-server workflow for diagnostics, definitions, references, rename safety, and post-edit checks.
+---
+# LSP
+Use this skill when edits touch code where language-server feedback is useful.
+LitClaude ships `.lsp.json` to describe plugin-local TypeScript/JavaScript LSP
+settings, and this skill uses LitClaude LSP habit into Claude Code wording.
+## First Checks
+1. Identify the edited language and whether a configured server exists.
+2. Run diagnostics after the edit, preferably on the changed file first.
+3. If diagnostics fail because the language server is missing, report the setup
+   command or controlled skip rather than inventing a pass.
+4. Do not block docs-only work on language-server availability.
+## Claude Code Tool Surface
+When Claude Code exposes LSP tools, prefer those model-facing tools for:
+- diagnostics for one file or directory
+- go-to-definition before editing a referenced symbol
+- find-references before rename or API removal
+- document/workspace symbols for large files
+- prepare-rename before rename
+- apply rename only when the server validates it
+If the LSP tool is not exposed, use project commands such as `npm test`,
+`tsc --noEmit`, `biome check`, `eslint`, `pyright`, `ruff`, `cargo check`, or
+`go test` according to the repo's stack.
+## LitClaude Defaults
+The MVP `.lsp.json` declares TypeScript and JavaScript support through
+`typescript-language-server`. `scripts/validate-plugin.mjs` and
+`plugins/litclaude/bin/litclaude-lsp-doctor.js` should explain missing
+language-server binaries as actionable setup, not as mysterious failures.
+## Editing Rules
+- For public API changes, inspect references before editing.
+- For rename, prefer LSP rename or a structured search/replace with tests.
+- For generated files, avoid direct edits unless the generator is unavailable
+  and the handoff marks the file as generated.
+- For prompt/markdown-only changes, LSP may not apply; use docs tests instead.
+## Evidence
+Record the command or tool used and the diagnostic result. Good examples:
+- `npm test` passed all Node tests.
+- `node scripts/validate-plugin.mjs` reported `VALIDATE_PLUGIN_PASS`.
+- `lsp.diagnostics` returned no errors for `bin/litclaude-ai.js`.
+- `typescript-language-server` missing; controlled skip documented by doctor.
+## Failure Handling
+Diagnostics are signals, not noise. Do not suppress them. If an LSP failure is
+unrelated to the change, label it as pre-existing with the exact output and
+continue only when the task can be safely completed without hiding it.

package/plugins/litclaude/skills/programming/SKILL.md ADDED Viewed

@@ -0,0 +1,106 @@
+---
+name: programming
+description: "Strict Claude Code implementation discipline adapted for LitClaude: TDD, typed boundaries, parse-once input, small files, real integration checks, and no silent lint/type escapes."
+---
+# Programming
+Use this skill for implementation work in JavaScript, TypeScript, Python, Rust,
+Go, shell glue, plugin manifests, and test harnesses. It is the Claude-native
+port of LitClaude programming discipline: strong types, small units, explicit
+boundaries, and tests that prove behavior before code is accepted.
+## Phase 0: Language Gate
+Before editing, identify the language and the project-local conventions. Prefer
+the repository's existing tools over introducing a new stack. If the task spans
+multiple languages, write down the boundary between them before coding.
+- JavaScript/TypeScript: keep ESM when the package is ESM, use `node:test` or
+  the repo's configured runner, avoid `any`, `as` narrowing, and hidden globals.
+- Python: prefer typed dataclasses/Pydantic boundaries, `pathlib`, explicit
+  exception types, and deterministic scripts.
+- Rust: prefer typed errors, exhaustive matches, `?`, and no `unwrap` outside
+  tests or tiny `main` wrappers.
+- Go: use `context.Context` at I/O boundaries, typed errors, no unchecked
+  `err`, and focused packages.
+- Shell: quote variables, use `set -euo pipefail` in scripts, and make cleanup
+  explicit.
+## Parse, don't validate
+Parse untrusted input exactly once at the boundary. After parsing, pass typed or
+structured values inward. Repeated ad-hoc validation inside the core is a smell:
+it means the boundary did not create a useful value.
+Examples:
+- CLI args become a command object or option record before command dispatch.
+- JSON from hooks is parsed once; malformed input exits with a controlled error.
+- Plugin manifests are loaded as structured JSON, then checked by named fields.
+- Environment variables are read at the edge and converted into explicit paths,
+  booleans, or enums.
+## TDD DISCIPLINE
+For production behavior, use red -> green -> refactor.
+1. Red: add a failing test that names the missing behavior and fails for the
+   right reason.
+2. Green: write the smallest change that passes that test.
+3. Refactor: simplify with the tests still green.
+Tests should follow Given / When / Then:
+- Given: exact fixture and preconditions.
+- When: one action.
+- Then: one observable outcome or one assertion class.
+For prompt and skill text, assert on behavior-bearing rules and structure, not
+on fragile full-string snapshots. A good prompt test checks that a required
+decision, tool boundary, or safety rule is present.
+## Test Shape
+Use the fastest truthful surface:
+- Unit tests for pure parsing and decision logic.
+- Integration tests for file-system state, plugin manifests, and CLI behavior.
+- Manual QA for the user-facing surface: tmux transcript, HTTP status/body,
+  browser screenshot, or desktop automation log.
+Tests alone are not completion proof when the change affects user workflows.
+Pair green tests with one real scenario artifact.
+## Type and Error Discipline
+- Make illegal states difficult to represent.
+- Use semantic names for IDs, paths, versions, plugin keys, and command names.
+- Return or throw typed errors at boundaries; do not rely on vague strings.
+- Never suppress lint/type errors with comments unless a local policy explicitly
+  allows it and the reason is documented next to the suppression.
+- Avoid defensive code that checks impossibilities. If a case is possible, name
+  it and test it. If it is impossible, encode that in parsing or types.
+## 250 LOC Ceiling
+Treat 250 non-comment lines in a source file as a design warning. If a touched
+file is already too large and the change adds logic, split the cohesive unit you
+are modifying before adding more. Keep orchestrators thin, parsing isolated,
+and command implementations focused.
+## Resource and State Hygiene
+Every spawned process, tmux session, temp directory, port, browser context, or
+container needs a cleanup receipt. Register cleanup when the resource is
+created, not after the test passes. A leftover process means the work is not
+done.
+## Claude Code Adaptation
+Prefer Claude Code's available tools and project conventions. If a Private reference
+instruction names a Codex-only tool, translate the principle instead of copying
+the tool name. For example, goal tools become conditional `get_goal` /
+`create_goal` / `update_goal` guidance only when exposed; workflow isolation
+uses `Workflow`, `EnterWorktree`, or an explicit `claude --worktree ...`
+operator path when that is the available Claude surface.