npm - @codename_inc/spectre - Versions diffs - 3.7.0 → 5.0.0 - Mend

@codename_inc/spectre 3.7.0 → 5.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (106) hide show

package/plugins/spectre-codex/skills/ship/SKILL.md ADDED Viewed

@@ -0,0 +1,181 @@
+---
+name: "ship"
+description: "👻 | Autonomous end-to-end: brain dump -> scope -> TDD -> commit -> rebase -> PR"
+user-invocable: true
+---
+# ship
+## Input Handling
+Treat the current command arguments as this workflow's input. When invoked from a slash command, use the forwarded `$ARGUMENTS` value.
+# ship: Autonomous End-to-End Delivery
+Take a brain dump and autonomously produce a reviewable PR. Zero confirmation gates — scope, implement with TDD, sweep, rebase, and open a PR.
+**Execution Style**: Fully autonomous. No user approval gates. Parse intent, build it, ship the PR.
+## ARGUMENTS
+&lt;ARGUMENTS&gt; $ARGUMENTS &lt;/ARGUMENTS&gt;
+## Step (1/8) - Parse Context
+- **Action** — ParseBrainDump: Extract from ARGUMENTS:
+  - `INTENT_TYPE`: `feat` or `fix` (infer from context — new behavior = feat, broken behavior = fix)
+  - `TARGET_BRANCH`: Extract if specified (e.g., "rebase onto develop"), default `origin/main`
+  - `SCOPE_SUMMARY`: 1-2 sentence distillation of what to build/fix
+  - `RELEVANT_FILES`: Any files, components, or areas mentioned
+  - `CONTEXT`: Remaining context, constraints, preferences
+- **Action** — ValidateInput:
+  - **If** ARGUMENTS empty → ask user for brain dump, then proceed autonomously
+  - **Else** → proceed
+## Step (2/8) - Ensure Worktree
+- **Action** — DetectBranch: `git rev-parse --abbrev-ref HEAD`
+  - **If** already in a worktree or on a non-main branch (not `main`, not `master`) → use current context, proceed
+  - **If** on `main` or `master` → use `EnterWorktree` to create an isolated worktree
+- **Action** — SetBranchName: Capture current branch name as `BRANCH_NAME` for artifact paths
+## Step (3/8) - Quick Scope
+- **Action** — DispatchResearch: Spawn parallel lightweight agents:
+  - `@finder` — Locate files related to `RELEVANT_FILES` and `SCOPE_SUMMARY`
+  - `@analyst` — Understand the relevant code area, key interfaces, existing patterns
+- **Action** — WriteScopeDoc: Create `docs/tasks/{BRANCH_NAME}/concepts/scope.md`:
+  ```markdown
+  # Scope: {SCOPE_SUMMARY}
+  ## Objective
+  {1-2 sentences from brain dump}
+  ## Type
+  {feat or fix}
+  ## In Scope
+  - {bullet list of what will be done}
+  ## Out of Scope
+  - {what this explicitly won't touch}
+  ## Target Branch
+  {TARGET_BRANCH}
+  ## Key Files
+  {from research — relevant files and their roles}
+  ```
+  Keep it \~20 lines. This is a lightweight scope, not full `scope`.
+## Step (4/8) - Create Tasks
+- **Action** — CreateTasks: Use `TaskCreate` to create 3-8 tasks proportional to scope complexity.
+  - Each task gets: clear `subject` (imperative), `description` (what to do + acceptance criteria), `activeForm` (present continuous)
+  - Order tasks by dependency — foundational work first
+  - Tasks are ephemeral and operational — no file artifact needed
+## Step (5/8) - Execute with TDD
+- **Action** — ExecuteLoop: For each task sequentially:
+  1. `TaskUpdate` → `in_progress`
+  2. Load `Skill(spectre-tdd)` for TDD methodology
+  3. Execute: RED (write failing test) → GREEN (minimal implementation) → REFACTOR (clean up)
+  4. Commit with conventional format: `{INTENT_TYPE}({scope}): {description}`
+  5. `TaskUpdate` → `completed`
+  **Rules**:
+  - One commit per task minimum
+  - Conventional commit format always
+  - TDD methodology for implementation tasks (skip for config/doc-only tasks)
+  - If a task reveals new work, create additional tasks rather than scope-creeping the current one
+## Step (6/8) - Sweep
+Inline sweep — same checklist as `sweep`, no subagents:
+### 6.1 Diff Sanity Check
+- Review full diff for unintentional changes (whitespace-only edits, merge artifacts)
+- Verify no accidentally staged files outside the intended scope
+- Confirm no secrets, API keys, credentials, or sensitive data in diff
+### 6.2 Logging Audit
+- Remove temporary/debug logging (console.log, print, debug flags)
+- Preserve intentional logs: errors, critical warnings, key state transitions
+- Verify log levels are appropriate for production context
+### 6.3 Code Hygiene
+- Remove commented-out code (it's in git history if needed)
+- Resolve or document any TODO/FIXME/HACK introduced in this session
+- Remove hardcoded test values that should be config/env
+### 6.4 Opportunistic Dead Code Cleanup
+- Orphaned imports with no usage in the file
+- Unused variables or functions declared but never referenced
+- Debug artifacts (debugger statements, leftover TODO/FIXME from this work)
+### 6.5 Lint (Strict)
+- Run the project linter and **fix all violations** — no skipping, no eslint-disable
+- Address structural lint issues by refactoring, not suppressing
+### 6.6 Test
+- Run affected tests + full test suite
+- Fix any failures caused by the changes
+- Do NOT write new tests here — that was done in Step 5
+### 6.7 Commit Sweep Fixes
+- If sweep produced changes, commit: `chore({scope}): sweep cleanup`
+## Step (7/8) - Rebase
+- **Action** — FetchLatest: `git fetch origin`
+- **Action** — CreateSafetyRef: `git branch backup/ship-$(date +%Y%m%d-%H%M%S)`
+- **Action** — Rebase: `git rebase {TARGET_BRANCH}`
+  - **If** conflicts → resolve automatically, favoring target branch conventions
+  - Track resolution decisions for PR summary
+- **Action** — VerifyPostRebase:
+  - Run linter — fix violations
+  - Run full test suite — fix failures
+  - Confirm commit count and no unexpected changes
+## Step (8/8) - Create PR
+- **Action** — PushBranch: `git push -u origin {BRANCH_NAME}`
+- **Action** — CreatePR: `gh pr create` with:
+  **Title**: `{INTENT_TYPE}({scope}): {SCOPE_SUMMARY}` (under 70 chars)
+  **Body**:
+  ```markdown
+  ## Summary
+  {From scope doc — objective and what was done}
+  ## Changes
+  {Bulleted list derived from completed tasks}
+  ## Test Plan
+  {Bulleted checklist — what was tested, what to verify manually}
+  Shipped autonomously via `ship`
+  ```
+- **Action** — OutputPRUrl: Display the PR URL as the final deliverable
+## Next Steps
+Use `Skill(spectre-guide)` skill to render the Next Steps footer.

package/plugins/spectre-codex/skills/sweep/SKILL.md ADDED Viewed

@@ -0,0 +1,91 @@
+---
+name: "sweep"
+description: "👻 | Light pass cleanup - clean, lint, test, commit"
+user-invocable: true
+---
+# sweep
+## Input Handling
+Treat the current command arguments as this workflow's input. When invoked from a slash command, use the forwarded `$ARGUMENTS` value.
+## Pre-Commit Sweep
+You are preparing uncommitted or recently committed changes for check-in. Perform a systematic cleanup, then commit with descriptive conventional commits.
+**Execution Style**: Fast, formulaic checklist. No subagents, no user approval gates. Execute each step and move on.
+### 1. Diff Sanity Check
+- Review full diff for unintentional changes (whitespace-only edits, merge artifacts)
+- Verify no accidentally staged files outside the intended scope
+- Confirm no secrets, API keys, credentials, or sensitive data in diff
+### 2. Logging Audit
+- Remove temporary/debug logging (console.log, print, debug flags, etc.)
+- Preserve intentional logs: errors, critical warnings, key state transitions
+- Verify log levels are appropriate for production context
+### 3. Code Hygiene
+- Remove commented-out code (it's in git history if needed)
+- Resolve or document any TODO/FIXME/HACK introduced in this session
+- Remove hardcoded test values that should be config/env
+### 4. Opportunistic Dead Code Cleanup
+Quick scan of changed files only — remove anything obviously dead, no deep investigation:
+- Orphaned imports with no usage in the file
+- Unused variables or functions declared but never referenced
+- Commented-out code blocks
+- Debug artifacts (debugger statements, leftover TODO/FIXME from this work)
+Do not hunt for dead code beyond the changed files. This is opportunistic, not forensic.
+### 5. Lint (Strict)
+- Run the project linter and **fix all violations** — no skipping, no eslint-disable
+- Address structural lint issues (file size, complexity thresholds) by refactoring, not suppressing
+- Verify .gitignore coverage (no temp files, build artifacts, IDE configs)
+### 6. Test
+- Identify test files related to the changed files (co-located tests, imports, shared modules)
+- Run those tests and the broader test suite
+- Fix any failures caused by the changes
+- Do NOT write new tests in this step — this is a sweep, not a test authoring pass
+### 7. Commit
+Group changes into logical conventional commits. Commits are project history and critical context for LLMs and future developers — invest in making them descriptive.
+**Format**: `type(scope): description`
+**Types**: feat, fix, refactor, test, chore, docs, style, perf
+**Grouping** — separate commits by concern:
+- Feature/behavior additions → `feat`
+- Refactors/cleanup with no behavior change → `refactor`
+- Bug fixes → `fix`
+- Test additions/updates → `test`
+- Config/dependency changes → `chore`
+- Documentation → `docs`
+**Commit message quality**:
+- Subject line answers: what changed and why (not "fix stuff" or "update files")
+- Include scope to locate the change: `feat(auth): add token refresh on 401 response`
+- If the commit touches multiple concerns, it's too big — split it
+- Body (optional) adds context: motivation, trade-offs, what was considered and rejected
+**Anti-patterns**:
+- `fix: updates` — says nothing
+- `refactor: clean up` — clean up what? why?
+- One giant commit for unrelated changes
+### 8. Render Footer
+Use `Skill(spectre-guide)` skill for Next Steps footer.

package/plugins/{spectre/skills/spectre-tdd → spectre-codex/skills/tdd}/SKILL.md RENAMED Viewed

@@ -1,5 +1,5 @@
 ---
-name: spectre-tdd
+name: "tdd"
 description: "Load this skill when executing TDD (Test-Driven Development) methodology. Use when implementing features via strict RED-GREEN-REFACTOR cycles, or when a prompt instructs execution via TDD."
 ---

package/plugins/spectre-codex/skills/test/SKILL.md ADDED Viewed

@@ -0,0 +1,389 @@
+---
+name: "test"
+description: "👻 | Risk-aware test coverage & commit - primary agent"
+user-invocable: true
+---
+# test
+## Input Handling
+Treat the current command arguments as this workflow's input. When invoked from a slash command, use the forwarded `$ARGUMENTS` value.
+# test: test coverage with risk-aware focus
+## Description
+- **What** — Triage changes into risk tiers, dispatch @tester subagents to fix lint and write **risk-appropriate tests** (not brute-force 100% line coverage), and commit after each passing batch.
+- **Outcome** — All lint issues fixed, surgical test coverage that maximizes confidence while minimizing token cost, incremental commits per batch, artifacts recorded in OUT_DIR.
+- **Philosophy** — Test behaviors at boundaries, not implementation details. Prioritize code that can hurt users when it breaks. Skip tests that only measure "was this line executed?" without verifying correctness.
+## ARGUMENTS Input
+Optional scope hint or specific files to focus on.
+&lt;ARGUMENTS&gt; $ARGUMENTS &lt;/ARGUMENTS&gt;
+## Instructions
+- Primary agent plans and verifies; @tester subagents write test code
+- Maximize parallelism: dispatch multiple @tester agents simultaneously, not sequentially
+- Primary agent coordinates; subagents execute test writing in parallel batches
+- No OUT_DIR artifacts — this is a lightweight flow
+- Risk assessment is inline reasoning, not a classification phase
+- Test behaviors at boundaries, not implementation details
+- Skip tests for P3 files (types, configs, simple wrappers)
+- when committing, —no-verify and eslint-disable, or committing code with eslint-disable, is expressly forbidden without the user’s explicit permission.
+### Why Risk-Weighted &gt; 100% Line Coverage
+**100% line coverage is a vanity metric that:**
+- Treats all code equally (a payment handler vs a string formatter)
+- Tests implementation details (brittle, breaks on refactor)
+- Creates maintenance burden (tests that slow you down)
+- Gives false confidence (100% coverage ≠ 100% correctness)
+- Burns tokens on code that can't break in production
+**Risk-weighted coverage instead:**
+- Focuses testing effort where bugs cause user pain
+- Tests behaviors and contracts, not internal wiring
+- Creates tests that survive refactoring
+- Catches actual bugs via mutation-resistant assertions
+- Dramatically reduces token cost while increasing safety
+### Risk Tier Definitions
+#### P0 — Critical (Must Test Thoroughly)
+**Identification patterns:**
+- Path contains: `auth`, `payment`, `security`, `crypto`, `session`, `token`
+- File has `@critical` JSDoc/comment annotation
+- Handles: user data mutations, financial transactions, PII, permissions
+- API handlers for external consumers
+- Database migration files
+**Coverage requirements:**
+- 100% **behavioral** coverage (every user-facing outcome has a test)
+- All error paths tested with specific error assertions
+- Edge cases for security-sensitive inputs (null, empty, malformed, overflow)
+- Contract tests for all public APIs (schema validation)
+- Mutation-resistant assertions (would a bug actually fail this test?)
+**Test quality bar:**
+```typescript
+// ✅ GOOD P0 test - tests behavior and catches real bugs
+it('rejects payment when card is expired', async () => {
+  const result = await processPayment({ card: expiredCard, amount: 100 });
+  expect(result.status).toBe('DECLINED');
+  expect(result.reason).toBe('CARD_EXPIRED');
+  expect(chargeService.charge).not.toHaveBeenCalled(); // Side effect prevented
+});
+// ❌ BAD test - tests implementation, not behavior
+it('calls validateCard', async () => {
+  await processPayment({ card, amount: 100 });
+  expect(validateCard).toHaveBeenCalledWith(card);
+});
+```
+#### P1 — Core (Test Key Behaviors)
+**Identification patterns:**
+- Main feature components (not utility wrappers)
+- API route handlers (internal)
+- State management (stores, reducers, contexts)
+- Core business logic
+- Data fetching/caching layers
+**Coverage requirements:**
+- Happy path coverage for all public functions
+- Critical error paths (ones users would see)
+- Contract tests at team boundaries (exported APIs other modules consume)
+- No need to test internal helper functions
+- No need to test every code branch, just primary behaviors
+**Test quality bar:**
+```typescript
+// ✅ GOOD P1 test - covers the behavior users care about
+it('fetches and caches user profile', async () => {
+  const profile = await getUserProfile(userId);
+  expect(profile.name).toBe('Joe');
+  // Second call uses cache
+  await getUserProfile(userId);
+  expect(api.get).toHaveBeenCalledTimes(1);
+});
+// ❌ SKIP - internal implementation detail
+it('calls normalizeUserData internally', () => { ... });
+```
+#### P2 — Supporting (Test Public Surface Only)
+**Identification patterns:**
+- Utility functions and helpers
+- Internal services not exposed to other teams
+- Formatters, validators, transformers
+- Hooks that compose other hooks
+- Adapters and wrappers
+**Coverage requirements:**
+- Public exported functions: happy path only
+- Skip internal/private functions entirely
+- Skip trivial functions (single-line returns, simple compositions)
+- Only test if the function has logic worth verifying
+**Test quality bar:**
+```typescript
+// ✅ GOOD P2 test - public util with actual logic
+it('formats currency correctly', () => {
+  expect(formatCurrency(1234.5, 'USD')).toBe('$1,234.50');
+  expect(formatCurrency(1234.5, 'EUR')).toBe('€1,234.50');
+});
+// ❌ SKIP - trivial wrapper with no logic
+// export const getFullName = (u) => `${u.first} ${u.last}`;
+```
+#### P3 — Low Risk (Skip Testing)
+**Identification patterns:**
+- TypeScript type definitions (`.d.ts`)
+- JSON/YAML configuration files
+- CSS/SCSS/Tailwind styles
+- Markdown documentation
+- Constants and enums (no logic)
+- Re-export barrels (`index.ts` that just re-exports)
+- Simple component wrappers (just pass props through)
+- Build scripts and tooling config
+**Coverage requirements:**
+- **NO TESTS REQUIRED** — Types are the test
+- Type checking + linting is sufficient
+- These files cannot break at runtime in ways tests would catch
+### Test Quality Requirements (All Tiers)
+#### Each test MUST:
+- **Test ONE behavior** — Single assertion focus, clear failure message
+- **Use descriptive names** — `when_[condition]_then_[outcome]` or `[action]_should_[result]`
+- **Assert outcomes, not calls** — Verify what happened, not what was invoked
+- **Be refactor-resilient** — Test should pass if behavior unchanged, even if internals change
+- **Catch real bugs** — Ask: "If I introduced a bug, would this test fail?"
+#### Each test MUST NOT:
+- **Mock implementation details** — Don't mock internal functions
+- **Assert on call counts** — Unless testing side-effect prevention
+- **Duplicate type coverage** — Don't test that TS types are correct
+- **Test framework behavior** — Don't test that React renders or Express routes
+#### Mutation Testing Mindset
+For every test, ask: "If I changed the implementation to return a wrong value, would this test catch it?"
+```typescript
+// ✅ Mutation-resistant — changing the discount calculation would fail this
+it('applies 20% discount for premium users', () => {
+  expect(calculateTotal({ items: [100], userTier: 'premium' })).toBe(80);
+});
+// ❌ NOT mutation-resistant — always passes regardless of implementation
+it('calls calculateDiscount', () => {
+  calculateTotal({ items: [100], userTier: 'premium' });
+  expect(calculateDiscount).toHaveBeenCalled();
+});
+```
+### Contract Tests at Team Boundaries
+When your code is consumed by other teams/modules, add contract tests:
+```typescript
+// API Contract Test
+describe('UserAPI contract', () => {
+  it('GET /users/:id returns UserResponse schema', async () => {
+    const response = await request(app).get('/users/123');
+    expect(response.body).toMatchSchema(UserResponseSchema);
+  });
+  it('returns standard APIError shape on 404', async () => {
+    const response = await request(app).get('/users/nonexistent');
+    expect(response.status).toBe(404);
+    expect(response.body).toMatchSchema(APIErrorSchema);
+  });
+});
+// Event Contract Test
+describe('UserCreated event contract', () => {
+  it('emits event matching UserCreatedEvent schema', async () => {
+    await createUser({ name: 'Test' });
+    expect(eventBus.lastEvent).toMatchSchema(UserCreatedEventSchema);
+  });
+});
+```
+## Steps
+### Step 1/4 — Discover Full Working Set and Plan
+- **Action** — DiscoverFullWorkingSet:
+  - Validate commit_id if provided
+  - Gather: committed changes + staged + unstaged + untracked
+  - **Full Working Set** = UNION of all sources
+- **Action** — RecordWorkingSet: Write `OUT_DIR/working_set.json`
+- **Action** — BaselineLintFull: Run lint on ALL files in Full Working Set
+- **Action** — MapDependencies: Build import/dep snapshot
+### Step (2/4) - Risk Assessment & Test Plan
+- **Action** — InlineRiskCheck: Quick mental triage of changed files
+  **P0 Critical** (thorough coverage required):
+  - Paths containing: `auth`, `payment`, `security`, `crypto`, `session`, `token`
+  - Handles: user data mutations, financial transactions, PII, permissions
+  - Has `@critical` annotation
+  **P1 Core** (key behaviors):
+  - API handlers, feature components, state management, services
+  **P2 Supporting** (public surface only):
+  - Utils, helpers, hooks, formatters
+  **P3 Skip** (no tests):
+  - Type definitions (`.d.ts`), configs, styles, index barrels, simple wrappers
+- **Action** — CreateTestPlan: Write 3-7 bullet test plan
+  - Format: `- [P{tier}] {file}: {behavior to test}`
+  - P0 files get multiple bullets (all behaviors + error paths)
+  - P1 files get 1-2 bullets (happy path + critical errors)
+  - P2 files get 1 bullet (public function smoke test)
+  - P3 files listed as "SKIP — {reason}"
+- **Action** - Update`OUT_DIR/working_set.json` with risk tier categorization.
+### Step (3/4) - Write Tests & Verify
+- **Action** — DispatchTestWriter: Spawn MULTIPLE @tester subagents IN PARALLEL
+  - **Parallelization Strategy**:
+    - Partition test plan items into independent batches (by file or logical grouping)
+    - Dispatch one @tester per batch — aim for 3-5 parallel agents for medium scope, up to 8 for large scope
+    - Each agent receives: its batch of test plan items, file paths, risk tier context
+    - **Critical**: Use a single message with multiple Task tool calls to launch all agents simultaneously
+  - **Batching Heuristics**:
+    - P0 files: 1 agent per file (thorough coverage requires focus)
+    - P1 files: Group 2-3 related files per agent
+    - P2 files: Group 3-5 files per agent (lighter coverage)
+  - Instruct each: "Write behavioral tests, assert outcomes not calls, mutation-resistant"
+  - Wait for all agents to complete before proceeding to lint/test verification
+- **Action** — RunLint: Execute linter; fix violations
+  - **If** lint fails → autofix first, then manual fix
+  - **Else** → continue
+- **Action** — RunTests: Execute full test suite
+  - **If** tests fail → analyze failure, fix via @tester or direct edit
+  - **Else** → continue
+- **Action** — VerifyQuality: Spot-check 1-2 tests
+  - Confirm: tests assert behaviors, would catch real bugs, survive refactoring
+  - **If** test quality poor → rework via @tester
+  - **Else** → continue
+### Step (4/4) - Commit
+- **Action** — CommitPlanningArtifacts: Gather and commit planning/working docs FIRST
+  - Check for uncommitted files in `OUT_DIR/`:
+    - `working_set.json` (scope and risk tier categorization)
+    - Any other `.md` or `.json` artifacts created during this flow
+  - Check for uncommitted docs in `docs/tasks/{branch_name}/` or related planning directories
+  - **If** uncommitted planning artifacts exist:
+    - Stage all: `git add docs/tasks/{branch_name}/ OUT_DIR/`
+    - Commit: `docs(test): add test planning artifacts for {branch_name}`
+- **Action** — GroupChanges: Organize code changes into logical commits
+  - Group by: feat/fix/refactor/test/chore
+  - Tests can be bundled with their feature or separate (your judgment)
+- **Action** — CommitAll: Create conventional commits for code changes
+  - Format: `type(scope): description`
+  - Each commit answers: What changed and why?
+- **Action** — RenderFooter: Render Next Steps footer using `Skill(spectre-guide)` skill (contains format template and spectre command options)
+## Next Steps
+See `Skill(spectre-guide)` skill for footer format and command options.
+## Success Criteria
+**Step 1 - Analyze Diff**:
+- [ ] Scope identified (files changed) and documented
+- [ ] Behaviors changed listed (not just file names)
+**Step 2 - Risk Assessment & Test Plan**:
+- [ ] Each changed file assigned P0-P3 tier
+- [ ] Test plan created with 3-7 bullets
+- [ ] P3 files explicitly marked SKIP
+**Step 3 - Write Tests & Verify**:
+- [ ] Multiple @tester agents dispatched in parallel (not sequential)
+- [ ] Test plan partitioned into independent batches
+- [ ] All agents launched in single message (parallel tool calls)
+- [ ] P0 files have thorough behavioral coverage
+- [ ] P1 files have key path coverage
+- [ ] P2 files have public surface coverage
+- [ ] P3 files have NO tests (confirmed skipped)
+- [ ] Lint passes
+- [ ] All tests pass
+- [ ] Test quality spot-checked
+**Step 4 - Commit**:
+- [ ] Changes grouped logically
+- [ ] Conventional commit format used
+- [ ] Single Next Steps footer rendered
+- [ ] Next steps guide read and options sourced