npm - @kodrunhq/opencode-autopilot - Versions diffs - 1.4.0 → 1.5.0 - Mend

@kodrunhq/opencode-autopilot 1.4.0 → 1.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/assets/commands/brainstorm.md +7 -0
package/assets/commands/stocktake.md +7 -0
package/assets/commands/tdd.md +7 -0
package/assets/commands/update-docs.md +7 -0
package/assets/commands/write-plan.md +7 -0
package/assets/skills/brainstorming/SKILL.md +295 -0
package/assets/skills/code-review/SKILL.md +241 -0
package/assets/skills/e2e-testing/SKILL.md +266 -0
package/assets/skills/git-worktrees/SKILL.md +296 -0
package/assets/skills/go-patterns/SKILL.md +240 -0
package/assets/skills/plan-executing/SKILL.md +258 -0
package/assets/skills/plan-writing/SKILL.md +278 -0
package/assets/skills/python-patterns/SKILL.md +255 -0
package/assets/skills/rust-patterns/SKILL.md +293 -0
package/assets/skills/strategic-compaction/SKILL.md +217 -0
package/assets/skills/systematic-debugging/SKILL.md +299 -0
package/assets/skills/tdd-workflow/SKILL.md +311 -0
package/assets/skills/typescript-patterns/SKILL.md +278 -0
package/assets/skills/verification/SKILL.md +240 -0
package/package.json +1 -1
package/src/index.ts +4 -0
package/src/orchestrator/skill-injection.ts +38 -0
package/src/review/sanitize.ts +1 -1
package/src/skills/adaptive-injector.ts +122 -0
package/src/skills/dependency-resolver.ts +88 -0
package/src/skills/linter.ts +113 -0
package/src/skills/loader.ts +88 -0
package/src/templates/skill-template.ts +4 -0
package/src/tools/create-skill.ts +12 -0
package/src/tools/stocktake.ts +170 -0
package/src/tools/update-docs.ts +116 -0

package/assets/skills/go-patterns/SKILL.md ADDED Viewed

@@ -0,0 +1,240 @@
+---
+name: go-patterns
+description: Idiomatic Go patterns covering error handling, concurrency, interfaces, and testing conventions
+stacks:
+  - go
+requires: []
+---
+# Go Patterns
+Idiomatic Go patterns for writing clean, concurrent, and testable code. Covers error handling, concurrency primitives, interface design, testing, package organization, and common anti-patterns. Apply these when writing, reviewing, or refactoring Go code.
+## 1. Error Handling
+**DO:** Treat errors as values. Check every error and provide context for debugging.
+- Always check errors immediately after the call:
+  ```go
+  f, err := os.Open(path)
+  if err != nil {
+      return fmt.Errorf("open config %s: %w", path, err)
+  }
+  defer f.Close()
+  ```
+- Wrap errors with `%w` for unwrapping with `errors.Is()` and `errors.As()`:
+  ```go
+  if err := db.Connect(); err != nil {
+      return fmt.Errorf("database connection: %w", err)
+  }
+  ```
+- Use sentinel errors for expected conditions:
+  ```go
+  var ErrNotFound = errors.New("not found")
+  var ErrConflict = errors.New("conflict")
+  // Caller checks:
+  if errors.Is(err, ErrNotFound) { ... }
+  ```
+- Create custom error types for rich context:
+  ```go
+  type ValidationError struct {
+      Field   string
+      Message string
+  }
+  func (e *ValidationError) Error() string {
+      return fmt.Sprintf("validation: %s: %s", e.Field, e.Message)
+  }
+  ```
+- Add context that helps debugging -- include the operation, the input, and the wrapped cause
+**DON'T:**
+- Ignore errors with `_` unless there is a comment explaining why: `_ = f.Close() // best-effort cleanup`
+- Use `panic` for recoverable errors -- reserve `panic` for truly unrecoverable bugs (nil dereference, impossible state)
+- Return generic `errors.New("something failed")` without context
+- Log AND return the same error -- choose one to avoid duplicate noise
+- Use `fmt.Errorf` without `%w` when the caller might need to inspect the cause
+## 2. Concurrency Patterns
+**DO:** Use goroutines and channels for communication, mutexes for shared state protection.
+- Always pass `context.Context` as the first parameter for cancellation and timeouts:
+  ```go
+  func fetchUser(ctx context.Context, id string) (*User, error) {
+      select {
+      case <-ctx.Done():
+          return nil, ctx.Err()
+      default:
+      }
+      // ... fetch logic
+  }
+  ```
+- Use `errgroup.Group` for parallel tasks with error collection:
+  ```go
+  g, ctx := errgroup.WithContext(ctx)
+  for _, url := range urls {
+      g.Go(func() error {
+          return fetch(ctx, url)
+      })
+  }
+  if err := g.Wait(); err != nil {
+      return fmt.Errorf("parallel fetch: %w", err)
+  }
+  ```
+- Every goroutine must have a clear shutdown path:
+  ```go
+  func worker(ctx context.Context, jobs <-chan Job) {
+      for {
+          select {
+          case <-ctx.Done():
+              return
+          case job, ok := <-jobs:
+              if !ok { return }
+              process(job)
+          }
+      }
+  }
+  ```
+- Use `sync.WaitGroup` for fan-out when you don't need error collection
+- Use `sync.Once` for lazy initialization of shared resources
+- Use `sync.Mutex` only when channels are impractical (protecting a shared map, counter, or cache)
+**DON'T:**
+- Start a goroutine without a way to stop it -- every goroutine needs a cancellation signal
+- Use `time.Sleep()` for synchronization -- use channels or `sync.WaitGroup`
+- Share memory by communicating -- communicate by sharing channels (Go proverb)
+- Use unbuffered channels when the producer and consumer run at different speeds
+- Forget `defer mu.Unlock()` after `mu.Lock()` -- always pair them on adjacent lines
+## 3. Interface Design
+**DO:** Keep interfaces small, define them at the consumer, and accept them as parameters.
+- Keep interfaces to 1-3 methods. The smaller the interface, the more types satisfy it:
+  ```go
+  type Reader interface {
+      Read(p []byte) (n int, err error)
+  }
+  ```
+- Define interfaces where they are used, not where they are implemented:
+  ```go
+  // In the service package (consumer), not the repository package (provider)
+  type UserStore interface {
+      FindByID(ctx context.Context, id string) (*User, error)
+  }
+  ```
+- Accept interfaces, return structs:
+  ```go
+  func NewService(store UserStore) *Service { ... }  // accepts interface
+  func NewPostgresStore(db *sql.DB) *PostgresStore { ... }  // returns concrete type
+  ```
+- Use the standard library interfaces (`io.Reader`, `io.Writer`, `fmt.Stringer`) when possible
+- Implicit satisfaction -- no `implements` keyword. If the methods match, the type satisfies the interface
+**DON'T:**
+- Create interfaces with 5+ methods -- break them into smaller, composable interfaces
+- Define interfaces before you need them -- extract when you have 2+ implementations or need testing
+- Put all interfaces in a single `interfaces.go` file -- define them next to their consumers
+- Use empty interface (`interface{}` or `any`) when a more specific type is possible
+## 4. Testing Patterns
+**DO:** Write table-driven tests, use `t.Helper()`, and keep tests close to the code they test.
+- Table-driven tests for multiple inputs:
+  ```go
+  func TestValidate(t *testing.T) {
+      tests := []struct {
+          name  string
+          input string
+          want  error
+      }{
+          {"valid email", "a@b.com", nil},
+          {"missing @", "ab.com", ErrInvalidEmail},
+          {"empty", "", ErrRequired},
+      }
+      for _, tt := range tests {
+          t.Run(tt.name, func(t *testing.T) {
+              err := Validate(tt.input)
+              if !errors.Is(err, tt.want) {
+                  t.Errorf("Validate(%q) = %v, want %v", tt.input, err, tt.want)
+              }
+          })
+      }
+  }
+  ```
+- Use `t.Helper()` in test helper functions for better error locations:
+  ```go
+  func assertNoError(t *testing.T, err error) {
+      t.Helper()
+      if err != nil {
+          t.Fatalf("unexpected error: %v", err)
+      }
+  }
+  ```
+- Use `t.Parallel()` for independent tests to run faster
+- Use `testdata/` directory for test fixtures (Go tooling ignores this directory)
+- Use `package foo_test` for black-box testing of the public API
+- Use `package foo` for white-box testing of internal behavior
+**DON'T:**
+- Use `assert` libraries that hide what's being tested -- prefer standard `t.Errorf` with context
+- Test private functions directly -- test through the public API
+- Use global test state -- each test case should be independent
+- Skip `t.Run` -- subtest names appear in failure output and make debugging easier
+## 5. Package Organization
+**DO:** Keep packages flat, focused, and named by what they provide.
+- Flat structure -- avoid deep nesting:
+  ```
+  // DO
+  auth/
+  user/
+  order/
+  // DON'T
+  pkg/services/auth/handlers/middleware/
+  ```
+- Use `internal/` for implementation details that other packages should not import
+- Use `cmd/` for entry points -- one `main.go` per binary:
+  ```
+  cmd/server/main.go
+  cmd/cli/main.go
+  ```
+- Name packages by what they provide, not what they contain: `auth` not `authutils`, `http` not `httphandlers`
+- One package per concern -- don't create `utils` or `helpers` grab-bag packages
+- Keep `main.go` thin -- parse flags, wire dependencies, call `Run()`
+**DON'T:**
+- Create a `models` or `types` package -- put types with the code that uses them
+- Use package names that stutter: `user.UserService` -- prefer `user.Service`
+- Import from `internal/` across module boundaries -- it won't compile
+- Put everything in one package to avoid import cycles -- fix the design instead
+## 6. Anti-Pattern Catalog
+**Anti-Pattern: Naked Returns**
+Using `return` without values in functions with named return parameters. Named returns are fine for documentation, but naked returns obscure what's being returned. Be explicit: `return user, nil`.
+**Anti-Pattern: Interface Pollution**
+Defining an interface before it has two implementations or a testing need. Interfaces are for decoupling -- premature interfaces add indirection without benefit. Wait until you need polymorphism, then extract.
+**Anti-Pattern: Global State**
+Package-level `var db *sql.DB` or `var logger *Logger`. Global state makes testing painful, creates hidden coupling, and breaks concurrent test execution. Pass dependencies via function parameters or struct fields.
+**Anti-Pattern: Init Function Overuse**
+Putting complex logic in `func init()` -- database connections, HTTP clients, file parsing. Init functions run at import time with no error handling. Move initialization to explicit `New()` or `Setup()` functions that return errors.
+**Anti-Pattern: Error String Matching**
+Checking `err.Error() == "not found"` instead of using `errors.Is(err, ErrNotFound)`. String matching is fragile -- error messages change, wrapping adds context. Use sentinel errors or custom types.
+**Anti-Pattern: Goroutine Leak**
+Starting a goroutine that blocks forever on a channel or context that's never cancelled. Every goroutine must have a clear exit condition. Use `context.WithCancel`, `context.WithTimeout`, or close the channel.

package/assets/skills/plan-executing/SKILL.md ADDED Viewed

@@ -0,0 +1,258 @@
+---
+name: plan-executing
+description: Batch execution methodology for implementing plans with verification checkpoints after each task
+stacks: []
+requires:
+  - plan-writing
+---
+# Plan Executing
+A systematic methodology for working through implementation plans task by task. Each task is executed, verified, and committed before moving to the next. Deviations are logged, failures are diagnosed, and progress is tracked throughout.
+## When to Use
+- **After writing a plan** (using the plan-writing skill) — the plan provides the task list, this skill provides the execution discipline
+- **When implementing a multi-task feature** — any work with more than 2 tasks benefits from structured execution
+- **When running through a task list systematically** — avoids skipping steps, forgetting verification, or losing track of progress
+- **When multiple people are implementing the same plan** — consistent execution methodology keeps everyone aligned
+- **When resuming work after a break** — the execution log tells you exactly where you left off and what state things are in
+## The Execution Process
+### Step 1: Read the Full Plan
+Before implementing anything, read every task in the plan. Do not start coding after reading just the first task.
+**Process:**
+1. Read the plan objective — what must be true when this work is complete?
+2. Read every task, including its files, action, verification, and done criteria
+3. Understand the dependency graph — which tasks depend on which?
+4. Identify the critical path — which tasks, if delayed, delay everything?
+5. Note any tasks that can run in parallel (same wave, no shared files)
+**Why read everything first:**
+- You may spot dependency errors before they block you
+- You will understand how early tasks set up later tasks
+- You can identify shared patterns and avoid redundant work
+- You will catch scope issues before investing implementation time
+### Step 2: Execute Wave by Wave
+Start with Wave 1 tasks (no dependencies). Complete each task fully before starting the next.
+**Per-task execution flow:**
+1. **Read the task** — files, action, verification, done criteria
+2. **Check prerequisites** — are all dependency tasks complete? Are their outputs available?
+3. **Implement** — follow the action description. If it says "create X with Y," create X with Y
+4. **Run verification** — execute the task's verification command
+5. **Check done criteria** — does the implementation meet the stated criteria?
+6. **Commit** — one commit per task, referencing the task number
+**Wave transition:**
+- After all Wave N tasks are complete and verified, move to Wave N+1
+- Do not start a Wave N+1 task until all its Wave N dependencies are complete
+- If a Wave N task fails, fix it before moving forward
+### Step 3: Verify After Each Task
+Verification is not optional. Every task has a verification step, and you must run it.
+**Verification hierarchy:**
+1. **Task-specific verification** — the command listed in the task (e.g., `bun test tests/auth/token.test.ts`)
+2. **Build check** — `bunx tsc --noEmit` to catch type errors across the project
+3. **Full test suite** — `bun test` to catch regressions in other modules
+4. **Lint check** — `bun run lint` to catch formatting and style issues
+**Rules:**
+- Run at least the task-specific verification after every task
+- Run the full test suite after every 2-3 tasks (or after every task if the project is small)
+- If any verification fails, fix it before proceeding — do NOT continue with a broken base
+- If a test that was passing before your change is now failing, you introduced a regression — fix it
+### Step 4: Track Progress
+Keep a running log of what is done, what deviated from the plan, and what remains.
+**Track:**
+- Completed tasks with commit hashes
+- Time spent per task (helps calibrate future estimates)
+- Deviations from the plan (scope changes, unexpected issues, reordered tasks)
+- New tasks discovered during implementation (add to the plan, do not just do them ad hoc)
+- Blockers encountered and how they were resolved
+**Why track:**
+- If you are interrupted, you (or someone else) can resume from the log
+- Deviations documented during implementation are easier to review than deviations discovered later
+- Time tracking reveals whether your task sizing is accurate (improving future plans)
+- New tasks discovered during implementation are visible for review (preventing scope creep)
+### Step 5: Handle Failures
+When something goes wrong (and it will), follow a structured response.
+**Task verification fails:**
+1. Read the error message carefully — what specifically failed?
+2. Is this a problem with the implementation or the test?
+3. Use the systematic-debugging skill for non-obvious failures
+4. Fix the issue, re-run verification, confirm it passes
+5. Log the failure and fix as a deviation
+**Unexpected dependency discovered:**
+1. The task requires something that is not in the plan
+2. Check: is this a missing task, or a missing prerequisite from an existing task?
+3. Add the missing work to the plan (new task or expanded existing task)
+4. Re-evaluate wave assignments — does this change the dependency graph?
+5. Log as a deviation
+**Scope creep detected:**
+1. While implementing Task N, you discover that "it would be nice to also do X"
+2. Ask: is X required for the plan's goal, or just a nice-to-have?
+3. If required: add it to the plan as a new task with proper sizing and dependencies
+4. If nice-to-have: log it as a follow-up item, do NOT implement it now
+5. Every unplanned addition increases risk — be disciplined
+**Blocked by external factor:**
+1. Cannot proceed due to missing API key, unavailable service, pending PR review, etc.
+2. Document the blocker with: what is blocked, what is needed, who can unblock it
+3. Skip to the next non-blocked task (if one exists in the current wave)
+4. Do NOT implement workarounds that will need to be undone later
+### Step 6: Final Verification
+After all tasks are complete, run the plan-level verification.
+**Process:**
+1. Run the full test suite: `bun test`
+2. Run the linter: `bun run lint`
+3. Run the type checker: `bunx tsc --noEmit`
+4. Verify the plan objective — is the stated goal actually achieved?
+5. Check for regressions — are all previously passing tests still passing?
+6. Review all deviations — do they make sense? Are they documented?
+**This is the "ship it" gate.** If final verification passes, the work is complete. If it fails, the work is not complete — regardless of how many tasks are checked off.
+## Commit Strategy
+One commit per task. No exceptions.
+**Commit message format:**
+```
+type(scope): concise description (task N/M)
+- Key change 1
+- Key change 2
+```
+**Examples:**
+```
+feat(auth): create login types and token utilities (task 1/5)
+- Add LoginRequest and LoginResponse types
+- Implement createToken and verifyToken with jose
+- Add tests for token creation and expired token handling
+```
+```
+fix(auth): add rate limiting to login endpoint (task 4/5)
+- Limit to 5 attempts per minute per IP
+- Return 429 with retry-after header
+```
+**Rules:**
+- Each commit should leave the codebase in a working state (tests pass, builds succeed)
+- Never commit broken code — if verification fails, fix first, then commit
+- Never batch multiple tasks into one commit — the commit history should match the plan
+- If a task requires no code changes (e.g., documentation-only), commit the docs
+## Anti-Pattern Catalog
+### Anti-Pattern: Skipping Verification
+**What goes wrong:** "I will test it all at the end." You implement 5 tasks, run the tests, and 3 fail. Now you have to debug failures across 5 tasks worth of changes with no idea which task introduced which failure.
+**Instead:** Verify after every task. When a test fails, you know exactly which change caused it (the one you just made).
+### Anti-Pattern: Continuing on Failures
+**What goes wrong:** Task 2 verification fails, but you start Task 3 anyway because "I will fix it later." Task 3 depends on Task 2 working correctly, so now Task 3 is also broken. The failure cascades.
+**Instead:** Fix Task 2 before starting Task 3. A broken foundation makes everything built on top of it unreliable.
+### Anti-Pattern: Not Committing
+**What goes wrong:** You complete 5 tasks and make one giant commit. If something goes wrong, you cannot revert a single task — you revert everything. Code review is painful because the diff is enormous.
+**Instead:** Commit after each verified task. Small, focused commits are easier to review, revert, and bisect.
+### Anti-Pattern: Deviating Without Logging
+**What goes wrong:** You change the plan on the fly — reorder tasks, add new ones, modify scope — without documenting why. Later, reviewers do not understand why the implementation differs from the plan.
+**Instead:** Log every deviation with: what changed, why, and what impact it has. Deviations are normal — undocumented deviations are not.
+### Anti-Pattern: Gold Plating
+**What goes wrong:** Task 3 says "implement the login endpoint." You implement login, registration, password reset, and email verification because "we will need them eventually."
+**Instead:** Implement exactly what the task says. Nothing more. Additional features go into additional tasks in additional plans. Scope discipline is the difference between plans that finish on time and plans that never finish.
+### Anti-Pattern: Parallelizing Without Understanding
+**What goes wrong:** You see two tasks in the same wave and assume they can be done simultaneously. But they modify the same file, causing merge conflicts.
+**Instead:** Check for file conflicts before parallelizing. Two tasks in the same wave can run in parallel only if they do not modify the same files.
+## Integration with Our Tools
+- **`oc_orchestrate`** — Autonomous plan execution. The orchestrator reads the plan, dispatches tasks to agents, verifies each task, and tracks progress automatically. Use for hands-off execution of well-defined plans.
+- **`oc_quick`** — For single-task execution when you want to implement one specific task from the plan.
+- **`oc_review`** — Run after each task for automated code review. Catches issues the verification command might miss (code quality, security, naming).
+- **`oc_state`** — Track pipeline state during execution. Shows current phase, completed tasks, and any blockers.
+- **`oc_phase`** — Check phase transitions. Useful when a plan spans the boundary between two pipeline phases.
+- **`oc_session_stats`** — Monitor session health during long execution runs. Check for accumulating errors or performance degradation.
+## Failure Modes
+### All Tasks Fail
+**Symptom:** Every task's verification fails. Nothing works.
+**Diagnosis:** The plan itself may be fundamentally flawed — wrong assumptions, missing infrastructure, incorrect dependency ordering. Go back to the plan-writing skill and re-plan from scratch. Examine: are the dependencies right? Are the task actions actually implementable?
+### Velocity Is Too Slow
+**Symptom:** Tasks that were estimated at 30 minutes are taking 2 hours each. The plan will take 3x longer than expected.
+**Diagnosis:** Tasks are too large or too vaguely defined. Split them. A task taking 2 hours probably has 3-4 sub-tasks hiding inside it. Re-plan the remaining tasks with smaller granularity.
+### Tests Pass but Feature Does Not Work
+**Symptom:** All unit tests pass, but the feature fails when used for real. The tests are testing the wrong things.
+**Diagnosis:** Missing integration or end-to-end test. Unit tests verify individual pieces; integration tests verify that the pieces work together. Add an integration test that exercises the actual feature path.
+### Cascading Failures After One Task
+**Symptom:** Task 3 passes verification, but then tasks 4, 5, and 6 all fail because Task 3 changed something they depend on.
+**Diagnosis:** Task 3's verification was insufficient — it checked its own output but not its impact on downstream consumers. Add broader verification (full test suite) after tasks that modify shared interfaces.
+### Plan Becomes Obsolete Mid-Execution
+**Symptom:** After implementing 3 of 6 tasks, you realize the remaining tasks no longer make sense because the first 3 revealed a better approach.
+**Diagnosis:** This is normal. Plans are a best estimate based on current knowledge. When the plan becomes obsolete, stop and re-plan the remaining tasks. Do not force an outdated plan. The work already completed is not wasted — it informed the better approach.
+## Quick Reference
+**Per-task cycle:**
+1. Read task
+2. Check prerequisites
+3. Implement
+4. Verify
+5. Commit
+6. Log progress
+**Verification after every task. Commit after every task. Log deviations in real time.**