npm - crewkit - Versions diffs - 0.1.0 → 1.0.0 - Mend

crewkit 0.1.0 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (14) hide show

package/README.md +142 -1
package/package.json +1 -1
package/skill/SKILL.md +38 -2
package/skill/adapters/copilot.md +239 -0
package/skill/adapters/cursor.md +215 -0
package/skill/templates/skills/dev-metrics/SKILL.md +126 -0
package/skill/templates/skills/impact/SKILL.md +157 -0
package/skill/templates/skills/retro/SKILL.md +134 -0
package/skill/templates/skills/review-pr/SKILL.md +39 -5
package/skill/templates/skills/security-scan/SKILL.md +157 -0
package/src/add.js +45 -0
package/src/cli.js +15 -6
package/src/install.js +4 -1
package/src/update.js +28 -0

package/skill/templates/skills/dev-metrics/SKILL.md ADDED Viewed

@@ -0,0 +1,126 @@
+---
+name: dev-metrics
+description: "Generate development metrics from git history — commit patterns, fix loop frequency, agent usage, file hotspots, and workflow efficiency."
+---
+Generate development metrics for: $ARGUMENTS
+If $ARGUMENTS is empty, analyze the last 90 days of git history.
+If $ARGUMENTS is a number (e.g. `30`), use that many days.
+If $ARGUMENTS is a branch or date range, scope to that.
+---
+## Steps
+### 1. Collect raw data
+Run all commands in parallel:
+```bash
+# Commit volume and cadence
+git log --oneline --since="90 days ago" --format="%ad %s" --date=short
+# File change frequency (hotspots)
+git log --since="90 days ago" --name-only --pretty=format: | sort | uniq -c | sort -rn | head -30
+# Author breakdown (if multi-contributor)
+git shortlog -sn --since="90 days ago"
+# Fix/correction commits (heuristic: commit message contains fix, hotfix, correction, revert)
+git log --oneline --since="90 days ago" --grep="fix\|hotfix\|correction\|revert\|bugfix" -i
+# Merge commits (PR merges)
+git log --oneline --since="90 days ago" --merges
+```
+Adapt the `--since` window to match $ARGUMENTS if provided.
+### 2. Compute metrics
+From raw data, derive:
+#### Commit patterns
+| Metric | Value |
+|--------|-------|
+| Total commits | count |
+| Commits per week (avg) | count |
+| Peak activity day/week | date or range |
+| Fix/correction commit ratio | fix commits / total commits (%) |
+#### Fix loop frequency
+Estimate fix loop frequency by counting sequential commits on the same file within a 24h window.
+Flag any file with 3+ consecutive fix commits as a **fix loop hotspot**.
+| File | Fix loop count | Most recent |
+|------|---------------|-------------|
+| ... | ... | ... |
+#### File hotspots
+Top 10 most-changed files:
+| File | Change count | Risk level |
+|------|-------------|-----------|
+| ... | ... | HIGH if auth/tenant/migration, MEDIUM if handler/service, LOW otherwise |
+Apply risk classification using `.ai/memory/conventions.md` if present (read it).
+#### Workflow efficiency signals
+| Signal | Value | Health |
+|--------|-------|--------|
+| Fix commit ratio | X% | GREEN <15%, YELLOW 15-30%, RED >30% |
+| Revert count | N | GREEN 0, YELLOW 1-2, RED 3+ |
+| Hotfix commits | N | flag if >2 in window |
+| Files changed per commit (avg) | N | GREEN <5, YELLOW 5-10, RED >10 |
+#### Agent usage (if detectable from commit messages)
+If commit messages contain agent names (coder, tester, reviewer, architect, explorer),
+tally usage per agent. Otherwise, skip this section.
+### 3. Identify systemic risks
+Cross-reference hotspot files with risk classification:
+- If a HIGH-risk file (auth, tenant, billing, migration) is in the top 5 hotspots → flag as **systemic risk**
+- If fix commit ratio > 30% → flag as **process health concern**
+- If the same file appears in both hotspots and fix loops → flag as **instability candidate**
+### 4. Suggest improvements
+For each systemic risk or process health concern, propose one concrete action:
+- Do not propose vague items ("write more tests")
+- Each suggestion must reference a specific file, module, or metric
+---
+## Return Format
+```markdown
+---
+**Dev Metrics Report**
+**Period:** [date range]
+**Total commits analyzed:** N
+## Commit Patterns
+[table]
+## Fix Loop Frequency
+[table or "No fix loop hotspots detected"]
+## File Hotspots (top 10)
+[table]
+## Workflow Efficiency
+[table with GREEN/YELLOW/RED indicators]
+## Agent Usage
+[table or "Not detectable from commit messages"]
+## Systemic Risks
+[list or "None identified"]
+## Suggested Improvements
+[numbered list, max 5, concrete and actionable]
+---
+```
+Keep the report structured and scannable. Do not include raw git output.

package/skill/templates/skills/impact/SKILL.md ADDED Viewed

@@ -0,0 +1,157 @@
+---
+name: impact
+description: "Analyze blast radius of changing a file, handler, entity, or module. Maps callers, tests, endpoints, and UI pages affected."
+---
+Analyze blast radius of: $ARGUMENTS
+$ARGUMENTS must be a file path, handler name, entity name, module name, or endpoint.
+Examples: `src/Orders/OrderHandler.cs`, `OrderEntity`, `POST /api/orders`, `Orders module`
+---
+## When to use
+Use before starting any MEDIUM or LARGE task to understand the full scope of change.
+Use before `/explore-and-plan` when the target is already known but blast radius is uncertain.
+Use after a production incident to understand what else might be affected by the fix.
+---
+## Steps
+### 1. Identify the target
+From $ARGUMENTS, determine:
+- **Target type:** file / class / handler / entity / endpoint / module
+- **Target location:** resolve to exact file path(s) if not already a path
+- **Stack:** infer from path extension and `.ai/memory/architecture.md`
+Read `.ai/memory/architecture.md` and `.ai/memory/conventions.md` to understand layer rules
+and naming conventions before searching.
+### 2. Map direct callers
+Search for all direct references to the target:
+```bash
+# Search for imports, usages, and references
+# Adapt search patterns to the detected stack:
+# - .NET: class name, interface name, constructor injection, handler registration
+# - Node.js: require/import of the file, function call sites
+# - Blazor: component references, @inject, @page routes, event handlers
+# - SQL/migrations: table name, column name in queries and seeders
+```
+Build the direct caller list:
+| File | Reference type | Layer |
+|------|---------------|-------|
+| ... | import / call / inject / inherit | controller / service / handler / UI / test |
+### 3. Map transitive impact
+For each direct caller, check if it is itself called by other files:
+- Go one level deeper if the direct caller is an interface, base class, or shared service
+- Stop at two levels unless the target is a core shared abstraction (entity, base class, shared interface)
+- Flag if the dependency graph is too wide to enumerate (>20 unique callers at any level)
+### 4. Map tests
+Find all test files that directly or indirectly test the target:
+```bash
+# Search test directories for the target name, class name, or endpoint path
+# Look for test doubles (mocks, fakes, stubs) of the target
+```
+| Test file | Tests what | Has mock/fake of target? |
+|-----------|-----------|--------------------------|
+| ... | ... | yes / no |
+Flag any test file that uses a mock/fake of the target — changing the target's interface or
+exception types will require updating those fakes.
+### 5. Map API endpoints and UI pages
+If the target is a handler, service, or entity:
+- Find which API endpoints call it (controller/route → handler)
+- Find which UI pages or components consume those endpoints (if frontend source is available)
+| Endpoint | Method | UI page/component | Consumer type |
+|----------|--------|------------------|---------------|
+| ... | ... | ... | internal / public API |
+Mark endpoints as **public API** if they are exposed externally — changes to those have higher blast radius.
+### 6. Classify blast radius
+| Dimension | Count | Assessment |
+|-----------|-------|-----------|
+| Direct callers | N | — |
+| Transitive callers | N | — |
+| Test files affected | N | — |
+| API endpoints affected | N | — |
+| UI pages affected | N | — |
+| Public API contracts affected | N | HIGH risk if >0 |
+| Auth/tenant code affected | yes/no | HIGH risk if yes |
+| DB schema affected | yes/no | HIGH risk if yes |
+**Overall blast radius:**
+- **LOW** — 1-2 files, same layer, no public API, no auth/schema
+- **MEDIUM** — 3-7 files, cross-layer, no public API change
+- **HIGH** — 8+ files, or public API, or auth/tenant, or DB schema change
+### 7. Identify change categories
+Classify what types of changes to the target would cause breakage vs. safe changes:
+| Change type | Breakage risk | Affected consumers |
+|-------------|--------------|-------------------|
+| Add new field (non-breaking) | LOW | none |
+| Rename field or method | HIGH | all callers + test fakes |
+| Change return type | HIGH | all callers |
+| Change exception thrown | MEDIUM | test fakes + callers that catch |
+| Add required parameter | HIGH | all call sites |
+| Add optional parameter | LOW | none |
+| Split into two classes | HIGH | all callers + DI registrations |
+| Change DB column | HIGH | queries + migrations |
+---
+## Return Format
+```markdown
+---
+**Impact Analysis: [target name]**
+**Target type:** [file / class / handler / entity / endpoint / module]
+**Stack:** [detected]
+## Direct Callers
+[table from Step 2]
+## Transitive Impact
+[table or "None — direct callers are leaf nodes"]
+## Tests Affected
+[table from Step 4]
+[Flag: "N test files use a mock/fake of this target — update them if interface changes"]
+## Endpoints and UI Pages
+[table from Step 5, or "Not applicable"]
+## Blast Radius Summary
+[table from Step 6]
+**Blast radius: LOW / MEDIUM / HIGH**
+## Safe vs. Breaking Changes
+[table from Step 7]
+## Recommendation
+[1-3 sentences: what to do before making this change, and what to watch for]
+---
+```
+If $ARGUMENTS does not resolve to a known file or name, ask for clarification before proceeding.
+Do not guess at the target.

package/skill/templates/skills/retro/SKILL.md ADDED Viewed

@@ -0,0 +1,134 @@
+---
+name: retro
+description: "Post-mortem of a completed task or plan — analyzes fix loops, reviewer findings, lessons learned, and suggests process improvements."
+---
+Post-mortem for: $ARGUMENTS
+If $ARGUMENTS is empty, analyze the most recent completed task from git log.
+---
+## When to use
+Use after a task is completed (or after a painful iteration) to extract durable lessons
+and identify process improvements. Not a blame exercise — a signal extraction exercise.
+---
+## Steps
+### 1. Gather signals
+Run in parallel:
+```bash
+git log --oneline -30
+git diff HEAD~10..HEAD --stat
+```
+Also read (if they exist):
+- `.ai/plans/` — any plan file matching $ARGUMENTS
+- `.ai/memory/lessons-*.md` — to avoid duplicating existing lessons
+If $ARGUMENTS points to a specific plan file, read it directly.
+### 2. Reconstruct the timeline
+From git log and plan file, reconstruct:
+| Phase | What happened | Agent/actor |
+|-------|--------------|-------------|
+| Initial implementation | ... | coder |
+| First test run | PASS / FAIL | tester |
+| First review | APPROVED / NEEDS_CHANGES | reviewer |
+| Fix loop iterations | count + what was fixed | coder |
+| Final state | PASS + APPROVED | — |
+Flag any phase that repeated more than once.
+### 3. Analyze fix loops
+For each fix loop iteration, identify:
+- **Trigger:** what caused the loop (test failure / reviewer finding / build error)
+- **Root cause category:** one of:
+  - `spec-gap` — requirement was ambiguous or incomplete
+  - `scope-underestimate` — task was classified smaller than it was
+  - `missing-context` — coder lacked critical info (architecture, conventions, existing pattern)
+  - `test-gap` — test didn't cover the scenario before the fix
+  - `review-gap` — reviewer finding could have been caught earlier
+  - `execution-error` — correct spec, wrong implementation (typo, off-by-one, wrong field)
+  - `external-dependency` — blocked by something outside the task
+### 4. Classify reviewer findings
+If reviewer output is available (from PR diff, plan notes, or conversation context), classify findings:
+| Severity | Count | Recurring? | Category |
+|----------|-------|-----------|----------|
+| CRITICAL | ... | yes/no | ... |
+| IMPORTANT | ... | yes/no | ... |
+| MINOR | ... | yes/no | ... |
+"Recurring" = same finding appeared in a previous retro or lesson file.
+### 5. Identify process improvements
+For each fix loop trigger, propose one concrete process change:
+| Finding | Proposed change | Target phase |
+|---------|----------------|--------------|
+| e.g. coder missed multi-tenant rule | Add explicit tenant check to coder prompt | Step 0 (classify) |
+| e.g. test missed edge case | Add edge case checklist to tester for this module | tester |
+Keep proposals concrete and actionable. Do not propose vague "be more careful" items.
+### 6. Extract durable lessons
+For each lesson that would prevent future recurrence, format as:
+```markdown
+### [YYYY-MM-DD] Retro: <short title>
+- **Task:** [what was being built/fixed]
+- **What happened:** [1-2 sentences]
+- **Root cause:** [category from Step 3]
+- **Lesson:** [actionable guidance for next time]
+- **Applies to:** [domain: .NET / gateway / Blazor / process / all]
+```
+Append to the correct `.ai/memory/lessons-{domain}.md`.
+If lesson is process-level, append to `lessons-process.md` (create if missing).
+### 7. Update plan status
+If a plan file was identified, update its status to **DONE** (if not already).
+---
+## Return Format
+```markdown
+---
+**Retro: [task name or git range]**
+**Period:** [date range from git log]
+**Fix loop count:** [N]
+**Timeline summary:**
+[reconstructed table from Step 2]
+**Fix loop analysis:**
+[table from Step 3]
+**Reviewer findings:**
+[table from Step 4, or "not available"]
+**Process improvements:**
+[table from Step 5]
+**Lessons documented:** [N lessons → file(s)]
+**Top recommendation:** [single most impactful process change]
+---
+```
+If no fix loops occurred and review was clean on the first pass, state that explicitly — it is a positive signal worth noting.

package/skill/templates/skills/review-pr/SKILL.md CHANGED Viewed

@@ -9,7 +9,10 @@ Review pull request: $ARGUMENTS
 ### 1. Fetch PR data
-Run in parallel:
+Try each source in order, stopping at the first that succeeds.
+#### 1a. GitHub (gh CLI)
 ```bash
 gh pr view $ARGUMENTS --json number,title,body,author,baseRefName,headRefName,additions,deletions,changedFiles
 gh pr diff $ARGUMENTS
@@ -17,6 +20,32 @@ gh pr diff $ARGUMENTS
 If $ARGUMENTS is empty, use `gh pr view` (current branch's PR).
+If `gh` is not installed or returns an error, proceed to **1b**.
+#### 1b. GitLab (glab CLI)
+```bash
+glab mr view $ARGUMENTS --output json
+glab mr diff $ARGUMENTS
+```
+If $ARGUMENTS is empty, use `glab mr view` (current branch's MR).
+If `glab` is not installed or returns an error, proceed to **1c**.
+#### 1c. Pure git fallback
+```bash
+git log main..HEAD --oneline
+git diff main...HEAD
+```
+When using this fallback:
+- PR metadata (title, description, author, reviewer comments) is **not available**.
+- Communicate this clearly to the reviewer agent.
+- Ask the user to provide context about the changes if possible:
+  > "No GitHub/GitLab CLI was found. I'm reviewing based on raw git diff only. PR title, description, and comments are unavailable. If you can share what this change is about, it will improve the review."
 ### 2. Load project context
 Read `.ai/memory/architecture.md` and `.ai/memory/conventions.md`.
@@ -24,9 +53,10 @@ Read `.ai/memory/architecture.md` and `.ai/memory/conventions.md`.
 ### 3. Run reviewer agent
 Pass to **reviewer** subagent:
-- Full PR diff
-- PR title and description
-- File count and change size
+- Full PR/MR diff (or git diff if fallback)
+- PR/MR title and description (if available; otherwise note "unavailable — git fallback")
+- File count and change size (derive from diff if metadata unavailable)
+- Source used: GitHub / GitLab / git fallback
 - Project context from step 2
 The reviewer applies all checks from its instructions and `.ai/memory/conventions.md`, including project-specific rules (e.g., multi-tenant enforcement, architecture layer violations, forbidden patterns).
@@ -37,6 +67,7 @@ The reviewer applies all checks from its instructions and `.ai/memory/convention
 ---
 **PR #[number] — [title]**
 **Author:** [author] | **Branch:** [head] → [base]
+**Source:** GitHub / GitLab / git fallback (no PR metadata)
 **Size:** +[additions] / -[deletions] in [changedFiles] files
 **Findings:**
@@ -50,4 +81,7 @@ The reviewer applies all checks from its instructions and `.ai/memory/convention
 ---
 ```
-If no PR number provided and no current branch PR exists, ask for the PR number.
+If using git fallback, omit fields that are unavailable (number, author, branch names) and add a note:
+> "Review based on git diff only — PR metadata was not available."
+If no PR/MR number provided and no current branch PR/MR exists and git fallback also fails, ask the user for a PR number or a base branch to diff against.

package/skill/templates/skills/security-scan/SKILL.md ADDED Viewed

@@ -0,0 +1,157 @@
+---
+name: security-scan
+description: "Scan for known CRITICAL security issues and OWASP top 10 vulnerabilities specific to this project. Reports status of each known issue."
+---
+Run security scan for: $ARGUMENTS
+If $ARGUMENTS is empty, scan the full codebase.
+If $ARGUMENTS is a path or module name, scope the scan to that area.
+---
+## When to use
+Use before merging changes that touch auth, tenant isolation, input handling, external integrations,
+or any area that directly processes user-supplied data. Not a replacement for automated SAST —
+a targeted, context-aware review using project conventions.
+---
+## Steps
+### 1. Load project context
+Read `.ai/memory/conventions.md` and `.ai/memory/architecture.md`.
+Extract:
+- Security rules declared in conventions (e.g., TenantId must come from JWT, not body)
+- Auth mechanism (JWT, sessions, API keys)
+- Data flow: where external input enters the system
+- Stack-specific concerns (e.g., SQL via ORM, HTML rendering, file uploads)
+### 2. Map attack surface
+Identify entry points within the scan scope:
+```bash
+# Find API endpoint definitions
+# (adapt pattern to the project's stack — controllers, routes, handlers, etc.)
+# Find places that read from request body, query string, or headers
+# Find places that write to DB or execute queries
+# Find places that render HTML or return user-supplied content
+# Find places that call external services
+# Find files that handle auth or session state
+```
+Run searches appropriate to the detected stack. Do not run all commands if the stack is clear
+from `.ai/memory/architecture.md`.
+### 3. Check OWASP Top 10
+For each category, determine status based on code inspection:
+| # | Category | Status | Evidence |
+|---|----------|--------|----------|
+| A01 | Broken Access Control | PASS / FAIL / PARTIAL / SKIP | [file:line or reason for skip] |
+| A02 | Cryptographic Failures | PASS / FAIL / PARTIAL / SKIP | |
+| A03 | Injection (SQL, NoSQL, cmd) | PASS / FAIL / PARTIAL / SKIP | |
+| A04 | Insecure Design | PASS / FAIL / PARTIAL / SKIP | |
+| A05 | Security Misconfiguration | PASS / FAIL / PARTIAL / SKIP | |
+| A06 | Vulnerable Components | PASS / FAIL / PARTIAL / SKIP | |
+| A07 | Auth & Session Management | PASS / FAIL / PARTIAL / SKIP | |
+| A08 | Software & Data Integrity | PASS / FAIL / PARTIAL / SKIP | |
+| A09 | Security Logging & Monitoring | PASS / FAIL / PARTIAL / SKIP | |
+| A10 | Server-Side Request Forgery | PASS / FAIL / PARTIAL / SKIP | |
+**Status definitions:**
+- `PASS` — checked, no issue found
+- `FAIL` — issue found, report exact location
+- `PARTIAL` — partially mitigated, describe gap
+- `SKIP` — not applicable to this scope (explain why)
+### 4. Check project-specific security rules
+From the rules extracted in Step 1, verify each one explicitly.
+Common examples (adapt to what conventions.md actually says):
+| Rule | Status | Evidence |
+|------|--------|----------|
+| TenantId sourced from JWT only (never from body/query) | PASS / FAIL | |
+| No hardcoded secrets or API keys in source files | PASS / FAIL | |
+| Auth enforced on all non-public endpoints | PASS / FAIL | |
+| User-supplied data never passed to raw SQL or shell | PASS / FAIL | |
+| File uploads validated for type and size | PASS / FAIL / N/A | |
+| External service calls use timeout and error handling | PASS / FAIL / N/A | |
+### 5. Check for secrets in code
+Search for common secret patterns:
+```bash
+# Patterns to search: password=, secret=, apikey=, token=, connectionstring=
+# in non-test, non-example source files
+# Flag any hardcoded value that is not an environment variable reference
+```
+Report any findings with file path and line number. Flag `FAIL` in A02 if found.
+### 6. Classify findings
+| Severity | Criteria |
+|----------|----------|
+| CRITICAL | Exploitable without authentication, or exposes tenant data across boundaries |
+| HIGH | Exploitable by authenticated users, privilege escalation, or data exposure |
+| MEDIUM | Requires specific conditions, indirect exposure, or defense-in-depth gap |
+| LOW | Best practice violation, minor information leakage, or hardening opportunity |
+---
+## Return Format
+```markdown
+---
+**Security Scan Report**
+**Scope:** [full codebase or specific module]
+**Stack:** [detected from architecture.md]
+## OWASP Top 10 Status
+[table from Step 3]
+## Project-Specific Rules
+[table from Step 4]
+## Findings
+### CRITICAL
+- [none | list with file:line, description, remediation]
+### HIGH
+- [none | list]
+### MEDIUM
+- [none | list]
+### LOW
+- [none | list]
+## Summary
+| Total findings | CRITICAL | HIGH | MEDIUM | LOW |
+|---------------|----------|------|--------|-----|
+| N | N | N | N | N |
+**Overall status:** CLEAN / NEEDS_ATTENTION / CRITICAL_ACTION_REQUIRED
+## Recommended next steps
+[numbered list, prioritized by severity]
+---
+```
+**CLEAN** = zero CRITICAL or HIGH findings.
+**NEEDS_ATTENTION** = one or more MEDIUM findings, zero CRITICAL/HIGH.
+**CRITICAL_ACTION_REQUIRED** = any CRITICAL or HIGH finding present.
+Do not suggest generic remediation. Every recommendation must reference the specific file,
+function, or rule from conventions.md.

package/src/add.js ADDED Viewed

@@ -0,0 +1,45 @@
+import { cpSync, mkdirSync, existsSync } from 'node:fs';
+import { join } from 'node:path';
+import { homedir } from 'node:os';
+const OPTIONAL_SKILLS = ['retro', 'dev-metrics', 'security-scan', 'impact'];
+export function add(skillName) {
+  if (!skillName) {
+    console.error('Error: skill name is required.');
+    console.log(`\n  Available optional skills:\n${OPTIONAL_SKILLS.map(s => `    - ${s}`).join('\n')}\n`);
+    process.exit(1);
+  }
+  if (!OPTIONAL_SKILLS.includes(skillName)) {
+    console.error(`Error: "${skillName}" is not a known optional skill.`);
+    console.log(`\n  Available optional skills:\n${OPTIONAL_SKILLS.map(s => `    - ${s}`).join('\n')}\n`);
+    process.exit(1);
+  }
+  const source = join(homedir(), '.claude', 'skills', 'crewkit-setup', 'templates', 'skills', skillName, 'SKILL.md');
+  if (!existsSync(source)) {
+    console.error(`Error: source not found at ${source}`);
+    console.log('  Make sure crewkit is installed first: npx crewkit install');
+    process.exit(1);
+  }
+  const targetDir = join(process.cwd(), '.claude', 'skills', skillName);
+  const target = join(targetDir, 'SKILL.md');
+  if (existsSync(target)) {
+    console.log(`  Warning: ${target} already exists. Overwriting.`);
+  }
+  mkdirSync(targetDir, { recursive: true });
+  cpSync(source, target);
+  console.log(`
+  ✓ Skill "${skillName}" installed
+  Copied to: ${target}
+  Use /${skillName} in Claude Code to run it.
+  `);
+}