npm - singularity-claude - Versions diffs - 0.1.0 - Mend

singularity-claude 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/.claude-plugin/marketplace.json +15 -0
package/.claude-plugin/plugin.json +21 -0
package/CHANGELOG.md +20 -0
package/LICENSE +21 -0
package/README.md +269 -0
package/agents/gap-detector.md +74 -0
package/agents/skill-assessor.md +55 -0
package/hooks/hooks.json +16 -0
package/hooks/run-hook.cmd +5 -0
package/hooks/session-start +97 -0
package/package.json +29 -0
package/scripts/score-manager.sh +301 -0
package/scripts/telemetry-writer.sh +191 -0
package/skills/creating-skills/SKILL.md +113 -0
package/skills/creating-skills/references/scoring-template.json +17 -0
package/skills/creating-skills/references/skill-template.md +36 -0
package/skills/crystallizing/SKILL.md +91 -0
package/skills/dashboard/SKILL.md +67 -0
package/skills/repairing/SKILL.md +109 -0
package/skills/reviewing/SKILL.md +85 -0
package/skills/scoring/SKILL.md +81 -0
package/skills/scoring/references/scoring-rubric.md +75 -0
package/skills/using-singularity/SKILL.md +60 -0

package/skills/crystallizing/SKILL.md ADDED Viewed

@@ -0,0 +1,91 @@
+---
+name: crystallizing
+description: Use when a skill has proven itself with consistently high scores across multiple executions and should be locked as a stable, immutable version via /singularity-crystallize
+---
+# Crystallize a Validated Skill
+Lock a battle-tested skill version as production-grade. Crystallized skills are immutable — further changes require a new version.
+## When to Use
+- Skill average score >= 90 with 5+ executions
+- Skill has handled at least one edge case
+- User confirms the skill is ready
+## Workflow
+### Step 1: Validate Readiness
+Read score file and config:
+```bash
+"${CLAUDE_PLUGIN_ROOT}/scripts/score-manager.sh" trend <skill-name>
+```
+Check requirements:
+- [ ] Average score >= `crystallizationThreshold` (default: 90)
+- [ ] Execution count >= `crystallizationMinExecutions` (default: 5)
+- [ ] At least one edge case recorded in score history
+- [ ] Maturity is `hardened` (not `draft` or `tested`)
+If not ready, explain what's missing:
+*"This skill needs <X more runs / higher scores / edge case coverage> before crystallization."*
+### Step 2: Confirm with User
+Show the skill's full score summary and ask:
+*"Ready to crystallize <skill-name> v<version>? This locks the current version as immutable."*
+### Step 3: Create Git Tag
+If `~/.claude/skills/` is a git repository:
+```bash
+cd ~/.claude/skills
+git add <skill-name>/
+git commit -m "singularity: crystallize <skill-name> v<version>"
+git tag "singularity/<skill-name>/v<version>"
+```
+If not a git repo, create a backup copy:
+```bash
+mkdir -p ~/.claude/singularity/crystallized/<skill-name>/
+cp -r ~/.claude/skills/<skill-name>/ ~/.claude/singularity/crystallized/<skill-name>/v<version>/
+```
+### Step 4: Update Records
+Update score file:
+- Set maturity to `"crystallized"` for the current version
+Update registry:
+- Set `maturity` to `"crystallized"`
+### Step 5: Log Telemetry
+```bash
+"${CLAUDE_PLUGIN_ROOT}/scripts/telemetry-writer.sh" log <skill-name> \
+  --trigger "crystallization" \
+  --summary "Crystallized v<version> (avg: <score>/100, <count> runs)"
+```
+## Report
+```
+Crystallized: <skill-name> v<version>
+  Average score: <avg>/100 over <count> executions
+  Edge cases handled: <count>
+  Git tag: singularity/<skill-name>/v<version>
+  Status: LOCKED — further changes require a new version
+```
+## Rollback
+To recover a crystallized version later:
+```bash
+git checkout singularity/<skill-name>/v<version> -- <skill-name>/
+```
+Or restore from backup:
+```bash
+cp -r ~/.claude/singularity/crystallized/<skill-name>/v<version>/ ~/.claude/skills/<skill-name>/
+```

package/skills/dashboard/SKILL.md ADDED Viewed

@@ -0,0 +1,67 @@
+---
+name: dashboard
+description: Use when wanting an overview of all singularity-managed skills showing their scores, maturity levels, trends, and alerts via /singularity-dashboard
+---
+# Singularity Dashboard
+Display a comprehensive overview of all managed skills with health indicators and actionable alerts.
+## Workflow
+### Step 1: Load Registry
+Read `~/.claude/singularity/registry.json` to get the list of all managed skills.
+If no skills are registered:
+*"No skills tracked yet. Create your first skill with `/singularity-create`."*
+### Step 2: Gather Data
+For each skill in the registry:
+1. Read the score file from `~/.claude/singularity/scores/<skill-name>.json`
+2. Compute trend: compare last 2 scores (if available)
+   - Score increased by 5+ → `↑`
+   - Score decreased by 5+ → `↓`
+   - Otherwise → `→`
+### Step 3: Display Table
+```
+Singularity Dashboard
+═══════════════════════════════════════════════════════════════════
+| Skill              | Version | Maturity      | Avg  | Runs | Trend | Last Used  |
+|--------------------|---------|---------------|------|------|-------|------------|
+| my-api-client      | v1.2.0  | hardened      | 87   | 12   | ↑     | 2026-03-17 |
+| data-transformer   | v1.0.1  | tested        | 72   | 5    | ↓     | 2026-03-15 |
+| form-generator     | v1.0.0  | draft         | 45   | 2    | →     | 2026-03-10 |
+═══════════════════════════════════════════════════════════════════
+```
+### Step 4: Show Alerts
+Highlight skills needing attention:
+**Needs Repair** (avg < 50, 2+ runs):
+```
+⚠ <skill-name>: avg score <n>/100 — run /singularity-repair
+```
+**Ready to Crystallize** (avg >= 90, 5+ runs, hardened):
+```
+✦ <skill-name>: avg score <n>/100, <n> runs — run /singularity-crystallize
+```
+**Stale** (not used in 30+ days):
+```
+⏳ <skill-name>: last used <n> days ago — review relevance
+```
+### Step 5: Summary Stats
+```
+Total skills: <n>
+  Draft: <n>  |  Tested: <n>  |  Hardened: <n>  |  Crystallized: <n>
+  Average health: <overall-avg>/100
+  Alerts: <n> needing repair, <n> ready to crystallize
+```

package/skills/repairing/SKILL.md ADDED Viewed

@@ -0,0 +1,109 @@
+---
+name: repairing
+description: Use when a singularity-managed skill is failing, scoring below threshold, or producing incorrect outputs that need automated correction via /singularity-repair
+---
+# Auto-Repair a Failing Skill
+Diagnose why a skill is underperforming and rewrite it to fix identified weaknesses. This is the core of the recursive evolution loop.
+## When to Use
+- Skill average score < 50 (auto-repair threshold)
+- Scoring skill suggested repair
+- User noticed skill producing wrong or incomplete output
+## Workflow
+### Step 1: Diagnose
+Read the skill's score history:
+```bash
+"${CLAUDE_PLUGIN_ROOT}/scripts/score-manager.sh" list <skill-name>
+```
+Read recent telemetry:
+```bash
+"${CLAUDE_PLUGIN_ROOT}/scripts/telemetry-writer.sh" list <skill-name> --last 5
+```
+Identify patterns:
+- Which rubric dimensions score lowest consistently?
+- What edge cases caused failures?
+- Are errors recurring?
+### Step 2: Read Current Skill
+Load the skill's SKILL.md:
+```
+~/.claude/skills/<skill-name>/SKILL.md
+```
+Also read any supporting files in `references/`.
+### Step 3: Identify Repair Targets
+Map low-scoring dimensions to specific skill content:
+| Low Dimension | Likely Cause | Repair Focus |
+|---------------|-------------|--------------|
+| Correctness | Wrong instructions or logic | Rewrite core workflow steps |
+| Completeness | Missing requirements | Add missing steps or checks |
+| Edge Cases | No error handling guidance | Add edge case handling section |
+| Efficiency | Verbose or roundabout | Streamline workflow, remove unnecessary steps |
+| Reusability | Hardcoded or too specific | Parameterize, add flexibility |
+### Step 4: Rewrite
+Edit the SKILL.md to address identified weaknesses. Preserve what works (high-scoring dimensions) and focus changes on the lowest-scoring areas.
+**Rules:**
+- Don't rewrite from scratch — surgical fixes only
+- Preserve the existing structure and naming
+- Add a "## Repair History" section at the bottom documenting what changed and why
+### Step 5: Bump Version
+The skill is now a new version. Update:
+1. Score file — add new version entry:
+   - Read current version from `~/.claude/singularity/scores/<skill-name>.json`
+   - Increment patch: v1.0.0 → v1.0.1 (or minor if significant changes: v1.0.0 → v1.1.0)
+2. Registry — update `currentVersion`
+### Step 6: Test
+Invoke the repaired skill with a scenario that previously failed or scored low.
+### Step 7: Score
+Run `singularity-claude:scoring` on the test output.
+### Step 8: Compare
+| New > Old avg | Keep new version |
+|---------------|-----------------|
+| New <= Old avg | Revert: restore previous SKILL.md from git history and keep old version as current |
+If repair failed (new version scores equal or worse):
+- Log the failed repair attempt in telemetry
+- Suggest user intervention: *"Auto-repair didn't improve the skill. Consider manual editing or `/singularity-review` for deeper analysis."*
+### Step 9: Log Telemetry
+```bash
+"${CLAUDE_PLUGIN_ROOT}/scripts/telemetry-writer.sh" log <skill-name> \
+  --trigger "auto-repair" \
+  --summary "Repaired: <what changed>. Score: <old-avg> → <new-score>"
+```
+## Report
+After repair, show:
+```
+Repair Summary for <skill-name>:
+  Previous version: v1.0.0 (avg: 42/100)
+  New version: v1.0.1
+  Changes: <bullet list of what was fixed>
+  Test score: <score>/100
+  Status: <Improved ✓ / Failed ✗ — reverted>
+```

package/skills/reviewing/SKILL.md ADDED Viewed

@@ -0,0 +1,85 @@
+---
+name: reviewing
+description: Use when wanting to assess the health of a singularity-managed skill, check its maturity, decide whether it needs repair or is ready for crystallization via /singularity-review
+---
+# Review Skill Health
+Produce a comprehensive health report for a singularity-managed skill and recommend next actions.
+## Workflow
+### Step 1: Load Data
+Read the skill's data from three sources:
+1. **Score history:**
+   ```bash
+   "${CLAUDE_PLUGIN_ROOT}/scripts/score-manager.sh" list <skill-name>
+   ```
+2. **Registry entry:** Read from `~/.claude/singularity/registry.json`
+3. **Recent telemetry:**
+   ```bash
+   "${CLAUDE_PLUGIN_ROOT}/scripts/telemetry-writer.sh" list <skill-name> --last 10
+   ```
+### Step 2: Compute Health Metrics
+| Metric | How to Calculate |
+|--------|-----------------|
+| **Current version** | From score file `currentVersion` |
+| **Maturity** | From score file version entry |
+| **Average score** | From score file version `averageScore` |
+| **Score trend** | Compare last 3 scores: improving (+), declining (-), stable (=) |
+| **Execution count** | From score file version `executionCount` |
+| **Lowest dimension** | Find the rubric dimension with lowest average across scores |
+| **Edge cases handled** | Count unique edge cases from score entries |
+| **Staleness** | Days since `lastExecuted` in registry |
+| **Repair history** | Count telemetry entries with trigger "auto-repair" |
+### Step 3: Generate Recommendation
+| Condition | Recommendation |
+|-----------|---------------|
+| Average declining over last 3 scores | "Skill is degrading. Consider `/singularity-repair`" |
+| Average < 50 with 2+ runs | "Skill needs repair. Run `/singularity-repair`" |
+| Average >= 90, 5+ runs, has edge cases, maturity is `hardened` | "Ready for crystallization. Run `/singularity-crystallize`" |
+| Execution count < 3 | "Needs more usage data. Use the skill more before evaluating." |
+| Not executed in 30+ days | "Stale skill. Review if still relevant." |
+| One dimension consistently low | "Weakest area: <dimension>. Targeted repair recommended." |
+| All metrics healthy | "Skill is healthy. Continue using." |
+### Step 4: Present Report
+```
+Health Report: <skill-name>
+═══════════════════════════════════════
+Version:     <version>
+Maturity:    <level>
+Avg Score:   <avg>/100 (<trend>)
+Executions:  <count>
+Last Used:   <date> (<days> days ago)
+Edge Cases:  <count> handled
+Dimension Breakdown:
+  Correctness:    <avg>/20
+  Completeness:   <avg>/20
+  Edge Cases:     <avg>/20
+  Efficiency:     <avg>/20
+  Reusability:    <avg>/20
+Weakest Area: <dimension> (<avg>/20)
+Repairs:     <count> attempted
+Recommendation: <recommendation>
+═══════════════════════════════════════
+```
+### Optional: Deep Analysis
+If the user wants more detail, dispatch the `singularity-claude:gap-detector` agent to analyze whether this skill's weaknesses indicate a need for:
+- Splitting into multiple focused skills
+- Merging with another skill
+- Fundamentally different approach

package/skills/scoring/SKILL.md ADDED Viewed

@@ -0,0 +1,81 @@
+---
+name: scoring
+description: Use after any singularity-managed skill execution to rate performance 0-100, track quality over time, and trigger repair when scores drop below threshold
+---
+# Score a Skill Execution
+Evaluate skill performance using a structured 5-dimension rubric. Scores drive the evolution loop: low scores trigger repair, high scores enable crystallization.
+## Workflow
+### Step 1: Identify the Skill
+Determine which singularity-managed skill was just executed. Check `~/.claude/singularity/registry.json` to confirm it's tracked.
+If the skill isn't registered, ask: *"This skill isn't tracked by singularity. Want me to register it first?"*
+### Step 2: Assess Performance
+Dispatch the `singularity-claude:skill-assessor` agent (haiku model, fast and cheap) with:
+- **Skill name** and version
+- **What was requested** (the user's original task)
+- **What was produced** (the skill's output/changes)
+- **The scoring rubric** from `references/scoring-rubric.md`
+The assessor returns a structured JSON score.
+### Step 3: Record the Score
+```bash
+"${CLAUDE_PLUGIN_ROOT}/scripts/score-manager.sh" add <skill-name> <total-score> \
+  --context "<what the skill was used for>" \
+  --strengths '["<strength1>", "<strength2>"]' \
+  --weaknesses '["<weakness1>"]' \
+  --edge-cases '["<edge-case-if-any>"]'
+```
+### Step 4: Check Thresholds
+Read config from `~/.claude/singularity/config.json`:
+| Condition | Action |
+|-----------|--------|
+| Average < `autoRepairThreshold` (50) for 2+ runs | Suggest: *"This skill is underperforming. Run `/singularity-repair` to fix it."* |
+| Average >= `crystallizationThreshold` (90) with 5+ runs | Suggest: *"This skill is ready for crystallization. Run `/singularity-crystallize` to lock it."* |
+### Step 5: Update Registry
+Update `~/.claude/singularity/registry.json` with:
+- `lastExecuted`: current timestamp
+- `executionCount`: increment
+- `averageScore`: from score file
+### Step 6: Log Telemetry
+```bash
+"${CLAUDE_PLUGIN_ROOT}/scripts/telemetry-writer.sh" log <skill-name> \
+  --trigger "scoring" \
+  --score <total-score> \
+  --summary "<brief assessment>"
+```
+### Step 7: Report
+Show the user:
+```
+Score: <total>/100 (avg: <average>/100 over <count> runs)
+  Correctness:    <n>/20
+  Completeness:   <n>/20
+  Edge Cases:     <n>/20
+  Efficiency:     <n>/20
+  Reusability:    <n>/20
+Maturity: <level> → <new-level-if-changed>
+```
+## Scoring Modes
+From `config.json`:
+- `"auto"` — Dispatch assessor agent automatically (default)
+- `"manual"` — Ask user for the 0-100 score directly
+- `"hybrid"` — Auto-assess, show result, let user override

package/skills/scoring/references/scoring-rubric.md ADDED Viewed

@@ -0,0 +1,75 @@
+# Singularity Scoring Rubric
+Rate each dimension 0-20. Total: 0-100.
+## Dimensions
+### 1. Correctness (0-20)
+Did the skill achieve the stated goal?
+| Score | Meaning |
+|-------|---------|
+| 0-5   | Failed completely or produced wrong output |
+| 6-10  | Partially correct, major issues |
+| 11-15 | Mostly correct, minor issues |
+| 16-20 | Fully correct, goal achieved |
+### 2. Completeness (0-20)
+Were all requirements addressed?
+| Score | Meaning |
+|-------|---------|
+| 0-5   | Most requirements missed |
+| 6-10  | Some requirements addressed |
+| 11-15 | Most requirements addressed, minor gaps |
+| 16-20 | All requirements fully addressed |
+### 3. Edge Case Handling (0-20)
+Did it handle unusual inputs, errors, and boundary conditions?
+| Score | Meaning |
+|-------|---------|
+| 0-5   | No edge case handling, fragile |
+| 6-10  | Basic error handling only |
+| 11-15 | Handles common edge cases |
+| 16-20 | Robust against unusual inputs and failures |
+### 4. Efficiency (0-20)
+Was the approach direct and token-efficient?
+| Score | Meaning |
+|-------|---------|
+| 0-5   | Wasteful, unnecessary steps, verbose |
+| 6-10  | Some unnecessary work |
+| 11-15 | Mostly efficient, minor overhead |
+| 16-20 | Direct, minimal, no wasted effort |
+### 5. Reusability (0-20)
+Could the output be reused or adapted for similar tasks?
+| Score | Meaning |
+|-------|---------|
+| 0-5   | One-off, hardcoded, not adaptable |
+| 6-10  | Limited reusability |
+| 11-15 | Reasonably adaptable |
+| 16-20 | Highly reusable, well-parameterized |
+## Output Format
+Return as JSON:
+```json
+{
+  "totalScore": 75,
+  "dimensions": {
+    "correctness": { "score": 18, "rationale": "..." },
+    "completeness": { "score": 15, "rationale": "..." },
+    "edgeCases": { "score": 12, "rationale": "..." },
+    "efficiency": { "score": 16, "rationale": "..." },
+    "reusability": { "score": 14, "rationale": "..." }
+  },
+  "strengths": ["...", "..."],
+  "weaknesses": ["..."],
+  "edgeCasesEncountered": ["..."]
+}
+```

package/skills/using-singularity/SKILL.md ADDED Viewed

@@ -0,0 +1,60 @@
+---
+name: using-singularity
+description: Use when starting any conversation where skills may need to be created, evaluated, repaired, or when a capability gap is detected during task execution
+---
+# Singularity — Self-Evolving Skill Engine
+You have singularity-claude, a system that creates, scores, repairs, and hardens Claude Code skills through recursive improvement cycles.
+## Decision Flow
+When a task arrives:
+1. Task arrives → Does a skill exist?
+   - **Yes** → Execute it → Score with `/singularity-score` → Maturity auto-updates
+     - Avg < 50 (2+ runs) → Suggest `/singularity-repair`
+     - Avg ≥ 90, 5+ runs, hardened, edge cases → Suggest `/singularity-crystallize`
+     - Otherwise → Keep using and scoring
+   - **No** → Is this a recurring need?
+     - **Yes** → Create with `/singularity-create`
+     - **No** → Do manually
+## Available Commands
+| Command | Skill | Purpose |
+|---------|-------|---------|
+| `/singularity-create` | `singularity-claude:creating-skills` | Build a new skill |
+| `/singularity-score` | `singularity-claude:scoring` | Rate a skill execution 0-100 |
+| `/singularity-review` | `singularity-claude:reviewing` | Health check a skill |
+| `/singularity-repair` | `singularity-claude:repairing` | Auto-fix a failing skill |
+| `/singularity-crystallize` | `singularity-claude:crystallizing` | Lock a validated version |
+| `/singularity-dashboard` | `singularity-claude:dashboard` | Overview of all managed skills |
+## Capability Gap Detection
+Watch for these signals during task execution — they indicate a new skill should be created:
+1. **Repetition across sessions** — You're doing the same multi-step procedure you've done before
+2. **Task failure without skill coverage** — No existing skill addresses this capability
+3. **Generalizable pattern** — The procedure would apply beyond this specific task
+4. **Complex workflow** — The task requires 5+ steps that could be encoded
+When you detect a gap, suggest: *"This looks like a capability gap. Want me to create a skill for this with `/singularity-create`?"*
+## Data Locations
+- **Registry:** `~/.claude/singularity/registry.json`
+- **Scores:** `~/.claude/singularity/scores/<skill-name>.json`
+- **Telemetry:** `~/.claude/singularity/telemetry/<skill-name>/`
+- **Config:** `~/.claude/singularity/config.json`
+- **Created skills:** `~/.claude/skills/<skill-name>/SKILL.md`
+## Maturity Levels
+| Level | Criteria | Meaning |
+|-------|----------|---------|
+| `draft` | < 3 executions | New, unproven |
+| `tested` | 3+ runs, avg >= 60 | Working but not battle-tested |
+| `hardened` | 5+ runs, avg >= 80, edge cases handled | Reliable |
+| `crystallized` | Locked via git tag | Production-grade, immutable |