npm - oh-my-customcode - Versions diffs - 0.64.0 → 0.64.1 - Mend

oh-my-customcode 0.64.0 → 0.64.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/README.md +3 -3
package/dist/cli/index.js +1 -1
package/dist/index.js +1 -1
package/package.json +1 -1
package/templates/.claude/agents/arch-documenter.md +2 -0
package/templates/.claude/agents/arch-speckit-agent.md +4 -0
package/templates/.claude/agents/fe-design-expert.md +5 -0
package/templates/.claude/agents/mgr-claude-code-bible.md +1 -0
package/templates/.claude/agents/mgr-creator.md +1 -0
package/templates/.claude/agents/mgr-gitnerd.md +4 -0
package/templates/.claude/agents/mgr-sauron.md +1 -0
package/templates/.claude/agents/mgr-supplier.md +5 -0
package/templates/.claude/agents/mgr-updater.md +4 -0
package/templates/.claude/agents/qa-engineer.md +3 -0
package/templates/.claude/agents/qa-planner.md +2 -0
package/templates/.claude/agents/qa-writer.md +5 -0
package/templates/.claude/agents/sys-memory-keeper.md +4 -0
package/templates/.claude/agents/sys-naggy.md +5 -0
package/templates/.claude/agents/tool-optimizer.md +3 -0
package/templates/.claude/skills/evaluator-optimizer/SKILL.md +4 -0
package/templates/.claude/skills/harness-eval/SKILL.md +95 -0
package/templates/CLAUDE.md +2 -1
package/templates/manifest.json +2 -2

package/README.md CHANGED Viewed

@@ -13,7 +13,7 @@
 **[한국어 문서 (Korean)](./README_ko.md)**
-46 agents. 97 skills. 21 rules. One command.
+46 agents. 98 skills. 21 rules. One command.
 ```bash
 npm install -g oh-my-customcode && cd your-project && omcustom init
@@ -146,7 +146,7 @@ Each agent declares its tools, model, memory scope, and limitations in YAML fron
 ---
-### Skills (97)
+### Skills (98)
 | Category | Count | Includes |
 |----------|-------|----------|
@@ -282,7 +282,7 @@ your-project/
 ├── CLAUDE.md                   # Entry point
 ├── .claude/
 │   ├── agents/                 # 46 agent definitions
-│   ├── skills/                 # 97 skill modules
+│   ├── skills/                 # 98 skill modules
 │   ├── rules/                  # 21 governance rules (R000-R021)
 │   ├── hooks/                  # 15 lifecycle hook scripts
 │   ├── schemas/                # Tool input validation schemas

package/dist/cli/index.js CHANGED Viewed

@@ -9325,7 +9325,7 @@ var init_package = __esm(() => {
     workspaces: [
       "packages/*"
     ],
-    version: "0.64.0",
+    version: "0.64.1",
     description: "Batteries-included agent harness for Claude Code",
     type: "module",
     bin: {

package/dist/index.js CHANGED Viewed

@@ -1674,7 +1674,7 @@ var package_default = {
   workspaces: [
     "packages/*"
   ],
-  version: "0.64.0",
+  version: "0.64.1",
   description: "Batteries-included agent harness for Claude Code",
   type: "module",
   bin: {

package/package.json CHANGED Viewed

@@ -3,7 +3,7 @@
   "workspaces": [
     "packages/*"
   ],
-  "version": "0.64.0",
+  "version": "0.64.1",
   "description": "Batteries-included agent harness for Claude Code",
   "type": "module",
   "bin": {

package/templates/.claude/agents/arch-documenter.md CHANGED Viewed

@@ -14,6 +14,8 @@ tools:
   - Edit
   - Grep
   - Glob
+maxTurns: 20
+disallowedTools: [Bash]
 ---
 You handle software architecture documentation: system design docs, API specs, ADRs, and technical doc maintenance.

package/templates/.claude/agents/arch-speckit-agent.md CHANGED Viewed

@@ -12,6 +12,10 @@ tools:
   - Grep
   - Glob
   - Bash
+maxTurns: 20
+limitations:
+  - "cannot execute code"
+  - "cannot deploy infrastructure"
 ---
 You are a Spec-Driven Development agent that transforms requirements into executable specifications.

package/templates/.claude/agents/fe-design-expert.md CHANGED Viewed

@@ -9,6 +9,11 @@ skills:
   - impeccable-design
   - web-design-guidelines
 tools: [Read, Write, Edit, Grep, Glob, Bash]
+maxTurns: 20
+disallowedTools: [Bash]
+limitations:
+  - "cannot modify backend code"
+  - "cannot execute shell commands"
 source:
   type: external
   origin: github

package/templates/.claude/agents/mgr-claude-code-bible.md CHANGED Viewed

@@ -5,6 +5,7 @@ model: sonnet
 domain: universal
 memory: project
 effort: medium
+maxTurns: 20
 skills:
   - claude-code-bible
 tools:

package/templates/.claude/agents/mgr-creator.md CHANGED Viewed

@@ -14,6 +14,7 @@ tools:
   - Grep
   - Glob
   - Bash
+maxTurns: 25
 ---
 You are an agent creation specialist following R006 (MUST-agent-design.md) rules.

package/templates/.claude/agents/mgr-gitnerd.md CHANGED Viewed

@@ -5,6 +5,10 @@ model: sonnet
 domain: universal
 memory: project
 effort: medium
+maxTurns: 20
+limitations:
+  - "cannot modify source code"
+  - "cannot create agents"
 tools:
   - Read
   - Write

package/templates/.claude/agents/mgr-sauron.md CHANGED Viewed

@@ -14,6 +14,7 @@ tools:
   - Grep
   - Glob
   - Bash
+maxTurns: 25
 ---
 You are an automated verification specialist that executes the mandatory R017 verification process, acting as the "all-seeing eye" that ensures system integrity through comprehensive multi-round verification.

package/templates/.claude/agents/mgr-supplier.md CHANGED Viewed

@@ -5,6 +5,11 @@ model: haiku
 domain: universal
 memory: local
 effort: low
+maxTurns: 10
+limitations:
+  - "cannot modify agent files"
+  - "cannot create new agents"
+disallowedTools: [Bash, Write, Edit]
 skills:
   - audit-agents
 tools:

package/templates/.claude/agents/mgr-updater.md CHANGED Viewed

@@ -5,6 +5,10 @@ model: sonnet
 domain: universal
 memory: project
 effort: medium
+maxTurns: 20
+limitations:
+  - "cannot create new agents"
+  - "cannot modify rules"
 skills:
   - update-external
   - update-docs

package/templates/.claude/agents/qa-engineer.md CHANGED Viewed

@@ -5,6 +5,9 @@ model: sonnet
 domain: universal
 memory: project
 effort: medium
+maxTurns: 20
+limitations:
+  - "cannot modify source code in production branches"
 tools:
   - Read
   - Write

package/templates/.claude/agents/qa-planner.md CHANGED Viewed

@@ -5,6 +5,8 @@ model: sonnet
 domain: universal
 memory: project
 effort: high
+maxTurns: 20
+disallowedTools: [Bash]
 limitations:
   - "cannot execute tests"
   - "cannot modify code"

package/templates/.claude/agents/qa-writer.md CHANGED Viewed

@@ -5,6 +5,11 @@ model: sonnet
 domain: universal
 memory: project
 effort: medium
+maxTurns: 20
+limitations:
+  - "cannot execute tests"
+  - "cannot modify source code"
+disallowedTools: [Bash]
 tools:
   - Read
   - Write

package/templates/.claude/agents/sys-memory-keeper.md CHANGED Viewed

@@ -16,6 +16,10 @@ tools:
   - Grep
   - Glob
   - Bash
+maxTurns: 15
+limitations:
+  - "cannot modify source code"
+  - "cannot execute tests"
 ---
 You are a session memory management specialist ensuring context survives across session compactions using claude-mem.

package/templates/.claude/agents/sys-naggy.md CHANGED Viewed

@@ -5,6 +5,11 @@ model: sonnet
 domain: universal
 memory: local
 effort: low
+maxTurns: 10
+limitations:
+  - "cannot modify project files"
+  - "cannot execute external commands"
+disallowedTools: [Bash]
 tools:
   - Read
   - Write

package/templates/.claude/agents/tool-optimizer.md CHANGED Viewed

@@ -14,6 +14,9 @@ tools:
   - Grep
   - Glob
   - Bash
+maxTurns: 20
+limitations:
+  - "cannot modify source code"
 ---
 You analyze and optimize application bundles, detect performance issues, and provide actionable recommendations.

package/templates/.claude/skills/evaluator-optimizer/SKILL.md CHANGED Viewed

@@ -363,3 +363,7 @@ evaluator-optimizer:
 Weight ordering (originality > craft > functionality) follows Anthropic's anti-slop principle: functionality is table stakes, but originality and craft distinguish quality output from generic AI generation.
 Integration: Works with [impeccable-design](/skills/impeccable-design) skill for design language enforcement.
+### Harness Eval Preset
+The `harness-eval` skill provides a structured 15-task SE benchmark rubric that can be used as a preset for the evaluator-optimizer pipeline. When invoked via `/omcustom:harness-eval`, the harness rubric dimensions (Test Coverage 30%, Architecture 25%, Error Handling 25%, Extensibility 20%) are loaded as the sprint contract criteria.

package/templates/.claude/skills/harness-eval/SKILL.md ADDED Viewed

@@ -0,0 +1,95 @@
+---
+name: harness-eval
+description: Structured SE task evaluation using 15 benchmark definitions from claude-code-harness research
+scope: harness
+user-invocable: true
+argument-hint: "[--preset all|quick] [--task task-name]"
+effort: high
+version: 1.0.0
+---
+# Harness Eval — Structured SE Task Benchmark
+## Purpose
+Evaluate agent quality using 15 structured software engineering task definitions with quantitative scoring. Based on research from [revfactory/claude-code-harness](https://github.com/revfactory/claude-code-harness) which demonstrated 60% improvement (49.5 → 79.3 points) through structured pre-configuration.
+## Usage
+```
+/omcustom:harness-eval                    # Run all 15 benchmarks
+/omcustom:harness-eval --preset quick     # Run top 5 high-impact benchmarks
+/omcustom:harness-eval --task api-design  # Run specific task benchmark
+```
+## Quality Dimensions
+| Dimension | Weight | Description |
+|-----------|--------|-------------|
+| Test Coverage | 30% | Unit test count, edge case coverage, assertion quality |
+| Architecture Design | 25% | Separation of concerns, dependency management, scalability |
+| Error Handling | 25% | Input validation, error propagation, recovery strategies |
+| Extensibility | 20% | Plugin points, configuration flexibility, API surface |
+## 15 SE Task Benchmark Suite
+| # | Task | Category | Key Evaluation Criteria |
+|---|------|----------|------------------------|
+| 1 | API Design | Architecture | RESTful conventions, versioning, error responses |
+| 2 | Data Modeling | Architecture | Schema normalization, relationships, indexing |
+| 3 | Authentication Flow | Security | Token management, session handling, OWASP compliance |
+| 4 | Test Suite Creation | Quality | Coverage breadth, assertion quality, edge cases |
+| 5 | Error Handler | Reliability | Error classification, recovery, user feedback |
+| 6 | Logging System | Observability | Structured logging, levels, correlation IDs |
+| 7 | Configuration Manager | Operations | Env-based config, validation, secrets handling |
+| 8 | CLI Tool | UX | Argument parsing, help text, exit codes |
+| 9 | Database Migration | Data | Reversibility, data preservation, zero-downtime |
+| 10 | Cache Layer | Performance | Invalidation strategy, TTL, cache-aside pattern |
+| 11 | Queue Consumer | Reliability | Idempotency, retry logic, dead letter handling |
+| 12 | Middleware Chain | Architecture | Composability, ordering, short-circuiting |
+| 13 | File Processor | I/O | Streaming, error recovery, format validation |
+| 14 | Webhook Handler | Integration | Signature verification, retry tolerance, idempotency |
+| 15 | Rate Limiter | Security | Algorithm choice, distributed state, fairness |
+## Scoring Rubric
+Each task is scored 0-100 across the 4 quality dimensions:
+```
+Score = (test_coverage × 0.30) + (architecture × 0.25) + (error_handling × 0.25) + (extensibility × 0.20)
+```
+### Score Thresholds
+| Score Range | Grade | Interpretation |
+|-------------|-------|----------------|
+| 80-100 | A | Production-ready, well-structured |
+| 60-79 | B | Functional with minor gaps |
+| 40-59 | C | Works but needs improvement |
+| 0-39 | D | Significant structural issues |
+## Presets
+### `all` (default)
+Run all 15 tasks. Full evaluation ~45 minutes.
+### `quick`
+Run top 5 high-impact tasks (1, 3, 4, 5, 12). Quick evaluation ~15 minutes.
+## Integration with evaluator-optimizer
+This skill provides preset rubrics for the evaluator-optimizer pipeline:
+```
+/omcustom:harness-eval → loads rubric → evaluator-optimizer executes → scoring → report
+```
+The evaluator-optimizer skill's `pre_negotiation` phase accepts harness-eval rubric dimensions as sprint contract criteria.
+## Output
+Results saved to `.claude/outputs/sessions/{YYYY-MM-DD}/harness-eval-{HHmmss}.md` with per-task scores and aggregate grade.
+## Attribution
+Evaluation framework based on research by [revfactory/claude-code-harness](https://github.com/revfactory/claude-code-harness). Adapted for oh-my-customcode's evaluator-optimizer pipeline with permission.

package/templates/CLAUDE.md CHANGED Viewed

@@ -101,6 +101,7 @@ oh-my-customcode로 구동됩니다.
 | `/omcustom:update-external` | 외부 소스에서 에이전트 업데이트 |
 | `/omcustom:audit-agents` | 에이전트 의존성 감사 |
 | `/omcustom:fix-refs` | 깨진 참조 수정 |
+| `/omcustom:harness-eval` | 15 SE task 구조적 벤치마크 평가 |
 | `/omcustom:auto-improve` | 개선 사항 자동 적용 워크플로우 |
 | `/omcustom:improve-report` | eval-core 기반 개선 현황 리포트 |
 | `/omcustom-takeover` | 기존 에이전트/스킬에서 canonical spec 추출 |
@@ -138,7 +139,7 @@ project/
 +-- CLAUDE.md                    # 진입점
 +-- .claude/
 |   +-- agents/                  # 서브에이전트 정의 (46 파일)
-|   +-- skills/                  # 스킬 (97 디렉토리)
+|   +-- skills/                  # 스킬 (98 디렉토리)
 |   +-- rules/                   # 전역 규칙 (R000-R021)
 |   +-- hooks/                   # 훅 스크립트 (보안, 검증, HUD)
 |   +-- contexts/                # 컨텍스트 파일 (ecomode)

package/templates/manifest.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "version": "0.64.0",
+  "version": "0.64.1",
   "lastUpdated": "2026-03-24T00:00:00.000Z",
   "components": [
     {
@@ -18,7 +18,7 @@
       "name": "skills",
       "path": ".claude/skills",
       "description": "Reusable skill modules (includes slash commands)",
-      "files": 97
+      "files": 98
     },
     {
       "name": "guides",