npm - guild-agents - Versions diffs - 1.5.0 → 2.1.0 - Mend

guild-agents 1.5.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (51) hide show

package/README.md +71 -67
package/bin/guild.js +4 -85
package/package.json +1 -1
package/src/commands/doctor.js +11 -33
package/src/commands/init.js +1 -1
package/src/templates/skills/build-feature/SKILL.md +7 -38
package/src/templates/skills/build-feature/evals/evals.json +2 -2
package/src/templates/skills/council/SKILL.md +4 -14
package/src/templates/skills/council/evals/evals.json +3 -13
package/src/templates/skills/create-pr/SKILL.md +2 -5
package/src/templates/skills/guild-specialize/SKILL.md +2 -5
package/src/templates/skills/qa-cycle/SKILL.md +0 -7
package/src/templates/skills/re-specialize/SKILL.md +0 -3
package/src/templates/skills/session-end/SKILL.md +77 -30
package/src/templates/skills/session-start/SKILL.md +51 -20
package/src/utils/eval-runner.js +2 -8
package/src/utils/generators.js +3 -4
package/src/utils/skill-parser.js +83 -0
package/src/utils/trigger-runner.js +1 -1
package/src/commands/logs.js +0 -63
package/src/commands/reset-learnings.js +0 -44
package/src/commands/run.js +0 -105
package/src/commands/stats.js +0 -147
package/src/templates/agents/learnings-extractor.md +0 -49
package/src/templates/skills/dev-flow/SKILL.md +0 -81
package/src/templates/skills/dev-flow/evals/evals.json +0 -36
package/src/templates/skills/dev-flow/evals/triggers.json +0 -16
package/src/templates/skills/new-feature/SKILL.md +0 -119
package/src/templates/skills/new-feature/evals/evals.json +0 -41
package/src/templates/skills/new-feature/evals/triggers.json +0 -16
package/src/templates/skills/review/SKILL.md +0 -97
package/src/templates/skills/review/evals/evals.json +0 -43
package/src/templates/skills/review/evals/triggers.json +0 -16
package/src/templates/skills/status/SKILL.md +0 -100
package/src/templates/skills/status/evals/evals.json +0 -40
package/src/templates/skills/status/evals/triggers.json +0 -16
package/src/templates/skills/verify/SKILL.md +0 -114
package/src/templates/skills/verify/evals/triggers.json +0 -16
package/src/utils/accounting.js +0 -139
package/src/utils/dispatch-protocol.js +0 -71
package/src/utils/dispatch.js +0 -172
package/src/utils/executor.js +0 -293
package/src/utils/learnings-io.js +0 -76
package/src/utils/learnings.js +0 -204
package/src/utils/orchestrator-io.js +0 -356
package/src/utils/orchestrator.js +0 -590
package/src/utils/pricing.js +0 -28
package/src/utils/providers/claude-code.js +0 -43
package/src/utils/skill-loader.js +0 -83
package/src/utils/trace.js +0 -400
package/src/utils/workflow-parser.js +0 -225

package/README.md CHANGED Viewed

@@ -7,7 +7,7 @@
 **Guild makes Claude Code think before it builds.**
-Guild is a spec-driven development CLI for Claude Code. It installs structured design and development workflows as `.claude/` markdown files in any project. Before code is written, features are evaluated, debated by independent AI perspectives, and specified in a design doc. Everything is markdown, tracked by git, works offline, zero infrastructure.
+Without Guild, Claude Code writes code immediately. No evaluation, no design, no review. With Guild, every feature goes through structured phases — evaluated by an advisor, designed by a tech lead, reviewed by a code reviewer, validated by QA — before anything ships. Everything is markdown in `.claude/`, tracked by git, works offline, zero infrastructure.
 ## The Problem
@@ -20,10 +20,11 @@ Without structure, Claude Code:
 ## How Guild Solves It
-- **Spec before code**: every feature starts with a design doc
-- **Structured deliberation**: `/council` runs parallel independent analysis -- multiple perspectives evaluate independently, then synthesize
-- **Decisions that persist**: design docs, session state, and project context live in git-tracked markdown
-- **Zero infrastructure**: no servers, no APIs, just markdown files and Claude Code
+- **Spec before code**: `/build-feature` enforces evaluation, design, and review phases — code comes after the design doc
+- **Independent perspectives**: `/council` spawns parallel agents that each analyze your idea independently, then synthesize into a decision
+- **Session continuity**: `/session-start` and `/session-end` combine SESSION.md with Claude Code's memory system — you never lose context between sessions
+- **Behavioral discipline**: `/tdd` and `/debug` prevent the most common LLM anti-patterns: code before tests, fixes before root cause analysis
+- **Quality you can measure**: `guild eval` validates skill structure, trigger accuracy, and description quality with automated benchmarks
 ## Quick Start
@@ -43,46 +44,56 @@ Then use skills as slash commands in Claude Code:
 ## The Pipeline
 ```text
-You ──> /council "Add JWT auth"
+You ──> /build-feature "Add JWT auth"
          │
          ▼
-    ┌──────────┐     ┌──────────────┐     ┌──────────┐
-    │ Evaluate │────>│  Design Doc  │────>│  Build   │
-    │ debate   │     │  spec        │     │ implement│
-    └──────────┘     └──────────────┘     └────┬─────┘
-                                               │
-                                         ┌─────┴─────┐
-                                         ▼           ▼
-                                   ┌──────────┐┌──────────┐
-                                   │  Review  ││    QA    │
-                                   └──────────┘└──────────┘
+    ┌──────────┐     ┌──────────┐     ┌──────────┐
+    │ Evaluate │────>│  Design  │────>│  Build   │
+    │ advisor  │     │ tech-lead│     │developer │
+    └──────────┘     └──────────┘     └────┬─────┘
+                                           │
+                                     ┌─────┴─────┐
+                                     ▼           ▼
+                               ┌──────────┐┌──────────┐
+                               │  Review  ││    QA    │
+                               └──────────┘└──────────┘
 ```
 Five phases: **evaluate**, **design**, **implement**, **review**, **validate**. Phases 1-2 happen before any code is written.
-## Skills Reference
-All 15 skills, grouped by function:
-| Skill | Group | Description |
-| --- | --- | --- |
-| `/build-feature` | Pipeline | Full pipeline: evaluate, spec, implement, review, QA |
-| `/new-feature` | Pipeline | Create branch and scaffold for a new feature |
-| `/create-pr` | Pipeline | Create a structured pull request from current branch |
-| `/council` | Decision | Multi-perspective deliberation on a decision or feature |
-| `/review` | Quality | Code review on the current diff |
-| `/qa-cycle` | Quality | QA and bugfix loop until clean |
-| `/tdd` | Discipline | TDD red-green-refactor cycle |
-| `/debug` | Discipline | Systematic 4-phase debugging |
-| `/verify` | Discipline | Evidence-before-claims verification |
-| `/guild-specialize` | Context | Explore codebase, enrich CLAUDE.md with real conventions |
-| `/re-specialize` | Context | Incremental update of auto-generated CLAUDE.md zones |
-| `/session-start` | Context | Load context and resume work |
-| `/session-end` | Context | Save state to SESSION.md |
-| `/status` | Context | Project and session state overview |
-| `/dev-flow` | Context | Show current pipeline phase and next step |
-## CLI Commands
+## Skills
+10 skills, available as slash commands in Claude Code:
+| Skill | What it does |
+| --- | --- |
+| `/build-feature` | Full pipeline: evaluate, design, implement, review, QA |
+| `/council` | Multi-perspective deliberation — 3 agents debate independently, then synthesize |
+| `/create-pr` | Structured pull request from current branch |
+| `/qa-cycle` | QA and bugfix loop until clean |
+| `/tdd` | TDD red-green-refactor — no code without a failing test |
+| `/debug` | Systematic 4-phase debugging — no fixes without root cause |
+| `/guild-specialize` | Explore your codebase, enrich CLAUDE.md with real conventions |
+| `/re-specialize` | Incremental update of CLAUDE.md when your stack changes |
+| `/session-start` | Resume work from SESSION.md + Claude Code memory |
+| `/session-end` | Save state to SESSION.md + durable learnings to memory |
+## Agents
+6 specialized roles that give Claude Code distinct perspectives:
+| Agent | Role |
+| --- | --- |
+| advisor | Evaluates ideas and provides strategic direction. First gate before any work begins |
+| tech-lead | Breaks features into tasks. Defines technical approach and architecture |
+| developer | Implements features following project conventions. Writes tests, makes atomic commits |
+| code-reviewer | Reviews quality, patterns, and technical debt |
+| qa | Testing, edge cases, regression. Validates the implementation meets acceptance criteria |
+| bugfix | Diagnosis and bug resolution. Isolates root causes and applies targeted fixes |
+Each agent is a flat `.md` file with identity, responsibilities, and boundaries. Claude Code reads them via its native Agent tool and assumes the role.
+## CLI
 ```bash
 guild init              # Interactive project onboarding
@@ -90,45 +101,38 @@ guild new-agent <name>  # Create a custom agent
 guild status            # Show project status
 guild doctor            # Diagnose setup
 guild list              # List agents and skills
-guild run <skill>       # Preview a skill's execution plan (dry-run)
-guild logs              # View execution traces
-guild logs clean        # Remove old traces (--days N, --all)
-guild stats             # Token usage and cost estimates
 guild eval              # Run structural skill evaluations
-guild eval --triggers   # Run trigger accuracy tests (keyword matcher)
-guild eval --semantic   # Run trigger tests with LLM semantic matcher
-guild eval --suggest    # Show description improvement suggestions
-guild workspace init <name> <members...>  # Create a workspace
-guild workspace add <path>                # Add a member repo
-guild workspace status                    # Show workspace state
+guild eval --triggers   # Run trigger accuracy tests
+guild eval --semantic   # LLM-based trigger tests (requires ANTHROPIC_API_KEY)
+guild eval --suggest    # Description improvement suggestions
+guild workspace init    # Create a multi-repo workspace
 ```
 ## Skill Evaluations
-Guild includes a built-in evaluation framework for validating skill quality:
+Guild includes a built-in framework for measuring skill quality:
-- **Structural evals** (`guild eval`) -- assert workflow structure: steps exist, roles are correct, gates are present
-- **Trigger tests** (`guild eval --triggers`) -- verify that user prompts route to the correct skill using keyword overlap scoring
-- **Semantic matcher** (`guild eval --semantic`) -- optional LLM-based scoring via Anthropic Haiku for higher-fidelity trigger testing (requires `ANTHROPIC_API_KEY`)
-- **Description suggestions** (`guild eval --suggest`) -- analyzes keyword gaps in skill descriptions based on failed triggers
+- **Structural evals** -- assert workflow structure: steps exist, roles are correct, gates are present
+- **Trigger tests** -- verify that user prompts route to the correct skill
+- **Semantic matcher** -- LLM-based scoring for higher-fidelity trigger testing
+- **Benchmarks** -- rolling history with per-skill accuracy, precision, recall, and regression detection
-Every trigger run automatically records results to `benchmarks/benchmark.json` (rolling 30-entry history) and generates `benchmarks/benchmark.md` with per-skill accuracy, precision, recall, and delta vs previous run. Regressions (>5% accuracy drop with 2+ tests flipped) are flagged automatically.
+## How It Works
-## Under the Hood
+Guild installs agent definitions and skill workflows as markdown files in your project's `.claude/` directory. Claude Code discovers and executes them natively — no custom runtime, no extra process, no API calls. When you type `/build-feature`, Claude Code reads the skill, follows the phases, and spawns agents using its own Agent tool.
-Guild coordinates 7 specialized agents through the pipeline. Each agent handles one phase.
+Guild defines **what** happens. Claude Code decides **how** to execute it.
-| Agent | Role |
-| --- | --- |
-| advisor | Evaluates ideas and provides strategic direction |
-| tech-lead | Defines technical approach, tasks, and architecture |
-| developer | Implements features following project conventions |
-| code-reviewer | Reviews quality, patterns, and technical debt |
-| qa | Testing, edge cases, regression validation |
-| bugfix | Bug diagnosis and resolution |
-| learnings-extractor | Extracts compound learnings from pipeline executions |
+## Session Continuity
+Claude Code's native memory system remembers who you are, lessons learned, and project context — knowledge that lasts months. But it explicitly does not store ephemeral work state: what you were building, which branch, what phase, what's next. That's the gap Guild fills.
+`/session-end` writes to **both layers**:
+- **SESSION.md** — where you stopped: task, branch, phase, next steps (overwritten each session)
+- **Claude Code memory** — what you learned: decisions, lessons, references (persists across sessions)
-Agents are flat `.md` files with identity and expertise. Skills orchestrate agents through structured pipelines. Everything lives in `.claude/`, readable by humans, tracked by git.
+`/session-start` reads from **both** and presents a unified summary. You resume exactly where you left off, with full context of what you know and what you were doing.
 ## Guild Builds Itself

package/bin/guild.js CHANGED Viewed

@@ -1,14 +1,15 @@
 #!/usr/bin/env node
 /**
- * Guild v1 — CLI entry point
+ * Guild v2 — CLI entry point
  * Usage:
- *   guild init           — interactive onboarding v1
+ *   guild init           — interactive onboarding
  *   guild new-agent      — create a new agent
  *   guild status         — view project status
  *   guild doctor         — verify setup and report issues
  *   guild list           — list installed agents and skills
- *   guild stats          — view token usage and cost stats
+ *   guild eval           — run skill structural evaluations
+ *   guild workspace      — manage multi-repo workspaces
  */
 import { program } from 'commander';
@@ -106,69 +107,6 @@ program
     }
   });
-// guild reset-learnings
-program
-  .command('reset-learnings')
-  .description('Reset the compound learnings file')
-  .option('-f, --force', 'Skip confirmation prompt')
-  .action(async (options) => {
-    try {
-      const { runResetLearnings } = await import('../src/commands/reset-learnings.js');
-      await runResetLearnings(options);
-    } catch (err) {
-      console.error(err.message);
-      process.exit(1);
-    }
-  });
-// guild run
-program
-  .command('run')
-  .description('Execute a skill workflow')
-  .argument('<skill>', 'Skill name to run')
-  .argument('[input]', 'Input text for the skill', '')
-  .option('--profile <profile>', 'Model profile (max, pro)', 'max')
-  .option('--dry-run', 'Display the execution plan without running it')
-  .action(async (skill, input, options) => {
-    try {
-      const { runRun } = await import('../src/commands/run.js');
-      await runRun(skill, input, options);
-    } catch (err) {
-      console.error(err.message);
-      process.exit(1);
-    }
-  });
-// guild logs (list traces)
-const logsCmd = program
-  .command('logs')
-  .description('View and manage execution traces')
-  .action(async (options) => {
-    try {
-      const { runLogs } = await import('../src/commands/logs.js');
-      await runLogs('list', options);
-    } catch (err) {
-      console.error(err.message);
-      process.exit(1);
-    }
-  });
-// guild logs clean
-logsCmd
-  .command('clean')
-  .description('Remove old execution traces')
-  .option('--days <days>', 'Remove traces older than N days', '30')
-  .option('--all', 'Remove all traces')
-  .action(async (options) => {
-    try {
-      const { runLogs } = await import('../src/commands/logs.js');
-      await runLogs('clean', options);
-    } catch (err) {
-      console.error(err.message);
-      process.exit(1);
-    }
-  });
 // guild eval
 program
   .command('eval')
@@ -195,25 +133,6 @@ program
     }
   });
-// guild stats
-program
-  .command('stats')
-  .description('View token usage stats and cost estimates')
-  .option('--period <period>', 'Filter by period: today, week, month, all', 'month')
-  .option('--compare', 'Compare cost across model profiles')
-  .option('--reset', 'Delete all usage history')
-  .option('-f, --force', 'Skip confirmation prompt (for --reset)')
-  .option('--export <format>', 'Export data (csv)')
-  .action(async (options) => {
-    try {
-      const { runStats } = await import('../src/commands/stats.js');
-      await runStats(options);
-    } catch (err) {
-      console.error(err.message);
-      process.exit(1);
-    }
-  });
 // guild workspace
 const workspaceCmd = program
   .command('workspace')

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "guild-agents",
-  "version": "1.5.0",
+  "version": "2.1.0",
   "description": "Specification-driven development CLI for Claude Code — think before you build",
   "type": "module",
   "files": [

package/src/commands/doctor.js CHANGED Viewed

@@ -4,10 +4,10 @@
 import * as p from '@clack/prompts';
 import chalk from 'chalk';
-import { existsSync, readdirSync } from 'fs';
+import { existsSync, readdirSync, readFileSync } from 'fs';
 import { join } from 'path';
 import { resolveProjectRoot } from '../utils/files.js';
-import { loadAllSkills } from '../utils/skill-loader.js';
+import { parseSkill } from '../utils/skill-parser.js';
 export async function runDoctor() {
   const root = resolveProjectRoot();
@@ -86,27 +86,26 @@ export async function runDoctor() {
   // Check workflow validation in skills
   if (existsSync(skillsDir)) {
-    const skills = loadAllSkills(skillsDir);
+    const skillDirs = readdirSync(skillsDir, { withFileTypes: true })
+      .filter(d => d.isDirectory())
+      .filter(d => existsSync(join(skillsDir, d.name, 'SKILL.md')));
     let workflowCount = 0;
     let workflowErrors = 0;
     const errorDetails = [];
-    for (const [name, skill] of skills) {
+    for (const dir of skillDirs) {
+      const content = readFileSync(join(skillsDir, dir.name, 'SKILL.md'), 'utf8');
+      const skill = parseSkill(content);
       if (skill.workflow) {
         workflowCount++;
-        if (skill.errors.length > 0) {
-          workflowErrors++;
-          errorDetails.push(`${name}: ${skill.errors.join('; ')}`);
-        }
-      }
-      // Check that agent references exist
-      if (skill.workflow) {
         for (const step of skill.workflow.steps) {
           if (step.role !== 'system' && step.role !== 'dynamic') {
             const agentPath = join(agentsDir, `${step.role}.md`);
             if (!existsSync(agentPath)) {
-              errorDetails.push(`${name}: step "${step.id}" references agent "${step.role}" — agent not found`);
+              errorDetails.push(`${dir.name}: step "${step.id}" references agent "${step.role}" — agent not found`);
               workflowErrors++;
             }
           }
@@ -124,27 +123,6 @@ export async function runDoctor() {
       });
       healthy = false;
     }
-    // If workflowCount === 0, don't add a check (no workflows to validate)
-    // Check for dual-format skills (workflow frontmatter + body step/phase headings)
-    // Matches "### Step 1", "## Phase 2", etc. — requires digit after Step/Phase
-    const STEP_PHASE_RE = /^#{1,3}\s+(Step|Phase)\s+\d/im;
-    const dualFormatWarnings = [];
-    for (const [name, skill] of skills) {
-      if (skill.workflow && skill.body && STEP_PHASE_RE.test(skill.body)) {
-        dualFormatWarnings.push(name);
-      }
-    }
-    if (dualFormatWarnings.length > 0) {
-      checks.push({
-        name: `Dual-format skills (${dualFormatWarnings.length} warning(s))`,
-        pass: true,
-        warn: true,
-        detail: `Skills with both workflow frontmatter and body step/phase headings: ${dualFormatWarnings.join(', ')}. Workflow steps take precedence — consider removing prose steps from body.`,
-      });
-    }
   }
   // Display results

package/src/commands/init.js CHANGED Viewed

@@ -135,7 +135,7 @@ export async function runInit() {
   const relevantSkills = projectData.hasExistingCode
     ? ['/guild-specialize', '/council', '/build-feature']
-    : ['/council', '/build-feature', '/new-feature'];
+    : ['/council', '/build-feature'];
   p.log.info(`Start with: ${relevantSkills.join('  ')}`);
   const quickStart = projectData.hasExistingCode

package/src/templates/skills/build-feature/SKILL.md CHANGED Viewed

@@ -11,14 +11,12 @@ workflow:
       requires: [feature-description]
       produces: [evaluation-report, verdict]
       model-tier: reasoning
-      on-failure: abort
     - id: design
       role: tech-lead
       intent: "Break the feature into concrete tasks with acceptance criteria. Define implementation approach: files to modify, patterns to follow, interfaces, and technical risks."
       requires: [feature-description, evaluation-report]
       produces: [task-list, acceptance-criteria, technical-plan]
       model-tier: reasoning
-      condition: step.evaluate.verdict != rejected
     - id: implement
       role: developer
       intent: "Implement the feature following the technical plan. Write unit tests. Make atomic commits."
@@ -28,13 +26,11 @@ workflow:
     - id: gate-pre-review
       role: system
       intent: "Run project tests and lint. Both must pass before review."
-      commands: [npm test, npm run lint]
       gate: true
       produces: [gate-pre-review-result]
-      on-failure: goto:implement
-    - id: checkpoint-phase4
+    - id: checkpoint
       role: system
-      intent: "Create checkpoint commit and write partial pipeline trace (phases 1-4) to spec file."
+      intent: "Create checkpoint commit and write partial pipeline trace to spec file."
       requires: [implementation, gate-pre-review-result]
       produces: [checkpoint-commit]
       gate: true
@@ -44,56 +40,29 @@ workflow:
       requires: [implementation, gate-pre-review-result]
       produces: [review-report]
       model-tier: reasoning
-      retry:
-        max: 2
-        on: has-blockers
     - id: fix-review-blockers
       role: developer
       intent: "Fix blocker findings from code review. Run tests after fixing."
       requires: [review-report]
       produces: [implementation]
       model-tier: execution
-      condition: step.review.has-blockers
-      on-failure: goto:review
     - id: qa-phase
-      role: system
-      intent: "Run QA validation with bugfix cycles."
-      delegates-to: qa-cycle
+      role: qa
+      intent: "Validate acceptance criteria, test edge cases, run bugfix cycles if needed."
       requires: [acceptance-criteria, implementation]
       produces: [qa-report]
-      retry:
-        max: 2
-        on: has-bugs
-    - id: post-qa-review
-      role: code-reviewer
-      intent: "Review changes introduced during QA bugfix cycles."
-      requires: [qa-report, implementation]
-      produces: [post-qa-review-report]
-      model-tier: reasoning
-      condition: step.qa-phase.had-significant-changes
-      retry:
-        max: 2
-        on: has-blockers
+      model-tier: execution
     - id: gate-final
       role: system
       intent: "Run project tests and lint as final verification. Both must pass."
-      commands: [npm test, npm run lint]
       gate: true
       produces: [final-gate-result]
-      on-failure: goto:qa-phase
     - id: completion
       role: system
-      intent: "Write complete pipeline trace to spec file. Update SESSION.md. Present summary to user."
+      intent: "Update SESSION.md. Present summary to user."
       requires: [final-gate-result, review-report, qa-report]
-      produces: [pipeline-trace, session-update]
+      produces: [session-update]
       gate: true
-    - id: extract-learnings
-      role: learnings-extractor
-      intent: "Extract compound learnings from this pipeline execution."
-      requires: [pipeline-trace]
-      produces: [updated-learnings]
-      model-tier: routine
-      blocking: false
 ---
 # Build Feature

package/src/templates/skills/build-feature/evals/evals.json CHANGED Viewed

@@ -43,9 +43,9 @@
     },
     {
       "id": "bf-minimum-steps",
-      "description": "Plan has at least 10 steps",
+      "description": "Plan has at least 8 steps",
       "expectations": [
-        { "text": "At least 10 steps", "assertion": "step-count:10" }
+        { "text": "At least 8 steps", "assertion": "step-count:8" }
       ]
     }
   ]

package/src/templates/skills/council/SKILL.md CHANGED Viewed

@@ -11,33 +11,24 @@ workflow:
       requires: [user-question]
       produces: [council-type, participant-roles]
       gate: true
-    - id: workspace-context
-      role: system
-      intent: "Detect workspace membership. If in a workspace, collect context from sibling repos (CLAUDE.md, PROJECT.md, SESSION.md) and build workspace context block."
-      requires: [council-type]
-      produces: [workspace-context]
-      condition: in-workspace
     - id: agent-1
       role: dynamic
-      intent: "Analyze the question from specialized perspective. State position with concrete arguments."
-      requires: [user-question, council-type, workspace-context]
+      intent: "Analyze the question from specialized perspective. State position with concrete arguments. Spawn via Agent tool IN PARALLEL with agent-2 and agent-3."
+      requires: [user-question, council-type]
       produces: [perspective-1]
       model-tier: reasoning
-      parallel: [agent-2, agent-3]
     - id: agent-2
       role: dynamic
       intent: "Analyze the question from specialized perspective. State position with concrete arguments."
-      requires: [user-question, council-type, workspace-context]
+      requires: [user-question, council-type]
       produces: [perspective-2]
       model-tier: reasoning
-      parallel: [agent-1, agent-3]
     - id: agent-3
       role: dynamic
       intent: "Analyze the question from specialized perspective. State position with concrete arguments."
-      requires: [user-question, council-type, workspace-context]
+      requires: [user-question, council-type]
       produces: [perspective-3]
       model-tier: reasoning
-      parallel: [agent-1, agent-2]
     - id: synthesize
       role: system
       intent: "Synthesize debate: points of agreement, disagreement, risks. Present options to user."
@@ -49,7 +40,6 @@ workflow:
       intent: "After user decides, write spec document to docs/specs/."
       requires: [synthesis, user-decision]
       produces: [spec-document]
-      condition: user-wants-spec
       gate: true
 ---

package/src/templates/skills/council/evals/evals.json CHANGED Viewed

@@ -2,15 +2,12 @@
   "skill": "council",
   "evals": [
     {
-      "id": "council-three-parallel-agents",
-      "description": "Council has 3 agent steps in parallel",
+      "id": "council-three-agents",
+      "description": "Council has 3 agent steps",
       "expectations": [
         { "text": "Agent-1 exists", "assertion": "step-exists:agent-1" },
         { "text": "Agent-2 exists", "assertion": "step-exists:agent-2" },
-        { "text": "Agent-3 exists", "assertion": "step-exists:agent-3" },
-        { "text": "Agent-1 is parallel", "assertion": "step-parallel:agent-1" },
-        { "text": "Agent-2 is parallel", "assertion": "step-parallel:agent-2" },
-        { "text": "Agent-3 is parallel", "assertion": "step-parallel:agent-3" }
+        { "text": "Agent-3 exists", "assertion": "step-exists:agent-3" }
       ]
     },
     {
@@ -29,13 +26,6 @@
         { "text": "Synthesize step exists", "assertion": "step-exists:synthesize" },
         { "text": "Synthesize has gate", "assertion": "gate-exists:synthesize" }
       ]
-    },
-    {
-      "id": "council-workspace-context",
-      "description": "Workspace context step exists with condition",
-      "expectations": [
-        { "text": "Workspace-context step exists", "assertion": "step-exists:workspace-context" }
-      ]
     }
   ]
 }

package/src/templates/skills/create-pr/SKILL.md CHANGED Viewed

@@ -8,12 +8,10 @@ workflow:
     - id: verify-branch
       role: system
       intent: "Verify not on main/develop, check for uncommitted changes, get commits ahead of main."
-      commands: [git branch --show-current, git status, git log main..HEAD --oneline]
       produces: [branch-name, branch-state, commit-list]
     - id: gather-context
       role: system
       intent: "Collect diff stats, run tests and lint for PR description context."
-      commands: [git diff main..HEAD --stat, npm test, npm run lint]
       requires: [branch-state]
       produces: [diff-summary, test-result, lint-result]
     - id: generate-description
@@ -25,7 +23,6 @@ workflow:
     - id: create-pr
       role: system
       intent: "Push branch to origin and create PR via gh CLI."
-      commands: [git push -u origin, gh pr create]
       requires: [pr-description, pr-title, branch-name]
       produces: [pr-url]
     - id: post-creation
@@ -102,7 +99,7 @@ Build a structured PR description:
 1. Display the PR URL
 2. Suggest next steps:
    - "Request review from a teammate"
-   - "Run `/review` for an AI code review"
+   - "Run `/code-review` for an AI code review"
    - "Merge when ready with `gh pr merge [number]`"
 ## Example Session
@@ -121,7 +118,7 @@ https://github.com/org/repo/pull/42
 Next steps:
 - Request review from a teammate
-- Run /review for AI code review
+- Run /code-review for AI code review
 - Merge when ready
 ```