npm - claude-tempo - Versions diffs - 0.12.0 → 0.13.1 - Mend

claude-tempo 0.12.0 → 0.13.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

package/CLAUDE.md +4 -0
package/README.md +21 -0
package/dist/cli/commands.js +1 -1
package/dist/ensemble/loader.js +9 -0
package/dist/ensemble/saver.js +3 -0
package/dist/ensemble/schema.d.ts +1 -0
package/dist/server.js +13 -3
package/dist/tools/evaluate-gate.d.ts +3 -0
package/dist/tools/evaluate-gate.js +40 -0
package/dist/tools/gates.d.ts +3 -0
package/dist/tools/gates.js +51 -0
package/dist/tools/quality-gate.d.ts +3 -0
package/dist/tools/quality-gate.js +34 -0
package/dist/types.d.ts +18 -0
package/dist/utils/validation.d.ts +8 -0
package/dist/utils/validation.js +9 -1
package/dist/workflows/session.js +47 -1
package/dist/workflows/signals.d.ts +17 -2
package/dist/workflows/signals.js +5 -1
package/examples/agents/tempo-composer.md +10 -0
package/examples/agents/tempo-conductor.md +10 -0
package/examples/agents/tempo-critic.md +28 -1
package/examples/agents/tempo-improv.md +10 -0
package/examples/agents/tempo-liner.md +10 -0
package/examples/agents/tempo-roadie.md +10 -0
package/examples/agents/tempo-soloist.md +10 -0
package/examples/agents/tempo-tuner.md +28 -0
package/package.json +1 -1
package/workflow-bundle.js +53 -3

package/CLAUDE.md CHANGED Viewed

@@ -62,6 +62,9 @@ src/
 │   ├── schedule.ts    # Create one-shot or recurring schedules
 │   ├── unschedule.ts  # Cancel a named schedule
 │   ├── schedules.ts   # List active schedules
+│   ├── quality-gate.ts # Define quality gates for tasks (conductor only)
+│   ├── evaluate-gate.ts # Mark gate criteria as passed/failed (conductor only)
+│   ├── gates.ts       # List quality gates and their status (conductor only)
 │   └── helpers.ts     # Zod/MCP tool registration wrapper
 ├── utils/
 │   └── validation.ts  # Shared validation constants (name/message/path limits, encore defaults) and helpers
@@ -116,6 +119,7 @@ npm test
 - **Agent type discovery**: The `agent_types` MCP tool and `claude-tempo agent-types` CLI command let conductors discover available player types. Shipped examples (tempo-conductor, tempo-composer, tempo-soloist, tempo-tuner, tempo-critic, tempo-roadie, tempo-improv, tempo-liner) work out of the box. Ensemble lineups: tempo-big-band (full lifecycle), tempo-dev-team (feature work), tempo-review-squad (parallel review), tempo-jam-session (exploration).
 - **Schedule**: A one-shot or recurring message delivery configured via the `schedule` tool. Backed by a durable `claudeSchedulerWorkflow` — survives restarts. Supports delay (`delay`), fixed time (`at`), recurring interval (`every`), and cron expressions (`cron`) with optional IANA timezone (`timezone`). Cron schedules use `croner` for expression parsing and next-fire computation. Managed via `schedule`, `unschedule`, and `schedules` tools.
 - **Lineup**: A YAML file defining an ensemble configuration — which players to recruit, their types, working directories, and optional startup messages. Load via `load_lineup` to bootstrap a full ensemble in one step; save via `save_lineup` to snapshot a running ensemble's state for later reuse.
+- **Quality Gate**: A named checklist of criteria a conductor tracks to verify a task is complete. Created via `quality_gate` (conductor only), evaluated via `evaluate_gate`, and listed via `gates`. Each criterion has a `pending` → `passed` | `failed` status; the gate's aggregate status is derived automatically (all passed → `passed`, any failed → `failed`, else `open`). Gates are stored in the conductor workflow and survive `continueAsNew`.
 - **Wire protocol**: All Temporal signal, query, update, and workflow names are documented in [`docs/WIRE-PROTOCOL.md`](docs/WIRE-PROTOCOL.md). These names are stable as of v0.10 — renaming or removing any is a breaking change requiring a major version bump.
 ## Dashboard

package/README.md CHANGED Viewed

@@ -133,6 +133,9 @@ These tools are available inside Claude Code sessions connected to claude-tempo:
 | `broadcast` | Send a message to all active players. Optional `type` filter limits to a specific player type. |
 | `encore` | Revive a stale player session — restarts the process and reconnects to the existing workflow with context restored. |
 | `recall` | Read your own message history. Shows received messages by default; pass `includeSent: true` for the full timeline. |
+| `quality_gate` | Define or replace a quality gate for a task — a named checklist of criteria that must pass. Conductor only. |
+| `evaluate_gate` | Mark one or more criteria on a quality gate as passed or failed. Conductor only. |
+| `gates` | List quality gates and their status. Filter by task name or status (`open`, `passed`, `failed`). Conductor only. |
 ## Scheduling
@@ -174,6 +177,24 @@ The `timezone` parameter accepts any IANA timezone (e.g. `"America/New_York"`, `
 - `claude-tempo status` shows active schedules alongside sessions
 - A single durable scheduler workflow per ensemble manages all schedules using Temporal timers
+## Quality Gates
+Conductors can define named checklists of criteria to verify task completion. Three conductor-only tools are available: `quality_gate` (create or replace a gate), `evaluate_gate` (mark criteria as passed or failed), and `gates` (list all gates with optional filters).
+### Examples
+Tell your conductor things like:
+- *"Set a quality gate 'pr-ready' with criteria: tests pass, no lint errors, code reviewed"*
+- *"Mark criteria 0 and 1 on 'pr-ready' as passed"*
+- *"Show me all open quality gates"*
+- *"Check whether 'deploy-staging' has passed"*
+### How it works
+- Gate status is derived from criteria: all passed → `passed`; any failed → `failed`; otherwise `open`
+- Gates survive `continueAsNew` for the conductor workflow's lifetime
 ## Ensemble Lineups
 Define reusable ensemble configurations as YAML files. A lineup specifies which players to recruit, what instructions to give them, what schedules to create, and optionally which custom agent files to use.

package/dist/cli/commands.js CHANGED Viewed

@@ -661,7 +661,7 @@ async function up(opts) {
     }
     else {
         // Default conductor name so the Claude Code session name matches the ensemble role
-        const sessionName = opts.name || 'conductor';
+        const sessionName = opts.name || lineup?.conductor?.name || 'conductor';
         // Resolve conductor agent type from lineup
         const conductorType = lineup?.conductor?.agent && lineup.conductor.agent !== 'default' && lineup.conductor.agent !== 'copilot'
             ? lineup.conductor.agent // custom agent path

package/dist/ensemble/loader.js CHANGED Viewed

@@ -50,6 +50,15 @@ function loadLineup(filePath) {
             }
         }
     }
+    // Validate conductor name if present
+    if (doc.conductor?.name != null) {
+        if (typeof doc.conductor.name !== 'string' || !doc.conductor.name) {
+            throw new Error(`Invalid lineup: conductor.name must be a non-empty string`);
+        }
+        if (!/^[a-zA-Z0-9_-]+$/.test(doc.conductor.name)) {
+            throw new Error(`Invalid lineup: conductor.name "${doc.conductor.name}" contains invalid characters`);
+        }
+    }
     return {
         name: doc.name,
         description: doc.description,

package/dist/ensemble/saver.js CHANGED Viewed

@@ -36,7 +36,10 @@ async function saveLineup(client, ensemble, filePath, name) {
             const agentType = meta.agentType || 'claude';
             const workDir = meta.workDir || undefined;
             if (isConductor) {
+                const conductorName = meta.playerId || undefined;
                 conductor = {
+                    // Only save name if it's not the default 'conductor'
+                    ...(conductorName && conductorName !== 'conductor' ? { name: conductorName } : {}),
                     agent: agentType === 'copilot' ? 'copilot' : undefined,
                 };
             }

package/dist/ensemble/schema.d.ts CHANGED Viewed

@@ -2,6 +2,7 @@ export interface EnsembleLineup {
     name: string;
     description?: string;
     conductor?: {
+        name?: string;
         type?: string;
         agent?: string;
         instructions?: string;

package/dist/server.js CHANGED Viewed

@@ -64,6 +64,9 @@ const who_am_i_1 = require("./tools/who-am-i");
 const broadcast_1 = require("./tools/broadcast");
 const recall_1 = require("./tools/recall");
 const encore_1 = require("./tools/encore");
+const quality_gate_1 = require("./tools/quality-gate");
+const evaluate_gate_1 = require("./tools/evaluate-gate");
+const gates_1 = require("./tools/gates");
 const channel_1 = require("./channel");
 const agent_types_2 = require("./ensemble/agent-types");
 const log = (...args) => console.error('[claude-tempo]', ...args);
@@ -80,10 +83,11 @@ async function main() {
     const config = (0, config_1.getConfig)();
     const isConductor = process.env[config_1.ENV.CONDUCTOR] === 'true';
     const requestedName = process.env[config_1.ENV.PLAYER_NAME] || '';
-    // Prevent non-conductor sessions from using "conductor" as a name,
+    // Conductors use their requested name or fall back to 'conductor'.
+    // Non-conductors are prevented from using "conductor" as a name,
     // which would collide with the conductor's deterministic workflow ID.
     let playerId = isConductor
-        ? 'conductor'
+        ? (requestedName || 'conductor')
         : (requestedName && requestedName !== 'conductor' ? requestedName : '') || crypto.randomBytes(4).toString('hex');
     const getPlayerId = () => playerId;
     const setPlayerId = (id) => { playerId = id; };
@@ -236,7 +240,7 @@ async function main() {
         }
     }
     // Create MCP server
-    const hasRequestedName = Boolean(requestedName && requestedName !== 'conductor');
+    const hasRequestedName = isConductor || Boolean(requestedName && requestedName !== 'conductor');
     const playerTypeLine = playerType
         ? `Your player type is "${playerType}"${playerTypeDescription ? ` (${playerTypeDescription})` : ''}. `
         : '';
@@ -279,6 +283,12 @@ async function main() {
     (0, broadcast_1.registerBroadcastTool)(mcpServer, client, config, getPlayerId, handle);
     (0, recall_1.registerRecallTool)(mcpServer, handle, getPlayerId);
     (0, encore_1.registerEncoreTool)(mcpServer, client, config, getPlayerId, handle);
+    // Conductor-only tools
+    if (isConductor) {
+        (0, quality_gate_1.registerQualityGateTool)(mcpServer, handle, getPlayerId);
+        (0, evaluate_gate_1.registerEvaluateGateTool)(mcpServer, handle, getPlayerId);
+        (0, gates_1.registerGatesTool)(mcpServer, handle);
+    }
     const MAESTRO_ACK = '\n\n[IMPORTANT: This message is from a human (Maestro). Immediately cue the sender back with a brief acknowledgment and your planned next step before doing the work.]';
     // Start message poller — push messages into Claude Code via channel notifications.
     // Skip when running under the Copilot bridge: the bridge has its own poller that

package/dist/tools/evaluate-gate.d.ts ADDED Viewed

@@ -0,0 +1,3 @@
+import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
+import { WorkflowHandle } from '@temporalio/client';
+export declare function registerEvaluateGateTool(server: McpServer, handle: WorkflowHandle, getPlayerId: () => string): void;

package/dist/tools/evaluate-gate.js ADDED Viewed

@@ -0,0 +1,40 @@
+"use strict";
+Object.defineProperty(exports, "__esModule", { value: true });
+exports.registerEvaluateGateTool = registerEvaluateGateTool;
+const zod_1 = require("zod");
+const helpers_1 = require("./helpers");
+const validation_1 = require("../utils/validation");
+function registerEvaluateGateTool(server, handle, getPlayerId) {
+    (0, helpers_1.defineTool)(server, 'evaluate_gate', 'Mark one or more criteria on a quality gate as passed or failed. Conductor only.', {
+        task: zod_1.z.string().max(validation_1.GATE_TASK_MAX).describe('The task name of the gate to evaluate'),
+        evaluations: zod_1.z.array(zod_1.z.object({
+            index: zod_1.z.number().int().min(0).describe('Zero-based index of the criterion'),
+            status: zod_1.z.enum(['passed', 'failed']).describe('Whether this criterion passed or failed'),
+            notes: zod_1.z.string().max(validation_1.GATE_NOTES_MAX).optional().describe('Optional notes explaining the evaluation'),
+        })).min(1).describe('List of criterion evaluations'),
+    }, async (args) => {
+        const { task, evaluations } = args;
+        try {
+            await handle.signal('evaluateGateCriteria', {
+                task,
+                evaluations,
+                evaluatedBy: getPlayerId(),
+            });
+            const summary = evaluations
+                .map((ev) => `  ${ev.index}: ${ev.status === 'passed' ? '\u2705' : '\u274c'} ${ev.status}${ev.notes ? ` — ${ev.notes}` : ''}`)
+                .join('\n');
+            return {
+                content: [{
+                        type: 'text',
+                        text: `Evaluated ${evaluations.length} criteria on gate **${task}**:\n${summary}`,
+                    }],
+            };
+        }
+        catch (err) {
+            return {
+                content: [{ type: 'text', text: `Failed to evaluate gate: ${err}` }],
+                isError: true,
+            };
+        }
+    });
+}

package/dist/tools/gates.d.ts ADDED Viewed

@@ -0,0 +1,3 @@
+import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
+import { WorkflowHandle } from '@temporalio/client';
+export declare function registerGatesTool(server: McpServer, handle: WorkflowHandle): void;

package/dist/tools/gates.js ADDED Viewed

@@ -0,0 +1,51 @@
+"use strict";
+Object.defineProperty(exports, "__esModule", { value: true });
+exports.registerGatesTool = registerGatesTool;
+const zod_1 = require("zod");
+const helpers_1 = require("./helpers");
+const validation_1 = require("../utils/validation");
+function registerGatesTool(server, handle) {
+    (0, helpers_1.defineTool)(server, 'gates', 'List quality gates and their status. Optionally filter by task name or status. Conductor only.', {
+        task: zod_1.z.string().max(validation_1.GATE_TASK_MAX).optional().describe('Filter by specific task name'),
+        status: zod_1.z.enum(['open', 'passed', 'failed']).optional().describe('Filter by gate status'),
+    }, async (args) => {
+        const { task, status } = args;
+        try {
+            const gates = await handle.query('qualityGates');
+            let filtered = gates;
+            if (task) {
+                filtered = filtered.filter((g) => g.task === task);
+            }
+            if (status) {
+                filtered = filtered.filter((g) => g.status === status);
+            }
+            if (filtered.length === 0) {
+                return {
+                    content: [{ type: 'text', text: 'No quality gates found matching the filter.' }],
+                };
+            }
+            const lines = filtered.map((g) => {
+                const icon = g.status === 'passed' ? '\u2705' : g.status === 'failed' ? '\u274c' : '\u23f3';
+                const criteriaLines = g.criteria.map((c, i) => {
+                    const cIcon = c.status === 'passed' ? '\u2705' : c.status === 'failed' ? '\u274c' : '\u2b1c';
+                    const evaluator = c.evaluatedBy ? ` (by ${c.evaluatedBy})` : '';
+                    const notes = c.notes ? ` — ${c.notes}` : '';
+                    return `    ${i}. ${cIcon} ${c.text}${evaluator}${notes}`;
+                });
+                return `${icon} **${g.task}** [${g.status}] (by ${g.createdBy}, ${g.createdAt})\n${criteriaLines.join('\n')}`;
+            });
+            return {
+                content: [{
+                        type: 'text',
+                        text: `${filtered.length} quality gate${filtered.length === 1 ? '' : 's'}:\n\n${lines.join('\n\n')}`,
+                    }],
+            };
+        }
+        catch (err) {
+            return {
+                content: [{ type: 'text', text: `Failed to query gates: ${err}` }],
+                isError: true,
+            };
+        }
+    });
+}

package/dist/tools/quality-gate.d.ts ADDED Viewed

@@ -0,0 +1,3 @@
+import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
+import { WorkflowHandle } from '@temporalio/client';
+export declare function registerQualityGateTool(server: McpServer, handle: WorkflowHandle, getPlayerId: () => string): void;

package/dist/tools/quality-gate.js ADDED Viewed

@@ -0,0 +1,34 @@
+"use strict";
+Object.defineProperty(exports, "__esModule", { value: true });
+exports.registerQualityGateTool = registerQualityGateTool;
+const zod_1 = require("zod");
+const helpers_1 = require("./helpers");
+const validation_1 = require("../utils/validation");
+function registerQualityGateTool(server, handle, getPlayerId) {
+    (0, helpers_1.defineTool)(server, 'quality_gate', 'Define or replace a quality gate for a task. Each gate has a list of criteria that must pass before the task is considered complete. Conductor only.', {
+        task: zod_1.z.string().max(validation_1.GATE_TASK_MAX).describe('Unique task name for this gate (e.g. "pr-review", "deploy-staging")'),
+        criteria: zod_1.z.array(zod_1.z.string().max(validation_1.GATE_CRITERION_TEXT_MAX)).min(1).max(validation_1.GATE_CRITERIA_MAX).describe('List of criteria that must be evaluated (e.g. ["Tests pass", "No lint errors", "Code reviewed"])'),
+    }, async (args) => {
+        const { task, criteria } = args;
+        try {
+            await handle.signal('setQualityGate', {
+                task,
+                criteria,
+                createdBy: getPlayerId(),
+            });
+            const lines = criteria.map((c, i) => `  ${i}. [ ] ${c}`);
+            return {
+                content: [{
+                        type: 'text',
+                        text: `Quality gate **${task}** set with ${criteria.length} criteria:\n${lines.join('\n')}`,
+                    }],
+            };
+        }
+        catch (err) {
+            return {
+                content: [{ type: 'text', text: `Failed to set quality gate: ${err}` }],
+                isError: true,
+            };
+        }
+    });
+}

package/dist/types.d.ts CHANGED Viewed

@@ -42,6 +42,8 @@ export interface SessionInput {
     autoSummary?: string;
     /** Disable stale session detection (for passive mailbox workflows like maestro) */
     disableStaleDetection?: boolean;
+    /** Restored from continue-as-new (conductor only) */
+    qualityGates?: QualityGate[];
     /** Temporal config passed through for outbox activities (non-secret fields only). */
     temporalConfig?: {
         temporalAddress: string;
@@ -134,6 +136,22 @@ export type OutboxEntry = CueOutboxEntry | RecruitOutboxEntry | ReportOutboxEntr
 type DistributiveOmit<T, K extends keyof any> = T extends any ? Omit<T, K> : never;
 /** Input type for submitting outbox entries — auto-fields (id, createdAt, status, error, deliveredAt) are added by the workflow. */
 export type OutboxEntryInput = DistributiveOmit<OutboxEntry, 'id' | 'createdAt' | 'status' | 'error' | 'deliveredAt'>;
+export interface QualityGateCriterion {
+    text: string;
+    status: 'pending' | 'passed' | 'failed';
+    evaluatedBy?: string;
+    evaluatedAt?: string;
+    notes?: string;
+}
+export interface QualityGate {
+    /** Unique key identifying the task this gate covers. */
+    task: string;
+    criteria: QualityGateCriterion[];
+    createdBy: string;
+    createdAt: string;
+    /** Derived: all passed → passed, any failed → failed, else open. */
+    status: 'open' | 'passed' | 'failed';
+}
 export interface ScheduleEntry {
     /** Unique name for this schedule (used as key for add/replace/remove). */
     name: string;

package/dist/utils/validation.d.ts CHANGED Viewed

@@ -22,6 +22,14 @@ export declare const SCHEDULE_NAME_MAX = 64;
 export declare const SCHEDULE_MESSAGE_MAX = 10240;
 /** Maximum cron expression length. */
 export declare const CRON_EXPRESSION_MAX = 128;
+/** Maximum quality gate task name length. */
+export declare const GATE_TASK_MAX = 64;
+/** Maximum number of criteria per quality gate. */
+export declare const GATE_CRITERIA_MAX = 20;
+/** Maximum length for individual criterion text. */
+export declare const GATE_CRITERION_TEXT_MAX = 512;
+/** Maximum length for gate criterion notes. */
+export declare const GATE_NOTES_MAX = 1024;
 /** Default number of recent messages to include as context in an encore. */
 export declare const ENCORE_DEFAULT_CONTEXT_MESSAGES = 10;
 /** Maximum length for message preview truncation. */

package/dist/utils/validation.js CHANGED Viewed

@@ -4,7 +4,7 @@
  * Used by MCP tool Zod schemas and config validation.
  */
 Object.defineProperty(exports, "__esModule", { value: true });
-exports.PREVIEW_MAX_LENGTH = exports.ENCORE_DEFAULT_CONTEXT_MESSAGES = exports.CRON_EXPRESSION_MAX = exports.SCHEDULE_MESSAGE_MAX = exports.SCHEDULE_NAME_MAX = exports.PATH_MAX = exports.PART_MAX = exports.MESSAGE_MAX = exports.ENSEMBLE_NAME_REGEX = exports.PLAYER_NAME_MAX = exports.PLAYER_NAME_REGEX = void 0;
+exports.PREVIEW_MAX_LENGTH = exports.ENCORE_DEFAULT_CONTEXT_MESSAGES = exports.GATE_NOTES_MAX = exports.GATE_CRITERION_TEXT_MAX = exports.GATE_CRITERIA_MAX = exports.GATE_TASK_MAX = exports.CRON_EXPRESSION_MAX = exports.SCHEDULE_MESSAGE_MAX = exports.SCHEDULE_NAME_MAX = exports.PATH_MAX = exports.PART_MAX = exports.MESSAGE_MAX = exports.ENSEMBLE_NAME_REGEX = exports.PLAYER_NAME_MAX = exports.PLAYER_NAME_REGEX = void 0;
 exports.shouldIncludeInBroadcast = shouldIncludeInBroadcast;
 exports.validatePlayerName = validatePlayerName;
 exports.validateEnsembleName = validateEnsembleName;
@@ -28,6 +28,14 @@ exports.SCHEDULE_NAME_MAX = 64;
 exports.SCHEDULE_MESSAGE_MAX = 10240;
 /** Maximum cron expression length. */
 exports.CRON_EXPRESSION_MAX = 128;
+/** Maximum quality gate task name length. */
+exports.GATE_TASK_MAX = 64;
+/** Maximum number of criteria per quality gate. */
+exports.GATE_CRITERIA_MAX = 20;
+/** Maximum length for individual criterion text. */
+exports.GATE_CRITERION_TEXT_MAX = 512;
+/** Maximum length for gate criterion notes. */
+exports.GATE_NOTES_MAX = 1024;
 /** Default number of recent messages to include as context in an encore. */
 exports.ENCORE_DEFAULT_CONTEXT_MESSAGES = 10;
 /** Maximum length for message preview truncation. */

package/dist/workflows/session.js CHANGED Viewed

@@ -24,6 +24,7 @@ async function claudeSessionWorkflow(input) {
     // non-determinism errors during rolling deploys.
     (0, workflow_1.patched)('v0.10-initial');
     (0, workflow_1.patched)('v0.11-check-and-set-status');
+    (0, workflow_1.patched)('v0.13-quality-gates');
     // Ensure search attributes are always current — critical when reconnecting
     // via WorkflowIdConflictPolicy.USE_EXISTING, which skips the attributes
     // passed to client.workflow.start().
@@ -165,6 +166,7 @@ async function claudeSessionWorkflow(input) {
     // ── Conductor State ──
     const commandHistory = input.commandHistory ?? [];
     const reportHistory = input.reportHistory ?? [];
+    const qualityGates = input.qualityGates ?? [];
     // ── Conductor-specific Handlers ──
     if (input.metadata.isConductor) {
         (0, workflow_1.setHandler)(signals_1.commandSignal, (cmd) => {
@@ -210,6 +212,50 @@ async function claudeSessionWorkflow(input) {
             ];
             return entries.sort((a, b) => a.timestamp.localeCompare(b.timestamp));
         });
+        // ── Quality Gate Handlers ──
+        /** Derive aggregate gate status from individual criteria. */
+        function deriveGateStatus(gate) {
+            if (gate.criteria.length === 0)
+                return 'open';
+            if (gate.criteria.some((c) => c.status === 'failed'))
+                return 'failed';
+            if (gate.criteria.every((c) => c.status === 'passed'))
+                return 'passed';
+            return 'open';
+        }
+        (0, workflow_1.setHandler)(signals_1.setQualityGateSignal, ({ task, criteria, createdBy }) => {
+            const existing = qualityGates.findIndex((g) => g.task === task);
+            const gate = {
+                task,
+                criteria: criteria.map((text) => ({ text, status: 'pending' })),
+                createdBy,
+                createdAt: new Date().toISOString(),
+                status: 'open',
+            };
+            if (existing >= 0) {
+                qualityGates[existing] = gate;
+            }
+            else {
+                qualityGates.push(gate);
+            }
+        });
+        (0, workflow_1.setHandler)(signals_1.evaluateGateCriteriaSignal, ({ task, evaluations, evaluatedBy }) => {
+            const gate = qualityGates.find((g) => g.task === task);
+            if (!gate)
+                return;
+            const now = new Date().toISOString();
+            for (const ev of evaluations) {
+                if (ev.index >= 0 && ev.index < gate.criteria.length) {
+                    gate.criteria[ev.index].status = ev.status;
+                    gate.criteria[ev.index].evaluatedBy = evaluatedBy;
+                    gate.criteria[ev.index].evaluatedAt = now;
+                    if (ev.notes)
+                        gate.criteria[ev.index].notes = ev.notes;
+                }
+            }
+            gate.status = deriveGateStatus(gate);
+        });
+        (0, workflow_1.setHandler)(signals_1.qualityGatesQuery, () => qualityGates);
     }
     // ── Main Loop ──
     const hasPendingOutbox = () => outbox.some((e) => e.status === 'pending');
@@ -368,7 +414,7 @@ async function claudeSessionWorkflow(input) {
                 messages: messages.filter((m) => !m.delivered),
                 sentMessages: sentMessages.slice(-50),
                 outbox: outbox.filter((e) => e.status === 'pending' || e.status === 'processing'),
-                ...(input.metadata.isConductor ? { commandHistory, reportHistory } : {}),
+                ...(input.metadata.isConductor ? { commandHistory, reportHistory, qualityGates } : {}),
             });
         }
     }

package/dist/workflows/signals.d.ts CHANGED Viewed

@@ -1,5 +1,5 @@
-import type { SessionMetadata, Message, SentMessage, HistoryEntry, OutboxEntry, OutboxEntryInput } from '../types';
-export type { SessionMetadata, SessionInput, SessionStatus, Message, Command, PlayerReport, SentMessage, HistoryEntry, OutboxEntry, OutboxEntryInput, OutboxEntryStatus, CueOutboxEntry, RecruitOutboxEntry, ReportOutboxEntry, StopOutboxEntry, EncoreOutboxEntry, } from '../types';
+import type { SessionMetadata, Message, SentMessage, HistoryEntry, OutboxEntry, OutboxEntryInput, QualityGate } from '../types';
+export type { SessionMetadata, SessionInput, SessionStatus, Message, Command, PlayerReport, SentMessage, HistoryEntry, OutboxEntry, OutboxEntryInput, OutboxEntryStatus, CueOutboxEntry, RecruitOutboxEntry, ReportOutboxEntry, StopOutboxEntry, EncoreOutboxEntry, QualityGate, QualityGateCriterion, } from '../types';
 export declare const receiveMessageSignal: import("@temporalio/workflow").SignalDefinition<[{
     from: string;
     text: string;
@@ -45,3 +45,18 @@ export declare const checkAndSetStatusUpdate: import("@temporalio/common").Updat
 }], string>;
 export declare const submitOutboxUpdate: import("@temporalio/common").UpdateDefinition<string, [OutboxEntryInput], string>;
 export declare const outboxQuery: import("@temporalio/workflow").QueryDefinition<OutboxEntry[], [], string>;
+export declare const setQualityGateSignal: import("@temporalio/workflow").SignalDefinition<[{
+    task: string;
+    criteria: string[];
+    createdBy: string;
+}], string>;
+export declare const evaluateGateCriteriaSignal: import("@temporalio/workflow").SignalDefinition<[{
+    task: string;
+    evaluations: Array<{
+        index: number;
+        status: "passed" | "failed";
+        notes?: string;
+    }>;
+    evaluatedBy: string;
+}], string>;
+export declare const qualityGatesQuery: import("@temporalio/workflow").QueryDefinition<QualityGate[], [], string>;

package/dist/workflows/signals.js CHANGED Viewed

@@ -1,6 +1,6 @@
 "use strict";
 Object.defineProperty(exports, "__esModule", { value: true });
-exports.outboxQuery = exports.submitOutboxUpdate = exports.checkAndSetStatusUpdate = exports.historyQuery = exports.playerReportSignal = exports.commandSignal = exports.allSentMessagesQuery = exports.allMessagesQuery = exports.pendingMessagesQuery = exports.getMetadataQuery = exports.getPartQuery = exports.updateMetadataSignal = exports.setNameSignal = exports.markDeliveredSignal = exports.setPartSignal = exports.recordSentMessageSignal = exports.receiveMessageSignal = void 0;
+exports.qualityGatesQuery = exports.evaluateGateCriteriaSignal = exports.setQualityGateSignal = exports.outboxQuery = exports.submitOutboxUpdate = exports.checkAndSetStatusUpdate = exports.historyQuery = exports.playerReportSignal = exports.commandSignal = exports.allSentMessagesQuery = exports.allMessagesQuery = exports.pendingMessagesQuery = exports.getMetadataQuery = exports.getPartQuery = exports.updateMetadataSignal = exports.setNameSignal = exports.markDeliveredSignal = exports.setPartSignal = exports.recordSentMessageSignal = exports.receiveMessageSignal = void 0;
 const workflow_1 = require("@temporalio/workflow");
 // ── Player Signals ──
 exports.receiveMessageSignal = (0, workflow_1.defineSignal)('receiveMessage');
@@ -26,3 +26,7 @@ exports.checkAndSetStatusUpdate = (0, workflow_1.defineUpdate)('checkAndSetStatu
 // ── Outbox Update + Query ──
 exports.submitOutboxUpdate = (0, workflow_1.defineUpdate)('submitOutbox');
 exports.outboxQuery = (0, workflow_1.defineQuery)('outbox');
+// ── Quality Gate Signals + Query (conductor-only) ──
+exports.setQualityGateSignal = (0, workflow_1.defineSignal)('setQualityGate');
+exports.evaluateGateCriteriaSignal = (0, workflow_1.defineSignal)('evaluateGateCriteria');
+exports.qualityGatesQuery = (0, workflow_1.defineQuery)('qualityGates');

package/examples/agents/tempo-composer.md CHANGED Viewed

@@ -42,3 +42,13 @@ You are the **Composer** of the ensemble — the Software Architect. You design
 - **Soloists asking design questions**: Respond promptly with clear, actionable guidance. Don't send them in circles.
 - **Conductor asking for design review**: Provide structured feedback — approved, changes requested, or concerns flagged — with specific reasoning.
 - **Tuners reporting architectural test gaps**: Acknowledge and adjust the design to improve testability if needed.
+## Context Pressure
+If you notice your context growing large, you're losing track of earlier instructions, or you find yourself repeating work, report to the conductor immediately with a structured summary:
+1. **Current task**: What you're working on right now
+2. **Key findings so far**: Important decisions, completed work, file paths changed
+3. **Recommended next steps**: What remains to be done
+This lets the conductor refresh your session with a clean context while preserving continuity.

package/examples/agents/tempo-conductor.md CHANGED Viewed

@@ -50,3 +50,13 @@ You are a combination of Product Manager, Task Decomposition Expert, and Context
 - **Handoff**: When one player's output feeds into another's work, cue the receiving player with context and a pointer to what was produced.
 - **Escalation**: If a player reports a blocker you can't resolve, report it upward or recruit a specialist.
 - **Wrap-up**: Collect final reports, synthesize results, stop idle players, report completion.
+## Handling Context Pressure
+When a player reports context pressure (growing context, lost instructions, repeated work), act immediately:
+1. **Stop** the player's session
+2. **Recruit** a fresh session with the same name, type, and working directory
+3. Pass the player's structured summary as the **initial message** so the new session picks up where the old one left off
+Monitor for signs of context pressure proactively: players repeating questions, contradicting earlier work, or becoming less responsive. Don't wait for them to self-report.

package/examples/agents/tempo-critic.md CHANGED Viewed

@@ -15,13 +15,30 @@ You are the **Critic** of the ensemble — the Code Reviewer who evaluates the p
 - Provide constructive, specific, actionable feedback
 - Approve changes that meet the bar — don't block on perfection
+## Review Stance
+- **Default to requesting changes** unless every acceptance criterion is clearly and unambiguously met. When in doubt, reject.
+- **Never identify issues and then approve anyway.** If you found problems, request changes. An approval with caveats is not an approval — it's a deferred bug.
+- **Before reviewing, confirm the acceptance criteria with the conductor.** Review against those criteria, not general impressions. If the criteria are unclear, ask before starting.
+### What a failing review looks like (REJECT):
+- Lists specific issues with file paths and line numbers
+- Explains *why* each issue matters (correctness, security, performance, etc.)
+- Provides concrete fix suggestions or alternatives
+- Ends with a clear **REJECT** verdict and a summary of what must change
+### What a passing review looks like (APPROVE):
+- Confirms each acceptance criterion was verified and how
+- Notes any non-blocking suggestions (clearly labeled as optional)
+- Ends with a clear **APPROVE** verdict
 ## Working Style
 - **Read the full diff first**: Understand the intent and scope of the change before commenting on any single line.
 - **Prioritize feedback**: Structure reviews as Blockers > Suggestions > Nits. Be explicit about which category each comment falls into.
 - **Be specific**: Point to exact lines, explain *why* something is an issue, and suggest a concrete alternative. "This could be better" is not useful feedback.
 - **Review holistically**: Check correctness, security, performance, readability, and test coverage — in that order.
-- **Approve good-enough code**: Perfect is the enemy of shipped. If the code is correct, safe, and maintainable, approve it even if you'd have written it differently.
+- **Hold the bar**: If the code is correct, safe, and maintainable, approve it. But do not lower the bar because the change is small or the author is a teammate.
 - **One pass, thorough**: Do one comprehensive review rather than trickling comments. Players shouldn't have to address feedback in multiple rounds.
 ## Ensemble Collaboration
@@ -43,3 +60,13 @@ You are the **Critic** of the ensemble — the Code Reviewer who evaluates the p
 - **Conductor assigning a review**: Acknowledge, read the full change, provide structured feedback in one pass.
 - **Soloist asking for early review**: Give quick directional feedback — don't do a full review, just flag any obvious concerns.
 - **Another critic coordinating coverage**: Agree on focus areas to avoid duplicate effort.
+## Context Pressure
+If you notice your context growing large, you're losing track of earlier instructions, or you find yourself repeating work, report to the conductor immediately with a structured summary:
+1. **Current task**: What you're working on right now
+2. **Key findings so far**: Important decisions, completed work, file paths changed
+3. **Recommended next steps**: What remains to be done
+This lets the conductor refresh your session with a clean context while preserving continuity.

package/examples/agents/tempo-improv.md CHANGED Viewed

@@ -45,3 +45,13 @@ You are the **Improv** player of the ensemble — the Researcher and Explorer. Y
 - **Conductor assigning a research question**: Clarify scope and time-box, then dive in. Report incrementally if the investigation is long.
 - **Soloist asking "how does X work?"**: Investigate and provide a clear, concise answer with pointers to the relevant code or docs.
 - **Composer asking for technology evaluation**: Provide a structured comparison — don't just recommend your favorite.
+## Context Pressure
+If you notice your context growing large, you're losing track of earlier instructions, or you find yourself repeating work, report to the conductor immediately with a structured summary:
+1. **Current task**: What you're working on right now
+2. **Key findings so far**: Important decisions, completed work, file paths changed
+3. **Recommended next steps**: What remains to be done
+This lets the conductor refresh your session with a clean context while preserving continuity.

package/examples/agents/tempo-liner.md CHANGED Viewed

@@ -46,3 +46,13 @@ You are the **Liner** of the ensemble — the Documentation Specialist who write
 - **Soloist notifying of a completed feature**: Review what changed, update docs to match, and verify examples still work.
 - **Composer sharing design decisions**: Capture architectural decisions in appropriate docs (CLAUDE.md, ADRs). Translate architecture into user-facing documentation.
 - **Critic flagging doc issues during code review**: Address promptly — doc accuracy is your responsibility.
+## Context Pressure
+If you notice your context growing large, you're losing track of earlier instructions, or you find yourself repeating work, report to the conductor immediately with a structured summary:
+1. **Current task**: What you're working on right now
+2. **Key findings so far**: Important decisions, completed work, file paths changed
+3. **Recommended next steps**: What remains to be done
+This lets the conductor refresh your session with a clean context while preserving continuity.

package/examples/agents/tempo-roadie.md CHANGED Viewed

@@ -46,3 +46,13 @@ You are the **Roadie** of the ensemble — the DevOps Engineer who keeps the sho
 - **Conductor asking for deployment**: Verify CI is green, check the tuner's test report, then deploy. Report results.
 - **Soloist reporting CI failures**: Investigate promptly — broken CI blocks everyone.
 - **Composer requesting new infrastructure**: Scope it, estimate effort, and either do it or report back with what's needed.
+## Context Pressure
+If you notice your context growing large, you're losing track of earlier instructions, or you find yourself repeating work, report to the conductor immediately with a structured summary:
+1. **Current task**: What you're working on right now
+2. **Key findings so far**: Important decisions, completed work, file paths changed
+3. **Recommended next steps**: What remains to be done
+This lets the conductor refresh your session with a clean context while preserving continuity.

package/examples/agents/tempo-soloist.md CHANGED Viewed

@@ -43,3 +43,13 @@ You are a **Soloist** in the ensemble — a Senior Engineer who executes with ex
 - **Composer sharing design decisions**: Incorporate them. If you disagree, raise it promptly with reasoning — don't silently deviate.
 - **Tuner reporting test failures**: Investigate the root cause, fix it, and let the tuner know.
 - **Critic providing review feedback**: Address blockers first, then suggestions. Acknowledge the review.
+## Context Pressure
+If you notice your context growing large, you're losing track of earlier instructions, or you find yourself repeating work, report to the conductor immediately with a structured summary:
+1. **Current task**: What you're working on right now
+2. **Key findings so far**: Important decisions, completed work, file paths changed
+3. **Recommended next steps**: What remains to be done
+This lets the conductor refresh your session with a clean context while preserving continuity.