claude-tempo 0.12.0 → 0.13.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CLAUDE.md +4 -0
- package/README.md +21 -0
- package/dist/cli/commands.js +1 -1
- package/dist/ensemble/loader.js +9 -0
- package/dist/ensemble/saver.js +3 -0
- package/dist/ensemble/schema.d.ts +1 -0
- package/dist/server.js +13 -3
- package/dist/tools/evaluate-gate.d.ts +3 -0
- package/dist/tools/evaluate-gate.js +40 -0
- package/dist/tools/gates.d.ts +3 -0
- package/dist/tools/gates.js +51 -0
- package/dist/tools/quality-gate.d.ts +3 -0
- package/dist/tools/quality-gate.js +34 -0
- package/dist/types.d.ts +18 -0
- package/dist/utils/validation.d.ts +8 -0
- package/dist/utils/validation.js +9 -1
- package/dist/workflows/session.js +47 -1
- package/dist/workflows/signals.d.ts +17 -2
- package/dist/workflows/signals.js +5 -1
- package/examples/agents/tempo-composer.md +10 -0
- package/examples/agents/tempo-conductor.md +10 -0
- package/examples/agents/tempo-critic.md +28 -1
- package/examples/agents/tempo-improv.md +10 -0
- package/examples/agents/tempo-liner.md +10 -0
- package/examples/agents/tempo-roadie.md +10 -0
- package/examples/agents/tempo-soloist.md +10 -0
- package/examples/agents/tempo-tuner.md +28 -0
- package/package.json +1 -1
- package/workflow-bundle.js +53 -3
package/CLAUDE.md
CHANGED
|
@@ -62,6 +62,9 @@ src/
|
|
|
62
62
|
│ ├── schedule.ts # Create one-shot or recurring schedules
|
|
63
63
|
│ ├── unschedule.ts # Cancel a named schedule
|
|
64
64
|
│ ├── schedules.ts # List active schedules
|
|
65
|
+
│ ├── quality-gate.ts # Define quality gates for tasks (conductor only)
|
|
66
|
+
│ ├── evaluate-gate.ts # Mark gate criteria as passed/failed (conductor only)
|
|
67
|
+
│ ├── gates.ts # List quality gates and their status (conductor only)
|
|
65
68
|
│ └── helpers.ts # Zod/MCP tool registration wrapper
|
|
66
69
|
├── utils/
|
|
67
70
|
│ └── validation.ts # Shared validation constants (name/message/path limits, encore defaults) and helpers
|
|
@@ -116,6 +119,7 @@ npm test
|
|
|
116
119
|
- **Agent type discovery**: The `agent_types` MCP tool and `claude-tempo agent-types` CLI command let conductors discover available player types. Shipped examples (tempo-conductor, tempo-composer, tempo-soloist, tempo-tuner, tempo-critic, tempo-roadie, tempo-improv, tempo-liner) work out of the box. Ensemble lineups: tempo-big-band (full lifecycle), tempo-dev-team (feature work), tempo-review-squad (parallel review), tempo-jam-session (exploration).
|
|
117
120
|
- **Schedule**: A one-shot or recurring message delivery configured via the `schedule` tool. Backed by a durable `claudeSchedulerWorkflow` — survives restarts. Supports delay (`delay`), fixed time (`at`), recurring interval (`every`), and cron expressions (`cron`) with optional IANA timezone (`timezone`). Cron schedules use `croner` for expression parsing and next-fire computation. Managed via `schedule`, `unschedule`, and `schedules` tools.
|
|
118
121
|
- **Lineup**: A YAML file defining an ensemble configuration — which players to recruit, their types, working directories, and optional startup messages. Load via `load_lineup` to bootstrap a full ensemble in one step; save via `save_lineup` to snapshot a running ensemble's state for later reuse.
|
|
122
|
+
- **Quality Gate**: A named checklist of criteria a conductor tracks to verify a task is complete. Created via `quality_gate` (conductor only), evaluated via `evaluate_gate`, and listed via `gates`. Each criterion has a `pending` → `passed` | `failed` status; the gate's aggregate status is derived automatically (all passed → `passed`, any failed → `failed`, else `open`). Gates are stored in the conductor workflow and survive `continueAsNew`.
|
|
119
123
|
- **Wire protocol**: All Temporal signal, query, update, and workflow names are documented in [`docs/WIRE-PROTOCOL.md`](docs/WIRE-PROTOCOL.md). These names are stable as of v0.10 — renaming or removing any is a breaking change requiring a major version bump.
|
|
120
124
|
|
|
121
125
|
## Dashboard
|
package/README.md
CHANGED
|
@@ -133,6 +133,9 @@ These tools are available inside Claude Code sessions connected to claude-tempo:
|
|
|
133
133
|
| `broadcast` | Send a message to all active players. Optional `type` filter limits to a specific player type. |
|
|
134
134
|
| `encore` | Revive a stale player session — restarts the process and reconnects to the existing workflow with context restored. |
|
|
135
135
|
| `recall` | Read your own message history. Shows received messages by default; pass `includeSent: true` for the full timeline. |
|
|
136
|
+
| `quality_gate` | Define or replace a quality gate for a task — a named checklist of criteria that must pass. Conductor only. |
|
|
137
|
+
| `evaluate_gate` | Mark one or more criteria on a quality gate as passed or failed. Conductor only. |
|
|
138
|
+
| `gates` | List quality gates and their status. Filter by task name or status (`open`, `passed`, `failed`). Conductor only. |
|
|
136
139
|
|
|
137
140
|
## Scheduling
|
|
138
141
|
|
|
@@ -174,6 +177,24 @@ The `timezone` parameter accepts any IANA timezone (e.g. `"America/New_York"`, `
|
|
|
174
177
|
- `claude-tempo status` shows active schedules alongside sessions
|
|
175
178
|
- A single durable scheduler workflow per ensemble manages all schedules using Temporal timers
|
|
176
179
|
|
|
180
|
+
## Quality Gates
|
|
181
|
+
|
|
182
|
+
Conductors can define named checklists of criteria to verify task completion. Three conductor-only tools are available: `quality_gate` (create or replace a gate), `evaluate_gate` (mark criteria as passed or failed), and `gates` (list all gates with optional filters).
|
|
183
|
+
|
|
184
|
+
### Examples
|
|
185
|
+
|
|
186
|
+
Tell your conductor things like:
|
|
187
|
+
|
|
188
|
+
- *"Set a quality gate 'pr-ready' with criteria: tests pass, no lint errors, code reviewed"*
|
|
189
|
+
- *"Mark criteria 0 and 1 on 'pr-ready' as passed"*
|
|
190
|
+
- *"Show me all open quality gates"*
|
|
191
|
+
- *"Check whether 'deploy-staging' has passed"*
|
|
192
|
+
|
|
193
|
+
### How it works
|
|
194
|
+
|
|
195
|
+
- Gate status is derived from criteria: all passed → `passed`; any failed → `failed`; otherwise `open`
|
|
196
|
+
- Gates survive `continueAsNew` for the conductor workflow's lifetime
|
|
197
|
+
|
|
177
198
|
## Ensemble Lineups
|
|
178
199
|
|
|
179
200
|
Define reusable ensemble configurations as YAML files. A lineup specifies which players to recruit, what instructions to give them, what schedules to create, and optionally which custom agent files to use.
|
package/dist/cli/commands.js
CHANGED
|
@@ -661,7 +661,7 @@ async function up(opts) {
|
|
|
661
661
|
}
|
|
662
662
|
else {
|
|
663
663
|
// Default conductor name so the Claude Code session name matches the ensemble role
|
|
664
|
-
const sessionName = opts.name || 'conductor';
|
|
664
|
+
const sessionName = opts.name || lineup?.conductor?.name || 'conductor';
|
|
665
665
|
// Resolve conductor agent type from lineup
|
|
666
666
|
const conductorType = lineup?.conductor?.agent && lineup.conductor.agent !== 'default' && lineup.conductor.agent !== 'copilot'
|
|
667
667
|
? lineup.conductor.agent // custom agent path
|
package/dist/ensemble/loader.js
CHANGED
|
@@ -50,6 +50,15 @@ function loadLineup(filePath) {
|
|
|
50
50
|
}
|
|
51
51
|
}
|
|
52
52
|
}
|
|
53
|
+
// Validate conductor name if present
|
|
54
|
+
if (doc.conductor?.name != null) {
|
|
55
|
+
if (typeof doc.conductor.name !== 'string' || !doc.conductor.name) {
|
|
56
|
+
throw new Error(`Invalid lineup: conductor.name must be a non-empty string`);
|
|
57
|
+
}
|
|
58
|
+
if (!/^[a-zA-Z0-9_-]+$/.test(doc.conductor.name)) {
|
|
59
|
+
throw new Error(`Invalid lineup: conductor.name "${doc.conductor.name}" contains invalid characters`);
|
|
60
|
+
}
|
|
61
|
+
}
|
|
53
62
|
return {
|
|
54
63
|
name: doc.name,
|
|
55
64
|
description: doc.description,
|
package/dist/ensemble/saver.js
CHANGED
|
@@ -36,7 +36,10 @@ async function saveLineup(client, ensemble, filePath, name) {
|
|
|
36
36
|
const agentType = meta.agentType || 'claude';
|
|
37
37
|
const workDir = meta.workDir || undefined;
|
|
38
38
|
if (isConductor) {
|
|
39
|
+
const conductorName = meta.playerId || undefined;
|
|
39
40
|
conductor = {
|
|
41
|
+
// Only save name if it's not the default 'conductor'
|
|
42
|
+
...(conductorName && conductorName !== 'conductor' ? { name: conductorName } : {}),
|
|
40
43
|
agent: agentType === 'copilot' ? 'copilot' : undefined,
|
|
41
44
|
};
|
|
42
45
|
}
|
package/dist/server.js
CHANGED
|
@@ -64,6 +64,9 @@ const who_am_i_1 = require("./tools/who-am-i");
|
|
|
64
64
|
const broadcast_1 = require("./tools/broadcast");
|
|
65
65
|
const recall_1 = require("./tools/recall");
|
|
66
66
|
const encore_1 = require("./tools/encore");
|
|
67
|
+
const quality_gate_1 = require("./tools/quality-gate");
|
|
68
|
+
const evaluate_gate_1 = require("./tools/evaluate-gate");
|
|
69
|
+
const gates_1 = require("./tools/gates");
|
|
67
70
|
const channel_1 = require("./channel");
|
|
68
71
|
const agent_types_2 = require("./ensemble/agent-types");
|
|
69
72
|
const log = (...args) => console.error('[claude-tempo]', ...args);
|
|
@@ -80,10 +83,11 @@ async function main() {
|
|
|
80
83
|
const config = (0, config_1.getConfig)();
|
|
81
84
|
const isConductor = process.env[config_1.ENV.CONDUCTOR] === 'true';
|
|
82
85
|
const requestedName = process.env[config_1.ENV.PLAYER_NAME] || '';
|
|
83
|
-
//
|
|
86
|
+
// Conductors use their requested name or fall back to 'conductor'.
|
|
87
|
+
// Non-conductors are prevented from using "conductor" as a name,
|
|
84
88
|
// which would collide with the conductor's deterministic workflow ID.
|
|
85
89
|
let playerId = isConductor
|
|
86
|
-
? 'conductor'
|
|
90
|
+
? (requestedName || 'conductor')
|
|
87
91
|
: (requestedName && requestedName !== 'conductor' ? requestedName : '') || crypto.randomBytes(4).toString('hex');
|
|
88
92
|
const getPlayerId = () => playerId;
|
|
89
93
|
const setPlayerId = (id) => { playerId = id; };
|
|
@@ -236,7 +240,7 @@ async function main() {
|
|
|
236
240
|
}
|
|
237
241
|
}
|
|
238
242
|
// Create MCP server
|
|
239
|
-
const hasRequestedName = Boolean(requestedName && requestedName !== 'conductor');
|
|
243
|
+
const hasRequestedName = isConductor || Boolean(requestedName && requestedName !== 'conductor');
|
|
240
244
|
const playerTypeLine = playerType
|
|
241
245
|
? `Your player type is "${playerType}"${playerTypeDescription ? ` (${playerTypeDescription})` : ''}. `
|
|
242
246
|
: '';
|
|
@@ -279,6 +283,12 @@ async function main() {
|
|
|
279
283
|
(0, broadcast_1.registerBroadcastTool)(mcpServer, client, config, getPlayerId, handle);
|
|
280
284
|
(0, recall_1.registerRecallTool)(mcpServer, handle, getPlayerId);
|
|
281
285
|
(0, encore_1.registerEncoreTool)(mcpServer, client, config, getPlayerId, handle);
|
|
286
|
+
// Conductor-only tools
|
|
287
|
+
if (isConductor) {
|
|
288
|
+
(0, quality_gate_1.registerQualityGateTool)(mcpServer, handle, getPlayerId);
|
|
289
|
+
(0, evaluate_gate_1.registerEvaluateGateTool)(mcpServer, handle, getPlayerId);
|
|
290
|
+
(0, gates_1.registerGatesTool)(mcpServer, handle);
|
|
291
|
+
}
|
|
282
292
|
const MAESTRO_ACK = '\n\n[IMPORTANT: This message is from a human (Maestro). Immediately cue the sender back with a brief acknowledgment and your planned next step before doing the work.]';
|
|
283
293
|
// Start message poller — push messages into Claude Code via channel notifications.
|
|
284
294
|
// Skip when running under the Copilot bridge: the bridge has its own poller that
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
3
|
+
exports.registerEvaluateGateTool = registerEvaluateGateTool;
|
|
4
|
+
const zod_1 = require("zod");
|
|
5
|
+
const helpers_1 = require("./helpers");
|
|
6
|
+
const validation_1 = require("../utils/validation");
|
|
7
|
+
function registerEvaluateGateTool(server, handle, getPlayerId) {
|
|
8
|
+
(0, helpers_1.defineTool)(server, 'evaluate_gate', 'Mark one or more criteria on a quality gate as passed or failed. Conductor only.', {
|
|
9
|
+
task: zod_1.z.string().max(validation_1.GATE_TASK_MAX).describe('The task name of the gate to evaluate'),
|
|
10
|
+
evaluations: zod_1.z.array(zod_1.z.object({
|
|
11
|
+
index: zod_1.z.number().int().min(0).describe('Zero-based index of the criterion'),
|
|
12
|
+
status: zod_1.z.enum(['passed', 'failed']).describe('Whether this criterion passed or failed'),
|
|
13
|
+
notes: zod_1.z.string().max(validation_1.GATE_NOTES_MAX).optional().describe('Optional notes explaining the evaluation'),
|
|
14
|
+
})).min(1).describe('List of criterion evaluations'),
|
|
15
|
+
}, async (args) => {
|
|
16
|
+
const { task, evaluations } = args;
|
|
17
|
+
try {
|
|
18
|
+
await handle.signal('evaluateGateCriteria', {
|
|
19
|
+
task,
|
|
20
|
+
evaluations,
|
|
21
|
+
evaluatedBy: getPlayerId(),
|
|
22
|
+
});
|
|
23
|
+
const summary = evaluations
|
|
24
|
+
.map((ev) => ` ${ev.index}: ${ev.status === 'passed' ? '\u2705' : '\u274c'} ${ev.status}${ev.notes ? ` — ${ev.notes}` : ''}`)
|
|
25
|
+
.join('\n');
|
|
26
|
+
return {
|
|
27
|
+
content: [{
|
|
28
|
+
type: 'text',
|
|
29
|
+
text: `Evaluated ${evaluations.length} criteria on gate **${task}**:\n${summary}`,
|
|
30
|
+
}],
|
|
31
|
+
};
|
|
32
|
+
}
|
|
33
|
+
catch (err) {
|
|
34
|
+
return {
|
|
35
|
+
content: [{ type: 'text', text: `Failed to evaluate gate: ${err}` }],
|
|
36
|
+
isError: true,
|
|
37
|
+
};
|
|
38
|
+
}
|
|
39
|
+
});
|
|
40
|
+
}
|
|
@@ -0,0 +1,51 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
3
|
+
exports.registerGatesTool = registerGatesTool;
|
|
4
|
+
const zod_1 = require("zod");
|
|
5
|
+
const helpers_1 = require("./helpers");
|
|
6
|
+
const validation_1 = require("../utils/validation");
|
|
7
|
+
function registerGatesTool(server, handle) {
|
|
8
|
+
(0, helpers_1.defineTool)(server, 'gates', 'List quality gates and their status. Optionally filter by task name or status. Conductor only.', {
|
|
9
|
+
task: zod_1.z.string().max(validation_1.GATE_TASK_MAX).optional().describe('Filter by specific task name'),
|
|
10
|
+
status: zod_1.z.enum(['open', 'passed', 'failed']).optional().describe('Filter by gate status'),
|
|
11
|
+
}, async (args) => {
|
|
12
|
+
const { task, status } = args;
|
|
13
|
+
try {
|
|
14
|
+
const gates = await handle.query('qualityGates');
|
|
15
|
+
let filtered = gates;
|
|
16
|
+
if (task) {
|
|
17
|
+
filtered = filtered.filter((g) => g.task === task);
|
|
18
|
+
}
|
|
19
|
+
if (status) {
|
|
20
|
+
filtered = filtered.filter((g) => g.status === status);
|
|
21
|
+
}
|
|
22
|
+
if (filtered.length === 0) {
|
|
23
|
+
return {
|
|
24
|
+
content: [{ type: 'text', text: 'No quality gates found matching the filter.' }],
|
|
25
|
+
};
|
|
26
|
+
}
|
|
27
|
+
const lines = filtered.map((g) => {
|
|
28
|
+
const icon = g.status === 'passed' ? '\u2705' : g.status === 'failed' ? '\u274c' : '\u23f3';
|
|
29
|
+
const criteriaLines = g.criteria.map((c, i) => {
|
|
30
|
+
const cIcon = c.status === 'passed' ? '\u2705' : c.status === 'failed' ? '\u274c' : '\u2b1c';
|
|
31
|
+
const evaluator = c.evaluatedBy ? ` (by ${c.evaluatedBy})` : '';
|
|
32
|
+
const notes = c.notes ? ` — ${c.notes}` : '';
|
|
33
|
+
return ` ${i}. ${cIcon} ${c.text}${evaluator}${notes}`;
|
|
34
|
+
});
|
|
35
|
+
return `${icon} **${g.task}** [${g.status}] (by ${g.createdBy}, ${g.createdAt})\n${criteriaLines.join('\n')}`;
|
|
36
|
+
});
|
|
37
|
+
return {
|
|
38
|
+
content: [{
|
|
39
|
+
type: 'text',
|
|
40
|
+
text: `${filtered.length} quality gate${filtered.length === 1 ? '' : 's'}:\n\n${lines.join('\n\n')}`,
|
|
41
|
+
}],
|
|
42
|
+
};
|
|
43
|
+
}
|
|
44
|
+
catch (err) {
|
|
45
|
+
return {
|
|
46
|
+
content: [{ type: 'text', text: `Failed to query gates: ${err}` }],
|
|
47
|
+
isError: true,
|
|
48
|
+
};
|
|
49
|
+
}
|
|
50
|
+
});
|
|
51
|
+
}
|
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
"use strict";
|
|
2
|
+
Object.defineProperty(exports, "__esModule", { value: true });
|
|
3
|
+
exports.registerQualityGateTool = registerQualityGateTool;
|
|
4
|
+
const zod_1 = require("zod");
|
|
5
|
+
const helpers_1 = require("./helpers");
|
|
6
|
+
const validation_1 = require("../utils/validation");
|
|
7
|
+
function registerQualityGateTool(server, handle, getPlayerId) {
|
|
8
|
+
(0, helpers_1.defineTool)(server, 'quality_gate', 'Define or replace a quality gate for a task. Each gate has a list of criteria that must pass before the task is considered complete. Conductor only.', {
|
|
9
|
+
task: zod_1.z.string().max(validation_1.GATE_TASK_MAX).describe('Unique task name for this gate (e.g. "pr-review", "deploy-staging")'),
|
|
10
|
+
criteria: zod_1.z.array(zod_1.z.string().max(validation_1.GATE_CRITERION_TEXT_MAX)).min(1).max(validation_1.GATE_CRITERIA_MAX).describe('List of criteria that must be evaluated (e.g. ["Tests pass", "No lint errors", "Code reviewed"])'),
|
|
11
|
+
}, async (args) => {
|
|
12
|
+
const { task, criteria } = args;
|
|
13
|
+
try {
|
|
14
|
+
await handle.signal('setQualityGate', {
|
|
15
|
+
task,
|
|
16
|
+
criteria,
|
|
17
|
+
createdBy: getPlayerId(),
|
|
18
|
+
});
|
|
19
|
+
const lines = criteria.map((c, i) => ` ${i}. [ ] ${c}`);
|
|
20
|
+
return {
|
|
21
|
+
content: [{
|
|
22
|
+
type: 'text',
|
|
23
|
+
text: `Quality gate **${task}** set with ${criteria.length} criteria:\n${lines.join('\n')}`,
|
|
24
|
+
}],
|
|
25
|
+
};
|
|
26
|
+
}
|
|
27
|
+
catch (err) {
|
|
28
|
+
return {
|
|
29
|
+
content: [{ type: 'text', text: `Failed to set quality gate: ${err}` }],
|
|
30
|
+
isError: true,
|
|
31
|
+
};
|
|
32
|
+
}
|
|
33
|
+
});
|
|
34
|
+
}
|
package/dist/types.d.ts
CHANGED
|
@@ -42,6 +42,8 @@ export interface SessionInput {
|
|
|
42
42
|
autoSummary?: string;
|
|
43
43
|
/** Disable stale session detection (for passive mailbox workflows like maestro) */
|
|
44
44
|
disableStaleDetection?: boolean;
|
|
45
|
+
/** Restored from continue-as-new (conductor only) */
|
|
46
|
+
qualityGates?: QualityGate[];
|
|
45
47
|
/** Temporal config passed through for outbox activities (non-secret fields only). */
|
|
46
48
|
temporalConfig?: {
|
|
47
49
|
temporalAddress: string;
|
|
@@ -134,6 +136,22 @@ export type OutboxEntry = CueOutboxEntry | RecruitOutboxEntry | ReportOutboxEntr
|
|
|
134
136
|
type DistributiveOmit<T, K extends keyof any> = T extends any ? Omit<T, K> : never;
|
|
135
137
|
/** Input type for submitting outbox entries — auto-fields (id, createdAt, status, error, deliveredAt) are added by the workflow. */
|
|
136
138
|
export type OutboxEntryInput = DistributiveOmit<OutboxEntry, 'id' | 'createdAt' | 'status' | 'error' | 'deliveredAt'>;
|
|
139
|
+
export interface QualityGateCriterion {
|
|
140
|
+
text: string;
|
|
141
|
+
status: 'pending' | 'passed' | 'failed';
|
|
142
|
+
evaluatedBy?: string;
|
|
143
|
+
evaluatedAt?: string;
|
|
144
|
+
notes?: string;
|
|
145
|
+
}
|
|
146
|
+
export interface QualityGate {
|
|
147
|
+
/** Unique key identifying the task this gate covers. */
|
|
148
|
+
task: string;
|
|
149
|
+
criteria: QualityGateCriterion[];
|
|
150
|
+
createdBy: string;
|
|
151
|
+
createdAt: string;
|
|
152
|
+
/** Derived: all passed → passed, any failed → failed, else open. */
|
|
153
|
+
status: 'open' | 'passed' | 'failed';
|
|
154
|
+
}
|
|
137
155
|
export interface ScheduleEntry {
|
|
138
156
|
/** Unique name for this schedule (used as key for add/replace/remove). */
|
|
139
157
|
name: string;
|
|
@@ -22,6 +22,14 @@ export declare const SCHEDULE_NAME_MAX = 64;
|
|
|
22
22
|
export declare const SCHEDULE_MESSAGE_MAX = 10240;
|
|
23
23
|
/** Maximum cron expression length. */
|
|
24
24
|
export declare const CRON_EXPRESSION_MAX = 128;
|
|
25
|
+
/** Maximum quality gate task name length. */
|
|
26
|
+
export declare const GATE_TASK_MAX = 64;
|
|
27
|
+
/** Maximum number of criteria per quality gate. */
|
|
28
|
+
export declare const GATE_CRITERIA_MAX = 20;
|
|
29
|
+
/** Maximum length for individual criterion text. */
|
|
30
|
+
export declare const GATE_CRITERION_TEXT_MAX = 512;
|
|
31
|
+
/** Maximum length for gate criterion notes. */
|
|
32
|
+
export declare const GATE_NOTES_MAX = 1024;
|
|
25
33
|
/** Default number of recent messages to include as context in an encore. */
|
|
26
34
|
export declare const ENCORE_DEFAULT_CONTEXT_MESSAGES = 10;
|
|
27
35
|
/** Maximum length for message preview truncation. */
|
package/dist/utils/validation.js
CHANGED
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
* Used by MCP tool Zod schemas and config validation.
|
|
5
5
|
*/
|
|
6
6
|
Object.defineProperty(exports, "__esModule", { value: true });
|
|
7
|
-
exports.PREVIEW_MAX_LENGTH = exports.ENCORE_DEFAULT_CONTEXT_MESSAGES = exports.CRON_EXPRESSION_MAX = exports.SCHEDULE_MESSAGE_MAX = exports.SCHEDULE_NAME_MAX = exports.PATH_MAX = exports.PART_MAX = exports.MESSAGE_MAX = exports.ENSEMBLE_NAME_REGEX = exports.PLAYER_NAME_MAX = exports.PLAYER_NAME_REGEX = void 0;
|
|
7
|
+
exports.PREVIEW_MAX_LENGTH = exports.ENCORE_DEFAULT_CONTEXT_MESSAGES = exports.GATE_NOTES_MAX = exports.GATE_CRITERION_TEXT_MAX = exports.GATE_CRITERIA_MAX = exports.GATE_TASK_MAX = exports.CRON_EXPRESSION_MAX = exports.SCHEDULE_MESSAGE_MAX = exports.SCHEDULE_NAME_MAX = exports.PATH_MAX = exports.PART_MAX = exports.MESSAGE_MAX = exports.ENSEMBLE_NAME_REGEX = exports.PLAYER_NAME_MAX = exports.PLAYER_NAME_REGEX = void 0;
|
|
8
8
|
exports.shouldIncludeInBroadcast = shouldIncludeInBroadcast;
|
|
9
9
|
exports.validatePlayerName = validatePlayerName;
|
|
10
10
|
exports.validateEnsembleName = validateEnsembleName;
|
|
@@ -28,6 +28,14 @@ exports.SCHEDULE_NAME_MAX = 64;
|
|
|
28
28
|
exports.SCHEDULE_MESSAGE_MAX = 10240;
|
|
29
29
|
/** Maximum cron expression length. */
|
|
30
30
|
exports.CRON_EXPRESSION_MAX = 128;
|
|
31
|
+
/** Maximum quality gate task name length. */
|
|
32
|
+
exports.GATE_TASK_MAX = 64;
|
|
33
|
+
/** Maximum number of criteria per quality gate. */
|
|
34
|
+
exports.GATE_CRITERIA_MAX = 20;
|
|
35
|
+
/** Maximum length for individual criterion text. */
|
|
36
|
+
exports.GATE_CRITERION_TEXT_MAX = 512;
|
|
37
|
+
/** Maximum length for gate criterion notes. */
|
|
38
|
+
exports.GATE_NOTES_MAX = 1024;
|
|
31
39
|
/** Default number of recent messages to include as context in an encore. */
|
|
32
40
|
exports.ENCORE_DEFAULT_CONTEXT_MESSAGES = 10;
|
|
33
41
|
/** Maximum length for message preview truncation. */
|
|
@@ -24,6 +24,7 @@ async function claudeSessionWorkflow(input) {
|
|
|
24
24
|
// non-determinism errors during rolling deploys.
|
|
25
25
|
(0, workflow_1.patched)('v0.10-initial');
|
|
26
26
|
(0, workflow_1.patched)('v0.11-check-and-set-status');
|
|
27
|
+
(0, workflow_1.patched)('v0.13-quality-gates');
|
|
27
28
|
// Ensure search attributes are always current — critical when reconnecting
|
|
28
29
|
// via WorkflowIdConflictPolicy.USE_EXISTING, which skips the attributes
|
|
29
30
|
// passed to client.workflow.start().
|
|
@@ -165,6 +166,7 @@ async function claudeSessionWorkflow(input) {
|
|
|
165
166
|
// ── Conductor State ──
|
|
166
167
|
const commandHistory = input.commandHistory ?? [];
|
|
167
168
|
const reportHistory = input.reportHistory ?? [];
|
|
169
|
+
const qualityGates = input.qualityGates ?? [];
|
|
168
170
|
// ── Conductor-specific Handlers ──
|
|
169
171
|
if (input.metadata.isConductor) {
|
|
170
172
|
(0, workflow_1.setHandler)(signals_1.commandSignal, (cmd) => {
|
|
@@ -210,6 +212,50 @@ async function claudeSessionWorkflow(input) {
|
|
|
210
212
|
];
|
|
211
213
|
return entries.sort((a, b) => a.timestamp.localeCompare(b.timestamp));
|
|
212
214
|
});
|
|
215
|
+
// ── Quality Gate Handlers ──
|
|
216
|
+
/** Derive aggregate gate status from individual criteria. */
|
|
217
|
+
function deriveGateStatus(gate) {
|
|
218
|
+
if (gate.criteria.length === 0)
|
|
219
|
+
return 'open';
|
|
220
|
+
if (gate.criteria.some((c) => c.status === 'failed'))
|
|
221
|
+
return 'failed';
|
|
222
|
+
if (gate.criteria.every((c) => c.status === 'passed'))
|
|
223
|
+
return 'passed';
|
|
224
|
+
return 'open';
|
|
225
|
+
}
|
|
226
|
+
(0, workflow_1.setHandler)(signals_1.setQualityGateSignal, ({ task, criteria, createdBy }) => {
|
|
227
|
+
const existing = qualityGates.findIndex((g) => g.task === task);
|
|
228
|
+
const gate = {
|
|
229
|
+
task,
|
|
230
|
+
criteria: criteria.map((text) => ({ text, status: 'pending' })),
|
|
231
|
+
createdBy,
|
|
232
|
+
createdAt: new Date().toISOString(),
|
|
233
|
+
status: 'open',
|
|
234
|
+
};
|
|
235
|
+
if (existing >= 0) {
|
|
236
|
+
qualityGates[existing] = gate;
|
|
237
|
+
}
|
|
238
|
+
else {
|
|
239
|
+
qualityGates.push(gate);
|
|
240
|
+
}
|
|
241
|
+
});
|
|
242
|
+
(0, workflow_1.setHandler)(signals_1.evaluateGateCriteriaSignal, ({ task, evaluations, evaluatedBy }) => {
|
|
243
|
+
const gate = qualityGates.find((g) => g.task === task);
|
|
244
|
+
if (!gate)
|
|
245
|
+
return;
|
|
246
|
+
const now = new Date().toISOString();
|
|
247
|
+
for (const ev of evaluations) {
|
|
248
|
+
if (ev.index >= 0 && ev.index < gate.criteria.length) {
|
|
249
|
+
gate.criteria[ev.index].status = ev.status;
|
|
250
|
+
gate.criteria[ev.index].evaluatedBy = evaluatedBy;
|
|
251
|
+
gate.criteria[ev.index].evaluatedAt = now;
|
|
252
|
+
if (ev.notes)
|
|
253
|
+
gate.criteria[ev.index].notes = ev.notes;
|
|
254
|
+
}
|
|
255
|
+
}
|
|
256
|
+
gate.status = deriveGateStatus(gate);
|
|
257
|
+
});
|
|
258
|
+
(0, workflow_1.setHandler)(signals_1.qualityGatesQuery, () => qualityGates);
|
|
213
259
|
}
|
|
214
260
|
// ── Main Loop ──
|
|
215
261
|
const hasPendingOutbox = () => outbox.some((e) => e.status === 'pending');
|
|
@@ -368,7 +414,7 @@ async function claudeSessionWorkflow(input) {
|
|
|
368
414
|
messages: messages.filter((m) => !m.delivered),
|
|
369
415
|
sentMessages: sentMessages.slice(-50),
|
|
370
416
|
outbox: outbox.filter((e) => e.status === 'pending' || e.status === 'processing'),
|
|
371
|
-
...(input.metadata.isConductor ? { commandHistory, reportHistory } : {}),
|
|
417
|
+
...(input.metadata.isConductor ? { commandHistory, reportHistory, qualityGates } : {}),
|
|
372
418
|
});
|
|
373
419
|
}
|
|
374
420
|
}
|
|
@@ -1,5 +1,5 @@
|
|
|
1
|
-
import type { SessionMetadata, Message, SentMessage, HistoryEntry, OutboxEntry, OutboxEntryInput } from '../types';
|
|
2
|
-
export type { SessionMetadata, SessionInput, SessionStatus, Message, Command, PlayerReport, SentMessage, HistoryEntry, OutboxEntry, OutboxEntryInput, OutboxEntryStatus, CueOutboxEntry, RecruitOutboxEntry, ReportOutboxEntry, StopOutboxEntry, EncoreOutboxEntry, } from '../types';
|
|
1
|
+
import type { SessionMetadata, Message, SentMessage, HistoryEntry, OutboxEntry, OutboxEntryInput, QualityGate } from '../types';
|
|
2
|
+
export type { SessionMetadata, SessionInput, SessionStatus, Message, Command, PlayerReport, SentMessage, HistoryEntry, OutboxEntry, OutboxEntryInput, OutboxEntryStatus, CueOutboxEntry, RecruitOutboxEntry, ReportOutboxEntry, StopOutboxEntry, EncoreOutboxEntry, QualityGate, QualityGateCriterion, } from '../types';
|
|
3
3
|
export declare const receiveMessageSignal: import("@temporalio/workflow").SignalDefinition<[{
|
|
4
4
|
from: string;
|
|
5
5
|
text: string;
|
|
@@ -45,3 +45,18 @@ export declare const checkAndSetStatusUpdate: import("@temporalio/common").Updat
|
|
|
45
45
|
}], string>;
|
|
46
46
|
export declare const submitOutboxUpdate: import("@temporalio/common").UpdateDefinition<string, [OutboxEntryInput], string>;
|
|
47
47
|
export declare const outboxQuery: import("@temporalio/workflow").QueryDefinition<OutboxEntry[], [], string>;
|
|
48
|
+
export declare const setQualityGateSignal: import("@temporalio/workflow").SignalDefinition<[{
|
|
49
|
+
task: string;
|
|
50
|
+
criteria: string[];
|
|
51
|
+
createdBy: string;
|
|
52
|
+
}], string>;
|
|
53
|
+
export declare const evaluateGateCriteriaSignal: import("@temporalio/workflow").SignalDefinition<[{
|
|
54
|
+
task: string;
|
|
55
|
+
evaluations: Array<{
|
|
56
|
+
index: number;
|
|
57
|
+
status: "passed" | "failed";
|
|
58
|
+
notes?: string;
|
|
59
|
+
}>;
|
|
60
|
+
evaluatedBy: string;
|
|
61
|
+
}], string>;
|
|
62
|
+
export declare const qualityGatesQuery: import("@temporalio/workflow").QueryDefinition<QualityGate[], [], string>;
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
"use strict";
|
|
2
2
|
Object.defineProperty(exports, "__esModule", { value: true });
|
|
3
|
-
exports.outboxQuery = exports.submitOutboxUpdate = exports.checkAndSetStatusUpdate = exports.historyQuery = exports.playerReportSignal = exports.commandSignal = exports.allSentMessagesQuery = exports.allMessagesQuery = exports.pendingMessagesQuery = exports.getMetadataQuery = exports.getPartQuery = exports.updateMetadataSignal = exports.setNameSignal = exports.markDeliveredSignal = exports.setPartSignal = exports.recordSentMessageSignal = exports.receiveMessageSignal = void 0;
|
|
3
|
+
exports.qualityGatesQuery = exports.evaluateGateCriteriaSignal = exports.setQualityGateSignal = exports.outboxQuery = exports.submitOutboxUpdate = exports.checkAndSetStatusUpdate = exports.historyQuery = exports.playerReportSignal = exports.commandSignal = exports.allSentMessagesQuery = exports.allMessagesQuery = exports.pendingMessagesQuery = exports.getMetadataQuery = exports.getPartQuery = exports.updateMetadataSignal = exports.setNameSignal = exports.markDeliveredSignal = exports.setPartSignal = exports.recordSentMessageSignal = exports.receiveMessageSignal = void 0;
|
|
4
4
|
const workflow_1 = require("@temporalio/workflow");
|
|
5
5
|
// ── Player Signals ──
|
|
6
6
|
exports.receiveMessageSignal = (0, workflow_1.defineSignal)('receiveMessage');
|
|
@@ -26,3 +26,7 @@ exports.checkAndSetStatusUpdate = (0, workflow_1.defineUpdate)('checkAndSetStatu
|
|
|
26
26
|
// ── Outbox Update + Query ──
|
|
27
27
|
exports.submitOutboxUpdate = (0, workflow_1.defineUpdate)('submitOutbox');
|
|
28
28
|
exports.outboxQuery = (0, workflow_1.defineQuery)('outbox');
|
|
29
|
+
// ── Quality Gate Signals + Query (conductor-only) ──
|
|
30
|
+
exports.setQualityGateSignal = (0, workflow_1.defineSignal)('setQualityGate');
|
|
31
|
+
exports.evaluateGateCriteriaSignal = (0, workflow_1.defineSignal)('evaluateGateCriteria');
|
|
32
|
+
exports.qualityGatesQuery = (0, workflow_1.defineQuery)('qualityGates');
|
|
@@ -42,3 +42,13 @@ You are the **Composer** of the ensemble — the Software Architect. You design
|
|
|
42
42
|
- **Soloists asking design questions**: Respond promptly with clear, actionable guidance. Don't send them in circles.
|
|
43
43
|
- **Conductor asking for design review**: Provide structured feedback — approved, changes requested, or concerns flagged — with specific reasoning.
|
|
44
44
|
- **Tuners reporting architectural test gaps**: Acknowledge and adjust the design to improve testability if needed.
|
|
45
|
+
|
|
46
|
+
## Context Pressure
|
|
47
|
+
|
|
48
|
+
If you notice your context growing large, you're losing track of earlier instructions, or you find yourself repeating work, report to the conductor immediately with a structured summary:
|
|
49
|
+
|
|
50
|
+
1. **Current task**: What you're working on right now
|
|
51
|
+
2. **Key findings so far**: Important decisions, completed work, file paths changed
|
|
52
|
+
3. **Recommended next steps**: What remains to be done
|
|
53
|
+
|
|
54
|
+
This lets the conductor refresh your session with a clean context while preserving continuity.
|
|
@@ -50,3 +50,13 @@ You are a combination of Product Manager, Task Decomposition Expert, and Context
|
|
|
50
50
|
- **Handoff**: When one player's output feeds into another's work, cue the receiving player with context and a pointer to what was produced.
|
|
51
51
|
- **Escalation**: If a player reports a blocker you can't resolve, report it upward or recruit a specialist.
|
|
52
52
|
- **Wrap-up**: Collect final reports, synthesize results, stop idle players, report completion.
|
|
53
|
+
|
|
54
|
+
## Handling Context Pressure
|
|
55
|
+
|
|
56
|
+
When a player reports context pressure (growing context, lost instructions, repeated work), act immediately:
|
|
57
|
+
|
|
58
|
+
1. **Stop** the player's session
|
|
59
|
+
2. **Recruit** a fresh session with the same name, type, and working directory
|
|
60
|
+
3. Pass the player's structured summary as the **initial message** so the new session picks up where the old one left off
|
|
61
|
+
|
|
62
|
+
Monitor for signs of context pressure proactively: players repeating questions, contradicting earlier work, or becoming less responsive. Don't wait for them to self-report.
|
|
@@ -15,13 +15,30 @@ You are the **Critic** of the ensemble — the Code Reviewer who evaluates the p
|
|
|
15
15
|
- Provide constructive, specific, actionable feedback
|
|
16
16
|
- Approve changes that meet the bar — don't block on perfection
|
|
17
17
|
|
|
18
|
+
## Review Stance
|
|
19
|
+
|
|
20
|
+
- **Default to requesting changes** unless every acceptance criterion is clearly and unambiguously met. When in doubt, reject.
|
|
21
|
+
- **Never identify issues and then approve anyway.** If you found problems, request changes. An approval with caveats is not an approval — it's a deferred bug.
|
|
22
|
+
- **Before reviewing, confirm the acceptance criteria with the conductor.** Review against those criteria, not general impressions. If the criteria are unclear, ask before starting.
|
|
23
|
+
|
|
24
|
+
### What a failing review looks like (REJECT):
|
|
25
|
+
- Lists specific issues with file paths and line numbers
|
|
26
|
+
- Explains *why* each issue matters (correctness, security, performance, etc.)
|
|
27
|
+
- Provides concrete fix suggestions or alternatives
|
|
28
|
+
- Ends with a clear **REJECT** verdict and a summary of what must change
|
|
29
|
+
|
|
30
|
+
### What a passing review looks like (APPROVE):
|
|
31
|
+
- Confirms each acceptance criterion was verified and how
|
|
32
|
+
- Notes any non-blocking suggestions (clearly labeled as optional)
|
|
33
|
+
- Ends with a clear **APPROVE** verdict
|
|
34
|
+
|
|
18
35
|
## Working Style
|
|
19
36
|
|
|
20
37
|
- **Read the full diff first**: Understand the intent and scope of the change before commenting on any single line.
|
|
21
38
|
- **Prioritize feedback**: Structure reviews as Blockers > Suggestions > Nits. Be explicit about which category each comment falls into.
|
|
22
39
|
- **Be specific**: Point to exact lines, explain *why* something is an issue, and suggest a concrete alternative. "This could be better" is not useful feedback.
|
|
23
40
|
- **Review holistically**: Check correctness, security, performance, readability, and test coverage — in that order.
|
|
24
|
-
- **
|
|
41
|
+
- **Hold the bar**: If the code is correct, safe, and maintainable, approve it. But do not lower the bar because the change is small or the author is a teammate.
|
|
25
42
|
- **One pass, thorough**: Do one comprehensive review rather than trickling comments. Players shouldn't have to address feedback in multiple rounds.
|
|
26
43
|
|
|
27
44
|
## Ensemble Collaboration
|
|
@@ -43,3 +60,13 @@ You are the **Critic** of the ensemble — the Code Reviewer who evaluates the p
|
|
|
43
60
|
- **Conductor assigning a review**: Acknowledge, read the full change, provide structured feedback in one pass.
|
|
44
61
|
- **Soloist asking for early review**: Give quick directional feedback — don't do a full review, just flag any obvious concerns.
|
|
45
62
|
- **Another critic coordinating coverage**: Agree on focus areas to avoid duplicate effort.
|
|
63
|
+
|
|
64
|
+
## Context Pressure
|
|
65
|
+
|
|
66
|
+
If you notice your context growing large, you're losing track of earlier instructions, or you find yourself repeating work, report to the conductor immediately with a structured summary:
|
|
67
|
+
|
|
68
|
+
1. **Current task**: What you're working on right now
|
|
69
|
+
2. **Key findings so far**: Important decisions, completed work, file paths changed
|
|
70
|
+
3. **Recommended next steps**: What remains to be done
|
|
71
|
+
|
|
72
|
+
This lets the conductor refresh your session with a clean context while preserving continuity.
|
|
@@ -45,3 +45,13 @@ You are the **Improv** player of the ensemble — the Researcher and Explorer. Y
|
|
|
45
45
|
- **Conductor assigning a research question**: Clarify scope and time-box, then dive in. Report incrementally if the investigation is long.
|
|
46
46
|
- **Soloist asking "how does X work?"**: Investigate and provide a clear, concise answer with pointers to the relevant code or docs.
|
|
47
47
|
- **Composer asking for technology evaluation**: Provide a structured comparison — don't just recommend your favorite.
|
|
48
|
+
|
|
49
|
+
## Context Pressure
|
|
50
|
+
|
|
51
|
+
If you notice your context growing large, you're losing track of earlier instructions, or you find yourself repeating work, report to the conductor immediately with a structured summary:
|
|
52
|
+
|
|
53
|
+
1. **Current task**: What you're working on right now
|
|
54
|
+
2. **Key findings so far**: Important decisions, completed work, file paths changed
|
|
55
|
+
3. **Recommended next steps**: What remains to be done
|
|
56
|
+
|
|
57
|
+
This lets the conductor refresh your session with a clean context while preserving continuity.
|
|
@@ -46,3 +46,13 @@ You are the **Liner** of the ensemble — the Documentation Specialist who write
|
|
|
46
46
|
- **Soloist notifying of a completed feature**: Review what changed, update docs to match, and verify examples still work.
|
|
47
47
|
- **Composer sharing design decisions**: Capture architectural decisions in appropriate docs (CLAUDE.md, ADRs). Translate architecture into user-facing documentation.
|
|
48
48
|
- **Critic flagging doc issues during code review**: Address promptly — doc accuracy is your responsibility.
|
|
49
|
+
|
|
50
|
+
## Context Pressure
|
|
51
|
+
|
|
52
|
+
If you notice your context growing large, you're losing track of earlier instructions, or you find yourself repeating work, report to the conductor immediately with a structured summary:
|
|
53
|
+
|
|
54
|
+
1. **Current task**: What you're working on right now
|
|
55
|
+
2. **Key findings so far**: Important decisions, completed work, file paths changed
|
|
56
|
+
3. **Recommended next steps**: What remains to be done
|
|
57
|
+
|
|
58
|
+
This lets the conductor refresh your session with a clean context while preserving continuity.
|
|
@@ -46,3 +46,13 @@ You are the **Roadie** of the ensemble — the DevOps Engineer who keeps the sho
|
|
|
46
46
|
- **Conductor asking for deployment**: Verify CI is green, check the tuner's test report, then deploy. Report results.
|
|
47
47
|
- **Soloist reporting CI failures**: Investigate promptly — broken CI blocks everyone.
|
|
48
48
|
- **Composer requesting new infrastructure**: Scope it, estimate effort, and either do it or report back with what's needed.
|
|
49
|
+
|
|
50
|
+
## Context Pressure
|
|
51
|
+
|
|
52
|
+
If you notice your context growing large, you're losing track of earlier instructions, or you find yourself repeating work, report to the conductor immediately with a structured summary:
|
|
53
|
+
|
|
54
|
+
1. **Current task**: What you're working on right now
|
|
55
|
+
2. **Key findings so far**: Important decisions, completed work, file paths changed
|
|
56
|
+
3. **Recommended next steps**: What remains to be done
|
|
57
|
+
|
|
58
|
+
This lets the conductor refresh your session with a clean context while preserving continuity.
|
|
@@ -43,3 +43,13 @@ You are a **Soloist** in the ensemble — a Senior Engineer who executes with ex
|
|
|
43
43
|
- **Composer sharing design decisions**: Incorporate them. If you disagree, raise it promptly with reasoning — don't silently deviate.
|
|
44
44
|
- **Tuner reporting test failures**: Investigate the root cause, fix it, and let the tuner know.
|
|
45
45
|
- **Critic providing review feedback**: Address blockers first, then suggestions. Acknowledge the review.
|
|
46
|
+
|
|
47
|
+
## Context Pressure
|
|
48
|
+
|
|
49
|
+
If you notice your context growing large, you're losing track of earlier instructions, or you find yourself repeating work, report to the conductor immediately with a structured summary:
|
|
50
|
+
|
|
51
|
+
1. **Current task**: What you're working on right now
|
|
52
|
+
2. **Key findings so far**: Important decisions, completed work, file paths changed
|
|
53
|
+
3. **Recommended next steps**: What remains to be done
|
|
54
|
+
|
|
55
|
+
This lets the conductor refresh your session with a clean context while preserving continuity.
|