teamcast 1.0.3 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -69,7 +69,7 @@ TeamCast now uses a canonical manifest shape with target-specific blocks:
69
69
  - `claude.agents.<name>` - native Claude Code runtime fields and doc outputs
70
70
  - `codex.agents.<name>` - native Codex runtime fields and TOML outputs
71
71
  - `<target>.agents.<name>.forge` - TeamCast-only metadata such as delegation graph
72
- - `project.environments` - active project environments such as `node`, `python` — auto-detected or explicit
72
+ - `project.environments` - active project environments (`node`, `python`, `go`, `rust`, `java`, `ruby`, `docker`, `terraform`) — auto-detected or explicit
73
73
 
74
74
  TeamCast includes a built-in registry of capabilities, traits, instruction fragments, policy fragments, models, and skills. These are not serialized into `teamcast.yaml`.
75
75
 
@@ -117,6 +117,7 @@ teamcast clean --yes # skip clean confirmation
117
117
  | `solo-dev` | developer | single full-stack agent handles end-to-end: plan, implement, test, verify |
118
118
  | `research-and-build` | orchestrator, researcher, planner, developer | research-first: orchestrator routes to researcher for external info, planner integrates findings, developer implements |
119
119
  | `secure-dev` | orchestrator, planner, developer, security-auditor, reviewer | mandatory security pipeline: planner includes threat model, developer follows OWASP, security-auditor gates every change, reviewer checks quality |
120
+ | `red-blue` | orchestrator, red-agent, blue-agent, judge | adversarial hardening: red attacks with failing tests, blue fixes without weakening them, judge decides SHIP or next round |
120
121
 
121
122
  The built-in preset files live in `templates/presets/` and are valid TeamCast YAML. Use them as a reference when creating custom presets, or copy one as a starting point:
122
123
 
@@ -526,6 +527,50 @@ This means a **reviewer** (read + execute, no write) gets code patterns but NOT
526
527
 
527
528
  Custom agents work the same way — a `react-dev` with `write_files` + `execute` automatically gets the right fragments without any role-name matching.
528
529
 
530
+ #### Built-in environments
531
+
532
+ | Environment | Auto-detected by | Policy allows |
533
+ |---|---|---|
534
+ | `node` | `package.json` | `npm`, `npx`, `node` |
535
+ | `python` | `pyproject.toml`, `requirements.txt`, `setup.py` | `pytest`, `python`, `uv`, `poetry` |
536
+ | `go` | `go.mod` | `go build/test/run/vet/mod` |
537
+ | `rust` | `Cargo.toml` | `cargo`, `rustfmt`, `clippy` |
538
+ | `java` | `pom.xml`, `build.gradle` | `mvn`, `gradle`, `./gradlew` |
539
+ | `ruby` | `Gemfile` | `bundle`, `rake`, `rspec` |
540
+ | `docker` | `Dockerfile`, `docker-compose.yml` | `docker`, `docker compose` |
541
+ | `terraform` | `main.tf`, `terraform.tf` | `terraform init/plan/validate/fmt` |
542
+
543
+ #### Custom environments
544
+
545
+ Drop a YAML file into `.agentforge/environments/` to add a new environment or override a builtin:
546
+
547
+ ```yaml
548
+ # .agentforge/environments/bun.yaml
549
+ id: bun
550
+ description: "Bun runtime environment"
551
+ detect_files:
552
+ - bun.lockb
553
+ policy_rules:
554
+ sandbox:
555
+ enabled: true
556
+ allow:
557
+ - "Bash(bun *)"
558
+ instruction_fragments:
559
+ bun_patterns:
560
+ content: |
561
+ This project uses Bun.
562
+ Use `bun install`, `bun run`, and `bun test`.
563
+ requires_capabilities:
564
+ - read_files
565
+ ```
566
+
567
+ Reference it in `teamcast.yaml` by id:
568
+
569
+ ```yaml
570
+ project:
571
+ environments: [node, bun]
572
+ ```
573
+
529
574
  ### Instruction Layers
530
575
 
531
576
  Agent prompts are composed from three layers:
@@ -534,7 +579,7 @@ Agent prompts are composed from three layers:
534
579
  |-------|--------|-------|
535
580
  | **instruction_blocks** | `teamcast.yaml` or preset | Project-specific behavior, workflow rules |
536
581
  | **instruction_fragments** | Built-in registry | Reusable role patterns (e.g. `feature-developer-workflow`) |
537
- | **environment instructions** | Built-in environments | Toolchain best practices, injected by capability |
582
+ | **environment instructions** | Built-in + custom environments | Toolchain best practices, injected by capability |
538
583
 
539
584
  Presets provide sensible defaults for `instruction_blocks` and `instruction_fragments`. For deeper customization, edit `teamcast.yaml` and run `teamcast generate`.
540
585
 
@@ -7,6 +7,7 @@ import { hasErrors, printManifestValidationSummary, } from '../validator/reporte
7
7
  import { getTarget, getRegisteredTargetNames } from '../renderers/registry.js';
8
8
  import { applyEnvironmentInstructions, resolveEnvironmentIds, resolveEnvironmentPolicies, } from '../core/environment-resolver.js';
9
9
  import { checkManifestRegistry } from '../validator/checks/manifest-registry.js';
10
+ import { builtinResourceLoader } from '../registry/resource-loader.js';
10
11
  export function evaluateTeam(manifest, options) {
11
12
  const schemaResult = validateSchema(manifest);
12
13
  if (!schemaResult.valid) {
@@ -15,6 +16,8 @@ export function evaluateTeam(manifest, options) {
15
16
  validationResults: [],
16
17
  };
17
18
  }
19
+ if (options?.cwd)
20
+ builtinResourceLoader.loadUserResources(options.cwd);
18
21
  const rawManifest = applyDefaults(schemaResult.data);
19
22
  const resolvedManifest = options?.cwd ? resolveEnvironmentPolicies(rawManifest, options.cwd) : rawManifest;
20
23
  const manifestRegistryResults = checkManifestRegistry(resolvedManifest);
@@ -6,7 +6,7 @@ import { writeManifest } from '../manifest/writer.js';
6
6
  import { expandCapabilities } from '../core/capability-resolver.js';
7
7
  import { generate } from '../generator/index.js';
8
8
  import { defaultRegistry } from '../registry/index.js';
9
- import { printSuccess, printError, printHeader, printCommandSuccess, } from '../utils/chalk-helpers.js';
9
+ import { printSuccess, printError, printHeader, printCommandSuccess, printNextSteps, } from '../utils/chalk-helpers.js';
10
10
  import { getTarget, getRegisteredTargetNames } from '../renderers/registry.js';
11
11
  import { evaluateTeam, teamHasBlockingIssues, printManifestValidation, } from '../application/validate-team.js';
12
12
  import { promptConfirm, promptInput, promptList, promptCheckbox, } from '../utils/prompts.js';
@@ -248,6 +248,10 @@ export async function runAddAgentCommand(name, options) {
248
248
  const nextTeam = addAgentToTeam(team, name, agent);
249
249
  applyManifestChanges(cwd, manifest, targetName, nextTeam);
250
250
  printCommandSuccess(`Agent "${name}" added and configuration regenerated`);
251
+ printNextSteps([
252
+ `Open ${chalk.bold('teamcast.yaml')} and fill in agent instructions based on ${chalk.yellow('// TODO')} comments`,
253
+ `Run ${chalk.bold('teamcast generate')} to apply your changes`,
254
+ ]);
251
255
  }
252
256
  export async function runCreateSkillCommand(name, options) {
253
257
  const cwd = process.cwd();
@@ -407,7 +411,11 @@ async function promptAgentConfig(name, targetContext) {
407
411
  instructions: [
408
412
  {
409
413
  kind: 'behavior',
410
- content: `You are ${name}. Focus on the responsibilities described in your role and use your allowed tools appropriately.`,
414
+ content: `You are ${name}.\n// TODO: Describe the agent's core personality, rules, and constraints here.\n// Example: "You are a strict security auditor. Never trust user input."`,
415
+ },
416
+ {
417
+ kind: 'workflow',
418
+ content: `// TODO: Define the step-by-step process the agent should follow.\n// 1. Read the provided context.\n// 2. Perform analysis.\n// 3. Output the result.`,
411
419
  },
412
420
  ],
413
421
  };
@@ -1,6 +1,6 @@
1
1
  import { TARGET_NAMES, getManifestTargetConfig, setManifestTargetConfig } from '../manifest/targets.js';
2
2
  import { getEnvironment, detectEnvironments } from '../registry/environments.js';
3
- import { isEnvironmentId } from '../registry/types.js';
3
+ import { isEnvironmentId } from '../registry/environments.js';
4
4
  import { agentHasCapability } from './capability-resolver.js';
5
5
  /**
6
6
  * Resolves environment IDs from the manifest, combining:
@@ -80,31 +80,30 @@ function mergePoliciesSimple(base, extra) {
80
80
  }
81
81
  : undefined,
82
82
  hooks: base.hooks || extra.hooks
83
- ? {
84
- pre_tool_use: [...(base.hooks?.pre_tool_use ?? []), ...(extra.hooks?.pre_tool_use ?? [])].length > 0
85
- ? [...(base.hooks?.pre_tool_use ?? []), ...(extra.hooks?.pre_tool_use ?? [])]
86
- : undefined,
87
- post_tool_use: [...(base.hooks?.post_tool_use ?? []), ...(extra.hooks?.post_tool_use ?? [])].length > 0
88
- ? [...(base.hooks?.post_tool_use ?? []), ...(extra.hooks?.post_tool_use ?? [])]
89
- : undefined,
90
- notification: [...(base.hooks?.notification ?? []), ...(extra.hooks?.notification ?? [])].length > 0
91
- ? [...(base.hooks?.notification ?? []), ...(extra.hooks?.notification ?? [])]
92
- : undefined,
93
- }
83
+ ? (() => {
84
+ const pre = [...(base.hooks?.pre_tool_use ?? []), ...(extra.hooks?.pre_tool_use ?? [])];
85
+ const post = [...(base.hooks?.post_tool_use ?? []), ...(extra.hooks?.post_tool_use ?? [])];
86
+ const notif = [...(base.hooks?.notification ?? []), ...(extra.hooks?.notification ?? [])];
87
+ return {
88
+ pre_tool_use: pre.length > 0 ? pre : undefined,
89
+ post_tool_use: post.length > 0 ? post : undefined,
90
+ notification: notif.length > 0 ? notif : undefined,
91
+ };
92
+ })()
94
93
  : undefined,
95
94
  network: base.network || extra.network
96
- ? {
97
- allowed_domains: [...new Set([
95
+ ? (() => {
96
+ const domains = [...new Set([
98
97
  ...(base.network?.allowed_domains ?? []),
99
98
  ...(extra.network?.allowed_domains ?? []),
100
- ])].length > 0
101
- ? [...new Set([...(base.network?.allowed_domains ?? []), ...(extra.network?.allowed_domains ?? [])])]
102
- : undefined,
103
- }
104
- : undefined,
105
- assertions: [...(base.assertions ?? []), ...(extra.assertions ?? [])].length > 0
106
- ? [...(base.assertions ?? []), ...(extra.assertions ?? [])]
99
+ ])];
100
+ return { allowed_domains: domains.length > 0 ? domains : undefined };
101
+ })()
107
102
  : undefined,
103
+ assertions: (() => {
104
+ const merged = [...(base.assertions ?? []), ...(extra.assertions ?? [])];
105
+ return merged.length > 0 ? merged : undefined;
106
+ })(),
108
107
  };
109
108
  }
110
109
  function resolveTargetPolicies(envPolicies, targetConfig) {
@@ -3,7 +3,9 @@ import { buildGeneratedOutputs } from '../application/build-generated-files.js';
3
3
  import { getRegisteredTargetNames, getTarget } from '../renderers/registry.js';
4
4
  import { normalizeManifest } from '../manifest/normalize.js';
5
5
  import { applyEnvironmentInstructions, resolveEnvironmentIds, resolveEnvironmentPolicies, } from '../core/environment-resolver.js';
6
+ import { builtinResourceLoader } from '../registry/resource-loader.js';
6
7
  export function generate(manifest, options) {
8
+ builtinResourceLoader.loadUserResources(options.cwd);
7
9
  const rawManifest = resolveEnvironmentPolicies(applyDefaults(manifest), options.cwd);
8
10
  const envIds = resolveEnvironmentIds(rawManifest, options.cwd);
9
11
  const rawManifestRecord = rawManifest;
@@ -0,0 +1,65 @@
1
+ // Environment YAML schema — parse and convert YAML environment definitions
2
+ // to runtime EnvironmentDef objects.
3
+ import { existsSync } from 'fs';
4
+ import { join } from 'path';
5
+ import { isCapability } from './types.js';
6
+ // --- Validation ---
7
+ export function parseEnvironmentYaml(raw) {
8
+ if (!raw || typeof raw !== 'object') {
9
+ throw new Error('Environment definition must be an object');
10
+ }
11
+ const obj = raw;
12
+ if (typeof obj.id !== 'string' || !obj.id) {
13
+ throw new Error('Environment definition requires a non-empty "id" field');
14
+ }
15
+ if (typeof obj.description !== 'string') {
16
+ throw new Error(`Environment "${obj.id}": "description" must be a string`);
17
+ }
18
+ if (obj.detect_files !== undefined) {
19
+ if (!Array.isArray(obj.detect_files) || !obj.detect_files.every((f) => typeof f === 'string')) {
20
+ throw new Error(`Environment "${obj.id}": "detect_files" must be a string array`);
21
+ }
22
+ }
23
+ if (!obj.policy_rules || typeof obj.policy_rules !== 'object') {
24
+ throw new Error(`Environment "${obj.id}": "policy_rules" must be an object`);
25
+ }
26
+ const policies = obj.policy_rules;
27
+ if (policies.allow !== undefined) {
28
+ if (!Array.isArray(policies.allow) || !policies.allow.every((a) => typeof a === 'string')) {
29
+ throw new Error(`Environment "${obj.id}": "policy_rules.allow" must be a string array`);
30
+ }
31
+ }
32
+ if (!obj.instruction_fragments || typeof obj.instruction_fragments !== 'object') {
33
+ throw new Error(`Environment "${obj.id}": "instruction_fragments" must be an object`);
34
+ }
35
+ return obj;
36
+ }
37
+ // --- Conversion to runtime EnvironmentDef ---
38
+ function toEnvironmentInstruction(value) {
39
+ if (typeof value === 'string')
40
+ return value;
41
+ return {
42
+ content: value.content,
43
+ requires_capabilities: value.requires_capabilities.filter(isCapability),
44
+ };
45
+ }
46
+ export function environmentYamlToDef(yaml) {
47
+ const fragments = {};
48
+ for (const [key, value] of Object.entries(yaml.instruction_fragments)) {
49
+ fragments[key] = toEnvironmentInstruction(value);
50
+ }
51
+ const def = {
52
+ id: yaml.id,
53
+ description: yaml.description,
54
+ policyRules: {
55
+ sandbox: yaml.policy_rules.sandbox,
56
+ allow: yaml.policy_rules.allow,
57
+ },
58
+ instructionFragments: fragments,
59
+ };
60
+ if (yaml.detect_files?.length) {
61
+ const files = yaml.detect_files;
62
+ def.detect = (cwd) => files.some((file) => existsSync(join(cwd, file)));
63
+ }
64
+ return def;
65
+ }
@@ -1,105 +1,17 @@
1
- import { existsSync } from 'fs';
2
- import { join } from 'path';
3
- const ENVIRONMENTS = {
4
- node: {
5
- id: 'node',
6
- description: 'Node.js environment, auto-detected via package.json',
7
- detect: (cwd) => existsSync(join(cwd, 'package.json')),
8
- policyRules: {
9
- sandbox: { enabled: true },
10
- allow: [
11
- 'Bash(npm run *)',
12
- 'Bash(npm test *)',
13
- 'Bash(npx *)',
14
- 'Bash(npm install)',
15
- 'Bash(node *)',
16
- ],
17
- },
18
- instructionFragments: {
19
- node_code_patterns: {
20
- content: [
21
- 'This is a Node.js project.',
22
- 'Use ESM module syntax (import/export). All relative imports must use .js extensions.',
23
- 'Prefer named exports over default exports.',
24
- 'Use TypeScript strict mode when tsconfig.json is present.',
25
- ].join('\n'),
26
- requires_capabilities: ['read_files'],
27
- },
28
- node_development: {
29
- content: [
30
- 'Install dependencies with `npm install`.',
31
- 'Use `npm run <script>` to execute package.json scripts.',
32
- 'Prefer async/await over raw Promises or callbacks.',
33
- 'Handle errors at system boundaries. Use typed error classes where the project defines them.',
34
- ].join('\n'),
35
- requires_capabilities: ['write_files'],
36
- },
37
- node_testing: {
38
- content: [
39
- 'Run tests with `npm test`.',
40
- 'Run a specific test file with `npx vitest run <path>` (vitest) or `npx jest <path>` (jest).',
41
- 'Always run tests after making changes to verify nothing broke.',
42
- 'Follow existing test patterns: check the tests/ directory for conventions before writing new tests.',
43
- ].join('\n'),
44
- requires_capabilities: ['execute', 'write_files'],
45
- },
46
- },
47
- },
48
- python: {
49
- id: 'python',
50
- description: 'Python environment, auto-detected via pyproject.toml or requirements.txt',
51
- detect: (cwd) => existsSync(join(cwd, 'pyproject.toml')) ||
52
- existsSync(join(cwd, 'requirements.txt')) ||
53
- existsSync(join(cwd, 'setup.py')),
54
- policyRules: {
55
- sandbox: { enabled: true },
56
- allow: [
57
- 'Bash(pytest *)',
58
- 'Bash(python -m pytest *)',
59
- 'Bash(uv run *)',
60
- 'Bash(poetry run *)',
61
- 'Bash(python *)',
62
- ],
63
- },
64
- instructionFragments: {
65
- python_code_patterns: {
66
- content: [
67
- 'This is a Python project.',
68
- 'Follow PEP 8 style conventions.',
69
- 'Use type hints for function signatures and class attributes.',
70
- 'Prefer pathlib.Path over os.path for file operations.',
71
- ].join('\n'),
72
- requires_capabilities: ['read_files'],
73
- },
74
- python_development: {
75
- content: [
76
- 'If using poetry: `poetry install` and `poetry run <cmd>`. If using uv: `uv sync` and `uv run <cmd>`.',
77
- 'Otherwise use pip and virtualenv.',
78
- 'Use structured logging (logging module) instead of print statements.',
79
- 'Handle exceptions with specific types, not bare except clauses.',
80
- ].join('\n'),
81
- requires_capabilities: ['write_files'],
82
- },
83
- python_testing: {
84
- content: [
85
- 'Run tests with `pytest`. If using poetry or uv, prefix with `poetry run` or `uv run`.',
86
- 'Run a specific test: `pytest <path>::<test_name>`.',
87
- 'Always run tests after changes. Follow existing test patterns in the tests/ directory.',
88
- 'Use fixtures for shared setup. Prefer parametrize for similar test cases.',
89
- ].join('\n'),
90
- requires_capabilities: ['execute', 'write_files'],
91
- },
92
- },
93
- },
94
- };
1
+ // Environment registry delegates to ResourceLoader (YAML is the sole source).
2
+ import { builtinResourceLoader } from './resource-loader.js';
3
+ export function isEnvironmentId(value) {
4
+ return builtinResourceLoader.hasEnvironment(value);
5
+ }
95
6
  export function getEnvironment(id) {
96
- return ENVIRONMENTS[id];
7
+ const env = builtinResourceLoader.getEnvironment(id);
8
+ if (!env)
9
+ throw new Error(`Unknown environment "${id}"`);
10
+ return env;
97
11
  }
98
12
  export function listEnvironments() {
99
- return Object.values(ENVIRONMENTS);
13
+ return builtinResourceLoader.listEnvironments();
100
14
  }
101
15
  export function detectEnvironments(cwd) {
102
- return Object.values(ENVIRONMENTS)
103
- .filter((env) => env.detect(cwd))
104
- .map((env) => env.id);
16
+ return builtinResourceLoader.detectEnvironments(cwd);
105
17
  }
@@ -46,12 +46,17 @@ const INSTRUCTION_FRAGMENTS = {
46
46
  ].join('\n')),
47
47
  'solo-dev-style': block('style', 'Follow existing code style. Keep changes minimal and focused.'),
48
48
  'feature-orchestrator-workflow': block('workflow', [
49
- 'Always start by reading the task carefully. Then decide:',
50
- '- Does this need research or planning first? -> delegate to planner',
51
- '- Is the plan ready and implementation needed? -> delegate to developer',
52
- '- Is the implementation done and needs review? -> delegate to reviewer',
49
+ 'Classify every incoming task before acting:',
50
+ '- META (git operations, read file, explain code, answer a question) -> handle directly',
51
+ '- MICRO (typo, rename, 1-2 line fix) -> handle directly',
52
+ '- SMALL (bug fix, isolated change, single module, <50 lines) -> delegate to developer only',
53
+ '- MEDIUM (new feature, refactor touching multiple files) -> planner -> developer -> reviewer',
54
+ '- LARGE (complex feature, cross-cutting concern, new subsystem) -> planner -> developer -> reviewer with detailed handoff context',
55
+ ].join('\n')),
56
+ 'feature-orchestrator-output': block('delegation', [
57
+ 'When handling directly: be concise, do not explain your triage decision.',
58
+ 'When delegating: state the goal, relevant files, and expected output format.',
53
59
  ].join('\n')),
54
- 'feature-orchestrator-output': block('delegation', 'Never write code or modify files yourself. Your output is always a delegation or a final summary.'),
55
60
  'feature-planner-workflow': block('workflow', 'Always read the relevant files before making conclusions. Search for existing patterns and utilities that can be reused.'),
56
61
  'feature-planner-read-only': block('safety', 'Your output is always a plan - never code changes.'),
57
62
  'feature-reviewer-checklist': block('workflow', [
@@ -0,0 +1,78 @@
1
+ // ResourceLoader — scans directories for YAML resource files and registers them.
2
+ import { readFileSync, readdirSync } from 'fs';
3
+ import { dirname, join } from 'path';
4
+ import { fileURLToPath } from 'url';
5
+ import { parse } from 'yaml';
6
+ import { parseEnvironmentYaml, environmentYamlToDef } from './environment-schema.js';
7
+ const __filename = fileURLToPath(import.meta.url);
8
+ const __dirname = dirname(__filename);
9
+ const BUILTIN_ENVIRONMENTS_DIR = join(__dirname, '../../templates/environments');
10
+ /** Check whether an environment matches the given cwd. */
11
+ function matchesEnv(env, cwd) {
12
+ return env.detect ? env.detect(cwd) : false;
13
+ }
14
+ export class ResourceLoader {
15
+ environments = new Map();
16
+ loadedDirs = new Set();
17
+ /** Load all *.yaml files from a directory as environment definitions.
18
+ * Each directory is only loaded once — adding files after the first call has no effect. */
19
+ loadEnvironmentsFromDir(dir, allowOverride = false) {
20
+ if (this.loadedDirs.has(dir))
21
+ return;
22
+ this.loadedDirs.add(dir);
23
+ let files;
24
+ try {
25
+ files = readdirSync(dir).filter((f) => f.endsWith('.yaml') || f.endsWith('.yml'));
26
+ }
27
+ catch {
28
+ return; // Directory does not exist or is inaccessible
29
+ }
30
+ for (const file of files) {
31
+ const filePath = join(dir, file);
32
+ try {
33
+ const raw = parse(readFileSync(filePath, 'utf-8'));
34
+ const yaml = parseEnvironmentYaml(raw);
35
+ const def = environmentYamlToDef(yaml);
36
+ if (this.environments.has(def.id) && !allowOverride)
37
+ continue;
38
+ this.environments.set(def.id, def);
39
+ }
40
+ catch (err) {
41
+ if (allowOverride) {
42
+ // User-defined file — surface the error so the user can fix it
43
+ process.stderr.write(`[agentforge] Warning: skipping "${filePath}": ${err instanceof Error ? err.message : String(err)}\n`);
44
+ }
45
+ // Builtin files should never fail — skip silently
46
+ }
47
+ }
48
+ }
49
+ /** Load user-defined resources from a project's .agentforge/ directory. */
50
+ loadUserResources(projectDir) {
51
+ const envDir = join(projectDir, '.agentforge', 'environments');
52
+ this.loadEnvironmentsFromDir(envDir, true);
53
+ }
54
+ hasEnvironment(id) {
55
+ return this.environments.has(id);
56
+ }
57
+ getEnvironment(id) {
58
+ return this.environments.get(id);
59
+ }
60
+ listEnvironments() {
61
+ return [...this.environments.values()];
62
+ }
63
+ listEnvironmentIds() {
64
+ return [...this.environments.keys()];
65
+ }
66
+ detectEnvironments(cwd) {
67
+ return this.listEnvironments()
68
+ .filter((env) => matchesEnv(env, cwd))
69
+ .map((env) => env.id);
70
+ }
71
+ }
72
+ // Singleton — loads builtin environments from templates/environments/
73
+ function createBuiltinLoader() {
74
+ const loader = new ResourceLoader();
75
+ loader.loadEnvironmentsFromDir(BUILTIN_ENVIRONMENTS_DIR);
76
+ return loader;
77
+ }
78
+ export const builtinResourceLoader = createBuiltinLoader();
@@ -14,7 +14,74 @@ export const CAPABILITY_IDS = [
14
14
  export function isCapability(value) {
15
15
  return CAPABILITY_IDS.includes(value);
16
16
  }
17
- export const ENVIRONMENT_IDS = ['node', 'python'];
18
- export function isEnvironmentId(value) {
19
- return ENVIRONMENT_IDS.includes(value);
20
- }
17
+ // --- Capability Trait (named bundle of capabilities) ---
18
+ export const BUILTIN_CAPABILITY_TRAIT_IDS = [
19
+ 'base-read',
20
+ 'file-authoring',
21
+ 'command-execution',
22
+ 'web-research',
23
+ 'delegation',
24
+ 'interaction',
25
+ 'notebook-editing',
26
+ 'no-file-edits',
27
+ 'no-commands',
28
+ 'no-web',
29
+ 'full-access',
30
+ ];
31
+ // --- Policy Fragment ---
32
+ export const BUILTIN_POLICY_FRAGMENT_IDS = [
33
+ 'allow-git-read',
34
+ 'allow-git-write',
35
+ 'ask-git-push',
36
+ 'deny-destructive-shell',
37
+ 'deny-network-downloads',
38
+ 'deny-dynamic-exec',
39
+ 'deny-env-files',
40
+ 'sandbox-default',
41
+ ];
42
+ export const BUILTIN_INSTRUCTION_FRAGMENT_IDS = [
43
+ 'coordination-core',
44
+ 'delegate-first',
45
+ 'planning-core',
46
+ 'planning-read-only',
47
+ 'research-core',
48
+ 'research-citation',
49
+ 'research-no-file-edits',
50
+ 'development-core',
51
+ 'development-workflow',
52
+ 'tester-core',
53
+ 'tester-read-only',
54
+ 'review-core',
55
+ 'review-feedback',
56
+ 'security-audit-core',
57
+ 'security-audit-severity',
58
+ 'research-handoff',
59
+ 'secure-planning',
60
+ 'secure-development',
61
+ 'secure-development-tests',
62
+ 'security-review-gate',
63
+ 'post-audit-review',
64
+ 'solo-dev-core',
65
+ 'solo-dev-workflow',
66
+ 'solo-dev-style',
67
+ 'feature-orchestrator-workflow',
68
+ 'feature-orchestrator-output',
69
+ 'feature-planner-workflow',
70
+ 'feature-planner-read-only',
71
+ 'feature-developer-core',
72
+ 'feature-developer-workflow',
73
+ 'feature-developer-summary',
74
+ 'feature-reviewer-checklist',
75
+ 'feature-reviewer-style',
76
+ 'research-orchestrator-core',
77
+ 'research-orchestrator-workflow',
78
+ 'research-orchestrator-output',
79
+ 'research-planner-core',
80
+ 'research-planner-constraints',
81
+ 'research-developer-core',
82
+ 'research-developer-tests',
83
+ 'secure-orchestrator-core',
84
+ 'secure-orchestrator-workflow',
85
+ 'secure-orchestrator-gate',
86
+ 'post-audit-review-core',
87
+ ];
@@ -2,7 +2,7 @@ import { defaultRegistry } from '../../registry/index.js';
2
2
  // --- Frontmatter ---
3
3
  function buildFrontmatter(skill) {
4
4
  const lines = ['---'];
5
- lines.push(`name: ${skill.name}`);
5
+ lines.push(`name: ${skill.id}`);
6
6
  lines.push(`description: ${skill.description}`);
7
7
  if (skill.allowed_tools?.length) {
8
8
  lines.push(`allowed-tools:`);
@@ -20,7 +20,7 @@ function generateSkillStub(skillName) {
20
20
  .map((word) => word.charAt(0).toUpperCase() + word.slice(1))
21
21
  .join(' ');
22
22
  return `---
23
- name: ${title}
23
+ name: ${skillName}
24
24
  description: <!-- describe when this skill triggers -->
25
25
  ---
26
26
 
@@ -5,7 +5,7 @@ function getSkillBasePath(skillId) {
5
5
  }
6
6
  // --- Frontmatter (Codex: name + description only) ---
7
7
  function buildFrontmatter(name, description) {
8
- return ['---', `name: ${name}`, `description: ${description}`, '---'].join('\n');
8
+ return ['---', `name: ${JSON.stringify(name)}`, `description: ${JSON.stringify(description)}`, '---'].join('\n');
9
9
  }
10
10
  // --- Stub for unknown skills ---
11
11
  function generateSkillStub(skillName) {
@@ -1,7 +1,7 @@
1
1
  import { isCapabilityTraitName } from '../../registry/traits.js';
2
2
  import { isPolicyFragmentId } from '../../registry/policy-fragments.js';
3
3
  import { isInstructionFragmentId } from '../../registry/instruction-fragments.js';
4
- import { isEnvironmentId } from '../../registry/types.js';
4
+ import { isEnvironmentId } from '../../registry/environments.js';
5
5
  import { getManifestTargetEntries } from '../../manifest/targets.js';
6
6
  /**
7
7
  * Pre-normalization manifest-level registry checks.
@@ -66,5 +66,24 @@ export function checkTeamGraphEnhanced(team) {
66
66
  message: `Multiple root agents detected: ${roots.join(', ')} — consider a single orchestrator entry point`,
67
67
  });
68
68
  }
69
+ // HANDOFF_CAPABILITY_MISMATCH — delegating to an agent with no tools
70
+ for (const [agentId, agent] of agentEntries) {
71
+ for (const target of agent.metadata?.handoffs ?? []) {
72
+ const targetAgent = team.agents[target];
73
+ if (!targetAgent)
74
+ continue; // already caught by checkHandoffGraph
75
+ const targetTools = targetAgent.runtime.tools ?? [];
76
+ if (targetTools.length === 0) {
77
+ results.push({
78
+ severity: 'warning',
79
+ category: 'Team graph',
80
+ code: 'HANDOFF_CAPABILITY_MISMATCH',
81
+ phase: 'team-graph',
82
+ message: `Agent "${agentId}" hands off to "${target}" but "${target}" has no capabilities — delegation may be ineffective`,
83
+ agent: agentId,
84
+ });
85
+ }
86
+ }
87
+ }
69
88
  return results;
70
89
  }
@@ -15,8 +15,8 @@ import { checkSkillRequirements } from './checks/skill-requirements.js';
15
15
  import { checkMcpServers } from './checks/mcp.js';
16
16
  import { checkTeamGraphEnhanced } from './checks/team-graph-enhanced.js';
17
17
  const CHECKERS = (skillMap, targetName) => [
18
- checkRegistryReferences,
19
- checkEnvironments,
18
+ checkRegistryReferences, // Phase 1
19
+ checkEnvironments, // Phase 9
20
20
  checkTraitCapabilities, // Phase 2
21
21
  (team) => checkCapabilityTools(team, skillMap), // Phase 3
22
22
  (team) => checkHandoffGraph(team, skillMap),
@@ -76,7 +76,8 @@ export async function runWizard(options) {
76
76
  printCommandSuccess(`Agent team initialized for project "${rawManifest.project.name}"`);
77
77
  printManifestValidation(validation);
78
78
  printNextSteps([
79
+ `Open ${chalk.bold('teamcast.yaml')} and fill in agent instructions based on ${chalk.yellow('// TODO')} comments`,
79
80
  `${chalk.bold('teamcast explain')} - view the team structure`,
80
- `Edit ${chalk.bold('teamcast.yaml')} and run ${chalk.bold('teamcast generate')} to apply changes`,
81
+ `Run ${chalk.bold('teamcast generate')} to apply your changes`,
81
82
  ]);
82
83
  }
@@ -1,6 +1,6 @@
1
1
  import chalk from 'chalk';
2
2
  import { detectEnvironments, listEnvironments } from '../../registry/environments.js';
3
- import { isEnvironmentId } from '../../registry/types.js';
3
+ import { isEnvironmentId } from '../../registry/environments.js';
4
4
  import { promptCheckbox } from '../../utils/prompts.js';
5
5
  function mergeEnvironmentIds(...lists) {
6
6
  return [...new Set(lists.flatMap((list) => list ?? []))].filter(isEnvironmentId);
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "teamcast",
3
- "version": "1.0.3",
3
+ "version": "1.2.0",
4
4
  "description": "YAML-driven CLI to design, validate, and generate multi-target agent teams for Claude Code and Codex",
5
5
  "type": "module",
6
6
  "bin": {
@@ -0,0 +1,36 @@
1
+ id: docker
2
+ description: "Docker environment, auto-detected via Dockerfile"
3
+ detect_files:
4
+ - Dockerfile
5
+ - docker-compose.yml
6
+ - docker-compose.yaml
7
+ - compose.yml
8
+ - compose.yaml
9
+ policy_rules:
10
+ sandbox:
11
+ enabled: true
12
+ allow:
13
+ - "Bash(docker build *)"
14
+ - "Bash(docker run *)"
15
+ - "Bash(docker compose *)"
16
+ - "Bash(docker-compose *)"
17
+ - "Bash(docker ps *)"
18
+ - "Bash(docker logs *)"
19
+ - "Bash(docker images *)"
20
+ instruction_fragments:
21
+ docker_patterns:
22
+ content: |
23
+ This project uses Docker.
24
+ Use multi-stage builds to minimize image size.
25
+ Prefer alpine or slim base images where practical.
26
+ Use .dockerignore to exclude unnecessary files from build context.
27
+ requires_capabilities:
28
+ - read_files
29
+ docker_development:
30
+ content: |
31
+ Build images with `docker build -t <tag> .`.
32
+ Use `docker compose up` for multi-container setups.
33
+ Pin base image versions for reproducible builds.
34
+ Order Dockerfile instructions to maximize layer caching (dependencies before source code).
35
+ requires_capabilities:
36
+ - write_files
@@ -0,0 +1,40 @@
1
+ id: go
2
+ description: "Go environment, auto-detected via go.mod"
3
+ detect_files:
4
+ - go.mod
5
+ policy_rules:
6
+ sandbox:
7
+ enabled: true
8
+ allow:
9
+ - "Bash(go build *)"
10
+ - "Bash(go test *)"
11
+ - "Bash(go run *)"
12
+ - "Bash(go vet *)"
13
+ - "Bash(go mod *)"
14
+ - "Bash(go generate *)"
15
+ instruction_fragments:
16
+ go_code_patterns:
17
+ content: |
18
+ This is a Go project.
19
+ Follow standard Go conventions and idiomatic patterns.
20
+ Use gofmt/goimports for formatting.
21
+ Prefer short variable names in small scopes, descriptive names in larger scopes.
22
+ requires_capabilities:
23
+ - read_files
24
+ go_development:
25
+ content: |
26
+ Use `go build ./...` to compile all packages.
27
+ Use `go mod tidy` to clean up dependencies.
28
+ Handle errors explicitly — do not ignore returned errors.
29
+ Prefer returning errors over panicking.
30
+ requires_capabilities:
31
+ - write_files
32
+ go_testing:
33
+ content: |
34
+ Run tests with `go test ./...`.
35
+ Run a specific test: `go test -run TestName ./path/to/package`.
36
+ Use table-driven tests for multiple cases.
37
+ Always run tests after changes to verify nothing broke.
38
+ requires_capabilities:
39
+ - execute
40
+ - write_files
@@ -0,0 +1,41 @@
1
+ id: java
2
+ description: "Java environment, auto-detected via pom.xml or build.gradle"
3
+ detect_files:
4
+ - pom.xml
5
+ - build.gradle
6
+ - build.gradle.kts
7
+ policy_rules:
8
+ sandbox:
9
+ enabled: true
10
+ allow:
11
+ - "Bash(mvn *)"
12
+ - "Bash(gradle *)"
13
+ - "Bash(./gradlew *)"
14
+ - "Bash(java *)"
15
+ - "Bash(javac *)"
16
+ instruction_fragments:
17
+ java_code_patterns:
18
+ content: |
19
+ This is a Java project.
20
+ Follow standard Java naming conventions (camelCase for methods, PascalCase for classes).
21
+ Use appropriate access modifiers.
22
+ Prefer composition over inheritance where practical.
23
+ requires_capabilities:
24
+ - read_files
25
+ java_development:
26
+ content: |
27
+ If using Maven: `mvn compile` to build, `mvn package` to create artifacts.
28
+ If using Gradle: `./gradlew build` or `gradle build`.
29
+ Handle exceptions with specific types, not bare catch blocks.
30
+ Use try-with-resources for AutoCloseable resources.
31
+ requires_capabilities:
32
+ - write_files
33
+ java_testing:
34
+ content: |
35
+ Run tests with `mvn test` (Maven) or `./gradlew test` (Gradle).
36
+ Run a specific test: `mvn -Dtest=TestClassName test` or `./gradlew test --tests TestClassName`.
37
+ Use JUnit 5 annotations. Follow existing test patterns in the project.
38
+ Always run tests after changes to verify nothing broke.
39
+ requires_capabilities:
40
+ - execute
41
+ - write_files
@@ -0,0 +1,39 @@
1
+ id: node
2
+ description: "Node.js environment, auto-detected via package.json"
3
+ detect_files:
4
+ - package.json
5
+ policy_rules:
6
+ sandbox:
7
+ enabled: true
8
+ allow:
9
+ - "Bash(npm run *)"
10
+ - "Bash(npm test *)"
11
+ - "Bash(npx *)"
12
+ - "Bash(npm install)"
13
+ - "Bash(node *)"
14
+ instruction_fragments:
15
+ node_code_patterns:
16
+ content: |
17
+ This is a Node.js project.
18
+ Use ESM module syntax (import/export). All relative imports must use .js extensions.
19
+ Prefer named exports over default exports.
20
+ Use TypeScript strict mode when tsconfig.json is present.
21
+ requires_capabilities:
22
+ - read_files
23
+ node_development:
24
+ content: |
25
+ Install dependencies with `npm install`.
26
+ Use `npm run <script>` to execute package.json scripts.
27
+ Prefer async/await over raw Promises or callbacks.
28
+ Handle errors at system boundaries. Use typed error classes where the project defines them.
29
+ requires_capabilities:
30
+ - write_files
31
+ node_testing:
32
+ content: |
33
+ Run tests with `npm test`.
34
+ Run a specific test file with `npx vitest run <path>` (vitest) or `npx jest <path>` (jest).
35
+ Always run tests after making changes to verify nothing broke.
36
+ Follow existing test patterns: check the tests/ directory for conventions before writing new tests.
37
+ requires_capabilities:
38
+ - execute
39
+ - write_files
@@ -0,0 +1,41 @@
1
+ id: python
2
+ description: "Python environment, auto-detected via pyproject.toml or requirements.txt"
3
+ detect_files:
4
+ - pyproject.toml
5
+ - requirements.txt
6
+ - setup.py
7
+ policy_rules:
8
+ sandbox:
9
+ enabled: true
10
+ allow:
11
+ - "Bash(pytest *)"
12
+ - "Bash(python -m pytest *)"
13
+ - "Bash(uv run *)"
14
+ - "Bash(poetry run *)"
15
+ - "Bash(python *)"
16
+ instruction_fragments:
17
+ python_code_patterns:
18
+ content: |
19
+ This is a Python project.
20
+ Follow PEP 8 style conventions.
21
+ Use type hints for function signatures and class attributes.
22
+ Prefer pathlib.Path over os.path for file operations.
23
+ requires_capabilities:
24
+ - read_files
25
+ python_development:
26
+ content: |
27
+ If using poetry: `poetry install` and `poetry run <cmd>`. If using uv: `uv sync` and `uv run <cmd>`.
28
+ Otherwise use pip and virtualenv.
29
+ Use structured logging (logging module) instead of print statements.
30
+ Handle exceptions with specific types, not bare except clauses.
31
+ requires_capabilities:
32
+ - write_files
33
+ python_testing:
34
+ content: |
35
+ Run tests with `pytest`. If using poetry or uv, prefix with `poetry run` or `uv run`.
36
+ Run a specific test: `pytest <path>::<test_name>`.
37
+ Always run tests after changes. Follow existing test patterns in the tests/ directory.
38
+ Use fixtures for shared setup. Prefer parametrize for similar test cases.
39
+ requires_capabilities:
40
+ - execute
41
+ - write_files
@@ -0,0 +1,39 @@
1
+ id: ruby
2
+ description: "Ruby environment, auto-detected via Gemfile"
3
+ detect_files:
4
+ - Gemfile
5
+ policy_rules:
6
+ sandbox:
7
+ enabled: true
8
+ allow:
9
+ - "Bash(bundle *)"
10
+ - "Bash(rake *)"
11
+ - "Bash(rspec *)"
12
+ - "Bash(ruby *)"
13
+ - "Bash(rails *)"
14
+ instruction_fragments:
15
+ ruby_code_patterns:
16
+ content: |
17
+ This is a Ruby project.
18
+ Follow Ruby style conventions (snake_case for methods/variables, PascalCase for classes).
19
+ Use frozen string literal comments where the project follows that convention.
20
+ Prefer blocks and enumerators over manual loops.
21
+ requires_capabilities:
22
+ - read_files
23
+ ruby_development:
24
+ content: |
25
+ Install dependencies with `bundle install`.
26
+ Use `bundle exec` to run commands in the context of the bundle.
27
+ Prefer keyword arguments for methods with multiple optional parameters.
28
+ Handle errors with specific exception classes.
29
+ requires_capabilities:
30
+ - write_files
31
+ ruby_testing:
32
+ content: |
33
+ Run tests with `bundle exec rspec` (RSpec) or `bundle exec rake test` (Minitest).
34
+ Run a specific test: `bundle exec rspec path/to/spec.rb:LINE`.
35
+ Follow existing test patterns. Use shared examples for reusable test behaviors.
36
+ Always run tests after changes to verify nothing broke.
37
+ requires_capabilities:
38
+ - execute
39
+ - write_files
@@ -0,0 +1,41 @@
1
+ id: rust
2
+ description: "Rust environment, auto-detected via Cargo.toml"
3
+ detect_files:
4
+ - Cargo.toml
5
+ policy_rules:
6
+ sandbox:
7
+ enabled: true
8
+ allow:
9
+ - "Bash(cargo build *)"
10
+ - "Bash(cargo test *)"
11
+ - "Bash(cargo run *)"
12
+ - "Bash(cargo clippy *)"
13
+ - "Bash(cargo fmt *)"
14
+ - "Bash(cargo check *)"
15
+ - "Bash(rustfmt *)"
16
+ instruction_fragments:
17
+ rust_code_patterns:
18
+ content: |
19
+ This is a Rust project.
20
+ Follow Rust idioms: prefer ownership over borrowing when practical.
21
+ Use clippy lints to catch common mistakes.
22
+ Prefer Result/Option over panicking.
23
+ requires_capabilities:
24
+ - read_files
25
+ rust_development:
26
+ content: |
27
+ Use `cargo build` to compile the project.
28
+ Use `cargo check` for fast feedback without full compilation.
29
+ Run `cargo clippy` before committing to catch lint issues.
30
+ Use `cargo fmt` to format code consistently.
31
+ requires_capabilities:
32
+ - write_files
33
+ rust_testing:
34
+ content: |
35
+ Run tests with `cargo test`.
36
+ Run a specific test: `cargo test test_name`.
37
+ Use `#[cfg(test)]` modules for unit tests within source files.
38
+ Always run tests after changes to verify nothing broke.
39
+ requires_capabilities:
40
+ - execute
41
+ - write_files
@@ -0,0 +1,31 @@
1
+ id: terraform
2
+ description: "Terraform environment, auto-detected via main.tf"
3
+ detect_files:
4
+ - main.tf
5
+ - terraform.tf
6
+ policy_rules:
7
+ sandbox:
8
+ enabled: true
9
+ allow:
10
+ - "Bash(terraform init *)"
11
+ - "Bash(terraform plan *)"
12
+ - "Bash(terraform validate *)"
13
+ - "Bash(terraform fmt *)"
14
+ - "Bash(terraform state *)"
15
+ instruction_fragments:
16
+ terraform_patterns:
17
+ content: |
18
+ This project uses Terraform for infrastructure as code.
19
+ Follow HCL conventions: use snake_case for resource names and variables.
20
+ Organize configuration into logical files (main.tf, variables.tf, outputs.tf).
21
+ Use modules for reusable infrastructure components.
22
+ requires_capabilities:
23
+ - read_files
24
+ terraform_development:
25
+ content: |
26
+ Run `terraform init` to initialize providers and modules.
27
+ Run `terraform plan` to preview changes before applying.
28
+ Run `terraform validate` and `terraform fmt` before committing.
29
+ Never apply changes without reviewing the plan first.
30
+ requires_capabilities:
31
+ - write_files
@@ -27,15 +27,14 @@ claude:
27
27
  - planner
28
28
  - developer
29
29
  - reviewer
30
- description: Coordinates the team. Analyzes tasks, decomposes them into
31
- subtasks, and delegates to the right specialist. Never writes code
32
- directly.
30
+ description: Tech lead. Handles simple tasks directly, delegates complex
31
+ work to the right specialist. Triages by task size before acting.
33
32
  model: opus
34
33
  capability_traits:
35
34
  - base-read
35
+ - file-authoring
36
+ - command-execution
36
37
  - delegation
37
- - no-file-edits
38
- - no-commands
39
38
  - no-web
40
39
  skills:
41
40
  - triage
@@ -44,12 +43,7 @@ claude:
44
43
  instruction_blocks:
45
44
  - kind: behavior
46
45
  content: |
47
- You are the team coordinator. Triage every incoming task before acting:
48
- - Bug report or regression -> delegate to planner for root-cause analysis, then developer to fix
49
- - New feature or enhancement -> delegate to planner for design, then developer to implement, then reviewer to sign off
50
- - Refactor or cleanup -> delegate directly to developer, then reviewer
51
- Never write or edit code yourself. Your output is always a delegation message or a final summary.
52
- When delegating, state the goal, the relevant files, and the expected output format.
46
+ You are the tech lead. Handle simple tasks directly. Delegate complex work to the right specialist.
53
47
  instruction_fragments:
54
48
  - feature-orchestrator-workflow
55
49
  - feature-orchestrator-output
@@ -0,0 +1,127 @@
1
+ version: "2"
2
+ preset_meta:
3
+ tags:
4
+ - security
5
+ - testing
6
+ - adversarial
7
+ min_version: "2"
8
+ project:
9
+ name: placeholder
10
+ preset: red-blue
11
+ description: Adversarial hardening team. Red agent attacks with failing tests, blue agent defends with fixes, judge decides when to ship.
12
+ claude:
13
+ policies:
14
+ fragments:
15
+ - allow-git-read
16
+ - allow-git-write
17
+ - ask-git-push
18
+ - deny-destructive-shell
19
+ - deny-network-downloads
20
+ - deny-dynamic-exec
21
+ - deny-env-files
22
+ - sandbox-default
23
+ agents:
24
+ orchestrator:
25
+ forge:
26
+ handoffs:
27
+ - red-agent
28
+ - blue-agent
29
+ - judge
30
+ description: Manages adversarial rounds. Routes between red, blue, and judge.
31
+ Ships when judge approves or max rounds reached.
32
+ model: opus
33
+ capability_traits:
34
+ - base-read
35
+ - delegation
36
+ - no-file-edits
37
+ - no-commands
38
+ - no-web
39
+ max_turns: 30
40
+ instruction_blocks:
41
+ - kind: behavior
42
+ content: |
43
+ You are the round manager. Run adversarial rounds between red-agent and blue-agent.
44
+ Track the current round number (start at 1, maximum 3).
45
+ Round flow: red-agent -> blue-agent -> judge.
46
+ If judge returns SHIP or round >= 3: deliver final report and stop.
47
+ If judge returns ROUND N+1: increment round, pass the judge hint to red-agent.
48
+ Never write or modify code yourself.
49
+ - kind: delegation
50
+ content: |
51
+ When delegating to red-agent: include the target scope and any judge hint from the previous round.
52
+ When delegating to blue-agent: include red's attack report and the list of failing test files.
53
+ When delegating to judge: include both red's attack report and blue's fix report.
54
+ red-agent:
55
+ description: Attacker. Finds weaknesses and writes failing tests. Never
56
+ modifies production code.
57
+ model: sonnet
58
+ capability_traits:
59
+ - base-read
60
+ - file-authoring
61
+ - command-execution
62
+ - no-web
63
+ skills:
64
+ - security-check
65
+ - test-first
66
+ instruction_blocks:
67
+ - kind: behavior
68
+ content: |
69
+ You are the attacker. Your goal is to break the code through tests.
70
+ Read the target code carefully. Find: edge cases, invalid inputs, null paths,
71
+ boundary conditions, type coercion issues, missing error handling, race conditions.
72
+ Write tests that expose these weaknesses. Run them — confirm they FAIL before reporting.
73
+ If the judge gave you a hint for this round, focus your attack on that angle.
74
+ - kind: safety
75
+ content: |
76
+ Write only test files. Never edit, create, or delete production source files.
77
+ Each test must have a clear name describing what weakness it exposes.
78
+ Only include tests that actually fail in your report.
79
+ blue-agent:
80
+ description: Defender. Makes red's failing tests pass without deleting or
81
+ weakening them.
82
+ model: sonnet
83
+ capability_traits:
84
+ - base-read
85
+ - file-authoring
86
+ - command-execution
87
+ - no-web
88
+ skills:
89
+ - clean-code
90
+ - secure-coding
91
+ instruction_blocks:
92
+ - kind: behavior
93
+ content: |
94
+ You are the defender. Make every failing test from red-agent pass.
95
+ Fix the root cause — do not delete, skip, or weaken any test.
96
+ Run the full test suite after your fixes. All tests must be green before reporting.
97
+ Keep changes minimal and focused. Do not refactor beyond what is needed to pass the tests.
98
+ - kind: safety
99
+ content: |
100
+ Never delete, skip (.skip), or modify red's test files.
101
+ If a test appears wrong, flag it in your report — do not remove it.
102
+ judge:
103
+ description: Evaluates blue's fixes. Returns SHIP if solid, or ROUND N+1
104
+ with a new attack hint for red.
105
+ model: sonnet
106
+ capability_traits:
107
+ - base-read
108
+ - command-execution
109
+ - no-file-edits
110
+ - no-web
111
+ skills:
112
+ - code-review
113
+ - security-check
114
+ instruction_blocks:
115
+ - kind: behavior
116
+ content: |
117
+ You are the judge. Read red's attack report and blue's fix report.
118
+ Evaluate: did blue fix the root cause, or just suppress the symptom?
119
+ Look for new attack surfaces introduced by blue's changes.
120
+ Check for: try/catch that swallows errors, conditions that only handle
121
+ the tested input, hardcoded values that mask the real problem.
122
+ - kind: style
123
+ content: |
124
+ Return exactly one of:
125
+ SHIP — fixes are solid, no new surfaces, ready to merge.
126
+ ROUND N+1: <specific hint> — what angle red should try next.
127
+ Be concrete in hints (e.g. "try concurrent calls to X", "pass null for Y").