bmad-method-test-architecture-enterprise 1.4.0 → 1.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/explanation/subagent-architecture.md +115 -506
- package/docs/reference/configuration.md +53 -40
- package/package.json +1 -1
- package/release_notes.md +5 -7
- package/src/testarch/knowledge/ci-burn-in.md +42 -0
- package/src/workflows/testarch/ci/checklist.md +1 -0
- package/src/workflows/testarch/ci/github-actions-template.yaml +118 -0
- package/src/workflows/testarch/ci/steps-c/step-02-generate-pipeline.md +38 -0
- package/src/workflows/testarch/ci/steps-c/step-03-configure-quality-gates.md +23 -0
- package/src/workflows/testarch/ci/steps-v/step-01-validate.md +14 -0
- package/website/astro.config.mjs +0 -1
- package/docs/explanation/subagent-implementation-status.md +0 -327
|
@@ -1,580 +1,189 @@
|
|
|
1
1
|
---
|
|
2
2
|
title: Subagent Architecture
|
|
3
|
-
description:
|
|
3
|
+
description: How TEA uses subagents and agent teams across workflows
|
|
4
4
|
---
|
|
5
5
|
|
|
6
|
-
#
|
|
6
|
+
# Subagents and Agent Teams in TEA
|
|
7
7
|
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
**Status**: Implementation Guide
|
|
8
|
+
This guide explains how TEA orchestrates work when a workflow can split into
|
|
9
|
+
worker steps (independent workers or dependency-ordered work units).
|
|
11
10
|
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
## Overview
|
|
15
|
-
|
|
16
|
-
TEA workflows use **subagent patterns** to parallelize independent tasks, improving performance and maintaining clean separation of concerns. Five workflows benefit from this architecture:
|
|
17
|
-
|
|
18
|
-
1. **automate** - Parallel test generation (API + E2E)
|
|
19
|
-
2. **atdd** - Parallel failing test generation (API + E2E)
|
|
20
|
-
3. **test-review** - Parallel quality dimension checks
|
|
21
|
-
4. **nfr-assess** - Parallel NFR domain assessments
|
|
22
|
-
5. **trace** - Two-phase workflow separation
|
|
23
|
-
|
|
24
|
-
---
|
|
25
|
-
|
|
26
|
-
## Core Subagent Pattern
|
|
27
|
-
|
|
28
|
-
### Architecture
|
|
29
|
-
|
|
30
|
-
```
|
|
31
|
-
Main Workflow (Orchestrator)
|
|
32
|
-
├── Step 1: Setup & Context Loading
|
|
33
|
-
├── Step 2: Launch Subagents
|
|
34
|
-
│ ├── Subagent A → temp-file-a.json
|
|
35
|
-
│ ├── Subagent B → temp-file-b.json
|
|
36
|
-
│ ├── Subagent C → temp-file-c.json
|
|
37
|
-
│ └── (All run in parallel, isolated 200k containers)
|
|
38
|
-
└── Step 3: Aggregate Results
|
|
39
|
-
├── Read all temp files
|
|
40
|
-
├── Merge/synthesize outputs
|
|
41
|
-
└── Generate final artifact
|
|
42
|
-
```
|
|
43
|
-
|
|
44
|
-
### Key Principles
|
|
45
|
-
|
|
46
|
-
1. **Independence**: Each subagent is completely independent (no shared state)
|
|
47
|
-
2. **Isolation**: Each subagent runs in separate 200k context container
|
|
48
|
-
3. **Output Format**: All subagents output structured JSON to temp files
|
|
49
|
-
4. **Aggregation**: Main workflow reads temp files and synthesizes final output
|
|
50
|
-
5. **Error Handling**: Each subagent reports success/failure in JSON output
|
|
51
|
-
|
|
52
|
-
---
|
|
53
|
-
|
|
54
|
-
## Workflow-Specific Designs
|
|
55
|
-
|
|
56
|
-
### 1. automate - Parallel Test Generation
|
|
57
|
-
|
|
58
|
-
**Goal**: Generate API and E2E tests in parallel
|
|
59
|
-
|
|
60
|
-
#### Architecture
|
|
11
|
+
## Scope
|
|
61
12
|
|
|
62
|
-
|
|
63
|
-
automate workflow
|
|
64
|
-
├── Step 1: Analyze codebase & identify features
|
|
65
|
-
├── Step 2: Load relevant knowledge fragments
|
|
66
|
-
├── Step 3: Launch parallel test generation
|
|
67
|
-
│ ├── Subagent A: Generate API tests → /tmp/api-tests-{timestamp}.json
|
|
68
|
-
│ └── Subagent B: Generate E2E tests → /tmp/e2e-tests-{timestamp}.json
|
|
69
|
-
├── Step 4: Aggregate tests
|
|
70
|
-
│ ├── Read API tests JSON
|
|
71
|
-
│ ├── Read E2E tests JSON
|
|
72
|
-
│ └── Generate fixtures (if needed)
|
|
73
|
-
├── Step 5: Verify all tests pass
|
|
74
|
-
└── Step 6: Generate DoD summary
|
|
75
|
-
```
|
|
76
|
-
|
|
77
|
-
#### Subagent A: API Tests
|
|
78
|
-
|
|
79
|
-
**Input** (passed via temp file):
|
|
80
|
-
|
|
81
|
-
```json
|
|
82
|
-
{
|
|
83
|
-
"features": ["feature1", "feature2"],
|
|
84
|
-
"knowledge_fragments": ["api-request", "data-factories"],
|
|
85
|
-
"config": {
|
|
86
|
-
"use_playwright_utils": true,
|
|
87
|
-
"framework": "playwright"
|
|
88
|
-
}
|
|
89
|
-
}
|
|
90
|
-
```
|
|
91
|
-
|
|
92
|
-
**Output** (`/tmp/api-tests-{timestamp}.json`):
|
|
93
|
-
|
|
94
|
-
```json
|
|
95
|
-
{
|
|
96
|
-
"success": true,
|
|
97
|
-
"tests": [
|
|
98
|
-
{
|
|
99
|
-
"file": "tests/api/feature1.spec.ts",
|
|
100
|
-
"content": "import { test, expect } from '@playwright/test';\n...",
|
|
101
|
-
"description": "API tests for feature1"
|
|
102
|
-
}
|
|
103
|
-
],
|
|
104
|
-
"fixtures": [],
|
|
105
|
-
"summary": "Generated 5 API test cases"
|
|
106
|
-
}
|
|
107
|
-
```
|
|
13
|
+
This applies to these workflows:
|
|
108
14
|
|
|
109
|
-
|
|
15
|
+
- `automate`
|
|
16
|
+
- `atdd`
|
|
17
|
+
- `test-review`
|
|
18
|
+
- `nfr-assess`
|
|
19
|
+
- `framework`
|
|
20
|
+
- `ci`
|
|
21
|
+
- `test-design`
|
|
22
|
+
- `trace`
|
|
110
23
|
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
```json
|
|
114
|
-
{
|
|
115
|
-
"features": ["feature1", "feature2"],
|
|
116
|
-
"knowledge_fragments": ["fixture-architecture", "network-first"],
|
|
117
|
-
"config": {
|
|
118
|
-
"use_playwright_utils": true,
|
|
119
|
-
"framework": "playwright"
|
|
120
|
-
}
|
|
121
|
-
}
|
|
122
|
-
```
|
|
123
|
-
|
|
124
|
-
**Output** (`/tmp/e2e-tests-{timestamp}.json`):
|
|
125
|
-
|
|
126
|
-
```json
|
|
127
|
-
{
|
|
128
|
-
"success": true,
|
|
129
|
-
"tests": [
|
|
130
|
-
{
|
|
131
|
-
"file": "tests/e2e/feature1.spec.ts",
|
|
132
|
-
"content": "import { test, expect } from '@playwright/test';\n...",
|
|
133
|
-
"description": "E2E tests for feature1 user journey"
|
|
134
|
-
}
|
|
135
|
-
],
|
|
136
|
-
"fixtures": ["authFixture", "dataFixture"],
|
|
137
|
-
"summary": "Generated 8 E2E test cases"
|
|
138
|
-
}
|
|
139
|
-
```
|
|
140
|
-
|
|
141
|
-
#### Step 4: Aggregation Logic
|
|
142
|
-
|
|
143
|
-
```javascript
|
|
144
|
-
// Read both subagent outputs
|
|
145
|
-
const apiTests = JSON.parse(fs.readFileSync('/tmp/api-tests-{timestamp}.json', 'utf8'));
|
|
146
|
-
const e2eTests = JSON.parse(fs.readFileSync('/tmp/e2e-tests-{timestamp}.json', 'utf8'));
|
|
147
|
-
|
|
148
|
-
// Merge test suites
|
|
149
|
-
const allTests = [...apiTests.tests, ...e2eTests.tests];
|
|
150
|
-
|
|
151
|
-
// Collect unique fixtures
|
|
152
|
-
const allFixtures = [...new Set([...apiTests.fixtures, ...e2eTests.fixtures])];
|
|
153
|
-
|
|
154
|
-
// Generate combined DoD summary
|
|
155
|
-
const summary = {
|
|
156
|
-
total_tests: allTests.length,
|
|
157
|
-
api_tests: apiTests.tests.length,
|
|
158
|
-
e2e_tests: e2eTests.tests.length,
|
|
159
|
-
fixtures: allFixtures,
|
|
160
|
-
status: apiTests.success && e2eTests.success ? 'PASS' : 'FAIL',
|
|
161
|
-
};
|
|
162
|
-
```
|
|
24
|
+
It does not apply to `teach-me-testing`.
|
|
163
25
|
|
|
164
26
|
---
|
|
165
27
|
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
**Goal**: Generate failing API and E2E tests in parallel (TDD red phase)
|
|
28
|
+
## Core Model
|
|
169
29
|
|
|
170
|
-
|
|
30
|
+
TEA orchestration has three parts:
|
|
171
31
|
|
|
172
|
-
|
|
173
|
-
|
|
174
|
-
|
|
175
|
-
├── Step 2: Load relevant knowledge fragments
|
|
176
|
-
├── Step 3: Launch parallel test generation
|
|
177
|
-
│ ├── Subagent A: Generate failing API tests → /tmp/atdd-api-{timestamp}.json
|
|
178
|
-
│ └── Subagent B: Generate failing E2E tests → /tmp/atdd-e2e-{timestamp}.json
|
|
179
|
-
├── Step 4: Aggregate tests
|
|
180
|
-
├── Step 5: Verify tests fail (red phase)
|
|
181
|
-
└── Step 6: Output ATDD checklist
|
|
182
|
-
```
|
|
183
|
-
|
|
184
|
-
**Key Difference from automate**: Tests must be written to **fail** before implementation exists.
|
|
185
|
-
|
|
186
|
-
#### Subagent Outputs
|
|
187
|
-
|
|
188
|
-
Same JSON structure as automate, but:
|
|
32
|
+
1. Resolve execution mode (`tea_execution_mode` + optional runtime probe)
|
|
33
|
+
2. Dispatch worker steps (independent or dependency-ordered, depending on workflow)
|
|
34
|
+
3. Aggregate worker outputs into one deterministic final artifact
|
|
189
35
|
|
|
190
|
-
|
|
191
|
-
|
|
36
|
+
Workers are isolated and exchange data through structured outputs that the
|
|
37
|
+
aggregation step validates.
|
|
192
38
|
|
|
193
39
|
---
|
|
194
40
|
|
|
195
|
-
|
|
41
|
+
## Execution Modes
|
|
196
42
|
|
|
197
|
-
|
|
43
|
+
TEA supports four modes:
|
|
198
44
|
|
|
199
|
-
|
|
45
|
+
- `auto`
|
|
46
|
+
- `agent-team`
|
|
47
|
+
- `subagent`
|
|
48
|
+
- `sequential`
|
|
200
49
|
|
|
201
|
-
|
|
202
|
-
test-review workflow
|
|
203
|
-
├── Step 1: Load test files & context
|
|
204
|
-
├── Step 2: Launch parallel quality checks
|
|
205
|
-
│ ├── Subagent A: Determinism check → /tmp/determinism-{timestamp}.json
|
|
206
|
-
│ ├── Subagent B: Isolation check → /tmp/isolation-{timestamp}.json
|
|
207
|
-
│ ├── Subagent C: Maintainability check → /tmp/maintainability-{timestamp}.json
|
|
208
|
-
│ ├── Subagent D: Coverage check → /tmp/coverage-{timestamp}.json
|
|
209
|
-
│ └── Subagent E: Performance check → /tmp/performance-{timestamp}.json
|
|
210
|
-
└── Step 3: Aggregate findings
|
|
211
|
-
├── Calculate weighted score (0-100)
|
|
212
|
-
├── Synthesize violations
|
|
213
|
-
└── Generate review report with suggestions
|
|
214
|
-
```
|
|
50
|
+
### What Each Mode Means
|
|
215
51
|
|
|
216
|
-
|
|
217
|
-
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
```json
|
|
221
|
-
{
|
|
222
|
-
"dimension": "determinism",
|
|
223
|
-
"score": 85,
|
|
224
|
-
"max_score": 100,
|
|
225
|
-
"violations": [
|
|
226
|
-
{
|
|
227
|
-
"file": "tests/api/user.spec.ts",
|
|
228
|
-
"line": 42,
|
|
229
|
-
"severity": "HIGH",
|
|
230
|
-
"description": "Test uses Math.random() - non-deterministic",
|
|
231
|
-
"suggestion": "Use faker with fixed seed"
|
|
232
|
-
}
|
|
233
|
-
],
|
|
234
|
-
"passed_checks": 12,
|
|
235
|
-
"failed_checks": 3,
|
|
236
|
-
"summary": "Tests are mostly deterministic with 3 violations"
|
|
237
|
-
}
|
|
238
|
-
```
|
|
52
|
+
- `auto`: Choose the best supported mode at runtime.
|
|
53
|
+
- `agent-team`: Prefer team/delegation orchestration when runtime supports it.
|
|
54
|
+
- `subagent`: Prefer isolated worker orchestration when runtime supports it.
|
|
55
|
+
- `sequential`: Run worker steps one-by-one.
|
|
239
56
|
|
|
240
|
-
|
|
241
|
-
|
|
242
|
-
```javascript
|
|
243
|
-
// Read all dimension outputs
|
|
244
|
-
const dimensions = ['determinism', 'isolation', 'maintainability', 'coverage', 'performance'];
|
|
245
|
-
const results = dimensions.map((d) => JSON.parse(fs.readFileSync(`/tmp/${d}-{timestamp}.json`, 'utf8')));
|
|
246
|
-
|
|
247
|
-
// Calculate weighted score
|
|
248
|
-
const weights = { determinism: 0.25, isolation: 0.25, maintainability: 0.2, coverage: 0.15, performance: 0.15 };
|
|
249
|
-
const totalScore = results.reduce((sum, r) => sum + r.score * weights[r.dimension], 0);
|
|
250
|
-
|
|
251
|
-
// Aggregate violations by severity
|
|
252
|
-
const allViolations = results.flatMap((r) => r.violations);
|
|
253
|
-
const highSeverity = allViolations.filter((v) => v.severity === 'HIGH');
|
|
254
|
-
const mediumSeverity = allViolations.filter((v) => v.severity === 'MEDIUM');
|
|
255
|
-
const lowSeverity = allViolations.filter((v) => v.severity === 'LOW');
|
|
256
|
-
|
|
257
|
-
// Generate final report
|
|
258
|
-
const report = {
|
|
259
|
-
overall_score: Math.round(totalScore),
|
|
260
|
-
grade: getGrade(totalScore), // A/B/C/D/F
|
|
261
|
-
dimensions: results,
|
|
262
|
-
violations_summary: {
|
|
263
|
-
high: highSeverity.length,
|
|
264
|
-
medium: mediumSeverity.length,
|
|
265
|
-
low: lowSeverity.length,
|
|
266
|
-
total: allViolations.length,
|
|
267
|
-
},
|
|
268
|
-
top_suggestions: prioritizeSuggestions(allViolations),
|
|
269
|
-
};
|
|
270
|
-
```
|
|
57
|
+
### Fallback Behavior
|
|
271
58
|
|
|
272
|
-
|
|
59
|
+
When `tea_capability_probe: true`, TEA can fallback safely:
|
|
273
60
|
|
|
274
|
-
|
|
61
|
+
- `auto` falls back in order: `agent-team` -> `subagent` -> `sequential`
|
|
62
|
+
- explicit `agent-team` or `subagent` falls back to next supported mode
|
|
63
|
+
- `sequential` always stays sequential
|
|
275
64
|
|
|
276
|
-
|
|
65
|
+
When `tea_capability_probe: false`, TEA honors the requested mode strictly and
|
|
66
|
+
fails if runtime cannot execute it.
|
|
277
67
|
|
|
278
|
-
|
|
68
|
+
### Runtime Scheduling
|
|
279
69
|
|
|
280
|
-
|
|
281
|
-
|
|
282
|
-
├── Step 1: Load system context
|
|
283
|
-
├── Step 2: Launch parallel NFR assessments
|
|
284
|
-
│ ├── Subagent A: Security assessment → /tmp/nfr-security-{timestamp}.json
|
|
285
|
-
│ ├── Subagent B: Performance assessment → /tmp/nfr-performance-{timestamp}.json
|
|
286
|
-
│ ├── Subagent C: Reliability assessment → /tmp/nfr-reliability-{timestamp}.json
|
|
287
|
-
│ └── Subagent D: Scalability assessment → /tmp/nfr-scalability-{timestamp}.json
|
|
288
|
-
└── Step 3: Aggregate NFR report
|
|
289
|
-
├── Synthesize domain assessments
|
|
290
|
-
├── Identify cross-domain risks
|
|
291
|
-
└── Generate compliance documentation
|
|
292
|
-
```
|
|
293
|
-
|
|
294
|
-
#### Subagent Output Format
|
|
295
|
-
|
|
296
|
-
Each NFR domain subagent outputs:
|
|
297
|
-
|
|
298
|
-
```json
|
|
299
|
-
{
|
|
300
|
-
"domain": "security",
|
|
301
|
-
"risk_level": "MEDIUM",
|
|
302
|
-
"findings": [
|
|
303
|
-
{
|
|
304
|
-
"category": "Authentication",
|
|
305
|
-
"status": "PASS",
|
|
306
|
-
"description": "OAuth2 with JWT tokens implemented",
|
|
307
|
-
"recommendations": []
|
|
308
|
-
},
|
|
309
|
-
{
|
|
310
|
-
"category": "Data Encryption",
|
|
311
|
-
"status": "CONCERN",
|
|
312
|
-
"description": "Database encryption at rest not enabled",
|
|
313
|
-
"recommendations": ["Enable database encryption", "Use AWS KMS for key management"]
|
|
314
|
-
}
|
|
315
|
-
],
|
|
316
|
-
"compliance": {
|
|
317
|
-
"SOC2": "PARTIAL",
|
|
318
|
-
"GDPR": "PASS",
|
|
319
|
-
"HIPAA": "N/A"
|
|
320
|
-
},
|
|
321
|
-
"priority_actions": ["Enable database encryption within 30 days"]
|
|
322
|
-
}
|
|
323
|
-
```
|
|
324
|
-
|
|
325
|
-
#### Step 3: Aggregation Logic
|
|
326
|
-
|
|
327
|
-
```javascript
|
|
328
|
-
// Read all NFR domain outputs
|
|
329
|
-
const domains = ['security', 'performance', 'reliability', 'scalability'];
|
|
330
|
-
const assessments = domains.map((d) => JSON.parse(fs.readFileSync(`/tmp/nfr-${d}-{timestamp}.json`, 'utf8')));
|
|
331
|
-
|
|
332
|
-
// Calculate overall risk
|
|
333
|
-
const riskLevels = { HIGH: 3, MEDIUM: 2, LOW: 1, NONE: 0 };
|
|
334
|
-
const maxRiskLevel = Math.max(...assessments.map((a) => riskLevels[a.risk_level]));
|
|
335
|
-
const overallRisk = Object.keys(riskLevels).find((k) => riskLevels[k] === maxRiskLevel);
|
|
336
|
-
|
|
337
|
-
// Aggregate compliance status
|
|
338
|
-
const allCompliance = assessments.flatMap((a) => Object.entries(a.compliance));
|
|
339
|
-
const complianceSummary = {};
|
|
340
|
-
allCompliance.forEach(([std, status]) => {
|
|
341
|
-
if (!complianceSummary[std]) complianceSummary[std] = [];
|
|
342
|
-
complianceSummary[std].push(status);
|
|
343
|
-
});
|
|
344
|
-
|
|
345
|
-
// Synthesize cross-domain risks
|
|
346
|
-
const crossDomainRisks = identifyCrossDomainRisks(assessments); // e.g., "Performance + scalability concern"
|
|
347
|
-
|
|
348
|
-
// Generate final report
|
|
349
|
-
const report = {
|
|
350
|
-
overall_risk: overallRisk,
|
|
351
|
-
domains: assessments,
|
|
352
|
-
compliance_summary: complianceSummary,
|
|
353
|
-
cross_domain_risks: crossDomainRisks,
|
|
354
|
-
priority_actions: assessments.flatMap((a) => a.priority_actions),
|
|
355
|
-
executive_summary: generateExecutiveSummary(assessments),
|
|
356
|
-
};
|
|
357
|
-
```
|
|
70
|
+
In `agent-team` and `subagent` modes, runtime decides concurrency and timing.
|
|
71
|
+
TEA does not impose its own parallel worker limit.
|
|
358
72
|
|
|
359
73
|
---
|
|
360
74
|
|
|
361
|
-
|
|
75
|
+
## Verbal Override Rules
|
|
362
76
|
|
|
363
|
-
|
|
77
|
+
During a run, explicit user phrasing can override config for that run only.
|
|
364
78
|
|
|
365
|
-
|
|
79
|
+
Supported normalized terms:
|
|
366
80
|
|
|
367
|
-
|
|
368
|
-
|
|
369
|
-
|
|
370
|
-
|
|
371
|
-
|
|
372
|
-
│ └── Step 3: Generate traceability matrix → /tmp/trace-matrix-{timestamp}.json
|
|
373
|
-
└── Phase 2: Gate Decision (depends on Phase 1 output)
|
|
374
|
-
├── Step 4: Read coverage matrix
|
|
375
|
-
├── Step 5: Apply decision tree logic
|
|
376
|
-
├── Step 6: Calculate coverage percentages
|
|
377
|
-
└── Step 7: Generate gate decision (PASS/CONCERNS/FAIL/WAIVED)
|
|
378
|
-
```
|
|
81
|
+
- `agent team` or `agent teams` -> `agent-team`
|
|
82
|
+
- `agentteam` -> `agent-team`
|
|
83
|
+
- `subagent`, `subagents`, `sub agent`, or `sub agents` -> `subagent`
|
|
84
|
+
- `sequential` -> `sequential`
|
|
85
|
+
- `auto` -> `auto`
|
|
379
86
|
|
|
380
|
-
|
|
381
|
-
|
|
382
|
-
#### Phase 1 Output Format
|
|
383
|
-
|
|
384
|
-
```json
|
|
385
|
-
{
|
|
386
|
-
"requirements": [
|
|
387
|
-
{
|
|
388
|
-
"id": "REQ-001",
|
|
389
|
-
"description": "User can login with email/password",
|
|
390
|
-
"priority": "P0",
|
|
391
|
-
"tests": ["tests/auth/login.spec.ts::should login with valid credentials"],
|
|
392
|
-
"coverage": "FULL"
|
|
393
|
-
},
|
|
394
|
-
{
|
|
395
|
-
"id": "REQ-002",
|
|
396
|
-
"description": "User can reset password",
|
|
397
|
-
"priority": "P1",
|
|
398
|
-
"tests": [],
|
|
399
|
-
"coverage": "NONE"
|
|
400
|
-
}
|
|
401
|
-
],
|
|
402
|
-
"total_requirements": 50,
|
|
403
|
-
"covered_requirements": 42,
|
|
404
|
-
"coverage_percentage": 84
|
|
405
|
-
}
|
|
406
|
-
```
|
|
87
|
+
Resolution precedence:
|
|
407
88
|
|
|
408
|
-
|
|
409
|
-
|
|
410
|
-
|
|
411
|
-
// Read Phase 1 output
|
|
412
|
-
const matrix = JSON.parse(fs.readFileSync('/tmp/trace-matrix-{timestamp}.json', 'utf8'));
|
|
413
|
-
|
|
414
|
-
// Apply decision tree
|
|
415
|
-
const p0Coverage = matrix.requirements.filter((r) => r.priority === 'P0' && r.coverage === 'FULL').length;
|
|
416
|
-
const totalP0 = matrix.requirements.filter((r) => r.priority === 'P0').length;
|
|
417
|
-
|
|
418
|
-
let gateDecision;
|
|
419
|
-
if (p0Coverage === totalP0 && matrix.coverage_percentage >= 90) {
|
|
420
|
-
gateDecision = 'PASS';
|
|
421
|
-
} else if (p0Coverage === totalP0 && matrix.coverage_percentage >= 75) {
|
|
422
|
-
gateDecision = 'CONCERNS';
|
|
423
|
-
} else if (p0Coverage < totalP0) {
|
|
424
|
-
gateDecision = 'FAIL';
|
|
425
|
-
} else {
|
|
426
|
-
gateDecision = 'WAIVED'; // Manual review required
|
|
427
|
-
}
|
|
428
|
-
|
|
429
|
-
// Generate gate report
|
|
430
|
-
const report = {
|
|
431
|
-
decision: gateDecision,
|
|
432
|
-
coverage_matrix: matrix,
|
|
433
|
-
p0_coverage: `${p0Coverage}/${totalP0}`,
|
|
434
|
-
overall_coverage: `${matrix.coverage_percentage}%`,
|
|
435
|
-
recommendations: generateRecommendations(matrix, gateDecision),
|
|
436
|
-
uncovered_requirements: matrix.requirements.filter((r) => r.coverage === 'NONE'),
|
|
437
|
-
};
|
|
438
|
-
```
|
|
89
|
+
1. Explicit run-level request (if present)
|
|
90
|
+
2. `tea_execution_mode` in config
|
|
91
|
+
3. Runtime fallback (when probing is enabled)
|
|
439
92
|
|
|
440
93
|
---
|
|
441
94
|
|
|
442
|
-
##
|
|
95
|
+
## Workflow Coverage Map
|
|
443
96
|
|
|
444
|
-
###
|
|
97
|
+
### `automate`
|
|
445
98
|
|
|
446
|
-
|
|
99
|
+
- Worker split: API + E2E/backend test generation workers
|
|
100
|
+
- Aggregation: merges generated tests, fixtures, and summary stats
|
|
101
|
+
- Mode effect: changes orchestration style only, not output contract
|
|
447
102
|
|
|
448
|
-
|
|
449
|
-
/tmp/{workflow}-{subagent-name}-{timestamp}.json
|
|
450
|
-
```
|
|
103
|
+
### `atdd`
|
|
451
104
|
|
|
452
|
-
|
|
105
|
+
- Worker split: failing API + failing E2E test generation workers
|
|
106
|
+
- Aggregation: validates red-phase output and merges artifacts
|
|
107
|
+
- Mode effect: changes orchestration style only, not red-phase requirements
|
|
453
108
|
|
|
454
|
-
|
|
455
|
-
- `/tmp/test-review-determinism-20260127-143022.json`
|
|
456
|
-
- `/tmp/nfr-security-20260127-143022.json`
|
|
109
|
+
### `test-review`
|
|
457
110
|
|
|
458
|
-
|
|
111
|
+
- Worker split: quality-dimension evaluations (determinism, isolation,
|
|
112
|
+
maintainability, performance)
|
|
113
|
+
- Aggregation: computes combined quality score/report
|
|
114
|
+
- Mode effect: changes orchestration style only, not scoring schema
|
|
459
115
|
|
|
460
|
-
-
|
|
461
|
-
- Keep temp files on error for debugging
|
|
462
|
-
- Implement retry logic for temp file reads (race conditions)
|
|
116
|
+
### `nfr-assess`
|
|
463
117
|
|
|
464
|
-
|
|
118
|
+
- Worker split: security, performance, reliability, scalability assessments
|
|
119
|
+
- Aggregation: computes overall risk, compliance summary, priority actions
|
|
120
|
+
- Mode effect: changes orchestration style only, not report schema
|
|
465
121
|
|
|
466
|
-
|
|
467
|
-
|
|
468
|
-
```json
|
|
469
|
-
{
|
|
470
|
-
"success": true|false,
|
|
471
|
-
"error": "Error message if failed",
|
|
472
|
-
"data": { ... }
|
|
473
|
-
}
|
|
474
|
-
```
|
|
122
|
+
### `framework`
|
|
475
123
|
|
|
476
|
-
|
|
124
|
+
- Worker split: scaffold work units (structure/config, fixtures, samples)
|
|
125
|
+
- Aggregation: consolidates generated framework setup outputs
|
|
126
|
+
- Mode effect: changes orchestration style only
|
|
477
127
|
|
|
478
|
-
|
|
479
|
-
2. If any subagent failed, aggregate error messages
|
|
480
|
-
3. Decide whether to continue (partial success) or fail (critical subagent failed)
|
|
128
|
+
### `ci`
|
|
481
129
|
|
|
482
|
-
|
|
130
|
+
- Worker split: orchestration-capable mode resolution for pipeline generation
|
|
131
|
+
- Aggregation: deterministic single pipeline artifact
|
|
132
|
+
- Mode effect: mostly impacts orchestration policy; final pipeline contract is
|
|
133
|
+
unchanged
|
|
483
134
|
|
|
484
|
-
|
|
135
|
+
### `test-design`
|
|
485
136
|
|
|
486
|
-
-
|
|
487
|
-
-
|
|
488
|
-
-
|
|
137
|
+
- Worker split: orchestration-capable mode resolution for output generation
|
|
138
|
+
- Aggregation: deterministic design artifact output
|
|
139
|
+
- Mode effect: orchestration policy only; output schema unchanged
|
|
489
140
|
|
|
490
|
-
|
|
141
|
+
### `trace`
|
|
491
142
|
|
|
492
|
-
-
|
|
493
|
-
-
|
|
494
|
-
-
|
|
495
|
-
|
|
496
|
-
- Implement proper synchronization (wait for all subagents to complete)
|
|
143
|
+
- Worker split: phase/work-unit separation with dependency ordering
|
|
144
|
+
- Aggregation: merges gap analysis + coverage/gate data
|
|
145
|
+
- Mode effect: orchestration policy only; final decision/report contract
|
|
146
|
+
unchanged
|
|
497
147
|
|
|
498
148
|
---
|
|
499
149
|
|
|
500
|
-
##
|
|
501
|
-
|
|
502
|
-
### Test Checklist
|
|
503
|
-
|
|
504
|
-
For each workflow with subagents:
|
|
505
|
-
|
|
506
|
-
- [ ] **Unit Test**: Test each subagent in isolation
|
|
507
|
-
- Provide mock input JSON
|
|
508
|
-
- Verify output JSON structure
|
|
509
|
-
- Test error scenarios
|
|
510
|
-
|
|
511
|
-
- [ ] **Integration Test**: Test full workflow
|
|
512
|
-
- Launch all subagents
|
|
513
|
-
- Verify parallel execution
|
|
514
|
-
- Verify aggregation logic
|
|
515
|
-
- Test with real project data
|
|
516
|
-
|
|
517
|
-
- [ ] **Performance Test**: Measure speedup
|
|
518
|
-
- Benchmark sequential vs parallel
|
|
519
|
-
- Measure subagent overhead
|
|
520
|
-
- Verify memory usage acceptable
|
|
150
|
+
## Design Guarantees
|
|
521
151
|
|
|
522
|
-
|
|
523
|
-
- One subagent fails
|
|
524
|
-
- Multiple subagents fail
|
|
525
|
-
- Temp file read/write errors
|
|
526
|
-
- Timeout scenarios
|
|
152
|
+
TEA maintains these guarantees across all modes:
|
|
527
153
|
|
|
528
|
-
|
|
154
|
+
- Same output schema for a given workflow
|
|
155
|
+
- Same validation and aggregation rules
|
|
156
|
+
- Same deterministic fallback semantics
|
|
157
|
+
- Same failure behavior for missing/invalid worker outputs
|
|
529
158
|
|
|
530
|
-
|
|
531
|
-
|
|
532
|
-
- Sequential: ~5-10 minutes (API then E2E)
|
|
533
|
-
- Parallel: ~3-6 minutes (both at once)
|
|
534
|
-
- **Speedup: ~40-50%**
|
|
535
|
-
|
|
536
|
-
**test-review**:
|
|
537
|
-
|
|
538
|
-
- Sequential: ~3-5 minutes (5 quality checks)
|
|
539
|
-
- Parallel: ~1-2 minutes (all checks at once)
|
|
540
|
-
- **Speedup: ~60-70%**
|
|
541
|
-
|
|
542
|
-
**nfr-assess**:
|
|
543
|
-
|
|
544
|
-
- Sequential: ~8-12 minutes (4 NFR domains)
|
|
545
|
-
- Parallel: ~3-5 minutes (all domains at once)
|
|
546
|
-
- **Speedup: ~60-70%**
|
|
159
|
+
Mode selection changes orchestration behavior, not artifact contracts.
|
|
547
160
|
|
|
548
161
|
---
|
|
549
162
|
|
|
550
|
-
##
|
|
163
|
+
## Practical Guidance
|
|
551
164
|
|
|
552
|
-
|
|
165
|
+
Recommended defaults:
|
|
553
166
|
|
|
554
|
-
|
|
555
|
-
|
|
556
|
-
|
|
557
|
-
|
|
558
|
-
|
|
559
|
-
---
|
|
167
|
+
```yaml
|
|
168
|
+
tea_execution_mode: 'auto'
|
|
169
|
+
tea_capability_probe: true
|
|
170
|
+
```
|
|
560
171
|
|
|
561
|
-
|
|
172
|
+
Use `sequential` when you need strict single-threaded execution or debugging
|
|
173
|
+
clarity.
|
|
562
174
|
|
|
563
|
-
|
|
564
|
-
|
|
565
|
-
3. **Progress Reporting**: Real-time progress updates from each subagent
|
|
566
|
-
4. **Caching**: Cache subagent outputs for identical inputs (idempotent operations)
|
|
567
|
-
5. **Distributed Execution**: Run subagents on different machines for massive parallelization
|
|
175
|
+
Use explicit `agent-team` or `subagent` only when you intentionally want that
|
|
176
|
+
mode and understand runtime support in your environment.
|
|
568
177
|
|
|
569
178
|
---
|
|
570
179
|
|
|
571
|
-
##
|
|
180
|
+
## Troubleshooting Signals
|
|
572
181
|
|
|
573
|
-
|
|
574
|
-
- Runtime-specific agent/subagent documentation (Codex, Claude Code, etc.)
|
|
575
|
-
- TEA Workflow validation reports (proof of 100% compliance)
|
|
182
|
+
Common causes of orchestration confusion:
|
|
576
183
|
|
|
577
|
-
|
|
184
|
+
- Explicit run-level override text was provided and took precedence over config
|
|
185
|
+
- Runtime did not support requested mode and fallback changed final mode
|
|
186
|
+
- Probe disabled (`tea_capability_probe: false`) with unsupported explicit mode
|
|
578
187
|
|
|
579
|
-
|
|
580
|
-
|
|
188
|
+
Check resolved mode logs in the workflow execution report to confirm what mode
|
|
189
|
+
actually ran.
|