valent-pipeline 0.2.20 → 0.2.21
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +438 -0
- package/package.json +1 -1
- package/pipeline/agents-manifest.yaml +61 -1
- package/pipeline/docs/agent-reference.md +82 -23
- package/pipeline/docs/design/refactor-checklist.md +111 -0
- package/pipeline/docs/index.md +60 -0
- package/pipeline/docs/pipeline-overview.md +4 -0
- package/pipeline/prompts/bend.md +5 -11
- package/pipeline/prompts/critic.md +9 -0
- package/pipeline/prompts/data.md +59 -0
- package/pipeline/prompts/docgen.md +61 -0
- package/pipeline/prompts/fend.md +3 -10
- package/pipeline/prompts/iac.md +70 -0
- package/pipeline/prompts/lead.md +81 -3
- package/pipeline/prompts/libdev.md +61 -0
- package/pipeline/prompts/mcp-dev.md +59 -0
- package/pipeline/prompts/mobile.md +92 -0
- package/pipeline/prompts/qa-a.md +1 -1
- package/pipeline/prompts/qa-b.md +1 -1
- package/pipeline/prompts/reqs.md +5 -1
- package/pipeline/scripts/db-bootstrap.ts +1 -1
- package/pipeline/scripts/embed-sqlite.ts +5 -0
- package/pipeline/steps/common/quality-standards.md +19 -0
- package/pipeline/steps/critic/data-pipeline.md +28 -0
- package/pipeline/steps/critic/document-generation.md +21 -0
- package/pipeline/steps/critic/iac.md +29 -0
- package/pipeline/steps/critic/library.md +24 -0
- package/pipeline/steps/critic/mcp-server.md +24 -0
- package/pipeline/steps/critic/mobile-app.md +29 -0
- package/pipeline/steps/data/estimate.md +51 -0
- package/pipeline/steps/data/handoff.md +9 -0
- package/pipeline/steps/data/implement.md +16 -0
- package/pipeline/steps/data/read-inputs.md +13 -0
- package/pipeline/steps/data/write-tests.md +13 -0
- package/pipeline/steps/docgen/estimate.md +49 -0
- package/pipeline/steps/docgen/handoff.md +9 -0
- package/pipeline/steps/docgen/implement.md +19 -0
- package/pipeline/steps/docgen/read-inputs.md +13 -0
- package/pipeline/steps/docgen/write-tests.md +15 -0
- package/pipeline/steps/iac/estimate.md +50 -0
- package/pipeline/steps/iac/handoff.md +9 -0
- package/pipeline/steps/iac/implement.md +19 -0
- package/pipeline/steps/iac/read-inputs.md +13 -0
- package/pipeline/steps/iac/write-tests.md +20 -0
- package/pipeline/steps/judge/ship-decision.md +14 -1
- package/pipeline/steps/libdev/estimate.md +49 -0
- package/pipeline/steps/libdev/handoff.md +9 -0
- package/pipeline/steps/libdev/implement.md +19 -0
- package/pipeline/steps/libdev/read-inputs.md +13 -0
- package/pipeline/steps/libdev/write-tests.md +16 -0
- package/pipeline/steps/mcp-dev/estimate.md +49 -0
- package/pipeline/steps/mcp-dev/handoff.md +9 -0
- package/pipeline/steps/mcp-dev/implement.md +29 -0
- package/pipeline/steps/mcp-dev/read-inputs.md +13 -0
- package/pipeline/steps/mcp-dev/write-tests.md +19 -0
- package/pipeline/steps/mobile/emulator-lifecycle.md +67 -0
- package/pipeline/steps/mobile/estimate.md +51 -0
- package/pipeline/steps/mobile/flutter.md +30 -0
- package/pipeline/steps/mobile/handoff.md +18 -0
- package/pipeline/steps/mobile/implement.md +20 -0
- package/pipeline/steps/mobile/react-native.md +32 -0
- package/pipeline/steps/mobile/read-inputs.md +10 -0
- package/pipeline/steps/mobile/write-tests.md +59 -0
- package/pipeline/steps/orchestration/adopt-lead-and-create-team.md +1 -1
- package/pipeline/steps/orchestration/sprint-groom.md +4 -0
- package/pipeline/steps/orchestration/sprint-size.md +19 -12
- package/pipeline/steps/orchestration/validate-story-inputs.md +9 -0
- package/pipeline/steps/qa-a/data-pipeline.md +32 -0
- package/pipeline/steps/qa-a/document-generation.md +52 -0
- package/pipeline/steps/qa-a/iac.md +30 -0
- package/pipeline/steps/qa-a/library.md +42 -0
- package/pipeline/steps/qa-a/mcp-server.md +31 -0
- package/pipeline/steps/qa-a/mobile-app.md +59 -0
- package/pipeline/steps/qa-b/data-pipeline.md +48 -0
- package/pipeline/steps/qa-b/document-generation.md +47 -0
- package/pipeline/steps/qa-b/iac.md +44 -0
- package/pipeline/steps/qa-b/library.md +61 -0
- package/pipeline/steps/qa-b/mcp-server.md +40 -0
- package/pipeline/steps/qa-b/mobile-app.md +71 -0
- package/pipeline/steps/readiness/standalone-review.md +7 -2
- package/pipeline/steps/reqs/data-pipeline.md +56 -0
- package/pipeline/steps/reqs/document-generation.md +55 -0
- package/pipeline/steps/reqs/draft-brief.md +10 -0
- package/pipeline/steps/reqs/iac.md +63 -0
- package/pipeline/steps/reqs/library.md +56 -0
- package/pipeline/steps/reqs/mcp-server.md +48 -0
- package/pipeline/steps/reqs/mobile-app.md +54 -0
- package/pipeline/steps/reqs/self-review.md +5 -3
- package/pipeline/task-graphs/backend-api.yaml +19 -2
- package/pipeline/task-graphs/data-pipeline.yaml +29 -12
- package/pipeline/task-graphs/document-generation.yaml +29 -12
- package/pipeline/task-graphs/frontend-only.yaml +19 -2
- package/pipeline/task-graphs/fullstack-web.yaml +19 -2
- package/pipeline/task-graphs/library.yaml +29 -12
- package/pipeline/task-graphs/mcp-server.yaml +29 -12
- package/pipeline/task-graphs/mobile-app.yaml +171 -0
- package/pipeline/templates/bugs.template.md +1 -1
- package/pipeline/templates/critic-review.template.md +1 -1
- package/pipeline/templates/data-handoff.template.md +96 -0
- package/pipeline/templates/docgen-handoff.template.md +83 -0
- package/pipeline/templates/iac-handoff.template.md +83 -0
- package/pipeline/templates/judge-decision.template.md +11 -1
- package/pipeline/templates/libdev-handoff.template.md +82 -0
- package/pipeline/templates/mcp-dev-handoff.template.md +87 -0
- package/pipeline/templates/mobile-handoff.template.md +122 -0
- package/pipeline/templates/reqs-brief.template.md +60 -4
- package/skills/valent-run-deferred-tests/SKILL.md +109 -0
- package/src/commands/db-rebuild.js +5 -0
- package/src/lib/config-schema.js +1 -1
- package/src/lib/db.js +1 -1
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# MOBILE Step: React Native Specifics
|
|
2
|
+
|
|
3
|
+
This step is loaded conditionally when `{tech_stack.mobile_framework}` is `react-native`. Read before implementing.
|
|
4
|
+
|
|
5
|
+
## Metro Bundler Management
|
|
6
|
+
- Start Metro before any test execution: `npx react-native start --reset-cache &`
|
|
7
|
+
- Monitor Metro for JS bundle errors. If bundling fails, fix the code and retry.
|
|
8
|
+
- Handle port conflicts: check if port 8081 is in use before starting (`lsof -i :8081` on Mac/Linux, `netstat -ano | findstr 8081` on Windows). Kill stale Metro processes if needed.
|
|
9
|
+
- Kill Metro after all tests complete.
|
|
10
|
+
|
|
11
|
+
## React Native Build Configuration
|
|
12
|
+
- Use `react-native.config.js` for native module auto-linking
|
|
13
|
+
- Hermes engine: verify Hermes is enabled for Android (check `android/app/build.gradle` for `enableHermes: true`)
|
|
14
|
+
- Flipper: disable in release builds, optional in debug
|
|
15
|
+
- Fast Refresh: disable during E2E to avoid test flakiness (`--no-interactive` flag)
|
|
16
|
+
|
|
17
|
+
## React Native Testing Patterns
|
|
18
|
+
- Component tests: use React Native Testing Library (`@testing-library/react-native`)
|
|
19
|
+
- Navigation tests: verify deep link resolution via React Navigation linking config, verify screen transitions
|
|
20
|
+
- Platform-specific code: test `.android.tsx` and `.ios.tsx` variants separately when they exist
|
|
21
|
+
- AsyncStorage / MMKV: clear between test suites to prevent state leakage
|
|
22
|
+
|
|
23
|
+
## Native Module Considerations
|
|
24
|
+
- If the story uses native modules, verify auto-linking succeeded on both platforms
|
|
25
|
+
- Bridge calls must be tested with real native code, not mocks
|
|
26
|
+
- Pod install (iOS): `cd ios && pod install` after adding native dependencies
|
|
27
|
+
- Gradle sync (Android): `cd android && ./gradlew --refresh-dependencies` if native module issues
|
|
28
|
+
|
|
29
|
+
## Area Labels for Testing
|
|
30
|
+
- Use `testID` prop for React Native components (maps to `accessibilityIdentifier` on iOS, `resource-id` on Android)
|
|
31
|
+
- Maestro `tapOn` uses `id:` selector which reads `testID` on both platforms
|
|
32
|
+
- Follow the area label convention from uxa-spec.md: `{screen}-{section}-{element}`
|
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
# MOBILE Step: Read Inputs
|
|
2
|
+
|
|
3
|
+
## Step 1: Read inputs
|
|
4
|
+
Read `reqs-brief.md`, `uxa-spec.md` (if UI profile active), and `qa-test-spec.md`. Understand: acceptance criteria, screen specifications, navigation flows, component hierarchy, platform-specific behaviors, Maestro flow specifications, test specifications.
|
|
5
|
+
|
|
6
|
+
## Step 2: Read correction directives
|
|
7
|
+
Read `{correction_directives}`. Apply all directives targeting MOBILE. Note any conflicts with default behavior and follow the directive.
|
|
8
|
+
|
|
9
|
+
## Step 2b: Query Knowledge Agent (Conditional)
|
|
10
|
+
If a Knowledge Agent is available in the team config, send: `[KNOWLEDGE-QUERY] What mobile patterns, navigation conventions, and platform-specific constraints should I know? Context: I am MOBILE implementing {story_id} using {tech_stack.mobile_framework}.` If no response within a reasonable time or no Knowledge Agent is spawned, proceed without.
|
|
@@ -0,0 +1,59 @@
|
|
|
1
|
+
# MOBILE Step: Write Tests
|
|
2
|
+
|
|
3
|
+
## Step 8: Write Maestro YAML test flows
|
|
4
|
+
For each AC in qa-test-spec.md, write a Maestro YAML flow file. Place flows in `e2e/maestro/` directory.
|
|
5
|
+
|
|
6
|
+
Each flow file structure:
|
|
7
|
+
```yaml
|
|
8
|
+
appId: {app_package_name}
|
|
9
|
+
name: {descriptive flow name}
|
|
10
|
+
---
|
|
11
|
+
- clearState
|
|
12
|
+
- launchApp
|
|
13
|
+
# ... test steps: tapOn, assertVisible, inputText, scroll, back, swipe, etc.
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
Rules:
|
|
17
|
+
- Every flow must start with `clearState` and `launchApp`
|
|
18
|
+
- Use `assertVisible` and `assertNotVisible` for assertions, not fixed-time waits
|
|
19
|
+
- Use `waitForAnimationToEnd` instead of hardcoded `extendedWaitUntil` timeouts
|
|
20
|
+
- Deep link tests: use `openLink` command with the URI pattern from reqs-brief
|
|
21
|
+
- Screenshot capture: use `takeScreenshot` at assertion points for evidence
|
|
22
|
+
|
|
23
|
+
Record in `mobile-handoff.md#maestro-flow-files`.
|
|
24
|
+
|
|
25
|
+
## Step 8b: Write unit tests
|
|
26
|
+
Write unit tests per qa-test-spec.md using `{tech_stack.test_framework_unit}`. Unit tests MAY mock API clients for isolated component logic. Every mocked unit test for an API-calling AC must be paired with a real-API Maestro flow for the same AC. Record in `mobile-handoff.md#test-files-written`.
|
|
27
|
+
|
|
28
|
+
## Step 9: Run unit tests, verify all pass
|
|
29
|
+
Run the unit test suite. All tests must pass. Record results in `mobile-handoff.md#test-results-summary`. If tests fail, fix the code -- do not skip or weaken tests.
|
|
30
|
+
|
|
31
|
+
## Step 9b: App-Level Smoke Test
|
|
32
|
+
|
|
33
|
+
Write one test that bootstraps the application from its entry point and asserts the story's deliverable is present and reachable. This catches "unwired entry point" bugs where a screen exists but is never registered in the navigation. Mandatory for the first mobile story in a project, recommended for all subsequent stories.
|
|
34
|
+
|
|
35
|
+
Record in `mobile-handoff.md#test-files-written`.
|
|
36
|
+
|
|
37
|
+
## Step 10: Run Maestro flows
|
|
38
|
+
|
|
39
|
+
### Android (always)
|
|
40
|
+
For each flow file:
|
|
41
|
+
1. State isolation (Step 7d)
|
|
42
|
+
2. Execute: `maestro test {flow_file}`
|
|
43
|
+
3. Record per-flow result (pass/fail with output)
|
|
44
|
+
|
|
45
|
+
### iOS (Mac only)
|
|
46
|
+
For each flow file tagged as `both` or `ios`:
|
|
47
|
+
1. State isolation (Step 7d, iOS variant)
|
|
48
|
+
2. Execute: `maestro test {flow_file} --device {ios_simulator_name}`
|
|
49
|
+
3. Record per-flow result
|
|
50
|
+
|
|
51
|
+
### iOS Deferred (Windows/Linux)
|
|
52
|
+
If not on Mac, record all iOS-targeted flows in `mobile-handoff.md#deferred-ios-tests` with reason "Host OS lacks iOS simulator". This is expected, not a bug.
|
|
53
|
+
|
|
54
|
+
E2E tests run serially against the single emulator -- the emulator is shared mutable state. The 1.5-minute timeout per story applies to test execution time excluding emulator boot time.
|
|
55
|
+
|
|
56
|
+
## Step 10b: Signal integration readiness
|
|
57
|
+
When mobile code is complete and all available-platform tests pass, send to BEND via inbox:
|
|
58
|
+
`[INTEGRATION-READY] Mobile code complete. Run integration tests against my app.`
|
|
59
|
+
Wait for BEND's `[INTEGRATION-READY]` message before running integration verification. Once both sides are ready, verify that API calls from the app resolve correctly against BEND's running server.
|
|
@@ -99,7 +99,7 @@ Otherwise, substitute variables in the knowledge spawn template (`{{story_id}}`,
|
|
|
99
99
|
|
|
100
100
|
| Wave | Spawn Trigger | Agents |
|
|
101
101
|
|---|---|---|
|
|
102
|
-
| 2 | QA-A sends `[HANDOFF]` (completes) | BEND, FEND, CRITIC |
|
|
102
|
+
| 2 | QA-A sends `[HANDOFF]` (completes) | BEND, FEND, DATA, MCP-DEV, LIBDEV, DOCGEN, IAC, CRITIC (each only if not skipped by testing_profiles) |
|
|
103
103
|
| 3 | CRITIC task becomes `in_progress` | QA-B, PMCP (if ui profile) |
|
|
104
104
|
| 4 | JUDGE bug-review task becomes `in_progress` | (reserved) |
|
|
105
105
|
|
|
@@ -14,6 +14,10 @@ For each pending story in the grooming batch:
|
|
|
14
14
|
- `api` — story has API endpoints, backend logic, or database changes
|
|
15
15
|
- `ui` — story has UI components, pages, or visual elements
|
|
16
16
|
- `data-pipeline` — story has ETL, data transformation, or batch processing
|
|
17
|
+
- `mcp-server` — story has MCP server tools, handlers, or protocol work
|
|
18
|
+
- `library` — story is shared library/package (exports, packaging, versioning)
|
|
19
|
+
- `document-generation` — story has document/report template or generation pipeline work
|
|
20
|
+
- `iac` — story has infrastructure work (Terraform, CloudFormation, Kubernetes, CI/CD)
|
|
17
21
|
3. Write `testing_profiles: [api, ui]` (or whichever apply) to the story's backlog entry
|
|
18
22
|
|
|
19
23
|
This must complete before Step 1. Downstream agents rely on `testing_profiles` to determine conditional steps.
|
|
@@ -2,12 +2,17 @@
|
|
|
2
2
|
|
|
3
3
|
**Condition:** Only execute in sprint mode (`{is_sprint_mode}` is true).
|
|
4
4
|
|
|
5
|
-
## Step 1: Spawn
|
|
5
|
+
## Step 1: Spawn Developer Agents
|
|
6
6
|
|
|
7
7
|
Scan groomed stories' `testing_profiles` in `{backlog_path}`:
|
|
8
8
|
|
|
9
|
-
- Spawn BEND if any groomed story has `api`
|
|
9
|
+
- Spawn BEND if any groomed story has `api` in `testing_profiles`
|
|
10
10
|
- Spawn FEND if any groomed story has `ui` in `testing_profiles`
|
|
11
|
+
- Spawn DATA if any groomed story has `data-pipeline` in `testing_profiles`
|
|
12
|
+
- Spawn MCP-DEV if any groomed story has `mcp-server` in `testing_profiles`
|
|
13
|
+
- Spawn LIBDEV if any groomed story has `library` in `testing_profiles`
|
|
14
|
+
- Spawn DOCGEN if any groomed story has `document-generation` in `testing_profiles`
|
|
15
|
+
- Spawn IAC if any groomed story has `iac` in `testing_profiles`
|
|
11
16
|
|
|
12
17
|
Spawn with their normal prompt template and pass `.valent-pipeline/steps/{agent}/estimate.md` as the first step. Pass `{estimation_model}` and `{correction_directives}` (calibration directives) in the spawn context.
|
|
13
18
|
|
|
@@ -19,16 +24,18 @@ For each story with status `groomed`:
|
|
|
19
24
|
|
|
20
25
|
1. Update status to `sizing` in `{backlog_path}`
|
|
21
26
|
2. Read story's `testing_profiles` from `{backlog_path}`
|
|
22
|
-
3. Dispatch based on profiles
|
|
23
|
-
- `
|
|
24
|
-
- `
|
|
25
|
-
- `
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
-
|
|
29
|
-
-
|
|
30
|
-
|
|
31
|
-
|
|
27
|
+
3. Dispatch based on profiles — send story context to **every agent whose profile is present**:
|
|
28
|
+
- `api` in profiles → send to BEND
|
|
29
|
+
- `ui` in profiles → send to FEND
|
|
30
|
+
- `data-pipeline` in profiles → send to DATA
|
|
31
|
+
- `mcp-server` in profiles → send to MCP-DEV
|
|
32
|
+
- `library` in profiles → send to LIBDEV
|
|
33
|
+
- `document-generation` in profiles → send to DOCGEN
|
|
34
|
+
- `iac` in profiles → send to IAC
|
|
35
|
+
Multiple profiles can be active (e.g., `[api, data-pipeline]` sends to both BEND and DATA).
|
|
36
|
+
4. Agents write estimation files (`{agent}-estimation.md`)
|
|
37
|
+
5. **Record points:** sum all agent estimates for the story.
|
|
38
|
+
`story_points = sum of all agent estimates received`
|
|
32
39
|
6. Update story's `story_points` field in `{backlog_path}`
|
|
33
40
|
|
|
34
41
|
## Step 3: Update Sprint State
|
|
@@ -33,11 +33,20 @@ Based on the story scope and project type, determine which testing profiles are
|
|
|
33
33
|
| Story has API endpoints (backend routes, REST/GraphQL) | `api` |
|
|
34
34
|
| Story has UI components (pages, components, visual changes) | `ui` |
|
|
35
35
|
| Story has data pipeline work (ETL, transformations, migrations) | `data-pipeline` |
|
|
36
|
+
| Story has MCP server tools, handlers, or protocol work | `mcp-server` |
|
|
37
|
+
| Story is shared library/package (exports, packaging, versioning) | `library` |
|
|
38
|
+
| Story has document/report template or generation pipeline work | `document-generation` |
|
|
39
|
+
| Story has infrastructure work (Terraform, CloudFormation, Kubernetes, CI/CD) | `iac` |
|
|
36
40
|
|
|
37
41
|
Multiple profiles can be active. Examples:
|
|
38
42
|
- Backend-only story: `[api]`
|
|
39
43
|
- Frontend-only story: `[ui]`
|
|
40
44
|
- Fullstack story with both API and UI work: `[api, ui]`
|
|
41
45
|
- Data pipeline story: `[data-pipeline]`
|
|
46
|
+
- MCP server story: `[mcp-server]`
|
|
47
|
+
- Library/package story: `[library]`
|
|
48
|
+
- Document generation story: `[document-generation]`
|
|
49
|
+
- Infrastructure story: `[iac]`
|
|
50
|
+
- Fullstack story with infrastructure: `[api, ui, iac]`
|
|
42
51
|
|
|
43
52
|
Set `{testing_profiles}` for use in shared context.
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# QA-A Step: Data Pipeline Testing
|
|
2
|
+
|
|
3
|
+
## Pipeline Smoke Test Specification
|
|
4
|
+
|
|
5
|
+
For every pipeline stage in this story, write a **Pipeline Smoke Test** table:
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
## Pipeline Smoke Tests
|
|
9
|
+
|
|
10
|
+
| ID | Input Dataset | Transform Step | Expected Output | Row Count Delta | Idempotency Check |
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
Rules:
|
|
14
|
+
- One row per transform stage (ingest, each transform, output)
|
|
15
|
+
- Input dataset: exact description of seed data (file path, format, row count, key characteristics)
|
|
16
|
+
- Transform step: the specific stage being tested
|
|
17
|
+
- Expected output: key fields, values, and format QA-B must verify
|
|
18
|
+
- Row count delta: expected rows in vs rows out with reason for any difference
|
|
19
|
+
- Idempotency check: "Run twice, assert identical output" for every write stage
|
|
20
|
+
- Minimum per pipeline: one happy path per stage, one null/malformed input, one empty input
|
|
21
|
+
- Every filter/join stage MUST have a row asserting dropped rows are logged with reason
|
|
22
|
+
- Checkpoint/resume: at least one test row that simulates mid-pipeline failure and verifies resume produces correct final output
|
|
23
|
+
|
|
24
|
+
## Quality Gate Additions
|
|
25
|
+
|
|
26
|
+
- [ ] Smoke test table covers every pipeline stage (ingest, transform, output)
|
|
27
|
+
- [ ] Every filter/join has a row count delta assertion with drop reason verification
|
|
28
|
+
- [ ] Idempotency test specified for every write stage
|
|
29
|
+
- [ ] Null and malformed input test cases included
|
|
30
|
+
- [ ] Empty input test case included
|
|
31
|
+
- [ ] Checkpoint/resume test case included (if pipeline supports checkpointing)
|
|
32
|
+
- [ ] Row counts verified at each stage boundary
|
|
@@ -0,0 +1,52 @@
|
|
|
1
|
+
# QA-A Step: Document Generation Testing
|
|
2
|
+
|
|
3
|
+
## Render Smoke Test Specification
|
|
4
|
+
|
|
5
|
+
For every document template in this story, write a **Render Smoke Test** table:
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
## Render Smoke Tests
|
|
9
|
+
|
|
10
|
+
| ID | Template | Input Data | Expected Output Format | Validation Check |
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
Rules:
|
|
14
|
+
- One row per template + scenario (happy path + key edge cases)
|
|
15
|
+
- Input Data: exact JSON payload or reference to fixture file
|
|
16
|
+
- Expected Output Format: PDF, HTML, Markdown, etc. with expected MIME type
|
|
17
|
+
- Validation Check: what QA-B must verify in the generated output
|
|
18
|
+
- Minimum per template: one happy path with all variables populated, one with null/missing optional variables, one with edge-case data (unicode, long strings, special characters)
|
|
19
|
+
|
|
20
|
+
### Variable Substitution Tests
|
|
21
|
+
|
|
22
|
+
- **Normal substitution:** all required variables present and correctly typed -- verify they appear in output at expected positions
|
|
23
|
+
- **Null variables:** optional variables set to null -- verify graceful handling (omitted or default value), no literal `null` in output
|
|
24
|
+
- **Missing variables:** required variables omitted -- verify clear error, no unsubstituted markers (`{{varName}}`, `${varName}`, etc.) in output
|
|
25
|
+
|
|
26
|
+
### Conditional Section Tests
|
|
27
|
+
|
|
28
|
+
- Templates with conditional sections must have test rows for each branch (true and false conditions)
|
|
29
|
+
- Templates with loops must have test rows for empty collection, single item, and multiple items
|
|
30
|
+
|
|
31
|
+
### Output Format Validation
|
|
32
|
+
|
|
33
|
+
- Every declared output format must have at least one test row
|
|
34
|
+
- Validation must confirm correct MIME type and parseable structure (valid HTML, valid PDF, valid Markdown)
|
|
35
|
+
|
|
36
|
+
### Encoding and Unicode Tests
|
|
37
|
+
|
|
38
|
+
- At least one test row with CJK characters, emoji, or RTL text in variable data
|
|
39
|
+
- Verify output preserves unicode correctly (no mojibake, no encoding errors)
|
|
40
|
+
|
|
41
|
+
### No Unsubstituted Markers
|
|
42
|
+
|
|
43
|
+
- Every test must verify that no raw template markers appear in the final output
|
|
44
|
+
|
|
45
|
+
## Quality Gate Additions
|
|
46
|
+
|
|
47
|
+
- [ ] Render smoke test table covers every template (happy path + null/missing + edge-case data)
|
|
48
|
+
- [ ] Variable substitution tested for normal, null, and missing cases
|
|
49
|
+
- [ ] Conditional sections tested for all branches
|
|
50
|
+
- [ ] Every output format has at least one validation test
|
|
51
|
+
- [ ] Encoding/unicode test included
|
|
52
|
+
- [ ] No unsubstituted markers assertion included in every test
|
|
@@ -0,0 +1,30 @@
|
|
|
1
|
+
# QA-A Step: Infrastructure Testing
|
|
2
|
+
|
|
3
|
+
## Infrastructure Smoke Test Specification
|
|
4
|
+
|
|
5
|
+
For every infrastructure resource in this story, write an **Infrastructure Smoke Test** table:
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
## Infrastructure Smoke Tests
|
|
9
|
+
|
|
10
|
+
| ID | Resource | Operation | Expected State | Validation Method |
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
Rules:
|
|
14
|
+
- One row per resource provisioned or modified
|
|
15
|
+
- Resource: resource type and logical name (e.g., `aws_s3_bucket.data_lake`)
|
|
16
|
+
- Operation: plan, apply, destroy, or drift-check
|
|
17
|
+
- Expected state: the desired state after the operation (e.g., "exists with tags", "no diff on re-apply")
|
|
18
|
+
- Validation method: how QA-B verifies (e.g., "terraform plan output", "aws cli describe", "policy check output")
|
|
19
|
+
- Minimum per resource: one plan validation, one tagging check
|
|
20
|
+
- Every story must include: plan output validation (no errors), drift check (plan after apply = no changes), tagging check (all resources tagged), security policy check (no overly permissive IAM)
|
|
21
|
+
- Idempotency row required: "apply twice, second plan shows no changes"
|
|
22
|
+
|
|
23
|
+
## Quality Gate Additions
|
|
24
|
+
|
|
25
|
+
- [ ] Smoke test table covers every infrastructure resource (plan + tagging + security)
|
|
26
|
+
- [ ] Plan output validation row present (terraform plan succeeds without errors)
|
|
27
|
+
- [ ] Drift check row present (plan after apply = no changes)
|
|
28
|
+
- [ ] Tagging check row present (all resources have standard tags)
|
|
29
|
+
- [ ] Security policy check row present (no wildcard IAM, no hardcoded secrets)
|
|
30
|
+
- [ ] Idempotency row present (apply twice = no changes)
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
# QA-A Step: Library Testing
|
|
2
|
+
|
|
3
|
+
## Export Smoke Test Specification
|
|
4
|
+
|
|
5
|
+
For every public export in this story, write an **Export Smoke Test** table:
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
## Export Smoke Tests
|
|
9
|
+
|
|
10
|
+
| ID | Import Method | Module Path | Expected Export | Verification |
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
Rules:
|
|
14
|
+
- One row per export + import method (CJS `require()` and ESM `import` for each export)
|
|
15
|
+
- Module path: exact path from the exports map (e.g., `"./utils"`, `"."`)
|
|
16
|
+
- Expected export: the named or default export and its expected type/signature
|
|
17
|
+
- Verification: what to assert (typeof, instanceof, return value shape, callable, etc.)
|
|
18
|
+
- Minimum per export: one CJS row, one ESM row
|
|
19
|
+
- Type declaration exports must have a verification row confirming .d.ts resolution
|
|
20
|
+
- Backwards compatibility: if this is an update to an existing library, include rows verifying that previously documented imports still resolve
|
|
21
|
+
|
|
22
|
+
## Tree-Shaking Specification
|
|
23
|
+
|
|
24
|
+
If the library declares `sideEffects: false`, write a **Tree-Shaking Test** table:
|
|
25
|
+
|
|
26
|
+
```
|
|
27
|
+
## Tree-Shaking Tests
|
|
28
|
+
|
|
29
|
+
| ID | Import Statement | Expected Included | Expected Excluded | Verification |
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
Rules:
|
|
33
|
+
- Selective import must not pull in unrelated modules
|
|
34
|
+
- Bundle output must not contain code from unused exports
|
|
35
|
+
- Side-effect-free imports must produce no console output or global mutations
|
|
36
|
+
|
|
37
|
+
## Quality Gate Additions
|
|
38
|
+
|
|
39
|
+
- [ ] Export smoke test table covers every public export (CJS + ESM rows)
|
|
40
|
+
- [ ] Type declaration verification rows present for all typed exports
|
|
41
|
+
- [ ] Backwards compatibility rows present for updated libraries
|
|
42
|
+
- [ ] Tree-shaking tests present if sideEffects: false is declared
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
# QA-A Step: MCP Server Testing
|
|
2
|
+
|
|
3
|
+
## Protocol Smoke Test Specification
|
|
4
|
+
|
|
5
|
+
For every MCP tool in this story, write a **Protocol Smoke Test** table:
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
## Protocol Smoke Tests
|
|
9
|
+
|
|
10
|
+
| ID | JSON-RPC Method | Params | Expected Result Shape | Expected Error Code |
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
Rules:
|
|
14
|
+
- Initialize handshake first: `initialize` request must be the first row, verifying server info and capabilities
|
|
15
|
+
- `tools/list` must follow, verifying all tools are registered with correct inputSchema
|
|
16
|
+
- One `tools/call` row per tool with valid params (happy path)
|
|
17
|
+
- One `tools/call` row per tool with invalid params (schema violation, expecting `-32602`)
|
|
18
|
+
- Two-tier error model coverage: at least one row triggering `isError: true` (tool failure) and at least one row triggering a JSON-RPC error code (protocol failure)
|
|
19
|
+
- One row for unknown tool name (expecting `-32601` Method not found or `-32602` Invalid params)
|
|
20
|
+
- One row for malformed JSON-RPC (expecting `-32700` or `-32600`)
|
|
21
|
+
- Expected result shape: key fields and content types QA-B must verify
|
|
22
|
+
- Params: exact JSON payload for the JSON-RPC params field
|
|
23
|
+
|
|
24
|
+
## Quality Gate Additions
|
|
25
|
+
|
|
26
|
+
- [ ] Protocol smoke test table covers initialize handshake
|
|
27
|
+
- [ ] Protocol smoke test table covers tools/list
|
|
28
|
+
- [ ] Protocol smoke test table covers tools/call for every tool (happy path + error paths)
|
|
29
|
+
- [ ] Two-tier error model tested: at least one JSON-RPC error code and one isError:true
|
|
30
|
+
- [ ] Invalid args test included for every tool
|
|
31
|
+
- [ ] Unknown tool / unknown method test included
|
|
@@ -0,0 +1,59 @@
|
|
|
1
|
+
# QA-A Step: Mobile App Testing
|
|
2
|
+
|
|
3
|
+
## Maestro Flow Specification
|
|
4
|
+
|
|
5
|
+
For every mobile screen/flow in this story, write a **Maestro Flow Specification** table:
|
|
6
|
+
|
|
7
|
+
### Maestro Flow Specifications
|
|
8
|
+
|
|
9
|
+
| ID | Flow Name | App State Setup | Steps | Expected Result | Platform |
|
|
10
|
+
|----|-----------|-----------------|-------|-----------------|----------|
|
|
11
|
+
|
|
12
|
+
Column rules:
|
|
13
|
+
- **ID:** Sequential flow identifier (MF-001, MF-002, ...)
|
|
14
|
+
- **Flow Name:** Descriptive name matching the AC being tested
|
|
15
|
+
- **App State Setup:** How to reach the required starting state. Every flow starts with `clearState` + `launchApp`. If seed data is needed, specify the API call or fixture.
|
|
16
|
+
- **Steps:** Sequence of Maestro actions (`launchApp`, `tapOn`, `assertVisible`, `inputText`, `scroll`, `back`, `swipe`, `openLink`, `takeScreenshot`)
|
|
17
|
+
- **Expected Result:** What the user should see after the flow completes (assert conditions)
|
|
18
|
+
- **Platform:** `both` (default) | `android-only` | `ios-only`
|
|
19
|
+
|
|
20
|
+
### Flow Writing Rules
|
|
21
|
+
|
|
22
|
+
1. Every flow MUST start with `clearState` or `launchApp` with `clearState: true` — no state carryover from previous flows
|
|
23
|
+
2. Every flow MUST be independent — no ordering dependency between flows
|
|
24
|
+
3. Use `assertVisible` and `assertNotVisible` for assertions, not fixed-time waits
|
|
25
|
+
4. Use `waitForAnimationToEnd` for animation settling, not `extendedWaitUntil` with hardcoded durations
|
|
26
|
+
5. For deep link tests, use `openLink` command with the URI pattern from reqs-brief
|
|
27
|
+
6. Include `takeScreenshot` at key assertion points for evidence capture
|
|
28
|
+
|
|
29
|
+
## Platform-Conditional Test Requirements
|
|
30
|
+
|
|
31
|
+
For each AC, specify:
|
|
32
|
+
- **Both platforms** (default): tests that verify identical behavior on Android and iOS
|
|
33
|
+
- **Android-specific:** tests for Android-only behavior (hardware back button, specific permissions, Android notification channels)
|
|
34
|
+
- **iOS-specific:** tests for iOS-only behavior (swipe-to-go-back, Face ID/Touch ID, iOS notification categories)
|
|
35
|
+
|
|
36
|
+
Mark iOS-specific tests with `[DEFER-IOS]` if the pipeline host may be Windows/Linux. The MOBILE agent will move these to the deferred queue.
|
|
37
|
+
|
|
38
|
+
## State Isolation Requirements
|
|
39
|
+
|
|
40
|
+
- Every Maestro flow must be independent (no ordering dependency)
|
|
41
|
+
- App state cleared between flows via `adb shell pm clear {package}` / simulator equivalent
|
|
42
|
+
- If a flow requires seed data, specify the exact API call or fixture to set it up before the flow
|
|
43
|
+
- If a flow requires specific permissions, specify which permissions must be pre-granted
|
|
44
|
+
|
|
45
|
+
## API Integration for Mobile
|
|
46
|
+
|
|
47
|
+
When a story's mobile app calls backend APIs, the spec MUST include:
|
|
48
|
+
- At least one Maestro flow per API-calling AC that exercises the real API round-trip (app → API → database → response → UI update)
|
|
49
|
+
- No mocked API responses in Maestro flows (Maestro does not support API interception — this is enforced by design)
|
|
50
|
+
- Infrastructure prerequisite: "API server must be running before Maestro flow execution"
|
|
51
|
+
|
|
52
|
+
## Quality Gate Additions
|
|
53
|
+
|
|
54
|
+
- [ ] Every AC has at least one Maestro flow specification
|
|
55
|
+
- [ ] State isolation documented for every flow (clearState + seed data if needed)
|
|
56
|
+
- [ ] Platform-specific tests explicitly tagged (both / android-only / ios-only)
|
|
57
|
+
- [ ] No flow depends on another flow's output state
|
|
58
|
+
- [ ] Deep link tests included for screens with URI patterns
|
|
59
|
+
- [ ] Infrastructure prerequisites documented (API server, emulator, permissions)
|
|
@@ -0,0 +1,48 @@
|
|
|
1
|
+
# QA-B Step: Data Pipeline Testing
|
|
2
|
+
|
|
3
|
+
## Pipeline Execution Tests
|
|
4
|
+
|
|
5
|
+
Mandatory for all stories with data pipeline stages. Run the real pipeline against real data and real data stores.
|
|
6
|
+
|
|
7
|
+
**Procedure:**
|
|
8
|
+
|
|
9
|
+
1. **Seed sample data.** Per smoke test table in qa-test-spec.md. Use fixture files, seed scripts, or direct data store insertion. Include happy-path data, null/malformed records, and edge cases.
|
|
10
|
+
2. **Run pipeline.** Execute the full pipeline from ingest to output. Capture all logs.
|
|
11
|
+
3. **Validate row counts.** At each stage boundary, verify row counts match expected values from the smoke test table. Record actual vs expected.
|
|
12
|
+
4. **Spot-check values.** For each transform stage, verify a sample of output records against expected values. Check data types, formats, null handling, and edge cases.
|
|
13
|
+
5. **Re-run pipeline (idempotency).** Run the same pipeline again with the same input. Assert that the output is identical -- no duplicate rows, no changed values, no side effects from the second run.
|
|
14
|
+
6. **Kill and restart (checkpoint/resume).** If the pipeline supports checkpointing: run the pipeline, kill it mid-execution (after at least one checkpoint), restart, and verify that the final output matches a clean full run.
|
|
15
|
+
7. **Record results** in `## Pipeline Execution Results` of execution-report.md.
|
|
16
|
+
|
|
17
|
+
**Pipeline Execution Results table:**
|
|
18
|
+
|
|
19
|
+
```
|
|
20
|
+
| ID | Stage | Input Rows | Expected Output Rows | Actual Output Rows | Spot-Check Values | Idempotency | Result |
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
**Row Count Reconciliation:**
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
| Stage | Input | Output | Dropped | Drop Reason | Expected Delta | Actual Delta | Match |
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
Include full pipeline logs and commands for reproducibility.
|
|
30
|
+
|
|
31
|
+
**Failure handling:**
|
|
32
|
+
- Pipeline fails to start: file P1 bug, record error, continue.
|
|
33
|
+
- Stage produces wrong row count: file bug at appropriate priority, continue remaining stages.
|
|
34
|
+
- Idempotency fails (duplicates on re-run): file P1 bug, record both run outputs.
|
|
35
|
+
- Checkpoint/resume produces different output than clean run: file P1 bug, record both outputs.
|
|
36
|
+
- Pipeline crashes mid-run: file P1 bug, record crash output, attempt restart.
|
|
37
|
+
|
|
38
|
+
**This step cannot be skipped.** If qa-test-spec.md lacks a Pipeline Smoke Tests section, construct the table from pipeline stages in reqs-brief.md and execute.
|
|
39
|
+
|
|
40
|
+
## Execution Report Additions
|
|
41
|
+
|
|
42
|
+
The execution report MUST include:
|
|
43
|
+
- `## Pipeline Execution Results` table with actual vs expected for every stage
|
|
44
|
+
- `## Row Count Reconciliation` table showing data flow through the pipeline
|
|
45
|
+
- Full pipeline execution logs
|
|
46
|
+
- Pipeline start command and configuration
|
|
47
|
+
- Idempotency verification results (both runs compared)
|
|
48
|
+
- Checkpoint/resume verification results (if applicable)
|
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
# QA-B Step: Document Generation Testing
|
|
2
|
+
|
|
3
|
+
## Render Validation Tests
|
|
4
|
+
|
|
5
|
+
Mandatory for all stories with document generation. Invoke the real render pipeline and validate actual output.
|
|
6
|
+
|
|
7
|
+
**Procedure:**
|
|
8
|
+
|
|
9
|
+
1. **Seed template and input data.** Per render smoke test table in qa-test-spec.md. Load templates and prepare input data fixtures (JSON, database records, or programmatic setup).
|
|
10
|
+
2. **Invoke document generation.** Call the render pipeline with the seeded template and input data. Capture the generated output (file, buffer, or stream).
|
|
11
|
+
3. **Verify output format and MIME type.** Confirm the output matches the expected format (PDF, HTML, Markdown) and has the correct MIME type.
|
|
12
|
+
4. **Parse output structure.** For HTML: parse the DOM. For PDF: extract text and metadata. For Markdown: parse structure. Verify the output is well-formed and parseable.
|
|
13
|
+
5. **Check variable substitution.** Verify all expected variable values appear in the output at correct positions. Verify no unsubstituted template markers (`{{varName}}`, `${varName}`, `{% raw %}`, etc.) remain.
|
|
14
|
+
6. **Verify encoding.** Confirm UTF-8 encoding. Test with unicode data (CJK, emoji, RTL) and verify output preserves characters correctly -- no mojibake, no replacement characters.
|
|
15
|
+
7. **Execute edge-case data tests.** Per render smoke test table: null variables, missing optional fields, empty collections, extremely long strings, special characters. Verify graceful handling.
|
|
16
|
+
8. **Record results** in `## Render Validation Results` of execution-report.md.
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
## Render Validation Results
|
|
20
|
+
|
|
21
|
+
| ID | Template | Input Data | Expected Format | Actual Format | Substitution Check | Encoding Check | Result |
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
Include raw generation commands and output excerpts for reproducibility.
|
|
25
|
+
|
|
26
|
+
### Variable Substitution Audit
|
|
27
|
+
|
|
28
|
+
After all render tests execute, build a substitution audit:
|
|
29
|
+
|
|
30
|
+
| Template | Total Variables | Substituted Correctly | Null Handled | Missing Handled | Unsubstituted Markers Found |
|
|
31
|
+
|----------|----------------|----------------------|--------------|-----------------|---------------------------|
|
|
32
|
+
|
|
33
|
+
**Failure handling:**
|
|
34
|
+
- Render pipeline fails to start: file P1 bug, record error, continue.
|
|
35
|
+
- Render test fails: file bug at appropriate priority, continue remaining tests.
|
|
36
|
+
- Output contains unsubstituted markers: file P1 bug -- this is a data leak / presentation defect.
|
|
37
|
+
- Encoding errors (mojibake, replacement characters): file P2 bug.
|
|
38
|
+
|
|
39
|
+
**This step cannot be skipped.** If qa-test-spec.md lacks a Render Smoke Tests section, construct the table from template definitions in reqs-brief.md and execute.
|
|
40
|
+
|
|
41
|
+
## Execution Report Additions
|
|
42
|
+
|
|
43
|
+
The execution report MUST include:
|
|
44
|
+
- `## Render Validation Results` table with actual vs expected for every row
|
|
45
|
+
- Variable substitution audit table
|
|
46
|
+
- Raw generation commands and output excerpts
|
|
47
|
+
- Encoding verification details (input data with unicode, output confirmation)
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
# QA-B Step: Infrastructure Testing
|
|
2
|
+
|
|
3
|
+
## Infrastructure Validation Tests
|
|
4
|
+
|
|
5
|
+
Mandatory for all stories with infrastructure resources. Validate infrastructure definitions against real plan output and live state.
|
|
6
|
+
|
|
7
|
+
**Procedure:**
|
|
8
|
+
|
|
9
|
+
1. **Initialize.** Run `terraform init` (or equivalent) to initialize providers and modules. Verify initialization succeeds without errors.
|
|
10
|
+
2. **Plan validation.** Run `terraform plan` (or equivalent). Verify no errors. Capture plan output for review.
|
|
11
|
+
3. **Apply (if test environment available).** Run `terraform apply` against the test environment. Capture apply output.
|
|
12
|
+
4. **Verify resources exist.** For each resource in the smoke test table, verify it exists in the target environment using CLI or API queries.
|
|
13
|
+
5. **Verify tags.** For each resource, verify all standard tags are present (environment, project, owner, managed-by).
|
|
14
|
+
6. **Verify IAM policies.** For each IAM role/policy, verify least-privilege: no wildcard actions, no overly broad resource scopes.
|
|
15
|
+
7. **Idempotency check.** Run `terraform plan` again after apply. Expect no changes (zero diff). Any unexpected diff is a bug.
|
|
16
|
+
8. **Record results** in `## Infrastructure Validation Results` of execution-report.md.
|
|
17
|
+
|
|
18
|
+
**Infrastructure Validation Results table:**
|
|
19
|
+
|
|
20
|
+
```
|
|
21
|
+
| ID | Resource | Operation | Expected State | Actual State | Tags Valid | IAM Valid | Result |
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
Include full command output for reproducibility.
|
|
25
|
+
|
|
26
|
+
**Failure handling:**
|
|
27
|
+
- Init fails: file P1 bug, record error, continue.
|
|
28
|
+
- Plan fails: file P1 bug, record plan output, continue.
|
|
29
|
+
- Apply fails: file P1 bug, record apply output, continue.
|
|
30
|
+
- Missing resource after apply: file P1 bug, continue remaining checks.
|
|
31
|
+
- Missing tags: file P2 bug, continue.
|
|
32
|
+
- Overly permissive IAM: file P1 bug, continue.
|
|
33
|
+
- Idempotency fails (diff after apply): file P1 bug, record both plan outputs.
|
|
34
|
+
|
|
35
|
+
**This step cannot be skipped.** If qa-test-spec.md lacks an Infrastructure Smoke Tests section, construct the table from infrastructure resources in reqs-brief.md and execute.
|
|
36
|
+
|
|
37
|
+
## Execution Report Additions
|
|
38
|
+
|
|
39
|
+
The execution report MUST include:
|
|
40
|
+
- `## Infrastructure Validation Results` table with actual vs expected for every resource
|
|
41
|
+
- Full terraform init/plan/apply output
|
|
42
|
+
- Tag verification results per resource
|
|
43
|
+
- IAM policy verification results
|
|
44
|
+
- Idempotency verification (second plan output showing no changes)
|