valent-pipeline 0.2.19 → 0.2.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (115) hide show
  1. package/README.md +438 -0
  2. package/package.json +1 -1
  3. package/pipeline/agents-manifest.yaml +61 -1
  4. package/pipeline/docs/agent-reference.md +82 -23
  5. package/pipeline/docs/design/refactor-checklist.md +111 -0
  6. package/pipeline/docs/index.md +60 -0
  7. package/pipeline/docs/lead-lifecycle.md +1 -1
  8. package/pipeline/docs/pipeline-overview.md +4 -0
  9. package/pipeline/prompts/bend.md +5 -11
  10. package/pipeline/prompts/critic.md +9 -0
  11. package/pipeline/prompts/data.md +59 -0
  12. package/pipeline/prompts/docgen.md +61 -0
  13. package/pipeline/prompts/fend.md +3 -10
  14. package/pipeline/prompts/iac.md +70 -0
  15. package/pipeline/prompts/knowledge.md +2 -0
  16. package/pipeline/prompts/lead.md +97 -6
  17. package/pipeline/prompts/libdev.md +61 -0
  18. package/pipeline/prompts/mcp-dev.md +59 -0
  19. package/pipeline/prompts/mobile.md +92 -0
  20. package/pipeline/prompts/qa-a.md +1 -1
  21. package/pipeline/prompts/qa-b.md +1 -1
  22. package/pipeline/prompts/reqs.md +5 -1
  23. package/pipeline/scripts/db-bootstrap.ts +1 -1
  24. package/pipeline/scripts/embed-sqlite.ts +5 -0
  25. package/pipeline/steps/common/quality-standards.md +19 -0
  26. package/pipeline/steps/critic/data-pipeline.md +28 -0
  27. package/pipeline/steps/critic/document-generation.md +21 -0
  28. package/pipeline/steps/critic/iac.md +29 -0
  29. package/pipeline/steps/critic/library.md +24 -0
  30. package/pipeline/steps/critic/mcp-server.md +24 -0
  31. package/pipeline/steps/critic/mobile-app.md +29 -0
  32. package/pipeline/steps/data/estimate.md +51 -0
  33. package/pipeline/steps/data/handoff.md +9 -0
  34. package/pipeline/steps/data/implement.md +16 -0
  35. package/pipeline/steps/data/read-inputs.md +13 -0
  36. package/pipeline/steps/data/write-tests.md +13 -0
  37. package/pipeline/steps/docgen/estimate.md +49 -0
  38. package/pipeline/steps/docgen/handoff.md +9 -0
  39. package/pipeline/steps/docgen/implement.md +19 -0
  40. package/pipeline/steps/docgen/read-inputs.md +13 -0
  41. package/pipeline/steps/docgen/write-tests.md +15 -0
  42. package/pipeline/steps/iac/estimate.md +50 -0
  43. package/pipeline/steps/iac/handoff.md +9 -0
  44. package/pipeline/steps/iac/implement.md +19 -0
  45. package/pipeline/steps/iac/read-inputs.md +13 -0
  46. package/pipeline/steps/iac/write-tests.md +20 -0
  47. package/pipeline/steps/judge/ship-decision.md +14 -1
  48. package/pipeline/steps/libdev/estimate.md +49 -0
  49. package/pipeline/steps/libdev/handoff.md +9 -0
  50. package/pipeline/steps/libdev/implement.md +19 -0
  51. package/pipeline/steps/libdev/read-inputs.md +13 -0
  52. package/pipeline/steps/libdev/write-tests.md +16 -0
  53. package/pipeline/steps/mcp-dev/estimate.md +49 -0
  54. package/pipeline/steps/mcp-dev/handoff.md +9 -0
  55. package/pipeline/steps/mcp-dev/implement.md +29 -0
  56. package/pipeline/steps/mcp-dev/read-inputs.md +13 -0
  57. package/pipeline/steps/mcp-dev/write-tests.md +19 -0
  58. package/pipeline/steps/mobile/emulator-lifecycle.md +67 -0
  59. package/pipeline/steps/mobile/estimate.md +51 -0
  60. package/pipeline/steps/mobile/flutter.md +30 -0
  61. package/pipeline/steps/mobile/handoff.md +18 -0
  62. package/pipeline/steps/mobile/implement.md +20 -0
  63. package/pipeline/steps/mobile/react-native.md +32 -0
  64. package/pipeline/steps/mobile/read-inputs.md +10 -0
  65. package/pipeline/steps/mobile/write-tests.md +59 -0
  66. package/pipeline/steps/orchestration/adopt-lead-and-create-team.md +1 -1
  67. package/pipeline/steps/orchestration/sprint-execute.md +3 -2
  68. package/pipeline/steps/orchestration/sprint-groom.md +4 -0
  69. package/pipeline/steps/orchestration/sprint-size.md +26 -16
  70. package/pipeline/steps/orchestration/validate-story-inputs.md +9 -0
  71. package/pipeline/steps/qa-a/data-pipeline.md +32 -0
  72. package/pipeline/steps/qa-a/document-generation.md +52 -0
  73. package/pipeline/steps/qa-a/iac.md +30 -0
  74. package/pipeline/steps/qa-a/library.md +42 -0
  75. package/pipeline/steps/qa-a/mcp-server.md +31 -0
  76. package/pipeline/steps/qa-a/mobile-app.md +59 -0
  77. package/pipeline/steps/qa-b/data-pipeline.md +48 -0
  78. package/pipeline/steps/qa-b/document-generation.md +47 -0
  79. package/pipeline/steps/qa-b/iac.md +44 -0
  80. package/pipeline/steps/qa-b/library.md +61 -0
  81. package/pipeline/steps/qa-b/mcp-server.md +40 -0
  82. package/pipeline/steps/qa-b/mobile-app.md +71 -0
  83. package/pipeline/steps/readiness/standalone-review.md +7 -2
  84. package/pipeline/steps/reqs/data-pipeline.md +56 -0
  85. package/pipeline/steps/reqs/document-generation.md +55 -0
  86. package/pipeline/steps/reqs/draft-brief.md +10 -0
  87. package/pipeline/steps/reqs/iac.md +63 -0
  88. package/pipeline/steps/reqs/library.md +56 -0
  89. package/pipeline/steps/reqs/mcp-server.md +48 -0
  90. package/pipeline/steps/reqs/mobile-app.md +54 -0
  91. package/pipeline/steps/reqs/self-review.md +5 -3
  92. package/pipeline/task-graphs/backend-api.yaml +19 -2
  93. package/pipeline/task-graphs/data-pipeline.yaml +29 -12
  94. package/pipeline/task-graphs/document-generation.yaml +29 -12
  95. package/pipeline/task-graphs/frontend-only.yaml +19 -2
  96. package/pipeline/task-graphs/fullstack-web.yaml +19 -2
  97. package/pipeline/task-graphs/library.yaml +29 -12
  98. package/pipeline/task-graphs/mcp-server.yaml +29 -12
  99. package/pipeline/task-graphs/mobile-app.yaml +171 -0
  100. package/pipeline/templates/bugs.template.md +1 -1
  101. package/pipeline/templates/critic-review.template.md +1 -1
  102. package/pipeline/templates/data-handoff.template.md +96 -0
  103. package/pipeline/templates/docgen-handoff.template.md +83 -0
  104. package/pipeline/templates/iac-handoff.template.md +83 -0
  105. package/pipeline/templates/judge-decision.template.md +11 -1
  106. package/pipeline/templates/libdev-handoff.template.md +82 -0
  107. package/pipeline/templates/mcp-dev-handoff.template.md +87 -0
  108. package/pipeline/templates/mobile-handoff.template.md +122 -0
  109. package/pipeline/templates/reqs-brief.template.md +60 -4
  110. package/skills/valent-run-deferred-tests/SKILL.md +109 -0
  111. package/skills/valent-run-epic/SKILL.md +1 -1
  112. package/skills/valent-run-project/SKILL.md +1 -1
  113. package/src/commands/db-rebuild.js +5 -0
  114. package/src/lib/config-schema.js +1 -1
  115. package/src/lib/db.js +1 -1
@@ -0,0 +1,13 @@
1
+ # MCP-DEV Step: Read Inputs
2
+
3
+ ## Step 1: Read reqs-brief.md
4
+ Understand: acceptance criteria, business rules, tool definitions (names, descriptions, inputSchema), transport requirements (stdio/SSE/HTTP), capability declarations, error handling expectations, cross-cutting concerns.
5
+
6
+ ## Step 2: Read qa-test-spec.md
7
+ Understand: what tests to write for each AC, expected assertions, protocol compliance verification requirements, test case names and structure.
8
+
9
+ ## Step 3: Read correction directives
10
+ Read `{correction_directives}`. Apply all directives targeting MCP-DEV. Note any conflicts with default behavior and follow the directive.
11
+
12
+ ## Step 3b: Query Knowledge Agent (Conditional)
13
+ If a Knowledge Agent is available in the team config, send: `[KNOWLEDGE-QUERY] What codebase conventions, implementation patterns, and known pitfalls should I know? Context: I am MCP-DEV implementing {story_id} using {tech_stack.mcp_sdk} with {tech_stack.transport_type} transport.` If no response within a reasonable time or no Knowledge Agent is spawned, proceed without.
@@ -0,0 +1,19 @@
1
+ # MCP-DEV Step: Write Tests
2
+
3
+ ## Step 10: Write test code
4
+ Satisfy qa-test-spec for each AC. Every test case named in qa-test-spec must have a corresponding test. Follow quality standards from the core prompt. Record in `mcp-dev-handoff.md#test-files-written`.
5
+
6
+ **Critical requirement: real transport, no mocked transport.** Tests must spawn a real MCP server instance and communicate over the actual transport (stdio pipe, SSE connection, or HTTP). Do not mock the transport layer. The test client sends real JSON-RPC messages and asserts on real responses.
7
+
8
+ ## Step 11: Test protocol compliance
9
+ Tests must cover the full protocol handshake and lifecycle:
10
+ 1. `initialize` request returns correct server info and capabilities
11
+ 2. `tools/list` returns all registered tools with correct inputSchema
12
+ 3. `tools/call` for each tool with valid params returns expected result shape
13
+ 4. `tools/call` with invalid params returns JSON-RPC `-32602`
14
+ 5. `tools/call` triggering tool failure returns result with `isError: true`
15
+ 6. Unknown method returns JSON-RPC `-32601`
16
+ 7. Malformed JSON returns JSON-RPC `-32700`
17
+
18
+ ## Step 12: Run tests, verify all pass
19
+ Run the full test suite. All tests must pass. Record results in `mcp-dev-handoff.md#test-results-summary`. If tests fail, fix the code -- do not skip or weaken tests.
@@ -0,0 +1,67 @@
1
+ # MOBILE Step: Emulator Lifecycle Management
2
+
3
+ ## Step 7b: Boot Emulator
4
+
5
+ ### Android Emulator
6
+ 1. List available AVDs: `emulator -list-avds`
7
+ 2. Boot emulator: `emulator -avd {avd_name} -no-snapshot-load -no-audio -no-window &`
8
+ 3. Wait for boot: `adb wait-for-device` then poll `adb shell getprop sys.boot_completed` until it returns `1` (max 120s, 10 retries at 12s intervals)
9
+ 4. If boot fails after 120s: kill process (`adb emu kill`), retry once with fresh boot. If second attempt fails, file `[BLOCKER]` to Lead with emulator logs.
10
+
11
+ Record emulator config in `mobile-handoff.md#emulator-configuration`.
12
+
13
+ ### iOS Simulator (Mac Only)
14
+ 1. Verify Mac host: `uname -s` must return `Darwin`. If not Mac, skip iOS entirely.
15
+ 2. List available simulators: `xcrun simctl list devices available`
16
+ 3. Boot simulator: `xcrun simctl boot {device_udid}`
17
+ 4. Wait for boot: poll `xcrun simctl list devices | grep Booted` (max 60s)
18
+ 5. If boot fails: `xcrun simctl shutdown all`, retry once. If second attempt fails, file `[BLOCKER]` to Lead.
19
+
20
+ Record simulator config in `mobile-handoff.md#emulator-configuration`.
21
+
22
+ ## Step 7c: Build and Install App
23
+
24
+ ### React Native
25
+ 1. Start Metro bundler: `npx react-native start --reset-cache &`
26
+ 2. Wait for Metro ready: poll for `http://localhost:8081/status` returning `packager-status:running` (max 60s). Handle port conflicts by checking if port 8081 is in use.
27
+ 3. Android build + install: `npx react-native run-android`
28
+ 4. iOS build + install (Mac only): `npx react-native run-ios --simulator="{simulator_name}"`
29
+ 5. Verify main activity/screen renders within 10s of launch. If not, capture `adb logcat` output and file P1 bug.
30
+
31
+ ### Flutter
32
+ 1. Resolve dependencies: `flutter pub get`
33
+ 2. Android build + install: `flutter build apk --debug && flutter install --device-id {emulator_id}`
34
+ 3. iOS build + install (Mac only): `flutter build ios --debug --simulator && flutter install --device-id {simulator_id}`
35
+ 4. Verify app launches and main screen renders within 10s.
36
+
37
+ ### Native Module Recovery (React Native)
38
+ If native module errors occur during build:
39
+ - iOS: run `cd ios && pod install && cd ..` and retry build
40
+ - Android: run `cd android && ./gradlew clean && cd ..` and retry build
41
+ - If native module build fails after retry, file P1 bug with full build output.
42
+
43
+ ## Step 7d: State Isolation Between Maestro Flows
44
+
45
+ Before each Maestro flow execution:
46
+ - **Android:** `adb shell pm clear {app_package_name}`
47
+ - **iOS:** `xcrun simctl terminate {device_udid} {bundle_id}` followed by `xcrun simctl privacy {device_udid} reset all {bundle_id}`
48
+
49
+ This ensures no state leakage between test flows. Every flow starts from a clean app state.
50
+
51
+ ## Step 7e: Pre-Grant Permissions
52
+
53
+ Before test execution, pre-grant required permissions to avoid UI dialog interference:
54
+ - **Android:** `adb shell pm grant {package} android.permission.{PERMISSION}` for each required permission
55
+ - **iOS:** `xcrun simctl privacy {device_udid} grant {permission-type} {bundle_id}`
56
+
57
+ Never depend on UI dialogs for permission grants during E2E tests.
58
+
59
+ ## Step 7f: Crash Recovery
60
+
61
+ If emulator/simulator crashes or becomes unresponsive during test execution:
62
+ 1. Detect via `adb devices` showing offline or Maestro flow timeout
63
+ 2. Capture crash logs: `adb logcat -d > crash-{timestamp}.log`
64
+ 3. Kill stale processes: `adb emu kill` / `xcrun simctl shutdown all`
65
+ 4. Re-boot with clean state (Step 7b)
66
+ 5. Resume from the last incomplete Maestro flow (do not re-run passed flows)
67
+ 6. Max 2 crash recovery attempts per platform. After 2 crashes, file P1 bug with crash logs and stop testing on that platform.
@@ -0,0 +1,51 @@
1
+ # Mobile Estimation
2
+
3
+ **Purpose:** Assign a Fibonacci story point estimate for mobile implementation complexity. This is a lightweight estimation step — no code tools, no implementation. Read specs, assess complexity, output a number with rationale.
4
+
5
+ **Fibonacci scale:** 1, 2, 3, 5, 8, 13, 21
6
+
7
+ ## Step 1: Read Groomed Specs
8
+
9
+ Read and assess:
10
+ - `{story_output_dir}/reqs-brief.md` — REQUIRED
11
+ - `{story_output_dir}/uxa-spec.md` — REQUIRED (if UI profile active)
12
+ - `{story_output_dir}/qa-test-spec.md` — REQUIRED
13
+
14
+ ## Step 2: Assess Complexity Factors
15
+
16
+ Evaluate each factor and record your assessment:
17
+
18
+ | Factor | Assessment | Weight |
19
+ |--------|-----------|--------|
20
+ | **Screen count** | How many new or modified screens? Simple displays vs complex interactive screens? | High |
21
+ | **Navigation complexity** | Deep linking, nested stacks/tabs/drawers, modal flows, conditional navigation? | High |
22
+ | **Platform-specific requirements** | Android-only vs cross-platform? Platform-divergent behavior? | Medium |
23
+ | **Native module integration** | Camera, GPS, push notifications, biometrics, file system? | Medium |
24
+ | **State management complexity** | Local state vs global state? Offline persistence? Optimistic updates? | Medium |
25
+ | **API integration surface** | Number of endpoints consumed, real-time updates, file uploads? | Medium |
26
+
27
+ ## Step 3: Select Fibonacci Value
28
+
29
+ Map your assessment to the Fibonacci scale:
30
+
31
+ | Points | Typical Mobile Scope |
32
+ |--------|---------------------|
33
+ | 1 | Text change, style tweak, single prop addition |
34
+ | 2 | Simple display screen, minor layout change |
35
+ | 3 | Interactive screen with local state, form with validation |
36
+ | 5 | Multi-screen feature, navigation setup, API integration |
37
+ | 8 | Complex interactive feature, cross-platform divergence, native modules |
38
+ | 13 | Large feature with offline support, complex navigation, extensive platform handling |
39
+ | 21 | Epic-scale: new navigation paradigm or major platform integration (consider splitting) |
40
+
41
+ **Calibration context (if `{estimation_model}` is `calibrated`):**
42
+ If calibration directives are provided in `{correction_directives}`, factor them into your estimate. These are learned patterns from prior sprints.
43
+
44
+ ## Step 4: Write Estimate
45
+
46
+ Write to `{story_output_dir}/mobile-estimation.md` using `.valent-pipeline/templates/estimation.template.md`:
47
+ - Fibonacci value with brief rationale (2-3 sentences)
48
+ - Factor assessments from Step 2
49
+ - Calibration adjustments applied (if any)
50
+
51
+ Send: `[ESTIMATION] MOBILE estimates {story_id} at {points} points. See mobile-estimation.md.`
@@ -0,0 +1,30 @@
1
+ # MOBILE Step: Flutter Specifics
2
+
3
+ This step is loaded conditionally when `{tech_stack.mobile_framework}` is `flutter`. Read before implementing.
4
+
5
+ ## Flutter Build Configuration
6
+ - Debug builds for testing: `flutter build apk --debug` / `flutter build ios --debug --simulator`
7
+ - Resolve dependencies before build: `flutter pub get`
8
+ - Verify Flutter SDK version matches project constraints in `pubspec.yaml`
9
+ - Hot reload: disable during E2E (use cold start for each Maestro flow via `clearState`)
10
+
11
+ ## Flutter Testing Patterns
12
+ - Widget tests: use `flutter_test` with `WidgetTester` for component isolation
13
+ - Integration tests: use Maestro YAML flows (NOT `flutter_driver` or `integration_test` for pipeline E2E)
14
+ - State management: clear providers/blocs/cubits between test suites
15
+ - Platform channels: test with real native code, not mock method channel handlers
16
+
17
+ ## Flutter-Specific Emulator Setup
18
+ - Android: standard AVD boot, then `flutter install --device-id {emulator_id}`
19
+ - iOS: `open -a Simulator` if not already running, then `flutter install --device-id {simulator_id}`
20
+ - Verify device connection: `flutter devices` must list the target device
21
+
22
+ ## Area Labels for Testing
23
+ - Use `Key` with `ValueKey('testID')` for Flutter widgets
24
+ - Maestro `tapOn` with `id:` selector reads the `ValueKey` on both platforms
25
+ - Follow the area label convention from uxa-spec.md: `{screen}-{section}-{element}`
26
+
27
+ ## Offline Testing
28
+ - Use emulator console commands for network simulation: `adb emu network delay gprs` / `adb emu network speed gsm`
29
+ - Do NOT use `adb shell svc wifi disable` (unreliable on emulators)
30
+ - Test offline-capable features per reqs-brief offline requirements
@@ -0,0 +1,18 @@
1
+ # MOBILE Step: Handoff
2
+
3
+ Read `.valent-pipeline/steps/common/distilled-handoff-format.md` before writing output.
4
+
5
+ ## Step 11: Write mobile-handoff.md
6
+ Complete all sections of the handoff document using the template at `.valent-pipeline/templates/mobile-handoff.template.md`. Set `status: completed` in frontmatter.
7
+
8
+ If iOS tests were deferred (host is not Mac):
9
+ - Set `ios_deferred: true` in frontmatter
10
+ - Complete the `Deferred iOS Tests` section listing all unexecuted iOS flows
11
+ - Include in the inbox message: `[IOS-DEFERRED] {count} iOS Maestro flows deferred. Run /run-deferred-tests on Mac to complete.`
12
+
13
+ Notify lead via inbox: `[DONE] Mobile implementation complete. See mobile-handoff.md#orchestrator-summary.`
14
+
15
+ ## Independent Verification Requirement
16
+ All Android tests must pass before marking complete. If on Mac, all iOS tests must also pass. Do not mark complete with failing Android tests. Do not rely on BEND or CRITIC to catch your failures.
17
+
18
+ **Smoke test gate:** The app-level smoke test (Step 9b) must pass before sending `[DONE]`. If it fails, the app's entry point is not wired to your deliverable -- fix the wiring before marking complete.
@@ -0,0 +1,20 @@
1
+ # MOBILE Step: Implement
2
+
3
+ ## Step 3: Detect host platform
4
+ Run platform detection to determine available targets:
5
+ - `uname -s` returns `Darwin` → Mac: both Android and iOS targets available
6
+ - `uname -s` returns `Linux` or `MINGW*`/`MSYS*` → Windows/Linux: Android only, iOS deferred
7
+
8
+ Record platform capabilities in `mobile-handoff.md#platform-coverage`. If iOS is unavailable, set `ios_deferred: true` in handoff frontmatter.
9
+
10
+ ## Step 4: Plan screen architecture
11
+ From uxa-spec.md screen specifications (if present) or reqs-brief.md: identify screens, navigation structure (stack, tab, drawer), shared components, deep link URI patterns. Map to framework conventions for `{tech_stack.mobile_framework}`.
12
+
13
+ ## Step 5: Implement screens and navigation
14
+ Per spec: create screen components, navigation setup (React Navigation / Flutter Navigator), deep linking configuration. Apply `testID` attributes matching the area label system from uxa-spec.md. Record in `mobile-handoff.md#screens-implemented`.
15
+
16
+ ## Step 6: Implement components
17
+ Per spec: forms, lists, modals, gesture handlers, platform-specific components. Wire to backend API endpoints per `bend-handoff.md#api-endpoints-implemented` (if BEND is active). Record in `mobile-handoff.md#components-created`.
18
+
19
+ ## Step 7: Implement platform-specific behavior
20
+ Handle platform divergences: permissions (camera, location, notifications), native modules, platform-specific UI (Android back button, iOS swipe-to-go-back, safe areas, notch handling). Use `Platform.OS` / `Platform.select` for divergent behavior. Record decisions in `mobile-handoff.md#implementation-decisions`.
@@ -0,0 +1,32 @@
1
+ # MOBILE Step: React Native Specifics
2
+
3
+ This step is loaded conditionally when `{tech_stack.mobile_framework}` is `react-native`. Read before implementing.
4
+
5
+ ## Metro Bundler Management
6
+ - Start Metro before any test execution: `npx react-native start --reset-cache &`
7
+ - Monitor Metro for JS bundle errors. If bundling fails, fix the code and retry.
8
+ - Handle port conflicts: check if port 8081 is in use before starting (`lsof -i :8081` on Mac/Linux, `netstat -ano | findstr 8081` on Windows). Kill stale Metro processes if needed.
9
+ - Kill Metro after all tests complete.
10
+
11
+ ## React Native Build Configuration
12
+ - Use `react-native.config.js` for native module auto-linking
13
+ - Hermes engine: verify Hermes is enabled for Android (check `android/app/build.gradle` for `enableHermes: true`)
14
+ - Flipper: disable in release builds, optional in debug
15
+ - Fast Refresh: disable during E2E to avoid test flakiness (`--no-interactive` flag)
16
+
17
+ ## React Native Testing Patterns
18
+ - Component tests: use React Native Testing Library (`@testing-library/react-native`)
19
+ - Navigation tests: verify deep link resolution via React Navigation linking config, verify screen transitions
20
+ - Platform-specific code: test `.android.tsx` and `.ios.tsx` variants separately when they exist
21
+ - AsyncStorage / MMKV: clear between test suites to prevent state leakage
22
+
23
+ ## Native Module Considerations
24
+ - If the story uses native modules, verify auto-linking succeeded on both platforms
25
+ - Bridge calls must be tested with real native code, not mocks
26
+ - Pod install (iOS): `cd ios && pod install` after adding native dependencies
27
+ - Gradle sync (Android): `cd android && ./gradlew --refresh-dependencies` if native module issues
28
+
29
+ ## Area Labels for Testing
30
+ - Use `testID` prop for React Native components (maps to `accessibilityIdentifier` on iOS, `resource-id` on Android)
31
+ - Maestro `tapOn` uses `id:` selector which reads `testID` on both platforms
32
+ - Follow the area label convention from uxa-spec.md: `{screen}-{section}-{element}`
@@ -0,0 +1,10 @@
1
+ # MOBILE Step: Read Inputs
2
+
3
+ ## Step 1: Read inputs
4
+ Read `reqs-brief.md`, `uxa-spec.md` (if UI profile active), and `qa-test-spec.md`. Understand: acceptance criteria, screen specifications, navigation flows, component hierarchy, platform-specific behaviors, Maestro flow specifications, test specifications.
5
+
6
+ ## Step 2: Read correction directives
7
+ Read `{correction_directives}`. Apply all directives targeting MOBILE. Note any conflicts with default behavior and follow the directive.
8
+
9
+ ## Step 2b: Query Knowledge Agent (Conditional)
10
+ If a Knowledge Agent is available in the team config, send: `[KNOWLEDGE-QUERY] What mobile patterns, navigation conventions, and platform-specific constraints should I know? Context: I am MOBILE implementing {story_id} using {tech_stack.mobile_framework}.` If no response within a reasonable time or no Knowledge Agent is spawned, proceed without.
@@ -0,0 +1,59 @@
1
+ # MOBILE Step: Write Tests
2
+
3
+ ## Step 8: Write Maestro YAML test flows
4
+ For each AC in qa-test-spec.md, write a Maestro YAML flow file. Place flows in `e2e/maestro/` directory.
5
+
6
+ Each flow file structure:
7
+ ```yaml
8
+ appId: {app_package_name}
9
+ name: {descriptive flow name}
10
+ ---
11
+ - clearState
12
+ - launchApp
13
+ # ... test steps: tapOn, assertVisible, inputText, scroll, back, swipe, etc.
14
+ ```
15
+
16
+ Rules:
17
+ - Every flow must start with `clearState` and `launchApp`
18
+ - Use `assertVisible` and `assertNotVisible` for assertions, not fixed-time waits
19
+ - Use `waitForAnimationToEnd` instead of hardcoded `extendedWaitUntil` timeouts
20
+ - Deep link tests: use `openLink` command with the URI pattern from reqs-brief
21
+ - Screenshot capture: use `takeScreenshot` at assertion points for evidence
22
+
23
+ Record in `mobile-handoff.md#maestro-flow-files`.
24
+
25
+ ## Step 8b: Write unit tests
26
+ Write unit tests per qa-test-spec.md using `{tech_stack.test_framework_unit}`. Unit tests MAY mock API clients for isolated component logic. Every mocked unit test for an API-calling AC must be paired with a real-API Maestro flow for the same AC. Record in `mobile-handoff.md#test-files-written`.
27
+
28
+ ## Step 9: Run unit tests, verify all pass
29
+ Run the unit test suite. All tests must pass. Record results in `mobile-handoff.md#test-results-summary`. If tests fail, fix the code -- do not skip or weaken tests.
30
+
31
+ ## Step 9b: App-Level Smoke Test
32
+
33
+ Write one test that bootstraps the application from its entry point and asserts the story's deliverable is present and reachable. This catches "unwired entry point" bugs where a screen exists but is never registered in the navigation. Mandatory for the first mobile story in a project, recommended for all subsequent stories.
34
+
35
+ Record in `mobile-handoff.md#test-files-written`.
36
+
37
+ ## Step 10: Run Maestro flows
38
+
39
+ ### Android (always)
40
+ For each flow file:
41
+ 1. State isolation (Step 7d)
42
+ 2. Execute: `maestro test {flow_file}`
43
+ 3. Record per-flow result (pass/fail with output)
44
+
45
+ ### iOS (Mac only)
46
+ For each flow file tagged as `both` or `ios`:
47
+ 1. State isolation (Step 7d, iOS variant)
48
+ 2. Execute: `maestro test {flow_file} --device {ios_simulator_name}`
49
+ 3. Record per-flow result
50
+
51
+ ### iOS Deferred (Windows/Linux)
52
+ If not on Mac, record all iOS-targeted flows in `mobile-handoff.md#deferred-ios-tests` with reason "Host OS lacks iOS simulator". This is expected, not a bug.
53
+
54
+ E2E tests run serially against the single emulator -- the emulator is shared mutable state. The 1.5-minute timeout per story applies to test execution time excluding emulator boot time.
55
+
56
+ ## Step 10b: Signal integration readiness
57
+ When mobile code is complete and all available-platform tests pass, send to BEND via inbox:
58
+ `[INTEGRATION-READY] Mobile code complete. Run integration tests against my app.`
59
+ Wait for BEND's `[INTEGRATION-READY]` message before running integration verification. Once both sides are ready, verify that API calls from the app resolve correctly against BEND's running server.
@@ -99,7 +99,7 @@ Otherwise, substitute variables in the knowledge spawn template (`{{story_id}}`,
99
99
 
100
100
  | Wave | Spawn Trigger | Agents |
101
101
  |---|---|---|
102
- | 2 | QA-A sends `[HANDOFF]` (completes) | BEND, FEND, CRITIC |
102
+ | 2 | QA-A sends `[HANDOFF]` (completes) | BEND, FEND, DATA, MCP-DEV, LIBDEV, DOCGEN, IAC, CRITIC (each only if not skipped by testing_profiles) |
103
103
  | 3 | CRITIC task becomes `in_progress` | QA-B, PMCP (if ui profile) |
104
104
  | 4 | JUDGE bug-review task becomes `in_progress` | (reserved) |
105
105
 
@@ -2,7 +2,7 @@
2
2
 
3
3
  **Condition:** Only execute in sprint mode (`{is_sprint_mode}` is true).
4
4
 
5
- Execute sprint stories sequentially. Phase 2 agents (BEND, FEND, CRITIC, QA-B, JUDGE) are spawned fresh per story and killed after each.
5
+ Execute sprint stories sequentially. BEND/FEND persist from the sizing phase into story 1. CRITIC, QA-B, and JUDGE are spawned fresh for story 1. For story 2+, all Phase 2 agents are spawned fresh and killed after each.
6
6
 
7
7
  ## For Each Sprint Story (Sequential)
8
8
 
@@ -22,7 +22,8 @@ For each story in sprint order:
22
22
  1. Update story status to `development` in both `{backlog_path}` and `sprint-{n}-status.yaml`
23
23
  2. Execute the standard story flow:
24
24
  - Create story branch
25
- - Spawn Phase 2 agents per the task graph (BEND, FEND, CRITIC, QA-B, JUDGE)
25
+ - **Story 1:** BEND/FEND already alive from sizing spawn only CRITIC, QA-B, JUDGE fresh. Send implementation context to existing BEND/FEND.
26
+ - **Story 2+:** Spawn all Phase 2 agents fresh per the task graph (BEND, FEND, CRITIC, QA-B, JUDGE)
26
27
  - Agents query Knowledge/SQLite for grooming context (NOT in-context from Phase 1)
27
28
  - Monitor execution, handle rejections, gate verdicts
28
29
  - On JUDGE SHIP: merge branch, record actuals
@@ -14,6 +14,10 @@ For each pending story in the grooming batch:
14
14
  - `api` — story has API endpoints, backend logic, or database changes
15
15
  - `ui` — story has UI components, pages, or visual elements
16
16
  - `data-pipeline` — story has ETL, data transformation, or batch processing
17
+ - `mcp-server` — story has MCP server tools, handlers, or protocol work
18
+ - `library` — story is shared library/package (exports, packaging, versioning)
19
+ - `document-generation` — story has document/report template or generation pipeline work
20
+ - `iac` — story has infrastructure work (Terraform, CloudFormation, Kubernetes, CI/CD)
17
21
  3. Write `testing_profiles: [api, ui]` (or whichever apply) to the story's backlog entry
18
22
 
19
23
  This must complete before Step 1. Downstream agents rely on `testing_profiles` to determine conditional steps.
@@ -2,33 +2,43 @@
2
2
 
3
3
  **Condition:** Only execute in sprint mode (`{is_sprint_mode}` is true).
4
4
 
5
- ## Step 1: Spawn Estimation Agents
5
+ ## Step 1: Spawn Developer Agents
6
6
 
7
- Spawn BEND with `.valent-pipeline/steps/bend/estimate.md` step file only (no implementation tools).
7
+ Scan groomed stories' `testing_profiles` in `{backlog_path}`:
8
8
 
9
- If any groomed stories have `fullstack-web` or `frontend-only` surface, also spawn FEND with `.valent-pipeline/steps/fend/estimate.md`.
9
+ - Spawn BEND if any groomed story has `api` in `testing_profiles`
10
+ - Spawn FEND if any groomed story has `ui` in `testing_profiles`
11
+ - Spawn DATA if any groomed story has `data-pipeline` in `testing_profiles`
12
+ - Spawn MCP-DEV if any groomed story has `mcp-server` in `testing_profiles`
13
+ - Spawn LIBDEV if any groomed story has `library` in `testing_profiles`
14
+ - Spawn DOCGEN if any groomed story has `document-generation` in `testing_profiles`
15
+ - Spawn IAC if any groomed story has `iac` in `testing_profiles`
10
16
 
11
- Pass `{estimation_model}` and `{correction_directives}` (calibration directives) in the spawn context.
17
+ Spawn with their normal prompt template and pass `.valent-pipeline/steps/{agent}/estimate.md` as the first step. Pass `{estimation_model}` and `{correction_directives}` (calibration directives) in the spawn context.
18
+
19
+ These agents persist into execution — they are NOT killed after sizing.
12
20
 
13
21
  ## Step 2: Size Each Groomed Story
14
22
 
15
23
  For each story with status `groomed`:
16
24
 
17
25
  1. Update status to `sizing` in `{backlog_path}`
18
- 2. Send story context (reqs-brief, uxa-spec, qa-test-spec) to BEND
19
- 3. BEND writes `bend-estimation.md` with Fibonacci points
20
- 4. If full-stack: FEND writes `fend-estimation.md` with Fibonacci points
21
- 5. **Record points:**
22
- - Backend-only: `story_points = BEND estimate`
23
- - Full-stack: `story_points = BEND estimate + FEND estimate`
24
- - Data-pipeline: `story_points = BEND estimate`
26
+ 2. Read story's `testing_profiles` from `{backlog_path}`
27
+ 3. Dispatch based on profiles send story context to **every agent whose profile is present**:
28
+ - `api` in profiles send to BEND
29
+ - `ui` in profiles → send to FEND
30
+ - `data-pipeline` in profiles → send to DATA
31
+ - `mcp-server` in profiles send to MCP-DEV
32
+ - `library` in profiles → send to LIBDEV
33
+ - `document-generation` in profiles → send to DOCGEN
34
+ - `iac` in profiles → send to IAC
35
+ Multiple profiles can be active (e.g., `[api, data-pipeline]` sends to both BEND and DATA).
36
+ 4. Agents write estimation files (`{agent}-estimation.md`)
37
+ 5. **Record points:** sum all agent estimates for the story.
38
+ `story_points = sum of all agent estimates received`
25
39
  6. Update story's `story_points` field in `{backlog_path}`
26
40
 
27
- ## Step 3: Kill Estimation Agents
28
-
29
- In epic/project mode: kill BEND and FEND after sizing all stories. They will be respawned fresh per story during execution (each story needs clean code context).
30
-
31
- ## Step 4: Update Sprint State
41
+ ## Step 3: Update Sprint State
32
42
 
33
43
  Update `pipeline-state.json`: `current_sprint.phase = "planning"`.
34
44
 
@@ -33,11 +33,20 @@ Based on the story scope and project type, determine which testing profiles are
33
33
  | Story has API endpoints (backend routes, REST/GraphQL) | `api` |
34
34
  | Story has UI components (pages, components, visual changes) | `ui` |
35
35
  | Story has data pipeline work (ETL, transformations, migrations) | `data-pipeline` |
36
+ | Story has MCP server tools, handlers, or protocol work | `mcp-server` |
37
+ | Story is shared library/package (exports, packaging, versioning) | `library` |
38
+ | Story has document/report template or generation pipeline work | `document-generation` |
39
+ | Story has infrastructure work (Terraform, CloudFormation, Kubernetes, CI/CD) | `iac` |
36
40
 
37
41
  Multiple profiles can be active. Examples:
38
42
  - Backend-only story: `[api]`
39
43
  - Frontend-only story: `[ui]`
40
44
  - Fullstack story with both API and UI work: `[api, ui]`
41
45
  - Data pipeline story: `[data-pipeline]`
46
+ - MCP server story: `[mcp-server]`
47
+ - Library/package story: `[library]`
48
+ - Document generation story: `[document-generation]`
49
+ - Infrastructure story: `[iac]`
50
+ - Fullstack story with infrastructure: `[api, ui, iac]`
42
51
 
43
52
  Set `{testing_profiles}` for use in shared context.
@@ -0,0 +1,32 @@
1
+ # QA-A Step: Data Pipeline Testing
2
+
3
+ ## Pipeline Smoke Test Specification
4
+
5
+ For every pipeline stage in this story, write a **Pipeline Smoke Test** table:
6
+
7
+ ```
8
+ ## Pipeline Smoke Tests
9
+
10
+ | ID | Input Dataset | Transform Step | Expected Output | Row Count Delta | Idempotency Check |
11
+ ```
12
+
13
+ Rules:
14
+ - One row per transform stage (ingest, each transform, output)
15
+ - Input dataset: exact description of seed data (file path, format, row count, key characteristics)
16
+ - Transform step: the specific stage being tested
17
+ - Expected output: key fields, values, and format QA-B must verify
18
+ - Row count delta: expected rows in vs rows out with reason for any difference
19
+ - Idempotency check: "Run twice, assert identical output" for every write stage
20
+ - Minimum per pipeline: one happy path per stage, one null/malformed input, one empty input
21
+ - Every filter/join stage MUST have a row asserting dropped rows are logged with reason
22
+ - Checkpoint/resume: at least one test row that simulates mid-pipeline failure and verifies resume produces correct final output
23
+
24
+ ## Quality Gate Additions
25
+
26
+ - [ ] Smoke test table covers every pipeline stage (ingest, transform, output)
27
+ - [ ] Every filter/join has a row count delta assertion with drop reason verification
28
+ - [ ] Idempotency test specified for every write stage
29
+ - [ ] Null and malformed input test cases included
30
+ - [ ] Empty input test case included
31
+ - [ ] Checkpoint/resume test case included (if pipeline supports checkpointing)
32
+ - [ ] Row counts verified at each stage boundary
@@ -0,0 +1,52 @@
1
+ # QA-A Step: Document Generation Testing
2
+
3
+ ## Render Smoke Test Specification
4
+
5
+ For every document template in this story, write a **Render Smoke Test** table:
6
+
7
+ ```
8
+ ## Render Smoke Tests
9
+
10
+ | ID | Template | Input Data | Expected Output Format | Validation Check |
11
+ ```
12
+
13
+ Rules:
14
+ - One row per template + scenario (happy path + key edge cases)
15
+ - Input Data: exact JSON payload or reference to fixture file
16
+ - Expected Output Format: PDF, HTML, Markdown, etc. with expected MIME type
17
+ - Validation Check: what QA-B must verify in the generated output
18
+ - Minimum per template: one happy path with all variables populated, one with null/missing optional variables, one with edge-case data (unicode, long strings, special characters)
19
+
20
+ ### Variable Substitution Tests
21
+
22
+ - **Normal substitution:** all required variables present and correctly typed -- verify they appear in output at expected positions
23
+ - **Null variables:** optional variables set to null -- verify graceful handling (omitted or default value), no literal `null` in output
24
+ - **Missing variables:** required variables omitted -- verify clear error, no unsubstituted markers (`{{varName}}`, `${varName}`, etc.) in output
25
+
26
+ ### Conditional Section Tests
27
+
28
+ - Templates with conditional sections must have test rows for each branch (true and false conditions)
29
+ - Templates with loops must have test rows for empty collection, single item, and multiple items
30
+
31
+ ### Output Format Validation
32
+
33
+ - Every declared output format must have at least one test row
34
+ - Validation must confirm correct MIME type and parseable structure (valid HTML, valid PDF, valid Markdown)
35
+
36
+ ### Encoding and Unicode Tests
37
+
38
+ - At least one test row with CJK characters, emoji, or RTL text in variable data
39
+ - Verify output preserves unicode correctly (no mojibake, no encoding errors)
40
+
41
+ ### No Unsubstituted Markers
42
+
43
+ - Every test must verify that no raw template markers appear in the final output
44
+
45
+ ## Quality Gate Additions
46
+
47
+ - [ ] Render smoke test table covers every template (happy path + null/missing + edge-case data)
48
+ - [ ] Variable substitution tested for normal, null, and missing cases
49
+ - [ ] Conditional sections tested for all branches
50
+ - [ ] Every output format has at least one validation test
51
+ - [ ] Encoding/unicode test included
52
+ - [ ] No unsubstituted markers assertion included in every test
@@ -0,0 +1,30 @@
1
+ # QA-A Step: Infrastructure Testing
2
+
3
+ ## Infrastructure Smoke Test Specification
4
+
5
+ For every infrastructure resource in this story, write an **Infrastructure Smoke Test** table:
6
+
7
+ ```
8
+ ## Infrastructure Smoke Tests
9
+
10
+ | ID | Resource | Operation | Expected State | Validation Method |
11
+ ```
12
+
13
+ Rules:
14
+ - One row per resource provisioned or modified
15
+ - Resource: resource type and logical name (e.g., `aws_s3_bucket.data_lake`)
16
+ - Operation: plan, apply, destroy, or drift-check
17
+ - Expected state: the desired state after the operation (e.g., "exists with tags", "no diff on re-apply")
18
+ - Validation method: how QA-B verifies (e.g., "terraform plan output", "aws cli describe", "policy check output")
19
+ - Minimum per resource: one plan validation, one tagging check
20
+ - Every story must include: plan output validation (no errors), drift check (plan after apply = no changes), tagging check (all resources tagged), security policy check (no overly permissive IAM)
21
+ - Idempotency row required: "apply twice, second plan shows no changes"
22
+
23
+ ## Quality Gate Additions
24
+
25
+ - [ ] Smoke test table covers every infrastructure resource (plan + tagging + security)
26
+ - [ ] Plan output validation row present (terraform plan succeeds without errors)
27
+ - [ ] Drift check row present (plan after apply = no changes)
28
+ - [ ] Tagging check row present (all resources have standard tags)
29
+ - [ ] Security policy check row present (no wildcard IAM, no hardcoded secrets)
30
+ - [ ] Idempotency row present (apply twice = no changes)
@@ -0,0 +1,42 @@
1
+ # QA-A Step: Library Testing
2
+
3
+ ## Export Smoke Test Specification
4
+
5
+ For every public export in this story, write an **Export Smoke Test** table:
6
+
7
+ ```
8
+ ## Export Smoke Tests
9
+
10
+ | ID | Import Method | Module Path | Expected Export | Verification |
11
+ ```
12
+
13
+ Rules:
14
+ - One row per export + import method (CJS `require()` and ESM `import` for each export)
15
+ - Module path: exact path from the exports map (e.g., `"./utils"`, `"."`)
16
+ - Expected export: the named or default export and its expected type/signature
17
+ - Verification: what to assert (typeof, instanceof, return value shape, callable, etc.)
18
+ - Minimum per export: one CJS row, one ESM row
19
+ - Type declaration exports must have a verification row confirming .d.ts resolution
20
+ - Backwards compatibility: if this is an update to an existing library, include rows verifying that previously documented imports still resolve
21
+
22
+ ## Tree-Shaking Specification
23
+
24
+ If the library declares `sideEffects: false`, write a **Tree-Shaking Test** table:
25
+
26
+ ```
27
+ ## Tree-Shaking Tests
28
+
29
+ | ID | Import Statement | Expected Included | Expected Excluded | Verification |
30
+ ```
31
+
32
+ Rules:
33
+ - Selective import must not pull in unrelated modules
34
+ - Bundle output must not contain code from unused exports
35
+ - Side-effect-free imports must produce no console output or global mutations
36
+
37
+ ## Quality Gate Additions
38
+
39
+ - [ ] Export smoke test table covers every public export (CJS + ESM rows)
40
+ - [ ] Type declaration verification rows present for all typed exports
41
+ - [ ] Backwards compatibility rows present for updated libraries
42
+ - [ ] Tree-shaking tests present if sideEffects: false is declared