@jterrats/open-orchestra 1.0.2 → 1.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (151) hide show
  1. package/AGENTS.md +7 -2
  2. package/CLAUDE.md +2 -2
  3. package/README.md +3 -0
  4. package/dist/args.js +12 -2
  5. package/dist/args.js.map +1 -1
  6. package/dist/assets/web-console.js +44 -0
  7. package/dist/autonomous-phase-lifecycle.js +23 -3
  8. package/dist/autonomous-phase-lifecycle.js.map +1 -1
  9. package/dist/autonomous-run-state.js +2 -0
  10. package/dist/autonomous-run-state.js.map +1 -1
  11. package/dist/benchmark.js +6 -0
  12. package/dist/benchmark.js.map +1 -1
  13. package/dist/cli.js +4 -1
  14. package/dist/cli.js.map +1 -1
  15. package/dist/command-manifest.js +4 -3
  16. package/dist/command-manifest.js.map +1 -1
  17. package/dist/command-utils.js +4 -5
  18. package/dist/command-utils.js.map +1 -1
  19. package/dist/commands.d.ts +1 -1
  20. package/dist/commands.js +1 -1
  21. package/dist/commands.js.map +1 -1
  22. package/dist/metrics-commands.js +8 -0
  23. package/dist/metrics-commands.js.map +1 -1
  24. package/dist/phase-playbooks.js +27 -1
  25. package/dist/phase-playbooks.js.map +1 -1
  26. package/dist/roles/core-roles.js +10 -5
  27. package/dist/roles/core-roles.js.map +1 -1
  28. package/dist/skills-catalog.js +136 -0
  29. package/dist/skills-catalog.js.map +1 -1
  30. package/dist/skills-commands.d.ts +1 -0
  31. package/dist/skills-commands.js +37 -1
  32. package/dist/skills-commands.js.map +1 -1
  33. package/dist/skills-planning.d.ts +2 -1
  34. package/dist/skills-planning.js +79 -11
  35. package/dist/skills-planning.js.map +1 -1
  36. package/dist/skills.d.ts +1 -1
  37. package/dist/skills.js +1 -1
  38. package/dist/skills.js.map +1 -1
  39. package/dist/task-graph-commands.js +36 -8
  40. package/dist/task-graph-commands.js.map +1 -1
  41. package/dist/types/metrics.d.ts +2 -0
  42. package/dist/types/skills.d.ts +9 -0
  43. package/dist/types/tasks.d.ts +8 -1
  44. package/dist/types.d.ts +2 -2
  45. package/dist/types.js.map +1 -1
  46. package/dist/web-api.js +80 -7
  47. package/dist/web-api.js.map +1 -1
  48. package/dist/workflow-approval-service.js +13 -0
  49. package/dist/workflow-approval-service.js.map +1 -1
  50. package/dist/workflow-evidence-service.js +37 -2
  51. package/dist/workflow-evidence-service.js.map +1 -1
  52. package/dist/workflow-gates.js +56 -1
  53. package/dist/workflow-gates.js.map +1 -1
  54. package/dist/workflow-phase-planner.js +86 -13
  55. package/dist/workflow-phase-planner.js.map +1 -1
  56. package/dist/workflow-run-commands.d.ts +1 -0
  57. package/dist/workflow-run-commands.js +11 -6
  58. package/dist/workflow-run-commands.js.map +1 -1
  59. package/dist/workflow-services.js +24 -0
  60. package/dist/workflow-services.js.map +1 -1
  61. package/dist/workflow-task-service.js +27 -2
  62. package/dist/workflow-task-service.js.map +1 -1
  63. package/docs/adoption-guide.md +22 -1
  64. package/docs/advisory-supervisor-architecture.md +206 -0
  65. package/docs/architecture.md +47 -41
  66. package/docs/autonomous-workflow.md +2 -2
  67. package/docs/backlog/ac-evidence-bugfix-stories-20260517.md +76 -0
  68. package/docs/backlog/chaos-testing-stack-strategy.md +146 -0
  69. package/docs/backlog/dev-best-practices-hardening-story.md +69 -0
  70. package/docs/backlog/docs-public-internal-package-hygiene-story.md +62 -0
  71. package/docs/backlog/project-persona-registry-epic.md +350 -0
  72. package/docs/backlog/prompt-bank-registry-epic.md +159 -0
  73. package/docs/backlog/site-docs-manifest-story.md +56 -0
  74. package/docs/dev-team-specialist-role-profiles.md +1 -1
  75. package/docs/diagrams/diagram-master-prompt.md +207 -0
  76. package/docs/diagrams/enterprise-set/README.md +22 -0
  77. package/docs/diagrams/enterprise-set/lead-to-account-swimlanes.svg +38 -0
  78. package/docs/diagrams/enterprise-set/product-implementation-timeline.svg +45 -0
  79. package/docs/diagrams/enterprise-set/salesforce-enterprise-architecture.svg +54 -0
  80. package/docs/diagrams/experiments/pixel-v2-review.md +124 -0
  81. package/docs/diagrams/experiments/roadmap/diagram.mmd +14 -0
  82. package/docs/diagrams/experiments/roadmap/diagram.svg +48 -0
  83. package/docs/diagrams/experiments/roadmap/experiment.md +44 -0
  84. package/docs/diagrams/experiments/sfdc-implementation/diagram.mmd +54 -0
  85. package/docs/diagrams/experiments/sfdc-implementation/diagram.svg +72 -0
  86. package/docs/diagrams/experiments/sfdc-implementation/experiment.md +41 -0
  87. package/docs/diagrams/experiments/swimlane/diagram.mmd +40 -0
  88. package/docs/diagrams/experiments/swimlane/diagram.svg +70 -0
  89. package/docs/diagrams/experiments/swimlane/experiment.md +50 -0
  90. package/docs/diagrams/experiments/timeline/diagram.mmd +9 -0
  91. package/docs/diagrams/experiments/timeline/diagram.svg +29 -0
  92. package/docs/diagrams/experiments/timeline/experiment.md +34 -0
  93. package/docs/diagrams/final-artifact-hygiene.md +40 -0
  94. package/docs/diagrams/mermaid-target-strategy.md +106 -0
  95. package/docs/diagrams/payment-gateway/architecture.md +57 -0
  96. package/docs/diagrams/payment-gateway/architecture.mmd +39 -0
  97. package/docs/diagrams/payment-gateway/architecture.svg +171 -0
  98. package/docs/diagrams/prompt-bank.md +48 -0
  99. package/docs/diagrams/salesforce-integration/architecture.md +56 -0
  100. package/docs/diagrams/salesforce-integration/architecture.mmd +26 -0
  101. package/docs/diagrams/salesforce-integration/architecture.svg +123 -0
  102. package/docs/diagrams/source-fidelity-review.md +116 -0
  103. package/docs/diagrams/state-uml-recreated.drawio +336 -0
  104. package/docs/diagrams/state-uml-recreated.prompt.md +114 -0
  105. package/docs/diagrams/state-uml-recreated.prompt.v10.md +52 -0
  106. package/docs/diagrams/state-uml-recreated.prompt.v11.md +52 -0
  107. package/docs/diagrams/state-uml-recreated.prompt.v12.md +50 -0
  108. package/docs/diagrams/state-uml-recreated.prompt.v14.md +91 -0
  109. package/docs/diagrams/state-uml-recreated.prompt.v2.md +31 -0
  110. package/docs/diagrams/state-uml-recreated.prompt.v3.md +36 -0
  111. package/docs/diagrams/state-uml-recreated.prompt.v4.md +35 -0
  112. package/docs/diagrams/state-uml-recreated.prompt.v5.md +35 -0
  113. package/docs/diagrams/state-uml-recreated.prompt.v6.md +39 -0
  114. package/docs/diagrams/state-uml-recreated.prompt.v7.md +37 -0
  115. package/docs/diagrams/state-uml-recreated.prompt.v8.md +41 -0
  116. package/docs/diagrams/state-uml-recreated.prompt.v9.md +32 -0
  117. package/docs/diagrams/state-uml-recreated.svg +159 -0
  118. package/docs/diagrams/v14-stress-test/README.md +33 -0
  119. package/docs/diagrams/v14-stress-test/stress-test.svg +114 -0
  120. package/docs/external-artifact-import-bridge.md +56 -0
  121. package/docs/{setup-agents-applicability-review.md → external-baseline-applicability-review.md} +37 -40
  122. package/docs/{setup-agents-dogfooding-findings.md → external-baseline-dogfooding-findings.md} +10 -9
  123. package/docs/multi-agent-orchestrator-backlog.md +1 -1
  124. package/docs/orchestra-mvp.md +19 -0
  125. package/docs/persona-workflows.md +42 -0
  126. package/docs/release-test-matrix.md +21 -9
  127. package/docs/reports/ac-evidence-backfill-20260517.md +256 -0
  128. package/docs/reports/ac-evolution-reconciliation-20260517.md +366 -0
  129. package/docs/reports/ac-failure-evidence-20260517.md +115 -0
  130. package/docs/reports/ac-history-dry-run-20260517.md +434 -0
  131. package/docs/runtime-llm-flow.md +8 -0
  132. package/docs/site-content-workflow.md +96 -0
  133. package/docs/site-manifest.json +143 -0
  134. package/docs/skill-loading-strategy.md +18 -7
  135. package/docs/story-mapping-adoption-review.md +99 -0
  136. package/docs/workspace-repo-strategy.md +63 -0
  137. package/package.json +3 -1
  138. package/rules/agent-collaboration.mdc +2 -0
  139. package/rules/code-review-engineering.mdc +2 -0
  140. package/rules/delivery-quality-gates.mdc +12 -0
  141. package/rules/development-engineering.mdc +3 -0
  142. package/rules/diagram-quality.mdc +35 -0
  143. package/rules/module-boundaries.mdc +71 -0
  144. package/rules/testing-discipline.mdc +13 -0
  145. package/skills/chaos-resilience-testing/SKILL.md +127 -0
  146. package/skills/chaos-resilience-testing/manifest.json +61 -0
  147. package/skills/collection-standards/SKILL.md +2 -0
  148. package/skills/diagram-export/SKILL.md +30 -0
  149. package/skills/qa-evidence-pack/SKILL.md +110 -0
  150. package/skills/qa-evidence-pack/manifest.json +60 -0
  151. package/docs/setup-agents-bridge.md +0 -61
@@ -0,0 +1,71 @@
1
+ ---
2
+ description: Module boundaries, god-file prevention, and thin adapter standards
3
+ alwaysApply: true
4
+ ---
5
+
6
+ # Module Boundaries
7
+
8
+ Every code change must preserve clear ownership boundaries. Before adding code
9
+ to an existing file, check whether the file is already large, multi-purpose, or
10
+ adapter-shaped. If the change would make the file harder to review, create or
11
+ reuse the correct domain, model, service, repository, or adapter module instead.
12
+
13
+ ## Pre-Write Check
14
+
15
+ - Inspect the target file's current responsibility, exported surface, and size
16
+ before editing.
17
+ - Treat files over 300 lines, functions over 30 lines, and command/controller
18
+ files with business logic as god-file risk.
19
+ - A large existing file is not a reason to keep adding to it. If the new change
20
+ is separable, extract the new behavior into a focused module and wire it from
21
+ the existing entry point.
22
+ - If extraction is unsafe in the current task, record a follow-up debt task with
23
+ the reason, affected file, and proposed boundary.
24
+
25
+ ## Expected Layers
26
+
27
+ - `model` or `types`: narrow public data contracts, discriminated unions,
28
+ schemas, and DTOs.
29
+ - `domain`: pure invariants, policy decisions, validation rules, state
30
+ transitions, and calculations.
31
+ - `service` or `use-case`: orchestration of domain logic, repositories, clients,
32
+ and side effects for one workflow.
33
+ - `repository`, `store`, or `gateway`: persistence, file I/O, network I/O, and
34
+ external system adapters.
35
+ - `commands`: CLI adapter only. Parse arguments, call services, format output,
36
+ and convert errors to user-safe messages.
37
+ - `web` or `api`: HTTP/UI adapter only. Parse requests, call services, serialize
38
+ responses, and map errors.
39
+
40
+ ## Logicless Commands
41
+
42
+ Command modules must remain nearly logicless. They may:
43
+
44
+ - parse flags and positional arguments;
45
+ - choose output format;
46
+ - call one service/use-case function;
47
+ - map expected errors to CLI messages and exit codes.
48
+
49
+ Command modules must not:
50
+
51
+ - own business rules or workflow policy;
52
+ - perform direct persistence when a repository/service should own it;
53
+ - contain repeated hardcoded registries, option lists, status sets, or provider
54
+ matrices;
55
+ - implement complex loops, joins, retries, or batching;
56
+ - become the primary test target for domain behavior.
57
+
58
+ ## Hardcoded Collections
59
+
60
+ Repeated hardcoded values must move to a typed source of truth. This applies to
61
+ roles, statuses, providers, commands, option lists, validators, selectors,
62
+ fixtures, CI matrices, and any key/value collection reused by more than one
63
+ consumer. Load `collection-standards` when this risk appears.
64
+
65
+ ## Review Checklist
66
+
67
+ - Did the author check target file size and responsibility before writing?
68
+ - Did new logic land in the correct layer?
69
+ - Is the command/controller/route still a thin adapter?
70
+ - Are repeated hardcoded collections extracted to a typed source of truth?
71
+ - Is there a follow-up debt task when extraction was intentionally deferred?
@@ -6,40 +6,53 @@ alwaysApply: true
6
6
  # Testing Discipline
7
7
 
8
8
  ## Test-Driven Development (TDD)
9
+
9
10
  - Write the test **before** or **alongside** the implementation. At minimum, tests must exist before the PR.
10
11
  - Red → Green → Refactor. Start with a failing test, make it pass with minimal code, then clean up.
11
12
  - Every development task must include unit tests for new or changed business logic before it is handed to QA.
12
13
 
13
14
  ## Behavior-Driven Development (BDD)
15
+
14
16
  - Test **behavior**, not implementation. Test what the function does, not how it does it.
15
17
  - Name tests as specifications: `it('rejects orders with zero quantity')`, not `it('test1')`.
16
18
  - One assertion per test method. If you need multiple, it's multiple behaviors — split them.
17
19
 
18
20
  ## Test Structure
21
+
19
22
  - **Arrange → Act → Assert.** Separate setup, execution, and verification with blank lines.
20
23
  - Use factory functions or builders for test data — never copy-paste fixtures across test files.
21
24
  - QA automation, E2E suites, contract tests, and test scripts that repeat fixture collections, selectors, expected outputs, or command matrices must load the `collection-standards` skill.
22
25
  - Tests must be deterministic. No reliance on system clock, network, or random values without seeding.
23
26
 
24
27
  ## Sync Tests
28
+
25
29
  - If data is duplicated across packages (e.g., type definitions, config arrays), a test must assert both copies are identical.
26
30
  - Schema changes in a source of truth must break a test somewhere — if they don't, add one.
27
31
 
28
32
  ## Coverage
33
+
29
34
  - Target **90%+ line coverage** for business logic. Infrastructure/glue code can be lower.
30
35
  - Coverage is a floor, not a goal. 100% coverage with bad assertions is worse than 80% with good ones.
31
36
 
32
37
  ## E2E / Integration
38
+
33
39
  - Prefer Playwright for browser-based E2E, smoke, and regression automation.
34
40
  - Use the Page Object pattern for UI tests. Selectors live in page objects, not test bodies.
35
41
  - Tag tests by speed/scope (`@smoke`, `@regression`) so CI can run fast feedback loops.
36
42
  - Capture evidence for E2E failures with traces, screenshots, or videos when supported by the framework.
43
+ - QA, SDET, Developer, BA, Architect, and Release work that produces or reviews evidence must load the `qa-evidence-pack` skill when it involves acceptance criteria coverage, Playwright/browser artifacts, CLI stdout/stderr, API contracts, integration side effects, screenshots, visual diffs, or annotated defect evidence.
44
+ - Keep large screenshots, videos, traces, logs, API payloads, and visual diffs as files. Summarize them in a compact evidence report so agents do not consume context with raw artifacts.
37
45
 
38
46
  ## QA Handoff
47
+
39
48
  - Developer must provide QA with test commands run, pass/fail results, covered scenarios, and known gaps.
40
49
  - QA must produce a test plan before release approval and map every acceptance criterion to automated, manual, contract/mock, or deferred evidence.
41
50
  - QA evidence must validate observable outcomes, not only execution. CLI checks assert exit code, stdout/stderr, files, events, or final state; browser checks assert visible user-facing state; API checks assert response contract and side effects; integration checks assert sandbox/mock/contract/webhook/event/log outcomes or defer with owner and rationale.
42
51
  - Evidence summaries or metadata must name the covered acceptance criterion or explicitly state that all acceptance criteria are covered. Smoke and regression checks are useful but do not count as acceptance coverage unless they map to an acceptance criterion.
52
+ - Visual/UI/diagram defect evidence must include source or expected image when available, actual screenshot/render, diff image when practical, and an annotated screenshot for ambiguous failures. Use red boxes for broken bounds/overlap, orange arrows for wrong connectors or flow, yellow translucent areas for excess spacing, blue guide lines for alignment, and short defect labels.
53
+ - Executed QA evidence must receive a sprint-review-style cross-review before release: Analyst/Business Analyst compares the evidence against the GitHub issue, user story, acceptance criteria, and Orchestra task, while Architect validates technical contract, integration, boundary, data-flow, and risk coverage.
54
+ - Analyst/Business Analyst must comment on the GitHub issue/user story and Orchestra task when the evidence does not prove the requested behavior, misses acceptance criteria, or exposes workflow gaps. These findings block release until fixed or explicitly risk-accepted by the Product Owner.
55
+ - If Analyst/BA or Architect review is not applicable, QA must record the rationale and Product Owner acceptance before release.
43
56
  - QA and Developer must decide which manual checks should be automated, preferring Playwright for browser flows.
44
57
  - User-facing QA plans must include responsive, accessibility, copy, tooltip, loading, empty, error, success, and recovery-state checks.
45
58
  - API, data, async, performance, and config changes must include targeted regression checks for contract, migration, idempotency, latency, and environment behavior when applicable.
@@ -0,0 +1,127 @@
1
+ # Chaos Resilience Testing
2
+
3
+ Design deterministic failure scenarios that prove workflows, APIs, providers,
4
+ gates, budgets, and regulated flows degrade safely.
5
+
6
+ ## When To Load
7
+
8
+ - Trigger: `chaos`
9
+ - Trigger: `resilience`
10
+ - Trigger: `fault injection`
11
+ - Trigger: `failure mode`
12
+ - Trigger: `provider timeout`
13
+ - Trigger: `provider unavailable`
14
+ - Trigger: `offline mode`
15
+ - Trigger: `circuit breaker`
16
+ - Trigger: `rate limit`
17
+ - Trigger: `budget exhaustion`
18
+ - Trigger: `approval race`
19
+ - Trigger: `policy failure`
20
+ - Trigger: `audit failure`
21
+ - Trigger: `stale data`
22
+ - Trigger: `corrupted state`
23
+ - Trigger: `tenant isolation`
24
+ - Trigger: `regulated flow`
25
+
26
+ ## Procedure
27
+
28
+ 1. Identify the task, acceptance criteria, impacted runtime surfaces, and the
29
+ user-visible or release-critical outcome that must survive failure.
30
+ 2. Classify each failure as one of:
31
+ - fail closed: security, approvals, regulated authority, secrets, PII/PHI,
32
+ payment, policy, tenant isolation, or destructive actions;
33
+ - degrade with recovery: optional enrichment, UI panels, advisory features,
34
+ non-critical telemetry, or external references;
35
+ - retry with bounds: transient provider/API, storage, webhook, or scheduler
36
+ failures with explicit timeout, backoff, and retry limits.
37
+ 3. Select deterministic scenarios before implementation. Prefer controlled
38
+ stubs, fake providers, injected stores, fixture corruption, and bounded
39
+ timeout simulation over random production-style fault injection.
40
+ 4. For each scenario, define:
41
+ - fault injected;
42
+ - expected behavior;
43
+ - expected user/operator message;
44
+ - expected audit/event/evidence output;
45
+ - recovery path;
46
+ - acceptance criteria covered.
47
+ 5. Validate at least the relevant categories:
48
+ - provider/model timeout or unavailable provider;
49
+ - external API/network unavailable;
50
+ - corrupted or partially written local state;
51
+ - stale reads or cache mismatch;
52
+ - concurrent update/approval race;
53
+ - budget/rate-limit exhaustion;
54
+ - policy engine denial or failure;
55
+ - audit/event write failure;
56
+ - offline mode with optional sources unavailable;
57
+ - tenant/regulatory boundary enforcement.
58
+ 6. Capture observable evidence. A passing command alone is not enough; prove the
59
+ final state, emitted event, user message, skipped activation, blocked gate, or
60
+ recovery artifact.
61
+ 7. Record unresolved resilience gaps with owner, severity, release impact, and
62
+ whether Product/Security/Compliance accepted the risk.
63
+
64
+ ## Stack Guidance
65
+
66
+ - Start with local deterministic faults: Node tests, fake providers, fake
67
+ storage/repositories, controlled timers, `AbortController`, injected clocks,
68
+ and fixture corruption.
69
+ - Use Playwright route stubs for web/API degraded states such as timeout, stale
70
+ data, malformed payload, empty response, or server error.
71
+ - Use Docker Compose, Toxiproxy, WireMock/MSW/Pact, k6, and OpenTelemetry only
72
+ when integration or SaaS boundaries require network/service-level evidence.
73
+ - Use Chaos Mesh or LitmusChaos only for future Kubernetes-managed services;
74
+ these are not npm package MVP dependencies.
75
+ - Keep stack details in backlog or architecture docs and load only the relevant
76
+ scenario guidance into task context.
77
+
78
+ ## Evidence Report Template
79
+
80
+ ```md
81
+ # Chaos / Resilience Evidence
82
+
83
+ Task:
84
+ Issue/User Story:
85
+ Environment:
86
+ Date:
87
+
88
+ ## Scenario Matrix
89
+
90
+ | Scenario | Fault | Expected behavior | Actual behavior | Evidence | Result |
91
+ | -------- | ----- | ----------------- | --------------- | -------- | ------ |
92
+
93
+ ## Acceptance Criteria Coverage
94
+
95
+ | AC | Scenario | Result | Notes |
96
+ | -- | -------- | ------ | ----- |
97
+
98
+ ## Recovery And Audit
99
+
100
+ | Scenario | Recovery path | Audit/event evidence | User/operator message |
101
+ | -------- | ------------- | -------------------- | --------------------- |
102
+
103
+ ## Gaps
104
+
105
+ | Gap | Severity | Owner | Release decision |
106
+ | --- | -------- | ----- | ---------------- |
107
+ ```
108
+
109
+ ## Acceptance Rules
110
+
111
+ - Security, compliance, tenant isolation, approval, regulated authority, secrets,
112
+ and payment-related failures must fail closed unless an explicit accepted risk
113
+ says otherwise.
114
+ - Optional enrichment and advisory features may degrade, but must expose clear
115
+ rationale and recovery guidance.
116
+ - Retries must be bounded by timeout, retry count, backoff, and budget policy.
117
+ - Chaos evidence must map back to acceptance criteria and release gates.
118
+ - A generated or automated reviewer cannot self-approve resilience gaps in
119
+ regulated or high-risk flows.
120
+
121
+ ## Evidence
122
+
123
+ - `command`
124
+ - `file`
125
+ - `log`
126
+ - `report`
127
+ - `trace`
@@ -0,0 +1,61 @@
1
+ {
2
+ "id": "chaos-resilience-testing",
3
+ "name": "Chaos Resilience Testing",
4
+ "summary": "Design deterministic failure scenarios that prove workflows, APIs, providers, gates, budgets, and regulated flows degrade safely.",
5
+ "triggers": [
6
+ "chaos",
7
+ "resilience",
8
+ "fault injection",
9
+ "failure mode",
10
+ "provider timeout",
11
+ "provider unavailable",
12
+ "offline mode",
13
+ "circuit breaker",
14
+ "rate limit",
15
+ "budget exhaustion",
16
+ "approval race",
17
+ "policy failure",
18
+ "audit failure",
19
+ "stale data",
20
+ "corrupted state",
21
+ "tenant isolation",
22
+ "regulated flow"
23
+ ],
24
+ "roles": [
25
+ "qa",
26
+ "sdet",
27
+ "sre",
28
+ "security",
29
+ "architect",
30
+ "developer",
31
+ "devops",
32
+ "platform_engineer",
33
+ "release_manager"
34
+ ],
35
+ "capabilities": [
36
+ "resilience-testing",
37
+ "chaos-testing",
38
+ "failure-mode-analysis",
39
+ "operational-readiness"
40
+ ],
41
+ "riskAreas": [
42
+ "security",
43
+ "release",
44
+ "integration",
45
+ "governance",
46
+ "sre",
47
+ "devops",
48
+ "compliance",
49
+ "performance"
50
+ ],
51
+ "sourceGroups": [
52
+ "quality-security",
53
+ "devops-runtime",
54
+ "architecture",
55
+ "product-backlog",
56
+ "agent-memory"
57
+ ],
58
+ "evidence": ["command", "file", "log", "report", "trace"],
59
+ "loadBudget": "normal",
60
+ "entry": "skills/chaos-resilience-testing/SKILL.md"
61
+ }
@@ -9,6 +9,8 @@ operational tooling, and generated code.
9
9
 
10
10
  - Developer, QA/SDET, DevOps, Platform, SRE, or Performance work writes code,
11
11
  scripts, tests, generated options, or automation helpers.
12
+ - A module-boundary or god-file review finds repeated hardcoded values in
13
+ commands, controllers, services, tests, or generated option builders.
12
14
  - The task mentions hardcoded values, arrays, maps, key/value pairs, options,
13
15
  fixtures, selectors, command cases, provider lists, CI matrices, roles,
14
16
  statuses, validators, bulk/batch processing, O(n), N+1, nested loops, or
@@ -14,12 +14,41 @@ Create, validate, and export architecture, workflow, and sequence diagrams.
14
14
 
15
15
  ## Procedure
16
16
 
17
+ - Load `docs/diagrams/diagram-master-prompt.md` as the canonical source-free
18
+ diagram prompt when detailed generation or validation guidance is needed.
17
19
  - Identify the diagram purpose and authoritative architecture sources before drawing.
20
+ - Classify the task before drawing: `semantic`, `inspired-by-reference`, or `recreation`.
21
+ - For `recreation`, acceptance is pixel-perfect source fidelity unless the user explicitly accepts an approximation. Structural similarity is not enough.
22
+ - For `recreation`, inventory every visible source element before drawing: containers, labels, icons, connectors, arrowheads, line styles, colors, borders, spacing, rotations, z-order, and page/canvas bounds.
18
23
  - Choose the diagram style from the decision matrix before drafting.
24
+ - When there is no source reference, create a diagram contract before drawing: purpose, audience, node inventory, groups, relationships, labels, annotations, expected reading flow, and planned connector endpoints/anchors.
19
25
  - Prefer text-native diagrams such as Mermaid unless the project requires another format.
26
+ - For recreated or high-fidelity diagrams, always perform a post-render visual QA pass against the source reference. Re-evaluate container sizing, text fit, spacing, connector bend points, and line/container overlaps after the diagram is rendered.
27
+ - After populating real text, subcards, chips, icons, and internal connectors, run a global layout reflow: grow parent containers when children need padding, then re-evaluate neighbors, connector routes, label lanes, and canvas bounds.
28
+ - Do not solve container overflow primarily by shrinking text. Prefer growing the parent container, repositioning children, or rerouting connectors unless the source reference requires tighter text.
29
+ - For `recreation`, record source-vs-output gaps by element ID or visual region after each iteration. If the chosen target cannot reach pixel-perfect fidelity, reclassify as approximation and document the reason.
30
+ - Avoid running connector lines over containers or important labels whenever practical. Add bend points, adjust spacing, or resize containers before treating the diagram as ready.
31
+ - Validate connector endpoint distance during the visual QA pass: every arrow must visibly leave the intended source edge and land on the intended target edge.
32
+ - Validate connector-label separation during the visual QA pass: labels must be placed in reserved whitespace or on readable label backgrounds, and must not touch connector strokes, arrowheads, or container borders.
33
+ - Validate element ordering during the visual QA pass: connectors and arrowheads must remain visible above the states or containers they connect, while accepted diagrams should remain visually stable across regenerations.
34
+ - Validate connector anchor aesthetics during the visual QA pass: choose source and target edge points that minimize bend count and unnecessary line travel without changing the intended relationship.
35
+ - Validate diagonal and crossing aesthetics during the visual QA pass: prefer orthogonal connectors and add line jumps or bridge arcs where unavoidable crossings remain.
36
+ - Validate layout simplification during the visual QA pass: before accepting a bent connector, check whether moving either connected element slightly creates a straighter route without breaking nearby spacing or semantics.
37
+ - Validate editable/rendered equivalence during the visual QA pass: draw.io XML and rendered SVG must describe the same moved elements, connectors, labels, and annotations.
38
+ - Validate annotation target clarity during the visual QA pass: every annotation arrow must visibly land on the exact element or line it describes, without obscuring target text or labels.
39
+ - For source-free diagrams, validate the rendered output against the diagram contract before handoff; correct and re-render when endpoints, labels, anchors, bend counts, or reading flow drift from the contract.
40
+ - Source-free diagrams still require a pixel-perfect quality pass against their own contract before delivery: text must fit, containers must be correctly sized, connectors must visibly attach to the intended source/target edges, arrowheads must remain visible, labels must not collide with lines or borders, and whitespace must be intentional.
41
+ - Never deliver the first render of a source-free diagram without re-evaluating sizes, line routing, connector anchors, text containment, z-order, and visual balance.
42
+ - After every correction, review the whole canvas again. Local fixes are incomplete until container containment, neighboring positions, connector routes, label lanes, z-order, and whitespace still pass globally.
43
+ - Before final handoff, perform diagram artifact hygiene:
44
+ - keep the accepted editable source, accepted render, prompt master or final prompt, and minimum QA evidence;
45
+ - archive or exclude intermediate previews, failed renders, temporary prompts, and one-off correction notes;
46
+ - do not publish source-specific prompt fragments into the prompt bank unless they have been rewritten as reusable rules;
47
+ - record where archived iterations or evidence can be found when traceability is required.
20
48
  - Run `orchestra diagrams lint --file <diagram.mmd>` for lint-only validation before sharing Mermaid diagrams.
21
49
  - Attach evidence with `orchestra diagrams lint --file <diagram.mmd> --task <task-id>` when the diagram supports workflow delivery.
22
50
  - If `mmdc` is missing, report the install guidance instead of pretending validation passed.
51
+ - Mermaid outputs can be accepted as semantic diagrams, but not as pixel-perfect recreations when exact layout, connectors, icons, rotations, or reference styling are acceptance criteria. Escalate those cases to draw.io XML or Lucid.
23
52
 
24
53
  ## Decision Matrix
25
54
 
@@ -33,3 +62,4 @@ Create, validate, and export architecture, workflow, and sequence diagrams.
33
62
 
34
63
  - `file`
35
64
  - `report`
65
+ - `screenshot`
@@ -0,0 +1,110 @@
1
+ # QA Evidence Pack
2
+
3
+ Build reviewable QA evidence packs that prove observable outcomes against
4
+ acceptance criteria without loading a large QA playbook into every task.
5
+
6
+ ## When To Load
7
+
8
+ - Trigger: `qa evidence`
9
+ - Trigger: `test evidence`
10
+ - Trigger: `acceptance criteria coverage`
11
+ - Trigger: `playwright`
12
+ - Trigger: `e2e`
13
+ - Trigger: `screenshot`
14
+ - Trigger: `trace`
15
+ - Trigger: `video`
16
+ - Trigger: `visual diff`
17
+ - Trigger: `annotated screenshot`
18
+ - Trigger: `cli output`
19
+ - Trigger: `stdout`
20
+ - Trigger: `stderr`
21
+ - Trigger: `api contract`
22
+ - Trigger: `integration`
23
+ - Trigger: `webhook`
24
+
25
+ ## Procedure
26
+
27
+ 1. Identify the GitHub issue, user story, Orchestra task, and acceptance criteria.
28
+ 2. Create or update a compact evidence report instead of pasting raw logs into
29
+ the agent context.
30
+ 3. Map every acceptance criterion to one of: automated, manual, contract/mock,
31
+ external verification, deferred with owner and rationale.
32
+ 4. Capture observable outcomes, not only command execution:
33
+ - Web: visible state, key screenshots, Playwright trace, video for failures
34
+ or critical flows, viewport/device.
35
+ - CLI: command, exit code, stdout/stderr expectations, created/changed files,
36
+ emitted events, final state.
37
+ - API: request shape, response contract, error contract, idempotency when
38
+ relevant, side effects.
39
+ - Integration: sandbox/mock receiver result, webhook/event/log, correlation
40
+ ID, database/query evidence, or explicit deferral.
41
+ - Visual/UI/diagram: source or expected image, actual image, diff image when
42
+ practical, annotated image for defects.
43
+ 5. For visual bugs, create an annotated screenshot using concise overlays:
44
+ - red rectangle for clipped, overlapping, or incorrect element bounds;
45
+ - orange arrow for wrong connector, anchor, or flow direction;
46
+ - yellow translucent area for excess whitespace or spacing defect;
47
+ - blue guide line for expected alignment;
48
+ - short label naming the defect.
49
+ 6. Store large artifacts as files and reference paths from the report. Summarize
50
+ only the relevant finding in the handoff.
51
+ 7. Ask BA/Product to compare evidence against story and acceptance criteria, and
52
+ Architect to review technical coverage before release.
53
+
54
+ ## Evidence Report Template
55
+
56
+ ```md
57
+ # QA Evidence Report
58
+
59
+ Task:
60
+ Issue/User Story:
61
+ Commit:
62
+ Environment:
63
+ Date:
64
+
65
+ ## Acceptance Criteria Coverage
66
+
67
+ | AC | Test | Result | Evidence | Notes |
68
+ | --- | ---- | ------ | -------- | ----- |
69
+
70
+ ## Commands
71
+
72
+ | Command | Result | Output artifact |
73
+ | ------- | ------ | --------------- |
74
+
75
+ ## Visual Evidence
76
+
77
+ | Viewport/Source | Actual | Expected/Source | Diff | Annotated | Result |
78
+ | --------------- | ------ | --------------- | ---- | --------- | ------ |
79
+
80
+ ## External Verification
81
+
82
+ | System | Correlation ID | Evidence | Result |
83
+ | ------ | -------------- | -------- | ------ |
84
+
85
+ ## Risks / Gaps
86
+
87
+ | Gap | Owner | PO accepted? | Rationale |
88
+ | --- | ----- | ------------ | --------- |
89
+ ```
90
+
91
+ ## Acceptance Rules
92
+
93
+ - A passing test without observable-result validation is not sufficient QA
94
+ evidence.
95
+ - A report without acceptance-criteria mapping is incomplete for release.
96
+ - Visual defects need source/expected, actual, and annotated evidence unless the
97
+ defect is already self-evident in a single screenshot.
98
+ - External integrations need receiver-side evidence or explicit deferral.
99
+ - Deferred evidence needs owner, rationale, follow-up, and Product Owner
100
+ acceptance before release.
101
+
102
+ ## Evidence
103
+
104
+ - `command`
105
+ - `file`
106
+ - `screenshot`
107
+ - `trace`
108
+ - `video`
109
+ - `log`
110
+ - `report`
@@ -0,0 +1,60 @@
1
+ {
2
+ "id": "qa-evidence-pack",
3
+ "name": "QA Evidence Pack",
4
+ "summary": "Create acceptance-criteria-mapped QA evidence packs with observable outcomes, artifacts, and annotated visual defect evidence.",
5
+ "triggers": [
6
+ "qa evidence",
7
+ "test evidence",
8
+ "acceptance criteria coverage",
9
+ "playwright",
10
+ "e2e",
11
+ "screenshot",
12
+ "trace",
13
+ "video",
14
+ "visual diff",
15
+ "annotated screenshot",
16
+ "cli output",
17
+ "stdout",
18
+ "stderr",
19
+ "api contract",
20
+ "integration",
21
+ "webhook"
22
+ ],
23
+ "roles": [
24
+ "qa",
25
+ "sdet",
26
+ "developer",
27
+ "frontend_specialist",
28
+ "backend_specialist",
29
+ "devops",
30
+ "sre",
31
+ "business_analyst",
32
+ "product_owner",
33
+ "architect",
34
+ "release_manager"
35
+ ],
36
+ "capabilities": [
37
+ "qa-evidence",
38
+ "acceptance-coverage",
39
+ "visual-annotation",
40
+ "external-verification"
41
+ ],
42
+ "riskAreas": ["quality", "release", "ux", "integration", "governance"],
43
+ "sourceGroups": [
44
+ "quality-security",
45
+ "product-backlog",
46
+ "codebase",
47
+ "agent-memory"
48
+ ],
49
+ "evidence": [
50
+ "command",
51
+ "file",
52
+ "screenshot",
53
+ "trace",
54
+ "video",
55
+ "log",
56
+ "report"
57
+ ],
58
+ "loadBudget": "normal",
59
+ "entry": "skills/qa-evidence-pack/SKILL.md"
60
+ }
@@ -1,61 +0,0 @@
1
- # setup-agents Bridge
2
-
3
- Open Orchestra can import optional `setup-agents` artifacts without making the
4
- runtime depend on Salesforce-specific setup flows.
5
-
6
- ## Command
7
-
8
- ```bash
9
- orchestra setup-agents import --source .setup-agents
10
- ```
11
-
12
- Use `--json` to get a machine-readable report with imported, skipped, and
13
- conflicted profiles, tasks, evidence references, and handoff references.
14
-
15
- ## Supported Inputs
16
-
17
- The importer reads:
18
-
19
- - `.setup-agents/open-orchestra/profiles.json`
20
- - `.setup-agents/tasks.json`
21
- - `.setup-agents/tasks.jsonl`
22
- - `.setup-agents/state/**/*.json`
23
- - `.setup-agents/state/**/*.jsonl`
24
-
25
- Task records may use either sparse legacy fields or enriched delivery fields.
26
- Supported task fields include:
27
-
28
- - `id`, `title`, `summary`
29
- - `owner`, `ownerRole`, `role`, `profile`, `profileId`
30
- - `backlogItem`, `backlog`
31
- - `userStory`, `goal`, `scope`
32
- - `acceptanceCriteria`
33
- - `definitionOfReady`, `definitionOfDone`
34
- - `dependencies`, `dependsOn`
35
- - `risks`, `assumptions`, `paths`, `files`
36
- - `testStrategy`
37
- - `contractVersion`
38
- - `acceptanceStatus`, `acceptedBy`
39
- - `evidenceIds`, `evidence`
40
- - `handoffIds`, `handoffs`
41
-
42
- ## Mapping Rules
43
-
44
- Profile role mappings are read from `profiles.json` when present. Setup role
45
- IDs such as `setup-agents:qa` are normalized to Orchestra role IDs such as
46
- `qa`. Unknown owner roles are treated as conflicts during profile import and
47
- fall back to `developer` for sparse task records.
48
-
49
- Imported tasks preserve setup metadata in `task.externalRefs.setupAgents`.
50
- Evidence and handoff IDs are stored as references there; the importer does not
51
- copy or mutate the original setup artifacts.
52
-
53
- ## Idempotency And Conflicts
54
-
55
- Re-running the import does not duplicate tasks. If an existing task has the
56
- same ID, title, and owner role, the task is reported as skipped. If the ID
57
- matches but title or owner differs, the importer reports a conflict and leaves
58
- the existing Orchestra task unchanged.
59
-
60
- Each import records a `SETUP_AGENTS_IMPORTED` event with summary counts so the
61
- story-to-evidence trail remains auditable.