npm - @qa-gentic/stlc-agents - Versions diffs - 1.0.2 → 1.0.4 - Mend

@qa-gentic/stlc-agents 1.0.2 → 1.0.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/.github/copilot-instructions/AGENT-BEHAVIOR.md +448 -0
package/.github/copilot-instructions/deduplication-protocol.md +303 -0
package/.github/copilot-instructions/generate-gherkin.md +550 -0
package/.github/copilot-instructions/generate-playwright-code.md +464 -0
package/.github/copilot-instructions/generate-test-cases.md +176 -0
package/.github/copilot-instructions/write-helix-files.md +374 -0
package/README.md +19 -12
package/package.json +8 -4
package/src/boilerplate-bundle.js +3 -3
package/src/stlc_agents/agent_helix_writer/tools/boilerplate.py +2 -2

package/.github/copilot-instructions/AGENT-BEHAVIOR.md ADDED Viewed

@@ -0,0 +1,448 @@
+# Agent Behavior Contract — QA STLC Agents
+> **This file is authoritative for all coding agents (Claude Code, GitHub Copilot, Cursor,
+> Windsurf, and any LLM operating on this repo).**
+> It defines what agents are and are not permitted to do. Every rule here overrides any
+> inference, "reasonable assumption", or pattern learned from prior context.
+>
+> Version: 1.1 — all rules are explicit; none are inferred.
+> Change log: see bottom of this file.
+---
+## 1. Scope Delimitation — What This Repo Does and Does Not Do
+### What the agents in this repo do
+| Agent | What it does | What it produces |
+|---|---|---|
+| `qa-test-case-manager` | Fetches ADO work items; creates and links manual test cases | ADO test case work items linked via `TestedBy-Forward` |
+| `qa-gherkin-generator` | Fetches ADO work item hierarchies; validates and attaches `.feature` files | `.feature` files attached to ADO work items |
+| `qa-playwright-generator` | Generates TypeScript Playwright code from Gherkin; attaches files to ADO | `locators.ts`, `*Page.ts`, `*.steps.ts` attached to ADO work items |
+| `qa-helix-writer` | Writes already-generated files to disk at Helix-QA paths | Files on disk — no ADO writes |
+### What the agents do NOT do
+- Do NOT make any product decisions (scope, priority, what to test).
+- Do NOT infer missing acceptance criteria, test data, or screen names.
+- Do NOT carry forward any decision from one artifact to the next.
+- Do NOT create local files unless the user explicitly requests a download.
+- Do NOT attach anything to ADO unless the explicit rule for that artifact type is satisfied.
+- Do NOT proceed past a gap or ambiguity — they stop and ask.
+---
+## 2. Zero-Inference Rule
+**An agent must never substitute inference for an explicit instruction.**
+This applies specifically to:
+| Situation | Prohibited inference | Required behaviour |
+|---|---|---|
+| User provides a work item ID but no type | Guessing whether it is Epic / Feature / PBI / Bug | Call the fetch tool; let the response `work_item_type` field determine routing |
+| User says "generate everything" | Assuming delivery destinations for each artifact type | Ask per artifact: "Attach Gherkin to ADO? Attach Playwright to ADO? Save locally?" |
+| User confirms Gherkin ADO attachment | Inferring Playwright code should also be attached | Wait for explicit confirmation before calling `attach_code_to_work_item` |
+| User declines creating test cases in ADO | Inferring they also decline Gherkin or Playwright ADO attachment | Treat each artifact delivery as a completely independent decision |
+| User says "yes" to one step | Carrying that "yes" forward to the next step | Seek fresh confirmation for each distinct action |
+| Navigation path not in work item | Inventing a path | Write placeholder `<!-- TODO: confirm screen name -->` and surface to user |
+| Test data not provided | Inventing emails, IDs, file contents | Stop and ask; never hardcode invented data |
+| Work item appears new | Skipping the deduplication protocol | Always run Phase 1 of deduplication-protocol.md before creating anything |
+---
+## 3. Explicit Artifact Delivery Rules
+Every artifact type has exactly one delivery rule. Completing one artifact does **not** trigger delivery of another.
+### 3A — Gherkin `.feature` files
+| Condition | Action |
+|---|---|
+| User asks to generate Gherkin | Generate the file; show content to user; **do not attach yet** |
+| `validate_gherkin_content` returns `valid: false` | Fix errors; do not attach |
+| `validate_gherkin_content` returns `valid: true` | **Ask the user**: "Attach this `.feature` file to ADO work item #{id}?" |
+| User confirms attachment | Call `attach_gherkin_to_feature` (Feature path) or `attach_gherkin_to_work_item` (PBI/Bug path) |
+| User declines | Save locally only if user explicitly requests; otherwise present content inline |
+| Work item type is Epic | **HARD STOP** — do not attach to Epic; tell user to specify a child Feature ID |
+### 3B — Playwright TypeScript files
+| Condition | Action |
+|---|---|
+| User asks to generate Playwright code | Generate files; show output summary; **do not attach yet** |
+| User has not explicitly requested ADO attachment | Do NOT call `attach_code_to_work_item` |
+| User explicitly requests ADO attachment | Call `attach_code_to_work_item` with net-new delta files only |
+| User previously declined test case creation | This has **no bearing** on Playwright attachment — ask separately |
+| User previously confirmed Gherkin attachment | This has **no bearing** on Playwright attachment — ask separately |
+| Work item type is Epic | **HARD STOP** — return `epic_not_supported`; do not attach |
+### 3C — Manual test cases (ADO)
+| Condition | Action |
+|---|---|
+| Work item type is Epic | **HARD STOP** — `fetch_work_item` returns `epic_not_supported`; inform user; do not call `create_and_link_test_cases` |
+| Work item type is Feature | Server returns `confirmation_required: true`; show user the proposed count + Feature title; wait for "yes" |
+| User says "yes" to Feature confirmation | Retry `create_and_link_test_cases` with the **identical arguments** plus `confirmed=true` — do NOT change `work_item_id` or any other parameter |
+| User says "no" or "cancel" to Feature confirmation | Abort entirely; report "No test cases were created." Do NOT retry with a different work item type. |
+| Work item type is PBI or Bug | Run deduplication protocol; create only net-new test cases |
+| `existing_test_cases_count > 0` | Always call `get_linked_test_cases` first to get real titles before generating |
+### 3D — Local file creation
+| Condition | Action |
+|---|---|
+| User says "save", "download", or "export to file" | Create local file as requested |
+| User has not requested a local file | Do NOT create one |
+| ADO attachment succeeded | Do NOT also create a local copy unless user asks |
+| ADO attachment failed | Offer the content inline; ask if user wants a local file |
+### 3E — Helix-QA disk writes
+| Condition | Action |
+|---|---|
+| User asks to write files to disk | Follow `write-helix-files.md` in full — never skip any step |
+| `write_helix_files` succeeds | Report the deduplication summary to the user |
+| `write_helix_files` fails | **See failure recovery rules below** — do NOT fall back to `create_file` |
+| A locator file already exists on disk (`list_helix_tree` returns it) | Extend that file via `write_helix_files` — **NEVER create a new file** |
+| A step file already exists on disk | Merge into that file via `write_helix_files` — **NEVER create a new file** |
+| User has not explicitly asked for local files | Do NOT call `create_file` or any filesystem tool |
+**Failure recovery for `write_helix_files`:**
+| Error type | Root cause | Recovery action |
+|---|---|---|
+| Parenthesis / bracket balance error in a `*.steps.ts` file | Step definitions use regex patterns (`/^I click ([^"]*)/`) — the regex preprocessor strips string literals before counting parens, so capture groups cause a nonzero delta | Call `pre_validate_cucumber_steps` on the steps file — it identifies every offending pattern and provides ready-to-paste Cucumber expression replacements. Fix all flagged patterns, then retry `write_helix_files`. |
+| `OSError` on a specific path | Path does not exist or permission denied | Report path to user; do not fall back to `create_file` |
+| Any other tool error | Unknown | Report the raw error to user verbatim; ask how to proceed |
+**The `create_file` tool is PROHIBITED for Helix-QA artifacts.** Every file in `src/` that
+belongs to the QA test suite — locators, page objects, step definitions, feature files,
+cucumber config — must be written exclusively via `qa-helix-writer:write_helix_files`.
+Using `create_file` bypasses deduplication, interface-adaptation rewrites, and file routing.
+If `write_helix_files` cannot be made to succeed, surface the blocker to the user rather
+than silently routing around it.
+| Condition | Action |
+|---|---|
+| Files to write | Always call `list_helix_tree` first to see what already exists |
+| File already exists at target path | Read it first; check for overlap; write only net-new content |
+| `generate_playwright_code` output available | Pass the `files` dict directly to `write_helix_files` — do not reformat |
+**Playwright generation routing (strict — no inference):**
+| Project state (`list_helix_tree` result) | Action |
+|---|---|
+| `framework_state: "absent"` or `"partial"` (fresh project) | Call `scaffold_locator_repository` FIRST, then `generate_playwright_code` |
+| `framework_state: "present"` (existing boilerplate) | Call `generate_playwright_code` ONLY — never call `scaffold_locator_repository` |
+Calling `scaffold_locator_repository` on an existing project overwrites the six shared
+infrastructure files and destroys customisations. It is only safe on a fresh project.
+---
+## 4. Strict Input → Tool → Output Chains
+Each user request type has one and only one correct tool chain. Do not improvise.
+### Chain A — Manual test cases from PBI or Bug
+```
+INPUT:  PBI or Bug ID  +  org URL  +  project name
+  │
+  ├─ fetch_work_item(id)
+  │    └─ if epic_not_supported  →  STOP, inform user
+  │    └─ if confirmation_required  →  STOP, ask user; proceed only on "yes"
+  │
+  ├─ get_linked_test_cases(id)          ← always if count > 0
+  │
+  ├─ [deduplication-protocol Phase 2]
+  │
+  └─ create_and_link_test_cases(net-new only)
+OUTPUT: ADO test case IDs + deduplication report
+        NO local file, NO Gherkin, NO Playwright — unless user explicitly requests them
+```
+### Chain B — Gherkin from Feature
+```
+INPUT:  Feature ID  +  org URL  +  project name
+  │
+  ├─ fetch_feature_hierarchy(id)
+  │
+  ├─ [deduplication-protocol Phase 1+2]
+  │
+  ├─ [gap check — stop and ask if any G1–G7 gap found]
+  │
+  ├─ [live Playwright MCP snapshots of every screen]
+  │
+  ├─ validate_gherkin_content(content, scope="feature")
+  │    └─ if valid: false  →  fix; re-validate; do NOT attach
+  │
+  ├─ validate_gherkin_steps(content)
+  │
+  ├─ [show content to user]
+  │
+  ├─ [ASK: "Attach to ADO Feature #{id}?"]
+  │    └─ yes  →  attach_gherkin_to_feature(id)
+  │    └─ no   →  stop; offer local download only if user asks
+  │
+  └─ [STOP — do not proceed to Playwright unless user explicitly asks]
+OUTPUT: .feature file attached to Feature work item (or shown inline)
+        NO Playwright code unless user explicitly requests it
+        NO test case creation unless user explicitly requests it
+```
+### Chain C — Gherkin from PBI or Bug
+```
+INPUT:  PBI or Bug ID  +  org URL  +  project name
+  │
+  ├─ fetch_work_item_for_gherkin(id)
+  │
+  ├─ get_linked_test_cases(id)          ← context only
+  │
+  ├─ [deduplication-protocol Phase 1+2]
+  │
+  ├─ [gap check]
+  │
+  ├─ [live Playwright MCP snapshots]
+  │
+  ├─ validate_gherkin_content(content, scope="work_item")
+  │
+  ├─ [show content to user]
+  │
+  ├─ [ASK: "Attach to ADO work item #{id}?"]
+  │    └─ yes  →  attach_gherkin_to_work_item(id)
+  │    └─ no   →  stop
+  │
+  └─ [STOP — do not proceed to Playwright unless user explicitly asks]
+OUTPUT: .feature file attached to PBI/Bug work item (or shown inline)
+```
+### Chain D — Gherkin from Epic
+```
+INPUT:  Epic ID  +  org URL  +  project name
+  │
+  ├─ fetch_epic_hierarchy(id)
+  │    └─ returns: features[] each with child_work_items[] and existing_test_cases[]
+  │
+  ├─ [read ALL child Features and ALL child PBIs/Bugs before writing any Gherkin]
+  │
+  ├─ for each child Feature:
+  │    ├─ [deduplication-protocol Phase 1+2 for that Feature ID]
+  │    ├─ [gap check for that Feature]
+  │    ├─ [live snapshots for that Feature's screens]
+  │    ├─ generate .feature file (5–10 scenarios)
+  │    ├─ validate_gherkin_content(scope="feature")
+  │    ├─ [show content to user]
+  │    ├─ [ASK: "Attach .feature to Feature #{child_id}?"]
+  │    └─ yes  →  attach_gherkin_to_feature(child_id)
+  │
+  └─ [NEVER call create_and_link_test_cases on the Epic ID]
+OUTPUT: one .feature file per child Feature
+        test cases NOT created unless user explicitly asks for child Feature/PBI coverage
+```
+### Chain E — Playwright TypeScript from Gherkin
+```
+INPUT:  validated .feature content  +  work item ID  +  org URL  +  project name
+  │
+  ├─ [deduplication-protocol Phase 1 — read existing locators, page objects, steps]
+  │
+  ├─ [gap check — identify state-dependent screens; stop and ask if blockers found]
+  │
+  ├─ [live Playwright MCP snapshots through full flow]
+  │
+  ├─ generate_playwright_code(gherkin, context_map)
+  │
+  ├─ [apply delta filter — remove keys / methods / steps already in existing files]
+  │
+  ├─ [show delta summary to user]
+  │
+  ├─ [ASK: "Attach these Playwright files to ADO work item #{id}?"]
+  │    └─ yes  →  attach_code_to_work_item(delta files only)
+  │    └─ no   →  stop; offer local download only if user asks
+  │
+  └─ [STOP — do not trigger Helix write unless user explicitly asks]
+OUTPUT: delta Playwright files attached to work item (or shown inline)
+        NO Helix disk write unless user explicitly requests it
+```
+---
+## 5. Ambiguity Resolution — Stop, Don't Guess
+When any of the following is unclear, **stop and ask the user**. Do not proceed.
+| Ambiguity | What to ask |
+|---|---|
+| Work item ID given but type is unknown | Call fetch tool and surface the `work_item_type` field; do not guess |
+| "Generate everything for #123" | "Which artifacts do you want? (A) Manual test cases in ADO (B) Gherkin .feature file (C) Playwright TypeScript (D) All of the above — and which should be attached to ADO vs saved locally?" |
+| User says "attach it" after multiple artifacts generated | "Which artifact are you referring to — the Gherkin .feature file, the Playwright TypeScript files, or both?" |
+| No org URL or project name provided | "Please provide the ADO organisation URL (e.g. `https://dev.azure.com/myorg`) and project name." |
+| No screen name or navigation path available | Use `<!-- TODO: confirm screen name -->` placeholder; surface to user; do not invent |
+| No test data for a state-dependent screen | Stop; ask what file / email / record to use; do not invent |
+| User says "yes" ambiguously after a confirmation prompt | Re-read the most recent prompt; if still unclear, re-ask with the specific action being confirmed |
+---
+## 6. No Carry-Over Between Decisions
+Each of these decisions is **completely independent**. A prior answer to one does not answer another.
+```
+[1] Generate Gherkin?              → answered
+[2] Attach Gherkin to ADO?         → must ask SEPARATELY after [1]
+[3] Generate Playwright code?      → must ask SEPARATELY
+[4] Attach Playwright to ADO?      → must ask SEPARATELY after [3]
+[5] Create test cases in ADO?      → must ask SEPARATELY
+[6] Save any artifact locally?     → must ask SEPARATELY
+[7] Write to Helix-QA on disk?     → must ask SEPARATELY
+```
+**Declining [5] does not affect [2], [4], [6], or [7].**
+**Confirming [2] does not pre-answer [4].**
+**Saying "yes" at step [1] does not mean "yes" at any other step.**
+---
+## 7. Decision Tree — Work Item Type Routing
+```
+User provides work item ID
+         │
+         ▼
+  Call the fetch tool for the intended operation
+  (fetch_work_item, fetch_work_item_for_gherkin, or fetch_epic_hierarchy)
+         │
+         ▼
+  Read work_item_type from response
+         │
+    ┌────┴─────────────────────────────────────────────┐
+    │                                                  │
+  "Epic"                                          not "Epic"
+    │                                                  │
+    ▼                                            ┌─────┴────────┐
+  HARD STOP for test cases                    "Feature"    "PBI"/"Bug"
+  Use fetch_epic_hierarchy for Gherkin            │              │
+  Never call create_and_link_test_cases           │              │
+  Never call attach_code_to_work_item             │              │
+  on the Epic ID                             Feature path   PBI/Bug path
+                                            (5–10 scenarios) (3–9 scenarios)
+                                            attach_gherkin_to_feature
+                                                        attach_gherkin_to_work_item
+```
+```
+Feature path for test cases
+         │
+         ▼
+  create_and_link_test_cases returns confirmation_required: true
+         │
+         ▼
+  STOP — show user: Feature title + proposed TC count
+         │
+    ┌────┴────┐
+  "yes"     "no"/"cancel"
+    │              │
+    ▼           ABORT — report "No test cases created"
+  retry create_and_link_test_cases
+  with identical args + confirmed=true
+  (do NOT change work_item_id or pivot to a child PBI)
+```
+---
+## 8. Versioning — Explicit vs Inferred
+All rules in this file are **explicitly stated**. None are inferred from code patterns,
+prior conversation, or training data.
+------
+## 9. Orchestrator / Multi-Step Pipeline Rules
+These rules apply whenever an agent is given a numbered pipeline prompt that runs multiple
+QA STLC steps in sequence (e.g. "Execute steps 1–6 in order: fetch → test cases → Gherkin
+→ locators → Playwright → Helix write").
+### 9A — Deduplication gate between Step 1 and Step 2
+`fetch_work_item` / `fetch_work_item_for_gherkin` always returns `existing_test_cases_count`.
+A pipeline **MUST NOT** proceed to `create_and_link_test_cases` without first passing through
+the full deduplication protocol — even when the orchestrator prompt says "execute all steps".
+```
+After fetch (Step 1):
+  if existing_test_cases_count > 0:
+    ├─ HARD STOP before create_and_link_test_cases
+    ├─ Call get_linked_test_cases(work_item_id)     ← mandatory, not optional
+    ├─ Run deduplication-protocol.md Phase 2 (semantic diff)
+    ├─ Build net-new list
+    └─ if net-new list is empty:
+         Log: "✅ All scenario types already covered — skipping test case creation."
+         Skip create_and_link_test_cases entirely; continue pipeline at next step.
+       else:
+         Call create_and_link_test_cases(net-new only)
+```
+The instruction "execute all steps" does **not** override this gate.
+It means "execute each step correctly" — which includes the deduplication check.
+### 9B — Treating validation warnings as blocking gates
+Any `_validation.warnings` entry of the form:
+- `"N existing test case(s) found"`
+- `"existing attachment found"`
+- `"duplicate scenario detected"`
+is a **blocking condition**, not advisory text. The pipeline must pause, run the
+relevant deduplication phase, and resolve the warning before proceeding.
+### 9C — Per-step confirmation in automated pipelines
+A pipeline prompt that says "execute steps 1–6 in order" grants permission to
+**attempt** each step — it does not grant blanket confirmation for creation actions.
+The explicit confirmation rules in sections 3A–3E still apply at each step.
+| Step type | Still requires |
+|---|---|
+| Fetch (read-only) | No confirmation needed |
+| `create_and_link_test_cases` | Deduplication gate (§9A) + net-new check |
+| `attach_gherkin_to_work_item` | Gherkin validation must pass first |
+| `generate_playwright_code` | Locator live-verification must precede |
+| `write_helix_files` | `list_helix_tree` read must precede |
+### 9D — Mandatory deduplication report in every pipeline run
+Every orchestrated pipeline run must output a deduplication report after Step 2,
+regardless of whether any test cases were created (see deduplication-protocol.md §PHASE 3).
+Skipping the report is a violation of this contract.
+---
+When a rule is changed:
+1. Update the rule in this file with a dated entry in the change log below.
+2. Update the corresponding skill file in `skills/` to match.
+3. Run `./scripts/install-skills.sh vscode` to sync `.github/copilot-instructions/`.
+4. Update `CLAUDE.md` if the rule affects a key rule number.
+### Change log
+| Date | Version | Change | Author |
+|---|---|---|---|
+| 2025-01-01 | 1.0 | Initial explicit rule set — all 8 sections | repo init |
+| 2026-04-03 | 1.1 | 3C: added `confirmed=true` retry protocol for Feature confirmation gate | repo update |
+| 2026-04-03 | 1.1 | 3E: paren error recovery now references `pre_validate_cucumber_steps` as first diagnostic step | repo update |
+| 2026-04-03 | 1.1 | §7: Feature decision tree clarifies retry mechanism (`confirmed=true`, no pivot to child PBI) | repo update |
+| 2026-04-04 | 1.2 | §9: Added orchestrator/pipeline rules — deduplication gate between Steps 1–2, warning-as-blocking-gate, per-step confirmation, mandatory dedup report | repo update |