npm - @qa-gentic/stlc-agents - Versions diffs - 1.0.1 → 1.0.3 - Mend

@qa-gentic/stlc-agents 1.0.1 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/.github/copilot-instructions/AGENT-BEHAVIOR.md +448 -0
package/.github/copilot-instructions/deduplication-protocol.md +303 -0
package/.github/copilot-instructions/generate-gherkin.md +550 -0
package/.github/copilot-instructions/generate-playwright-code.md +464 -0
package/.github/copilot-instructions/generate-test-cases.md +176 -0
package/.github/copilot-instructions/write-helix-files.md +374 -0
package/README.md +19 -12
package/package.json +8 -4
package/src/stlc_agents/agent_helix_writer/tools/helix_write.py +94 -20

package/.github/copilot-instructions/deduplication-protocol.md ADDED Viewed

@@ -0,0 +1,303 @@
+---
+name: deduplication-protocol
+description: >
+  MANDATORY pre-flight protocol for every QA STLC agent run. Must be read and applied BEFORE
+  creating any test cases, Gherkin feature files, Playwright locators, page objects, or step
+  definitions in Azure DevOps. Prevents duplicate test cases, duplicate feature file attachments,
+  and duplicate Playwright code being created across multiple runs or agents on the same work item.
+  Triggers on any task involving: ADO work items, test cases, feature files, Gherkin BDD, Playwright
+  automation, locators, page objects, step definitions, qa-gherkin-generator, qa-test-case-manager,
+  qa-playwright-generator tools.
+compatibility:
+  tools:
+    - qa-test-case-manager:get_linked_test_cases
+    - qa-test-case-manager:fetch_work_item
+    - qa-test-case-manager:create_and_link_test_cases
+    - qa-gherkin-generator:fetch_feature_hierarchy
+    - qa-gherkin-generator:attach_gherkin_to_feature
+    - qa-playwright-generator:generate_playwright_code
+    - qa-playwright-generator:attach_code_to_work_item
+    - qa-playwright-generator:validate_gherkin_steps
+---
+# QA Deduplication Protocol
+> **Read `AGENT-BEHAVIOR.md` before this protocol.**
+> This protocol is a pre-flight gate only — it does not authorise creating anything.
+> All creation decisions require separate explicit user confirmation per artifact type.
+> **This protocol is mandatory. No artifact may be created or attached to ADO without completing
+> the READ → DIFF → CREATE-ONLY-WHAT-IS-MISSING pipeline below.**
+>
+> Any agent that skips this protocol and creates duplicates has violated the QA STLC workflow.
+---
+## Why This Exists
+When QA STLC agents run more than once on the same work item — or when multiple agents run in
+sequence on the same item — they have no memory of what previous runs created. Without this
+protocol every run blindly creates new test cases, attaches new feature files, and generates new
+Playwright code that duplicates existing artifacts. This produces:
+- 2–5× the intended number of ADO test cases per work item
+- Multiple `.feature` files on the same work item covering identical scenarios
+- Multiple versions of `locators.ts` / `PageObject.ts` / `steps.ts` that diverge silently
+- Test suites that fail because the same step definition is registered twice
+This skill codifies the fix as a durable, enforceable protocol that any agent can read and follow.
+---
+## The Three-Phase Mandatory Workflow
+```
+┌───────────────────────────────────────────────────────────────┐
+│  PHASE 1 — READ (always first, no exceptions)                 │
+│  PHASE 2 — DIFF (semantic matching, not just string equality) │
+│  PHASE 3 — CREATE only what has no existing coverage          │
+└───────────────────────────────────────────────────────────────┘
+```
+---
+## PHASE 1 — READ Everything That Already Exists
+Before touching any artifact, make all of the following calls **in this order**.
+This protocol handles **any work item type** — PBI, Bug, or Feature — as follows:
+| Work item type passed | Step 1A | Step 1B |
+|---|---|---|
+| **PBI or Bug** | `fetch_work_item(id)` → get parent Feature id | `fetch_feature_hierarchy(parent_feature_id)` |
+| **Feature** | Skip `fetch_work_item` | `fetch_feature_hierarchy(id)` directly |
+### 1A — Fetch the work item (PBI or Bug only)
+```
+qa-test-case-manager:fetch_work_item(
+  organization_url, project_name, work_item_id
+)
+```
+Extract and store:
+- Title, description, acceptance_criteria
+- **Parent Feature ID** — used in Step 1B
+- Story points, priority, state
+Skip this step if the ID passed is already a Feature.
+### 1B — Fetch the parent Feature hierarchy
+```
+qa-gherkin-generator:fetch_feature_hierarchy(
+  organization_url, project_name, feature_id   ← parent Feature ID from 1A, or the ID itself if Feature
+)
+```
+Extract and store:
+- Feature title, description, acceptance criteria
+- **All child PBIs and Bugs** with their titles + acceptance criteria
+- All existing test cases across the whole feature
+- All `.feature` file attachments already on the feature
+- Build the **flow map** from sibling work items (see generate-gherkin.md Step 1D)
+> **Why sibling PBIs matter:** A PBI is one slice of a larger flow defined across multiple
+> work items. Siblings often represent prerequisite or downstream steps. Without reading all
+> of them the agent invents navigation steps and test data already defined elsewhere.
+### 1C — get_linked_test_cases on the specific work item
+```
+qa-test-case-manager:get_linked_test_cases(
+  organization_url, project_name, work_item_id   ← the original ID (PBI, Bug, or Feature)
+)
+```
+Extract and store:
+- All existing test case IDs, titles (normalised), priority values
+### 1D — Check for existing Playwright attachments
+From the feature hierarchy response, check for:
+- `locators.ts` — extract all locator keys already defined
+- `*Page.ts` — extract all method names already defined
+- `*.steps.ts` — extract all step definition strings already registered
+> If you cannot read an attachment's content, treat it as fully covering its domain and
+> produce NO new file of that type unless you can prove a gap.
+---
+## PHASE 2 — DIFF Using Semantic Matching
+String equality is not enough. Two test cases are **semantically equivalent** (duplicates)
+if they test the same condition on the same subject, even when worded differently.
+### Normalisation algorithm (apply before comparing)
+1. Lowercase the full title
+2. Strip tag prefixes: `[smoke]`, `[regression]`, `[a11y]`, `[negative]`, `[boundary]`, etc.
+3. Strip filler words: `the`, `a`, `an`, `is`, `are`, `should`, `will`, `when`, `after`, `during`
+4. Extract the **subject noun** (what is being tested: button, CSV, backend, upload, template…)
+5. Extract the **condition** (visible, absent, downloaded, cleared, rejected, correct…)
+6. If subject + condition match an existing case → **DUPLICATE**, do not create
+### Semantic coverage matrix
+```
+For each proposed test case:
+  normalise(proposed.title) → (subject, condition)
+  for each existing test case:
+    normalise(existing.title) → (subject, condition)
+    if subject matches AND condition matches → DUPLICATE → skip
+  if no match found → NET-NEW → add to creation list
+```
+Only pass the **net-new** list to `create_and_link_test_cases`.
+### Gherkin scenario deduplication
+For each proposed Gherkin scenario:
+1. Extract the scenario title
+2. Apply the same normalisation algorithm
+3. Compare against all scenario titles in existing `.feature` attachments AND existing ADO test case titles
+4. Semantic match → skip; zero net-new → do not attach; some net-new → attach delta file only
+### Playwright code deduplication
+- `locators.ts`: only emit keys not already in existing file; if delta empty → skip; if non-empty → attach as `locators.delta.ts`
+- `*Page.ts`: only emit methods not already present; if delta empty → skip; if non-empty → attach as `*Page.delta.ts`
+- `*.steps.ts`: only emit step strings not already registered; NEVER re-register an existing step (causes `Ambiguous step definition` runtime error)
+---
+## PHASE 3 — CREATE Only What Is Missing
+| Diff result | Action |
+|---|---|
+| Zero net-new | **Skip.** Log: `✅ Already fully covered — nothing to create.` |
+| Some net-new, some duplicates | **Create only net-new.** Log which were skipped and why. |
+| All net-new (first run) | **Create all.** Normal flow. |
+### Mandatory deduplication report
+Output after every run regardless of whether anything was created:
+```
+## Deduplication Report — Work Item #<id>
+### Test Cases
+- Existing: <count> linked
+- Proposed: <count>
+- Duplicates skipped: <count> (<titles>)
+- Net-new created: <count> (<IDs if created>)
+### Gherkin Feature File
+- Existing attachments: <filenames or "none">
+- Proposed scenarios: <count>
+- Duplicate scenarios skipped: <count> (<titles>)
+- Net-new scenarios: <count>
+- Action: <"Skipped — fully covered" | "Attached delta" | "Attached full (first run)">
+### Playwright Code
+- Existing locators.ts: <"found — N keys" | "not found">
+- Net-new locator keys: <count> (<list or "none">)
+- Existing page object: <"found — N methods" | "not found">
+- Net-new page methods: <count> (<list or "none">)
+- Existing steps file: <"found — N steps" | "not found">
+- Net-new step definitions: <count> (<list or "none">)
+- Action per file: <"Skipped" | "Attached delta" | "Attached full (first run)">
+```
+---
+## Hard Rules — Never Violate These
+1. **NEVER call `create_and_link_test_cases` without first calling `get_linked_test_cases`** on the same work item.
+2. **NEVER attach a `.feature` file without first checking for existing feature file attachments.**
+3. **NEVER attach a `steps.ts` that re-registers an existing step string** — causes `Ambiguous step definition` at runtime.
+4. **NEVER treat tag differences as meaningful.** `[REGRESSION] Export List downloads CSV` is a duplicate of `[SMOKE] Clicking Export List downloads a valid CSV file`.
+5. **NEVER create more than one test case covering the same (subject, condition) pair.**
+6. **NEVER skip this protocol because the work item appears new.** `existing_test_cases_count: 0` in the hierarchy can still have cases via `get_linked_test_cases` — always call both.
+---
+## Integration With Other QA Skills
+This protocol is a **work-item-scoped pre-flight gate**.
+- **Each unique `work_item_id` gets exactly one PHASE 1 run.** Findings are cached and reused by every subsequent agent operating on the same item.
+- **A different `work_item_id` always triggers a fresh PHASE 1.** Cache from item A must never be used for item B.
+```
+work_item_id = 111  →  PHASE 1 runs in full → CACHE[111] populated
+  generate-gherkin on 111      →  reads CACHE[111], skips PHASE 1
+  generate-playwright on 111   →  reads CACHE[111], skips PHASE 1
+work_item_id = 222  →  PHASE 1 runs in full → CACHE[222] populated (CACHE[111] untouched)
+```
+Skills that delegate here:
+```
+skills/generate-gherkin.md          →  delegates here; reads work-item cache
+skills/generate-playwright-code.md  →  delegates here; reads work-item cache
+```
+### Work-Item Cache Schema
+```
+CACHE[work_item_id] = {
+  work_item:            { id, type, title, acceptance_criteria },  # PBI/Bug/Feature
+  parent_feature:       { id, title, description },
+  sibling_pbis:         [{ id, type, title, acceptance_criteria, state }],
+  flow_map:             <assembled string describing the full user journey>,
+  existing_test_cases:  [{ id, title, priority }],
+  existing_attachments: {
+    feature_files: [{ name, content }],
+    locators_ts:   { found: bool, keys: [] },
+    page_objects:  [{ name, methods: [] }],
+    steps_files:   [{ name, step_strings: [] }],
+  },
+  gap_check_completed:  bool,   # set true after generate-gherkin Step 2 passes
+  phase1_completed:     true,
+}
+```
+**Before running PHASE 1**, check `CACHE[work_item_id].phase1_completed`:
+- If **true** → skip PHASE 1 entirely; use cached data for PHASE 2.
+- If **false / not set** → run PHASE 1 in full and populate the cache.
+---
+## Examples
+### Multiple work items in one session
+```
+► PBI #111 processed:
+  CACHE[111] not found → PHASE 1 in full → CACHE[111].phase1_completed = true
+  generate-gherkin on 111    → reads CACHE[111]
+  generate-playwright on 111 → reads CACHE[111], skips all ADO reads
+► Bug #222 processed:
+  CACHE[222] not found → PHASE 1 in full → CACHE[222].phase1_completed = true
+  (CACHE[111] untouched — completely independent)
+```
+### Second run on a fully covered item
+```
+PHASE 1: get_linked_test_cases(273440) → 35 cases; fetch_feature_hierarchy → .feature attached
+PHASE 2: 14 proposed → 14/14 duplicates; 0 net-new locators, methods, steps
+PHASE 3: Skip all
+REPORT:  ✅ Work item #273440 fully covered. Nothing to create.
+```
+### Partial gap run
+```
+PHASE 1: 3 cases found, no .feature, no Playwright attachments
+PHASE 2: 5 proposed → 3 duplicates → 2 net-new
+PHASE 3: create 2 test cases; attach full .feature (first run); attach full Playwright files
+```