npm - @qa-gentic/stlc-agents - Versions diffs - 1.0.17 → 1.0.19 - Mend

@qa-gentic/stlc-agents 1.0.17 → 1.0.19

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/ORCHESTRATION_RULES.md +283 -0
package/README.md +250 -57
package/bin/postinstall.js +9 -1
package/package.json +15 -2
package/src/cli/cmd-init.js +19 -2
package/src/cli/cmd-mcp-config.js +10 -14
package/src/cli/cmd-skills.js +21 -4
package/src/stlc_agents/shared/install_hook.py +154 -0

package/ORCHESTRATION_RULES.md ADDED Viewed

@@ -0,0 +1,283 @@
+# Orchestration Rules — Multi-Step Pipeline Agents
+> **Universal rules file for coding agents (Claude, Copilot, Cursor, Windsurf, Gemini, etc.)**
+> Place this file in your project root or `.ai/` folder. Reference it in your prompt with:
+> `"Apply all rules from ORCHESTRATION_RULES.md before executing any step."`
+---
+## 1. Core Principle
+Every intermediate output is an **input contract** for the next step — not a done state.
+A step is only complete when its output has been validated against its spec and confirmed
+fit for downstream consumption. Never proceed on a "good enough" assumption.
+---
+## 2. Mandatory Behaviours
+### 2.1 Explicit Task Breakdown
+- Before executing, decompose the full task into named steps using a todo/task list tool
+  (e.g. `manage_todo_list`, GitHub Copilot Tasks, Cursor task panel).
+- Each step must have a declared **input**, **action**, and **output spec**.
+- Mark steps as `[ ] pending`, `[~] in-progress`, `[x] done`, `[!] blocked` — update in real time.
+- Never treat a step as in-progress and done simultaneously.
+### 2.2 No Skipping Intermediate Steps
+- If a step produces data that the next step consumes, that data must be:
+  1. **Extracted** from raw output (not left embedded)
+  2. **Structured** into the agreed schema
+  3. **Validated** against the checkpoint gate
+  4. **Explicitly handed off** as a named artefact
+- Do NOT jump ahead assuming the downstream step can infer missing data.
+- Do NOT proceed if a required input is absent or malformed.
+### 2.3 Checkpoint Gates Are Blocking — Pre-Flight Required
+Gates are not a post-generation reflection. They run **before** output is produced.
+**Before generating output for any step, you MUST:**
+1. Output the pre-flight checklist with your intended answers filled in.
+2. Only if all items are YES — proceed to generate the output.
+3. If any item is NO — stop, state what is missing, and wait for the user.
+This is not optional. Generating output first and checking after is a rule violation.
+Pre-flight format (required before every step):
+```
+PRE-FLIGHT: Step [N] — [Step Name]
+  [ ] Input artefact "[name]" received from Step [N-1]?
+  [ ] Input matches expected schema?
+  [ ] [Step-specific countable check, e.g. "11 scenarios in scenario_inventory?"]
+  [ ] [Any tool or selector availability check]
+→ PROCEED  /  → BLOCKED: [state what is missing]
+```
+### 2.4 Data Handoff Must Be Explicit
+- Output data in full — not summarised, not truncated.
+- Use a consistent schema (JSON, YAML, or named list) — do not change shape between steps.
+- Name the artefact (e.g. `context_map`, `test_case_list`, `scenario_inventory`).
+- The receiving step must reference the artefact by name, not re-derive it.
+### 2.5 No Placeholder or Stub Outputs
+This rule applies at generation time, not reflection time. The agent must not produce
+stubs and then acknowledge the violation — it must prevent them before generating.
+- Never produce output containing `TODO`, `placeholder`, `// implement later`,
+  `throw new Error('pending')`, or any empty method/step body.
+- Before generating a file, state the **expected item count** (e.g. number of step
+  definitions, number of test cases). The generated file must match that count exactly.
+- If a step cannot produce a complete output, declare it `[!] blocked` and stop.
+- Partial outputs passed downstream cause compounding failures and wasted tokens.
+**Countable verification pattern (required for code generation steps):**
+```
+Expected: [N] step definitions (from scenario_inventory)
+Generating: [N] step definitions
+Verify after: count implemented bodies — must equal [N], zero empty
+```
+### 2.6 Query-Driven Data Capture (Snapshot / Scraping Steps)
+- Navigation is NOT the deliverable — **structured data extraction** is.
+- For every screen or page visited, immediately extract all required fields before moving on.
+- Do not defer extraction to a later step.
+- Capture only what downstream steps need (defined by the step's output spec).
+- Validate coverage: every field required by downstream must be present in the captured data.
+### 2.7 Split Generation Steps to Prevent Silent Stubs
+For any step that generates code or structured output consumed by a subsequent step,
+split it into two sub-steps:
+- **[N]a — Signatures only:** generate method/step signatures (names, parameters) with no bodies.
+  Output as an inventory list. This makes the expected count explicit and visible.
+- **[N]b — Implement each signature:** implement every item from the [N]a inventory.
+  No body may be left empty. Reference `context_map` or equivalent for all selectors/data.
+This forces a visible count before implementation begins, eliminating silent stub generation.
+### 2.8 Token Efficiency
+- Avoid re-deriving data already produced in a prior step.
+- Reference prior artefacts by name; do not re-fetch or re-generate unless a gate failed.
+- If rework is needed, state which gate failed, what was missing, and what the corrected output is.
+---
+## 3. Error Handling
+| Situation | Required Action |
+|---|---|
+| Gate fails | STOP. Report failed items. Wait for user input or resolution. |
+| Required input missing | STOP. Name the missing input. Do not guess. |
+| Tool call returns empty | STOP. Report. Do not silently continue. |
+| Partial output produced | Mark step `[!] blocked`. Do not pass partial output downstream. |
+| Schema mismatch | STOP. Show expected vs actual schema. Do not transform silently. |
+| Ambiguous instruction | Ask one clarifying question before proceeding. Do not assume. |
+| Stub/TODO found in output | STOP. Do not accept the output. Regenerate from signatures. |
+| Count mismatch (generated vs expected) | STOP. List which items are missing. Do not proceed. |
+---
+## 4. Checkpoint Gate Template
+Run this **before** generating output — not after.
+```
+CHECKPOINT GATE [N] — [Step Name]
+---------------------------------------
+PRE-FLIGHT (run before generating):
+  [ ] Input artefact received and named
+  [ ] Input matches expected schema
+  [ ] Expected output count stated: [N items]
+  [ ] All required tools/selectors available
+POST-GENERATION (run before handing off):
+  [ ] Actual output count matches expected: [N of N]
+  [ ] No stubs, TODOs, or empty bodies in output
+  [ ] All items from upstream list are accounted for
+  [ ] Output is in agreed schema / format
+RESULT:  [ ] PASS — hand off artefact to Step [N+1]
+         [ ] FAIL — stop and report: [list what failed]
+```
+---
+## 5. Step Definition Template
+```
+STEP [N] — [Name]
+---------------------------------------
+Tool / Agent:   [name of tool, MCP server, or agent]
+Input (required):
+  - [Named artefact from Step N-1]
+  - [Any other required input]
+Pre-flight check:
+  - State expected output count before generating
+  - Confirm all inputs available and schema-valid
+Action:
+  [Precise description — not vague verbs like "process" or "handle"]
+  If code generation: split into [N]a (signatures) and [N]b (implementations)
+Output spec (the contract):
+  - Artefact name: [e.g. context_map]
+  - Format: [JSON | YAML | list | file | etc.]
+  - Required fields: [enumerate them]
+  - Coverage requirement: [e.g. one implementation per Gherkin step]
+Checkpoint Gate:  → run Gate [N] template above
+```
+---
+## 6. Anti-Patterns (Never Do These)
+| Anti-Pattern | Why It Fails | Correct Behaviour |
+|---|---|---|
+| Generating output then checking the gate | Gate runs after stubs already exist — violation acknowledged but not prevented | Run pre-flight checklist before generating |
+| Treating gate as a reflection step | Agent notices violation after the fact; output is already committed | Gate is a pre-condition, not a review |
+| Skipping data extraction after capture | Downstream step receives raw/unstructured input and must infer | Extract and structure data immediately after capture |
+| Jumping to generation without verified inputs | Output based on inference, not facts — stubs and errors result | Validate inputs at gate before calling the generator |
+| Treating "good enough" output as done | Errors compound; rework costs more tokens than doing it right | Validate against spec before marking a step complete |
+| Producing stubs with TODO | Downstream steps receive incomplete contracts and silently fail | Block the step; declare it incomplete; stop |
+| Re-deriving upstream data in a downstream step | Wasted tokens; divergence risk if re-derivation differs | Reference the named artefact from the prior step |
+| Proceeding past a failed gate | Snowballing failures requiring full rework | Stop at the gate; surface the gap; wait for resolution |
+| Single atomic generation step for code | No visible count before generation — stubs go undetected | Split into signatures ([N]a) then implementations ([N]b) |
+---
+## 7. Orchestration Health Checks
+Run at the start of any multi-step task:
+- [ ] Are all steps named and sequenced in the task list?
+- [ ] Does each step have a declared input and output spec?
+- [ ] Does each step have a defined pre-flight and post-generation gate?
+- [ ] Are code generation steps split into signatures + implementations?
+- [ ] Are all required tools / MCP servers available?
+- [ ] Are named artefacts from prior steps available as inputs?
+If any health check fails before execution begins, resolve it first.
+---
+## 8. Agent-Specific Integration Notes
+### Claude (claude.ai / API)
+- Reference this file in your system prompt or project instructions.
+- Use `manage_todo_list` for step tracking.
+- Attach this file as a project document so it persists across sessions.
+- Pre-flight checklists work reliably as Claude outputs reasoning before tool calls.
+### GitHub Copilot (VS Code / JetBrains)
+- Add to `.github/copilot-instructions.md` or reference in your workspace prompt.
+- **Critical:** Copilot treats rules as advisory context — gates are not enforced at runtime.
+  Mitigate by scoping rules to file types (e.g. `When generating *.steps.ts files, you must...`).
+- Always include the pre-flight checklist directly in your chat message for the current step,
+  not just in the rules file. Copilot applies in-message instructions more reliably than
+  file-level rules for generation constraints.
+- Use the countable verification pattern (section 2.5) explicitly in each chat prompt:
+  "There are 11 scenarios. Generate exactly 11 step definitions. State the count before writing."
+- For step definition files: add a file-type-scoped rule to `.github/copilot-instructions.md`:
+  ```
+  When generating Playwright step definition files (*.steps.ts):
+  1. Count Given/When/Then steps in the linked .feature file and state the count first.
+  2. Every step body must contain real implementation — no TODO, no throw pending, no empty bodies.
+  3. If a selector is missing from context_map, name the missing step and stop. Do not stub it.
+  ```
+### Cursor
+- Place in `.cursor/rules/` as `orchestration.mdc` (set scope: `always`).
+- Or add to `.cursorrules` in the project root.
+- Cursor applies project-level rules more consistently than Copilot for generation steps.
+- Use `@file` references in chat to explicitly pull the rules into context per step.
+### Windsurf (Codeium)
+- Place in `.windsurf/rules.md` or reference in the global rules panel.
+- Windsurf's Cascade agent picks up project-level markdown rules automatically.
+### Gemini CLI / Vertex AI Agent Builder
+- Reference via system instruction or as a grounding document.
+- Use the Step Definition Template when constructing task configs.
+---
+## 9. Quick Reference Card
+```
+BEFORE EACH STEP:
+  1. Output the pre-flight checklist — fill in all items.
+  2. If all YES → state expected output count → generate.
+  3. If any NO → stop and report.
+AFTER EACH STEP:
+  1. Run post-generation gate — count actual vs expected.
+  2. If PASS → hand off named artefact to next step.
+  3. If FAIL → stop, report, regenerate.
+FOR CODE GENERATION:
+  1. Generate signatures/names only first ([N]a).
+  2. State the count from [N]a.
+  3. Implement every item ([N]b) — zero empty bodies allowed.
+NEVER:
+  - Generate output then check the gate.
+  - Proceed past a failed gate.
+  - Pass unstructured or partial data downstream.
+  - Produce stubs and acknowledge them — prevent them.
+```
+---
+*Version 1.1 — Updated to add pre-flight gate enforcement, countable verification,
+split generation step pattern, and Copilot-specific stub prevention guidance.
+Root cause addressed: gates were post-generation reflections, not pre-generation blockers.*

package/README.md CHANGED Viewed

@@ -1,11 +1,12 @@
 # @qa-gentic/stlc-agents
-> AI-powered QA STLC automation — from Azure DevOps **or Jira Cloud** work item to self-healing Playwright TypeScript in a Helix-QA project.
+> AI-powered QA STLC automation — from Azure DevOps **or Jira Cloud** work item state change to self-healing Playwright TypeScript in a Helix-QA project.
-Works with **GitHub Copilot** (VS Code Agent mode), **Claude Code**, **Cursor**, and **Windsurf**.
+Works with **GitHub Copilot** (VS Code Agent mode), **Claude Code**, **Cursor**, and **Windsurf**.
+Also runs fully headless via the **webhook bridge** — no human in the loop required.
-![npm version](https://img.shields.io/badge/npm-v1.0.10-blue)
-![PyPI version](https://img.shields.io/badge/pypi-v1.0.10-blue)
+![npm version](https://img.shields.io/badge/npm-v1.0.17-blue)
+![PyPI version](https://img.shields.io/badge/pypi-v1.0.17-blue)
 [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
 [![Node.js >=18](https://img.shields.io/badge/node-%3E%3D18-brightgreen)](https://nodejs.org)
 [![Python >=3.10](https://img.shields.io/badge/python-%3E%3D3.10-blue)](https://python.org)
@@ -14,19 +15,103 @@ Works with **GitHub Copilot** (VS Code Agent mode), **Claude Code**, **Cursor**,
 ## What It Does
-Five Python MCP servers cover the full QA Software Test Life Cycle:
+Five Python MCP servers cover the full QA Software Test Life Cycle. A sixth server (Playwright MCP) drives a real browser during code generation.
-| Agent | Input | Output |
-|---|---|---|
-| `qa-test-case-manager` | ADO PBI / Bug / Feature ID | Manual test cases created & linked via TestedBy-Forward |
-| `qa-gherkin-generator` | ADO or Jira Epic / Feature / PBI / Bug ID | `.feature` file attached to work item or saved locally |
-| `qa-playwright-generator` | Gherkin + live browser AX tree | `locators.ts` + page objects + step defs |
-| `qa-helix-writer` | Generated `.ts` files + `helix_root` | Files written to Helix-QA directory layout on disk |
-| `qa-jira-manager` | Jira Story / Bug / Task / Epic | Test cases in Jira + full pipeline to Helix-QA |
+| Agent | Server name | Input | Output |
+|---|---|---|---|
+| Agent 1 | `qa-test-case-manager` | ADO PBI / Bug / Feature | Manual test cases created & linked (TestedBy-Forward), deduped on re-trigger |
+| Agent 2 | `qa-gherkin-generator` | ADO Feature / PBI / Bug | `.feature` file validated and attached to work item |
+| Agent 3 | `qa-playwright-generator` | Gherkin + optional AX-tree `context_map` | `locators.ts` + page objects + step defs (cached, retrieved via `get_generated_files`) |
+| Agent 4 | `qa-helix-writer` | Generated `.ts` files + `helix_root` | Files written to Helix-QA directory layout, never overwrites |
+| Agent 5 | `qa-jira-manager` | Jira Story / Bug / Task | Test cases created & linked in Jira, `.feature` attached to issue, deduped on re-trigger |
+---
+## End-to-End Flow
+### Webhook-triggered (headless)
+A work item state change in ADO or Jira fires a webhook POST to the bridge. The bridge normalises the payload and routes it through the pipeline automatically.
+```
+ADO Service Hook / Jira Webhook
+        │  POST
+        ▼
+webhook_bridge/server.py  (FastAPI — qa-stlc-serve)
+        │
+        ├─ parsers.py       raw payload → normalised event dict
+        ├─ state_router.py  event → STAGE_TEST_CASES | STAGE_FULL_PIPELINE | SKIP
+        │
+        └─ ci_runner/pipeline.py
+              ├─ run_test_cases()    → Agent 1 (ADO) or Agent 5 (Jira)
+              └─ run_post_done()     → Agent 2 → Agent 3 → Agent 4
+```
+### State → stage mapping
-A sixth server — **Playwright MCP** (`http://localhost:8931/mcp`) — drives a real browser during code generation, replacing hand-authored locators with accessibility-tree-derived, zero-hallucination selectors.
+| Platform | Work item state | Stage | Agents called |
+|---|---|---|---|
+| ADO | `Approved` / `Committed` | `STAGE_TEST_CASES` | Agent 1 only |
+| ADO | `Done` | `STAGE_FULL_PIPELINE` | Agents 2 → 3 → 4 |
+| Jira | `In Progress` / `Selected for Development` | `STAGE_TEST_CASES` | Agent 5 only |
+| Jira | `Done` | `STAGE_FULL_PIPELINE` | Agents 2 → 3 → 4 |
-> **ADO or Jira?** Both pipelines run the full STLC: fetch → analyse → test cases → Gherkin → Playwright → Helix-QA. ADO uses `qa-test-case-manager` as the entry point; Jira uses `qa-jira-manager`. Agents 2–4 (`qa-gherkin-generator`, `qa-playwright-generator`, `qa-helix-writer`) are shared by both pipelines.
+Any other state is silently dropped. State names are configurable in `state_router.py`.
+### STAGE_TEST_CASES — Agent 1 / Agent 5
+```
+fetch_work_item / fetch_jira_issue
+        ↓
+[LLM] generate_test_cases()        ← pipeline bridges fetch → LLM → create
+        ↓
+create_deduped_test_cases()        ← internally: get_linked → filter dupes → create_and_link
+```
+On re-trigger, titles already linked are skipped (case-insensitive, stop-word normalised). Only net-new test cases are created.
+### STAGE_FULL_PIPELINE — Agents 2 → 3 → 4
+First error stops the chain and surfaces in the result dict. Subsequent agents are not called.
+**Agent 2 — Gherkin**
+```
+ADO Feature : fetch_feature_hierarchy  →  [LLM] generate_gherkin
+ADO PBI/Bug : fetch_work_item_for_gherkin  →  [LLM] generate_gherkin
+Jira        : fetch_jira_issue (via Agent 5)  →  [LLM] generate_gherkin
+        ↓
+validate_gherkin_content()             ← structural check before attach
+        ↓  (if invalid → pipeline stops, returns validation errors)
+ADO Feature : attach_gherkin_to_feature()
+ADO PBI/Bug : attach_gherkin_to_work_item()
+Jira        : attach_gherkin_to_issue()    ← uploads via Jira attachment API
+```
+The `generate_and_attach_gherkin` composite tool wraps validate + attach for CI/headless callers.
+**Agent 3 — Playwright**
+```
+generate_playwright_code(gherkin_content, page_class_name)
+        ↓  returns cache_key (files held in-memory)
+get_generated_files(cache_key)
+        ↓  returns { "path/to/file.ts": "content", ... }
+```
+`page_class_name` is derived from `event["title"]` (ADO) or `event["summary"]` (Jira) — camel-cased, max 4 words. Without a `context_map`, locators are Gherkin-inferred (stability=0). Pass `app_url` to embed a snapshot hint comment in `locators.ts`.
+**Agent 4 — Helix writer**
+```
+inspect_helix_project(helix_root)
+        ↓  framework_state: absent | partial | present
+write_helix_files(helix_root, files, mode)
+        mode = "scaffold_and_tests"  if absent or partial
+        mode = "tests_only"          if present
+```
+Agent 4 handles all deduplication and conflict renaming internally. No file is ever overwritten.
 ---
@@ -34,11 +119,9 @@ A sixth server — **Playwright MCP** (`http://localhost:8931/mcp`) — drives a
 ```bash
 # 1. Install the CLI + npm package globally
-#    You will be prompted to choose your integration: ado / jira / both
+#    Prompted to choose integration: ado / jira / both
 npm install -g @qa-gentic/stlc-agents
-```
-```bash
 # 2. Bootstrap your project
 qa-stlc init --vscode --integration ado    # GitHub Copilot / VS Code — ADO
 qa-stlc init --vscode --integration jira   # GitHub Copilot / VS Code — Jira
@@ -49,16 +132,30 @@ qa-stlc init --integration ado
 # 3. Scaffold a new Playwright + Cucumber + TypeScript QA project
 qa-stlc scaffold --name my-qa-project
-# 4. Start the Playwright browser server (required for code generation)
+# 4. Start the Playwright browser server (required for live-locator generation)
 npx @playwright/mcp@latest --port 8931
 ```
-`qa-stlc init` does four things:
+### Cost Tracking Activation
+**npm install** (`npm install -g @qa-gentic/stlc-agents`): Cost tracking is activated automatically after the Python servers are installed. No manual step required.
+**pip install** (`pip install qa-gentic-stlc-agents`): Cost tracking is **not** activated automatically. You must run one of the following after pip install:
+```bash
+qa-stlc-apply-cost
+# or
+python -m stlc_agents.shared.install_hook
+```
+This patches all MCP servers to log tokens and cost for every tool call.
+`qa-stlc init` does five things:
-1. `pip install qa-gentic-stlc-agents` — installs all five Python MCP servers
+1. `pip install qa-gentic-stlc-agents` — installs all five Python MCP servers + rules files
 2. Copies skill files to `.github/copilot-instructions/` (and `.claude/` if not `--vscode`)
 3. Copies custom agent files to `.github/agents/`
-4. Writes `.vscode/mcp.json` with all six servers configured
+4. Copies `ORCHESTRATION_RULES.md` to project root for reference during multi-step tasks
+5. Writes `.vscode/mcp.json` (or `.mcp.json`) with all six servers configured
 ---
@@ -78,12 +175,15 @@ npx @playwright/mcp@latest --port 8931
 ADO_ORGANIZATION_URL=https://dev.azure.com/your-org
 ADO_PROJECT_NAME=YourProject
 ADO_PAT=your-personal-access-token
-APP_BASE_URL=your-app-base-url
-APP_EMAIL=your-test-email@example.com
-APP_PASSWORD=your-test-password
+APP_BASE_URL=https://your-app.example.com
+# LLM — pick one provider:
+AI_HEALING_PROVIDER=anthropic
+AI_HEALING_API_KEY=sk-ant-...
+# or: OPENAI_API_KEY / AZURE_OPENAI_API_KEY / GITHUB_TOKEN (Copilot)
 ```
-**Jira `.env` vars (additional):**
+**Jira additional `.env` vars:**
 ```env
 JIRA_CLIENT_ID=your-atlassian-oauth-client-id
@@ -91,6 +191,14 @@ JIRA_CLIENT_SECRET=your-atlassian-oauth-client-secret
 JIRA_CLOUD_ID=your-atlassian-cloud-id
 ```
+**Webhook bridge additional `.env` vars:**
+```env
+WEBHOOK_SECRET=your-shared-secret
+HELIX_PROJECT_ROOT=/path/to/helix-qa   # where Agent 4 writes files
+PLAYWRIGHT_MCP_URL=http://localhost:8931/mcp   # leave blank to skip Agent 3
+```
 ---
 ## CLI Commands
@@ -101,56 +209,83 @@ JIRA_CLOUD_ID=your-atlassian-cloud-id
 | `qa-stlc scaffold [--name n] [--dir path]` | Copy full Playwright + Cucumber + TypeScript boilerplate to a new project |
 | `qa-stlc skills [--target claude\|vscode\|cursor\|windsurf]` | Copy skill files to the correct AI coding agent directory |
 | `qa-stlc mcp-config [--vscode] [--print]` | Write `.vscode/mcp.json` or `.mcp.json` with all servers configured |
-| `qa-stlc verify` | Check that all six MCP servers are reachable |
+| `qa-stlc verify` | Check that all MCP servers are reachable and auth is cached |
+| `qa-stlc-serve [--host] [--port] [--reload]` | Start the webhook bridge (FastAPI) |
+---
+## Orchestration Rules
+When `qa-stlc init` is run, `ORCHESTRATION_RULES.md` is installed to your project root and placed in both npm (`node_modules/@qa-gentic/stlc-agents/`) and pip (`site-packages/stlc_agents/`) installations.
+Refer to this file in multi-step QA workflows to ensure:
+- **Step breakdown:** every task is decomposed into named steps with explicit inputs/outputs
+- **Pre-flight gates:** output validation runs *before* generation, not after
+- **No skipped steps:** intermediate data is structured and handed off explicitly, never inferred
+- **Countable verification:** code generation steps state expected item counts before implementation
+- **Zero stubs:** no partial outputs with TODOs or empty bodies passed downstream
+Integration per AI coding agent:
+- **GitHub Copilot (VS Code):** Add to `.github/copilot-instructions.md` or reference in your workspace prompt
+- **Claude Code:** Reference in `.claude/instructions.md` or as a project document
+- **Cursor:** Add to `.cursor/rules/orchestration.mdc` (scope: always)
+- **Windsurf:** Reference in `.windsurf/rules.md` or global rules panel
+See section 8 of `ORCHESTRATION_RULES.md` for agent-specific integration notes.
 ---
 ## Tool Reference
-### qa-test-case-manager _(Azure DevOps)_
+### Agent 1 — `qa-test-case-manager` _(Azure DevOps)_
 | Tool | Description |
 |---|---|
-| `fetch_work_item` | Fetch a PBI, Bug, or Feature with acceptance criteria, coverage hints, and existing TC count. Returns `epic_not_supported` for Epics; `confirmation_required: true` for Features. |
-| `get_linked_test_cases` | List all test cases already linked to a work item (deduplication). |
-| `create_and_link_test_cases` | Create structured manual test cases in ADO and link via TestedBy-Forward. |
+| `fetch_work_item` | Fetch a PBI, Bug, or Feature with acceptance criteria, coverage hints, and existing TC count. Returns `epic_not_supported` for Epics. |
+| `get_linked_test_cases` | List all test cases linked via TestedBy-Forward (used by dedup). |
+| `create_and_link_test_cases` | Create structured manual test cases and link to work item. Feature confirmation gate fires unless `confirmed=true`. |
+| `create_deduped_test_cases` | **Headless/webhook tool.** Internally calls `get_linked_test_cases`, filters duplicates (normalised title match), then calls `create_and_link_test_cases` on net-new only. Safe to call on every re-trigger. |
-### qa-gherkin-generator _(Azure DevOps)_
+### Agent 2 — `qa-gherkin-generator` _(Azure DevOps)_
 | Tool | Description |
 |---|---|
-| `fetch_feature_hierarchy` | Fetch a Feature and all child PBIs/Bugs with acceptance criteria and test case steps. |
-| `fetch_work_item_for_gherkin` | Fetch a PBI or Bug with parent Feature context and suggested file name. |
-| `attach_gherkin_to_feature` | Validate and attach a `.feature` file to a Feature work item. |
-| `attach_gherkin_to_work_item` | Validate and attach a `.feature` file to a PBI or Bug. |
-| `validate_gherkin_content` | Structural validation — returns `valid: bool` + `errors` + `warnings`. |
+| `fetch_feature_hierarchy` | Fetch a Feature + all child PBIs/Bugs + coverage hints. |
+| `fetch_work_item_for_gherkin` | Fetch a single PBI or Bug with parent Feature context. |
+| `validate_gherkin_content` | Structural check: @smoke/@regression tags, scenario count (5–10 feature / 3–9 work_item), every scenario has `When`, no duplicate titles. |
+| `attach_gherkin_to_feature` | Validate + upload + link `.feature` to a Feature work item. |
+| `attach_gherkin_to_work_item` | Validate + upload + link `.feature` to a PBI or Bug. |
+| `generate_and_attach_gherkin` | **Headless/webhook composite.** Accepts pre-generated `gherkin_content`, validates, attaches. Returns `status: validation_failed` with errors if invalid — pipeline must re-generate. |
-### qa-playwright-generator _(ADO + Jira)_
+### Agent 3 — `qa-playwright-generator` _(ADO + Jira)_
 | Tool | Description |
 |---|---|
-| `generate_playwright_code` | Generate `locators.ts`, `*Page.ts`, `*.steps.ts`, and `cucumber-profile.js` from Gherkin + live AX-tree `context_map`. Hard-blocks if `context_map` is absent. |
-| `scaffold_locator_repository` | Generate the five Helix-QA healing infrastructure files. |
+| `generate_playwright_code` | Generate `locators.ts`, `*Page.ts`, `*.steps.ts`, `cucumber-profile.js`. Returns `cache_key`. Optional `context_map` for AX-tree-verified locators; optional `app_url` embeds snapshot hint. |
+| `get_generated_files` | Retrieve full file content by `cache_key` from Agent 3's in-memory cache. |
+| `scaffold_locator_repository` | Generate the five Helix-QA healing infrastructure files (`LocatorHealer`, `TimingHealer`, `VisualIntentChecker`, `LocatorRepository`, `HealingDashboard`). Call once per project. |
 | `validate_gherkin_steps` | Check for duplicate step strings and missing `When` steps. |
-| `attach_code_to_work_item` | Attach delta Playwright TypeScript files to an ADO work item. |
+| `attach_code_to_work_item` | Attach generated TypeScript delta files to an ADO work item. |
-### qa-helix-writer _(ADO + Jira)_
+### Agent 4 — `qa-helix-writer` _(ADO + Jira)_
 | Tool | Description |
 |---|---|
-| `inspect_helix_project` | Returns `framework_state` (`present` / `partial` / `absent`) and `recommendation`. |
+| `inspect_helix_project` | Returns `framework_state`: `present` / `partial` / `absent`. Drives `mode` selection in pipeline. |
+| `write_helix_files` | Write files to Helix-QA layout. `mode=scaffold_and_tests` for new/partial projects; `mode=tests_only` for existing. Never overwrites; deduplicates and conflict-renames. |
 | `list_helix_tree` | Full directory listing of a Helix-QA project. |
 | `read_helix_file` | Read an existing file for overlap detection. |
-| `write_helix_files` | Write generated files to the correct Helix-QA paths. |
-### qa-jira-manager _(Jira Cloud)_
+### Agent 5 — `qa-jira-manager` _(Jira Cloud)_
 | Tool | Description |
 |---|---|
 | `fetch_jira_issue` | Fetch a Story, Bug, or Task with acceptance criteria and coverage hints. Returns `epic_use_hierarchy` for Epics. |
-| `fetch_jira_epic_hierarchy` | Fetch an Epic with all child issues (Stories, Tasks, Bugs, Sub-tasks). |
-| `create_and_link_test_cases` | Create test case issues in Jira (type `Test`, falls back to `Task`) and link via `is tested by`. Steps stored as ADF table — no Xray required. |
-| `get_linked_test_cases` | List all issues linked via `is tested by` / `Test` link type (deduplication). |
+| `get_linked_test_cases` | List all issues linked via `is tested by` / `Test` link type (used by dedup). |
+| `create_and_link_test_cases` | Create test case issues in Jira (type `Test`, falls back to `Task`) and link. Steps stored as ADF table — no Xray required. Epic confirmation gate fires unless `confirmed=true`. |
+| `create_deduped_test_cases` | **Headless/webhook tool.** Internally calls `get_linked_test_cases`, filters duplicates (normalised summary match), then calls `create_and_link_test_cases` on net-new only. Safe on every re-trigger. |
+| `attach_gherkin_to_issue` | Upload a `.feature` file as an attachment to a Jira issue via the Jira attachment API. File is named `{issue_key}_{summary_kebab}_regression.feature`. |
 ---
@@ -166,25 +301,84 @@ Healed selectors persist in `LocatorRepository`. All AI suggestions require huma
 ---
-## Workflow Comparison: ADO vs Jira
+## Webhook / Auto-Trigger Setup
+### Local dev
+```bash
+# 1. Install webhook extras
+pip install "qa-gentic-stlc-agents[webhook]"
+# 2. Start the bridge
+make serve
+# 3. Expose via ngrok
+ngrok http 8080
+# 4. Register hooks
+make register-ado-hooks  BRIDGE_URL=https://xxxx.ngrok.io
+make register-jira-hooks BRIDGE_URL=https://xxxx.ngrok.io
+```
+### Azure Functions deploy
+```bash
+make deploy-azure FUNC_APP=stlc-webhook-bridge
+make register-ado-hooks  BRIDGE_URL=https://stlc-webhook-bridge.azurewebsites.net/api
+make register-jira-hooks BRIDGE_URL=https://stlc-webhook-bridge.azurewebsites.net/api
+```
+### Docker
+```bash
+make docker-build-webhook
+make docker-run-webhook
+```
+### LLM provider selection
+| Provider | Env var(s) | Default model |
+|---|---|---|
+| `anthropic` | `AI_HEALING_API_KEY` or `ANTHROPIC_API_KEY` | `claude-haiku-4-5-20251001` |
+| `copilot` | `GITHUB_TOKEN` | `gpt-4o` |
+| `openai` | `OPENAI_API_KEY` | `gpt-4o-mini` |
+| `azure-openai` | `AZURE_OPENAI_API_KEY` + `AZURE_OPENAI_ENDPOINT` | `gpt-4o-mini` |
+| `ollama` | `OLLAMA_HOST` (optional) | `llama3.2` |
+Override model at any time: `LLM_MODEL=claude-sonnet-4-6`
+### Token cost (headless, Haiku default)
+| Stage | Tokens / work item | Approx cost |
+|---|---|---|
+| Test case generation (state → Approved) | ~8,000–15,000 | $0.01–$0.03 |
+| Gherkin generation (state → Done) | ~10,000–18,000 | $0.01–$0.04 |
+| Playwright TS generation (state → Done) | ~20,000–35,000 | $0.03–$0.07 |
+---
+## ADO vs Jira — Side by Side
 | Step | Azure DevOps | Jira Cloud |
 |---|---|---|
+| Trigger | Service Hook: `workitem.updated` | Webhook: `jira:issue_updated` |
 | Fetch issue | `fetch_work_item` | `fetch_jira_issue` |
-| Fetch Epic hierarchy | Not supported — warn user | `fetch_jira_epic_hierarchy` |
-| Check for duplicates | `get_linked_test_cases` | `get_linked_test_cases` |
-| Create & link test cases | `create_and_link_test_cases` | `create_and_link_test_cases` |
-| Generate Gherkin | `fetch_work_item_for_gherkin` → `attach_gherkin_to_work_item` | `qa-gherkin-generator` with Jira AC |
-| Generate Playwright | `generate_playwright_code` (shared) | `generate_playwright_code` (shared) |
+| Check duplicates | `get_linked_test_cases` | `get_linked_test_cases` |
+| Create test cases (interactive) | `create_and_link_test_cases` | `create_and_link_test_cases` |
+| Create test cases (headless) | `create_deduped_test_cases` | `create_deduped_test_cases` |
+| Gherkin attach | `attach_gherkin_to_feature` / `attach_gherkin_to_work_item` | `attach_gherkin_to_issue` |
+| Headless Gherkin composite | `generate_and_attach_gherkin` | `generate_and_attach_gherkin` + `attach_gherkin_to_issue` |
+| Playwright generation | `generate_playwright_code` (shared) | `generate_playwright_code` (shared) |
 | Write to disk | `write_helix_files` (shared) | `write_helix_files` (shared) |
-| Authentication | MSAL silent + browser (`~/.msal-cache/`) | OAuth 2.0 (3LO) + browser (`~/.jira-cache/`) |
-| Link relation | `TestedBy-Forward` | `is tested by` / `Test` link type |
+| Link relation | `TestedBy-Forward` | `is tested by` / `Test` |
+| Auth | MSAL silent + browser (`~/.msal-cache/`) | OAuth 2.0 3LO + browser (`~/.jira-cache/`) |
 ---
 ## Run Tests
 ```bash
+# Full suite
 ENABLE_SELF_HEALING=true \
 HEALING_DASHBOARD_PORT=7890 \
 APP_BASE_URL=<your-app-base-url> \
@@ -204,8 +398,7 @@ cucumber-js --config=config/cucumber.js -p <feature_profile> --tags "@smoke"
 - [ARCHITECTURE-JIRA.md](ARCHITECTURE-JIRA.md) — Full technical architecture, Jira pipeline
 - [WALKTHROUGH-ADO.md](WALKTHROUGH-ADO.md) — End-to-end walkthrough, ADO pipeline
 - [WALKTHROUGH-JIRA.md](WALKTHROUGH-JIRA.md) — End-to-end walkthrough, Jira pipeline
-- [PEER-DEV-PRESENTATION.md](PEER-DEV-PRESENTATION.md) — Developer team overview
-- [MANAGEMENT-ROI.md](MANAGEMENT-ROI.md) — ROI, quality impact, and cost analysis
+- [WEBHOOK.md](WEBHOOK.md) — Webhook bridge setup, deployment, and state trigger customisation
 ---

package/bin/postinstall.js CHANGED Viewed

@@ -52,7 +52,15 @@ const info = (s) => console.log(`${C.cyan}→${C.reset}  ${s}`);
 const warn = (s) => console.log(`${C.yellow}⚠${C.reset}  ${s}`);
 const d    = (s) => `${C.dim}${s}${C.reset}`;
-console.log(`\n${b("QA STLC Agents")} v${pkg.version} — post-install\n`);
+console.log(`
+${b("QA STLC Agents")} v${pkg.version} — post-install
+${d("This npm package includes:")}
+  • Five Python MCP servers for Azure DevOps + Jira Cloud
+  • Skill files for AI coding agents (Claude Code, Copilot, Cursor, Windsurf)
+  • ORCHESTRATION_RULES.md — reference guide for multi-step QA workflows
+  • Command-line tools: qa-stlc init, qa-stlc scaffold, qa-stlc skills, etc.
+`);
 // ── 1. Find Python ────────────────────────────────────────────────────────────
 const pythonCandidates = ["python3", "python"];

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@qa-gentic/stlc-agents",
-  "version": "1.0.17",
+  "version": "1.0.19",
   "description": "QA STLC Agents — five MCP servers + skills for AI-powered test case, Gherkin, Playwright generation, and Helix-QA file writing against Azure DevOps and Jira Cloud. Full pipeline for both: fetch → test cases → Gherkin → Playwright → Helix-QA. Works with Claude Code, GitHub Copilot, Cursor, Windsurf.",
   "keywords": [
     "playwright",
@@ -37,11 +37,24 @@
     "src/",
     "skills/",
     ".github/agents/",
-    "README.md"
+    "README.md",
+    "ORCHESTRATION_RULES.md"
   ],
   "scripts": {
     "postinstall": "node ./bin/postinstall.js"
   },
+  "_comment": "Diff to apply to package.json — add the cost command to qa-stlc.js",
+  "diff": {
+    "bin/qa-stlc.js": {
+      "add_require": "const cmdCost = require('../src/cli/cmd-cost');",
+      "add_command": {
+        "after": "// ── scaffold (or any existing last command) ──────────────────────────────",
+        "insert": "\n// ── cost ─────────────────────────────────────────────────────────────────\nprogram\n  .command('cost')\n  .description('Show token usage and cost for the current or past sessions.\\n' +\n    'Reads session logs from ~/.qa-stlc/cost-*.jsonl\\n' +\n    'Written automatically by the MCP servers on every tool call.')\n  .option('--all', 'Show all sessions (default: last session only)')\n  .option('--session <id>', 'Show a specific session by ID')\n  .option('--json', 'Output raw JSON')\n  .action(cmdCost);\n"
+      }
+    }
+  },
+  "full_command_to_add_to_qa-stlc.js": "// ── cost ─────────────────────────────────────────────────────────────────\nprogram\n  .command('cost')\n  .description(\n    'Show token usage and cost for the current or past pipeline sessions.\\n' +\n    'Reads logs from ~/.qa-stlc/cost-*.jsonl written by the MCP servers.\\n' +\n    'Each MCP tool call logs tokens, cost, and latency automatically.'\n  )\n  .option('--all', 'Show all sessions (not just the last one)')\n  .option('--session <id>', 'Show a specific session by its ID')\n  .option('--json', 'Emit raw JSON instead of a formatted table')\n  .action(cmdCost);",
   "dependencies": {
     "commander": "^12.0.0",
     "which": "^4.0.0"

package/src/cli/cmd-init.js CHANGED Viewed

@@ -1,7 +1,7 @@
 /**
  * cmd-init.js — `qa-stlc init`
  *
- * Full bootstrap: install Python agents + skills + MCP config.
+ * Full bootstrap: install Python agents + skills + ORCHESTRATION_RULES.md + MCP config.
  * Accepts --integration <ado|jira|both>.  When omitted, reads
  * ~/.qa-stlc/integration (written by postinstall) or prompts interactively.
  */
@@ -89,7 +89,23 @@ module.exports = async function init(opts) {
   }
   ok("qa-gentic-stlc-agents installed.");
-  // ── 4. Install skills ─────────────────────────────────────────────────────
+  // ── 4. Copy ORCHESTRATION_RULES.md to project root ─────────────────────────
+  info("Installing ORCHESTRATION_RULES.md to project root…");
+  try {
+    const npmPkgDir = path.join(path.dirname(require.resolve("@qa-gentic/stlc-agents/package.json")));
+    const srcRules = path.join(npmPkgDir, "ORCHESTRATION_RULES.md");
+    const destRules = path.join(process.cwd(), "ORCHESTRATION_RULES.md");
+    if (fs.existsSync(srcRules)) {
+      fs.copyFileSync(srcRules, destRules);
+      ok("ORCHESTRATION_RULES.md copied to project root.");
+    } else {
+      warn("ORCHESTRATION_RULES.md not found in npm package (expected in development).");
+    }
+  } catch (e) {
+    // Silently skip if not available (common in dev environments before full build)
+  }
+  // ── 5. Install skills ─────────────────────────────────────────────────────
   info("Installing skills…");
   const skillTarget = opts.vscode ? "vscode" : "claude";
   await cmdSkills({ target: skillTarget, integration });
@@ -132,6 +148,7 @@ ${C.bold}Setup complete.${C.reset}
   ${C.dim}Integration:${C.reset} ${C.bold}${integration}${C.reset}
   ${C.dim}MCP config :${C.reset} ${mcpLocation}
   ${C.dim}Skills     :${C.reset} ${skillsLocation}
+  ${C.dim}Rules      :${C.reset} ORCHESTRATION_RULES.md ${C.dim}(project root — reference for multi-step workflows)${C.reset}
 ${C.bold}Start Playwright MCP${C.reset} ${C.dim}(keep running in a separate terminal):${C.reset}

package/src/cli/cmd-mcp-config.js CHANGED Viewed

@@ -169,7 +169,10 @@ function buildClaudeConfig(pythonBin, playwrightPort, integration) {
     }
   }
-  servers["playwright"] = { type: "url", url: `ws://localhost:${playwrightPort}` };
+  servers["playwright"] = {
+    command: "npx",
+    args: ["@playwright/mcp@latest", "--isolated"],
+  };
   return { config: { mcpServers: servers }, missing };
 }
@@ -190,14 +193,7 @@ function buildVscodeConfig(pythonBin, playwrightPort, integration) {
         type: "stdio",
         command: bin,
         args: [],
-        env: {
-          // Azure DevOps auth passthrough (original)
-          "AZURE_TENANT_ID":     "${env:AZURE_TENANT_ID}",
-          "AZURE_CLIENT_ID":     "${env:AZURE_CLIENT_ID}",
-          "AZURE_CLIENT_SECRET": "${env:AZURE_CLIENT_SECRET}",
-          // Cost tracking passthrough (new)
-          ...COST_ENV,
-        },
+        env: { ...COST_ENV },
       };
     } else {
       missing.push(name);
@@ -236,8 +232,9 @@ function buildVscodeConfig(pythonBin, playwrightPort, integration) {
   }
   servers["playwright"] = {
-    type: "http",
-    url: `http://localhost:${playwrightPort}/mcp`,
+    type: "stdio",
+    command: "npx",
+    args: ["@playwright/mcp@latest", "--isolated"],
   };
   return { config: { servers }, missing };
@@ -246,9 +243,8 @@ function buildVscodeConfig(pythonBin, playwrightPort, integration) {
 function printNextSteps(mode, playwrightPort) {
   const isVscode = mode === "vscode";
   console.log(`
-  ${C.dim}Start Playwright MCP before running generation workflows:${C.reset}
-  npx @playwright/mcp@latest --port ${playwrightPort}
-  ${C.dim}headless (CI): npx @playwright/mcp@latest --headless --port ${playwrightPort}${C.reset}
+  ${C.dim}Playwright MCP is auto-started by the MCP framework (--isolated, no manual start needed).
+  For CI/headless: set PLAYWRIGHT_MCP_URL or start manually with --headless --isolated --port ${playwrightPort}${C.reset}
   ${isVscode
     ? `Reload VS Code window — all MCP servers will appear in the MCP panel.`

package/src/cli/cmd-skills.js CHANGED Viewed

@@ -9,6 +9,8 @@
  *   windsurf  →  .windsurf/rules/
  *   both      →  claude + vscode
  *   print     →  stdout only
+ *
+ * Also installs ORCHESTRATION_RULES.md to project root (multi-step workflow reference).
  */
 "use strict";
@@ -35,10 +37,11 @@ function readIntegrationPrefSk() {
 }
 // Resolve the skills bundled with this npm package
-const PKG_ROOT    = path.resolve(__dirname, "../..");
-const SKILLS_DIR  = path.join(PKG_ROOT, "skills");
-const BEHAVIOR_MD = path.join(SKILLS_DIR, "AGENT-BEHAVIOR.md");
-const AGENTS_DIR  = path.join(PKG_ROOT, ".github", "agents");
+const PKG_ROOT           = path.resolve(__dirname, "../..");
+const SKILLS_DIR         = path.join(PKG_ROOT, "skills");
+const BEHAVIOR_MD        = path.join(SKILLS_DIR, "AGENT-BEHAVIOR.md");
+const ORCHESTRATION_MD   = path.join(PKG_ROOT, "ORCHESTRATION_RULES.md");
+const AGENTS_DIR         = path.join(PKG_ROOT, ".github", "agents");
 /** Copy a file, creating parent dirs as needed. */
 function cp(src, dest) {
@@ -105,6 +108,15 @@ function installAgents(integration) {
   agentFiles(integration).forEach((f) => info(path.basename(f)));
 }
+/** Install ORCHESTRATION_RULES.md to project root for multi-step workflow reference. */
+function installOrchestrationRules() {
+  if (!fs.existsSync(ORCHESTRATION_MD)) return;
+  const dest = path.join(CWD, "ORCHESTRATION_RULES.md");
+  cp(ORCHESTRATION_MD, dest);
+  ok(`ORCHESTRATION_RULES.md installed → project root`);
+  info("Reference this file for multi-step QA workflow best practices");
+}
 function installClaude(integration) {
   const dest = path.join(CWD, ".claude", "skills");
   // Copy entire skill directory (preserves references/ subdirectory)
@@ -118,6 +130,7 @@ function installClaude(integration) {
   info("AGENT-BEHAVIOR.md → .claude/AGENT-BEHAVIOR.md");
   skillEntries(integration).forEach((e) => info(`${e.name}/SKILL.md`));
   installAgents(integration);
+  installOrchestrationRules();
   printPlaywrightHint();
 }
@@ -134,6 +147,7 @@ function installVscode(integration) {
   info("AGENT-BEHAVIOR.md → .github/copilot-instructions/AGENT-BEHAVIOR.md");
   skillEntries(integration).forEach((e) => info(`${e.name}/SKILL.md`));
   installAgents(integration);
+  installOrchestrationRules();
   printPlaywrightHint();
 }
@@ -146,6 +160,7 @@ function installCursor(integration) {
   cp(BEHAVIOR_MD, path.join(dest, "AGENT-BEHAVIOR.md"));
   ok(`Skills installed → .cursor/rules/`);
   installAgents(integration);
+  installOrchestrationRules();
   printPlaywrightHint();
 }
@@ -158,6 +173,7 @@ function installWindsurf(integration) {
   cp(BEHAVIOR_MD, path.join(dest, "AGENT-BEHAVIOR.md"));
   ok(`Skills installed → .windsurf/rules/`);
   installAgents(integration);
+  installOrchestrationRules();
   printPlaywrightHint();
 }
@@ -165,6 +181,7 @@ function printSkills() {
   console.log("\nAvailable skills:\n");
   skillEntries().forEach((e) => console.log(`  ${e.name}/SKILL.md`));
   console.log(`  AGENT-BEHAVIOR.md`);
+  console.log(`  ORCHESTRATION_RULES.md (workflow reference)`);
 }
 /** Recursively copy a directory tree. */

package/src/stlc_agents/shared/install_hook.py ADDED Viewed

@@ -0,0 +1,154 @@
+"""
+install_hook.py  —  stlc_agents.shared.install_hook
+─────────────────────────────────────────────────────
+Called automatically by postinstall.js after `pip install qa-gentic-stlc-agents`.
+Also callable manually: python -m stlc_agents.shared.install_hook
+Applies the cost tracking patch to all 5 MCP server files by importing
+and running the same logic as scripts/apply_cost_tracking.py, but resolved
+relative to the installed package location (works in site-packages, .venv, etc).
+"""
+from __future__ import annotations
+import re
+import sys
+from pathlib import Path
+SERVERS = [
+    ("agent_gherkin_generator",     "qa-gherkin-generator"),
+    ("agent_test_case_manager",     "qa-test-case-manager"),
+    ("agent_playwright_generator",  "qa-playwright-generator"),
+    ("agent_helix_writer",          "qa-helix-writer"),
+    ("agent_jira_manager",          "qa-jira-manager"),
+]
+IMPORT_MARKER = "from stlc_agents.shared.cost_tracker import track"
+TIME_IMPORT   = "import time"
+OLD_RETURN = (
+    'return [types.TextContent(type="text", '
+    'text=json.dumps(result, indent=2, ensure_ascii=False))]'
+)
+NEW_RETURN = "return track(result, tool_name=name, server={server!r}, t0=t0)"
+OLD_ERR_BLOCK = """\
+        return [types.TextContent(
+            type="text",
+            text=json.dumps({"error": str(exc), "tool": name}, indent=2),
+        )]"""
+NEW_ERR_BLOCK = """\
+        err_result = {{"error": str(exc), "tool": name}}
+        return track(err_result, tool_name=name, server={server!r}, t0=t0)"""
+def _root() -> Path:
+    """Resolve the stlc_agents package root regardless of install method."""
+    import stlc_agents
+    return Path(stlc_agents.__file__).parent
+def patch_server(agent_dir: str, server_name: str, root: Path) -> str:
+    """Patch one server file. Returns 'patched' | 'already_patched' | 'not_found' | 'no_change'."""
+    path = root / agent_dir / "server.py"
+    if not path.exists():
+        return "not_found"
+    src = path.read_text(encoding="utf-8")
+    if IMPORT_MARKER in src:
+        return "already_patched"
+    original = src
+    # 1. import time
+    if TIME_IMPORT not in src:
+        src = src.replace("import sys\n", "import sys\nimport time\n", 1)
+    # 2. cost_tracker import — after last `from stlc_agents...` line
+    last_match = None
+    for m in re.finditer(r"^from stlc_agents\..+\n", src, re.MULTILINE):
+        last_match = m
+    if last_match:
+        pos = last_match.end()
+        src = src[:pos] + "from stlc_agents.shared.cost_tracker import track\n" + src[pos:]
+    else:
+        src = src.replace(
+            "from dotenv import load_dotenv\n",
+            "from dotenv import load_dotenv\nfrom stlc_agents.shared.cost_tracker import track\n",
+            1,
+        )
+    # 3. t0 = time.monotonic() inside call_tool()
+    src = src.replace(
+        "@app.call_tool()\nasync def call_tool(name: str, arguments: dict)"
+        " -> list[types.TextContent]:\n    try:\n",
+        "@app.call_tool()\nasync def call_tool(name: str, arguments: dict)"
+        " -> list[types.TextContent]:\n    t0 = time.monotonic()\n    try:\n",
+        1,
+    )
+    # 4. Replace all result return lines
+    new_ret = NEW_RETURN.format(server=server_name)
+    for indent in ("        ", "            ", "                "):
+        src = src.replace(f"{indent}{OLD_RETURN}", f"{indent}{new_ret}")
+    # 5. Replace error-path block
+    src = src.replace(OLD_ERR_BLOCK, NEW_ERR_BLOCK.format(server=server_name))
+    if src == original:
+        return "no_change"
+    path.write_text(src, encoding="utf-8")
+    return "patched"
+def apply_cost_tracking() -> None:
+    """Entry point — called by postinstall.js and the console_script."""
+    root = _root()
+    ok   = "\x1b[32m✓\x1b[0m"
+    skip = "\x1b[33m–\x1b[0m"
+    err  = "\x1b[31m✗\x1b[0m"
+    print("\n  stlc-agents · Activating cost tracking on MCP servers...\n")
+    any_patched = False
+    for agent_dir, server_name in SERVERS:
+        status = patch_server(agent_dir, server_name, root)
+        if status == "patched":
+            print(f"  {ok}  {agent_dir}  ({server_name})")
+            any_patched = True
+        elif status == "already_patched":
+            print(f"  {skip}  {agent_dir}  — already active")
+        elif status == "not_found":
+            print(f"  {err}  {agent_dir}/server.py  — not found (skip)")
+        elif status == "no_change":
+            print(f"  {skip}  {agent_dir}  — no matching pattern (manual patch needed)")
+    print()
+    if any_patched:
+        print("  Cost tracking is now active.  On every MCP tool call you will see:")
+        print("  [stlc-cost] <server> · <tool>  ~<N>K tokens  $<cost>  (session: $<total>)")
+        print()
+        print("  Session logs:  ~/.qa-stlc/cost-<session-id>.jsonl")
+        print("  View report:   qa-stlc cost")
+        print("  View all:      qa-stlc cost --all")
+        print()
+        print("  Environment variables:")
+        print("    STLC_COST_TRACKING=false          disable output")
+        print("    STLC_CODING_AGENT_MODEL=<model>   set your agent's model for exact pricing")
+        print("      e.g. claude-sonnet-4-6 | claude-opus-4-6 | gpt-4o")
+        print("    STLC_COST_LOG_DIR=<path>          change log directory")
+    else:
+        print("  All servers already have cost tracking active.")
+    print()
+def main() -> None:
+    apply_cost_tracking()
+if __name__ == "__main__":
+    main()