npm - @qa-gentic/stlc-agents - Versions diffs - 1.0.16 → 1.0.18 - Mend

@qa-gentic/stlc-agents 1.0.16 → 1.0.18

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (45) hide show

package/ORCHESTRATION_RULES.md ADDED Viewed

@@ -0,0 +1,283 @@
+# Orchestration Rules — Multi-Step Pipeline Agents
+> **Universal rules file for coding agents (Claude, Copilot, Cursor, Windsurf, Gemini, etc.)**
+> Place this file in your project root or `.ai/` folder. Reference it in your prompt with:
+> `"Apply all rules from ORCHESTRATION_RULES.md before executing any step."`
+---
+## 1. Core Principle
+Every intermediate output is an **input contract** for the next step — not a done state.
+A step is only complete when its output has been validated against its spec and confirmed
+fit for downstream consumption. Never proceed on a "good enough" assumption.
+---
+## 2. Mandatory Behaviours
+### 2.1 Explicit Task Breakdown
+- Before executing, decompose the full task into named steps using a todo/task list tool
+  (e.g. `manage_todo_list`, GitHub Copilot Tasks, Cursor task panel).
+- Each step must have a declared **input**, **action**, and **output spec**.
+- Mark steps as `[ ] pending`, `[~] in-progress`, `[x] done`, `[!] blocked` — update in real time.
+- Never treat a step as in-progress and done simultaneously.
+### 2.2 No Skipping Intermediate Steps
+- If a step produces data that the next step consumes, that data must be:
+  1. **Extracted** from raw output (not left embedded)
+  2. **Structured** into the agreed schema
+  3. **Validated** against the checkpoint gate
+  4. **Explicitly handed off** as a named artefact
+- Do NOT jump ahead assuming the downstream step can infer missing data.
+- Do NOT proceed if a required input is absent or malformed.
+### 2.3 Checkpoint Gates Are Blocking — Pre-Flight Required
+Gates are not a post-generation reflection. They run **before** output is produced.
+**Before generating output for any step, you MUST:**
+1. Output the pre-flight checklist with your intended answers filled in.
+2. Only if all items are YES — proceed to generate the output.
+3. If any item is NO — stop, state what is missing, and wait for the user.
+This is not optional. Generating output first and checking after is a rule violation.
+Pre-flight format (required before every step):
+```
+PRE-FLIGHT: Step [N] — [Step Name]
+  [ ] Input artefact "[name]" received from Step [N-1]?
+  [ ] Input matches expected schema?
+  [ ] [Step-specific countable check, e.g. "11 scenarios in scenario_inventory?"]
+  [ ] [Any tool or selector availability check]
+→ PROCEED  /  → BLOCKED: [state what is missing]
+```
+### 2.4 Data Handoff Must Be Explicit
+- Output data in full — not summarised, not truncated.
+- Use a consistent schema (JSON, YAML, or named list) — do not change shape between steps.
+- Name the artefact (e.g. `context_map`, `test_case_list`, `scenario_inventory`).
+- The receiving step must reference the artefact by name, not re-derive it.
+### 2.5 No Placeholder or Stub Outputs
+This rule applies at generation time, not reflection time. The agent must not produce
+stubs and then acknowledge the violation — it must prevent them before generating.
+- Never produce output containing `TODO`, `placeholder`, `// implement later`,
+  `throw new Error('pending')`, or any empty method/step body.
+- Before generating a file, state the **expected item count** (e.g. number of step
+  definitions, number of test cases). The generated file must match that count exactly.
+- If a step cannot produce a complete output, declare it `[!] blocked` and stop.
+- Partial outputs passed downstream cause compounding failures and wasted tokens.
+**Countable verification pattern (required for code generation steps):**
+```
+Expected: [N] step definitions (from scenario_inventory)
+Generating: [N] step definitions
+Verify after: count implemented bodies — must equal [N], zero empty
+```
+### 2.6 Query-Driven Data Capture (Snapshot / Scraping Steps)
+- Navigation is NOT the deliverable — **structured data extraction** is.
+- For every screen or page visited, immediately extract all required fields before moving on.
+- Do not defer extraction to a later step.
+- Capture only what downstream steps need (defined by the step's output spec).
+- Validate coverage: every field required by downstream must be present in the captured data.
+### 2.7 Split Generation Steps to Prevent Silent Stubs
+For any step that generates code or structured output consumed by a subsequent step,
+split it into two sub-steps:
+- **[N]a — Signatures only:** generate method/step signatures (names, parameters) with no bodies.
+  Output as an inventory list. This makes the expected count explicit and visible.
+- **[N]b — Implement each signature:** implement every item from the [N]a inventory.
+  No body may be left empty. Reference `context_map` or equivalent for all selectors/data.
+This forces a visible count before implementation begins, eliminating silent stub generation.
+### 2.8 Token Efficiency
+- Avoid re-deriving data already produced in a prior step.
+- Reference prior artefacts by name; do not re-fetch or re-generate unless a gate failed.
+- If rework is needed, state which gate failed, what was missing, and what the corrected output is.
+---
+## 3. Error Handling
+| Situation | Required Action |
+|---|---|
+| Gate fails | STOP. Report failed items. Wait for user input or resolution. |
+| Required input missing | STOP. Name the missing input. Do not guess. |
+| Tool call returns empty | STOP. Report. Do not silently continue. |
+| Partial output produced | Mark step `[!] blocked`. Do not pass partial output downstream. |
+| Schema mismatch | STOP. Show expected vs actual schema. Do not transform silently. |
+| Ambiguous instruction | Ask one clarifying question before proceeding. Do not assume. |
+| Stub/TODO found in output | STOP. Do not accept the output. Regenerate from signatures. |
+| Count mismatch (generated vs expected) | STOP. List which items are missing. Do not proceed. |
+---
+## 4. Checkpoint Gate Template
+Run this **before** generating output — not after.
+```
+CHECKPOINT GATE [N] — [Step Name]
+---------------------------------------
+PRE-FLIGHT (run before generating):
+  [ ] Input artefact received and named
+  [ ] Input matches expected schema
+  [ ] Expected output count stated: [N items]
+  [ ] All required tools/selectors available
+POST-GENERATION (run before handing off):
+  [ ] Actual output count matches expected: [N of N]
+  [ ] No stubs, TODOs, or empty bodies in output
+  [ ] All items from upstream list are accounted for
+  [ ] Output is in agreed schema / format
+RESULT:  [ ] PASS — hand off artefact to Step [N+1]
+         [ ] FAIL — stop and report: [list what failed]
+```
+---
+## 5. Step Definition Template
+```
+STEP [N] — [Name]
+---------------------------------------
+Tool / Agent:   [name of tool, MCP server, or agent]
+Input (required):
+  - [Named artefact from Step N-1]
+  - [Any other required input]
+Pre-flight check:
+  - State expected output count before generating
+  - Confirm all inputs available and schema-valid
+Action:
+  [Precise description — not vague verbs like "process" or "handle"]
+  If code generation: split into [N]a (signatures) and [N]b (implementations)
+Output spec (the contract):
+  - Artefact name: [e.g. context_map]
+  - Format: [JSON | YAML | list | file | etc.]
+  - Required fields: [enumerate them]
+  - Coverage requirement: [e.g. one implementation per Gherkin step]
+Checkpoint Gate:  → run Gate [N] template above
+```
+---
+## 6. Anti-Patterns (Never Do These)
+| Anti-Pattern | Why It Fails | Correct Behaviour |
+|---|---|---|
+| Generating output then checking the gate | Gate runs after stubs already exist — violation acknowledged but not prevented | Run pre-flight checklist before generating |
+| Treating gate as a reflection step | Agent notices violation after the fact; output is already committed | Gate is a pre-condition, not a review |
+| Skipping data extraction after capture | Downstream step receives raw/unstructured input and must infer | Extract and structure data immediately after capture |
+| Jumping to generation without verified inputs | Output based on inference, not facts — stubs and errors result | Validate inputs at gate before calling the generator |
+| Treating "good enough" output as done | Errors compound; rework costs more tokens than doing it right | Validate against spec before marking a step complete |
+| Producing stubs with TODO | Downstream steps receive incomplete contracts and silently fail | Block the step; declare it incomplete; stop |
+| Re-deriving upstream data in a downstream step | Wasted tokens; divergence risk if re-derivation differs | Reference the named artefact from the prior step |
+| Proceeding past a failed gate | Snowballing failures requiring full rework | Stop at the gate; surface the gap; wait for resolution |
+| Single atomic generation step for code | No visible count before generation — stubs go undetected | Split into signatures ([N]a) then implementations ([N]b) |
+---
+## 7. Orchestration Health Checks
+Run at the start of any multi-step task:
+- [ ] Are all steps named and sequenced in the task list?
+- [ ] Does each step have a declared input and output spec?
+- [ ] Does each step have a defined pre-flight and post-generation gate?
+- [ ] Are code generation steps split into signatures + implementations?
+- [ ] Are all required tools / MCP servers available?
+- [ ] Are named artefacts from prior steps available as inputs?
+If any health check fails before execution begins, resolve it first.
+---
+## 8. Agent-Specific Integration Notes
+### Claude (claude.ai / API)
+- Reference this file in your system prompt or project instructions.
+- Use `manage_todo_list` for step tracking.
+- Attach this file as a project document so it persists across sessions.
+- Pre-flight checklists work reliably as Claude outputs reasoning before tool calls.
+### GitHub Copilot (VS Code / JetBrains)
+- Add to `.github/copilot-instructions.md` or reference in your workspace prompt.
+- **Critical:** Copilot treats rules as advisory context — gates are not enforced at runtime.
+  Mitigate by scoping rules to file types (e.g. `When generating *.steps.ts files, you must...`).
+- Always include the pre-flight checklist directly in your chat message for the current step,
+  not just in the rules file. Copilot applies in-message instructions more reliably than
+  file-level rules for generation constraints.
+- Use the countable verification pattern (section 2.5) explicitly in each chat prompt:
+  "There are 11 scenarios. Generate exactly 11 step definitions. State the count before writing."
+- For step definition files: add a file-type-scoped rule to `.github/copilot-instructions.md`:
+  ```
+  When generating Playwright step definition files (*.steps.ts):
+  1. Count Given/When/Then steps in the linked .feature file and state the count first.
+  2. Every step body must contain real implementation — no TODO, no throw pending, no empty bodies.
+  3. If a selector is missing from context_map, name the missing step and stop. Do not stub it.
+  ```
+### Cursor
+- Place in `.cursor/rules/` as `orchestration.mdc` (set scope: `always`).
+- Or add to `.cursorrules` in the project root.
+- Cursor applies project-level rules more consistently than Copilot for generation steps.
+- Use `@file` references in chat to explicitly pull the rules into context per step.
+### Windsurf (Codeium)
+- Place in `.windsurf/rules.md` or reference in the global rules panel.
+- Windsurf's Cascade agent picks up project-level markdown rules automatically.
+### Gemini CLI / Vertex AI Agent Builder
+- Reference via system instruction or as a grounding document.
+- Use the Step Definition Template when constructing task configs.
+---
+## 9. Quick Reference Card
+```
+BEFORE EACH STEP:
+  1. Output the pre-flight checklist — fill in all items.
+  2. If all YES → state expected output count → generate.
+  3. If any NO → stop and report.
+AFTER EACH STEP:
+  1. Run post-generation gate — count actual vs expected.
+  2. If PASS → hand off named artefact to next step.
+  3. If FAIL → stop, report, regenerate.
+FOR CODE GENERATION:
+  1. Generate signatures/names only first ([N]a).
+  2. State the count from [N]a.
+  3. Implement every item ([N]b) — zero empty bodies allowed.
+NEVER:
+  - Generate output then check the gate.
+  - Proceed past a failed gate.
+  - Pass unstructured or partial data downstream.
+  - Produce stubs and acknowledge them — prevent them.
+```
+---
+*Version 1.1 — Updated to add pre-flight gate enforcement, countable verification,
+split generation step pattern, and Copilot-specific stub prevention guidance.
+Root cause addressed: gates were post-generation reflections, not pre-generation blockers.*