npm - @skyramp/mcp - Versions diffs - 0.0.64-rc.12 → 0.0.64-rc.13 - Mend

@skyramp/mcp 0.0.64-rc.12 → 0.0.64-rc.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (2) hide show

package/build/prompts/testbot/testbot-prompts.js +42 -32
package/package.json +1 -1

package/build/prompts/testbot/testbot-prompts.js CHANGED Viewed

@@ -58,7 +58,7 @@ tests. The analyze tool uses PR comment history to avoid duplicates.
 ## Step 2: Decide — one action per affected test / endpoint
-Using the diff, the recommendations, and the health assessment, assign exactly one action to each item:
+Using the diff, the recommendations, and the health assessment, assign one or more actions to each item:
 ### For each **existing Skyramp test**:
 - **UPDATE** — the diff touches the endpoint this test covers AND adds/changes fields the test should assert (e.g. new response field, changed status code, renamed path). The test still runs but has a coverage gap or will break.
@@ -70,15 +70,26 @@ Using the diff, the recommendations, and the health assessment, assign exactly o
 - **ADD** — the diff introduced this route; generate a new test.
 - If the endpoint existed before this diff (only a model/field change touched it) — log as a coverage gap but do not generate a test.
-### Decision rules (apply in order, stop at first match):
-1. If the diff adds/removes/renames a field in a response this test asserts → **UPDATE** (not ADD).
-2. If the diff adds a **brand-new HTTP method on a resource path already covered by existing tests** → **UPDATE all existing test files for that resource** (integration, contract, UI, or any other type) to add coverage for the new method. Do NOT create a new file for the new method unless no existing test covers that resource path at all.
-3. If the diff adds a **brand-new route definition** for a resource path with **no existing test coverage** → **ADD** a new file.
-4. If the diff makes an **additive, non-breaking change** to an existing route (e.g. new optional query params, new optional request fields, new optional response fields) AND an existing test already covers that route → **UPDATE** that test to assert the new behavior. Do NOT create a new file.
-5. If an existing test covers the endpoint but the new behavior introduces a **new auth path, a new error/edge-case branch, or a fundamentally separate scenario** (not just a new HTTP method) → **ADD** alongside the existing test.
-6. If the test is unrelated to the diff → skip it entirely (no action, no report entry).
-7. Only use **ADD** for endpoints whose route was introduced in this diff AND no existing tests cover that resource path. An endpoint that existed before but now lacks a test is a pre-existing coverage gap — log it in \`additionalRecommendations\`, do NOT generate a test for it.
-8. Do NOT add a new test when an UPDATE to an existing test is the right fix.
+### Decision rules
+Assign actions using the categories below. Rules are not mutually exclusive — a single diff may trigger UPDATE on existing files AND ADD for a new scenario simultaneously.
+**UPDATE** an existing test when any of the following is true (all that apply, not just the first):
+- The diff adds, removes, or renames a field the test asserts.
+- The diff adds a new HTTP method on a resource path the test already covers — UPDATE **all** existing test files for that resource (contract, integration, UI). Follow the Enhance assertions guidelines when adding the new method's assertions.
+- The diff makes an additive change (new optional fields, new query params) to a route the test covers.
+**ADD** a net-new test only when:
+- The diff introduces a brand-new route that has **no existing test coverage at all**, OR
+- The diff introduces a new auth path, error branch, or fundamentally separate scenario that no existing test covers.
+- Never ADD for a resource that already has existing tests just because the HTTP method is new — UPDATE those files instead.
+- An endpoint that existed before this diff but lacks tests is a pre-existing coverage gap — log it in \`additionalRecommendations\`, do NOT add a test.
+**REGENERATE** when the endpoint was substantially restructured and the test is fundamentally broken.
+**DELETE** when the endpoint the test covers was removed entirely.
+**Skip** if the test is unrelated to the diff.
 Output your decision table:
 \`\`\`
@@ -99,11 +110,17 @@ Log each finding in \`issuesFound\` with a \`severity\` (critical/high/medium/lo
 ## Step 3: Act
-Execute the actions from Step 2.
-- **MANDATORY — use the pre-ranked GENERATE list as-is**: The Execution Plan's GENERATE section contains pre-scored, deterministically ranked test scenarios. You MUST generate exactly those scenarios in the exact order listed. Do NOT substitute, rename, or replace a GENERATE item with a different scenario you invented during source-code enrichment. If enrichment reveals a high-value insight, add it to \`additionalRecommendations\` — never displace a GENERATE item.
-  - **Exception — Rule 2 override**: Before executing any GENERATE item, cross-check its endpoint path against the Existing Tests list. If the resource path is already covered by an existing test file (regardless of HTTP method), do NOT create a new file — instead apply Rule 2: UPDATE all existing test files for that resource to add coverage for the new method. Count the UPDATE toward the generation budget. If the GENERATE item was the only budget item for that resource, spend the slot on the UPDATE.
-  - **Example**: If the plan says "#1 GENERATE: contract | new_endpoint | PATCH /orders/{id}" and \`orders_contract_test.py\` already exists for GET/DELETE \`/orders/{id}\`, do NOT create a new file — UPDATE \`orders_contract_test.py\` to add a PATCH test function instead.
-  - **Example**: If the plan says "#1 GENERATE: orders-patch-add-items-recalculate" and you discover a discount-calculation bug during enrichment, generate \`orders-patch-add-items-recalculate\` as #1 and add \`integration-order-discount-calculation\` to additionalRecommendations with priority=high.
+Execute the actions from Step 2 in two independent tracks:
+**Track A — UPDATE / DELETE / REGENERATE (unconditional):**
+Execute every UPDATE, DELETE, and REGENERATE decision from Step 2 regardless of the GENERATE list or budget. These are never skipped. When a new HTTP method is added, UPDATE covers all existing test files for that resource (contract, integration, UI) — scan the actual test directory on disk to find them, do not rely solely on what the analyze tool reports.
+**Track B — ADD new tests (follow GENERATE list):**
+- **MANDATORY — use the pre-ranked GENERATE list as-is**: The Execution Plan's GENERATE section governs ADD and REGENERATE actions only. You MUST generate exactly those scenarios in the exact order listed. Do NOT substitute, rename, or replace a GENERATE item. If enrichment reveals a high-value insight, add it to \`additionalRecommendations\` — never displace a GENERATE item.
+- Scenario JSON files are always new files — always generate them for new methods. Every generated scenario JSON must have a corresponding new integration test generated from it via \`skyramp_integration_test_generation\`.
+- Do NOT create a new contract or integration test file for a resource that already has existing tests — those are handled by Track A (UPDATE). If the GENERATE item names a scenario for a resource already covered by existing tests, convert it to a Track A UPDATE and count it toward the budget.
+- **Example**: If the plan says "GENERATE: <METHOD> /resource/{id}" and existing contract and integration tests already cover that resource path, do NOT create new files — generate the new scenario JSON, then UPDATE all existing test files for that resource to add the new method's test function.
+- **Example**: If the plan says "GENERATE: resource-method-add-items-recalculate" and you discover a bug during enrichment, generate the planned item and add the bug scenario to \`additionalRecommendations\`.
 - **Total generated**: Follow the **"Budget: N generate"** line in the Execution Plan (section "## Execution Plan", "Budget: N generate + M additional = T total"). Generate exactly the GENERATE-tagged items. Do NOT generate fewer or different items.
 - **UI test priority**: If the diff contains frontend/UI changes (e.g. \`.tsx\`, \`.jsx\`, \`.vue\`, \`.svelte\` files), you MUST attempt to generate at least one UI test. Use \`browser_navigate\` to the app's base URL — if the app responds, record a trace and generate the test. Only skip if the app is unreachable. This takes priority over generating additional backend-only tests.
 - **Always generate a test for critical bugs, even if it will fail.** When a GENERATE-tagged item targets a page or endpoint with a known bug, do NOT skip it because you expect the test to fail — a failing test that documents a bug is more valuable than a text-only description. This applies within the existing GENERATE budget; do not add extra tests beyond the plan.
@@ -120,7 +137,9 @@ Execute the actions from Step 2.
 Edit the existing test file directly:
 - Add missing assertions for new response fields (e.g. \`assert "archived" in resp\` or \`assert resp["archived"] >= 0\`).
 - Fix path/method changes in the test.
-- **When adding a new method to an existing resource:** Add the new method's test cases to the existing test file that covers that resource path (e.g. adding POST to a file that already tests GET /products).
+- **When adding a new method to an existing resource:** Add the new method's test cases to the existing test file that covers that resource path (e.g. adding POST to a file that already tests GET /products). This applies to ALL existing test files for that resource — contract, integration, UI — not just the one the GENERATE plan names.
+- **Happy path first (CRITICAL):** When adding a new HTTP method (PUT, PATCH, POST) to an existing test file, always include the happy path (2xx success) assertion. Do NOT add only an error-path test (e.g. 404, 422) for the new method — error cases may follow, but the happy path is mandatory.
+- **Test ordering within the file (CRITICAL):** Always place mutation tests (PATCH, PUT) BEFORE any DELETE test on the same resource. DELETE removes the resource — any PATCH/PUT after it will fail with 404. Insert new mutation test functions above the DELETE function and before the DELETE call in the \`if __name__ == "__main__"\` block (or equivalent runner entrypoint).
 - Do not regenerate — only apply the minimal change needed.
 ### REGENERATE
@@ -128,8 +147,7 @@ Call the appropriate generation tool to replace the existing test from scratch.
 Use the same filename so it overwrites the old file.
 ### ADD
-Generate a net-new test. Use a unique descriptive filename to avoid overwriting existing files.
-**Exception:** If the diff adds a new HTTP method to a resource path already covered by existing tests, UPDATE all existing test files for that resource instead (see rule 2 above). Do not let "distinct workflow" or "different test type" reasoning override this — new methods on covered resources always update existing files.
+Generate a net-new test only for resources with no existing test coverage. Use a unique descriptive filename. Do NOT create a new contract or integration test file for a resource that already has existing tests — use UPDATE instead (Track A).
 **Auth — determine ONCE, apply to EVERY tool call:**
 1. Start from the Execution Plan returned by \`skyramp_analyze_changes\` — it includes pre-resolved auth params.
@@ -196,20 +214,12 @@ Do NOT use \`page.waitForTimeout()\` with fixed delays — these are flaky in CI
 **After generation, you MUST do exactly these steps — nothing more, nothing less:**
 1. **Fix chaining**: replace hardcoded IDs with dynamic response values — path params like \`id = 'id'\` → \`skyramp.get_response_value(prev_response, "id")\`, and hardcoded IDs in request bodies → dynamic values from prior responses.
-2. **Enhance assertions**: for integration tests and contract provider tests, follow the assertion enhancement instructions returned in the tool output. This step is MANDATORY — do NOT skip it even if chaining is already correct.
-   - Apply assertions to **every** request call in the test — every step regardless of its position (first, middle, or last) or HTTP method.
-   - **Each step that creates or modifies a resource**: assert non-null IDs, echo-back values, value ranges, and any computed/derived fields whose inputs are available from earlier responses or the current request body.
-   - **Echo-back assertions must reflect what was sent in the current request**: the right-hand side of an echo-back assertion must match the value actually submitted in that step's request — not a value from an earlier step that may have been superseded.
-     Determine the sent value in priority order:
-     1. If the field is set via \`data_override\`, use that override source
-        (e.g. \`get_response_value(products2_POST_response, "product_id")\`).
-     2. If the field is a hardcoded literal in the request body with no \`data_override\`,
-        assert against that literal directly (e.g. \`== 2\`).
-   - **Cross-step computation assertions (MANDATORY)**: when a response contains a computed numeric field (e.g. \`total_amount\`, \`discount_amount\`, \`subtotal\`), trace its inputs back through ALL prior responses in the test chain — including product/item creation steps. If \`total_amount = price × quantity\` and \`price\` is available from an earlier product response, assert \`total_amount == get_response_value(product_response, "price") * quantity\` in addition to asserting only against other order-level fields.
-   - **Array index assertions must match the recorded response**: when asserting array fields (e.g. \`items.0\`, \`items.1\`), look at the scenario's recorded response body to determine the actual array length. Only assert indices that exist in the recorded response — never infer array size from the scenario name, request body, or endpoint semantics. If the scenario response shows \`items\` with 1 element, assert only \`items.0.*\`; do NOT assert \`items.1.*\`.
-   - **Each step that reads a resource**: assert chained values from prior responses and re-assert any computed/derived fields — do NOT reduce read-step assertions to null-checks only.
-   - No step is exempt because it is a "setup" or "teardown" step — treat every step as a first-class assertion target.
-   - **Assertion parity between contract and integration tests**: for every response field that is assertable from the inline request body or the response itself (non-null, echo-back, value ranges, computed/derived), the same assertion MUST appear in both the contract test and the integration test. A bug that breaks a response field must be detectable by either test type independently — if one test would catch it and the other would not, add the missing assertion to the weaker test.
+2. **Enhance assertions** (integration and contract tests — MANDATORY, no step exempt):
+   - **Every step**: assert non-null IDs, echo-back values, and value ranges. Echo-back must use the value sent in the *current* step's request — use the \`data_override\` source if set, otherwise the hardcoded literal.
+   - **Create/modify steps**: also assert computed/derived fields (e.g. \`total_amount\`, \`discount_amount\`) by tracing inputs through all prior responses. In integration tests, use dynamic chained values (e.g. \`get_response_value(product_response, "price") * quantity\`); in contract tests, use the hardcoded literals from the request body and expected response (e.g. \`assert total_amount == 19.99 * 2\`).
+   - **Read steps**: re-assert chained and computed fields — do not reduce to null-checks only.
+   - **Array fields**: only assert indices that exist in the recorded response body — never infer array length from the request or scenario name.
+   - **Parity**: every assertion derivable from the request body or response (non-null, echo-back, value ranges, computed) must appear in both the contract test and the integration test independently.
 3. **Enhance UI test assertions**: for UI tests, refer back to your business logic analysis from Step 2 (code review) and the \`issuesFound\` you logged. Add assertions that catch real user-facing bugs:
    - **Page renders after navigation**: after clicking a button that navigates (e.g. "Edit Order"), assert that the target page loaded its expected heading or key element. A blank page or missing heading means a rendering crash.
    - **No duplicate items**: after editing or deleting items in a list (e.g. order items, cart products), assert the expected item count. Duplicate entries indicate an accumulation bug.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@skyramp/mcp",
-  "version": "0.0.64-rc.12",
+  "version": "0.0.64-rc.13",
   "main": "build/index.js",
   "exports": {
     ".": "./build/index.js",