@skyramp/mcp 0.0.64-rc.12 → 0.0.64-rc.13
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
|
@@ -58,7 +58,7 @@ tests. The analyze tool uses PR comment history to avoid duplicates.
|
|
|
58
58
|
|
|
59
59
|
## Step 2: Decide — one action per affected test / endpoint
|
|
60
60
|
|
|
61
|
-
Using the diff, the recommendations, and the health assessment, assign
|
|
61
|
+
Using the diff, the recommendations, and the health assessment, assign one or more actions to each item:
|
|
62
62
|
|
|
63
63
|
### For each **existing Skyramp test**:
|
|
64
64
|
- **UPDATE** — the diff touches the endpoint this test covers AND adds/changes fields the test should assert (e.g. new response field, changed status code, renamed path). The test still runs but has a coverage gap or will break.
|
|
@@ -70,15 +70,26 @@ Using the diff, the recommendations, and the health assessment, assign exactly o
|
|
|
70
70
|
- **ADD** — the diff introduced this route; generate a new test.
|
|
71
71
|
- If the endpoint existed before this diff (only a model/field change touched it) — log as a coverage gap but do not generate a test.
|
|
72
72
|
|
|
73
|
-
### Decision rules
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
|
|
81
|
-
|
|
73
|
+
### Decision rules
|
|
74
|
+
|
|
75
|
+
Assign actions using the categories below. Rules are not mutually exclusive — a single diff may trigger UPDATE on existing files AND ADD for a new scenario simultaneously.
|
|
76
|
+
|
|
77
|
+
**UPDATE** an existing test when any of the following is true (all that apply, not just the first):
|
|
78
|
+
- The diff adds, removes, or renames a field the test asserts.
|
|
79
|
+
- The diff adds a new HTTP method on a resource path the test already covers — UPDATE **all** existing test files for that resource (contract, integration, UI). Follow the Enhance assertions guidelines when adding the new method's assertions.
|
|
80
|
+
- The diff makes an additive change (new optional fields, new query params) to a route the test covers.
|
|
81
|
+
|
|
82
|
+
**ADD** a net-new test only when:
|
|
83
|
+
- The diff introduces a brand-new route that has **no existing test coverage at all**, OR
|
|
84
|
+
- The diff introduces a new auth path, error branch, or fundamentally separate scenario that no existing test covers.
|
|
85
|
+
- Never ADD for a resource that already has existing tests just because the HTTP method is new — UPDATE those files instead.
|
|
86
|
+
- An endpoint that existed before this diff but lacks tests is a pre-existing coverage gap — log it in \`additionalRecommendations\`, do NOT add a test.
|
|
87
|
+
|
|
88
|
+
**REGENERATE** when the endpoint was substantially restructured and the test is fundamentally broken.
|
|
89
|
+
|
|
90
|
+
**DELETE** when the endpoint the test covers was removed entirely.
|
|
91
|
+
|
|
92
|
+
**Skip** if the test is unrelated to the diff.
|
|
82
93
|
|
|
83
94
|
Output your decision table:
|
|
84
95
|
\`\`\`
|
|
@@ -99,11 +110,17 @@ Log each finding in \`issuesFound\` with a \`severity\` (critical/high/medium/lo
|
|
|
99
110
|
|
|
100
111
|
## Step 3: Act
|
|
101
112
|
|
|
102
|
-
Execute the actions from Step 2
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
|
|
113
|
+
Execute the actions from Step 2 in two independent tracks:
|
|
114
|
+
|
|
115
|
+
**Track A — UPDATE / DELETE / REGENERATE (unconditional):**
|
|
116
|
+
Execute every UPDATE, DELETE, and REGENERATE decision from Step 2 regardless of the GENERATE list or budget. These are never skipped. When a new HTTP method is added, UPDATE covers all existing test files for that resource (contract, integration, UI) — scan the actual test directory on disk to find them, do not rely solely on what the analyze tool reports.
|
|
117
|
+
|
|
118
|
+
**Track B — ADD new tests (follow GENERATE list):**
|
|
119
|
+
- **MANDATORY — use the pre-ranked GENERATE list as-is**: The Execution Plan's GENERATE section governs ADD and REGENERATE actions only. You MUST generate exactly those scenarios in the exact order listed. Do NOT substitute, rename, or replace a GENERATE item. If enrichment reveals a high-value insight, add it to \`additionalRecommendations\` — never displace a GENERATE item.
|
|
120
|
+
- Scenario JSON files are always new files — always generate them for new methods. Every generated scenario JSON must have a corresponding new integration test generated from it via \`skyramp_integration_test_generation\`.
|
|
121
|
+
- Do NOT create a new contract or integration test file for a resource that already has existing tests — those are handled by Track A (UPDATE). If the GENERATE item names a scenario for a resource already covered by existing tests, convert it to a Track A UPDATE and count it toward the budget.
|
|
122
|
+
- **Example**: If the plan says "GENERATE: <METHOD> /resource/{id}" and existing contract and integration tests already cover that resource path, do NOT create new files — generate the new scenario JSON, then UPDATE all existing test files for that resource to add the new method's test function.
|
|
123
|
+
- **Example**: If the plan says "GENERATE: resource-method-add-items-recalculate" and you discover a bug during enrichment, generate the planned item and add the bug scenario to \`additionalRecommendations\`.
|
|
107
124
|
- **Total generated**: Follow the **"Budget: N generate"** line in the Execution Plan (section "## Execution Plan", "Budget: N generate + M additional = T total"). Generate exactly the GENERATE-tagged items. Do NOT generate fewer or different items.
|
|
108
125
|
- **UI test priority**: If the diff contains frontend/UI changes (e.g. \`.tsx\`, \`.jsx\`, \`.vue\`, \`.svelte\` files), you MUST attempt to generate at least one UI test. Use \`browser_navigate\` to the app's base URL — if the app responds, record a trace and generate the test. Only skip if the app is unreachable. This takes priority over generating additional backend-only tests.
|
|
109
126
|
- **Always generate a test for critical bugs, even if it will fail.** When a GENERATE-tagged item targets a page or endpoint with a known bug, do NOT skip it because you expect the test to fail — a failing test that documents a bug is more valuable than a text-only description. This applies within the existing GENERATE budget; do not add extra tests beyond the plan.
|
|
@@ -120,7 +137,9 @@ Execute the actions from Step 2.
|
|
|
120
137
|
Edit the existing test file directly:
|
|
121
138
|
- Add missing assertions for new response fields (e.g. \`assert "archived" in resp\` or \`assert resp["archived"] >= 0\`).
|
|
122
139
|
- Fix path/method changes in the test.
|
|
123
|
-
- **When adding a new method to an existing resource:** Add the new method's test cases to the existing test file that covers that resource path (e.g. adding POST to a file that already tests GET /products).
|
|
140
|
+
- **When adding a new method to an existing resource:** Add the new method's test cases to the existing test file that covers that resource path (e.g. adding POST to a file that already tests GET /products). This applies to ALL existing test files for that resource — contract, integration, UI — not just the one the GENERATE plan names.
|
|
141
|
+
- **Happy path first (CRITICAL):** When adding a new HTTP method (PUT, PATCH, POST) to an existing test file, always include the happy path (2xx success) assertion. Do NOT add only an error-path test (e.g. 404, 422) for the new method — error cases may follow, but the happy path is mandatory.
|
|
142
|
+
- **Test ordering within the file (CRITICAL):** Always place mutation tests (PATCH, PUT) BEFORE any DELETE test on the same resource. DELETE removes the resource — any PATCH/PUT after it will fail with 404. Insert new mutation test functions above the DELETE function and before the DELETE call in the \`if __name__ == "__main__"\` block (or equivalent runner entrypoint).
|
|
124
143
|
- Do not regenerate — only apply the minimal change needed.
|
|
125
144
|
|
|
126
145
|
### REGENERATE
|
|
@@ -128,8 +147,7 @@ Call the appropriate generation tool to replace the existing test from scratch.
|
|
|
128
147
|
Use the same filename so it overwrites the old file.
|
|
129
148
|
|
|
130
149
|
### ADD
|
|
131
|
-
Generate a net-new test. Use a unique descriptive filename
|
|
132
|
-
**Exception:** If the diff adds a new HTTP method to a resource path already covered by existing tests, UPDATE all existing test files for that resource instead (see rule 2 above). Do not let "distinct workflow" or "different test type" reasoning override this — new methods on covered resources always update existing files.
|
|
150
|
+
Generate a net-new test only for resources with no existing test coverage. Use a unique descriptive filename. Do NOT create a new contract or integration test file for a resource that already has existing tests — use UPDATE instead (Track A).
|
|
133
151
|
|
|
134
152
|
**Auth — determine ONCE, apply to EVERY tool call:**
|
|
135
153
|
1. Start from the Execution Plan returned by \`skyramp_analyze_changes\` — it includes pre-resolved auth params.
|
|
@@ -196,20 +214,12 @@ Do NOT use \`page.waitForTimeout()\` with fixed delays — these are flaky in CI
|
|
|
196
214
|
|
|
197
215
|
**After generation, you MUST do exactly these steps — nothing more, nothing less:**
|
|
198
216
|
1. **Fix chaining**: replace hardcoded IDs with dynamic response values — path params like \`id = 'id'\` → \`skyramp.get_response_value(prev_response, "id")\`, and hardcoded IDs in request bodies → dynamic values from prior responses.
|
|
199
|
-
2. **Enhance assertions
|
|
200
|
-
-
|
|
201
|
-
- **
|
|
202
|
-
- **
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
(e.g. \`get_response_value(products2_POST_response, "product_id")\`).
|
|
206
|
-
2. If the field is a hardcoded literal in the request body with no \`data_override\`,
|
|
207
|
-
assert against that literal directly (e.g. \`== 2\`).
|
|
208
|
-
- **Cross-step computation assertions (MANDATORY)**: when a response contains a computed numeric field (e.g. \`total_amount\`, \`discount_amount\`, \`subtotal\`), trace its inputs back through ALL prior responses in the test chain — including product/item creation steps. If \`total_amount = price × quantity\` and \`price\` is available from an earlier product response, assert \`total_amount == get_response_value(product_response, "price") * quantity\` in addition to asserting only against other order-level fields.
|
|
209
|
-
- **Array index assertions must match the recorded response**: when asserting array fields (e.g. \`items.0\`, \`items.1\`), look at the scenario's recorded response body to determine the actual array length. Only assert indices that exist in the recorded response — never infer array size from the scenario name, request body, or endpoint semantics. If the scenario response shows \`items\` with 1 element, assert only \`items.0.*\`; do NOT assert \`items.1.*\`.
|
|
210
|
-
- **Each step that reads a resource**: assert chained values from prior responses and re-assert any computed/derived fields — do NOT reduce read-step assertions to null-checks only.
|
|
211
|
-
- No step is exempt because it is a "setup" or "teardown" step — treat every step as a first-class assertion target.
|
|
212
|
-
- **Assertion parity between contract and integration tests**: for every response field that is assertable from the inline request body or the response itself (non-null, echo-back, value ranges, computed/derived), the same assertion MUST appear in both the contract test and the integration test. A bug that breaks a response field must be detectable by either test type independently — if one test would catch it and the other would not, add the missing assertion to the weaker test.
|
|
217
|
+
2. **Enhance assertions** (integration and contract tests — MANDATORY, no step exempt):
|
|
218
|
+
- **Every step**: assert non-null IDs, echo-back values, and value ranges. Echo-back must use the value sent in the *current* step's request — use the \`data_override\` source if set, otherwise the hardcoded literal.
|
|
219
|
+
- **Create/modify steps**: also assert computed/derived fields (e.g. \`total_amount\`, \`discount_amount\`) by tracing inputs through all prior responses. In integration tests, use dynamic chained values (e.g. \`get_response_value(product_response, "price") * quantity\`); in contract tests, use the hardcoded literals from the request body and expected response (e.g. \`assert total_amount == 19.99 * 2\`).
|
|
220
|
+
- **Read steps**: re-assert chained and computed fields — do not reduce to null-checks only.
|
|
221
|
+
- **Array fields**: only assert indices that exist in the recorded response body — never infer array length from the request or scenario name.
|
|
222
|
+
- **Parity**: every assertion derivable from the request body or response (non-null, echo-back, value ranges, computed) must appear in both the contract test and the integration test independently.
|
|
213
223
|
3. **Enhance UI test assertions**: for UI tests, refer back to your business logic analysis from Step 2 (code review) and the \`issuesFound\` you logged. Add assertions that catch real user-facing bugs:
|
|
214
224
|
- **Page renders after navigation**: after clicking a button that navigates (e.g. "Edit Order"), assert that the target page loaded its expected heading or key element. A blank page or missing heading means a rendering crash.
|
|
215
225
|
- **No duplicate items**: after editing or deleting items in a list (e.g. order items, cart products), assert the expected item count. Duplicate entries indicate an accumulation bug.
|