@skyramp/mcp 0.1.3 → 0.1.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/build/prompts/test-maintenance/driftAnalysisSections.js +2 -2
- package/build/prompts/test-recommendation/analysisOutputPrompt.js +2 -2
- package/build/prompts/test-recommendation/recommendationSections.js +42 -10
- package/build/prompts/test-recommendation/registerRecommendTestsPrompt.js +2 -5
- package/build/prompts/test-recommendation/test-recommendation-prompt.js +67 -152
- package/build/prompts/test-recommendation/test-recommendation-prompt.test.js +111 -18
- package/build/prompts/testbot/testbot-prompts.js +17 -9
- package/build/services/ScenarioGenerationService.js +2 -1
- package/build/tools/generate-tests/generateBatchScenarioRestTool.js +3 -4
- package/build/tools/generate-tests/generateBatchScenarioRestTool.test.js +9 -0
- package/build/tools/submitReportTool.js +4 -3
- package/build/tools/submitReportTool.test.js +16 -2
- package/build/tools/test-management/analyzeChangesTool.js +10 -5
- package/build/types/TestRecommendation.js +2 -0
- package/build/utils/featureFlags.js +25 -0
- package/build/utils/httpDefaults.js +12 -0
- package/build/utils/scenarioDrafting.js +116 -505
- package/build/utils/scenarioDrafting.test.js +260 -480
- package/package.json +1 -1
|
@@ -143,8 +143,8 @@ When a diff adds a new HTTP method to a resource, UPDATE covers **all** existing
|
|
|
143
143
|
|
|
144
144
|
### PATCH/PUT with child collections (MANDATORY)
|
|
145
145
|
When updating a contract or integration test for a PATCH or PUT endpoint whose request/response includes a child collection array (e.g. \`items\`, \`products\`, \`line_items\`):
|
|
146
|
-
1. The request body MUST include the child array with at least one item containing the
|
|
147
|
-
2. Assert each item's
|
|
146
|
+
1. The request body MUST include the child array with at least one item containing the Foreign Key field (e.g. \`product_id\`) and a \`quantity\` field.
|
|
147
|
+
2. Assert each item's Foreign Key field and \`quantity\` match the sent values.
|
|
148
148
|
3. Assert the top-level computed total (e.g. \`total_amount\`) equals the expected math from the items.
|
|
149
149
|
A test that only sends/asserts metadata (discount, status, notes) without asserting the items array is INCOMPLETE and will produce false passes even when the items/total logic is broken.
|
|
150
150
|
|
|
@@ -52,10 +52,10 @@ The ranked test recommendation catalog is pre-built and shown below (after the s
|
|
|
52
52
|
**Your only job is to present it.**
|
|
53
53
|
|
|
54
54
|
1. Fill in every \`<…from source>\` placeholder using the field names, computed formulas, and auth details you found in Steps 1–2.
|
|
55
|
-
2. Output the completed catalog **exactly as formatted
|
|
55
|
+
2. Output the completed catalog **exactly as formatted**, preserving whatever test-type section headings are already present in the catalog. Do NOT restructure, reorder, rename sections, invent missing sections, or generate a new format.
|
|
56
56
|
3. Do NOT call any Skyramp generation tools. The catalog shows ready-to-use tool calls that can be executed on demand.
|
|
57
57
|
|
|
58
|
-
**If** Steps 1–2 revealed additional scenarios the catalog does not cover (e.g. a computed formula or
|
|
58
|
+
**If** Steps 1–2 revealed additional scenarios the catalog does not cover (e.g. a computed formula or Foreign Key relationship that was missed), you may optionally call \`skyramp_recommend_tests\` with \`stateFile: "${p.stateFile ?? p.sessionId}"\` and \`enrichedScenarios\` to regenerate a more complete catalog — but only after presenting the current one.`;
|
|
59
59
|
const hasJavaFiles = p.candidateRouteFiles?.some(f => /\.(java|kt)$/.test(f)) ?? false;
|
|
60
60
|
const routeFilesSection = p.candidateRouteFiles && p.candidateRouteFiles.length > 0
|
|
61
61
|
? `\nRoute/controller files found by static scan (read these to discover endpoints — the regex-based catalog below may be incomplete for your framework):\n${p.candidateRouteFiles.map(f => `- ${f}`).join("\n")}\n`
|
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
import { isContractConsumerModeEnabled } from "../../utils/featureFlags.js";
|
|
1
|
+
import { isContractConsumerModeEnabled, resolveServiceDetailsRef } from "../../utils/featureFlags.js";
|
|
2
2
|
import { WorkspaceAuthType, getAuthScheme, isAuthorizationHeaderName, AUTH_MIDDLEWARE_PATTERNS_STR } from "../../utils/workspaceAuth.js";
|
|
3
3
|
// Cached at module-load — the flag is process-wide and cannot change per call.
|
|
4
4
|
const CONSUMER_MODE_ENABLED = isContractConsumerModeEnabled();
|
|
@@ -42,13 +42,45 @@ Before calling any tool, replace every \`<from source>\` placeholder in the tool
|
|
|
42
42
|
}
|
|
43
43
|
export function buildReasoningProtocol() {
|
|
44
44
|
return `<reasoning_protocol>
|
|
45
|
+
## Coverage Reasoning Block (MANDATORY — complete BEFORE your Budget Plan)
|
|
46
|
+
|
|
47
|
+
Before committing to a Budget Plan and test list, produce a <thinking> block that enumerates ALL testable surfaces introduced or affected by this PR. This prevents narrow focus on a single endpoint/method.
|
|
48
|
+
|
|
49
|
+
**For backend-only PRs**, your thinking MUST cover:
|
|
50
|
+
1. **All HTTP methods affected** — if a new validation/service method is added, trace ALL callers (not just createOne — also updateOne, updateMany, deleteOne). List every HTTP method × endpoint pair.
|
|
51
|
+
2. **Error paths per method** — for each endpoint-method, what error codes does the source code return? (400, 401, 403, 404, 409, 422). Each distinct error path is a potential test.
|
|
52
|
+
3. **Cross-service impact** — does the change affect other services that import the modified module? Those endpoints need coverage too.
|
|
53
|
+
4. **Data migrations** — if a migration exists, can its effect be verified via an API call? (e.g. backfill → GET should return the backfilled value)
|
|
54
|
+
|
|
55
|
+
**For frontend-only PRs**, your thinking MUST cover:
|
|
56
|
+
1. **Component integration** — which routes render the changed component? Each route is a test target.
|
|
57
|
+
2. **User interactions** — what actions can a user perform on the changed component? (click, type, select, drag). Each distinct action flow is a test.
|
|
58
|
+
3. **State variations** — what different states does the component render? (empty, loading, error, populated, edge values)
|
|
59
|
+
|
|
60
|
+
**For mixed (frontend + backend) PRs**, your thinking MUST cover:
|
|
61
|
+
1. All backend surfaces (methods 1–4 above)
|
|
62
|
+
2. All frontend surfaces (methods 1–3 above)
|
|
63
|
+
3. **E2E bridges** — which frontend components call the changed backend endpoints? Those are E2E test candidates that cover both layers in one test.
|
|
64
|
+
|
|
65
|
+
**Output format in your thinking block:**
|
|
66
|
+
\`\`\`
|
|
67
|
+
Testable surfaces:
|
|
68
|
+
- POST /permissions → happy path (201), invalid fields (422), missing collection (400)
|
|
69
|
+
- PATCH /permissions/:id → update with valid fields (200), update with invalid fields (422)
|
|
70
|
+
- GET /items/:collection?aggregate → with allowed fields (200), with forbidden fields (403)
|
|
71
|
+
- UI: permissions field selector → add field, remove field, wildcard toggle
|
|
72
|
+
Total distinct surfaces: N
|
|
73
|
+
\`\`\`
|
|
74
|
+
|
|
75
|
+
Your Budget Plan total MUST be ≥ the number of GENERATE slots and reflect the breadth of surfaces found. If you found 8 distinct surfaces but only budget 3 tests, you are under-covering the PR.
|
|
76
|
+
|
|
45
77
|
## Parameter Grounding Rule
|
|
46
78
|
Before each GENERATE tool call, confirm WHERE each key value comes from:
|
|
47
79
|
|
|
48
80
|
- **requestBody / responseBody fields** → source code schema (Zod, Pydantic, DTO), enriched scenario, or OpenAPI spec. **The generation tool rejects empty \`{}\` request bodies for POST/PUT/PATCH** — read the source schema first if the fields are unknown.
|
|
49
81
|
- **endpointURL** → workspace \`baseUrl\` + endpoint path (both required — never path alone)
|
|
50
82
|
- **authHeader / authScheme** → workspace config or OpenAPI \`securitySchemes\`
|
|
51
|
-
- **
|
|
83
|
+
- **Foreign Key path params** → chained from a prior step's response (check the actual field name — it may be \`id\`, \`uuid\`, \`_id\`, or a resource-specific \`*_id\` field). The chaining source can be a response body (POST or GET), a response header (e.g. \`Location\`), or a cookie — not hardcoded
|
|
52
84
|
- **Names / string values** → realistic; append timestamp suffix to avoid re-run conflicts
|
|
53
85
|
|
|
54
86
|
## Ranking Rule
|
|
@@ -110,11 +142,11 @@ export function buildTestPatternGuidelines() {
|
|
|
110
142
|
- **Middleware chains**: If auth/rate-limit/logging middleware exists, test the chain (e.g., rate limit hit → auth still checked → correct error returned)
|
|
111
143
|
- **N+1 query risk**: If list endpoints join related data (e.g., orders with products), test with large datasets
|
|
112
144
|
- **State machines**: If resources have status transitions (draft→published→archived), test invalid transitions (e.g., archived→draft should fail)
|
|
113
|
-
- **Cascade deletes**: Only recommend after reading source code to confirm which resource holds the
|
|
145
|
+
- **Cascade deletes**: Only recommend after reading source code to confirm which resource holds the Foreign Key. The resource with the Foreign Key is the child; the one it points to is the parent. Example: if orders.product_id references products, then products is the parent — deleting a product tests whether orders are protected or cascade-deleted. Getting this backwards (treating the child as the parent) produces a nonsensical test.
|
|
114
146
|
- **Race conditions**: If concurrent writes are possible (inventory deduction, counter increment), test concurrent requests
|
|
115
147
|
- **Computed fields**: If response contains derived values (total, average, count), verify computation with known inputs (e.g., total_cost = compute_seconds * rate + memory_mb * rate + external_cost)
|
|
116
148
|
- **Mutation with collection modification**: If PUT/PATCH endpoints accept arrays of child items (e.g., order line items, cart products, invoice entries), test adding/removing items and verify that derived totals (e.g., total_amount, subtotal, item_count) are recalculated correctly. This is the most common source of user-reported bugs — always prioritize it for GENERATE over simple field-update tests.
|
|
117
|
-
The PATCH/PUT request body should include the child collection array field(s) defined for that endpoint (e.g., "items" with
|
|
149
|
+
The PATCH/PUT request body should include the child collection array field(s) defined for that endpoint (e.g., "items" with Foreign Key references like "product_id" and a quantity field) chained from prior POST responses. A PATCH that only sends metadata fields (e.g., discount_type, status, notes) without modifying the child collection is NOT a valid mutation-recalc test — it will pass even when the item/total logic is broken. Before writing assertions, inspect the source code or OpenAPI spec to identify (1) the actual child collection field name and its Foreign Key/quantity/price sub-fields, and (2) how derived totals are calculated (including any discounts, taxes, or fees). Then assert: the child Foreign Key fields match chained IDs, quantities match sent values, and totals match the computation from the source code
|
|
118
150
|
- **Webhook/event side effects**: If endpoints trigger async operations, test that side effects occur (e.g., POST /orders triggers notification)
|
|
119
151
|
- **Cross-user isolation**: If resources are owned by users, test that user B cannot access/modify user A's resources (GET /users/{other_id}/data → 403 Forbidden)
|
|
120
152
|
- **Range/boundary invariants**: If business rules cap values (max retries, min balance, discount ≤ subtotal), test the boundary (e.g., set retries to max+1 → expect rejection)
|
|
@@ -128,7 +160,7 @@ that step B depends on (e.g., create product → create order referencing that p
|
|
|
128
160
|
verify order contains correct product). Single-resource CRUD alone is not an integration test.
|
|
129
161
|
Use actual field names and values from the source code schema or OpenAPI schema (not \`{}\` or invented field names); verify response data, not just status codes.
|
|
130
162
|
When a PUT/PATCH updates a resource with child collections (e.g., order items), the request body
|
|
131
|
-
MUST include the child array with
|
|
163
|
+
MUST include the child array with Foreign Key references chained from prior steps — and assertions MUST
|
|
132
164
|
verify the actual child items in the response (product_id, quantity, unit_price), not just
|
|
133
165
|
top-level metadata like discount or status.
|
|
134
166
|
|
|
@@ -182,7 +214,7 @@ Before finalizing your output, verify:
|
|
|
182
214
|
6. **Real request shapes**: requestBody for POST/PUT/PATCH uses actual field names from source (not \`{}\`). GET search/filter uses \`queryParams\`, not \`requestBody\`.
|
|
183
215
|
7. **scenarioFile**: \`skyramp_integration_test_generation\` uses the exact \`filePath\` returned by \`skyramp_batch_scenario_test_generation\` — not a guessed or hardcoded filename.
|
|
184
216
|
8. **bugCatchingTarget**: Every GENERATE integration test that targets a business rule, formula, or constraint has a non-empty \`bugCatchingTarget\`.
|
|
185
|
-
9. **
|
|
217
|
+
9. **Foreign Key chaining**: In multi-step integration tests, path params sourced from a prior step's response (e.g. \`order_id\` from step 1) use \`chainsFrom\` — not hardcoded IDs.
|
|
186
218
|
10. **Concrete scenario names**: No GENERATE item uses a placeholder name ending in a numeric suffix (e.g. \`ui-test-for-changed-component-1\`, \`ui-test-from-trace-2\`). Derive the name from the actual changed component or flow: if the diff touches \`LinkCard.tsx\`, the scenario name should be \`link-card-pin-toggle\` or \`link-card-edit-description\`, not \`ui-test-for-changed-component-1\`. The changed file list is available above — use it.
|
|
187
219
|
</verification>`;
|
|
188
220
|
}
|
|
@@ -193,7 +225,7 @@ export function buildFewShotExamples() {
|
|
|
193
225
|
**Parameter grounding**:
|
|
194
226
|
- baseURL: "http://localhost:8000" (workspace api.baseUrl)
|
|
195
227
|
- steps[0].requestBody fields "name", "price": ProductCreate schema fields (src/models/product.py)
|
|
196
|
-
- steps[1].requestBody "product_id":
|
|
228
|
+
- steps[1].requestBody "product_id": Foreign Key to products — chained from step 0 response id
|
|
197
229
|
- steps[1].requestBody "quantity": OrderCreate schema field (src/models/order.py)
|
|
198
230
|
- responseBody "total_amount": 89.97 = 29.99 × 3 — from order total formula (src/services/order_service.py: total = sum(item.price * item.quantity))
|
|
199
231
|
- authHeader/authScheme: workspace config (Authorization / Bearer)
|
|
@@ -311,7 +343,7 @@ ${authGuidance}
|
|
|
311
343
|
**For multi-endpoint workflows (integration tests) — Batch Scenario → Integration pipeline:**
|
|
312
344
|
1. Call \`skyramp_batch_scenario_test_generation\` with ALL steps in a single call: \`scenarioName\`, \`destination\`,
|
|
313
345
|
\`baseURL\`, \`${authCallParams}\`, and a \`steps\` array where each element has \`method\`, \`path\`, \`requestBody\` OR \`queryParams\`, \`responseBody\`, \`statusCode\`.
|
|
314
|
-
\`statusCode\` is
|
|
346
|
+
\`statusCode\` is required — determine the expected status code from the source code for each step.
|
|
315
347
|
**OpenAPI spec is NOT required.** \`apiSchema\` is OPTIONAL — omit it if no spec exists.
|
|
316
348
|
**CRITICAL — Query params vs request body:**
|
|
317
349
|
- For **POST/PUT/PATCH**: use \`requestBody\` with realistic field values from source code schemas.
|
|
@@ -351,12 +383,12 @@ ${CONSUMER_MODE_ENABLED ? `**Contract test mode selection — set based on this
|
|
|
351
383
|
Only provider-side contract tests are supported. Pass \`providerMode: true\` for new or modified endpoints this codebase owns.`}
|
|
352
384
|
|
|
353
385
|
**For UI tests:**
|
|
354
|
-
1. \`browser_navigate\` to the target URL (from
|
|
386
|
+
1. \`browser_navigate\` to the target URL (from ${resolveServiceDetailsRef().baseUrlRef})
|
|
355
387
|
2. \`browser_snapshot\` to see the page (ARIA tree)
|
|
356
388
|
3. Interact using \`browser_click\`, \`browser_type\`, \`browser_fill_form\`, etc.
|
|
357
389
|
4. \`browser_snapshot\` after each interaction that changes the page
|
|
358
390
|
5. \`skyramp_export_zip\` with an **absolute** output path: \`<repositoryPath>/.skyramp/<test_name>_trace.zip\`
|
|
359
|
-
6. \`skyramp_ui_test_generation\` with \`playwrightInput\` = the **absolute** path of the exported zip, and \`outputDir\` =
|
|
391
|
+
6. \`skyramp_ui_test_generation\` with \`playwrightInput\` = the **absolute** path of the exported zip, and \`outputDir\` = ${resolveServiceDetailsRef().frontendTestDirRef} (e.g. \`frontend/tests\`). Do NOT use the backend service's testDirectory — UI tests must go in the frontend service's test directory.
|
|
360
392
|
|
|
361
393
|
Tips: For custom dropdowns (Radix, MUI): click combobox → snapshot → click option (NOT \`browser_select_option\`).
|
|
362
394
|
|
|
@@ -4,6 +4,7 @@ import { logger } from "../../utils/logger.js";
|
|
|
4
4
|
import { buildRecommendationPrompt } from "./test-recommendation-prompt.js";
|
|
5
5
|
import { ScenarioSource, AnalysisScope } from "../../types/RepositoryAnalysis.js";
|
|
6
6
|
import { SCENARIO_CATEGORIES } from "../../types/TestRecommendation.js";
|
|
7
|
+
import { inferExpectedStatus } from "../../utils/httpDefaults.js";
|
|
7
8
|
export function mergeEnrichedScenarios(serverScenarios, raw) {
|
|
8
9
|
const rejectionNotes = [];
|
|
9
10
|
let parsed;
|
|
@@ -54,11 +55,7 @@ export function mergeEnrichedScenarios(serverScenarios, raw) {
|
|
|
54
55
|
requestBody: st.requestBody,
|
|
55
56
|
queryParams: st.queryParams,
|
|
56
57
|
responseBody: st.responseBody,
|
|
57
|
-
|
|
58
|
-
expectedStatusCode: st.expectedStatusCode ??
|
|
59
|
-
(String(st.method ?? "").toUpperCase() === "POST" ? 201
|
|
60
|
-
: String(st.method ?? "").toUpperCase() === "DELETE" ? 204
|
|
61
|
-
: 200),
|
|
58
|
+
expectedStatusCode: st.expectedStatusCode ?? inferExpectedStatus(String(st.method ?? "GET")),
|
|
62
59
|
expectedResponseFields: st.expectedResponseFields,
|
|
63
60
|
bodyMustInclude: st.bodyMustInclude,
|
|
64
61
|
chainsFrom: st.chainsFrom,
|
|
@@ -4,8 +4,9 @@ import { WorkspaceAuthType, getDefaultAuthHeader, AUTH_MIDDLEWARE_PATTERNS_STR }
|
|
|
4
4
|
import { logger } from "../../utils/logger.js";
|
|
5
5
|
import { extractResourceFromPath } from "../../utils/routeParsers.js";
|
|
6
6
|
import { buildArchitectPreamble, buildContextFetchingGuidance, buildReasoningProtocol, buildToolWorkflows, buildTestPatternGuidelines, buildTestQualityCriteria, buildFewShotExamples, buildVerificationChecklist, buildGenerationRules, getAuthSnippets, MAX_TESTS_TO_GENERATE, MAX_RECOMMENDATIONS, MAX_CRITICAL_TESTS, } from "./recommendationSections.js";
|
|
7
|
-
import { CATEGORY_PRIORITY, TEST_CATEGORIES } from "../../types/TestRecommendation.js";
|
|
7
|
+
import { CATEGORY_PRIORITY, PRIORITY_TIER_ORDER, TEST_CATEGORIES } from "../../types/TestRecommendation.js";
|
|
8
8
|
import { buildScopeAssessmentSection, isFrontendFile } from "./scopeAssessment.js";
|
|
9
|
+
import { resolveServiceDetailsRef } from "../../utils/featureFlags.js";
|
|
9
10
|
function formatTestLocations(locs) {
|
|
10
11
|
const entries = Object.entries(locs || {});
|
|
11
12
|
if (entries.length === 0)
|
|
@@ -26,8 +27,8 @@ function formatTestLocations(locs) {
|
|
|
26
27
|
// Categories map to HIGH / MEDIUM / LOW tiers.
|
|
27
28
|
// Within a tier, novelty (new > modified > existing) breaks ties,
|
|
28
29
|
// then cross-resource, step count, and finally the deterministic SHA-256 seed.
|
|
29
|
-
//
|
|
30
|
-
const PRIORITY_ORDER =
|
|
30
|
+
// Single source of truth for priority ordering — imported from types.
|
|
31
|
+
const PRIORITY_ORDER = PRIORITY_TIER_ORDER;
|
|
31
32
|
const NOVELTY_ORDER = { new: 3, modified: 2, existing: 1 };
|
|
32
33
|
function classifyNovelty(scenario, diffContext) {
|
|
33
34
|
if (!diffContext)
|
|
@@ -133,12 +134,6 @@ function buildExternalCoverageSet(testLocations) {
|
|
|
133
134
|
}
|
|
134
135
|
// ── Execution Plan (replaces pre-ranked + scenarios + heuristic sections) ──
|
|
135
136
|
function buildFullRepoRecommendations(scored, topN, baseUrl, authHeaderValue, authSchemeSnippet, authTypeValue, isFrontendProject = false, isFrontendOnlyProject = false, externalCoverage = new Set()) {
|
|
136
|
-
// Full-repo mode only — percentage-based UI/E2E slot targets (15% each, floor 1).
|
|
137
|
-
const rawE2E = isFrontendProject ? Math.max(1, Math.round(topN * 0.15)) : 0;
|
|
138
|
-
const rawUI = isFrontendProject ? Math.max(1, Math.round(topN * 0.15)) : 0;
|
|
139
|
-
const slotsFloor = Math.floor(topN / 2);
|
|
140
|
-
const minE2ESlots = Math.min(rawE2E, slotsFloor);
|
|
141
|
-
const minUISlots = Math.min(rawUI, Math.max(0, topN - minE2ESlots));
|
|
142
137
|
const authRef = authHeaderValue
|
|
143
138
|
? `, authHeader: "${authHeaderValue}"${authSchemeSnippet}`
|
|
144
139
|
: `, authHeader: <check OpenAPI securitySchemes or auth middleware; "" if confirmed unauthenticated>`;
|
|
@@ -167,11 +162,9 @@ function buildFullRepoRecommendations(scored, topN, baseUrl, authHeaderValue, au
|
|
|
167
162
|
return true;
|
|
168
163
|
})
|
|
169
164
|
: scored;
|
|
170
|
-
//
|
|
171
|
-
|
|
172
|
-
|
|
173
|
-
: topN;
|
|
174
|
-
const allItems = scoredFiltered.slice(0, backendSlotCount);
|
|
165
|
+
// All backend slots — UI/E2E split is determined by the LLM's Budget Plan
|
|
166
|
+
// (via buildScopeAssessmentSection), not by hardcoded percentage allocation.
|
|
167
|
+
const allItems = scoredFiltered.slice(0, topN);
|
|
175
168
|
const byType = new Map();
|
|
176
169
|
for (const t of TYPE_ORDER)
|
|
177
170
|
byType.set(t, []);
|
|
@@ -195,7 +188,7 @@ function buildFullRepoRecommendations(scored, topN, baseUrl, authHeaderValue, au
|
|
|
195
188
|
return [
|
|
196
189
|
`**${rank}. ${title}**`,
|
|
197
190
|
` ${s.description}`,
|
|
198
|
-
` ${step.method} ${step.path} \u2192 ${step.expectedStatusCode}`,
|
|
191
|
+
` ${step.method} ${step.path}${step.expectedStatusCode ? ` \u2192 ${step.expectedStatusCode}` : ""}`,
|
|
199
192
|
` Tool: \`skyramp_contract_test_generation({ endpointURL: "${endpointURL}", method: "${step.method}"${authRef}${dataParam} })\``,
|
|
200
193
|
` From source: fill in requestData field names and the specific production boundary this validates`,
|
|
201
194
|
].join("\n");
|
|
@@ -204,7 +197,7 @@ function buildFullRepoRecommendations(scored, topN, baseUrl, authHeaderValue, au
|
|
|
204
197
|
const stepLines = s.steps.map(st => {
|
|
205
198
|
const isBody = ["POST", "PUT", "PATCH"].includes(st.method);
|
|
206
199
|
const bodyHint = isBody ? ` \u2014 body: <${st.method} ${st.path} required fields from source>` : "";
|
|
207
|
-
return ` ${st.order}. ${st.method} ${st.path} \u2192 ${st.expectedStatusCode}: ${st.description}${bodyHint}`;
|
|
200
|
+
return ` ${st.order}. ${st.method} ${st.path}${st.expectedStatusCode ? ` \u2192 ${st.expectedStatusCode}` : ""}: ${st.description}${bodyHint}`;
|
|
208
201
|
}).join("\n");
|
|
209
202
|
const isTraceBased = testType === "e2e" || testType === "ui";
|
|
210
203
|
let toolCallsBlock;
|
|
@@ -250,7 +243,7 @@ function buildFullRepoRecommendations(scored, topN, baseUrl, authHeaderValue, au
|
|
|
250
243
|
dataParam = `, requestBody: <${st.method} ${st.path} required fields from source code>`;
|
|
251
244
|
}
|
|
252
245
|
}
|
|
253
|
-
return ` { method: "${st.method}", path: "${st.path}"
|
|
246
|
+
return ` { method: "${st.method}", path: "${st.path}"${st.expectedStatusCode ? `, statusCode: ${st.expectedStatusCode}` : ""}${dataParam} }`;
|
|
254
247
|
}).join(",\n");
|
|
255
248
|
toolCallsBlock = [
|
|
256
249
|
` skyramp_batch_scenario_test_generation({ scenarioName: "${s.scenarioName}", destination: "${destinationHost}", baseURL: "${baseUrl}"${scenarioAuthRef}, steps: [\n${batchSteps}\n ] })`,
|
|
@@ -282,55 +275,11 @@ function buildFullRepoRecommendations(scored, topN, baseUrl, authHeaderValue, au
|
|
|
282
275
|
const entries = items.map((item, i) => renderItem(item, globalRank + i + 1)).join("\n\n");
|
|
283
276
|
return `### ${label} (${items.length})\n\n${entries}`;
|
|
284
277
|
});
|
|
285
|
-
|
|
286
|
-
const e2eSectionParts = [];
|
|
287
|
-
const uiSectionParts = [];
|
|
288
|
-
if (isFrontendProject) {
|
|
289
|
-
for (let i = 0; i < minE2ESlots; i++) {
|
|
290
|
-
const rank = i + 1;
|
|
291
|
-
e2eSectionParts.push(`**${rank}. E2E User Journey ${i + 1}**\n` +
|
|
292
|
-
` End-to-end test covering a complete user journey through the frontend and backend.\n` +
|
|
293
|
-
` To generate: record a browser trace, then call the generation tool.\n` +
|
|
294
|
-
` browser_navigate({ url: "${baseUrl}" }) \u2192 exercise key user flow \u2192 skyramp_export_zip({ outputPath: "<repo>/.skyramp/e2e_journey_${i + 1}.zip" })\n` +
|
|
295
|
-
` Tool: \`skyramp_e2e_test_generation({ playwrightInput: "<repo>/.skyramp/e2e_journey_${i + 1}.zip"${authHeaderOnlyRef} })\`\n` +
|
|
296
|
-
` From source: read frontend components and their API calls to identify the highest-value user journey`);
|
|
297
|
-
}
|
|
298
|
-
for (let i = 0; i < minUISlots; i++) {
|
|
299
|
-
const rank = minE2ESlots + i + 1;
|
|
300
|
-
uiSectionParts.push(`**${rank}. UI Component Test ${i + 1}**\n` +
|
|
301
|
-
` Test key UI component interactions and state changes.\n` +
|
|
302
|
-
` To generate: record a browser trace, then call the generation tool.\n` +
|
|
303
|
-
` browser_navigate({ url: "${baseUrl}" }) \u2192 interact with UI components \u2192 skyramp_export_zip({ outputPath: "<repo>/.skyramp/ui_component_${i + 1}.zip" })\n` +
|
|
304
|
-
` Tool: \`skyramp_ui_test_generation({ playwrightInput: "<repo>/.skyramp/ui_component_${i + 1}.zip"${authHeaderOnlyRef} })\`\n` +
|
|
305
|
-
` From source: read frontend component files to identify interactions, form submissions, and state transitions`);
|
|
306
|
-
}
|
|
307
|
-
// Offset backend section ranks by the number of E2E + UI placeholders
|
|
308
|
-
const offset = minE2ESlots + minUISlots;
|
|
309
|
-
backendSections.forEach((_, idx) => {
|
|
310
|
-
const t = TYPE_ORDER.filter(t => (byType.get(t) ?? []).length > 0)[idx];
|
|
311
|
-
if (!t)
|
|
312
|
-
return;
|
|
313
|
-
const items = byType.get(t);
|
|
314
|
-
const label = TYPE_LABEL[t];
|
|
315
|
-
let globalRank = offset;
|
|
316
|
-
for (const prev of TYPE_ORDER) {
|
|
317
|
-
if (prev === t)
|
|
318
|
-
break;
|
|
319
|
-
globalRank += (byType.get(prev) ?? []).length;
|
|
320
|
-
}
|
|
321
|
-
backendSections[idx] = `### ${label} (${items.length})\n\n${items.map((item, i) => renderItem(item, globalRank + i + 1)).join("\n\n")}`;
|
|
322
|
-
});
|
|
323
|
-
}
|
|
324
|
-
const allSections = [
|
|
325
|
-
...(e2eSectionParts.length > 0 ? [`### E2E (${e2eSectionParts.length})\n\n${e2eSectionParts.join("\n\n")}`] : []),
|
|
326
|
-
...(uiSectionParts.length > 0 ? [`### UI (${uiSectionParts.length})\n\n${uiSectionParts.join("\n\n")}`] : []),
|
|
327
|
-
...backendSections,
|
|
328
|
-
];
|
|
329
|
-
const sections = allSections.join("\n\n");
|
|
278
|
+
const sections = backendSections.join("\n\n");
|
|
330
279
|
const frontendTierNote = isFrontendOnlyProject
|
|
331
|
-
? `\n\n**Frontend repo:**
|
|
280
|
+
? `\n\n**Frontend repo:** add E2E and UI tests only — no integration or contract tests. The number of each is determined by the UI/E2E percentage you committed to in your Budget Plan above.`
|
|
332
281
|
: isFrontendProject
|
|
333
|
-
? `\n\n**Full-stack repo:**
|
|
282
|
+
? `\n\n**Full-stack repo:** add E2E and UI tests alongside backend tests. Fill your Budget Plan's UI/E2E percentage first, then use remaining slots for backend tests (Tiers 1-4 above).`
|
|
334
283
|
: "";
|
|
335
284
|
const repoSupplementNote = supplementCount > 0
|
|
336
285
|
? `
|
|
@@ -357,9 +306,9 @@ function buildFullRepoRecommendations(scored, topN, baseUrl, authHeaderValue, au
|
|
|
357
306
|
</supplement_guidance>`
|
|
358
307
|
: "";
|
|
359
308
|
const typeMixText = isFrontendOnlyProject
|
|
360
|
-
? `This is a frontend repo. Focus on E2E and UI tests only.
|
|
309
|
+
? `This is a frontend repo. Focus on E2E and UI tests only. Do NOT add integration or contract tests. Split between E2E and UI based on the percentage in your Budget Plan above.`
|
|
361
310
|
: isFrontendProject
|
|
362
|
-
? `This is a full-stack repo. Coverage ranking: E2E > UI > Integration > Contract.
|
|
311
|
+
? `This is a full-stack repo. Coverage ranking: E2E > UI > Integration > Contract. Use \`skyramp_e2e_test_generation\` for E2E and \`skyramp_ui_test_generation\` for UI tests. Split between frontend and backend tests based on the percentage in your Budget Plan above.`
|
|
363
312
|
: `Focus on integration and contract tests for all API endpoints.`;
|
|
364
313
|
return `## Test Recommendations — ${topN} total (grouped by test type)
|
|
365
314
|
|
|
@@ -388,7 +337,7 @@ Before filling in tool call parameters for each item, use the analysis data alre
|
|
|
388
337
|
- Computed/derived response fields and their formulas — assert exact values; read source for formula details not captured in the analysis
|
|
389
338
|
- Auth middleware — set authHeader/authScheme from the repository context above; FastAPI HTTPBearer → 403 not 401
|
|
390
339
|
- Storage backend — if Redis or schema-less, discard unique-constraint and cascade-delete scenarios
|
|
391
|
-
- Delete behavior — hard-delete
|
|
340
|
+
- Delete behavior — read the route handler to determine actual response code (hard-delete may use 204, soft-delete/cancel may use 200)
|
|
392
341
|
|
|
393
342
|
${buildTestQualityCriteria()}
|
|
394
343
|
|
|
@@ -402,16 +351,6 @@ ${buildTestQualityCriteria()}
|
|
|
402
351
|
</enrichment_notes>`;
|
|
403
352
|
}
|
|
404
353
|
function buildExecutionPlan(scored, maxGen, topN, baseUrl, authHeaderValue, authSchemeSnippet, authTypeValue, seed, endpointCount, isUIOnlyPR, hasFrontendChanges = false, hasTraces = false, externalCoverage = new Set(), relevantExternalTestPaths = []) {
|
|
405
|
-
const frontendUrl = "<frontend_url>";
|
|
406
|
-
// Slot allocation:
|
|
407
|
-
// - UI-only PR: all GENERATE slots are UI placeholders (no pre-ranked backend scenarios)
|
|
408
|
-
// - Mixed PR: last GENERATE slot is a UI placeholder; remaining slots are backend
|
|
409
|
-
// - Backend-only PR: all GENERATE slots are backend scenarios
|
|
410
|
-
const backendGenerateCount = isUIOnlyPR
|
|
411
|
-
? 0
|
|
412
|
-
: hasFrontendChanges
|
|
413
|
-
? Math.max(0, maxGen - 1)
|
|
414
|
-
: maxGen;
|
|
415
354
|
// Filter out scenarios whose primary method + resource + test type is already covered by external tests.
|
|
416
355
|
// Method-aware: an external test covering GET /orders won't block PUT /orders scenarios.
|
|
417
356
|
// This is the programmatic complement to the prompt-level Step 0 dedup instructions.
|
|
@@ -425,8 +364,10 @@ function buildExecutionPlan(scored, maxGen, topN, baseUrl, authHeaderValue, auth
|
|
|
425
364
|
return true;
|
|
426
365
|
})
|
|
427
366
|
: scored;
|
|
428
|
-
|
|
429
|
-
|
|
367
|
+
// All pre-ranked backend scenarios go into GENERATE slots (up to maxGen).
|
|
368
|
+
// UI/E2E split is determined by the LLM's Budget Plan — not hardcoded here.
|
|
369
|
+
const generateItems = scoredAfterExternalDedup.slice(0, Math.min(maxGen, scoredAfterExternalDedup.length));
|
|
370
|
+
const rawAdditionalItems = scoredAfterExternalDedup.slice(maxGen, topN);
|
|
430
371
|
// Filter additional items whose primary resource + test type already appear in GENERATE
|
|
431
372
|
const generatedCoverage = new Set(generateItems.map(item => scenarioCoverageKey(item.scenario)));
|
|
432
373
|
const additionalItems = rawAdditionalItems.filter(item => !generatedCoverage.has(scenarioCoverageKey(item.scenario)));
|
|
@@ -439,47 +380,10 @@ function buildExecutionPlan(scored, maxGen, topN, baseUrl, authHeaderValue, auth
|
|
|
439
380
|
: authHeaderValue
|
|
440
381
|
? `, authHeader: "${authHeaderValue}"`
|
|
441
382
|
: `, authHeader: <check OpenAPI securitySchemes or auth middleware; "" if confirmed unauthenticated>`;
|
|
442
|
-
// UI-only:
|
|
443
|
-
|
|
444
|
-
|
|
445
|
-
|
|
446
|
-
const zipPath = `<repositoryPath>/.skyramp/ui_test_${rank}_trace.zip`;
|
|
447
|
-
return hasTraces
|
|
448
|
-
? (`**#${rank} — GENERATE** | ui | workflow | new\n` +
|
|
449
|
-
`Scenario: ui-test-from-trace-${rank} (rename from the actual changed component/flow)\n` +
|
|
450
|
-
`Validates: UI interactions for a changed frontend component or flow.\n\n` +
|
|
451
|
-
`**Tool**: \`skyramp_ui_test_generation({ playwrightInput: "<discovered_trace_file_path>", outputDir: "<frontend service testDirectory from workspace.yml e.g. frontend/tests>" })\``)
|
|
452
|
-
: (`**#${rank} — GENERATE** | ui | workflow | new\n` +
|
|
453
|
-
`Scenario: ui-test-for-changed-component-${rank} (rename from the actual changed component/flow)\n` +
|
|
454
|
-
`Validates: UI interactions for changed frontend component/flow ${rank}.\n\n` +
|
|
455
|
-
`**Tool workflow:**\n` +
|
|
456
|
-
` 1. \`browser_navigate({ url: "${frontendUrl}" })\`\n` +
|
|
457
|
-
` 2. Interact with the changed component (read the diff to identify which component changed and what interactions it supports)\n` +
|
|
458
|
-
` 3. \`browser_snapshot()\` after each key interaction\n` +
|
|
459
|
-
` 4. \`skyramp_export_zip({ outputPath: "${zipPath}" })\` — absolute path\n` +
|
|
460
|
-
` 5. \`skyramp_ui_test_generation({ playwrightInput: "${zipPath}", outputDir: "<frontend service testDirectory from workspace.yml e.g. frontend/tests>" })\`\n\n` +
|
|
461
|
-
`Each item must target a distinct changed component or user flow.`);
|
|
462
|
-
}).join("\n\n")
|
|
463
|
-
: "";
|
|
464
|
-
// Mixed PR: reserve the last GENERATE slot for a UI test for the changed frontend components.
|
|
465
|
-
// Guard: skip when maxGen=0 (caller explicitly requested no generation)
|
|
466
|
-
const uiRank = generateItems.length + 1;
|
|
467
|
-
const uiPlaceholderBlock = (hasFrontendChanges && !isUIOnlyPR && maxGen > 0)
|
|
468
|
-
? hasTraces
|
|
469
|
-
? (`**#${uiRank} — GENERATE** | ui | workflow | new\n` +
|
|
470
|
-
`Scenario: ui-test-for-changed-components (rename from the actual changed component/flow)\n` +
|
|
471
|
-
`Validates: UI interactions for the changed frontend components in this PR.\n\n` +
|
|
472
|
-
`**Tool**: \`skyramp_ui_test_generation({ playwrightInput: "<discovered_trace_file_path>", outputDir: "<frontend service testDirectory from workspace.yml e.g. frontend/tests>" })\``)
|
|
473
|
-
: (`**#${uiRank} — GENERATE** | ui | workflow | new\n` +
|
|
474
|
-
`Scenario: ui-test-for-changed-components (rename from the actual changed component/flow)\n` +
|
|
475
|
-
`Validates: UI interactions for the changed frontend components in this PR.\n\n` +
|
|
476
|
-
`**Tool workflow:**\n` +
|
|
477
|
-
` 1. \`browser_navigate({ url: "${frontendUrl}" })\`\n` +
|
|
478
|
-
` 2. Interact with the changed component (read the diff to identify which component changed and what interactions it supports)\n` +
|
|
479
|
-
` 3. \`browser_snapshot()\` after each key interaction\n` +
|
|
480
|
-
` 4. \`skyramp_export_zip({ outputPath: "<repositoryPath>/.skyramp/ui_mixed_pr_trace.zip" })\` — absolute path\n` +
|
|
481
|
-
` 5. \`skyramp_ui_test_generation({ playwrightInput: "<repositoryPath>/.skyramp/ui_mixed_pr_trace.zip", outputDir: "<frontend service testDirectory from workspace.yml e.g. frontend/tests>" })\`\n\n` +
|
|
482
|
-
`Derive scenario name and steps from the actual changed frontend files.`)
|
|
383
|
+
// UI-only PR: provide guidance template for the LLM to derive UI tests from changed files.
|
|
384
|
+
// The LLM's Budget Plan (100% UI for UI-only PRs) determines how many to generate.
|
|
385
|
+
const uiOnlyGenerateGuidance = isUIOnlyPR
|
|
386
|
+
? `**UI-only PR — derive ${maxGen} UI tests from changed frontend files.**\nEach test must target a distinct changed component or user flow. Use \`skyramp_ui_test_generation\` to generate each test.`
|
|
483
387
|
: "";
|
|
484
388
|
const generateBlocks = generateItems.map((item, i) => {
|
|
485
389
|
const rank = i + 1;
|
|
@@ -496,7 +400,7 @@ function buildExecutionPlan(scored, maxGen, topN, baseUrl, authHeaderValue, auth
|
|
|
496
400
|
? `\n authHeader: "${authHeaderValue}"${authSchemeSnippet}`
|
|
497
401
|
: `\n authHeader: <resolve from workspace or OpenAPI securitySchemes>; authScheme: <if Authorization>`;
|
|
498
402
|
return (`**#${rank} — GENERATE** | ${testType} | ${s.category} | ${item.novelty}\n` +
|
|
499
|
-
`${step.method} ${step.path} → ${step.expectedStatusCode}\n` +
|
|
403
|
+
`${step.method} ${step.path}${step.expectedStatusCode ? ` → ${step.expectedStatusCode}` : ""}\n` +
|
|
500
404
|
`Validates: ${s.description}\n\n` +
|
|
501
405
|
`**Context for generation**:\n` +
|
|
502
406
|
` Endpoint URL: ${endpointURL}${requestBodyData}${authContext}\n\n` +
|
|
@@ -519,7 +423,7 @@ function buildExecutionPlan(scored, maxGen, topN, baseUrl, authHeaderValue, auth
|
|
|
519
423
|
const bodyData = st.requestBody && Object.keys(st.requestBody).length > 0
|
|
520
424
|
? ` [use requestBody: ${JSON.stringify(st.requestBody)} — pass as JSON string in tool call]`
|
|
521
425
|
: "";
|
|
522
|
-
return ` ${st.order}. ${st.method} ${st.path} → ${st.expectedStatusCode}: ${st.description}${chains}${bodyHint}${bodyData}${responseHint}`;
|
|
426
|
+
return ` ${st.order}. ${st.method} ${st.path}${st.expectedStatusCode ? ` → ${st.expectedStatusCode}` : ""}: ${st.description}${chains}${bodyHint}${bodyData}${responseHint}`;
|
|
523
427
|
}).join("\n");
|
|
524
428
|
let destinationHost = "localhost";
|
|
525
429
|
try {
|
|
@@ -530,9 +434,7 @@ function buildExecutionPlan(scored, maxGen, topN, baseUrl, authHeaderValue, auth
|
|
|
530
434
|
const authContext = authHeaderValue
|
|
531
435
|
? `authHeader: "${authHeaderValue}"${authSchemeSnippet}`
|
|
532
436
|
: "authHeader: <resolve from workspace or OpenAPI securitySchemes>; authScheme: <if Authorization>";
|
|
533
|
-
const prereqNote = s.
|
|
534
|
-
? `\n**Prerequisite discovery**: Check for FK fields (product_id, user_id, order_id) in the endpoint's request body. If found, prepend a step to create that prerequisite resource first, then chain its primary key field into the dependent step using template variable syntax. Check the actual field name from the response body (\`id\`, \`uuid\`, \`_id\`, etc.), response header (\`Location\`), or cookie — do not assume \`id\`.`
|
|
535
|
-
: "";
|
|
437
|
+
const prereqNote = `\n**Prerequisite discovery**: Check for Foreign Key fields (product_id, user_id, order_id) in the endpoint's request body. If found, prepend a step to create that prerequisite resource first, then chain its primary key field into the dependent step using template variable syntax. Check the actual field name from the response body (\`id\`, \`uuid\`, \`_id\`, etc.), response header (\`Location\`), or cookie — do not assume \`id\`.`;
|
|
536
438
|
const bugLine = s.bugCatchingTarget
|
|
537
439
|
? `**Bug to catch**: ${s.bugCatchingTarget}\n`
|
|
538
440
|
: "";
|
|
@@ -561,17 +463,16 @@ function buildExecutionPlan(scored, maxGen, topN, baseUrl, authHeaderValue, auth
|
|
|
561
463
|
const s = item.scenario;
|
|
562
464
|
const testType = s.testType ?? (s.steps.length === 1 ? "contract" : "integration");
|
|
563
465
|
const target = s.steps.length === 1
|
|
564
|
-
? `${s.steps[0].method} ${s.steps[0].path} → ${s.steps[0].expectedStatusCode}`
|
|
466
|
+
? `${s.steps[0].method} ${s.steps[0].path}${s.steps[0].expectedStatusCode ? ` → ${s.steps[0].expectedStatusCode}` : ""}`
|
|
565
467
|
: `Scenario: ${s.scenarioName} (${s.steps.map(st => `${st.method} ${st.path}`).join(" → ")})`;
|
|
566
468
|
return `#${rank} [ADDITIONAL] | ${testType} | ${s.category} | ${item.novelty}\n ${target}\n Validates: ${s.description}`;
|
|
567
469
|
}).join("\n\n");
|
|
568
|
-
// UI/E2E guidance — the LLM adds
|
|
569
|
-
//
|
|
570
|
-
|
|
571
|
-
|
|
572
|
-
**UI/E2E tests (add per your Budget Plan):** If your Budget Plan requires UI/E2E items beyond what is already in your GENERATE list, append an [ADDITIONAL] entry for each. If a UI test already occupies a GENERATE slot above, that slot satisfies your UI/E2E generate count — do NOT add it again to ADDITIONAL. Tool workflow for each new item:
|
|
470
|
+
// UI/E2E guidance — the LLM adds UI/E2E items as its Budget Plan dictates.
|
|
471
|
+
// Only rendered for non-UI-only PRs (UI-only PRs have dedicated guidance above).
|
|
472
|
+
const uiGuidance = (!isUIOnlyPR && hasFrontendChanges) ? `
|
|
473
|
+
**UI/E2E tests (add per your Budget Plan):** If your Budget Plan allocates UI/E2E slots, add them here. Tool workflow for each new item:
|
|
573
474
|
- **E2E**: ${hasTraces ? "Use discovered trace/recording files with `skyramp_e2e_test_generation`." : "Add to additionalRecommendations with a note that both a backend API trace (`skyramp_start_trace_collection` / `skyramp_stop_trace_collection`) and a browser Playwright recording must be collected in a live environment first. Do NOT attempt `skyramp_e2e_test_generation` without both traces present."}
|
|
574
|
-
- **UI**: ${hasTraces ? "Use an existing Playwright `.zip` trace with `skyramp_ui_test_generation`." :
|
|
475
|
+
- **UI**: ${hasTraces ? "Use an existing Playwright `.zip` trace with `skyramp_ui_test_generation`." : `Record a trace using \`browser_navigate\` + \`browser_snapshot\` + \`skyramp_export_zip\`, then call \`skyramp_ui_test_generation({ playwrightInput: "<zip_path>", outputDir: "<frontend_test_directory>" })\`. Resolve \`<frontend_test_directory>\` from ${resolveServiceDetailsRef().frontendTestDirRef}.`}
|
|
575
476
|
Derive scenario names and steps from the actual changed frontend files. If your Budget Plan calls for 0% UI/E2E, omit this entirely.` : "";
|
|
576
477
|
const supplementNote = `\n**If your Budget Plan total exceeds the pre-ranked items listed above:** draft additional tests from source-code enrichment (Step 1). For each new or changed endpoint, identify boundary or variation scenarios — formula parameters, search/filter constraints, required field validation. Only after exhausting PR-specific scenarios, add generic patterns (auth boundary → 401, non-existent ID → 404). Do NOT supplement with tests whose endpoint + test type match a GENERATE item.`;
|
|
577
478
|
// ── PR / branch-diff mode: execution plan ────────────────────────────────
|
|
@@ -598,7 +499,7 @@ ${externalTestFilesList}For every GENERATE item below, check its endpoint path a
|
|
|
598
499
|
|
|
599
500
|
**Step 1 — Source-Code Enrichment (before executing anything)**
|
|
600
501
|
Read the source code for ALL changed files. Before generating each recommendation, quote the relevant source code in a <source_evidence> block — include the route handler signature, request body schema fields, response shape, and any computed field formulas. Use these quotes to derive tool call parameters. Look for:
|
|
601
|
-
- **Auth middleware** — check for known signals (${AUTH_MIDDLEWARE_PATTERNS_STR}). If any match, override \`authHeader\` and \`authScheme\` even if
|
|
502
|
+
- **Auth middleware** — check for known signals (${AUTH_MIDDLEWARE_PATTERNS_STR}). If any match, override \`authHeader\` and \`authScheme\` even if ${resolveServiceDetailsRef().authSourceRef} says authType: none. **If no known signal matches but the diff shows security-adjacent code** (decorators like \`@requiresRole\`/\`@Protected\`, function names like \`validateToken\`/\`checkPermission\`/\`verifyHMAC\`, or imports from auth/security packages), read the relevant source file to determine the actual auth scheme before proceeding. Auth handling for \`skyramp_integration_test_generation\` with \`scenarioFile\` is covered in the Tool Workflows section below.
|
|
602
503
|
- Business rules and formulas (e.g. total_cost = compute * rate + memory * rate)
|
|
603
504
|
- State transitions and domain constraints (e.g. budget cannot drop below current spend)
|
|
604
505
|
- Validation logic (field constraints, cross-field dependencies)
|
|
@@ -633,7 +534,7 @@ If these conditions are not met, add it to ADDITIONAL only — do NOT displace a
|
|
|
633
534
|
When a qualifying candidate is inserted: place it HIGH before MEDIUM before LOW; within the same priority, source-code-derived candidates go BEFORE structural ones. Re-number ranks after insertion. The top ${maxGen} ranked items become GENERATE candidates.
|
|
634
535
|
|
|
635
536
|
**Source-code validation gates (apply during Step 1):**
|
|
636
|
-
- **Cascade vs referential integrity**: If both a cascade-delete and a delete-blocked scenario appear for the same resource pair, keep only the one matching the source
|
|
537
|
+
- **Cascade vs referential integrity**: If both a cascade-delete and a delete-blocked scenario appear for the same resource pair, keep only the one matching the source Foreign Key delete policy (ON DELETE CASCADE / cascade=True / onDelete: 'CASCADE' → keep cascade-delete; RESTRICT/PROTECT/no annotation → keep delete-blocked). Remove the inapplicable variant.
|
|
637
538
|
- **Unique constraints**: Unique-constraint scenarios (duplicate POST → 409) are pre-drafted for all resources. Confirm enforcement before keeping: SQL UNIQUE index, Mongoose unique: true, Prisma @unique, or explicit duplicate-check code. If the backend is Redis, schema-less, or has no explicit constraint in the changed files, move to ADDITIONAL with a note — do NOT generate.
|
|
638
539
|
|
|
639
540
|
**Step 2 — Diversity check (using enriched knowledge from Step 1)**
|
|
@@ -649,7 +550,7 @@ For each pair of GENERATE items, ask: same HTTP method + path + step sequence +
|
|
|
649
550
|
Same step sequence with only payload differences (e.g. 10% vs 5% discount both returning 200) = same code path = duplicate. Different scenario names do not make duplicate tests distinct.
|
|
650
551
|
|
|
651
552
|
**Step 3 — Execute merged plan in rank order**
|
|
652
|
-
Replace any scenario that pairs unrelated resources with one reflecting actual
|
|
553
|
+
Replace any scenario that pairs unrelated resources with one reflecting actual Foreign Key relationships in the codebase.
|
|
653
554
|
Use the field names and values from the \`<source_evidence>\` blocks you quoted in Step 1 to fill all tool call parameters. Prefer reusing Step 1 evidence when it already resolves a placeholder, but if a placeholder cannot be replaced with concrete values from files already read, you may read the specific schema, model, or handler file needed to resolve it. Assert response field values, not just status codes.
|
|
654
555
|
|
|
655
556
|
${buildTestQualityCriteria()}
|
|
@@ -665,10 +566,10 @@ ${buildGenerationRules(isUIOnlyPR)}
|
|
|
665
566
|
### GENERATE (process these EXACTLY as listed, in order — after completing Steps 0–2 above; if Step 0 converts an item to UPDATE, backfill the ADD slot from ADDITIONAL following the priority order in Step 0)
|
|
666
567
|
|
|
667
568
|
${isUIOnlyPR
|
|
668
|
-
? (
|
|
669
|
-
: (
|
|
569
|
+
? (uiOnlyGenerateGuidance || " (no UI generate items — derive scenarios from changed frontend files)")
|
|
570
|
+
: (generateBlocks || " (no pre-ranked generate items — draft your own based on endpoint analysis)")}
|
|
670
571
|
|
|
671
|
-
**
|
|
572
|
+
**VERIFICATION CHECK**: Before proceeding, verify your generate list covers the same endpoints and test types as the items above. Add genuinely new scenarios to ADDITIONAL instead. One retry on failure then skip to next item.
|
|
672
573
|
|
|
673
574
|
### ADDITIONAL (list in additionalRecommendations in this order after Step 1 insertion)
|
|
674
575
|
|
|
@@ -703,10 +604,17 @@ export function buildRecommendationPrompt(analysis, analysisScope = AnalysisScop
|
|
|
703
604
|
const hasFrontendChanges = isDiffScope && diffContext
|
|
704
605
|
? filteredChangedFiles.some(f => isFrontendFile(f))
|
|
705
606
|
: false;
|
|
607
|
+
// Backend changes detected if:
|
|
608
|
+
// 1. Endpoints directly matched from changed files (new/modified/removed), OR
|
|
609
|
+
// 2. Changed files are in backend service/model/middleware directories (affectedServices non-empty)
|
|
610
|
+
// but couldn't be mapped to specific endpoints (service-layer changes like services/items.ts)
|
|
706
611
|
const hasApiChanges = isDiffScope && diffContext
|
|
707
612
|
? (diffContext.newEndpoints.length > 0 || diffContext.modifiedEndpoints.length > 0 || (diffContext.removedEndpoints?.length ?? 0) > 0)
|
|
708
613
|
: false;
|
|
709
|
-
const
|
|
614
|
+
const hasBackendServiceChanges = isDiffScope && diffContext
|
|
615
|
+
? (diffContext.affectedServices.length > 0 && filteredChangedFiles.some(f => !isFrontendFile(f) && /\.(ts|js|py|java|go|rb|rs|cs)$/.test(f)))
|
|
616
|
+
: false;
|
|
617
|
+
const isUIOnlyPR = hasFrontendChanges && !hasApiChanges && !hasBackendServiceChanges;
|
|
710
618
|
const hasTraces = (analysis.artifacts?.traceFiles?.length ?? 0) > 0 ||
|
|
711
619
|
(analysis.artifacts?.playwrightRecordings?.length ?? 0) > 0;
|
|
712
620
|
// ── Mode preamble ──
|
|
@@ -906,17 +814,11 @@ ${detailBlocks}
|
|
|
906
814
|
const na = NOVELTY_ORDER[a.novelty], nb = NOVELTY_ORDER[b.novelty];
|
|
907
815
|
if (nb !== na)
|
|
908
816
|
return nb - na;
|
|
909
|
-
|
|
910
|
-
|
|
911
|
-
|
|
912
|
-
|
|
913
|
-
|
|
914
|
-
return b.scenario.steps.length - a.scenario.steps.length;
|
|
915
|
-
const errorA = a.scenario.steps.some(s => s.interactionType === "error" || s.interactionType === "edge-case") ? 1 : 0;
|
|
916
|
-
const errorB = b.scenario.steps.some(s => s.interactionType === "error" || s.interactionType === "edge-case") ? 1 : 0;
|
|
917
|
-
if (errorB !== errorA)
|
|
918
|
-
return errorB - errorA;
|
|
919
|
-
// Use locale-independent comparison to avoid runtime-locale non-determinism
|
|
817
|
+
// Deterministic tiebreaker: when priority and novelty are equal, sort by
|
|
818
|
+
// scenario name then seeded hash. This ensures identical inputs always produce
|
|
819
|
+
// the same ordering regardless of runtime locale or JS sort stability — important
|
|
820
|
+
// because the LLM receives a ranked list and would otherwise produce inconsistent
|
|
821
|
+
// recommendations across runs for equally-ranked scenarios.
|
|
920
822
|
const nameA = a.scenario.scenarioName;
|
|
921
823
|
const nameB = b.scenario.scenarioName;
|
|
922
824
|
if (nameA < nameB)
|
|
@@ -951,12 +853,23 @@ ${detailBlocks}
|
|
|
951
853
|
mainSection = buildExecutionPlan(scored, maxGen, topN, analysis.apiEndpoints.baseUrl, authHeaderValue, authSchemeSnippet, authTypeValue, seed, endpointCount, isUIOnlyPR, hasFrontendChanges, hasTraces, externalCoverage, analysis.existingTests.relevantExternalTestPaths ?? []);
|
|
952
854
|
}
|
|
953
855
|
else {
|
|
856
|
+
// Endpoint discovery hint: when backend service files changed but no endpoints were
|
|
857
|
+
// directly matched, guide the LLM to trace from service files → controllers → routes.
|
|
858
|
+
const endpointDiscoveryHint = hasBackendServiceChanges && diffContext
|
|
859
|
+
? `
|
|
860
|
+
**Endpoint Discovery Required** — the diff modifies backend service files (affected services: ${diffContext.affectedServices.join(", ")}) that don't directly define routes. You MUST:
|
|
861
|
+
1. Read the Routing entry-point files listed above
|
|
862
|
+
2. Trace which controllers/routers import the affected services
|
|
863
|
+
3. Identify the specific HTTP endpoints those controllers register
|
|
864
|
+
4. Use discovered endpoints as your GENERATE targets (contract + integration tests)
|
|
865
|
+
Do NOT default to UI-only tests — this PR has backend logic changes that require API-level testing.`
|
|
866
|
+
: "";
|
|
954
867
|
mainSection = `
|
|
955
868
|
## Draft Your Execution Plan
|
|
956
869
|
|
|
957
|
-
No pre-drafted scenarios available
|
|
870
|
+
No pre-drafted scenarios available.${endpointDiscoveryHint}
|
|
958
871
|
|
|
959
|
-
${buildScopeAssessmentSection(topN, maxGen)}
|
|
872
|
+
${buildScopeAssessmentSection(topN, maxGen, isUIOnlyPR)}
|
|
960
873
|
|
|
961
874
|
Draft tests from the endpoint interactions and source code above, following the same tool pipeline described in Tool Workflows below. Prioritize critical categories: security_boundary > data_integrity > business_rule > workflow > crud.
|
|
962
875
|
|
|
@@ -964,6 +877,8 @@ For each test: pick the highest-impact endpoint(s), draft a realistic scenario w
|
|
|
964
877
|
|
|
965
878
|
**Honor your Budget Plan: produce exactly the total you committed to (GENERATE + ADDITIONAL). No fewer, no padding with low-value tests.**
|
|
966
879
|
|
|
880
|
+
**Coverage breadth enforcement:** Your GENERATE items must span DIFFERENT HTTP methods or endpoints from your Coverage Reasoning surfaces. If you identified 5+ testable surfaces but all GENERATE items target the same method + path (e.g. all POST /permissions), you are violating diversity. Spread GENERATE slots across distinct surfaces; put remaining surfaces in ADDITIONAL recommendations.
|
|
881
|
+
|
|
967
882
|
## Recommendation Stability
|
|
968
883
|
- **Carry forward** previous additionalRecommendations that still apply — match by scenarioName (multi-step) or endpoint (single-endpoint). Re-derive category and priority from test content.
|
|
969
884
|
- **Only drop** a previous recommendation if its target endpoint was removed, its business logic changed, or it is now covered by a generated test.
|