@skyramp/mcp 0.1.0-rc.6 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -30,8 +30,10 @@ export async function registerPlaywrightTools(server, options) {
30
30
  'browser_snapshot',
31
31
  'browser_click',
32
32
  'browser_type',
33
+ 'browser_press_key',
33
34
  'browser_select_option',
34
35
  'browser_hover',
36
+ 'browser_drag',
35
37
  'browser_tabs',
36
38
  'browser_navigate_back',
37
39
  'browser_wait_for',
@@ -93,8 +93,18 @@ ${nextStep}`;
93
93
  ? `\n<diff>\n${p.diffContent}\n</diff>`
94
94
  : "";
95
95
  const step2 = isUIOnly
96
- ? `### Step 2: Identify consumed API endpoints
97
- UI-only PR — read changed components to find API calls (fetch, axios, hooks).`
96
+ ? `### Step 2: Identify consumed API endpoints and integration status
97
+ UI-only PR — perform two checks:
98
+ 1. Read changed frontend files to find API calls (fetch, axios, hooks).
99
+ 2. For each changed component file (skip CSS/HTML/style-only files — they have no exported component name to search for): check whether any production source file imports, re-exports, or renders it.
100
+ - Search for both the component's exported name AND its module path/filename to catch aliased and default imports (e.g. \`import Foo from './CartLine'\`).
101
+ - Derive the exported name from the file itself: use the default export name, a named exported PascalCase component, or the PascalCase file basename when no clearer name exists.
102
+ - Exclude test/story files from the search: ignore matches in \`*.test.*\`, \`*.spec.*\`, \`*.stories.*\`, and \`__tests__/\` directories — only production code imports count as integration.
103
+
104
+ If no production file imports, re-exports, or renders a changed component, mark it as **unintegrated** in the Execution Plan output.
105
+ Exception: if the same PR also adds a route/page file (e.g. under Next.js \`pages/\` or \`app/\`) that imports the component, the route IS the integration point — do NOT mark it as unintegrated.
106
+ Do NOT apply the unintegrated heuristic to route/entrypoint files themselves — those are always reachable by convention.
107
+ An unintegrated non-route component has no DOM node in the running app and cannot be browser-tested — it qualifies as a dead-code / unintegrated-component no-surface PR regardless of how complex the component logic is.`
98
108
  : p.diffContent
99
109
  ? `### Step 2: Extract new and modified API endpoints from the diff
100
110
  Read the \`<diff>\` above and identify every new or modified API endpoint — route registrations, handler methods, controller annotations. Then use the **Router Mounting / Nesting** section above to reconstruct the full URL path for each endpoint by chaining all parent router prefixes down to the handler (e.g. a handler in a file with \`prefix="/reviews"\` that is mounted at \`/{product_id}\` under a router mounted at \`/api/v1/products\` → full path \`/api/v1/products/{product_id}/reviews\`).
@@ -51,7 +51,7 @@ Use those recommendations as your baseline. Only add or remove tests that the us
51
51
 
52
52
  1. Call \`skyramp_analyze_changes\` with \`repositoryPath\`: "${repositoryPath}", \`scope\`: "branch_diff", \`topN\`: ${maxRecommendations}, \`maxGenerate\`: ${maxGenerate}${baseBranch ? `, \`baseBranch\`: "${baseBranch}"` : ""}${prNumber ? `, \`prNumber\`: ${prNumber}` : ""}${stateOutputFile ? `, \`stateOutputFile\`: "${stateOutputFile}"` : ""} — discovers existing Skyramp tests, scans endpoints changed in the diff, loads workspace config, and returns ${maxRecommendations} ranked ADD recommendations (${maxGenerate} to generate, ${maxRecommendations - maxGenerate} as additional).${prNumber ? " Uses PR comment history to avoid re-recommending already-generated tests." : ""}
53
53
  **If \`skyramp_analyze_changes\` returns an error:** retry once only if the error is transient (timeout, network blip, temporary unavailability) — do NOT retry for permanent errors (invalid repository path, missing required parameter, authentication failure). If it fails again, call \`skyramp_submit_report\` with a minimal valid payload: leave all test arrays empty and add the error to \`issuesFound\`. Refer to the \`skyramp_submit_report\` schema for required fields. Do NOT attempt Task 2 without a valid stateFile.
54
- **If all changed files are non-application** (CI/CD, docs, lock files, config) → skip to Task 3 (Submit Report) with empty arrays.
54
+ **If all changed files are non-application** (CI/CD, docs, lock files, config) → skip to Task 3 (Submit Report) with empty arrays and a single \`issuesFound\` entry explaining why (same format as the zero-test path below).
55
55
 
56
56
  2. **Maintain existing tests** using the rules in \`<drift_analysis_rules>\` below. For each existing test reported by \`skyramp_analyze_changes\`, score it and choose the action exactly as directed by the Action Decision Matrix in \`<drift_analysis_rules>\`. Only read test files that require action per that matrix — do NOT read files that will be IGNORED. **Do NOT read source files (routers, models, CRUD, components) — all the information you need is in the \`skyramp_analyze_changes\` output and the diff.** When reading multiple test files, **read them all in a single parallel batch** — do NOT read them one at a time. Apply actions directly. Results go in \`testMaintenance\`.
57
57
 
@@ -98,7 +98,16 @@ ${userPrompt ? "" : "Drift-based maintenance (Task 1) is complete. This step onl
98
98
  Keep advancing until you have created exactly ${maxGenerate} new test files OR exhausted all candidates.
99
99
  - **Example**: If enrichment reveals that sending \`discount_value\` without \`discount_type\` silently orphans the value (a concrete bug), complete all planned GENERATE items first, then generate this discovered scenario as an extra test and report it in \`newTestsCreated\`.
100
100
  - **Total generated**: Follow the **"Budget: N generate"** line in the Execution Plan. Process every GENERATE-tagged item in order. Items that become UPDATEs (covered resource) do not count — backfill from ADDITIONAL candidates (highest-ranked first) until \`newTestsCreated\` reaches ${maxGenerate} or all candidates are exhausted.
101
- - **UI test priority**: If the diff contains frontend/UI changes (e.g. \`.tsx\`, \`.jsx\`, \`.vue\`, \`.svelte\` files), you MUST attempt to generate at least one UI test. Use \`browser_navigate\` to the app's base URL — if the app responds, record a trace and generate the test. Only skip if the app is unreachable. This takes priority over generating additional backend-only tests.
101
+ - **UI test priority**: If the diff contains frontend/UI changes (e.g. \`.tsx\`, \`.jsx\`, \`.vue\`, \`.svelte\` files), you MUST attempt to generate at least one UI test. Use \`browser_navigate\` to the app's base URL — if the app responds, record a trace and generate the test.
102
+ **Skip only if one of these conditions is met:**
103
+ - **(a) App is unreachable** — \`browser_navigate\` fails or connection is refused.
104
+ - **(b) Unintegrated non-route component** — the changed file is a leaf component (not a framework route/entrypoint) that has no integration point in the running app. To confirm:
105
+ 1. Grep for the component's exported name AND its module path/filename across all production source files (excluding \`*.test.*\`, \`*.spec.*\`, \`*.stories.*\`, \`__tests__/\` directories — only production code imports count).
106
+ 2. If no production file imports, re-exports, or renders it, the component has no DOM node in the running app → unintegrated.
107
+ 3. **Exception**: if the same PR also adds a route/page file (e.g. under Next.js \`pages/\` or \`app/\`) that imports the component, the route IS the integration point — test through it.
108
+ **Never** apply the unintegrated heuristic to framework route/entrypoint files themselves — those are always reachable by convention.
109
+ **Never** generate tests for unrelated pages as a substitute for an unintegrated component.
110
+ This rule takes priority over generating additional backend-only tests.
102
111
  - **Always generate a test for critical bugs, even if it will fail.** When a GENERATE-tagged item targets a page or endpoint with a known bug, do NOT skip it because you expect the test to fail — a failing test that documents a bug is more valuable than a text-only description. This applies within the existing GENERATE budget; do not add extra tests beyond the plan.
103
112
  - For UI rendering bugs: navigate to the broken page and add a \`browser_assert\` that verifies the page rendered its expected content (e.g. assert the page heading is visible). The assertion will fail on the broken page, which is the correct outcome — it documents the bug as a failing test.
104
113
  - The assertion MUST target the broken page itself, not a different page that works. If \`/orders/{id}/edit\` crashes, assert on \`/orders/{id}/edit\` (e.g. "Edit Order" heading visible), NOT on \`/orders\`.
@@ -197,8 +206,7 @@ If a test **generation** tool call fails:
197
206
  2. If it fails again, **skip** that candidate and move to the next ranked candidate.
198
207
  3. If all candidates in the GENERATE set fail, fall back to generating the **simplest possible test**: a single contract test for the highest-scored endpoint (GET → 200 or POST → 201).
199
208
  **Exception — frontend-only PRs**: If the diff modifies ONLY frontend files (\`.tsx\`, \`.jsx\`, \`.vue\`, \`.svelte\`, \`.css\`, \`.html\`) AND browser recording was not possible, do NOT generate a backend fallback contract test — it is irrelevant to the PR. Instead move ALL GENERATE candidates to \`additionalRecommendations\` and proceed to Task 3.
200
- 4. You MUST generate **at least 1 test** for any PR that touches application code. Zero generated tests is NOT acceptable — unless the frontend-only exception above applies.
201
- 5. Log skipped candidates in \`issuesFound\` with the error message.
209
+ 4. Log skipped candidates in \`issuesFound\` with the error message.
202
210
 
203
211
  If a test **execution** (\`skyramp_execute_test\`) fails for a newly generated test:
204
212
  1. Read the error output to diagnose the root cause (4xx on prereq step, assertion mismatch, floating-point precision, 500 from app bug, timeout, etc.).
@@ -235,9 +243,16 @@ Do not make any changes other than the assertion enhancements described above. F
235
243
  **Before calling \`skyramp_submit_report\` — mandatory count check:**
236
244
  If you skipped here due to non-application changes (per Task 1), submit with empty arrays — the count checks below do not apply.
237
245
 
238
- **If you generated zero new tests because the PR has no testable behavioral surface (cosmetic/docs/style/dependency-only):**
246
+ **If you generated zero new tests because the PR has no testable behavioral surface:**
247
+ This applies when the diff contains ONLY changes with no observable API or UI behavior change. Examples:
248
+ - Cosmetic/docs/style: JSDoc updates, CSS reformats, comment-only changes
249
+ - Dependency-only: version bumps with no API surface change
250
+ - Dead code / unintegrated utility or component: a new helper function, utility, or UI component added to the codebase but not imported, mounted, or rendered anywhere — use this classification only after confirming the new symbol does not appear as an import or render call in any other source file; do NOT classify as dead code based solely on the diff. For UI components specifically: an unintegrated component has no DOM node in the running app and cannot be browser-tested regardless of how complex its logic is
251
+ - Config-only: linter rules, build config, environment variable additions with no runtime behavior change
252
+
253
+ In these cases:
239
254
  - \`newTestsCreated\` must be \`[]\`
240
- - Add exactly one entry to \`issuesFound\`: \`"No testable behavioral surface detected: <brief reason, e.g. 'JSDoc-only changes with no endpoint modifications', 'CSS reformat with no logic changes', 'dependency version bump with no API surface change'>. Zero new tests generated by design."\`
255
+ - Add exactly one entry to \`issuesFound\`: \`"No testable behavioral surface detected: <brief reason, e.g. 'JSDoc-only changes with no endpoint modifications', 'CSS reformat with no logic changes', 'dependency version bump with no API surface change', 'utility function added but not integrated into any endpoint or component'>. Zero new tests generated by design."\`
241
256
  - \`businessCaseAnalysis\` must be a one-sentence summary of what the PR actually does (do NOT leave it blank)
242
257
  - \`additionalRecommendations\` must be \`[]\` — do NOT recommend tests for a no-surface PR
243
258
  - A blank \`issuesFound\` when tests were intentionally skipped will lose report quality points
@@ -54,7 +54,7 @@ describe("dockerImageExistsLocally", () => {
54
54
  });
55
55
  });
56
56
  describe("pullDockerImage", () => {
57
- const IMAGE = "skyramp/executor:v1.3.21";
57
+ const IMAGE = "skyramp/executor:v1.3.22";
58
58
  beforeEach(() => jest.clearAllMocks());
59
59
  describe("on amd64 host", () => {
60
60
  const originalArch = process.arch;
@@ -1,3 +1,3 @@
1
- export const SKYRAMP_IMAGE_VERSION = "v1.3.21";
1
+ export const SKYRAMP_IMAGE_VERSION = "v1.3.22";
2
2
  export const EXECUTOR_DOCKER_IMAGE = `skyramp/executor:${SKYRAMP_IMAGE_VERSION}`;
3
3
  export const WORKER_DOCKER_IMAGE = `skyramp/worker:${SKYRAMP_IMAGE_VERSION}`;