npm - @skyramp/mcp - Versions diffs - 0.0.58 → 0.0.59 - Mend

@skyramp/mcp 0.0.58 → 0.0.59

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/build/prompts/testbot/testbot-prompts.js +47 -12
package/build/tools/submitReportTool.js +10 -2
package/build/tools/test-recommendation/recommendTestsTool.js +27 -1
package/package.json +1 -1

package/build/prompts/testbot/testbot-prompts.js CHANGED Viewed

@@ -11,16 +11,30 @@ function getTestbotPrompt(prTitle, prDescription, diffFile, testDirectory, summa
 For all the following work, use the tools offered by Skyramp MCP server.
-First analyze the pull request title, description, and code changes to determine a business case
-justification for this code change.
 Then perform ALL of the following tasks. Every task is MANDATORY — do NOT skip any task based on your own judgment unless the task itself gives you an explicit condition to skip.
-## Task 1: Recommend New Tests (MANDATORY)
+## Task 1: Recommend New Tests (MANDATORY — but skip if no application code changed)
+Read the diff at \`${diffFile}\`. Classify EVERY changed file using these categories:
+**Non-application files (DO NOT generate tests for these):**
+- CI/CD workflow files (.github/workflows/*.yml, .gitlab-ci.yml, Jenkinsfile, etc.)
+- Markdown documentation (.md files, README, CHANGELOG, CONTRIBUTING, etc.)
+- Dependency lock files (package-lock.json, yarn.lock, Pipfile.lock, poetry.lock, Gemfile.lock, go.sum, etc.)
+- Configuration-only files (.gitignore, .editorconfig, .prettierrc, renovate.json, dependabot.yml, etc.)
+- License files (LICENSE, NOTICE, etc.)
-Read the diff at \`${diffFile}\`. Classify each changed file. A file is application source code if it is any of: a route/controller/handler, a model/schema/validator/serializer/DTO, business logic, middleware, service, utility, test helper, or has a source extension (.py, .ts, .js, .java, .go, .rb, .cs, .kt, .swift, etc.). When in doubt, treat the file as application source code.
+**Application source code (generate tests for these):**
+- Routes, controllers, handlers, API endpoints
+- Models, schemas, validators, serializers, DTOs
+- Business logic, services, middleware, utilities
+- Test helpers and test fixtures
+- Any file with a source extension (.py, .ts, .js, .java, .go, .rb, .cs, .kt, .swift, etc.) that is NOT in the non-application list above
-**DEFAULT: You MUST run steps 1–5 below.** The only exception is if you can confirm that EVERY changed file is exclusively a CI workflow YAML, markdown documentation, README, CHANGELOG, or a dependency lock file — and nothing else.
+**SKIP RULE — THIS IS MANDATORY:**
+If EVERY changed file in the diff falls into the "non-application files" category above, you MUST skip steps 1–6 entirely. Do NOT call \`skyramp_analyze_repository\`, do NOT call \`skyramp_map_tests\`, do NOT generate any tests. Instead, proceed directly to Task 2. In your report, state: "Task 1 skipped: PR contains only non-application changes (CI/docs/config)."
+**When in doubt:** If even ONE changed file looks like it could be application source code, run steps 1–6.
 1. Call \`skyramp_analyze_repository\` with:
    - \`repositoryPath\`: "${repositoryPath}"
@@ -29,21 +43,40 @@ Read the diff at \`${diffFile}\`. Classify each changed file. A file is applicat
 3. MANDATORY: Call \`skyramp_recommend_tests\` with the \`stateFile\` returned by \`skyramp_map_tests\`. Use the priority summary and the specific endpoints/files that changed to determine exactly what to test.
 4. Generate tests using the Skyramp MCP generate tools, in priority order (minimum 3 test types).
 5. Use Skyramp MCP to execute the generated tests and validate the results.
+6. **E2E / UI Test Generation from Trace Files**: Search the repository for existing Skyramp trace files that can be used for E2E or UI test generation. Look for:
+   - Backend trace files: files matching patterns like \`**/skyramp*trace*.json\`, \`**/skyramp-traces.json\`, or \`**/*trace*.json\` in test directories
+   - Playwright UI trace files: files matching patterns like \`**/skyramp*playwright*.zip\`, \`**/*playwright*.zip\`, or \`**/*ui*trace*.zip\`
+   Search in the test directory (\`${testDirectory}\`), the repository root, and any \`.skyramp/\` directories.
+   - If you find BOTH a backend trace file AND a Playwright trace ZIP, call \`skyramp_e2e_test_generation\` with both files to generate an E2E test.
+   - If you find ONLY a Playwright trace ZIP (no backend trace), call \`skyramp_ui_test_generation\` with the Playwright file to generate a UI test.
+   - When generating E2E/UI tests, use the same language and framework as other tests in the repository. Default to Python with pytest if no convention is detected.
+   - Execute any generated E2E/UI tests to validate them. Note: Playwright browsers are pre-installed in the CI environment.
 **IMPORTANT — Endpoint Renames:** If the diff shows an endpoint path was renamed (e.g. \`/products\` changed to \`/items\`) and existing tests already cover that endpoint under the old name, do NOT generate new tests for the renamed endpoint. The existing tests will be updated with the new path in Task 2 (Test Maintenance). Only generate new tests for genuinely new endpoints that have no existing test coverage under any name.
 ## Task 2: Existing Test Maintenance (MANDATORY)
-You MUST always run steps 1–4 below. Do NOT skip this task based on your own assessment of whether tests exist or are relevant — use the tools to determine that.
+You MUST always run the steps below. Do NOT skip this task based on your own assessment of whether tests exist or are relevant — use the tools to determine that.
 1. Call \`skyramp_discover_tests\` with \`repositoryPath\`: "${repositoryPath}" to find all existing Skyramp-generated tests.
-2. Call \`skyramp_analyze_test_drift\` with the \`stateFile\` returned by \`skyramp_discover_tests\`.
-3. Call \`skyramp_calculate_health_scores\` with the \`stateFile\` from the previous step.
-4. Call \`skyramp_actions\` with the updated \`stateFile\` to apply recommended updates.
+   You may skip the rest of this task ONLY if it explicitly returns zero Skyramp-generated tests.
+2. **Baseline — check for parallel CI first:**
+   a. Read the workflow files in \`.github/workflows/\` and check if any workflow (other than the Skyramp Testbot workflow) is triggered on \`pull_request\` AND runs tests against the test directory (look for commands like \`pytest\`, \`jest\`, \`npm test\`, \`go test\`, \`skyramp test\`, or similar test execution commands).
+   b. If such a workflow exists, run: \`gh run list --commit $(git rev-parse HEAD) --workflow <workflow-filename> --json status,conclusion --limit 1\` to check if it has completed for the current commit.
+   c. If the parallel workflow completed successfully — record beforeStatus as "Pass" for the discovered tests and note "baseline from CI workflow <workflow-name>" in beforeDetails. Skip to step 3.
+   d. If the parallel workflow completed with failure — record beforeStatus as "Fail" and capture the failure context in beforeDetails. Skip to step 3.
+   e. If no parallel test workflow exists, it hasn't completed yet, or the \`gh\` command fails for any reason (e.g. permissions, CLI not available) — execute ALL discovered tests AS-IS (before any modifications) using \`skyramp_execute_tests_batch\` or \`skyramp_execute_test\`. Record each test's status and details as the "before" results. In beforeDetails, describe the execution result (e.g. "Pass (10.8s)" or "Fail (404 Not Found)"). If you could not query CI, just note "unable to query existing CI pipeline" — do NOT expose internal details like authentication errors.
+3. Call \`skyramp_analyze_test_drift\` with the \`stateFile\` returned by \`skyramp_discover_tests\`.
+4. Call \`skyramp_calculate_health_scores\` with the \`stateFile\` from the previous step.
+5. Call \`skyramp_actions\` with the updated \`stateFile\`. This tool returns instructions describing what needs to change in each test file — it does NOT modify the files itself.
+6. **You MUST modify the existing test files in-place using your file editing tools.** Read the instructions from \`skyramp_actions\`, cross-reference with the code diff, and edit each test file directly.
    - If \`skyramp_actions\` returns endpoint rename mappings (old path → new path), apply them as simple find-and-replace on the test file URLs. Do NOT regenerate or restructure the test — only update the paths.
    - If \`skyramp_actions\` suggests file renames (e.g. \`products_smoke_test.py\` → \`items_smoke_test.py\`), rename the files using \`git mv\` after updating their content.
-5. Execute any updated or affected tests using Skyramp MCP and validate the results.
-6. You may skip this task ONLY if \`skyramp_discover_tests\` explicitly returns zero Skyramp-generated tests.
+   - The goal is to fix the discovered tests so they pass with the new code, preserving the original test structure and logic. Do NOT create new test files as a substitute for fixing existing ones.
+7. Execute the modified tests using Skyramp MCP and validate the results. This includes E2E and UI tests — Playwright browsers are pre-installed in the CI environment, so E2E/UI test execution is fully supported. Record each test's status and details as the "after" results.
+8. For each maintained test, report BOTH the before and after results in the \`testMaintenance\` array of the report (using the fileName, beforeStatus, beforeDetails, afterStatus, afterDetails fields), so the user has full visibility into whether the code change or the existing test was at fault.
 ## Task 3: Submit Report (MANDATORY)
@@ -59,6 +92,8 @@ Do NOT write the report to a file yourself. Do NOT skip this step. The skyramp_s
 ## Report Guidelines
+**businessCaseAnalysis:** Base this ONLY on facts from the PR title, description, and what the tools reported. If \`skyramp_analyze_repository\` reported 0 new endpoints, do NOT claim new endpoints were added — instead describe the change accurately (e.g. "frontend changes to consume existing API endpoints", "refactored service layer", "updated test configuration"). Never infer new backend endpoints from frontend fetch/API calls in the diff.
 When reporting test results, if you chose to skip executing a test, you MUST explain WHY you skipped it.
 NEVER use the phrase "CI timeout" or imply a timeout occurred unless a tool call actually timed out.
 Instead, set the status to "Skipped" and provide an honest reason in the details, for example:

package/build/tools/submitReportTool.js CHANGED Viewed

@@ -19,6 +19,14 @@ const newTestSchema = z.object({
 const descriptionSchema = z.object({
     description: z.string().describe("One-line description"),
 });
+const testMaintenanceSchema = z.object({
+    fileName: z.string().describe("Test file that was maintained, e.g. 'products_smoke_test.py'"),
+    description: z.string().describe("What was changed and why"),
+    beforeStatus: z.enum(["Pass", "Fail", "Error"]).describe("Test result BEFORE modification"),
+    beforeDetails: z.string().describe("Execution output/timing before modification, or 'baseline from CI workflow <name>' if a parallel workflow provided the baseline"),
+    afterStatus: z.enum(["Pass", "Fail", "Error", "Skipped"]).describe("Test result AFTER modification"),
+    afterDetails: z.string().describe("Execution output/timing after modification"),
+});
 export function registerSubmitReportTool(server) {
     server.registerTool(TOOL_NAME, {
         description: "Submit the final testbot report. Call this tool once after completing all test analysis, generation, and execution. " +
@@ -34,8 +42,8 @@ export function registerSubmitReportTool(server) {
                 .array(newTestSchema)
                 .describe("List of new tests created. Use empty array [] if none."),
             testMaintenance: z
-                .array(descriptionSchema)
-                .describe("List of existing test modifications. Use empty array [] if none."),
+                .array(testMaintenanceSchema)
+                .describe("List of existing test modifications with before/after execution results. Use empty array [] if none."),
             testResults: z
                 .array(testResultSchema)
                 .describe("List of ALL test execution results. One entry per test executed."),

package/build/tools/test-recommendation/recommendTestsTool.js CHANGED Viewed

@@ -137,6 +137,32 @@ ${diff.changedFiles.map((f) => `- \`${f}\``).join("\n")}
                 .join("\n");
             const highActions = buildActionList(mapping.summary.highPriority);
             const mediumActions = buildActionList(mapping.summary.mediumPriority);
+            // Check if E2E or UI tests are in the priority lists
+            const allPriority = [
+                ...mapping.summary.highPriority,
+                ...mapping.summary.mediumPriority,
+            ];
+            const hasE2EOrUI = allPriority.some((t) => t === TestType.E2E || t === TestType.UI);
+            const traceGuidance = hasE2EOrUI
+                ? `
+### Trace Files for E2E/UI Tests
+E2E and UI test generation requires pre-recorded trace files. Search the repository for:
+- Backend traces: \`**/skyramp*trace*.json\`, \`**/skyramp-traces.json\`
+- Playwright traces: \`**/skyramp*playwright*.zip\`, \`**/*playwright*.zip\`
+Look in the test directory, repository root, and \`.skyramp/\` directories.
+**IMPORTANT — Verify trace relevance before using it:**
+Before passing a trace file to a test generation tool, inspect its contents to confirm it actually exercises the UI components or pages affected by the PR. A trace recorded before the current changes will not cover new UI elements. If the trace does NOT cover the changed UI:
+- Do NOT use it for generating tests for the new changes.
+- Report in \`issuesFound\`: "A Playwright trace file was found (<filename>) but it does not cover the new UI changes in this PR. To generate UI tests for the new functionality, record a new trace that exercises the changed pages/components and commit it, then re-run the Testbot."
+- **Both found and relevant** → call \`skyramp_e2e_test_generation\` with both trace files
+- **Only Playwright ZIP found and relevant** → call \`skyramp_ui_test_generation\` with the Playwright file
+- **No traces found** → do NOT silently skip. Include in \`issuesFound\` when submitting your report: "E2E/UI tests were recommended but could not be generated because no Playwright trace file (.zip) was found in the repository. To enable E2E/UI test generation, record a Playwright trace and commit the .zip file, then re-run the Testbot."
+`
+                : "";
             const nextActionsSection = mapping.summary.highPriority.length > 0 ||
                 mapping.summary.mediumPriority.length > 0
                 ? `
@@ -148,7 +174,7 @@ Do NOT skip any. Do NOT just run existing tests — generate new ones.
 ### High Priority (call these first)
 ${highActions || "none"}
-${mediumActions ? `### Medium Priority (call after high)\n${mediumActions}\n` : ""}${isDiffScope && ((analysis?.branchDiffContext?.newEndpoints?.length ?? 0) + (analysis?.branchDiffContext?.modifiedEndpoints?.length ?? 0)) > 0 ? `\nTarget the changed endpoint(s) listed above for each generated test. Use the full URL (including base URL) as the \`endpointURL\` parameter when calling generate tools.` : ""}
+${mediumActions ? `### Medium Priority (call after high)\n${mediumActions}\n` : ""}${isDiffScope && ((analysis?.branchDiffContext?.newEndpoints?.length ?? 0) + (analysis?.branchDiffContext?.modifiedEndpoints?.length ?? 0)) > 0 ? `\nTarget the changed endpoint(s) listed above for each generated test. Use the full URL (including base URL) as the \`endpointURL\` parameter when calling generate tools.` : ""}${traceGuidance}
 `
                 : "";
             const output = `# Test Recommendations

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@skyramp/mcp",
-  "version": "0.0.58",
+  "version": "0.0.59",
   "main": "build/index.js",
   "type": "module",
   "bin": {