npm - @qulib/mcp - Versions diffs - 0.8.2 → 0.9.0 - Mend

@qulib/mcp 0.8.2 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -43,22 +43,48 @@ For verbose server-side stderr logs while troubleshooting host wiring, add:
 }
 ```
-## What it does
+## MCP tools
-Tools:
+| Tool | Purpose |
+|---|---|
+| **`qulib_score_confidence`** | **Flagship.** Fuses evidence from `analyze_app`, `qulib_score_automation`, and `qulib_score_api` into one verdict: **ship / caution / hold / block** with a 0–100 confidence score, L1–L5 level, per-source contributions, honesty notes, and recommended next checks. Pass `url` and/or `repoPath`. |
+| `analyze_app` | Live-app quality scan: release confidence (0–100), axe-core a11y, broken links, console errors, prioritized gaps. Default payload is summary-first; pass `includeFullReport: true` for all scenarios. Optional form-login / storage-state auth. |
+| `qulib_score_automation` | Score a local repo's test-automation maturity across six dimensions (test coverage breadth, framework adoption, test-id hygiene, CI integration, auth test coverage, component test ratio) — plus a conditional 7th dimension (API coverage) when API endpoints are detected. Returns overall 0–100, level (L1–L5), and top recommendations. Each dimension carries `applicability`; score normalizes over applicable dimensions only. |
+| `qulib_score_api` | Discover API endpoints in a repo and score their test coverage. Tier1=OpenAPI specs, Tier2=framework routes (Next.js, Express, Fastify, NestJS), Tier3=heuristic opt-in (tRPC). Returns an api-test-coverage dimension score with per-endpoint evidence. |
+| `qulib_scaffold_tests` | Generate a ready-to-run test scaffold (Cypress or Playwright config + spec files) by crawling a deployed URL. Returns `generatedTests` and `projectConfig` so an agent can write files directly. Pass `recipes` (e.g. `["auth","a11y"]`) to append proven test patterns. |
+| `explore_auth` | List all sign-in paths (OAuth, SSO, forms, magic link) and what the agent must collect before `analyze_app`. Prefer on unfamiliar apps. |
+| `detect_auth` | Single-pass auth pattern guess with a recommendation. Lighter than `explore_auth`. |
-- **`explore_auth(url, timeoutMs?)`** — list all sign-in paths (OAuth, unknown SSO heuristics, forms, magic link) and what the agent must collect before `analyze_app`. Prefer this on unfamiliar apps.
-- **`analyze_app`** — quality scan (optional form-login or storage-state auth). **Default payload is summary-first:** `summary`, `topGaps`, `costIntelligenceSummary`, `nextDeterministicChecks`, small previews. Set **`includeFullReport: true`** for the full `analyzeApp` result (all scenarios). Optional harness overrides: **`llmMaxOutputTokensPerCall`**, **`llmTokenBudget`** (legacy), **`testGenerationLimit`**, **`enableLlmScenarios`** (default true when omitted).
-- **`detect_auth(url, timeoutMs?)`** — single-pattern auth guess with a short recommendation (lighter than `explore_auth`).
-- **`qulib_score_automation(repoPath, includeFullDimensions?)`** — score a local automation repo across six dimensions (test coverage breadth, framework adoption, test-id hygiene, CI integration, auth test coverage, component test ratio). Returns an overall 0–100 score, maturity level (L1–L5), and top recommendations. Each dimension carries an **`applicability`** field (`applicable` / `not_applicable` / `unknown`); the overall score normalizes across applicable dimensions only so absent capabilities never get silent partial credit. **`repoPath`** must be an absolute path on the MCP host. Pass **`includeFullDimensions: true`** for per-dimension evidence and reasons.
+**Example — flagship confidence call:**
-Returns from `analyze_app`:
+```
+qulib_score_confidence({ url: "https://example.com", repoPath: "/path/to/repo" })
+```
+Returns a verdict like:
+```json
+{
+  "releaseConfidence": {
+    "verdict": "caution",
+    "confidenceScore": 54,
+    "level": 3,
+    "label": "Moderate confidence — proceed with known risks",
+    "topRisks": ["Low crawl coverage (2 pages)", "No CI integration detected"],
+    "recommendedNextChecks": ["Add CI pipeline", "Increase crawl depth"],
+    "honestyNotes": ["API coverage: not_applicable (no API endpoints found — excluded from score)"]
+  }
+}
+```
+### `analyze_app` detail
+- **Default payload:** `summary`, `topGaps`, `costIntelligenceSummary`, `nextDeterministicChecks`, small previews.
+- **`includeFullReport: true`** — full `gapAnalysis` (all scenarios) and full `repoInventory`.
+- **`agentSummary: true`** — compact gate-decision payload (`pass`/`warn`/`fail`) for CI orchestrators.
+- Optional harness overrides: **`llmMaxOutputTokensPerCall`**, **`llmTokenBudget`** (legacy), **`testGenerationLimit`**, **`enableLlmScenarios`**.
-- Release confidence score (0-100)
-- Accessibility violations (axe-core, WCAG 2 A/AA)
-- Broken links
-- Console errors and coverage warnings
-- Prioritized gaps with severity
+Returns: release confidence score (0–100), accessibility violations (axe-core, WCAG 2 A/AA), broken links, console errors and coverage warnings, prioritized gaps with severity.
 Supports optional form-login auth for scanning authenticated pages. If auth is required but not configured, the scan can stop early with `mode: auth-required` and guidance in `detectedAuth` / the decision log.

package/dist/index.js CHANGED Viewed

@@ -14,7 +14,8 @@ import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
 import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
 const requirePkg = createRequire(import.meta.url);
 const pkg = requirePkg('../package.json');
-import { analyzeApp, detectAuth, exploreAuth, scanRepo, computeAutomationMaturity, scaffoldTests, discoverApiSurfaceWithRepo, computeApiCoverage, } from '@qulib/core';
+import { analyzeApp, detectAuth, exploreAuth, scanRepo, computeAutomationMaturity, scaffoldTests, discoverApiSurfaceWithRepo, computeApiCoverage, computeReleaseConfidence, buildConfidenceInputFromQulib, } from '@qulib/core';
+import { RecipeIdSchema } from '@qulib/core';
 import { z } from 'zod';
 import { buildAnalyzeAppMcpPayload } from './analyze-app-mcp-payload.js';
 import { log } from './logger.js';
@@ -274,18 +275,31 @@ const ScaffoldTestsInputSchema = z.object({
         .max(20)
         .optional()
         .describe('Max pages to crawl when running analyze_app internally. Default: 10'),
+    recipes: z
+        .array(RecipeIdSchema)
+        .optional()
+        .describe('Optional list of reusable test-pattern recipes to append to the scaffold. ' +
+        'Each recipe adds proven NQ-2/CaseLoom-derived scenarios: ' +
+        '"auth" = login/logout/protected-route flows; ' +
+        '"a11y" = heading/landmark/title accessibility checks; ' +
+        '"nav" = deep-link/browser-back/404 handling; ' +
+        '"seed" = data-seeding/state-reset helpers. ' +
+        'Recipe scenarios are APPENDED to crawl-derived scenarios — they never replace them. ' +
+        'Example: ["auth", "a11y"] adds 6 ready-to-run test scenarios.'),
 });
 mcpServer.registerTool('qulib_scaffold_tests', {
-    description: 'Generate a ready-to-run test scaffold for a deployed web app. Crawls the URL, identifies quality gaps and user flows, then produces framework-specific test files (Cypress or Playwright) plus the project config (cypress.config.ts or playwright.config.ts) and package.json deps. Returns generatedTests (array of {filename, code, outputPath}) and projectConfig so an agent can write the files directly to a repo without any manual test-writing.',
+    description: 'Generate a ready-to-run test scaffold for a deployed web app. Crawls the URL, identifies quality gaps and user flows, then produces framework-specific test files (Cypress or Playwright) plus the project config (cypress.config.ts or playwright.config.ts) and package.json deps. Returns generatedTests (array of {filename, code, outputPath}) and projectConfig so an agent can write the files directly to a repo without any manual test-writing. Optionally pass recipes (e.g. ["auth","a11y"]) to append proven NQ-2/CaseLoom-derived test patterns for common flows — auth adds login/logout/protected-route tests, a11y adds heading/landmark/title checks, nav adds deep-link/404 tests, seed adds state-reset helpers.',
     inputSchema: ScaffoldTestsInputSchema,
-}, async ({ url, framework, maxPagesToScan }) => {
+}, async ({ url, framework, maxPagesToScan, recipes }) => {
     try {
-        log.info(`qulib_scaffold_tests url=${url} framework=${framework ?? 'cypress-e2e'} maxPagesToScan=${maxPagesToScan ?? 10}`);
+        const recipesLog = recipes && recipes.length > 0 ? ` recipes=[${recipes.join(',')}]` : '';
+        log.info(`qulib_scaffold_tests url=${url} framework=${framework ?? 'cypress-e2e'} maxPagesToScan=${maxPagesToScan ?? 10}${recipesLog}`);
         const result = await scaffoldTests(url, {
             framework: framework ?? 'cypress-e2e',
             maxPagesToScan: maxPagesToScan ?? 10,
             progressLog: mcpProgressLog,
             telemetry: telemetrySink,
+            ...(recipes && recipes.length > 0 && { recipes }),
         });
         return {
             content: [
@@ -359,5 +373,113 @@ mcpServer.registerTool('qulib_score_api', {
         return toolError('QULIB_API_SCORE_FAILED', msg, err instanceof Error ? err.stack : undefined);
     }
 });
+// ---------------------------------------------------------------------------
+// qulib_score_confidence — P3 Release Confidence Aggregator
+// Composes existing collectors (analyze_app, qulib_score_automation,
+// qulib_score_api) into one fused Release Confidence verdict. Honors the
+// tool-explosion guardrail by composing, not fanning out (index.ts lines 4–10).
+// ---------------------------------------------------------------------------
+const ScoreConfidenceInputSchema = z.object({
+    url: z.string().url().optional().describe('URL of the deployed app to analyze (runs analyze_app if provided)'),
+    repoPath: z
+        .string()
+        .optional()
+        .describe('Absolute path to the repository (runs qulib_score_automation + qulib_score_api if provided)'),
+    includeViews: z
+        .object({
+        replay: z.boolean().optional().describe('Include the Replay provenance trace in the response'),
+    })
+        .optional()
+        .describe('Optional projection flags — which views to include beyond the Release Confidence view'),
+    subject: z
+        .object({
+        kind: z.enum(['release', 'pr', 'deploy', 'app', 'repo']).optional(),
+        ref: z.string().optional(),
+        tenantId: z.string().optional(),
+    })
+        .optional()
+        .describe('Subject metadata for the confidence verdict; defaults are inferred from url/repoPath'),
+});
+mcpServer.registerTool('qulib_score_confidence', {
+    description: 'Compute a fused Release Confidence verdict by composing qulib evidence collectors. ' +
+        'Given a URL and/or repo path, runs analyze_app / qulib_score_automation / qulib_score_api as applicable, ' +
+        'then fuses the signals into one verdict (ship | caution | hold | block) with a 0–100 confidence score, ' +
+        'L1–L5 level, per-source contributions, honesty notes for any excluded/unknown source, and recommended next checks. ' +
+        'Returns the Release Confidence view. Pass includeViews.replay for the full provenance trace.',
+    inputSchema: ScoreConfidenceInputSchema,
+}, async ({ url, repoPath, includeViews, subject }) => {
+    try {
+        const subjectRef = subject?.ref ?? url ?? repoPath ?? 'unknown';
+        const subjectKind = subject?.kind ?? (url && repoPath ? 'release' : url ? 'app' : 'repo');
+        const tenantId = subject?.tenantId ?? 'default';
+        const confidenceSubject = { kind: subjectKind, ref: subjectRef, tenantId };
+        // Collect evidence from whichever collectors apply.
+        let analyzeResult;
+        let maturityResult;
+        let apiCoverageResult;
+        if (url) {
+            log.info(`qulib_score_confidence: running analyze_app url=${url}`);
+            const harnessConfig = {
+                maxPagesToScan: 10,
+                maxDepth: 3,
+                minPagesForConfidence: 3,
+                timeoutMs: 30000,
+                retryCount: 0,
+                llmTokenBudget: 4096,
+                testGenerationLimit: 5,
+                enableLlmScenarios: false,
+                readOnlyMode: true,
+                requireHumanReview: false,
+                failOnConsoleError: false,
+                explorer: 'playwright',
+                defaultAdapter: 'playwright',
+                adapters: ['playwright'],
+            };
+            analyzeResult = await analyzeApp({
+                url,
+                writeArtifacts: false,
+                config: harnessConfig,
+                progressLog: mcpProgressLog,
+                telemetry: telemetrySink,
+            });
+        }
+        if (repoPath) {
+            const abs = validateAbsoluteRepoPath(repoPath);
+            log.info(`qulib_score_confidence: running qulib_score_automation + qulib_score_api repoPath=${abs}`);
+            const repo = await scanRepo(abs);
+            maturityResult = computeAutomationMaturity(repo);
+            const apiSurface = await discoverApiSurfaceWithRepo(abs, repo, { enableTier3: false });
+            apiCoverageResult = computeApiCoverage(repo, apiSurface);
+        }
+        // Build the evidence bundle from qulib's own collectors.
+        const confidenceInput = buildConfidenceInputFromQulib({
+            analyze: analyzeResult,
+            maturity: maturityResult,
+            apiCoverage: apiCoverageResult,
+            subject: confidenceSubject,
+        });
+        // Run the pure scorer.
+        const rc = computeReleaseConfidence(confidenceInput);
+        // Build the response payload (Release Confidence view is always included).
+        const payload = { releaseConfidence: rc };
+        if (includeViews?.replay) {
+            const { buildReplay } = await import('@qulib/core');
+            payload['replay'] = buildReplay(confidenceInput, rc);
+        }
+        log.info(`qulib_score_confidence done verdict=${rc.verdict} confidenceScore=${rc.confidenceScore ?? 'null'} ` +
+            `level=${rc.level} evidenceSources=${confidenceInput.evidence.length}`);
+        return {
+            content: [{ type: 'text', text: JSON.stringify(payload, null, 2) }],
+        };
+    }
+    catch (err) {
+        const msg = err instanceof Error ? err.message : String(err);
+        if (msg.includes('repoPath must')) {
+            return toolError('QULIB_INPUT_INVALID', msg, undefined);
+        }
+        log.error(`qulib_score_confidence failed: ${msg}`);
+        return toolError('QULIB_CONFIDENCE_FAILED', msg, err instanceof Error ? err.stack : undefined);
+    }
+});
 const transport = new StdioServerTransport();
 await mcpServer.connect(transport);

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "@qulib/mcp",
-  "version": "0.8.2",
-  "description": "MCP server for Qulib — AI-callable QA gap analysis",
+  "version": "0.9.0",
+  "description": "MCP server for Qulib — AI-callable release confidence. Seven tools: fused verdict, live-app scan, automation maturity, API coverage, test scaffold, and auth tools.",
   "license": "MIT",
   "author": "Tapesh Nagarwal",
   "homepage": "https://github.com/TapeshN/qulib#readme",
@@ -28,12 +28,13 @@
   ],
   "scripts": {
     "build": "npm --prefix ../.. run build -w @qulib/core && tsc && chmod +x dist/index.js",
+    "prepublishOnly": "npm run build",
     "dev": "tsx src/index.ts",
-    "test": "node --import tsx/esm --test src/__tests__/summarize-analyze-result.test.ts src/__tests__/analyze-app-mcp-payload.test.ts"
+    "test": "node --import tsx/esm --test src/__tests__/summarize-analyze-result.test.ts src/__tests__/analyze-app-mcp-payload.test.ts src/__tests__/score-confidence-mcp.test.ts"
   },
   "dependencies": {
     "@modelcontextprotocol/sdk": "^1.0.0",
-    "@qulib/core": "0.8.2",
+    "@qulib/core": "0.9.0",
     "zod": "^3.23.0"
   },
   "devDependencies": {
@@ -43,10 +44,12 @@
   },
   "keywords": [
     "mcp",
+    "release-confidence",
     "qa",
     "quality",
+    "ship-verdict",
+    "automation-maturity",
     "accessibility",
-    "release-confidence",
     "ai"
   ]
 }