npm - @mnapoli/exspec - Versions diffs - 0.1.0 - Mend

@mnapoli/exspec 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (32) hide show

package/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2026 Matthieu Napoli
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,100 @@
+# exspec
+**Executable specs** — run Gherkin feature files with an AI agent in the browser.
+exspec parses `.feature` files, launches a Claude agent restricted to browser-only interaction (Playwright, headless), and produces a test report. Feature files can be written in any language supported by Gherkin (English, French, German, Spanish, [70+ languages](https://cucumber.io/docs/gherkin/languages/)).
+## Install
+```bash
+npm install -D @mnapoli/exspec
+```
+## Prerequisites
+- [Claude Code CLI](https://docs.anthropic.com/en/docs/claude-code) installed and authenticated
+## Usage
+```bash
+# Run all feature files in features/
+npx exspec
+# Run a specific file
+npx exspec features/Auth/Login.feature
+# Run all features in a directory
+npx exspec features/Auth/
+# Filter by scenario name
+npx exspec features/Auth/Login.feature --filter "invalid password"
+# Stop at first failure
+npx exspec --fail-fast
+# Run with visible browser (for debugging)
+npx exspec --headed
+```
+## Configuration
+### `exspec.md`
+Create an `features/exspec.md` file. Its content is passed to the AI agent as context.
+```markdown
+# QA Configuration
+## Application
+URL: http://localhost:3000
+This is an e-commerce app. The user is a store manager. For detailed feature documentation, see the `docs/` directory.
+## Authentication
+Use the `test@example.com` / `password` credentials for authentication.
+## Browser
+Resolution: 1920x1080
+```
+The agent reads this file as context, so you can reference any project documentation here, or give it extra instructions.
+### Environment variables
+If your project has a `.env` file, exspec loads it automatically. You can then reference environment variables in `exspec.md` using `$VAR` or `${VAR}` syntax, they are resolved before the config is passed to the agent.
+```markdown
+URL: $APP_URL
+```
+This is useful for dynamic URLs across environments (e.g. with git worktrees). If a variable is not defined, the reference is left as-is.
+## How it works
+1. Loads `.env` (if present) and `exspec.md` (with variable expansion)
+2. Discovers and parses `.feature` files (supports all Gherkin languages)
+3. Groups scenarios by domain (subdirectory of `features/`)
+4. For each domain, invokes Claude CLI with:
+   - Only Playwright tools available (browser-only, no database or code access)
+   - Playwright in headless mode (or headed with `--headed`)
+   - Feature content + context docs + config as prompt
+5. Parses results (PASS/FAIL/SKIP) and writes them to `features/exspec/`
+## Results
+Results are written to `features/exspec/{YYYY-MM-DD-HHmm}.md` with failure screenshots in the corresponding directory.
+The CLI exits with code `1` if any tests fail (CI-friendly).
+## Agent restrictions
+The AI agent can ONLY use Playwright browser tools. It cannot:
+- Access the database
+- Read or modify source code
+- Execute shell commands
+If a scenario cannot be verified through the browser, it is marked as FAIL.

package/dist/cli.d.ts ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ #!/usr/bin/env node
2	+ export {};

package/dist/cli.js ADDED Viewed

@@ -0,0 +1,129 @@
+#!/usr/bin/env node
+import { readFileSync, existsSync } from "fs";
+import { resolve } from "path";
+import { loadDotenv, expandVars } from "./env.js";
+import { discoverFeatures } from "./discovery.js";
+import { parseFeature, filterScenarios, groupByDomain } from "./gherkin.js";
+import { buildPrompt } from "./prompt.js";
+import { runDomain } from "./runner.js";
+import { generateRunId, initResultsFile, appendDomainResults, appendSummary, } from "./reporter.js";
+const projectRoot = resolve(process.cwd());
+// Parse arguments
+const args = process.argv.slice(2);
+let target;
+let filter = null;
+let failFast = false;
+let headed = false;
+for (let i = 0; i < args.length; i++) {
+    if (args[i] === "--filter" && args[i + 1]) {
+        filter = args[++i];
+    }
+    else if (args[i] === "--fail-fast") {
+        failFast = true;
+    }
+    else if (args[i] === "--headed") {
+        headed = true;
+    }
+    else if (!args[i].startsWith("--")) {
+        target = args[i];
+    }
+}
+// Load .env if it exists (populates process.env)
+loadDotenv(projectRoot);
+// Load config
+const configPath = resolve(projectRoot, "features", "exspec.md");
+if (!existsSync(configPath)) {
+    console.error("features/exspec.md not found.");
+    console.error("Create a features/exspec.md file with your QA configuration.");
+    process.exit(1);
+}
+// Resolve $VAR and ${VAR} references in the config using process.env
+const configContent = expandVars(readFileSync(configPath, "utf-8"));
+// Discover and parse features
+let featureFiles;
+try {
+    featureFiles = discoverFeatures(projectRoot, target);
+}
+catch (error) {
+    console.error(error instanceof Error ? error.message : String(error));
+    process.exit(1);
+}
+if (featureFiles.length === 0) {
+    console.error("No .feature files found.");
+    process.exit(1);
+}
+let features = featureFiles.map((f) => parseFeature(f, projectRoot));
+if (filter) {
+    features = filterScenarios(features, filter);
+    if (features.length === 0) {
+        console.error(`No scenarios matching filter "${filter}".`);
+        process.exit(1);
+    }
+}
+const domains = groupByDomain(features);
+const totalScenarios = features.reduce((sum, f) => sum + f.scenarios.length, 0);
+// Display test plan
+console.log(`\nSuite: ${totalScenarios} scenario(s) in ${domains.size} domain(s)\n`);
+for (const [domain, domainFeatures] of domains) {
+    const count = domainFeatures.reduce((sum, f) => sum + f.scenarios.length, 0);
+    console.log(`  ${domain} (${count} scenarios)`);
+    for (const f of domainFeatures) {
+        for (const s of f.scenarios) {
+            console.log(`    · ${s.name}`);
+        }
+    }
+}
+console.log();
+// Initialize results
+const runId = generateRunId();
+const { resultsPath, screenshotsDir } = initResultsFile(projectRoot, runId);
+console.log(`Results: features/exspec/${runId}.md\n`);
+// Execute tests domain by domain
+const totals = { passed: 0, failed: 0, skipped: 0, errors: 0 };
+for (const [domain, domainFeatures] of domains) {
+    console.log(`▶ ${domain}...`);
+    const prompt = buildPrompt({
+        features: domainFeatures,
+        scenarioFilter: filter,
+        configContent,
+        screenshotsDir,
+    });
+    const result = await runDomain(prompt, domain, projectRoot, { headed });
+    appendDomainResults(resultsPath, result);
+    if (result.isError) {
+        totals.errors++;
+        console.log(`  ✗ ERROR`);
+    }
+    else {
+        for (const s of result.scenarios) {
+            if (s.status === "pass")
+                totals.passed++;
+            else if (s.status === "fail")
+                totals.failed++;
+            else
+                totals.skipped++;
+        }
+        const p = result.scenarios.filter((s) => s.status === "pass").length;
+        const f = result.scenarios.filter((s) => s.status === "fail").length;
+        console.log(`  ${p} passed, ${f} failed`);
+    }
+    if (result.cost) {
+        totals.cost = (totals.cost ?? 0) + result.cost;
+        console.log(`  Cost: $${result.cost.toFixed(4)}`);
+    }
+    console.log();
+    if (failFast &&
+        (result.isError || result.scenarios.some((s) => s.status === "fail"))) {
+        console.log("--fail-fast: stopping after first failure.");
+        break;
+    }
+}
+// Summary
+appendSummary(resultsPath, totals);
+console.log("─".repeat(40));
+console.log(`Total: ${totals.passed} passed, ${totals.failed} failed, ${totals.skipped} skipped, ${totals.errors} errors`);
+if (totals.cost) {
+    console.log(`Total cost: $${totals.cost.toFixed(4)}`);
+}
+console.log(`\nResults written to features/exspec/${runId}.md`);
+process.exit(totals.failed > 0 || totals.errors > 0 ? 1 : 0);

package/dist/discovery.d.ts ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ export declare function discoverFeatures(projectRoot: string, target?: string): string[];
2	+ export declare function getDomain(featurePath: string, projectRoot: string): string;

package/dist/discovery.js ADDED Viewed

@@ -0,0 +1,26 @@
+import { readdirSync, statSync, existsSync } from "fs";
+import { resolve, relative } from "path";
+export function discoverFeatures(projectRoot, target) {
+    if (!target) {
+        return globFeatures(resolve(projectRoot, "features"));
+    }
+    const fullPath = resolve(projectRoot, target);
+    if (!existsSync(fullPath)) {
+        throw new Error(`Path not found: ${fullPath}`);
+    }
+    if (statSync(fullPath).isDirectory()) {
+        return globFeatures(fullPath);
+    }
+    return [fullPath];
+}
+function globFeatures(dir) {
+    return readdirSync(dir, { recursive: true, withFileTypes: true })
+        .filter((entry) => !entry.isDirectory() && entry.name.endsWith(".feature"))
+        .map((entry) => resolve(entry.parentPath, entry.name))
+        .sort();
+}
+export function getDomain(featurePath, projectRoot) {
+    const rel = relative(resolve(projectRoot, "features"), featurePath);
+    const parts = rel.split("/");
+    return parts.length > 1 ? parts[0] : "default";
+}

package/dist/discovery.test.d.ts ADDED Viewed

	@@ -0,0 +1 @@
1	+ export {};

package/dist/discovery.test.js ADDED Viewed

@@ -0,0 +1,14 @@
+import { describe, test, expect } from "vitest";
+import { getDomain } from "./discovery.js";
+describe("getDomain", () => {
+    const root = "/project";
+    test("extracts domain from subdirectory", () => {
+        expect(getDomain("/project/features/Auth/login.feature", root)).toBe("Auth");
+    });
+    test("extracts domain from nested subdirectory", () => {
+        expect(getDomain("/project/features/Auth/Admin/users.feature", root)).toBe("Auth");
+    });
+    test("returns 'default' for files directly in features/", () => {
+        expect(getDomain("/project/features/login.feature", root)).toBe("default");
+    });
+});

package/dist/env.d.ts ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ export declare function loadDotenv(projectRoot: string): void;
2	+ export declare function expandVars(text: string): string;

package/dist/env.js ADDED Viewed

@@ -0,0 +1,17 @@
+import { config } from "dotenv";
+import dotenvExpand from "dotenv-expand";
+import { existsSync } from "fs";
+import { resolve } from "path";
+export function loadDotenv(projectRoot) {
+    const envPath = resolve(projectRoot, ".env");
+    if (!existsSync(envPath))
+        return;
+    const env = config({ path: envPath });
+    dotenvExpand.expand(env);
+}
+export function expandVars(text) {
+    return text.replace(/\$\{(\w+)\}|\$(\w+)/g, (match, braced, bare) => {
+        const varName = braced ?? bare;
+        return process.env[varName] ?? match;
+    });
+}

package/dist/env.test.d.ts ADDED Viewed

	@@ -0,0 +1 @@
1	+ export {};

package/dist/env.test.js ADDED Viewed

@@ -0,0 +1,28 @@
+import { describe, test, expect, beforeEach, afterEach } from "vitest";
+import { expandVars } from "./env.js";
+describe("expandVars", () => {
+    const saved = { ...process.env };
+    beforeEach(() => {
+        process.env.APP_URL = "http://localhost:3000";
+        process.env.SECRET = "s3cret";
+    });
+    afterEach(() => {
+        process.env = { ...saved };
+    });
+    test("expands $VAR syntax", () => {
+        expect(expandVars("URL: $APP_URL")).toBe("URL: http://localhost:3000");
+    });
+    test("expands ${VAR} syntax", () => {
+        expect(expandVars("URL: ${APP_URL}")).toBe("URL: http://localhost:3000");
+    });
+    test("expands multiple variables", () => {
+        expect(expandVars("$APP_URL with $SECRET")).toBe("http://localhost:3000 with s3cret");
+    });
+    test("leaves undefined variables as-is", () => {
+        expect(expandVars("$UNDEFINED_VAR")).toBe("$UNDEFINED_VAR");
+        expect(expandVars("${UNDEFINED_VAR}")).toBe("${UNDEFINED_VAR}");
+    });
+    test("returns text without variables unchanged", () => {
+        expect(expandVars("no variables here")).toBe("no variables here");
+    });
+});

package/dist/gherkin.d.ts ADDED Viewed

@@ -0,0 +1,4 @@
+import type { ParsedFeature } from "./types.js";
+export declare function parseFeature(filePath: string, projectRoot: string): ParsedFeature;
+export declare function filterScenarios(features: ParsedFeature[], filter: string): ParsedFeature[];
+export declare function groupByDomain(features: ParsedFeature[]): Map<string, ParsedFeature[]>;

package/dist/gherkin.js ADDED Viewed

@@ -0,0 +1,41 @@
+import { AstBuilder, GherkinClassicTokenMatcher, Parser, } from "@cucumber/gherkin";
+import { IdGenerator } from "@cucumber/messages";
+import { readFileSync } from "fs";
+import { getDomain } from "./discovery.js";
+const uuidFn = IdGenerator.uuid();
+export function parseFeature(filePath, projectRoot) {
+    const rawContent = readFileSync(filePath, "utf-8");
+    const builder = new AstBuilder(uuidFn);
+    const matcher = new GherkinClassicTokenMatcher();
+    const parser = new Parser(builder, matcher);
+    const document = parser.parse(rawContent);
+    const feature = document.feature;
+    const scenarios = (feature?.children ?? [])
+        .filter((child) => child.scenario)
+        .map((child) => ({ name: child.scenario.name }));
+    return {
+        name: feature?.name ?? "Unknown",
+        filePath,
+        domain: getDomain(filePath, projectRoot),
+        rawContent,
+        scenarios,
+    };
+}
+export function filterScenarios(features, filter) {
+    const lowerFilter = filter.toLowerCase();
+    return features
+        .map((f) => ({
+        ...f,
+        scenarios: f.scenarios.filter((s) => s.name.toLowerCase().includes(lowerFilter)),
+    }))
+        .filter((f) => f.scenarios.length > 0);
+}
+export function groupByDomain(features) {
+    const groups = new Map();
+    for (const feature of features) {
+        const existing = groups.get(feature.domain) ?? [];
+        existing.push(feature);
+        groups.set(feature.domain, existing);
+    }
+    return groups;
+}

package/dist/gherkin.test.d.ts ADDED Viewed

	@@ -0,0 +1 @@
1	+ export {};

package/dist/gherkin.test.js ADDED Viewed

@@ -0,0 +1,46 @@
+import { describe, test, expect } from "vitest";
+import { filterScenarios, groupByDomain } from "./gherkin.js";
+function feature(overrides = {}) {
+    return {
+        name: "Test Feature",
+        filePath: "/features/Test/test.feature",
+        domain: "Test",
+        rawContent: "",
+        scenarios: [{ name: "Scenario A" }, { name: "Scenario B" }],
+        ...overrides,
+    };
+}
+describe("filterScenarios", () => {
+    test("filters scenarios by name (case-insensitive)", () => {
+        const features = [feature()];
+        const result = filterScenarios(features, "scenario a");
+        expect(result).toHaveLength(1);
+        expect(result[0].scenarios).toEqual([{ name: "Scenario A" }]);
+    });
+    test("removes features with no matching scenarios", () => {
+        const features = [feature()];
+        const result = filterScenarios(features, "nonexistent");
+        expect(result).toHaveLength(0);
+    });
+    test("partial match works", () => {
+        const features = [feature({ scenarios: [{ name: "User can login" }] })];
+        const result = filterScenarios(features, "login");
+        expect(result).toHaveLength(1);
+    });
+});
+describe("groupByDomain", () => {
+    test("groups features by domain", () => {
+        const features = [
+            feature({ domain: "Auth" }),
+            feature({ domain: "Billing" }),
+            feature({ domain: "Auth", name: "Other" }),
+        ];
+        const groups = groupByDomain(features);
+        expect(groups.size).toBe(2);
+        expect(groups.get("Auth")).toHaveLength(2);
+        expect(groups.get("Billing")).toHaveLength(1);
+    });
+    test("returns empty map for no features", () => {
+        expect(groupByDomain([]).size).toBe(0);
+    });
+});

package/dist/prompt.d.ts ADDED Viewed

@@ -0,0 +1,7 @@
+import type { ParsedFeature } from "./types.js";
+export declare function buildPrompt(options: {
+    features: ParsedFeature[];
+    scenarioFilter: string | null;
+    configContent: string;
+    screenshotsDir: string;
+}): string;

package/dist/prompt.js ADDED Viewed

@@ -0,0 +1,23 @@
+import { readFileSync } from "fs";
+import { resolve, dirname } from "path";
+import { fileURLToPath } from "url";
+const __dirname = dirname(fileURLToPath(import.meta.url));
+const templatePath = resolve(__dirname, "..", "prompt-template.md");
+export function buildPrompt(options) {
+    let template = readFileSync(templatePath, "utf-8");
+    const featureContent = options.features
+        .map((f) => f.rawContent)
+        .join("\n\n---\n\n");
+    const scenariosToExecute = options.scenarioFilter
+        ? options.features
+            .flatMap((f) => f.scenarios)
+            .map((s) => s.name)
+            .join(", ")
+        : "ALL";
+    template = template
+        .replaceAll("{FEATURE_CONTENT}", featureContent)
+        .replaceAll("{SCENARIOS_TO_EXECUTE}", scenariosToExecute)
+        .replaceAll("{CONFIG_CONTEXT}", options.configContent)
+        .replaceAll("{SCREENSHOTS_DIR}", options.screenshotsDir);
+    return template;
+}

package/dist/prompt.test.d.ts ADDED Viewed

	@@ -0,0 +1 @@
1	+ export {};

package/dist/prompt.test.js ADDED Viewed

@@ -0,0 +1,76 @@
+import { describe, test, expect } from "vitest";
+import { buildPrompt } from "./prompt.js";
+describe("buildPrompt", () => {
+    const feature = {
+        name: "Login",
+        filePath: "/features/Auth/login.feature",
+        domain: "Auth",
+        rawContent: "Feature: Login\n  Scenario: Valid login",
+        scenarios: [{ name: "Valid login" }],
+    };
+    test("includes feature content", () => {
+        const prompt = buildPrompt({
+            features: [feature],
+            scenarioFilter: null,
+            configContent: "URL: http://localhost",
+            screenshotsDir: "/tmp/screenshots",
+        });
+        expect(prompt).toContain("Feature: Login");
+    });
+    test("includes config content", () => {
+        const prompt = buildPrompt({
+            features: [feature],
+            scenarioFilter: null,
+            configContent: "URL: http://localhost",
+            screenshotsDir: "/tmp/screenshots",
+        });
+        expect(prompt).toContain("URL: http://localhost");
+    });
+    test("sets scenarios to ALL when no filter", () => {
+        const prompt = buildPrompt({
+            features: [feature],
+            scenarioFilter: null,
+            configContent: "",
+            screenshotsDir: "/tmp",
+        });
+        expect(prompt).toContain("`ALL`");
+    });
+    test("lists filtered scenario names", () => {
+        const prompt = buildPrompt({
+            features: [feature],
+            scenarioFilter: "login",
+            configContent: "",
+            screenshotsDir: "/tmp",
+        });
+        expect(prompt).toContain("Valid login");
+        expect(prompt).not.toContain("`ALL`");
+    });
+    test("replaces all occurrences of screenshots dir", () => {
+        const prompt = buildPrompt({
+            features: [feature],
+            scenarioFilter: null,
+            configContent: "",
+            screenshotsDir: "/tmp/shots",
+        });
+        // The template has {SCREENSHOTS_DIR} in two places
+        expect(prompt).not.toContain("{SCREENSHOTS_DIR}");
+        const count = (prompt.match(/\/tmp\/shots/g) ?? []).length;
+        expect(count).toBeGreaterThanOrEqual(2);
+    });
+    test("joins multiple features with separator", () => {
+        const feature2 = {
+            ...feature,
+            name: "Register",
+            rawContent: "Feature: Register\n  Scenario: New user",
+        };
+        const prompt = buildPrompt({
+            features: [feature, feature2],
+            scenarioFilter: null,
+            configContent: "",
+            screenshotsDir: "/tmp",
+        });
+        expect(prompt).toContain("Feature: Login");
+        expect(prompt).toContain("Feature: Register");
+        expect(prompt).toContain("---");
+    });
+});

package/dist/reporter.d.ts ADDED Viewed

@@ -0,0 +1,9 @@
+import type { DomainResult, RunTotals } from "./types.js";
+export declare function generateRunId(): string;
+export declare function formatTime(): string;
+export declare function initResultsFile(projectRoot: string, runId: string): {
+    resultsPath: string;
+    screenshotsDir: string;
+};
+export declare function appendDomainResults(resultsPath: string, result: DomainResult): void;
+export declare function appendSummary(resultsPath: string, totals: RunTotals): void;

package/dist/reporter.js ADDED Viewed

@@ -0,0 +1,71 @@
+import { appendFileSync, existsSync, mkdirSync, writeFileSync } from "fs";
+import { resolve, join } from "path";
+const pad = (n) => String(n).padStart(2, "0");
+export function generateRunId() {
+    const now = new Date();
+    return `${now.getFullYear()}-${pad(now.getMonth() + 1)}-${pad(now.getDate())}-${pad(now.getHours())}${pad(now.getMinutes())}`;
+}
+export function formatTime() {
+    const now = new Date();
+    return `${pad(now.getHours())}:${pad(now.getMinutes())}`;
+}
+export function initResultsFile(projectRoot, runId) {
+    const resultsDir = resolve(projectRoot, "features/exspec");
+    const screenshotsDir = resolve(resultsDir, runId);
+    const resultsPath = resolve(resultsDir, `${runId}.md`);
+    mkdirSync(screenshotsDir, { recursive: true });
+    // Create .gitignore on first run
+    const gitignorePath = join(resultsDir, ".gitignore");
+    if (!existsSync(gitignorePath)) {
+        writeFileSync(gitignorePath, "*\n!.gitignore\n");
+    }
+    writeFileSync(resultsPath, `# Test results — ${runId}\n\nStarted at ${formatTime()}\n`);
+    return { resultsPath, screenshotsDir };
+}
+export function appendDomainResults(resultsPath, result) {
+    const lines = [""];
+    if (result.isError) {
+        lines.push(`## ${result.domain} — ERROR`, "");
+        lines.push(`  Agent crashed or returned no results.`);
+        if (result.rawOutput) {
+            lines.push(`  Raw output: ${result.rawOutput.slice(0, 500)}`);
+        }
+    }
+    else {
+        const passed = result.scenarios.filter((s) => s.status === "pass").length;
+        const failed = result.scenarios.filter((s) => s.status === "fail").length;
+        const skipped = result.scenarios.filter((s) => s.status === "skip").length;
+        lines.push(`## ${result.domain} — ${passed} passed, ${failed} failed, ${skipped} skipped`, "");
+        for (const scenario of result.scenarios) {
+            if (scenario.status === "pass") {
+                lines.push(`  ✓ ${scenario.name}`);
+                if (scenario.details) {
+                    lines.push(`    ${scenario.details.split("\n")[0]}`);
+                }
+            }
+            else if (scenario.status === "fail") {
+                lines.push(`  ✗ ${scenario.name}`);
+                if (scenario.details) {
+                    lines.push(`    → ${scenario.details.split("\n").join("\n    ")}`);
+                }
+            }
+            else {
+                lines.push(`  ○ ${scenario.name}`);
+                if (scenario.details) {
+                    lines.push(`    → ${scenario.details.split("\n")[0]}`);
+                }
+            }
+            lines.push("");
+        }
+    }
+    appendFileSync(resultsPath, lines.join("\n"));
+}
+export function appendSummary(resultsPath, totals) {
+    const content = [
+        "---\n",
+        "## Summary\n",
+        `Total: ${totals.passed} passed, ${totals.failed} failed, ${totals.skipped} skipped, ${totals.errors} errors\n`,
+        `Finished at ${formatTime()}\n`,
+    ].join("\n");
+    appendFileSync(resultsPath, content);
+}

package/dist/reporter.test.d.ts ADDED Viewed

	@@ -0,0 +1 @@
1	+ export {};

package/dist/reporter.test.js ADDED Viewed

@@ -0,0 +1,84 @@
+import { describe, test, expect, afterEach } from "vitest";
+import { readFileSync, mkdirSync, rmSync } from "fs";
+import { join } from "path";
+import { tmpdir } from "os";
+import { generateRunId, initResultsFile, appendDomainResults, appendSummary, } from "./reporter.js";
+describe("generateRunId", () => {
+    test("returns YYYY-MM-DD-HHmm format", () => {
+        expect(generateRunId()).toMatch(/^\d{4}-\d{2}-\d{2}-\d{4}$/);
+    });
+});
+describe("initResultsFile", () => {
+    const tmpRoot = join(tmpdir(), `exspec-test-${Date.now()}`);
+    afterEach(() => {
+        rmSync(tmpRoot, { recursive: true, force: true });
+    });
+    test("creates results file and screenshots dir", () => {
+        mkdirSync(tmpRoot, { recursive: true });
+        const { resultsPath, screenshotsDir } = initResultsFile(tmpRoot, "2025-01-15-1430");
+        const content = readFileSync(resultsPath, "utf-8");
+        expect(content).toContain("# Test results — 2025-01-15-1430");
+        expect(screenshotsDir).toContain("2025-01-15-1430");
+    });
+});
+describe("appendDomainResults", () => {
+    const tmpRoot = join(tmpdir(), `exspec-test-${Date.now()}`);
+    let resultsPath;
+    afterEach(() => {
+        rmSync(tmpRoot, { recursive: true, force: true });
+    });
+    function setup() {
+        mkdirSync(tmpRoot, { recursive: true });
+        const result = initResultsFile(tmpRoot, "test-run");
+        resultsPath = result.resultsPath;
+    }
+    test("writes passed/failed/skipped counts", () => {
+        setup();
+        const result = {
+            domain: "Auth",
+            scenarios: [
+                { name: "Login", status: "pass", details: "OK" },
+                { name: "Logout", status: "fail", details: "Button missing" },
+            ],
+            rawOutput: "",
+            isError: false,
+        };
+        appendDomainResults(resultsPath, result);
+        const content = readFileSync(resultsPath, "utf-8");
+        expect(content).toContain("Auth — 1 passed, 1 failed, 0 skipped");
+        expect(content).toContain("✓ Login");
+        expect(content).toContain("✗ Logout");
+    });
+    test("writes error domain", () => {
+        setup();
+        const result = {
+            domain: "Broken",
+            scenarios: [],
+            rawOutput: "some error output",
+            isError: true,
+        };
+        appendDomainResults(resultsPath, result);
+        const content = readFileSync(resultsPath, "utf-8");
+        expect(content).toContain("Broken — ERROR");
+        expect(content).toContain("some error output");
+    });
+});
+describe("appendSummary", () => {
+    const tmpRoot = join(tmpdir(), `exspec-test-${Date.now()}`);
+    afterEach(() => {
+        rmSync(tmpRoot, { recursive: true, force: true });
+    });
+    test("writes totals", () => {
+        mkdirSync(tmpRoot, { recursive: true });
+        const { resultsPath } = initResultsFile(tmpRoot, "test-run");
+        const totals = {
+            passed: 5,
+            failed: 2,
+            skipped: 1,
+            errors: 0,
+        };
+        appendSummary(resultsPath, totals);
+        const content = readFileSync(resultsPath, "utf-8");
+        expect(content).toContain("5 passed, 2 failed, 1 skipped, 0 errors");
+    });
+});

package/dist/runner.d.ts ADDED Viewed

@@ -0,0 +1,6 @@
+import type { DomainResult, ScenarioResult } from "./types.js";
+export interface RunOptions {
+    headed?: boolean;
+}
+export declare function runDomain(prompt: string, domain: string, projectRoot: string, options?: RunOptions): Promise<DomainResult>;
+export declare function parseScenarioResults(output: string): ScenarioResult[];

package/dist/runner.js ADDED Viewed

@@ -0,0 +1,170 @@
+import { spawn } from "child_process";
+import { writeFileSync, existsSync, readFileSync } from "fs";
+import { join, dirname } from "path";
+import { tmpdir } from "os";
+import { createRequire } from "module";
+const require = createRequire(import.meta.url);
+const playwrightBin = join(dirname(require.resolve("@playwright/mcp/package.json")), "cli.js");
+function getMcpConfigPath(headed) {
+    const config = {
+        mcpServers: {
+            playwright: {
+                type: "stdio",
+                command: playwrightBin,
+                args: headed ? [] : ["--headless"],
+            },
+        },
+    };
+    const suffix = headed ? "-headed" : "";
+    const configPath = join(tmpdir(), `exspec-mcp${suffix}.json`);
+    const json = JSON.stringify(config);
+    if (!existsSync(configPath) || readFileSync(configPath, "utf-8") !== json) {
+        writeFileSync(configPath, json);
+    }
+    return configPath;
+}
+export async function runDomain(prompt, domain, projectRoot, options = {}) {
+    const mcpConfigPath = getMcpConfigPath(options.headed ?? false);
+    try {
+        const { result, cost, duration } = await invokeClaude(prompt, projectRoot, mcpConfigPath);
+        const scenarios = parseScenarioResults(result);
+        return {
+            domain,
+            scenarios,
+            rawOutput: result,
+            isError: false,
+            cost,
+            duration,
+        };
+    }
+    catch (error) {
+        const message = error instanceof Error ? error.message : String(error);
+        return {
+            domain,
+            scenarios: [],
+            rawOutput: message.slice(0, 500),
+            isError: true,
+        };
+    }
+}
+function invokeClaude(prompt, cwd, mcpConfigPath) {
+    return new Promise((resolve, reject) => {
+        const child = spawn("claude", [
+            "-p",
+            prompt,
+            "--allowedTools",
+            "mcp__playwright__*",
+            "--output-format",
+            "stream-json",
+            "--model",
+            "sonnet",
+            "--mcp-config",
+            mcpConfigPath,
+        ], { cwd, stdio: ["ignore", "pipe", "pipe"] });
+        let buffer = "";
+        let resultText = "";
+        let cost;
+        let duration;
+        child.stdout.on("data", (data) => {
+            buffer += data.toString();
+            // Process complete JSON lines
+            const lines = buffer.split("\n");
+            buffer = lines.pop() ?? "";
+            for (const line of lines) {
+                if (!line.trim())
+                    continue;
+                try {
+                    const event = JSON.parse(line);
+                    handleStreamEvent(event);
+                }
+                catch {
+                    // Skip malformed lines
+                }
+            }
+        });
+        function handleStreamEvent(event) {
+            switch (event.type) {
+                case "assistant": {
+                    const message = event.message;
+                    const content = message?.content;
+                    if (content) {
+                        for (const block of content) {
+                            if (block.type === "text") {
+                                process.stderr.write(".");
+                            }
+                        }
+                    }
+                    break;
+                }
+                case "tool_use": {
+                    const toolName = event.tool_name;
+                    if (toolName) {
+                        const short = toolName.replace("mcp__playwright__browser_", "");
+                        process.stderr.write(`  [${short}]`);
+                    }
+                    break;
+                }
+                case "tool_result": {
+                    process.stderr.write(".");
+                    break;
+                }
+                case "result": {
+                    resultText = event.result ?? "";
+                    cost = event.cost_usd;
+                    duration = event.duration_ms;
+                    break;
+                }
+            }
+        }
+        let stderr = "";
+        child.stderr.on("data", (data) => {
+            stderr += data.toString();
+        });
+        child.on("close", (code) => {
+            // Process remaining buffer
+            if (buffer.trim()) {
+                try {
+                    const event = JSON.parse(buffer);
+                    handleStreamEvent(event);
+                }
+                catch {
+                    // ignore
+                }
+            }
+            process.stderr.write("\n");
+            if (code !== 0) {
+                reject(new Error(`claude exited with code ${code}${stderr ? `: ${stderr.slice(0, 500)}` : ""}`));
+            }
+            else {
+                resolve({ result: resultText, cost, duration });
+            }
+        });
+        child.on("error", (err) => {
+            reject(new Error(`Failed to spawn claude: ${err.message}`));
+        });
+    });
+}
+export function parseScenarioResults(output) {
+    const results = [];
+    const lines = output.split("\n");
+    for (let i = 0; i < lines.length; i++) {
+        const match = lines[i].match(/^### (PASS|FAIL|SKIP):\s*(.+)/);
+        if (match) {
+            const status = match[1].toLowerCase();
+            const details = collectDetails(lines, i + 1);
+            results.push({ name: match[2].trim(), status, details });
+        }
+    }
+    return results;
+}
+function collectDetails(lines, startIndex) {
+    const detailLines = [];
+    for (let i = startIndex; i < lines.length; i++) {
+        if (lines[i].match(/^### (PASS|FAIL|SKIP):/))
+            break;
+        if (lines[i].match(/^## /))
+            break;
+        detailLines.push(lines[i]);
+    }
+    return detailLines.join("\n").trim();
+}

package/dist/runner.test.d.ts ADDED Viewed

	@@ -0,0 +1 @@
1	+ export {};

package/dist/runner.test.js ADDED Viewed

@@ -0,0 +1,81 @@
+import { describe, test, expect } from "vitest";
+import { parseScenarioResults } from "./runner.js";
+describe("parseScenarioResults", () => {
+    test("parses PASS scenarios", () => {
+        const output = `## Feature: Login
+### PASS: User can login
+Login succeeded with correct credentials.`;
+        const results = parseScenarioResults(output);
+        expect(results).toEqual([
+            {
+                name: "User can login",
+                status: "pass",
+                details: "Login succeeded with correct credentials.",
+            },
+        ]);
+    });
+    test("parses FAIL scenarios with details", () => {
+        const output = `### FAIL: User sees dashboard
+**Failed step**: Then I should see the dashboard
+**Error**: Element not found
+**Expected**: Dashboard page
+**Observed**: Login page`;
+        const results = parseScenarioResults(output);
+        expect(results).toHaveLength(1);
+        expect(results[0].status).toBe("fail");
+        expect(results[0].name).toBe("User sees dashboard");
+        expect(results[0].details).toContain("Element not found");
+    });
+    test("parses SKIP scenarios", () => {
+        const output = `### SKIP: Admin panel
+**Reason**: Setup step failed`;
+        const results = parseScenarioResults(output);
+        expect(results).toEqual([
+            {
+                name: "Admin panel",
+                status: "skip",
+                details: "**Reason**: Setup step failed",
+            },
+        ]);
+    });
+    test("parses mixed results", () => {
+        const output = `## Feature: Auth
+### PASS: Login
+OK
+### FAIL: Logout
+Button not found
+### SKIP: MFA
+Not configured`;
+        const results = parseScenarioResults(output);
+        expect(results).toHaveLength(3);
+        expect(results[0]).toMatchObject({ name: "Login", status: "pass" });
+        expect(results[1]).toMatchObject({ name: "Logout", status: "fail" });
+        expect(results[2]).toMatchObject({ name: "MFA", status: "skip" });
+    });
+    test("returns empty array for no matches", () => {
+        expect(parseScenarioResults("random text")).toEqual([]);
+    });
+    test("stops collecting details at next scenario header", () => {
+        const output = `### PASS: First
+Detail line 1
+Detail line 2
+### PASS: Second
+Other detail`;
+        const results = parseScenarioResults(output);
+        expect(results[0].details).toBe("Detail line 1\nDetail line 2");
+        expect(results[1].details).toBe("Other detail");
+    });
+    test("stops collecting details at feature header", () => {
+        const output = `### PASS: First
+Some detail
+## Feature: Other`;
+        const results = parseScenarioResults(output);
+        expect(results[0].details).toBe("Some detail");
+    });
+});

package/dist/types.d.ts ADDED Viewed

@@ -0,0 +1,30 @@
+export interface ParsedFeature {
+    name: string;
+    filePath: string;
+    domain: string;
+    rawContent: string;
+    scenarios: ParsedScenario[];
+}
+export interface ParsedScenario {
+    name: string;
+}
+export interface ScenarioResult {
+    name: string;
+    status: "pass" | "fail" | "skip";
+    details?: string;
+}
+export interface DomainResult {
+    domain: string;
+    scenarios: ScenarioResult[];
+    rawOutput: string;
+    isError: boolean;
+    cost?: number;
+    duration?: number;
+}
+export interface RunTotals {
+    passed: number;
+    failed: number;
+    skipped: number;
+    errors: number;
+    cost?: number;
+}

package/dist/types.js ADDED Viewed

	@@ -0,0 +1 @@
1	+ export {};

package/package.json ADDED Viewed

@@ -0,0 +1,67 @@
+{
+  "name": "@mnapoli/exspec",
+  "version": "0.1.0",
+  "description": "Executable specs — run Gherkin feature files with an AI agent in the browser",
+  "type": "module",
+  "bin": {
+    "exspec": "./dist/cli.js"
+  },
+  "files": [
+    "dist",
+    "prompt-template.md"
+  ],
+  "scripts": {
+    "build": "tsc && node scripts/add-shebang.js",
+    "prepublishOnly": "npm run build",
+    "dev": "tsx src/cli.ts",
+    "test": "vitest run",
+    "lint": "eslint src",
+    "format": "prettier --write src",
+    "format:check": "prettier --check src"
+  },
+  "keywords": [
+    "gherkin",
+    "bdd",
+    "testing",
+    "executable-specifications",
+    "playwright",
+    "ai",
+    "claude",
+    "browser-testing",
+    "feature-tests"
+  ],
+  "repository": {
+    "type": "git",
+    "url": "git+https://github.com/mnapoli/exspec.git"
+  },
+  "homepage": "https://github.com/mnapoli/exspec",
+  "bugs": {
+    "url": "https://github.com/mnapoli/exspec/issues"
+  },
+  "author": "Matthieu Napoli",
+  "license": "MIT",
+  "engines": {
+    "node": ">=20"
+  },
+  "publishConfig": {
+    "access": "public"
+  },
+  "dependencies": {
+    "@cucumber/gherkin": "^29.0.0",
+    "@cucumber/messages": "^25.0.0",
+    "@playwright/mcp": "^0.0.68",
+    "dotenv": "^16.4.0",
+    "dotenv-expand": "^12.0.0"
+  },
+  "devDependencies": {
+    "@eslint/js": "^10.0.1",
+    "@types/node": "^25.5.0",
+    "eslint": "^10.0.3",
+    "eslint-config-prettier": "^10.1.8",
+    "prettier": "^3.8.1",
+    "tsx": "^4.0.0",
+    "typescript": "^5.2.0",
+    "typescript-eslint": "^8.57.1",
+    "vitest": "^4.1.0"
+  }
+}

package/prompt-template.md ADDED Viewed

@@ -0,0 +1,114 @@
+# Feature Scenario Executor
+You execute Gherkin scenarios by interacting with a web application through the browser. You are autonomous: read each step, understand the intent, and figure out how to perform it in the UI.
+## Input
+- **Feature file content**: `{FEATURE_CONTENT}`
+- **Scenarios to execute**: `{SCENARIOS_TO_EXECUTE}`
+## Context
+- **Screenshots directory**: {SCREENSHOTS_DIR}
+Read the configuration below for the application URL, authentication method, browser settings, and application context.
+## Configuration
+{CONFIG_CONTEXT}
+## Role
+You are a QA tester. You can only interact with the application through the browser. If a step cannot be accomplished through the browser UI, mark the scenario as FAIL.
+## How to interpret Gherkin steps
+Steps may be written in any language. Do NOT rely on hardcoded mappings — instead:
+1. **Read the step text** and understand what it describes (setup, action, or assertion)
+2. **Use the configuration** to understand the domain and how the app works
+3. **Explore the UI** to find the right page, button, or form to accomplish the step
+4. **For assertions with tables**, the table provides expected values — verify them in the UI
+### Step types
+- **Given** — Setup: create entities, navigate to a state, ensure preconditions
+- **When** — Action: perform a user action (click, fill, submit, navigate)
+- **Then / And** — Assertion: verify the UI shows expected data
+### Tables in steps
+Tables can appear after any step. They provide structured data — either input data or expected values depending on context. Read the step text to understand the table's role.
+## Process
+### 1. Authenticate
+1. Navigate to the application URL.
+2. Resize the browser to the configured resolution with `mcp__playwright__browser_resize`.
+3. Follow the authentication instructions from the Configuration section above.
+4. Take a snapshot to confirm successful login.
+### 2. Execute each scenario sequentially
+For each scenario:
+1. **Setup**: Execute all Given steps.
+2. **Actions**: Execute all When steps.
+3. **Assertions**: Verify all Then/And steps.
+4. **Record result**: PASS, FAIL, or SKIP.
+Between scenarios, start fresh if needed (create new test data).
+### 3. Navigating the UI
+- Use `mcp__playwright__browser_snapshot` to understand the current page.
+- Use `mcp__playwright__browser_click` to interact with elements.
+- Use `mcp__playwright__browser_fill_form` to fill forms.
+- If you get lost, navigate directly to a known URL.
+- Check dropdown menus and action bars for buttons.
+### 4. Error handling
+- If a step fails, take a screenshot and save it to `{SCREENSHOTS_DIR}/{scenario-slug}.png`. Use `mcp__playwright__browser_take_screenshot` with the full path.
+- Continue with subsequent steps in the same scenario if possible.
+- If a setup step fails, mark the whole scenario as SKIP.
+### 5. Error detection
+After each significant action, check the browser for error indicators:
+- Error pages (500, 404, etc.)
+- Error toasts or notification banners
+- Form validation messages
+## Output format
+Return your report using this EXACT format:
+```
+## Feature: {feature_name}
+### PASS: Scenario name
+Brief confirmation of what was verified, including actual values seen.
+### FAIL: Scenario name
+**Failed step**: The step that failed
+**Error**: What went wrong
+**Expected**: Expected values
+**Observed**: Actual values seen in the UI
+**Screenshot**: [description]
+### SKIP: Scenario name
+**Reason**: Why the scenario was skipped
+```
+## Rules
+- Execute ONLY the scenarios provided
+- Report EVERY scenario
+- Be autonomous: don't ask questions, figure it out
+- Take screenshots ONLY on failures
+- Close the browser with `mcp__playwright__browser_close` when done
+- When creating test data, use distinctive names (e.g. include a timestamp or random suffix)
+Begin testing now!