npm - @mnapoli/exspec - Versions diffs - 0.1.0 → 0.1.1 - Mend

@mnapoli/exspec 0.1.0 → 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md CHANGED Viewed

@@ -1,8 +1,53 @@
-# exspec
+# Executable specs
-**Executable specs** — run Gherkin feature files with an AI agent in the browser.
+AI writes code. AI writes tests. But confidence comes from tests you actually read and write.
-exspec parses `.feature` files, launches a Claude agent restricted to browser-only interaction (Playwright, headless), and produces a test report. Feature files can be written in any language supported by Gherkin (English, French, German, Spanish, [70+ languages](https://cucumber.io/docs/gherkin/languages/)).
+**exspec runs plain-text specs in a real browser using AI.**
+No test code, no step definitions. Write specs as acceptance criteria, then let agents build and run exspec to check they pass.
+## Example
+```gherkin
+Feature: Order management
+  Scenario: Place an order and check it appears in the dashboard
+    Given I am logged in as a store manager
+    When I create a new order for customer "Alice Martin" with 2 items
+    Then the order should appear in the orders list with status "Pending"
+  Scenario: Cancel an order
+    Given I am logged in as a store manager
+    And there is at least one pending order
+    When I open the most recent order and cancel it
+    Then the order status should change to "Cancelled"
+    And the customer should see a cancellation notice
+```
+```bash
+$ npx exspec
+Suite: 2 scenario(s) in 1 domain(s)
+  orders (2 scenarios)
+    · Place an order and check it appears in the dashboard
+    · Cancel an order
+▶ orders...
+  2 passed, 0 failed
+  Cost: $0.0523
+────────────────────────────────────────
+Total: 2 passed, 0 failed, 0 skipped, 0 errors
+Total cost: $0.0523
+Results written to features/exspec/2026-03-20-1430.md
+```
+Unlike [Cucumber](https://github.com/cucumber/cucumber-js) or [Behat](https://github.com/Behat/Behat), there's **no glue code** — no step definitions, no page objects, no regex matchers to wire up. The AI agent reads your specs and navigates the app like a real user would. It figures out where to click, what to fill in, and what to check on screen.
+This also means specs aren't brittle. Traditional browser tests break when a CSS class changes or a button moves. The AI agent adapts to the actual UI — and if the UX is so broken that a human couldn't complete the task, the spec fails too. That's a feature, not a bug.
+Specs are written in [Gherkin](https://cucumber.io/docs/gherkin/reference/), a simple Given/When/Then format. You can write them in [70+ languages](https://cucumber.io/docs/gherkin/languages/) (English, French, German, Spanish, etc.).
 ## Install
@@ -10,24 +55,52 @@ exspec parses `.feature` files, launches a Claude agent restricted to browser-on
 npm install -D @mnapoli/exspec
 ```
-## Prerequisites
+### Prerequisites
 - [Claude Code CLI](https://docs.anthropic.com/en/docs/claude-code) installed and authenticated
-## Usage
+## Quick start
+1. Create a `features/exspec.md` configuration file:
+```markdown
+URL: http://localhost:3000
+Use the `test@example.com` / `password` credentials for authentication.
+```
+2. Write a feature file in `features/`:
+```gherkin
+Feature: Shopping cart
+  Scenario: Add a product to the cart
+    Given I am logged in
+    When I navigate to the product catalog
+    And I add the first product to my cart
+    Then the cart should show 1 item
+```
+3. Run:
 ```bash
-# Run all feature files in features/
 npx exspec
+```
+That's it. No step definitions to implement, no test code to write.
-# Run a specific file
-npx exspec features/Auth/Login.feature
+## Usage
+```bash
+# Run all feature files
+npx exspec
-# Run all features in a directory
-npx exspec features/Auth/
+# Run a specific file or directory
+npx exspec features/auth/login.feature
+npx exspec features/auth/
 # Filter by scenario name
-npx exspec features/Auth/Login.feature --filter "invalid password"
+npx exspec --filter "invalid password"
 # Stop at first failure
 npx exspec --fail-fast
@@ -38,18 +111,17 @@ npx exspec --headed
 ## Configuration
-### `exspec.md`
+### `features/exspec.md`
-Create an `features/exspec.md` file. Its content is passed to the AI agent as context.
+This file is passed to the AI agent as context. Describe your app, provide credentials, set the URL — anything the agent needs to know to test your application.
 ```markdown
-# QA Configuration
+URL: http://localhost:3000
 ## Application
-URL: http://localhost:3000
-This is an e-commerce app. The user is a store manager. For detailed feature documentation, see the `docs/` directory.
+This is an e-commerce app. The user is a store manager.
+For detailed feature documentation, see the `docs/` directory.
 ## Authentication
@@ -60,41 +132,25 @@ Use the `test@example.com` / `password` credentials for authentication.
 Resolution: 1920x1080
 ```
-The agent reads this file as context, so you can reference any project documentation here, or give it extra instructions.
 ### Environment variables
-If your project has a `.env` file, exspec loads it automatically. You can then reference environment variables in `exspec.md` using `$VAR` or `${VAR}` syntax, they are resolved before the config is passed to the agent.
+If your project has a `.env` file, exspec loads it automatically. You can reference variables in `exspec.md` with `$VAR` or `${VAR}` syntax:
 ```markdown
 URL: $APP_URL
 ```
-This is useful for dynamic URLs across environments (e.g. with git worktrees). If a variable is not defined, the reference is left as-is.
 ## How it works
-1. Loads `.env` (if present) and `exspec.md` (with variable expansion)
-2. Discovers and parses `.feature` files (supports all Gherkin languages)
-3. Groups scenarios by domain (subdirectory of `features/`)
-4. For each domain, invokes Claude CLI with:
-   - Only Playwright tools available (browser-only, no database or code access)
-   - Playwright in headless mode (or headed with `--headed`)
-   - Feature content + context docs + config as prompt
-5. Parses results (PASS/FAIL/SKIP) and writes them to `features/exspec/`
+1. Discovers `.feature` files in `features/` and groups them by subdirectory
+2. For each group, launches a Claude agent with only Playwright browser tools (no database, no code, no shell access)
+3. The agent reads your specs and interacts with the browser autonomously
+4. Results (PASS/FAIL/SKIP) are written to `features/exspec/`
-## Results
-Results are written to `features/exspec/{YYYY-MM-DD-HHmm}.md` with failure screenshots in the corresponding directory.
-The CLI exits with code `1` if any tests fail (CI-friendly).
+The agent is sandboxed to browser-only interaction. If a scenario can't be verified through the browser, it's marked as FAIL.
-## Agent restrictions
-The AI agent can ONLY use Playwright browser tools. It cannot:
+## Results
-- Access the database
-- Read or modify source code
-- Execute shell commands
+Results are written to `features/exspec/{YYYY-MM-DD-HHmm}.md` with failure screenshots.
-If a scenario cannot be verified through the browser, it is marked as FAIL.
+The CLI exits with code `1` on failures (CI-friendly).

package/dist/cli.js CHANGED Viewed

@@ -119,7 +119,7 @@ for (const [domain, domainFeatures] of domains) {
     }
 }
 // Summary
-appendSummary(resultsPath, totals);
+appendSummary(resultsPath, totals, screenshotsDir);
 console.log("─".repeat(40));
 console.log(`Total: ${totals.passed} passed, ${totals.failed} failed, ${totals.skipped} skipped, ${totals.errors} errors`);
 if (totals.cost) {

package/dist/reporter.d.ts CHANGED Viewed

@@ -6,4 +6,4 @@ export declare function initResultsFile(projectRoot: string, runId: string): {
     screenshotsDir: string;
 };
 export declare function appendDomainResults(resultsPath: string, result: DomainResult): void;
-export declare function appendSummary(resultsPath: string, totals: RunTotals): void;
+export declare function appendSummary(resultsPath: string, totals: RunTotals, screenshotsDir: string): void;

package/dist/reporter.js CHANGED Viewed

@@ -1,5 +1,6 @@
-import { appendFileSync, existsSync, mkdirSync, writeFileSync } from "fs";
+import { appendFileSync, existsSync, mkdirSync, readdirSync, rmSync, writeFileSync, } from "fs";
 import { resolve, join } from "path";
+const MAX_RUNS = 5;
 const pad = (n) => String(n).padStart(2, "0");
 export function generateRunId() {
     const now = new Date();
@@ -13,12 +14,13 @@ export function initResultsFile(projectRoot, runId) {
     const resultsDir = resolve(projectRoot, "features/exspec");
     const screenshotsDir = resolve(resultsDir, runId);
     const resultsPath = resolve(resultsDir, `${runId}.md`);
-    mkdirSync(screenshotsDir, { recursive: true });
+    mkdirSync(resultsDir, { recursive: true });
     // Create .gitignore on first run
     const gitignorePath = join(resultsDir, ".gitignore");
     if (!existsSync(gitignorePath)) {
         writeFileSync(gitignorePath, "*\n!.gitignore\n");
     }
+    pruneOldRuns(resultsDir);
     writeFileSync(resultsPath, `# Test results — ${runId}\n\nStarted at ${formatTime()}\n`);
     return { resultsPath, screenshotsDir };
 }
@@ -60,7 +62,7 @@ export function appendDomainResults(resultsPath, result) {
     }
     appendFileSync(resultsPath, lines.join("\n"));
 }
-export function appendSummary(resultsPath, totals) {
+export function appendSummary(resultsPath, totals, screenshotsDir) {
     const content = [
         "---\n",
         "## Summary\n",
@@ -68,4 +70,29 @@ export function appendSummary(resultsPath, totals) {
         `Finished at ${formatTime()}\n`,
     ].join("\n");
     appendFileSync(resultsPath, content);
+    cleanupEmptyDir(screenshotsDir);
+}
+function cleanupEmptyDir(dir) {
+    if (!existsSync(dir))
+        return;
+    const entries = readdirSync(dir);
+    if (entries.length === 0) {
+        rmSync(dir);
+    }
+}
+function pruneOldRuns(resultsDir) {
+    const entries = readdirSync(resultsDir);
+    const runIds = entries
+        .filter((e) => e.match(/^\d{4}-\d{2}-\d{2}-\d{4}\.md$/))
+        .map((e) => e.replace(/\.md$/, ""))
+        .sort();
+    while (runIds.length >= MAX_RUNS) {
+        const oldest = runIds.shift();
+        const mdPath = join(resultsDir, `${oldest}.md`);
+        const dirPath = join(resultsDir, oldest);
+        if (existsSync(mdPath))
+            rmSync(mdPath);
+        if (existsSync(dirPath))
+            rmSync(dirPath, { recursive: true });
+    }
 }

package/dist/reporter.test.js CHANGED Viewed

@@ -70,14 +70,14 @@ describe("appendSummary", () => {
     });
     test("writes totals", () => {
         mkdirSync(tmpRoot, { recursive: true });
-        const { resultsPath } = initResultsFile(tmpRoot, "test-run");
+        const { resultsPath, screenshotsDir } = initResultsFile(tmpRoot, "test-run");
         const totals = {
             passed: 5,
             failed: 2,
             skipped: 1,
             errors: 0,
         };
-        appendSummary(resultsPath, totals);
+        appendSummary(resultsPath, totals, screenshotsDir);
         const content = readFileSync(resultsPath, "utf-8");
         expect(content).toContain("5 passed, 2 failed, 1 skipped, 0 errors");
     });

package/dist/runner.js CHANGED Viewed

@@ -56,6 +56,7 @@ function invokeClaude(prompt, cwd, mcpConfigPath) {
             "mcp__playwright__*",
             "--output-format",
             "stream-json",
+            "--verbose",
             "--model",
             "sonnet",
             "--mcp-config",
@@ -112,6 +113,9 @@ function invokeClaude(prompt, cwd, mcpConfigPath) {
                     resultText = event.result ?? "";
                     cost = event.cost_usd;
                     duration = event.duration_ms;
+                    if (event.is_error) {
+                        resultText = `Error: ${resultText}`;
+                    }
                     break;
                 }
             }
@@ -133,7 +137,8 @@ function invokeClaude(prompt, cwd, mcpConfigPath) {
             }
             process.stderr.write("\n");
             if (code !== 0) {
-                reject(new Error(`claude exited with code ${code}${stderr ? `: ${stderr.slice(0, 500)}` : ""}`));
+                const detail = resultText || stderr.slice(0, 500) || `exit code ${code}`;
+                reject(new Error(detail));
             }
             else {
                 resolve({ result: resultText, cost, duration });

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@mnapoli/exspec",
-  "version": "0.1.0",
+  "version": "0.1.1",
   "description": "Executable specs — run Gherkin feature files with an AI agent in the browser",
   "type": "module",
   "bin": {