npm - agent-gauntlet - Versions diffs - 0.12.0 → 0.13.1 - Mend

agent-gauntlet 0.12.0 → 0.13.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/dist/index.js +381 -305
package/dist/index.js.map +13 -13
package/dist/skill-templates/check-catalog.md +50 -1
package/dist/skill-templates/setup-ref-project-structure.md +153 -0
package/dist/skill-templates/setup-skill.md +66 -126
package/package.json +1 -1

package/dist/skill-templates/check-catalog.md CHANGED Viewed

@@ -346,11 +346,60 @@ entry_points:
       - docs-review
 ```
+### Wildcard Entry Points (Monorepos)
+Use a single-level wildcard (`*`) to expand one entry point into one job per changed subdirectory. This is ideal for monorepos where each package has the same toolchain:
+```yaml
+entry_points:
+  # Root: project-wide checks
+  - path: "."
+    checks:
+      - security-deps
+    reviews:
+      - code-quality
+  # Per-package: expands to one job per changed package
+  - path: "packages/*"
+    checks:
+      - build
+      - lint
+      - typecheck
+      - test
+```
+Check commands run with the working directory set to the matched package (e.g., `packages/api`), so a single `test.yml` works for all packages sharing the same test runner.
+### Split Project Entry Points
+For projects with distinct parts (e.g., frontend + backend) that may use different toolchains:
+```yaml
+entry_points:
+  - path: "frontend"
+    checks:
+      - build
+      - lint-frontend
+      - test-frontend
+    reviews:
+      - code-quality
+  - path: "backend"
+    checks:
+      - build-backend
+      - lint-backend
+      - test-backend
+    reviews:
+      - code-quality
+```
+When parts share the same command for a category (e.g., both run `npm test`), use one shared check file — the working directory is set per entry point at runtime. When they use different commands, create separate check files with a suffix (e.g., `test-frontend.yml`, `test-backend.yml`).
 ### Fields
 | Field | Type | Required | Description |
 |-------|------|----------|-------------|
-| `path` | string | Yes | Directory path to monitor. Relative to project root. |
+| `path` | string | Yes | Directory path to monitor. Relative to project root. Supports single-level wildcards (e.g., `packages/*`). |
 | `checks` | string[] | No | List of check names matching `.gauntlet/checks/<name>.yml` files. |
 | `reviews` | string[] | No | List of review names matching `.gauntlet/reviews/<name>.yml` or `.md` files. |
 | `exclude` | string[] | No | Glob patterns for files to exclude from change detection within this path. |

package/dist/skill-templates/setup-ref-project-structure.md ADDED Viewed

@@ -0,0 +1,153 @@
+# Multi-Project Entry Point Guide
+Reference for configuring entry points in monorepos and split projects. Only read this file if the project was classified as **monorepo** or **split project** in Step 3 of the setup skill.
+---
+## Entry Point Proposals
+### Monorepo
+Use a wildcard entry point for packages plus a root entry point for project-wide checks:
+```yaml
+entry_points:
+  # Root: project-wide checks (security audits, etc.)
+  - path: "."
+    checks:
+      - security-deps
+    reviews:
+      - code-quality
+  # Per-package: expands to one job per changed package
+  - path: "packages/*"    # or apps/*, services/*
+    checks:
+      - build
+      - lint
+      - typecheck
+      - test
+```
+**How wildcards work:** `packages/*` automatically expands at runtime to one job per changed package. Check commands run with the working directory set to the matched package (e.g., `packages/api`), so `npm test` runs inside that specific package. A single `test.yml` check file works for all packages sharing the same test runner.
+### Split project — different toolchains (e.g., frontend + backend)
+Separate entry point for each logical part:
+```yaml
+entry_points:
+  - path: "frontend"
+    checks:
+      - build
+      - lint
+      - test
+    reviews:
+      - code-quality
+  - path: "backend"
+    checks:
+      - build
+      - lint
+      - test
+    reviews:
+      - code-quality
+```
+Each entry point runs checks from within its own directory. If both parts use the same toolchain, they can share check files. If they use different toolchains, create separate check files (see "Check file naming" below).
+### Split project — same language (e.g., multiple apps or libs)
+When multiple apps or libraries share the same language and toolchain under a common parent, use a wildcard entry point:
+```yaml
+entry_points:
+  - path: "apps/*"
+    checks:
+      - build
+      - lint
+      - test
+    reviews:
+      - code-quality
+```
+This works the same as monorepo wildcards — each changed subdirectory gets its own job with the working directory set accordingly. Combine with a root entry point if there are project-wide checks (security, etc.).
+---
+## Proposing entry points
+Present the proposed layout and ask the user to confirm or adjust. They may want to:
+- Change directory paths
+- Add or remove entry points
+- Split a wildcard into individual entry points (or vice versa)
+- Add exclusion patterns
+---
+## Scanning for tooling
+**Split project:** Scan within each entry point's directory independently — different parts may use different tools.
+**Monorepo with wildcard:** Scan one representative package (the first found, or ask the user which). Also scan the project root for root-level tools (security auditing, etc.).
+---
+## Presenting findings
+Group findings by entry point:
+**Split project (different toolchains):**
+```
+Entry point: frontend/
+Category   | Tool       | Command            | Confidence
+-----------|------------|--------------------|----------
+Build      | Vite       | npm run build      | High
+Lint       | ESLint     | npx eslint .       | High
+Test       | Vitest     | npx vitest run     | High
+Entry point: backend/
+Category   | Tool       | Command            | Confidence
+-----------|------------|--------------------|----------
+Build      | Go         | go build ./...     | High
+Lint       | golangci   | golangci-lint run  | High
+Test       | Go         | go test ./...      | High
+```
+**Monorepo (shared toolchain):**
+```
+Entry point: . (root)
+Category        | Tool       | Command                          | Confidence
+----------------|------------|----------------------------------|-----------
+Security (deps) | npm audit  | npm audit --audit-level=moderate | Medium
+Entry point: packages/* (scanned: packages/core)
+Category   | Tool       | Command           | Confidence
+-----------|------------|-------------------|----------
+Build      | TypeScript | npm run build     | High
+Lint       | Biome      | npx biome check . | High
+Test       | Vitest     | npx vitest run    | High
+```
+When confirming, also ask the user whether the entry point assignments look correct.
+---
+## Check file naming
+**Shared** — Multiple entry points use the same command for a category (e.g., both run `npm test`). Create one check file (`test.yml`). The working directory is set per entry point at runtime.
+**Separate** — Entry points use different commands (e.g., frontend runs `npx vitest run`, backend runs `go test ./...`). Create suffixed files:
+- `test-frontend.yml` — `npx vitest run`
+- `test-backend.yml` — `go test ./...`
+Monorepo wildcard entry points typically use shared check files since all packages share the same toolchain.
+---
+## Updating entry_points
+For multi-entry-point fresh setups, attach `code-quality` review to the entry point covering primary source code (the wildcard or main project entry point, not the root).
+When adding checks/reviews to an existing multi-entry-point config, ask the user which entry point(s) to attach to.

package/dist/skill-templates/setup-skill.md CHANGED Viewed

@@ -8,7 +8,7 @@ allowed-tools: Bash, Read, Glob, Grep, Write, Edit
 Scan the project to discover tooling and configure checks and reviews for agent-gauntlet.
-Before starting, read the `references/check-catalog.md` file for check category details, YAML schemas, and example configurations.
+Before starting, read `references/check-catalog.md` for check category details, YAML schemas, and example configurations.
 ## Step 1: Check config exists
@@ -18,14 +18,14 @@ Read `.gauntlet/config.yml`. If the file does not exist, tell the user to run `a
 Read the `entry_points` field from `.gauntlet/config.yml`.
-**If `entry_points` is empty (`[]`):** This is a fresh setup. Proceed to Step 3 (full scan).
+**If `entry_points` is empty (`[]`):** This is a fresh setup. Proceed to Step 3 (detect project structure).
 **If `entry_points` is populated:** Show the user a summary of the current configuration:
 - List each entry point with its `path`, `checks`, and `reviews`
 - Then ask the user which action to take:
   1. **Add checks** — Scan for tools not already configured. Proceed to Step 3, but filter out any checks that already appear in `entry_points`.
-  2. **Add custom** — User describes what they want to add. Skip to Step 6.
+  2. **Add custom** — User describes what they want to add. Skip to Step 7.
   3. **Reconfigure** — Start fresh. Back up existing files first:
      - Rename each `.gauntlet/checks/*.yml` file to `.yml.bak` (overwrite any previous `.bak` files)
      - Rename each custom `.gauntlet/reviews/*.md` file to `.md.bak` (overwrite any previous `.bak` files)
@@ -33,58 +33,62 @@ Read the `entry_points` field from `.gauntlet/config.yml`.
      - Clear `entry_points` to `[]` in `config.yml`
      - Proceed to Step 3
-## Step 3: Scan the project
+## Step 3: Detect project structure
-Scan the project for tooling signals across 6 check categories:
+Scan for signals to classify the project as **monorepo**, **split project**, or **single project**.
-### Categories to scan
+### Monorepo signals
-1. **Build** — Build scripts, compiled languages (npm run build, cargo build, go build, make, gradle build, mvn package, etc.)
-2. **Lint** — Linters, formatters (eslint, biome, prettier, ruff, golangci-lint, clippy, checkstyle, etc.)
-3. **Typecheck** — Static type checkers (tsc --noEmit, mypy, pyright, etc.)
-4. **Test** — Test runners, test directories (jest, vitest, pytest, go test, cargo test, mvn test, etc.)
-5. **Security (deps)** — Dependency audit tools (npm audit, pip-audit, cargo audit, etc.)
-6. **Security (code)** — Static analysis / SAST tools (semgrep, bandit, gosec, etc.)
+- `package.json` with a `workspaces` field
+- `pnpm-workspace.yaml`
+- `lerna.json`, `nx.json`, `turbo.json`
+- `Cargo.toml` with a `[workspace]` section
+- Multiple subdirectories under `packages/`, `apps/`, or `services/` each containing their own project manifest (`package.json`, `go.mod`, `Cargo.toml`, `pyproject.toml`)
-### Signals to look for
+### Split project signals
-Scan these files for tooling evidence:
-- `package.json` — Check `scripts` (build, lint, test, typecheck, format, etc.) and `devDependencies` (eslint, biome, jest, vitest, typescript, prettier, semgrep, etc.)
-- `Makefile`, `Taskfile.yml`, `justfile` — Look for targets matching check categories
-- `Cargo.toml` — Rust project (cargo build, cargo test, cargo clippy, cargo audit)
-- `pyproject.toml`, `setup.py`, `setup.cfg` — Python project; check for tool configs (ruff, mypy, pytest, bandit, pip-audit)
-- `go.mod` — Go project (go build, go test, golangci-lint, gosec)
-- `build.gradle`, `build.gradle.kts`, `pom.xml` — Java/Kotlin project (gradle build, mvn package)
-- Config files that confirm tool presence:
-  - `.eslintrc`, `.eslintrc.js`, `.eslintrc.json`, `.eslintrc.yml`, `eslint.config.js`, `eslint.config.mjs` — ESLint
-  - `biome.json`, `biome.jsonc` — Biome
-  - `ruff.toml`, `.ruff.toml` — Ruff
-  - `.golangci.yml`, `.golangci.yaml` — golangci-lint
-  - `tsconfig.json` — TypeScript (typecheck)
-  - `.prettierrc`, `.prettierrc.js`, `.prettierrc.json`, `.prettierrc.yml`, `prettier.config.js` — Prettier
-  - `jest.config.js`, `jest.config.ts`, `jest.config.mjs` — Jest
-  - `vitest.config.js`, `vitest.config.ts`, `vitest.config.mjs` — Vitest
-  - `pytest.ini`, `conftest.py` — Pytest
-  - `.semgrep.yml`, `.semgrep.yaml` — Semgrep
-- `.github/workflows/*.yml` — CI workflow files often reveal exact commands for build, lint, test, etc.
+- `frontend/` + `backend/` (or `client/` + `server/`, `web/` + `api/`) directories each containing source code and/or their own project manifest
+- Multiple apps or libraries of the same language under a common parent directory (e.g., `apps/web/`, `apps/api/`, `apps/worker/` each with their own source and config) — suggests a wildcard entry point like `apps/*`
-**For the "add checks" path:** After scanning, filter out any checks that are already configured in the current `entry_points`.
+### Single project signals
-**If no tools are discovered:** Inform the user that no tools were automatically detected and offer the custom addition flow (skip to Step 6). Still include `code-quality` review in `entry_points`.
+- `src/` or `lib/` as sole source directory, or source files at project root
+- No monorepo or split project signals found
-## Step 4: Present findings
+**If monorepo or split project:** Read `references/project-structure.md` for detailed multi-project entry point guidance, then follow it for Steps 4 through 8. The rest of this file covers the single-project flow.
+**If single project:** Tell the user what you detected and continue below.
+## Step 4: Determine entry point path
+Infer the source directory:
+- If `src/` exists and contains source code, suggest `src`
+- If `lib/` exists and contains source code, suggest `lib`
+- Otherwise suggest `.` (project root — safer default since it captures all changes)
+**Skip this step** if adding checks to an existing entry point that already has a path.
+## Step 5: Scan for tooling
+Scan the project for tooling signals across the 6 check categories listed in `references/check-catalog.md`.
+**For the "add checks" path:** Filter out checks already configured in `entry_points`.
+**If no tools discovered:** Offer the custom flow (skip to Step 7). Still include `code-quality` review.
+## Step 6: Present findings and confirm
 Show a table of discovered checks:
 ```
-Category        | Tool            | Command                         | Confidence
-----------------|-----------------|---------------------------------|-----------
-Build           | npm             | npm run build                   | High
-Lint            | ESLint          | npx eslint .                    | High
-Typecheck       | TypeScript      | npx tsc --noEmit                | High
-Test            | Jest            | npx jest                        | High
-Security (deps) | npm audit       | npm audit --audit-level=moderate| Medium
-Security (code) | Semgrep         | semgrep scan --config auto --error .| Medium
+Category        | Tool            | Command                              | Confidence
+----------------|-----------------|--------------------------------------|-----------
+Build           | npm             | npm run build                        | High
+Lint            | ESLint          | npx eslint .                         | High
+Typecheck       | TypeScript      | npx tsc --noEmit                     | High
+Test            | Jest            | npx jest                             | High
+Security (deps) | npm audit       | npm audit --audit-level=moderate     | Medium
+Security (code) | Semgrep         | semgrep scan --config auto --error . | Medium
 ```
 **Confidence levels:**
@@ -94,78 +98,32 @@ Security (code) | Semgrep         | semgrep scan --config auto --error .| Medium
 If a category has no discovered tool, show `(not found)` with `—` for command and confidence.
-## Step 5: Ask user to confirm
 Ask the user:
-1. Which of the discovered checks to enable (default: all)
-2. Whether any commands need adjustment (e.g., different flags, different paths)
+1. Which checks to enable (default: all)
+2. Whether any commands need adjustment
-If the user declines ALL discovered checks, still include the `code-quality` review in `entry_points` and offer the custom addition flow (proceed to Step 6).
+If the user declines ALL checks, still include `code-quality` review and offer the custom flow (Step 7).
 After confirmation, proceed to Step 8 (create files).
-## Step 6: Add custom
+## Step 7: Add custom
-Ask the user:
-- Is it a **check** (shell command that passes/fails) or a **review** (AI code review)?
+Ask the user: **check** (shell command) or **review** (AI code review)?
-**For checks:**
-- Ask: What command should be run?
-- Ask: What name for this check? (used as the filename, e.g., `my-check` creates `.gauntlet/checks/my-check.yml`)
-- Ask: Which entry point path should it be attached to?
-- Ask: Any special settings? (timeout, parallel, run_in_ci, run_locally — explain defaults)
+**For checks:** Ask for command, name, and optional settings (timeout, parallel, run_in_ci, run_locally).
-**For reviews:**
-- Ask: Use the built-in `code-quality` review or write a custom review prompt?
-- If built-in: What name? (creates `.gauntlet/reviews/<name>.yml` with `builtin: code-quality`)
-- If custom: What name? What should the review focus on? Write the review prompt.
-  - Creates `.gauntlet/reviews/<name>.md` with YAML frontmatter (`num_reviews: 1`) and the review prompt as Markdown content.
+**For reviews:** Built-in (`code-quality`) or custom prompt? Ask for name and write the review content.
-## Step 7: Determine source directory
+## Step 8: Create files and update config
-Ask the user for the source directory for the entry point `path` field (e.g., `src/`, `.`, `lib/`), or infer it from project structure:
-- If `src/` directory exists and contains source code, suggest `src`
-- If `lib/` directory exists and contains source code, suggest `lib`
-- Otherwise suggest `.` (project root)
+**Checks** — Create `.gauntlet/checks/<name>.yml` with `command`, `parallel: true`, `run_in_ci: true`, `run_locally: true`. Add optional fields only when specified. See `references/check-catalog.md` for schema.
-**Skip this step** if adding checks to an existing entry point that already has a path (the "add checks" or "add custom" paths with a pre-existing entry point).
+**Custom reviews** — Create `.gauntlet/reviews/<name>.md` with YAML frontmatter (`num_reviews: 1`) and review prompt.
-## Step 8: Create check/review files
+**Built-in reviews** — Create `.gauntlet/reviews/<name>.yml` with `builtin: code-quality` and `num_reviews: 1`.
-For each confirmed item, create the appropriate file:
+**Update entry_points** in `.gauntlet/config.yml`:
-**Checks** — Create `.gauntlet/checks/<name>.yml`:
-```yaml
-command: <the command>
-parallel: true
-run_in_ci: true
-run_locally: true
-```
-Add optional fields only when the user specified them (timeout, working_directory, rerun_command, etc.). Refer to `references/check-catalog.md` for the full schema.
-**Custom reviews** — Create `.gauntlet/reviews/<name>.md`:
-```markdown
----
-num_reviews: 1
----
-# <Review Name>
-<The review prompt content>
-```
-**Built-in reviews** — Create `.gauntlet/reviews/<name>.yml`:
-```yaml
-builtin: code-quality
-num_reviews: 1
-```
-## Step 9: Update entry_points
-Edit `.gauntlet/config.yml` to update the `entry_points` section:
-**Fresh setup (was `entry_points: []`):**
 ```yaml
 entry_points:
   - path: "<source_dir>"
@@ -176,34 +134,16 @@ entry_points:
       - code-quality
 ```
-Always include `code-quality` in the `reviews` list for fresh setups, regardless of what checks the user selected.
-**Add checks / Add custom (existing entry points):**
-- Append new check names to the appropriate entry point's `checks` list
-- Append new review names to the appropriate entry point's `reviews` list
-- If the check/review should go on a new entry point (different path), add a new entry point
-## Step 10: "Add something else?"
-Ask the user: "Would you like to add another check or review?"
-- If **yes**: loop back to Step 6 (add custom)
-- If **no**: proceed to Step 11
+Always include `code-quality` in `reviews` for fresh setups. For "add checks" / "add custom": append to the appropriate entry point's lists, or add a new entry point if needed.
-## Step 11: Validate
+## Step 9: "Add something else?"
-Run `agent-gauntlet validate` to verify the configuration is valid.
+Ask the user. If yes, loop to Step 7. If no, proceed.
-**If validation passes:** proceed to Step 12.
+## Step 10: Validate
-**If validation fails:**
-1. Display the validation errors to the user
-2. Apply one corrective attempt — fix the issue based on the error message (e.g., fix a typo in a YAML file, correct a missing field, fix an entry_points reference to a non-existent check)
-3. Run `agent-gauntlet validate` again
-4. If it still fails: **STOP** and ask the user for guidance. Do not attempt further automatic fixes.
+Run `agent-gauntlet validate`. If it fails, apply one corrective attempt and re-validate. If it still fails, **STOP** and ask the user.
-## Step 12: Suggest next steps
+## Step 11: Suggest next steps
-Tell the user:
-- Configuration is complete and validated
-- They can now run `/gauntlet-run` to execute the full verification suite
-- They can run `/gauntlet-setup` again at any time to add more checks or reconfigure
+Tell the user: configuration is complete. Run `/gauntlet-run` to execute, or `/gauntlet-setup` again to add more.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "agent-gauntlet",
-  "version": "0.12.0",
+  "version": "0.13.1",
   "description": "A CLI tool for testing AI coding agents",
   "license": "Apache-2.0",
   "author": "Paul Caplan",