agent-gauntlet 1.2.0 → 1.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/index.js +195 -122
- package/dist/index.js.map +8 -8
- package/package.json +1 -1
- package/skills/gauntlet-commit/SKILL.md +7 -3
- package/skills/gauntlet-run/SKILL.md +15 -19
package/package.json
CHANGED
|
@@ -16,11 +16,15 @@ Commit with optional gauntlet validation. Runs `agent-gauntlet detect` first, va
|
|
|
16
16
|
Run `agent-gauntlet detect` using `Bash`:
|
|
17
17
|
|
|
18
18
|
```bash
|
|
19
|
-
agent-gauntlet detect 2>&1
|
|
19
|
+
agent-gauntlet detect 2>&1; echo "DETECT_EXIT:$?"
|
|
20
20
|
```
|
|
21
21
|
|
|
22
|
-
|
|
23
|
-
|
|
22
|
+
Check the exit code from the `DETECT_EXIT:` line:
|
|
23
|
+
|
|
24
|
+
- **Exit 0** → gates would run, continue to Step 2
|
|
25
|
+
- **Exit 2** → no gates would run (no changes or no applicable gates), **skip to Step 4** (commit directly)
|
|
26
|
+
- **Exit 1** → error, report the error to the user and stop
|
|
27
|
+
- **Any other exit code** → treat as error, report output to the user, and stop
|
|
24
28
|
|
|
25
29
|
## Step 2 - Determine Validation Intent
|
|
26
30
|
|
|
@@ -13,26 +13,22 @@ Fix issues you reasonably agree with or believe the human wants to be fixed. Ski
|
|
|
13
13
|
|
|
14
14
|
## Procedure
|
|
15
15
|
|
|
16
|
-
### Step 1 -
|
|
17
|
-
|
|
18
|
-
Run `agent-gauntlet clean` to archive any previous log files.
|
|
19
|
-
|
|
20
|
-
### Step 2 - Run Gauntlet
|
|
16
|
+
### Step 1 - Run Gauntlet
|
|
21
17
|
|
|
22
18
|
If the caller requests a specific review to be enabled, append `--enable-review <name>` to the run command for each requested review.
|
|
23
19
|
|
|
24
20
|
Run `agent-gauntlet run` using `Bash` with `timeout: 300000`. **ALWAYS wait for and read the full command output** before proceeding — the command typically takes 1-2 minutes. **Verify you can see a `Status:` line in the output before continuing.**
|
|
25
21
|
|
|
26
|
-
### Step
|
|
22
|
+
### Step 2 - Check Status
|
|
27
23
|
|
|
28
24
|
**NEVER assume success** — you must see an explicit `Status:` line before continuing. Check it and route accordingly:
|
|
29
|
-
- `Status: Passed` → Go to Step
|
|
30
|
-
- `Status: Passed with warnings` → Go to Step
|
|
31
|
-
- `Status: Failed` → Continue to Step
|
|
32
|
-
- `Status: Retry limit exceeded` →
|
|
25
|
+
- `Status: Passed` → Go to Step 8.
|
|
26
|
+
- `Status: Passed with warnings` → Go to Step 8.
|
|
27
|
+
- `Status: Failed` → Continue to Step 3. **You MUST continue — do not stop here.**
|
|
28
|
+
- `Status: Retry limit exceeded` → Go to Step 8.
|
|
33
29
|
- No status line visible → **Known issue:** Bun can drop all stdout/stderr when LLM review subprocesses run. Read the console log file to get the status: find the latest `console.*.log` in the gauntlet log directory (e.g., `gauntlet_logs/console.1.log`) and look for the `Status:` line there. If no console log is found there, also check `gauntlet_logs/previous/` for logs from the most recent archived run. If no console log exists in either location, the command may have timed out or failed to run — re-run with a longer timeout or investigate the error. Do NOT proceed as if it passed.
|
|
34
30
|
|
|
35
|
-
### Step
|
|
31
|
+
### Step 3 - Extract Failures
|
|
36
32
|
|
|
37
33
|
Required when status is Failed:
|
|
38
34
|
- Infer the log directory from the file paths in the console output (e.g., if output references `gauntlet_logs/check_._lint.1.log`, the log directory is `gauntlet_logs/`)
|
|
@@ -42,31 +38,31 @@ Required when status is Failed:
|
|
|
42
38
|
b. **Subagent delegation**: If your environment supports delegating work to a subagent but not the Task tool, delegate the extract-prompt instructions with the log directory to a subagent for processing.
|
|
43
39
|
c. **Inline fallback**: If no subagent capability is available, follow the extract-prompt instructions yourself to read the log files and produce the compact failure summary.
|
|
44
40
|
|
|
45
|
-
### Step
|
|
41
|
+
### Step 4 - Report Failures
|
|
46
42
|
|
|
47
|
-
Print the compact failure summary returned from Step
|
|
43
|
+
Print the compact failure summary returned from Step 3.
|
|
48
44
|
|
|
49
|
-
### Step
|
|
45
|
+
### Step 5 - Fix
|
|
50
46
|
|
|
51
47
|
Apply the review guidance above to each failure and fix accordingly:
|
|
52
48
|
- CHECK failures with Fix Skill: invoke the named skill
|
|
53
49
|
- CHECK failures with Fix Instructions: follow the instructions
|
|
54
50
|
- REVIEW violations: fix or skip per the review guidance above
|
|
55
51
|
|
|
56
|
-
### Step
|
|
52
|
+
### Step 6 - Update Review Decisions
|
|
57
53
|
|
|
58
54
|
For REVIEW violations you addressed:
|
|
59
55
|
- Read `update-prompt.md` from this skill's directory
|
|
60
|
-
- **Update review decisions** using the first available strategy (same as Step
|
|
56
|
+
- **Update review decisions** using the first available strategy (same as Step 3):
|
|
61
57
|
a. **Task tool** (Claude Code): `Task` with `subagent_type="general-purpose"`, `model="haiku"`, `prompt=` update-prompt content + log directory + decisions list. **Task calls MUST be synchronous** — NEVER use `run_in_background: true`.
|
|
62
58
|
b. **Subagent delegation**: Delegate the update-prompt instructions with the log directory and decisions to a subagent.
|
|
63
59
|
c. **Inline fallback**: Follow the update-prompt instructions yourself to update the review JSON files.
|
|
64
60
|
|
|
65
|
-
### Step
|
|
61
|
+
### Step 7 - Re-run Verification
|
|
66
62
|
|
|
67
|
-
**NEVER skip this step** — if the run failed, you MUST fix and re-run. Run the same command from Step
|
|
63
|
+
**NEVER skip this step** — if the run failed, you MUST fix and re-run. Run the same command from Step 1 (including any `--enable-review` flags) again with `Bash` and `timeout: 300000`. The tool detects existing logs and automatically switches to verification mode. **Go back to Step 2** to check the status line and repeat.
|
|
68
64
|
|
|
69
|
-
### Step
|
|
65
|
+
### Step 8 - Summarize Session
|
|
70
66
|
|
|
71
67
|
Provide a summary of the session:
|
|
72
68
|
- Final Status: (Passed / Passed with warnings / Retry limit exceeded)
|