agenteval-cli 0.7.10 → 0.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +120 -0
- package/package.json +1 -1
package/README.md
ADDED
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
# agenteval-cli
|
|
2
|
+
|
|
3
|
+
Your CLAUDE.md is untested. So is your AGENTS.md, your copilot-instructions.md, and your .cursorrules.
|
|
4
|
+
|
|
5
|
+
agenteval is a linter, benchmarker, and CI gate for AI coding instructions. It finds dead references, token bloat, contradictions, and stale instructions before your agent does.
|
|
6
|
+
|
|
7
|
+
[](https://www.npmjs.com/package/agenteval-cli)
|
|
8
|
+
[](https://github.com/lukasmetzler/agenteval/actions/workflows/ci.yml)
|
|
9
|
+
[](https://github.com/lukasmetzler/agenteval/blob/main/LICENSE)
|
|
10
|
+
|
|
11
|
+
## Install
|
|
12
|
+
|
|
13
|
+
```bash
|
|
14
|
+
npm install -g agenteval-cli
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
Or run without installing:
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
npx agenteval-cli lint
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
This installs a lightweight wrapper (~3 KB) that downloads the prebuilt native binary for your platform on first run. No Bun or Node runtime needed at execution time.
|
|
24
|
+
|
|
25
|
+
### Supported Platforms
|
|
26
|
+
|
|
27
|
+
| Platform | Architecture |
|
|
28
|
+
|----------|-------------|
|
|
29
|
+
| Linux | x64 |
|
|
30
|
+
| macOS | ARM64 (Apple Silicon) |
|
|
31
|
+
| macOS | x64 (Intel) |
|
|
32
|
+
|
|
33
|
+
## Usage
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
# Lint your instruction files (CLAUDE.md, AGENTS.md, etc.)
|
|
37
|
+
agenteval lint
|
|
38
|
+
|
|
39
|
+
# Show why each rule matters
|
|
40
|
+
agenteval lint --explain
|
|
41
|
+
|
|
42
|
+
# Preview AI commits in your git history
|
|
43
|
+
agenteval harvest --dry-run
|
|
44
|
+
|
|
45
|
+
# Run all eval tasks, fail on regressions (for CI)
|
|
46
|
+
agenteval ci
|
|
47
|
+
|
|
48
|
+
# Score history and trends
|
|
49
|
+
agenteval trends
|
|
50
|
+
|
|
51
|
+
# Self-update to the latest version
|
|
52
|
+
agenteval update
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
## What It Catches
|
|
56
|
+
|
|
57
|
+
- Dead references to files that don't exist in your repo
|
|
58
|
+
- Filler phrases that waste context tokens ("make sure to", "it is important that")
|
|
59
|
+
- Contradictions between instruction files
|
|
60
|
+
- Content overlap and duplication across multiple files
|
|
61
|
+
- Token budget overruns that crowd out code context
|
|
62
|
+
- Vague instructions without actionable specifics
|
|
63
|
+
- Stale instructions referencing code that was refactored weeks ago
|
|
64
|
+
- Broken markdown links and heading anchors
|
|
65
|
+
- Invalid skill metadata (per Anthropic spec)
|
|
66
|
+
|
|
67
|
+
## Supported Instruction Formats
|
|
68
|
+
|
|
69
|
+
- `CLAUDE.md` (Claude Code)
|
|
70
|
+
- `AGENTS.md` (OpenAI Codex, generic agents)
|
|
71
|
+
- `.github/copilot-instructions.md` (GitHub Copilot)
|
|
72
|
+
- `.github/instructions/*.instructions.md` (scoped Copilot instructions)
|
|
73
|
+
- `.claude/skills/*/SKILL.md` (Anthropic skills)
|
|
74
|
+
- `.cursorrules` and `.cursor/rules/*.mdc` (Cursor)
|
|
75
|
+
|
|
76
|
+
## CI Integration
|
|
77
|
+
|
|
78
|
+
Use as a GitHub Action:
|
|
79
|
+
|
|
80
|
+
```yaml
|
|
81
|
+
- uses: lukasmetzler/agenteval@v0
|
|
82
|
+
with:
|
|
83
|
+
command: lint
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
Or run the CLI directly:
|
|
87
|
+
|
|
88
|
+
```bash
|
|
89
|
+
agenteval ci --min-score 0.7 --max-regression 0.05
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
## Configuration
|
|
93
|
+
|
|
94
|
+
agenteval works with zero configuration. To customize, create an `agenteval.yaml`:
|
|
95
|
+
|
|
96
|
+
```yaml
|
|
97
|
+
version: 1
|
|
98
|
+
model: claude-sonnet-4-6
|
|
99
|
+
contextBudget: 0.3
|
|
100
|
+
lint:
|
|
101
|
+
maxTokensPerFile: 8000
|
|
102
|
+
overlapThreshold: 0.3
|
|
103
|
+
bloatThreshold: 0.5
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
## Alternative Install Methods
|
|
107
|
+
|
|
108
|
+
| Method | Command |
|
|
109
|
+
|--------|---------|
|
|
110
|
+
| Homebrew | `brew tap lukasmetzler/agenteval && brew install agenteval` |
|
|
111
|
+
| Shell | `curl -fsSL https://raw.githubusercontent.com/lukasmetzler/agenteval/main/install.sh \| bash` |
|
|
112
|
+
| Binary | [GitHub Releases](https://github.com/lukasmetzler/agenteval/releases) |
|
|
113
|
+
|
|
114
|
+
## Documentation
|
|
115
|
+
|
|
116
|
+
Full docs, guides, and examples at [github.com/lukasmetzler/agenteval](https://github.com/lukasmetzler/agenteval).
|
|
117
|
+
|
|
118
|
+
## License
|
|
119
|
+
|
|
120
|
+
MIT
|