@twarc_net/groundtruth 0.1.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,5 +1,5 @@
1
1
  <p align="center">
2
- <img src="assets/hero.svg" alt="groundtruth — catch when your AI coding assistant claims work it didn't do" width="820">
2
+ <img src="assets/demo.svg" alt="groundtruth — catch when your AI coding assistant claims work it didn't do" width="820">
3
3
  </p>
4
4
 
5
5
  <p align="center">
@@ -11,8 +11,22 @@
11
11
  <a href="https://github.com/youcefzemmar/groundtruth/blob/main/CONTRIBUTING.md"><img src="https://img.shields.io/badge/PRs-welcome-brightgreen.svg" alt="PRs welcome"></a>
12
12
  </p>
13
13
 
14
+ <p align="center">
15
+ <b>English</b> ·
16
+ <a href="docs/i18n/README.zh-CN.md">简体中文</a> ·
17
+ <a href="docs/i18n/README.es.md">Español</a> ·
18
+ <a href="docs/i18n/README.pt-BR.md">Português</a> ·
19
+ <a href="docs/i18n/README.fr.md">Français</a> ·
20
+ <a href="docs/i18n/README.de.md">Deutsch</a> ·
21
+ <a href="docs/i18n/README.ja.md">日本語</a> ·
22
+ <a href="docs/i18n/README.ru.md">Русский</a> ·
23
+ <a href="docs/i18n/README.ar.md">العربية</a>
24
+ </p>
25
+
14
26
  # groundtruth
15
27
 
28
+ > **TL;DR** — Your AI says _"Done! I added X, fixed Y, wrote tests."_ groundtruth checks each claim against the real diff and flags the ones that never happened. One command: `npx @twarc_net/groundtruth install`.
29
+
16
30
  **Catch when your AI coding assistant claims work it didn't do.**
17
31
 
18
32
  Your agent ends a turn with _"Done! I added a `rateLimiter` middleware to `src/server.ts`, fixed the timeout bug, and added tests."_ You trust the summary, commit, and move on. Two weeks later production breaks — the rate limiter was never written. The summary lied (or hallucinated), and nothing checked it against the actual diff.
@@ -69,6 +83,13 @@ Restart Claude Code (or run `/hooks`) and groundtruth checks every turn automati
69
83
  npx @twarc_net/groundtruth verify
70
84
  ```
71
85
 
86
+ Prefer plugins? Add the marketplace and install in one step:
87
+
88
+ ```text
89
+ /plugin marketplace add youcefzemmar/groundtruth
90
+ /plugin install groundtruth
91
+ ```
92
+
72
93
  ## How it works
73
94
 
74
95
  ```text
@@ -119,6 +140,98 @@ By default the hook is **non-blocking**: it prints its report and gets out of th
119
140
 
120
141
  Full details in [`docs/claim-types.md`](docs/claim-types.md).
121
142
 
143
+ ## Use in CI (GitHub Action)
144
+
145
+ Post claim verdicts as a sticky comment on every PR — grading the **PR description against the diff**, so it works on any PR with zero agent setup:
146
+
147
+ ```yaml
148
+ # .github/workflows/groundtruth.yml
149
+ name: groundtruth
150
+ on: pull_request
151
+ permissions:
152
+ contents: read
153
+ pull-requests: write
154
+ jobs:
155
+ claim-check:
156
+ runs-on: ubuntu-latest
157
+ steps:
158
+ - uses: actions/checkout@v6
159
+ with: { fetch-depth: 0 }
160
+ - uses: youcefzemmar/groundtruth@v0.3.0
161
+ ```
162
+
163
+ Add `with: { strict: true }` to turn it into a merge gate. Full options in [docs/github-action.md](docs/github-action.md).
164
+
165
+ ### Locally, against your commit message
166
+
167
+ `--staged` checks a message against what's actually staged — drop this in `.git/hooks/commit-msg` (or a [lefthook](https://github.com/evilmartians/lefthook)/husky `commit-msg` hook):
168
+
169
+ ```sh
170
+ #!/bin/sh
171
+ # .git/hooks/commit-msg — verify the commit message against the staged diff
172
+ npx @twarc_net/groundtruth verify --summary "$1" --staged
173
+ # add --strict to abort the commit when a claim is unsupported
174
+ ```
175
+
176
+ ## Configuration
177
+
178
+ Optional — drop a `.groundtruthrc.json` in your project (or a `"groundtruth"` key in package.json):
179
+
180
+ ```json
181
+ {
182
+ "strict": false,
183
+ "failOn": ["unsupported"],
184
+ "shadow": false,
185
+ "ignore": ["CHANGELOG.md", "*.generated.ts"],
186
+ "ignoreKinds": ["command"],
187
+ "output": "terminal"
188
+ }
189
+ ```
190
+
191
+ - **`ignore`** — claim targets to skip (substring or `*` glob). Your escape hatch for any false positive.
192
+ - **`ignoreKinds`** — whole claim kinds to skip (`file`, `symbol`, `test`, `dependency`, `command`, `action`).
193
+ - **`strict`** / **`output`** — defaults for blocking and output format.
194
+ - **`failOn`** — which verdict levels count as a failure in strict mode (default `["unsupported"]`).
195
+ - **`shadow`** — record to the ledger but never print or block (for gradual rollout).
196
+
197
+ Install into more hook events for multi-agent workflows:
198
+
199
+ ```bash
200
+ npx @twarc_net/groundtruth install --events Stop,SubagentStop,SessionEnd
201
+ ```
202
+
203
+ `SubagentStop` checks each subagent's turn; `SessionEnd` prints a per-session digest.
204
+
205
+ ## Other agents
206
+
207
+ The Stop hook is Claude Code-specific, but `verify` reads other agents' transcripts too — the claim engine is agent-neutral:
208
+
209
+ ```bash
210
+ groundtruth verify --agent codex # OpenAI Codex CLI
211
+ groundtruth verify --agent gemini # Gemini CLI
212
+ groundtruth verify --agent cursor # Cursor (agent-transcripts)
213
+ groundtruth verify --agent opencode # OpenCode
214
+ groundtruth verify --agent aider # Aider (best-effort)
215
+ groundtruth verify --agent auto # pick the most recent across all
216
+ ```
217
+
218
+ Each adapter normalizes the agent's transcript into the same `{summary, toolUses}` shape. New adapters are a great contribution — see [CONTRIBUTING.md](CONTRIBUTING.md).
219
+
220
+ ## Stats & status bar
221
+
222
+ The hook keeps a privacy-safe local tally (counts only — never code or prompts, in `~/.groundtruth/ledger.jsonl`):
223
+
224
+ ```bash
225
+ groundtruth stats # this project: turns, verified, unsupported, to-review (7d/30d/all)
226
+ groundtruth stats --all # across every project
227
+ ```
228
+
229
+ Show a live count in the Claude Code status bar (`🔎 gt 3❌ ·7d`):
230
+
231
+ ```bash
232
+ npx @twarc_net/groundtruth install --statusline
233
+ ```
234
+
122
235
  ## Honest limitations
123
236
 
124
237
  - It verifies that claimed work **exists in the diff**, not that it is **correct**. _"Fixed the bug"_ can be confirmed to touch the right code; it cannot be confirmed to actually fix anything. That's what tests are for.
@@ -134,13 +247,34 @@ const report = runPipeline({ transcriptPath: "session.jsonl", cwd: process.cwd()
134
247
  console.log(renderMarkdown(report));
135
248
  ```
136
249
 
250
+ ## FAQ
251
+
252
+ **Does it send my code anywhere?**
253
+ No. It runs entirely locally — reads your transcript and `git`, writes nothing except when you run `install`. Zero network calls, zero runtime dependencies.
254
+
255
+ **Will it block my commits or get in the way?**
256
+ No. By default it just prints a report and exits cleanly. Blocking is strictly opt-in (`--strict`).
257
+
258
+ **Isn't this what tests are for?**
259
+ Tests catch code that's _wrong_. groundtruth catches code that was _never written_ but reported as done — there's nothing for a test to run. They're complementary.
260
+
261
+ **Does it work with Cursor / other agents?**
262
+ The engine is format-agnostic; today it ships a Claude Code transcript adapter. Adapters for other agents are a great first contribution — see [CONTRIBUTING.md](CONTRIBUTING.md).
263
+
264
+ **Will it falsely accuse me?**
265
+ It's tuned hard against that. A claim is only `unsupported` when it's concretely checkable and nothing supports it; everything fuzzy is shown as advisory, never a failure.
266
+
137
267
  ## Contributing
138
268
 
139
269
  Issues and PRs welcome — especially new claim patterns, agent adapters, and false-positive reports (those are gold). See [CONTRIBUTING.md](CONTRIBUTING.md) and the [Code of Conduct](CODE_OF_CONDUCT.md).
140
270
 
141
271
  ## Star history
142
272
 
143
- If groundtruth ever catches your agent in a lie, consider [starring the repo](https://github.com/youcefzemmar/groundtruth) — it helps other people find it.
273
+ If groundtruth ever catches your agent in a lie, a helps other people find it.
274
+
275
+ <a href="https://star-history.com/#youcefzemmar/groundtruth&Date">
276
+ <img src="https://api.star-history.com/svg?repos=youcefzemmar/groundtruth&type=Date" alt="Star History Chart" width="600">
277
+ </a>
144
278
 
145
279
  ## License
146
280