npm - @twarc_net/groundtruth - Versions diffs - 0.1.0 → 0.3.0 - Mend

@twarc_net/groundtruth 0.1.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 <p align="center">
-  <img src="assets/hero.svg" alt="groundtruth — catch when your AI coding assistant claims work it didn't do" width="820">
+  <img src="assets/demo.svg" alt="groundtruth — catch when your AI coding assistant claims work it didn't do" width="820">
 </p>
 <p align="center">
@@ -11,8 +11,22 @@
   <a href="https://github.com/youcefzemmar/groundtruth/blob/main/CONTRIBUTING.md"><img src="https://img.shields.io/badge/PRs-welcome-brightgreen.svg" alt="PRs welcome"></a>
 </p>
+<p align="center">
+  <b>English</b> ·
+  <a href="docs/i18n/README.zh-CN.md">简体中文</a> ·
+  <a href="docs/i18n/README.es.md">Español</a> ·
+  <a href="docs/i18n/README.pt-BR.md">Português</a> ·
+  <a href="docs/i18n/README.fr.md">Français</a> ·
+  <a href="docs/i18n/README.de.md">Deutsch</a> ·
+  <a href="docs/i18n/README.ja.md">日本語</a> ·
+  <a href="docs/i18n/README.ru.md">Русский</a> ·
+  <a href="docs/i18n/README.ar.md">العربية</a>
+</p>
 # groundtruth
+> **TL;DR** — Your AI says _"Done! I added X, fixed Y, wrote tests."_ groundtruth checks each claim against the real diff and flags the ones that never happened. One command: `npx @twarc_net/groundtruth install`.
 **Catch when your AI coding assistant claims work it didn't do.**
 Your agent ends a turn with _"Done! I added a `rateLimiter` middleware to `src/server.ts`, fixed the timeout bug, and added tests."_ You trust the summary, commit, and move on. Two weeks later production breaks — the rate limiter was never written. The summary lied (or hallucinated), and nothing checked it against the actual diff.
@@ -69,6 +83,13 @@ Restart Claude Code (or run `/hooks`) and groundtruth checks every turn automati
 npx @twarc_net/groundtruth verify
 ```
+Prefer plugins? Add the marketplace and install in one step:
+```text
+/plugin marketplace add youcefzemmar/groundtruth
+/plugin install groundtruth
+```
 ## How it works
 ```text
@@ -119,6 +140,98 @@ By default the hook is **non-blocking**: it prints its report and gets out of th
 Full details in [`docs/claim-types.md`](docs/claim-types.md).
+## Use in CI (GitHub Action)
+Post claim verdicts as a sticky comment on every PR — grading the **PR description against the diff**, so it works on any PR with zero agent setup:
+```yaml
+# .github/workflows/groundtruth.yml
+name: groundtruth
+on: pull_request
+permissions:
+  contents: read
+  pull-requests: write
+jobs:
+  claim-check:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v6
+        with: { fetch-depth: 0 }
+      - uses: youcefzemmar/groundtruth@v0.3.0
+```
+Add `with: { strict: true }` to turn it into a merge gate. Full options in [docs/github-action.md](docs/github-action.md).
+### Locally, against your commit message
+`--staged` checks a message against what's actually staged — drop this in `.git/hooks/commit-msg` (or a [lefthook](https://github.com/evilmartians/lefthook)/husky `commit-msg` hook):
+```sh
+#!/bin/sh
+# .git/hooks/commit-msg — verify the commit message against the staged diff
+npx @twarc_net/groundtruth verify --summary "$1" --staged
+# add --strict to abort the commit when a claim is unsupported
+```
+## Configuration
+Optional — drop a `.groundtruthrc.json` in your project (or a `"groundtruth"` key in package.json):
+```json
+{
+  "strict": false,
+  "failOn": ["unsupported"],
+  "shadow": false,
+  "ignore": ["CHANGELOG.md", "*.generated.ts"],
+  "ignoreKinds": ["command"],
+  "output": "terminal"
+}
+```
+- **`ignore`** — claim targets to skip (substring or `*` glob). Your escape hatch for any false positive.
+- **`ignoreKinds`** — whole claim kinds to skip (`file`, `symbol`, `test`, `dependency`, `command`, `action`).
+- **`strict`** / **`output`** — defaults for blocking and output format.
+- **`failOn`** — which verdict levels count as a failure in strict mode (default `["unsupported"]`).
+- **`shadow`** — record to the ledger but never print or block (for gradual rollout).
+Install into more hook events for multi-agent workflows:
+```bash
+npx @twarc_net/groundtruth install --events Stop,SubagentStop,SessionEnd
+```
+`SubagentStop` checks each subagent's turn; `SessionEnd` prints a per-session digest.
+## Other agents
+The Stop hook is Claude Code-specific, but `verify` reads other agents' transcripts too — the claim engine is agent-neutral:
+```bash
+groundtruth verify --agent codex     # OpenAI Codex CLI
+groundtruth verify --agent gemini    # Gemini CLI
+groundtruth verify --agent cursor    # Cursor (agent-transcripts)
+groundtruth verify --agent opencode  # OpenCode
+groundtruth verify --agent aider     # Aider (best-effort)
+groundtruth verify --agent auto      # pick the most recent across all
+```
+Each adapter normalizes the agent's transcript into the same `{summary, toolUses}` shape. New adapters are a great contribution — see [CONTRIBUTING.md](CONTRIBUTING.md).
+## Stats & status bar
+The hook keeps a privacy-safe local tally (counts only — never code or prompts, in `~/.groundtruth/ledger.jsonl`):
+```bash
+groundtruth stats          # this project: turns, verified, unsupported, to-review (7d/30d/all)
+groundtruth stats --all    # across every project
+```
+Show a live count in the Claude Code status bar (`🔎 gt 3❌ ·7d`):
+```bash
+npx @twarc_net/groundtruth install --statusline
+```
 ## Honest limitations
 - It verifies that claimed work **exists in the diff**, not that it is **correct**. _"Fixed the bug"_ can be confirmed to touch the right code; it cannot be confirmed to actually fix anything. That's what tests are for.
@@ -134,13 +247,34 @@ const report = runPipeline({ transcriptPath: "session.jsonl", cwd: process.cwd()
 console.log(renderMarkdown(report));
 ```
+## FAQ
+**Does it send my code anywhere?**
+No. It runs entirely locally — reads your transcript and `git`, writes nothing except when you run `install`. Zero network calls, zero runtime dependencies.
+**Will it block my commits or get in the way?**
+No. By default it just prints a report and exits cleanly. Blocking is strictly opt-in (`--strict`).
+**Isn't this what tests are for?**
+Tests catch code that's _wrong_. groundtruth catches code that was _never written_ but reported as done — there's nothing for a test to run. They're complementary.
+**Does it work with Cursor / other agents?**
+The engine is format-agnostic; today it ships a Claude Code transcript adapter. Adapters for other agents are a great first contribution — see [CONTRIBUTING.md](CONTRIBUTING.md).
+**Will it falsely accuse me?**
+It's tuned hard against that. A claim is only `unsupported` when it's concretely checkable and nothing supports it; everything fuzzy is shown as advisory, never a failure.
 ## Contributing
 Issues and PRs welcome — especially new claim patterns, agent adapters, and false-positive reports (those are gold). See [CONTRIBUTING.md](CONTRIBUTING.md) and the [Code of Conduct](CODE_OF_CONDUCT.md).
 ## Star history
-If groundtruth ever catches your agent in a lie, consider [starring the repo](https://github.com/youcefzemmar/groundtruth) — it helps other people find it.
+If groundtruth ever catches your agent in a lie, a ⭐ helps other people find it.
+<a href="https://star-history.com/#youcefzemmar/groundtruth&Date">
+  <img src="https://api.star-history.com/svg?repos=youcefzemmar/groundtruth&type=Date" alt="Star History Chart" width="600">
+</a>
 ## License