@motivation-labs/crosscheck 0.8.0-beta.9 → 0.9.0-beta.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -10,40 +10,18 @@
10
10
 
11
11
  # crosscheck
12
12
 
13
- **A lightweight orchestration layer that makes your AI coding agents review each other's work then fix it.**
13
+ **Automated AI code review for teams using Claude Code and Codex configured your way, zero new infrastructure.**
14
14
 
15
- When Claude Code opens a PR, Codex reviews it. When Codex opens a PR, Claude reviews it. If issues are found on a Claude-authored PR, Claude commits fixes and pushes them back. All of this runs on your laptop, against your existing subscriptions, with a single command.
16
-
17
- ```
18
- GitHub PR → crosscheck watch → AI review posted → fixes committed
19
- ```
20
-
21
- > Inspired by [Symphony](https://github.com/openai/symphony) — OpenAI's spec-driven multi-agent framework. Where Symphony coordinates agents at the product level, crosscheck stays at the level engineers already live in: pull requests, diffs, and code review. No new abstractions — just the CR + fix loop, automated.
22
-
23
- ---
24
-
25
- ## Why
26
-
27
- Coding agents ship fast. They also make confident, plausible mistakes. The fix isn't a human reviewer on every AI-authored PR — it's the *other* AI reviewing it. Claude and Codex have complementary blind spots; cross-vendor review catches more issues than either model alone, without adding human latency to every commit.
28
-
29
- crosscheck wires that loop:
30
-
31
- | | |
32
- |---|---|
33
- | **Local execution, always listening** | Runs on your machine. `crosscheck watch` opens a tunnel and keeps running. No cloud, no infra, no SaaS. |
34
- | **Subscription-funded** | Uses `claude` and `codex` CLIs against your existing Claude Pro/Max and ChatGPT Plus/Pro plans. No per-token API billing. Reviews are free once you're subscribed. |
35
- | **Cross-vendor by default** | Claude reviews Codex PRs; Codex reviews Claude PRs. Each model brings different training and different failure modes. The overlap is where bugs hide. |
36
- | **Self-improving** | `crosscheck diagnose` surfaces failure patterns from logs. `crosscheck optimize` feeds them to your AI and updates reviewer instructions automatically. |
15
+ When your AI agent opens a PR, the rival AI reviews it. If issues are found, the original agent fixes them and opens a follow-up PR. The whole loop runs against your existing subscriptions with a single command.
37
16
 
38
17
  ---
39
18
 
40
- ## Who this is for
19
+ ## Highlights
41
20
 
42
- **The AI-native developer** — You use Claude Code and Codex to ship features across multiple repos. crosscheck routes every PR to the rival AI, closes the loop with auto-fix PRs, and requires zero new infrastructure beyond your existing subscriptions.
43
-
44
- **The solo founder or indie hacker** — You're the only engineer and your agents run around the clock. Without crosscheck, no one reviews what the agent ships. With it, every PR gets a second opinion before it lands — automatically, while you sleep.
45
-
46
- **The team scaling AI** — Your org has adopted AI coding tools across multiple repos. crosscheck gives you consistent coverage out of the box, a self-improving instruction set via `crosscheck optimize`, and a quantified ROI story via `crosscheck impact --money` when leadership asks whether the AI investment is paying off.
21
+ - **Customizable review workflow** — configure the full pipeline: review-only, review + auto-fix, or review + fix + recheck. Per-step instructions let you tune what the reviewer focuses on without editing prompts manually.
22
+ - **Cross-vendor and single-vendor modes** — cross-vendor mode routes each PR to the rival AI for independent review. Single-vendor mode uses whichever AI you have. Switch with one config line.
23
+ - **Subscription-funded, not token-billed** — runs through the `claude` and `codex` CLIs against your Claude Pro/Max and ChatGPT Plus/Pro plans. No API keys, no per-review cost.
24
+ - **`watch` for personal use, `serve` for your team** — `crosscheck watch` runs on your laptop and opens a tunnel automatically, ideal for solo developers. `crosscheck serve` binds to a fixed port on a shared machine so the whole team gets coverage without anyone's laptop staying on.
47
25
 
48
26
  ---
49
27
 
@@ -56,68 +34,37 @@ npm install -g @anthropic-ai/claude-code && claude # Claude Pro/Max subsc
56
34
  npm install -g @openai/codex && codex login --device-auth # ChatGPT Plus/Pro subscription
57
35
  brew install gh && gh auth login # GitHub CLI
58
36
 
59
- # 2. Guided setup — auth, repos, review mode, workflow pipeline
37
+ # 2. Guided setup — repos, review mode, workflow pipeline
60
38
  crosscheck onboard
61
39
 
62
- # 3. Run continuously
63
- crosscheck watch
40
+ # 3. Start watching
41
+ crosscheck watch # personal laptop
42
+ crosscheck serve # always-on team server
64
43
  ```
65
44
 
66
- `crosscheck onboard` walks you through seven steps: environment check, deployment mode (personal vs team), repo selection, review mode (cross-vendor or single-vendor), workflow pipeline (review-only / review+fix / review+fix+recheck), connection type (localhost.run vs smee.io), and config write. `crosscheck watch` then opens a tunnel (localhost.run by default, or smee.io if you selected it), auto-registers the webhook, and starts listening.
45
+ `crosscheck onboard` walks you through repo selection, vendor mode, pipeline steps, and tunnel choice. After that, `watch` or `serve` is all you need.
67
46
 
68
47
  ---
69
48
 
70
- ## How it works
71
-
72
- ```
73
- ┌────────────────────────────────────────────────────────────────┐
74
- │ Your laptop │
75
- │ │
76
- │ crosscheck watch │
77
- │ ├── SSH tunnel (localhost.run) ◄──── GitHub webhook │
78
- │ ├── Webhook server (:7891) │
79
- │ └── PR handler │
80
- │ ├── detect origin (Claude Code? Codex? other?) │
81
- │ ├── clone PR branch │
82
- │ ├── run reviewer (cross-vendor assignment) │
83
- │ ├── post review comment │
84
- │ └── post-review auto-fix │
85
- │ ├── authoring vendor reads review comment │
86
- │ └── opens fix PR targeting original branch │
87
- └────────────────────────────────────────────────────────────────┘
88
- ```
89
-
90
- **Routing** uses a four-signal chain: PR body patterns → commit `Co-Authored-By:` trailers → branch prefix (`claude/` or `codex/`) → `author_routes` config fallback. `Generated with [Claude Code]` → Codex reviews. `Generated with [OpenAI Codex]` → Claude reviews. `allowed_authors` restricts reviews to your agent accounts.
91
-
92
- **Post-review auto-fix** runs after the review when issues are found. The vendor that authored the PR (`fixer: same-as-author`) reads its own review comment and opens a new fix PR targeting the original branch. You review and merge the fix PR — the original PR then updates automatically. Controlled by `post_review.auto_fix` in your config.
93
-
94
- **The feedback loop** closes via `crosscheck diagnose` → `crosscheck optimize`. Failure patterns and quality signals from `~/.crosscheck/logs/` feed back into improved reviewer instructions — no manual config editing required.
95
-
96
- ---
97
-
98
- ## watch output
49
+ ## What it looks like
99
50
 
100
51
  ```
101
52
  $ crosscheck watch
102
53
 
103
54
  "Move fast and review things."
104
55
 
105
- crosscheck watch
106
-
107
56
  profile personal · cross-vendor · balanced
108
57
  users your-github-login (5 repos)
109
58
  auto-fix on_issues · same-as-author · pull_request
110
- config ./crosscheck.config.yml ← edit to change above
59
+ config ./crosscheck.config.yml
111
60
 
112
- 9:18:14 AM ✓ tunnel ready: https://abc123.lhr.life
113
- 9:18:15 AM ✓ webhook registered for your-org/your-repo
114
- Waiting for PR events — Ctrl+C to stop.
61
+ ✓ tunnel ready: https://abc123.lhr.life
62
+ ✓ webhook registered for your-org/your-repo
63
+ Waiting for PR events — Ctrl+C to stop.
115
64
 
116
65
  PR #47 opened: add retry logic for flaky network calls
117
66
  origin=claude reviewer=codex
118
67
  codex reviewing... (12s)
119
- review complete (12s)
120
- posted → github.com/your-org/your-repo/pull/47
121
68
  NEEDS WORK
122
69
  auto-fix claude fixing...
123
70
  fix PR #48 opened → github.com/your-org/your-repo/pull/48
@@ -125,8 +72,6 @@ PR #47 opened: add retry logic for flaky network calls
125
72
  PR #49 opened: implement caching layer
126
73
  origin=codex reviewer=claude
127
74
  claude reviewing... (18s)
128
- review complete (18s)
129
- posted → github.com/your-org/your-repo/pull/49
130
75
  APPROVE
131
76
  ```
132
77
 
@@ -136,14 +81,18 @@ PR #49 opened: implement caching layer
136
81
 
137
82
  ```bash
138
83
  crosscheck init # check prerequisites, write starter config
139
- crosscheck onboard # guided setup — pick repos and set deployment mode
84
+ crosscheck onboard # guided setup — pick repos, mode, and pipeline
140
85
  crosscheck review <pr-url> # one-shot review of a specific PR
141
- crosscheck run <pr-url> # full workflow: review auto-fix recheck
142
- crosscheck watch # local devtunnel + auto-webhook + listening
143
- crosscheck serve # always-on fixed port, register webhook once
144
- crosscheck status # auth state, config, log summary, CLI versions
86
+ crosscheck watch # personal use tunnel + webhook + listening on your laptop
87
+ crosscheck serve # team usefixed port, register webhook once
88
+ crosscheck status # auth state, config summary, CLI versions
89
+ ```
90
+
91
+ **Continuous improvement** *(experimental)*
92
+
93
+ ```bash
145
94
  crosscheck diagnose # surface failure patterns from review logs
146
- crosscheck optimize [--apply] # update reviewer instructions from diagnose output
95
+ crosscheck optimize [--apply] # rewrite reviewer instructions based on diagnose output
147
96
  crosscheck impact [--money] # time saved, issues caught, code quality trends
148
97
  crosscheck issue # draft and file a bug report from recent error logs
149
98
  ```
@@ -152,159 +101,37 @@ crosscheck issue # draft and file a bug report from recent er
152
101
 
153
102
  ## Configuration
154
103
 
155
- Config lives at `~/.crosscheck/config.yml` by default persistent across all projects, no per-repo file needed. Run `crosscheck init` to generate it. Project-local `crosscheck.config.yml` overrides the global config when present.
104
+ Config lives at `~/.crosscheck/config.yml` one file covers all your repos. Run `crosscheck init` to generate it, or let `crosscheck onboard` write it for you.
156
105
 
157
106
  ```yaml
158
- # Which repos/orgs to watch (at least one required)
159
107
  orgs:
160
- - your-org # covers every repo in the org
108
+ - your-org
161
109
 
162
- # Only review PRs from these GitHub accounts
163
- # Auto-filled with your login by `crosscheck init` or first `crosscheck watch`
164
110
  routing:
165
111
  allowed_authors:
166
- - your-github-login # auto-detected from gh auth
112
+ - your-github-login
167
113
 
168
- # Review depth
169
114
  quality:
170
- tier: balanced # fast | balanced | thorough
115
+ tier: balanced # fast | balanced | thorough
171
116
 
172
- # Optional spend cap
173
- budget:
174
- per_review_usd: 2.0
175
- codex_monthly_usd: 50
176
-
177
- # Tunnel backend (watch mode only)
178
- # localhost.run — zero install, reconnects automatically (default)
179
- # smee — stable channel URL, queues events while offline
180
- tunnel:
181
- backend: localhost.run
182
-
183
- # Post-review auto-fix — opens a fix PR when issues are found
184
117
  post_review:
185
118
  auto_fix:
186
119
  enabled: true
187
- trigger: on_issues # on_issues | always | never
188
- min_severity: warning # error | warning | info
189
- fixer: same-as-author # the vendor that wrote the PR also fixes it
120
+ trigger: on_issues # on_issues | always | never
121
+ fixer: same-as-author
190
122
  delivery:
191
- mode: pull_request # pull_request | commit | comment
123
+ mode: pull_request
192
124
  ```
193
125
 
194
- Full configuration reference: [get-started.md](./get-started.md)
195
-
196
- ---
197
-
198
- ## Customization home
199
-
200
- Everything crosscheck learns and every choice you make persists in `~/.crosscheck/`. Back this directory up and a reinstall is instant — just run `crosscheck onboard` and press Enter through each step to confirm your previous settings.
201
-
202
- | File | Written by | Purpose |
203
- |---|---|---|
204
- | `config.yml` | `onboard`, `init` | Deployment, repos, mode, vendors, quality tier, tunnel, routing, budget |
205
- | `workflow.yml` | `onboard` | Global pipeline steps with per-step inline instructions. Written on first onboard, never overwritten — edit freely |
206
- | `webhook-secret` | auto-generated | HMAC secret for GitHub webhook verification. Reused across restarts |
207
- | `logs/YYYY-MM-DD.ndjson` | `watch`, `serve` | Structured review event log — feeds `diagnose`, `optimize`, `impact` |
208
-
209
- Per-project overrides (checked before the global files):
210
-
211
- | File | Purpose |
212
- |---|---|
213
- | `.crosscheck/workflow.yml` *(in repo)* | Per-project pipeline — takes priority over `~/.crosscheck/workflow.yml` |
214
- | `.crosscheck/AGENT.md` *(in repo)* | Per-project harness for `crosscheck optimize` — takes priority over bundled `AGENT.md` |
215
-
216
- ---
217
-
218
- ## Self-improving reviews
219
-
220
- Every review outcome is logged to `~/.crosscheck/logs/YYYY-MM-DD.ndjson`. Over time, patterns emerge — which commands the reviewer tries to run (and fails), verdict distributions, review duration trends.
221
-
222
- ```bash
223
- # See what's going wrong
224
- $ crosscheck diagnose
225
-
226
- crosscheck diagnose (2026-01-01 → 2026-05-08 · 3 log files)
227
-
228
- Reviews 47 total — 28 APPROVE 14 NEEDS WORK 5 BLOCK
229
- Failure rate codex 12% / claude 4%
230
-
231
- Suggestions
232
- ─────────────────────────────────────────────────────────────
233
- ✦ codex runs `npm test` during review (7 occurrences)
234
- → add to instructions: "Do not run npm, tsc, or test commands."
235
- ✦ 3 reviews timed out on large PRs (>400 lines changed)
236
- → consider quality.tier: fast for PRs above a size threshold
237
-
238
- # Apply the suggested fixes automatically
239
- $ crosscheck optimize --apply
240
- agent claude (lower failure rate: 4% vs codex 12%)
241
- writing ~/.crosscheck/workflow.yml (review step)
242
- + Do not run npm, tsc, jest, or any build/test commands.
243
- + Flag PRs over 400 lines changed as too large to review thoroughly.
244
- done
245
-
246
- # Measure the compounding value
247
- $ crosscheck impact --money
248
-
249
- crosscheck impact (all time · 47 reviews)
250
-
251
- Time saved
252
- ──────────────────────────────────────────────
253
- Reviews run 47
254
- Avg AI review time ~14 min
255
- Assumed human time 60 min ⓘ
256
- Total time saved ~43 h
257
-
258
- Issues caught
259
- ──────────────────────────────────────────────
260
- APPROVE 28 (60%)
261
- NEEDS WORK 14 (30%) ← actionable feedback
262
- BLOCK 5 (11%) ← potential bugs / breaking changes
263
- Total issues caught 19
264
-
265
- Estimated value: ~$8,450
266
- (43h × $150/hr + 19 issues × $150/issue)
267
- ```
126
+ Full reference: [get-started.md](./get-started.md)
268
127
 
269
128
  ---
270
129
 
271
130
  ## Deployment
272
131
 
273
- ### Laptop — `crosscheck watch`
274
-
275
- Zero configuration. SSH tunnel through `localhost.run` handles NAT — no port-forwarding, no cloud account. If the tunnel goes silent without exiting, the health check detects it within ~2 minutes and forces a reconnect + webhook re-registration.
276
-
277
- ```bash
278
- crosscheck watch
279
- # → opens tunnel, registers webhook, starts listening
280
- ```
281
-
282
- ### Server — `crosscheck serve`
283
-
284
- Bind to a fixed port on a machine with a public IP. Register the webhook once.
285
-
286
- ```bash
287
- crosscheck serve
288
- # → listens on :7891, you register https://your-server/webhook manually
289
- ```
290
-
291
- ### smee.io — stable relay (optional)
292
-
293
- `localhost.run` drops events if your laptop is offline when a PR opens. [smee.io](https://smee.io) queues them and replays on reconnect — useful when the reviewer machine isn't always on.
294
-
295
- ```bash
296
- npm install -g smee-client
297
- # Visit https://smee.io/new — copy the channel URL
298
- ```
299
-
300
- ```yaml
301
- # ~/.crosscheck/config.yml
302
- tunnel:
303
- backend: smee
304
- smee_channel: https://smee.io/your-channel-id
305
- ```
132
+ **Personal (`crosscheck watch`)** runs on your laptop. SSH tunnel through `localhost.run` handles everything — no port-forwarding, no cloud account. Health check reconnects automatically if the tunnel drops.
306
133
 
307
- crosscheck registers the webhook automatically on first `watch` start no manual webhook setup needed.
134
+ **Team (`crosscheck serve`)** bind to a fixed port on a machine with a public IP. Register the webhook once and the whole team is covered without anyone's laptop staying on.
308
135
 
309
136
  ---
310
137
 
@@ -317,7 +144,7 @@ crosscheck registers the webhook automatically on first `watch` start — no man
317
144
  | Codex CLI | latest — `npm install -g @openai/codex` |
318
145
  | GitHub CLI | 2.65+ — `brew install gh` |
319
146
 
320
- `GITHUB_TOKEN` is derived automatically when `gh auth login` has been run. No manual export required.
147
+ `GITHUB_TOKEN` is derived automatically from `gh auth login`. No manual export needed.
321
148
 
322
149
  ---
323
150
 
@@ -325,10 +152,9 @@ crosscheck registers the webhook automatically on first `watch` start — no man
325
152
 
326
153
  | | |
327
154
  |---|---|
328
- | **[get-started.md](./get-started.md)** | Full setup guide — prerequisites, all commands and flags, complete config reference, how it works, FAQ |
329
- | **[crosscheck.config.example.yml](./crosscheck.config.example.yml)** | Annotated config file with every option |
330
- | **[AGENT.md](./AGENT.md)** | Harness document used by `crosscheck optimize` — how the AI improves reviewer instructions |
331
- | **[CHANGELOG.md](./CHANGELOG.md)** | Release notes — what's new in each version |
155
+ | **[get-started.md](./get-started.md)** | Full setup guide — prerequisites, all flags, complete config reference, FAQ |
156
+ | **[crosscheck.config.example.yml](./crosscheck.config.example.yml)** | Annotated config with every option |
157
+ | **[CHANGELOG.md](./CHANGELOG.md)** | Release notes |
332
158
 
333
159
  ---
334
160