npm - job-forge - Versions diffs - 2.0.0 → 2.0.2 - Mend

job-forge 2.0.0 → 2.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/.cursor/rules/main.mdc CHANGED Viewed

@@ -15,8 +15,17 @@ The Hard Limits below are non-negotiable numeric rules. If you catch yourself ab
 4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
 5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
 6. **Application outcomes flow through TSVs, not `data/pipeline.md`.** When a subagent returns APPLIED / FAILED / SKIP, the outcome goes to `batch/tracker-additions/{num}-{slug}.tsv`. `node merge-tracker.mjs` then consumes the TSVs into the correct `data/applications/YYYY-MM-DD.md` day file. `data/pipeline.md` only tracks URL inbox state (`[ ]` pending → `[x]` processed). **NEVER append APPLIED / FAILED status lines to `pipeline.md`** — that's the day file's job, via the TSV pathway. After any multi-apply run, the orchestrator MUST run `node merge-tracker.mjs` followed by `node verify-pipeline.mjs` before ending the session.
+7. **URLs passed to downstream subagents must come from a file, not from a prior subagent's prose.** When an orchestrator dispatches a subagent with a URL (for evaluation, apply, verification, etc.), the URL MUST originate from:
+   - `data/pipeline.md`
+   - `data/scan-history.tsv`
+   - `batch/scan-output-*.md` or similar structured output file
+   - A report file (`reports/{num}-*.md`) with an authoritative `**URL:**` header
-Everything below is context and rationale. These six numbers are the rules.
+   URLs mentioned in a subagent's return message are NOT trustworthy by default — they may be hallucinated or reconstructed. Before passing any URL from a subagent report to another subagent, cross-check it exists in one of the authoritative files above, OR instruct the dispatching subagent to write its output to a structured file and re-read from that file.
+   **Why**: a scan subagent once reported 30 plausible-looking Greenhouse IDs in its return message that did not exist in the Greenhouse API. The orchestrator dispatched 30 downstream subagents that all failed verification. Trusting prose-form URLs cost ~2 hours of wasted work and corrupted the tracker.
+Everything below is context and rationale. These seven numbers are the rules.
 ---

package/AGENTS.md CHANGED Viewed

@@ -10,8 +10,17 @@ The Hard Limits below are non-negotiable numeric rules. If you catch yourself ab
 4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
 5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
 6. **Application outcomes flow through TSVs, not `data/pipeline.md`.** When a subagent returns APPLIED / FAILED / SKIP, the outcome goes to `batch/tracker-additions/{num}-{slug}.tsv`. `node merge-tracker.mjs` then consumes the TSVs into the correct `data/applications/YYYY-MM-DD.md` day file. `data/pipeline.md` only tracks URL inbox state (`[ ]` pending → `[x]` processed). **NEVER append APPLIED / FAILED status lines to `pipeline.md`** — that's the day file's job, via the TSV pathway. After any multi-apply run, the orchestrator MUST run `node merge-tracker.mjs` followed by `node verify-pipeline.mjs` before ending the session.
+7. **URLs passed to downstream subagents must come from a file, not from a prior subagent's prose.** When an orchestrator dispatches a subagent with a URL (for evaluation, apply, verification, etc.), the URL MUST originate from:
+   - `data/pipeline.md`
+   - `data/scan-history.tsv`
+   - `batch/scan-output-*.md` or similar structured output file
+   - A report file (`reports/{num}-*.md`) with an authoritative `**URL:**` header
-Everything below is context and rationale. These six numbers are the rules.
+   URLs mentioned in a subagent's return message are NOT trustworthy by default — they may be hallucinated or reconstructed. Before passing any URL from a subagent report to another subagent, cross-check it exists in one of the authoritative files above, OR instruct the dispatching subagent to write its output to a structured file and re-read from that file.
+   **Why**: a scan subagent once reported 30 plausible-looking Greenhouse IDs in its return message that did not exist in the Greenhouse API. The orchestrator dispatched 30 downstream subagents that all failed verification. Trusting prose-form URLs cost ~2 hours of wasted work and corrupted the tracker.
+Everything below is context and rationale. These seven numbers are the rules.
 ---

package/CLAUDE.md CHANGED Viewed

@@ -10,8 +10,17 @@ The Hard Limits below are non-negotiable numeric rules. If you catch yourself ab
 4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
 5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
 6. **Application outcomes flow through TSVs, not `data/pipeline.md`.** When a subagent returns APPLIED / FAILED / SKIP, the outcome goes to `batch/tracker-additions/{num}-{slug}.tsv`. `node merge-tracker.mjs` then consumes the TSVs into the correct `data/applications/YYYY-MM-DD.md` day file. `data/pipeline.md` only tracks URL inbox state (`[ ]` pending → `[x]` processed). **NEVER append APPLIED / FAILED status lines to `pipeline.md`** — that's the day file's job, via the TSV pathway. After any multi-apply run, the orchestrator MUST run `node merge-tracker.mjs` followed by `node verify-pipeline.mjs` before ending the session.
+7. **URLs passed to downstream subagents must come from a file, not from a prior subagent's prose.** When an orchestrator dispatches a subagent with a URL (for evaluation, apply, verification, etc.), the URL MUST originate from:
+   - `data/pipeline.md`
+   - `data/scan-history.tsv`
+   - `batch/scan-output-*.md` or similar structured output file
+   - A report file (`reports/{num}-*.md`) with an authoritative `**URL:**` header
-Everything below is context and rationale. These six numbers are the rules.
+   URLs mentioned in a subagent's return message are NOT trustworthy by default — they may be hallucinated or reconstructed. Before passing any URL from a subagent report to another subagent, cross-check it exists in one of the authoritative files above, OR instruct the dispatching subagent to write its output to a structured file and re-read from that file.
+   **Why**: a scan subagent once reported 30 plausible-looking Greenhouse IDs in its return message that did not exist in the Greenhouse API. The orchestrator dispatched 30 downstream subagents that all failed verification. Trusting prose-form URLs cost ~2 hours of wasted work and corrupted the tracker.
+Everything below is context and rationale. These seven numbers are the rules.
 ---

package/README.md CHANGED Viewed

@@ -19,17 +19,17 @@
 ## Quick Start
 ```bash
-npx github:razroo/JobForge create-job-forge my-job-search
+npx --package=job-forge create-job-forge my-job-search
 cd my-job-search
 npm install
 opencode
 ```
-The scaffolded `opencode.json` already has the Geometra MCP (browser automation + PDF) and Gmail MCP (reading replies) wired up — they launch automatically the first time opencode starts.
+The scaffolded `opencode.json` already has the Geometra MCP (browser automation + PDF) and Gmail MCP (reading replies) wired up — they launch automatically the first time opencode starts. `npm install` also materializes symlinks for every supported agent harness — OpenCode, Cursor, Claude Code, and Codex — so you can run `opencode`, `cursor`, `claude`, or `codex` in the same project and each picks up the shared MCP config and instructions.
 Then fill in `cv.md`, `config/profile.yml`, and `portals.yml` with your personal data, paste a job URL into opencode, and JobForge evaluates + tracks it.
-**Upgrade later:** `npm run update-harness` (pulls latest harness + fallback plugin, re-syncs symlinks, prints the resolved commit)
+**Upgrade later:** `npm run update-harness` (pulls latest `job-forge` from npm, re-syncs symlinks, prints the resolved version)
 Full setup guide and alternative install paths (including contributing to the harness itself): **[docs/SETUP.md](docs/SETUP.md)**.
@@ -125,61 +125,74 @@ You paste a job URL or description
 ## Project Structure
-**Your personal project** (after `npx create-job-forge my-search`):
+**Your personal project** (after `npx --package=job-forge create-job-forge my-search`):
 ```
 my-search/
-├── package.json                 # depends on "job-forge" (github:razroo/JobForge)
-├── opencode.json                # thin config — enables MCPs + states.yml
-├── cv.md                        # your CV (personal)
-├── article-digest.md            # your proof points (optional, personal)
-├── portals.yml                  # companies you want to scan (personal)
-├── config/profile.yml           # your identity, target roles (personal)
-├── data/                        # applications, pipeline, scan history (personal, gitignored)
-├── reports/                     # generated evaluation reports (personal, gitignored)
-├── batch/
-│   ├── batch-input.tsv          # URLs to batch-evaluate (personal)
-│   ├── batch-state.tsv          # resumable batch state (personal)
-│   ├── tracker-additions/       # TSVs waiting to merge (personal)
-│   ├── logs/                    # per-worker logs (personal, gitignored)
-│   ├── batch-prompt.md          # → symlink to node_modules/job-forge/
-│   └── batch-runner.sh          # → symlink to node_modules/job-forge/
-├── modes/                       # → symlink to node_modules/job-forge/modes/
-├── templates/                   # → symlink to node_modules/job-forge/templates/
-├── .opencode/skills/job-forge.md  # → symlink to node_modules/job-forge/
-└── node_modules/job-forge/      # the harness (fetched from github:razroo/JobForge)
+├── package.json                  # depends on "job-forge": "^2.0.0" (npm registry)
+├── opencode.json                 # thin config — enables MCPs + states.yml
+├── cv.md                         # your CV (personal)
+├── article-digest.md             # your proof points (optional, personal)
+├── portals.yml                   # companies to scan (personal)
+├── config/profile.yml            # your identity, target roles (personal)
+├── data/                         # applications, pipeline, scan history (personal, gitignored)
+├── reports/                      # generated evaluation reports (personal, gitignored)
+├── batch/{batch-input,batch-state}.tsv, tracker-additions/, logs/   # personal
+├── AGENTS.md                     # personal overrides (opencode + codex)
+├── CLAUDE.md                     # personal overrides (Claude Code), @-imports CLAUDE.harness.md
+│
+│ # ↓ symlinks into node_modules/job-forge/, regenerated by postinstall sync.mjs
+├── AGENTS.harness.md             # → harness instructions (loaded via opencode.json)
+├── CLAUDE.harness.md             # → harness instructions (imported from personal CLAUDE.md)
+├── .mcp.json                     # → Claude Code MCP config
+├── .codex/config.toml            # → Codex MCP config
+├── .cursor/mcp.json              # → Cursor MCP config
+├── .cursor/rules/main.mdc        # → Cursor always-apply rule
+├── .opencode/skills/job-forge.md # → skill router
+├── .opencode/agents/             # → @general-free, @general-paid, @glm-minimal
+├── modes/                        # → _shared.md + skill modes
+├── templates/                    # → states.yml, portals.example.yml, cv-template.html
+├── batch/batch-prompt.md         # → batch worker prompt
+├── batch/batch-runner.sh         # → parallel orchestrator
+│
+└── node_modules/job-forge/       # the harness (from npm: `job-forge@2.x`)
 ```
 Symlinks are regenerated on every `npm install` via the package's `postinstall` hook. You never have to know about harness internals — just edit `cv.md`, `portals.yml`, and `config/profile.yml`.
-**The harness itself** (this repo, what gets installed into `node_modules/job-forge/`):
+**The harness itself** (this repo, what gets published as `job-forge` on npm):
 ```
 JobForge/
-├── package.json                # bin: job-forge, create-job-forge
+├── iso/                          # ← SOURCE OF TRUTH for harness configuration
+│   ├── instructions.md           # → AGENTS.md + CLAUDE.md (Claude Code / Codex / Cursor)
+│   ├── mcp.json                  # → .mcp.json + .cursor/mcp.json + .codex/config.toml + opencode.json
+│   ├── agents/*.md               # → .opencode/agents/*.md (general-free, general-paid, glm-minimal)
+│   ├── commands/job-forge.md     # → .opencode/skills/job-forge.md
+│   └── config.json               # per-harness top-level extras (e.g. opencode `instructions` array)
+│
+├── package.json                  # bin: job-forge, create-job-forge; prepack runs iso-harness
 ├── bin/
-│   ├── job-forge.mjs           # CLI dispatcher (merge/verify/pdf/tokens/sync/...)
-│   ├── sync.mjs                # postinstall: creates symlinks in consumer project
-│   └── create-job-forge.mjs    # npx create-job-forge scaffolder
-├── .opencode/skills/job-forge.md  # the skill router
-├── modes/                      # _shared.md + 16 skill modes
-├── templates/                  # cv-template.html, portals.example.yml, states.yml
-├── config/profile.example.yml  # template for consumer's profile.yml
-├── batch/batch-prompt.md       # batch worker prompt template
-├── batch/batch-runner.sh       # parallel orchestrator
-├── scripts/token-usage-report.mjs   # opencode cost analyzer
-├── tracker-lib.mjs             # shared tracker read/write helper
-├── merge-tracker.mjs           # merge batch TSVs → tracker
-├── dedup-tracker.mjs           # remove dupes
-├── verify-pipeline.mjs         # pipeline integrity checks
-├── normalize-statuses.mjs      # canonicalize status values
-├── generate-pdf.mjs            # CV PDF generator
-├── cv-sync-check.mjs           # setup lint
-├── dashboard/                  # optional Go TUI
-├── fonts/                      # Space Grotesk + DM Sans (for PDF)
-└── docs/                       # architecture, setup, customization
+│   ├── job-forge.mjs             # CLI dispatcher (merge/verify/pdf/tokens/sync/...)
+│   ├── sync.mjs                  # postinstall: creates symlinks in consumer project
+│   └── create-job-forge.mjs      # scaffolder
+├── modes/                        # _shared.md + 16 skill modes
+├── templates/                    # cv-template.html, portals.example.yml, states.yml
+├── config/profile.example.yml    # template for consumer's profile.yml
+├── batch/{batch-prompt.md,batch-runner.sh}   # batch orchestrator
+├── scripts/
+│   ├── token-usage-report.mjs    # opencode cost analyzer
+│   └── release/check-source.mjs  # version gate for npm publish
+├── tracker-lib.mjs / merge-tracker.mjs / dedup-tracker.mjs / verify-pipeline.mjs
+├── normalize-statuses.mjs / generate-pdf.mjs / cv-sync-check.mjs
+├── dashboard/                    # optional Go TUI
+├── fonts/                        # Space Grotesk + DM Sans (for PDF)
+├── docs/                         # architecture, setup, customization
+└── .github/workflows/            # quality.yml + release.yml (CI publish to npm)
 ```
+All per-harness config trees (`.opencode/`, `.cursor/`, `.claude/`, `.codex/`, `CLAUDE.md`, `AGENTS.md`, `.mcp.json`, `opencode.json`) are **generated** from `iso/` by [`@razroo/iso-harness`](https://www.npmjs.com/package/@razroo/iso-harness) and gitignored in this repo. `npm run build:config` regenerates them locally; `prepack` regenerates them into the tarball at publish time so consumers get everything pre-baked.
 ## Documentation
 Index and cross-links: [docs/README.md](docs/README.md).

package/bin/create-job-forge.mjs CHANGED Viewed

@@ -117,7 +117,7 @@ const consumerPkg = {
     'update-harness': 'npm update job-forge @razroo/opencode-model-fallback @razroo/gmail-mcp @geometra/mcp && job-forge sync && node -e "console.log(\'✅ harness at\', require(\'./package-lock.json\').packages[\'node_modules/job-forge\'].resolved)"',
   },
   dependencies: {
-    'job-forge': 'github:razroo/JobForge',
+    'job-forge': '^2.0.0',
     // Model-fallback plugin: rotates agents through their fallback_models
     // chain on rate-limit / 5xx errors so a rate-limited free-tier model
     // doesn't wedge the whole flow. The chains live upstream in each

package/docs/ARCHITECTURE.md CHANGED Viewed

@@ -2,31 +2,42 @@
 ## Package architecture (v2.0.0+)
-JobForge ships as an npm package. There are two kinds of repo involved:
+JobForge ships as an npm package at [`job-forge`](https://www.npmjs.com/package/job-forge). There are two kinds of repo involved:
-- **Harness** — this repo, `razroo/JobForge`. Installable via `github:razroo/JobForge` (no npm registry). Contains modes, scripts, skill router, templates, fonts, dashboard, and bin entries.
-- **Consumer project** — what users interact with day-to-day. Scaffolded via `npx create-job-forge <dir>`, or hand-authored with `job-forge` listed in `package.json` dependencies.
+- **Harness** — this repo, `razroo/JobForge`. Published to npm. Contains `iso/` (single source of truth), modes, scripts, skill router, templates, fonts, dashboard, and bin entries. Per-harness config trees are **generated** from `iso/` by [`@razroo/iso-harness`](https://www.npmjs.com/package/@razroo/iso-harness) — gitignored here, baked into the tarball by `prepack` at publish time, and landed in consumer projects via symlinks.
+- **Consumer project** — what users interact with day-to-day. Scaffolded via `npx --package=job-forge create-job-forge <dir>`, or hand-authored with `job-forge` listed in `package.json` dependencies.
-The consumer's project root contains only personal data:
+The consumer's project root contains personal data plus symlinks into `node_modules/job-forge/`:
 ```
 my-search/
-├── package.json                 # depends on "job-forge"
-├── opencode.json                # instructions: ["templates/states.yml"]
-├── cv.md                        # personal
-├── config/profile.yml           # personal
-├── portals.yml                  # personal
-├── data/                        # personal (gitignored)
-├── reports/                     # personal (gitignored)
-├── modes/                       # → symlink to node_modules/job-forge/modes/
-├── templates/                   # → symlink to node_modules/job-forge/templates/
-├── .opencode/skills/job-forge.md # → symlink
-├── batch/batch-prompt.md        # → symlink
-├── batch/batch-runner.sh        # → symlink
-└── node_modules/job-forge/      # harness, fetched from github
+├── package.json                      # depends on "job-forge": "^2.0.0"
+├── opencode.json                     # instructions: ["templates/states.yml"]
+├── cv.md                             # personal
+├── config/profile.yml                # personal
+├── portals.yml                       # personal
+├── data/                             # personal (gitignored)
+├── reports/                          # personal (gitignored)
+├── AGENTS.md                         # personal overrides (opencode + codex)
+├── CLAUDE.md                         # personal overrides (Claude Code); @-imports CLAUDE.harness.md
+│
+│ # ↓ symlinks regenerated on every `npm install` by bin/sync.mjs
+├── AGENTS.harness.md                 # → node_modules/job-forge/AGENTS.md
+├── CLAUDE.harness.md                 # → node_modules/job-forge/CLAUDE.md
+├── .mcp.json                         # → Claude Code MCP config
+├── .codex/config.toml                # → Codex MCP config
+├── .cursor/mcp.json                  # → Cursor MCP config
+├── .cursor/rules/main.mdc            # → Cursor always-apply rule
+├── .opencode/skills/job-forge.md     # → skill router
+├── .opencode/agents/                 # → @general-free, @general-paid, @glm-minimal
+├── modes/                            # → mode files
+├── templates/                        # → states.yml, portals.example.yml, cv-template.html
+├── batch/batch-prompt.md             # → batch worker prompt
+├── batch/batch-runner.sh             # → parallel orchestrator
+└── node_modules/job-forge/           # harness, installed from npm
 ```
-Symlinks are created by the harness's `postinstall` hook (`bin/sync.mjs`) on every `npm install`. They are gitignored in the scaffolder template. Real files at those paths are preserved — if a user locally customizes a mode file, the sync skips that symlink and warns.
+Symlinks are created by the harness's `postinstall` hook (`bin/sync.mjs`) on every `npm install`. Real files at those paths are preserved — if a user locally customizes a mode file, the sync skips that symlink and warns.
 The consumer's `opencode.json` loads a small set of stable files as always-present instructions: `AGENTS.harness.md` (harness operational rules), `templates/states.yml` (canonical application states), `modes/_shared.md` (scoring model), and `cv.md` (the candidate's CV). Caching these in the prefix means agents never Read them as tool calls. Churning content (score calibration anchors, specific mode files) stays out of `instructions` and is Read on demand.
@@ -34,7 +45,9 @@ The skill router (`.opencode/skills/job-forge.md`) loads mode and data files on
 **Cost-tiered subagents** live in `.opencode/agents/` (`general-free`, `general-paid`, `glm-minimal`) — the orchestrator delegates procedural work to free-tier models and reserves paid models for quality-sensitive writing. See [MODEL-ROUTING.md](MODEL-ROUTING.md) for the routing architecture, why it exists, and how to customize.
-**Upgrading** the harness in a consumer project is `npm run update-harness` — fetches the latest harness (`github:razroo/JobForge`) and `@razroo/opencode-model-fallback` plugin, re-runs symlink sync, and prints the resolved commit SHA.
+**Multi-harness support.** Because `iso/` is the single source of truth, publishing ships config for OpenCode, Cursor, Claude Code, and Codex in one tarball. Consumers run any of `opencode`, `cursor`, `claude`, or `codex` in the project and each picks up the shared MCP config + instructions via the symlinks above.
+**Upgrading** the harness in a consumer project is `npm run update-harness` — pulls the latest `job-forge` from npm, refreshes the fallback plugin + pinned MCPs, re-runs symlink sync, and prints the resolved version.
 ## System Overview

package/docs/README.md CHANGED Viewed

@@ -4,12 +4,12 @@ Guides for installing JobForge, understanding how pieces fit together, and tailo
 ## Install paths
-JobForge ships as an installable package (v2.0.0+). Pick the path that matches your goal:
+JobForge ships on npm as [`job-forge`](https://www.npmjs.com/package/job-forge) (v2.0.0+). Pick the path that matches your goal:
 | Path | Who it's for | How |
 |------|--------------|-----|
-| **A — Scaffold a personal project** | Most users. You want a job search project with the harness in `node_modules`, updatable via `npm update job-forge`. | `npx github:razroo/JobForge create-job-forge my-search && cd my-search && npm install` |
-| **B — Clone the harness directly** | Contributors and hackers working on modes, scripts, or the scoring model. Personal files are gitignored. | `git clone https://github.com/razroo/JobForge.git && cd JobForge && npm install` |
+| **A — Scaffold a personal project** | Most users. You want a job search project with the harness in `node_modules`, updatable via `npm update job-forge`. | `npx --package=job-forge create-job-forge my-search && cd my-search && npm install` |
+| **B — Clone the harness directly** | Contributors and hackers working on `iso/`, modes, scripts, or the scoring model. Personal files are gitignored. | `git clone https://github.com/razroo/JobForge.git && cd JobForge && npm install && npm run build:config` |
 See [SETUP.md](SETUP.md) for both paths.

package/docs/SETUP.md CHANGED Viewed

@@ -10,16 +10,22 @@
 ### Path A — Scaffold a personal project (recommended)
-JobForge is distributed as an installable npm package. Use the scaffolder to create a new project that keeps only your personal data (CV, profile, portals, tracker) while the harness (modes, skills, scripts) lives in `node_modules/job-forge` and updates with one command.
+JobForge is published on npm as [`job-forge`](https://www.npmjs.com/package/job-forge). Use the scaffolder to create a new project that keeps only your personal data (CV, profile, portals, tracker) while the harness (modes, skills, scripts, per-harness configs) lives in `node_modules/job-forge` and updates with one command.
 ```bash
 # 1. Scaffold
-npx github:razroo/JobForge create-job-forge my-job-search
+npx --package=job-forge create-job-forge my-job-search
 cd my-job-search
-# 2. Install the harness (pulls razroo/JobForge from GitHub; postinstall
-#    creates symlinks for modes/, templates/, .opencode/skills/job-forge.md,
-#    and batch/{batch-prompt.md,batch-runner.sh,README.md})
+# 2. Install the harness. `npm install` fetches job-forge@^2.0.0 from npm;
+#    its postinstall hook creates symlinks into your project root for:
+#      .opencode/{skills/job-forge.md, agents/}
+#      .cursor/mcp.json, .cursor/rules/main.mdc
+#      .mcp.json                       (Claude Code MCP config)
+#      .codex/config.toml              (Codex MCP config)
+#      AGENTS.harness.md, CLAUDE.harness.md
+#      modes/, templates/
+#      batch/{batch-prompt.md, batch-runner.sh, README.md}
 npm install
 # 3. Fill in personal files
@@ -41,18 +47,25 @@ Paste a job URL or run `/job-forge` to see the command menu.
 To **upgrade the harness** later:
 ```bash
-npm update job-forge       # pulls latest razroo/JobForge
+npm update job-forge       # pulls latest job-forge from the npm registry
 npx job-forge sync         # refresh symlinks if anything drifted
 ```
+Or simpler, via the scaffolded script: `npm run update-harness` (also refreshes the fallback plugin + pinned MCPs, reprints the resolved version).
 ### Path B — Clone the harness directly
-Use this if you want to hack on the harness itself (add modes, tune the scoring model, contribute back). Personal files are gitignored.
+Use this if you want to hack on the harness itself (edit `iso/`, tune the scoring model, add modes, contribute back). Personal files are gitignored.
 ```bash
 git clone https://github.com/razroo/JobForge.git
 cd JobForge
 npm install
+npm run build:config   # regenerate per-harness trees from iso/ (CLAUDE.md,
+                       # AGENTS.md, .mcp.json, .codex/, .cursor/, .opencode/,
+                       # opencode.json) — these are gitignored but materialized
+                       # locally so OpenCode/Cursor/Claude Code/Codex can read
+                       # them while you develop
 # Add personal files the same way as Path A
 cp config/profile.example.yml config/profile.yml
@@ -60,7 +73,7 @@ cp templates/portals.example.yml portals.yml
 # Create cv.md in the project root
 ```
-When you're inside this repo, the `postinstall` symlink step is a no-op (detected and skipped). All npm scripts run the harness code directly. The repo's `opencode.json` at the project root registers the same Geometra + Gmail MCPs as the scaffolder ships to consumers.
+When you're inside this repo, the `postinstall` symlink step is a no-op (detected and skipped). All npm scripts run the harness code directly. The repo's generated `opencode.json` at the project root registers the same Geometra + Gmail MCPs as the scaffolder ships to consumers. Re-run `npm run build:config` any time you edit something under `iso/`; `prepack` runs the same build automatically at publish time so tarballs always match `iso/`.
 ## Personalization

package/iso/instructions.md CHANGED Viewed

@@ -10,8 +10,17 @@ The Hard Limits below are non-negotiable numeric rules. If you catch yourself ab
 4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
 5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
 6. **Application outcomes flow through TSVs, not `data/pipeline.md`.** When a subagent returns APPLIED / FAILED / SKIP, the outcome goes to `batch/tracker-additions/{num}-{slug}.tsv`. `node merge-tracker.mjs` then consumes the TSVs into the correct `data/applications/YYYY-MM-DD.md` day file. `data/pipeline.md` only tracks URL inbox state (`[ ]` pending → `[x]` processed). **NEVER append APPLIED / FAILED status lines to `pipeline.md`** — that's the day file's job, via the TSV pathway. After any multi-apply run, the orchestrator MUST run `node merge-tracker.mjs` followed by `node verify-pipeline.mjs` before ending the session.
+7. **URLs passed to downstream subagents must come from a file, not from a prior subagent's prose.** When an orchestrator dispatches a subagent with a URL (for evaluation, apply, verification, etc.), the URL MUST originate from:
+   - `data/pipeline.md`
+   - `data/scan-history.tsv`
+   - `batch/scan-output-*.md` or similar structured output file
+   - A report file (`reports/{num}-*.md`) with an authoritative `**URL:**` header
-Everything below is context and rationale. These six numbers are the rules.
+   URLs mentioned in a subagent's return message are NOT trustworthy by default — they may be hallucinated or reconstructed. Before passing any URL from a subagent report to another subagent, cross-check it exists in one of the authoritative files above, OR instruct the dispatching subagent to write its output to a structured file and re-read from that file.
+   **Why**: a scan subagent once reported 30 plausible-looking Greenhouse IDs in its return message that did not exist in the Greenhouse API. The orchestrator dispatched 30 downstream subagents that all failed verification. Trusting prose-form URLs cost ~2 hours of wasted work and corrupted the tracker.
+Everything below is context and rationale. These seven numbers are the rules.
 ---

package/modes/auto-pipeline.md CHANGED Viewed

@@ -8,9 +8,12 @@ Fetch the JD content once. If the input is a **URL** (not pasted JD text), fetch
 **Pick exactly one method, in this priority order:**
-1. **Geometra MCP (preferred):** Most job portals (Lever, Ashby, Greenhouse, Workday) are SPAs. Use `geometra_connect` + `geometra_page_model` to render and read the JD. **If this returns non-empty JD text, STOP — do not WebFetch the same URL.**
-2. **WebFetch (only if Geometra is unavailable OR returned only a shell with no JD text):** For static pages (ZipRecruiter, WeLoveProduct, company career pages).
-3. **WebSearch (only if methods 1 AND 2 both failed):** Search for the role title + company on secondary portals that index the JD in static HTML.
+1. **Greenhouse JSON API (first try, if the URL is Greenhouse-backed):** If the pipeline.md entry carries `| gh={slug}/{id}` OR the URL host matches `*.greenhouse.io` / a known Greenhouse customer front-end (`*.pinterestcareers.com`, `okta.com/company/careers/opportunity/*`, `samsara.com/company/careers/roles/*`, `zoominfo.com/careers?gh_jid=*`, `collibra.com/.../?gh_jid=*`, `careers.toasttab.com/jobs?gh_jid=*`, `careers.airbnb.com/positions/*?gh_jid=*`, `coinbase.com/careers/positions/*?gh_jid=*`, `instacart.careers/job/?gh_jid=*`), extract `slug` and `id` and WebFetch `https://boards-api.greenhouse.io/v1/boards/{slug}/jobs/{id}`. 200 + JSON with `content` is the authoritative JD. 404 = genuinely closed (mark CLOSED and stop). **If 200, STOP — do not fall back to Geometra or WebFetch of the front-end.** The API is faster, cheaper (no Geometra session), and never returns a bot-shell.
+2. **Geometra MCP:** Most non-Greenhouse job portals (Lever, Ashby, Workday) are SPAs. Use `geometra_connect` + `geometra_page_model` to render and read the JD. **If this returns non-empty JD text, STOP — do not WebFetch the same URL.**
+3. **WebFetch (only if Geometra is unavailable OR returned only a shell with no JD text):** For static pages (ZipRecruiter, WeLoveProduct, company career pages).
+4. **WebSearch (only if methods 1–3 all failed):** Search for the role title + company on secondary portals that index the JD in static HTML.
+**Do NOT mark a Greenhouse-sourced offer CLOSED based on a WebFetch shell or a 403 from a customer-skinned careers domain.** Pinterest, Okta, Samsara, ZoomInfo, Collibra, Toast, Airbnb, Coinbase, Instacart all serve bot-hostile fronts. The Greenhouse JSON API (step 1) is the ground truth for their offer state. A previous scan run fed 60 live Greenhouse URLs through WebFetch-only verification and 100% of them were wrongly marked CLOSED; if you see a high stale rate, you are skipping step 1.
 **Rule:** Each URL gets fetched at most once per session. If you already have the JD text in context — from Geometra, a previous WebFetch, or pasted by the candidate — do not fetch again.

package/modes/pipeline.md CHANGED Viewed

@@ -33,9 +33,10 @@ Processes accumulated job offer URLs from `data/pipeline.md`. The user adds URLs
 ## Detect JD From URL
-1. **Geometra MCP (preferred):** `geometra_connect` + `geometra_page_model`. Works with all SPAs, uses fewer tokens than raw DOM snapshots.
-2. **WebFetch (fallback):** For static pages or when Geometra is not available.
-3. **WebSearch (last resort):** Search on secondary portals that index the JD.
+1. **Greenhouse JSON API (FIRST, when the entry has `| gh={slug}/{id}` OR the host looks Greenhouse-backed):** WebFetch `https://boards-api.greenhouse.io/v1/boards/{slug}/jobs/{id}`. 200 + JSON with `content` = LIVE, use it as the JD; 404 = genuinely CLOSED (mark `- [!]` and continue). Bot-hostile customer fronts (`pinterestcareers.com`, `okta.com`, `samsara.com`, `zoominfo.com`, `collibra.com`, `careers.toasttab.com`, `careers.airbnb.com`, `coinbase.com`, `instacart.careers`, `careers.toasttab.com`) MUST be verified via this API first — WebFetch/Geometra of those domains returns a shell or 403 and causes false CLOSED marks.
+2. **Geometra MCP:** `geometra_connect` + `geometra_page_model`. Works with non-Greenhouse SPAs (Lever, Ashby, Workday), uses fewer tokens than raw DOM snapshots.
+3. **WebFetch (fallback):** For static pages or when Geometra is not available.
+4. **WebSearch (last resort):** Search on secondary portals that index the JD.
 **Special cases:**
 - **LinkedIn**: May require login → mark `[!]` and ask the user to paste the text

package/modes/scan.md CHANGED Viewed

@@ -68,8 +68,12 @@ The levels are additive — all are executed, results are merged and deduplicate
 5. **Level 2 — Greenhouse APIs** (WebFetch can batch freely — it's cheap and doesn't use Geometra sessions):
    For each company in `tracked_companies` with `api:` defined and `enabled: true`:
    a. WebFetch the API URL → JSON with job list
-   b. For each job extract: `{title, url, company}`
-   c. Accumulate in candidates list (dedup with Level 1)
+   b. For each job extract: `{title, url, company, gh_slug, gh_id, updated_at}`
+      - **`url`**: ALWAYS record the canonical Greenhouse URL: `https://job-boards.greenhouse.io/{gh_slug}/jobs/{gh_id}`. Do **NOT** use `absolute_url` when it points to a customer-skinned front-end (e.g. `pinterestcareers.com/jobs/?gh_jid=N`, `okta.com/company/careers/opportunity/N`, `samsara.com/company/careers/roles/N`, `zoominfo.com/careers?gh_jid=N`, `collibra.com/.../?gh_jid=N`, `careers.toasttab.com/jobs?gh_jid=N`, `careers.airbnb.com/positions/N`, `coinbase.com/careers/positions/N`, `instacart.careers/job/?gh_jid=N`, `pinterestcareers.com/jobs/?gh_jid=N`). These customer front-ends return shells or 403 to bots and cause downstream WebFetch-based verification to wrongly mark the role CLOSED.
+      - **`gh_slug`**: the Greenhouse board slug (from the API URL that was fetched).
+      - **`gh_id`**: `jobs[].id` from the API response.
+      - **`updated_at`**: `jobs[].updated_at` — record for staleness detection (skip if older than 90 days, flag if older than 30).
+   c. Accumulate in candidates list (dedup with Level 1). The pipeline.md entry MUST carry `| gh={gh_slug}/{gh_id}` at the end of the metadata so downstream evaluators can fall back to `https://boards-api.greenhouse.io/v1/boards/{gh_slug}/jobs/{gh_id}` when the canonical URL renders as a shell.
 6. **Level 3 — WebSearch queries** (WebSearch is parallel-safe; batch freely):
    For each query in `search_queries` with `enabled: true`:
@@ -102,7 +106,7 @@ The levels are additive — all are executed, results are merged and deduplicate
    - When a fuzzy match is found but the URL is new, log it as `skipped_repost` (not `skipped_dup`) with a note referencing the original entry number.
 8. **For each new offer that passes filters**:
-   a. Add to `pipeline.md` section "Pending": `- [ ] {url} | {company} | {title}`
+   a. Add to `pipeline.md` section "Pending": `- [ ] {url} | {company} | {title}` — append `| gh={gh_slug}/{gh_id}` when the offer came from the Greenhouse API (Level 2) so downstream verification can hit the JSON endpoint.
    b. Record in `scan-history.tsv`: `{url}\t{date}\t{query_name}\t{title}\t{company}\tadded`
 9. **Offers filtered by title**: record in `scan-history.tsv` with status `skipped_title`
@@ -137,6 +141,30 @@ https://...	2026-02-10	Greenhouse — SA	Junior Dev	BigCo	skipped_title
 https://...	2026-02-10	Ashby — AI PM	SA AI	OldCo	skipped_dup
 ```
+## Structured Output — Required for Downstream Dispatch
+Scan mode MUST write its ranked candidate list to a file, not just return it in prose. Downstream subagents (evaluators, applyers) must read URLs from this file, not from the scan subagent's return message. This prevents any hallucinated URL or ID from propagating.
+**File location**: `batch/scan-output-{YYYY-MM-DD}.md`
+**Format**: one markdown table per scan run, ordered by archetype-fit rank:
+| rank | company | role | gh_slug | gh_id | url | updated_at |
+|------|---------|------|---------|-------|-----|------------|
+| 1    | Webflow | Lead AI Engineer | webflow | 7689676 | https://job-boards.greenhouse.io/webflow/jobs/7689676 | 2026-04-14 |
+| ... | ... | ... | ... | ... | ... | ... |
+Every row MUST have:
+- `gh_slug` and `gh_id` copied verbatim from the Greenhouse API response (not reconstructed)
+- `url` in the canonical form `https://job-boards.greenhouse.io/{gh_slug}/jobs/{gh_id}` (matching the suffix in `data/pipeline.md`)
+- `updated_at` in `YYYY-MM-DD` form (the most recent `updated_at` in the API response)
+The scan subagent's return message MUST:
+- Reference the file path (so orchestrators know where to read)
+- Omit the ranked URL list from prose entirely (summary counts only)
+**Rationale**: in a prior run, a scan subagent returned correct IDs in `scan-history.tsv` but hallucinated plausible-looking fake IDs in its prose-form top-30 list. The orchestrator trusted prose and dispatched 30 downstream subagents against fake URLs. File-based handoff prevents this class of error.
 ## Output Summary
 ```
@@ -148,12 +176,27 @@ Filtered by title: N relevant
 Duplicates: N (already evaluated or in pipeline)
 New added to pipeline.md: N
-  + {company} | {title} | {query_name}
-  ...
-→ Run /job-forge pipeline to evaluate the new offers.
+NEXT STEP RECOMMENDATION:
+- Structured candidate list written to: batch/scan-output-{YYYY-MM-DD}.md
+- Downstream subagents MUST read URLs from that file, not from this return message
+- Run /job-forge pipeline to evaluate the new offers.
 ```
+## Verify Before Marking CLOSED (downstream rule)
+**DO NOT mark a Greenhouse offer CLOSED based on a WebFetch/Geometra result alone.** Customer-skinned careers pages (`pinterestcareers.com`, `okta.com`, `samsara.com`, `zoominfo.com`, `collibra.com`, `careers.toasttab.com`, `careers.airbnb.com`, `coinbase.com`, `instacart.careers`, etc.) serve bot-hostile shells — a 403, a navbar-only response, or a client-side-only render. WebFetch sees "no JD" and mis-classifies as CLOSED.
+**Correct verification order for any Greenhouse-sourced URL** (identified by a `| gh={slug}/{id}` suffix in `pipeline.md` or a `boards-api.greenhouse.io` / `job-boards.greenhouse.io` / `boards.greenhouse.io` host):
+1. Try `https://boards-api.greenhouse.io/v1/boards/{slug}/jobs/{id}`. This is the authoritative source.
+   - **200 + JSON with `title` and `content`** → offer is LIVE. Use the JSON content as the JD. Do not mark CLOSED.
+   - **404** → offer is genuinely closed. Mark CLOSED.
+   - **Other non-2xx** → treat as transient (network/rate-limit); retry once. If still failing, mark `**Verification: unconfirmed**` and continue evaluation from whatever text is available. Do NOT mark CLOSED.
+2. Only then fall back to WebFetch of the canonical `job-boards.greenhouse.io/{slug}/jobs/{id}` URL.
+3. Only then fall back to Geometra on the same canonical URL.
+**Rule of thumb:** Greenhouse postings with valid `gh_slug`/`gh_id` should be verified via the API first. A WebFetch failure on a customer-skinned domain is NOT evidence the role is closed.
 ## Update careers_url
 Each company in `tracked_companies` MUST have a `careers_url` — the direct URL to its job listings page. The stored URL avoids searching for it every time.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "job-forge",
-  "version": "2.0.0",
+  "version": "2.0.2",
   "description": "AI-powered job search pipeline built on opencode",
   "type": "module",
   "bin": {