job-forge 2.14.25 → 2.14.27

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -12,7 +12,7 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
12
12
  - [H1] Max 2 parallel `task` dispatches per message. For N jobs, run `ceil(N/2)` sequential rounds of 2. A round is not complete until both subagents return a final outcome (`APPLIED`, `APPLY FAILED`, `SKIP`, `Discarded`, or a written TSV path). A `task` tool result that only gives a session id / title is a launch acknowledgement, not completion. Applies in all modes, for all user phrasings ("urgent", "apply to 10 jobs now").
13
13
  why: each subagent requires post-cleanup and racing more than 2 reliably loses at least one result. On 2026-04-25 the orchestrator launched round 2 while round 1 had only returned task ids, leaving four application subagents in flight and losing two provider recoveries
14
14
 
15
- - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. `npx job-forge index:has --key "company-role:..."` may be used first as a fast local artifact prefilter; it rebuilds `.jobforge-index.json` on demand from `templates/index.json`, and a company+role hit is enough to drop an obvious duplicate before dispatch. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may also be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the index or ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
15
+ - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. Use `npx job-forge canon:key company-role --company "..." --role "..."` when deriving a stable duplicate key; do not invent slugs in prose. `npx job-forge index:has --key "company-role:..."` may be used first as a fast local artifact prefilter; it rebuilds `.jobforge-index.json` on demand from `templates/index.json`, and a company+role hit is enough to drop an obvious duplicate before dispatch. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may also be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the index or ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
16
16
  why: 2026-04 same-day batch collision — when two batches target the same role, `npx job-forge merge` updates the existing day-file row rather than appending, so grepping day files alone misses earlier-batch applies; merged/*.tsv is the only place the breadcrumb remains
17
17
 
18
18
  - [H3] Before every batch of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round, no exceptions. Name this cleanup as an explicit "step 0" in your first-response plan for any multi-apply request — it is the most frequently skipped guardrail in practice, and skipping it produces cascade "Not connected" failures on the next dispatch.
@@ -77,12 +77,18 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
77
77
  - [D13] Use `job-forge index:*` for deterministic artifact lookup when available. `index:has` and `index:query` rebuild `.jobforge-index.json` from `templates/index.json` on demand, covering reports, tracker day files, tracker TSVs, pipeline URLs, scan history, and ledger events without loading those growing files into prompt context.
78
78
  why: `iso-index` is not an MCP and adds no prompt/tool-schema tokens; it gives agents compact file/line pointers and duplicate prefilters before expensive reads or browser dispatches
79
79
 
80
+ - [D14] Treat `templates/migrations.json` as the source of truth for consumer-project upgrades. Use `npx job-forge migrate:plan` or `npx job-forge migrate:check` when diagnosing harness drift; `job-forge sync` applies safe migrations automatically unless `JOB_FORGE_SKIP_MIGRATIONS=1` is set.
81
+ why: `iso-migrate` is not an MCP and adds no prompt/tool-schema tokens; it prevents stale consumer scripts and generated-artifact ignores without asking agents to hand-edit package.json
82
+
83
+ - [D15] Treat `templates/canon.json` as the source of truth for URL/company/role identity keys. Use `npx job-forge canon:key ...` or `npx job-forge canon:compare ...` before broad duplicate checks when a stable key or same/possible/different decision is useful.
84
+ why: `iso-canon` is not an MCP and adds no prompt/tool-schema tokens; it centralizes duplicate-key rules so agents do not repeatedly derive inconsistent slugs for aliases, suffixes, remote/location noise, or tracking URLs
85
+
80
86
  ## Procedure
81
87
 
82
88
  1. Check `cv.md`, `profile.yml`, and `portals.yml`; onboard if any file is missing.
83
89
  2. Pick and name the mode from **Routing** [D6]. No match → ask; do not guess.
84
- 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Decide inline vs delegated work [D1].
85
- 4. Prepare Geometra dispatches: cleanup [H3], index/ledger prefilter when useful [D8, D13], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
90
+ 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Use canonical identity keys for duplicate checks [D15]. Use migration checks for harness drift [D14]. Decide inline vs delegated work [D1].
91
+ 4. Prepare Geometra dispatches: cleanup [H3], canon/index/ledger prefilter when useful [D8, D13, D15], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
86
92
  5. Dispatch at most 2 tasks per round [H1]; wait for final outcomes, not just task ids [H5b].
87
93
  6. Keep multi-job form-filling out of the orchestrator [H4].
88
94
  7. Cross-check subagent facts against authoritative files [H7].
@@ -78,6 +78,15 @@ Local artifact index (terminal, outside opencode):
78
78
  npx job-forge index:has --key "company-role:acme:staff-engineer"
79
79
  npx job-forge index:query "acme"
80
80
 
81
+ Identity keys (terminal, outside opencode):
82
+ npx job-forge canon:key company-role --company "Acme" --role "Staff Engineer"
83
+ npx job-forge canon:compare company "OpenAI, Inc." "Open AI"
84
+
85
+ Consumer migrations (terminal, outside opencode):
86
+ npx job-forge migrate:plan # preview package.json/.gitignore drift
87
+ npx job-forge migrate:apply # apply safe harness upgrade migrations
88
+ npx job-forge migrate:check # fail if migrations are pending
89
+
81
90
  Artifact contracts (terminal, outside opencode):
82
91
  npx iso-contract explain jobforge.tracker-row --contracts templates/contracts.json
83
92
  npx job-forge tracker-line ... --write # renders + validates tracker TSV locally
@@ -163,11 +172,13 @@ Step 1 — Enumerate candidates
163
172
  - Build ordered list: candidates = [job_1, job_2, ..., job_N]
164
173
 
165
174
  Step 2 — Dedup against already-applied
166
- - Run npx job-forge index:has --key "company-role:<company-slug>:<role-slug>"
167
- as a fast local artifact prefilter when company+role is known. It rebuilds
168
- .jobforge-index.json on demand from templates/index.json. A hit means the
169
- role has already appeared in tracker files or tracker TSVs and can be
170
- dropped before dispatch.
175
+ - Derive the stable key with npx job-forge canon:key company-role --company
176
+ "<company>" --role "<role>" when company+role is known.
177
+ - Run npx job-forge index:has --key "<canon-key>" as a fast local artifact
178
+ prefilter. It rebuilds .jobforge-index.json on demand from
179
+ templates/index.json and canonicalizes indexed company/role records through
180
+ templates/canon.json. A hit means the role has already appeared in tracker
181
+ files or tracker TSVs and can be dropped before dispatch.
171
182
  - If .jobforge-ledger/events.jsonl exists, use npx job-forge ledger:has as a
172
183
  fast prefilter for obvious company+role Applied duplicates. A ledger match
173
184
  can be dropped before dispatch without loading tracker files into context.
package/AGENTS.md CHANGED
@@ -7,7 +7,7 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
7
7
  - [H1] Max 2 parallel `task` dispatches per message. For N jobs, run `ceil(N/2)` sequential rounds of 2. A round is not complete until both subagents return a final outcome (`APPLIED`, `APPLY FAILED`, `SKIP`, `Discarded`, or a written TSV path). A `task` tool result that only gives a session id / title is a launch acknowledgement, not completion. Applies in all modes, for all user phrasings ("urgent", "apply to 10 jobs now").
8
8
  why: each subagent requires post-cleanup and racing more than 2 reliably loses at least one result. On 2026-04-25 the orchestrator launched round 2 while round 1 had only returned task ids, leaving four application subagents in flight and losing two provider recoveries
9
9
 
10
- - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. `npx job-forge index:has --key "company-role:..."` may be used first as a fast local artifact prefilter; it rebuilds `.jobforge-index.json` on demand from `templates/index.json`, and a company+role hit is enough to drop an obvious duplicate before dispatch. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may also be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the index or ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
10
+ - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. Use `npx job-forge canon:key company-role --company "..." --role "..."` when deriving a stable duplicate key; do not invent slugs in prose. `npx job-forge index:has --key "company-role:..."` may be used first as a fast local artifact prefilter; it rebuilds `.jobforge-index.json` on demand from `templates/index.json`, and a company+role hit is enough to drop an obvious duplicate before dispatch. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may also be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the index or ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
11
11
  why: 2026-04 same-day batch collision — when two batches target the same role, `npx job-forge merge` updates the existing day-file row rather than appending, so grepping day files alone misses earlier-batch applies; merged/*.tsv is the only place the breadcrumb remains
12
12
 
13
13
  - [H3] Before every batch of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round, no exceptions. Name this cleanup as an explicit "step 0" in your first-response plan for any multi-apply request — it is the most frequently skipped guardrail in practice, and skipping it produces cascade "Not connected" failures on the next dispatch.
@@ -72,12 +72,18 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
72
72
  - [D13] Use `job-forge index:*` for deterministic artifact lookup when available. `index:has` and `index:query` rebuild `.jobforge-index.json` from `templates/index.json` on demand, covering reports, tracker day files, tracker TSVs, pipeline URLs, scan history, and ledger events without loading those growing files into prompt context.
73
73
  why: `iso-index` is not an MCP and adds no prompt/tool-schema tokens; it gives agents compact file/line pointers and duplicate prefilters before expensive reads or browser dispatches
74
74
 
75
+ - [D14] Treat `templates/migrations.json` as the source of truth for consumer-project upgrades. Use `npx job-forge migrate:plan` or `npx job-forge migrate:check` when diagnosing harness drift; `job-forge sync` applies safe migrations automatically unless `JOB_FORGE_SKIP_MIGRATIONS=1` is set.
76
+ why: `iso-migrate` is not an MCP and adds no prompt/tool-schema tokens; it prevents stale consumer scripts and generated-artifact ignores without asking agents to hand-edit package.json
77
+
78
+ - [D15] Treat `templates/canon.json` as the source of truth for URL/company/role identity keys. Use `npx job-forge canon:key ...` or `npx job-forge canon:compare ...` before broad duplicate checks when a stable key or same/possible/different decision is useful.
79
+ why: `iso-canon` is not an MCP and adds no prompt/tool-schema tokens; it centralizes duplicate-key rules so agents do not repeatedly derive inconsistent slugs for aliases, suffixes, remote/location noise, or tracking URLs
80
+
75
81
  ## Procedure
76
82
 
77
83
  1. Check `cv.md`, `profile.yml`, and `portals.yml`; onboard if any file is missing.
78
84
  2. Pick and name the mode from **Routing** [D6]. No match → ask; do not guess.
79
- 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Decide inline vs delegated work [D1].
80
- 4. Prepare Geometra dispatches: cleanup [H3], index/ledger prefilter when useful [D8, D13], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
85
+ 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Use canonical identity keys for duplicate checks [D15]. Use migration checks for harness drift [D14]. Decide inline vs delegated work [D1].
86
+ 4. Prepare Geometra dispatches: cleanup [H3], canon/index/ledger prefilter when useful [D8, D13, D15], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
81
87
  5. Dispatch at most 2 tasks per round [H1]; wait for final outcomes, not just task ids [H5b].
82
88
  6. Keep multi-job form-filling out of the orchestrator [H4].
83
89
  7. Cross-check subagent facts against authoritative files [H7].
package/CLAUDE.md CHANGED
@@ -7,7 +7,7 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
7
7
  - [H1] Max 2 parallel `task` dispatches per message. For N jobs, run `ceil(N/2)` sequential rounds of 2. A round is not complete until both subagents return a final outcome (`APPLIED`, `APPLY FAILED`, `SKIP`, `Discarded`, or a written TSV path). A `task` tool result that only gives a session id / title is a launch acknowledgement, not completion. Applies in all modes, for all user phrasings ("urgent", "apply to 10 jobs now").
8
8
  why: each subagent requires post-cleanup and racing more than 2 reliably loses at least one result. On 2026-04-25 the orchestrator launched round 2 while round 1 had only returned task ids, leaving four application subagents in flight and losing two provider recoveries
9
9
 
10
- - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. `npx job-forge index:has --key "company-role:..."` may be used first as a fast local artifact prefilter; it rebuilds `.jobforge-index.json` on demand from `templates/index.json`, and a company+role hit is enough to drop an obvious duplicate before dispatch. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may also be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the index or ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
10
+ - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. Use `npx job-forge canon:key company-role --company "..." --role "..."` when deriving a stable duplicate key; do not invent slugs in prose. `npx job-forge index:has --key "company-role:..."` may be used first as a fast local artifact prefilter; it rebuilds `.jobforge-index.json` on demand from `templates/index.json`, and a company+role hit is enough to drop an obvious duplicate before dispatch. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may also be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the index or ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
11
11
  why: 2026-04 same-day batch collision — when two batches target the same role, `npx job-forge merge` updates the existing day-file row rather than appending, so grepping day files alone misses earlier-batch applies; merged/*.tsv is the only place the breadcrumb remains
12
12
 
13
13
  - [H3] Before every batch of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round, no exceptions. Name this cleanup as an explicit "step 0" in your first-response plan for any multi-apply request — it is the most frequently skipped guardrail in practice, and skipping it produces cascade "Not connected" failures on the next dispatch.
@@ -72,12 +72,18 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
72
72
  - [D13] Use `job-forge index:*` for deterministic artifact lookup when available. `index:has` and `index:query` rebuild `.jobforge-index.json` from `templates/index.json` on demand, covering reports, tracker day files, tracker TSVs, pipeline URLs, scan history, and ledger events without loading those growing files into prompt context.
73
73
  why: `iso-index` is not an MCP and adds no prompt/tool-schema tokens; it gives agents compact file/line pointers and duplicate prefilters before expensive reads or browser dispatches
74
74
 
75
+ - [D14] Treat `templates/migrations.json` as the source of truth for consumer-project upgrades. Use `npx job-forge migrate:plan` or `npx job-forge migrate:check` when diagnosing harness drift; `job-forge sync` applies safe migrations automatically unless `JOB_FORGE_SKIP_MIGRATIONS=1` is set.
76
+ why: `iso-migrate` is not an MCP and adds no prompt/tool-schema tokens; it prevents stale consumer scripts and generated-artifact ignores without asking agents to hand-edit package.json
77
+
78
+ - [D15] Treat `templates/canon.json` as the source of truth for URL/company/role identity keys. Use `npx job-forge canon:key ...` or `npx job-forge canon:compare ...` before broad duplicate checks when a stable key or same/possible/different decision is useful.
79
+ why: `iso-canon` is not an MCP and adds no prompt/tool-schema tokens; it centralizes duplicate-key rules so agents do not repeatedly derive inconsistent slugs for aliases, suffixes, remote/location noise, or tracking URLs
80
+
75
81
  ## Procedure
76
82
 
77
83
  1. Check `cv.md`, `profile.yml`, and `portals.yml`; onboard if any file is missing.
78
84
  2. Pick and name the mode from **Routing** [D6]. No match → ask; do not guess.
79
- 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Decide inline vs delegated work [D1].
80
- 4. Prepare Geometra dispatches: cleanup [H3], index/ledger prefilter when useful [D8, D13], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
85
+ 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Use canonical identity keys for duplicate checks [D15]. Use migration checks for harness drift [D14]. Decide inline vs delegated work [D1].
86
+ 4. Prepare Geometra dispatches: cleanup [H3], canon/index/ledger prefilter when useful [D8, D13, D15], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
81
87
  5. Dispatch at most 2 tasks per round [H1]; wait for final outcomes, not just task ids [H5b].
82
88
  6. Keep multi-job form-filling out of the orchestrator [H4].
83
89
  7. Cross-check subagent facts against authoritative files [H7].
package/README.md CHANGED
@@ -31,13 +31,13 @@ The scaffolded `opencode.json` already has three MCPs wired up — they launch a
31
31
  - **Gmail** — reads replies from recruiters
32
32
  - **state-trace** — typed working memory for cross-session context (resumed batches, recent decisions, repeated portal quirks). Install once with `python3 -m pip install "state-trace[mcp]"`; the MCP command is `state-trace-mcp`.
33
33
 
34
- JobForge also keeps MCP-free local workflow state: `templates/contracts.json` defines tracker/apply artifact shapes via `@razroo/iso-contract`, `templates/capabilities.json` defines role capability boundaries via `@razroo/iso-capabilities`, `templates/context.json` defines deterministic mode/reference bundles via `@razroo/iso-context`, `.jobforge-ledger/events.jsonl` records duplicate/status events via `@razroo/iso-ledger`, `.jobforge-cache/` stores reusable JD/artifact content via `@razroo/iso-cache`, and `.jobforge-index.json` indexes artifact source pointers via `@razroo/iso-index`. None of these add always-on prompt or tool-schema tokens.
34
+ JobForge also keeps MCP-free local workflow state: `templates/canon.json` defines URL/company/role identity keys via `@razroo/iso-canon`, `templates/contracts.json` defines tracker/apply artifact shapes via `@razroo/iso-contract`, `templates/capabilities.json` defines role capability boundaries via `@razroo/iso-capabilities`, `templates/context.json` defines deterministic mode/reference bundles via `@razroo/iso-context`, `templates/migrations.json` defines safe consumer-project upgrades via `@razroo/iso-migrate`, `.jobforge-ledger/events.jsonl` records duplicate/status events via `@razroo/iso-ledger`, `.jobforge-cache/` stores reusable JD/artifact content via `@razroo/iso-cache`, and `.jobforge-index.json` indexes artifact source pointers via `@razroo/iso-index`. None of these add always-on prompt or tool-schema tokens.
35
35
 
36
36
  `npm install` also materializes symlinks for every supported agent harness — OpenCode, Cursor, Claude Code, and Codex — so you can run `opencode`, `cursor`, `claude`, or `codex` in the same project and each picks up the shared MCP config and instructions.
37
37
 
38
38
  Then fill in `cv.md`, `config/profile.yml`, and `portals.yml` with your personal data, paste a job URL into opencode, and JobForge evaluates + tracks it.
39
39
 
40
- **Upgrade later:** `npm run update-harness` (pulls latest `job-forge` from npm, re-syncs symlinks, prints the resolved version)
40
+ **Upgrade later:** `npm run update-harness` (pulls latest `job-forge` from npm, re-syncs symlinks, applies safe consumer migrations, prints the resolved version)
41
41
 
42
42
  Full setup guide and alternative install paths (including contributing to the harness itself): **[docs/SETUP.md](docs/SETUP.md)**.
43
43
 
@@ -78,7 +78,7 @@ JobForge turns opencode into a full job search command center. Instead of manual
78
78
  | **Durable Batch Orchestration** | `batch-runner.sh` uses `@razroo/iso-orchestrator` for resumable bundle execution, bounded fan-out, mutexed state writes, and workflow records in `.jobforge-runs/`. |
79
79
  | **Pipeline Integrity** | Automated merge, dedup, status normalization, health checks |
80
80
  | **Cost-Aware Agent Routing** | Three subagents (`@general-free`, `@general-paid`, `@glm-minimal`) with per-task tool surfaces. On OpenCode, JobForge pins all tiers to `opencode-go/deepseek-v4-flash` so application runs avoid overloaded free-model pools. See [Subagent Routing in AGENTS.md](AGENTS.md) for the task-to-agent mapping. |
81
- | **Trace + Telemetry + Guard + Contract + Ledger + Capabilities + Context + Cache + Index** | `job-forge trace:*` exposes local OpenCode transcripts, `job-forge telemetry:*` summarizes runs, `job-forge guard:*` audits deterministic policy rules, `templates/contracts.json` enforces artifact shape with `iso-contract`, `job-forge ledger:*` queries append-only workflow state, `job-forge capabilities:*` checks role boundaries, `job-forge context:*` plans mode/reference context bundles, `job-forge cache:*` reuses fetched JD/artifact content, and `job-forge index:*` queries compact source pointers without MCP/tool-schema overhead. |
81
+ | **Trace + Telemetry + Guard + Contract + Canon + Ledger + Capabilities + Context + Cache + Index + Migrate** | `job-forge trace:*` exposes local OpenCode transcripts, `job-forge telemetry:*` summarizes runs, `job-forge guard:*` audits deterministic policy rules, `templates/contracts.json` enforces artifact shape with `iso-contract`, `job-forge canon:*` derives stable URL/company/role identity keys, `job-forge ledger:*` queries append-only workflow state, `job-forge capabilities:*` checks role boundaries, `job-forge context:*` plans mode/reference context bundles, `job-forge cache:*` reuses fetched JD/artifact content, `job-forge index:*` queries compact source pointers, and `job-forge migrate:*` applies safe consumer-project upgrades without MCP/tool-schema overhead. |
82
82
  | **Token Cost Visibility** | `job-forge tokens --days 1` for per-session breakdown; `job-forge session-report --since-minutes 60 --log` to flag sessions over budget and append history to `data/token-usage.tsv`. Auto-logged after every batch run. |
83
83
 
84
84
  ## Usage
@@ -164,7 +164,7 @@ my-search/
164
164
  ├── .opencode/skills/job-forge.md # → skill router
165
165
  ├── .opencode/agents/ # → @general-free, @general-paid, @glm-minimal
166
166
  ├── modes/ # → _shared.md + skill modes
167
- ├── templates/ # → states.yml, portals.example.yml, cv-template.html, capabilities.json, context.json, index.json
167
+ ├── templates/ # → states.yml, portals.example.yml, cv-template.html, canon.json, capabilities.json, context.json, index.json, migrations.json
168
168
  ├── batch/batch-prompt.md # → batch worker prompt
169
169
  ├── batch/batch-runner.sh # → parallel orchestrator
170
170
 
@@ -190,7 +190,7 @@ JobForge/
190
190
  │ ├── sync.mjs # postinstall: creates symlinks in consumer project
191
191
  │ └── create-job-forge.mjs # scaffolder
192
192
  ├── modes/ # _shared.md + 16 skill modes
193
- ├── templates/ # cv-template.html, portals.example.yml, states.yml, capabilities.json, context.json
193
+ ├── templates/ # cv-template.html, portals.example.yml, states.yml, canon.json, capabilities.json, context.json, migrations.json
194
194
  ├── config/profile.example.yml # template for consumer's profile.yml
195
195
  ├── batch/{batch-prompt.md,batch-runner.sh} # batch orchestrator
196
196
  ├── scripts/
@@ -201,6 +201,8 @@ JobForge/
201
201
  │ ├── context.mjs # iso-context-backed context bundle CLI
202
202
  │ ├── cache.mjs # iso-cache-backed local artifact cache CLI
203
203
  │ ├── index.mjs # iso-index-backed artifact lookup CLI
204
+ │ ├── canon.mjs # iso-canon-backed identity normalization CLI
205
+ │ ├── migrate.mjs # iso-migrate-backed consumer-project migrations
204
206
  │ ├── token-usage-report.mjs # opencode cost analyzer
205
207
  │ └── release/check-source.mjs # version gate for npm publish
206
208
  ├── tracker-lib.mjs / merge-tracker.mjs / dedup-tracker.mjs / verify-pipeline.mjs
@@ -124,11 +124,37 @@ const consumerPkg = {
124
124
  'ledger:verify': 'job-forge ledger:verify',
125
125
  'ledger:has': 'job-forge ledger:has',
126
126
  'ledger:query': 'job-forge ledger:query',
127
+ 'capabilities:list': 'job-forge capabilities:list',
128
+ 'capabilities:explain': 'job-forge capabilities:explain',
129
+ 'capabilities:check': 'job-forge capabilities:check',
130
+ 'capabilities:render': 'job-forge capabilities:render',
131
+ 'context:list': 'job-forge context:list',
132
+ 'context:explain': 'job-forge context:explain',
133
+ 'context:plan': 'job-forge context:plan',
134
+ 'context:check': 'job-forge context:check',
135
+ 'context:render': 'job-forge context:render',
136
+ 'cache:key': 'job-forge cache:key',
137
+ 'cache:has': 'job-forge cache:has',
138
+ 'cache:get': 'job-forge cache:get',
139
+ 'cache:put': 'job-forge cache:put',
140
+ 'cache:status': 'job-forge cache:status',
141
+ 'cache:list': 'job-forge cache:list',
142
+ 'cache:verify': 'job-forge cache:verify',
143
+ 'cache:prune': 'job-forge cache:prune',
127
144
  'index:build': 'job-forge index:build',
128
145
  'index:status': 'job-forge index:status',
129
146
  'index:verify': 'job-forge index:verify',
130
147
  'index:has': 'job-forge index:has',
131
148
  'index:query': 'job-forge index:query',
149
+ 'index:explain': 'job-forge index:explain',
150
+ 'canon:normalize': 'job-forge canon:normalize',
151
+ 'canon:key': 'job-forge canon:key',
152
+ 'canon:compare': 'job-forge canon:compare',
153
+ 'canon:explain': 'job-forge canon:explain',
154
+ 'migrate:plan': 'job-forge migrate:plan',
155
+ 'migrate:apply': 'job-forge migrate:apply',
156
+ 'migrate:check': 'job-forge migrate:check',
157
+ 'migrate:explain': 'job-forge migrate:explain',
132
158
  // One command to pull the latest harness and any locally-pinned MCP
133
159
  // packages. npm update is a no-op on packages not in package.json, so
134
160
  // listing @razroo/gmail-mcp + @geometra/mcp is safe for consumers that
@@ -230,6 +256,8 @@ Before doing any work, remember where things live in *this* project:
230
256
  | Scanner dedup history | \`data/scan-history.tsv\` | Only touch in \`/job-forge scan\` |
231
257
  | Local workflow ledger | \`.jobforge-ledger/events.jsonl\` | Deterministic append-only state; use \`job-forge ledger:*\` |
232
258
  | Local artifact index | \`.jobforge-index.json\` | Deterministic file/line lookup; use \`job-forge index:*\` |
259
+ | Identity canonicalization | \`templates/canon.json\` | Stable URL/company/role keys; use \`job-forge canon:*\` |
260
+ | Consumer migrations | \`templates/migrations.json\` | Safe script/gitignore upgrades; use \`job-forge migrate:*\` |
233
261
  | Scanner config | \`portals.yml\` (project root) | Company configs |
234
262
  | Profile / identity | \`config/profile.yml\` | Candidate name, email, target roles |
235
263
  | CV | \`cv.md\` (project root) | Markdown, source of truth |
@@ -379,6 +407,8 @@ job-forge merge # merge batch/tracker-additions/*.tsv into the tracke
379
407
  job-forge verify # verify pipeline integrity
380
408
  job-forge ledger:status # local deterministic workflow ledger status
381
409
  job-forge index:status # local artifact index status
410
+ job-forge canon:key company-role --company "Acme, Inc." --role "Senior SWE"
411
+ job-forge migrate:check # verify consumer package scripts/gitignore are current
382
412
  job-forge pdf cv.md out.pdf
383
413
  job-forge tokens --days 1 # per-session opencode token usage
384
414
  \`\`\`
package/bin/job-forge.mjs CHANGED
@@ -25,6 +25,8 @@
25
25
  * context:* Query/render deterministic context bundles via iso-context
26
26
  * cache:* Reuse local deterministic artifacts via iso-cache
27
27
  * index:* Query local artifacts via iso-index
28
+ * canon:* Compute deterministic identity keys via iso-canon
29
+ * migrate:* Apply deterministic consumer-project migrations via iso-migrate
28
30
  * sync Re-run the harness symlink sync (bin/sync.mjs)
29
31
  * help, --help Show this message
30
32
  */
@@ -127,6 +129,22 @@ const indexAliases = {
127
129
  'index:path': 'path',
128
130
  };
129
131
 
132
+ const canonAliases = {
133
+ 'canon:normalize': 'normalize',
134
+ 'canon:key': 'key',
135
+ 'canon:compare': 'compare',
136
+ 'canon:explain': 'explain',
137
+ 'canon:path': 'path',
138
+ };
139
+
140
+ const migrateAliases = {
141
+ 'migrate:plan': 'plan',
142
+ 'migrate:apply': 'apply',
143
+ 'migrate:check': 'check',
144
+ 'migrate:explain': 'explain',
145
+ 'migrate:path': 'path',
146
+ };
147
+
130
148
  const [, , cmd, ...rest] = process.argv;
131
149
 
132
150
  function printHelp() {
@@ -177,6 +195,13 @@ Commands:
177
195
  index:has Check indexed URL/company-role/report facts without loading source files
178
196
  index:query Query indexed reports, tracker rows, TSVs, scan history, pipeline, and ledger
179
197
  index:verify Validate local artifact index integrity
198
+ canon:key Print stable URL/company/role/company-role keys
199
+ canon:compare Compare two identifiers as same/possible/different
200
+ canon:explain Show the active identity canonicalization policy
201
+ migrate:plan Preview deterministic consumer-project migrations
202
+ migrate:apply Apply deterministic consumer-project migrations
203
+ migrate:check Fail if migrations are pending
204
+ migrate:explain Show the active migration policy
180
205
  sync Re-create harness symlinks in the current project
181
206
 
182
207
  Deterministic helpers (prefer these over LLM-derived values):
@@ -215,6 +240,10 @@ Pass --help after a command to see its own flags, e.g.:
215
240
  job-forge cache:put --url https://example.test/jobs/123 --input @jds/example.md
216
241
  job-forge index:has --key "company-role:acme:staff-engineer"
217
242
  job-forge index:query "acme"
243
+ job-forge canon:key company-role --company "Acme, Inc." --role "Senior SWE - Remote US"
244
+ job-forge canon:compare company "OpenAI, Inc." "Open AI"
245
+ job-forge migrate:check
246
+ job-forge migrate:apply
218
247
 
219
248
  Project directory resolves to $JOB_FORGE_PROJECT or cwd.`);
220
249
  }
@@ -344,6 +373,36 @@ if (cmd === 'index' || indexAliases[cmd]) {
344
373
  process.exit(result.status ?? 1);
345
374
  }
346
375
 
376
+ if (cmd === 'canon' || canonAliases[cmd]) {
377
+ const canonArgs = cmd === 'canon'
378
+ ? (rest.length === 0 ? ['help'] : rest)
379
+ : [canonAliases[cmd], ...rest];
380
+
381
+ const scriptPath = join(PKG_ROOT, 'scripts/canon.mjs');
382
+ const result = spawnSync(process.execPath, [scriptPath, ...canonArgs], {
383
+ stdio: 'inherit',
384
+ cwd: PROJECT_DIR,
385
+ env: process.env,
386
+ });
387
+
388
+ process.exit(result.status ?? 1);
389
+ }
390
+
391
+ if (cmd === 'migrate' || migrateAliases[cmd]) {
392
+ const migrateArgs = cmd === 'migrate'
393
+ ? (rest.length === 0 ? ['help'] : rest)
394
+ : [migrateAliases[cmd], ...rest];
395
+
396
+ const scriptPath = join(PKG_ROOT, 'scripts/migrate.mjs');
397
+ const result = spawnSync(process.execPath, [scriptPath, ...migrateArgs], {
398
+ stdio: 'inherit',
399
+ cwd: PROJECT_DIR,
400
+ env: process.env,
401
+ });
402
+
403
+ process.exit(result.status ?? 1);
404
+ }
405
+
347
406
  const rel = commands[cmd];
348
407
  if (!rel) {
349
408
  console.error(`Unknown command: ${cmd}\n`);
package/bin/sync.mjs CHANGED
@@ -23,6 +23,11 @@
23
23
  import { existsSync, lstatSync, readlinkSync, symlinkSync, mkdirSync, readFileSync } from 'fs';
24
24
  import { dirname, join, resolve, relative } from 'path';
25
25
  import { fileURLToPath } from 'url';
26
+ import {
27
+ loadMigrationConfig,
28
+ parseJson,
29
+ runMigrations,
30
+ } from '@razroo/iso-migrate';
26
31
 
27
32
  const __dirname = dirname(fileURLToPath(import.meta.url));
28
33
  const PKG_ROOT = resolve(__dirname, '..');
@@ -149,4 +154,29 @@ for (const { src, dst } of links) {
149
154
  }
150
155
  }
151
156
 
157
+ try {
158
+ const migrationResult = applyConsumerMigrations();
159
+ if (migrationResult?.changeCount > 0) {
160
+ console.log(`\njob-forge migrate: applied ${migrationResult.changeCount} change(s)`);
161
+ }
162
+ } catch (error) {
163
+ console.warn(`\n warn: job-forge migrations skipped: ${error instanceof Error ? error.message : String(error)}`);
164
+ warned++;
165
+ }
166
+
152
167
  console.log(`\njob-forge sync: ${created} created, ${skipped} up-to-date, ${warned} warnings (project: ${PROJECT_DIR})`);
168
+
169
+ function applyConsumerMigrations() {
170
+ const skip = process.env.JOB_FORGE_SKIP_MIGRATIONS;
171
+ if (skip === '1' || skip === 'true') return null;
172
+ if (!existsSync(join(PROJECT_DIR, 'package.json'))) return null;
173
+
174
+ const configPath = process.env.JOB_FORGE_MIGRATIONS_CONFIG || join(PKG_ROOT, 'templates', 'migrations.json');
175
+ if (!existsSync(configPath)) return null;
176
+
177
+ const config = loadMigrationConfig(parseJson(readFileSync(configPath, 'utf8'), configPath));
178
+ return runMigrations(config, {
179
+ root: PROJECT_DIR,
180
+ dryRun: false,
181
+ });
182
+ }
@@ -48,7 +48,7 @@ The skill router (`.opencode/skills/job-forge.md`) loads mode and data files on
48
48
 
49
49
  **Multi-harness support.** Because `iso/` is the single source of truth, publishing ships config for OpenCode, Cursor, Claude Code, and Codex in one tarball. Consumers run any of `opencode`, `cursor`, `claude`, or `codex` in the project and each picks up the shared MCP config + instructions via the symlinks above.
50
50
 
51
- **Upgrading** the harness in a consumer project is `npm run update-harness` — pulls the latest `job-forge` from npm, refreshes pinned MCPs, re-runs symlink sync, and prints the resolved version.
51
+ **Upgrading** the harness in a consumer project is `npm run update-harness` — pulls the latest `job-forge` from npm, refreshes pinned MCPs, re-runs symlink sync, applies safe consumer migrations, and prints the resolved version.
52
52
 
53
53
  ## System Overview
54
54
 
@@ -164,7 +164,9 @@ data/pipeline.md → Pending URLs and `local:jds/...` inbox (see modes/p
164
164
  .jobforge-index.json → Deterministic artifact lookup index built from templates/index.json
165
165
  jds/*.md → Saved job descriptions referenced from the pipeline (`local:jds/{file}`)
166
166
  templates/states.yml → Canonical status values
167
+ templates/canon.json → Canonical URL/company/role identity keys
167
168
  templates/context.json → Deterministic mode/reference context bundle policy
169
+ templates/migrations.json → Safe consumer-project upgrade policy
168
170
  templates/cv-template.html → PDF generation template
169
171
  examples/*.md → Fictional layouts only (not read by scripts; see examples/README.md)
170
172
  ```
@@ -178,6 +180,8 @@ Create `data/pipeline.md` when you start using the URL inbox (`/job-forge pipeli
178
180
  - Tracker TSVs: `batch/tracker-additions/{num}-{company-slug}.tsv` (one file per evaluation; merged files move under `batch/tracker-additions/merged/`; shape enforced by `templates/contracts.json`)
179
181
  - Ledger: `.jobforge-ledger/events.jsonl` (created by `job-forge ledger:rebuild`, `tracker-line --write`, or `merge`; gitignored personal state)
180
182
  - Index: `.jobforge-index.json` (created on demand by `job-forge index:*`; gitignored local lookup state)
183
+ - Canon: `templates/canon.json` (identity rules inspected with `job-forge canon:*`)
184
+ - Migrations: `templates/migrations.json` (applied by `job-forge sync` and inspectable with `job-forge migrate:*`)
181
185
  - Capabilities: `templates/capabilities.json` (role boundary policy inspected with `job-forge capabilities:*`)
182
186
  - Context: `templates/context.json` (mode/reference file bundles inspected with `job-forge context:*`)
183
187
 
@@ -223,9 +227,11 @@ Scripts maintain data consistency. In a consumer project they're invoked via the
223
227
  | `scripts/guard.mjs` | `npx job-forge guard:audit` / `guard:explain` | Deterministic `@razroo/iso-guard` policy audits over local OpenCode traces |
224
228
  | `scripts/ledger.mjs` | `npx job-forge ledger:status` / `ledger:has` / `ledger:rebuild` | Deterministic `@razroo/iso-ledger` state over tracker, TSV, and pipeline files |
225
229
  | `scripts/index.mjs` | `npx job-forge index:status` / `index:has` / `index:query` | Deterministic `@razroo/iso-index` lookup over reports, tracker rows, TSVs, pipeline, scan history, and ledger events |
230
+ | `scripts/canon.mjs` | `npx job-forge canon:normalize` / `canon:key` / `canon:compare` | Deterministic `@razroo/iso-canon` identity normalization for URLs, companies, roles, and company+role pairs |
226
231
  | `scripts/context.mjs` | `npx job-forge context:list` / `context:plan` / `context:check` / `context:render` | Deterministic `@razroo/iso-context` mode/reference context bundle planning and rendering |
232
+ | `scripts/migrate.mjs` | `npx job-forge migrate:plan` / `migrate:apply` / `migrate:check` | Deterministic `@razroo/iso-migrate` consumer-project upgrades for scripts and generated-artifact ignores |
227
233
  | `tracker-lib.mjs` | _(library)_ | Shared helpers for reading/writing day-based tracker files — imported by merge/dedup/verify/normalize |
228
- | `bin/sync.mjs` | `npx job-forge sync` | Creates the harness symlinks in a consumer project (also runs as `postinstall`) |
234
+ | `bin/sync.mjs` | `npx job-forge sync` | Creates the harness symlinks in a consumer project and applies safe migrations (also runs as `postinstall`) |
229
235
  | `bin/create-job-forge.mjs` | `npx create-job-forge <dir>` | Scaffolds a new personal project |
230
236
 
231
237
  All scripts resolve the consumer project dir via `process.env.JOB_FORGE_PROJECT || process.cwd()`, so running the CLI from anywhere in the consumer project Just Works.
@@ -152,7 +152,15 @@ Mode/reference context bundles live in `templates/context.json` and are planned
152
152
 
153
153
  ## JobForge artifact index
154
154
 
155
- Artifact lookup policy lives in `templates/index.json` and is built locally by `@razroo/iso-index`. Use `job-forge index:has --key "company-role:acme:staff-engineer"` as a cheap duplicate/source prefilter, `job-forge index:query "acme"` to get compact source path/line pointers, and `job-forge index:verify` to validate `.jobforge-index.json`. Query, has, and verify rebuild the index on demand, so scaffolded projects need no setup. This is not an MCP and does not add tool-schema tokens.
155
+ Artifact lookup policy lives in `templates/index.json` and is built locally by `@razroo/iso-index`. Use `job-forge index:has --key "company-role:acme:staff-engineer"` as a cheap duplicate/source prefilter, `job-forge index:query "acme"` to get compact source path/line pointers, and `job-forge index:verify` to validate `.jobforge-index.json`. Query, has, and verify rebuild the index on demand, so scaffolded projects need no setup. JobForge canonicalizes company/role and URL records through `templates/canon.json` before writing the index. This is not an MCP and does not add tool-schema tokens.
156
+
157
+ ## JobForge identity canonicalization
158
+
159
+ URL, company, role, and company+role identity rules live in `templates/canon.json` and are enforced locally by `@razroo/iso-canon`. Use `job-forge canon:key company-role --company "OpenAI, Inc." --role "Senior SWE, AI Platform"` to derive the same duplicate key used by ledger/index helpers, and `job-forge canon:compare company "OpenAI, Inc." "Open AI"` to explain whether two values resolve to the same entity. Custom forks can extend aliases, suffixes, stop words, and match thresholds in `templates/canon.json`. This is not an MCP and does not add prompt or tool-schema tokens.
160
+
161
+ ## JobForge consumer migrations
162
+
163
+ Consumer-project migrations live in `templates/migrations.json` and are applied locally by `@razroo/iso-migrate`. `job-forge sync` applies safe migrations automatically after refreshing symlinks; use `JOB_FORGE_SKIP_MIGRATIONS=1` to opt out. Use `job-forge migrate:plan`, `job-forge migrate:apply`, and `job-forge migrate:check` to inspect or enforce script/gitignore drift explicitly. This is not an MCP and does not add prompt or tool-schema tokens.
156
164
 
157
165
  ## JobForge guard audits
158
166
 
package/docs/README.md CHANGED
@@ -31,7 +31,7 @@ The harness exposes a single CLI (`job-forge`) installed as a `bin` entry. In a
31
31
 
32
32
  | What you need | Where to read |
33
33
  |---------------|---------------|
34
- | Full command list (`verify`, `merge`, `dedup`, `normalize`, `pdf`, `sync-check`, `tokens`, `trace`, `telemetry`, `guard`, `ledger`, `context`, `sync`). | [SETUP.md — Tracker and scripts (terminal)](SETUP.md#tracker-and-scripts-terminal). |
34
+ | Full command list (`verify`, `merge`, `dedup`, `normalize`, `pdf`, `sync-check`, `tokens`, `trace`, `telemetry`, `guard`, `ledger`, `canon`, `context`, `sync`). | [SETUP.md — Tracker and scripts (terminal)](SETUP.md#tracker-and-scripts-terminal). |
35
35
  | What each harness `.mjs` script does. | [ARCHITECTURE.md — Pipeline integrity](ARCHITECTURE.md#pipeline-integrity) and the scripts table underneath. |
36
36
  | Batch runner, TSV layout, and `batch/tracker-additions/` merge flow. | [batch/README.md](../batch/README.md). |
37
37
  | PR gate for harness contributions (`npm run verify` + `npm run build:dashboard`). | [CONTRIBUTING.md — Development](../CONTRIBUTING.md#development). |
package/docs/SETUP.md CHANGED
@@ -126,10 +126,13 @@ From your project root, these commands maintain the tracker and pipeline checks.
126
126
  | Pipeline health check | `npx job-forge verify` | `npm run verify` |
127
127
  | Merge `batch/tracker-additions/*.tsv` into the tracker | `npx job-forge merge` | `npm run merge` |
128
128
  | Inspect tracker row contract | `npx iso-contract explain jobforge.tracker-row --contracts templates/contracts.json` | _(none)_ |
129
+ | Derive canonical company/role key | `npx job-forge canon:key company-role --company "Acme" --role "Staff Engineer"` | `npm run canon:key -- company-role --company ...` |
130
+ | Compare identity values | `npx job-forge canon:compare company "OpenAI, Inc." "Open AI"` | `npm run canon:compare -- company ...` |
129
131
  | Inspect role capabilities | `npx job-forge capabilities:explain general-free` | `npm run capabilities:explain -- general-free` |
130
132
  | Inspect context bundle budget | `npx job-forge context:plan apply` | `npm run context:plan -- apply` |
131
133
  | Inspect local JD/artifact cache | `npx job-forge cache:status` | `npm run cache:status` |
132
134
  | Inspect local artifact index | `npx job-forge index:status` | `npm run index:status` |
135
+ | Inspect pending consumer migrations | `npx job-forge migrate:plan` | `npm run migrate:plan` |
133
136
  | Map status column to canonical labels | `npx job-forge normalize` | `npm run normalize` |
134
137
  | Merge duplicate company/role rows | `npx job-forge dedup` | `npm run dedup` |
135
138
  | Generate ATS-optimized CV PDF | `npx job-forge pdf` | `npm run pdf` |
@@ -148,6 +151,7 @@ From your project root, these commands maintain the tracker and pipeline checks.
148
151
  | Check duplicate/status event without loading tracker files | `npx job-forge ledger:has --company "Acme" --role "Staff Engineer" --status Applied` | `npm run ledger:has -- --company ...` |
149
152
  | Check/reuse cached JD content | `npx job-forge cache:has --url <url>` / `npx job-forge cache:get --url <url>` | `npm run cache:has -- --url ...` |
150
153
  | Query local artifact pointers | `npx job-forge index:query "Acme"` / `npx job-forge index:has --key company-role:acme:staff-engineer` | `npm run index:query -- Acme` |
154
+ | Apply safe consumer migrations | `npx job-forge migrate:apply` | `npm run migrate:apply` |
151
155
  | Re-create harness symlinks | `npx job-forge sync` | `npm run sync` |
152
156
  | Build optional dashboard TUI (Go on `PATH`) | `(cd node_modules/job-forge/dashboard && go build .)` | `npm run build:dashboard` (harness repo only) |
153
157
 
@@ -81,6 +81,15 @@ Local artifact index (terminal, outside opencode):
81
81
  npx job-forge index:has --key "company-role:acme:staff-engineer"
82
82
  npx job-forge index:query "acme"
83
83
 
84
+ Identity keys (terminal, outside opencode):
85
+ npx job-forge canon:key company-role --company "Acme" --role "Staff Engineer"
86
+ npx job-forge canon:compare company "OpenAI, Inc." "Open AI"
87
+
88
+ Consumer migrations (terminal, outside opencode):
89
+ npx job-forge migrate:plan # preview package.json/.gitignore drift
90
+ npx job-forge migrate:apply # apply safe harness upgrade migrations
91
+ npx job-forge migrate:check # fail if migrations are pending
92
+
84
93
  Artifact contracts (terminal, outside opencode):
85
94
  npx iso-contract explain jobforge.tracker-row --contracts templates/contracts.json
86
95
  npx job-forge tracker-line ... --write # renders + validates tracker TSV locally
@@ -166,11 +175,13 @@ Step 1 — Enumerate candidates
166
175
  - Build ordered list: candidates = [job_1, job_2, ..., job_N]
167
176
 
168
177
  Step 2 — Dedup against already-applied
169
- - Run npx job-forge index:has --key "company-role:<company-slug>:<role-slug>"
170
- as a fast local artifact prefilter when company+role is known. It rebuilds
171
- .jobforge-index.json on demand from templates/index.json. A hit means the
172
- role has already appeared in tracker files or tracker TSVs and can be
173
- dropped before dispatch.
178
+ - Derive the stable key with npx job-forge canon:key company-role --company
179
+ "<company>" --role "<role>" when company+role is known.
180
+ - Run npx job-forge index:has --key "<canon-key>" as a fast local artifact
181
+ prefilter. It rebuilds .jobforge-index.json on demand from
182
+ templates/index.json and canonicalizes indexed company/role records through
183
+ templates/canon.json. A hit means the role has already appeared in tracker
184
+ files or tracker TSVs and can be dropped before dispatch.
174
185
  - If .jobforge-ledger/events.jsonl exists, use npx job-forge ledger:has as a
175
186
  fast prefilter for obvious company+role Applied duplicates. A ledger match
176
187
  can be dropped before dispatch without loading tracker files into context.
@@ -7,7 +7,7 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
7
7
  - [H1] Max 2 parallel `task` dispatches per message. For N jobs, run `ceil(N/2)` sequential rounds of 2. A round is not complete until both subagents return a final outcome (`APPLIED`, `APPLY FAILED`, `SKIP`, `Discarded`, or a written TSV path). A `task` tool result that only gives a session id / title is a launch acknowledgement, not completion. Applies in all modes, for all user phrasings ("urgent", "apply to 10 jobs now").
8
8
  why: each subagent requires post-cleanup and racing more than 2 reliably loses at least one result. On 2026-04-25 the orchestrator launched round 2 while round 1 had only returned task ids, leaving four application subagents in flight and losing two provider recoveries
9
9
 
10
- - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. `npx job-forge index:has --key "company-role:..."` may be used first as a fast local artifact prefilter; it rebuilds `.jobforge-index.json` on demand from `templates/index.json`, and a company+role hit is enough to drop an obvious duplicate before dispatch. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may also be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the index or ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
10
+ - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. Use `npx job-forge canon:key company-role --company "..." --role "..."` when deriving a stable duplicate key; do not invent slugs in prose. `npx job-forge index:has --key "company-role:..."` may be used first as a fast local artifact prefilter; it rebuilds `.jobforge-index.json` on demand from `templates/index.json`, and a company+role hit is enough to drop an obvious duplicate before dispatch. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may also be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the index or ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
11
11
  why: 2026-04 same-day batch collision — when two batches target the same role, `npx job-forge merge` updates the existing day-file row rather than appending, so grepping day files alone misses earlier-batch applies; merged/*.tsv is the only place the breadcrumb remains
12
12
 
13
13
  - [H3] Before every batch of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round, no exceptions. Name this cleanup as an explicit "step 0" in your first-response plan for any multi-apply request — it is the most frequently skipped guardrail in practice, and skipping it produces cascade "Not connected" failures on the next dispatch.
@@ -72,12 +72,18 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
72
72
  - [D13] Use `job-forge index:*` for deterministic artifact lookup when available. `index:has` and `index:query` rebuild `.jobforge-index.json` from `templates/index.json` on demand, covering reports, tracker day files, tracker TSVs, pipeline URLs, scan history, and ledger events without loading those growing files into prompt context.
73
73
  why: `iso-index` is not an MCP and adds no prompt/tool-schema tokens; it gives agents compact file/line pointers and duplicate prefilters before expensive reads or browser dispatches
74
74
 
75
+ - [D14] Treat `templates/migrations.json` as the source of truth for consumer-project upgrades. Use `npx job-forge migrate:plan` or `npx job-forge migrate:check` when diagnosing harness drift; `job-forge sync` applies safe migrations automatically unless `JOB_FORGE_SKIP_MIGRATIONS=1` is set.
76
+ why: `iso-migrate` is not an MCP and adds no prompt/tool-schema tokens; it prevents stale consumer scripts and generated-artifact ignores without asking agents to hand-edit package.json
77
+
78
+ - [D15] Treat `templates/canon.json` as the source of truth for URL/company/role identity keys. Use `npx job-forge canon:key ...` or `npx job-forge canon:compare ...` before broad duplicate checks when a stable key or same/possible/different decision is useful.
79
+ why: `iso-canon` is not an MCP and adds no prompt/tool-schema tokens; it centralizes duplicate-key rules so agents do not repeatedly derive inconsistent slugs for aliases, suffixes, remote/location noise, or tracking URLs
80
+
75
81
  ## Procedure
76
82
 
77
83
  1. Check `cv.md`, `profile.yml`, and `portals.yml`; onboard if any file is missing.
78
84
  2. Pick and name the mode from **Routing** [D6]. No match → ask; do not guess.
79
- 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Decide inline vs delegated work [D1].
80
- 4. Prepare Geometra dispatches: cleanup [H3], index/ledger prefilter when useful [D8, D13], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
85
+ 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Use canonical identity keys for duplicate checks [D15]. Use migration checks for harness drift [D14]. Decide inline vs delegated work [D1].
86
+ 4. Prepare Geometra dispatches: cleanup [H3], canon/index/ledger prefilter when useful [D8, D13, D15], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
81
87
  5. Dispatch at most 2 tasks per round [H1]; wait for final outcomes, not just task ids [H5b].
82
88
  6. Keep multi-job form-filling out of the orchestrator [H4].
83
89
  7. Cross-check subagent facts against authoritative files [H7].
@@ -10,6 +10,7 @@ import {
10
10
  resolveCacheDir,
11
11
  verifyCache,
12
12
  } from '@razroo/iso-cache';
13
+ import { canonicalizeJobForgeUrl } from './jobforge-canon.mjs';
13
14
 
14
15
  export const CACHE_DIR = '.jobforge-cache';
15
16
  export const JD_CACHE_NAMESPACE = 'jobforge.jd';
@@ -96,10 +97,14 @@ export function normalizeJobUrl(url) {
96
97
  const text = String(url || '').trim();
97
98
  if (!text) throw new Error('url is required');
98
99
  try {
99
- const parsed = new URL(text);
100
- parsed.hash = '';
101
- return parsed.toString();
100
+ return canonicalizeJobForgeUrl(text).canonical;
102
101
  } catch {
103
- return text;
102
+ try {
103
+ const parsed = new URL(text);
104
+ parsed.hash = '';
105
+ return parsed.toString();
106
+ } catch {
107
+ return text;
108
+ }
104
109
  }
105
110
  }