job-forge 2.14.30 → 2.14.31

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -24,13 +24,13 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
24
24
  - [H5] Re-dispatch the same company only AFTER the previous subagent returns. Never fire the same `task` twice while the first is still in flight.
25
25
  why: two in-flight subagents for the same URL race on Geometra sessions and on tracker TSV writes, corrupting state and sometimes double-submitting
26
26
 
27
- - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, `.jobforge-index.json`, or `iso-trace`) rather than spawning a "check task status" subagent.
27
+ - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, `.jobforge-index.json`, `.jobforge-facts.json`, or `iso-trace`) rather than spawning a "check task status" subagent.
28
28
  why: OpenCode status prompts can be delivered into the target subagent as a new user message; a 2026-04-25 trace caused a subagent to call `task` recursively instead of finishing the application
29
29
 
30
30
  - [H6] Application outcomes flow through `batch/tracker-additions/*.tsv`, not `data/pipeline.md`. After any multi-apply run, the orchestrator MUST run `npx job-forge merge` then `npx job-forge verify` before ending the session.
31
31
  why: `pipeline.md` is the URL inbox (`[ ]` pending → `[x]` processed); `data/applications/YYYY-MM-DD.md` is the outcome log; the TSV pathway is the only safe bridge because `merge` handles column order and duplicate detection
32
32
 
33
- - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`, cached JD content returned by `npx job-forge cache:get --url ...`, and source path/line pointers returned by `npx job-forge index:query ...`.
33
+ - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`, cached JD content returned by `npx job-forge cache:get --url ...`, source path/line pointers returned by `npx job-forge index:query ...`, and materialized fact records returned by `npx job-forge facts:query ...`.
34
34
  why: 2026-04-18 scan subagent returned 30 fabricated Greenhouse IDs in prose (plausible-looking, non-existent); orchestrator dispatched 30 downstream subagents that all 404'd. Subagents can hallucinate IDs, scores, and confirmation text — round-trip through a file or don't trust the value
35
35
 
36
36
  - [H8] Never paste proxy values from `config/profile.yml` into `task` prompts, status text, or summaries. If a proxy is configured, tell the subagent exactly: "Proxy is configured; read `config/profile.yml` and pass its top-level `proxy:` object to every `geometra_connect` call." Do not transcribe `server`, `username`, `password`, or `bypass`, even if you just read them from disk.
@@ -77,6 +77,9 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
77
77
  - [D13] Use `job-forge index:*` for deterministic artifact lookup when available. `index:has` and `index:query` rebuild `.jobforge-index.json` from `templates/index.json` on demand, covering reports, tracker day files, tracker TSVs, pipeline URLs, scan history, and ledger events without loading those growing files into prompt context.
78
78
  why: `iso-index` is not an MCP and adds no prompt/tool-schema tokens; it gives agents compact file/line pointers and duplicate prefilters before expensive reads or browser dispatches
79
79
 
80
+ - [D13b] Use `job-forge facts:*` for deterministic source-backed fact materialization when available. `facts:has` and `facts:query` rebuild `.jobforge-facts.json` from `templates/facts.json` on demand, covering job URLs, scores, application statuses, tracker TSVs, preflight candidates, scan history, and ledger events with path/line provenance.
81
+ why: `iso-facts` is not an MCP and adds no prompt/tool-schema tokens; it turns authoritative files into compact queryable fact records so agents do not repeatedly reread broad artifact trees
82
+
80
83
  - [D14] Treat `templates/migrations.json` as the source of truth for consumer-project upgrades. Use `npx job-forge migrate:plan` or `npx job-forge migrate:check` when diagnosing harness drift; `job-forge sync` applies safe migrations automatically unless `JOB_FORGE_SKIP_MIGRATIONS=1` is set.
81
84
  why: `iso-migrate` is not an MCP and adds no prompt/tool-schema tokens; it prevents stale consumer scripts and generated-artifact ignores without asking agents to hand-edit package.json
82
85
 
@@ -96,8 +99,8 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
96
99
 
97
100
  1. Check `cv.md`, `profile.yml`, and `portals.yml`; onboard if any file is missing.
98
101
  2. Pick and name the mode from **Routing** [D6]. No match → ask; do not guess.
99
- 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Use canonical identity keys for duplicate checks [D15]. Use migration checks for harness drift [D14]. Use redaction checks before exporting local artifacts [D18]. Decide inline vs delegated work [D1].
100
- 4. Prepare Geometra dispatches: cleanup [H3], canon/index/ledger prefilter when useful [D8, D13, D15], dedupe [H2], location filter [D5], materialize candidate facts/gates and run preflight plan/check [D16], routing [D2, D10], proxy prompt hygiene [H8].
102
+ 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Use materialized facts when a fact query can answer the question [D13b]. Use canonical identity keys for duplicate checks [D15]. Use migration checks for harness drift [D14]. Use redaction checks before exporting local artifacts [D18]. Decide inline vs delegated work [D1].
103
+ 4. Prepare Geometra dispatches: cleanup [H3], canon/index/facts/ledger prefilter when useful [D8, D13, D13b, D15], dedupe [H2], location filter [D5], materialize candidate facts/gates and run preflight plan/check [D16], routing [D2, D10], proxy prompt hygiene [H8].
101
104
  5. Dispatch at most 2 tasks per round [H1]; wait for final outcomes, not just task ids [H5b], then settle the round with postflight status [D17].
102
105
  6. Keep multi-job form-filling out of the orchestrator [H4].
103
106
  7. Cross-check subagent facts against authoritative files [H7].
package/AGENTS.md CHANGED
@@ -19,13 +19,13 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
19
19
  - [H5] Re-dispatch the same company only AFTER the previous subagent returns. Never fire the same `task` twice while the first is still in flight.
20
20
  why: two in-flight subagents for the same URL race on Geometra sessions and on tracker TSV writes, corrupting state and sometimes double-submitting
21
21
 
22
- - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, `.jobforge-index.json`, or `iso-trace`) rather than spawning a "check task status" subagent.
22
+ - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, `.jobforge-index.json`, `.jobforge-facts.json`, or `iso-trace`) rather than spawning a "check task status" subagent.
23
23
  why: OpenCode status prompts can be delivered into the target subagent as a new user message; a 2026-04-25 trace caused a subagent to call `task` recursively instead of finishing the application
24
24
 
25
25
  - [H6] Application outcomes flow through `batch/tracker-additions/*.tsv`, not `data/pipeline.md`. After any multi-apply run, the orchestrator MUST run `npx job-forge merge` then `npx job-forge verify` before ending the session.
26
26
  why: `pipeline.md` is the URL inbox (`[ ]` pending → `[x]` processed); `data/applications/YYYY-MM-DD.md` is the outcome log; the TSV pathway is the only safe bridge because `merge` handles column order and duplicate detection
27
27
 
28
- - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`, cached JD content returned by `npx job-forge cache:get --url ...`, and source path/line pointers returned by `npx job-forge index:query ...`.
28
+ - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`, cached JD content returned by `npx job-forge cache:get --url ...`, source path/line pointers returned by `npx job-forge index:query ...`, and materialized fact records returned by `npx job-forge facts:query ...`.
29
29
  why: 2026-04-18 scan subagent returned 30 fabricated Greenhouse IDs in prose (plausible-looking, non-existent); orchestrator dispatched 30 downstream subagents that all 404'd. Subagents can hallucinate IDs, scores, and confirmation text — round-trip through a file or don't trust the value
30
30
 
31
31
  - [H8] Never paste proxy values from `config/profile.yml` into `task` prompts, status text, or summaries. If a proxy is configured, tell the subagent exactly: "Proxy is configured; read `config/profile.yml` and pass its top-level `proxy:` object to every `geometra_connect` call." Do not transcribe `server`, `username`, `password`, or `bypass`, even if you just read them from disk.
@@ -72,6 +72,9 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
72
72
  - [D13] Use `job-forge index:*` for deterministic artifact lookup when available. `index:has` and `index:query` rebuild `.jobforge-index.json` from `templates/index.json` on demand, covering reports, tracker day files, tracker TSVs, pipeline URLs, scan history, and ledger events without loading those growing files into prompt context.
73
73
  why: `iso-index` is not an MCP and adds no prompt/tool-schema tokens; it gives agents compact file/line pointers and duplicate prefilters before expensive reads or browser dispatches
74
74
 
75
+ - [D13b] Use `job-forge facts:*` for deterministic source-backed fact materialization when available. `facts:has` and `facts:query` rebuild `.jobforge-facts.json` from `templates/facts.json` on demand, covering job URLs, scores, application statuses, tracker TSVs, preflight candidates, scan history, and ledger events with path/line provenance.
76
+ why: `iso-facts` is not an MCP and adds no prompt/tool-schema tokens; it turns authoritative files into compact queryable fact records so agents do not repeatedly reread broad artifact trees
77
+
75
78
  - [D14] Treat `templates/migrations.json` as the source of truth for consumer-project upgrades. Use `npx job-forge migrate:plan` or `npx job-forge migrate:check` when diagnosing harness drift; `job-forge sync` applies safe migrations automatically unless `JOB_FORGE_SKIP_MIGRATIONS=1` is set.
76
79
  why: `iso-migrate` is not an MCP and adds no prompt/tool-schema tokens; it prevents stale consumer scripts and generated-artifact ignores without asking agents to hand-edit package.json
77
80
 
@@ -91,8 +94,8 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
91
94
 
92
95
  1. Check `cv.md`, `profile.yml`, and `portals.yml`; onboard if any file is missing.
93
96
  2. Pick and name the mode from **Routing** [D6]. No match → ask; do not guess.
94
- 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Use canonical identity keys for duplicate checks [D15]. Use migration checks for harness drift [D14]. Use redaction checks before exporting local artifacts [D18]. Decide inline vs delegated work [D1].
95
- 4. Prepare Geometra dispatches: cleanup [H3], canon/index/ledger prefilter when useful [D8, D13, D15], dedupe [H2], location filter [D5], materialize candidate facts/gates and run preflight plan/check [D16], routing [D2, D10], proxy prompt hygiene [H8].
97
+ 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Use materialized facts when a fact query can answer the question [D13b]. Use canonical identity keys for duplicate checks [D15]. Use migration checks for harness drift [D14]. Use redaction checks before exporting local artifacts [D18]. Decide inline vs delegated work [D1].
98
+ 4. Prepare Geometra dispatches: cleanup [H3], canon/index/facts/ledger prefilter when useful [D8, D13, D13b, D15], dedupe [H2], location filter [D5], materialize candidate facts/gates and run preflight plan/check [D16], routing [D2, D10], proxy prompt hygiene [H8].
96
99
  5. Dispatch at most 2 tasks per round [H1]; wait for final outcomes, not just task ids [H5b], then settle the round with postflight status [D17].
97
100
  6. Keep multi-job form-filling out of the orchestrator [H4].
98
101
  7. Cross-check subagent facts against authoritative files [H7].
package/CLAUDE.md CHANGED
@@ -19,13 +19,13 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
19
19
  - [H5] Re-dispatch the same company only AFTER the previous subagent returns. Never fire the same `task` twice while the first is still in flight.
20
20
  why: two in-flight subagents for the same URL race on Geometra sessions and on tracker TSV writes, corrupting state and sometimes double-submitting
21
21
 
22
- - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, `.jobforge-index.json`, or `iso-trace`) rather than spawning a "check task status" subagent.
22
+ - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, `.jobforge-index.json`, `.jobforge-facts.json`, or `iso-trace`) rather than spawning a "check task status" subagent.
23
23
  why: OpenCode status prompts can be delivered into the target subagent as a new user message; a 2026-04-25 trace caused a subagent to call `task` recursively instead of finishing the application
24
24
 
25
25
  - [H6] Application outcomes flow through `batch/tracker-additions/*.tsv`, not `data/pipeline.md`. After any multi-apply run, the orchestrator MUST run `npx job-forge merge` then `npx job-forge verify` before ending the session.
26
26
  why: `pipeline.md` is the URL inbox (`[ ]` pending → `[x]` processed); `data/applications/YYYY-MM-DD.md` is the outcome log; the TSV pathway is the only safe bridge because `merge` handles column order and duplicate detection
27
27
 
28
- - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`, cached JD content returned by `npx job-forge cache:get --url ...`, and source path/line pointers returned by `npx job-forge index:query ...`.
28
+ - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`, cached JD content returned by `npx job-forge cache:get --url ...`, source path/line pointers returned by `npx job-forge index:query ...`, and materialized fact records returned by `npx job-forge facts:query ...`.
29
29
  why: 2026-04-18 scan subagent returned 30 fabricated Greenhouse IDs in prose (plausible-looking, non-existent); orchestrator dispatched 30 downstream subagents that all 404'd. Subagents can hallucinate IDs, scores, and confirmation text — round-trip through a file or don't trust the value
30
30
 
31
31
  - [H8] Never paste proxy values from `config/profile.yml` into `task` prompts, status text, or summaries. If a proxy is configured, tell the subagent exactly: "Proxy is configured; read `config/profile.yml` and pass its top-level `proxy:` object to every `geometra_connect` call." Do not transcribe `server`, `username`, `password`, or `bypass`, even if you just read them from disk.
@@ -72,6 +72,9 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
72
72
  - [D13] Use `job-forge index:*` for deterministic artifact lookup when available. `index:has` and `index:query` rebuild `.jobforge-index.json` from `templates/index.json` on demand, covering reports, tracker day files, tracker TSVs, pipeline URLs, scan history, and ledger events without loading those growing files into prompt context.
73
73
  why: `iso-index` is not an MCP and adds no prompt/tool-schema tokens; it gives agents compact file/line pointers and duplicate prefilters before expensive reads or browser dispatches
74
74
 
75
+ - [D13b] Use `job-forge facts:*` for deterministic source-backed fact materialization when available. `facts:has` and `facts:query` rebuild `.jobforge-facts.json` from `templates/facts.json` on demand, covering job URLs, scores, application statuses, tracker TSVs, preflight candidates, scan history, and ledger events with path/line provenance.
76
+ why: `iso-facts` is not an MCP and adds no prompt/tool-schema tokens; it turns authoritative files into compact queryable fact records so agents do not repeatedly reread broad artifact trees
77
+
75
78
  - [D14] Treat `templates/migrations.json` as the source of truth for consumer-project upgrades. Use `npx job-forge migrate:plan` or `npx job-forge migrate:check` when diagnosing harness drift; `job-forge sync` applies safe migrations automatically unless `JOB_FORGE_SKIP_MIGRATIONS=1` is set.
76
79
  why: `iso-migrate` is not an MCP and adds no prompt/tool-schema tokens; it prevents stale consumer scripts and generated-artifact ignores without asking agents to hand-edit package.json
77
80
 
@@ -91,8 +94,8 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
91
94
 
92
95
  1. Check `cv.md`, `profile.yml`, and `portals.yml`; onboard if any file is missing.
93
96
  2. Pick and name the mode from **Routing** [D6]. No match → ask; do not guess.
94
- 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Use canonical identity keys for duplicate checks [D15]. Use migration checks for harness drift [D14]. Use redaction checks before exporting local artifacts [D18]. Decide inline vs delegated work [D1].
95
- 4. Prepare Geometra dispatches: cleanup [H3], canon/index/ledger prefilter when useful [D8, D13, D15], dedupe [H2], location filter [D5], materialize candidate facts/gates and run preflight plan/check [D16], routing [D2, D10], proxy prompt hygiene [H8].
97
+ 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Use materialized facts when a fact query can answer the question [D13b]. Use canonical identity keys for duplicate checks [D15]. Use migration checks for harness drift [D14]. Use redaction checks before exporting local artifacts [D18]. Decide inline vs delegated work [D1].
98
+ 4. Prepare Geometra dispatches: cleanup [H3], canon/index/facts/ledger prefilter when useful [D8, D13, D13b, D15], dedupe [H2], location filter [D5], materialize candidate facts/gates and run preflight plan/check [D16], routing [D2, D10], proxy prompt hygiene [H8].
96
99
  5. Dispatch at most 2 tasks per round [H1]; wait for final outcomes, not just task ids [H5b], then settle the round with postflight status [D17].
97
100
  6. Keep multi-job form-filling out of the orchestrator [H4].
98
101
  7. Cross-check subagent facts against authoritative files [H7].
package/README.md CHANGED
@@ -31,7 +31,7 @@ The scaffolded `opencode.json` already has three MCPs wired up — they launch a
31
31
  - **Gmail** — reads replies from recruiters
32
32
  - **state-trace** — typed working memory for cross-session context (resumed batches, recent decisions, repeated portal quirks). Install once with `python3 -m pip install "state-trace[mcp]"`; the MCP command is `state-trace-mcp`.
33
33
 
34
- JobForge also keeps MCP-free local workflow state: `templates/canon.json` defines URL/company/role identity keys via `@razroo/iso-canon`, `templates/contracts.json` defines tracker/apply artifact shapes via `@razroo/iso-contract`, `templates/capabilities.json` defines role capability boundaries via `@razroo/iso-capabilities`, `templates/context.json` defines deterministic mode/reference bundles via `@razroo/iso-context`, `templates/preflight.json` defines safe dispatch rounds/gates via `@razroo/iso-preflight`, `templates/postflight.json` defines safe dispatch settlement via `@razroo/iso-postflight`, `templates/redact.json` defines safe-export redaction rules via `@razroo/iso-redact`, `templates/migrations.json` defines safe consumer-project upgrades via `@razroo/iso-migrate`, `.jobforge-ledger/events.jsonl` records duplicate/status events via `@razroo/iso-ledger`, `.jobforge-cache/` stores reusable JD/artifact content via `@razroo/iso-cache`, and `.jobforge-index.json` indexes artifact source pointers via `@razroo/iso-index`. None of these add always-on prompt or tool-schema tokens.
34
+ JobForge also keeps MCP-free local workflow state: `templates/canon.json` defines URL/company/role identity keys via `@razroo/iso-canon`, `templates/contracts.json` defines tracker/apply artifact shapes via `@razroo/iso-contract`, `templates/capabilities.json` defines role capability boundaries via `@razroo/iso-capabilities`, `templates/context.json` defines deterministic mode/reference bundles via `@razroo/iso-context`, `templates/preflight.json` defines safe dispatch rounds/gates via `@razroo/iso-preflight`, `templates/postflight.json` defines safe dispatch settlement via `@razroo/iso-postflight`, `templates/redact.json` defines safe-export redaction rules via `@razroo/iso-redact`, `templates/migrations.json` defines safe consumer-project upgrades via `@razroo/iso-migrate`, `templates/facts.json` defines source-backed fact extraction via `@razroo/iso-facts`, `.jobforge-ledger/events.jsonl` records duplicate/status events via `@razroo/iso-ledger`, `.jobforge-cache/` stores reusable JD/artifact content via `@razroo/iso-cache`, `.jobforge-index.json` indexes artifact source pointers via `@razroo/iso-index`, and `.jobforge-facts.json` materializes queryable facts with provenance. None of these add always-on prompt or tool-schema tokens.
35
35
 
36
36
  `npm install` also materializes symlinks for every supported agent harness — OpenCode, Cursor, Claude Code, and Codex — so you can run `opencode`, `cursor`, `claude`, or `codex` in the same project and each picks up the shared MCP config and instructions.
37
37
 
@@ -78,7 +78,7 @@ JobForge turns opencode into a full job search command center. Instead of manual
78
78
  | **Durable Batch Orchestration** | `batch-runner.sh` uses `@razroo/iso-orchestrator` for resumable bundle execution, bounded fan-out, mutexed state writes, and workflow records in `.jobforge-runs/`. |
79
79
  | **Pipeline Integrity** | Automated merge, dedup, status normalization, health checks |
80
80
  | **Cost-Aware Agent Routing** | Three subagents (`@general-free`, `@general-paid`, `@glm-minimal`) with per-task tool surfaces. On OpenCode, JobForge pins all tiers to `opencode-go/deepseek-v4-flash` so application runs avoid overloaded free-model pools. See [Subagent Routing in AGENTS.md](AGENTS.md) for the task-to-agent mapping. |
81
- | **Trace + Telemetry + Guard + Contract + Canon + Ledger + Capabilities + Context + Cache + Index + Preflight + Postflight + Redact + Migrate** | `job-forge trace:*` exposes local OpenCode transcripts, `job-forge telemetry:*` summarizes runs, `job-forge guard:*` audits deterministic policy rules, `templates/contracts.json` enforces artifact shape with `iso-contract`, `job-forge canon:*` derives stable URL/company/role identity keys, `job-forge ledger:*` queries append-only workflow state, `job-forge capabilities:*` checks role boundaries, `job-forge context:*` plans mode/reference context bundles, `job-forge cache:*` reuses fetched JD/artifact content, `job-forge index:*` queries compact source pointers, `job-forge preflight:*` plans bounded apply dispatch rounds from file-backed candidate facts, `job-forge postflight:*` settles dispatch outcomes/artifacts/post-steps, `job-forge redact:*` sanitizes local exports, and `job-forge migrate:*` applies safe consumer-project upgrades without MCP/tool-schema overhead. |
81
+ | **Trace + Telemetry + Guard + Contract + Canon + Ledger + Capabilities + Context + Cache + Index + Facts + Preflight + Postflight + Redact + Migrate** | `job-forge trace:*` exposes local OpenCode transcripts, `job-forge telemetry:*` summarizes runs, `job-forge guard:*` audits deterministic policy rules, `templates/contracts.json` enforces artifact shape with `iso-contract`, `job-forge canon:*` derives stable URL/company/role identity keys, `job-forge ledger:*` queries append-only workflow state, `job-forge capabilities:*` checks role boundaries, `job-forge context:*` plans mode/reference context bundles, `job-forge cache:*` reuses fetched JD/artifact content, `job-forge index:*` queries compact source pointers, `job-forge facts:*` materializes source-backed job/application/candidate facts, `job-forge preflight:*` plans bounded apply dispatch rounds from file-backed candidate facts, `job-forge postflight:*` settles dispatch outcomes/artifacts/post-steps, `job-forge redact:*` sanitizes local exports, and `job-forge migrate:*` applies safe consumer-project upgrades without MCP/tool-schema overhead. |
82
82
  | **Token Cost Visibility** | `job-forge tokens --days 1` for per-session breakdown; `job-forge session-report --since-minutes 60 --log` to flag sessions over budget and append history to `data/token-usage.tsv`. Auto-logged after every batch run. |
83
83
 
84
84
  ## Usage
@@ -148,6 +148,7 @@ my-search/
148
148
  ├── .jobforge-ledger/ # append-only local workflow events (personal, gitignored)
149
149
  ├── .jobforge-cache/ # content-addressed local JD/artifact cache (personal, gitignored)
150
150
  ├── .jobforge-index.json # deterministic artifact lookup index (generated, gitignored)
151
+ ├── .jobforge-facts.json # deterministic fact set with provenance (generated, gitignored)
151
152
  ├── .jobforge-redacted/ # sanitized local exports (generated, gitignored)
152
153
  ├── reports/ # generated evaluation reports (personal, gitignored)
153
154
  ├── batch/{batch-input,batch-state}.tsv, tracker-additions/, logs/ # personal
@@ -165,7 +166,7 @@ my-search/
165
166
  ├── .opencode/skills/job-forge.md # → skill router
166
167
  ├── .opencode/agents/ # → @general-free, @general-paid, @glm-minimal
167
168
  ├── modes/ # → _shared.md + skill modes
168
- ├── templates/ # → states.yml, portals.example.yml, cv-template.html, canon.json, capabilities.json, context.json, index.json, preflight.json, postflight.json, redact.json, migrations.json
169
+ ├── templates/ # → states.yml, portals.example.yml, cv-template.html, canon.json, capabilities.json, context.json, index.json, facts.json, preflight.json, postflight.json, redact.json, migrations.json
169
170
  ├── batch/batch-prompt.md # → batch worker prompt
170
171
  ├── batch/batch-runner.sh # → parallel orchestrator
171
172
 
@@ -191,7 +192,7 @@ JobForge/
191
192
  │ ├── sync.mjs # postinstall: creates symlinks in consumer project
192
193
  │ └── create-job-forge.mjs # scaffolder
193
194
  ├── modes/ # _shared.md + 16 skill modes
194
- ├── templates/ # cv-template.html, portals.example.yml, states.yml, canon.json, capabilities.json, context.json, preflight.json, postflight.json, redact.json, migrations.json
195
+ ├── templates/ # cv-template.html, portals.example.yml, states.yml, canon.json, capabilities.json, context.json, facts.json, preflight.json, postflight.json, redact.json, migrations.json
195
196
  ├── config/profile.example.yml # template for consumer's profile.yml
196
197
  ├── batch/{batch-prompt.md,batch-runner.sh} # batch orchestrator
197
198
  ├── scripts/
@@ -202,6 +203,7 @@ JobForge/
202
203
  │ ├── context.mjs # iso-context-backed context bundle CLI
203
204
  │ ├── cache.mjs # iso-cache-backed local artifact cache CLI
204
205
  │ ├── index.mjs # iso-index-backed artifact lookup CLI
206
+ │ ├── facts.mjs # iso-facts-backed local fact materialization
205
207
  │ ├── canon.mjs # iso-canon-backed identity normalization CLI
206
208
  │ ├── preflight.mjs # iso-preflight-backed dispatch planning CLI
207
209
  │ ├── postflight.mjs # iso-postflight-backed dispatch settlement CLI
@@ -147,6 +147,13 @@ const consumerPkg = {
147
147
  'index:has': 'job-forge index:has',
148
148
  'index:query': 'job-forge index:query',
149
149
  'index:explain': 'job-forge index:explain',
150
+ 'facts:build': 'job-forge facts:build',
151
+ 'facts:status': 'job-forge facts:status',
152
+ 'facts:verify': 'job-forge facts:verify',
153
+ 'facts:check': 'job-forge facts:check',
154
+ 'facts:has': 'job-forge facts:has',
155
+ 'facts:query': 'job-forge facts:query',
156
+ 'facts:explain': 'job-forge facts:explain',
150
157
  'canon:normalize': 'job-forge canon:normalize',
151
158
  'canon:key': 'job-forge canon:key',
152
159
  'canon:compare': 'job-forge canon:compare',
@@ -262,6 +269,7 @@ Before doing any work, remember where things live in *this* project:
262
269
  | Scanner dedup history | \`data/scan-history.tsv\` | Only touch in \`/job-forge scan\` |
263
270
  | Local workflow ledger | \`.jobforge-ledger/events.jsonl\` | Deterministic append-only state; use \`job-forge ledger:*\` |
264
271
  | Local artifact index | \`.jobforge-index.json\` | Deterministic file/line lookup; use \`job-forge index:*\` |
272
+ | Local fact set | \`.jobforge-facts.json\` | Deterministic source-backed facts; use \`job-forge facts:*\` |
265
273
  | Identity canonicalization | \`templates/canon.json\` | Stable URL/company/role keys; use \`job-forge canon:*\` |
266
274
  | Dispatch preflight policy | \`templates/preflight.json\` | Safe apply rounds/gates; use \`job-forge preflight:*\` |
267
275
  | Dispatch postflight policy | \`templates/postflight.json\` | Safe apply settlement; use \`job-forge postflight:*\` |
@@ -356,6 +364,7 @@ data/token-usage.tsv
356
364
  .jobforge-ledger/
357
365
  .jobforge-cache/
358
366
  .jobforge-index.json
367
+ .jobforge-facts.json
359
368
  .jobforge-runs/
360
369
  reports/
361
370
  !reports/.gitkeep
@@ -418,6 +427,7 @@ job-forge merge # merge batch/tracker-additions/*.tsv into the tracke
418
427
  job-forge verify # verify pipeline integrity
419
428
  job-forge ledger:status # local deterministic workflow ledger status
420
429
  job-forge index:status # local artifact index status
430
+ job-forge facts:status # local source-backed fact status
421
431
  job-forge canon:key company-role --company "Acme, Inc." --role "Senior SWE"
422
432
  job-forge preflight:plan --candidates batch/preflight-candidates.json
423
433
  job-forge postflight:status --plan batch/preflight-plan.json --outcomes batch/postflight-outcomes.json
package/bin/job-forge.mjs CHANGED
@@ -25,6 +25,7 @@
25
25
  * context:* Query/render deterministic context bundles via iso-context
26
26
  * cache:* Reuse local deterministic artifacts via iso-cache
27
27
  * index:* Query local artifacts via iso-index
28
+ * facts:* Materialize source-backed local facts via iso-facts
28
29
  * canon:* Compute deterministic identity keys via iso-canon
29
30
  * preflight:* Plan safe dispatch rounds via iso-preflight
30
31
  * postflight:* Settle dispatch outcomes via iso-postflight
@@ -132,6 +133,17 @@ const indexAliases = {
132
133
  'index:path': 'path',
133
134
  };
134
135
 
136
+ const factsAliases = {
137
+ 'facts:build': 'build',
138
+ 'facts:status': 'status',
139
+ 'facts:query': 'query',
140
+ 'facts:has': 'has',
141
+ 'facts:verify': 'verify',
142
+ 'facts:check': 'check',
143
+ 'facts:explain': 'explain',
144
+ 'facts:path': 'path',
145
+ };
146
+
135
147
  const canonAliases = {
136
148
  'canon:normalize': 'normalize',
137
149
  'canon:key': 'key',
@@ -220,6 +232,12 @@ Commands:
220
232
  index:has Check indexed URL/company-role/report facts without loading source files
221
233
  index:query Query indexed reports, tracker rows, TSVs, scan history, pipeline, and ledger
222
234
  index:verify Validate local artifact index integrity
235
+ facts:status Show local materialized fact set status
236
+ facts:build Rebuild .jobforge-facts.json from templates/facts.json
237
+ facts:has Check source-backed job/application/candidate facts
238
+ facts:query Query materialized facts with source path/line provenance
239
+ facts:verify Validate local fact set integrity
240
+ facts:check Check configured fact requirements
223
241
  canon:key Print stable URL/company/role/company-role keys
224
242
  canon:compare Compare two identifiers as same/possible/different
225
243
  canon:explain Show the active identity canonicalization policy
@@ -275,6 +293,8 @@ Pass --help after a command to see its own flags, e.g.:
275
293
  job-forge cache:put --url https://example.test/jobs/123 --input @jds/example.md
276
294
  job-forge index:has --key "company-role:acme:staff-engineer"
277
295
  job-forge index:query "acme"
296
+ job-forge facts:has --fact application.status --key "company-role:acme:staff-engineer"
297
+ job-forge facts:query --fact job.url --tag report
278
298
  job-forge canon:key company-role --company "Acme, Inc." --role "Senior SWE - Remote US"
279
299
  job-forge canon:compare company "OpenAI, Inc." "Open AI"
280
300
  job-forge preflight:plan --candidates batch/preflight-candidates.json
@@ -414,6 +434,21 @@ if (cmd === 'index' || indexAliases[cmd]) {
414
434
  process.exit(result.status ?? 1);
415
435
  }
416
436
 
437
+ if (cmd === 'facts' || factsAliases[cmd]) {
438
+ const factsArgs = cmd === 'facts'
439
+ ? (rest.length === 0 ? ['help'] : rest)
440
+ : [factsAliases[cmd], ...rest];
441
+
442
+ const scriptPath = join(PKG_ROOT, 'scripts/facts.mjs');
443
+ const result = spawnSync(process.execPath, [scriptPath, ...factsArgs], {
444
+ stdio: 'inherit',
445
+ cwd: PROJECT_DIR,
446
+ env: process.env,
447
+ });
448
+
449
+ process.exit(result.status ?? 1);
450
+ }
451
+
417
452
  if (cmd === 'canon' || canonAliases[cmd]) {
418
453
  const canonArgs = cmd === 'canon'
419
454
  ? (rest.length === 0 ? ['help'] : rest)
@@ -162,10 +162,12 @@ portals.yml → Scanner configuration
162
162
  data/pipeline.md → Pending URLs and `local:jds/...` inbox (see modes/pipeline.md)
163
163
  .jobforge-ledger/events.jsonl → Append-only workflow events for cheap local duplicate/status checks
164
164
  .jobforge-index.json → Deterministic artifact lookup index built from templates/index.json
165
+ .jobforge-facts.json → Deterministic fact set built from templates/facts.json
165
166
  jds/*.md → Saved job descriptions referenced from the pipeline (`local:jds/{file}`)
166
167
  templates/states.yml → Canonical status values
167
168
  templates/canon.json → Canonical URL/company/role identity keys
168
169
  templates/context.json → Deterministic mode/reference context bundle policy
170
+ templates/facts.json → Source-backed fact extraction policy
169
171
  templates/preflight.json → Safe apply dispatch rounds/gates policy
170
172
  templates/postflight.json → Safe apply dispatch settlement policy
171
173
  templates/migrations.json → Safe consumer-project upgrade policy
@@ -182,6 +184,7 @@ Create `data/pipeline.md` when you start using the URL inbox (`/job-forge pipeli
182
184
  - Tracker TSVs: `batch/tracker-additions/{num}-{company-slug}.tsv` (one file per evaluation; merged files move under `batch/tracker-additions/merged/`; shape enforced by `templates/contracts.json`)
183
185
  - Ledger: `.jobforge-ledger/events.jsonl` (created by `job-forge ledger:rebuild`, `tracker-line --write`, or `merge`; gitignored personal state)
184
186
  - Index: `.jobforge-index.json` (created on demand by `job-forge index:*`; gitignored local lookup state)
187
+ - Facts: `.jobforge-facts.json` (created on demand by `job-forge facts:*`; gitignored local fact state)
185
188
  - Canon: `templates/canon.json` (identity rules inspected with `job-forge canon:*`)
186
189
  - Preflight: `templates/preflight.json` (dispatch rounds/gates inspected with `job-forge preflight:*`)
187
190
  - Postflight: `templates/postflight.json` (dispatch outcomes/artifacts/post-steps inspected with `job-forge postflight:*`)
@@ -191,7 +194,7 @@ Create `data/pipeline.md` when you start using the URL inbox (`/job-forge pipeli
191
194
 
192
195
  ## Pipeline Integrity
193
196
 
194
- From the project root, `npx job-forge verify` (or `npm run verify`) runs `verify-pipeline.mjs`. When a tracker file exists, it validates canonical statuses (using `templates/states.yml` when that file is present and parseable), validates every tracker row against `templates/contracts.json`, warns on probable duplicate company/role rows, checks that report column markdown links resolve to files in the repo, validates score column format (`X.X/5`, `N/A`, or `DUP`), rejects table rows with too few columns, flags markdown bold inside the score column, and warns if any `batch/tracker-additions/*.tsv` files are still waiting to be merged. If `.jobforge-ledger/events.jsonl` exists, verify also validates the append-only ledger. If `.jobforge-index.json` exists, verify validates the artifact index. It also compares state ids from `templates/states.yml` to an internal fallback list and warns when the two sets drift. **Fresh clone:** the command exits successfully when neither `data/applications.md` nor root `applications.md` exists yet; pending-TSV and states-drift checks still run so contributors see unmerged batch output early. Optional setup validation after you add `cv.md` and `config/profile.yml`: `npm run sync-check` (`cv-sync-check.mjs`).
197
+ From the project root, `npx job-forge verify` (or `npm run verify`) runs `verify-pipeline.mjs`. When a tracker file exists, it validates canonical statuses (using `templates/states.yml` when that file is present and parseable), validates every tracker row against `templates/contracts.json`, warns on probable duplicate company/role rows, checks that report column markdown links resolve to files in the repo, validates score column format (`X.X/5`, `N/A`, or `DUP`), rejects table rows with too few columns, flags markdown bold inside the score column, and warns if any `batch/tracker-additions/*.tsv` files are still waiting to be merged. If `.jobforge-ledger/events.jsonl` exists, verify also validates the append-only ledger. If `.jobforge-index.json` exists, verify validates the artifact index. If `.jobforge-facts.json` exists, verify validates the materialized fact set. It also compares state ids from `templates/states.yml` to an internal fallback list and warns when the two sets drift. **Fresh clone:** the command exits successfully when neither `data/applications.md` nor root `applications.md` exists yet; pending-TSV and states-drift checks still run so contributors see unmerged batch output early. Optional setup validation after you add `cv.md` and `config/profile.yml`: `npm run sync-check` (`cv-sync-check.mjs`).
195
198
 
196
199
  **`verify-pipeline.mjs` checks (same order as the script header):**
197
200
 
@@ -206,6 +209,7 @@ From the project root, `npx job-forge verify` (or `npm run verify`) runs `verify
206
209
  9. Warn when state ids in `templates/states.yml` drift from the script’s built-in fallback list (or when the file exists but ids failed to parse).
207
210
  10. Validate `.jobforge-ledger/events.jsonl` when present.
208
211
  11. Validate `.jobforge-index.json` when present.
212
+ 12. Validate `.jobforge-facts.json` when present.
209
213
 
210
214
  When the tracker file is missing, checks 1-6 and 8 are skipped; checks 7, 9, 10, and 11 still run when applicable.
211
215
 
@@ -231,6 +235,7 @@ Scripts maintain data consistency. In a consumer project they're invoked via the
231
235
  | `scripts/guard.mjs` | `npx job-forge guard:audit` / `guard:explain` | Deterministic `@razroo/iso-guard` policy audits over local OpenCode traces |
232
236
  | `scripts/ledger.mjs` | `npx job-forge ledger:status` / `ledger:has` / `ledger:rebuild` | Deterministic `@razroo/iso-ledger` state over tracker, TSV, and pipeline files |
233
237
  | `scripts/index.mjs` | `npx job-forge index:status` / `index:has` / `index:query` | Deterministic `@razroo/iso-index` lookup over reports, tracker rows, TSVs, pipeline, scan history, and ledger events |
238
+ | `scripts/facts.mjs` | `npx job-forge facts:status` / `facts:has` / `facts:query` | Deterministic `@razroo/iso-facts` materialization over job URLs, scores, application statuses, preflight candidates, scan history, and ledger events |
234
239
  | `scripts/canon.mjs` | `npx job-forge canon:normalize` / `canon:key` / `canon:compare` | Deterministic `@razroo/iso-canon` identity normalization for URLs, companies, roles, and company+role pairs |
235
240
  | `scripts/context.mjs` | `npx job-forge context:list` / `context:plan` / `context:check` / `context:render` | Deterministic `@razroo/iso-context` mode/reference context bundle planning and rendering |
236
241
  | `scripts/preflight.mjs` | `npx job-forge preflight:plan` / `preflight:check` / `preflight:explain` | Deterministic `@razroo/iso-preflight` dispatch planning for file-backed candidate facts and gates |
@@ -154,6 +154,10 @@ Mode/reference context bundles live in `templates/context.json` and are planned
154
154
 
155
155
  Artifact lookup policy lives in `templates/index.json` and is built locally by `@razroo/iso-index`. Use `job-forge index:has --key "company-role:acme:staff-engineer"` as a cheap duplicate/source prefilter, `job-forge index:query "acme"` to get compact source path/line pointers, and `job-forge index:verify` to validate `.jobforge-index.json`. Query, has, and verify rebuild the index on demand, so scaffolded projects need no setup. JobForge canonicalizes company/role and URL records through `templates/canon.json` before writing the index. This is not an MCP and does not add tool-schema tokens.
156
156
 
157
+ ## JobForge materialized facts
158
+
159
+ Fact extraction policy lives in `templates/facts.json` and is built locally by `@razroo/iso-facts`. Use `job-forge facts:query --fact job.url`, `job-forge facts:has --fact application.status --key "company-role:acme:staff-engineer"`, and `job-forge facts:verify` to work with compact source-backed facts instead of rereading reports, tracker day files, TSVs, candidate JSON, scan history, and ledger events. Query, has, verify, and check rebuild `.jobforge-facts.json` on demand, so scaffolded projects need no setup. JobForge canonicalizes company/role and URL fact keys through `templates/canon.json` before writing the fact set. This is not an MCP and does not add prompt or tool-schema tokens.
160
+
157
161
  ## JobForge identity canonicalization
158
162
 
159
163
  URL, company, role, and company+role identity rules live in `templates/canon.json` and are enforced locally by `@razroo/iso-canon`. Use `job-forge canon:key company-role --company "OpenAI, Inc." --role "Senior SWE, AI Platform"` to derive the same duplicate key used by ledger/index helpers, and `job-forge canon:compare company "OpenAI, Inc." "Open AI"` to explain whether two values resolve to the same entity. Custom forks can extend aliases, suffixes, stop words, and match thresholds in `templates/canon.json`. This is not an MCP and does not add prompt or tool-schema tokens.
package/docs/README.md CHANGED
@@ -31,7 +31,7 @@ The harness exposes a single CLI (`job-forge`) installed as a `bin` entry. In a
31
31
 
32
32
  | What you need | Where to read |
33
33
  |---------------|---------------|
34
- | Full command list (`verify`, `merge`, `dedup`, `normalize`, `pdf`, `sync-check`, `tokens`, `trace`, `telemetry`, `guard`, `ledger`, `canon`, `context`, `preflight`, `postflight`, `redact`, `sync`). | [SETUP.md — Tracker and scripts (terminal)](SETUP.md#tracker-and-scripts-terminal). |
34
+ | Full command list (`verify`, `merge`, `dedup`, `normalize`, `pdf`, `sync-check`, `tokens`, `trace`, `telemetry`, `guard`, `ledger`, `canon`, `context`, `index`, `facts`, `preflight`, `postflight`, `redact`, `sync`). | [SETUP.md — Tracker and scripts (terminal)](SETUP.md#tracker-and-scripts-terminal). |
35
35
  | What each harness `.mjs` script does. | [ARCHITECTURE.md — Pipeline integrity](ARCHITECTURE.md#pipeline-integrity) and the scripts table underneath. |
36
36
  | Batch runner, TSV layout, and `batch/tracker-additions/` merge flow. | [batch/README.md](../batch/README.md). |
37
37
  | PR gate for harness contributions (`npm run verify` + `npm run build:dashboard`). | [CONTRIBUTING.md — Development](../CONTRIBUTING.md#development). |
package/docs/SETUP.md CHANGED
@@ -132,6 +132,7 @@ From your project root, these commands maintain the tracker and pipeline checks.
132
132
  | Inspect context bundle budget | `npx job-forge context:plan apply` | `npm run context:plan -- apply` |
133
133
  | Inspect local JD/artifact cache | `npx job-forge cache:status` | `npm run cache:status` |
134
134
  | Inspect local artifact index | `npx job-forge index:status` | `npm run index:status` |
135
+ | Inspect local materialized facts | `npx job-forge facts:status` | `npm run facts:status` |
135
136
  | Plan safe application dispatch rounds | `npx job-forge preflight:plan --candidates batch/preflight-candidates.json` | `npm run preflight:plan -- --candidates ...` |
136
137
  | Fail on blocked preflight candidates | `npx job-forge preflight:check --candidates batch/preflight-candidates.json` | `npm run preflight:check -- --candidates ...` |
137
138
  | Settle dispatch outcomes after a round | `npx job-forge postflight:status --plan batch/preflight-plan.json --outcomes batch/postflight-outcomes.json` | `npm run postflight:status -- --plan ... --outcomes ...` |
@@ -157,6 +158,7 @@ From your project root, these commands maintain the tracker and pipeline checks.
157
158
  | Check duplicate/status event without loading tracker files | `npx job-forge ledger:has --company "Acme" --role "Staff Engineer" --status Applied` | `npm run ledger:has -- --company ...` |
158
159
  | Check/reuse cached JD content | `npx job-forge cache:has --url <url>` / `npx job-forge cache:get --url <url>` | `npm run cache:has -- --url ...` |
159
160
  | Query local artifact pointers | `npx job-forge index:query "Acme"` / `npx job-forge index:has --key company-role:acme:staff-engineer` | `npm run index:query -- Acme` |
161
+ | Query local source-backed facts | `npx job-forge facts:query --fact job.url` / `npx job-forge facts:has --fact application.status --key company-role:acme:staff-engineer` | `npm run facts:query -- --fact job.url` |
160
162
  | Apply safe consumer migrations | `npx job-forge migrate:apply` | `npm run migrate:apply` |
161
163
  | Re-create harness symlinks | `npx job-forge sync` | `npm run sync` |
162
164
  | Build optional dashboard TUI (Go on `PATH`) | `(cd node_modules/job-forge/dashboard && go build .)` | `npm run build:dashboard` (harness repo only) |
@@ -19,13 +19,13 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
19
19
  - [H5] Re-dispatch the same company only AFTER the previous subagent returns. Never fire the same `task` twice while the first is still in flight.
20
20
  why: two in-flight subagents for the same URL race on Geometra sessions and on tracker TSV writes, corrupting state and sometimes double-submitting
21
21
 
22
- - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, `.jobforge-index.json`, or `iso-trace`) rather than spawning a "check task status" subagent.
22
+ - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, `.jobforge-index.json`, `.jobforge-facts.json`, or `iso-trace`) rather than spawning a "check task status" subagent.
23
23
  why: OpenCode status prompts can be delivered into the target subagent as a new user message; a 2026-04-25 trace caused a subagent to call `task` recursively instead of finishing the application
24
24
 
25
25
  - [H6] Application outcomes flow through `batch/tracker-additions/*.tsv`, not `data/pipeline.md`. After any multi-apply run, the orchestrator MUST run `npx job-forge merge` then `npx job-forge verify` before ending the session.
26
26
  why: `pipeline.md` is the URL inbox (`[ ]` pending → `[x]` processed); `data/applications/YYYY-MM-DD.md` is the outcome log; the TSV pathway is the only safe bridge because `merge` handles column order and duplicate detection
27
27
 
28
- - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`, cached JD content returned by `npx job-forge cache:get --url ...`, and source path/line pointers returned by `npx job-forge index:query ...`.
28
+ - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`, cached JD content returned by `npx job-forge cache:get --url ...`, source path/line pointers returned by `npx job-forge index:query ...`, and materialized fact records returned by `npx job-forge facts:query ...`.
29
29
  why: 2026-04-18 scan subagent returned 30 fabricated Greenhouse IDs in prose (plausible-looking, non-existent); orchestrator dispatched 30 downstream subagents that all 404'd. Subagents can hallucinate IDs, scores, and confirmation text — round-trip through a file or don't trust the value
30
30
 
31
31
  - [H8] Never paste proxy values from `config/profile.yml` into `task` prompts, status text, or summaries. If a proxy is configured, tell the subagent exactly: "Proxy is configured; read `config/profile.yml` and pass its top-level `proxy:` object to every `geometra_connect` call." Do not transcribe `server`, `username`, `password`, or `bypass`, even if you just read them from disk.
@@ -72,6 +72,9 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
72
72
  - [D13] Use `job-forge index:*` for deterministic artifact lookup when available. `index:has` and `index:query` rebuild `.jobforge-index.json` from `templates/index.json` on demand, covering reports, tracker day files, tracker TSVs, pipeline URLs, scan history, and ledger events without loading those growing files into prompt context.
73
73
  why: `iso-index` is not an MCP and adds no prompt/tool-schema tokens; it gives agents compact file/line pointers and duplicate prefilters before expensive reads or browser dispatches
74
74
 
75
+ - [D13b] Use `job-forge facts:*` for deterministic source-backed fact materialization when available. `facts:has` and `facts:query` rebuild `.jobforge-facts.json` from `templates/facts.json` on demand, covering job URLs, scores, application statuses, tracker TSVs, preflight candidates, scan history, and ledger events with path/line provenance.
76
+ why: `iso-facts` is not an MCP and adds no prompt/tool-schema tokens; it turns authoritative files into compact queryable fact records so agents do not repeatedly reread broad artifact trees
77
+
75
78
  - [D14] Treat `templates/migrations.json` as the source of truth for consumer-project upgrades. Use `npx job-forge migrate:plan` or `npx job-forge migrate:check` when diagnosing harness drift; `job-forge sync` applies safe migrations automatically unless `JOB_FORGE_SKIP_MIGRATIONS=1` is set.
76
79
  why: `iso-migrate` is not an MCP and adds no prompt/tool-schema tokens; it prevents stale consumer scripts and generated-artifact ignores without asking agents to hand-edit package.json
77
80
 
@@ -91,8 +94,8 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
91
94
 
92
95
  1. Check `cv.md`, `profile.yml`, and `portals.yml`; onboard if any file is missing.
93
96
  2. Pick and name the mode from **Routing** [D6]. No match → ask; do not guess.
94
- 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Use canonical identity keys for duplicate checks [D15]. Use migration checks for harness drift [D14]. Use redaction checks before exporting local artifacts [D18]. Decide inline vs delegated work [D1].
95
- 4. Prepare Geometra dispatches: cleanup [H3], canon/index/ledger prefilter when useful [D8, D13, D15], dedupe [H2], location filter [D5], materialize candidate facts/gates and run preflight plan/check [D16], routing [D2, D10], proxy prompt hygiene [H8].
97
+ 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Use materialized facts when a fact query can answer the question [D13b]. Use canonical identity keys for duplicate checks [D15]. Use migration checks for harness drift [D14]. Use redaction checks before exporting local artifacts [D18]. Decide inline vs delegated work [D1].
98
+ 4. Prepare Geometra dispatches: cleanup [H3], canon/index/facts/ledger prefilter when useful [D8, D13, D13b, D15], dedupe [H2], location filter [D5], materialize candidate facts/gates and run preflight plan/check [D16], routing [D2, D10], proxy prompt hygiene [H8].
96
99
  5. Dispatch at most 2 tasks per round [H1]; wait for final outcomes, not just task ids [H5b], then settle the round with postflight status [D17].
97
100
  6. Keep multi-job form-filling out of the orchestrator [H4].
98
101
  7. Cross-check subagent facts against authoritative files [H7].
@@ -0,0 +1,178 @@
1
+ import { existsSync, readFileSync, writeFileSync } from 'fs';
2
+ import { join } from 'path';
3
+ import {
4
+ buildFacts,
5
+ checkFactRequirements,
6
+ factId,
7
+ hasFact,
8
+ loadFactsConfig,
9
+ parseJson,
10
+ queryFacts,
11
+ verifyFactSet,
12
+ } from '@razroo/iso-facts';
13
+ import {
14
+ jobForgeCompanyRoleKey,
15
+ jobForgeUrlKey,
16
+ legacyCompanyRoleKey,
17
+ legacyUrlKey,
18
+ } from './jobforge-canon.mjs';
19
+
20
+ export const FACTS_FILE = '.jobforge-facts.json';
21
+ export const FACTS_CONFIG_FILE = 'templates/facts.json';
22
+
23
+ export function resolveProjectDir(projectDir = process.env.JOB_FORGE_PROJECT || process.cwd()) {
24
+ return projectDir;
25
+ }
26
+
27
+ export function jobForgeFactsPath(projectDir = resolveProjectDir()) {
28
+ return process.env.JOB_FORGE_FACTS || join(projectDir, FACTS_FILE);
29
+ }
30
+
31
+ export function jobForgeFactsConfigPath(projectDir = resolveProjectDir()) {
32
+ return process.env.JOB_FORGE_FACTS_CONFIG || join(projectDir, FACTS_CONFIG_FILE);
33
+ }
34
+
35
+ export function factsExist(projectDir = resolveProjectDir()) {
36
+ return existsSync(jobForgeFactsPath(projectDir));
37
+ }
38
+
39
+ export function readJobForgeFactsConfig(projectDir = resolveProjectDir()) {
40
+ const path = jobForgeFactsConfigPath(projectDir);
41
+ return loadFactsConfig(parseJson(readFileSync(path, 'utf8'), path));
42
+ }
43
+
44
+ export function buildJobForgeFacts(options = {}, projectDir = resolveProjectDir()) {
45
+ const config = readJobForgeFactsConfig(projectDir);
46
+ const factSet = canonicalizeJobForgeFacts(buildFacts(config, { root: projectDir }), projectDir);
47
+ const out = options.out || jobForgeFactsPath(projectDir);
48
+ if (options.write !== false) {
49
+ writeFileSync(out, `${JSON.stringify(factSet, null, 2)}\n`, 'utf8');
50
+ }
51
+ return { factSet, out };
52
+ }
53
+
54
+ export function readJobForgeFacts(projectDir = resolveProjectDir()) {
55
+ const path = jobForgeFactsPath(projectDir);
56
+ return parseJson(readFileSync(path, 'utf8'), path);
57
+ }
58
+
59
+ export function ensureJobForgeFacts(options = {}, projectDir = resolveProjectDir()) {
60
+ if (options.rebuild !== false || !factsExist(projectDir)) {
61
+ return buildJobForgeFacts({ out: options.out }, projectDir).factSet;
62
+ }
63
+ return readJobForgeFacts(projectDir);
64
+ }
65
+
66
+ export function queryJobForgeFacts(query = {}, options = {}, projectDir = resolveProjectDir()) {
67
+ return queryFacts(ensureJobForgeFacts(options, projectDir), query);
68
+ }
69
+
70
+ export function hasJobForgeFact(query = {}, options = {}, projectDir = resolveProjectDir()) {
71
+ return hasFact(ensureJobForgeFacts(options, projectDir), query);
72
+ }
73
+
74
+ export function verifyJobForgeFacts(options = {}, projectDir = resolveProjectDir()) {
75
+ const factSet = options.factSet || ensureJobForgeFacts(options, projectDir);
76
+ return verifyFactSet(factSet);
77
+ }
78
+
79
+ export function checkJobForgeFacts(options = {}, projectDir = resolveProjectDir()) {
80
+ const factSet = options.factSet || ensureJobForgeFacts(options, projectDir);
81
+ const config = readJobForgeFactsConfig(projectDir);
82
+ return checkFactRequirements(factSet, config.requirements || []);
83
+ }
84
+
85
+ export function jobForgeFactsSummary(projectDir = resolveProjectDir()) {
86
+ if (!factsExist(projectDir)) {
87
+ return {
88
+ path: jobForgeFactsPath(projectDir),
89
+ config: jobForgeFactsConfigPath(projectDir),
90
+ exists: false,
91
+ facts: 0,
92
+ files: 0,
93
+ sources: 0,
94
+ };
95
+ }
96
+ const factSet = readJobForgeFacts(projectDir);
97
+ return {
98
+ path: jobForgeFactsPath(projectDir),
99
+ config: jobForgeFactsConfigPath(projectDir),
100
+ exists: true,
101
+ facts: factSet.stats?.facts || 0,
102
+ files: factSet.stats?.files || 0,
103
+ sources: factSet.stats?.sources || 0,
104
+ configHash: factSet.configHash,
105
+ };
106
+ }
107
+
108
+ function canonicalizeJobForgeFacts(factSet, projectDir) {
109
+ const facts = (factSet.facts || []).map((fact) => canonicalizeJobForgeFact(fact, projectDir));
110
+ facts.sort(compareFacts);
111
+ return {
112
+ ...factSet,
113
+ facts,
114
+ stats: {
115
+ ...(factSet.stats || {}),
116
+ facts: facts.length,
117
+ },
118
+ };
119
+ }
120
+
121
+ function canonicalizeJobForgeFact(fact, projectDir) {
122
+ const key = canonicalFactKey(fact, projectDir);
123
+ if (key === fact.key) return fact;
124
+ const updated = { ...fact, key };
125
+ return { ...updated, id: factId(updated) };
126
+ }
127
+
128
+ function canonicalFactKey(fact, projectDir) {
129
+ if (isCompanyRoleFact(fact)) {
130
+ const { company, role } = companyRoleFields(fact);
131
+ if (company && role) return safeCompanyRoleKey(company, role, projectDir);
132
+ }
133
+ if (isUrlFact(fact)) {
134
+ const url = fact.fields?.url;
135
+ if (url) return safeUrlKey(url, projectDir);
136
+ }
137
+ return fact.key;
138
+ }
139
+
140
+ function isCompanyRoleFact(fact) {
141
+ return fact.key?.startsWith('company-role:') ||
142
+ fact.fact === 'application.status' ||
143
+ fact.fact === 'tracker.addition' ||
144
+ fact.fact === 'candidate.ready';
145
+ }
146
+
147
+ function companyRoleFields(fact) {
148
+ const fields = fact.fields || {};
149
+ return {
150
+ company: fields.company || fields.Company,
151
+ role: fields.role || fields.Role,
152
+ };
153
+ }
154
+
155
+ function isUrlFact(fact) {
156
+ return fact.key?.startsWith('url:') || fact.fact === 'job.url';
157
+ }
158
+
159
+ function safeCompanyRoleKey(company, role, projectDir) {
160
+ try {
161
+ return jobForgeCompanyRoleKey(company, role, projectDir);
162
+ } catch {
163
+ return legacyCompanyRoleKey(company, role);
164
+ }
165
+ }
166
+
167
+ function safeUrlKey(url, projectDir) {
168
+ try {
169
+ return jobForgeUrlKey(url, projectDir);
170
+ } catch {
171
+ return legacyUrlKey(url);
172
+ }
173
+ }
174
+
175
+ function compareFacts(a, b) {
176
+ return `${a.fact}\0${a.key || ''}\0${a.value || ''}\0${a.source?.path || ''}\0${a.source?.line || ''}\0${a.id}`
177
+ .localeCompare(`${b.fact}\0${b.key || ''}\0${b.value || ''}\0${b.source?.path || ''}\0${b.source?.line || ''}\0${b.id}`);
178
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "job-forge",
3
- "version": "2.14.30",
3
+ "version": "2.14.31",
4
4
  "description": "AI-powered job search pipeline built on opencode",
5
5
  "type": "module",
6
6
  "bin": {
@@ -55,6 +55,13 @@
55
55
  "index:has": "node bin/job-forge.mjs index:has",
56
56
  "index:verify": "node bin/job-forge.mjs index:verify",
57
57
  "index:explain": "node bin/job-forge.mjs index:explain",
58
+ "facts:build": "node bin/job-forge.mjs facts:build",
59
+ "facts:status": "node bin/job-forge.mjs facts:status",
60
+ "facts:verify": "node bin/job-forge.mjs facts:verify",
61
+ "facts:check": "node bin/job-forge.mjs facts:check",
62
+ "facts:has": "node bin/job-forge.mjs facts:has",
63
+ "facts:query": "node bin/job-forge.mjs facts:query",
64
+ "facts:explain": "node bin/job-forge.mjs facts:explain",
58
65
  "canon:normalize": "node bin/job-forge.mjs canon:normalize",
59
66
  "canon:key": "node bin/job-forge.mjs canon:key",
60
67
  "canon:compare": "node bin/job-forge.mjs canon:compare",
@@ -144,6 +151,7 @@
144
151
  "@razroo/iso-capabilities": "^0.1.0",
145
152
  "@razroo/iso-context": "^0.1.0",
146
153
  "@razroo/iso-contract": "^0.1.0",
154
+ "@razroo/iso-facts": "^0.1.0",
147
155
  "@razroo/iso-guard": "^0.1.0",
148
156
  "@razroo/iso-index": "^0.1.0",
149
157
  "@razroo/iso-ledger": "^0.1.0",
@@ -0,0 +1,238 @@
1
+ #!/usr/bin/env node
2
+
3
+ import { relative } from 'path';
4
+ import {
5
+ formatBuildResult,
6
+ formatCheckResult,
7
+ formatConfigSummary,
8
+ formatFacts,
9
+ formatVerifyResult,
10
+ } from '@razroo/iso-facts';
11
+ import { PROJECT_DIR } from '../tracker-lib.mjs';
12
+ import {
13
+ buildJobForgeFacts,
14
+ checkJobForgeFacts,
15
+ factsExist,
16
+ hasJobForgeFact,
17
+ jobForgeFactsConfigPath,
18
+ jobForgeFactsPath,
19
+ jobForgeFactsSummary,
20
+ queryJobForgeFacts,
21
+ readJobForgeFactsConfig,
22
+ verifyJobForgeFacts,
23
+ } from '../lib/jobforge-facts.mjs';
24
+
25
+ const USAGE = `job-forge facts - local deterministic fact materialization
26
+
27
+ Usage:
28
+ job-forge facts:status [--json]
29
+ job-forge facts:build [--json]
30
+ job-forge facts:query [text] [--fact <fact>] [--key <key>] [--value <value>] [--source <path>] [--tag <tag>] [--limit N] [--no-rebuild] [--json]
31
+ job-forge facts:has [text] [--fact <fact>] [--key <key>] [--value <value>] [--source <path>] [--tag <tag>] [--no-rebuild] [--json]
32
+ job-forge facts:verify [--no-rebuild] [--json]
33
+ job-forge facts:check [--no-rebuild] [--json]
34
+ job-forge facts:explain [--json]
35
+ job-forge facts:path
36
+
37
+ Default config is templates/facts.json. Default output is .jobforge-facts.json.
38
+ Query, has, verify, and check rebuild facts by default so consumer projects need
39
+ no manual setup. Use --no-rebuild to inspect the existing fact set only.`;
40
+
41
+ const [cmd = 'help', ...rawArgs] = process.argv.slice(2);
42
+ const opts = parseArgs(rawArgs);
43
+
44
+ if (opts.help || cmd === 'help' || cmd === '--help' || cmd === '-h') {
45
+ console.log(USAGE);
46
+ process.exit(0);
47
+ }
48
+
49
+ try {
50
+ if (cmd === 'path') {
51
+ console.log(jobForgeFactsPath(PROJECT_DIR));
52
+ } else if (cmd === 'status') {
53
+ status(opts);
54
+ } else if (cmd === 'build') {
55
+ build(opts);
56
+ } else if (cmd === 'query') {
57
+ query(opts);
58
+ } else if (cmd === 'has') {
59
+ has(opts);
60
+ } else if (cmd === 'verify') {
61
+ verify(opts);
62
+ } else if (cmd === 'check') {
63
+ check(opts);
64
+ } else if (cmd === 'explain') {
65
+ explain(opts);
66
+ } else {
67
+ console.error(`unknown facts command "${cmd}"\n`);
68
+ console.error(USAGE);
69
+ process.exit(2);
70
+ }
71
+ } catch (error) {
72
+ console.error(error instanceof Error ? error.message : String(error));
73
+ process.exit(1);
74
+ }
75
+
76
+ function parseArgs(args) {
77
+ const opts = {
78
+ json: false,
79
+ help: false,
80
+ rebuild: true,
81
+ query: {},
82
+ text: [],
83
+ };
84
+
85
+ for (let i = 0; i < args.length; i++) {
86
+ const arg = args[i];
87
+ if (arg === '--json') {
88
+ opts.json = true;
89
+ } else if (arg === '--no-rebuild') {
90
+ opts.rebuild = false;
91
+ } else if (arg === '--rebuild') {
92
+ opts.rebuild = true;
93
+ } else if (arg === '--fact') {
94
+ opts.query.fact = valueAfter(args, ++i, '--fact');
95
+ } else if (arg.startsWith('--fact=')) {
96
+ opts.query.fact = arg.slice('--fact='.length);
97
+ } else if (arg === '--key') {
98
+ opts.query.key = valueAfter(args, ++i, '--key');
99
+ } else if (arg.startsWith('--key=')) {
100
+ opts.query.key = arg.slice('--key='.length);
101
+ } else if (arg === '--value') {
102
+ opts.query.value = valueAfter(args, ++i, '--value');
103
+ } else if (arg.startsWith('--value=')) {
104
+ opts.query.value = arg.slice('--value='.length);
105
+ } else if (arg === '--source') {
106
+ opts.query.source = valueAfter(args, ++i, '--source');
107
+ } else if (arg.startsWith('--source=')) {
108
+ opts.query.source = arg.slice('--source='.length);
109
+ } else if (arg === '--tag') {
110
+ opts.query.tag = valueAfter(args, ++i, '--tag');
111
+ } else if (arg.startsWith('--tag=')) {
112
+ opts.query.tag = arg.slice('--tag='.length);
113
+ } else if (arg === '--limit') {
114
+ opts.query.limit = parsePositiveInteger(valueAfter(args, ++i, '--limit'), '--limit');
115
+ } else if (arg.startsWith('--limit=')) {
116
+ opts.query.limit = parsePositiveInteger(arg.slice('--limit='.length), '--limit');
117
+ } else if (arg === '--help' || arg === '-h') {
118
+ opts.help = true;
119
+ } else if (arg.startsWith('--')) {
120
+ throw new Error(`unknown flag "${arg}"`);
121
+ } else {
122
+ opts.text.push(arg);
123
+ }
124
+ }
125
+
126
+ if (opts.text.length > 0) opts.query.text = opts.text.join(' ');
127
+ return opts;
128
+ }
129
+
130
+ function valueAfter(values, index, flag) {
131
+ const value = values[index];
132
+ if (!value || value.startsWith('--')) throw new Error(`${flag} requires a value`);
133
+ return value;
134
+ }
135
+
136
+ function parsePositiveInteger(value, flag) {
137
+ const parsed = Number(value);
138
+ if (!Number.isInteger(parsed) || parsed <= 0) throw new Error(`${flag} must be a positive integer`);
139
+ return parsed;
140
+ }
141
+
142
+ function status(opts) {
143
+ const summary = jobForgeFactsSummary(PROJECT_DIR);
144
+ if (opts.json) {
145
+ console.log(JSON.stringify(summary, null, 2));
146
+ return;
147
+ }
148
+ if (!summary.exists) {
149
+ console.log(`facts: missing (${relativePath(summary.path)})`);
150
+ console.log('run: job-forge facts:build');
151
+ return;
152
+ }
153
+ const result = verifyJobForgeFacts({ rebuild: false }, PROJECT_DIR);
154
+ console.log(`facts: ${relativePath(summary.path)}`);
155
+ console.log(`config: ${relativePath(summary.config)}`);
156
+ console.log(`sources: ${summary.sources}`);
157
+ console.log(`files: ${summary.files}`);
158
+ console.log(`facts: ${summary.facts}`);
159
+ console.log(`verify: ${result.ok ? 'PASS' : 'FAIL'} (${result.issues.length} issue(s))`);
160
+ }
161
+
162
+ function build(opts) {
163
+ const { factSet, out } = buildJobForgeFacts({}, PROJECT_DIR);
164
+ if (opts.json) {
165
+ console.log(JSON.stringify({ out, stats: factSet.stats }, null, 2));
166
+ return;
167
+ }
168
+ console.log(formatBuildResult(factSet, out));
169
+ }
170
+
171
+ function query(opts) {
172
+ const facts = queryJobForgeFacts(opts.query, { rebuild: opts.rebuild }, PROJECT_DIR);
173
+ if (opts.json) {
174
+ console.log(JSON.stringify(facts, null, 2));
175
+ return;
176
+ }
177
+ console.log(formatFacts(facts));
178
+ }
179
+
180
+ function has(opts) {
181
+ const hit = hasJobForgeFact(opts.query, { rebuild: opts.rebuild }, PROJECT_DIR);
182
+ if (opts.json) {
183
+ console.log(JSON.stringify({ hit, query: opts.query }, null, 2));
184
+ } else {
185
+ console.log(hit ? 'MATCH' : 'MISS');
186
+ }
187
+ process.exit(hit ? 0 : 1);
188
+ }
189
+
190
+ function verify(opts) {
191
+ if (!opts.rebuild && !factsExist(PROJECT_DIR)) {
192
+ if (opts.json) {
193
+ console.log(JSON.stringify({ ok: true, missing: true, path: jobForgeFactsPath(PROJECT_DIR) }, null, 2));
194
+ } else {
195
+ console.log(`facts: missing (${relativePath(jobForgeFactsPath(PROJECT_DIR))})`);
196
+ }
197
+ return;
198
+ }
199
+ const result = verifyJobForgeFacts({ rebuild: opts.rebuild }, PROJECT_DIR);
200
+ if (opts.json) {
201
+ console.log(JSON.stringify(result, null, 2));
202
+ } else {
203
+ console.log(formatVerifyResult(result));
204
+ }
205
+ process.exit(result.ok ? 0 : 1);
206
+ }
207
+
208
+ function check(opts) {
209
+ if (!opts.rebuild && !factsExist(PROJECT_DIR)) {
210
+ if (opts.json) {
211
+ console.log(JSON.stringify({ ok: true, missing: true, path: jobForgeFactsPath(PROJECT_DIR) }, null, 2));
212
+ } else {
213
+ console.log(`facts: missing (${relativePath(jobForgeFactsPath(PROJECT_DIR))})`);
214
+ }
215
+ return;
216
+ }
217
+ const result = checkJobForgeFacts({ rebuild: opts.rebuild }, PROJECT_DIR);
218
+ if (opts.json) {
219
+ console.log(JSON.stringify(result, null, 2));
220
+ } else {
221
+ console.log(formatCheckResult(result));
222
+ }
223
+ process.exit(result.ok ? 0 : 1);
224
+ }
225
+
226
+ function explain(opts) {
227
+ const config = readJobForgeFactsConfig(PROJECT_DIR);
228
+ if (opts.json) {
229
+ console.log(JSON.stringify(config, null, 2));
230
+ return;
231
+ }
232
+ console.log(`config: ${relativePath(jobForgeFactsConfigPath(PROJECT_DIR))}`);
233
+ console.log(formatConfigSummary(config));
234
+ }
235
+
236
+ function relativePath(path) {
237
+ return relative(PROJECT_DIR, path) || '.';
238
+ }
@@ -0,0 +1,200 @@
1
+ {
2
+ "version": 1,
3
+ "sources": [
4
+ {
5
+ "name": "reports",
6
+ "include": ["reports/*.md"],
7
+ "format": "text",
8
+ "rules": [
9
+ {
10
+ "fact": "job.url",
11
+ "pattern": "^\\*\\*URL:\\*\\*\\s*(?<url>https?://\\S+)",
12
+ "flags": "i",
13
+ "key": "url:{url}",
14
+ "value": "{url}",
15
+ "fields": {
16
+ "url": "{url}",
17
+ "report": "{source}"
18
+ },
19
+ "tags": ["report", "url"]
20
+ },
21
+ {
22
+ "fact": "job.score",
23
+ "pattern": "^\\*\\*Score:\\*\\*\\s*(?<score>[0-9.]+/5)",
24
+ "flags": "i",
25
+ "key": "report:{source}:score",
26
+ "value": "{score}",
27
+ "fields": {
28
+ "score": "{score}",
29
+ "report": "{source}"
30
+ },
31
+ "tags": ["report", "score"]
32
+ }
33
+ ]
34
+ },
35
+ {
36
+ "name": "application-day-files",
37
+ "include": ["data/applications/*.md"],
38
+ "format": "markdown-table",
39
+ "records": [
40
+ {
41
+ "fact": "application.status",
42
+ "key": "company-role:{Company|slug}:{Role|slug}",
43
+ "value": "{Status}",
44
+ "fields": {
45
+ "num": "{#}",
46
+ "date": "{Date}",
47
+ "company": "{Company}",
48
+ "role": "{Role}",
49
+ "score": "{Score}",
50
+ "status": "{Status}",
51
+ "pdf": "{PDF}",
52
+ "report": "{Report}",
53
+ "notes": "{Notes}"
54
+ },
55
+ "tags": ["tracker", "application"]
56
+ }
57
+ ]
58
+ },
59
+ {
60
+ "name": "application-single-file",
61
+ "include": ["data/applications.md", "applications.md"],
62
+ "format": "markdown-table",
63
+ "records": [
64
+ {
65
+ "fact": "application.status",
66
+ "key": "company-role:{Company|slug}:{Role|slug}",
67
+ "value": "{Status}",
68
+ "fields": {
69
+ "num": "{#}",
70
+ "date": "{Date}",
71
+ "company": "{Company}",
72
+ "role": "{Role}",
73
+ "score": "{Score}",
74
+ "status": "{Status}",
75
+ "pdf": "{PDF}",
76
+ "report": "{Report}",
77
+ "notes": "{Notes}"
78
+ },
79
+ "tags": ["tracker", "application"]
80
+ }
81
+ ]
82
+ },
83
+ {
84
+ "name": "tracker-additions",
85
+ "include": ["batch/tracker-additions/*.tsv", "batch/tracker-additions/merged/*.tsv"],
86
+ "format": "tsv",
87
+ "header": false,
88
+ "columns": ["num", "date", "company", "role", "statusOrScore", "scoreOrStatus", "pdf", "report", "notes"],
89
+ "records": [
90
+ {
91
+ "fact": "tracker.addition",
92
+ "key": "company-role:{company|slug}:{role|slug}",
93
+ "value": "{source}",
94
+ "fields": ["num", "date", "company", "role", "statusOrScore", "scoreOrStatus", "pdf", "report", "notes"],
95
+ "tags": ["tracker", "tsv"]
96
+ }
97
+ ]
98
+ },
99
+ {
100
+ "name": "pipeline",
101
+ "include": ["data/pipeline.md"],
102
+ "format": "text",
103
+ "rules": [
104
+ {
105
+ "fact": "job.url",
106
+ "pattern": "^\\s*-\\s*\\[(?<state>[ xX])\\]\\s+(?<url>https?://\\S+)",
107
+ "key": "url:{url}",
108
+ "value": "{state}",
109
+ "fields": {
110
+ "url": "{url}",
111
+ "state": "{state}",
112
+ "pipeline": "{source}"
113
+ },
114
+ "tags": ["pipeline", "url"]
115
+ }
116
+ ]
117
+ },
118
+ {
119
+ "name": "scan-history",
120
+ "include": ["data/scan-history.tsv"],
121
+ "format": "tsv",
122
+ "records": [
123
+ {
124
+ "fact": "job.url",
125
+ "key": "url:{url}",
126
+ "value": "{url}",
127
+ "fields": ["date", "company", "role", "url", "ats"],
128
+ "tags": ["scan", "url"]
129
+ }
130
+ ]
131
+ },
132
+ {
133
+ "name": "preflight-candidates-object",
134
+ "include": ["batch/preflight-candidates.json"],
135
+ "format": "json",
136
+ "records": [
137
+ {
138
+ "fact": "candidate.ready",
139
+ "path": "$.candidates[]",
140
+ "key": "{companyRoleKey}",
141
+ "value": "{url}",
142
+ "fields": {
143
+ "id": "{id}",
144
+ "company": "{company}",
145
+ "role": "{role}",
146
+ "companyRoleKey": "{companyRoleKey}",
147
+ "url": "{url}",
148
+ "score": "{score}",
149
+ "gateStatus": "{gate.status}",
150
+ "locationStatus": "{location.status}"
151
+ },
152
+ "tags": ["candidate", "preflight"]
153
+ }
154
+ ]
155
+ },
156
+ {
157
+ "name": "preflight-candidates-array",
158
+ "include": ["batch/preflight-candidates.json"],
159
+ "format": "json",
160
+ "records": [
161
+ {
162
+ "fact": "candidate.ready",
163
+ "path": "$[]",
164
+ "key": "{companyRoleKey}",
165
+ "value": "{url}",
166
+ "fields": {
167
+ "id": "{id}",
168
+ "company": "{company}",
169
+ "role": "{role}",
170
+ "companyRoleKey": "{companyRoleKey}",
171
+ "url": "{url}",
172
+ "score": "{score}",
173
+ "gateStatus": "{gate.status}",
174
+ "locationStatus": "{location.status}"
175
+ },
176
+ "tags": ["candidate", "preflight"]
177
+ }
178
+ ]
179
+ },
180
+ {
181
+ "name": "ledger",
182
+ "include": [".jobforge-ledger/*.jsonl"],
183
+ "format": "jsonl",
184
+ "records": [
185
+ {
186
+ "fact": "ledger.event",
187
+ "key": "{key}",
188
+ "value": "{type}",
189
+ "fields": {
190
+ "type": "{type}",
191
+ "key": "{key}",
192
+ "status": "{data.status}",
193
+ "source": "{source}"
194
+ },
195
+ "tags": ["ledger"]
196
+ }
197
+ ]
198
+ }
199
+ ]
200
+ }
@@ -33,6 +33,13 @@
33
33
  "index:has": "job-forge index:has",
34
34
  "index:query": "job-forge index:query",
35
35
  "index:explain": "job-forge index:explain",
36
+ "facts:build": "job-forge facts:build",
37
+ "facts:status": "job-forge facts:status",
38
+ "facts:verify": "job-forge facts:verify",
39
+ "facts:check": "job-forge facts:check",
40
+ "facts:has": "job-forge facts:has",
41
+ "facts:query": "job-forge facts:query",
42
+ "facts:explain": "job-forge facts:explain",
36
43
  "canon:normalize": "job-forge canon:normalize",
37
44
  "canon:key": "job-forge canon:key",
38
45
  "canon:compare": "job-forge canon:compare",
@@ -69,6 +76,7 @@
69
76
  ".jobforge-ledger/",
70
77
  ".jobforge-cache/",
71
78
  ".jobforge-index.json",
79
+ ".jobforge-facts.json",
72
80
  ".jobforge-redacted/",
73
81
  "batch/preflight-candidates.json",
74
82
  "batch/preflight-plan.json",
@@ -18,6 +18,7 @@
18
18
  * 9. Drift warning if states.yml ids differ from the built-in fallback list
19
19
  * 10. Ledger file verifies if .jobforge-ledger/events.jsonl exists
20
20
  * 11. Artifact index verifies if .jobforge-index.json exists
21
+ * 12. Fact set verifies if .jobforge-facts.json exists
21
22
  *
22
23
  * Run: node verify-pipeline.mjs (from repo root; same as npm run verify)
23
24
  */
@@ -31,6 +32,7 @@ import {
31
32
  } from './tracker-lib.mjs';
32
33
  import { jobForgeLedgerPath, ledgerExists, verifyJobForgeLedger } from './lib/jobforge-ledger.mjs';
33
34
  import { indexExists, jobForgeIndexPath, verifyJobForgeIndex } from './lib/jobforge-index.mjs';
35
+ import { factsExist, jobForgeFactsPath, verifyJobForgeFacts } from './lib/jobforge-facts.mjs';
34
36
  import {
35
37
  canonicalStatusValues,
36
38
  formatContractIssues,
@@ -171,6 +173,22 @@ function verifyIndexIfPresent() {
171
173
  }
172
174
  }
173
175
 
176
+ function verifyFactsIfPresent() {
177
+ if (!factsExist(PROJECT_DIR)) {
178
+ ok('Fact set not initialized');
179
+ return;
180
+ }
181
+ const result = verifyJobForgeFacts({ rebuild: false }, PROJECT_DIR);
182
+ for (const issue of result.issues) {
183
+ const msg = `facts: ${issue.kind}: ${issue.message}`;
184
+ if (issue.severity === 'error') error(msg);
185
+ else warn(msg);
186
+ }
187
+ if (result.ok) {
188
+ ok(`Fact set valid (${result.facts} facts at ${relative(PROJECT_DIR, jobForgeFactsPath(PROJECT_DIR))})`);
189
+ }
190
+ }
191
+
174
192
  // --- Read entries ---
175
193
  const { entries, source } = readAllEntries();
176
194
 
@@ -181,6 +199,7 @@ if (entries.length === 0) {
181
199
  verifyStatesYamlDrift();
182
200
  verifyLedgerIfPresent();
183
201
  verifyIndexIfPresent();
202
+ verifyFactsIfPresent();
184
203
  console.log('\n' + '='.repeat(50));
185
204
  console.log(`📊 Pipeline Health: ${errors} errors, ${warnings} warnings`);
186
205
  if (errors === 0 && warnings === 0) console.log('🟢 Pipeline is clean!');
@@ -317,6 +336,7 @@ if (boldScores === 0) ok('No bold in scores');
317
336
  verifyStatesYamlDrift();
318
337
  verifyLedgerIfPresent();
319
338
  verifyIndexIfPresent();
339
+ verifyFactsIfPresent();
320
340
 
321
341
  console.log('\n' + '='.repeat(50));
322
342
  console.log(`📊 Pipeline Health: ${errors} errors, ${warnings} warnings`);