job-forge 2.14.21 → 2.14.23

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -12,7 +12,7 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
12
12
  - [H1] Max 2 parallel `task` dispatches per message. For N jobs, run `ceil(N/2)` sequential rounds of 2. A round is not complete until both subagents return a final outcome (`APPLIED`, `APPLY FAILED`, `SKIP`, `Discarded`, or a written TSV path). A `task` tool result that only gives a session id / title is a launch acknowledgement, not completion. Applies in all modes, for all user phrasings ("urgent", "apply to 10 jobs now").
13
13
  why: each subagent requires post-cleanup and racing more than 2 reliably loses at least one result. On 2026-04-25 the orchestrator launched round 2 while round 1 had only returned task ids, leaving four application subagents in flight and losing two provider recoveries
14
14
 
15
- - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
15
+ - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. `npx job-forge index:has --key "company-role:..."` may be used first as a fast local artifact prefilter; it rebuilds `.jobforge-index.json` on demand from `templates/index.json`, and a company+role hit is enough to drop an obvious duplicate before dispatch. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may also be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the index or ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
16
16
  why: 2026-04 same-day batch collision — when two batches target the same role, `npx job-forge merge` updates the existing day-file row rather than appending, so grepping day files alone misses earlier-batch applies; merged/*.tsv is the only place the breadcrumb remains
17
17
 
18
18
  - [H3] Before every batch of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round, no exceptions. Name this cleanup as an explicit "step 0" in your first-response plan for any multi-apply request — it is the most frequently skipped guardrail in practice, and skipping it produces cascade "Not connected" failures on the next dispatch.
@@ -24,13 +24,13 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
24
24
  - [H5] Re-dispatch the same company only AFTER the previous subagent returns. Never fire the same `task` twice while the first is still in flight.
25
25
  why: two in-flight subagents for the same URL race on Geometra sessions and on tracker TSV writes, corrupting state and sometimes double-submitting
26
26
 
27
- - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, or `iso-trace`) rather than spawning a "check task status" subagent.
27
+ - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, `.jobforge-index.json`, or `iso-trace`) rather than spawning a "check task status" subagent.
28
28
  why: OpenCode status prompts can be delivered into the target subagent as a new user message; a 2026-04-25 trace caused a subagent to call `task` recursively instead of finishing the application
29
29
 
30
30
  - [H6] Application outcomes flow through `batch/tracker-additions/*.tsv`, not `data/pipeline.md`. After any multi-apply run, the orchestrator MUST run `npx job-forge merge` then `npx job-forge verify` before ending the session.
31
31
  why: `pipeline.md` is the URL inbox (`[ ]` pending → `[x]` processed); `data/applications/YYYY-MM-DD.md` is the outcome log; the TSV pathway is the only safe bridge because `merge` handles column order and duplicate detection
32
32
 
33
- - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`.
33
+ - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`, cached JD content returned by `npx job-forge cache:get --url ...`, and source path/line pointers returned by `npx job-forge index:query ...`.
34
34
  why: 2026-04-18 scan subagent returned 30 fabricated Greenhouse IDs in prose (plausible-looking, non-existent); orchestrator dispatched 30 downstream subagents that all 404'd. Subagents can hallucinate IDs, scores, and confirmation text — round-trip through a file or don't trust the value
35
35
 
36
36
  - [H8] Never paste proxy values from `config/profile.yml` into `task` prompts, status text, or summaries. If a proxy is configured, tell the subagent exactly: "Proxy is configured; read `config/profile.yml` and pass its top-level `proxy:` object to every `geometra_connect` call." Do not transcribe `server`, `username`, `password`, or `bypass`, even if you just read them from disk.
@@ -71,12 +71,18 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
71
71
  - [D11] Treat `templates/context.json` as the source of truth for mode/reference context bundles. Use `npx job-forge context:plan <mode>` or `npx job-forge context:check <mode>` when changing or validating what a mode loads; do not paste the full context matrix into prompts.
72
72
  why: deterministic context bundles prevent reference-file drift and accidental token bloat without adding MCP/tool-schema tokens
73
73
 
74
+ - [D12] Use `job-forge cache:*` for deterministic local artifact reuse when available. For URL inputs, check `npx job-forge cache:has --url "..."` / `cache:get` before browser or network JD fetches; after a successful fetch, store the exact JD text with `npx job-forge cache:put --url "..." --ttl 14d --input @file` when it is already on disk.
75
+ why: `iso-cache` is not an MCP and adds no prompt/tool-schema tokens; it avoids repeated JD fetch/render passes and lets future sessions reuse stable content from `.jobforge-cache/`
76
+
77
+ - [D13] Use `job-forge index:*` for deterministic artifact lookup when available. `index:has` and `index:query` rebuild `.jobforge-index.json` from `templates/index.json` on demand, covering reports, tracker day files, tracker TSVs, pipeline URLs, scan history, and ledger events without loading those growing files into prompt context.
78
+ why: `iso-index` is not an MCP and adds no prompt/tool-schema tokens; it gives agents compact file/line pointers and duplicate prefilters before expensive reads or browser dispatches
79
+
74
80
  ## Procedure
75
81
 
76
82
  1. Check `cv.md`, `profile.yml`, and `portals.yml`; onboard if any file is missing.
77
83
  2. Pick and name the mode from **Routing** [D6]. No match → ask; do not guess.
78
- 3. Read the active mode file [D3]; use context bundle checks when changing context loads [D11]; decide inline vs delegated work [D1].
79
- 4. Prepare Geometra dispatches: cleanup [H3], ledger prefilter when present [D8], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
84
+ 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Decide inline vs delegated work [D1].
85
+ 4. Prepare Geometra dispatches: cleanup [H3], index/ledger prefilter when useful [D8, D13], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
80
86
  5. Dispatch at most 2 tasks per round [H1]; wait for final outcomes, not just task ids [H5b].
81
87
  6. Keep multi-job form-filling out of the orchestrator [H4].
82
88
  7. Cross-check subagent facts against authoritative files [H7].
@@ -73,6 +73,11 @@ Local workflow ledger (terminal, outside opencode):
73
73
  npx job-forge ledger:status # .jobforge-ledger/events.jsonl summary
74
74
  npx job-forge ledger:has --company "Acme" --role "Staff Engineer" --status Applied
75
75
 
76
+ Local artifact index (terminal, outside opencode):
77
+ npx job-forge index:status # .jobforge-index.json summary
78
+ npx job-forge index:has --key "company-role:acme:staff-engineer"
79
+ npx job-forge index:query "acme"
80
+
76
81
  Artifact contracts (terminal, outside opencode):
77
82
  npx iso-contract explain jobforge.tracker-row --contracts templates/contracts.json
78
83
  npx job-forge tracker-line ... --write # renders + validates tracker TSV locally
@@ -158,6 +163,11 @@ Step 1 — Enumerate candidates
158
163
  - Build ordered list: candidates = [job_1, job_2, ..., job_N]
159
164
 
160
165
  Step 2 — Dedup against already-applied
166
+ - Run npx job-forge index:has --key "company-role:<company-slug>:<role-slug>"
167
+ as a fast local artifact prefilter when company+role is known. It rebuilds
168
+ .jobforge-index.json on demand from templates/index.json. A hit means the
169
+ role has already appeared in tracker files or tracker TSVs and can be
170
+ dropped before dispatch.
161
171
  - If .jobforge-ledger/events.jsonl exists, use npx job-forge ledger:has as a
162
172
  fast prefilter for obvious company+role Applied duplicates. A ledger match
163
173
  can be dropped before dispatch without loading tracker files into context.
package/AGENTS.md CHANGED
@@ -7,7 +7,7 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
7
7
  - [H1] Max 2 parallel `task` dispatches per message. For N jobs, run `ceil(N/2)` sequential rounds of 2. A round is not complete until both subagents return a final outcome (`APPLIED`, `APPLY FAILED`, `SKIP`, `Discarded`, or a written TSV path). A `task` tool result that only gives a session id / title is a launch acknowledgement, not completion. Applies in all modes, for all user phrasings ("urgent", "apply to 10 jobs now").
8
8
  why: each subagent requires post-cleanup and racing more than 2 reliably loses at least one result. On 2026-04-25 the orchestrator launched round 2 while round 1 had only returned task ids, leaving four application subagents in flight and losing two provider recoveries
9
9
 
10
- - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
10
+ - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. `npx job-forge index:has --key "company-role:..."` may be used first as a fast local artifact prefilter; it rebuilds `.jobforge-index.json` on demand from `templates/index.json`, and a company+role hit is enough to drop an obvious duplicate before dispatch. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may also be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the index or ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
11
11
  why: 2026-04 same-day batch collision — when two batches target the same role, `npx job-forge merge` updates the existing day-file row rather than appending, so grepping day files alone misses earlier-batch applies; merged/*.tsv is the only place the breadcrumb remains
12
12
 
13
13
  - [H3] Before every batch of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round, no exceptions. Name this cleanup as an explicit "step 0" in your first-response plan for any multi-apply request — it is the most frequently skipped guardrail in practice, and skipping it produces cascade "Not connected" failures on the next dispatch.
@@ -19,13 +19,13 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
19
19
  - [H5] Re-dispatch the same company only AFTER the previous subagent returns. Never fire the same `task` twice while the first is still in flight.
20
20
  why: two in-flight subagents for the same URL race on Geometra sessions and on tracker TSV writes, corrupting state and sometimes double-submitting
21
21
 
22
- - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, or `iso-trace`) rather than spawning a "check task status" subagent.
22
+ - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, `.jobforge-index.json`, or `iso-trace`) rather than spawning a "check task status" subagent.
23
23
  why: OpenCode status prompts can be delivered into the target subagent as a new user message; a 2026-04-25 trace caused a subagent to call `task` recursively instead of finishing the application
24
24
 
25
25
  - [H6] Application outcomes flow through `batch/tracker-additions/*.tsv`, not `data/pipeline.md`. After any multi-apply run, the orchestrator MUST run `npx job-forge merge` then `npx job-forge verify` before ending the session.
26
26
  why: `pipeline.md` is the URL inbox (`[ ]` pending → `[x]` processed); `data/applications/YYYY-MM-DD.md` is the outcome log; the TSV pathway is the only safe bridge because `merge` handles column order and duplicate detection
27
27
 
28
- - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`.
28
+ - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`, cached JD content returned by `npx job-forge cache:get --url ...`, and source path/line pointers returned by `npx job-forge index:query ...`.
29
29
  why: 2026-04-18 scan subagent returned 30 fabricated Greenhouse IDs in prose (plausible-looking, non-existent); orchestrator dispatched 30 downstream subagents that all 404'd. Subagents can hallucinate IDs, scores, and confirmation text — round-trip through a file or don't trust the value
30
30
 
31
31
  - [H8] Never paste proxy values from `config/profile.yml` into `task` prompts, status text, or summaries. If a proxy is configured, tell the subagent exactly: "Proxy is configured; read `config/profile.yml` and pass its top-level `proxy:` object to every `geometra_connect` call." Do not transcribe `server`, `username`, `password`, or `bypass`, even if you just read them from disk.
@@ -66,12 +66,18 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
66
66
  - [D11] Treat `templates/context.json` as the source of truth for mode/reference context bundles. Use `npx job-forge context:plan <mode>` or `npx job-forge context:check <mode>` when changing or validating what a mode loads; do not paste the full context matrix into prompts.
67
67
  why: deterministic context bundles prevent reference-file drift and accidental token bloat without adding MCP/tool-schema tokens
68
68
 
69
+ - [D12] Use `job-forge cache:*` for deterministic local artifact reuse when available. For URL inputs, check `npx job-forge cache:has --url "..."` / `cache:get` before browser or network JD fetches; after a successful fetch, store the exact JD text with `npx job-forge cache:put --url "..." --ttl 14d --input @file` when it is already on disk.
70
+ why: `iso-cache` is not an MCP and adds no prompt/tool-schema tokens; it avoids repeated JD fetch/render passes and lets future sessions reuse stable content from `.jobforge-cache/`
71
+
72
+ - [D13] Use `job-forge index:*` for deterministic artifact lookup when available. `index:has` and `index:query` rebuild `.jobforge-index.json` from `templates/index.json` on demand, covering reports, tracker day files, tracker TSVs, pipeline URLs, scan history, and ledger events without loading those growing files into prompt context.
73
+ why: `iso-index` is not an MCP and adds no prompt/tool-schema tokens; it gives agents compact file/line pointers and duplicate prefilters before expensive reads or browser dispatches
74
+
69
75
  ## Procedure
70
76
 
71
77
  1. Check `cv.md`, `profile.yml`, and `portals.yml`; onboard if any file is missing.
72
78
  2. Pick and name the mode from **Routing** [D6]. No match → ask; do not guess.
73
- 3. Read the active mode file [D3]; use context bundle checks when changing context loads [D11]; decide inline vs delegated work [D1].
74
- 4. Prepare Geometra dispatches: cleanup [H3], ledger prefilter when present [D8], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
79
+ 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Decide inline vs delegated work [D1].
80
+ 4. Prepare Geometra dispatches: cleanup [H3], index/ledger prefilter when useful [D8, D13], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
75
81
  5. Dispatch at most 2 tasks per round [H1]; wait for final outcomes, not just task ids [H5b].
76
82
  6. Keep multi-job form-filling out of the orchestrator [H4].
77
83
  7. Cross-check subagent facts against authoritative files [H7].
package/CLAUDE.md CHANGED
@@ -7,7 +7,7 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
7
7
  - [H1] Max 2 parallel `task` dispatches per message. For N jobs, run `ceil(N/2)` sequential rounds of 2. A round is not complete until both subagents return a final outcome (`APPLIED`, `APPLY FAILED`, `SKIP`, `Discarded`, or a written TSV path). A `task` tool result that only gives a session id / title is a launch acknowledgement, not completion. Applies in all modes, for all user phrasings ("urgent", "apply to 10 jobs now").
8
8
  why: each subagent requires post-cleanup and racing more than 2 reliably loses at least one result. On 2026-04-25 the orchestrator launched round 2 while round 1 had only returned task ids, leaving four application subagents in flight and losing two provider recoveries
9
9
 
10
- - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
10
+ - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. `npx job-forge index:has --key "company-role:..."` may be used first as a fast local artifact prefilter; it rebuilds `.jobforge-index.json` on demand from `templates/index.json`, and a company+role hit is enough to drop an obvious duplicate before dispatch. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may also be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the index or ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
11
11
  why: 2026-04 same-day batch collision — when two batches target the same role, `npx job-forge merge` updates the existing day-file row rather than appending, so grepping day files alone misses earlier-batch applies; merged/*.tsv is the only place the breadcrumb remains
12
12
 
13
13
  - [H3] Before every batch of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round, no exceptions. Name this cleanup as an explicit "step 0" in your first-response plan for any multi-apply request — it is the most frequently skipped guardrail in practice, and skipping it produces cascade "Not connected" failures on the next dispatch.
@@ -19,13 +19,13 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
19
19
  - [H5] Re-dispatch the same company only AFTER the previous subagent returns. Never fire the same `task` twice while the first is still in flight.
20
20
  why: two in-flight subagents for the same URL race on Geometra sessions and on tracker TSV writes, corrupting state and sometimes double-submitting
21
21
 
22
- - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, or `iso-trace`) rather than spawning a "check task status" subagent.
22
+ - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, `.jobforge-index.json`, or `iso-trace`) rather than spawning a "check task status" subagent.
23
23
  why: OpenCode status prompts can be delivered into the target subagent as a new user message; a 2026-04-25 trace caused a subagent to call `task` recursively instead of finishing the application
24
24
 
25
25
  - [H6] Application outcomes flow through `batch/tracker-additions/*.tsv`, not `data/pipeline.md`. After any multi-apply run, the orchestrator MUST run `npx job-forge merge` then `npx job-forge verify` before ending the session.
26
26
  why: `pipeline.md` is the URL inbox (`[ ]` pending → `[x]` processed); `data/applications/YYYY-MM-DD.md` is the outcome log; the TSV pathway is the only safe bridge because `merge` handles column order and duplicate detection
27
27
 
28
- - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`.
28
+ - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`, cached JD content returned by `npx job-forge cache:get --url ...`, and source path/line pointers returned by `npx job-forge index:query ...`.
29
29
  why: 2026-04-18 scan subagent returned 30 fabricated Greenhouse IDs in prose (plausible-looking, non-existent); orchestrator dispatched 30 downstream subagents that all 404'd. Subagents can hallucinate IDs, scores, and confirmation text — round-trip through a file or don't trust the value
30
30
 
31
31
  - [H8] Never paste proxy values from `config/profile.yml` into `task` prompts, status text, or summaries. If a proxy is configured, tell the subagent exactly: "Proxy is configured; read `config/profile.yml` and pass its top-level `proxy:` object to every `geometra_connect` call." Do not transcribe `server`, `username`, `password`, or `bypass`, even if you just read them from disk.
@@ -66,12 +66,18 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
66
66
  - [D11] Treat `templates/context.json` as the source of truth for mode/reference context bundles. Use `npx job-forge context:plan <mode>` or `npx job-forge context:check <mode>` when changing or validating what a mode loads; do not paste the full context matrix into prompts.
67
67
  why: deterministic context bundles prevent reference-file drift and accidental token bloat without adding MCP/tool-schema tokens
68
68
 
69
+ - [D12] Use `job-forge cache:*` for deterministic local artifact reuse when available. For URL inputs, check `npx job-forge cache:has --url "..."` / `cache:get` before browser or network JD fetches; after a successful fetch, store the exact JD text with `npx job-forge cache:put --url "..." --ttl 14d --input @file` when it is already on disk.
70
+ why: `iso-cache` is not an MCP and adds no prompt/tool-schema tokens; it avoids repeated JD fetch/render passes and lets future sessions reuse stable content from `.jobforge-cache/`
71
+
72
+ - [D13] Use `job-forge index:*` for deterministic artifact lookup when available. `index:has` and `index:query` rebuild `.jobforge-index.json` from `templates/index.json` on demand, covering reports, tracker day files, tracker TSVs, pipeline URLs, scan history, and ledger events without loading those growing files into prompt context.
73
+ why: `iso-index` is not an MCP and adds no prompt/tool-schema tokens; it gives agents compact file/line pointers and duplicate prefilters before expensive reads or browser dispatches
74
+
69
75
  ## Procedure
70
76
 
71
77
  1. Check `cv.md`, `profile.yml`, and `portals.yml`; onboard if any file is missing.
72
78
  2. Pick and name the mode from **Routing** [D6]. No match → ask; do not guess.
73
- 3. Read the active mode file [D3]; use context bundle checks when changing context loads [D11]; decide inline vs delegated work [D1].
74
- 4. Prepare Geometra dispatches: cleanup [H3], ledger prefilter when present [D8], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
79
+ 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Decide inline vs delegated work [D1].
80
+ 4. Prepare Geometra dispatches: cleanup [H3], index/ledger prefilter when useful [D8, D13], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
75
81
  5. Dispatch at most 2 tasks per round [H1]; wait for final outcomes, not just task ids [H5b].
76
82
  6. Keep multi-job form-filling out of the orchestrator [H4].
77
83
  7. Cross-check subagent facts against authoritative files [H7].
package/README.md CHANGED
@@ -31,7 +31,7 @@ The scaffolded `opencode.json` already has three MCPs wired up — they launch a
31
31
  - **Gmail** — reads replies from recruiters
32
32
  - **state-trace** — typed working memory for cross-session context (resumed batches, recent decisions, repeated portal quirks). Install once with `python3 -m pip install "state-trace[mcp]"`; the MCP command is `state-trace-mcp`.
33
33
 
34
- JobForge also keeps MCP-free local workflow state: `templates/contracts.json` defines tracker/apply artifact shapes via `@razroo/iso-contract`, `templates/capabilities.json` defines role capability boundaries via `@razroo/iso-capabilities`, `templates/context.json` defines deterministic mode/reference bundles via `@razroo/iso-context`, and `.jobforge-ledger/events.jsonl` records deterministic duplicate/status events via `@razroo/iso-ledger`. None of these add always-on prompt or tool-schema tokens.
34
+ JobForge also keeps MCP-free local workflow state: `templates/contracts.json` defines tracker/apply artifact shapes via `@razroo/iso-contract`, `templates/capabilities.json` defines role capability boundaries via `@razroo/iso-capabilities`, `templates/context.json` defines deterministic mode/reference bundles via `@razroo/iso-context`, `.jobforge-ledger/events.jsonl` records duplicate/status events via `@razroo/iso-ledger`, `.jobforge-cache/` stores reusable JD/artifact content via `@razroo/iso-cache`, and `.jobforge-index.json` indexes artifact source pointers via `@razroo/iso-index`. None of these add always-on prompt or tool-schema tokens.
35
35
 
36
36
  `npm install` also materializes symlinks for every supported agent harness — OpenCode, Cursor, Claude Code, and Codex — so you can run `opencode`, `cursor`, `claude`, or `codex` in the same project and each picks up the shared MCP config and instructions.
37
37
 
@@ -78,7 +78,7 @@ JobForge turns opencode into a full job search command center. Instead of manual
78
78
  | **Durable Batch Orchestration** | `batch-runner.sh` uses `@razroo/iso-orchestrator` for resumable bundle execution, bounded fan-out, mutexed state writes, and workflow records in `.jobforge-runs/`. |
79
79
  | **Pipeline Integrity** | Automated merge, dedup, status normalization, health checks |
80
80
  | **Cost-Aware Agent Routing** | Three subagents (`@general-free`, `@general-paid`, `@glm-minimal`) with per-task tool surfaces. On OpenCode, JobForge pins all tiers to `opencode-go/deepseek-v4-flash` so application runs avoid overloaded free-model pools. See [Subagent Routing in AGENTS.md](AGENTS.md) for the task-to-agent mapping. |
81
- | **Trace + Telemetry + Guard + Contract + Ledger + Capabilities + Context** | `job-forge trace:*` exposes local OpenCode transcripts, `job-forge telemetry:*` summarizes runs, `job-forge guard:*` audits deterministic policy rules, `templates/contracts.json` enforces artifact shape with `iso-contract`, `job-forge ledger:*` queries append-only workflow state, `job-forge capabilities:*` checks role boundaries, and `job-forge context:*` plans mode/reference context bundles without MCP/tool-schema overhead. |
81
+ | **Trace + Telemetry + Guard + Contract + Ledger + Capabilities + Context + Cache + Index** | `job-forge trace:*` exposes local OpenCode transcripts, `job-forge telemetry:*` summarizes runs, `job-forge guard:*` audits deterministic policy rules, `templates/contracts.json` enforces artifact shape with `iso-contract`, `job-forge ledger:*` queries append-only workflow state, `job-forge capabilities:*` checks role boundaries, `job-forge context:*` plans mode/reference context bundles, `job-forge cache:*` reuses fetched JD/artifact content, and `job-forge index:*` queries compact source pointers without MCP/tool-schema overhead. |
82
82
  | **Token Cost Visibility** | `job-forge tokens --days 1` for per-session breakdown; `job-forge session-report --since-minutes 60 --log` to flag sessions over budget and append history to `data/token-usage.tsv`. Auto-logged after every batch run. |
83
83
 
84
84
  ## Usage
@@ -146,6 +146,8 @@ my-search/
146
146
  ├── config/profile.yml # your identity, target roles (personal)
147
147
  ├── data/ # applications, pipeline, scan history (personal, gitignored)
148
148
  ├── .jobforge-ledger/ # append-only local workflow events (personal, gitignored)
149
+ ├── .jobforge-cache/ # content-addressed local JD/artifact cache (personal, gitignored)
150
+ ├── .jobforge-index.json # deterministic artifact lookup index (generated, gitignored)
149
151
  ├── reports/ # generated evaluation reports (personal, gitignored)
150
152
  ├── batch/{batch-input,batch-state}.tsv, tracker-additions/, logs/ # personal
151
153
  ├── .jobforge-runs/ # durable batch workflow records (generated)
@@ -162,7 +164,7 @@ my-search/
162
164
  ├── .opencode/skills/job-forge.md # → skill router
163
165
  ├── .opencode/agents/ # → @general-free, @general-paid, @glm-minimal
164
166
  ├── modes/ # → _shared.md + skill modes
165
- ├── templates/ # → states.yml, portals.example.yml, cv-template.html, capabilities.json, context.json
167
+ ├── templates/ # → states.yml, portals.example.yml, cv-template.html, capabilities.json, context.json, index.json
166
168
  ├── batch/batch-prompt.md # → batch worker prompt
167
169
  ├── batch/batch-runner.sh # → parallel orchestrator
168
170
 
@@ -197,6 +199,8 @@ JobForge/
197
199
  │ ├── ledger.mjs # iso-ledger-backed workflow-state CLI
198
200
  │ ├── capabilities.mjs # iso-capabilities-backed role policy CLI
199
201
  │ ├── context.mjs # iso-context-backed context bundle CLI
202
+ │ ├── cache.mjs # iso-cache-backed local artifact cache CLI
203
+ │ ├── index.mjs # iso-index-backed artifact lookup CLI
200
204
  │ ├── token-usage-report.mjs # opencode cost analyzer
201
205
  │ └── release/check-source.mjs # version gate for npm publish
202
206
  ├── tracker-lib.mjs / merge-tracker.mjs / dedup-tracker.mjs / verify-pipeline.mjs
@@ -124,6 +124,11 @@ const consumerPkg = {
124
124
  'ledger:verify': 'job-forge ledger:verify',
125
125
  'ledger:has': 'job-forge ledger:has',
126
126
  'ledger:query': 'job-forge ledger:query',
127
+ 'index:build': 'job-forge index:build',
128
+ 'index:status': 'job-forge index:status',
129
+ 'index:verify': 'job-forge index:verify',
130
+ 'index:has': 'job-forge index:has',
131
+ 'index:query': 'job-forge index:query',
127
132
  // One command to pull the latest harness and any locally-pinned MCP
128
133
  // packages. npm update is a no-op on packages not in package.json, so
129
134
  // listing @razroo/gmail-mcp + @geometra/mcp is safe for consumers that
@@ -224,6 +229,7 @@ Before doing any work, remember where things live in *this* project:
224
229
  | Inbox of pending URLs | \`data/pipeline.md\` | The queue for \`/job-forge pipeline\` |
225
230
  | Scanner dedup history | \`data/scan-history.tsv\` | Only touch in \`/job-forge scan\` |
226
231
  | Local workflow ledger | \`.jobforge-ledger/events.jsonl\` | Deterministic append-only state; use \`job-forge ledger:*\` |
232
+ | Local artifact index | \`.jobforge-index.json\` | Deterministic file/line lookup; use \`job-forge index:*\` |
227
233
  | Scanner config | \`portals.yml\` (project root) | Company configs |
228
234
  | Profile / identity | \`config/profile.yml\` | Candidate name, email, target roles |
229
235
  | CV | \`cv.md\` (project root) | Markdown, source of truth |
@@ -369,6 +375,7 @@ job-forge sync # re-run if symlinks drift
369
375
  job-forge merge # merge batch/tracker-additions/*.tsv into the tracker
370
376
  job-forge verify # verify pipeline integrity
371
377
  job-forge ledger:status # local deterministic workflow ledger status
378
+ job-forge index:status # local artifact index status
372
379
  job-forge pdf cv.md out.pdf
373
380
  job-forge tokens --days 1 # per-session opencode token usage
374
381
  \`\`\`
package/bin/job-forge.mjs CHANGED
@@ -23,6 +23,8 @@
23
23
  * ledger:* Query local deterministic workflow state via iso-ledger
24
24
  * capabilities:* Query role capability policy via iso-capabilities
25
25
  * context:* Query/render deterministic context bundles via iso-context
26
+ * cache:* Reuse local deterministic artifacts via iso-cache
27
+ * index:* Query local artifacts via iso-index
26
28
  * sync Re-run the harness symlink sync (bin/sync.mjs)
27
29
  * help, --help Show this message
28
30
  */
@@ -103,6 +105,28 @@ const contextAliases = {
103
105
  'context:path': 'path',
104
106
  };
105
107
 
108
+ const cacheAliases = {
109
+ 'cache:key': 'key',
110
+ 'cache:status': 'status',
111
+ 'cache:has': 'has',
112
+ 'cache:get': 'get',
113
+ 'cache:put': 'put',
114
+ 'cache:list': 'list',
115
+ 'cache:verify': 'verify',
116
+ 'cache:prune': 'prune',
117
+ 'cache:path': 'path',
118
+ };
119
+
120
+ const indexAliases = {
121
+ 'index:build': 'build',
122
+ 'index:status': 'status',
123
+ 'index:query': 'query',
124
+ 'index:has': 'has',
125
+ 'index:verify': 'verify',
126
+ 'index:explain': 'explain',
127
+ 'index:path': 'path',
128
+ };
129
+
106
130
  const [, , cmd, ...rest] = process.argv;
107
131
 
108
132
  function printHelp() {
@@ -142,6 +166,17 @@ Commands:
142
166
  context:plan Estimate files/tokens for one context bundle
143
167
  context:check Fail if a context bundle exceeds its budget
144
168
  context:render Render context bundle content as markdown/json
169
+ cache:status Show local artifact cache status
170
+ cache:key Print deterministic cache key for a job URL
171
+ cache:has Check whether a job URL or cache key is cached
172
+ cache:get Read cached JD/artifact content
173
+ cache:put Store JD/artifact content
174
+ cache:verify Validate local artifact cache integrity
175
+ index:status Show local artifact index status
176
+ index:build Rebuild .jobforge-index.json from templates/index.json
177
+ index:has Check indexed URL/company-role/report facts without loading source files
178
+ index:query Query indexed reports, tracker rows, TSVs, scan history, pipeline, and ledger
179
+ index:verify Validate local artifact index integrity
145
180
  sync Re-create harness symlinks in the current project
146
181
 
147
182
  Deterministic helpers (prefer these over LLM-derived values):
@@ -175,6 +210,11 @@ Pass --help after a command to see its own flags, e.g.:
175
210
  job-forge capabilities:check general-free --tool browser --mcp geometra --command "npx job-forge merge" --filesystem write
176
211
  job-forge context:plan apply
177
212
  job-forge context:check apply --budget 23000
213
+ job-forge cache:has --url https://example.test/jobs/123
214
+ job-forge cache:get --url https://example.test/jobs/123
215
+ job-forge cache:put --url https://example.test/jobs/123 --input @jds/example.md
216
+ job-forge index:has --key "company-role:acme:staff-engineer"
217
+ job-forge index:query "acme"
178
218
 
179
219
  Project directory resolves to $JOB_FORGE_PROJECT or cwd.`);
180
220
  }
@@ -274,6 +314,36 @@ if (cmd === 'context' || contextAliases[cmd]) {
274
314
  process.exit(result.status ?? 1);
275
315
  }
276
316
 
317
+ if (cmd === 'cache' || cacheAliases[cmd]) {
318
+ const cacheArgs = cmd === 'cache'
319
+ ? (rest.length === 0 ? ['help'] : rest)
320
+ : [cacheAliases[cmd], ...rest];
321
+
322
+ const scriptPath = join(PKG_ROOT, 'scripts/cache.mjs');
323
+ const result = spawnSync(process.execPath, [scriptPath, ...cacheArgs], {
324
+ stdio: 'inherit',
325
+ cwd: PROJECT_DIR,
326
+ env: process.env,
327
+ });
328
+
329
+ process.exit(result.status ?? 1);
330
+ }
331
+
332
+ if (cmd === 'index' || indexAliases[cmd]) {
333
+ const indexArgs = cmd === 'index'
334
+ ? (rest.length === 0 ? ['help'] : rest)
335
+ : [indexAliases[cmd], ...rest];
336
+
337
+ const scriptPath = join(PKG_ROOT, 'scripts/index.mjs');
338
+ const result = spawnSync(process.execPath, [scriptPath, ...indexArgs], {
339
+ stdio: 'inherit',
340
+ cwd: PROJECT_DIR,
341
+ env: process.env,
342
+ });
343
+
344
+ process.exit(result.status ?? 1);
345
+ }
346
+
277
347
  const rel = commands[cmd];
278
348
  if (!rel) {
279
349
  console.error(`Unknown command: ${cmd}\n`);
@@ -161,6 +161,7 @@ config/profile.yml → Candidate identity
161
161
  portals.yml → Scanner configuration
162
162
  data/pipeline.md → Pending URLs and `local:jds/...` inbox (see modes/pipeline.md)
163
163
  .jobforge-ledger/events.jsonl → Append-only workflow events for cheap local duplicate/status checks
164
+ .jobforge-index.json → Deterministic artifact lookup index built from templates/index.json
164
165
  jds/*.md → Saved job descriptions referenced from the pipeline (`local:jds/{file}`)
165
166
  templates/states.yml → Canonical status values
166
167
  templates/context.json → Deterministic mode/reference context bundle policy
@@ -176,12 +177,13 @@ Create `data/pipeline.md` when you start using the URL inbox (`/job-forge pipeli
176
177
  - PDFs: `cv-candidate-{company-slug}-{YYYY-MM-DD}.pdf`
177
178
  - Tracker TSVs: `batch/tracker-additions/{num}-{company-slug}.tsv` (one file per evaluation; merged files move under `batch/tracker-additions/merged/`; shape enforced by `templates/contracts.json`)
178
179
  - Ledger: `.jobforge-ledger/events.jsonl` (created by `job-forge ledger:rebuild`, `tracker-line --write`, or `merge`; gitignored personal state)
180
+ - Index: `.jobforge-index.json` (created on demand by `job-forge index:*`; gitignored local lookup state)
179
181
  - Capabilities: `templates/capabilities.json` (role boundary policy inspected with `job-forge capabilities:*`)
180
182
  - Context: `templates/context.json` (mode/reference file bundles inspected with `job-forge context:*`)
181
183
 
182
184
  ## Pipeline Integrity
183
185
 
184
- From the project root, `npx job-forge verify` (or `npm run verify`) runs `verify-pipeline.mjs`. When a tracker file exists, it validates canonical statuses (using `templates/states.yml` when that file is present and parseable), validates every tracker row against `templates/contracts.json`, warns on probable duplicate company/role rows, checks that report column markdown links resolve to files in the repo, validates score column format (`X.X/5`, `N/A`, or `DUP`), rejects table rows with too few columns, flags markdown bold inside the score column, and warns if any `batch/tracker-additions/*.tsv` files are still waiting to be merged. If `.jobforge-ledger/events.jsonl` exists, verify also validates the append-only ledger. It also compares state ids from `templates/states.yml` to an internal fallback list and warns when the two sets drift. **Fresh clone:** the command exits successfully when neither `data/applications.md` nor root `applications.md` exists yet; pending-TSV and states-drift checks still run so contributors see unmerged batch output early. Optional setup validation after you add `cv.md` and `config/profile.yml`: `npm run sync-check` (`cv-sync-check.mjs`).
186
+ From the project root, `npx job-forge verify` (or `npm run verify`) runs `verify-pipeline.mjs`. When a tracker file exists, it validates canonical statuses (using `templates/states.yml` when that file is present and parseable), validates every tracker row against `templates/contracts.json`, warns on probable duplicate company/role rows, checks that report column markdown links resolve to files in the repo, validates score column format (`X.X/5`, `N/A`, or `DUP`), rejects table rows with too few columns, flags markdown bold inside the score column, and warns if any `batch/tracker-additions/*.tsv` files are still waiting to be merged. If `.jobforge-ledger/events.jsonl` exists, verify also validates the append-only ledger. If `.jobforge-index.json` exists, verify validates the artifact index. It also compares state ids from `templates/states.yml` to an internal fallback list and warns when the two sets drift. **Fresh clone:** the command exits successfully when neither `data/applications.md` nor root `applications.md` exists yet; pending-TSV and states-drift checks still run so contributors see unmerged batch output early. Optional setup validation after you add `cv.md` and `config/profile.yml`: `npm run sync-check` (`cv-sync-check.mjs`).
185
187
 
186
188
  **`verify-pipeline.mjs` checks (same order as the script header):**
187
189
 
@@ -195,8 +197,9 @@ From the project root, `npx job-forge verify` (or `npm run verify`) runs `verify
195
197
  8. Score column has no markdown bold.
196
198
  9. Warn when state ids in `templates/states.yml` drift from the script’s built-in fallback list (or when the file exists but ids failed to parse).
197
199
  10. Validate `.jobforge-ledger/events.jsonl` when present.
200
+ 11. Validate `.jobforge-index.json` when present.
198
201
 
199
- When the tracker file is missing, checks 1-6 and 8 are skipped; checks 7, 9, and 10 still run when applicable.
202
+ When the tracker file is missing, checks 1-6 and 8 are skipped; checks 7, 9, 10, and 11 still run when applicable.
200
203
 
201
204
  ## Contributing touchpoints
202
205
 
@@ -219,6 +222,7 @@ Scripts maintain data consistency. In a consumer project they're invoked via the
219
222
  | `scripts/telemetry.mjs` | `npx job-forge telemetry:status` / `telemetry:show` | JobForge operational telemetry derived from OpenCode traces plus tracker TSV state |
220
223
  | `scripts/guard.mjs` | `npx job-forge guard:audit` / `guard:explain` | Deterministic `@razroo/iso-guard` policy audits over local OpenCode traces |
221
224
  | `scripts/ledger.mjs` | `npx job-forge ledger:status` / `ledger:has` / `ledger:rebuild` | Deterministic `@razroo/iso-ledger` state over tracker, TSV, and pipeline files |
225
+ | `scripts/index.mjs` | `npx job-forge index:status` / `index:has` / `index:query` | Deterministic `@razroo/iso-index` lookup over reports, tracker rows, TSVs, pipeline, scan history, and ledger events |
222
226
  | `scripts/context.mjs` | `npx job-forge context:list` / `context:plan` / `context:check` / `context:render` | Deterministic `@razroo/iso-context` mode/reference context bundle planning and rendering |
223
227
  | `tracker-lib.mjs` | _(library)_ | Shared helpers for reading/writing day-based tracker files — imported by merge/dedup/verify/normalize |
224
228
  | `bin/sync.mjs` | `npx job-forge sync` | Creates the harness symlinks in a consumer project (also runs as `postinstall`) |
@@ -150,6 +150,10 @@ Role capability boundaries live in `templates/capabilities.json` and are enforce
150
150
 
151
151
  Mode/reference context bundles live in `templates/context.json` and are planned locally by `@razroo/iso-context`. Use `job-forge context:plan <mode>` to see the files and estimated tokens, `job-forge context:check <mode>` to fail on budget drift, and `job-forge context:render <mode>` when you intentionally need a compact markdown or JSON context bundle. This is not an MCP and does not add tool-schema tokens; rendered context only consumes prompt tokens when a workflow deliberately asks for it.
152
152
 
153
+ ## JobForge artifact index
154
+
155
+ Artifact lookup policy lives in `templates/index.json` and is built locally by `@razroo/iso-index`. Use `job-forge index:has --key "company-role:acme:staff-engineer"` as a cheap duplicate/source prefilter, `job-forge index:query "acme"` to get compact source path/line pointers, and `job-forge index:verify` to validate `.jobforge-index.json`. Query, has, and verify rebuild the index on demand, so scaffolded projects need no setup. This is not an MCP and does not add tool-schema tokens.
156
+
153
157
  ## JobForge guard audits
154
158
 
155
159
  Guard audits run deterministic `@razroo/iso-guard` policies over the same local OpenCode traces. The default policy lives at `templates/guards/jobforge-baseline.yaml` and checks rules that are reliable from transcript data, including max two task dispatches per assistant message, no task-status polling via `task`, no raw proxy configuration in task prompts, and no child session task recursion.
package/docs/SETUP.md CHANGED
@@ -128,6 +128,8 @@ From your project root, these commands maintain the tracker and pipeline checks.
128
128
  | Inspect tracker row contract | `npx iso-contract explain jobforge.tracker-row --contracts templates/contracts.json` | _(none)_ |
129
129
  | Inspect role capabilities | `npx job-forge capabilities:explain general-free` | `npm run capabilities:explain -- general-free` |
130
130
  | Inspect context bundle budget | `npx job-forge context:plan apply` | `npm run context:plan -- apply` |
131
+ | Inspect local JD/artifact cache | `npx job-forge cache:status` | `npm run cache:status` |
132
+ | Inspect local artifact index | `npx job-forge index:status` | `npm run index:status` |
131
133
  | Map status column to canonical labels | `npx job-forge normalize` | `npm run normalize` |
132
134
  | Merge duplicate company/role rows | `npx job-forge dedup` | `npm run dedup` |
133
135
  | Generate ATS-optimized CV PDF | `npx job-forge pdf` | `npm run pdf` |
@@ -144,6 +146,8 @@ From your project root, these commands maintain the tracker and pipeline checks.
144
146
  | Show local workflow ledger status | `npx job-forge ledger:status` | `npm run ledger:status` |
145
147
  | Rebuild local workflow ledger from tracker/pipeline files | `npx job-forge ledger:rebuild` | `npm run ledger:rebuild` |
146
148
  | Check duplicate/status event without loading tracker files | `npx job-forge ledger:has --company "Acme" --role "Staff Engineer" --status Applied` | `npm run ledger:has -- --company ...` |
149
+ | Check/reuse cached JD content | `npx job-forge cache:has --url <url>` / `npx job-forge cache:get --url <url>` | `npm run cache:has -- --url ...` |
150
+ | Query local artifact pointers | `npx job-forge index:query "Acme"` / `npx job-forge index:has --key company-role:acme:staff-engineer` | `npm run index:query -- Acme` |
147
151
  | Re-create harness symlinks | `npx job-forge sync` | `npm run sync` |
148
152
  | Build optional dashboard TUI (Go on `PATH`) | `(cd node_modules/job-forge/dashboard && go build .)` | `npm run build:dashboard` (harness repo only) |
149
153
 
@@ -76,6 +76,11 @@ Local workflow ledger (terminal, outside opencode):
76
76
  npx job-forge ledger:status # .jobforge-ledger/events.jsonl summary
77
77
  npx job-forge ledger:has --company "Acme" --role "Staff Engineer" --status Applied
78
78
 
79
+ Local artifact index (terminal, outside opencode):
80
+ npx job-forge index:status # .jobforge-index.json summary
81
+ npx job-forge index:has --key "company-role:acme:staff-engineer"
82
+ npx job-forge index:query "acme"
83
+
79
84
  Artifact contracts (terminal, outside opencode):
80
85
  npx iso-contract explain jobforge.tracker-row --contracts templates/contracts.json
81
86
  npx job-forge tracker-line ... --write # renders + validates tracker TSV locally
@@ -161,6 +166,11 @@ Step 1 — Enumerate candidates
161
166
  - Build ordered list: candidates = [job_1, job_2, ..., job_N]
162
167
 
163
168
  Step 2 — Dedup against already-applied
169
+ - Run npx job-forge index:has --key "company-role:<company-slug>:<role-slug>"
170
+ as a fast local artifact prefilter when company+role is known. It rebuilds
171
+ .jobforge-index.json on demand from templates/index.json. A hit means the
172
+ role has already appeared in tracker files or tracker TSVs and can be
173
+ dropped before dispatch.
164
174
  - If .jobforge-ledger/events.jsonl exists, use npx job-forge ledger:has as a
165
175
  fast prefilter for obvious company+role Applied duplicates. A ledger match
166
176
  can be dropped before dispatch without loading tracker files into context.
@@ -7,7 +7,7 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
7
7
  - [H1] Max 2 parallel `task` dispatches per message. For N jobs, run `ceil(N/2)` sequential rounds of 2. A round is not complete until both subagents return a final outcome (`APPLIED`, `APPLY FAILED`, `SKIP`, `Discarded`, or a written TSV path). A `task` tool result that only gives a session id / title is a launch acknowledgement, not completion. Applies in all modes, for all user phrasings ("urgent", "apply to 10 jobs now").
8
8
  why: each subagent requires post-cleanup and racing more than 2 reliably loses at least one result. On 2026-04-25 the orchestrator launched round 2 while round 1 had only returned task ids, leaving four application subagents in flight and losing two provider recoveries
9
9
 
10
- - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
10
+ - [H2] Max 1 application per company+role. Before every `apply` dispatch, grep all four sources for the URL and for `company+role`: `data/pipeline.md`, all `data/applications/*.md` day files, `batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`. `npx job-forge index:has --key "company-role:..."` may be used first as a fast local artifact prefilter; it rebuilds `.jobforge-index.json` on demand from `templates/index.json`, and a company+role hit is enough to drop an obvious duplicate before dispatch. If `.jobforge-ledger/events.jsonl` exists, `npx job-forge ledger:has --company "..." --role "..." --status Applied` may also be used as a fast prefilter; a match is enough to drop that duplicate before dispatch. For candidates not rejected by the index or ledger, the four-source grep is still mandatory. If any source shows APPLIED / Applied, skip the dispatch and pick a replacement from the remaining candidate list. Do not count duplicates toward a requested "apply to N jobs" total, and do not delegate obvious duplicates just so a subagent can return SKIP.
11
11
  why: 2026-04 same-day batch collision — when two batches target the same role, `npx job-forge merge` updates the existing day-file row rather than appending, so grepping day files alone misses earlier-batch applies; merged/*.tsv is the only place the breadcrumb remains
12
12
 
13
13
  - [H3] Before every batch of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round, no exceptions. Name this cleanup as an explicit "step 0" in your first-response plan for any multi-apply request — it is the most frequently skipped guardrail in practice, and skipping it produces cascade "Not connected" failures on the next dispatch.
@@ -19,13 +19,13 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
19
19
  - [H5] Re-dispatch the same company only AFTER the previous subagent returns. Never fire the same `task` twice while the first is still in flight.
20
20
  why: two in-flight subagents for the same URL race on Geometra sessions and on tracker TSV writes, corrupting state and sometimes double-submitting
21
21
 
22
- - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, or `iso-trace`) rather than spawning a "check task status" subagent.
22
+ - [H5b] Do not use `task` to poll task status. If OpenCode returns a task/session id without a final result, record the id, stop dispatching new rounds, and tell the user the round is still in flight. When the user asks to check later, inspect authoritative files (`batch/tracker-additions/*.tsv`, `batch/tracker-additions/merged/*.tsv`, day files, `.jobforge-ledger/events.jsonl`, `.jobforge-index.json`, or `iso-trace`) rather than spawning a "check task status" subagent.
23
23
  why: OpenCode status prompts can be delivered into the target subagent as a new user message; a 2026-04-25 trace caused a subagent to call `task` recursively instead of finishing the application
24
24
 
25
25
  - [H6] Application outcomes flow through `batch/tracker-additions/*.tsv`, not `data/pipeline.md`. After any multi-apply run, the orchestrator MUST run `npx job-forge merge` then `npx job-forge verify` before ending the session.
26
26
  why: `pipeline.md` is the URL inbox (`[ ]` pending → `[x]` processed); `data/applications/YYYY-MM-DD.md` is the outcome log; the TSV pathway is the only safe bridge because `merge` handles column order and duplicate detection
27
27
 
28
- - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`.
28
+ - [H7] Load-bearing facts passed to downstream subagents must originate from a file, not from prior subagent prose. Authoritative sources: `data/pipeline.md`, `data/scan-history.tsv`, `batch/scan-output-*.md`, `reports/{num}-*.md` with `**URL:**` / `**Score:**` headers, `batch/tracker-additions/*.tsv`, cached JD content returned by `npx job-forge cache:get --url ...`, and source path/line pointers returned by `npx job-forge index:query ...`.
29
29
  why: 2026-04-18 scan subagent returned 30 fabricated Greenhouse IDs in prose (plausible-looking, non-existent); orchestrator dispatched 30 downstream subagents that all 404'd. Subagents can hallucinate IDs, scores, and confirmation text — round-trip through a file or don't trust the value
30
30
 
31
31
  - [H8] Never paste proxy values from `config/profile.yml` into `task` prompts, status text, or summaries. If a proxy is configured, tell the subagent exactly: "Proxy is configured; read `config/profile.yml` and pass its top-level `proxy:` object to every `geometra_connect` call." Do not transcribe `server`, `username`, `password`, or `bypass`, even if you just read them from disk.
@@ -66,12 +66,18 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
66
66
  - [D11] Treat `templates/context.json` as the source of truth for mode/reference context bundles. Use `npx job-forge context:plan <mode>` or `npx job-forge context:check <mode>` when changing or validating what a mode loads; do not paste the full context matrix into prompts.
67
67
  why: deterministic context bundles prevent reference-file drift and accidental token bloat without adding MCP/tool-schema tokens
68
68
 
69
+ - [D12] Use `job-forge cache:*` for deterministic local artifact reuse when available. For URL inputs, check `npx job-forge cache:has --url "..."` / `cache:get` before browser or network JD fetches; after a successful fetch, store the exact JD text with `npx job-forge cache:put --url "..." --ttl 14d --input @file` when it is already on disk.
70
+ why: `iso-cache` is not an MCP and adds no prompt/tool-schema tokens; it avoids repeated JD fetch/render passes and lets future sessions reuse stable content from `.jobforge-cache/`
71
+
72
+ - [D13] Use `job-forge index:*` for deterministic artifact lookup when available. `index:has` and `index:query` rebuild `.jobforge-index.json` from `templates/index.json` on demand, covering reports, tracker day files, tracker TSVs, pipeline URLs, scan history, and ledger events without loading those growing files into prompt context.
73
+ why: `iso-index` is not an MCP and adds no prompt/tool-schema tokens; it gives agents compact file/line pointers and duplicate prefilters before expensive reads or browser dispatches
74
+
69
75
  ## Procedure
70
76
 
71
77
  1. Check `cv.md`, `profile.yml`, and `portals.yml`; onboard if any file is missing.
72
78
  2. Pick and name the mode from **Routing** [D6]. No match → ask; do not guess.
73
- 3. Read the active mode file [D3]; use context bundle checks when changing context loads [D11]; decide inline vs delegated work [D1].
74
- 4. Prepare Geometra dispatches: cleanup [H3], ledger prefilter when present [D8], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
79
+ 3. Read the active mode file [D3]. Use context bundle checks when changing context loads [D11]. Check cached artifacts before URL/JD refetches [D12]. Use artifact index lookups before broad file reads when they can answer the question [D13]. Decide inline vs delegated work [D1].
80
+ 4. Prepare Geometra dispatches: cleanup [H3], index/ledger prefilter when useful [D8, D13], dedupe [H2], location filter [D5], routing [D2, D10], proxy prompt hygiene [H8].
75
81
  5. Dispatch at most 2 tasks per round [H1]; wait for final outcomes, not just task ids [H5b].
76
82
  6. Keep multi-job form-filling out of the orchestrator [H4].
77
83
  7. Cross-check subagent facts against authoritative files [H7].