job-forge 2.0.1 → 2.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.cursor/rules/main.mdc +20 -7
- package/.opencode/skills/job-forge.md +2 -2
- package/AGENTS.md +20 -7
- package/CLAUDE.md +20 -7
- package/batch/README.md +1 -1
- package/batch/batch-prompt.md +1 -1
- package/config/profile.example.yml +21 -0
- package/iso/commands/job-forge.md +2 -2
- package/iso/instructions.md +20 -7
- package/lib/canonical-states.mjs +116 -0
- package/merge-tracker.mjs +7 -22
- package/modes/_shared.md +2 -2
- package/modes/apply.md +71 -8
- package/modes/auto-pipeline.md +6 -3
- package/modes/pdf.md +1 -1
- package/modes/pipeline.md +7 -6
- package/modes/scan.md +59 -7
- package/normalize-statuses.mjs +5 -6
- package/package.json +2 -1
- package/templates/states.yml +6 -0
package/.cursor/rules/main.mdc
CHANGED
|
@@ -14,9 +14,19 @@ The Hard Limits below are non-negotiable numeric rules. If you catch yourself ab
|
|
|
14
14
|
3. **Always clean Geometra sessions before dispatching.** Before every round of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round. The disconnect is a no-op when the pool is empty.
|
|
15
15
|
4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
|
|
16
16
|
5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
|
|
17
|
-
6. **Application outcomes flow through TSVs, not `data/pipeline.md`.** When a subagent returns APPLIED / FAILED / SKIP, the outcome goes to `batch/tracker-additions/{num}-{slug}.tsv`. `
|
|
17
|
+
6. **Application outcomes flow through TSVs, not `data/pipeline.md`.** When a subagent returns APPLIED / FAILED / SKIP, the outcome goes to `batch/tracker-additions/{num}-{slug}.tsv`. `npx job-forge merge` then consumes the TSVs into the correct `data/applications/YYYY-MM-DD.md` day file. `data/pipeline.md` only tracks URL inbox state (`[ ]` pending → `[x]` processed). **NEVER append APPLIED / FAILED status lines to `pipeline.md`** — that's the day file's job, via the TSV pathway. After any multi-apply run, the orchestrator MUST run `npx job-forge merge` followed by `npx job-forge verify` before ending the session.
|
|
18
|
+
7. **Load-bearing facts passed to downstream subagents must come from a file, not from a prior subagent's prose.** A URL, score, email ID, confirmation page snippet, JD salary range, exact answer submitted to a form question, or any other specific value that a downstream subagent will act on MUST originate from one of:
|
|
19
|
+
- `data/pipeline.md` (URL inbox state)
|
|
20
|
+
- `data/scan-history.tsv` (scan provenance)
|
|
21
|
+
- `batch/scan-output-*.md` (scan-ranked candidates)
|
|
22
|
+
- A report file (`reports/{num}-*.md`) with authoritative headers (`**URL:**`, `**Score:**`, etc.)
|
|
23
|
+
- A TSV in `batch/tracker-additions/` (per-apply outcomes)
|
|
18
24
|
|
|
19
|
-
|
|
25
|
+
**Not trustworthy by default**: anything quoted from a subagent's return message, any ID or score the orchestrator "remembers" from prose, any page-content snippet reproduced from a subagent's narrative. Subagents can hallucinate plausible-looking IDs, scores, and confirmation text. Before passing any such fact to another subagent, cross-check it exists in one of the authoritative files above, OR instruct the dispatching subagent to write its output to a structured file and re-read from that file.
|
|
26
|
+
|
|
27
|
+
**Why**: on 2026-04-18, a scan subagent returned 30 fabricated Greenhouse IDs in prose (correct role titles, plausible-looking invented IDs that didn't exist in the API). The orchestrator dispatched 30 downstream subagents that all hit 404s. Verification rules downstream (Hard Limit #6, API-first verify) caught the symptom. This rule prevents the *shape* of the bug — hallucinations propagating through prose handoffs — across all quantitative / identifier / specific-fact claims, not just URLs.
|
|
28
|
+
|
|
29
|
+
Everything below is context and rationale. These seven numbers are the rules.
|
|
20
30
|
|
|
21
31
|
---
|
|
22
32
|
|
|
@@ -36,6 +46,8 @@ Whenever the user says any variation of "apply to N jobs", "process the pipeline
|
|
|
36
46
|
|
|
37
47
|
**Exception:** evaluation-only or tracker-only work (no Geometra, no repeated tool calls) can proceed in a single session. The rule targets tool-heavy multi-step loops.
|
|
38
48
|
|
|
49
|
+
**Before any batch-apply dispatch, run the Apply Preflight location filter from `modes/apply.md`** to exclude location-incompatible candidates. Catches the common case where an evaluated role has the right role-shape but a deal-breaking location that profile.yml already rules out.
|
|
50
|
+
|
|
39
51
|
---
|
|
40
52
|
|
|
41
53
|
## Subagent Routing — which agent for which task
|
|
@@ -462,7 +474,7 @@ To check or modify MCP settings, edit `opencode.json` in the project root.
|
|
|
462
474
|
- JDs in `jds/` (referenced as `local:jds/{file}` in pipeline.md)
|
|
463
475
|
- Batch in `batch/` (gitignored except scripts and prompt)
|
|
464
476
|
- Report numbering: sequential 3-digit zero-padded, max existing + 1
|
|
465
|
-
- **RULE: After each batch of evaluations, run `
|
|
477
|
+
- **RULE: After each batch of evaluations, run `npx job-forge merge`** to merge tracker additions and avoid duplications.
|
|
466
478
|
- **RULE: NEVER create new entries in applications.md if company+role already exists.** Update the existing entry.
|
|
467
479
|
- **RULE: NEVER attribute commits to opencode (no `Co-Authored-By: opencode` or similar).** All commits must be attributed solely to the person making the commit (e.g., CharlieGreenman).
|
|
468
480
|
|
|
@@ -489,13 +501,13 @@ Write one TSV file per evaluation to `batch/tracker-additions/{num}-{company-slu
|
|
|
489
501
|
|
|
490
502
|
### Pipeline Integrity
|
|
491
503
|
|
|
492
|
-
1. **NEVER edit day files in `data/applications/` to ADD new entries** -- Write TSV in `batch/tracker-additions/` and `merge
|
|
504
|
+
1. **NEVER edit day files in `data/applications/` to ADD new entries** -- Write TSV in `batch/tracker-additions/` and `npx job-forge merge` handles the merge.
|
|
493
505
|
2. **YES you can edit day files in `data/applications/` to UPDATE status/notes of existing entries.**
|
|
494
506
|
3. All reports MUST include `**URL:**` in the header (between Score and PDF).
|
|
495
507
|
4. All statuses MUST be canonical (see `templates/states.yml`).
|
|
496
|
-
5. Health check: `
|
|
497
|
-
6. Normalize statuses: `
|
|
498
|
-
7. Dedup: `
|
|
508
|
+
5. Health check: `npx job-forge verify`
|
|
509
|
+
6. Normalize statuses: `npx job-forge normalize`
|
|
510
|
+
7. Dedup: `npx job-forge dedup`
|
|
499
511
|
|
|
500
512
|
### Canonical States (applications day files)
|
|
501
513
|
|
|
@@ -511,6 +523,7 @@ Write one TSV file per evaluation to `batch/tracker-additions/{num}-{company-slu
|
|
|
511
523
|
| `Offer` | Offer received |
|
|
512
524
|
| `Rejected` | Rejected by company |
|
|
513
525
|
| `Discarded` | Discarded by candidate or offer closed |
|
|
526
|
+
| `Failed` | Submission attempted but blocked by portal (spam-filter, anti-bot, broken form). May be recoverable via manual retry. |
|
|
514
527
|
| `SKIP` | Doesn't fit, don't apply |
|
|
515
528
|
|
|
516
529
|
**RULES:**
|
|
@@ -163,8 +163,8 @@ Step 5 — Between rounds: clean sessions again
|
|
|
163
163
|
- geometra_disconnect({ closeBrowser: true })
|
|
164
164
|
|
|
165
165
|
Step 6 — After all rounds: reconcile outcomes (Hard Limit #6)
|
|
166
|
-
- bash:
|
|
167
|
-
- bash:
|
|
166
|
+
- bash: npx job-forge merge # consumes batch/tracker-additions/*.tsv into the day file
|
|
167
|
+
- bash: npx job-forge verify # validates URL/status consistency
|
|
168
168
|
- Review output; if verify-pipeline reports issues, fix them before ending.
|
|
169
169
|
|
|
170
170
|
Step 7 — Aggregate and report
|
package/AGENTS.md
CHANGED
|
@@ -9,9 +9,19 @@ The Hard Limits below are non-negotiable numeric rules. If you catch yourself ab
|
|
|
9
9
|
3. **Always clean Geometra sessions before dispatching.** Before every round of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round. The disconnect is a no-op when the pool is empty.
|
|
10
10
|
4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
|
|
11
11
|
5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
|
|
12
|
-
6. **Application outcomes flow through TSVs, not `data/pipeline.md`.** When a subagent returns APPLIED / FAILED / SKIP, the outcome goes to `batch/tracker-additions/{num}-{slug}.tsv`. `
|
|
12
|
+
6. **Application outcomes flow through TSVs, not `data/pipeline.md`.** When a subagent returns APPLIED / FAILED / SKIP, the outcome goes to `batch/tracker-additions/{num}-{slug}.tsv`. `npx job-forge merge` then consumes the TSVs into the correct `data/applications/YYYY-MM-DD.md` day file. `data/pipeline.md` only tracks URL inbox state (`[ ]` pending → `[x]` processed). **NEVER append APPLIED / FAILED status lines to `pipeline.md`** — that's the day file's job, via the TSV pathway. After any multi-apply run, the orchestrator MUST run `npx job-forge merge` followed by `npx job-forge verify` before ending the session.
|
|
13
|
+
7. **Load-bearing facts passed to downstream subagents must come from a file, not from a prior subagent's prose.** A URL, score, email ID, confirmation page snippet, JD salary range, exact answer submitted to a form question, or any other specific value that a downstream subagent will act on MUST originate from one of:
|
|
14
|
+
- `data/pipeline.md` (URL inbox state)
|
|
15
|
+
- `data/scan-history.tsv` (scan provenance)
|
|
16
|
+
- `batch/scan-output-*.md` (scan-ranked candidates)
|
|
17
|
+
- A report file (`reports/{num}-*.md`) with authoritative headers (`**URL:**`, `**Score:**`, etc.)
|
|
18
|
+
- A TSV in `batch/tracker-additions/` (per-apply outcomes)
|
|
13
19
|
|
|
14
|
-
|
|
20
|
+
**Not trustworthy by default**: anything quoted from a subagent's return message, any ID or score the orchestrator "remembers" from prose, any page-content snippet reproduced from a subagent's narrative. Subagents can hallucinate plausible-looking IDs, scores, and confirmation text. Before passing any such fact to another subagent, cross-check it exists in one of the authoritative files above, OR instruct the dispatching subagent to write its output to a structured file and re-read from that file.
|
|
21
|
+
|
|
22
|
+
**Why**: on 2026-04-18, a scan subagent returned 30 fabricated Greenhouse IDs in prose (correct role titles, plausible-looking invented IDs that didn't exist in the API). The orchestrator dispatched 30 downstream subagents that all hit 404s. Verification rules downstream (Hard Limit #6, API-first verify) caught the symptom. This rule prevents the *shape* of the bug — hallucinations propagating through prose handoffs — across all quantitative / identifier / specific-fact claims, not just URLs.
|
|
23
|
+
|
|
24
|
+
Everything below is context and rationale. These seven numbers are the rules.
|
|
15
25
|
|
|
16
26
|
---
|
|
17
27
|
|
|
@@ -31,6 +41,8 @@ Whenever the user says any variation of "apply to N jobs", "process the pipeline
|
|
|
31
41
|
|
|
32
42
|
**Exception:** evaluation-only or tracker-only work (no Geometra, no repeated tool calls) can proceed in a single session. The rule targets tool-heavy multi-step loops.
|
|
33
43
|
|
|
44
|
+
**Before any batch-apply dispatch, run the Apply Preflight location filter from `modes/apply.md`** to exclude location-incompatible candidates. Catches the common case where an evaluated role has the right role-shape but a deal-breaking location that profile.yml already rules out.
|
|
45
|
+
|
|
34
46
|
---
|
|
35
47
|
|
|
36
48
|
## Subagent Routing — which agent for which task
|
|
@@ -457,7 +469,7 @@ To check or modify MCP settings, edit `opencode.json` in the project root.
|
|
|
457
469
|
- JDs in `jds/` (referenced as `local:jds/{file}` in pipeline.md)
|
|
458
470
|
- Batch in `batch/` (gitignored except scripts and prompt)
|
|
459
471
|
- Report numbering: sequential 3-digit zero-padded, max existing + 1
|
|
460
|
-
- **RULE: After each batch of evaluations, run `
|
|
472
|
+
- **RULE: After each batch of evaluations, run `npx job-forge merge`** to merge tracker additions and avoid duplications.
|
|
461
473
|
- **RULE: NEVER create new entries in applications.md if company+role already exists.** Update the existing entry.
|
|
462
474
|
- **RULE: NEVER attribute commits to opencode (no `Co-Authored-By: opencode` or similar).** All commits must be attributed solely to the person making the commit (e.g., CharlieGreenman).
|
|
463
475
|
|
|
@@ -484,13 +496,13 @@ Write one TSV file per evaluation to `batch/tracker-additions/{num}-{company-slu
|
|
|
484
496
|
|
|
485
497
|
### Pipeline Integrity
|
|
486
498
|
|
|
487
|
-
1. **NEVER edit day files in `data/applications/` to ADD new entries** -- Write TSV in `batch/tracker-additions/` and `merge
|
|
499
|
+
1. **NEVER edit day files in `data/applications/` to ADD new entries** -- Write TSV in `batch/tracker-additions/` and `npx job-forge merge` handles the merge.
|
|
488
500
|
2. **YES you can edit day files in `data/applications/` to UPDATE status/notes of existing entries.**
|
|
489
501
|
3. All reports MUST include `**URL:**` in the header (between Score and PDF).
|
|
490
502
|
4. All statuses MUST be canonical (see `templates/states.yml`).
|
|
491
|
-
5. Health check: `
|
|
492
|
-
6. Normalize statuses: `
|
|
493
|
-
7. Dedup: `
|
|
503
|
+
5. Health check: `npx job-forge verify`
|
|
504
|
+
6. Normalize statuses: `npx job-forge normalize`
|
|
505
|
+
7. Dedup: `npx job-forge dedup`
|
|
494
506
|
|
|
495
507
|
### Canonical States (applications day files)
|
|
496
508
|
|
|
@@ -506,6 +518,7 @@ Write one TSV file per evaluation to `batch/tracker-additions/{num}-{company-slu
|
|
|
506
518
|
| `Offer` | Offer received |
|
|
507
519
|
| `Rejected` | Rejected by company |
|
|
508
520
|
| `Discarded` | Discarded by candidate or offer closed |
|
|
521
|
+
| `Failed` | Submission attempted but blocked by portal (spam-filter, anti-bot, broken form). May be recoverable via manual retry. |
|
|
509
522
|
| `SKIP` | Doesn't fit, don't apply |
|
|
510
523
|
|
|
511
524
|
**RULES:**
|
package/CLAUDE.md
CHANGED
|
@@ -9,9 +9,19 @@ The Hard Limits below are non-negotiable numeric rules. If you catch yourself ab
|
|
|
9
9
|
3. **Always clean Geometra sessions before dispatching.** Before every round of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round. The disconnect is a no-op when the pool is empty.
|
|
10
10
|
4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
|
|
11
11
|
5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
|
|
12
|
-
6. **Application outcomes flow through TSVs, not `data/pipeline.md`.** When a subagent returns APPLIED / FAILED / SKIP, the outcome goes to `batch/tracker-additions/{num}-{slug}.tsv`. `
|
|
12
|
+
6. **Application outcomes flow through TSVs, not `data/pipeline.md`.** When a subagent returns APPLIED / FAILED / SKIP, the outcome goes to `batch/tracker-additions/{num}-{slug}.tsv`. `npx job-forge merge` then consumes the TSVs into the correct `data/applications/YYYY-MM-DD.md` day file. `data/pipeline.md` only tracks URL inbox state (`[ ]` pending → `[x]` processed). **NEVER append APPLIED / FAILED status lines to `pipeline.md`** — that's the day file's job, via the TSV pathway. After any multi-apply run, the orchestrator MUST run `npx job-forge merge` followed by `npx job-forge verify` before ending the session.
|
|
13
|
+
7. **Load-bearing facts passed to downstream subagents must come from a file, not from a prior subagent's prose.** A URL, score, email ID, confirmation page snippet, JD salary range, exact answer submitted to a form question, or any other specific value that a downstream subagent will act on MUST originate from one of:
|
|
14
|
+
- `data/pipeline.md` (URL inbox state)
|
|
15
|
+
- `data/scan-history.tsv` (scan provenance)
|
|
16
|
+
- `batch/scan-output-*.md` (scan-ranked candidates)
|
|
17
|
+
- A report file (`reports/{num}-*.md`) with authoritative headers (`**URL:**`, `**Score:**`, etc.)
|
|
18
|
+
- A TSV in `batch/tracker-additions/` (per-apply outcomes)
|
|
13
19
|
|
|
14
|
-
|
|
20
|
+
**Not trustworthy by default**: anything quoted from a subagent's return message, any ID or score the orchestrator "remembers" from prose, any page-content snippet reproduced from a subagent's narrative. Subagents can hallucinate plausible-looking IDs, scores, and confirmation text. Before passing any such fact to another subagent, cross-check it exists in one of the authoritative files above, OR instruct the dispatching subagent to write its output to a structured file and re-read from that file.
|
|
21
|
+
|
|
22
|
+
**Why**: on 2026-04-18, a scan subagent returned 30 fabricated Greenhouse IDs in prose (correct role titles, plausible-looking invented IDs that didn't exist in the API). The orchestrator dispatched 30 downstream subagents that all hit 404s. Verification rules downstream (Hard Limit #6, API-first verify) caught the symptom. This rule prevents the *shape* of the bug — hallucinations propagating through prose handoffs — across all quantitative / identifier / specific-fact claims, not just URLs.
|
|
23
|
+
|
|
24
|
+
Everything below is context and rationale. These seven numbers are the rules.
|
|
15
25
|
|
|
16
26
|
---
|
|
17
27
|
|
|
@@ -31,6 +41,8 @@ Whenever the user says any variation of "apply to N jobs", "process the pipeline
|
|
|
31
41
|
|
|
32
42
|
**Exception:** evaluation-only or tracker-only work (no Geometra, no repeated tool calls) can proceed in a single session. The rule targets tool-heavy multi-step loops.
|
|
33
43
|
|
|
44
|
+
**Before any batch-apply dispatch, run the Apply Preflight location filter from `modes/apply.md`** to exclude location-incompatible candidates. Catches the common case where an evaluated role has the right role-shape but a deal-breaking location that profile.yml already rules out.
|
|
45
|
+
|
|
34
46
|
---
|
|
35
47
|
|
|
36
48
|
## Subagent Routing — which agent for which task
|
|
@@ -457,7 +469,7 @@ To check or modify MCP settings, edit `opencode.json` in the project root.
|
|
|
457
469
|
- JDs in `jds/` (referenced as `local:jds/{file}` in pipeline.md)
|
|
458
470
|
- Batch in `batch/` (gitignored except scripts and prompt)
|
|
459
471
|
- Report numbering: sequential 3-digit zero-padded, max existing + 1
|
|
460
|
-
- **RULE: After each batch of evaluations, run `
|
|
472
|
+
- **RULE: After each batch of evaluations, run `npx job-forge merge`** to merge tracker additions and avoid duplications.
|
|
461
473
|
- **RULE: NEVER create new entries in applications.md if company+role already exists.** Update the existing entry.
|
|
462
474
|
- **RULE: NEVER attribute commits to opencode (no `Co-Authored-By: opencode` or similar).** All commits must be attributed solely to the person making the commit (e.g., CharlieGreenman).
|
|
463
475
|
|
|
@@ -484,13 +496,13 @@ Write one TSV file per evaluation to `batch/tracker-additions/{num}-{company-slu
|
|
|
484
496
|
|
|
485
497
|
### Pipeline Integrity
|
|
486
498
|
|
|
487
|
-
1. **NEVER edit day files in `data/applications/` to ADD new entries** -- Write TSV in `batch/tracker-additions/` and `merge
|
|
499
|
+
1. **NEVER edit day files in `data/applications/` to ADD new entries** -- Write TSV in `batch/tracker-additions/` and `npx job-forge merge` handles the merge.
|
|
488
500
|
2. **YES you can edit day files in `data/applications/` to UPDATE status/notes of existing entries.**
|
|
489
501
|
3. All reports MUST include `**URL:**` in the header (between Score and PDF).
|
|
490
502
|
4. All statuses MUST be canonical (see `templates/states.yml`).
|
|
491
|
-
5. Health check: `
|
|
492
|
-
6. Normalize statuses: `
|
|
493
|
-
7. Dedup: `
|
|
503
|
+
5. Health check: `npx job-forge verify`
|
|
504
|
+
6. Normalize statuses: `npx job-forge normalize`
|
|
505
|
+
7. Dedup: `npx job-forge dedup`
|
|
494
506
|
|
|
495
507
|
### Canonical States (applications day files)
|
|
496
508
|
|
|
@@ -506,6 +518,7 @@ Write one TSV file per evaluation to `batch/tracker-additions/{num}-{company-slu
|
|
|
506
518
|
| `Offer` | Offer received |
|
|
507
519
|
| `Rejected` | Rejected by company |
|
|
508
520
|
| `Discarded` | Discarded by candidate or offer closed |
|
|
521
|
+
| `Failed` | Submission attempted but blocked by portal (spam-filter, anti-bot, broken form). May be recoverable via manual retry. |
|
|
509
522
|
| `SKIP` | Doesn't fit, don't apply |
|
|
510
523
|
|
|
511
524
|
**RULES:**
|
package/batch/README.md
CHANGED
|
@@ -51,7 +51,7 @@ npm run merge
|
|
|
51
51
|
npm run verify # optional: pipeline health after merge (report links, statuses, pending TSVs)
|
|
52
52
|
```
|
|
53
53
|
|
|
54
|
-
(`
|
|
54
|
+
(`npx job-forge merge` — same as `npm run merge`; see [CONTRIBUTING.md](../CONTRIBUTING.md#development).)
|
|
55
55
|
|
|
56
56
|
After a successful merge, each processed file is moved to **`batch/tracker-additions/merged/`** (created on first merge when the directory does not yet exist). `npm run verify` only looks for `*.tsv` files in the **top level** of `batch/tracker-additions/`, so rows already merged and archived under `merged/` do not trigger the “pending TSVs” warning.
|
|
57
57
|
|
package/batch/batch-prompt.md
CHANGED
|
@@ -243,7 +243,7 @@ Where `{company-slug}` is the company name in lowercase, no spaces, with hyphens
|
|
|
243
243
|
12. Write HTML to `/tmp/cv-candidate-{company-slug}.html`
|
|
244
244
|
13. Run:
|
|
245
245
|
```bash
|
|
246
|
-
|
|
246
|
+
npx job-forge pdf \
|
|
247
247
|
/tmp/cv-candidate-{company-slug}.html \
|
|
248
248
|
output/cv-candidate-{company-slug}-{{DATE}}.pdf \
|
|
249
249
|
--format={letter|a4}
|
|
@@ -65,3 +65,24 @@ location:
|
|
|
65
65
|
visa_status: "No sponsorship needed"
|
|
66
66
|
# For remote roles outside your country:
|
|
67
67
|
# onsite_availability: "1 week/month in any city"
|
|
68
|
+
|
|
69
|
+
# Structured location constraints — consumed by the Apply Preflight location
|
|
70
|
+
# filter in modes/apply.md. The prose fields above (compensation.location_flexibility,
|
|
71
|
+
# location.*) remain for human readability and LLM narrative context; the fields
|
|
72
|
+
# below govern automated, deterministic compatibility checks before dispatching
|
|
73
|
+
# an apply subagent.
|
|
74
|
+
#
|
|
75
|
+
# City names are lowercase, hyphenated (e.g. "san-francisco", "new-york").
|
|
76
|
+
# Country codes are ISO-3166 alpha-2 uppercase (e.g. "US", "CA", "GB").
|
|
77
|
+
location_constraints:
|
|
78
|
+
remote_us: true # open to US-remote roles
|
|
79
|
+
remote_global: false # open to non-US remote (visa / timezone permitting)
|
|
80
|
+
hybrid_cities: # cities where hybrid N-days-in-office is acceptable
|
|
81
|
+
- san-francisco
|
|
82
|
+
blocked_cities: # cities that are a hard No for relocation (even if hybrid)
|
|
83
|
+
- new-york
|
|
84
|
+
- london
|
|
85
|
+
authorized_countries: # countries where the candidate has right-to-work
|
|
86
|
+
- US
|
|
87
|
+
requires_visa_sponsorship: false # true → roles in non-authorized countries are blocked unless
|
|
88
|
+
# the JD explicitly mentions visa sponsorship
|
|
@@ -166,8 +166,8 @@ Step 5 — Between rounds: clean sessions again
|
|
|
166
166
|
- geometra_disconnect({ closeBrowser: true })
|
|
167
167
|
|
|
168
168
|
Step 6 — After all rounds: reconcile outcomes (Hard Limit #6)
|
|
169
|
-
- bash:
|
|
170
|
-
- bash:
|
|
169
|
+
- bash: npx job-forge merge # consumes batch/tracker-additions/*.tsv into the day file
|
|
170
|
+
- bash: npx job-forge verify # validates URL/status consistency
|
|
171
171
|
- Review output; if verify-pipeline reports issues, fix them before ending.
|
|
172
172
|
|
|
173
173
|
Step 7 — Aggregate and report
|
package/iso/instructions.md
CHANGED
|
@@ -9,9 +9,19 @@ The Hard Limits below are non-negotiable numeric rules. If you catch yourself ab
|
|
|
9
9
|
3. **Always clean Geometra sessions before dispatching.** Before every round of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round. The disconnect is a no-op when the pool is empty.
|
|
10
10
|
4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
|
|
11
11
|
5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
|
|
12
|
-
6. **Application outcomes flow through TSVs, not `data/pipeline.md`.** When a subagent returns APPLIED / FAILED / SKIP, the outcome goes to `batch/tracker-additions/{num}-{slug}.tsv`. `
|
|
12
|
+
6. **Application outcomes flow through TSVs, not `data/pipeline.md`.** When a subagent returns APPLIED / FAILED / SKIP, the outcome goes to `batch/tracker-additions/{num}-{slug}.tsv`. `npx job-forge merge` then consumes the TSVs into the correct `data/applications/YYYY-MM-DD.md` day file. `data/pipeline.md` only tracks URL inbox state (`[ ]` pending → `[x]` processed). **NEVER append APPLIED / FAILED status lines to `pipeline.md`** — that's the day file's job, via the TSV pathway. After any multi-apply run, the orchestrator MUST run `npx job-forge merge` followed by `npx job-forge verify` before ending the session.
|
|
13
|
+
7. **Load-bearing facts passed to downstream subagents must come from a file, not from a prior subagent's prose.** A URL, score, email ID, confirmation page snippet, JD salary range, exact answer submitted to a form question, or any other specific value that a downstream subagent will act on MUST originate from one of:
|
|
14
|
+
- `data/pipeline.md` (URL inbox state)
|
|
15
|
+
- `data/scan-history.tsv` (scan provenance)
|
|
16
|
+
- `batch/scan-output-*.md` (scan-ranked candidates)
|
|
17
|
+
- A report file (`reports/{num}-*.md`) with authoritative headers (`**URL:**`, `**Score:**`, etc.)
|
|
18
|
+
- A TSV in `batch/tracker-additions/` (per-apply outcomes)
|
|
13
19
|
|
|
14
|
-
|
|
20
|
+
**Not trustworthy by default**: anything quoted from a subagent's return message, any ID or score the orchestrator "remembers" from prose, any page-content snippet reproduced from a subagent's narrative. Subagents can hallucinate plausible-looking IDs, scores, and confirmation text. Before passing any such fact to another subagent, cross-check it exists in one of the authoritative files above, OR instruct the dispatching subagent to write its output to a structured file and re-read from that file.
|
|
21
|
+
|
|
22
|
+
**Why**: on 2026-04-18, a scan subagent returned 30 fabricated Greenhouse IDs in prose (correct role titles, plausible-looking invented IDs that didn't exist in the API). The orchestrator dispatched 30 downstream subagents that all hit 404s. Verification rules downstream (Hard Limit #6, API-first verify) caught the symptom. This rule prevents the *shape* of the bug — hallucinations propagating through prose handoffs — across all quantitative / identifier / specific-fact claims, not just URLs.
|
|
23
|
+
|
|
24
|
+
Everything below is context and rationale. These seven numbers are the rules.
|
|
15
25
|
|
|
16
26
|
---
|
|
17
27
|
|
|
@@ -31,6 +41,8 @@ Whenever the user says any variation of "apply to N jobs", "process the pipeline
|
|
|
31
41
|
|
|
32
42
|
**Exception:** evaluation-only or tracker-only work (no Geometra, no repeated tool calls) can proceed in a single session. The rule targets tool-heavy multi-step loops.
|
|
33
43
|
|
|
44
|
+
**Before any batch-apply dispatch, run the Apply Preflight location filter from `modes/apply.md`** to exclude location-incompatible candidates. Catches the common case where an evaluated role has the right role-shape but a deal-breaking location that profile.yml already rules out.
|
|
45
|
+
|
|
34
46
|
---
|
|
35
47
|
|
|
36
48
|
## Subagent Routing — which agent for which task
|
|
@@ -457,7 +469,7 @@ To check or modify MCP settings, edit `opencode.json` in the project root.
|
|
|
457
469
|
- JDs in `jds/` (referenced as `local:jds/{file}` in pipeline.md)
|
|
458
470
|
- Batch in `batch/` (gitignored except scripts and prompt)
|
|
459
471
|
- Report numbering: sequential 3-digit zero-padded, max existing + 1
|
|
460
|
-
- **RULE: After each batch of evaluations, run `
|
|
472
|
+
- **RULE: After each batch of evaluations, run `npx job-forge merge`** to merge tracker additions and avoid duplications.
|
|
461
473
|
- **RULE: NEVER create new entries in applications.md if company+role already exists.** Update the existing entry.
|
|
462
474
|
- **RULE: NEVER attribute commits to opencode (no `Co-Authored-By: opencode` or similar).** All commits must be attributed solely to the person making the commit (e.g., CharlieGreenman).
|
|
463
475
|
|
|
@@ -484,13 +496,13 @@ Write one TSV file per evaluation to `batch/tracker-additions/{num}-{company-slu
|
|
|
484
496
|
|
|
485
497
|
### Pipeline Integrity
|
|
486
498
|
|
|
487
|
-
1. **NEVER edit day files in `data/applications/` to ADD new entries** -- Write TSV in `batch/tracker-additions/` and `merge
|
|
499
|
+
1. **NEVER edit day files in `data/applications/` to ADD new entries** -- Write TSV in `batch/tracker-additions/` and `npx job-forge merge` handles the merge.
|
|
488
500
|
2. **YES you can edit day files in `data/applications/` to UPDATE status/notes of existing entries.**
|
|
489
501
|
3. All reports MUST include `**URL:**` in the header (between Score and PDF).
|
|
490
502
|
4. All statuses MUST be canonical (see `templates/states.yml`).
|
|
491
|
-
5. Health check: `
|
|
492
|
-
6. Normalize statuses: `
|
|
493
|
-
7. Dedup: `
|
|
503
|
+
5. Health check: `npx job-forge verify`
|
|
504
|
+
6. Normalize statuses: `npx job-forge normalize`
|
|
505
|
+
7. Dedup: `npx job-forge dedup`
|
|
494
506
|
|
|
495
507
|
### Canonical States (applications day files)
|
|
496
508
|
|
|
@@ -506,6 +518,7 @@ Write one TSV file per evaluation to `batch/tracker-additions/{num}-{company-slu
|
|
|
506
518
|
| `Offer` | Offer received |
|
|
507
519
|
| `Rejected` | Rejected by company |
|
|
508
520
|
| `Discarded` | Discarded by candidate or offer closed |
|
|
521
|
+
| `Failed` | Submission attempted but blocked by portal (spam-filter, anti-bot, broken form). May be recoverable via manual retry. |
|
|
509
522
|
| `SKIP` | Doesn't fit, don't apply |
|
|
510
523
|
|
|
511
524
|
**RULES:**
|
|
@@ -0,0 +1,116 @@
|
|
|
1
|
+
/**
|
|
2
|
+
* canonical-states.mjs — single source of truth for JobForge canonical states.
|
|
3
|
+
*
|
|
4
|
+
* `templates/states.yml` is the authoritative list. This module reads it
|
|
5
|
+
* (when available) and provides a hardcoded fallback that MUST stay in sync
|
|
6
|
+
* with the YAML for the belt-and-suspenders case where the file is missing.
|
|
7
|
+
*
|
|
8
|
+
* Consumers:
|
|
9
|
+
* - merge-tracker.mjs — validation + TSV column-swap heuristic
|
|
10
|
+
* - normalize-statuses.mjs — canonical list for direct matching
|
|
11
|
+
*
|
|
12
|
+
* The dashboard (Go) currently duplicates this list in
|
|
13
|
+
* dashboard/internal/ui/screens/pipeline.go (statusOptions, statusGroupOrder, statusLabel)
|
|
14
|
+
* dashboard/internal/data/career.go (NormalizeStatus, StatusPriority)
|
|
15
|
+
* Full codegen from YAML on the Go side is a follow-up; for now those
|
|
16
|
+
* copies carry KEEP IN SYNC comments.
|
|
17
|
+
*/
|
|
18
|
+
|
|
19
|
+
import { readFileSync, existsSync } from 'fs';
|
|
20
|
+
import { join } from 'path';
|
|
21
|
+
|
|
22
|
+
/**
|
|
23
|
+
* Fallback canonical labels, in display order matching templates/states.yml.
|
|
24
|
+
* Used when the YAML file can't be read. Keep in sync with the YAML.
|
|
25
|
+
*/
|
|
26
|
+
export const DEFAULT_STATES = [
|
|
27
|
+
'Evaluated',
|
|
28
|
+
'Applied',
|
|
29
|
+
'Responded',
|
|
30
|
+
'Contacted',
|
|
31
|
+
'Interview',
|
|
32
|
+
'Offer',
|
|
33
|
+
'Rejected',
|
|
34
|
+
'Discarded',
|
|
35
|
+
'Failed',
|
|
36
|
+
'SKIP',
|
|
37
|
+
];
|
|
38
|
+
|
|
39
|
+
/**
|
|
40
|
+
* Extra tokens the column-swap heuristic recognises as "this column looks
|
|
41
|
+
* like a status". Canonical labels plus historical aliases the tracker has
|
|
42
|
+
* been known to emit (duplicate/repost/hold). Kept here so that both
|
|
43
|
+
* merge-tracker.mjs and any future consumer see the same alias set.
|
|
44
|
+
*/
|
|
45
|
+
const STATUS_DETECT_EXTRAS = ['duplicate', 'repost', 'hold'];
|
|
46
|
+
|
|
47
|
+
/**
|
|
48
|
+
* Parse `templates/states.yml` and return the ordered list of canonical
|
|
49
|
+
* labels. Returns null when the file is missing or contains no labels,
|
|
50
|
+
* so callers can fall back to DEFAULT_STATES.
|
|
51
|
+
*
|
|
52
|
+
* The parser intentionally uses a line-regex rather than pulling in a
|
|
53
|
+
* YAML dependency — job-forge has no runtime YAML parser and we don't
|
|
54
|
+
* want to add one just for this.
|
|
55
|
+
*
|
|
56
|
+
* @param {string} repoRoot - repo root where `templates/states.yml` lives.
|
|
57
|
+
* Also checks `states.yml` at the root as a legacy fallback.
|
|
58
|
+
* @returns {string[] | null}
|
|
59
|
+
*/
|
|
60
|
+
export function loadCanonicalStates(repoRoot) {
|
|
61
|
+
const candidates = [
|
|
62
|
+
join(repoRoot, 'templates/states.yml'),
|
|
63
|
+
join(repoRoot, 'states.yml'),
|
|
64
|
+
];
|
|
65
|
+
|
|
66
|
+
for (const filePath of candidates) {
|
|
67
|
+
if (!existsSync(filePath)) continue;
|
|
68
|
+
let text;
|
|
69
|
+
try {
|
|
70
|
+
text = readFileSync(filePath, 'utf-8');
|
|
71
|
+
} catch {
|
|
72
|
+
continue;
|
|
73
|
+
}
|
|
74
|
+
const labels = [];
|
|
75
|
+
for (const line of text.split('\n')) {
|
|
76
|
+
const m = line.match(/^\s+label:\s*(.+)$/);
|
|
77
|
+
if (!m) continue;
|
|
78
|
+
const v = m[1].trim().replace(/^['"]|['"]$/g, '');
|
|
79
|
+
if (v) labels.push(v);
|
|
80
|
+
}
|
|
81
|
+
if (labels.length > 0) return labels;
|
|
82
|
+
}
|
|
83
|
+
return null;
|
|
84
|
+
}
|
|
85
|
+
|
|
86
|
+
/**
|
|
87
|
+
* Build the case-insensitive "does this column look like a status?" regex
|
|
88
|
+
* used by merge-tracker.mjs to detect swapped status/score columns in
|
|
89
|
+
* legacy TSVs.
|
|
90
|
+
*
|
|
91
|
+
* Matches at the start of the column text, case-insensitive. Includes the
|
|
92
|
+
* canonical labels plus alias tokens (duplicate/repost/hold) that have
|
|
93
|
+
* historically appeared in the status column.
|
|
94
|
+
*
|
|
95
|
+
* @param {string[]} states - canonical labels (typically the output of
|
|
96
|
+
* loadCanonicalStates, or DEFAULT_STATES).
|
|
97
|
+
* @returns {RegExp}
|
|
98
|
+
*/
|
|
99
|
+
export function buildStatusDetectionRegex(states) {
|
|
100
|
+
const tokens = [
|
|
101
|
+
...states.map((s) => s.toLowerCase()),
|
|
102
|
+
...STATUS_DETECT_EXTRAS,
|
|
103
|
+
];
|
|
104
|
+
// Dedupe while preserving order.
|
|
105
|
+
const seen = new Set();
|
|
106
|
+
const unique = [];
|
|
107
|
+
for (const t of tokens) {
|
|
108
|
+
if (!seen.has(t)) {
|
|
109
|
+
seen.add(t);
|
|
110
|
+
unique.push(t);
|
|
111
|
+
}
|
|
112
|
+
}
|
|
113
|
+
// Escape regex-special chars just in case a label ever contains one.
|
|
114
|
+
const escaped = unique.map((t) => t.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'));
|
|
115
|
+
return new RegExp(`^(${escaped.join('|')})`, 'i');
|
|
116
|
+
}
|
package/merge-tracker.mjs
CHANGED
|
@@ -26,6 +26,9 @@ import {
|
|
|
26
26
|
usesDayFiles, ensureDayDir, getHeader, formatAppLine, parseAppLine,
|
|
27
27
|
readAllEntries, writeToDayFiles, listDayFiles, dayFilePath,
|
|
28
28
|
} from './tracker-lib.mjs';
|
|
29
|
+
import {
|
|
30
|
+
DEFAULT_STATES, loadCanonicalStates, buildStatusDetectionRegex,
|
|
31
|
+
} from './lib/canonical-states.mjs';
|
|
29
32
|
|
|
30
33
|
const ADDITIONS_DIR = join(PROJECT_DIR, 'batch/tracker-additions');
|
|
31
34
|
const MERGED_DIR = join(ADDITIONS_DIR, 'merged');
|
|
@@ -54,26 +57,8 @@ Run from the repository root.`);
|
|
|
54
57
|
process.exit(0);
|
|
55
58
|
}
|
|
56
59
|
|
|
57
|
-
const
|
|
58
|
-
|
|
59
|
-
: join(PROJECT_DIR, 'states.yml');
|
|
60
|
-
|
|
61
|
-
function loadCanonicalLabelsFromStatesYaml(filePath) {
|
|
62
|
-
if (!existsSync(filePath)) return null;
|
|
63
|
-
const text = readFileSync(filePath, 'utf-8');
|
|
64
|
-
const labels = [];
|
|
65
|
-
for (const line of text.split('\n')) {
|
|
66
|
-
const m = line.match(/^\s+label:\s*(.+)$/);
|
|
67
|
-
if (!m) continue;
|
|
68
|
-
let v = m[1].trim().replace(/^['"]|['"]$/g, '');
|
|
69
|
-
if (v) labels.push(v);
|
|
70
|
-
}
|
|
71
|
-
return labels.length > 0 ? labels : null;
|
|
72
|
-
}
|
|
73
|
-
|
|
74
|
-
const CANONICAL_STATES = loadCanonicalLabelsFromStatesYaml(STATES_FILE) || [
|
|
75
|
-
'Evaluated', 'Applied', 'Contacted', 'Responded', 'Interview', 'Offer', 'Rejected', 'Discarded', 'SKIP',
|
|
76
|
-
];
|
|
60
|
+
const CANONICAL_STATES = loadCanonicalStates(PROJECT_DIR) || DEFAULT_STATES;
|
|
61
|
+
const STATUS_DETECT_RE = buildStatusDetectionRegex(CANONICAL_STATES);
|
|
77
62
|
|
|
78
63
|
function validateStatus(status) {
|
|
79
64
|
const clean = status.replace(/\*\*/g, '').replace(/\s+\d{4}-\d{2}-\d{2}.*$/, '').trim();
|
|
@@ -156,8 +141,8 @@ function parseTsvContent(content, filename) {
|
|
|
156
141
|
const col5 = parts[5].trim();
|
|
157
142
|
const col4LooksLikeScore = /^\d+\.?\d*\/5$/.test(col4) || col4 === 'N/A' || col4 === 'DUP';
|
|
158
143
|
const col5LooksLikeScore = /^\d+\.?\d*\/5$/.test(col5) || col5 === 'N/A' || col5 === 'DUP';
|
|
159
|
-
const col4LooksLikeStatus =
|
|
160
|
-
const col5LooksLikeStatus =
|
|
144
|
+
const col4LooksLikeStatus = STATUS_DETECT_RE.test(col4);
|
|
145
|
+
const col5LooksLikeStatus = STATUS_DETECT_RE.test(col5);
|
|
161
146
|
|
|
162
147
|
let statusCol, scoreCol;
|
|
163
148
|
if (col4LooksLikeStatus && !col4LooksLikeScore) {
|
package/modes/_shared.md
CHANGED
|
@@ -247,7 +247,7 @@ If the candidate has a live demo/dashboard (check profile.yml), offer access in
|
|
|
247
247
|
|
|
248
248
|
0. **Cover letter:** If the form has an option to attach or write a cover letter, ALWAYS include one. Generate PDF with the same visual design as the CV. Content: JD quotes mapped to proof points, links to relevant case studies. 1 page max.
|
|
249
249
|
1. Read cv.md and article-digest.md (if exists) before evaluating any offer
|
|
250
|
-
1b. **First evaluation of each session:** Run `
|
|
250
|
+
1b. **First evaluation of each session:** Run `npx job-forge sync-check` with Bash. If it reports warnings, notify the candidate before continuing
|
|
251
251
|
2. Detect the role archetype and adapt framing
|
|
252
252
|
3. Cite exact lines from CV when matching
|
|
253
253
|
4. Use WebSearch for comp and company data
|
|
@@ -269,4 +269,4 @@ If the candidate has a live demo/dashboard (check profile.yml), offer access in
|
|
|
269
269
|
| Read | cv.md, article-digest.md, cv-template.html |
|
|
270
270
|
| Write | Temporary HTML for PDF, day files in `data/applications/YYYY-MM-DD.md`, reports .md |
|
|
271
271
|
| Edit | Update tracker |
|
|
272
|
-
| Bash | `
|
|
272
|
+
| Bash | `npx job-forge pdf` |
|
package/modes/apply.md
CHANGED
|
@@ -10,6 +10,54 @@ Interactive mode for when the candidate is filling out an application form in Ch
|
|
|
10
10
|
|
|
11
11
|
For a single application interactively, carry on in the current session — the rule targets multi-job loops.
|
|
12
12
|
|
|
13
|
+
## Apply Preflight — Location Filter (orchestrator runs before dispatch)
|
|
14
|
+
|
|
15
|
+
Before dispatching any batch of apply subagents, cross-check each candidate's location against `config/profile.yml`. **Prefer the structured `location_constraints` block** (deterministic match). Fall back to the prose `location.*` / `compensation.location_flexibility` fields only when `location_constraints` is absent (legacy profiles).
|
|
16
|
+
|
|
17
|
+
### Preferred path — structured `location_constraints` (deterministic)
|
|
18
|
+
|
|
19
|
+
1. Read `config/profile.yml → location_constraints`. If present, use the structured fields:
|
|
20
|
+
|
|
21
|
+
```yaml
|
|
22
|
+
location_constraints:
|
|
23
|
+
remote_us: true | false
|
|
24
|
+
remote_global: true | false
|
|
25
|
+
hybrid_cities: [san-francisco, ...]
|
|
26
|
+
blocked_cities: [new-york, ...]
|
|
27
|
+
authorized_countries: [US, ...] # ISO-3166 alpha-2
|
|
28
|
+
requires_visa_sponsorship: true | false
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
2. For each candidate, open its evaluation report (`reports/{num}-*.md`) and read the Location / Block A content. Extract: `mode ∈ {remote, hybrid, onsite}`, `city` (lowercase hyphenated), `country` (ISO-3166 alpha-2 when derivable).
|
|
32
|
+
|
|
33
|
+
3. Apply the filter (decision table):
|
|
34
|
+
|
|
35
|
+
| Role shape | Rule | Outcome |
|
|
36
|
+
|---|---|---|
|
|
37
|
+
| Remote, country ∈ authorized_countries (typically US) | `remote_us == true` → COMPATIBLE | dispatch |
|
|
38
|
+
| Remote, country ∉ authorized_countries | `remote_global == true` AND (`requires_visa_sponsorship == false` OR JD mentions sponsorship) → COMPATIBLE | dispatch / else skip |
|
|
39
|
+
| Hybrid, `city ∈ hybrid_cities` | COMPATIBLE | dispatch |
|
|
40
|
+
| Hybrid or Onsite, `city ∈ blocked_cities` | INCOMPATIBLE | mark `Discarded`, note `location mismatch: blocked_city=X` |
|
|
41
|
+
| Hybrid or Onsite, `city` not in `hybrid_cities` and not in `blocked_cities` | INCOMPATIBLE by default (hybrid is opt-in per city) | mark `Discarded`, note `location mismatch: city=X not in hybrid_cities` |
|
|
42
|
+
| Location unclear / ambiguous | dispatch with a prompt flag instructing the apply subagent to verify the JD location first and Discard early if confirmed incompatible | dispatch-with-flag |
|
|
43
|
+
|
|
44
|
+
4. Country/visa: if `requires_visa_sponsorship == false` AND `country ∉ authorized_countries` AND the JD does NOT explicitly offer sponsorship → INCOMPATIBLE, do NOT dispatch.
|
|
45
|
+
|
|
46
|
+
### Fallback path — prose fields (legacy profiles with no `location_constraints`)
|
|
47
|
+
|
|
48
|
+
When `location_constraints` is absent, use the prose fields:
|
|
49
|
+
|
|
50
|
+
1. Read `config/profile.yml` for `location` (country, city), `compensation.location_flexibility`, and `visa_status`.
|
|
51
|
+
2. For each candidate, open its evaluation report (`reports/{num}-*.md`) and read the Location / Block A content.
|
|
52
|
+
3. Apply the filter:
|
|
53
|
+
- If the report says "Remote (US)" / "Remote" / "fully remote" — COMPATIBLE, dispatch.
|
|
54
|
+
- If the report says "Hybrid N days in {city}" AND {city} matches `location.city` OR `location_flexibility` says "open to hybrid in {city}" — COMPATIBLE, dispatch.
|
|
55
|
+
- If the report says "Hybrid" or "Onsite" at a city NOT in the profile's location set AND `location_flexibility` says Remote-preferred — INCOMPATIBLE, do NOT dispatch. Mark the tracker entry `Discarded` directly with note `location mismatch: profile=X, role=Y`.
|
|
56
|
+
- If unclear or ambiguous — dispatch with a prompt flag telling the apply subagent to verify the JD location first and Discard early if confirmed incompatible.
|
|
57
|
+
4. Country/visa: if `visa_status: "No sponsorship needed"` and the role is outside the authorized country — INCOMPATIBLE, do NOT dispatch.
|
|
58
|
+
|
|
59
|
+
**Why**: on 2026-04-18, 5 of 7 candidates dispatched for apply turned out location-incompatible. Each burned an apply-subagent round. The prose-field path reached the right call but cost interpretation cycles per dispatch; the structured path is O(1) field lookup and removes LLM-interpretation risk.
|
|
60
|
+
|
|
13
61
|
### Run this multi-job apply runbook literally when N > 1
|
|
14
62
|
|
|
15
63
|
```
|
|
@@ -24,8 +72,8 @@ Step 4 — For round in ceil(N/2):
|
|
|
24
72
|
# WAIT for both returns. Do not proceed until both done.
|
|
25
73
|
Step 5 — Between rounds: geometra_list_sessions() + geometra_disconnect({closeBrowser: true})
|
|
26
74
|
Step 6 — Reconcile outcomes (Hard Limit #6):
|
|
27
|
-
bash:
|
|
28
|
-
bash:
|
|
75
|
+
bash: npx job-forge merge # TSVs → day file
|
|
76
|
+
bash: npx job-forge verify # validate
|
|
29
77
|
Step 7 — Summarize outcomes; do NOT auto-retry failures.
|
|
30
78
|
```
|
|
31
79
|
|
|
@@ -33,7 +81,7 @@ If a subagent fails, report it in the summary and let the user decide whether to
|
|
|
33
81
|
|
|
34
82
|
**Outcome routing (Hard Limit #6 in `AGENTS.md`):**
|
|
35
83
|
- Subagents write `batch/tracker-additions/{num}-{slug}.tsv` — one TSV per job.
|
|
36
|
-
- Orchestrator runs `
|
|
84
|
+
- Orchestrator runs `npx job-forge merge` once at the end to consume TSVs into the right day file.
|
|
37
85
|
- **Do NOT** append APPLIED / FAILED / SKIP lines to `data/pipeline.md` — that file is the URL inbox only.
|
|
38
86
|
|
|
39
87
|
## Verify these requirements
|
|
@@ -214,13 +262,28 @@ Specific portals — Workday "parse my resume", iCIMS multi-step, SAP SuccessFac
|
|
|
214
262
|
Check for an OTP gate after the candidate (or Geometra) submits — the major portals (Greenhouse, Workday, Lever, Ashby) gate submission behind an email verification code. When an OTP step appears, do this.
|
|
215
263
|
|
|
216
264
|
1. **Do NOT stop and ask the candidate to paste the code manually.** Use the Gmail MCP.
|
|
217
|
-
2.
|
|
218
|
-
3. `
|
|
219
|
-
4. `
|
|
265
|
+
2. **Pick the Gmail sender query from the ATS recorded at scan time.** The scan subagent records the ATS type in `batch/scan-output-{YYYY-MM-DD}.md` (`ats` column) and in `data/pipeline.md` (`| ats={type}` suffix). Read that value first — do NOT re-infer the ATS from the URL host when it's already recorded.
|
|
266
|
+
3. Map the `ats` value to the Gmail sender query (table below). Wait ~5-10 seconds for the email, then call `gmail_list_messages` with the matching query.
|
|
267
|
+
4. `gmail_get_message` on the most recent match, extract the code from the body.
|
|
268
|
+
5. `geometra_fill_otp` to enter it, then submit.
|
|
269
|
+
|
|
270
|
+
**ATS → Gmail sender query lookup** (use the `ats` value recorded at scan time):
|
|
271
|
+
|
|
272
|
+
| `ats` value | `q` for `gmail_list_messages` |
|
|
273
|
+
|-------------|-------------------------------|
|
|
274
|
+
| `greenhouse` | `from:greenhouse newer_than:10m` |
|
|
275
|
+
| `workday` | `from:myworkday newer_than:10m` |
|
|
276
|
+
| `lever` | `from:lever newer_than:10m` |
|
|
277
|
+
| `ashby` | `from:ashby newer_than:10m` |
|
|
278
|
+
| `workable` | `from:workable newer_than:10m` |
|
|
279
|
+
| `builtin` | `from:builtin newer_than:10m` |
|
|
280
|
+
| `custom` / `unknown` / missing | `newer_than:10m subject:(verify OR code OR confirm)` |
|
|
281
|
+
|
|
282
|
+
**Fallback when `ats` is missing** (legacy pipeline entries with no `| ats=` suffix, or scan-output without an `ats` column): infer from the URL host — `*.greenhouse.io` → `greenhouse`; `jobs.ashbyhq.com` → `ashby`; `jobs.lever.co` → `lever`; `*.myworkdayjobs.com` → `workday`; `apply.workable.com` / `jobs.workable.com` → `workable`; `builtin.com` → `builtin`; otherwise use the generic `verify OR code OR confirm` subject query.
|
|
220
283
|
|
|
221
284
|
**Before reporting the submission as failed, always check Gmail.** A "submit did nothing" outcome usually means a silent OTP step — not a real failure.
|
|
222
285
|
|
|
223
|
-
Full
|
|
286
|
+
Full OTP recipe and fallback patterns: see "OTP Handling via Gmail MCP" in `AGENTS.md`.
|
|
224
287
|
|
|
225
288
|
## Step 7 — Update outcomes after submission
|
|
226
289
|
|
|
@@ -240,7 +303,7 @@ The row exists. You are UPDATING an existing entry, which is allowed (Pipeline I
|
|
|
240
303
|
The row does NOT exist yet. You MUST go through the TSV pathway (Hard Limit #6 + Pipeline Integrity rule #1):
|
|
241
304
|
|
|
242
305
|
1. Write `batch/tracker-additions/{num}-{slug}.tsv` with the canonical 9-column format (see "TSV Format for Tracker Additions" in `AGENTS.md`)
|
|
243
|
-
2. At the end of the apply run, the orchestrator calls `
|
|
306
|
+
2. At the end of the apply run, the orchestrator calls `npx job-forge merge`, which inserts the row into today's day file
|
|
244
307
|
3. Do NOT manually add a row to the day file. Do NOT append an `APPLIED` line to `data/pipeline.md`.
|
|
245
308
|
|
|
246
309
|
### Apply to both cases
|
package/modes/auto-pipeline.md
CHANGED
|
@@ -8,9 +8,12 @@ Fetch the JD content once. If the input is a **URL** (not pasted JD text), fetch
|
|
|
8
8
|
|
|
9
9
|
**Pick exactly one method, in this priority order:**
|
|
10
10
|
|
|
11
|
-
1. **
|
|
12
|
-
2. **
|
|
13
|
-
3. **
|
|
11
|
+
1. **Greenhouse JSON API (first try, if the URL is Greenhouse-backed):** If the pipeline.md entry carries `| gh={slug}/{id}` OR the URL host matches `*.greenhouse.io` / a known Greenhouse customer front-end (`*.pinterestcareers.com`, `okta.com/company/careers/opportunity/*`, `samsara.com/company/careers/roles/*`, `zoominfo.com/careers?gh_jid=*`, `collibra.com/.../?gh_jid=*`, `careers.toasttab.com/jobs?gh_jid=*`, `careers.airbnb.com/positions/*?gh_jid=*`, `coinbase.com/careers/positions/*?gh_jid=*`, `instacart.careers/job/?gh_jid=*`), extract `slug` and `id` and WebFetch `https://boards-api.greenhouse.io/v1/boards/{slug}/jobs/{id}`. 200 + JSON with `content` is the authoritative JD. 404 = genuinely closed (mark CLOSED and stop). **If 200, STOP — do not fall back to Geometra or WebFetch of the front-end.** The API is faster, cheaper (no Geometra session), and never returns a bot-shell.
|
|
12
|
+
2. **Geometra MCP:** Most non-Greenhouse job portals (Lever, Ashby, Workday) are SPAs. Use `geometra_connect` + `geometra_page_model` to render and read the JD. **If this returns non-empty JD text, STOP — do not WebFetch the same URL.**
|
|
13
|
+
3. **WebFetch (only if Geometra is unavailable OR returned only a shell with no JD text):** For static pages (ZipRecruiter, WeLoveProduct, company career pages).
|
|
14
|
+
4. **WebSearch (only if methods 1–3 all failed):** Search for the role title + company on secondary portals that index the JD in static HTML.
|
|
15
|
+
|
|
16
|
+
**Do NOT mark a Greenhouse-sourced offer CLOSED based on a WebFetch shell or a 403 from a customer-skinned careers domain.** Pinterest, Okta, Samsara, ZoomInfo, Collibra, Toast, Airbnb, Coinbase, Instacart all serve bot-hostile fronts. The Greenhouse JSON API (step 1) is the ground truth for their offer state. A previous scan run fed 60 live Greenhouse URLs through WebFetch-only verification and 100% of them were wrongly marked CLOSED; if you see a high stale rate, you are skipping step 1.
|
|
14
17
|
|
|
15
18
|
**Rule:** Each URL gets fetched at most once per session. If you already have the JD text in context — from Geometra, a previous WebFetch, or pasted by the candidate — do not fetch again.
|
|
16
19
|
|
package/modes/pdf.md
CHANGED
|
@@ -17,7 +17,7 @@
|
|
|
17
17
|
11. Inject keywords naturally into existing achievements (NEVER fabricate)
|
|
18
18
|
12. Generate complete HTML from template + personalized content
|
|
19
19
|
13. Write HTML to `/tmp/cv-candidate-{company}.html`
|
|
20
|
-
14. Run: `
|
|
20
|
+
14. Run: `npx job-forge pdf /tmp/cv-candidate-{company}.html output/cv-candidate-{company}-{YYYY-MM-DD}.pdf --format={letter|a4}`
|
|
21
21
|
15. Report: PDF path, page count, keyword coverage %
|
|
22
22
|
|
|
23
23
|
## Apply these ATS rules for clean parsing
|
package/modes/pipeline.md
CHANGED
|
@@ -33,9 +33,10 @@ Processes accumulated job offer URLs from `data/pipeline.md`. The user adds URLs
|
|
|
33
33
|
|
|
34
34
|
## Detect JD From URL
|
|
35
35
|
|
|
36
|
-
1. **
|
|
37
|
-
2. **
|
|
38
|
-
3. **
|
|
36
|
+
1. **Greenhouse JSON API (FIRST, when the entry has `| gh={slug}/{id}` OR the host looks Greenhouse-backed):** WebFetch `https://boards-api.greenhouse.io/v1/boards/{slug}/jobs/{id}`. 200 + JSON with `content` = LIVE, use it as the JD; 404 = genuinely CLOSED (mark `- [!]` and continue). Bot-hostile customer fronts (`pinterestcareers.com`, `okta.com`, `samsara.com`, `zoominfo.com`, `collibra.com`, `careers.toasttab.com`, `careers.airbnb.com`, `coinbase.com`, `instacart.careers`, `careers.toasttab.com`) MUST be verified via this API first — WebFetch/Geometra of those domains returns a shell or 403 and causes false CLOSED marks.
|
|
37
|
+
2. **Geometra MCP:** `geometra_connect` + `geometra_page_model`. Works with non-Greenhouse SPAs (Lever, Ashby, Workday), uses fewer tokens than raw DOM snapshots.
|
|
38
|
+
3. **WebFetch (fallback):** For static pages or when Geometra is not available.
|
|
39
|
+
4. **WebSearch (last resort):** Search on secondary portals that index the JD.
|
|
39
40
|
|
|
40
41
|
**Special cases:**
|
|
41
42
|
- **LinkedIn**: May require login → mark `[!]` and ask the user to paste the text
|
|
@@ -52,7 +53,7 @@ Processes accumulated job offer URLs from `data/pipeline.md`. The user adds URLs
|
|
|
52
53
|
|
|
53
54
|
Before processing any URL, verify sync:
|
|
54
55
|
```bash
|
|
55
|
-
|
|
56
|
+
npx job-forge sync-check
|
|
56
57
|
```
|
|
57
58
|
If there is a desynchronization, warn the user before continuing.
|
|
58
59
|
|
|
@@ -71,8 +72,8 @@ Step 3 — For round in ceil(N/2):
|
|
|
71
72
|
# WAIT for both returns before the next round.
|
|
72
73
|
Step 4 — Between rounds: geometra_list_sessions() + geometra_disconnect({closeBrowser: true})
|
|
73
74
|
Step 5 — Reconcile outcomes (Hard Limit #6):
|
|
74
|
-
bash:
|
|
75
|
-
bash:
|
|
75
|
+
bash: npx job-forge merge # TSVs → correct day file
|
|
76
|
+
bash: npx job-forge verify # validate URL/status consistency
|
|
76
77
|
Step 6 — Display summary table; flag any verify-pipeline errors.
|
|
77
78
|
```
|
|
78
79
|
|
package/modes/scan.md
CHANGED
|
@@ -68,8 +68,12 @@ The levels are additive — all are executed, results are merged and deduplicate
|
|
|
68
68
|
5. **Level 2 — Greenhouse APIs** (WebFetch can batch freely — it's cheap and doesn't use Geometra sessions):
|
|
69
69
|
For each company in `tracked_companies` with `api:` defined and `enabled: true`:
|
|
70
70
|
a. WebFetch the API URL → JSON with job list
|
|
71
|
-
b. For each job extract: `{title, url, company}`
|
|
72
|
-
|
|
71
|
+
b. For each job extract: `{title, url, company, gh_slug, gh_id, updated_at}`
|
|
72
|
+
- **`url`**: ALWAYS record the canonical Greenhouse URL: `https://job-boards.greenhouse.io/{gh_slug}/jobs/{gh_id}`. Do **NOT** use `absolute_url` when it points to a customer-skinned front-end (e.g. `pinterestcareers.com/jobs/?gh_jid=N`, `okta.com/company/careers/opportunity/N`, `samsara.com/company/careers/roles/N`, `zoominfo.com/careers?gh_jid=N`, `collibra.com/.../?gh_jid=N`, `careers.toasttab.com/jobs?gh_jid=N`, `careers.airbnb.com/positions/N`, `coinbase.com/careers/positions/N`, `instacart.careers/job/?gh_jid=N`, `pinterestcareers.com/jobs/?gh_jid=N`). These customer front-ends return shells or 403 to bots and cause downstream WebFetch-based verification to wrongly mark the role CLOSED.
|
|
73
|
+
- **`gh_slug`**: the Greenhouse board slug (from the API URL that was fetched).
|
|
74
|
+
- **`gh_id`**: `jobs[].id` from the API response.
|
|
75
|
+
- **`updated_at`**: `jobs[].updated_at` — record for staleness detection (skip if older than 90 days, flag if older than 30).
|
|
76
|
+
c. Accumulate in candidates list (dedup with Level 1). The pipeline.md entry MUST carry `| gh={gh_slug}/{gh_id}` at the end of the metadata so downstream evaluators can fall back to `https://boards-api.greenhouse.io/v1/boards/{gh_slug}/jobs/{gh_id}` when the canonical URL renders as a shell.
|
|
73
77
|
|
|
74
78
|
6. **Level 3 — WebSearch queries** (WebSearch is parallel-safe; batch freely):
|
|
75
79
|
For each query in `search_queries` with `enabled: true`:
|
|
@@ -102,7 +106,10 @@ The levels are additive — all are executed, results are merged and deduplicate
|
|
|
102
106
|
- When a fuzzy match is found but the URL is new, log it as `skipped_repost` (not `skipped_dup`) with a note referencing the original entry number.
|
|
103
107
|
|
|
104
108
|
8. **For each new offer that passes filters**:
|
|
105
|
-
a. Add to `pipeline.md` section "Pending": `- [ ] {url} | {company} | {title}`
|
|
109
|
+
a. Add to `pipeline.md` section "Pending": `- [ ] {url} | {company} | {title} | ats={ats}` — the `| ats={type}` suffix is REQUIRED for every entry (values: `greenhouse`, `ashby`, `workable`, `lever`, `workday`, `builtin`, `custom`, `unknown`). When the offer came from the Greenhouse API (Level 2), ALSO append `| gh={gh_slug}/{gh_id}` so downstream verification can hit the JSON endpoint. Example entries:
|
|
110
|
+
- `- [ ] https://job-boards.greenhouse.io/webflow/jobs/7689676 | Webflow | Lead AI Engineer | ats=greenhouse | gh=webflow/7689676`
|
|
111
|
+
- `- [ ] https://jobs.ashbyhq.com/everai/abc-123 | EverAI | Senior AI PM | ats=ashby`
|
|
112
|
+
- `- [ ] https://jobs.lever.co/temporal/xyz | Temporal | Product Manager - AI | ats=lever`
|
|
106
113
|
b. Record in `scan-history.tsv`: `{url}\t{date}\t{query_name}\t{title}\t{company}\tadded`
|
|
107
114
|
|
|
108
115
|
9. **Offers filtered by title**: record in `scan-history.tsv` with status `skipped_title`
|
|
@@ -137,6 +144,36 @@ https://... 2026-02-10 Greenhouse — SA Junior Dev BigCo skipped_title
|
|
|
137
144
|
https://... 2026-02-10 Ashby — AI PM SA AI OldCo skipped_dup
|
|
138
145
|
```
|
|
139
146
|
|
|
147
|
+
## Structured Output — Required for Downstream Dispatch
|
|
148
|
+
|
|
149
|
+
Scan mode MUST write its ranked candidate list to a file, not just return it in prose. Downstream subagents (evaluators, applyers) must read URLs from this file, not from the scan subagent's return message. This prevents any hallucinated URL or ID from propagating.
|
|
150
|
+
|
|
151
|
+
**File location**: `batch/scan-output-{YYYY-MM-DD}.md`
|
|
152
|
+
|
|
153
|
+
**Format**: one markdown table per scan run, ordered by archetype-fit rank:
|
|
154
|
+
|
|
155
|
+
| rank | company | ats | role | gh_slug | gh_id | url | updated_at |
|
|
156
|
+
|------|---------|-----|------|---------|-------|-----|------------|
|
|
157
|
+
| 1 | Webflow | greenhouse | Lead AI Engineer | webflow | 7689676 | https://job-boards.greenhouse.io/webflow/jobs/7689676 | 2026-04-14 |
|
|
158
|
+
| 2 | EverAI | ashby | Senior AI PM | - | - | https://jobs.ashbyhq.com/everai/abc-123 | 2026-04-15 |
|
|
159
|
+
| ... | ... | ... | ... | ... | ... | ... | ... |
|
|
160
|
+
|
|
161
|
+
**`ats` values** (one of): `greenhouse`, `ashby`, `workable`, `lever`, `workday`, `builtin`, `custom`, `unknown`. Every row MUST populate this column — it's what the apply subagent uses to pick the correct Gmail OTP sender query.
|
|
162
|
+
|
|
163
|
+
Every row MUST have:
|
|
164
|
+
- `ats` — the ATS platform hosting the posting. Inferred from the canonical URL host (e.g. `boards-api.greenhouse.io` / `job-boards.greenhouse.io` → `greenhouse`; `jobs.ashbyhq.com` → `ashby`; `jobs.lever.co` → `lever`; `myworkdayjobs.com` / `.wd5.myworkdayjobs.com` → `workday`; `apply.workable.com` / `jobs.workable.com` → `workable`; `builtin.com/jobs/` → `builtin`; company-own domains → `custom`; anything indeterminate → `unknown`).
|
|
165
|
+
- `url` in canonical form. For Greenhouse use `https://job-boards.greenhouse.io/{gh_slug}/jobs/{gh_id}` (matching the suffix in `data/pipeline.md`). For other ATSes use the platform's native URL (do not rewrite).
|
|
166
|
+
- `updated_at` in `YYYY-MM-DD` form (the most recent `updated_at` in the API response, or scan date when the source has no such field).
|
|
167
|
+
|
|
168
|
+
Additional columns — REQUIRED when available, `-` (dash) when not applicable:
|
|
169
|
+
- `gh_slug`, `gh_id` — Greenhouse-only. Copied verbatim from the Greenhouse API response (not reconstructed). For non-Greenhouse rows, emit `-` in both columns; `ats` + `url` are sufficient.
|
|
170
|
+
|
|
171
|
+
The scan subagent's return message MUST:
|
|
172
|
+
- Reference the file path (so orchestrators know where to read)
|
|
173
|
+
- Omit the ranked URL list from prose entirely (summary counts only)
|
|
174
|
+
|
|
175
|
+
**Rationale**: in a prior run, a scan subagent returned correct IDs in `scan-history.tsv` but hallucinated plausible-looking fake IDs in its prose-form top-30 list. The orchestrator trusted prose and dispatched 30 downstream subagents against fake URLs. File-based handoff prevents this class of error. Recording `ats` at scan time (rather than having the apply subagent infer it from the URL host) saves downstream re-parsing and keeps the OTP sender lookup deterministic.
|
|
176
|
+
|
|
140
177
|
## Output Summary
|
|
141
178
|
|
|
142
179
|
```
|
|
@@ -148,12 +185,27 @@ Filtered by title: N relevant
|
|
|
148
185
|
Duplicates: N (already evaluated or in pipeline)
|
|
149
186
|
New added to pipeline.md: N
|
|
150
187
|
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
188
|
+
NEXT STEP RECOMMENDATION:
|
|
189
|
+
- Structured candidate list written to: batch/scan-output-{YYYY-MM-DD}.md
|
|
190
|
+
- Downstream subagents MUST read URLs from that file, not from this return message
|
|
191
|
+
- Run /job-forge pipeline to evaluate the new offers.
|
|
155
192
|
```
|
|
156
193
|
|
|
194
|
+
## Verify Before Marking CLOSED (downstream rule)
|
|
195
|
+
|
|
196
|
+
**DO NOT mark a Greenhouse offer CLOSED based on a WebFetch/Geometra result alone.** Customer-skinned careers pages (`pinterestcareers.com`, `okta.com`, `samsara.com`, `zoominfo.com`, `collibra.com`, `careers.toasttab.com`, `careers.airbnb.com`, `coinbase.com`, `instacart.careers`, etc.) serve bot-hostile shells — a 403, a navbar-only response, or a client-side-only render. WebFetch sees "no JD" and mis-classifies as CLOSED.
|
|
197
|
+
|
|
198
|
+
**Correct verification order for any Greenhouse-sourced URL** (identified by a `| gh={slug}/{id}` suffix in `pipeline.md` or a `boards-api.greenhouse.io` / `job-boards.greenhouse.io` / `boards.greenhouse.io` host):
|
|
199
|
+
|
|
200
|
+
1. Try `https://boards-api.greenhouse.io/v1/boards/{slug}/jobs/{id}`. This is the authoritative source.
|
|
201
|
+
- **200 + JSON with `title` and `content`** → offer is LIVE. Use the JSON content as the JD. Do not mark CLOSED.
|
|
202
|
+
- **404** → offer is genuinely closed. Mark CLOSED.
|
|
203
|
+
- **Other non-2xx** → treat as transient (network/rate-limit); retry once. If still failing, mark `**Verification: unconfirmed**` and continue evaluation from whatever text is available. Do NOT mark CLOSED.
|
|
204
|
+
2. Only then fall back to WebFetch of the canonical `job-boards.greenhouse.io/{slug}/jobs/{id}` URL.
|
|
205
|
+
3. Only then fall back to Geometra on the same canonical URL.
|
|
206
|
+
|
|
207
|
+
**Rule of thumb:** Greenhouse postings with valid `gh_slug`/`gh_id` should be verified via the API first. A WebFetch failure on a customer-skinned domain is NOT evidence the role is closed.
|
|
208
|
+
|
|
157
209
|
## Update careers_url
|
|
158
210
|
|
|
159
211
|
Each company in `tracked_companies` MUST have a `careers_url` — the direct URL to its job listings page. The stored URL avoids searching for it every time.
|
package/normalize-statuses.mjs
CHANGED
|
@@ -7,7 +7,7 @@
|
|
|
7
7
|
* - Single-file: data/applications.md or applications.md (legacy)
|
|
8
8
|
*
|
|
9
9
|
* Maps all non-canonical statuses to canonical ones per templates/states.yml:
|
|
10
|
-
* Evaluated, Applied, Responded, Contacted, Interview, Offer, Rejected, Discarded, SKIP
|
|
10
|
+
* Evaluated, Applied, Responded, Contacted, Interview, Offer, Rejected, Discarded, Failed, SKIP
|
|
11
11
|
*
|
|
12
12
|
* Also strips markdown bold (**) and dates from the status field,
|
|
13
13
|
* moving DUPLICADO info to the notes column.
|
|
@@ -23,6 +23,9 @@ import {
|
|
|
23
23
|
usesDayFiles, ensureDayDir, parseAppLine, formatAppLine,
|
|
24
24
|
readAllEntries, writeToDayFiles, listDayFiles,
|
|
25
25
|
} from './tracker-lib.mjs';
|
|
26
|
+
import { DEFAULT_STATES, loadCanonicalStates } from './lib/canonical-states.mjs';
|
|
27
|
+
|
|
28
|
+
const CANONICAL_STATES = loadCanonicalStates(PROJECT_DIR) || DEFAULT_STATES;
|
|
26
29
|
|
|
27
30
|
const DRY_RUN = process.argv.includes('--dry-run');
|
|
28
31
|
|
|
@@ -61,11 +64,7 @@ function normalizeStatus(raw) {
|
|
|
61
64
|
|
|
62
65
|
if (s === '—' || s === '-' || s === '') return { status: 'Discarded' };
|
|
63
66
|
|
|
64
|
-
const
|
|
65
|
-
'Evaluated', 'Applied', 'Contacted', 'Responded', 'Interview',
|
|
66
|
-
'Offer', 'Rejected', 'Discarded', 'SKIP',
|
|
67
|
-
];
|
|
68
|
-
for (const c of canonical) {
|
|
67
|
+
for (const c of CANONICAL_STATES) {
|
|
69
68
|
if (lower === c.toLowerCase()) return { status: c };
|
|
70
69
|
}
|
|
71
70
|
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "job-forge",
|
|
3
|
-
"version": "2.0.
|
|
3
|
+
"version": "2.0.3",
|
|
4
4
|
"description": "AI-powered job search pipeline built on opencode",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"bin": {
|
|
@@ -43,6 +43,7 @@
|
|
|
43
43
|
"batch/batch-runner.sh",
|
|
44
44
|
"batch/README.md",
|
|
45
45
|
"docs/",
|
|
46
|
+
"lib/",
|
|
46
47
|
"tracker-lib.mjs",
|
|
47
48
|
"merge-tracker.mjs",
|
|
48
49
|
"dedup-tracker.mjs",
|
package/templates/states.yml
CHANGED
|
@@ -55,6 +55,12 @@ states:
|
|
|
55
55
|
description: Discarded by candidate or offer closed
|
|
56
56
|
dashboard_group: discarded
|
|
57
57
|
|
|
58
|
+
- id: failed
|
|
59
|
+
label: Failed
|
|
60
|
+
aliases: []
|
|
61
|
+
description: Submission attempted but blocked by portal (spam-filter, anti-bot, broken form). May be recoverable via manual retry.
|
|
62
|
+
dashboard_group: failed
|
|
63
|
+
|
|
58
64
|
- id: skip
|
|
59
65
|
label: SKIP
|
|
60
66
|
aliases: [skip]
|