job-forge 2.1.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -10,7 +10,13 @@ alwaysApply: true
10
10
  The Hard Limits below are non-negotiable numeric rules. If you catch yourself about to violate one, STOP and restructure.
11
11
 
12
12
  1. **Max parallel subagents: 2.** Never emit 3+ `task` tool calls in a single message. For N jobs, run `ceil(N/2)` sequential rounds of 2. No exceptions — not for "urgent", not for "the user asked for 10".
13
- 2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep `data/pipeline.md` and today's `data/applications/*.md` for the URL and for `company+role`. If already APPLIED, skip that job and do not dispatch.
13
+ 2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep ALL of the following for the URL and for `company+role`:
14
+ - `data/pipeline.md`
15
+ - all `data/applications/*.md` day files (not just today's — prior-day Applies count too)
16
+ - `batch/tracker-additions/*.tsv` (pending outcomes not yet merged)
17
+ - `batch/tracker-additions/merged/*.tsv` (outcomes already consumed into day files — catches same-day earlier-batch Applies that merge collapsed into an existing row)
18
+
19
+ If any source shows an APPLIED / Applied outcome for this URL or company+role, skip that job and do not dispatch. **Why merged/ matters**: when two batches in the same day target the same role, `npx job-forge merge` updates the existing day-file row instead of creating a new one — so `grep data/applications/*.md` for the higher report number misses the earlier apply. The merged TSV is the only place the newer attempt's breadcrumb remains.
14
20
  3. **Always clean Geometra sessions before dispatching.** Before every round of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round. The disconnect is a no-op when the pool is empty.
15
21
  4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
16
22
  5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
@@ -475,7 +481,7 @@ To check or modify MCP settings, edit `opencode.json` in the project root.
475
481
  - Output in `output/` (gitignored), Reports in `reports/`
476
482
  - JDs in `jds/` (referenced as `local:jds/{file}` in pipeline.md)
477
483
  - Batch in `batch/` (gitignored except scripts and prompt)
478
- - Report numbering: sequential 3-digit zero-padded, max existing + 1
484
+ - Report numbering: sequential 3-digit zero-padded. **Always use `npx job-forge next-num` to get the next number** — do NOT derive it yourself from `ls reports/`. The CLI scans all sources: `reports/*.md`, the `#` column of every `data/applications/*.md` day file, and pending + merged `batch/tracker-additions/*.tsv`. Deriving from `reports/` alone misses numbers assigned by prior-day tracker additions that were never written as report files (e.g., `SKIP` entries), which causes ID collisions downstream.
479
485
  - **RULE: After each batch of evaluations, run `npx job-forge merge`** to merge tracker additions and avoid duplications.
480
486
  - **RULE: NEVER create new entries in applications.md if company+role already exists.** Update the existing entry.
481
487
  - **RULE: NEVER attribute commits to opencode (no `Co-Authored-By: opencode` or similar).** All commits must be attributed solely to the person making the commit (e.g., CharlieGreenman).
package/AGENTS.md CHANGED
@@ -5,7 +5,13 @@
5
5
  The Hard Limits below are non-negotiable numeric rules. If you catch yourself about to violate one, STOP and restructure.
6
6
 
7
7
  1. **Max parallel subagents: 2.** Never emit 3+ `task` tool calls in a single message. For N jobs, run `ceil(N/2)` sequential rounds of 2. No exceptions — not for "urgent", not for "the user asked for 10".
8
- 2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep `data/pipeline.md` and today's `data/applications/*.md` for the URL and for `company+role`. If already APPLIED, skip that job and do not dispatch.
8
+ 2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep ALL of the following for the URL and for `company+role`:
9
+ - `data/pipeline.md`
10
+ - all `data/applications/*.md` day files (not just today's — prior-day Applies count too)
11
+ - `batch/tracker-additions/*.tsv` (pending outcomes not yet merged)
12
+ - `batch/tracker-additions/merged/*.tsv` (outcomes already consumed into day files — catches same-day earlier-batch Applies that merge collapsed into an existing row)
13
+
14
+ If any source shows an APPLIED / Applied outcome for this URL or company+role, skip that job and do not dispatch. **Why merged/ matters**: when two batches in the same day target the same role, `npx job-forge merge` updates the existing day-file row instead of creating a new one — so `grep data/applications/*.md` for the higher report number misses the earlier apply. The merged TSV is the only place the newer attempt's breadcrumb remains.
9
15
  3. **Always clean Geometra sessions before dispatching.** Before every round of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round. The disconnect is a no-op when the pool is empty.
10
16
  4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
11
17
  5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
@@ -470,7 +476,7 @@ To check or modify MCP settings, edit `opencode.json` in the project root.
470
476
  - Output in `output/` (gitignored), Reports in `reports/`
471
477
  - JDs in `jds/` (referenced as `local:jds/{file}` in pipeline.md)
472
478
  - Batch in `batch/` (gitignored except scripts and prompt)
473
- - Report numbering: sequential 3-digit zero-padded, max existing + 1
479
+ - Report numbering: sequential 3-digit zero-padded. **Always use `npx job-forge next-num` to get the next number** — do NOT derive it yourself from `ls reports/`. The CLI scans all sources: `reports/*.md`, the `#` column of every `data/applications/*.md` day file, and pending + merged `batch/tracker-additions/*.tsv`. Deriving from `reports/` alone misses numbers assigned by prior-day tracker additions that were never written as report files (e.g., `SKIP` entries), which causes ID collisions downstream.
474
480
  - **RULE: After each batch of evaluations, run `npx job-forge merge`** to merge tracker additions and avoid duplications.
475
481
  - **RULE: NEVER create new entries in applications.md if company+role already exists.** Update the existing entry.
476
482
  - **RULE: NEVER attribute commits to opencode (no `Co-Authored-By: opencode` or similar).** All commits must be attributed solely to the person making the commit (e.g., CharlieGreenman).
package/CLAUDE.md CHANGED
@@ -5,7 +5,13 @@
5
5
  The Hard Limits below are non-negotiable numeric rules. If you catch yourself about to violate one, STOP and restructure.
6
6
 
7
7
  1. **Max parallel subagents: 2.** Never emit 3+ `task` tool calls in a single message. For N jobs, run `ceil(N/2)` sequential rounds of 2. No exceptions — not for "urgent", not for "the user asked for 10".
8
- 2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep `data/pipeline.md` and today's `data/applications/*.md` for the URL and for `company+role`. If already APPLIED, skip that job and do not dispatch.
8
+ 2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep ALL of the following for the URL and for `company+role`:
9
+ - `data/pipeline.md`
10
+ - all `data/applications/*.md` day files (not just today's — prior-day Applies count too)
11
+ - `batch/tracker-additions/*.tsv` (pending outcomes not yet merged)
12
+ - `batch/tracker-additions/merged/*.tsv` (outcomes already consumed into day files — catches same-day earlier-batch Applies that merge collapsed into an existing row)
13
+
14
+ If any source shows an APPLIED / Applied outcome for this URL or company+role, skip that job and do not dispatch. **Why merged/ matters**: when two batches in the same day target the same role, `npx job-forge merge` updates the existing day-file row instead of creating a new one — so `grep data/applications/*.md` for the higher report number misses the earlier apply. The merged TSV is the only place the newer attempt's breadcrumb remains.
9
15
  3. **Always clean Geometra sessions before dispatching.** Before every round of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round. The disconnect is a no-op when the pool is empty.
10
16
  4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
11
17
  5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
@@ -470,7 +476,7 @@ To check or modify MCP settings, edit `opencode.json` in the project root.
470
476
  - Output in `output/` (gitignored), Reports in `reports/`
471
477
  - JDs in `jds/` (referenced as `local:jds/{file}` in pipeline.md)
472
478
  - Batch in `batch/` (gitignored except scripts and prompt)
473
- - Report numbering: sequential 3-digit zero-padded, max existing + 1
479
+ - Report numbering: sequential 3-digit zero-padded. **Always use `npx job-forge next-num` to get the next number** — do NOT derive it yourself from `ls reports/`. The CLI scans all sources: `reports/*.md`, the `#` column of every `data/applications/*.md` day file, and pending + merged `batch/tracker-additions/*.tsv`. Deriving from `reports/` alone misses numbers assigned by prior-day tracker additions that were never written as report files (e.g., `SKIP` entries), which causes ID collisions downstream.
474
480
  - **RULE: After each batch of evaluations, run `npx job-forge merge`** to merge tracker additions and avoid duplications.
475
481
  - **RULE: NEVER create new entries in applications.md if company+role already exists.** Update the existing entry.
476
482
  - **RULE: NEVER attribute commits to opencode (no `Co-Authored-By: opencode` or similar).** All commits must be attributed solely to the person making the commit (e.g., CharlieGreenman).
@@ -5,7 +5,13 @@
5
5
  The Hard Limits below are non-negotiable numeric rules. If you catch yourself about to violate one, STOP and restructure.
6
6
 
7
7
  1. **Max parallel subagents: 2.** Never emit 3+ `task` tool calls in a single message. For N jobs, run `ceil(N/2)` sequential rounds of 2. No exceptions — not for "urgent", not for "the user asked for 10".
8
- 2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep `data/pipeline.md` and today's `data/applications/*.md` for the URL and for `company+role`. If already APPLIED, skip that job and do not dispatch.
8
+ 2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep ALL of the following for the URL and for `company+role`:
9
+ - `data/pipeline.md`
10
+ - all `data/applications/*.md` day files (not just today's — prior-day Applies count too)
11
+ - `batch/tracker-additions/*.tsv` (pending outcomes not yet merged)
12
+ - `batch/tracker-additions/merged/*.tsv` (outcomes already consumed into day files — catches same-day earlier-batch Applies that merge collapsed into an existing row)
13
+
14
+ If any source shows an APPLIED / Applied outcome for this URL or company+role, skip that job and do not dispatch. **Why merged/ matters**: when two batches in the same day target the same role, `npx job-forge merge` updates the existing day-file row instead of creating a new one — so `grep data/applications/*.md` for the higher report number misses the earlier apply. The merged TSV is the only place the newer attempt's breadcrumb remains.
9
15
  3. **Always clean Geometra sessions before dispatching.** Before every round of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round. The disconnect is a no-op when the pool is empty.
10
16
  4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
11
17
  5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
@@ -470,7 +476,7 @@ To check or modify MCP settings, edit `opencode.json` in the project root.
470
476
  - Output in `output/` (gitignored), Reports in `reports/`
471
477
  - JDs in `jds/` (referenced as `local:jds/{file}` in pipeline.md)
472
478
  - Batch in `batch/` (gitignored except scripts and prompt)
473
- - Report numbering: sequential 3-digit zero-padded, max existing + 1
479
+ - Report numbering: sequential 3-digit zero-padded. **Always use `npx job-forge next-num` to get the next number** — do NOT derive it yourself from `ls reports/`. The CLI scans all sources: `reports/*.md`, the `#` column of every `data/applications/*.md` day file, and pending + merged `batch/tracker-additions/*.tsv`. Deriving from `reports/` alone misses numbers assigned by prior-day tracker additions that were never written as report files (e.g., `SKIP` entries), which causes ID collisions downstream.
474
480
  - **RULE: After each batch of evaluations, run `npx job-forge merge`** to merge tracker additions and avoid duplications.
475
481
  - **RULE: NEVER create new entries in applications.md if company+role already exists.** Update the existing entry.
476
482
  - **RULE: NEVER attribute commits to opencode (no `Co-Authored-By: opencode` or similar).** All commits must be attributed solely to the person making the commit (e.g., CharlieGreenman).
package/modes/pipeline.md CHANGED
@@ -6,7 +6,7 @@ Processes accumulated job offer URLs from `data/pipeline.md`. The user adds URLs
6
6
 
7
7
  1. **Read** `data/pipeline.md` → find `- [ ]` items in the "Pending" section
8
8
  2. **For each pending URL**:
9
- a. Calculate the next sequential `REPORT_NUM` (read `reports/`, take the highest number + 1)
9
+ a. Calculate the next sequential `REPORT_NUM` by running `npx job-forge next-num` (scans `reports/`, day file `#` columns, and `batch/tracker-additions/` — do NOT derive from `reports/` alone)
10
10
  b. **Extract JD** using Geometra MCP (geometra_connect + geometra_page_model) → WebFetch → WebSearch
11
11
  c. If the URL is not accessible → mark as `- [!]` with a note and continue
12
12
  d. **Run full auto-pipeline**: A-F Evaluation → Report .md → PDF (if score >= 3.0, per `_shared.md` thresholds) → Draft answers (if score >= 3.5) → Tracker
@@ -45,9 +45,13 @@ Processes accumulated job offer URLs from `data/pipeline.md`. The user adds URLs
45
45
 
46
46
  ## Automatic Numbering
47
47
 
48
- 1. List all files in `reports/`
49
- 2. Extract the number from the prefix (e.g., `142-medispend...` → 142)
50
- 3. New number = highest found + 1
48
+ Run `npx job-forge next-num` returns the next 3-digit zero-padded report number. The CLI scans:
49
+
50
+ 1. `reports/*.md` filename prefixes
51
+ 2. The `#` column of every `data/applications/*.md` day file
52
+ 3. The `{num}` prefix of every `batch/tracker-additions/*.tsv` (pending + merged)
53
+
54
+ Takes the max across all three sources and adds 1. Do NOT derive from any single source — prior-day SKIPs and other non-report tracker entries advance the counter but never write to `reports/`, so `ls reports/` alone misses them.
51
55
 
52
56
  ## Source Synchronization
53
57
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "job-forge",
3
- "version": "2.1.0",
3
+ "version": "2.2.0",
4
4
  "description": "AI-powered job search pipeline built on opencode",
5
5
  "type": "module",
6
6
  "bin": {
@@ -2,23 +2,36 @@
2
2
  /**
3
3
  * next-num — print the next sequential report number (3-digit zero-padded).
4
4
  *
5
- * Reads reports/ and returns max(existing) + 1. Used by agents instead of
6
- * having the model figure this out by listing + parsing filenames.
5
+ * Scans three sources to find the max and returns max + 1:
6
+ * 1. reports/*.md — filename prefix `{num}-`
7
+ * 2. data/applications/*.md — `#` column of each table row
8
+ * 3. batch/tracker-additions/*.tsv — first tab-separated column (pending)
9
+ * batch/tracker-additions/merged/ — same, already consumed
10
+ *
11
+ * Why all three? Same-day batches can advance the counter without writing a
12
+ * report (e.g., SKIP entries skip PDF + report). Deriving from reports/ alone
13
+ * causes ID collisions when a later subagent picks a number already used in
14
+ * a tracker row or TSV. Scanning all three sources is O(N) on a small
15
+ * directory and eliminates the collision class.
7
16
  *
8
17
  * Usage:
9
18
  * job-forge next-num # prints e.g. "521"
10
- * job-forge next-num --padded # prints e.g. "521" (default, already padded)
11
19
  * job-forge next-num --raw # prints e.g. "521" without padding
12
20
  */
13
21
 
14
- import { readdirSync, existsSync } from 'fs';
22
+ import { readdirSync, readFileSync, existsSync, statSync } from 'fs';
15
23
  import { join } from 'path';
16
24
 
17
25
  const PROJECT_DIR = process.env.JOB_FORGE_PROJECT || process.cwd();
18
26
  const REPORTS_DIR = join(PROJECT_DIR, 'reports');
27
+ const APPS_DIR = join(PROJECT_DIR, 'data', 'applications');
28
+ const TSV_DIR = join(PROJECT_DIR, 'batch', 'tracker-additions');
29
+ const TSV_MERGED_DIR = join(TSV_DIR, 'merged');
19
30
  const RAW = process.argv.includes('--raw');
20
31
 
21
32
  let max = 0;
33
+
34
+ // 1. reports/*.md
22
35
  if (existsSync(REPORTS_DIR)) {
23
36
  for (const f of readdirSync(REPORTS_DIR)) {
24
37
  if (!f.endsWith('.md')) continue;
@@ -29,5 +42,47 @@ if (existsSync(REPORTS_DIR)) {
29
42
  }
30
43
  }
31
44
 
45
+ // 2. data/applications/*.md — first `|` column of each table row
46
+ if (existsSync(APPS_DIR)) {
47
+ for (const f of readdirSync(APPS_DIR)) {
48
+ if (!f.endsWith('.md')) continue;
49
+ const full = join(APPS_DIR, f);
50
+ if (!statSync(full).isFile()) continue;
51
+ const content = readFileSync(full, 'utf-8');
52
+ for (const line of content.split('\n')) {
53
+ // Match: "| 756 | 2026-04-18 | ..." — integer in first cell
54
+ const m = line.match(/^\|\s*(\d+)\s*\|/);
55
+ if (!m) continue;
56
+ const n = parseInt(m[1], 10);
57
+ if (n > max) max = n;
58
+ }
59
+ }
60
+ }
61
+
62
+ // 3. batch/tracker-additions/*.tsv (pending) + merged/*.tsv
63
+ for (const dir of [TSV_DIR, TSV_MERGED_DIR]) {
64
+ if (!existsSync(dir)) continue;
65
+ for (const f of readdirSync(dir)) {
66
+ if (!f.endsWith('.tsv')) continue;
67
+ const full = join(dir, f);
68
+ if (!statSync(full).isFile()) continue;
69
+ // Prefer the filename prefix (always present and canonical) over TSV
70
+ // contents — avoids reading the file for the common case.
71
+ const mName = f.match(/^(\d+)-/);
72
+ if (mName) {
73
+ const n = parseInt(mName[1], 10);
74
+ if (n > max) max = n;
75
+ continue;
76
+ }
77
+ // Fallback: parse first column of first non-empty line
78
+ const content = readFileSync(full, 'utf-8');
79
+ const firstLine = content.split('\n').find(l => l.trim().length > 0);
80
+ if (!firstLine) continue;
81
+ const cell = firstLine.split('\t')[0];
82
+ const n = parseInt(cell, 10);
83
+ if (!Number.isNaN(n) && n > max) max = n;
84
+ }
85
+ }
86
+
32
87
  const next = max + 1;
33
88
  console.log(RAW ? String(next) : String(next).padStart(3, '0'));