npm - job-forge - Versions diffs - 2.1.0 → 2.2.0 - Mend

job-forge 2.1.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/.cursor/rules/main.mdc CHANGED Viewed

@@ -10,7 +10,13 @@ alwaysApply: true
 The Hard Limits below are non-negotiable numeric rules. If you catch yourself about to violate one, STOP and restructure.
 1. **Max parallel subagents: 2.** Never emit 3+ `task` tool calls in a single message. For N jobs, run `ceil(N/2)` sequential rounds of 2. No exceptions — not for "urgent", not for "the user asked for 10".
-2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep `data/pipeline.md` and today's `data/applications/*.md` for the URL and for `company+role`. If already APPLIED, skip that job and do not dispatch.
+2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep ALL of the following for the URL and for `company+role`:
+   - `data/pipeline.md`
+   - all `data/applications/*.md` day files (not just today's — prior-day Applies count too)
+   - `batch/tracker-additions/*.tsv` (pending outcomes not yet merged)
+   - `batch/tracker-additions/merged/*.tsv` (outcomes already consumed into day files — catches same-day earlier-batch Applies that merge collapsed into an existing row)
+   If any source shows an APPLIED / Applied outcome for this URL or company+role, skip that job and do not dispatch. **Why merged/ matters**: when two batches in the same day target the same role, `npx job-forge merge` updates the existing day-file row instead of creating a new one — so `grep data/applications/*.md` for the higher report number misses the earlier apply. The merged TSV is the only place the newer attempt's breadcrumb remains.
 3. **Always clean Geometra sessions before dispatching.** Before every round of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round. The disconnect is a no-op when the pool is empty.
 4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
 5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
@@ -475,7 +481,7 @@ To check or modify MCP settings, edit `opencode.json` in the project root.
 - Output in `output/` (gitignored), Reports in `reports/`
 - JDs in `jds/` (referenced as `local:jds/{file}` in pipeline.md)
 - Batch in `batch/` (gitignored except scripts and prompt)
-- Report numbering: sequential 3-digit zero-padded, max existing + 1
+- Report numbering: sequential 3-digit zero-padded. **Always use `npx job-forge next-num` to get the next number** — do NOT derive it yourself from `ls reports/`. The CLI scans all sources: `reports/*.md`, the `#` column of every `data/applications/*.md` day file, and pending + merged `batch/tracker-additions/*.tsv`. Deriving from `reports/` alone misses numbers assigned by prior-day tracker additions that were never written as report files (e.g., `SKIP` entries), which causes ID collisions downstream.
 - **RULE: After each batch of evaluations, run `npx job-forge merge`** to merge tracker additions and avoid duplications.
 - **RULE: NEVER create new entries in applications.md if company+role already exists.** Update the existing entry.
 - **RULE: NEVER attribute commits to opencode (no `Co-Authored-By: opencode` or similar).** All commits must be attributed solely to the person making the commit (e.g., CharlieGreenman).

package/AGENTS.md CHANGED Viewed

@@ -5,7 +5,13 @@
 The Hard Limits below are non-negotiable numeric rules. If you catch yourself about to violate one, STOP and restructure.
 1. **Max parallel subagents: 2.** Never emit 3+ `task` tool calls in a single message. For N jobs, run `ceil(N/2)` sequential rounds of 2. No exceptions — not for "urgent", not for "the user asked for 10".
-2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep `data/pipeline.md` and today's `data/applications/*.md` for the URL and for `company+role`. If already APPLIED, skip that job and do not dispatch.
+2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep ALL of the following for the URL and for `company+role`:
+   - `data/pipeline.md`
+   - all `data/applications/*.md` day files (not just today's — prior-day Applies count too)
+   - `batch/tracker-additions/*.tsv` (pending outcomes not yet merged)
+   - `batch/tracker-additions/merged/*.tsv` (outcomes already consumed into day files — catches same-day earlier-batch Applies that merge collapsed into an existing row)
+   If any source shows an APPLIED / Applied outcome for this URL or company+role, skip that job and do not dispatch. **Why merged/ matters**: when two batches in the same day target the same role, `npx job-forge merge` updates the existing day-file row instead of creating a new one — so `grep data/applications/*.md` for the higher report number misses the earlier apply. The merged TSV is the only place the newer attempt's breadcrumb remains.
 3. **Always clean Geometra sessions before dispatching.** Before every round of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round. The disconnect is a no-op when the pool is empty.
 4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
 5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
@@ -470,7 +476,7 @@ To check or modify MCP settings, edit `opencode.json` in the project root.
 - Output in `output/` (gitignored), Reports in `reports/`
 - JDs in `jds/` (referenced as `local:jds/{file}` in pipeline.md)
 - Batch in `batch/` (gitignored except scripts and prompt)
-- Report numbering: sequential 3-digit zero-padded, max existing + 1
+- Report numbering: sequential 3-digit zero-padded. **Always use `npx job-forge next-num` to get the next number** — do NOT derive it yourself from `ls reports/`. The CLI scans all sources: `reports/*.md`, the `#` column of every `data/applications/*.md` day file, and pending + merged `batch/tracker-additions/*.tsv`. Deriving from `reports/` alone misses numbers assigned by prior-day tracker additions that were never written as report files (e.g., `SKIP` entries), which causes ID collisions downstream.
 - **RULE: After each batch of evaluations, run `npx job-forge merge`** to merge tracker additions and avoid duplications.
 - **RULE: NEVER create new entries in applications.md if company+role already exists.** Update the existing entry.
 - **RULE: NEVER attribute commits to opencode (no `Co-Authored-By: opencode` or similar).** All commits must be attributed solely to the person making the commit (e.g., CharlieGreenman).

package/CLAUDE.md CHANGED Viewed

@@ -5,7 +5,13 @@
 The Hard Limits below are non-negotiable numeric rules. If you catch yourself about to violate one, STOP and restructure.
 1. **Max parallel subagents: 2.** Never emit 3+ `task` tool calls in a single message. For N jobs, run `ceil(N/2)` sequential rounds of 2. No exceptions — not for "urgent", not for "the user asked for 10".
-2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep `data/pipeline.md` and today's `data/applications/*.md` for the URL and for `company+role`. If already APPLIED, skip that job and do not dispatch.
+2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep ALL of the following for the URL and for `company+role`:
+   - `data/pipeline.md`
+   - all `data/applications/*.md` day files (not just today's — prior-day Applies count too)
+   - `batch/tracker-additions/*.tsv` (pending outcomes not yet merged)
+   - `batch/tracker-additions/merged/*.tsv` (outcomes already consumed into day files — catches same-day earlier-batch Applies that merge collapsed into an existing row)
+   If any source shows an APPLIED / Applied outcome for this URL or company+role, skip that job and do not dispatch. **Why merged/ matters**: when two batches in the same day target the same role, `npx job-forge merge` updates the existing day-file row instead of creating a new one — so `grep data/applications/*.md` for the higher report number misses the earlier apply. The merged TSV is the only place the newer attempt's breadcrumb remains.
 3. **Always clean Geometra sessions before dispatching.** Before every round of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round. The disconnect is a no-op when the pool is empty.
 4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
 5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
@@ -470,7 +476,7 @@ To check or modify MCP settings, edit `opencode.json` in the project root.
 - Output in `output/` (gitignored), Reports in `reports/`
 - JDs in `jds/` (referenced as `local:jds/{file}` in pipeline.md)
 - Batch in `batch/` (gitignored except scripts and prompt)
-- Report numbering: sequential 3-digit zero-padded, max existing + 1
+- Report numbering: sequential 3-digit zero-padded. **Always use `npx job-forge next-num` to get the next number** — do NOT derive it yourself from `ls reports/`. The CLI scans all sources: `reports/*.md`, the `#` column of every `data/applications/*.md` day file, and pending + merged `batch/tracker-additions/*.tsv`. Deriving from `reports/` alone misses numbers assigned by prior-day tracker additions that were never written as report files (e.g., `SKIP` entries), which causes ID collisions downstream.
 - **RULE: After each batch of evaluations, run `npx job-forge merge`** to merge tracker additions and avoid duplications.
 - **RULE: NEVER create new entries in applications.md if company+role already exists.** Update the existing entry.
 - **RULE: NEVER attribute commits to opencode (no `Co-Authored-By: opencode` or similar).** All commits must be attributed solely to the person making the commit (e.g., CharlieGreenman).

package/iso/instructions.md CHANGED Viewed

@@ -5,7 +5,13 @@
 The Hard Limits below are non-negotiable numeric rules. If you catch yourself about to violate one, STOP and restructure.
 1. **Max parallel subagents: 2.** Never emit 3+ `task` tool calls in a single message. For N jobs, run `ceil(N/2)` sequential rounds of 2. No exceptions — not for "urgent", not for "the user asked for 10".
-2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep `data/pipeline.md` and today's `data/applications/*.md` for the URL and for `company+role`. If already APPLIED, skip that job and do not dispatch.
+2. **Max 1 application per company+role.** Before every `task` dispatch for `apply`, Grep ALL of the following for the URL and for `company+role`:
+   - `data/pipeline.md`
+   - all `data/applications/*.md` day files (not just today's — prior-day Applies count too)
+   - `batch/tracker-additions/*.tsv` (pending outcomes not yet merged)
+   - `batch/tracker-additions/merged/*.tsv` (outcomes already consumed into day files — catches same-day earlier-batch Applies that merge collapsed into an existing row)
+   If any source shows an APPLIED / Applied outcome for this URL or company+role, skip that job and do not dispatch. **Why merged/ matters**: when two batches in the same day target the same role, `npx job-forge merge` updates the existing day-file row instead of creating a new one — so `grep data/applications/*.md` for the higher report number misses the earlier apply. The merged TSV is the only place the newer attempt's breadcrumb remains.
 3. **Always clean Geometra sessions before dispatching.** Before every round of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round. The disconnect is a no-op when the pool is empty.
 4. **Orchestrator does NOT fill forms.** This session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` when handling a multi-job request. If you need to, it means you MUST have delegated — `task` out the remaining work instead.
 5. **Re-dispatch only AFTER the previous subagent returns.** Never fire the same company's `task` twice while the first is still in-flight. Wait for the return value, then decide if a retry is warranted.
@@ -470,7 +476,7 @@ To check or modify MCP settings, edit `opencode.json` in the project root.
 - Output in `output/` (gitignored), Reports in `reports/`
 - JDs in `jds/` (referenced as `local:jds/{file}` in pipeline.md)
 - Batch in `batch/` (gitignored except scripts and prompt)
-- Report numbering: sequential 3-digit zero-padded, max existing + 1
+- Report numbering: sequential 3-digit zero-padded. **Always use `npx job-forge next-num` to get the next number** — do NOT derive it yourself from `ls reports/`. The CLI scans all sources: `reports/*.md`, the `#` column of every `data/applications/*.md` day file, and pending + merged `batch/tracker-additions/*.tsv`. Deriving from `reports/` alone misses numbers assigned by prior-day tracker additions that were never written as report files (e.g., `SKIP` entries), which causes ID collisions downstream.
 - **RULE: After each batch of evaluations, run `npx job-forge merge`** to merge tracker additions and avoid duplications.
 - **RULE: NEVER create new entries in applications.md if company+role already exists.** Update the existing entry.
 - **RULE: NEVER attribute commits to opencode (no `Co-Authored-By: opencode` or similar).** All commits must be attributed solely to the person making the commit (e.g., CharlieGreenman).

package/modes/pipeline.md CHANGED Viewed

@@ -6,7 +6,7 @@ Processes accumulated job offer URLs from `data/pipeline.md`. The user adds URLs
 1. **Read** `data/pipeline.md` → find `- [ ]` items in the "Pending" section
 2. **For each pending URL**:
-   a. Calculate the next sequential `REPORT_NUM` (read `reports/`, take the highest number + 1)
+   a. Calculate the next sequential `REPORT_NUM` by running `npx job-forge next-num` (scans `reports/`, day file `#` columns, and `batch/tracker-additions/` — do NOT derive from `reports/` alone)
    b. **Extract JD** using Geometra MCP (geometra_connect + geometra_page_model) → WebFetch → WebSearch
    c. If the URL is not accessible → mark as `- [!]` with a note and continue
    d. **Run full auto-pipeline**: A-F Evaluation → Report .md → PDF (if score >= 3.0, per `_shared.md` thresholds) → Draft answers (if score >= 3.5) → Tracker
@@ -45,9 +45,13 @@ Processes accumulated job offer URLs from `data/pipeline.md`. The user adds URLs
 ## Automatic Numbering
-1. List all files in `reports/`
-2. Extract the number from the prefix (e.g., `142-medispend...` → 142)
-3. New number = highest found + 1
+Run `npx job-forge next-num` — returns the next 3-digit zero-padded report number. The CLI scans:
+1. `reports/*.md` filename prefixes
+2. The `#` column of every `data/applications/*.md` day file
+3. The `{num}` prefix of every `batch/tracker-additions/*.tsv` (pending + merged)
+Takes the max across all three sources and adds 1. Do NOT derive from any single source — prior-day SKIPs and other non-report tracker entries advance the counter but never write to `reports/`, so `ls reports/` alone misses them.
 ## Source Synchronization

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "job-forge",
-  "version": "2.1.0",
+  "version": "2.2.0",
   "description": "AI-powered job search pipeline built on opencode",
   "type": "module",
   "bin": {

package/scripts/next-num.mjs CHANGED Viewed

@@ -2,23 +2,36 @@
 /**
  * next-num — print the next sequential report number (3-digit zero-padded).
  *
- * Reads reports/ and returns max(existing) + 1. Used by agents instead of
- * having the model figure this out by listing + parsing filenames.
+ * Scans three sources to find the max and returns max + 1:
+ *   1. reports/*.md                       — filename prefix `{num}-`
+ *   2. data/applications/*.md             — `#` column of each table row
+ *   3. batch/tracker-additions/*.tsv      — first tab-separated column (pending)
+ *      batch/tracker-additions/merged/    — same, already consumed
+ *
+ * Why all three? Same-day batches can advance the counter without writing a
+ * report (e.g., SKIP entries skip PDF + report). Deriving from reports/ alone
+ * causes ID collisions when a later subagent picks a number already used in
+ * a tracker row or TSV. Scanning all three sources is O(N) on a small
+ * directory and eliminates the collision class.
  *
  * Usage:
  *   job-forge next-num              # prints e.g. "521"
- *   job-forge next-num --padded     # prints e.g. "521" (default, already padded)
  *   job-forge next-num --raw        # prints e.g. "521" without padding
  */
-import { readdirSync, existsSync } from 'fs';
+import { readdirSync, readFileSync, existsSync, statSync } from 'fs';
 import { join } from 'path';
 const PROJECT_DIR = process.env.JOB_FORGE_PROJECT || process.cwd();
 const REPORTS_DIR = join(PROJECT_DIR, 'reports');
+const APPS_DIR = join(PROJECT_DIR, 'data', 'applications');
+const TSV_DIR = join(PROJECT_DIR, 'batch', 'tracker-additions');
+const TSV_MERGED_DIR = join(TSV_DIR, 'merged');
 const RAW = process.argv.includes('--raw');
 let max = 0;
+// 1. reports/*.md
 if (existsSync(REPORTS_DIR)) {
   for (const f of readdirSync(REPORTS_DIR)) {
     if (!f.endsWith('.md')) continue;
@@ -29,5 +42,47 @@ if (existsSync(REPORTS_DIR)) {
   }
 }
+// 2. data/applications/*.md — first `|` column of each table row
+if (existsSync(APPS_DIR)) {
+  for (const f of readdirSync(APPS_DIR)) {
+    if (!f.endsWith('.md')) continue;
+    const full = join(APPS_DIR, f);
+    if (!statSync(full).isFile()) continue;
+    const content = readFileSync(full, 'utf-8');
+    for (const line of content.split('\n')) {
+      // Match: "| 756 | 2026-04-18 | ..." — integer in first cell
+      const m = line.match(/^\|\s*(\d+)\s*\|/);
+      if (!m) continue;
+      const n = parseInt(m[1], 10);
+      if (n > max) max = n;
+    }
+  }
+}
+// 3. batch/tracker-additions/*.tsv (pending) + merged/*.tsv
+for (const dir of [TSV_DIR, TSV_MERGED_DIR]) {
+  if (!existsSync(dir)) continue;
+  for (const f of readdirSync(dir)) {
+    if (!f.endsWith('.tsv')) continue;
+    const full = join(dir, f);
+    if (!statSync(full).isFile()) continue;
+    // Prefer the filename prefix (always present and canonical) over TSV
+    // contents — avoids reading the file for the common case.
+    const mName = f.match(/^(\d+)-/);
+    if (mName) {
+      const n = parseInt(mName[1], 10);
+      if (n > max) max = n;
+      continue;
+    }
+    // Fallback: parse first column of first non-empty line
+    const content = readFileSync(full, 'utf-8');
+    const firstLine = content.split('\n').find(l => l.trim().length > 0);
+    if (!firstLine) continue;
+    const cell = firstLine.split('\t')[0];
+    const n = parseInt(cell, 10);
+    if (!Number.isNaN(n) && n > max) max = n;
+  }
+}
 const next = max + 1;
 console.log(RAW ? String(next) : String(next).padStart(3, '0'));