job-forge 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (79) hide show
  1. package/.codex/config.toml +8 -0
  2. package/.cursor/mcp.json +21 -0
  3. package/.cursor/rules/main.mdc +519 -0
  4. package/.mcp.json +21 -0
  5. package/.opencode/agents/general-free.md +85 -0
  6. package/.opencode/agents/general-paid.md +39 -0
  7. package/.opencode/agents/glm-minimal.md +50 -0
  8. package/.opencode/skills/job-forge.md +185 -0
  9. package/AGENTS.md +514 -0
  10. package/CLAUDE.md +514 -0
  11. package/LICENSE +21 -0
  12. package/README.md +195 -0
  13. package/batch/README.md +60 -0
  14. package/batch/batch-prompt.md +399 -0
  15. package/batch/batch-runner.sh +673 -0
  16. package/bin/create-job-forge.mjs +375 -0
  17. package/bin/job-forge.mjs +120 -0
  18. package/bin/sync.mjs +141 -0
  19. package/config/profile.example.yml +67 -0
  20. package/cv-sync-check.mjs +128 -0
  21. package/dedup-tracker.mjs +201 -0
  22. package/docs/ARCHITECTURE.md +220 -0
  23. package/docs/CUSTOMIZATION.md +101 -0
  24. package/docs/MODEL-ROUTING.md +195 -0
  25. package/docs/README.md +54 -0
  26. package/docs/SETUP.md +186 -0
  27. package/docs/demo.gif +0 -0
  28. package/fonts/dm-sans-latin-ext.woff2 +0 -0
  29. package/fonts/dm-sans-latin.woff2 +0 -0
  30. package/fonts/space-grotesk-latin-ext.woff2 +0 -0
  31. package/fonts/space-grotesk-latin.woff2 +0 -0
  32. package/generate-pdf.mjs +168 -0
  33. package/iso/agents/general-free.md +90 -0
  34. package/iso/agents/general-paid.md +44 -0
  35. package/iso/agents/glm-minimal.md +55 -0
  36. package/iso/commands/job-forge.md +188 -0
  37. package/iso/config.json +7 -0
  38. package/iso/instructions.md +514 -0
  39. package/iso/mcp.json +15 -0
  40. package/merge-tracker.mjs +377 -0
  41. package/modes/README.md +30 -0
  42. package/modes/_shared-calibration.md +26 -0
  43. package/modes/_shared.md +272 -0
  44. package/modes/apply.md +257 -0
  45. package/modes/auto-pipeline.md +70 -0
  46. package/modes/batch.md +110 -0
  47. package/modes/compare.md +23 -0
  48. package/modes/contact.md +82 -0
  49. package/modes/deep.md +99 -0
  50. package/modes/followup.md +68 -0
  51. package/modes/negotiation.md +146 -0
  52. package/modes/offer.md +199 -0
  53. package/modes/pdf.md +121 -0
  54. package/modes/pipeline.md +83 -0
  55. package/modes/project.md +30 -0
  56. package/modes/rejection.md +92 -0
  57. package/modes/scan.md +185 -0
  58. package/modes/tracker.md +31 -0
  59. package/modes/training.md +27 -0
  60. package/normalize-statuses.mjs +152 -0
  61. package/opencode.json +28 -0
  62. package/package.json +78 -0
  63. package/scripts/add-tags.mjs +894 -0
  64. package/scripts/cursor-agent-loop.sh +211 -0
  65. package/scripts/cursor-agent-stream-format.py +134 -0
  66. package/scripts/next-num.mjs +33 -0
  67. package/scripts/release/check-source.mjs +37 -0
  68. package/scripts/render-report-header.mjs +78 -0
  69. package/scripts/session-report.mjs +129 -0
  70. package/scripts/slugify.mjs +27 -0
  71. package/scripts/today.mjs +20 -0
  72. package/scripts/token-usage-report.mjs +315 -0
  73. package/scripts/tracker-line.mjs +67 -0
  74. package/scripts/verify-greenhouse-urls.mjs +195 -0
  75. package/templates/cv-template.html +395 -0
  76. package/templates/portals.example.yml +3140 -0
  77. package/templates/states.yml +62 -0
  78. package/tracker-lib.mjs +257 -0
  79. package/verify-pipeline.mjs +267 -0
@@ -0,0 +1,39 @@
1
+ ---
2
+ description: Quality-sensitive worker on paid model. Use for offer evaluation narratives (Blocks A-F), cover letter generation, "Why X?" form answers, interview STAR stories, and other tasks where writing quality and judgment matter.
3
+ mode: subagent
4
+ model: opencode/glm-5.1
5
+ tools:
6
+ geometra_*: false
7
+ gmail_*: false
8
+ temperature: 0.3
9
+ reasoningEffort: medium
10
+ fallback_models:
11
+ - opencode/claude-haiku-4-5
12
+ ---
13
+
14
+ You are the @general-paid subagent. The orchestrator delegated this task to you because it requires quality writing or judgment — the kind of work `@general-free` isn't well-suited for.
15
+
16
+ ## Do These Tasks
17
+
18
+ - Generate evaluation narratives (Blocks A-F) per `modes/offer.md`.
19
+ - Write cover letters, Section G draft answers, "Why X?" responses.
20
+ - Compose STAR+R interview stories and the story bank (`modes/offer.md` Block F).
21
+ - Draft LinkedIn outreach messages (`modes/contact.md`).
22
+ - Score offers using the Canonical Scoring Model — emit the JSON score block per `modes/_shared.md`, then the narrative report.
23
+
24
+ ## Skip These Tasks
25
+
26
+ - Drive Geometra forms end-to-end (delegate to `@general-free` or do it yourself only when the orchestrator asks for an atomic one-shot apply).
27
+ - Manage trackers, run scripts, or do mechanical TSV/dedup work. Those go to `@general-free`.
28
+ - Duplicate work. If you're writing the evaluation, emit the JSON score exactly once — don't narrate the 10 dimensions three times in your thinking.
29
+
30
+ ## Apply This Working Style
31
+
32
+ - **Think, then emit once.** When you've decided on the scoring or framing, write it out once. Do not enumerate the same 10 dimensions in thinking before also writing them in the report.
33
+ - **Structured output first, prose after.** Per `modes/offer.md`, emit the JSON score block before the narrative `.md`. The prose is derived from the JSON, not parallel to it.
34
+ - **Cite, don't invent.** Pull exact lines from `cv.md` and `article-digest.md`. Never fabricate metrics.
35
+ - **Respect anti-AI-detection rules.** See `modes/_shared.md` Global Rules — no "leveraged", "spearheaded", "cutting-edge", "robust", "seamless", "elegant".
36
+
37
+ ## Use Context Loaded For You
38
+
39
+ The top-level `instructions` gives you `AGENTS.harness.md`, `modes/_shared.md`, `cv.md`, `templates/states.yml`. Read mode files on demand. `article-digest.md` is optional — Read it if it exists for detailed proof points.
@@ -0,0 +1,50 @@
1
+ ---
2
+ description: Narrow-scope extractor on free-tier model. Use for single-purpose tasks where the orchestrator passes the exact input and expects a small, structured output — e.g., "extract these 8 fields from this JD text" or "parse this form schema into a label→type map". NOT for multi-step workflows.
3
+ mode: subagent
4
+ model: opencode/minimax-m2.5-free
5
+ tools:
6
+ geometra_*: false
7
+ gmail_*: false
8
+ bash: false
9
+ write: false
10
+ edit: false
11
+ webfetch: false
12
+ websearch: false
13
+ task: false
14
+ temperature: 0
15
+ reasoningEffort: none
16
+ fallback_models:
17
+ - opencode/big-pickle
18
+ - opencode/nemotron-3-super-free
19
+ ---
20
+
21
+ You are the @glm-minimal subagent. You handle narrow, one-shot extractions where the orchestrator has pre-digested the context and just needs you to do a specific transform.
22
+
23
+ ## Match Tasks To This Shape
24
+
25
+ The orchestrator will hand you:
26
+ 1. A small input (text, JSON, a form schema, a JD snippet) — typically under 5K tokens
27
+ 2. A specific ask ("extract X", "classify Y", "map A to B")
28
+ 3. An expected output shape (usually JSON)
29
+
30
+ Example:
31
+
32
+ > "Here is a JD snippet. Extract: company, role, seniority, location, comp_range_usd, archetype. Return JSON matching this schema: {...}"
33
+
34
+ ## Apply This Working Style
35
+
36
+ - **No preamble.** Do not restate the task. Do not describe your plan.
37
+ - **No thinking narration.** Skip "Let me analyze this..." / "First I'll..." — just emit the output.
38
+ - **JSON when asked.** If the orchestrator asks for JSON, return JSON only. No markdown fences unless requested. No commentary.
39
+ - **If you cannot complete:** return `{"error": "<one-sentence reason>"}` and stop. Do not attempt alternative approaches.
40
+ - **No tool calls** unless the orchestrator specifically granted one (e.g., "WebSearch is allowed for comp lookups"). Default to zero tool calls — you're an extractor, not a researcher.
41
+
42
+ ## Skip These Tasks
43
+
44
+ - Multi-step flows (use `@general-free` or `@general-paid`).
45
+ - Anything requiring the full JobForge context (tracker, scoring model, CV match). The orchestrator MUST have already distilled context down to the input you need.
46
+ - Any action that writes to disk, modifies state, or invokes MCP tools.
47
+
48
+ ## Read This Context Note
49
+
50
+ Even though you technically see the global `instructions` context (AGENTS.harness.md, modes/_shared.md, cv.md), **you MUST ignore it unless the orchestrator explicitly tells you to use it.** Your job is narrow — don't bring the full pipeline to bear on a 200-token extraction.
@@ -0,0 +1,185 @@
1
+ ---
2
+ name: job-forge
3
+ description: AI job search command center -- evaluate offers, generate CVs, scan portals, track applications
4
+ user_invocable: true
5
+ args: mode
6
+ ---
7
+
8
+ # job-forge -- Router
9
+
10
+ ## Mode Routing
11
+
12
+ Determine the mode from `{{mode}}`:
13
+
14
+ | Input | Mode |
15
+ |-------|------|
16
+ | (empty / no args) | `discovery` -- Show command menu |
17
+ | JD text or URL (no sub-command) | **`auto-pipeline`** |
18
+ | `offer` | `offer` |
19
+ | `compare` | `compare` |
20
+ | `contact` | `contact` |
21
+ | `deep` | `deep` |
22
+ | `pdf` | `pdf` |
23
+ | `training` | `training` |
24
+ | `project` | `project` |
25
+ | `tracker` | `tracker` |
26
+ | `pipeline` | `pipeline` |
27
+ | `apply` | `apply` |
28
+ | `scan` | `scan` |
29
+ | `batch` | `batch` |
30
+ | `followup` | `followup` |
31
+ | `rejection` | `rejection` |
32
+ | `negotiation` | `negotiation` |
33
+
34
+ **Auto-pipeline detection:** If `{{mode}}` is not a known sub-command AND contains JD text (keywords: "responsibilities", "requirements", "qualifications", "about the role", "we're looking for", company name + role) or a URL to a JD, execute `auto-pipeline`.
35
+
36
+ If `{{mode}}` is not a sub-command AND doesn't look like a JD, show discovery.
37
+
38
+ ---
39
+
40
+ ## Run Discovery Mode (no arguments)
41
+
42
+ Show this menu:
43
+
44
+ ```
45
+ job-forge -- Command Center
46
+
47
+ Available commands:
48
+ /job-forge {JD} → AUTO-PIPELINE: evaluate + report + PDF + tracker (paste text or URL)
49
+ /job-forge pipeline → Process pending URLs from inbox (data/pipeline.md)
50
+ /job-forge offer → Evaluation only A-F (no auto PDF)
51
+ /job-forge compare → Compare and rank multiple offers
52
+ /job-forge contact → LinkedIn power move: find contacts + draft message
53
+ /job-forge deep → Deep research prompt about company
54
+ /job-forge pdf → PDF only, ATS-optimized CV
55
+ /job-forge training → Evaluate course/cert against North Star
56
+ /job-forge project → Evaluate portfolio project idea
57
+ /job-forge tracker → Application status overview
58
+ /job-forge followup → Follow-up timing and nudges from the tracker
59
+ /job-forge apply → Live application assistant (reads form + generates answers)
60
+ /job-forge scan → Scan portals and discover new offers
61
+ /job-forge batch → Batch processing with parallel workers
62
+ /job-forge negotiation → Negotiate a received offer (comp and terms)
63
+ /job-forge rejection → Log a rejection or review rejection patterns
64
+
65
+ Inbox: add URLs to data/pipeline.md → /job-forge pipeline
66
+ Or paste a JD directly to run the full pipeline.
67
+
68
+ Token usage check (terminal, outside opencode):
69
+ npx job-forge tokens --days 1 # today's sessions with input/cache breakdown
70
+ npx job-forge tokens --session <id> # drill into one session for cache-bust hunting
71
+ ```
72
+
73
+ ---
74
+
75
+ ## Load Context by Mode
76
+
77
+ **IMPORTANT: Only load files needed for the active mode.** Do NOT pre-load all data or mode files. This keeps token usage low.
78
+
79
+ After determining the mode, Read the necessary files before executing:
80
+
81
+ ### Read `_shared.md` Plus Mode File For These Modes
82
+ Read `modes/_shared.md` + `modes/{mode}.md`
83
+
84
+ Applies to: `auto-pipeline`, `offer`, `compare`, `pdf`, `contact`, `apply`, `pipeline`, `scan`, `batch`
85
+
86
+ ### Read Only Mode File For Standalone Modes
87
+ Read `modes/{mode}.md`
88
+
89
+ Applies to: `tracker`, `deep`, `training`, `project`, `followup`, `rejection`, `negotiation`
90
+
91
+ ### Load Data Files Only When Mode Needs Them
92
+
93
+ | File | Load when mode is... |
94
+ |------|---------------------|
95
+ | `data/applications.md` (or `data/applications/*.md` if day-based) | `tracker`, `followup`, `rejection`, `compare`, `auto-pipeline` (for dedup check), `batch` (for next number) |
96
+ | `data/pipeline.md` | `pipeline`, `scan` (to append new finds) |
97
+ | `data/scan-history.tsv` | `scan` only |
98
+ | `portals.yml` | `scan` only |
99
+ | `batch/batch-prompt.md` | `batch` only |
100
+ | `batch/batch-state.tsv` | `batch` only (for resume) |
101
+ | `config/profile.yml` | When `_shared.md` is loaded (it references profile) |
102
+ | `cv.md` | `pdf`, `auto-pipeline`, `apply` (when tailoring CV) |
103
+
104
+ **Do NOT read `data/scan-history.tsv` (70KB+), `portals.yml` (100KB+), or `data/applications.md` (grows over time) unless the mode explicitly needs them.**
105
+
106
+ ### Delegate These Modes To Subagent
107
+ For `scan`, `apply` (with Geometra MCP), and `pipeline` (3+ URLs): launch as Agent with the content of `_shared.md` + `modes/{mode}.md` injected into the subagent prompt.
108
+
109
+ ```
110
+ Agent(
111
+ subagent_type="general-purpose",
112
+ prompt="[content of modes/_shared.md]\n\n[content of modes/{mode}.md]\n\n[invocation-specific data]",
113
+ description="job-forge {mode}"
114
+ )
115
+ ```
116
+
117
+ Execute the instructions from the loaded mode file.
118
+
119
+ ---
120
+
121
+ ## Apply Session Hygiene To Keep Token Usage Low
122
+
123
+ **Rule: multi-job workflows MUST delegate each job to its own subagent.**
124
+
125
+ Long interactive sessions (>100 messages) — especially with Geometra MCP doing repeated `geometra_fill_form` / `geometra_page_model` calls — accumulate conversation history that the model has to re-read on every turn. Tool results from Geometra disrupt prompt caching, so the full history is re-processed as *fresh* input tokens instead of cache reads. Observed symptom: `cache_read` drops to ~2K while `input_tokens` climbs to 100K+ per message.
126
+
127
+ The session-hygiene rule applies to:
128
+
129
+ - **`apply` mode with >1 job URL** → launch one subagent per URL, **max 2 in parallel** (Hard Limit #1 in `AGENTS.md`). For 10 jobs, run 5 sequential rounds of 2. Never run applications directly in this session.
130
+ - **`batch` mode** → already uses `batch-runner.sh`'s parallel `opencode run` workers. Do not wrap `batch` in an interactive session that also does the form filling.
131
+ - **`pipeline` mode with 3+ URLs** → split into per-URL subagents, **max 2 in parallel** (Hard Limit #1).
132
+ - **Anything that calls `geometra_fill_form` more than twice in a row** MUST be split into subagents.
133
+
134
+ ### Apply-to-N-jobs runbook (follow literally)
135
+
136
+ When the user says "apply to N jobs", "process the pipeline", or similar, execute this exact sequence. Do not improvise.
137
+
138
+ ```
139
+ Step 1 — Enumerate candidates
140
+ - Grep data/applications/$(date +%Y-%m-%d).md and the last 3 day files for status "Evaluated"
141
+ - Also read data/pipeline.md for unprocessed URLs
142
+ - Build ordered list: candidates = [job_1, job_2, ..., job_N]
143
+
144
+ Step 2 — Dedup against already-applied
145
+ - For each candidate, Grep data/pipeline.md + today's day file for "APPLIED" + company+role
146
+ - Drop any match. Never re-apply.
147
+
148
+ Step 3 — Pre-flight cleanup (once, before the loop)
149
+ - geometra_list_sessions()
150
+ - geometra_disconnect({ closeBrowser: true })
151
+
152
+ Step 4 — Loop in rounds of 2 (Hard Limit #1)
153
+ for round in ceil(len(candidates) / 2):
154
+ pair = candidates[round*2 : round*2 + 2]
155
+ # Dispatch 1 or 2 task() calls in ONE message (never 3+)
156
+ task(subagent_type=<tier per AGENTS.md routing>, prompt=<apply prompt for pair[0]>)
157
+ task(subagent_type=<tier>, prompt=<apply prompt for pair[1]>) # only if pair has 2
158
+ # WAIT for both subagents to return before proceeding
159
+ # Read their return values, log outcomes
160
+
161
+ Step 5 — Between rounds: clean sessions again
162
+ - geometra_list_sessions()
163
+ - geometra_disconnect({ closeBrowser: true })
164
+
165
+ Step 6 — After all rounds: reconcile outcomes (Hard Limit #6)
166
+ - bash: node merge-tracker.mjs # consumes batch/tracker-additions/*.tsv into the day file
167
+ - bash: node verify-pipeline.mjs # validates URL/status consistency
168
+ - Review output; if verify-pipeline reports issues, fix them before ending.
169
+
170
+ Step 7 — Aggregate and report
171
+ - Summarize: applied, skipped, failed
172
+ - Do NOT re-dispatch failed jobs automatically. Report them to the user.
173
+ ```
174
+
175
+ **Hard rules for this runbook:**
176
+ - Never emit 3+ `task` calls in one message. Two is the max (Hard Limit #1).
177
+ - Never re-dispatch a company whose previous subagent hasn't returned yet (Hard Limit #5).
178
+ - Never call `geometra_fill_form` from this session (Hard Limit #4). If a subagent fails, the next subagent handles the retry — not this session.
179
+ - **Never append APPLIED / FAILED / SKIP lines to `data/pipeline.md`** (Hard Limit #6). Those outcomes live in `batch/tracker-additions/*.tsv` and flow to the day file via `merge-tracker.mjs`. `pipeline.md` only holds URL inbox state: `[ ]` pending or `[x]` processed.
180
+
181
+ **Rationale:** A 300-message "apply to 20 jobs" session burns roughly 100K tokens of *fresh* input per message (history re-processed, cache busted). Twenty 30-message per-job subagents do the same work with each sub-session short enough that the cache actually holds — typically 5-10× lower effective token usage.
182
+
183
+ **Verify after running:** `npx job-forge tokens --session <id>` shows per-message input/cache. Messages with `cache_read < 5K` and `input > 50K` are cache-bust offenders — investigate what's disrupting the cache prefix (usually a mid-session tool schema change or a compact rerun).
184
+
185
+ **Also:** when the current session has only evaluation or tracker work (no Geometra / no long form flows), you can proceed in a single session. The rule targets tool-heavy multi-step work, not lightweight reads.