job-forge 2.14.1 → 2.14.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  description: Procedural worker on free-tier model. Use for form filling via Geometra, tracker updates, TSV merges, scan dedup, OTP retrieval, and other mechanical/scripted tasks where quality-sensitive text generation is NOT required.
3
3
  mode: subagent
4
- model: openrouter/z-ai/glm-4.5-air:free
4
+ model: opencode/big-pickle
5
5
  tools:
6
6
  geometra_connect: true
7
7
  geometra_page_model: true
@@ -17,7 +17,7 @@ tools:
17
17
  temperature: 0.1
18
18
  reasoningEffort: minimal
19
19
  fallback_models:
20
- - openrouter/minimax/minimax-m2.5:free
20
+ - openrouter/z-ai/glm-4.5-air:free
21
21
  - openrouter/openai/gpt-oss-20b:free
22
22
  - openrouter/nvidia/nemotron-3-nano-30b-a3b:free
23
23
  - openrouter/qwen/qwen3-coder:free
@@ -8,10 +8,12 @@ tools:
8
8
  temperature: 0.3
9
9
  reasoningEffort: medium
10
10
  fallback_models:
11
- - openrouter/nvidia/nemotron-3-super-120b-a12b:free
12
11
  - openrouter/openai/gpt-oss-120b:free
12
+ - openrouter/nvidia/nemotron-3-super-120b-a12b:free
13
13
  - openrouter/z-ai/glm-4.5-air:free
14
14
  - openrouter/qwen/qwen3-coder:free
15
+ - openrouter/google/gemma-4-31b-it:free
16
+ - openrouter/meta-llama/llama-3.3-70b-instruct:free
15
17
  ---
16
18
 
17
19
  You are the @general-paid subagent. The orchestrator delegated this task to you because it requires quality writing or judgment — the kind of work `@general-free` isn't well-suited for.
@@ -9,9 +9,14 @@
9
9
  "openrouter/qwen/qwen3-next-80b-a3b-instruct:free"
10
10
  ],
11
11
  "retryable_error_patterns": [
12
- "(?i)\\bvenice\\b.*(?:insufficient|balance|credits?|diem|usd)",
13
- "(?i)insufficient\\s+(?:usd|diem|credits?|funds?|balance)",
14
- "(?i)credit.*balance.*too.*low",
15
- "(?i)(?:temporarily\\s+)?unavailable|overloaded|try\\s+again"
12
+ "\\bvenice\\b",
13
+ "insufficient\\s+usd",
14
+ "insufficient\\s+.*\\s+diem",
15
+ "diem\\s+balance",
16
+ "add\\s+credits",
17
+ "chutes",
18
+ "insufficient\\s+(?:credits?|funds?|balance)",
19
+ "credit.*balance.*too.*low",
20
+ "(?:temporarily\\s+)?unavailable|overloaded|try\\s+again"
16
21
  ]
17
22
  }
package/README.md CHANGED
@@ -68,7 +68,7 @@ JobForge turns opencode into a full job search command center. Instead of manual
68
68
  | **Portal Scanner** | 45+ companies pre-configured with fuzzy dedup for reposts |
69
69
  | **Batch Processing** | Parallel evaluation with `opencode run` workers, with honest verification flagging |
70
70
  | **Pipeline Integrity** | Automated merge, dedup, status normalization, health checks |
71
- | **Cost-Aware Agent Routing** | Three subagents (`@general-free`, `@general-paid`, `@glm-minimal`) with per-task model tiers. On OpenCode, all three default to free OpenRouter models so the harness can run there without paid model spend. See [Subagent Routing in AGENTS.md](AGENTS.md) for the task-to-agent mapping. |
71
+ | **Cost-Aware Agent Routing** | Three subagents (`@general-free`, `@general-paid`, `@glm-minimal`) with per-task model tiers. On OpenCode, JobForge mixes native free models with free OpenRouter routes so the harness stays no-cost without forcing every task through the same provider. See [Subagent Routing in AGENTS.md](AGENTS.md) for the task-to-agent mapping. |
72
72
  | **Automatic Model Fallback** | When a model rate-limits or 5xx's, [`@razroo/opencode-model-fallback`](https://www.npmjs.com/package/@razroo/opencode-model-fallback) rotates the agent through a configured `fallback_models` chain and replays the request. JobForge's OpenCode defaults stay on free models for both primaries and fallbacks. |
73
73
  | **Token Cost Visibility** | `job-forge tokens --days 1` for per-session breakdown; `job-forge session-report --since-minutes 60 --log` to flag sessions over budget and append history to `data/token-usage.tsv`. Auto-logged after every batch run. |
74
74
 
@@ -196,6 +196,7 @@ const opencodeCfg = {
196
196
  'nvidia/nemotron-nano-9b-v2:free': {},
197
197
  'google/gemma-4-26b-a4b-it:free': {},
198
198
  'google/gemma-4-31b-it:free': {},
199
+ 'meta-llama/llama-3.3-70b-instruct:free': {},
199
200
  },
200
201
  },
201
202
  },
@@ -43,7 +43,7 @@ The consumer's `opencode.json` loads a small set of stable files as always-prese
43
43
 
44
44
  The skill router (`.opencode/skills/job-forge.md`) loads mode and data files on demand, keeping per-session input tokens low (~20-40K for most modes instead of ~130-170K when everything was force-loaded).
45
45
 
46
- **Cost-tiered subagents** live in `.opencode/agents/` (`general-free`, `general-paid`, `glm-minimal`). On OpenCode, all three now resolve to free OpenRouter models by default, with different quality/latency tiers per task shape. See [MODEL-ROUTING.md](MODEL-ROUTING.md) for the routing architecture, why it exists, and how to customize.
46
+ **Cost-tiered subagents** live in `.opencode/agents/` (`general-free`, `general-paid`, `glm-minimal`). On OpenCode, JobForge now uses a mix of native free models and free OpenRouter routes, with different quality/latency tiers per task shape. See [MODEL-ROUTING.md](MODEL-ROUTING.md) for the routing architecture, why it exists, and how to customize.
47
47
 
48
48
  **Multi-harness support.** Because `iso/` is the single source of truth, publishing ships config for OpenCode, Cursor, Claude Code, and Codex in one tarball. Consumers run any of `opencode`, `cursor`, `claude`, or `codex` in the project and each picks up the shared MCP config + instructions via the symlinks above.
49
49
 
@@ -7,8 +7,8 @@ JobForge routes each piece of work to the cheapest model that can do it well, in
7
7
  A two-day trace early in development showed `$48` in spend, with **84% coming from GLM 5.1** despite the majority of the work being procedural (form fills, tracker updates, OTP retrieval). The root cause:
8
8
 
9
9
  - **GLM 5.1's provider doesn't discount cache reads.** On Anthropic, a 10K-token cached prefix costs ~$0.03. On GLM 5.1 it bills near-full input rate (~$0.35). Every session that re-loads the prefix pays full price.
10
- - **Procedural work is the high-volume work.** 1000+ messages per day go to form filling, TSV merges, scan dedup. Running that on a paid model is unnecessary when current free OpenRouter models can handle the task.
11
- - **Current OpenRouter free models are strong enough to cover the whole OpenCode path.** JobForge now defaults every OpenCode role to a free model, including the quality-sensitive writer tier.
10
+ - **Procedural work is the high-volume work.** 1000+ messages per day go to form filling, TSV merges, scan dedup. Running that on a paid model is unnecessary when current free models can handle the task.
11
+ - **OpenCode no longer needs one provider for every role.** JobForge now pins the procedural `@general-free` worker to `opencode/big-pickle`, while the quality-sensitive writer tier stays on a free OpenRouter route.
12
12
 
13
13
  Conclusion: route procedural work to free tier, reserve paid models for tasks that actually need the quality.
14
14
 
@@ -18,7 +18,7 @@ Defined in `.opencode/agents/*.md` (shipped in the harness, symlinked into consu
18
18
 
19
19
  | Agent | Model | Reasoning | Use for |
20
20
  |-------|-------|-----------|---------|
21
- | `@general-free` | `openrouter/z-ai/glm-4.5-air:free` | `minimal` | Geometra form fills, tracker TSV merges, scan dedup, OTP retrieval via Gmail, scripted pipeline steps |
21
+ | `@general-free` | `opencode/big-pickle` | `minimal` | Geometra form fills, tracker TSV merges, scan dedup, OTP retrieval via Gmail, scripted pipeline steps |
22
22
  | `@general-paid` | `openrouter/qwen/qwen3-next-80b-a3b-instruct:free` | `medium` | Offer evaluation narratives (Blocks A-F), cover letters, "Why X?" answers, STAR+R interview stories, LinkedIn outreach prose |
23
23
  | `@glm-minimal` | `openrouter/openai/gpt-oss-20b:free` | `none` | Narrow one-shot transforms: "extract these 8 fields from this JD text → JSON", "classify this archetype" |
24
24
 
@@ -79,7 +79,10 @@ The `.opencode/agents/general-paid.md` file is a symlink into `node_modules/job-
79
79
 
80
80
  ### Swap the free tier
81
81
 
82
- Same idea — edit `.opencode/agents/general-free.md`'s `model:` field. If you run into quality issues on forms, swap to a different free OpenRouter model first before considering a paid tier.
82
+ The primary `@general-free` model is set in `models.yaml` via the `fast`
83
+ role's `targets.opencode` override. Change that if you want a different
84
+ default. The `fallback_models` list still lives in
85
+ `.opencode/agents/general-free.md` / `iso/agents/general-free.md`.
83
86
 
84
87
  ### Add a custom agent
85
88
 
@@ -139,8 +142,8 @@ Default chains ship upstream in each agent's YAML frontmatter (`node_modules/job
139
142
 
140
143
  | Agent | Primary | Fallback chain (in order) |
141
144
  |-------|---------|---------------------------|
142
- | `@general-free` | `openrouter/z-ai/glm-4.5-air:free` | `openrouter/minimax/minimax-m2.5:free` → `openrouter/openai/gpt-oss-20b:free` → `openrouter/nvidia/nemotron-3-nano-30b-a3b:free` → `openrouter/qwen/qwen3-coder:free` |
143
- | `@general-paid` | `openrouter/qwen/qwen3-next-80b-a3b-instruct:free` | `openrouter/nvidia/nemotron-3-super-120b-a12b:free` → `openrouter/openai/gpt-oss-120b:free` → `openrouter/z-ai/glm-4.5-air:free` → `openrouter/qwen/qwen3-coder:free` |
145
+ | `@general-free` | `opencode/big-pickle` | `openrouter/z-ai/glm-4.5-air:free` → `openrouter/openai/gpt-oss-20b:free` → `openrouter/nvidia/nemotron-3-nano-30b-a3b:free` → `openrouter/qwen/qwen3-coder:free` |
146
+ | `@general-paid` | `openrouter/qwen/qwen3-next-80b-a3b-instruct:free` | `openrouter/openai/gpt-oss-120b:free` → `openrouter/nvidia/nemotron-3-super-120b-a12b:free` → `openrouter/z-ai/glm-4.5-air:free` → `openrouter/qwen/qwen3-coder:free` → `openrouter/google/gemma-4-31b-it:free` → `openrouter/meta-llama/llama-3.3-70b-instruct:free` |
144
147
  | `@glm-minimal` | `openrouter/openai/gpt-oss-20b:free` | `openrouter/google/gemma-4-26b-a4b-it:free` → `openrouter/nvidia/nemotron-nano-9b-v2:free` → `openrouter/google/gemma-4-31b-it:free` → `openrouter/z-ai/glm-4.5-air:free` |
145
148
 
146
149
  These chains are deliberately free-only so the default OpenCode path never needs to pay. **Note:** OpenCode model IDs must use the provider prefix it expects (`openrouter/...`, `opencode/...`, etc.). The raw OpenRouter model slug by itself is not enough.
@@ -10,11 +10,12 @@ targets:
10
10
  mode: subagent
11
11
  temperature: 0.1
12
12
  reasoningEffort: minimal
13
- # Primary (z-ai/glm-4.5-air:free) resolves from openrouter-free preset.
14
- # Fallback chain is ordered by decreasing likelihood of rate-limits,
15
- # staying within free models that can tool-call Geometra + Gmail MCPs.
13
+ # Primary comes from models.yaml: opencode/big-pickle on OpenCode.
14
+ # Fallback chain stays free-only and intentionally excludes
15
+ # openrouter/minimax/minimax-m2.5:free because recent traces showed
16
+ # repeated read({ path|file_path }) schema drift on that route.
16
17
  fallback_models:
17
- - openrouter/minimax/minimax-m2.5:free
18
+ - openrouter/z-ai/glm-4.5-air:free
18
19
  - openrouter/openai/gpt-oss-20b:free
19
20
  - openrouter/nvidia/nemotron-3-nano-30b-a3b:free
20
21
  - openrouter/qwen/qwen3-coder:free
@@ -12,13 +12,17 @@ targets:
12
12
  temperature: 0.3
13
13
  reasoningEffort: medium
14
14
  # Primary (qwen/qwen3-next-80b-a3b-instruct:free) resolves from the
15
- # openrouter-free preset. Fallback chain prioritizes models with
16
- # strong long-form writing judgment over raw size.
15
+ # openrouter-free preset. First fallbacks intentionally avoid another
16
+ # immediate hop through the same Venice/Qwen pool when OpenRouter
17
+ # returns "[Venice] insufficient …" — gpt-oss-120b + nemotron are
18
+ # usually different backends. Remaining picks stay free-only.
17
19
  fallback_models:
18
- - openrouter/nvidia/nemotron-3-super-120b-a12b:free
19
20
  - openrouter/openai/gpt-oss-120b:free
21
+ - openrouter/nvidia/nemotron-3-super-120b-a12b:free
20
22
  - openrouter/z-ai/glm-4.5-air:free
21
23
  - openrouter/qwen/qwen3-coder:free
24
+ - openrouter/google/gemma-4-31b-it:free
25
+ - openrouter/meta-llama/llama-3.3-70b-instruct:free
22
26
  tools:
23
27
  geometra_*: false
24
28
  gmail_*: false
package/iso/config.json CHANGED
@@ -10,10 +10,15 @@
10
10
  "openrouter/qwen/qwen3-next-80b-a3b-instruct:free"
11
11
  ],
12
12
  "retryable_error_patterns": [
13
- "(?i)\\bvenice\\b.*(?:insufficient|balance|credits?|diem|usd)",
14
- "(?i)insufficient\\s+(?:usd|diem|credits?|funds?|balance)",
15
- "(?i)credit.*balance.*too.*low",
16
- "(?i)(?:temporarily\\s+)?unavailable|overloaded|try\\s+again"
13
+ "\\bvenice\\b",
14
+ "insufficient\\s+usd",
15
+ "insufficient\\s+.*\\s+diem",
16
+ "diem\\s+balance",
17
+ "add\\s+credits",
18
+ "chutes",
19
+ "insufficient\\s+(?:credits?|funds?|balance)",
20
+ "credit.*balance.*too.*low",
21
+ "(?:temporarily\\s+)?unavailable|overloaded|try\\s+again"
17
22
  ]
18
23
  },
19
24
  "targets": {
@@ -34,7 +39,8 @@
34
39
  "nvidia/nemotron-3-nano-30b-a3b:free": {},
35
40
  "nvidia/nemotron-nano-9b-v2:free": {},
36
41
  "google/gemma-4-26b-a4b-it:free": {},
37
- "google/gemma-4-31b-it:free": {}
42
+ "google/gemma-4-31b-it:free": {},
43
+ "meta-llama/llama-3.3-70b-instruct:free": {}
38
44
  }
39
45
  }
40
46
  }
package/models.yaml CHANGED
@@ -9,7 +9,7 @@
9
9
  #
10
10
  # JobForge's subagents bind to preset roles via the `role:` field in
11
11
  # iso/agents/<slug>.md:
12
- # @general-free → role: fast (Haiku / OpenRouter GLM 4.5 Air free / gpt-5.4-mini)
12
+ # @general-free → role: fast (Haiku / OpenCode big-pickle / gpt-5.4-mini)
13
13
  # @general-paid → role: quality (Opus 4.7 / OpenRouter Qwen3 Next 80B free / gpt-5.4)
14
14
  # @glm-minimal → role: minimal (Haiku / OpenRouter GPT-OSS-20B free / gpt-5.4-nano)
15
15
  #
@@ -30,3 +30,10 @@
30
30
  # model: gpt-5.4
31
31
 
32
32
  extends: openrouter-free
33
+
34
+ roles:
35
+ fast:
36
+ targets:
37
+ opencode:
38
+ provider: opencode
39
+ model: opencode/big-pickle
package/opencode.json CHANGED
@@ -3,7 +3,7 @@
3
3
  "model": "openrouter/qwen/qwen3-coder:free",
4
4
  "agent": {
5
5
  "fast": {
6
- "model": "openrouter/z-ai/glm-4.5-air:free"
6
+ "model": "opencode/big-pickle"
7
7
  },
8
8
  "quality": {
9
9
  "model": "openrouter/qwen/qwen3-next-80b-a3b-instruct:free"
@@ -25,7 +25,8 @@
25
25
  "nvidia/nemotron-3-nano-30b-a3b:free": {},
26
26
  "nvidia/nemotron-nano-9b-v2:free": {},
27
27
  "google/gemma-4-26b-a4b-it:free": {},
28
- "google/gemma-4-31b-it:free": {}
28
+ "google/gemma-4-31b-it:free": {},
29
+ "meta-llama/llama-3.3-70b-instruct:free": {}
29
30
  }
30
31
  }
31
32
  },
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "job-forge",
3
- "version": "2.14.1",
3
+ "version": "2.14.3",
4
4
  "description": "AI-powered job search pipeline built on opencode",
5
5
  "type": "module",
6
6
  "bin": {