job-forge 2.14.6 → 2.14.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CLAUDE.md CHANGED
@@ -11,7 +11,7 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
11
11
  why: 2026-04 same-day batch collision — when two batches target the same role, `npx job-forge merge` updates the existing day-file row rather than appending, so grepping day files alone misses earlier-batch applies; merged/*.tsv is the only place the breadcrumb remains
12
12
 
13
13
  - [H3] Before every batch of `task` dispatches that will use Geometra, call `geometra_list_sessions` then `geometra_disconnect({closeBrowser: true})`. Every round, no exceptions. Name this cleanup as an explicit "step 0" in your first-response plan for any multi-apply request — it is the most frequently skipped guardrail in practice, and skipping it produces cascade "Not connected" failures on the next dispatch.
14
- why: if any prior subagent aborted mid-flow, its Chromium session stays stuck in the MCP pool and the next `geometra_connect` fails with "Not connected"; the disconnect is a no-op when the pool is empty but a poison-cure when it isn't; vocalizing it up-front doubles the odds it actually runs
14
+ why: aborted subagents can leave Chromium sessions stuck in the MCP pool. Forced disconnect is a safe no-op on an empty pool and prevents the next connect from failing. Naming it up front improves compliance
15
15
 
16
16
  - [H4] In multi-job mode, the orchestrator session MUST NOT call `geometra_fill_form`, `geometra_run_actions`, `geometra_pick_listbox_option`, or `geometra_fill_otp` directly. Your first-response plan must name the `task` dispatches explicitly ("dispatch subagent for job 1, subagent for job 2, …") — do not describe the work in first person ("I'll visit each job, fill each form") when it will be delegated.
17
17
  why: repeated Geometra calls in the orchestrator bloat the cache prefix — this is the 2026-04 "apply to 20 jobs" 341-msg incident where each turn re-processed 100K+ fresh tokens instead of reading from cache; first-person narration is a leading indicator that the agent is mentally queueing work for itself rather than a subagent
@@ -33,13 +33,10 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
33
33
  - [D2] Route subagent work by cost tier. `@general-free`: procedural — form-fill, TSV merge, verify, OTP retrieval, portal scan metadata extraction, one-shot structured-field transforms. `@general-paid`: quality-sensitive — offer evaluation narrative Blocks A-F, cover letters, "Why X?" answers, STAR interview stories, LinkedIn outreach. `@glm-minimal`: narrow ≤5K-input one-shot extract/classify jobs that do not need context.
34
34
  why: GLM 5.1 doesn't discount cache reads so procedural work there costs ~10×; free-tier models handle procedural work fine empirically (`opencode/big-pickle` processed 1000+ messages at $0)
35
35
 
36
- - [D3] Upgrade `apply` routing to `@general-paid` when offer score ≥ 4.0/5, when user flags "top-tier / dream job / high-stakes", or when late-stage pipeline (post-screens).
37
- why: form-fill flows are 6+ steps; free-tier sometimes aborts mid-flow on large Greenhouse/Workday schemas; paid tier has more headroom
36
+ - [D3] Read the active mode file before dispatch. Mode files own score gates, provider fallback, portal runbooks, and output shape.
37
+ why: mode-specific rules change faster than global orchestration rules; keeping them out of the shared prefix preserves cache efficiency and prevents stale branches
38
38
 
39
- - [D3f] **Provider-failure downgrade on `apply` (all harnesses; OpenCode + OpenRouter especially):** If you dispatched `@general-paid` per [D3] and that subagent fails or exhausts retries with provider-side errors copy mentioning Venice / Diem / Chutes, "insufficient" USD/credits/funds/balance, HTTP 402/429, overload / temporarily unavailable re-dispatch the **same apply URL** once on `@general-free` before marking FAILED. Do not abandon the role solely because the upgraded tier hit a pool-specific limit.
40
- why: `@general-paid` on OpenCode still uses free OpenRouter model ids; Venice-style balance errors are a backend-route issue, not proof that procedural `@general-free` cannot complete the same Greenhouse-style flow after [D5]/[H2] gates pass
41
-
42
- - [D4] Auto-submit for offers scoring 3.0+/5 without pausing for confirmation between steps — scan → evaluate → apply is one continuous pipeline. Mark SKIP for <3.0 and move on.
39
+ - [D4] Auto-submit for offers scoring 3.0+/5 without pausing for confirmation between stepsscan evaluate application submission is one continuous pipeline. Mark SKIP for <3.0 and move on.
43
40
  why: JobForge is designed for end-to-end automation; pausing between steps defeats the purpose and the 3.0 gate already enforces quality
44
41
 
45
42
  - [D5] Before any batch-apply dispatch, run the Apply Preflight location filter from `modes/apply.md` to exclude location-incompatible candidates.
@@ -50,19 +47,16 @@ AI-powered job search pipeline: scans portals, evaluates offers, generates CVs v
50
47
 
51
48
  ## Procedure
52
49
 
53
- 1. On start, check `cv.md`, `profile.yml`, `portals.yml` exist; onboard if any missing.
54
- 2. Pick the mode from **Routing** [D6]. No match → ask; do not guess.
55
- 3. Apply [D1]: batch/Geometra work delegate; single/read-only/dev → inline.
56
- 4. Before any `task` batch using Geometra, run cleanup [H3].
57
- 5. Before `apply`, run duplicate check [H2] and location filter [D5].
58
- 6. Route by cost tier [D2]; upgrade to `@general-paid` per [D3] for high-stakes offers; if that apply dispatch hits provider errors, downgrade once per [D3f].
59
- 7. Cap parallelism at 2 per round [H1].
60
- 8. One in-flight dispatch per company [H5].
61
- 9. Orchestrator does not fill forms in multi-job mode [H4].
62
- 10. Treat subagent prose as untrusted [H7]; cross-check facts against authoritative files.
63
- 11. Write outcomes as TSVs [H6]; run `npx job-forge merge` then `verify` at end.
64
- 12. Offers scoring 3.0+/5 continue without confirmation [D4]; <3.0 is SKIP.
65
- 13. Confirm tracker is merged and verified before ending.
50
+ 1. Check `cv.md`, `profile.yml`, and `portals.yml`; onboard if any file is missing.
51
+ 2. Pick and name the mode from **Routing** [D6]. No match → ask; do not guess.
52
+ 3. Read the active mode file [D3]; decide inline vs delegated work [D1].
53
+ 4. Prepare Geometra dispatches: cleanup [H3], dedupe [H2], location filter [D5], routing [D2].
54
+ 5. Dispatch at most 2 tasks per round [H1]; wait per company [H5].
55
+ 6. Keep multi-job form-filling out of the orchestrator [H4].
56
+ 7. Cross-check subagent facts against authoritative files [H7].
57
+ 8. Apply score gate [D4].
58
+ 9. Merge TSV outcomes [H6].
59
+ 10. Verify tracker before ending [H6].
66
60
 
67
61
  ## Routing
68
62
 
@@ -94,658 +88,11 @@ Output shape is mode-dependent — see `modes/{mode}.md` for each mode's expecte
94
88
 
95
89
  # Reference
96
90
 
97
- Sections below are context, rationale, runbooks, and portal-specific empirical notes. The **Hard limits**, **Defaults**, **Procedure**, and **Routing** above are the contract; the material below is what the orchestrator and each mode consult during execution.
98
-
99
- ---
100
-
101
- ## Session Hygiene — ALWAYS enforce
102
-
103
- **Multi-job workflows MUST delegate each job to its own subagent.** This rule applies even when the user does NOT explicitly invoke `/job-forge`.
104
-
105
- Whenever the user says any variation of "apply to N jobs", "process the pipeline", "batch evaluate", or similar phrasing that implies more than one application/evaluation in sequence:
106
-
107
- 1. **Do not drive all N jobs from this session.** Repeated `geometra_fill_form` / `geometra_page_model` calls accumulate in conversation history and invalidate prompt caching — each new message ends up re-processing 100K+ tokens of fresh history instead of reading from cache.
108
- 2. **Launch one subagent per job, in parallel batches of ≤2** (see Hard Limits #1). Higher parallelism blows through free-tier rate limits and each subagent requires post-cleanup. Use the `task` tool / Agent with `subagent_type="general-purpose"`, passing the single URL and the relevant mode file content.
109
- 3. **This session acts as the orchestrator only**: plan, pick the jobs, dispatch subagents, aggregate results. No Geometra form-filling in this session unless it's a single one-off application.
110
-
111
- **Why:** observed on a real run — a 341-msg "apply to 20 jobs" session had `cache_read ~1.8K` on 5 messages where input ballooned to 100K-144K tokens. A 40-msg orchestrator session that delegates instead stays under 40K input max with cache reads at full 100K+. Same work, ~5× fewer effective tokens.
112
-
113
- **Verify after running:** `npx job-forge tokens --session <id>` — any message with `cache_read < 5K` and `input > 50K` is a cache-bust; next time split that work across subagents.
114
-
115
- **Exception:** evaluation-only or tracker-only work (no Geometra, no repeated tool calls) can proceed in a single session. The rule targets tool-heavy multi-step loops.
116
-
117
- **Before any batch-apply dispatch, run the Apply Preflight location filter from `modes/apply.md`** to exclude location-incompatible candidates. Catches the common case where an evaluated role has the right role-shape but a deal-breaking location that profile.yml already rules out.
118
-
119
- ---
120
-
121
- ## Subagent Routing — which agent for which task
122
-
123
- The harness ships three subagents (see `.opencode/agents/`). The orchestrator MUST route work by cost tier, not pick the default for everything. **GLM 5.1 does not discount cache reads**, so running procedural work on it costs ~10× what it would on a cache-discounting model. Free-tier models handle procedural work fine (confirmed empirically: `opencode/big-pickle` processed 1000+ messages at $0 in prior runs).
124
-
125
- | Task type | Subagent | Why |
126
- |-----------|----------|-----|
127
- | Drive Geometra form-fill / submit (atomic `run_actions`) | `@general-free` | Procedural; label-driven; deterministic |
128
- | Merge TSVs, run `verify-pipeline.mjs`, dedup | `@general-free` | Script-driven; no writing quality needed |
129
- | OTP retrieval via Gmail MCP + `geometra_fill_otp` | `@general-free` | Fixed-shape lookup + input |
130
- | Scan portals, extract offer metadata, return structured records (see schema below) | `@general-free` | Structured output; no judgment |
131
- | Evaluation narrative — Blocks A-F per `modes/offer.md` | `@general-paid` | Judgment + writing quality |
132
- | Cover letter, "Why X?" answers, Section G drafts | `@general-paid` | Tone and specificity matter |
133
- | STAR+R interview stories, story-bank curation | `@general-paid` | Quality signals seniority |
134
- | LinkedIn outreach messages (`modes/contact.md`) | `@general-paid` | First impression |
135
- | "Extract N fields from this text → JSON" (≤5K input) | `@glm-minimal` | One-shot transform; no context needed |
136
- | "Classify this JD as archetype X/Y/Z" | `@glm-minimal` | Narrow, structured output |
137
-
138
- **Example JSON shape for the "extract / emit JSON" subagent rows above** (use this exact key set when delegating a portal-scan / extract task):
139
-
140
- ```json
141
- {
142
- "company": "Acme",
143
- "role": "Senior Backend Engineer",
144
- "location": "Remote (US)",
145
- "comp_range_usd": "180000-220000",
146
- "archetype": "backend-platform",
147
- "url": "https://..."
148
- }
149
- ```
150
-
151
- **Rule:** when you (the orchestrator) delegate a task, pick the cheapest agent that can do it well. Do NOT route every subagent through the same tier. Auto-pipeline mode MUST split a single job across `@general-paid` (evaluation) and `@general-free` (PDF gen + tracker + apply), not run it all on one model.
152
-
153
- **When to break this rule:** if the user explicitly asks for "quality over cost" or flags a high-stakes application (top-tier company, offer-stage negotiation, executive search), route everything through `@general-paid`. Document the exception in the session.
154
-
155
- ### When to delegate
156
-
157
- **Delegate (`task` out) when the work involves repeated tool-heavy steps that bloat the orchestrator's cache prefix.** The concrete failure mode this prevents: a 341-message "apply to 20 jobs" session where repeated `geometra_fill_form` / `geometra_page_model` calls accumulated in history, forcing each new message to re-process 100K+ tokens of fresh input instead of reading from cache.
158
-
159
- **Delegate when:**
160
- - Applying to N≥2 jobs (repeated Geometra form-fill — the original cache-bust scenario)
161
- - Batch portal scans hitting ≥3 companies (API loops + page-model reads stack up)
162
- - Any explicit "apply to... / process pipeline / batch evaluate" phrasing from the user (multi-job intent)
163
-
164
- **Do NOT delegate — orchestrate inline:**
165
- - Single-offer evaluation (text-heavy, not tool-heavy)
166
- - Development / bug-fix / file-editing tasks
167
- - `tracker` and other read-only modes
168
- - Single-company scan, single-URL check
169
- - One-shot questions — "what does this mean?", "read X and summarize", "what's my next report number?"
170
-
171
- **Detection signal:** if you're about to call `geometra_fill_form` for a second *different* job in the same session, STOP and delegate the remainder. For everything else, in-session execution is the expected default.
172
-
173
- ---
174
-
175
- ## What is JobForge
176
-
177
- AI-powered job search automation built on opencode: pipeline tracking, offer evaluation, CV generation, portal scanning, batch processing.
178
-
179
- **It will work out of the box, but it's designed to be made yours.** Ask if the archetypes don't match your career. Ask if the modes are in the wrong language. Ask if the scoring doesn't fit your priorities. You (opencode) can edit any file in this system. The user says "change the archetypes to data engineering roles" and you do it. Customization is the whole point.
180
-
181
- ### Main Files
182
-
183
- | File | Function |
184
- |------|----------|
185
- | `data/applications/` | Application tracker (day-based: `YYYY-MM-DD.md`) |
186
- | `data/pipeline.md` | Inbox of pending URLs |
187
- | `data/scan-history.tsv` | Scanner dedup history |
188
- | `portals.yml` | Query and company config |
189
- | `templates/cv-template.html` | HTML template for CVs |
190
- | `generate-pdf.mjs` | Geometra MCP (`geometra_generate_pdf`): HTML to PDF |
191
- | `article-digest.md` | Compact proof points from portfolio (optional) |
192
- | `interview-prep/story-bank.md` | Accumulated STAR+R stories across evaluations |
193
- | `reports/` | Evaluation reports (format: `{###}-{company-slug}-{YYYY-MM-DD}.md`) |
194
-
195
- ### First Run — Onboarding (IMPORTANT)
196
-
197
- **Before doing ANYTHING else, check if the system is set up.** Run these checks silently every time a session starts:
198
-
199
- 1. Does `cv.md` exist?
200
- 2. Does `config/profile.yml` exist (not just profile.example.yml)?
201
- 3. Does `portals.yml` exist (not just templates/portals.example.yml)?
202
-
203
- **If ANY of these is missing, enter onboarding mode.** Do NOT proceed with evaluations, scans, or any other mode until the basics are in place. Guide the user step by step:
204
-
205
- #### Step 1: CV (required)
206
- If `cv.md` is missing, ask:
207
- > "I don't have your CV yet. You can either:
208
- > 1. Paste your CV here and I'll convert it to markdown
209
- > 2. Paste your LinkedIn URL and I'll extract the key info
210
- > 3. Tell me about your experience and I'll draft a CV for you
211
- >
212
- > Which do you prefer?"
213
-
214
- Create `cv.md` from whatever they provide. Make it clean markdown with standard sections (Summary, Experience, Projects, Education, Skills).
215
-
216
- #### Step 2: Profile (required)
217
- If `config/profile.yml` is missing, copy from `config/profile.example.yml` and then ask:
218
- > "I need a few details to personalize the system:
219
- > - Your full name and email
220
- > - Your location and timezone
221
- > - What roles are you targeting? (e.g., 'Senior Backend Engineer', 'AI Product Manager')
222
- > - Your salary target range
223
- >
224
- > I'll set everything up for you."
225
-
226
- Fill in `config/profile.yml` with their answers. For archetypes, map their target roles to the closest matches and update `modes/_shared.md` when the existing archetypes do not cover their target roles.
227
-
228
- #### Step 3: Portals (recommended)
229
- If `portals.yml` is missing:
230
- > "I'll set up the job scanner with 45+ pre-configured companies. Want me to customize the search keywords for your target roles?"
231
-
232
- Copy `templates/portals.example.yml` → `portals.yml`. If they gave target roles in Step 2, update `title_filter.positive` to match.
233
-
234
- #### Step 4: Tracker
235
- If `data/applications/` directory doesn't exist, create it:
236
- ```bash
237
- mkdir -p data/applications
238
- ```
239
- The tracker stores entries in day-based files like `data/applications/2026-04-13.md`. Each file has the same table format:
240
- ```markdown
241
- # Applications Tracker
242
-
243
- | # | Date | Company | Role | Score | Status | PDF | Report | Notes |
244
- |---|------|---------|------|-------|--------|-----|--------|-------|
245
- ```
246
-
247
- #### Step 5: Ready
248
- Once all files exist, confirm:
249
- > "You're all set! You can now:
250
- > - Paste a job URL to evaluate it
251
- > - Run `/job-forge scan` to search portals
252
- > - Run `/job-forge` to see all commands
253
- >
254
- > Everything is customizable — just ask me to change anything.
255
- >
256
- "
257
-
258
- Then suggest automation:
259
- > "Want me to scan for new offers automatically? I can set up a recurring scan every few days so you don't miss anything. Just say 'scan every 3 days' and I'll configure it."
260
-
261
- If the user accepts, use the `/loop` or `/schedule` skill (if available) to set up a recurring `/job-forge scan`. If those aren't available, suggest adding a cron job or remind them to run `/job-forge scan` periodically.
262
-
263
- ### Personalization
264
-
265
- JobForge is designed to be customized by YOU (opencode). When the user asks you to change archetypes, translate modes, adjust scoring, add companies, or modify negotiation scripts -- do it directly. You read the same files you use, so you know exactly what to edit.
266
-
267
- **Common customization requests:**
268
- - "Change the archetypes to [backend/frontend/data/devops] roles" → edit `modes/_shared.md`
269
- - "Translate the modes to English" → edit all files in `modes/`
270
- - "Add these companies to my portals" → edit `portals.yml`
271
- - "Update my profile" → edit `config/profile.yml`
272
- - "Change the CV template design" → edit `templates/cv-template.html`
273
- - "Adjust the scoring weights" → edit `modes/_shared.md` and `batch/batch-prompt.md`
274
-
275
- ### Skill Modes
276
-
277
- Mode routing is specified in the top-level **## Routing** section. Each mode is implemented in `modes/{mode}.md` — consult those files for per-mode prompts, state, and expected outputs.
278
-
279
- ### CV Source of Truth
280
-
281
- - `cv.md` in project root is the canonical CV
282
- - `article-digest.md` has detailed proof points (optional)
283
- - **NEVER hardcode metrics** -- read them from these files at evaluation time
284
-
285
- ---
286
-
287
- ## Ethical Use -- CRITICAL
288
-
289
- **Applying to jobs via Geometra MCP is NOT impersonation.** The agent uses the user's credentials, CV, and information to apply on their behalf. This is the same as the user applying themselves — the agent is an extension of the user's will. Think of it as a browser automation tool that fills forms with the user's data.
290
-
291
- **This system is designed for quality, not quantity.** The goal is to help the user find and apply to roles where there is a genuine match -- not to spam companies with mass applications.
292
-
293
- - **Continuous pipeline for 3.0+ offers:** When scanning, evaluating, and applying — run the full pipeline end-to-end without pausing for confirmation. Scan → evaluate → fill form → submit for any offer scoring 3.0/5 or above. Do NOT stop between steps to ask "want me to continue?" — just do it.
294
- - **Auto-submit:** For offers scoring 3.0+/5, fill and submit the application automatically. For offers below 3.0/5, mark as SKIP and move on.
295
- - **Still respect quality:** Only apply where there is a genuine match (3.0+ ensures this). Auto-SKIP anything below 3.0.
296
- - **Respect recruiters' time.** Every application a human reads costs someone's attention. Only send what's worth reading.
297
-
298
- ---
299
-
300
- ## Offer Verification -- MANDATORY
301
-
302
- **Read local artifacts before the network.** If `reports/` already contains this posting URL (or company+role with a full JD in the body), **Read** that report for verification or evaluation instead of WebFetch/Geometra. If `data/pipeline.md` or `jds/` points at frozen JD text (`local:jds/{file}` or pasted blocks), **Read** that first. Reuse JD text already in the same conversation — do not fetch the same URL twice. (The JD extraction section at the top of `modes/auto-pipeline.md` and its "at most once per session" rule are the detailed contract.)
303
-
304
- **When Geometra MCP is available** (interactive sessions), ALWAYS use it to verify offers:
305
- 1. `geometra_connect` to the URL (via proxy)
306
- 2. `geometra_page_model` to read structured page content
307
- 3. Only footer/navbar without JD = closed. Title + description + Apply = active.
308
-
309
- **When Geometra MCP is NOT available** (batch workers via `opencode run`, headless environments):
310
- 1. Use WebFetch to retrieve the page content
311
- 2. Check for JD text, job title, and apply button/link in the response
312
- 3. If WebFetch returns only a shell/navbar (no JD content), mark the offer as `**Verification: unconfirmed**` in the report header
313
- 4. Do NOT skip the evaluation — proceed but flag the uncertainty so the user can verify manually before applying
314
-
315
- The goal is to never waste time on closed offers, but also never silently assume a role is active when verification was incomplete.
316
-
317
- ### Canonical MCP tools (quick reference)
318
-
319
- Pick tools by name directly — reduces unnecessary tool discovery:
320
-
321
- | Task | Preferred tools |
322
- |------|------------------|
323
- | JD from URL | Greenhouse boards API when the URL matches (see JD extraction in `modes/auto-pipeline.md`) → else `geometra_connect` + `geometra_page_model` → else WebFetch → WebSearch last |
324
- | Offer still live? | Same as JD when Geometra is available; else WebFetch per above |
325
- | One apply subagent (single job) | One `geometra_connect` per job URL; reuse `sessionId` through schema + fill; submit via atomic `geometra_run_actions` per `modes/apply.md` [H1]. Do **not** `geometra_disconnect` between `geometra_form_schema` and submit on the same form unless recovery requires it |
326
- | Chromium pool between orchestrator dispatch rounds | `geometra_list_sessions` + `geometra_disconnect({ closeBrowser: true })` per Hard limit [H3] — orchestrator-only; not a substitute for finishing the in-subagent form flow |
327
-
328
- ---
329
-
330
- ## OTP Handling via Gmail MCP -- REQUIRED
331
-
332
- When a form says "enter the code we sent to your email", you MUST retrieve the code from Gmail. NEVER ask the user to paste it. NEVER mark the application as failed without checking Gmail first.
333
-
334
- **You have exactly two Gmail tools.** There is NO `gmail_search_messages` and NO `gmail_read_message`. Use only these:
335
-
336
- | Tool | What it does | Key parameter |
337
- |------|-------------|---------------|
338
- | `gmail_list_messages` | Search emails. Returns message IDs + snippets. | `q` — Gmail search query string |
339
- | `gmail_get_message` | Read one email by ID. Returns full headers + body. | `id` — message ID from step 1 |
340
-
341
- **Step-by-step recipe (follow exactly):**
342
-
343
- 1. Reach the OTP step in the form. Do NOT close or abandon the session.
344
- 2. Wait ~5-10 seconds for the email to arrive.
345
- 3. Call `gmail_list_messages` with `q` set to the sender query from the Sender Lookup Table. Example:
346
- ```
347
- gmail_list_messages({ q: "from:greenhouse newer_than:10m", maxResults: 5 })
348
- ```
349
- 4. Take the `id` field from the first result. Call `gmail_get_message` with that `id`. Example:
350
- ```
351
- gmail_get_message({ id: "19d84d63a273c271" })
352
- ```
353
- 5. Find the code in the snippet or body. It is usually 6-8 characters near words like "security code" or "verification code".
354
- 6. Call `geometra_fill_otp` with the code. Example:
355
- ```
356
- geometra_fill_otp({ value: "ABC12345", sessionId: "..." })
357
- ```
358
- 7. Submit the form.
359
-
360
- **Sender Lookup Table:**
361
-
362
- | Portal | `q` value for `gmail_list_messages` |
363
- |--------|-------------------------------------|
364
- | Greenhouse | `from:greenhouse newer_than:10m` |
365
- | Workday | `from:myworkday newer_than:10m` |
366
- | Lever | `from:lever newer_than:10m` |
367
- | Ashby | `from:ashby newer_than:10m` |
368
- | SmartRecruiters | `from:smartrecruiters newer_than:10m` |
369
- | Toast (via ClinchTalent) | `from:toast.mail.clinchtalent.com newer_than:15m` OR `subject:"verify your login at Toast" newer_than:15m` |
370
- | Aggregator redirect (WeWorkRemotely / RemoteOK) | Detect the underlying ATS from the post-redirect URL, then use that row's sender query |
371
- | Unknown | `newer_than:10m subject:(verify OR code OR confirm)` |
372
-
373
- **Rules:**
374
- - ALWAYS check Gmail before reporting a submission as failed.
375
- - If "submit button did nothing", it usually means an OTP step appeared. Check Gmail.
376
- - If no email after 10 seconds, retry `gmail_list_messages` once more with `newer_than:5m`.
377
- - **Some Greenhouse tenants route OTP through third-party verification (Toast uses ClinchTalent).** If `from:greenhouse` returns empty after a Greenhouse submit, check the tenant-specific sender row above. Confirmed 2026-04-19: Toast Principal SWE #807 and Toast Senior FE #808.
378
-
379
- ---
380
-
381
- ## Geometra Form-Fill Patterns
382
-
383
- ### Validation State Lags Behind Actual Field State
384
-
385
- **This is a known issue across Greenhouse, Ashby, and similar ATS portals.** The frontend validation does not always update synchronously with field input. A field can be correctly filled but still show `invalid: true` or "This field is required" in the schema for 3-10 seconds — or even permanently until the user interacts with another field.
386
-
387
- **Common false-positive patterns:**
388
- - `set_checked` / `geometra_set_checked` sets a checkbox to `checked: true`, but the schema still shows `invalid: true` with "This field is required." A known lag affects privacy policy / acknowledgment checkboxes.
389
- - A dropdown/choice field is correctly picked, but the invalid flag persists.
390
- - A text field is filled correctly, but validation error text remains until the user tabs or blurs the field.
391
- - Combobox / autocomplete fields show stale "invalid" overlays after correct selection (Greenhouse, Ashby, Workday, Lever) but submit successfully.
392
-
393
- **Rule: Do NOT get stuck in a fill loop.** If a field value looks correct (checked=true, value="No", "Yes") but `invalidCount` is unchanged:
394
-
395
- 1. **Try Submit anyway.** The major portals (Greenhouse, Workday, Lever, Ashby) allow submission with stale validation errors as long as the underlying value is correct.
396
- 2. **If Submit is disabled**, try interacting with a nearby field (Tab, click another input) to force validation recalculation.
397
- 3. **If a checkbox still shows invalid after `set_checked`**, try clicking it directly by coordinates (`geometra_click` with x,y) instead of the label-based toggle.
398
- 4. **For combobox fields**, pick the option via `geometra_pick_listbox_option` (preferred) rather than typing — typing into comboboxes often creates a stale autocomplete overlay that blocks confirmation.
399
-
400
- **Decision tree for "field shows invalid after fill":**
401
-
402
- ```
403
- Is the visible value correct?
404
- ├── YES → Try Submit (preferred action)
405
- │ If Submit disabled → Tab away and back, then try Submit
406
- │ Still blocked → try clicking a nearby field to force recalc
407
- └── NO → Re-fill the field using the correct field id
408
- ```
409
-
410
- **The `invalidCount` from schema is a heuristic, not ground truth.** Always prefer direct observation of field values over the invalid count. If Submit becomes enabled, ignore any remaining invalid fields — the portal accepted the data.
411
-
412
- **Text-field specific fix — `imeFriendly: true`.** For text fills where the React-controlled input swallows programmatic value assignment (visible value correct, but `invalidCount` stays >0 and Submit is rejected with "flagged as possible spam" or "field required"), pass `imeFriendly: true` to `geometra_fill_fields`. This fires proper composition events (`compositionstart` / `input` / `compositionend`) that clear React's internal validity state. Confirmed fix on Ashby for Supabase (2026-04-19): first submit rejected despite clean fills; refill with `imeFriendly: true` succeeded on retry. Safe to use as default on all Ashby text fields — no cost if not needed.
413
-
414
- ### Ashby Anti-Bot Spam Filter — Two Failure Classes
415
-
416
- **Symptom:** after a form is filled cleanly (`invalidCount: 0`, all values correct) and Submit is clicked, Ashby returns: *"We couldn't submit your application. Your application submission was flagged as possible spam."*
417
-
418
- These blocks come from two distinct root causes and require different responses:
419
-
420
- | Class | Root cause | Recoverable in-session? | Fix |
421
- |---|---|---|---|
422
- | **A. React-validation lag** | programmatic text input didn't fire composition events; React marks required fields internally missing even though values look correct | Yes | Refill with `imeFriendly: true` and resubmit once. |
423
- | **B. Environment fingerprint** | datacenter IP / VPN / headless Chromium signatures / browser-extension tells detected server-side | No (in headless) | Mark `Failed` with note "Ashby env-fingerprint"; recommend manual submit from user's own browser. |
424
-
425
- **How to tell them apart:** if you saw `invalidCount > 0` and the "required field" error BEFORE submit, class A is likely — retry with `imeFriendly: true`. If the form filled perfectly clean (`invalidCount: 0` on every step) and the spam flag fires only on submit, class B is likely — Ashby's "Learn more" dialog cites VPN/proxy, ad blockers, shared/public network, which `imeFriendly` cannot influence.
426
-
427
- **Evidence (2026-04-19 session):**
428
- - Class A confirmed: Supabase #793 (rejected → refilled with `imeFriendly` → applied).
429
- - Class B confirmed: Unstructured #786 + ClickUp #787 — both filled cleanly with per-field `imeFriendly: true`, both still spam-flagged on submit with identical "VPN / ad blockers / shared network" messaging.
430
-
431
- **Rule — do NOT loop retrying a class B block.** One retry with `imeFriendly: true` is the correct test for class A. If the same spam message fires after a clean `imeFriendly` refill, stop, mark Failed, move on. Repeated retries waste subagent time and do not change the outcome.
432
-
433
- **Class B fix — BYO residential proxy** (added 2026-04-20 via Geometra MCP v1.59.0). When the candidate has configured `proxy:` in `config/profile.yml`, every `geometra_connect` call threads that proxy through to Chromium, which flips the outbound IP from datacenter to residential/mobile and collapses most class-B failures. See the "BYO Residential Proxy" reference section below. Without a configured proxy, class B stays Failed.
434
-
435
- **Known-block Ashby tenants (2026-04-19 empirical observations).** These tenants fired class B on every attempted submit from a headless datacenter-IP proxy. Orchestrators planning apply dispatches should assume these tenants will Fail in headless — prioritize other portals, or skip same-tenant siblings after a confirmed class B to avoid burning subagent slots:
436
-
437
- - Vellum, Linear, Vanta, River Financial, Higharc, Trace Labs, Solace Health, Unstructured, ClickUp, Zapier, Deepgram, Ramp, WorkOS, Ashby (self-tenant), Perplexity, **Goody**, **Starbridge**, **Graphite**, **Prompt Health**, **Vantage**
438
-
439
- **Known class-A-compatible Ashby tenants (same observations).** These tenants accepted headless submits cleanly, often with `imeFriendly: true` making the difference on the text-field subset:
440
-
441
- - Supabase, LangChain, Poolside, Runway Financial, Sentry, Cognition
442
-
443
- **Base rate for untested Ashby tenants (5/5 tested 2026-04-19 cycle 4 = class B).** The prior today is ~80-90% of untested Ashby tenants fingerprint-block headless submits. Orchestrators should treat any tenant not on the class-A-compatible list as likely class B — still dispatch to collect the data point, but don't burn multiple sibling-role slots on the same Ashby tenant.
444
-
445
- The pattern is tenant configuration, not role or company size. Lists drift as tenants tune their anti-bot — treat as probabilistic priors, not hard rules.
446
-
447
- **Ashby choice-group with `optionCount: 1` and no labels (Sentry pattern).** Some Ashby tenants render Yes/No work-authorization questions as `role="button" name="Application"` pill toggles where the accessibility tree exposes neither `Yes` nor `No` labels. `fill_fields` with `choiceType: "group"` silently no-ops; `geometra_click` by `id` also fails to toggle. Fix: fall back to `geometra_click` with RAW x,y coordinates at the button centers (Yes is typically the left button, No is the right). Confirmed on Sentry Staff Platform #845, 2026-04-19.
448
-
449
- ### Other Portal Failure Classes
450
-
451
- **Typeform applications are Geometra-unsupported.** Some companies (Better Stack confirmed, 2026-04-19) route the Apply link to a Typeform wizard (`*.typeform.com/apply-*`). Typeform renders questions via a custom React/canvas layer that does NOT expose input fields to the accessibility tree — `geometra_form_schema` returns "No forms found", `geometra_query role=textbox` returns empty, blind `geometra_type` produces no semantic change. Mark `Failed` with reason "Typeform portal — Geometra unsupported" on detection; do not burn the 9-minute budget attempting blind input.
452
-
453
- **Avature multi-step wizards have a native-`<select>` validation lag (Bloomberg pattern).** Bloomberg's careers site redirects to `bloomberg.avature.net` with a 4-step wizard. On Step 2, native `<select>` elements ("Is Current Position? / No") accept the value but keep `invalid: true` persistently — neither Tab, re-submit, nor re-pick clears it. `imeFriendly` has no effect because the field is a native `<select>`, not React-controlled text. There is no documented recovery. Mark `Failed` with reason "Avature native-select validation lag"; account creation up to that point is preserved for any future manual path. Confirmed on Bloomberg Sr SWE Auth #828, 2026-04-19.
454
-
455
- **Cloudflare / ATS-vendor blocks on Dropbox-class portals.** Dropbox's real apply flow lives behind `happydance.website` (ATS vendor), which Cloudflare-fingerprints headless Chromium + datacenter IPs and returns "Sorry, you have been blocked". `job-boards.greenhouse.io/dropbox` does not mirror — there is no public Greenhouse fallback. Symptom-wise indistinguishable from Ashby class B but at a different layer. Mark `Failed` with reason "ATS vendor Cloudflare block (happydance.website or equivalent)". Confirmed on Dropbox Sr FS Product #831, 2026-04-19.
456
-
457
- **Greenhouse OTP-on-fill variant (Instacart pattern).** Most Greenhouse OTP flows fire on Submit. A minority (Instacart Staff FoodStorm #827, 2026-04-19) fire the 8-cell security-code gate mid-fill, BEFORE the user clicks Submit. Detection: watch for an 8-cell OTP input surfacing after resume upload or the first listbox commit. Fetch from Gmail (`from:greenhouse newer_than:10m`) immediately when it appears — do not wait for Submit.
458
-
459
- **`geometra_fill_otp` char-drop on first fill.** Occasionally `fill_otp` lands only the first character of an 8-char code (seen on Instacart, 2026-04-19). Recovery: click the first cell to focus, then re-issue `fill_otp` with `perCharDelayMs: 120`. The form usually auto-submits once all 8 cells are populated.
460
-
461
- **Breezy portal — tenant-dependent, native `<select>`, resume-auto-parse is primary.** A subset of companies (Avantos AI, Courted, Instinct Science confirmed 2026-04-19) host applications on `*.breezy.hr` or `applytojob.com`. Empirical rules:
462
-
463
- - **Class is per-tenant, not uniform.** Avantos (Failed 2026-04-19 #854) returned Breezy's own "It looks like maybe you've already applied to this job?" banner from IP fingerprinting, even on a first submit — distinct failure mode from Ashby's "flagged as possible spam". Courted (Applied 2026-04-19 #855) went through cleanly on the same session. Don't pre-skip Breezy; the outcome is tenant-specific.
464
- - **Native `<select>` elements, not React comboboxes.** `geometra_pick_listbox_option` sets the visible display but NOT the underlying form state — submit will fail with "A response is required" on every combobox. Use `geometra_select_option` with x,y + label value for every choice field on Breezy.
465
- - **Resume-auto-parse carries the signal.** After resume upload, Breezy auto-parses work history and education into structured rows. Do NOT Add/Delete position rows via Geometra — row mutations reshuffle fieldIds mid-flow, sequential `fill_fields` calls land in wrong rows, and upstream pollution corrupts earlier positions. Trust the parsed resume and fill only Personal Details + salary.
466
-
467
- **Mailto-apply portals — direct email via gmail-mcp `attachments`.** A subset of HN-listed companies (CoPlane, Gambit Robotics, Rinse, Digital Health Strategies confirmed 2026-04-19) don't host an ATS form — their careers page instructs sending resume by email to `founders@...` / `jobs@...` / `contact@...`. Detection: WebFetch the careers URL; if the Apply link resolves to `mailto:` or the copy reads "email your resume to …", skip Geometra entirely.
468
-
469
- Use `gmail_send_message` with the `attachments` parameter (available from `@razroo/gmail-mcp@1.8.0`):
470
-
471
- ```
472
- gmail_send_message({
473
- to: ["founders@example.com"],
474
- subject: "Application — Forward Deployed AI Engineer — Charlie Greenman (Austin)",
475
- body: "<Section G pitch, 4-8 short paragraphs>",
476
- attachments: [{ path: "/abs/path/to/Charlie-Greenman-CV.pdf" }]
477
- })
478
- ```
479
-
480
- The MCP reads the file from disk and builds multipart/mixed MIME server-side — do NOT manually base64-encode a PDF into the `raw` parameter (the inline blob exceeds tool-call argument limits for any real attachment). Subject is auto MIME-encoded for non-ASCII (em-dash, smart quotes) by the same version. For older gmail-mcp versions (< 1.8.0) the only path was a direct Gmail API POST with the stored OAuth token at `~/.gmail-mcp/credentials.json` — upgrade if you can.
481
-
482
- Mark Applied with note `mailto portal — sent via gmail_send_message; Gmail msgId {id}`. Verify via `gmail_get_message` that the attachment intact-size matches what was on disk before writing the TSV.
483
-
484
- ### Greenhouse Bot-Detection Honeypots
485
-
486
- Some Greenhouse tenants (Grafana Labs confirmed, 2026-04-19) inject a honeypot-style single-pick question on the application form, rendered as a listbox labeled something like "Which of the following best describes you?" with options resembling "I am a human being / I am a bot / I am a robot".
487
-
488
- **Rule:** pick the "I am a human being" option (or whichever option is the obvious human-authentic choice). Bots that pick other options are filtered before submit. This is NOT a validation check — the field will always read back clean — but the submit will be silently discarded if the wrong option is selected.
489
-
490
- If the honeypot question is absent, skip. If present, always pick the human option.
491
-
492
- ### Nested Scroll Containers (Greenhouse / Ashby)
493
-
494
- The major ATS portals (Greenhouse, Workday, Lever, Ashby) use nested scrollable regions. A field's `visibleBounds` may show it as off-screen even when it is actually visible within a child scroll container. Geometra's `scroll_to` operates on the outermost page scroll, so it cannot reach fields in inner scroll regions.
495
-
496
- **Signs you are dealing with nested scroll:**
497
- - `scroll_to` reports `revealed: false` with `maxSteps` exhausted, but you can see the field in the page model
498
- - A field's `y` coordinate in `bounds` is far outside the viewport, yet it is visible on screen
499
- - Wheel events at one `y` coordinate scroll a different region than expected
500
-
501
- **Workaround:**
502
- 1. Use `geometra_wheel` at a low `y` value (e.g., 360, near the top of the viewport) to scroll the outer container
503
- 2. Alternatively, click directly on the element using `geometra_click` with x,y coordinates derived from the element's `visibleBounds` center
504
- 3. Once in the correct scroll region, `scroll_to` within that region works correctly
505
-
506
- ### Corrupted Fields (Text Typed Into Listbox)
507
-
508
- Sometimes text typed into the wrong field (e.g., an essay pasted into a listbox search field) corrupts the field state. The listbox shows the typed text as a search query and refuses to clear.
509
-
510
- **Recovery:**
511
- 1. Find and click the "Clear selections" button (`role: "button"`, `name: "Clear selections"`) — this usually resets the field
512
- 2. After clearing, use `geometra_pick_listbox_option` to select the correct value
513
- 3. If "Clear selections" is not available, try pressing `Escape` multiple times or clicking outside the dropdown
514
-
515
- ### Parallel Form Submissions — Isolated Sessions Required
516
-
517
- When running multiple application forms in parallel, each `geometra_connect` MUST use `isolated: true`. Without this flag, sessions share the Chromium browser pool and contaminate each other's localStorage, cookies, and autocomplete state — one job's email address can leak into another job's form.
518
-
519
- **Correct parallel pattern:**
520
- ```javascript
521
- geometra_connect({ pageUrl: "https://...", isolated: true, headless: true, slowMo: 350 })
522
- ```
523
-
524
- **Wrong:** running `geometra_connect` without `isolated: true` when submitting multiple forms concurrently. The forms may share state and produce incorrect submissions.
525
-
526
- **With a configured proxy,** add `proxy: { server, username?, password?, bypass? }` to the same call — see "BYO Residential Proxy" below. The reusable-proxy pool is partitioned by proxy identity, so mixing direct and proxied sessions across parallel rounds is safe.
527
-
528
- ### Session Reuse — When Subagents Cannot Reach Existing Sessions
529
-
530
- Subagents launched via the `task` tool start with a fresh context and cannot automatically attach to Chromium sessions spawned by a previous orchestrator session. If you dispatch a subagent to fill a form in session `s16`, but `s16` was created by a previous opencode session, the subagent's MCP calls will silently fail (returning empty results) because the subagent's MCP server has no knowledge of `s16`.
531
-
532
- **Rule:** When resuming work on forms that were opened in a previous opencode session, drive them from the current orchestrator session directly — do not delegate to a subagent.
533
-
534
- **Session IDs persist** across the same opencode session. Within one orchestrator session, `geometra_list_sessions` correctly shows all active sessions (s16, s17, s18, and any other s-prefixed IDs from this session) and `geometra_fill_form`, `geometra_page_model`, and other tools work against those sessions. Subagents are only reliable for NEW form-fill sessions they open themselves.
535
-
536
- ### Stale Session Cleanup — MANDATORY
537
-
538
- **Problem in one sentence:** if any previous subagent aborted (ran out of context, timed out, hit tool error), the Chromium session it opened is STUCK in the Geometra MCP pool, and the NEXT `geometra_connect` will fail with `Not connected`.
539
-
540
- **Fix in one sentence:** ALWAYS run `geometra_list_sessions` + `geometra_disconnect` BEFORE `geometra_connect`. Every time. No exceptions except the one explicit exception below.
541
-
542
- ---
543
-
544
- #### Rule 1 — Orchestrator pre-dispatch cleanup (DO THIS EVERY TIME)
545
-
546
- Before dispatching ANY batch of subagents that will use Geometra (apply, scan, pipeline, batch, auto-pipeline), run these TWO tool calls IN ORDER, with these EXACT arguments:
547
-
548
- ```
549
- Step 1: geometra_list_sessions()
550
- Step 2: geometra_disconnect({ closeBrowser: true })
551
- ```
552
-
553
- **DO NOT** think about whether cleanup is needed. **DO NOT** check if sessions look "fine". **DO NOT** skip Step 2 if Step 1 returns an empty list. Just run both, every time, before `task` dispatch. It costs ~100 tokens and prevents cascade failures.
554
-
555
- **Then** dispatch your subagents.
556
-
557
- **Single exception:** if you (the orchestrator) opened a session earlier in THIS SAME conversation and want a subagent to attach to it, skip cleanup and pass the exact `sessionId` to the subagent. This applies to interactive single-application flows only.
558
-
559
- ---
560
-
561
- #### Rule 2 — Subagent pre-flight cleanup (DO THIS EVERY TIME)
562
-
563
- Every subagent that uses Geometra must run these THREE tool calls as its FIRST three tool calls, in this order, with these EXACT arguments:
564
-
565
- ```
566
- Step 1: geometra_list_sessions()
567
- Step 2: geometra_disconnect({ closeBrowser: true })
568
- Step 3: geometra_connect({ pageUrl: "<the URL the orchestrator gave you>", isolated: true, headless: true, slowMo: 350 })
569
- ```
570
-
571
- **If the orchestrator passed a `proxy` object in the task prompt** (sourced from `config/profile.yml`), add it to Step 3:
572
-
573
- ```
574
- Step 3: geometra_connect({
575
- pageUrl: "<URL>", isolated: true, headless: true, slowMo: 350,
576
- proxy: { server: "...", username: "...", password: "...", bypass: "..." }
577
- })
578
- ```
579
-
580
- Pass the proxy object through unchanged. Do NOT paraphrase or drop fields — `username`/`password`/`bypass` are optional, so only include what the orchestrator gave you. See the "BYO Residential Proxy" reference section for the why.
581
-
582
- **DO NOT** skip Step 1 or Step 2. **DO NOT** think about whether it's needed. **DO NOT** look at `geometra_list_sessions` output and reason about it — just always call `geometra_disconnect({ closeBrowser: true })` next. The disconnect is a no-op if the pool is empty, and a poison-cure if it isn't.
583
-
584
- **Single exception:** if the orchestrator's task prompt says literally "attach to sessionId X" or "use existing session X", skip Steps 1-3 and call `geometra_page_model({ sessionId: "X" })` directly.
585
-
586
- ---
587
-
588
- #### Rule 3 — Routing high-value applications
589
-
590
- When the orchestrator dispatches an `apply` (form-fill + submit), pick the subagent based on this table:
591
-
592
- | Offer score | Subagent |
593
- |-------------|----------|
594
- | 3.0-3.9/5 | `@general-free` |
595
- | 4.0+/5 | `@general-paid` |
596
- | User said "top-tier", "dream job", "high-stakes" | `@general-paid` |
597
- | Late-stage pipeline (already passed screens) | `@general-paid` |
598
-
599
- **Why:** form-fill flows are 6+ steps. Free-tier models have smaller context windows and sometimes abort mid-flow when the form schema is large (Greenhouse, Workday). Paid tier has more headroom. Evaluation and procedural non-apply work stay on `@general-free` — only the `apply` step gets upgraded.
600
-
601
- ---
602
-
603
- ## BYO Residential Proxy — opt-in outbound-IP override
604
-
605
- **Problem:** on 2026-04-19 cycle 4, 5/5 untested Ashby tenants and 100% of Dropbox-class Cloudflare-fronted portals fingerprint-blocked headless Chromium from datacenter IPs. `imeFriendly: true` fixes class A (React validation lag) but has zero effect on class B (environment fingerprint). There is no in-session software-only fix for class B: the server decided the session is a bot before the form response was rendered.
606
-
607
- **Fix:** route the spawned Chromium through a residential or mobile proxy the candidate already pays for. Geometra MCP v1.59.0 added a `proxy: { server, username?, password?, bypass? }` parameter on `geometra_connect` and `geometra_prepare_browser` that forwards straight to Playwright's `chromium.launch({ proxy })`. The outbound IP becomes residential/mobile, and the fingerprint check that fired class B no longer trips.
608
-
609
- **Opt-in, BYO.** JobForge does NOT bundle or resell proxy bandwidth — the candidate brings their own provider (Bright Data, Oxylabs, SOAX, Smartproxy, mobile hotspot, self-hosted SOCKS). Without a configured proxy, JobForge behavior is unchanged from v2.11.0 and earlier.
610
-
611
- ### Where the proxy config lives
612
-
613
- `config/profile.yml` → top-level `proxy:` block:
614
-
615
- ```yaml
616
- proxy:
617
- server: "http://residential.example.com:8080" # http://, https://, or socks5://
618
- username: "your-proxy-username" # optional
619
- password: "your-proxy-password" # optional
620
- bypass: "*.internal,localhost" # optional
621
- ```
622
-
623
- See `config/profile.example.yml` for the commented-out template.
624
-
625
- ### How the orchestrator threads it through
626
-
627
- **Orchestrator responsibilities:**
628
-
629
- 1. On session start, read `config/profile.yml` once. If a `proxy:` block is present, capture it as the `PROXY_CONFIG` for the session.
630
- 2. When dispatching any subagent whose work involves a `geometra_connect` call, include `PROXY_CONFIG` verbatim in the task prompt. Example dispatch prompt line: "Pass `proxy: { server: ..., username: ..., password: ..., bypass: ... }` to every `geometra_connect` call you make."
631
- 3. When the orchestrator itself opens a Chromium session (single-application interactive flow), include the same `proxy` object in its own `geometra_connect` call.
632
- 4. If `proxy:` is absent from `profile.yml`, skip the param entirely. Do NOT invent a proxy URL or leave a stale placeholder.
633
-
634
- **Subagent responsibilities:**
635
-
636
- 1. If the task prompt includes a `proxy` object, pass it through to `geometra_connect` and any `geometra_prepare_browser` calls unchanged.
637
- 2. If the task prompt does NOT include a proxy object, run without one.
638
- 3. Never second-guess the proxy field — if the orchestrator sourced it from `profile.yml`, it's authoritative.
639
-
640
- ### When proxy use is load-bearing
641
-
642
- Apply these rules when deciding whether the proxy is worth waiting for:
643
-
644
- - **Required** for known-block Ashby tenants (see the class-B list in the Ashby section above), for `happydance.website` / Cloudflare-fronted ATSes, and for any Lever tenant that previously failed in the class-B pattern.
645
- - **Recommended** for any Ashby tenant NOT on the class-A-compatible list (base rate prior: ~80-90% block headless).
646
- - **Optional** for Greenhouse, Workday, Lever-clean tenants — these accept datacenter IPs today; using the proxy adds ~100ms per frame but no material downside.
647
- - **Not useful** for Typeform (Geometra-unsupported), Avature native-select lag (not a fingerprint issue), JazzHR+reCAPTCHA (reCAPTCHA scores unrelated to IP), Breezy (tenant-configured per-IP throttle — proxy may help or may hit a fresh throttle).
648
-
649
- ### Pool partitioning — why mixed runs are safe
650
-
651
- The Geometra MCP partitions its reusable-proxy pool by `(server, username, bypass)` — see `@geometra/mcp@1.59.0` release notes. A direct session and a proxied session NEVER share a Chromium instance, and two sessions with different proxy configs don't pool either. Practical consequence: flipping `proxy:` on or off in `profile.yml` mid-session is safe — the next `geometra_connect` just opens a fresh Chromium in its own pool partition.
652
-
653
- ### Troubleshooting
654
-
655
- | Symptom | Diagnosis |
656
- |---|---|
657
- | `Error: Failed to connect to proxy` immediately after `geometra_connect` | Proxy URL is wrong / unreachable. Verify the `server:` field hits the right host:port. |
658
- | `407 Proxy Authentication Required` | `username` or `password` is wrong or missing. Many residential providers require both. |
659
- | Class-B submit failure persists even with proxy set | (a) proxy is a datacenter proxy, not residential; (b) same tenant IP-banned your specific proxy's IP pool; (c) tenant uses TLS fingerprint / canvas fingerprint, not IP — switch to a fresh Chromium (isolated: true) and retry once, else mark Failed. |
660
- | Every `geometra_connect` is 3-5s slower than before | Expected — residential proxies add latency. Trade-off for higher submit-success rate. Do NOT revert unless the acceptance-rate lift is < 5%. |
661
-
662
- - Node.js (mjs modules), Geometra MCP (PDF + scraping + form filling), Gmail MCP (email), YAML (config), HTML/CSS (template), Markdown (data)
663
-
664
- ### MCP Configuration
665
-
666
- **Current MCP servers** (configured in `opencode.json`):
667
-
668
- | MCP | Package | Purpose |
669
- |-----|---------|---------|
670
- | `geometra` | `@geometra/mcp` | PDF generation, web scraping, form filling |
671
- | `gmail` | `@razroo/gmail-mcp` | Email integration (drafts, send, labels, threads) |
672
-
673
- ```json
674
- {
675
- "mcp": {
676
- "geometra": {
677
- "type": "stdio",
678
- "command": "npx",
679
- "args": ["-y", "@geometra/mcp"]
680
- },
681
- "gmail": {
682
- "type": "stdio",
683
- "command": "npx",
684
- "args": ["-y", "@razroo/gmail-mcp"]
685
- }
686
- }
687
- }
688
- ```
689
-
690
- To check or modify MCP settings, edit `opencode.json` in the project root.
691
- - Scripts in `.mjs`, configuration in YAML
692
- - Output in `output/` (gitignored), Reports in `reports/`
693
- - JDs in `jds/` (referenced as `local:jds/{file}` in pipeline.md)
694
- - Batch in `batch/` (gitignored except scripts and prompt)
695
- - Report numbering: sequential 3-digit zero-padded. **Always use `npx job-forge next-num` to get the next number** — do NOT derive it yourself from `ls reports/`. The CLI scans all sources: `reports/*.md`, the `#` column of every `data/applications/*.md` day file, and pending + merged `batch/tracker-additions/*.tsv`. Deriving from `reports/` alone misses numbers assigned by prior-day tracker additions that were never written as report files (e.g., `SKIP` entries), which causes ID collisions downstream.
696
- - **RULE: After each batch of evaluations, run `npx job-forge merge`** to merge tracker additions and avoid duplications.
697
- - **RULE: NEVER create new entries in applications.md if company+role already exists.** Update the existing entry.
698
- - **RULE: NEVER attribute commits to opencode (no `Co-Authored-By: opencode` or similar).** All commits must be attributed solely to the person making the commit (e.g., CharlieGreenman).
699
-
700
- ### TSV Format for Tracker Additions
701
-
702
- Write one TSV file per evaluation to `batch/tracker-additions/{num}-{company-slug}.tsv`. Single line, 9 tab-separated columns:
703
-
704
- ```
705
- {num}\t{date}\t{company}\t{role}\t{status}\t{score}/5\t{pdf_emoji}\t[{num}](reports/{num}-{slug}-{date}.md)\t{note}
706
- ```
707
-
708
- **Column order (IMPORTANT -- status BEFORE score):**
709
- 1. `num` -- sequential number (integer)
710
- 2. `date` -- YYYY-MM-DD
711
- 3. `company` -- short company name
712
- 4. `role` -- job title
713
- 5. `status` -- canonical status (e.g., `Evaluated`)
714
- 6. `score` -- format `X.X/5` (e.g., `4.2/5`)
715
- 7. `pdf` -- `✅` or `❌`
716
- 8. `report` -- markdown link `[num](reports/...)`
717
- 9. `notes` -- one-line summary
718
-
719
- **Note:** In applications.md, score comes BEFORE status. The merge script handles this column swap automatically.
720
-
721
- ### Pipeline Integrity
722
-
723
- 1. **NEVER edit day files in `data/applications/` to ADD new entries** -- Write TSV in `batch/tracker-additions/` and `npx job-forge merge` handles the merge.
724
- 2. **YES you can edit day files in `data/applications/` to UPDATE status/notes of existing entries.**
725
- 3. All reports MUST include `**URL:**` in the header (between Score and PDF).
726
- 4. All statuses MUST be canonical (see `templates/states.yml`).
727
- 5. Health check: `npx job-forge verify`
728
- 6. Normalize statuses: `npx job-forge normalize`
729
- 7. Dedup: `npx job-forge dedup`
730
-
731
- ### Canonical States (applications day files)
732
-
733
- **Source of truth:** `templates/states.yml`
91
+ The sections above are the shared contract. Load detailed context on demand:
734
92
 
735
- | State | When to use |
736
- |-------|-------------|
737
- | `Evaluated` | Report completed, pending decision |
738
- | `Applied` | Application sent |
739
- | `Responded` | Company responded |
740
- | `Contacted` | Candidate proactively reached out (LinkedIn, email) — awaiting response |
741
- | `Interview` | In interview process |
742
- | `Offer` | Offer received |
743
- | `Rejected` | Rejected by company |
744
- | `Discarded` | Discarded by candidate or offer closed |
745
- | `Failed` | Submission attempted but blocked by portal (spam-filter, anti-bot, broken form). May be recoverable via manual retry. |
746
- | `SKIP` | Doesn't fit, don't apply |
93
+ - `modes/{mode}.md` for the active mode procedure, output shape, and mode-specific routing.
94
+ - `modes/reference-setup.md` for onboarding, tracker layout, states, and profile/CV setup.
95
+ - `modes/reference-portals.md` for OTP, residential proxy, and MCP configuration.
96
+ - `modes/reference-geometra.md` for form-fill patterns, portal failures, cleanup runbooks, and session recovery.
747
97
 
748
- **RULES:**
749
- - No markdown bold (`**`) in status field
750
- - No dates in status field (use the date column)
751
- - No extra text (use the notes column)
98
+ Do not pre-load all reference files. Read only the active mode file and the reference file needed for the current blocker.