job-forge 2.3.0 → 2.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -90,22 +90,23 @@ The harness ships three subagents (see `.opencode/agents/`). The orchestrator MU
90
90
 
91
91
  **When to break this rule:** if the user explicitly asks for "quality over cost" or flags a high-stakes application (top-tier company, offer-stage negotiation, executive search), route everything through `@general-paid`. Document the exception in the session.
92
92
 
93
- ### Pre-flight delegation (HARD RULE)
93
+ ### When to delegate
94
94
 
95
- For any task that will involve **more than one tool call** i.e., anything beyond a one-shot answer the orchestrator's **first tool call MUST be `task`** (dispatching to a subagent). Not `Read`, not `Bash`, not `geometra_connect`, not `Grep`. The orchestrator plans and dispatches; subagents execute.
95
+ **Delegate (`task` out) when the work involves repeated tool-heavy steps that bloat the orchestrator's cache prefix.** The concrete failure mode this prevents: a 341-message "apply to 20 jobs" session where repeated `geometra_fill_form` / `geometra_page_model` calls accumulated in history, forcing each new message to re-process 100K+ tokens of fresh input instead of reading from cache.
96
96
 
97
- **Why this is absolute:** every tool call in the orchestrator accumulates in the top-level session's history and pollutes the cache prefix. Once the orchestrator has read three files and made two Geometra calls, delegating to a subagent no longer helps — the subagent inherits the bloated context. The only way to keep the orchestrator lean is to delegate *before* doing anything else.
97
+ **Delegate when:**
98
+ - Applying to N≥2 jobs (repeated Geometra form-fill — the original cache-bust scenario)
99
+ - Batch portal scans hitting ≥3 companies (API loops + page-model reads stack up)
100
+ - Any explicit "apply to... / process pipeline / batch evaluate" phrasing from the user (multi-job intent)
98
101
 
99
- **What counts as "more than one tool call":**
100
- - Evaluating any offer (always ≥3 steps: fetch JD, score, write report)
101
- - Any `/job-forge` mode invocation except `tracker` (read-only)
102
- - Applying to a job
103
- - Scanning portals
104
- - Any batch operation
102
+ **Do NOT delegate orchestrate inline:**
103
+ - Single-offer evaluation (text-heavy, not tool-heavy)
104
+ - Development / bug-fix / file-editing tasks
105
+ - `tracker` and other read-only modes
106
+ - Single-company scan, single-URL check
107
+ - One-shot questions — "what does this mean?", "read X and summarize", "what's my next report number?"
105
108
 
106
- **Explicit exception:** trivial one-shot answers "what does this error mean?", "read this file and summarize", "what's my next report number?" — can stay in the orchestrator. If the question can be answered in ≤1 tool call, do not delegate.
107
-
108
- **Detection signal:** if you (orchestrator) find yourself about to make your 2nd tool call in a session that wasn't a trivial one-shot, STOP. Instead, `task` out the remaining work as a single delegated job.
109
+ **Detection signal:** if you're about to call `geometra_fill_form` for a second *different* job in the same session, STOP and delegate the remainder. For everything else, in-session execution is the expected default.
109
110
 
110
111
  ---
111
112
 
@@ -307,6 +308,7 @@ When a form says "enter the code we sent to your email", you MUST retrieve the c
307
308
  | Lever | `from:lever newer_than:10m` |
308
309
  | Ashby | `from:ashby newer_than:10m` |
309
310
  | SmartRecruiters | `from:smartrecruiters newer_than:10m` |
311
+ | Toast (via ClinchTalent) | `from:toast.mail.clinchtalent.com newer_than:15m` OR `subject:"verify your login at Toast" newer_than:15m` |
310
312
  | Aggregator redirect (WeWorkRemotely / RemoteOK) | Detect the underlying ATS from the post-redirect URL, then use that row's sender query |
311
313
  | Unknown | `newer_than:10m subject:(verify OR code OR confirm)` |
312
314
 
@@ -314,6 +316,7 @@ When a form says "enter the code we sent to your email", you MUST retrieve the c
314
316
  - ALWAYS check Gmail before reporting a submission as failed.
315
317
  - If "submit button did nothing", it usually means an OTP step appeared. Check Gmail.
316
318
  - If no email after 10 seconds, retry `gmail_list_messages` once more with `newer_than:5m`.
319
+ - **Some Greenhouse tenants route OTP through third-party verification (Toast uses ClinchTalent).** If `from:greenhouse` returns empty after a Greenhouse submit, check the tenant-specific sender row above. Confirmed 2026-04-19: Toast Principal SWE #807 and Toast Senior FE #808.
317
320
 
318
321
  ---
319
322
 
@@ -369,6 +372,38 @@ These blocks come from two distinct root causes and require different responses:
369
372
 
370
373
  **Rule — do NOT loop retrying a class B block.** One retry with `imeFriendly: true` is the correct test for class A. If the same spam message fires after a clean `imeFriendly` refill, stop, mark Failed, move on. Repeated retries waste subagent time and do not change the outcome.
371
374
 
375
+ **Known-block Ashby tenants (2026-04-19 empirical observations).** These tenants fired class B on every attempted submit from a headless datacenter-IP proxy. Orchestrators planning apply dispatches should assume these tenants will Fail in headless — prioritize other portals, or skip same-tenant siblings after a confirmed class B to avoid burning subagent slots:
376
+
377
+ - Vellum, Linear, Vanta, River Financial, Higharc, Trace Labs, Solace Health, Unstructured, ClickUp, Zapier, Deepgram, Ramp, WorkOS, **Ashby (self-tenant)**, **Perplexity**
378
+
379
+ **Known class-A-compatible Ashby tenants (same observations).** These tenants accepted headless submits cleanly, often with `imeFriendly: true` making the difference on the text-field subset:
380
+
381
+ - Supabase, LangChain, Poolside, Runway Financial, **Sentry**, **Cognition**
382
+
383
+ The pattern is tenant configuration, not role or company size. Lists drift as tenants tune their anti-bot — treat as probabilistic priors, not hard rules.
384
+
385
+ **Ashby choice-group with `optionCount: 1` and no labels (Sentry pattern).** Some Ashby tenants render Yes/No work-authorization questions as `role="button" name="Application"` pill toggles where the accessibility tree exposes neither `Yes` nor `No` labels. `fill_fields` with `choiceType: "group"` silently no-ops; `geometra_click` by `id` also fails to toggle. Fix: fall back to `geometra_click` with RAW x,y coordinates at the button centers (Yes is typically the left button, No is the right). Confirmed on Sentry Staff Platform #845, 2026-04-19.
386
+
387
+ ### Other Portal Failure Classes
388
+
389
+ **Typeform applications are Geometra-unsupported.** Some companies (Better Stack confirmed, 2026-04-19) route the Apply link to a Typeform wizard (`*.typeform.com/apply-*`). Typeform renders questions via a custom React/canvas layer that does NOT expose input fields to the accessibility tree — `geometra_form_schema` returns "No forms found", `geometra_query role=textbox` returns empty, blind `geometra_type` produces no semantic change. Mark `Failed` with reason "Typeform portal — Geometra unsupported" on detection; do not burn the 9-minute budget attempting blind input.
390
+
391
+ **Avature multi-step wizards have a native-`<select>` validation lag (Bloomberg pattern).** Bloomberg's careers site redirects to `bloomberg.avature.net` with a 4-step wizard. On Step 2, native `<select>` elements ("Is Current Position? / No") accept the value but keep `invalid: true` persistently — neither Tab, re-submit, nor re-pick clears it. `imeFriendly` has no effect because the field is a native `<select>`, not React-controlled text. There is no documented recovery. Mark `Failed` with reason "Avature native-select validation lag"; account creation up to that point is preserved for any future manual path. Confirmed on Bloomberg Sr SWE Auth #828, 2026-04-19.
392
+
393
+ **Cloudflare / ATS-vendor blocks on Dropbox-class portals.** Dropbox's real apply flow lives behind `happydance.website` (ATS vendor), which Cloudflare-fingerprints headless Chromium + datacenter IPs and returns "Sorry, you have been blocked". `job-boards.greenhouse.io/dropbox` does not mirror — there is no public Greenhouse fallback. Symptom-wise indistinguishable from Ashby class B but at a different layer. Mark `Failed` with reason "ATS vendor Cloudflare block (happydance.website or equivalent)". Confirmed on Dropbox Sr FS Product #831, 2026-04-19.
394
+
395
+ **Greenhouse OTP-on-fill variant (Instacart pattern).** Most Greenhouse OTP flows fire on Submit. A minority (Instacart Staff FoodStorm #827, 2026-04-19) fire the 8-cell security-code gate mid-fill, BEFORE the user clicks Submit. Detection: watch for an 8-cell OTP input surfacing after resume upload or the first listbox commit. Fetch from Gmail (`from:greenhouse newer_than:10m`) immediately when it appears — do not wait for Submit.
396
+
397
+ **`geometra_fill_otp` char-drop on first fill.** Occasionally `fill_otp` lands only the first character of an 8-char code (seen on Instacart, 2026-04-19). Recovery: click the first cell to focus, then re-issue `fill_otp` with `perCharDelayMs: 120`. The form usually auto-submits once all 8 cells are populated.
398
+
399
+ ### Greenhouse Bot-Detection Honeypots
400
+
401
+ Some Greenhouse tenants (Grafana Labs confirmed, 2026-04-19) inject a honeypot-style single-pick question on the application form, rendered as a listbox labeled something like "Which of the following best describes you?" with options resembling "I am a human being / I am a bot / I am a robot".
402
+
403
+ **Rule:** pick the "I am a human being" option (or whichever option is the obvious human-authentic choice). Bots that pick other options are filtered before submit. This is NOT a validation check — the field will always read back clean — but the submit will be silently discarded if the wrong option is selected.
404
+
405
+ If the honeypot question is absent, skip. If present, always pick the human option.
406
+
372
407
  ### Nested Scroll Containers (Greenhouse / Ashby)
373
408
 
374
409
  The major ATS portals (Greenhouse, Workday, Lever, Ashby) use nested scrollable regions. A field's `visibleBounds` may show it as off-screen even when it is actually visible within a child scroll container. Geometra's `scroll_to` operates on the outermost page scroll, so it cannot reach fields in inner scroll regions.
package/AGENTS.md CHANGED
@@ -85,22 +85,23 @@ The harness ships three subagents (see `.opencode/agents/`). The orchestrator MU
85
85
 
86
86
  **When to break this rule:** if the user explicitly asks for "quality over cost" or flags a high-stakes application (top-tier company, offer-stage negotiation, executive search), route everything through `@general-paid`. Document the exception in the session.
87
87
 
88
- ### Pre-flight delegation (HARD RULE)
88
+ ### When to delegate
89
89
 
90
- For any task that will involve **more than one tool call** i.e., anything beyond a one-shot answer the orchestrator's **first tool call MUST be `task`** (dispatching to a subagent). Not `Read`, not `Bash`, not `geometra_connect`, not `Grep`. The orchestrator plans and dispatches; subagents execute.
90
+ **Delegate (`task` out) when the work involves repeated tool-heavy steps that bloat the orchestrator's cache prefix.** The concrete failure mode this prevents: a 341-message "apply to 20 jobs" session where repeated `geometra_fill_form` / `geometra_page_model` calls accumulated in history, forcing each new message to re-process 100K+ tokens of fresh input instead of reading from cache.
91
91
 
92
- **Why this is absolute:** every tool call in the orchestrator accumulates in the top-level session's history and pollutes the cache prefix. Once the orchestrator has read three files and made two Geometra calls, delegating to a subagent no longer helps — the subagent inherits the bloated context. The only way to keep the orchestrator lean is to delegate *before* doing anything else.
92
+ **Delegate when:**
93
+ - Applying to N≥2 jobs (repeated Geometra form-fill — the original cache-bust scenario)
94
+ - Batch portal scans hitting ≥3 companies (API loops + page-model reads stack up)
95
+ - Any explicit "apply to... / process pipeline / batch evaluate" phrasing from the user (multi-job intent)
93
96
 
94
- **What counts as "more than one tool call":**
95
- - Evaluating any offer (always ≥3 steps: fetch JD, score, write report)
96
- - Any `/job-forge` mode invocation except `tracker` (read-only)
97
- - Applying to a job
98
- - Scanning portals
99
- - Any batch operation
97
+ **Do NOT delegate orchestrate inline:**
98
+ - Single-offer evaluation (text-heavy, not tool-heavy)
99
+ - Development / bug-fix / file-editing tasks
100
+ - `tracker` and other read-only modes
101
+ - Single-company scan, single-URL check
102
+ - One-shot questions — "what does this mean?", "read X and summarize", "what's my next report number?"
100
103
 
101
- **Explicit exception:** trivial one-shot answers "what does this error mean?", "read this file and summarize", "what's my next report number?" — can stay in the orchestrator. If the question can be answered in ≤1 tool call, do not delegate.
102
-
103
- **Detection signal:** if you (orchestrator) find yourself about to make your 2nd tool call in a session that wasn't a trivial one-shot, STOP. Instead, `task` out the remaining work as a single delegated job.
104
+ **Detection signal:** if you're about to call `geometra_fill_form` for a second *different* job in the same session, STOP and delegate the remainder. For everything else, in-session execution is the expected default.
104
105
 
105
106
  ---
106
107
 
@@ -302,6 +303,7 @@ When a form says "enter the code we sent to your email", you MUST retrieve the c
302
303
  | Lever | `from:lever newer_than:10m` |
303
304
  | Ashby | `from:ashby newer_than:10m` |
304
305
  | SmartRecruiters | `from:smartrecruiters newer_than:10m` |
306
+ | Toast (via ClinchTalent) | `from:toast.mail.clinchtalent.com newer_than:15m` OR `subject:"verify your login at Toast" newer_than:15m` |
305
307
  | Aggregator redirect (WeWorkRemotely / RemoteOK) | Detect the underlying ATS from the post-redirect URL, then use that row's sender query |
306
308
  | Unknown | `newer_than:10m subject:(verify OR code OR confirm)` |
307
309
 
@@ -309,6 +311,7 @@ When a form says "enter the code we sent to your email", you MUST retrieve the c
309
311
  - ALWAYS check Gmail before reporting a submission as failed.
310
312
  - If "submit button did nothing", it usually means an OTP step appeared. Check Gmail.
311
313
  - If no email after 10 seconds, retry `gmail_list_messages` once more with `newer_than:5m`.
314
+ - **Some Greenhouse tenants route OTP through third-party verification (Toast uses ClinchTalent).** If `from:greenhouse` returns empty after a Greenhouse submit, check the tenant-specific sender row above. Confirmed 2026-04-19: Toast Principal SWE #807 and Toast Senior FE #808.
312
315
 
313
316
  ---
314
317
 
@@ -364,6 +367,38 @@ These blocks come from two distinct root causes and require different responses:
364
367
 
365
368
  **Rule — do NOT loop retrying a class B block.** One retry with `imeFriendly: true` is the correct test for class A. If the same spam message fires after a clean `imeFriendly` refill, stop, mark Failed, move on. Repeated retries waste subagent time and do not change the outcome.
366
369
 
370
+ **Known-block Ashby tenants (2026-04-19 empirical observations).** These tenants fired class B on every attempted submit from a headless datacenter-IP proxy. Orchestrators planning apply dispatches should assume these tenants will Fail in headless — prioritize other portals, or skip same-tenant siblings after a confirmed class B to avoid burning subagent slots:
371
+
372
+ - Vellum, Linear, Vanta, River Financial, Higharc, Trace Labs, Solace Health, Unstructured, ClickUp, Zapier, Deepgram, Ramp, WorkOS, **Ashby (self-tenant)**, **Perplexity**
373
+
374
+ **Known class-A-compatible Ashby tenants (same observations).** These tenants accepted headless submits cleanly, often with `imeFriendly: true` making the difference on the text-field subset:
375
+
376
+ - Supabase, LangChain, Poolside, Runway Financial, **Sentry**, **Cognition**
377
+
378
+ The pattern is tenant configuration, not role or company size. Lists drift as tenants tune their anti-bot — treat as probabilistic priors, not hard rules.
379
+
380
+ **Ashby choice-group with `optionCount: 1` and no labels (Sentry pattern).** Some Ashby tenants render Yes/No work-authorization questions as `role="button" name="Application"` pill toggles where the accessibility tree exposes neither `Yes` nor `No` labels. `fill_fields` with `choiceType: "group"` silently no-ops; `geometra_click` by `id` also fails to toggle. Fix: fall back to `geometra_click` with RAW x,y coordinates at the button centers (Yes is typically the left button, No is the right). Confirmed on Sentry Staff Platform #845, 2026-04-19.
381
+
382
+ ### Other Portal Failure Classes
383
+
384
+ **Typeform applications are Geometra-unsupported.** Some companies (Better Stack confirmed, 2026-04-19) route the Apply link to a Typeform wizard (`*.typeform.com/apply-*`). Typeform renders questions via a custom React/canvas layer that does NOT expose input fields to the accessibility tree — `geometra_form_schema` returns "No forms found", `geometra_query role=textbox` returns empty, blind `geometra_type` produces no semantic change. Mark `Failed` with reason "Typeform portal — Geometra unsupported" on detection; do not burn the 9-minute budget attempting blind input.
385
+
386
+ **Avature multi-step wizards have a native-`<select>` validation lag (Bloomberg pattern).** Bloomberg's careers site redirects to `bloomberg.avature.net` with a 4-step wizard. On Step 2, native `<select>` elements ("Is Current Position? / No") accept the value but keep `invalid: true` persistently — neither Tab, re-submit, nor re-pick clears it. `imeFriendly` has no effect because the field is a native `<select>`, not React-controlled text. There is no documented recovery. Mark `Failed` with reason "Avature native-select validation lag"; account creation up to that point is preserved for any future manual path. Confirmed on Bloomberg Sr SWE Auth #828, 2026-04-19.
387
+
388
+ **Cloudflare / ATS-vendor blocks on Dropbox-class portals.** Dropbox's real apply flow lives behind `happydance.website` (ATS vendor), which Cloudflare-fingerprints headless Chromium + datacenter IPs and returns "Sorry, you have been blocked". `job-boards.greenhouse.io/dropbox` does not mirror — there is no public Greenhouse fallback. Symptom-wise indistinguishable from Ashby class B but at a different layer. Mark `Failed` with reason "ATS vendor Cloudflare block (happydance.website or equivalent)". Confirmed on Dropbox Sr FS Product #831, 2026-04-19.
389
+
390
+ **Greenhouse OTP-on-fill variant (Instacart pattern).** Most Greenhouse OTP flows fire on Submit. A minority (Instacart Staff FoodStorm #827, 2026-04-19) fire the 8-cell security-code gate mid-fill, BEFORE the user clicks Submit. Detection: watch for an 8-cell OTP input surfacing after resume upload or the first listbox commit. Fetch from Gmail (`from:greenhouse newer_than:10m`) immediately when it appears — do not wait for Submit.
391
+
392
+ **`geometra_fill_otp` char-drop on first fill.** Occasionally `fill_otp` lands only the first character of an 8-char code (seen on Instacart, 2026-04-19). Recovery: click the first cell to focus, then re-issue `fill_otp` with `perCharDelayMs: 120`. The form usually auto-submits once all 8 cells are populated.
393
+
394
+ ### Greenhouse Bot-Detection Honeypots
395
+
396
+ Some Greenhouse tenants (Grafana Labs confirmed, 2026-04-19) inject a honeypot-style single-pick question on the application form, rendered as a listbox labeled something like "Which of the following best describes you?" with options resembling "I am a human being / I am a bot / I am a robot".
397
+
398
+ **Rule:** pick the "I am a human being" option (or whichever option is the obvious human-authentic choice). Bots that pick other options are filtered before submit. This is NOT a validation check — the field will always read back clean — but the submit will be silently discarded if the wrong option is selected.
399
+
400
+ If the honeypot question is absent, skip. If present, always pick the human option.
401
+
367
402
  ### Nested Scroll Containers (Greenhouse / Ashby)
368
403
 
369
404
  The major ATS portals (Greenhouse, Workday, Lever, Ashby) use nested scrollable regions. A field's `visibleBounds` may show it as off-screen even when it is actually visible within a child scroll container. Geometra's `scroll_to` operates on the outermost page scroll, so it cannot reach fields in inner scroll regions.
package/CLAUDE.md CHANGED
@@ -85,22 +85,23 @@ The harness ships three subagents (see `.opencode/agents/`). The orchestrator MU
85
85
 
86
86
  **When to break this rule:** if the user explicitly asks for "quality over cost" or flags a high-stakes application (top-tier company, offer-stage negotiation, executive search), route everything through `@general-paid`. Document the exception in the session.
87
87
 
88
- ### Pre-flight delegation (HARD RULE)
88
+ ### When to delegate
89
89
 
90
- For any task that will involve **more than one tool call** i.e., anything beyond a one-shot answer the orchestrator's **first tool call MUST be `task`** (dispatching to a subagent). Not `Read`, not `Bash`, not `geometra_connect`, not `Grep`. The orchestrator plans and dispatches; subagents execute.
90
+ **Delegate (`task` out) when the work involves repeated tool-heavy steps that bloat the orchestrator's cache prefix.** The concrete failure mode this prevents: a 341-message "apply to 20 jobs" session where repeated `geometra_fill_form` / `geometra_page_model` calls accumulated in history, forcing each new message to re-process 100K+ tokens of fresh input instead of reading from cache.
91
91
 
92
- **Why this is absolute:** every tool call in the orchestrator accumulates in the top-level session's history and pollutes the cache prefix. Once the orchestrator has read three files and made two Geometra calls, delegating to a subagent no longer helps — the subagent inherits the bloated context. The only way to keep the orchestrator lean is to delegate *before* doing anything else.
92
+ **Delegate when:**
93
+ - Applying to N≥2 jobs (repeated Geometra form-fill — the original cache-bust scenario)
94
+ - Batch portal scans hitting ≥3 companies (API loops + page-model reads stack up)
95
+ - Any explicit "apply to... / process pipeline / batch evaluate" phrasing from the user (multi-job intent)
93
96
 
94
- **What counts as "more than one tool call":**
95
- - Evaluating any offer (always ≥3 steps: fetch JD, score, write report)
96
- - Any `/job-forge` mode invocation except `tracker` (read-only)
97
- - Applying to a job
98
- - Scanning portals
99
- - Any batch operation
97
+ **Do NOT delegate orchestrate inline:**
98
+ - Single-offer evaluation (text-heavy, not tool-heavy)
99
+ - Development / bug-fix / file-editing tasks
100
+ - `tracker` and other read-only modes
101
+ - Single-company scan, single-URL check
102
+ - One-shot questions — "what does this mean?", "read X and summarize", "what's my next report number?"
100
103
 
101
- **Explicit exception:** trivial one-shot answers "what does this error mean?", "read this file and summarize", "what's my next report number?" — can stay in the orchestrator. If the question can be answered in ≤1 tool call, do not delegate.
102
-
103
- **Detection signal:** if you (orchestrator) find yourself about to make your 2nd tool call in a session that wasn't a trivial one-shot, STOP. Instead, `task` out the remaining work as a single delegated job.
104
+ **Detection signal:** if you're about to call `geometra_fill_form` for a second *different* job in the same session, STOP and delegate the remainder. For everything else, in-session execution is the expected default.
104
105
 
105
106
  ---
106
107
 
@@ -302,6 +303,7 @@ When a form says "enter the code we sent to your email", you MUST retrieve the c
302
303
  | Lever | `from:lever newer_than:10m` |
303
304
  | Ashby | `from:ashby newer_than:10m` |
304
305
  | SmartRecruiters | `from:smartrecruiters newer_than:10m` |
306
+ | Toast (via ClinchTalent) | `from:toast.mail.clinchtalent.com newer_than:15m` OR `subject:"verify your login at Toast" newer_than:15m` |
305
307
  | Aggregator redirect (WeWorkRemotely / RemoteOK) | Detect the underlying ATS from the post-redirect URL, then use that row's sender query |
306
308
  | Unknown | `newer_than:10m subject:(verify OR code OR confirm)` |
307
309
 
@@ -309,6 +311,7 @@ When a form says "enter the code we sent to your email", you MUST retrieve the c
309
311
  - ALWAYS check Gmail before reporting a submission as failed.
310
312
  - If "submit button did nothing", it usually means an OTP step appeared. Check Gmail.
311
313
  - If no email after 10 seconds, retry `gmail_list_messages` once more with `newer_than:5m`.
314
+ - **Some Greenhouse tenants route OTP through third-party verification (Toast uses ClinchTalent).** If `from:greenhouse` returns empty after a Greenhouse submit, check the tenant-specific sender row above. Confirmed 2026-04-19: Toast Principal SWE #807 and Toast Senior FE #808.
312
315
 
313
316
  ---
314
317
 
@@ -364,6 +367,38 @@ These blocks come from two distinct root causes and require different responses:
364
367
 
365
368
  **Rule — do NOT loop retrying a class B block.** One retry with `imeFriendly: true` is the correct test for class A. If the same spam message fires after a clean `imeFriendly` refill, stop, mark Failed, move on. Repeated retries waste subagent time and do not change the outcome.
366
369
 
370
+ **Known-block Ashby tenants (2026-04-19 empirical observations).** These tenants fired class B on every attempted submit from a headless datacenter-IP proxy. Orchestrators planning apply dispatches should assume these tenants will Fail in headless — prioritize other portals, or skip same-tenant siblings after a confirmed class B to avoid burning subagent slots:
371
+
372
+ - Vellum, Linear, Vanta, River Financial, Higharc, Trace Labs, Solace Health, Unstructured, ClickUp, Zapier, Deepgram, Ramp, WorkOS, **Ashby (self-tenant)**, **Perplexity**
373
+
374
+ **Known class-A-compatible Ashby tenants (same observations).** These tenants accepted headless submits cleanly, often with `imeFriendly: true` making the difference on the text-field subset:
375
+
376
+ - Supabase, LangChain, Poolside, Runway Financial, **Sentry**, **Cognition**
377
+
378
+ The pattern is tenant configuration, not role or company size. Lists drift as tenants tune their anti-bot — treat as probabilistic priors, not hard rules.
379
+
380
+ **Ashby choice-group with `optionCount: 1` and no labels (Sentry pattern).** Some Ashby tenants render Yes/No work-authorization questions as `role="button" name="Application"` pill toggles where the accessibility tree exposes neither `Yes` nor `No` labels. `fill_fields` with `choiceType: "group"` silently no-ops; `geometra_click` by `id` also fails to toggle. Fix: fall back to `geometra_click` with RAW x,y coordinates at the button centers (Yes is typically the left button, No is the right). Confirmed on Sentry Staff Platform #845, 2026-04-19.
381
+
382
+ ### Other Portal Failure Classes
383
+
384
+ **Typeform applications are Geometra-unsupported.** Some companies (Better Stack confirmed, 2026-04-19) route the Apply link to a Typeform wizard (`*.typeform.com/apply-*`). Typeform renders questions via a custom React/canvas layer that does NOT expose input fields to the accessibility tree — `geometra_form_schema` returns "No forms found", `geometra_query role=textbox` returns empty, blind `geometra_type` produces no semantic change. Mark `Failed` with reason "Typeform portal — Geometra unsupported" on detection; do not burn the 9-minute budget attempting blind input.
385
+
386
+ **Avature multi-step wizards have a native-`<select>` validation lag (Bloomberg pattern).** Bloomberg's careers site redirects to `bloomberg.avature.net` with a 4-step wizard. On Step 2, native `<select>` elements ("Is Current Position? / No") accept the value but keep `invalid: true` persistently — neither Tab, re-submit, nor re-pick clears it. `imeFriendly` has no effect because the field is a native `<select>`, not React-controlled text. There is no documented recovery. Mark `Failed` with reason "Avature native-select validation lag"; account creation up to that point is preserved for any future manual path. Confirmed on Bloomberg Sr SWE Auth #828, 2026-04-19.
387
+
388
+ **Cloudflare / ATS-vendor blocks on Dropbox-class portals.** Dropbox's real apply flow lives behind `happydance.website` (ATS vendor), which Cloudflare-fingerprints headless Chromium + datacenter IPs and returns "Sorry, you have been blocked". `job-boards.greenhouse.io/dropbox` does not mirror — there is no public Greenhouse fallback. Symptom-wise indistinguishable from Ashby class B but at a different layer. Mark `Failed` with reason "ATS vendor Cloudflare block (happydance.website or equivalent)". Confirmed on Dropbox Sr FS Product #831, 2026-04-19.
389
+
390
+ **Greenhouse OTP-on-fill variant (Instacart pattern).** Most Greenhouse OTP flows fire on Submit. A minority (Instacart Staff FoodStorm #827, 2026-04-19) fire the 8-cell security-code gate mid-fill, BEFORE the user clicks Submit. Detection: watch for an 8-cell OTP input surfacing after resume upload or the first listbox commit. Fetch from Gmail (`from:greenhouse newer_than:10m`) immediately when it appears — do not wait for Submit.
391
+
392
+ **`geometra_fill_otp` char-drop on first fill.** Occasionally `fill_otp` lands only the first character of an 8-char code (seen on Instacart, 2026-04-19). Recovery: click the first cell to focus, then re-issue `fill_otp` with `perCharDelayMs: 120`. The form usually auto-submits once all 8 cells are populated.
393
+
394
+ ### Greenhouse Bot-Detection Honeypots
395
+
396
+ Some Greenhouse tenants (Grafana Labs confirmed, 2026-04-19) inject a honeypot-style single-pick question on the application form, rendered as a listbox labeled something like "Which of the following best describes you?" with options resembling "I am a human being / I am a bot / I am a robot".
397
+
398
+ **Rule:** pick the "I am a human being" option (or whichever option is the obvious human-authentic choice). Bots that pick other options are filtered before submit. This is NOT a validation check — the field will always read back clean — but the submit will be silently discarded if the wrong option is selected.
399
+
400
+ If the honeypot question is absent, skip. If present, always pick the human option.
401
+
367
402
  ### Nested Scroll Containers (Greenhouse / Ashby)
368
403
 
369
404
  The major ATS portals (Greenhouse, Workday, Lever, Ashby) use nested scrollable regions. A field's `visibleBounds` may show it as off-screen even when it is actually visible within a child scroll container. Geometra's `scroll_to` operates on the outermost page scroll, so it cannot reach fields in inner scroll regions.
@@ -85,22 +85,23 @@ The harness ships three subagents (see `.opencode/agents/`). The orchestrator MU
85
85
 
86
86
  **When to break this rule:** if the user explicitly asks for "quality over cost" or flags a high-stakes application (top-tier company, offer-stage negotiation, executive search), route everything through `@general-paid`. Document the exception in the session.
87
87
 
88
- ### Pre-flight delegation (HARD RULE)
88
+ ### When to delegate
89
89
 
90
- For any task that will involve **more than one tool call** i.e., anything beyond a one-shot answer the orchestrator's **first tool call MUST be `task`** (dispatching to a subagent). Not `Read`, not `Bash`, not `geometra_connect`, not `Grep`. The orchestrator plans and dispatches; subagents execute.
90
+ **Delegate (`task` out) when the work involves repeated tool-heavy steps that bloat the orchestrator's cache prefix.** The concrete failure mode this prevents: a 341-message "apply to 20 jobs" session where repeated `geometra_fill_form` / `geometra_page_model` calls accumulated in history, forcing each new message to re-process 100K+ tokens of fresh input instead of reading from cache.
91
91
 
92
- **Why this is absolute:** every tool call in the orchestrator accumulates in the top-level session's history and pollutes the cache prefix. Once the orchestrator has read three files and made two Geometra calls, delegating to a subagent no longer helps — the subagent inherits the bloated context. The only way to keep the orchestrator lean is to delegate *before* doing anything else.
92
+ **Delegate when:**
93
+ - Applying to N≥2 jobs (repeated Geometra form-fill — the original cache-bust scenario)
94
+ - Batch portal scans hitting ≥3 companies (API loops + page-model reads stack up)
95
+ - Any explicit "apply to... / process pipeline / batch evaluate" phrasing from the user (multi-job intent)
93
96
 
94
- **What counts as "more than one tool call":**
95
- - Evaluating any offer (always ≥3 steps: fetch JD, score, write report)
96
- - Any `/job-forge` mode invocation except `tracker` (read-only)
97
- - Applying to a job
98
- - Scanning portals
99
- - Any batch operation
97
+ **Do NOT delegate orchestrate inline:**
98
+ - Single-offer evaluation (text-heavy, not tool-heavy)
99
+ - Development / bug-fix / file-editing tasks
100
+ - `tracker` and other read-only modes
101
+ - Single-company scan, single-URL check
102
+ - One-shot questions — "what does this mean?", "read X and summarize", "what's my next report number?"
100
103
 
101
- **Explicit exception:** trivial one-shot answers "what does this error mean?", "read this file and summarize", "what's my next report number?" — can stay in the orchestrator. If the question can be answered in ≤1 tool call, do not delegate.
102
-
103
- **Detection signal:** if you (orchestrator) find yourself about to make your 2nd tool call in a session that wasn't a trivial one-shot, STOP. Instead, `task` out the remaining work as a single delegated job.
104
+ **Detection signal:** if you're about to call `geometra_fill_form` for a second *different* job in the same session, STOP and delegate the remainder. For everything else, in-session execution is the expected default.
104
105
 
105
106
  ---
106
107
 
@@ -302,6 +303,7 @@ When a form says "enter the code we sent to your email", you MUST retrieve the c
302
303
  | Lever | `from:lever newer_than:10m` |
303
304
  | Ashby | `from:ashby newer_than:10m` |
304
305
  | SmartRecruiters | `from:smartrecruiters newer_than:10m` |
306
+ | Toast (via ClinchTalent) | `from:toast.mail.clinchtalent.com newer_than:15m` OR `subject:"verify your login at Toast" newer_than:15m` |
305
307
  | Aggregator redirect (WeWorkRemotely / RemoteOK) | Detect the underlying ATS from the post-redirect URL, then use that row's sender query |
306
308
  | Unknown | `newer_than:10m subject:(verify OR code OR confirm)` |
307
309
 
@@ -309,6 +311,7 @@ When a form says "enter the code we sent to your email", you MUST retrieve the c
309
311
  - ALWAYS check Gmail before reporting a submission as failed.
310
312
  - If "submit button did nothing", it usually means an OTP step appeared. Check Gmail.
311
313
  - If no email after 10 seconds, retry `gmail_list_messages` once more with `newer_than:5m`.
314
+ - **Some Greenhouse tenants route OTP through third-party verification (Toast uses ClinchTalent).** If `from:greenhouse` returns empty after a Greenhouse submit, check the tenant-specific sender row above. Confirmed 2026-04-19: Toast Principal SWE #807 and Toast Senior FE #808.
312
315
 
313
316
  ---
314
317
 
@@ -364,6 +367,38 @@ These blocks come from two distinct root causes and require different responses:
364
367
 
365
368
  **Rule — do NOT loop retrying a class B block.** One retry with `imeFriendly: true` is the correct test for class A. If the same spam message fires after a clean `imeFriendly` refill, stop, mark Failed, move on. Repeated retries waste subagent time and do not change the outcome.
366
369
 
370
+ **Known-block Ashby tenants (2026-04-19 empirical observations).** These tenants fired class B on every attempted submit from a headless datacenter-IP proxy. Orchestrators planning apply dispatches should assume these tenants will Fail in headless — prioritize other portals, or skip same-tenant siblings after a confirmed class B to avoid burning subagent slots:
371
+
372
+ - Vellum, Linear, Vanta, River Financial, Higharc, Trace Labs, Solace Health, Unstructured, ClickUp, Zapier, Deepgram, Ramp, WorkOS, **Ashby (self-tenant)**, **Perplexity**
373
+
374
+ **Known class-A-compatible Ashby tenants (same observations).** These tenants accepted headless submits cleanly, often with `imeFriendly: true` making the difference on the text-field subset:
375
+
376
+ - Supabase, LangChain, Poolside, Runway Financial, **Sentry**, **Cognition**
377
+
378
+ The pattern is tenant configuration, not role or company size. Lists drift as tenants tune their anti-bot — treat as probabilistic priors, not hard rules.
379
+
380
+ **Ashby choice-group with `optionCount: 1` and no labels (Sentry pattern).** Some Ashby tenants render Yes/No work-authorization questions as `role="button" name="Application"` pill toggles where the accessibility tree exposes neither `Yes` nor `No` labels. `fill_fields` with `choiceType: "group"` silently no-ops; `geometra_click` by `id` also fails to toggle. Fix: fall back to `geometra_click` with RAW x,y coordinates at the button centers (Yes is typically the left button, No is the right). Confirmed on Sentry Staff Platform #845, 2026-04-19.
381
+
382
+ ### Other Portal Failure Classes
383
+
384
+ **Typeform applications are Geometra-unsupported.** Some companies (Better Stack confirmed, 2026-04-19) route the Apply link to a Typeform wizard (`*.typeform.com/apply-*`). Typeform renders questions via a custom React/canvas layer that does NOT expose input fields to the accessibility tree — `geometra_form_schema` returns "No forms found", `geometra_query role=textbox` returns empty, blind `geometra_type` produces no semantic change. Mark `Failed` with reason "Typeform portal — Geometra unsupported" on detection; do not burn the 9-minute budget attempting blind input.
385
+
386
+ **Avature multi-step wizards have a native-`<select>` validation lag (Bloomberg pattern).** Bloomberg's careers site redirects to `bloomberg.avature.net` with a 4-step wizard. On Step 2, native `<select>` elements ("Is Current Position? / No") accept the value but keep `invalid: true` persistently — neither Tab, re-submit, nor re-pick clears it. `imeFriendly` has no effect because the field is a native `<select>`, not React-controlled text. There is no documented recovery. Mark `Failed` with reason "Avature native-select validation lag"; account creation up to that point is preserved for any future manual path. Confirmed on Bloomberg Sr SWE Auth #828, 2026-04-19.
387
+
388
+ **Cloudflare / ATS-vendor blocks on Dropbox-class portals.** Dropbox's real apply flow lives behind `happydance.website` (ATS vendor), which Cloudflare-fingerprints headless Chromium + datacenter IPs and returns "Sorry, you have been blocked". `job-boards.greenhouse.io/dropbox` does not mirror — there is no public Greenhouse fallback. Symptom-wise indistinguishable from Ashby class B but at a different layer. Mark `Failed` with reason "ATS vendor Cloudflare block (happydance.website or equivalent)". Confirmed on Dropbox Sr FS Product #831, 2026-04-19.
389
+
390
+ **Greenhouse OTP-on-fill variant (Instacart pattern).** Most Greenhouse OTP flows fire on Submit. A minority (Instacart Staff FoodStorm #827, 2026-04-19) fire the 8-cell security-code gate mid-fill, BEFORE the user clicks Submit. Detection: watch for an 8-cell OTP input surfacing after resume upload or the first listbox commit. Fetch from Gmail (`from:greenhouse newer_than:10m`) immediately when it appears — do not wait for Submit.
391
+
392
+ **`geometra_fill_otp` char-drop on first fill.** Occasionally `fill_otp` lands only the first character of an 8-char code (seen on Instacart, 2026-04-19). Recovery: click the first cell to focus, then re-issue `fill_otp` with `perCharDelayMs: 120`. The form usually auto-submits once all 8 cells are populated.
393
+
394
+ ### Greenhouse Bot-Detection Honeypots
395
+
396
+ Some Greenhouse tenants (Grafana Labs confirmed, 2026-04-19) inject a honeypot-style single-pick question on the application form, rendered as a listbox labeled something like "Which of the following best describes you?" with options resembling "I am a human being / I am a bot / I am a robot".
397
+
398
+ **Rule:** pick the "I am a human being" option (or whichever option is the obvious human-authentic choice). Bots that pick other options are filtered before submit. This is NOT a validation check — the field will always read back clean — but the submit will be silently discarded if the wrong option is selected.
399
+
400
+ If the honeypot question is absent, skip. If present, always pick the human option.
401
+
367
402
  ### Nested Scroll Containers (Greenhouse / Ashby)
368
403
 
369
404
  The major ATS portals (Greenhouse, Workday, Lever, Ashby) use nested scrollable regions. A field's `visibleBounds` may show it as off-screen even when it is actually visible within a child scroll container. Geometra's `scroll_to` operates on the outermost page scroll, so it cannot reach fields in inner scroll regions.
package/merge-tracker.mjs CHANGED
@@ -60,6 +60,29 @@ Run from the repository root.`);
60
60
  const CANONICAL_STATES = loadCanonicalStates(PROJECT_DIR) || DEFAULT_STATES;
61
61
  const STATUS_DETECT_RE = buildStatusDetectionRegex(CANONICAL_STATES);
62
62
 
63
+ // Lifecycle precedence — higher value means the status represents a later
64
+ // stage of the application and should override an earlier stage on merge,
65
+ // independent of score. Evaluated (pure eval, no action) is the baseline;
66
+ // any action state outranks it. This fixes a historical bug where a higher-
67
+ // score Evaluated row would silently block an Applied/Failed/SKIP outcome
68
+ // from propagating because the merge considered score alone.
69
+ const STATUS_PRECEDENCE = {
70
+ 'Evaluated': 0,
71
+ 'SKIP': 1,
72
+ 'Discarded': 1,
73
+ 'Contacted': 2,
74
+ 'Failed': 2,
75
+ 'Applied': 3,
76
+ 'Responded': 4,
77
+ 'Rejected': 4,
78
+ 'Interview': 5,
79
+ 'Offer': 6,
80
+ };
81
+
82
+ function statusRank(s) {
83
+ return STATUS_PRECEDENCE[s] ?? 0;
84
+ }
85
+
63
86
  function validateStatus(status) {
64
87
  const clean = status.replace(/\*\*/g, '').replace(/\s+\d{4}-\d{2}-\d{2}.*$/, '').trim();
65
88
  const lower = clean.toLowerCase();
@@ -86,9 +109,31 @@ function normalizeCompany(name) {
86
109
  return name.toLowerCase().replace(/[^a-z0-9]/g, '');
87
110
  }
88
111
 
112
+ // Generic seniority + engineering words that appear across most SWE roles
113
+ // and carry no role-specialty signal. A "discriminator" is any remaining
114
+ // word longer than 3 chars (e.g. "Observability", "Telemetry", "Platform").
115
+ const ROLE_STOPWORDS = new Set([
116
+ 'staff', 'senior', 'principal', 'lead', 'junior',
117
+ 'software', 'engineer', 'engineering', 'developer',
118
+ 'backend', 'frontend', 'fullstack', 'full-stack', 'full', 'stack',
119
+ 'technical', 'applied',
120
+ ]);
121
+
89
122
  function roleFuzzyMatch(a, b) {
90
- const wordsA = a.toLowerCase().split(/\s+/).filter(w => w.length > 3);
91
- const wordsB = b.toLowerCase().split(/\s+/).filter(w => w.length > 3);
123
+ // Split on whitespace AND role punctuation (commas, colons, dashes, parens)
124
+ // so "Staff SWE, Observability K8s" tokenizes past the comma.
125
+ const split = (s) => s.toLowerCase()
126
+ .split(/[\s,:\-()\/]+/)
127
+ .map(w => w.trim())
128
+ .filter(w => w.length > 3 && !ROLE_STOPWORDS.has(w));
129
+
130
+ const wordsA = split(a);
131
+ const wordsB = split(b);
132
+
133
+ // Match on discriminator-word overlap only. Prevents "Staff Software
134
+ // Engineer, ML Observability" and "Staff Backend Engineer, Adaptive
135
+ // Telemetry" from colliding (same company, different specialty) while
136
+ // still collapsing re-evaluations of the same role (same discriminators).
92
137
  const overlap = wordsA.filter(w => wordsB.some(wb => wb.includes(w) || w.includes(wb)));
93
138
  return overlap.length >= 2;
94
139
  }
@@ -274,25 +319,49 @@ for (const file of tsvFiles) {
274
319
  if (duplicate) {
275
320
  const newScore = parseScore(addition.score);
276
321
  const oldScore = parseScore(duplicate.score);
277
-
278
- if (newScore > oldScore) {
279
- console.log(`🔄 Update: #${duplicate.num} ${addition.company} — ${addition.role} (${oldScore}→${newScore})`);
322
+ const newRank = statusRank(addition.status);
323
+ const oldRank = statusRank(duplicate.status);
324
+
325
+ // Update if EITHER the lifecycle status advances (e.g. Evaluated → Applied)
326
+ // OR the score improves. Never regress the status (Applied → Evaluated is
327
+ // ignored). Same-rank same-score updates are skipped as no-op.
328
+ const statusAdvances = newRank > oldRank;
329
+ const statusRegresses = newRank < oldRank;
330
+ const scoreImproves = newScore > oldScore;
331
+
332
+ if (statusAdvances || (!statusRegresses && scoreImproves)) {
333
+ const newStatus = statusAdvances ? addition.status : duplicate.status;
334
+ const newPdf = statusAdvances ? addition.pdf : duplicate.pdf;
335
+ const reason = statusAdvances
336
+ ? `${duplicate.status}→${newStatus}`
337
+ : `${oldScore}→${newScore}`;
338
+ console.log(`🔄 Update: #${duplicate.num} ${addition.company} — ${addition.role} (${reason})`);
280
339
 
281
340
  if (layout === 'day') {
282
- // Update in existing entries list for later write
283
341
  duplicate.date = addition.date;
284
342
  duplicate.company = addition.company;
285
343
  duplicate.role = addition.role;
286
- duplicate.score = addition.score;
344
+ duplicate.score = scoreImproves ? addition.score : duplicate.score;
345
+ duplicate.status = newStatus;
346
+ duplicate.pdf = newPdf;
287
347
  duplicate.report = addition.report;
288
- duplicate.notes = `Re-eval ${addition.date} (${oldScore}→${newScore}). ${addition.notes}`;
348
+ duplicate.notes = statusAdvances
349
+ ? addition.notes
350
+ : `Re-eval ${addition.date} (${oldScore}→${newScore}). ${addition.notes}`;
289
351
  } else {
290
352
  const lineIdx = appLines.indexOf(duplicate.raw);
353
+ const outScore = scoreImproves ? addition.score : duplicate.score;
354
+ const noteText = statusAdvances
355
+ ? addition.notes
356
+ : `Re-eval ${addition.date} (${oldScore}→${newScore}). ${addition.notes}`;
291
357
  if (lineIdx >= 0) {
292
- appLines[lineIdx] = `| ${duplicate.num} | ${addition.date} | ${addition.company} | ${addition.role} | ${addition.score} | ${duplicate.status} | ${duplicate.pdf} | ${addition.report} | Re-eval ${addition.date} (${oldScore}→${newScore}). ${addition.notes} |`;
358
+ appLines[lineIdx] = `| ${duplicate.num} | ${addition.date} | ${addition.company} | ${addition.role} | ${outScore} | ${newStatus} | ${newPdf} | ${addition.report} | ${noteText} |`;
293
359
  }
294
360
  }
295
361
  updated++;
362
+ } else if (statusRegresses) {
363
+ console.log(`⏭️ Skip: ${addition.company} — ${addition.role} (existing #${duplicate.num} status ${duplicate.status} outranks new ${addition.status})`);
364
+ skipped++;
296
365
  } else {
297
366
  console.log(`⏭️ Skip: ${addition.company} — ${addition.role} (existing #${duplicate.num} ${oldScore} >= new ${newScore})`);
298
367
  skipped++;
package/modes/apply.md CHANGED
@@ -260,6 +260,22 @@ If you've uploaded a file with a dedicated `geometra_run_actions` call (e.g., th
260
260
 
261
261
  Specific portals — Workday "parse my resume", iCIMS multi-step, SAP SuccessFactors — reveal additional fields ONLY after a file upload. In that case, use exactly two `run_actions` calls: (1) upload + wait_for, (2) fill+submit. After the first call, call `geometra_form_schema` **once** to discover the newly-revealed labels, then run the second call using labels. Never more than two phases.
262
262
 
263
+ ### Resume-upload silent-fail → chooser-strategy fallback (Greenhouse)
264
+
265
+ Some Greenhouse tenants (Grafana Labs confirmed, 2026-04-19) render the resume upload as a file input where the default `upload_files` action readback succeeds but the field stays empty — Submit returns "Resume/CV is required." only after submit is clicked.
266
+
267
+ **Fix:** if the resume field shows empty after an `upload_files` action (either by explicit readback or by a "Resume/CV is required" error post-submit), re-upload using `strategy: chooser` with x,y coordinates pulled from the upload button's `visibleBounds` center. Example:
268
+
269
+ ```
270
+ { type: "upload_files",
271
+ fieldLabel: "Resume/CV",
272
+ paths: ["/abs/path/cv.pdf"],
273
+ strategy: "chooser",
274
+ x: 314, y: 474 }
275
+ ```
276
+
277
+ The `chooser` strategy triggers the native file picker via click-at-coordinates, which bypasses the React-controlled input that silently drops programmatic assignments on some Greenhouse tenants. One retry is enough; if it still fails, mark Failed.
278
+
263
279
  ## Step 6 — Resolve OTP verification (if prompted)
264
280
 
265
281
  Check for an OTP gate after the candidate (or Geometra) submits — the major portals (Greenhouse, Workday, Lever, Ashby) gate submission behind an email verification code. When an OTP step appears, do this.
@@ -282,6 +298,7 @@ Check for an OTP gate after the candidate (or Geometra) submits — the major po
282
298
  | `smartrecruiters` | `from:smartrecruiters newer_than:10m` |
283
299
  | `wwr` / `remoteok` | Follow the apply redirect to the underlying ATS, re-detect the host, then use that row's query. Aggregators do not send OTP emails themselves. |
284
300
  | `builtin` | `from:builtin newer_than:10m` |
301
+ | Toast (via Greenhouse + ClinchTalent) | `from:toast.mail.clinchtalent.com newer_than:15m` OR `subject:"verify your login at Toast" newer_than:15m`. Default `from:greenhouse` returns null — Toast routes OTP through ClinchTalent. |
285
302
  | `custom` / `unknown` / missing | `newer_than:10m subject:(verify OR code OR confirm)` |
286
303
 
287
304
  **Fallback when `ats` is missing** (legacy pipeline entries with no `| ats=` suffix, or scan-output without an `ats` column): infer from the URL host — `*.greenhouse.io` → `greenhouse`; `jobs.ashbyhq.com` → `ashby`; `jobs.lever.co` → `lever`; `*.myworkdayjobs.com` → `workday`; `apply.workable.com` / `jobs.workable.com` → `workable`; `api.smartrecruiters.com` / `jobs.smartrecruiters.com` → `smartrecruiters`; `weworkremotely.com` → `wwr`; `remoteok.com` → `remoteok`; `builtin.com` → `builtin`; otherwise use the generic `verify OR code OR confirm` subject query.
package/modes/scan.md CHANGED
@@ -45,7 +45,7 @@ Supported API shapes:
45
45
  - **Endpoint**: `https://boards-api.greenhouse.io/v1/boards/{slug}/jobs`
46
46
  - **Method**: `GET` (plain, no auth)
47
47
  - **Shape**: `{ jobs: [{ id, title, absolute_url, updated_at, location: { name } }, ...] }`
48
- - **Canonical URL to record**: `https://job-boards.greenhouse.io/{slug}/jobs/{id}` — do NOT use `absolute_url` when it points to a customer-skinned front-end (see Verification section below).
48
+ - **Canonical URL to record**: `https://job-boards.greenhouse.io/{slug}/jobs/{id}` — do NOT use `absolute_url` when it points to a customer-skinned front-end (see **Verify Before Marking CLOSED** below).
49
49
  - **ats**: `greenhouse`
50
50
 
51
51
  #### Ashby (JSON, per-company board)
@@ -75,7 +75,7 @@ Supported API shapes:
75
75
  ```json
76
76
  {"appliedFacets": {}, "limit": 20, "offset": 0, "searchText": ""}
77
77
  ```
78
- - **Required headers**: `Content-Type: application/json`, `Accept: application/json`. Some tenants reject requests without a realistic `User-Agent` — set one if the response is 403.
78
+ - **Required headers**: `Content-Type: application/json`, `Accept: application/json`. If the response is 403, set a realistic `User-Agent` header and retry Workday tenants selectively block data-center UAs.
79
79
  - **Shape**: `{ jobPostings: [{ title, externalPath, postedOn, locationsText, bulletFields }, ...], total }`
80
80
  - **Canonical URL to record**: `https://{subdomain}.{pod}.myworkdayjobs.com/{site}{externalPath}` (note: `externalPath` already starts with `/job/...` — do NOT prepend an extra `/`).
81
81
  - **Pagination**: increment `offset` by `limit` (20) until `jobPostings.length < limit` or `offset >= total`.
@@ -287,7 +287,7 @@ NEXT STEP RECOMMENDATION:
287
287
 
288
288
  ## Verify Before Marking CLOSED (downstream rule)
289
289
 
290
- **DO NOT mark a Greenhouse offer CLOSED based on a WebFetch/Geometra result alone.** Customer-skinned careers pages (`pinterestcareers.com`, `okta.com`, `samsara.com`, `zoominfo.com`, `collibra.com`, `careers.toasttab.com`, `careers.airbnb.com`, `coinbase.com`, `instacart.careers`, etc.) serve bot-hostile shells a 403, a navbar-only response, or a client-side-only render. WebFetch sees "no JD" and mis-classifies as CLOSED.
290
+ **DO NOT mark a Greenhouse offer CLOSED based on a WebFetch/Geometra result alone.** Customer-skinned careers pages serve bot-hostile shells — a 403, a navbar-only response, or a client-side-only render — and WebFetch sees "no JD" and mis-classifies as CLOSED. Known customer-skinned hosts: `pinterestcareers.com`, `okta.com`, `samsara.com`, `zoominfo.com`, `collibra.com`, `careers.toasttab.com`, `careers.airbnb.com`, `coinbase.com`, `instacart.careers`. Treat any host that is NOT `greenhouse.io` / `job-boards.greenhouse.io` / `boards-api.greenhouse.io` as customer-skinned.
291
291
 
292
292
  **Correct verification order for any Greenhouse-sourced URL** (identified by a `| gh={slug}/{id}` suffix in `pipeline.md` or a `boards-api.greenhouse.io` / `job-boards.greenhouse.io` / `boards.greenhouse.io` host):
293
293
 
@@ -298,7 +298,7 @@ NEXT STEP RECOMMENDATION:
298
298
  2. Only then fall back to WebFetch of the canonical `job-boards.greenhouse.io/{slug}/jobs/{id}` URL.
299
299
  3. Only then fall back to Geometra on the same canonical URL.
300
300
 
301
- **Rule of thumb:** Greenhouse postings with valid `gh_slug`/`gh_id` should be verified via the API first. A WebFetch failure on a customer-skinned domain is NOT evidence the role is closed.
301
+ **Rule:** Greenhouse postings with valid `gh_slug`/`gh_id` MUST be verified via the API first. A WebFetch failure on a customer-skinned domain is NOT evidence the role is closed.
302
302
 
303
303
  ## Update careers_url
304
304
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "job-forge",
3
- "version": "2.3.0",
3
+ "version": "2.5.0",
4
4
  "description": "AI-powered job search pipeline built on opencode",
5
5
  "type": "module",
6
6
  "bin": {
@@ -18,8 +18,10 @@
18
18
  "tokens": "node scripts/token-usage-report.mjs",
19
19
  "tokens:today": "node scripts/token-usage-report.mjs --days 1",
20
20
  "tokens:log": "node scripts/token-usage-report.mjs --days 1 --append",
21
- "build:config": "iso-harness build --source iso --out .",
22
- "prepack": "iso-harness build --source iso --out .",
21
+ "trace:list": "iso-trace list --since 7d --cwd .",
22
+ "trace:stats": "iso-trace stats --since 7d --cwd .",
23
+ "build:config": "iso build .",
24
+ "prepack": "iso build .",
23
25
  "release:check-source": "node ./scripts/release/check-source.mjs",
24
26
  "postinstall": "node bin/sync.mjs"
25
27
  },
@@ -74,6 +76,8 @@
74
76
  "playwright": "^1.58.1"
75
77
  },
76
78
  "devDependencies": {
77
- "@razroo/iso-harness": "^0.1.3"
79
+ "@razroo/iso": "^0.1.1",
80
+ "@razroo/iso-harness": "^0.1.3",
81
+ "@razroo/iso-trace": "^0.1.0"
78
82
  }
79
83
  }