job-forge 2.14.35 → 2.14.36

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -8,7 +8,7 @@ Fetch the JD content once. If the input is a **URL** (not pasted JD text), fetch
8
8
 
9
9
  **Pick exactly one method, in this priority order:**
10
10
 
11
- 1. **Greenhouse JSON API (first try, if the URL is Greenhouse-backed):** If the pipeline.md entry carries `| gh={slug}/{id}` OR the URL host matches `*.greenhouse.io` / a known Greenhouse customer front-end (`*.pinterestcareers.com`, `okta.com/company/careers/opportunity/*`, `samsara.com/company/careers/roles/*`, `zoominfo.com/careers?gh_jid=*`, `collibra.com/.../?gh_jid=*`, `careers.toasttab.com/jobs?gh_jid=*`, `careers.airbnb.com/positions/*?gh_jid=*`, `coinbase.com/careers/positions/*?gh_jid=*`, `instacart.careers/job/?gh_jid=*`), extract `slug` and `id` and WebFetch `https://boards-api.greenhouse.io/v1/boards/{slug}/jobs/{id}`. 200 + JSON with `content` is the authoritative JD. 404 = genuinely closed (mark CLOSED and stop). **If 200, STOP — do not fall back to Geometra or WebFetch of the front-end.** The API is faster, cheaper (no Geometra session), and never returns a bot-shell.
11
+ 1. **Greenhouse JSON API (first try, if the URL is Greenhouse-backed):** If the pipeline.md entry carries `| gh={slug}/{id}` OR the URL host matches `*.greenhouse.io` / a known Greenhouse customer front-end (`*.pinterestcareers.com`, `okta.com/company/careers/opportunity/*`, `samsara.com/company/careers/roles/*`, `zoominfo.com/careers?gh_jid=*`, `collibra.com/.../?gh_jid=*`, `careers.toasttab.com/jobs?gh_jid=*`, `careers.airbnb.com/positions/*?gh_jid=*`, `coinbase.com/careers/positions/*?gh_jid=*`, `instacart.careers/job/?gh_jid=*`), extract `slug` and `id` and WebFetch `https://boards-api.greenhouse.io/v1/boards/{slug}/jobs/{id}`. 200 + JSON with `content` is the authoritative JD. 404 = genuinely closed (mark CLOSED and stop). **OpenCode WebFetch compatibility:** do not pass `format: "json"`; omit `format` or use `format: "text"` and parse the returned JSON text. **If 200, STOP — do not fall back to Geometra or WebFetch of the front-end.** The API is faster, cheaper (no Geometra session), and never returns a bot-shell.
12
12
  2. **Geometra MCP:** Most non-Greenhouse job portals (Lever, Ashby, Workday) are SPAs. Use `geometra_connect` + `geometra_page_model` to render and read the JD. **If this returns non-empty JD text, STOP — do not WebFetch the same URL.**
13
13
  3. **WebFetch (only if Geometra is unavailable OR returned only a shell with no JD text):** For static pages (ZipRecruiter, WeLoveProduct, company career pages).
14
14
  4. **WebSearch (only if methods 1–3 all failed):** Search for the role title + company on secondary portals that index the JD in static HTML.
package/modes/pipeline.md CHANGED
@@ -33,7 +33,7 @@ Processes accumulated job offer URLs from `data/pipeline.md`. The user adds URLs
33
33
 
34
34
  ## Detect JD From URL
35
35
 
36
- 1. **Greenhouse JSON API (FIRST, when the entry has `| gh={slug}/{id}` OR the host looks Greenhouse-backed):** WebFetch `https://boards-api.greenhouse.io/v1/boards/{slug}/jobs/{id}`. 200 + JSON with `content` = LIVE, use it as the JD; 404 = genuinely CLOSED (mark `- [!]` and continue). Bot-hostile customer fronts (`pinterestcareers.com`, `okta.com`, `samsara.com`, `zoominfo.com`, `collibra.com`, `careers.toasttab.com`, `careers.airbnb.com`, `coinbase.com`, `instacart.careers`, `careers.toasttab.com`) MUST be verified via this API first — WebFetch/Geometra of those domains returns a shell or 403 and causes false CLOSED marks.
36
+ 1. **Greenhouse JSON API (FIRST, when the entry has `| gh={slug}/{id}` OR the host looks Greenhouse-backed):** WebFetch `https://boards-api.greenhouse.io/v1/boards/{slug}/jobs/{id}`. 200 + JSON with `content` = LIVE, use it as the JD; 404 = genuinely CLOSED (mark `- [!]` and continue). **OpenCode WebFetch compatibility:** do not pass `format: "json"`; omit `format` or use `format: "text"` and parse the returned JSON text. Bot-hostile customer fronts (`pinterestcareers.com`, `okta.com`, `samsara.com`, `zoominfo.com`, `collibra.com`, `careers.toasttab.com`, `careers.airbnb.com`, `coinbase.com`, `instacart.careers`, `careers.toasttab.com`) MUST be verified via this API first — WebFetch/Geometra of those domains returns a shell or 403 and causes false CLOSED marks.
37
37
  2. **Geometra MCP:** `geometra_connect` + `geometra_page_model`. Works with non-Greenhouse SPAs (Lever, Ashby, Workday), uses fewer tokens than raw DOM snapshots.
38
38
  3. **WebFetch (fallback):** For static pages or when Geometra is not available.
39
39
  4. **WebSearch (last resort):** Search on secondary portals that index the JD.
package/modes/scan.md CHANGED
@@ -38,6 +38,8 @@ Read `portals.yml` which contains:
38
38
 
39
39
  For companies using an ATS or aggregator that exposes a public JSON/RSS API, fetch structured data directly. APIs are faster than Geometra and harder to hallucinate (the response is load-bearing — record IDs verbatim from the response, never reconstruct them). Use as a complement to Level 1.
40
40
 
41
+ **OpenCode WebFetch compatibility:** when fetching JSON/RSS API endpoints, do not pass `format: "json"`. OpenCode accepts `text`, `markdown`, or `html`; omit `format` or use `format: "text"` and parse JSON/RSS from the returned body.
42
+
41
43
  Supported API shapes:
42
44
 
43
45
  #### Greenhouse (JSON, per-company board)
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "job-forge",
3
- "version": "2.14.35",
3
+ "version": "2.14.36",
4
4
  "description": "AI-powered job search pipeline built on opencode",
5
5
  "type": "module",
6
6
  "bin": {