orangeslice 1.7.21 → 1.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,164 +1,93 @@
1
1
  # orangeslice
2
2
 
3
- B2B LinkedIn database prospector. **1.15B profiles, 85M companies.**
3
+ Orangeslice provides a `services.*` API for B2B research, enrichment, scraping, and AI helpers.
4
+
5
+ ## Quick start
4
6
 
5
7
  ```bash
6
8
  npx orangeslice
7
9
  ```
8
10
 
9
- This installs documentation your AI agent needs to master the database. Point your agent (Claude Code, Cursor, etc.) to `./AGENTS.md` and it becomes a B2B prospecting expert.
10
-
11
- For maintainers: npm publishes docs from the canonical source at `../lib/vfs/sheet-chat` via `npm run sync-docs`, so the app docs and package docs stay in sync.
12
-
13
- ## What It Does
14
-
15
- Your AI agent gets:
16
- - Full database schema (40+ tables)
17
- - Query patterns and examples
18
- - Anti-patterns to avoid
19
- - Performance rules
20
- - **Parallelization patterns** — agents must run queries in parallel, never sequentially
21
- - **AI structured output** — extract structured data from text with `orangeslice.ai.generateObject()`
22
-
23
- ## 🚨 CRITICAL: Always Parallelize
11
+ The CLI copies docs to `./orangeslice-docs` and creates `./orangeslice-docs/AGENTS.md` so Claude Code/Cursor can use the docs as source of truth.
24
12
 
25
- **The #1 rule: NEVER run queries sequentially. ALWAYS use `Promise.all()`.**
26
-
27
- The API handles rate limiting automatically. Fire all queries at once.
28
-
29
- ```typescript
30
- // ❌ WRONG - Sequential (SLOW)
31
- const company = await orangeslice.b2b.sql("...");
32
- const funding = await orangeslice.b2b.sql("...");
33
- const jobs = await orangeslice.b2b.sql("...");
34
-
35
- // ✅ CORRECT - Parallel (FAST)
36
- const [company, funding, jobs] = await Promise.all([
37
- orangeslice.b2b.sql("..."),
38
- orangeslice.b2b.sql("..."),
39
- orangeslice.b2b.sql("..."),
40
- ]);
41
- ```
42
-
43
- ## Quick Example
44
-
45
- ```typescript
46
- import { orangeslice } from 'orangeslice';
47
-
48
- // Research a company - ALL queries in parallel
49
- const [company, funding, recentJobs, leadership] = await Promise.all([
50
- orangeslice.b2b.sql(`SELECT * FROM linkedin_company WHERE domain = 'stripe.com'`),
51
- orangeslice.b2b.sql(`SELECT * FROM linkedin_crunchbase_funding WHERE linkedin_company_id = 123`),
52
- orangeslice.b2b.sql(`SELECT * FROM linkedin_job WHERE linkedin_company_id = 123 LIMIT 10`),
53
- orangeslice.b2b.sql(`
54
- SELECT p.first_name, p.last_name, pos.title
55
- FROM linkedin_profile p
56
- JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = p.id
57
- WHERE pos.linkedin_company_id = 123 AND pos.end_date IS NULL
58
- LIMIT 10
59
- `),
60
- ]);
61
-
62
- // Research multiple companies - ALL in parallel
63
- const domains = ['stripe.com', 'openai.com', 'anthropic.com'];
64
- const companies = await Promise.all(
65
- domains.map(d => orangeslice.b2b.sql(`SELECT * FROM linkedin_company WHERE domain = '${d}'`))
66
- );
67
- ```
68
-
69
- ## Documentation
70
-
71
- After running `npx orangeslice`, you get:
72
-
73
- ```
74
- orangeslice-docs/
75
- ├── AGENTS.md # Agent instructions (includes parallelization rules)
76
- └── linkedin_data/
77
- ├── QUICK_REF.md # START HERE - Critical rules & patterns
78
- ├── tables/ # Full schema (denormalized + normalized)
79
- └── search_examples/ # Query patterns for people & companies
80
- ```
81
-
82
- **Read `linkedin_data/QUICK_REF.md` before writing any queries.**
83
-
84
- ## Installation
13
+ Install as a dependency when writing app code:
85
14
 
86
15
  ```bash
87
16
  npm install orangeslice
88
17
  ```
89
18
 
90
- ## API
91
-
92
- ### `orangeslice.b2b.sql<T>(query: string): Promise<T>`
93
-
94
- Execute SQL and return rows. **Always wrap multiple calls in `Promise.all()`.**
95
-
96
- ```typescript
97
- // Single query
98
- const companies = await orangeslice.b2b.sql<Company[]>(
99
- "SELECT * FROM linkedin_company WHERE employee_count > 1000 LIMIT 10"
100
- );
101
-
102
- // Multiple queries - ALWAYS parallel
103
- const [techCos, healthCos, financeCos] = await Promise.all([
104
- orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE industry_code = 4 LIMIT 10"),
105
- orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE industry_code = 14 LIMIT 10"),
106
- orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE industry_code = 43 LIMIT 10"),
19
+ ## Public API (services-first)
20
+
21
+ Use `services.*` as the primary API surface.
22
+
23
+ ```ts
24
+ import { services } from "orangeslice";
25
+
26
+ const [companies, searchPage, ai] = await Promise.all([
27
+ services.company.linkedin.search({
28
+ sql: "SELECT * FROM linkedin_company WHERE domain = 'stripe.com' LIMIT 5"
29
+ }),
30
+ services.web.search({ query: "site:linkedin.com/company stripe" }),
31
+ services.ai.generateObject({
32
+ prompt: "Extract company and founding year from: Stripe was founded in 2010.",
33
+ schema: {
34
+ type: "object",
35
+ properties: {
36
+ company: { type: "string" },
37
+ year: { type: "number" }
38
+ },
39
+ required: ["company", "year"]
40
+ }
41
+ })
107
42
  ]);
108
43
  ```
109
44
 
110
- ### `orangeslice.b2b.query<T>(query: string): Promise<QueryResult<T>>`
111
-
112
- Execute SQL and return full result with metadata.
45
+ ## Service map
113
46
 
114
- ```typescript
115
- const result = await orangeslice.b2b.query("SELECT * FROM linkedin_company LIMIT 10");
116
- // result.rows, result.rowCount, result.duration_ms
117
- ```
47
+ - `services.company.linkedin.search/findUrl/enrich`
48
+ - `services.person.linkedin.search/findUrl/enrich`
49
+ - `services.person.contact.get`
50
+ - `services.company.getEmployeesFromLinkedin`
51
+ - `services.web.search/batchSearch`
52
+ - `services.ai.generateObject`
53
+ - `services.scrape.website`
54
+ - `services.browser.execute`
55
+ - `services.apify.runActor`
56
+ - `services.googleMaps.scrape`
57
+ - `services.geo.parseAddress`
58
+ - `services.healthcare.npi`
59
+ - `services.builtWith.lookupDomain/relationships/searchByTech`
118
60
 
119
- ### `orangeslice.b2b.configure(options)`
61
+ ## How routing works today
120
62
 
121
- Configure rate limiting. Default settings handle parallelization automatically.
63
+ All service calls go through `post()` in `src/api.ts`.
122
64
 
123
- ```typescript
124
- orangeslice.b2b.configure({
125
- concurrency: 3, // default: 2 concurrent requests
126
- minDelayMs: 200, // default: 100ms between requests
127
- });
128
- ```
65
+ - Default path: `https://enrichly-production.up.railway.app/function`
66
+ - Direct path: `https://orangeslice.ai/api/function` (only when a call passes `direct: true`)
67
+ - Polling split for pending/`202`:
68
+ - `batch-native` (`b2b`, `batchserp`, `generateObject`) polls batch-service result endpoint.
69
+ - `bridge` functions poll `https://orangeslice.ai/api/function/bridge-result`.
70
+ - Inngest secrets are server-side only (Next.js route env vars), never in the npm package.
129
71
 
130
- ### `orangeslice.ai.generateObject<T>(options): Promise<T>`
131
-
132
- Generate structured data from text using AI.
133
-
134
- ```typescript
135
- const result = await orangeslice.ai.generateObject({
136
- prompt: "Extract company info: Stripe was founded in 2010 by Patrick Collison",
137
- schema: {
138
- type: "object",
139
- properties: {
140
- company: { type: "string" },
141
- year: { type: "number" },
142
- founder: { type: "string" }
143
- },
144
- required: ["company", "year"]
145
- }
146
- });
147
- // { company: "Stripe", year: 2010, founder: "Patrick Collison" }
148
- ```
72
+ The current endpoint-to-backend mapping is documented in `docs/runtime-routing.md`.
149
73
 
150
- ### `orangeslice.ai.extract<T>(text, schema, instructions?): Promise<T>`
74
+ ## Docs installed by CLI
151
75
 
152
- Convenience method to extract structured data from text.
76
+ After `npx orangeslice`, you should have:
153
77
 
154
- ```typescript
155
- const data = await orangeslice.ai.extract(
156
- "Apple Inc was founded in 1976 by Steve Jobs in Cupertino",
157
- { type: "object", properties: { company: { type: "string" }, year: { type: "number" } } },
158
- "Extract the company name and founding year"
159
- );
78
+ ```text
79
+ orangeslice-docs/
80
+ AGENTS.md
81
+ services/
82
+ prospecting/
83
+ ...other synced docs
160
84
  ```
161
85
 
86
+ ## Maintainer notes
87
+
88
+ - Canonical docs are synced from `../lib/vfs/sheet-chat` via `npm run sync-docs`.
89
+ - `prepublishOnly` runs docs sync and TypeScript build.
90
+
162
91
  ## Restrictions
163
92
 
164
93
  - No direct contact data (email/phone)
package/dist/api.d.ts CHANGED
@@ -1,5 +1,9 @@
1
+ export type FunctionType = "batch-native" | "bridge";
2
+ export declare const BATCH_NATIVE_FUNCTION_IDS: Set<string>;
1
3
  interface PostOptions {
2
4
  direct?: boolean;
3
5
  }
6
+ export declare function getFunctionType(functionId: string): FunctionType;
7
+ export declare function getBridgeResultEventName(functionId: string): string | undefined;
4
8
  export declare function post<T>(functionId: string, payload: Record<string, unknown>, options?: PostOptions): Promise<T>;
5
9
  export {};
package/dist/api.js CHANGED
@@ -1,19 +1,49 @@
1
1
  "use strict";
2
2
  Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.BATCH_NATIVE_FUNCTION_IDS = void 0;
4
+ exports.getFunctionType = getFunctionType;
5
+ exports.getBridgeResultEventName = getBridgeResultEventName;
3
6
  exports.post = post;
7
+ const crypto_1 = require("crypto");
4
8
  /**
5
- * Runtime routing (hardcoded on purpose):
6
- * - Non-direct calls -> Railway batch-service /function
7
- * - Direct calls -> orangeslice.ai /api/function
8
- * - Inngest polling -> batch-service (configured there)
9
+ * Runtime routing:
10
+ * - Non-direct calls submit to Railway batch-service /function
11
+ * - Direct calls submit to orangeslice.ai /api/function
12
+ * - Pending polling splits by function type:
13
+ * - batch-native: poll batch-service /function/result
14
+ * - bridge: poll Next.js bridge-result route (server-side Inngest key)
9
15
  */
10
16
  const BASE_URL = "https://enrichly-production.up.railway.app/function";
11
17
  const DIRECT_BASE_URL = "https://orangeslice.ai/api/function";
18
+ const BRIDGE_POLL_URL = "https://orangeslice.ai/api/function/bridge-result";
12
19
  const POLL_TIMEOUT_MS = 120000;
13
20
  const DEFAULT_POLL_INTERVAL_MS = 1000;
21
+ const INTERNAL_CALLBACK_ID_KEY = "_orangesliceCallbackEventId";
22
+ exports.BATCH_NATIVE_FUNCTION_IDS = new Set(["b2b", "batchserp", "generateObject"]);
23
+ const BRIDGE_RESULT_EVENT_BY_FUNCTION_ID = {
24
+ firecrawl: "proxy/firecrawl.result",
25
+ kernel: "proxy/kernel.result",
26
+ apify: "proxy/apify.result",
27
+ contact: "proxy/invoke.result",
28
+ "linkedin/find-profile-url": "proxy/linkedin-profile-url.result",
29
+ "find-linkedin-company-url": "proxy/linkedin-company-url.result",
30
+ "b2b-get-employees-for-company": "proxy/invoke.result",
31
+ geo: "proxy/invoke.result",
32
+ npi: "proxy/invoke.result",
33
+ googleMaps: "proxy/invoke.result",
34
+ builtwithLookupDomain: "proxy/invoke.result",
35
+ builtwithRelationships: "proxy/invoke.result",
36
+ builtwithSearchByTech: "proxy/invoke.result",
37
+ };
14
38
  function sleep(ms) {
15
39
  return new Promise((resolve) => setTimeout(resolve, ms));
16
40
  }
41
+ function getFunctionType(functionId) {
42
+ return exports.BATCH_NATIVE_FUNCTION_IDS.has(functionId) ? "batch-native" : "bridge";
43
+ }
44
+ function getBridgeResultEventName(functionId) {
45
+ return BRIDGE_RESULT_EVENT_BY_FUNCTION_ID[functionId];
46
+ }
17
47
  async function readResponseBody(res) {
18
48
  const text = await res.text();
19
49
  if (!text)
@@ -53,7 +83,7 @@ function isPendingResponse(body) {
53
83
  return false;
54
84
  return body.pending === true;
55
85
  }
56
- function resolvePollUrl(baseUrl, pending) {
86
+ function resolveBatchPollUrl(baseUrl, pending) {
57
87
  if (pending.pollUrl) {
58
88
  return new URL(pending.pollUrl, baseUrl).toString();
59
89
  }
@@ -76,8 +106,8 @@ async function fetchWithRedirect(url, init) {
76
106
  }
77
107
  return res;
78
108
  }
79
- async function pollUntilComplete(baseUrl, functionId, pending) {
80
- const pollUrl = resolvePollUrl(baseUrl, pending);
109
+ async function pollBatchUntilComplete(baseUrl, functionId, pending) {
110
+ const pollUrl = resolveBatchPollUrl(baseUrl, pending);
81
111
  const timeoutAt = Date.now() + POLL_TIMEOUT_MS;
82
112
  let pollAfterMs = typeof pending.pollAfterMs === "number" && pending.pollAfterMs > 0
83
113
  ? pending.pollAfterMs
@@ -103,10 +133,39 @@ async function pollUntilComplete(baseUrl, functionId, pending) {
103
133
  }
104
134
  throw new Error(`[orangeslice] ${functionId}: polling timed out after ${POLL_TIMEOUT_MS}ms`);
105
135
  }
136
+ async function pollBridgeUntilComplete(functionId, callbackEventId, eventName, firstPollAfterMs) {
137
+ const timeoutAt = Date.now() + POLL_TIMEOUT_MS;
138
+ let pollAfterMs = typeof firstPollAfterMs === "number" && firstPollAfterMs > 0 ? firstPollAfterMs : DEFAULT_POLL_INTERVAL_MS;
139
+ while (Date.now() < timeoutAt) {
140
+ await sleep(pollAfterMs);
141
+ const url = `${BRIDGE_POLL_URL}?callbackEventId=${encodeURIComponent(callbackEventId)}&eventName=${encodeURIComponent(eventName)}`;
142
+ const res = await fetch(url, { method: "GET", headers: { "Content-Type": "application/json" } });
143
+ const data = await readResponseBody(res);
144
+ if (isPendingResponse(data) || res.status === 202) {
145
+ const next = data.pollAfterMs;
146
+ pollAfterMs = typeof next === "number" && next > 0 ? next : DEFAULT_POLL_INTERVAL_MS;
147
+ continue;
148
+ }
149
+ if (!res.ok) {
150
+ const message = asErrorMessage(data) || JSON.stringify(data);
151
+ throw new Error(`[orangeslice] ${functionId}: ${res.status} ${message}`);
152
+ }
153
+ const message = asErrorMessage(data);
154
+ if (message) {
155
+ throw new Error(`[orangeslice] ${functionId}: ${message}`);
156
+ }
157
+ return data;
158
+ }
159
+ throw new Error(`[orangeslice] ${functionId}: bridge polling timed out after ${POLL_TIMEOUT_MS}ms`);
160
+ }
106
161
  async function post(functionId, payload, options = {}) {
107
162
  const baseUrl = options.direct ? DIRECT_BASE_URL : BASE_URL;
163
+ const functionType = options.direct ? "batch-native" : getFunctionType(functionId);
164
+ const bridgeEventName = getBridgeResultEventName(functionId);
165
+ const callbackEventId = functionType === "bridge" && !options.direct && bridgeEventName ? (0, crypto_1.randomUUID)() : undefined;
166
+ const requestPayload = callbackEventId && functionType === "bridge" ? { ...payload, [INTERNAL_CALLBACK_ID_KEY]: callbackEventId } : payload;
108
167
  const url = `${baseUrl}?functionId=${functionId}`;
109
- const body = JSON.stringify(payload);
168
+ const body = JSON.stringify(requestPayload);
110
169
  const res = await fetchWithRedirect(url, {
111
170
  method: "POST",
112
171
  headers: { "Content-Type": "application/json" },
@@ -120,7 +179,16 @@ async function post(functionId, payload, options = {}) {
120
179
  }
121
180
  const data = await readResponseBody(res);
122
181
  if (isPendingResponse(data)) {
123
- return pollUntilComplete(baseUrl, functionId, data);
182
+ if (functionType === "bridge") {
183
+ const bridgePending = data;
184
+ const resolvedCallbackId = bridgePending.callbackEventId || callbackEventId;
185
+ const resolvedEventName = bridgePending.eventName || bridgeEventName;
186
+ if (!resolvedCallbackId || !resolvedEventName) {
187
+ throw new Error(`[orangeslice] ${functionId}: missing bridge callback metadata for polling`);
188
+ }
189
+ return pollBridgeUntilComplete(functionId, resolvedCallbackId, resolvedEventName, bridgePending.pollAfterMs);
190
+ }
191
+ return pollBatchUntilComplete(baseUrl, functionId, data);
124
192
  }
125
193
  const errorMessage = asErrorMessage(data);
126
194
  if (errorMessage) {
package/dist/cli.js CHANGED
@@ -34,9 +34,9 @@ var __importStar = (this && this.__importStar) || (function () {
34
34
  };
35
35
  })();
36
36
  Object.defineProperty(exports, "__esModule", { value: true });
37
+ const child_process_1 = require("child_process");
37
38
  const fs = __importStar(require("fs"));
38
39
  const path = __importStar(require("path"));
39
- const child_process_1 = require("child_process");
40
40
  const LEGACY_DOCS_DIR = path.join(__dirname, "..", "docs");
41
41
  const TARGET_DIR = path.join(process.cwd(), "orangeslice-docs");
42
42
  const AGENTS_FILE = path.join(TARGET_DIR, "AGENTS.md");
@@ -69,7 +69,12 @@ function copyDirSync(src, dest) {
69
69
  for (const entry of fs.readdirSync(src, { withFileTypes: true })) {
70
70
  const srcPath = path.join(src, entry.name);
71
71
  const destPath = path.join(dest, entry.name);
72
- entry.isDirectory() ? copyDirSync(srcPath, destPath) : fs.copyFileSync(srcPath, destPath);
72
+ if (entry.isDirectory()) {
73
+ copyDirSync(srcPath, destPath);
74
+ }
75
+ else {
76
+ fs.copyFileSync(srcPath, destPath);
77
+ }
73
78
  }
74
79
  }
75
80
  function writeAgentsGuide(destDir) {
@@ -95,7 +100,7 @@ Use the docs in this folder as the source of truth for all orangeslice operation
95
100
  fs.writeFileSync(AGENTS_FILE, content, "utf8");
96
101
  }
97
102
  async function main() {
98
- console.log("\n🍊 orangeslice\n");
103
+ console.log("\norangeslice\n");
99
104
  const docsDir = resolveDocsDir();
100
105
  // Copy docs
101
106
  if (fs.existsSync(TARGET_DIR))
@@ -110,15 +115,19 @@ async function main() {
110
115
  catch {
111
116
  // no package.json or npm not available
112
117
  }
113
- console.log("\n✅ Ready services-style API\n");
118
+ console.log("\nReady - services-style API\n");
114
119
  console.log(" import { services } from 'orangeslice';\n");
115
- console.log(" 🤖 Agent setup (do this first):");
120
+ console.log(" Agent setup (do this first):");
116
121
  console.log(" Ask your agent to read:");
117
122
  console.log(" 1) ./orangeslice-docs/AGENTS.md");
118
123
  console.log(" 2) ./orangeslice-docs/services/index.md");
119
- console.log(" Then tell it: \"Use these docs as source of truth for all orangeslice operations.\"\n");
124
+ console.log(' Then tell it: "Use these docs as source of truth for all orangeslice operations."\n');
125
+ console.log(" Routing note:");
126
+ console.log(" - LinkedIn SQL (functionId=b2b) is routed direct");
127
+ console.log(" - batch-native: b2b/batchserp/generateObject poll batch-service results");
128
+ console.log(" - bridge functions poll a server route (Inngest keys are server-side only)\n");
120
129
  console.log(" // LinkedIn B2B SQL");
121
- console.log(" const { rows } = await services.company.linkedin.search({ sql: \"SELECT * FROM linkedin_company LIMIT 10\" });\n");
130
+ console.log(' const { rows } = await services.company.linkedin.search({ sql: "SELECT * FROM linkedin_company LIMIT 10" });\n');
122
131
  console.log(" // Web search");
123
132
  console.log(" const page = await services.web.search({ query: 'best CRM software' });\n");
124
133
  console.log(" // Batched web search");
@@ -126,9 +135,9 @@ async function main() {
126
135
  console.log(" // AI structured output");
127
136
  console.log(" const { object } = await services.ai.generateObject({ prompt: '...', schema: {...} });\n");
128
137
  console.log(" // Browser automation (Kernel)");
129
- console.log(" const browser = await services.browser.execute({ code: \"return await page.title();\" });\n");
138
+ console.log(' const browser = await services.browser.execute({ code: "return await page.title();" });\n');
130
139
  console.log(" // Apify actor");
131
140
  console.log(" const actor = await services.apify.runActor({ actor: 'apify/web-scraper', input: {} });\n");
132
- console.log(" ⚠️ Always parallelize independent calls with Promise.all.\n");
141
+ console.log(" Always parallelize independent calls with Promise.all.\n");
133
142
  }
134
143
  main().catch(console.error);
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "orangeslice",
3
- "version": "1.7.21",
3
+ "version": "1.8.0",
4
4
  "description": "B2B LinkedIn database prospector - 1.15B profiles, 85M companies",
5
5
  "main": "dist/index.js",
6
6
  "types": "dist/index.d.ts",