orangeslice 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,107 @@
1
+ # orangeslice
2
+
3
+ Rate-limited B2B API client for AI agents. Call `orangeslice.b2b.sql()` anywhere, anytime - concurrency and rate limiting handled automatically.
4
+
5
+ ## Documentation
6
+
7
+ **Before writing queries, read the docs in [`./docs/`](./docs/):**
8
+
9
+ | Doc | What it covers |
10
+ |-----|----------------|
11
+ | [B2B_DATABASE.md](./docs/B2B_DATABASE.md) | Database overview, API endpoint, request format |
12
+ | [B2B_SCHEMA.md](./docs/B2B_SCHEMA.md) | All tables and columns |
13
+ | [B2B_EMPLOYEE_SEARCH.md](./docs/B2B_EMPLOYEE_SEARCH.md) | How to search for people/employees |
14
+ | [B2B_GENERALIZATION_RULES.md](./docs/B2B_GENERALIZATION_RULES.md) | Query patterns and best practices |
15
+ | [B2B_NLP_QUERY_MAPPINGS.md](./docs/B2B_NLP_QUERY_MAPPINGS.md) | Natural language to SQL mappings |
16
+ | [B2B_TABLE_INDICES.ts](./docs/B2B_TABLE_INDICES.ts) | TypeScript types for all tables |
17
+
18
+ Start with `B2B_DATABASE.md` and `B2B_SCHEMA.md` to understand the data model.
19
+
20
+ ## Installation
21
+
22
+ ```bash
23
+ npm install orangeslice
24
+ ```
25
+
26
+ ## Usage
27
+
28
+ ```typescript
29
+ import { orangeslice } from 'orangeslice';
30
+
31
+ // Query the B2B database - automatically rate-limited
32
+ const companies = await orangeslice.b2b.sql(`
33
+ SELECT company_name, domain, employee_count
34
+ FROM linkedin_company
35
+ WHERE universal_name = 'stripe'
36
+ `);
37
+
38
+ // Multiple parallel calls are queued (max 2 concurrent by default)
39
+ const results = await Promise.all([
40
+ orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE universal_name = 'stripe'"),
41
+ orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE universal_name = 'openai'"),
42
+ orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE universal_name = 'meta'"),
43
+ orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE universal_name = 'google'"),
44
+ orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE universal_name = 'amazon'"),
45
+ ]);
46
+ // ^ Only 2 run at once, rest wait in queue automatically
47
+ ```
48
+
49
+ ## Configuration
50
+
51
+ ```typescript
52
+ // Optional - configure before use
53
+ orangeslice.b2b.configure({
54
+ proxyUrl: 'http://your-proxy-url:3000/query', // default: B2B_SQL_PROXY_URL env var
55
+ concurrency: 3, // default: 2
56
+ minDelayMs: 200, // default: 100ms between requests
57
+ });
58
+ ```
59
+
60
+ ## Environment Variables
61
+
62
+ ```bash
63
+ B2B_SQL_PROXY_URL=http://165.22.151.131:3000/query
64
+ ```
65
+
66
+ ## How It Works
67
+
68
+ ```
69
+ Agent calls: [1] [2] [3] [4] [5] [6]
70
+
71
+ Queue: [ 1, 2 running ] [ 3, 4, 5, 6 waiting ]
72
+
73
+ API: [1] [2] ← only 2 hit the API at once
74
+
75
+ [3] [4] ← when 1,2 finish, 3,4 start
76
+ ```
77
+
78
+ **Agent never has to:**
79
+ - Think about concurrency
80
+ - Add `await sleep()`
81
+ - Worry about rate limits
82
+ - Handle API throttling errors
83
+
84
+ ## API Reference
85
+
86
+ ### `orangeslice.b2b.sql<T>(query: string): Promise<T>`
87
+
88
+ Execute SQL and return rows.
89
+
90
+ ```typescript
91
+ const companies = await orangeslice.b2b.sql<Company[]>(
92
+ "SELECT * FROM linkedin_company WHERE employee_count > 1000 LIMIT 10"
93
+ );
94
+ ```
95
+
96
+ ### `orangeslice.b2b.query<T>(query: string): Promise<QueryResult<T>>`
97
+
98
+ Execute SQL and return full result with metadata.
99
+
100
+ ```typescript
101
+ const result = await orangeslice.b2b.query("SELECT * FROM linkedin_company LIMIT 10");
102
+ // result.rows, result.rowCount, result.duration_ms
103
+ ```
104
+
105
+ ## Note on Concurrency
106
+
107
+ The rate limit is **per-process**. If you run multiple scripts simultaneously, each has its own queue. For most AI agent use cases (single script), this is fine.
package/dist/b2b.d.ts ADDED
@@ -0,0 +1,30 @@
1
+ /**
2
+ * Configure the B2B client
3
+ */
4
+ export declare function configure(options: {
5
+ proxyUrl?: string;
6
+ concurrency?: number;
7
+ minDelayMs?: number;
8
+ }): void;
9
+ export interface QueryResult<T = Record<string, unknown>> {
10
+ rows: T[];
11
+ rowCount: number;
12
+ duration_ms: number;
13
+ }
14
+ /**
15
+ * Execute a SQL query against the B2B database.
16
+ * Automatically rate-limited and concurrency-controlled.
17
+ *
18
+ * @example
19
+ * const companies = await b2b.sql<Company[]>("SELECT * FROM linkedin_company WHERE domain = 'stripe.com'");
20
+ */
21
+ export declare function sql<T = Record<string, unknown>[]>(query: string): Promise<T>;
22
+ /**
23
+ * Execute a SQL query and get full result with metadata
24
+ */
25
+ export declare function query<T = Record<string, unknown>>(sqlQuery: string): Promise<QueryResult<T>>;
26
+ export declare const b2b: {
27
+ sql: typeof sql;
28
+ query: typeof query;
29
+ configure: typeof configure;
30
+ };
package/dist/b2b.js ADDED
@@ -0,0 +1,89 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.b2b = void 0;
4
+ exports.configure = configure;
5
+ exports.sql = sql;
6
+ exports.query = query;
7
+ const queue_1 = require("./queue");
8
+ // Default config
9
+ let config = {
10
+ proxyUrl: process.env.B2B_SQL_PROXY_URL || "http://165.22.151.131:3000/query",
11
+ concurrency: 2,
12
+ minDelayMs: 100, // 100ms between requests = max 10/sec
13
+ };
14
+ // Create queue and rate limiter with defaults
15
+ let queue = (0, queue_1.createQueue)(config.concurrency);
16
+ let rateLimiter = (0, queue_1.createRateLimiter)(config.minDelayMs);
17
+ /**
18
+ * Configure the B2B client
19
+ */
20
+ function configure(options) {
21
+ if (options.proxyUrl)
22
+ config.proxyUrl = options.proxyUrl;
23
+ if (options.concurrency) {
24
+ config.concurrency = options.concurrency;
25
+ queue = (0, queue_1.createQueue)(options.concurrency);
26
+ }
27
+ if (options.minDelayMs !== undefined) {
28
+ config.minDelayMs = options.minDelayMs;
29
+ rateLimiter = (0, queue_1.createRateLimiter)(options.minDelayMs);
30
+ }
31
+ }
32
+ /**
33
+ * Execute a SQL query against the B2B database.
34
+ * Automatically rate-limited and concurrency-controlled.
35
+ *
36
+ * @example
37
+ * const companies = await b2b.sql<Company[]>("SELECT * FROM linkedin_company WHERE domain = 'stripe.com'");
38
+ */
39
+ async function sql(query) {
40
+ return queue(async () => {
41
+ return rateLimiter(async () => {
42
+ const response = await fetch(config.proxyUrl, {
43
+ method: "POST",
44
+ headers: { "Content-Type": "application/json" },
45
+ body: JSON.stringify({ sql: query }),
46
+ });
47
+ if (!response.ok) {
48
+ throw new Error(`B2B SQL request failed: ${response.status} ${response.statusText}`);
49
+ }
50
+ const data = (await response.json());
51
+ if (data.error) {
52
+ throw new Error(`B2B SQL error: ${data.error}`);
53
+ }
54
+ return (data.rows || []);
55
+ });
56
+ });
57
+ }
58
+ /**
59
+ * Execute a SQL query and get full result with metadata
60
+ */
61
+ async function query(sqlQuery) {
62
+ return queue(async () => {
63
+ return rateLimiter(async () => {
64
+ const response = await fetch(config.proxyUrl, {
65
+ method: "POST",
66
+ headers: { "Content-Type": "application/json" },
67
+ body: JSON.stringify({ sql: sqlQuery }),
68
+ });
69
+ if (!response.ok) {
70
+ throw new Error(`B2B SQL request failed: ${response.status} ${response.statusText}`);
71
+ }
72
+ const data = (await response.json());
73
+ if (data.error) {
74
+ throw new Error(`B2B SQL error: ${data.error}`);
75
+ }
76
+ return {
77
+ rows: (data.rows || []),
78
+ rowCount: data.rowCount || 0,
79
+ duration_ms: data.duration_ms || 0,
80
+ };
81
+ });
82
+ });
83
+ }
84
+ // Export as namespace
85
+ exports.b2b = {
86
+ sql,
87
+ query,
88
+ configure,
89
+ };
@@ -0,0 +1,29 @@
1
+ import { b2b } from "./b2b";
2
+ export { b2b };
3
+ /**
4
+ * Main orangeslice namespace
5
+ *
6
+ * @example
7
+ * import { orangeslice } from 'orangeslice';
8
+ *
9
+ * // Configure (optional)
10
+ * orangeslice.b2b.configure({ concurrency: 3 });
11
+ *
12
+ * // Query - automatically rate-limited
13
+ * const companies = await orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE domain = 'stripe.com'");
14
+ *
15
+ * // Multiple parallel calls are queued automatically
16
+ * const results = await Promise.all([
17
+ * orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE universal_name = 'stripe'"),
18
+ * orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE universal_name = 'openai'"),
19
+ * orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE universal_name = 'meta'"),
20
+ * ]);
21
+ */
22
+ export declare const orangeslice: {
23
+ b2b: {
24
+ sql: typeof import("./b2b").sql;
25
+ query: typeof import("./b2b").query;
26
+ configure: typeof import("./b2b").configure;
27
+ };
28
+ };
29
+ export default orangeslice;
package/dist/index.js ADDED
@@ -0,0 +1,28 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.orangeslice = exports.b2b = void 0;
4
+ const b2b_1 = require("./b2b");
5
+ Object.defineProperty(exports, "b2b", { enumerable: true, get: function () { return b2b_1.b2b; } });
6
+ /**
7
+ * Main orangeslice namespace
8
+ *
9
+ * @example
10
+ * import { orangeslice } from 'orangeslice';
11
+ *
12
+ * // Configure (optional)
13
+ * orangeslice.b2b.configure({ concurrency: 3 });
14
+ *
15
+ * // Query - automatically rate-limited
16
+ * const companies = await orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE domain = 'stripe.com'");
17
+ *
18
+ * // Multiple parallel calls are queued automatically
19
+ * const results = await Promise.all([
20
+ * orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE universal_name = 'stripe'"),
21
+ * orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE universal_name = 'openai'"),
22
+ * orangeslice.b2b.sql("SELECT * FROM linkedin_company WHERE universal_name = 'meta'"),
23
+ * ]);
24
+ */
25
+ exports.orangeslice = {
26
+ b2b: b2b_1.b2b,
27
+ };
28
+ exports.default = exports.orangeslice;
@@ -0,0 +1,9 @@
1
+ /**
2
+ * Simple concurrency queue - limits how many async operations run at once.
3
+ * Any excess calls are queued and run when a slot opens.
4
+ */
5
+ export declare function createQueue(concurrency: number): <T>(fn: () => Promise<T>) => Promise<T>;
6
+ /**
7
+ * Rate limiter - ensures minimum delay between requests
8
+ */
9
+ export declare function createRateLimiter(minDelayMs: number): <T>(fn: () => Promise<T>) => Promise<T>;
package/dist/queue.js ADDED
@@ -0,0 +1,48 @@
1
+ "use strict";
2
+ Object.defineProperty(exports, "__esModule", { value: true });
3
+ exports.createQueue = createQueue;
4
+ exports.createRateLimiter = createRateLimiter;
5
+ /**
6
+ * Simple concurrency queue - limits how many async operations run at once.
7
+ * Any excess calls are queued and run when a slot opens.
8
+ */
9
+ function createQueue(concurrency) {
10
+ let active = 0;
11
+ const pending = [];
12
+ const next = () => {
13
+ if (active < concurrency && pending.length > 0) {
14
+ active++;
15
+ const resolve = pending.shift();
16
+ resolve();
17
+ }
18
+ };
19
+ return async (fn) => {
20
+ // Wait for a slot to open
21
+ await new Promise((resolve) => {
22
+ pending.push(resolve);
23
+ next();
24
+ });
25
+ try {
26
+ return await fn();
27
+ }
28
+ finally {
29
+ active--;
30
+ next();
31
+ }
32
+ };
33
+ }
34
+ /**
35
+ * Rate limiter - ensures minimum delay between requests
36
+ */
37
+ function createRateLimiter(minDelayMs) {
38
+ let lastRequest = 0;
39
+ return async (fn) => {
40
+ const now = Date.now();
41
+ const timeSinceLastRequest = now - lastRequest;
42
+ if (timeSinceLastRequest < minDelayMs) {
43
+ await new Promise((resolve) => setTimeout(resolve, minDelayMs - timeSinceLastRequest));
44
+ }
45
+ lastRequest = Date.now();
46
+ return fn();
47
+ };
48
+ }
@@ -0,0 +1,255 @@
1
+ # B2B Cross-Table Query Test Findings
2
+
3
+ Comprehensive performance comparison between normalized tables (`linkedin_profile`, `linkedin_company`) and denormalized views (`lkd_profile`, `lkd_company`) for cross-table queries.
4
+
5
+ ---
6
+
7
+ ## Executive Summary
8
+
9
+ | Pattern | Normalized | Denormalized | Winner | Speedup |
10
+ | ---------------------------------- | ---------- | ------------ | ------------ | ------- |
11
+ | **Company ID lookup → employees** | 48ms | 279ms | Normalized | 5.8x |
12
+ | **Company name (org) search** | 274ms | 8,600ms | Normalized | 31x |
13
+ | **GIN-indexed org ILIKE** | 430ms | 29,409ms | Normalized | 68x |
14
+ | **Title ILIKE (common term)** | 64ms | 313ms | Normalized | 4.9x |
15
+ | **updated_at filter** | 4ms | 14ms | Normalized | 3.5x |
16
+ | **Company ID direct lookup** | 4ms | 31ms | Normalized | 7.8x |
17
+ | **Headline (rare term)** | 2,530ms | 1,258ms | Denormalized | 2x |
18
+ | **Skill array search** | 216ms | 169ms | Denormalized | 1.3x |
19
+ | **Industry + employee_count** | 742ms | 202ms | Denormalized | 3.7x |
20
+ | **Headline + company size (JOIN)** | 20,205ms | 217ms | Denormalized | 93x |
21
+ | **Multi-skill + company size** | 28,173ms | 1,281ms | Denormalized | 22x |
22
+ | **Skill + company industry** | TIMEOUT | 3,553ms | Denormalized | ∞ |
23
+ | **Complex multi-filter + company** | TIMEOUT | 4,947ms | Denormalized | ∞ |
24
+ | **AI company + SF location** | TIMEOUT | 11,061ms | Denormalized | ∞ |
25
+
26
+ **Key Finding**: When combining profile text filters (headline, skills) with company constraints (employee_count, industry), **denormalized JOINs are 20-90x faster** and often the only option that completes within timeout.
27
+
28
+ ---
29
+
30
+ ## Critical Pattern: Profile + Company Combined Filters
31
+
32
+ The most important discovery: **cross-table queries with text filters perform dramatically better with denormalized tables**.
33
+
34
+ ### Normalized Multi-JOIN (Often Fails)
35
+
36
+ ```sql
37
+ -- ❌ TIMEOUT or 20+ seconds
38
+ SELECT lp.id, lp.first_name, lp.headline, lc.company_name
39
+ FROM linkedin_profile lp
40
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
41
+ JOIN linkedin_company lc ON lc.id = pos.linkedin_company_id
42
+ WHERE pos.end_date IS NULL
43
+ AND lp.headline ILIKE '%engineer%'
44
+ AND lc.employee_count > 1000
45
+ LIMIT 50
46
+ -- Result: 20,205ms
47
+ ```
48
+
49
+ ### Denormalized JOIN (Fast)
50
+
51
+ ```sql
52
+ -- ✅ 217ms - 93x faster
53
+ SELECT lkd.profile_id, lkd.first_name, lkd.headline, lkdc.name
54
+ FROM lkd_profile lkd
55
+ JOIN lkd_company lkdc ON lkdc.linkedin_company_id = lkd.linkedin_company_id
56
+ WHERE lkd.headline ILIKE '%engineer%'
57
+ AND lkdc.employee_count > 1000
58
+ LIMIT 50
59
+ -- Result: 217ms
60
+ ```
61
+
62
+ ---
63
+
64
+ ## Test Results by Category
65
+
66
+ ### A. Company-First Queries
67
+
68
+ | Test | Query | Normalized | Denormalized | Winner |
69
+ | ---- | ------------------------------- | ---------- | ------------ | ----------------- |
70
+ | A1 | Employees at company ID | **48ms** | 279ms | Normalized (5.8x) |
71
+ | A2 | Employees by company name (org) | **274ms** | 8,600ms | Normalized (31x) |
72
+ | A3 | Engineers at large companies | **96ms** | 234ms | Normalized (2.4x) |
73
+
74
+ **Conclusion**: For company-first queries, normalized tables win due to indexed lookups.
75
+
76
+ ### B. Profile-First Queries
77
+
78
+ | Test | Query | Normalized | Denormalized | Winner |
79
+ | ---- | -------------------------- | ---------- | ------------ | ------------------- |
80
+ | B1 | Python developers | 216ms | **169ms** | Denormalized (1.3x) |
81
+ | B2 | US Data Scientists | 644ms | **557ms** | Denormalized (1.2x) |
82
+ | B3 | Senior engineers + company | 4,535ms | **196ms** | Denormalized (23x) |
83
+
84
+ **Conclusion**: Simple profile queries are similar; profile + company queries favor denormalized.
85
+
86
+ ### C. Complex Prospecting Queries
87
+
88
+ | Test | Query | Normalized | Denormalized | Winner |
89
+ | ---- | ----------------------------------------- | ----------- | ------------ | ----------------- |
90
+ | C1 | Decision makers at funded startups | **1,198ms** | 3,124ms | Normalized (2.6x) |
91
+ | C2 | AI company employees in SF | TIMEOUT | **11,061ms** | Denormalized (∞) |
92
+ | C3 | Hybrid (normalized profile + lkd_company) | 9,631ms | - | - |
93
+
94
+ **Conclusion**: When funding table is used (indexed JOIN), normalized wins. When text filters span tables, denormalized wins.
95
+
96
+ ### D. Company Lookups
97
+
98
+ | Test | Query | Normalized | Denormalized | Winner |
99
+ | ---- | -------------------------- | ---------- | ------------ | ------------------- |
100
+ | D1 | Company by ID | **4ms** | 31ms | Normalized (7.8x) |
101
+ | D2 | Industry + employee filter | 742ms | **202ms** | Denormalized (3.7x) |
102
+
103
+ ### E. Edge Cases
104
+
105
+ | Test | Query | Normalized | Denormalized | Winner |
106
+ | ---- | --------------------- | ---------- | ------------ | ------------------- |
107
+ | E1 | Headline (blockchain) | 713ms | **384ms** | Denormalized (1.9x) |
108
+ | E2 | Company description | 144ms | 152ms | Tie |
109
+
110
+ ### F. Verification Tests
111
+
112
+ | Test | Query | Normalized | Denormalized | Winner |
113
+ | ---- | ---------------------------- | ---------- | ------------ | ------------------- |
114
+ | F1 | Multi-skill + company size | 28,173ms | **1,281ms** | Denormalized (22x) |
115
+ | F2 | Country + org (GIN) | **990ms** | 4,594ms | Normalized (4.6x) |
116
+ | F3 | Title regex + company filter | 434ms | **227ms** | Denormalized (1.9x) |
117
+
118
+ ### G. Index Pattern Tests
119
+
120
+ | Test | Query | Normalized | Denormalized | Winner |
121
+ | ---- | ------------------------- | ---------- | ------------ | ----------------- |
122
+ | G1 | org ILIKE (GIN indexed) | **430ms** | 29,409ms | Normalized (68x) |
123
+ | G2 | headline ILIKE (no index) | 2,530ms | **1,258ms** | Denormalized (2x) |
124
+ | G3 | title ILIKE | **64ms** | 313ms | Normalized (4.9x) |
125
+ | G4 | updated_at filter | **4ms** | 14ms | Normalized (3.5x) |
126
+
127
+ ### H. Cross-Table JOIN Patterns
128
+
129
+ | Test | Query | Normalized | Denormalized | Winner |
130
+ | ---- | ------------------------- | ---------- | ------------ | ------------------ |
131
+ | H1 | Headline + employee_count | 20,205ms | **217ms** | Denormalized (93x) |
132
+ | H2 | Skill + company industry | TIMEOUT | **3,553ms** | Denormalized (∞) |
133
+ | H3 | Multi-filter + company | TIMEOUT | **4,947ms** | Denormalized (∞) |
134
+
135
+ ---
136
+
137
+ ## Decision Rules for Cross-Table Queries
138
+
139
+ ### Use Normalized (`linkedin_profile` + `linkedin_company` JOINs) When:
140
+
141
+ 1. **Company-first lookup** - Start with company ID, get employees
142
+ 2. **GIN-indexed field** - Searching `linkedin_profile.org` (company name)
143
+ 3. **Indexed lookups** - `updated_at`, company ID, profile ID
144
+ 4. **Title field search** - `linkedin_profile.title` is faster
145
+ 5. **Indexed JOIN tables** - `linkedin_crunchbase_funding`, `linkedin_profile_position3` by company
146
+
147
+ ### Use Denormalized (`lkd_profile` JOIN `lkd_company`) When:
148
+
149
+ 1. **Headline + company filter** - 93x faster
150
+ 2. **Skill + company constraint** - Normalized times out
151
+ 3. **Multi-filter combinations** - 22x faster
152
+ 4. **Industry + employee_count** - 3.7x faster
153
+ 5. **Text filter spanning profile + company** - Often only option
154
+
155
+ ### Never Use:
156
+
157
+ 1. `lkd_profile.company_name` ILIKE - Use `linkedin_profile.org` (68x faster)
158
+ 2. Normalized multi-JOIN with headline filter - Will timeout or be 20s+
159
+
160
+ ---
161
+
162
+ ## Recommended Query Patterns
163
+
164
+ ### Pattern 1: Find Employees at Company by Name
165
+
166
+ ```sql
167
+ -- ✅ BEST: Use GIN-indexed org field
168
+ SELECT id, first_name, title, headline, org
169
+ FROM linkedin_profile
170
+ WHERE org ILIKE '%Google%'
171
+ LIMIT 50
172
+ -- Result: 274ms
173
+ ```
174
+
175
+ ### Pattern 2: Find Engineers at Large Companies
176
+
177
+ ```sql
178
+ -- ✅ BEST: Denormalized JOIN (93x faster)
179
+ SELECT lkd.profile_id, lkd.first_name, lkd.headline, lkdc.name, lkdc.employee_count
180
+ FROM lkd_profile lkd
181
+ JOIN lkd_company lkdc ON lkdc.linkedin_company_id = lkd.linkedin_company_id
182
+ WHERE lkd.headline ILIKE '%engineer%'
183
+ AND lkdc.employee_count > 1000
184
+ LIMIT 50
185
+ -- Result: 217ms
186
+ ```
187
+
188
+ ### Pattern 3: Find People with Skills at Specific Company Types
189
+
190
+ ```sql
191
+ -- ✅ BEST: Denormalized (normalized times out)
192
+ SELECT lkd.profile_id, lkd.first_name, lkd.headline, lkdc.name
193
+ FROM lkd_profile lkd
194
+ JOIN lkd_company lkdc ON lkdc.linkedin_company_id = lkd.linkedin_company_id
195
+ WHERE 'Python' = ANY(lkd.skills)
196
+ AND 'SQL' = ANY(lkd.skills)
197
+ AND lkdc.employee_count BETWEEN 100 AND 5000
198
+ LIMIT 50
199
+ -- Result: 1,281ms (normalized: 28,173ms)
200
+ ```
201
+
202
+ ### Pattern 4: Prospecting Query (Profile Criteria + Company Criteria)
203
+
204
+ ```sql
205
+ -- ✅ BEST: Denormalized for multi-filter
206
+ SELECT lkd.profile_id, lkd.first_name, lkd.title, lkdc.name, lkdc.employee_count
207
+ FROM lkd_profile lkd
208
+ JOIN lkd_company lkdc ON lkdc.linkedin_company_id = lkd.linkedin_company_id
209
+ WHERE lkd.title ~* '(manager|director|lead)'
210
+ AND lkdc.employee_count BETWEEN 100 AND 1000
211
+ LIMIT 50
212
+ -- Result: 227ms (normalized: 434ms)
213
+ ```
214
+
215
+ ### Pattern 5: Decision Makers at Funded Startups
216
+
217
+ ```sql
218
+ -- ✅ BEST: Normalized when using indexed funding table
219
+ SELECT DISTINCT lp.id, lp.first_name, lp.title, lc.company_name
220
+ FROM linkedin_profile lp
221
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
222
+ JOIN linkedin_company lc ON lc.id = pos.linkedin_company_id
223
+ JOIN linkedin_crunchbase_funding cf ON cf.linkedin_company_id = lc.id
224
+ WHERE pos.end_date IS NULL
225
+ AND lp.title ~* '(CEO|CTO|VP|Director|Head)'
226
+ AND lc.employee_count BETWEEN 10 AND 500
227
+ LIMIT 50
228
+ -- Result: 1,198ms
229
+ ```
230
+
231
+ ---
232
+
233
+ ## Summary: The Cross-Table Golden Rules
234
+
235
+ 1. **Company name search** → Always use `linkedin_profile.org` (GIN indexed, 68x faster)
236
+ 2. **Headline/skill + company constraint** → Always use denormalized JOIN (20-93x faster, normalized often times out)
237
+ 3. **Company-first lookups** → Use normalized (5-8x faster)
238
+ 4. **Indexed table JOINs (funding, positions)** → Normalized is fine
239
+ 5. **Multi-filter profile + company** → Denormalized is the only option that works
240
+
241
+ ### Quick Decision:
242
+
243
+ ```
244
+ Need to search by company name?
245
+ └─ YES → Use linkedin_profile.org
246
+
247
+ Need profile text filter (headline/skills) + company constraint?
248
+ └─ YES → Use lkd_profile JOIN lkd_company
249
+
250
+ Need company ID lookup or indexed JOIN?
251
+ └─ YES → Use normalized tables
252
+
253
+ Default for prospecting queries:
254
+ └─ Use lkd_profile JOIN lkd_company
255
+ ```