orangeslice 1.6.1 → 1.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/apify.d.ts +57 -0
- package/dist/apify.js +126 -0
- package/dist/cli.js +18 -7
- package/dist/generateObject.d.ts +34 -0
- package/dist/generateObject.js +85 -0
- package/dist/geo.d.ts +50 -0
- package/dist/geo.js +91 -0
- package/dist/index.d.ts +32 -3
- package/dist/index.js +24 -3
- package/docs/AGENTS.md +94 -384
- package/docs/apify.md +133 -0
- package/docs/b2b.md +178 -0
- package/docs/browser.md +173 -0
- package/docs/serp.md +167 -0
- package/docs/strategies.md +250 -0
- package/package.json +2 -2
- package/docs/B2B_CROSS_TABLE_TEST_FINDINGS.md +0 -255
- package/docs/B2B_DATABASE.md +0 -314
- package/docs/B2B_DATABASE_TEST_FINDINGS.md +0 -476
- package/docs/B2B_EMPLOYEE_SEARCH.md +0 -697
- package/docs/B2B_GENERALIZATION_RULES.md +0 -220
- package/docs/B2B_NLP_QUERY_MAPPINGS.md +0 -240
- package/docs/B2B_NORMALIZED_VS_DENORMALIZED.md +0 -952
- package/docs/B2B_SCHEMA.md +0 -1042
- package/docs/B2B_SQL_COMPREHENSIVE_TEST_FINDINGS.md +0 -301
- package/docs/B2B_TABLE_INDICES.ts +0 -496
|
@@ -1,697 +0,0 @@
|
|
|
1
|
-
# B2B Employee Search Guide
|
|
2
|
-
|
|
3
|
-
How to find employees by title at a given company using the B2B database.
|
|
4
|
-
|
|
5
|
-
---
|
|
6
|
-
|
|
7
|
-
## Available Fields
|
|
8
|
-
|
|
9
|
-
When querying for employees, you can SELECT the following fields:
|
|
10
|
-
|
|
11
|
-
### From `linkedin_profile` (alias: `lp`)
|
|
12
|
-
|
|
13
|
-
| Field | Type | Description |
|
|
14
|
-
| ----------------------- | ------- | ------------------------------ |
|
|
15
|
-
| `first_name` | varchar | First name |
|
|
16
|
-
| `last_name` | varchar | Last name |
|
|
17
|
-
| `formatted_name` | varchar | Full name |
|
|
18
|
-
| `headline` | varchar | LinkedIn headline |
|
|
19
|
-
| `location_name` | varchar | Full location string |
|
|
20
|
-
| `location_city` | text | City |
|
|
21
|
-
| `location_region` | text | State/region |
|
|
22
|
-
| `location_country` | text | Country name |
|
|
23
|
-
| `location_country_code` | varchar | Country code (e.g., "US") |
|
|
24
|
-
| `connections` | integer | Number of LinkedIn connections |
|
|
25
|
-
| `public_profile_url` | varchar | LinkedIn profile URL |
|
|
26
|
-
|
|
27
|
-
### From `linkedin_profile_position3` (alias: `pos`)
|
|
28
|
-
|
|
29
|
-
| Field | Type | Description |
|
|
30
|
-
| -------------- | ---- | ---------------------------------- |
|
|
31
|
-
| `title` | text | Job title |
|
|
32
|
-
| `company_name` | text | Company name (denormalized) |
|
|
33
|
-
| `start_date` | date | Position start date |
|
|
34
|
-
| `end_date` | date | Position end date (NULL = current) |
|
|
35
|
-
| `summary` | text | Role description/summary |
|
|
36
|
-
|
|
37
|
-
---
|
|
38
|
-
|
|
39
|
-
## Quick Start
|
|
40
|
-
|
|
41
|
-
```sql
|
|
42
|
-
-- Find engineers at a company (fast: 10-60ms)
|
|
43
|
-
SELECT
|
|
44
|
-
lp.first_name,
|
|
45
|
-
lp.last_name,
|
|
46
|
-
lp.headline,
|
|
47
|
-
lp.location_name,
|
|
48
|
-
pos.title,
|
|
49
|
-
pos.company_name,
|
|
50
|
-
pos.start_date
|
|
51
|
-
FROM linkedin_profile lp
|
|
52
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
53
|
-
WHERE pos.linkedin_company_id = 2135371 -- Company ID (e.g., Stripe)
|
|
54
|
-
AND pos.end_date IS NULL -- Current employees only
|
|
55
|
-
AND pos.title ILIKE '%engineer%' -- Title filter
|
|
56
|
-
LIMIT 50;
|
|
57
|
-
```
|
|
58
|
-
|
|
59
|
-
---
|
|
60
|
-
|
|
61
|
-
## Key Tables
|
|
62
|
-
|
|
63
|
-
| Table | Purpose | Size |
|
|
64
|
-
| ---------------------------- | ------------------------------------------ | ---------------- |
|
|
65
|
-
| `linkedin_profile_position3` | Work experience records | **2.6 billion** |
|
|
66
|
-
| `linkedin_profile` | Profile details (name, headline, location) | **1.15 billion** |
|
|
67
|
-
| `linkedin_company` | Company lookup | Millions |
|
|
68
|
-
|
|
69
|
-
### Critical Indexes
|
|
70
|
-
|
|
71
|
-
| Index | Column | Use |
|
|
72
|
-
| --------------------------------------------------- | --------------------- | ----------------------- |
|
|
73
|
-
| `ix_linkedin_profile_position3_linkedin_company_id` | `linkedin_company_id` | **Fast company lookup** |
|
|
74
|
-
| `linkedin_profile_pkey` | `id` | Profile join |
|
|
75
|
-
|
|
76
|
-
**Note:** There is NO index on `title` - title filtering happens after the company index scan.
|
|
77
|
-
|
|
78
|
-
---
|
|
79
|
-
|
|
80
|
-
## Finding the Company ID
|
|
81
|
-
|
|
82
|
-
Before searching for employees, you need the company's `linkedin_company_id`:
|
|
83
|
-
|
|
84
|
-
```sql
|
|
85
|
-
-- By universal_name (URL slug) - FASTEST
|
|
86
|
-
SELECT id, company_name, employee_count
|
|
87
|
-
FROM linkedin_company
|
|
88
|
-
WHERE universal_name = 'stripe';
|
|
89
|
-
|
|
90
|
-
-- By domain
|
|
91
|
-
SELECT id, company_name, employee_count
|
|
92
|
-
FROM linkedin_company
|
|
93
|
-
WHERE domain = 'stripe.com';
|
|
94
|
-
|
|
95
|
-
-- By slug with key64 (if using linkedin_company_slug)
|
|
96
|
-
SELECT lc.id, lc.company_name
|
|
97
|
-
FROM linkedin_company lc
|
|
98
|
-
JOIN linkedin_company_slug lcs ON lcs.linkedin_company_id = lc.id
|
|
99
|
-
WHERE lcs.slug_key64 = key64('stripe');
|
|
100
|
-
```
|
|
101
|
-
|
|
102
|
-
### Common Company IDs (for reference)
|
|
103
|
-
|
|
104
|
-
| Company | ID | Employee Count |
|
|
105
|
-
| ------- | -------- | -------------- |
|
|
106
|
-
| Amazon | 1586 | 770K |
|
|
107
|
-
| Google | 1441 | 330K |
|
|
108
|
-
| Stripe | 2135371 | ~9K |
|
|
109
|
-
| OpenAI | 11130470 | ~7K |
|
|
110
|
-
| Ramp | 1406226 | ~3.5K |
|
|
111
|
-
|
|
112
|
-
---
|
|
113
|
-
|
|
114
|
-
## Performance by Company Size
|
|
115
|
-
|
|
116
|
-
### Tested Performance Results
|
|
117
|
-
|
|
118
|
-
| Company Size | Simple Query | DISTINCT ON | Aggregations |
|
|
119
|
-
| -------------------- | ------------ | ----------- | ------------ |
|
|
120
|
-
| **Massive (100K+)** | 15-65ms | TIMEOUT | TIMEOUT |
|
|
121
|
-
| **Large (10K-100K)** | 10-40ms | 500ms-1s | 1-15s |
|
|
122
|
-
| **Medium (1K-10K)** | 10-30ms | 100-200ms | 100-500ms |
|
|
123
|
-
| **Small (<1K)** | 4-20ms | 10-50ms | 5-50ms |
|
|
124
|
-
|
|
125
|
-
### What This Means
|
|
126
|
-
|
|
127
|
-
- **Amazon/Google/Microsoft**: Only use simple LIMIT queries
|
|
128
|
-
- **Stripe/Meta/Salesforce**: DISTINCT ON works, avoid heavy aggregations
|
|
129
|
-
- **Ramp/OpenAI**: Everything works reasonably fast
|
|
130
|
-
- **Startups (<1K)**: All queries work instantly
|
|
131
|
-
|
|
132
|
-
---
|
|
133
|
-
|
|
134
|
-
## Query Patterns by Company Size
|
|
135
|
-
|
|
136
|
-
### For MASSIVE Companies (>100K employees)
|
|
137
|
-
|
|
138
|
-
**DO:**
|
|
139
|
-
|
|
140
|
-
```sql
|
|
141
|
-
-- Simple LIMIT query (15-65ms)
|
|
142
|
-
SELECT
|
|
143
|
-
lp.first_name,
|
|
144
|
-
lp.last_name,
|
|
145
|
-
lp.headline,
|
|
146
|
-
lp.location_name,
|
|
147
|
-
pos.title,
|
|
148
|
-
pos.company_name
|
|
149
|
-
FROM linkedin_profile lp
|
|
150
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
151
|
-
WHERE pos.linkedin_company_id = 1586 -- Amazon
|
|
152
|
-
AND pos.end_date IS NULL
|
|
153
|
-
AND pos.title ILIKE '%account executive%'
|
|
154
|
-
LIMIT 50;
|
|
155
|
-
```
|
|
156
|
-
|
|
157
|
-
**DON'T:**
|
|
158
|
-
|
|
159
|
-
```sql
|
|
160
|
-
-- These will TIMEOUT (>30s):
|
|
161
|
-
SELECT DISTINCT ON (lp.id) ... -- NO
|
|
162
|
-
SELECT COUNT(*) ... -- NO
|
|
163
|
-
GROUP BY pos.title ... -- NO
|
|
164
|
-
```
|
|
165
|
-
|
|
166
|
-
### For LARGE Companies (10K-100K employees)
|
|
167
|
-
|
|
168
|
-
```sql
|
|
169
|
-
-- DISTINCT ON for deduplication (500ms-1s)
|
|
170
|
-
SELECT DISTINCT ON (lp.id)
|
|
171
|
-
lp.first_name,
|
|
172
|
-
lp.last_name,
|
|
173
|
-
lp.formatted_name,
|
|
174
|
-
lp.headline,
|
|
175
|
-
lp.location_name,
|
|
176
|
-
pos.title,
|
|
177
|
-
pos.company_name,
|
|
178
|
-
pos.start_date
|
|
179
|
-
FROM linkedin_profile lp
|
|
180
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
181
|
-
WHERE pos.linkedin_company_id = 2135371 -- Stripe
|
|
182
|
-
AND pos.end_date IS NULL
|
|
183
|
-
AND pos.title ILIKE '%vp %'
|
|
184
|
-
ORDER BY lp.id, pos.start_date DESC
|
|
185
|
-
LIMIT 100;
|
|
186
|
-
```
|
|
187
|
-
|
|
188
|
-
### For MEDIUM/SMALL Companies (<10K employees)
|
|
189
|
-
|
|
190
|
-
```sql
|
|
191
|
-
-- Full aggregations work (100-500ms)
|
|
192
|
-
SELECT pos.title, COUNT(DISTINCT lp.id) as unique_count
|
|
193
|
-
FROM linkedin_profile lp
|
|
194
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
195
|
-
WHERE pos.linkedin_company_id = 1406226 -- Ramp
|
|
196
|
-
AND pos.end_date IS NULL
|
|
197
|
-
GROUP BY pos.title
|
|
198
|
-
ORDER BY unique_count DESC
|
|
199
|
-
LIMIT 25;
|
|
200
|
-
```
|
|
201
|
-
|
|
202
|
-
---
|
|
203
|
-
|
|
204
|
-
## Batch Processing 200+ Companies (Guaranteed No Timeout)
|
|
205
|
-
|
|
206
|
-
When you need to find the same role type across many companies (including massive ones like Amazon/Google), use this pattern:
|
|
207
|
-
|
|
208
|
-
### The Golden Rule
|
|
209
|
-
|
|
210
|
-
**NEVER use DISTINCT ON or aggregations when batch processing companies of unknown size.**
|
|
211
|
-
|
|
212
|
-
Use this simple pattern instead - it works for ANY company size:
|
|
213
|
-
|
|
214
|
-
```sql
|
|
215
|
-
SELECT
|
|
216
|
-
lp.first_name,
|
|
217
|
-
lp.last_name,
|
|
218
|
-
lp.headline,
|
|
219
|
-
lp.location_name,
|
|
220
|
-
lp.public_profile_url,
|
|
221
|
-
pos.title,
|
|
222
|
-
pos.company_name,
|
|
223
|
-
pos.start_date
|
|
224
|
-
FROM linkedin_profile lp
|
|
225
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
226
|
-
WHERE pos.linkedin_company_id = :company_id -- Parameterized
|
|
227
|
-
AND pos.end_date IS NULL
|
|
228
|
-
AND (/* your title filters here */)
|
|
229
|
-
LIMIT 25;
|
|
230
|
-
```
|
|
231
|
-
|
|
232
|
-
### Tested Performance (All Sizes)
|
|
233
|
-
|
|
234
|
-
| Query Type | Amazon (770K) | Google (330K) | Stripe (9K) | Ramp (3.5K) | Cluely (86) |
|
|
235
|
-
| ----------- | ------------- | ------------- | ----------- | ----------- | ----------- |
|
|
236
|
-
| C-Suite | **1.9s** | **293ms** | 1s | 235ms | 8ms |
|
|
237
|
-
| Head of Eng | **2.1s** | **51ms** | 36ms | 61ms | - |
|
|
238
|
-
| Founders | **491ms** | **433ms** | 602ms | 46ms | 4ms |
|
|
239
|
-
|
|
240
|
-
**All queries complete successfully - no timeouts even on 770K employee companies.**
|
|
241
|
-
|
|
242
|
-
### Ready-to-Use Query Templates
|
|
243
|
-
|
|
244
|
-
#### C-Suite at Any Company
|
|
245
|
-
|
|
246
|
-
```sql
|
|
247
|
-
SELECT
|
|
248
|
-
lp.first_name,
|
|
249
|
-
lp.last_name,
|
|
250
|
-
lp.formatted_name,
|
|
251
|
-
lp.headline,
|
|
252
|
-
lp.location_name,
|
|
253
|
-
lp.public_profile_url,
|
|
254
|
-
pos.title,
|
|
255
|
-
pos.company_name,
|
|
256
|
-
pos.start_date
|
|
257
|
-
FROM linkedin_profile lp
|
|
258
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
259
|
-
WHERE pos.linkedin_company_id = :company_id
|
|
260
|
-
AND pos.end_date IS NULL
|
|
261
|
-
AND (pos.title ILIKE 'ceo%'
|
|
262
|
-
OR pos.title ILIKE 'cto%'
|
|
263
|
-
OR pos.title ILIKE 'cfo%'
|
|
264
|
-
OR pos.title ILIKE 'coo%'
|
|
265
|
-
OR pos.title ILIKE 'cmo%'
|
|
266
|
-
OR pos.title ILIKE '%chief exec%'
|
|
267
|
-
OR pos.title ILIKE '%chief tech%'
|
|
268
|
-
OR pos.title ILIKE '%chief fin%'
|
|
269
|
-
OR pos.title ILIKE '%chief oper%'
|
|
270
|
-
OR pos.title ILIKE '%chief market%')
|
|
271
|
-
LIMIT 25;
|
|
272
|
-
```
|
|
273
|
-
|
|
274
|
-
#### Head of Engineering at Any Company
|
|
275
|
-
|
|
276
|
-
```sql
|
|
277
|
-
SELECT
|
|
278
|
-
lp.first_name,
|
|
279
|
-
lp.last_name,
|
|
280
|
-
lp.headline,
|
|
281
|
-
lp.location_name,
|
|
282
|
-
lp.public_profile_url,
|
|
283
|
-
pos.title,
|
|
284
|
-
pos.company_name,
|
|
285
|
-
pos.start_date
|
|
286
|
-
FROM linkedin_profile lp
|
|
287
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
288
|
-
WHERE pos.linkedin_company_id = :company_id
|
|
289
|
-
AND pos.end_date IS NULL
|
|
290
|
-
AND (pos.title ILIKE '%head of engineer%'
|
|
291
|
-
OR pos.title ILIKE '%vp engineer%'
|
|
292
|
-
OR pos.title ILIKE '%vp of engineer%'
|
|
293
|
-
OR pos.title ILIKE '%director of engineer%'
|
|
294
|
-
OR pos.title ILIKE '%engineering lead%')
|
|
295
|
-
LIMIT 25;
|
|
296
|
-
```
|
|
297
|
-
|
|
298
|
-
#### Founders at Any Company
|
|
299
|
-
|
|
300
|
-
```sql
|
|
301
|
-
SELECT
|
|
302
|
-
lp.first_name,
|
|
303
|
-
lp.last_name,
|
|
304
|
-
lp.headline,
|
|
305
|
-
lp.location_name,
|
|
306
|
-
lp.public_profile_url,
|
|
307
|
-
pos.title,
|
|
308
|
-
pos.company_name,
|
|
309
|
-
pos.start_date
|
|
310
|
-
FROM linkedin_profile lp
|
|
311
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
312
|
-
WHERE pos.linkedin_company_id = :company_id
|
|
313
|
-
AND pos.end_date IS NULL
|
|
314
|
-
AND (pos.title ILIKE '%founder%'
|
|
315
|
-
OR pos.title ILIKE '%co-founder%'
|
|
316
|
-
OR pos.title ILIKE '%cofounder%')
|
|
317
|
-
LIMIT 25;
|
|
318
|
-
```
|
|
319
|
-
|
|
320
|
-
#### Head of Sales at Any Company
|
|
321
|
-
|
|
322
|
-
```sql
|
|
323
|
-
SELECT
|
|
324
|
-
lp.first_name,
|
|
325
|
-
lp.last_name,
|
|
326
|
-
lp.headline,
|
|
327
|
-
lp.location_name,
|
|
328
|
-
lp.public_profile_url,
|
|
329
|
-
pos.title,
|
|
330
|
-
pos.company_name,
|
|
331
|
-
pos.start_date
|
|
332
|
-
FROM linkedin_profile lp
|
|
333
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
334
|
-
WHERE pos.linkedin_company_id = :company_id
|
|
335
|
-
AND pos.end_date IS NULL
|
|
336
|
-
AND (pos.title ILIKE '%head of sales%'
|
|
337
|
-
OR pos.title ILIKE '%vp sales%'
|
|
338
|
-
OR pos.title ILIKE '%vp of sales%'
|
|
339
|
-
OR pos.title ILIKE '%chief revenue%'
|
|
340
|
-
OR pos.title ILIKE '%sales director%')
|
|
341
|
-
LIMIT 25;
|
|
342
|
-
```
|
|
343
|
-
|
|
344
|
-
#### Recruiters at Any Company
|
|
345
|
-
|
|
346
|
-
```sql
|
|
347
|
-
SELECT
|
|
348
|
-
lp.first_name,
|
|
349
|
-
lp.last_name,
|
|
350
|
-
lp.headline,
|
|
351
|
-
lp.location_name,
|
|
352
|
-
lp.public_profile_url,
|
|
353
|
-
pos.title,
|
|
354
|
-
pos.company_name,
|
|
355
|
-
pos.start_date
|
|
356
|
-
FROM linkedin_profile lp
|
|
357
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
358
|
-
WHERE pos.linkedin_company_id = :company_id
|
|
359
|
-
AND pos.end_date IS NULL
|
|
360
|
-
AND (pos.title ILIKE '%recruit%'
|
|
361
|
-
OR pos.title ILIKE '%talent acq%'
|
|
362
|
-
OR pos.title ILIKE '%sourcer%')
|
|
363
|
-
LIMIT 25;
|
|
364
|
-
```
|
|
365
|
-
|
|
366
|
-
### Handling Duplicates in Application Code
|
|
367
|
-
|
|
368
|
-
Since we can't use DISTINCT ON for massive companies, deduplicate in your application:
|
|
369
|
-
|
|
370
|
-
```typescript
|
|
371
|
-
// After fetching results
|
|
372
|
-
const uniqueByName = new Map();
|
|
373
|
-
for (const row of results) {
|
|
374
|
-
const key = `${row.first_name}-${row.last_name}-${row.title}`;
|
|
375
|
-
if (!uniqueByName.has(key)) {
|
|
376
|
-
uniqueByName.set(key, row);
|
|
377
|
-
}
|
|
378
|
-
}
|
|
379
|
-
const deduplicated = Array.from(uniqueByName.values());
|
|
380
|
-
```
|
|
381
|
-
|
|
382
|
-
### Why LIMIT 25?
|
|
383
|
-
|
|
384
|
-
- Most role searches return <25 unique people anyway
|
|
385
|
-
- Higher limits increase query time on massive companies
|
|
386
|
-
- You can increase to 50 if needed, but test first on Amazon/Google
|
|
387
|
-
|
|
388
|
-
---
|
|
389
|
-
|
|
390
|
-
## Title Search Patterns
|
|
391
|
-
|
|
392
|
-
### Role Categories
|
|
393
|
-
|
|
394
|
-
| Role | ILIKE Patterns | Notes |
|
|
395
|
-
| ------------------------ | ----------------------------------------------------------------------- | ------------------------------------------ |
|
|
396
|
-
| **Recruiters** | `%recruit%`, `%talent%`, `%sourcer%` | Also catches "Technical Recruiter" |
|
|
397
|
-
| **C-Suite** | `ceo%`, `cto%`, `cfo%`, `coo%`, `cmo%`, `%chief%` | Watch for "Chief of Staff" false positives |
|
|
398
|
-
| **VPs** | `%vp %`, `%vice president%` | Space after "vp" avoids "MVP" matches |
|
|
399
|
-
| **Directors** | `%director%`, `%head of%` | Very broad - may need refinement |
|
|
400
|
-
| **Engineers** | `%software engineer%`, `%engineer%` | `%engineer%` catches all types |
|
|
401
|
-
| **Senior Engineers** | `%senior%engineer%`, `%staff%engineer%`, `%principal%` | Senior IC roles |
|
|
402
|
-
| **Engineering Managers** | `%engineering manager%`, `%eng manager%` | People managers |
|
|
403
|
-
| **Sales** | `%account executive%`, `%sales rep%`, `%ae %` | Space after "ae" |
|
|
404
|
-
| **Sales Leadership** | `%head of sales%`, `%sales director%`, `%vp sales%` | Sales leaders |
|
|
405
|
-
| **SDRs/BDRs** | `%sales development%`, `%business development%`, `%sdr%`, `%bdr%` | Entry sales |
|
|
406
|
-
| **Product Managers** | `%product manager%`, `%product lead%` | PM roles |
|
|
407
|
-
| **Marketing** | `%marketing%`, `%growth%`, `%brand%` | Broad category |
|
|
408
|
-
| **HR/People** | `%hr %`, `%human resources%`, `%people ops%`, `%people partner%` | HR team |
|
|
409
|
-
| **Data/ML** | `%data scientist%`, `%machine learning%`, `%ml engineer%`, `%research%` | Technical |
|
|
410
|
-
| **Design** | `%designer%`, `%ux%`, `%ui%` | Design team |
|
|
411
|
-
| **Finance** | `%finance%`, `%accountant%`, `%controller%`, `%fp&a%` | Finance team |
|
|
412
|
-
|
|
413
|
-
### Example Queries by Role
|
|
414
|
-
|
|
415
|
-
**Find Recruiters:**
|
|
416
|
-
|
|
417
|
-
```sql
|
|
418
|
-
SELECT
|
|
419
|
-
lp.first_name,
|
|
420
|
-
lp.last_name,
|
|
421
|
-
lp.headline,
|
|
422
|
-
lp.location_name,
|
|
423
|
-
lp.public_profile_url,
|
|
424
|
-
pos.title,
|
|
425
|
-
pos.company_name
|
|
426
|
-
FROM linkedin_profile lp
|
|
427
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
428
|
-
WHERE pos.linkedin_company_id = 2135371
|
|
429
|
-
AND pos.end_date IS NULL
|
|
430
|
-
AND (pos.title ILIKE '%recruit%'
|
|
431
|
-
OR pos.title ILIKE '%talent acq%'
|
|
432
|
-
OR pos.title ILIKE '%sourcer%')
|
|
433
|
-
LIMIT 50;
|
|
434
|
-
```
|
|
435
|
-
|
|
436
|
-
**Find C-Suite:**
|
|
437
|
-
|
|
438
|
-
```sql
|
|
439
|
-
SELECT
|
|
440
|
-
lp.first_name,
|
|
441
|
-
lp.last_name,
|
|
442
|
-
lp.formatted_name,
|
|
443
|
-
lp.headline,
|
|
444
|
-
lp.location_name,
|
|
445
|
-
lp.connections,
|
|
446
|
-
pos.title,
|
|
447
|
-
pos.company_name,
|
|
448
|
-
pos.start_date
|
|
449
|
-
FROM linkedin_profile lp
|
|
450
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
451
|
-
WHERE pos.linkedin_company_id = 2135371
|
|
452
|
-
AND pos.end_date IS NULL
|
|
453
|
-
AND (pos.title ILIKE 'ceo%'
|
|
454
|
-
OR pos.title ILIKE 'cto%'
|
|
455
|
-
OR pos.title ILIKE 'cfo%'
|
|
456
|
-
OR pos.title ILIKE 'coo%'
|
|
457
|
-
OR pos.title ILIKE '%chief exec%'
|
|
458
|
-
OR pos.title ILIKE '%chief tech%'
|
|
459
|
-
OR pos.title ILIKE '%chief fin%'
|
|
460
|
-
OR pos.title ILIKE '%chief op%')
|
|
461
|
-
LIMIT 30;
|
|
462
|
-
```
|
|
463
|
-
|
|
464
|
-
**Find Senior Engineers:**
|
|
465
|
-
|
|
466
|
-
```sql
|
|
467
|
-
SELECT
|
|
468
|
-
lp.first_name,
|
|
469
|
-
lp.last_name,
|
|
470
|
-
lp.headline,
|
|
471
|
-
lp.location_city,
|
|
472
|
-
lp.location_region,
|
|
473
|
-
lp.location_country,
|
|
474
|
-
pos.title,
|
|
475
|
-
pos.company_name,
|
|
476
|
-
pos.start_date,
|
|
477
|
-
pos.summary
|
|
478
|
-
FROM linkedin_profile lp
|
|
479
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
480
|
-
WHERE pos.linkedin_company_id = 2135371
|
|
481
|
-
AND pos.end_date IS NULL
|
|
482
|
-
AND (pos.title ILIKE '%senior%engineer%'
|
|
483
|
-
OR pos.title ILIKE '%staff%engineer%'
|
|
484
|
-
OR pos.title ILIKE '%principal%engineer%'
|
|
485
|
-
OR pos.title ILIKE '%lead%engineer%')
|
|
486
|
-
LIMIT 50;
|
|
487
|
-
```
|
|
488
|
-
|
|
489
|
-
**Find Sales Team:**
|
|
490
|
-
|
|
491
|
-
```sql
|
|
492
|
-
SELECT
|
|
493
|
-
lp.first_name,
|
|
494
|
-
lp.last_name,
|
|
495
|
-
lp.headline,
|
|
496
|
-
lp.location_name,
|
|
497
|
-
lp.location_country_code,
|
|
498
|
-
pos.title,
|
|
499
|
-
pos.company_name,
|
|
500
|
-
pos.start_date
|
|
501
|
-
FROM linkedin_profile lp
|
|
502
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
503
|
-
WHERE pos.linkedin_company_id = 2135371
|
|
504
|
-
AND pos.end_date IS NULL
|
|
505
|
-
AND (pos.title ILIKE '%account exec%'
|
|
506
|
-
OR pos.title ILIKE '%sales rep%'
|
|
507
|
-
OR pos.title ILIKE '%business development%'
|
|
508
|
-
OR pos.title ILIKE '%ae %'
|
|
509
|
-
OR pos.title ILIKE '% ae')
|
|
510
|
-
LIMIT 50;
|
|
511
|
-
```
|
|
512
|
-
|
|
513
|
-
---
|
|
514
|
-
|
|
515
|
-
## Current vs Former Employees
|
|
516
|
-
|
|
517
|
-
### Current Employees
|
|
518
|
-
|
|
519
|
-
Use `end_date IS NULL` (recommended - more inclusive):
|
|
520
|
-
|
|
521
|
-
```sql
|
|
522
|
-
WHERE pos.end_date IS NULL
|
|
523
|
-
```
|
|
524
|
-
|
|
525
|
-
Or use `is_current = TRUE` (more conservative):
|
|
526
|
-
|
|
527
|
-
```sql
|
|
528
|
-
WHERE pos.is_current = TRUE
|
|
529
|
-
```
|
|
530
|
-
|
|
531
|
-
**Difference:** `end_date IS NULL` typically returns ~20% more results than `is_current = TRUE`.
|
|
532
|
-
|
|
533
|
-
### Former Employees (Alumni)
|
|
534
|
-
|
|
535
|
-
```sql
|
|
536
|
-
SELECT
|
|
537
|
-
lp.first_name,
|
|
538
|
-
lp.last_name,
|
|
539
|
-
lp.headline,
|
|
540
|
-
lp.location_name,
|
|
541
|
-
pos.title,
|
|
542
|
-
pos.company_name,
|
|
543
|
-
pos.start_date,
|
|
544
|
-
pos.end_date
|
|
545
|
-
FROM linkedin_profile lp
|
|
546
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
547
|
-
WHERE pos.linkedin_company_id = 2135371
|
|
548
|
-
AND pos.end_date IS NOT NULL
|
|
549
|
-
LIMIT 50;
|
|
550
|
-
```
|
|
551
|
-
|
|
552
|
-
---
|
|
553
|
-
|
|
554
|
-
## Handling Duplicates
|
|
555
|
-
|
|
556
|
-
People often have multiple position records at the same company due to:
|
|
557
|
-
|
|
558
|
-
- Title changes/promotions
|
|
559
|
-
- Multiple data sources
|
|
560
|
-
- Profile updates
|
|
561
|
-
|
|
562
|
-
### Deduplication with DISTINCT ON
|
|
563
|
-
|
|
564
|
-
```sql
|
|
565
|
-
-- Get most recent title per person (works for <100K employee companies)
|
|
566
|
-
SELECT DISTINCT ON (lp.id)
|
|
567
|
-
lp.first_name,
|
|
568
|
-
lp.last_name,
|
|
569
|
-
lp.formatted_name,
|
|
570
|
-
lp.headline,
|
|
571
|
-
lp.location_name,
|
|
572
|
-
pos.title,
|
|
573
|
-
pos.company_name,
|
|
574
|
-
pos.start_date
|
|
575
|
-
FROM linkedin_profile lp
|
|
576
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
577
|
-
WHERE pos.linkedin_company_id = 2135371
|
|
578
|
-
AND pos.end_date IS NULL
|
|
579
|
-
ORDER BY lp.id, pos.start_date DESC NULLS LAST
|
|
580
|
-
LIMIT 100;
|
|
581
|
-
```
|
|
582
|
-
|
|
583
|
-
### For Massive Companies (No DISTINCT)
|
|
584
|
-
|
|
585
|
-
For companies like Amazon/Google, deduplicate in application code:
|
|
586
|
-
|
|
587
|
-
```sql
|
|
588
|
-
-- Get raw results
|
|
589
|
-
SELECT
|
|
590
|
-
lp.id,
|
|
591
|
-
lp.first_name,
|
|
592
|
-
lp.last_name,
|
|
593
|
-
lp.headline,
|
|
594
|
-
lp.location_name,
|
|
595
|
-
pos.title,
|
|
596
|
-
pos.company_name
|
|
597
|
-
FROM linkedin_profile lp
|
|
598
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
599
|
-
WHERE pos.linkedin_company_id = 1586
|
|
600
|
-
AND pos.end_date IS NULL
|
|
601
|
-
AND pos.title ILIKE '%engineer%'
|
|
602
|
-
LIMIT 200; -- Get more to account for duplicates
|
|
603
|
-
```
|
|
604
|
-
|
|
605
|
-
Then deduplicate by `lp.id` in your application.
|
|
606
|
-
|
|
607
|
-
---
|
|
608
|
-
|
|
609
|
-
## Full Employee Profile Query
|
|
610
|
-
|
|
611
|
-
Get complete employee information with all available fields:
|
|
612
|
-
|
|
613
|
-
```sql
|
|
614
|
-
SELECT
|
|
615
|
-
lp.first_name,
|
|
616
|
-
lp.last_name,
|
|
617
|
-
lp.formatted_name,
|
|
618
|
-
lp.headline,
|
|
619
|
-
lp.location_name,
|
|
620
|
-
lp.location_city,
|
|
621
|
-
lp.location_region,
|
|
622
|
-
lp.location_country,
|
|
623
|
-
lp.location_country_code,
|
|
624
|
-
lp.connections,
|
|
625
|
-
lp.public_profile_url,
|
|
626
|
-
pos.title,
|
|
627
|
-
pos.company_name,
|
|
628
|
-
pos.start_date,
|
|
629
|
-
pos.end_date,
|
|
630
|
-
pos.summary
|
|
631
|
-
FROM linkedin_profile lp
|
|
632
|
-
JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
|
|
633
|
-
WHERE pos.linkedin_company_id = 2135371
|
|
634
|
-
AND pos.end_date IS NULL
|
|
635
|
-
AND pos.title ILIKE '%engineer%'
|
|
636
|
-
LIMIT 50;
|
|
637
|
-
```
|
|
638
|
-
|
|
639
|
-
---
|
|
640
|
-
|
|
641
|
-
## Role Breakdown Query
|
|
642
|
-
|
|
643
|
-
Get title distribution at a company (works for <100K employee companies):
|
|
644
|
-
|
|
645
|
-
```sql
|
|
646
|
-
SELECT
|
|
647
|
-
pos.title,
|
|
648
|
-
COUNT(*) as position_count,
|
|
649
|
-
COUNT(DISTINCT pos.linkedin_profile_id) as unique_people
|
|
650
|
-
FROM linkedin_profile_position3 pos
|
|
651
|
-
WHERE pos.linkedin_company_id = 2135371 -- Stripe
|
|
652
|
-
AND pos.end_date IS NULL
|
|
653
|
-
AND pos.title IS NOT NULL
|
|
654
|
-
GROUP BY pos.title
|
|
655
|
-
ORDER BY unique_people DESC
|
|
656
|
-
LIMIT 30;
|
|
657
|
-
```
|
|
658
|
-
|
|
659
|
-
---
|
|
660
|
-
|
|
661
|
-
## Common Gotchas
|
|
662
|
-
|
|
663
|
-
1. **Never use `linkedin_profile.linkedin_company_id`** - no index, will timeout
|
|
664
|
-
2. **Always use `linkedin_profile_position3`** for company-based searches
|
|
665
|
-
3. **Use `end_date IS NULL`** not `is_current = TRUE` for better coverage
|
|
666
|
-
4. **Always include LIMIT** - especially for large companies
|
|
667
|
-
5. **Avoid DISTINCT ON for 100K+ employee companies** - will timeout
|
|
668
|
-
6. **Avoid COUNT/GROUP BY for 100K+ employee companies** - will timeout
|
|
669
|
-
7. **Title matching is case-insensitive** with ILIKE but watch for variations
|
|
670
|
-
8. **Some titles are NULL** - filter with `pos.title IS NOT NULL` if needed
|
|
671
|
-
9. **Duplicates are normal** - same person may have multiple position records
|
|
672
|
-
|
|
673
|
-
---
|
|
674
|
-
|
|
675
|
-
## Performance Tips
|
|
676
|
-
|
|
677
|
-
1. **Filter by company first** - uses the index
|
|
678
|
-
2. **Add title filter after** - applied as a filter on indexed results
|
|
679
|
-
3. **Use LIMIT early** - don't fetch more than needed
|
|
680
|
-
4. **Avoid ORDER BY** on large result sets (except when needed for DISTINCT ON)
|
|
681
|
-
5. **Check company size first** - adjust query strategy accordingly
|
|
682
|
-
|
|
683
|
-
### Query Performance Reference
|
|
684
|
-
|
|
685
|
-
| Query Type | Small (<1K) | Medium (1K-10K) | Large (10K-100K) | Massive (100K+) |
|
|
686
|
-
| ------------ | ----------- | --------------- | ---------------- | --------------- |
|
|
687
|
-
| Simple LIMIT | 4-20ms | 10-30ms | 10-40ms | 15-65ms |
|
|
688
|
-
| DISTINCT ON | 10-50ms | 100-200ms | 500ms-1s | **TIMEOUT** |
|
|
689
|
-
| GROUP BY | 5-50ms | 100-500ms | 1-15s | **TIMEOUT** |
|
|
690
|
-
| COUNT(\*) | 5-50ms | 100-500ms | 1-15s | **TIMEOUT** |
|
|
691
|
-
|
|
692
|
-
---
|
|
693
|
-
|
|
694
|
-
## Related Documentation
|
|
695
|
-
|
|
696
|
-
- **[B2B_SCHEMA.md](./B2B_SCHEMA.md)** - Full schema reference
|
|
697
|
-
- **[B2B_DATABASE.md](./B2B_DATABASE.md)** - Database overview and examples
|