orangeslice 1.7.2 → 1.7.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,697 @@
1
+ # B2B Employee Search Guide
2
+
3
+ How to find employees by title at a given company using the B2B database.
4
+
5
+ ---
6
+
7
+ ## Available Fields
8
+
9
+ When querying for employees, you can SELECT the following fields:
10
+
11
+ ### From `linkedin_profile` (alias: `lp`)
12
+
13
+ | Field | Type | Description |
14
+ | ----------------------- | ------- | ------------------------------ |
15
+ | `first_name` | varchar | First name |
16
+ | `last_name` | varchar | Last name |
17
+ | `formatted_name` | varchar | Full name |
18
+ | `headline` | varchar | LinkedIn headline |
19
+ | `location_name` | varchar | Full location string |
20
+ | `location_city` | text | City |
21
+ | `location_region` | text | State/region |
22
+ | `location_country` | text | Country name |
23
+ | `location_country_code` | varchar | Country code (e.g., "US") |
24
+ | `connections` | integer | Number of LinkedIn connections |
25
+ | `public_profile_url` | varchar | LinkedIn profile URL |
26
+
27
+ ### From `linkedin_profile_position3` (alias: `pos`)
28
+
29
+ | Field | Type | Description |
30
+ | -------------- | ---- | ---------------------------------- |
31
+ | `title` | text | Job title |
32
+ | `company_name` | text | Company name (denormalized) |
33
+ | `start_date` | date | Position start date |
34
+ | `end_date` | date | Position end date (NULL = current) |
35
+ | `summary` | text | Role description/summary |
36
+
37
+ ---
38
+
39
+ ## Quick Start
40
+
41
+ ```sql
42
+ -- Find engineers at a company (fast: 10-60ms)
43
+ SELECT
44
+ lp.first_name,
45
+ lp.last_name,
46
+ lp.headline,
47
+ lp.location_name,
48
+ pos.title,
49
+ pos.company_name,
50
+ pos.start_date
51
+ FROM linkedin_profile lp
52
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
53
+ WHERE pos.linkedin_company_id = 2135371 -- Company ID (e.g., Stripe)
54
+ AND pos.end_date IS NULL -- Current employees only
55
+ AND pos.title ILIKE '%engineer%' -- Title filter
56
+ LIMIT 50;
57
+ ```
58
+
59
+ ---
60
+
61
+ ## Key Tables
62
+
63
+ | Table | Purpose | Size |
64
+ | ---------------------------- | ------------------------------------------ | ---------------- |
65
+ | `linkedin_profile_position3` | Work experience records | **2.6 billion** |
66
+ | `linkedin_profile` | Profile details (name, headline, location) | **1.15 billion** |
67
+ | `linkedin_company` | Company lookup | Millions |
68
+
69
+ ### Critical Indexes
70
+
71
+ | Index | Column | Use |
72
+ | --------------------------------------------------- | --------------------- | ----------------------- |
73
+ | `ix_linkedin_profile_position3_linkedin_company_id` | `linkedin_company_id` | **Fast company lookup** |
74
+ | `linkedin_profile_pkey` | `id` | Profile join |
75
+
76
+ **Note:** There is NO index on `title` - title filtering happens after the company index scan.
77
+
78
+ ---
79
+
80
+ ## Finding the Company ID
81
+
82
+ Before searching for employees, you need the company's `linkedin_company_id`:
83
+
84
+ ```sql
85
+ -- By universal_name (URL slug) - FASTEST
86
+ SELECT id, company_name, employee_count
87
+ FROM linkedin_company
88
+ WHERE universal_name = 'stripe';
89
+
90
+ -- By domain
91
+ SELECT id, company_name, employee_count
92
+ FROM linkedin_company
93
+ WHERE domain = 'stripe.com';
94
+
95
+ -- By slug with key64 (if using linkedin_company_slug)
96
+ SELECT lc.id, lc.company_name
97
+ FROM linkedin_company lc
98
+ JOIN linkedin_company_slug lcs ON lcs.linkedin_company_id = lc.id
99
+ WHERE lcs.slug_key64 = key64('stripe');
100
+ ```
101
+
102
+ ### Common Company IDs (for reference)
103
+
104
+ | Company | ID | Employee Count |
105
+ | ------- | -------- | -------------- |
106
+ | Amazon | 1586 | 770K |
107
+ | Google | 1441 | 330K |
108
+ | Stripe | 2135371 | ~9K |
109
+ | OpenAI | 11130470 | ~7K |
110
+ | Ramp | 1406226 | ~3.5K |
111
+
112
+ ---
113
+
114
+ ## Performance by Company Size
115
+
116
+ ### Tested Performance Results
117
+
118
+ | Company Size | Simple Query | DISTINCT ON | Aggregations |
119
+ | -------------------- | ------------ | ----------- | ------------ |
120
+ | **Massive (100K+)** | 15-65ms | TIMEOUT | TIMEOUT |
121
+ | **Large (10K-100K)** | 10-40ms | 500ms-1s | 1-15s |
122
+ | **Medium (1K-10K)** | 10-30ms | 100-200ms | 100-500ms |
123
+ | **Small (<1K)** | 4-20ms | 10-50ms | 5-50ms |
124
+
125
+ ### What This Means
126
+
127
+ - **Amazon/Google/Microsoft**: Only use simple LIMIT queries
128
+ - **Stripe/Meta/Salesforce**: DISTINCT ON works, avoid heavy aggregations
129
+ - **Ramp/OpenAI**: Everything works reasonably fast
130
+ - **Startups (<1K)**: All queries work instantly
131
+
132
+ ---
133
+
134
+ ## Query Patterns by Company Size
135
+
136
+ ### For MASSIVE Companies (>100K employees)
137
+
138
+ **DO:**
139
+
140
+ ```sql
141
+ -- Simple LIMIT query (15-65ms)
142
+ SELECT
143
+ lp.first_name,
144
+ lp.last_name,
145
+ lp.headline,
146
+ lp.location_name,
147
+ pos.title,
148
+ pos.company_name
149
+ FROM linkedin_profile lp
150
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
151
+ WHERE pos.linkedin_company_id = 1586 -- Amazon
152
+ AND pos.end_date IS NULL
153
+ AND pos.title ILIKE '%account executive%'
154
+ LIMIT 50;
155
+ ```
156
+
157
+ **DON'T:**
158
+
159
+ ```sql
160
+ -- These will TIMEOUT (>30s):
161
+ SELECT DISTINCT ON (lp.id) ... -- NO
162
+ SELECT COUNT(*) ... -- NO
163
+ GROUP BY pos.title ... -- NO
164
+ ```
165
+
166
+ ### For LARGE Companies (10K-100K employees)
167
+
168
+ ```sql
169
+ -- DISTINCT ON for deduplication (500ms-1s)
170
+ SELECT DISTINCT ON (lp.id)
171
+ lp.first_name,
172
+ lp.last_name,
173
+ lp.formatted_name,
174
+ lp.headline,
175
+ lp.location_name,
176
+ pos.title,
177
+ pos.company_name,
178
+ pos.start_date
179
+ FROM linkedin_profile lp
180
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
181
+ WHERE pos.linkedin_company_id = 2135371 -- Stripe
182
+ AND pos.end_date IS NULL
183
+ AND pos.title ILIKE '%vp %'
184
+ ORDER BY lp.id, pos.start_date DESC
185
+ LIMIT 100;
186
+ ```
187
+
188
+ ### For MEDIUM/SMALL Companies (<10K employees)
189
+
190
+ ```sql
191
+ -- Full aggregations work (100-500ms)
192
+ SELECT pos.title, COUNT(DISTINCT lp.id) as unique_count
193
+ FROM linkedin_profile lp
194
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
195
+ WHERE pos.linkedin_company_id = 1406226 -- Ramp
196
+ AND pos.end_date IS NULL
197
+ GROUP BY pos.title
198
+ ORDER BY unique_count DESC
199
+ LIMIT 25;
200
+ ```
201
+
202
+ ---
203
+
204
+ ## Batch Processing 200+ Companies (Guaranteed No Timeout)
205
+
206
+ When you need to find the same role type across many companies (including massive ones like Amazon/Google), use this pattern:
207
+
208
+ ### The Golden Rule
209
+
210
+ **NEVER use DISTINCT ON or aggregations when batch processing companies of unknown size.**
211
+
212
+ Use this simple pattern instead - it works for ANY company size:
213
+
214
+ ```sql
215
+ SELECT
216
+ lp.first_name,
217
+ lp.last_name,
218
+ lp.headline,
219
+ lp.location_name,
220
+ lp.public_profile_url,
221
+ pos.title,
222
+ pos.company_name,
223
+ pos.start_date
224
+ FROM linkedin_profile lp
225
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
226
+ WHERE pos.linkedin_company_id = :company_id -- Parameterized
227
+ AND pos.end_date IS NULL
228
+ AND (/* your title filters here */)
229
+ LIMIT 25;
230
+ ```
231
+
232
+ ### Tested Performance (All Sizes)
233
+
234
+ | Query Type | Amazon (770K) | Google (330K) | Stripe (9K) | Ramp (3.5K) | Cluely (86) |
235
+ | ----------- | ------------- | ------------- | ----------- | ----------- | ----------- |
236
+ | C-Suite | **1.9s** | **293ms** | 1s | 235ms | 8ms |
237
+ | Head of Eng | **2.1s** | **51ms** | 36ms | 61ms | - |
238
+ | Founders | **491ms** | **433ms** | 602ms | 46ms | 4ms |
239
+
240
+ **All queries complete successfully - no timeouts even on 770K employee companies.**
241
+
242
+ ### Ready-to-Use Query Templates
243
+
244
+ #### C-Suite at Any Company
245
+
246
+ ```sql
247
+ SELECT
248
+ lp.first_name,
249
+ lp.last_name,
250
+ lp.formatted_name,
251
+ lp.headline,
252
+ lp.location_name,
253
+ lp.public_profile_url,
254
+ pos.title,
255
+ pos.company_name,
256
+ pos.start_date
257
+ FROM linkedin_profile lp
258
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
259
+ WHERE pos.linkedin_company_id = :company_id
260
+ AND pos.end_date IS NULL
261
+ AND (pos.title ILIKE 'ceo%'
262
+ OR pos.title ILIKE 'cto%'
263
+ OR pos.title ILIKE 'cfo%'
264
+ OR pos.title ILIKE 'coo%'
265
+ OR pos.title ILIKE 'cmo%'
266
+ OR pos.title ILIKE '%chief exec%'
267
+ OR pos.title ILIKE '%chief tech%'
268
+ OR pos.title ILIKE '%chief fin%'
269
+ OR pos.title ILIKE '%chief oper%'
270
+ OR pos.title ILIKE '%chief market%')
271
+ LIMIT 25;
272
+ ```
273
+
274
+ #### Head of Engineering at Any Company
275
+
276
+ ```sql
277
+ SELECT
278
+ lp.first_name,
279
+ lp.last_name,
280
+ lp.headline,
281
+ lp.location_name,
282
+ lp.public_profile_url,
283
+ pos.title,
284
+ pos.company_name,
285
+ pos.start_date
286
+ FROM linkedin_profile lp
287
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
288
+ WHERE pos.linkedin_company_id = :company_id
289
+ AND pos.end_date IS NULL
290
+ AND (pos.title ILIKE '%head of engineer%'
291
+ OR pos.title ILIKE '%vp engineer%'
292
+ OR pos.title ILIKE '%vp of engineer%'
293
+ OR pos.title ILIKE '%director of engineer%'
294
+ OR pos.title ILIKE '%engineering lead%')
295
+ LIMIT 25;
296
+ ```
297
+
298
+ #### Founders at Any Company
299
+
300
+ ```sql
301
+ SELECT
302
+ lp.first_name,
303
+ lp.last_name,
304
+ lp.headline,
305
+ lp.location_name,
306
+ lp.public_profile_url,
307
+ pos.title,
308
+ pos.company_name,
309
+ pos.start_date
310
+ FROM linkedin_profile lp
311
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
312
+ WHERE pos.linkedin_company_id = :company_id
313
+ AND pos.end_date IS NULL
314
+ AND (pos.title ILIKE '%founder%'
315
+ OR pos.title ILIKE '%co-founder%'
316
+ OR pos.title ILIKE '%cofounder%')
317
+ LIMIT 25;
318
+ ```
319
+
320
+ #### Head of Sales at Any Company
321
+
322
+ ```sql
323
+ SELECT
324
+ lp.first_name,
325
+ lp.last_name,
326
+ lp.headline,
327
+ lp.location_name,
328
+ lp.public_profile_url,
329
+ pos.title,
330
+ pos.company_name,
331
+ pos.start_date
332
+ FROM linkedin_profile lp
333
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
334
+ WHERE pos.linkedin_company_id = :company_id
335
+ AND pos.end_date IS NULL
336
+ AND (pos.title ILIKE '%head of sales%'
337
+ OR pos.title ILIKE '%vp sales%'
338
+ OR pos.title ILIKE '%vp of sales%'
339
+ OR pos.title ILIKE '%chief revenue%'
340
+ OR pos.title ILIKE '%sales director%')
341
+ LIMIT 25;
342
+ ```
343
+
344
+ #### Recruiters at Any Company
345
+
346
+ ```sql
347
+ SELECT
348
+ lp.first_name,
349
+ lp.last_name,
350
+ lp.headline,
351
+ lp.location_name,
352
+ lp.public_profile_url,
353
+ pos.title,
354
+ pos.company_name,
355
+ pos.start_date
356
+ FROM linkedin_profile lp
357
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
358
+ WHERE pos.linkedin_company_id = :company_id
359
+ AND pos.end_date IS NULL
360
+ AND (pos.title ILIKE '%recruit%'
361
+ OR pos.title ILIKE '%talent acq%'
362
+ OR pos.title ILIKE '%sourcer%')
363
+ LIMIT 25;
364
+ ```
365
+
366
+ ### Handling Duplicates in Application Code
367
+
368
+ Since we can't use DISTINCT ON for massive companies, deduplicate in your application:
369
+
370
+ ```typescript
371
+ // After fetching results
372
+ const uniqueByName = new Map();
373
+ for (const row of results) {
374
+ const key = `${row.first_name}-${row.last_name}-${row.title}`;
375
+ if (!uniqueByName.has(key)) {
376
+ uniqueByName.set(key, row);
377
+ }
378
+ }
379
+ const deduplicated = Array.from(uniqueByName.values());
380
+ ```
381
+
382
+ ### Why LIMIT 25?
383
+
384
+ - Most role searches return <25 unique people anyway
385
+ - Higher limits increase query time on massive companies
386
+ - You can increase to 50 if needed, but test first on Amazon/Google
387
+
388
+ ---
389
+
390
+ ## Title Search Patterns
391
+
392
+ ### Role Categories
393
+
394
+ | Role | ILIKE Patterns | Notes |
395
+ | ------------------------ | ----------------------------------------------------------------------- | ------------------------------------------ |
396
+ | **Recruiters** | `%recruit%`, `%talent%`, `%sourcer%` | Also catches "Technical Recruiter" |
397
+ | **C-Suite** | `ceo%`, `cto%`, `cfo%`, `coo%`, `cmo%`, `%chief%` | Watch for "Chief of Staff" false positives |
398
+ | **VPs** | `%vp %`, `%vice president%` | Space after "vp" avoids "MVP" matches |
399
+ | **Directors** | `%director%`, `%head of%` | Very broad - may need refinement |
400
+ | **Engineers** | `%software engineer%`, `%engineer%` | `%engineer%` catches all types |
401
+ | **Senior Engineers** | `%senior%engineer%`, `%staff%engineer%`, `%principal%` | Senior IC roles |
402
+ | **Engineering Managers** | `%engineering manager%`, `%eng manager%` | People managers |
403
+ | **Sales** | `%account executive%`, `%sales rep%`, `%ae %` | Space after "ae" |
404
+ | **Sales Leadership** | `%head of sales%`, `%sales director%`, `%vp sales%` | Sales leaders |
405
+ | **SDRs/BDRs** | `%sales development%`, `%business development%`, `%sdr%`, `%bdr%` | Entry sales |
406
+ | **Product Managers** | `%product manager%`, `%product lead%` | PM roles |
407
+ | **Marketing** | `%marketing%`, `%growth%`, `%brand%` | Broad category |
408
+ | **HR/People** | `%hr %`, `%human resources%`, `%people ops%`, `%people partner%` | HR team |
409
+ | **Data/ML** | `%data scientist%`, `%machine learning%`, `%ml engineer%`, `%research%` | Technical |
410
+ | **Design** | `%designer%`, `%ux%`, `%ui%` | Design team |
411
+ | **Finance** | `%finance%`, `%accountant%`, `%controller%`, `%fp&a%` | Finance team |
412
+
413
+ ### Example Queries by Role
414
+
415
+ **Find Recruiters:**
416
+
417
+ ```sql
418
+ SELECT
419
+ lp.first_name,
420
+ lp.last_name,
421
+ lp.headline,
422
+ lp.location_name,
423
+ lp.public_profile_url,
424
+ pos.title,
425
+ pos.company_name
426
+ FROM linkedin_profile lp
427
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
428
+ WHERE pos.linkedin_company_id = 2135371
429
+ AND pos.end_date IS NULL
430
+ AND (pos.title ILIKE '%recruit%'
431
+ OR pos.title ILIKE '%talent acq%'
432
+ OR pos.title ILIKE '%sourcer%')
433
+ LIMIT 50;
434
+ ```
435
+
436
+ **Find C-Suite:**
437
+
438
+ ```sql
439
+ SELECT
440
+ lp.first_name,
441
+ lp.last_name,
442
+ lp.formatted_name,
443
+ lp.headline,
444
+ lp.location_name,
445
+ lp.connections,
446
+ pos.title,
447
+ pos.company_name,
448
+ pos.start_date
449
+ FROM linkedin_profile lp
450
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
451
+ WHERE pos.linkedin_company_id = 2135371
452
+ AND pos.end_date IS NULL
453
+ AND (pos.title ILIKE 'ceo%'
454
+ OR pos.title ILIKE 'cto%'
455
+ OR pos.title ILIKE 'cfo%'
456
+ OR pos.title ILIKE 'coo%'
457
+ OR pos.title ILIKE '%chief exec%'
458
+ OR pos.title ILIKE '%chief tech%'
459
+ OR pos.title ILIKE '%chief fin%'
460
+ OR pos.title ILIKE '%chief op%')
461
+ LIMIT 30;
462
+ ```
463
+
464
+ **Find Senior Engineers:**
465
+
466
+ ```sql
467
+ SELECT
468
+ lp.first_name,
469
+ lp.last_name,
470
+ lp.headline,
471
+ lp.location_city,
472
+ lp.location_region,
473
+ lp.location_country,
474
+ pos.title,
475
+ pos.company_name,
476
+ pos.start_date,
477
+ pos.summary
478
+ FROM linkedin_profile lp
479
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
480
+ WHERE pos.linkedin_company_id = 2135371
481
+ AND pos.end_date IS NULL
482
+ AND (pos.title ILIKE '%senior%engineer%'
483
+ OR pos.title ILIKE '%staff%engineer%'
484
+ OR pos.title ILIKE '%principal%engineer%'
485
+ OR pos.title ILIKE '%lead%engineer%')
486
+ LIMIT 50;
487
+ ```
488
+
489
+ **Find Sales Team:**
490
+
491
+ ```sql
492
+ SELECT
493
+ lp.first_name,
494
+ lp.last_name,
495
+ lp.headline,
496
+ lp.location_name,
497
+ lp.location_country_code,
498
+ pos.title,
499
+ pos.company_name,
500
+ pos.start_date
501
+ FROM linkedin_profile lp
502
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
503
+ WHERE pos.linkedin_company_id = 2135371
504
+ AND pos.end_date IS NULL
505
+ AND (pos.title ILIKE '%account exec%'
506
+ OR pos.title ILIKE '%sales rep%'
507
+ OR pos.title ILIKE '%business development%'
508
+ OR pos.title ILIKE '%ae %'
509
+ OR pos.title ILIKE '% ae')
510
+ LIMIT 50;
511
+ ```
512
+
513
+ ---
514
+
515
+ ## Current vs Former Employees
516
+
517
+ ### Current Employees
518
+
519
+ Use `end_date IS NULL` (recommended - more inclusive):
520
+
521
+ ```sql
522
+ WHERE pos.end_date IS NULL
523
+ ```
524
+
525
+ Or use `is_current = TRUE` (more conservative):
526
+
527
+ ```sql
528
+ WHERE pos.is_current = TRUE
529
+ ```
530
+
531
+ **Difference:** `end_date IS NULL` typically returns ~20% more results than `is_current = TRUE`.
532
+
533
+ ### Former Employees (Alumni)
534
+
535
+ ```sql
536
+ SELECT
537
+ lp.first_name,
538
+ lp.last_name,
539
+ lp.headline,
540
+ lp.location_name,
541
+ pos.title,
542
+ pos.company_name,
543
+ pos.start_date,
544
+ pos.end_date
545
+ FROM linkedin_profile lp
546
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
547
+ WHERE pos.linkedin_company_id = 2135371
548
+ AND pos.end_date IS NOT NULL
549
+ LIMIT 50;
550
+ ```
551
+
552
+ ---
553
+
554
+ ## Handling Duplicates
555
+
556
+ People often have multiple position records at the same company due to:
557
+
558
+ - Title changes/promotions
559
+ - Multiple data sources
560
+ - Profile updates
561
+
562
+ ### Deduplication with DISTINCT ON
563
+
564
+ ```sql
565
+ -- Get most recent title per person (works for <100K employee companies)
566
+ SELECT DISTINCT ON (lp.id)
567
+ lp.first_name,
568
+ lp.last_name,
569
+ lp.formatted_name,
570
+ lp.headline,
571
+ lp.location_name,
572
+ pos.title,
573
+ pos.company_name,
574
+ pos.start_date
575
+ FROM linkedin_profile lp
576
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
577
+ WHERE pos.linkedin_company_id = 2135371
578
+ AND pos.end_date IS NULL
579
+ ORDER BY lp.id, pos.start_date DESC NULLS LAST
580
+ LIMIT 100;
581
+ ```
582
+
583
+ ### For Massive Companies (No DISTINCT)
584
+
585
+ For companies like Amazon/Google, deduplicate in application code:
586
+
587
+ ```sql
588
+ -- Get raw results
589
+ SELECT
590
+ lp.id,
591
+ lp.first_name,
592
+ lp.last_name,
593
+ lp.headline,
594
+ lp.location_name,
595
+ pos.title,
596
+ pos.company_name
597
+ FROM linkedin_profile lp
598
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
599
+ WHERE pos.linkedin_company_id = 1586
600
+ AND pos.end_date IS NULL
601
+ AND pos.title ILIKE '%engineer%'
602
+ LIMIT 200; -- Get more to account for duplicates
603
+ ```
604
+
605
+ Then deduplicate by `lp.id` in your application.
606
+
607
+ ---
608
+
609
+ ## Full Employee Profile Query
610
+
611
+ Get complete employee information with all available fields:
612
+
613
+ ```sql
614
+ SELECT
615
+ lp.first_name,
616
+ lp.last_name,
617
+ lp.formatted_name,
618
+ lp.headline,
619
+ lp.location_name,
620
+ lp.location_city,
621
+ lp.location_region,
622
+ lp.location_country,
623
+ lp.location_country_code,
624
+ lp.connections,
625
+ lp.public_profile_url,
626
+ pos.title,
627
+ pos.company_name,
628
+ pos.start_date,
629
+ pos.end_date,
630
+ pos.summary
631
+ FROM linkedin_profile lp
632
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
633
+ WHERE pos.linkedin_company_id = 2135371
634
+ AND pos.end_date IS NULL
635
+ AND pos.title ILIKE '%engineer%'
636
+ LIMIT 50;
637
+ ```
638
+
639
+ ---
640
+
641
+ ## Role Breakdown Query
642
+
643
+ Get title distribution at a company (works for <100K employee companies):
644
+
645
+ ```sql
646
+ SELECT
647
+ pos.title,
648
+ COUNT(*) as position_count,
649
+ COUNT(DISTINCT pos.linkedin_profile_id) as unique_people
650
+ FROM linkedin_profile_position3 pos
651
+ WHERE pos.linkedin_company_id = 2135371 -- Stripe
652
+ AND pos.end_date IS NULL
653
+ AND pos.title IS NOT NULL
654
+ GROUP BY pos.title
655
+ ORDER BY unique_people DESC
656
+ LIMIT 30;
657
+ ```
658
+
659
+ ---
660
+
661
+ ## Common Gotchas
662
+
663
+ 1. **Never use `linkedin_profile.linkedin_company_id`** - no index, will timeout
664
+ 2. **Always use `linkedin_profile_position3`** for company-based searches
665
+ 3. **Use `end_date IS NULL`** not `is_current = TRUE` for better coverage
666
+ 4. **Always include LIMIT** - especially for large companies
667
+ 5. **Avoid DISTINCT ON for 100K+ employee companies** - will timeout
668
+ 6. **Avoid COUNT/GROUP BY for 100K+ employee companies** - will timeout
669
+ 7. **Title matching is case-insensitive** with ILIKE but watch for variations
670
+ 8. **Some titles are NULL** - filter with `pos.title IS NOT NULL` if needed
671
+ 9. **Duplicates are normal** - same person may have multiple position records
672
+
673
+ ---
674
+
675
+ ## Performance Tips
676
+
677
+ 1. **Filter by company first** - uses the index
678
+ 2. **Add title filter after** - applied as a filter on indexed results
679
+ 3. **Use LIMIT early** - don't fetch more than needed
680
+ 4. **Avoid ORDER BY** on large result sets (except when needed for DISTINCT ON)
681
+ 5. **Check company size first** - adjust query strategy accordingly
682
+
683
+ ### Query Performance Reference
684
+
685
+ | Query Type | Small (<1K) | Medium (1K-10K) | Large (10K-100K) | Massive (100K+) |
686
+ | ------------ | ----------- | --------------- | ---------------- | --------------- |
687
+ | Simple LIMIT | 4-20ms | 10-30ms | 10-40ms | 15-65ms |
688
+ | DISTINCT ON | 10-50ms | 100-200ms | 500ms-1s | **TIMEOUT** |
689
+ | GROUP BY | 5-50ms | 100-500ms | 1-15s | **TIMEOUT** |
690
+ | COUNT(\*) | 5-50ms | 100-500ms | 1-15s | **TIMEOUT** |
691
+
692
+ ---
693
+
694
+ ## Related Documentation
695
+
696
+ - **[B2B_SCHEMA.md](./B2B_SCHEMA.md)** - Full schema reference
697
+ - **[B2B_DATABASE.md](./B2B_DATABASE.md)** - Database overview and examples