orangeslice 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,952 @@
1
+ # B2B Database: Normalized vs Denormalized Table Selection Guide
2
+
3
+ Comprehensive test results and decision matrix for choosing between normalized tables (`linkedin_profile`, `linkedin_company`) and denormalized views (`lkd_profile`, `lkd_company`).
4
+
5
+ ---
6
+
7
+ ## Executive Summary
8
+
9
+ ### Profile Tables (`linkedin_profile` vs `lkd_profile`)
10
+
11
+ | Scenario | Winner | Speed Advantage |
12
+ | ---------------------------------------------------------------- | ------------ | ---------------- |
13
+ | **Simple PK/slug lookups** | Normalized | 3-6x faster |
14
+ | **Simple text/ILIKE (common terms)** | Normalized | 2-20x faster |
15
+ | **Batch PK lookups (≤20 IDs)** | Normalized | 3x faster |
16
+ | **Large result sets (5000+ rows)** | Normalized | 10-17x faster |
17
+ | **Indexed filters (updated_at, jobs_count, industry_id)** | Normalized | 3-8x faster |
18
+ | **Org/company name search (GIN indexed)** | Normalized | 20x faster |
19
+ | **Country filter + headline search** | Denormalized | Works vs TIMEOUT |
20
+ | **Multi-filter combinations (2+ filters)** | Denormalized | 2-3x faster |
21
+ | **Rare/uncommon headline terms** | Denormalized | 2-2.5x faster |
22
+ | **Multiple skills array searches** | Denormalized | 2-3x faster |
23
+ | **Numeric + text combos (follower + headline)** | Denormalized | 1.7-2.5x faster |
24
+ | **Complex regex patterns** | Denormalized | 1.4-2x faster |
25
+ | **Summary text searches** | Denormalized | 1.6x faster |
26
+ | **Name searches (first + last)** | Denormalized | 1.5x faster |
27
+ | **Getting enriched nested data (seniority, job_function, etc.)** | Denormalized | Only option |
28
+
29
+ ### Company Tables (`linkedin_company` vs `lkd_company`) - Updated Jan 2026
30
+
31
+ | Scenario | Winner | Speed Advantage |
32
+ | ---------------------------------------- | ------------ | ------------------------- |
33
+ | **Slug lookups via `key64()`** | Normalized | 5-12x faster (4ms) |
34
+ | **Domain/ticker/universal_name lookups** | Normalized | Only option (indexed) |
35
+ | **Single country filter** | Normalized | 20x faster (9ms vs 228ms) |
36
+ | **Industry via industry_code** | Normalized | 136x faster (2ms) |
37
+ | **Nested data via JOINs (1 ID)** | Normalized | 7-19x faster |
38
+ | **Aggregations (GROUP BY)** | Normalized | Works vs TIMEOUT |
39
+ | **Simple slug lookup (no key64)** | Denormalized | Only option (21-47ms) |
40
+ | **Compound queries (2+ filters)** | Denormalized | 1.5-2.4x faster |
41
+ | **Funding + employee compound** ⚠️ | Denormalized | **17x faster** (70ms) |
42
+ | **Country + description combinations** | Denormalized | 2.4x faster |
43
+ | **Description ILIKE (rare terms)** | Denormalized | 1.5x faster |
44
+ | **Specialties filtering** | Denormalized | 2.2x faster |
45
+ | **Pre-formatted nested JSON** | Denormalized | Convenience |
46
+
47
+ **Key Insights** (from 40+ comprehensive tests):
48
+
49
+ 1. **Text Selectivity Crossover**: Common terms (CEO, engineer) favor normalized (2.4-5.7x faster); rare terms (kubernetes, blockchain) favor denormalized (2-2.8x faster)
50
+ 2. **Filter Count Scaling**: 1 filter → normalized usually wins; 3+ filters → denormalized wins (2-4x faster)
51
+ 3. **Filter Location**: Text filter on company side → normalized (company table smaller); text filter on profile side → denormalized (profile table huge)
52
+ 4. **Cross-Table Pattern**: Profile text + company constraint → denormalized (13-93x faster, normalized often times out)
53
+ 5. **Companies**: Use `key64()` for slug lookups; `lkd_company` lacks domain/ticker columns
54
+
55
+ See [B2B_GENERALIZATION_RULES.md](./B2B_GENERALIZATION_RULES.md) for the complete decision matrix.
56
+
57
+ ---
58
+
59
+ ## Critical Finding: Country Filtering
60
+
61
+ The most significant discovery is that **country filtering behaves completely differently** between the two tables:
62
+
63
+ | Query Pattern | `linkedin_profile` | `lkd_profile` | Result |
64
+ | -------------------------------------------------- | ------------------ | ------------- | ----------------------- |
65
+ | `location_country_code = 'US'` + headline ILIKE | **TIMEOUT (30s+)** | - | Fails |
66
+ | `country_iso = 'US'` + headline ILIKE | - | **16-22s** | Works |
67
+ | `location_country_code = 'US'` + common term (CTO) | 47ms | 104ms | Both work |
68
+ | `country_iso = 'US'` + rare term (iOS developer) | TIMEOUT | **16s** | Only denormalized works |
69
+
70
+ **Rule**: For country-filtered searches with uncommon headline terms, you **must** use `lkd_profile` with `country_iso`.
71
+
72
+ ```sql
73
+ -- ✅ WORKS: US-based iOS developers
74
+ SELECT profile_id, first_name, headline, locality
75
+ FROM lkd_profile
76
+ WHERE country_iso = 'US'
77
+ AND headline ~* '(\miOS\M|\mSwift\M).*(developer|engineer)'
78
+ LIMIT 100
79
+
80
+ -- ❌ TIMEOUT: Same query on normalized table
81
+ SELECT id, first_name, headline, location_name
82
+ FROM linkedin_profile
83
+ WHERE location_country_code = 'US'
84
+ AND headline ILIKE '%iOS developer%'
85
+ LIMIT 100
86
+ ```
87
+
88
+ ---
89
+
90
+ ## Compound Query Performance
91
+
92
+ The key pattern: **compound queries scale better on denormalized views**.
93
+
94
+ | # of Filters | Normalized | Denormalized | Winner |
95
+ | ------------------- | ---------- | ------------ | ------------------- |
96
+ | 1 filter (indexed) | **Faster** | Slower | Normalized |
97
+ | 1 filter (headline) | Faster | Slower | Normalized |
98
+ | 2 filters | Similar | Similar | Tie |
99
+ | 3+ filters | **Slower** | **Faster** | Denormalized (2-3x) |
100
+
101
+ ### Example: Triple Filter Performance
102
+
103
+ ```sql
104
+ -- lkd_profile: 5.8s ✅
105
+ SELECT first_name, headline, connection_count
106
+ FROM lkd_profile
107
+ WHERE connection_count > 500
108
+ AND 'Python' = ANY(skills)
109
+ AND headline ILIKE '%engineer%'
110
+ LIMIT 100
111
+
112
+ -- linkedin_profile: 18.2s (3.1x slower)
113
+ SELECT first_name, headline, connections
114
+ FROM linkedin_profile
115
+ WHERE connections > 500
116
+ AND 'Python' = ANY(skills)
117
+ AND headline ILIKE '%engineer%'
118
+ LIMIT 100
119
+ ```
120
+
121
+ ---
122
+
123
+ ## Test Results: Profile Queries
124
+
125
+ ### 1. Primary Key Lookups
126
+
127
+ | Query Type | Normalized | Denormalized | Winner |
128
+ | ----------------------- | ---------- | ------------ | ----------------- |
129
+ | Single ID lookup | **0.12ms** | 0.66ms | Normalized (5.5x) |
130
+ | Batch 5 IDs | **1.84ms** | 1.15ms | Tie |
131
+ | Batch 20 IDs | **3.46ms** | 11.44ms | Normalized (3.3x) |
132
+ | Slug lookup (with JOIN) | **0.18ms** | timeout | Normalized |
133
+
134
+ ```sql
135
+ -- RECOMMENDED: Normalized PK lookup
136
+ SELECT id, formatted_name, title, org, headline
137
+ FROM linkedin_profile WHERE id = ?
138
+
139
+ -- AVOID: Denormalized for simple lookups
140
+ SELECT profile_id, name, title, company_name, headline
141
+ FROM lkd_profile WHERE profile_id = ?
142
+ ```
143
+
144
+ ### 2. Text/ILIKE Searches
145
+
146
+ | Query Type | Normalized | Denormalized | Winner |
147
+ | ---------------------------------- | ---------- | ------------ | ----------------- |
148
+ | Title ILIKE (LIMIT 10) | **24.5ms** | 66.7ms | Normalized (2.7x) |
149
+ | Headline multi-term | **99.2ms** | 116.4ms | Normalized (1.2x) |
150
+ | Org/company_name ILIKE | **86.7ms** | 1,710ms | Normalized (20x) |
151
+ | Compound filter (location + title) | timeout | 283.7ms | Denormalized\* |
152
+
153
+ \*Note: Normalized query timed out on this particular compound filter
154
+
155
+ ```sql
156
+ -- RECOMMENDED: Normalized text search
157
+ SELECT id, formatted_name, title
158
+ FROM linkedin_profile
159
+ WHERE title ILIKE '%software engineer%' LIMIT 10
160
+
161
+ -- AVOID for search: Denormalized
162
+ SELECT profile_id, name, title
163
+ FROM lkd_profile
164
+ WHERE title ILIKE '%software engineer%' LIMIT 10
165
+ ```
166
+
167
+ ### 3. Getting Nested Data (Experience, Education)
168
+
169
+ | Query Type | Normalized (with JOINs) | Denormalized | Winner |
170
+ | ------------------------------------------ | ----------------------- | ------------ | ------------------- |
171
+ | Profile + positions (1 ID) | **0.33ms** | 6.42ms | Normalized (19x) |
172
+ | Profile + education (1 ID) | **1.22ms** | 0.66ms | Denormalized (1.8x) |
173
+ | Profile + positions + education (1 ID) | **0.27ms** | 4.58ms | Normalized (17x) |
174
+ | Batch 5 IDs + positions | **0.85ms** | 6.36ms | Normalized (7.5x) |
175
+ | Full profile (positions, education, certs) | **0.32ms** | 6.28ms | Normalized (20x) |
176
+
177
+ ```sql
178
+ -- RECOMMENDED: Normalized with JOINs for nested data
179
+ SELECT lp.id, lp.formatted_name,
180
+ pp.title as job_title, pp.company_name, pp.start_date
181
+ FROM linkedin_profile lp
182
+ LEFT JOIN linkedin_profile_position3 pp ON pp.linkedin_profile_id = lp.id
183
+ WHERE lp.id = ?
184
+ ORDER BY pp.start_date DESC NULLS LAST
185
+
186
+ -- USE ONLY when you need enriched fields (seniority, job_function, etc.)
187
+ SELECT profile_id, name, experience, education
188
+ FROM lkd_profile WHERE profile_id = ?
189
+ ```
190
+
191
+ ### 4. Filtering on Nested Data
192
+
193
+ | Query Type | Normalized | Denormalized | Winner |
194
+ | ----------------------------------------- | ---------- | ------------ | --------------------- |
195
+ | EXISTS check on positions (known profile) | 1,106.6ms | **2.84ms** | Denormalized (390x)\* |
196
+ | Filter positions by title | **17.6ms** | 1,582ms | Normalized (90x) |
197
+ | COUNT positions > 5 (HAVING) | **0.99ms** | 94ms | Normalized (95x) |
198
+ | jsonb_array_length filter | N/A | 94ms | Normalized preferred |
199
+
200
+ \*Note: Denormalized wins when starting from a known profile_id and checking JSON fields. But loses badly for scanning/searching within JSON arrays.
201
+
202
+ ```sql
203
+ -- RECOMMENDED: Normalized for filtering on nested data
204
+ SELECT DISTINCT pp.linkedin_profile_id
205
+ FROM linkedin_profile_position3 pp
206
+ WHERE pp.title ILIKE '%software engineer%'
207
+ LIMIT 10
208
+
209
+ -- AVOID: JSON array filtering is very slow
210
+ SELECT profile_id, name
211
+ FROM lkd_profile
212
+ WHERE EXISTS (
213
+ SELECT 1 FROM jsonb_array_elements(experience::jsonb) e
214
+ WHERE (e->>'title') ILIKE '%software engineer%'
215
+ ) LIMIT 10
216
+ ```
217
+
218
+ ### 5. Extended Comparison Tests (2025)
219
+
220
+ Comprehensive tests across 23 query patterns:
221
+
222
+ | Test | Query Pattern | `lkd_profile` | `linkedin_profile` | Winner | Speedup |
223
+ | -------------------------- | ----------------------------- | ------------- | ------------------ | ------------ | ------- |
224
+ | Slug lookup | slug = 'x' | TIMEOUT | **7ms** | Normalized | ∞ |
225
+ | Profile ID lookup | id = X | 82ms | **33ms** | Normalized | 2.5x |
226
+ | updated_at filter | updated_at > date | 91ms | **12ms** | Normalized | 7.5x |
227
+ | jobs_count filter | jobs_count > 10 | 228ms | **53ms** | Normalized | 4.3x |
228
+ | Large results (5000) | headline ILIKE | 5.5s | **324ms** | Normalized | 17x |
229
+ | **Name search** | first + last | **2.3s** | 3.5s | Denormalized | 1.5x |
230
+ | **Follower + headline** | num > X AND headline | **1.8s** | 3.1s | Denormalized | 1.7x |
231
+ | **Multiple skills (3)** | skill1 AND skill2 AND skill3 | **2.5s** | 7s | Denormalized | 2.8x |
232
+ | **Skills + headline** | skill AND headline ILIKE | **7s** | 9.7s | Denormalized | 1.4x |
233
+ | **Rare term (blockchain)** | headline ILIKE '%blockchain%' | **1.4s** | 3.4s | Denormalized | 2.4x |
234
+ | **Rare term (kubernetes)** | headline ILIKE '%kubernetes%' | **1.6s** | 3.1s | Denormalized | 1.9x |
235
+ | **Summary search** | summary ILIKE '%startup%' | **612ms** | 966ms | Denormalized | 1.6x |
236
+ | **Connection + skills** | connections > X AND skill | **1.6s** | 3.8s | Denormalized | 2.4x |
237
+ | **Complex regex** | headline ~\* '(a\|b\|c\|d)' | **335ms** | 465ms | Denormalized | 1.4x |
238
+ | **Triple filter** | conn + skill + headline | **5.8s** | 18.2s | Denormalized | 3.1x |
239
+ | **Rare term + follower** | tensorflow + follower > X | **9.2s** | 23s | Denormalized | 2.5x |
240
+ | **Word boundary regex** | headline ~\* '\mAI\M' | **227ms** | 480ms | Denormalized | 2.1x |
241
+ | **Headline + locality** | headline + city ILIKE | **746ms** | 1.95s | Denormalized | 2.6x |
242
+
243
+ ---
244
+
245
+ ## Test Results: Company Queries (January 2026)
246
+
247
+ > **Note**: `lkd_company` is a VIEW (not a materialized view or table), so it has no indexes and relies on the underlying table indexes.
248
+
249
+ ### 5. Primary Key & Slug Lookups
250
+
251
+ | Query Type | Normalized | Denormalized | Winner |
252
+ | ------------------------------- | ---------- | ------------ | ------------------- |
253
+ | Single ID lookup | **3-4ms** | 4-19ms | Normalized (varies) |
254
+ | Batch 5 IDs | 4ms | 4ms | Tie |
255
+ | Batch 20 IDs | 5ms | 4ms | Tie |
256
+ | Slug via raw text | TIMEOUT | TIMEOUT | Both fail |
257
+ | Slug via `key64()` + subquery | **4ms** | N/A | Normalized |
258
+ | Slug direct on lkd_company | N/A | **21-47ms** | Denormalized only |
259
+ | Domain lookup (indexed) | **3ms** | N/A | Normalized only |
260
+ | universal_name lookup (indexed) | **2ms** | N/A | Normalized only |
261
+ | Ticker lookup (indexed) | **5ms** | N/A | Normalized only |
262
+
263
+ **Critical: Slug Lookup Strategy**
264
+
265
+ ```sql
266
+ -- ✅ FASTEST: Use key64() for indexed slug lookup
267
+ SELECT id, company_name FROM linkedin_company
268
+ WHERE id = (
269
+ SELECT linkedin_company_id FROM linkedin_company_slug
270
+ WHERE slug_key64 = key64('google') LIMIT 1
271
+ )
272
+ -- Result: 4ms
273
+
274
+ -- 🟡 WORKS: Denormalized direct (slower but simple)
275
+ SELECT linkedin_company_id, name FROM lkd_company WHERE slug = 'google'
276
+ -- Result: 21-47ms
277
+
278
+ -- ❌ TIMEOUT: Raw slug lookup without key64()
279
+ SELECT linkedin_company_id FROM linkedin_company_slug WHERE slug = 'google'
280
+ -- Result: 30s+ TIMEOUT
281
+ ```
282
+
283
+ ### 6. Text Searches
284
+
285
+ | Query Type | Normalized | Denormalized | Winner |
286
+ | -------------------------------- | ---------- | ------------ | ------------------- |
287
+ | Company name ILIKE | **691ms** | 872ms | Normalized (1.3x) |
288
+ | Description ILIKE (AI) | 396ms | **337ms** | Denormalized (1.2x) |
289
+ | Description ILIKE (blockchain) | 380ms | **372ms** | Tie |
290
+ | Description ILIKE (ML) | 413ms | **276ms** | Denormalized (1.5x) |
291
+ | Company headline ILIKE | **36ms** | 42ms | Tie |
292
+ | Regex multiple keywords (AI\|ML) | **128ms** | 158ms | Normalized (1.2x) |
293
+ | Specialties array contains | 25.9s | **11.6s** | Denormalized (2.2x) |
294
+
295
+ ### 7. Getting Nested Data (Funding, Locations, Industries)
296
+
297
+ | Query Type | Normalized (JOINs) | Denormalized | Winner |
298
+ | -------------------------- | ------------------ | ------------ | ----------------- |
299
+ | Company + funding (1 ID) | **9ms** | 67ms | Normalized (7.4x) |
300
+ | Company + locations (1 ID) | **6ms** | 11ms | Normalized (1.8x) |
301
+ | Company + industry (1 ID) | **3ms** | 50ms | Normalized (17x) |
302
+ | Full company all nested | **3ms** | 58ms | Normalized (19x) |
303
+ | Batch 5 IDs + industry | **3ms** | 53ms | Normalized (18x) |
304
+
305
+ ### 8. Filtering on Nested Company Data
306
+
307
+ | Query Type | Normalized | Denormalized | Winner |
308
+ | ------------------------------- | ---------- | ------------ | --------------------- |
309
+ | EXISTS funding check | **111ms** | 191ms | Normalized (1.7x) |
310
+ | Funding + Employee compound | 1,183ms | **70ms** | Denormalized (17x) ⚠️ |
311
+ | Industry name via JSON ILIKE | N/A | 272ms | Denormalized only |
312
+ | Industry via industry_code JOIN | **2ms** | N/A | Normalized only |
313
+
314
+ **⚠️ Critical Finding**: Funding + Employee compound queries are **17x faster** on denormalized.
315
+
316
+ ### 9. Numeric Filters
317
+
318
+ | Query Type | Normalized | Denormalized | Winner |
319
+ | ----------------------------- | ---------- | ------------ | ----------------- |
320
+ | employee_count > 1000 | 105ms | **81ms** | Denormalized |
321
+ | employee_count BETWEEN 50-200 | **10ms** | 11ms | Tie |
322
+ | follower_count > 100000 | **209ms** | 283ms | Normalized (1.4x) |
323
+ | founded > 2020 | **7ms** | 6ms | Tie |
324
+ | founded + employee compound | 82ms | 89ms | Tie |
325
+ | updated_at > date (indexed) | **3ms** | 4ms | Normalized |
326
+
327
+ ### 10. Country Filtering
328
+
329
+ | Query Type | Normalized | Denormalized | Winner |
330
+ | ------------------------- | ---------- | ------------ | ------------------- |
331
+ | country_iso only | **9-11ms** | 228ms | Normalized (20x) |
332
+ | country_code only | **9ms** | N/A | Normalized only |
333
+ | country_iso + employee | 250ms | **158ms** | Denormalized (1.6x) |
334
+ | country_iso + description | 322ms | **135ms** | Denormalized (2.4x) |
335
+
336
+ ### 11. Compound Query Performance
337
+
338
+ | # of Filters | Normalized | Denormalized | Winner |
339
+ | ----------------------------------- | ---------- | ------------ | ------------------- |
340
+ | 1 filter (indexed: country, domain) | **Faster** | Slower | Normalized (9-20x) |
341
+ | 1 filter (text: description) | Similar | Similar | Tie |
342
+ | 2 filters (employee + name) | 243ms | **229ms** | Tie |
343
+ | 2 filters (employee + description) | 162ms | **84ms** | Denormalized (1.9x) |
344
+ | 3 filters (country+emp+desc) | 797ms | **372ms** | Denormalized (2.1x) |
345
+ | 4 filters (country+emp+foll+desc) | 158ms | **108ms** | Denormalized (1.5x) |
346
+ | Double text (name + description) | 176ms | **148ms** | Denormalized (1.2x) |
347
+ | Locality + description | **357ms** | 434ms | Normalized (1.2x) |
348
+
349
+ ### 12. Large Result Sets & Aggregations
350
+
351
+ | Query Type | Normalized | Denormalized | Winner |
352
+ | ---------------------------- | ---------- | ------------ | ------------------ |
353
+ | 1,000 rows (employee filter) | 16ms | 17ms | Tie |
354
+ | 5,000 rows (employee filter) | **20ms** | 27ms | Normalized (1.35x) |
355
+ | 10,000 rows (no filter) | 30ms | 29ms | Tie |
356
+ | COUNT with filter | **20s** | 21s | Both slow |
357
+ | GROUP BY country | **26s** | TIMEOUT | Normalized |
358
+
359
+ ---
360
+
361
+ ## Decision Matrix
362
+
363
+ ### When to Use `linkedin_profile` (Normalized)
364
+
365
+ ✅ **Always use for:**
366
+
367
+ - Slug lookups (via `key64()` function) - **7ms vs TIMEOUT**
368
+ - Profile ID lookups - **2.5x faster**
369
+ - `updated_at` filtering - **7.5x faster** (indexed)
370
+ - `jobs_count` filtering - **4.3x faster**
371
+ - Org/company name search - **20x faster** (GIN indexed)
372
+ - Title field searches - **14x faster**
373
+ - Large result sets (5000+ rows) - **17x faster**
374
+ - JOINs with position/education tables for filtering
375
+ - COUNT, GROUP BY, aggregations
376
+ - EXISTS checks across many profiles
377
+
378
+ ### When to Use `lkd_profile` (Denormalized)
379
+
380
+ ✅ **Use for:**
381
+
382
+ - **Country filtering with uncommon terms** - `country_iso = 'US'` works where `location_country_code` times out
383
+ - **Multi-filter combinations (2+ filters)** - 2-3x faster
384
+ - **Multiple skills array searches** - 2.8x faster
385
+ - **Rare/uncommon headline terms** (blockchain, kubernetes, tensorflow) - 2-2.5x faster
386
+ - **Numeric + text combos** (follower count + headline) - 1.7-2.5x faster
387
+ - **Complex regex patterns** (multi-option, word boundaries) - 1.4-2x faster
388
+ - **Summary text searches** - 1.6x faster
389
+ - **Name searches** (first + last name) - 1.5x faster
390
+ - **Headline + locality combinations** - 2.6x faster
391
+ - Enriched experience data (seniority, job_function, employment_type, academic_qualification, inferred_location)
392
+ - Building API responses that need complete profile with all nested arrays
393
+ - You have a known profile_id and want all nested data in one query
394
+ - Checking JSON fields on a single known entity
395
+
396
+ ❌ **Never use for:**
397
+
398
+ - Slug lookups (use normalized with `key64()`)
399
+ - Searching/filtering within JSON arrays (experience titles, etc.)
400
+ - Aggregations or counting
401
+ - Large result sets without filters
402
+
403
+ ### When to Use `linkedin_company` (Normalized)
404
+
405
+ ✅ **Always use for:**
406
+
407
+ - **Slug lookups via `key64()`** - 4ms vs 21-47ms (5-12x faster)
408
+ - **Domain lookups** - only option (indexed, 3ms)
409
+ - **Ticker lookups** - only option (indexed, 5ms)
410
+ - **Universal name lookups** - only option (indexed, 2ms)
411
+ - **Single country filter** - 9ms vs 228ms (20x faster)
412
+ - **Updated_at filtering** - indexed (3ms)
413
+ - **Industry filtering via industry_code** - 2ms vs 272ms (136x faster)
414
+ - **Aggregations (GROUP BY)** - 26s vs TIMEOUT
415
+ - **Complex JOINs** (company → positions → profiles)
416
+ - **Getting nested data by known ID** - JOINs are 7-19x faster
417
+ - **EXISTS checks on funding** - 1.7x faster
418
+
419
+ ### When to Use `lkd_company` (Denormalized)
420
+
421
+ ✅ **Use for:**
422
+
423
+ - **Compound queries (2+ filters)** - 1.5-2.4x faster
424
+ - **Funding + employee compound** - 17x faster (70ms vs 1,183ms) ⚠️
425
+ - **Country + description combinations** - 2.4x faster
426
+ - **Country + employee combinations** - 1.6x faster
427
+ - **Description searches (rare terms)** - 1.5x faster for ML/AI terms
428
+ - **Specialties filtering** - 2.2x faster (11.6s vs 25.9s)
429
+ - **Simple slug lookups** without `key64()` - works (21-47ms) vs TIMEOUT
430
+ - Pre-formatted `industries`, `locations`, `crunchbase_funding` JSON
431
+ - Building API responses with complete company data in one query
432
+
433
+ ❌ **Never use for:**
434
+
435
+ - Domain/ticker/universal_name lookups (columns don't exist)
436
+ - Single indexed filter lookups (country_iso, industry_code) - 20x slower
437
+ - Aggregations (COUNT, GROUP BY) - TIMEOUT
438
+ - Complex JOINs with other tables
439
+
440
+ ---
441
+
442
+ ## Cross-Table Query Performance (Profile + Company)
443
+
444
+ **Critical Finding**: When combining profile text filters (headline, skills) with company constraints, **denormalized JOINs are 20-93x faster** and often the only option that completes.
445
+
446
+ ### Performance Comparison
447
+
448
+ | Query Pattern | Normalized Multi-JOIN | Denormalized JOIN | Winner |
449
+ | -------------------------- | --------------------- | ----------------- | ------------------ |
450
+ | Headline + company size | 20,205ms | **217ms** | Denormalized (93x) |
451
+ | Multi-skill + company size | 28,173ms | **1,281ms** | Denormalized (22x) |
452
+ | Skill + company industry | TIMEOUT | **3,553ms** | Denormalized (∞) |
453
+ | Senior engineers + company | 4,535ms | **196ms** | Denormalized (23x) |
454
+ | Company ID → employees | **48ms** | 279ms | Normalized (5.8x) |
455
+ | Company name (org) search | **274ms** | 8,600ms | Normalized (31x) |
456
+
457
+ ### Rules for Cross-Table Queries
458
+
459
+ 1. **Company name search** → Always use `linkedin_profile.org` (GIN indexed, 68x faster)
460
+ 2. **Headline/skill + company constraint** → Always use `lkd_profile JOIN lkd_company` (normalized times out)
461
+ 3. **Company-first lookups** → Use normalized (5-8x faster)
462
+ 4. **Multi-filter profile + company** → Denormalized is often the only option
463
+
464
+ ### Example: Engineers at Large Companies
465
+
466
+ ```sql
467
+ -- ❌ SLOW: Normalized multi-JOIN (20 seconds)
468
+ SELECT lp.id, lp.headline, lc.company_name
469
+ FROM linkedin_profile lp
470
+ JOIN linkedin_profile_position3 pos ON pos.linkedin_profile_id = lp.id
471
+ JOIN linkedin_company lc ON lc.id = pos.linkedin_company_id
472
+ WHERE pos.end_date IS NULL
473
+ AND lp.headline ILIKE '%engineer%'
474
+ AND lc.employee_count > 1000
475
+ LIMIT 50
476
+
477
+ -- ✅ FAST: Denormalized JOIN (217ms - 93x faster)
478
+ SELECT lkd.profile_id, lkd.headline, lkdc.name
479
+ FROM lkd_profile lkd
480
+ JOIN lkd_company lkdc ON lkdc.linkedin_company_id = lkd.linkedin_company_id
481
+ WHERE lkd.headline ILIKE '%engineer%'
482
+ AND lkdc.employee_count > 1000
483
+ LIMIT 50
484
+ ```
485
+
486
+ See [B2B_CROSS_TABLE_TEST_FINDINGS.md](./B2B_CROSS_TABLE_TEST_FINDINGS.md) for full test results.
487
+
488
+ ---
489
+
490
+ ## Query Patterns & Recommendations
491
+
492
+ ### Pattern 1: Find Profiles with Specific Experience
493
+
494
+ ```sql
495
+ -- ✅ CORRECT: Use normalized tables
496
+ SELECT DISTINCT lp.id, lp.formatted_name, pp.title, pp.company_name
497
+ FROM linkedin_profile lp
498
+ JOIN linkedin_profile_position3 pp ON pp.linkedin_profile_id = lp.id
499
+ WHERE pp.title ILIKE '%product manager%'
500
+ AND pp.linkedin_company_id IN (SELECT id FROM linkedin_company WHERE employee_count > 1000)
501
+ LIMIT 100
502
+
503
+ -- ❌ WRONG: Don't search JSON arrays
504
+ SELECT profile_id, name FROM lkd_profile
505
+ WHERE EXISTS (SELECT 1 FROM jsonb_array_elements(experience::jsonb) e
506
+ WHERE (e->>'title') ILIKE '%product manager%')
507
+ ```
508
+
509
+ ### Pattern 2: Get Complete Profile for Display
510
+
511
+ ```sql
512
+ -- ✅ Option A: Denormalized (when you need enriched fields)
513
+ SELECT profile_id, name, title, company_name, headline,
514
+ experience, education, certifications, skills
515
+ FROM lkd_profile WHERE profile_id = ?
516
+
517
+ -- ✅ Option B: Normalized with multiple queries (faster total time)
518
+ SELECT * FROM linkedin_profile WHERE id = ?;
519
+ SELECT * FROM linkedin_profile_position3 WHERE linkedin_profile_id = ? ORDER BY start_date DESC;
520
+ SELECT * FROM linkedin_profile_education2 WHERE linkedin_profile_id = ?;
521
+ ```
522
+
523
+ ### Pattern 3: Find Companies with Funding
524
+
525
+ ```sql
526
+ -- ✅ CORRECT: Use normalized with EXISTS
527
+ SELECT lc.id, lc.company_name, lc.employee_count
528
+ FROM linkedin_company lc
529
+ WHERE EXISTS (SELECT 1 FROM linkedin_crunchbase_funding cf WHERE cf.linkedin_company_id = lc.id)
530
+ AND lc.employee_count BETWEEN 50 AND 500
531
+ LIMIT 100
532
+
533
+ -- ❌ WRONG: Don't filter on JSON
534
+ SELECT linkedin_company_id, name FROM lkd_company
535
+ WHERE crunchbase_funding IS NOT NULL AND crunchbase_funding::text != 'null'
536
+ ```
537
+
538
+ ### Pattern 4: Company Lookup with Full Data
539
+
540
+ ```sql
541
+ -- ✅ CORRECT: Denormalized for complete company
542
+ SELECT linkedin_company_id, name, description, employee_count,
543
+ industries, locations, crunchbase_funding
544
+ FROM lkd_company WHERE linkedin_company_id = ?
545
+
546
+ -- OR: Normalized with JOINs (faster but more code)
547
+ SELECT lc.*,
548
+ array_agg(DISTINCT ca.address) as addresses,
549
+ array_agg(DISTINCT cf.round_name) as funding_rounds
550
+ FROM linkedin_company lc
551
+ LEFT JOIN linkedin_company_address2 ca ON ca.linkedin_company_id = lc.id
552
+ LEFT JOIN linkedin_crunchbase_funding cf ON cf.linkedin_company_id = lc.id
553
+ WHERE lc.id = ?
554
+ GROUP BY lc.id
555
+ ```
556
+
557
+ ---
558
+
559
+ ## Summary: The Golden Rules
560
+
561
+ ### Profile Rules
562
+
563
+ 1. **For indexed lookups (slug, ID, updated_at)**: Always use normalized tables
564
+ 2. **For country + headline filtering**: Use `lkd_profile` with `country_iso` (normalized times out)
565
+ 3. **For single filters on common terms**: Use normalized tables (faster)
566
+ 4. **For multi-filter combinations (2+ filters)**: Use denormalized views (2-3x faster)
567
+ 5. **For rare/uncommon headline terms**: Use denormalized views (2-2.5x faster)
568
+ 6. **For multiple skills searches**: Use denormalized views (2.8x faster)
569
+ 7. **For nested data**: Normalized with JOINs unless you need enriched fields
570
+ 8. **For aggregations**: Always normalized
571
+ 9. **Never**: Filter on JSON array contents in denormalized views
572
+
573
+ ### Company Rules (Updated Jan 2026)
574
+
575
+ 1. **For slug lookups**: Use `key64()` with normalized (4ms) or denormalized direct (21-47ms)
576
+ 2. **For domain/ticker/universal_name lookups**: Normalized only (indexed, 2-5ms)
577
+ 3. **For single indexed filters (country_iso, industry_code)**: Normalized (20-136x faster)
578
+ 4. **For compound queries (2+ filters)**: Denormalized (1.5-2.4x faster)
579
+ 5. **For funding + other filters**: Denormalized (17x faster) ⚠️
580
+ 6. **For nested data by known ID**: Normalized JOINs (7-19x faster)
581
+ 7. **For aggregations (COUNT, GROUP BY)**: Normalized only (denormalized TIMEOUTs)
582
+ 8. **For pre-formatted JSON response**: Denormalized (convenience)
583
+
584
+ ### Cross-Table Rules (Profile + Company Queries)
585
+
586
+ 1. **Company name search**: Always use `linkedin_profile.org` (GIN indexed, 68x faster)
587
+ 2. **Headline/skill + company constraint**: Use `lkd_profile JOIN lkd_company` (20-93x faster)
588
+ 3. **Company-first lookup → employees**: Use normalized (5-8x faster)
589
+ 4. **Multi-filter profile + company**: Denormalized JOIN is often the only option that works
590
+
591
+ ### Quick Decision Flowchart
592
+
593
+ ```
594
+ Query has indexed lookup (slug, ID, updated_at)?
595
+ └─ YES → Use linkedin_profile (normalized)
596
+
597
+ Query needs profile + company data together?
598
+ └─ YES:
599
+ Searching by company name?
600
+ └─ YES → Use linkedin_profile.org (GIN indexed, 68x faster)
601
+ Profile text filter (headline/skill) + company constraint?
602
+ └─ YES → Use lkd_profile JOIN lkd_company (20-93x faster)
603
+ Company-first lookup?
604
+ └─ YES → Use normalized JOINs (5-8x faster)
605
+
606
+ Query has country filter + uncommon headline term?
607
+ └─ YES → Use lkd_profile (denormalized) - normalized will TIMEOUT
608
+
609
+ Query has 3+ combined filters?
610
+ └─ YES → Use lkd_profile (denormalized) - 2-3x faster
611
+
612
+ Query searches rare headline terms (iOS, kubernetes, blockchain)?
613
+ └─ YES → Use lkd_profile (denormalized) - 2-2.5x faster
614
+
615
+ Query has multiple skills?
616
+ └─ YES → Use lkd_profile (denormalized) - 2.8x faster
617
+
618
+ Query needs large result set (5000+)?
619
+ └─ YES → Use linkedin_profile (normalized) - 17x faster
620
+
621
+ Default:
622
+ └─ Use linkedin_profile (normalized)
623
+ ```
624
+
625
+ ---
626
+
627
+ ## Common Query Patterns
628
+
629
+ ### Pattern 5: Country-Filtered Headline Search (NEW)
630
+
631
+ ```sql
632
+ -- ✅ CORRECT: Use lkd_profile for country + uncommon term
633
+ SELECT profile_id, first_name, last_name, headline, locality, url
634
+ FROM lkd_profile
635
+ WHERE country_iso = 'US'
636
+ AND headline ~* '(\miOS\M|\mSwift\M).*(developer|engineer)'
637
+ LIMIT 100
638
+ -- Result: 20s, 100 rows ✅
639
+
640
+ -- ❌ WRONG: linkedin_profile times out
641
+ SELECT id, first_name, last_name, headline, location_name
642
+ FROM linkedin_profile
643
+ WHERE location_country_code = 'US'
644
+ AND headline ILIKE '%iOS developer%'
645
+ LIMIT 100
646
+ -- Result: TIMEOUT after 30s ❌
647
+ ```
648
+
649
+ ### Pattern 6: Multi-Skill Search
650
+
651
+ ```sql
652
+ -- ✅ CORRECT: Use lkd_profile for multiple skills (2.8x faster)
653
+ SELECT profile_id, first_name, headline, skills[1:5]
654
+ FROM lkd_profile
655
+ WHERE 'Python' = ANY(skills)
656
+ AND 'SQL' = ANY(skills)
657
+ AND 'Data Analysis' = ANY(skills)
658
+ LIMIT 50
659
+ -- Result: 2.5s
660
+
661
+ -- 🟡 WORKS but slower: linkedin_profile
662
+ SELECT id, first_name, headline, skills[1:5]
663
+ FROM linkedin_profile
664
+ WHERE 'Python' = ANY(skills)
665
+ AND 'SQL' = ANY(skills)
666
+ AND 'Data Analysis' = ANY(skills)
667
+ LIMIT 50
668
+ -- Result: 7s (2.8x slower)
669
+ ```
670
+
671
+ ### Pattern 7: Rare Headline Term Search
672
+
673
+ ```sql
674
+ -- ✅ FASTER: Use lkd_profile for rare terms (2.4x faster)
675
+ SELECT profile_id, first_name, headline
676
+ FROM lkd_profile
677
+ WHERE headline ILIKE '%blockchain%'
678
+ LIMIT 100
679
+ -- Result: 1.4s
680
+
681
+ -- 🟡 WORKS but slower: linkedin_profile
682
+ SELECT id, first_name, headline
683
+ FROM linkedin_profile
684
+ WHERE headline ILIKE '%blockchain%'
685
+ LIMIT 100
686
+ -- Result: 3.4s (2.4x slower)
687
+ ```
688
+
689
+ ### Pattern 8: Triple Filter Combination
690
+
691
+ ```sql
692
+ -- ✅ MUCH FASTER: Use lkd_profile for 3+ filters (3.1x faster)
693
+ SELECT profile_id, first_name, headline, connection_count
694
+ FROM lkd_profile
695
+ WHERE connection_count > 500
696
+ AND 'Python' = ANY(skills)
697
+ AND headline ILIKE '%engineer%'
698
+ LIMIT 100
699
+ -- Result: 5.8s
700
+
701
+ -- 🟡 WORKS but much slower: linkedin_profile
702
+ SELECT id, first_name, headline, connections
703
+ FROM linkedin_profile
704
+ WHERE connections > 500
705
+ AND 'Python' = ANY(skills)
706
+ AND headline ILIKE '%engineer%'
707
+ LIMIT 100
708
+ -- Result: 18.2s (3.1x slower)
709
+ ```
710
+
711
+ ---
712
+
713
+ ## Company Query Patterns (Updated Jan 2026)
714
+
715
+ ### Pattern 9: Company Slug Lookup
716
+
717
+ ```sql
718
+ -- ✅ FASTEST: Use key64() for indexed lookup (4ms)
719
+ SELECT id, company_name, employee_count
720
+ FROM linkedin_company
721
+ WHERE id = (
722
+ SELECT linkedin_company_id FROM linkedin_company_slug
723
+ WHERE slug_key64 = key64('google') LIMIT 1
724
+ )
725
+
726
+ -- 🟡 WORKS: Denormalized direct (21-47ms) - simpler but 5-12x slower
727
+ SELECT linkedin_company_id, name, employee_count
728
+ FROM lkd_company WHERE slug = 'google'
729
+
730
+ -- ❌ TIMEOUT: Raw slug without key64()
731
+ SELECT linkedin_company_id FROM linkedin_company_slug WHERE slug = 'google'
732
+ -- Result: 30s+ TIMEOUT
733
+ ```
734
+
735
+ ### Pattern 10: Company Domain/Ticker Lookup
736
+
737
+ ```sql
738
+ -- ✅ CORRECT: Only normalized has these indexed columns
739
+ SELECT id, company_name, domain FROM linkedin_company WHERE domain = 'google.com'
740
+ -- Result: 3ms
741
+
742
+ SELECT id, company_name, ticker FROM linkedin_company WHERE ticker = 'GOOGL'
743
+ -- Result: 5ms
744
+
745
+ SELECT id, company_name FROM linkedin_company WHERE universal_name = 'google'
746
+ -- Result: 2ms
747
+
748
+ -- ❌ WRONG: lkd_company does NOT have domain/ticker columns
749
+ ```
750
+
751
+ ### Pattern 11: Company Compound Filters (2+ filters)
752
+
753
+ ```sql
754
+ -- ✅ FASTER: Use lkd_company for compound queries (1.9-2.4x faster)
755
+ SELECT linkedin_company_id, name
756
+ FROM lkd_company
757
+ WHERE country_iso = 'US'
758
+ AND employee_count > 100
759
+ AND description ILIKE '%software%'
760
+ LIMIT 50
761
+ -- Result: 135-372ms
762
+
763
+ -- 🟡 WORKS but slower: linkedin_company
764
+ SELECT id, company_name
765
+ FROM linkedin_company
766
+ WHERE country_iso = 'US'
767
+ AND employee_count > 100
768
+ AND description ILIKE '%software%'
769
+ LIMIT 50
770
+ -- Result: 247-797ms (2.1x slower)
771
+ ```
772
+
773
+ ### Pattern 12: Companies with Funding + Filters ⚠️
774
+
775
+ ```sql
776
+ -- ✅ MUCH FASTER: Use lkd_company for funding + other filters (17x faster!)
777
+ SELECT linkedin_company_id, name, crunchbase_funding
778
+ FROM lkd_company
779
+ WHERE employee_count > 50
780
+ AND crunchbase_funding IS NOT NULL
781
+ AND crunchbase_funding::text != 'null'
782
+ AND crunchbase_funding::text != '[]'
783
+ LIMIT 50
784
+ -- Result: 70ms
785
+
786
+ -- ❌ VERY SLOW: Normalized with EXISTS
787
+ SELECT lc.id, lc.company_name
788
+ FROM linkedin_company lc
789
+ WHERE lc.employee_count > 50
790
+ AND EXISTS (SELECT 1 FROM linkedin_crunchbase_funding cf WHERE cf.linkedin_company_id = lc.id)
791
+ LIMIT 50
792
+ -- Result: 1,183ms (17x slower)
793
+ ```
794
+
795
+ ### Pattern 13: Company Industry Lookup
796
+
797
+ ```sql
798
+ -- ✅ FASTEST: Use normalized with industry_code (2ms)
799
+ SELECT id, company_name, industry_code
800
+ FROM linkedin_company
801
+ WHERE industry_code = 4
802
+ LIMIT 100
803
+
804
+ -- 🟡 ALTERNATIVE: Denormalized JSON search (272ms) - 136x slower
805
+ SELECT linkedin_company_id, name
806
+ FROM lkd_company
807
+ WHERE industries::text ILIKE '%Software Development%'
808
+ LIMIT 100
809
+ ```
810
+
811
+ ### Pattern 14: Company Aggregations
812
+
813
+ ```sql
814
+ -- ✅ CORRECT: Only use normalized for aggregations
815
+ SELECT country_iso, COUNT(*)
816
+ FROM linkedin_company
817
+ WHERE country_iso IS NOT NULL
818
+ GROUP BY country_iso
819
+ ORDER BY COUNT(*) DESC
820
+ LIMIT 10
821
+ -- Result: 26s (slow but works)
822
+
823
+ -- ❌ TIMEOUT: Denormalized for aggregations
824
+ SELECT country_iso, COUNT(*) FROM lkd_company ...
825
+ -- Result: TIMEOUT (30s+)
826
+ ```
827
+
828
+ ---
829
+
830
+ ## Appendix: Fields Only Available in Denormalized Views
831
+
832
+ ### `lkd_profile` Exclusive Fields
833
+
834
+ **Key columns for filtering:**
835
+
836
+ | Column | Type | Notes |
837
+ | --------------- | ---- | ---------------------------------------------------------------- |
838
+ | `country_iso` | text | **Fast country filter** (use instead of `location_country_code`) |
839
+ | `country_name` | text | Full country name |
840
+ | `industry_name` | text | Denormalized industry name |
841
+ | `url` | text | Full profile URL |
842
+
843
+ **JSON fields with enriched data:**
844
+
845
+ - `experience` - JSON array with enriched fields per position:
846
+ - `seniority[]` - Array of {id, seniority} objects
847
+ - `job_function[]` - Array of {id, job_function} objects
848
+ - `employment_type[]` - Array of {id, job_employment_type} objects
849
+ - `academic_qualification[]` - Array of {id, academic_qualification} objects
850
+ - `inferred_location` - Geocoded {latitude, longitude, formatted_address, country_iso, admin_district, locality}
851
+ - `education` - JSON array of education records
852
+ - `certifications` - JSON array
853
+ - `courses` - JSON array
854
+ - `projects` - JSON array
855
+ - `volunteering` - JSON array
856
+ - `patents` - JSON array
857
+ - `awards` - JSON array
858
+ - `publications` - JSON array
859
+ - `languages` - JSON array
860
+ - `recommendations` - JSON array
861
+ - `test_scores` - JSON array
862
+ - `articles` - JSON array
863
+
864
+ ### `lkd_company` Exclusive Fields
865
+
866
+ - `industries[]` - Pre-joined industry data with {id, name, primary}
867
+ - `locations[]` - Pre-joined with inferred_location geocoding
868
+ - `crunchbase_funding[]` - Formatted funding rounds with URLs
869
+ - `naics_codes[]` - Pre-joined NAICS code data
870
+ - `inferred_location` - Geocoded primary location
871
+
872
+ ---
873
+
874
+ ## Appendix: Column Comparison
875
+
876
+ ### Key Filtering Columns
877
+
878
+ | Purpose | `linkedin_profile` | `lkd_profile` | Notes |
879
+ | --------------- | ----------------------- | ------------------------------- | ------------------------------------------- |
880
+ | Profile ID | `id` | `profile_id` | Both indexed |
881
+ | Country code | `location_country_code` | `country_iso` | **lkd_profile works with headline filters** |
882
+ | Location string | `location_name` | `locality` | Similar performance |
883
+ | Current company | `org` | `company_name` | `org` has GIN index |
884
+ | Current title | `title` | `title` | Similar |
885
+ | Headline | `headline` | `headline` | Similar |
886
+ | Skills | `skills` (array) | `skills` (array) | lkd_profile faster for multi-skill |
887
+ | Connections | `connections` | `connection_count` | Similar |
888
+ | Followers | `num_followers` | `follower_count` | Similar |
889
+ | Industry | `linkedin_industry_id` | `industry_id` + `industry_name` | lkd_profile has name |
890
+ | Updated | `updated_at` | `updated_at` | linkedin_profile indexed |
891
+
892
+ ### Indexes Available
893
+
894
+ **`linkedin_profile` indexes:**
895
+
896
+ - `linkedin_profile_pkey` - Primary key on `id`
897
+ - `ix_linkedin_profile_org_tsv` - GIN on `org` (full-text)
898
+ - `linkedin_profile_updated_at_idx` - on `updated_at`
899
+ - `ix_linkedin_profile_linkedin_user_id` - on `linkedin_user_id`
900
+
901
+ **`lkd_profile`:** No indexes (denormalized view), but optimized for compound queries
902
+
903
+ ### Company Column Comparison
904
+
905
+ | Purpose | `linkedin_company` | `lkd_company` | Notes |
906
+ | -------------- | -------------------- | --------------------- | ----------------------------------- |
907
+ | Company ID | `id` | `linkedin_company_id` | Both fast for lookups |
908
+ | Name | `company_name` | `name` | Similar |
909
+ | Slug | N/A (use slug table) | `slug` | **lkd_company has it directly** |
910
+ | Domain | `domain` | N/A | **Only normalized has domain** |
911
+ | Ticker | `ticker` | `ticker` | Both have it |
912
+ | Universal name | `universal_name` | N/A | **Only normalized (indexed)** |
913
+ | Country | `country_iso` | `country_iso` | Normalized faster for single filter |
914
+ | Locality | `locality` | `locality` | Similar |
915
+ | Employee count | `employee_count` | `employee_count` | Similar |
916
+ | Follower count | `follower_count` | `follower_count` | Similar |
917
+ | Founded | `founded` | `founded_year` | Similar |
918
+ | Industry | `industry_code` | `industries` (JSON) | Normalized indexed (136x faster) |
919
+ | Funding | N/A (use JOIN) | `crunchbase_funding` | JSON pre-formatted |
920
+ | Locations | N/A (use JOIN) | `locations` (JSON) | JSON pre-formatted |
921
+
922
+ ### Company Indexes Available
923
+
924
+ **`linkedin_company` indexes:**
925
+
926
+ - `linkedin_company_pkey` - Primary key on `id`
927
+ - `ix_linkedin_company_domain` - on `domain`
928
+ - `ix_linkedin_company_ticker` - on `ticker`
929
+ - `linkedin_company_universal_name_ix` - on `universal_name`
930
+ - `ix_linkedin_company_tsv` - GIN on company_name + universal_name (full-text)
931
+ - `ix_linkedin_company_company_id` - on `company_id`
932
+ - `ix_linkedin_company_max_snapshot_id` - on `max_snapshot_id`
933
+
934
+ **`linkedin_company_slug` indexes:**
935
+
936
+ - `linkedin_company_slug_pk` - Primary key
937
+ - `linkedin_company_slug_slug_key64_uniq` - on `slug_key64` (**use with key64() function**)
938
+ - `linkedin_company_slug_linkedin_company_id_ix` - on `linkedin_company_id`
939
+
940
+ **`lkd_company`:** No indexes (it's a VIEW, not a materialized view)
941
+
942
+ ---
943
+
944
+ ## Test Methodology (Jan 2026)
945
+
946
+ Tests were conducted against a live B2B database via `http://165.22.151.131:3000/query` endpoint.
947
+
948
+ - Each query was run multiple times for consistency verification
949
+ - Timings include network latency (remote server)
950
+ - Results show `duration_ms` as reported by the database
951
+ - TIMEOUT = 30+ seconds (query cancelled)
952
+ - Tests covered: ID lookups, slug lookups, text searches, compound filters, nested data, aggregations