npm - @opendirectory.dev/skills - Versions diffs - 0.1.33 → 0.1.34 - Mend

@opendirectory.dev/skills 0.1.33 → 0.1.34

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (8) hide show

package/package.json +1 -1
package/registry.json +10 -0
package/skills/vc-finder/.env.example +18 -0
package/skills/vc-finder/README.md +113 -0
package/skills/vc-finder/SKILL.md +663 -0
package/skills/vc-finder/evals/evals.json +125 -0
package/skills/vc-finder/references/stage-signals.md +98 -0
package/skills/vc-finder/references/vc-outreach-guide.md +142 -0

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@opendirectory.dev/skills",
-  "version": "0.1.33",
+  "version": "0.1.34",
   "main": "dist/index.js",
   "types": "dist/index.d.ts",
   "bin": {

package/registry.json CHANGED Viewed

@@ -289,6 +289,16 @@
     "version": "0.0.1",
     "path": "skills/twitter-GTM-find-skill"
   },
+  {
+    "name": "vc-finder",
+    "description": "Takes a startup product URL or description, detects the industry and funding stage, identifies 5 comparable funded companies, searches who invested...",
+    "tags": [
+      "SEO"
+    ],
+    "author": "opendirectory",
+    "version": "0.0.1",
+    "path": "skills/vc-finder"
+  },
   {
     "name": "yc-intent-radar-skill",
     "description": "Scrape daily job listings from YCombinator's Workatastartup platform without duplicates.",

package/skills/vc-finder/.env.example ADDED Viewed

@@ -0,0 +1,18 @@
+# vc-finder: Environment Variables
+# ===================================
+# Gemini and Tavily are required. Firecrawl is recommended.
+# Required: Google Gemini API key for product analysis and VC synthesis
+# Get it: aistudio.google.com > Get API key
+GEMINI_API_KEY=your_gemini_api_key_here
+# Required: Tavily API key for VC investment research (Track A and Track B searches)
+# Get it: app.tavily.com > API Keys
+# Free tier: 1000 credits/month (~125 full runs at 8 searches per run)
+TAVILY_API_KEY=your_tavily_api_key_here
+# Recommended: Firecrawl API key for fetching JS-rendered product pages
+# Get it: firecrawl.dev > Get API key
+# Free tier: 500 credits/month
+# If not set: Tavily extract is used as fallback (may miss content on React/Next.js sites)
+FIRECRAWL_API_KEY=your_firecrawl_api_key_here

package/skills/vc-finder/README.md ADDED Viewed

@@ -0,0 +1,113 @@
+# vc-finder
+Give the skill a product URL or description. It detects the industry and funding stage, identifies 5 comparable funded companies, searches who backed those companies (Track A), finds VCs who publish investment theses about this space (Track B), and returns a ranked sourced investor list with deep-dives and outreach hooks.
+## Install
+```bash
+npx "@opendirectory.dev/skills" install vc-finder --target claude
+```
+### Video Tutorial
+Watch this quick video to see how it's done:
+https://github.com/user-attachments/assets/ee98a1b5-ebc4-452f-bbfb-c434f2935067
+### Step 1: Download the skill from GitHub
+1. Click the **Code** button on this repo's GitHub page.
+2. Select **Download ZIP** to download the repository.
+3. Extract the ZIP file on your computer.
+### Step 2: Install the Skill in Claude
+1. Open your **Claude desktop app**.
+2. Go to the sidebar on the left side and click on the **Customize** section.
+3. Click on the **Skills** tab, then click on the **+** (plus) icon button to create a new skill.
+4. Choose the option to **Upload a skill**, and drag and drop the `.zip` file (or you can extract it and drop the folder, both work).
+> **Note:** Make sure you are uploading the folder that contains the `SKILL.md` file!
+## What It Does
+- Fetches the product URL via Firecrawl (handles JS-rendered SPAs) or Tavily extract as fallback
+- Detects funding stage from CTA signals on the page (waitlist, free trial, pricing, sales CTAs)
+- Uses Gemini to map a 3-level industry taxonomy (L1 > L2 > L3) and identify 5 comparable funded companies
+- Track A: 5 Tavily searches to find who invested in each comparable company
+- Track B: 3 Tavily searches to find VCs who publish investment theses about this specific niche
+- Gemini synthesizes and ranks all found VCs by stage fit and space fit (1-10 scores)
+- Produces top 5 deep-dives with fund overview, portfolio evidence, how-to-approach, and outreach hook
+- Generates 3 product-specific outreach hooks (not generic advice)
+- Saves output to `docs/vc-intel/[product]-[date].md`
+## Requirements
+| Requirement | Purpose | How to Set Up |
+|---|---|---|
+| Gemini API key | Product analysis and VC synthesis | aistudio.google.com, Get API key |
+| Tavily API key | VC investment research (Track A and Track B) | app.tavily.com, free tier: 1000 credits/month |
+| Firecrawl API key | Fetching JS-rendered product pages | firecrawl.dev, free tier: 500 credits/month |
+Gemini and Tavily are required. Firecrawl is recommended -- without it, Tavily extract is used as fallback (may miss JS-rendered content).
+## Setup
+```bash
+cp .env.example .env
+# Add GEMINI_API_KEY and TAVILY_API_KEY (required)
+# Add FIRECRAWL_API_KEY (recommended)
+```
+## How to Use
+```
+"Find VCs for my startup: https://example.com"
+"Who invests in developer tools at seed stage?"
+"Build me a VC target list for https://example.com"
+"Which funds should I pitch? https://example.com"
+"Find investors for my product: [paste description]"
+"Who backed companies like mine? https://example.com"
+```
+Or paste a product description directly if the URL is behind a login or returns no readable content.
+## Why Two Tracks
+**Track A (portfolio mapping):** VCs who already wrote a check in your space. These investors have proven they understand the category, the risks, and the buyer. They need less convincing than a generalist fund.
+**Track B (thesis matching):** VCs who are actively publishing about your space. An investor who wrote a 2,000-word blog post about why they want to invest in CI/CD tooling is actively looking for deals. Your cold email lands in a much warmer inbox.
+Generic "VCs in B2B SaaS" lists skip both signals. This skill produces only VCs with named evidence for each entry.
+## Output
+Each run produces:
+1. **Product analysis**: detected industry taxonomy, stage, ICP, comparable companies used
+2. **Track A table**: VCs who backed comparable companies (with evidence)
+3. **Track B table**: VCs with published theses about this space (with source)
+4. **Top 5 deep-dives**: fund overview, why it fits, portfolio in space, how to approach, outreach hook
+5. **3 outreach hooks**: product-specific openers for cold outreach
+## Cost per Run
+- Firecrawl: ~$0.001 per fetch
+- Tavily: 8 searches at ~$0.01 each = ~$0.08
+- Gemini: 2 calls at ~$0.015 each = ~$0.03
+- Total: ~$0.12 per run
+## Project Structure
+```
+vc-finder/
+├── SKILL.md
+├── README.md
+├── .env.example
+├── evals/
+│   └── evals.json
+└── references/
+    ├── stage-signals.md
+    └── vc-outreach-guide.md
+```
+## License
+MIT

package/skills/vc-finder/SKILL.md ADDED Viewed

@@ -0,0 +1,663 @@
+---
+name: vc-finder
+description: Takes a startup product URL or description, detects the industry and funding stage, identifies 5 comparable funded companies, searches who invested in those companies (Track A), finds VCs who publish investment theses about this space (Track B), and returns a ranked sourced list of relevant investors with deep-dives and outreach hooks. Use when asked to find investors for a startup, identify which VCs fund products like mine, research who backs companies in my space, build a VC target list, or find investor-market fit. Trigger when a user says "find VCs for my startup", "who invests in my space", "build me a VC list", "which funds should I pitch", "find investors for my product", "who backed companies like mine", or "help me find venture capital".
+compatibility: [claude-code, gemini-cli, github-copilot]
+---
+# VC Finder
+Take a product URL or description. Detect industry and stage. Find 5 comparable funded companies. Run two research tracks: who invested in those comparables (Track A), and which VCs publish theses about this space (Track B). Return a sourced, ranked investor list with outreach hooks.
+---
+**Critical rule:** Every VC in Track A must include the specific comparable company they backed as evidence. Every VC in Track B must include the exact article or post title where they stated their thesis. If a VC name did not appear in Tavily search results, do not include them. No hallucinated fund names.
+---
+## Common Mistakes
+| The agent will want to... | Why that's wrong |
+|---|---|
+| Add a16z or Sequoia because they are famous | A famous VC without evidence is noise. Only include VCs that appear in Tavily search results for this specific product. Name-dropping wastes the founder's time. |
+| Continue when all 5 Track A searches return 0 results | Zero Track A results means the comparables were wrong or too obscure. Stop, regenerate comparables with broader known names, and retry. Continuing produces an evidence-free list. |
+| Include a Track B VC without citing the article or post | Thesis without a source is indistinguishable from hallucination. The founder cannot verify it and the list loses all credibility. |
+| Detect stage from website aesthetics ("site looks polished") | Stage must come from the specific CTA signals detected in Step 4. Aesthetic guessing sends founders to wrong-stage investors. |
+| Write generic outreach hooks like "highlight your traction" | Every outreach hook must name this specific product's differentiator and a specific VC portfolio signal. Generic hooks are removed by the QA step. |
+| Skip the URL fetch when the user also provides a description | Always fetch the URL. The live page often reveals stage signals (pricing CTAs, customer logos, job openings) that the user's description omits. |
+---
+## Step 1: Setup Check
+```bash
+echo "GEMINI_API_KEY: ${GEMINI_API_KEY:+set}"
+echo "TAVILY_API_KEY: ${TAVILY_API_KEY:+set}"
+echo "FIRECRAWL_API_KEY: ${FIRECRAWL_API_KEY:-not set, Tavily extract will be used as fallback}"
+```
+**If GEMINI_API_KEY is missing:** Stop. Tell the user: "GEMINI_API_KEY is required for product analysis and VC synthesis. Get it at aistudio.google.com. Add it to your .env file."
+**If TAVILY_API_KEY is missing:** Stop. Tell the user: "TAVILY_API_KEY is required to research VC investments and theses. There is no fallback for this. Get it at app.tavily.com. Free tier: 1000 credits/month (about 125 full runs). Add it to your .env file."
+**If only FIRECRAWL_API_KEY is missing:** Continue silently. Tavily extract will be used for the URL fetch.
+---
+## Step 2: Gather Input
+You need:
+- Product URL (required, unless user pastes a product description directly)
+- Optional: target stage hint (pre-seed, seed, series-a, series-b) -- if provided, use it and skip stage detection
+- Optional: geography preference (US, Europe, global) -- defaults to US if not specified
+**If the user provides only a pasted description (no URL):** Skip Steps 3-4. Go directly to Step 5 with the pasted text as `product_content`. Set `stage_source` to `user_description`.
+**If neither URL nor description is provided:** Ask: "What is the URL of your product or startup? Or paste a short description: what it does, who it is for, and what stage you are at (pre-seed, seed, Series A)."
+Derive product slug from URL for the output filename:
+```bash
+PRODUCT_SLUG=$(python3 -c "
+from urllib.parse import urlparse
+url = 'URL_HERE'
+host = urlparse(url).netloc.replace('www.', '')
+print(host.split('.')[0])
+")
+```
+---
+## Step 3: Fetch Product Page
+**Primary: Firecrawl (if FIRECRAWL_API_KEY is set)**
+```bash
+curl -s -X POST https://api.firecrawl.dev/v1/scrape \
+  -H "Authorization: Bearer $FIRECRAWL_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{"url": "URL_HERE", "formats": ["markdown"], "onlyMainContent": true}' \
+  | python3 -c "
+import sys, json
+d = json.load(sys.stdin)
+content = d.get('data', {}).get('markdown', '') or d.get('markdown', '')
+print(f'Fetched: {len(content)} characters')
+open('/tmp/vc-product-raw.md', 'w').write(content)
+"
+```
+**Fallback: Tavily extract (if FIRECRAWL_API_KEY is not set)**
+```bash
+curl -s -X POST https://api.tavily.com/extract \
+  -H "Content-Type: application/json" \
+  -d "{\"api_key\": \"$TAVILY_API_KEY\", \"urls\": [\"URL_HERE\"]}" \
+  | python3 -c "
+import sys, json
+d = json.load(sys.stdin)
+content = d.get('results', [{}])[0].get('raw_content', '')
+print(f'Fetched via Tavily extract: {len(content)} characters')
+open('/tmp/vc-product-raw.md', 'w').write(content)
+"
+```
+**Step-level checkpoint:**
+```bash
+python3 -c "
+content = open('/tmp/vc-product-raw.md').read()
+if len(content) < 200:
+    print('ERROR: Page returned fewer than 200 characters.')
+else:
+    print(f'Content OK: {len(content)} characters')
+"
+```
+**If content < 200 characters:** Stop fetching. Tell the user: "The product page returned no readable content. This usually means the site is JavaScript-rendered and requires a browser. Please paste your product description directly: what it does, who it is for, and what stage you are at."
+Proceed to Step 5 using the pasted description as `product_content`.
+---
+## Step 4: Detect Stage Signals Locally (No API)
+Parse the fetched markdown with regex before any API call. This gives Gemini anchored evidence rather than asking it to guess from aesthetics.
+```bash
+python3 << 'PYEOF'
+import re, json
+content = open('/tmp/vc-product-raw.md').read().lower()
+stage_signals = []
+# Pre-seed signals
+if re.search(r'join\s+(the\s+)?waitlist|sign\s+up\s+for\s+beta|early\s+access|request\s+(an?\s+)?invite|get\s+notified', content):
+    stage_signals.append({'signal': 'waitlist or beta CTA', 'stage_hint': 'pre-seed'})
+# Seed signals
+if re.search(r'start\s+(your\s+)?free\s+trial|try\s+(it\s+)?for\s+free|request\s+a?\s+demo|book\s+a?\s+demo|schedule\s+a?\s+demo', content):
+    stage_signals.append({'signal': 'free trial or demo CTA', 'stage_hint': 'seed'})
+# Series A signals
+if re.search(r'contact\s+sales|talk\s+to\s+(our\s+)?sales|see\s+pricing|view\s+pricing|plans\s+and\s+pricing', content):
+    stage_signals.append({'signal': 'pricing or sales CTA', 'stage_hint': 'series-a'})
+if re.search(r'case\s+stud(y|ies)|customer\s+stor(y|ies)|trusted\s+by\s+[\d,]+|used\s+by\s+[\d,]+', content):
+    stage_signals.append({'signal': 'case studies or customer count', 'stage_hint': 'series-a'})
+# Series A/B signals
+if re.search(r'enterprise\s+(plan|pricing|tier)|we.?re\s+hiring|join\s+our\s+team|open\s+positions', content):
+    stage_signals.append({'signal': 'enterprise tier or job openings', 'stage_hint': 'series-a-or-b'})
+# Funding announcement -- extract directly if present
+funding_match = re.search(
+    r'raised\s+\$[\d,.]+\s*[mk]?|series\s+[abc]\s+round|seed\s+round|(\$[\d,.]+\s*[mk]?\s+(?:seed|series\s+[abc]))',
+    content
+)
+if funding_match:
+    stage_signals.append({'signal': f'funding text: {funding_match.group(0).strip()}', 'stage_hint': 'announced'})
+# Determine dominant stage
+if not stage_signals:
+    dominant = 'unknown'
+elif any(s['stage_hint'] == 'announced' for s in stage_signals):
+    dominant = 'announced'
+elif any(s['stage_hint'] == 'series-a-or-b' for s in stage_signals):
+    dominant = 'series-a'
+elif any(s['stage_hint'] == 'series-a' for s in stage_signals):
+    dominant = 'series-a'
+elif any(s['stage_hint'] == 'seed' for s in stage_signals):
+    dominant = 'seed'
+else:
+    dominant = 'pre-seed'
+confidence = 'high' if len(stage_signals) >= 2 else ('medium' if len(stage_signals) == 1 else 'low')
+result = {'signals': stage_signals, 'dominant_stage': dominant, 'confidence': confidence}
+json.dump(result, open('/tmp/vc-stage-signals.json', 'w'), indent=2)
+print(f'Stage: {dominant} ({confidence} confidence) from {len(stage_signals)} signal(s)')
+for s in stage_signals:
+    print(f'  - {s["signal"]} -> {s["stage_hint"]}')
+PYEOF
+```
+---
+## Step 5: Product Analysis with Gemini
+```bash
+python3 << 'PYEOF'
+import json
+product_content = open('/tmp/vc-product-raw.md').read()[:6000]
+stage_signals = json.load(open('/tmp/vc-stage-signals.json'))
+request = {
+    "system_instruction": {
+        "parts": [{
+            "text": "You are a venture capital analyst. Analyze a product page and return structured JSON only. No commentary. No em dashes. Vague category labels like 'technology' or 'software' alone are not acceptable at L2 or L3 -- be specific. Comparable companies must be real funded companies with public funding records, well-known enough to appear in press coverage."
+        }]
+    },
+    "contents": [{
+        "parts": [{
+            "text": f"""Analyze this product page and return a JSON object with exactly these keys:
+1. product_name: string
+2. one_line_description: string -- what it does, for whom, core value prop. Under 20 words. No marketing language.
+3. industry_taxonomy: object with:
+   - l1: top-level (e.g. "software", "fintech", "healthtech", "consumer", "hardware")
+   - l2: sector (e.g. "developer tools", "sales technology", "edtech", "logistics software")
+   - l3: specific niche (e.g. "CI/CD automation", "outbound prospecting", "last-mile routing")
+4. icp: object with:
+   - buyer_persona: job title (e.g. "VP Engineering", "founder", "sales ops manager")
+   - company_type: (e.g. "B2B SaaS", "e-commerce brand", "enterprise IT team")
+   - company_size: (e.g. "5-50 employees", "50-500 employees", "enterprise")
+5. detected_stage: one of: pre-seed, seed, series-a, series-b, unknown
+6. stage_confidence: one of: high, medium, low
+7. stage_evidence: one sentence citing exactly which CTA or text on the page drove this classification. Write "no clear signals found" if unknown.
+8. comparable_companies: array of exactly 5 objects, each with:
+   - name: real company name (must have public VC funding records)
+   - similarity_reason: one sentence why this company is comparable to the product
+   - estimated_stage: their funding stage as of your knowledge cutoff
+9. geography_bias: one of: US, Europe, global, unclear -- infer from page text
+Stage signals detected from the page (use as input to your stage classification):
+{json.dumps(stage_signals, indent=2)}
+Product page content:
+{product_content}"""
+        }]
+    }],
+    "generationConfig": {
+        "temperature": 0.2,
+        "maxOutputTokens": 3000
+    }
+}
+json.dump(request, open('/tmp/vc-analysis-request.json', 'w'))
+PYEOF
+curl -s -X POST \
+  "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d @/tmp/vc-analysis-request.json \
+  | python3 -c "
+import sys, json
+d = json.load(sys.stdin)
+text = d['candidates'][0]['content']['parts'][0]['text'].strip()
+if text.startswith('\`\`\`'):
+    text = '\n'.join(text.split('\n')[1:-1])
+analysis = json.loads(text)
+json.dump(analysis, open('/tmp/vc-product-analysis.json', 'w'), indent=2)
+print('Product analysis complete.')
+print('Product:', analysis['product_name'])
+print('Industry:', analysis['industry_taxonomy']['l1'], '>', analysis['industry_taxonomy']['l2'], '>', analysis['industry_taxonomy']['l3'])
+print('Stage:', analysis['detected_stage'], '(' + analysis['stage_confidence'] + ' confidence)')
+print('Comparables:', ', '.join(c['name'] for c in analysis['comparable_companies']))
+"
+```
+**If Gemini returns empty or JSON parsing fails:** Retry once with `maxOutputTokens` reduced to 2000. If retry also fails: Stop. Tell the user: "Product analysis failed. Please paste a direct description (3-5 sentences: what it does, who it is for, current stage) and run again."
+---
+## Step 6: Track A -- Who Invested in Comparable Companies
+Run 5 Tavily searches, one per comparable. Save all results to a single file.
+```bash
+python3 << 'PYEOF'
+import json, os, urllib.request
+analysis = json.load(open('/tmp/vc-product-analysis.json'))
+comparables = analysis['comparable_companies']
+tavily_key = os.environ.get('TAVILY_API_KEY', '')
+all_track_a = []
+for comp in comparables:
+    company = comp['name']
+    query = f'"{company}" investors funding venture capital backed seed series'
+    payload = json.dumps({
+        "api_key": tavily_key,
+        "query": query,
+        "search_depth": "advanced",
+        "max_results": 5,
+        "include_answer": True
+    }).encode()
+    req = urllib.request.Request(
+        'https://api.tavily.com/search',
+        data=payload,
+        headers={'Content-Type': 'application/json'},
+        method='POST'
+    )
+    try:
+        with urllib.request.urlopen(req, timeout=30) as resp:
+            result = json.loads(resp.read())
+            all_track_a.append({
+                'comparable_company': company,
+                'similarity_reason': comp['similarity_reason'],
+                'query': query,
+                'answer': result.get('answer', ''),
+                'results': result.get('results', [])
+            })
+            print(f'Track A - {company}: {len(result.get("results", []))} results')
+    except Exception as e:
+        print(f'Track A - {company}: FAILED ({e})')
+        all_track_a.append({
+            'comparable_company': company,
+            'similarity_reason': comp['similarity_reason'],
+            'query': query,
+            'answer': '',
+            'results': [],
+            'error': str(e)
+        })
+json.dump(all_track_a, open('/tmp/vc-tracka-results.json', 'w'), indent=2)
+print(f'Track A complete. Comparables with results: {sum(1 for r in all_track_a if r.get("results"))}')
+PYEOF
+```
+**If all 5 Track A searches return 0 results:** Tell the user: "No funding data found for the comparable companies. This usually means the comparables are too early-stage or obscure for public press coverage. I will retry with broader comparable names." Then re-run Step 5 with a note to Gemini to choose "well-funded companies with significant press coverage" and retry Step 6.
+If the retry also returns 0 results: proceed to Track B only, and flag this in `data_quality_flags`.
+---
+## Step 7: Track B -- VCs With Investment Theses About This Space
+Run 3 Tavily searches using the L2 and L3 taxonomy from Step 5.
+```bash
+python3 << 'PYEOF'
+import json, os, urllib.request
+analysis = json.load(open('/tmp/vc-product-analysis.json'))
+l2 = analysis['industry_taxonomy']['l2']
+l3 = analysis['industry_taxonomy']['l3']
+stage = analysis['detected_stage']
+tavily_key = os.environ.get('TAVILY_API_KEY', '')
+queries = [
+    {
+        'name': 'thesis_l3',
+        'query': f'venture capital investment thesis "{l3}" investing 2023 OR 2024 OR 2025'
+    },
+    {
+        'name': 'thesis_l2',
+        'query': f'VC fund "{l2}" investment thesis portfolio companies'
+    },
+    {
+        'name': 'stage_space',
+        'query': f'{stage} investors "{l3}" startup venture capital fund'
+    }
+]
+all_track_b = []
+for q in queries:
+    payload = json.dumps({
+        "api_key": tavily_key,
+        "query": q['query'],
+        "search_depth": "advanced",
+        "max_results": 7,
+        "include_answer": True
+    }).encode()
+    req = urllib.request.Request(
+        'https://api.tavily.com/search',
+        data=payload,
+        headers={'Content-Type': 'application/json'},
+        method='POST'
+    )
+    try:
+        with urllib.request.urlopen(req, timeout=30) as resp:
+            result = json.loads(resp.read())
+            all_track_b.append({
+                'query_name': q['name'],
+                'query': q['query'],
+                'answer': result.get('answer', ''),
+                'results': result.get('results', [])
+            })
+            print(f"Track B - {q['name']}: {len(result.get('results', []))} results")
+    except Exception as e:
+        print(f"Track B - {q['name']}: FAILED ({e})")
+        all_track_b.append({
+            'query_name': q['name'],
+            'query': q['query'],
+            'answer': '',
+            'results': [],
+            'error': str(e)
+        })
+json.dump(all_track_b, open('/tmp/vc-trackb-results.json', 'w'), indent=2)
+PYEOF
+```
+**If all 3 Track B searches return 0 results:** Proceed with Track A results only. Note in `data_quality_flags`: "No thesis-led investors found via public search. Try checking Substack manually for VC newsletters covering this niche."
+---
+## Step 8: Gemini Synthesis -- Rank and Score All VCs
+```bash
+python3 << 'PYEOF'
+import json
+analysis = json.load(open('/tmp/vc-product-analysis.json'))
+track_a = json.load(open('/tmp/vc-tracka-results.json'))
+track_b = json.load(open('/tmp/vc-trackb-results.json'))
+# Compress results to stay within token limits
+track_a_summary = []
+for item in track_a:
+    snippets = [{'title': r.get('title',''), 'url': r.get('url',''), 'content': r.get('content','')[:400]}
+                for r in item.get('results', [])[:3]]
+    track_a_summary.append({
+        'comparable_company': item['comparable_company'],
+        'similarity_reason': item['similarity_reason'],
+        'answer': item.get('answer', '')[:500],
+        'top_results': snippets
+    })
+track_b_summary = []
+for item in track_b:
+    snippets = [{'title': r.get('title',''), 'url': r.get('url',''), 'content': r.get('content','')[:400]}
+                for r in item.get('results', [])[:4]]
+    track_b_summary.append({
+        'query_name': item['query_name'],
+        'answer': item.get('answer', '')[:500],
+        'top_results': snippets
+    })
+context = {
+    'product': {
+        'name': analysis['product_name'],
+        'description': analysis['one_line_description'],
+        'industry': analysis['industry_taxonomy'],
+        'icp': analysis['icp'],
+        'stage': analysis['detected_stage'],
+        'stage_confidence': analysis['stage_confidence'],
+        'geography': analysis['geography_bias']
+    },
+    'track_a_research': track_a_summary,
+    'track_b_research': track_b_summary
+}
+request = {
+    "system_instruction": {
+        "parts": [{
+            "text": """You are a venture capital research analyst. Synthesize investor research into a sourced, ranked list. Follow these rules exactly:
+1. Only include VCs whose names appear in the provided Tavily search results. Do not add VCs not mentioned in the data.
+2. Every Track A VC must have evidence_company: the specific comparable company they backed (required -- omit the VC if you cannot confirm this).
+3. Every Track B VC must have thesis_source_title: the exact article or page title where they stated their thesis (required -- omit the VC if you cannot confirm this).
+4. stage_fit_score 1-10: penalize 3 points if the VC's typical stage does not match the product's detected stage.
+5. space_fit_score 1-10: only give 9-10 if the VC backed 2+ companies in this specific L3 niche.
+6. check_size: use ranges from search result data only. If not found, write "not in search data".
+7. approach_method: one of -- cold email, warm intro required, AngelList, application form, Twitter/X DM. Infer from what is publicly known about this fund's intake process.
+8. outreach_hook: must reference this specific product's differentiator and a named VC portfolio signal or thesis quote. Generic hooks like 'highlight your traction' are not acceptable.
+9. No em dashes anywhere in output.
+10. No marketing language."""
+        }]
+    },
+    "contents": [{
+        "parts": [{
+            "text": f"""Synthesize this VC research for the product below. Return a JSON object with exactly these keys:
+1. product_summary: object with name, one_line_description, industry_l1, industry_l2, industry_l3, detected_stage, comparable_companies_used (array of names)
+2. track_a_vcs: array of VC objects from Track A research. Each object:
+   - fund_name, evidence_company (REQUIRED), evidence_source_url, stage_focus, check_size, thesis_summary (1-2 sentences), stage_fit_score (1-10), space_fit_score (1-10), approach_method
+3. track_b_vcs: array of VC objects from Track B research. Each object:
+   - fund_name, thesis_source_title (REQUIRED), thesis_source_url, stage_focus, check_size, thesis_summary (1-2 sentences), stage_fit_score (1-10), space_fit_score (1-10), approach_method
+4. top_5_deep_dives: array of exactly 5 objects (the 5 highest combined score VCs across both tracks). Each:
+   - fund_name, track ("A" or "B"), fund_overview (2-3 sentences), why_fit (2-3 sentences specific to this product's L3 niche), portfolio_in_space (array of 1-3 names from search data only), how_to_approach (specific steps, min 30 chars), outreach_hook (2-3 sentences, product-specific)
+5. outreach_hooks: array of exactly 3 objects:
+   - hook_type (e.g. "portfolio overlap angle", "thesis language mirror", "comparable exit angle"), hook_text (2-3 sentences a founder would actually send), best_for (which VC type this works for)
+6. data_quality_flags: array of strings noting any gaps or low-confidence areas
+Research data:
+{json.dumps(context, indent=2)}"""
+        }]
+    }],
+    "generationConfig": {
+        "temperature": 0.3,
+        "maxOutputTokens": 6000
+    }
+}
+json.dump(request, open('/tmp/vc-synthesis-request.json', 'w'))
+print('Synthesis request prepared.')
+PYEOF
+curl -s -X POST \
+  "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=$GEMINI_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d @/tmp/vc-synthesis-request.json \
+  | python3 -c "
+import sys, json
+d = json.load(sys.stdin)
+text = d['candidates'][0]['content']['parts'][0]['text'].strip()
+if text.startswith('\`\`\`'):
+    text = '\n'.join(text.split('\n')[1:-1])
+result = json.loads(text)
+json.dump(result, open('/tmp/vc-final-list.json', 'w'), indent=2)
+print(f'Synthesis complete. Track A: {len(result.get(\"track_a_vcs\", []))} VCs. Track B: {len(result.get(\"track_b_vcs\", []))} VCs.')
+"
+```
+**If Gemini returns empty or JSON parsing fails:** Retry once with `maxOutputTokens` reduced to 4000. If retry also fails: present whatever partial JSON was returned, mark missing sections `[INCOMPLETE]`, and tell the user: "Synthesis incomplete. The research data may have been too large. Try running again."
+---
+## Step 9: Self-QA
+Run before presenting. Remove non-evidenced VCs structurally.
+```bash
+python3 << 'PYEOF'
+import json
+result = json.load(open('/tmp/vc-final-list.json'))
+failures = []
+# Remove Track A VCs missing evidence_company
+original_a = len(result.get('track_a_vcs', []))
+result['track_a_vcs'] = [v for v in result.get('track_a_vcs', []) if v.get('evidence_company')]
+removed_a = original_a - len(result['track_a_vcs'])
+if removed_a > 0:
+    failures.append(f'Removed {removed_a} Track A VC(s) missing evidence_company')
+# Remove Track B VCs missing thesis_source_title
+original_b = len(result.get('track_b_vcs', []))
+result['track_b_vcs'] = [v for v in result.get('track_b_vcs', []) if v.get('thesis_source_title')]
+removed_b = original_b - len(result['track_b_vcs'])
+if removed_b > 0:
+    failures.append(f'Removed {removed_b} Track B VC(s) missing thesis_source_title')
+# Check top 5 deep dives
+dives = result.get('top_5_deep_dives', [])
+if len(dives) < 5:
+    failures.append(f'Only {len(dives)} deep dives (expected 5) -- insufficient search data')
+for dd in dives:
+    if not dd.get('how_to_approach') or len(dd.get('how_to_approach', '')) < 30:
+        dd['how_to_approach'] = 'Approach method not determinable from search data. Check the fund website directly for application instructions.'
+        failures.append(f"Fixed: '{dd.get('fund_name')}' had missing how_to_approach")
+# Check outreach hooks count
+if len(result.get('outreach_hooks', [])) != 3:
+    failures.append(f"Expected 3 outreach hooks, got {len(result.get('outreach_hooks', []))}")
+# Check for em dashes
+if ':' in json.dumps(result):
+    result_str = json.dumps(result).replace(':', ':')
+    result = json.loads(result_str)
+    failures.append('Fixed: em dash characters removed from output')
+# Check for forbidden words
+forbidden = ['powerful', 'robust', 'seamless', 'innovative', 'game-changing', 'streamline', 'leverage', 'transform']
+full_text = json.dumps(result).lower()
+for word in forbidden:
+    if word in full_text:
+        failures.append(f"Warning: forbidden word '{word}' found in output -- review before presenting")
+# Ensure data_quality_flags exists
+if 'data_quality_flags' not in result:
+    result['data_quality_flags'] = []
+result['data_quality_flags'].extend(failures)
+json.dump(result, open('/tmp/vc-final-list.json', 'w'), indent=2)
+print(f'QA complete. Issues addressed: {len(failures)}')
+for f in failures:
+    print(f'  - {f}')
+if not failures:
+    print('All QA checks passed.')
+PYEOF
+```
+---
+## Step 10: Save and Present Output
+```bash
+DATE=$(date +%Y-%m-%d)
+OUTPUT_FILE="docs/vc-intel/${PRODUCT_SLUG}-${DATE}.md"
+mkdir -p docs/vc-intel
+```
+Present the final output:
+```
+## VC Finder: [product_name]
+Date: [today] | Stage: [detected_stage] ([stage_confidence] confidence) | Geography: [geography_bias]
+---
+### Product Analysis
+What it does: [one_line_description]
+Industry: [l1] > [l2] > [l3]
+Buyer: [buyer_persona] at [company_type], [company_size]
+Comparable companies used for research: [comma-separated list]
+---
+### Track A: VCs Who Backed Similar Companies
+*These investors have already written a check in this space.*
+| Fund | Backed Comparable | Stage Focus | Check Size | Fit Score | Approach |
+|---|---|---|---|---|---|
+[one row per Track A VC, sorted by space_fit_score descending]
+---
+### Track B: Thesis-Led Investors
+*These investors are actively publishing about this space.*
+| Fund | Thesis Source | Stage Focus | Check Size | Fit Score | Approach |
+|---|---|---|---|---|---|
+[one row per Track B VC, sorted by space_fit_score descending]
+---
+### Top 5 Deep Dives
+#### [N]. [Fund Name] (Track [A/B])
+Overview: [fund_overview]
+Why it fits: [why_fit]
+Portfolio in this space: [names, or "Not found in search data"]
+How to approach: [how_to_approach]
+Outreach hook: "[outreach_hook]"
+[repeat for all 5]
+---
+### 3 Outreach Hooks for This Product Type
+**1. [hook_type]**
+[hook_text]
+Best for: [best_for]
+[repeat for all 3]
+---
+Data quality notes: [data_quality_flags, or "None"]
+Saved to: docs/vc-intel/[PRODUCT_SLUG]-[DATE].md
+```
+Clean up temp files:
+```bash
+rm -f /tmp/vc-product-raw.md /tmp/vc-stage-signals.json /tmp/vc-analysis-request.json \
+      /tmp/vc-product-analysis.json /tmp/vc-tracka-results.json /tmp/vc-trackb-results.json \
+      /tmp/vc-synthesis-request.json /tmp/vc-final-list.json /tmp/vc-qa-result.json
+```

package/skills/vc-finder/evals/evals.json ADDED Viewed

@@ -0,0 +1,125 @@
+[
+  {
+    "id": "eval_001",
+    "name": "B2B SaaS product URL: full two-track output with correct stage detection",
+    "description": "A B2B SaaS developer tool with a demo CTA. Validates the full 10-step workflow, stage detection from page signals, two-track VC research, and complete output.",
+    "input": {
+      "prompt": "Find VCs for my startup: https://linear.app",
+      "env": {
+        "GEMINI_API_KEY": "set",
+        "TAVILY_API_KEY": "set",
+        "FIRECRAWL_API_KEY": "set"
+      }
+    },
+    "expected_behavior": [
+      "Fetches URL via Firecrawl",
+      "Step 4 detects 'free trial or demo CTA' signal from page content",
+      "Stage outputs 'seed' or 'series-a' with medium or high confidence",
+      "Industry taxonomy: software > developer tools > issue tracking or project management",
+      "Generates 5 comparable companies (e.g. Jira, Shortcut, Height, Plane, Asana)",
+      "Runs 5 Track A Tavily searches using quoted company names",
+      "Runs 3 Track B Tavily searches using L2 and L3 taxonomy terms",
+      "Every Track A VC in the output has evidence_company field populated",
+      "Every Track B VC in the output has thesis_source_title field populated",
+      "Self-QA step removes any VCs missing required evidence fields",
+      "Output includes exactly 5 deep dives and exactly 3 outreach hooks",
+      "Output saved to docs/vc-intel/linear-[date].md"
+    ],
+    "expected_output": "Full two-track VC list with sourced Track A evidence and Track B thesis citations, 5 deep dives with product-specific outreach hooks, saved to docs/vc-intel/"
+  },
+  {
+    "id": "eval_002",
+    "name": "Consumer app: different stage signals, consumer-focused VCs not B2B enterprise funds",
+    "description": "A consumer mobile app. Validates that the skill detects consumer-specific signals and returns consumer-focused investors, not enterprise SaaS funds.",
+    "input": {
+      "prompt": "Find investors for this consumer app: https://www.bereal.com",
+      "env": {
+        "GEMINI_API_KEY": "set",
+        "TAVILY_API_KEY": "set",
+        "FIRECRAWL_API_KEY": "set"
+      }
+    },
+    "expected_behavior": [
+      "Fetches URL successfully",
+      "Industry taxonomy maps to consumer or social media, not B2B SaaS",
+      "Comparable companies are consumer social apps, not enterprise tools",
+      "Track A searches target consumer social investors (who backed TikTok, Snapchat, Instagram at early stage)",
+      "Track B searches use consumer social or UGC thesis terms",
+      "VCs in output are consumer-focused funds, not B2B SaaS investors",
+      "Outreach hooks are specific to consumer social or photo product type",
+      "No B2B enterprise-only investors appear in the output"
+    ],
+    "expected_output": "VC list with consumer-focused investors, stage-appropriate alignment, no B2B enterprise fund entries"
+  },
+  {
+    "id": "eval_003",
+    "name": "Description-only input: skips fetch, goes direct to Gemini analysis",
+    "description": "User pastes a product description with no URL. Validates that Steps 3 and 4 are skipped and the skill proceeds directly to Gemini analysis.",
+    "input": {
+      "prompt": "Find VCs for my startup. Here's what we do: We build an AI-powered legal contract review tool for in-house legal teams at mid-market companies. Our tool flags risky clauses and suggests standard language. We're pre-seed, 3 months post-launch, no pricing page yet. US-focused.",
+      "env": {
+        "GEMINI_API_KEY": "set",
+        "TAVILY_API_KEY": "set",
+        "FIRECRAWL_API_KEY": "not set"
+      }
+    },
+    "expected_behavior": [
+      "Detects no URL in the input",
+      "Skips Step 3: no fetch attempt is made",
+      "Skips Step 4: no regex stage detection from page",
+      "Passes the pasted description directly to Gemini in Step 5",
+      "Stage set to 'pre-seed' with high confidence (user stated it explicitly)",
+      "Industry taxonomy: software > legaltech > contract review automation",
+      "Generates 5 comparable legaltech companies",
+      "Runs full Track A and Track B searches using description-derived comparables and taxonomy",
+      "Output note: 'Stage: pre-seed (stated by user)'",
+      "data_quality_flags includes note about stage coming from user description not page signals"
+    ],
+    "expected_output": "Full VC list using description-derived analysis, legaltech-specific investors, stage labeled as user-stated"
+  },
+  {
+    "id": "eval_004",
+    "name": "GEMINI_API_KEY missing: immediate stop at Step 1",
+    "description": "Validates that the skill stops at Step 1 with exact setup instructions when Gemini key is absent.",
+    "input": {
+      "prompt": "Find VCs for https://stripe.com",
+      "env": {
+        "GEMINI_API_KEY": "not set",
+        "TAVILY_API_KEY": "set",
+        "FIRECRAWL_API_KEY": "set"
+      }
+    },
+    "expected_behavior": [
+      "Step 1 detects GEMINI_API_KEY is missing",
+      "Stops immediately at Step 1",
+      "Tells the user: 'GEMINI_API_KEY is required for product analysis and VC synthesis. Get it at aistudio.google.com. Add it to your .env file.'",
+      "Does NOT fetch the URL",
+      "Does NOT run any Tavily searches",
+      "Does NOT attempt any analysis"
+    ],
+    "expected_output": "Immediate stop at Step 1 with exact error message including the URL to get the key. No partial output generated."
+  },
+  {
+    "id": "eval_005",
+    "name": "TAVILY_API_KEY missing: immediate stop at Step 1, no fallback",
+    "description": "Validates that the skill stops at Step 1 when Tavily key is absent, with no fallback path suggested.",
+    "input": {
+      "prompt": "Who invests in CI/CD automation startups? https://buildkite.com",
+      "env": {
+        "GEMINI_API_KEY": "set",
+        "TAVILY_API_KEY": "not set",
+        "FIRECRAWL_API_KEY": "set"
+      }
+    },
+    "expected_behavior": [
+      "Step 1 detects GEMINI_API_KEY is set",
+      "Step 1 detects TAVILY_API_KEY is missing",
+      "Stops immediately at Step 1",
+      "Tells the user: 'TAVILY_API_KEY is required to research VC investments and theses. There is no fallback for this. Get it at app.tavily.com. Free tier: 1000 credits/month (about 125 full runs). Add it to your .env file.'",
+      "Does NOT fetch the URL",
+      "Does NOT run any Gemini analysis",
+      "Does NOT suggest any workaround"
+    ],
+    "expected_output": "Immediate stop at Step 1 with exact error message. Does not proceed past setup check."
+  }
+]

package/skills/vc-finder/references/stage-signals.md ADDED Viewed

@@ -0,0 +1,98 @@
+# Stage Signals Reference
+Used by SKILL.md Step 4 to detect funding stage from product page content before any API call.
+---
+## Signal Detection Table
+| Signal Pattern (regex) | Stage Hint | Example Page Text |
+|---|---|---|
+| `join (the )?waitlist` | pre-seed | "Join the waitlist for early access" |
+| `sign up for beta` | pre-seed | "Sign up for our beta program" |
+| `early access` | pre-seed | "Request early access" |
+| `request (an? )?invite` | pre-seed | "Request an invite" |
+| `get notified` | pre-seed | "Get notified when we launch" |
+| `start (your )?free trial` | seed | "Start your free 14-day trial" |
+| `try (it )?for free` | seed | "Try for free, no credit card required" |
+| `request a? demo` | seed | "Request a demo" |
+| `book a? demo` | seed | "Book a demo with our team" |
+| `schedule a? demo` | seed | "Schedule a 30-minute demo" |
+| `contact sales` | series-a | "Contact sales for enterprise pricing" |
+| `talk to (our )?sales` | series-a | "Talk to our sales team" |
+| `see pricing` / `view pricing` | series-a | "See pricing" |
+| `plans and pricing` | series-a | "Plans and Pricing" |
+| `case stud(y\|ies)` | series-a | "Read our case studies" |
+| `customer stor(y\|ies)` | series-a | "Customer success stories" |
+| `trusted by \d+` | series-a | "Trusted by 2,000+ teams" |
+| `enterprise (plan\|pricing\|tier)` | series-a-or-b | "Enterprise plan available" |
+| `we.?re hiring` | series-a-or-b | "We're hiring -- see open roles" |
+| `join our team` | series-a-or-b | "Join our team of 50+" |
+| `raised \$[\d,.]+[mk]?` | announced | "We raised $8M in Series A" |
+| `series [abc] round` | announced | "Series B round closed" |
+| `seed round` | announced | "Seed round led by X" |
+---
+## Signal Confidence Rules
+| Signals Found | Confidence |
+|---|---|
+| 2 or more matching signals | high |
+| Exactly 1 matching signal | medium |
+| 0 signals (stage estimated from content alone) | low |
+| Funding announcement text found directly | high (overrides other signals) |
+---
+## Handling Conflicting Signals
+Some pages show signals from multiple stages simultaneously (e.g. a pricing page AND a waitlist). Use this resolution order:
+1. Funding announcement text wins over all other signals (the stage is known, not inferred)
+2. If both "pricing page" (Series A) and "free trial" (seed) signals appear: call it seed-to-series-a transition, output `series-a` with medium confidence
+3. If both "waitlist" (pre-seed) and "demo request" (seed) signals appear: output `seed` -- the product is likely further along than the waitlist implies
+4. If no signals found at all: output `unknown` with low confidence, pass this to Gemini with a note to infer from product maturity and content
+---
+## Common Misdetections
+**"Request demo" from a mature Series B company:** Some large companies keep demo CTAs even after raising Series B. Override signals if the page also shows: enterprise logo bars, "trusted by Fortune 500", or explicit Series B announcement.
+**Open-source projects:** Often show no stage signals (no pricing, no CTA). Output `unknown`. In the Gemini analysis step, note that the product appears open-source and ask Gemini to infer whether there is a commercial entity behind it.
+**Startup landing pages with no product live:** A site with only a headline, a value prop paragraph, and an email capture is almost certainly pre-seed. Even without explicit waitlist language, if the page has no product demo and no pricing, output `pre-seed` with medium confidence.
+---
+## Stage-to-Tavily Query Modifiers
+These modifiers are added to Track B search queries based on detected stage:
+| Detected Stage | Query Modifier |
+|---|---|
+| pre-seed | "pre-seed micro VC angel" |
+| seed | "seed fund" |
+| series-a | "series A lead investor" |
+| series-b | "growth stage VC" |
+| unknown | "early stage" (default) |
+Example Track B query with modifier:
+- L3 = "CI/CD automation", stage = seed
+- Query: `seed fund "CI/CD automation" investment thesis portfolio companies`
+---
+## Stage-to-VC-Check-Size Reference
+Use this to validate stage fit scores in the synthesis step:
+| Stage | Typical Check Size |
+|---|---|
+| Pre-seed | $25K-$500K |
+| Seed | $500K-$3M |
+| Series A | $3M-$15M |
+| Series B | $15M-$50M |
+A VC whose typical check size is $20M-$50M has a stage fit score of 2 or lower for a pre-seed product, regardless of how well their thesis matches.

package/skills/vc-finder/references/vc-outreach-guide.md ADDED Viewed

@@ -0,0 +1,142 @@
+# VC Outreach Guide
+How to approach the different investor archetypes the skill surfaces. The right approach depends on the VC type, not on the founder's preference.
+---
+## 5 Investor Archetypes
+### Archetype 1: Thesis-First Writers (Track B investors)
+These VCs publish blog posts, newsletters, or Twitter threads about why they want to invest in a specific space. They are signaling active deal interest.
+**How to approach:**
+1. Read the article or post that surfaced them in Track B before reaching out
+2. Open with a direct reference to their specific thesis language, not a paraphrase: "You wrote in [post title] that you believe [exact phrase from post]. We built exactly that."
+3. Keep the first message under 100 words. The thesis reference does the work -- do not over-explain.
+4. Cold email works here. These VCs publish specifically to attract inbound -- they expect cold emails from founders who read their work.
+**What to avoid:** Do not write "I came across your firm" or "I've been following your work." These are signals that you did not actually read the thesis. Quote it or do not mention it.
+---
+### Archetype 2: Portfolio-Pattern Investors (Track A investors)
+These VCs backed a company comparable to yours. They already understand the space, the buyer, and the risk profile.
+**How to approach:**
+1. Name the specific portfolio company in your opening: "You backed [Company X] which means you understand [specific pain]."
+2. Do not position yourself as a competitor to their portfolio company. Instead, find the adjacent or complementary angle: "We're solving the [different part of the workflow] problem that [Company X] doesn't address."
+3. Cold email or LinkedIn works. Warm intro from a mutual connection in the portfolio company works best.
+**What to avoid:** Do not say "just like [portfolio company] but better." This forces the investor to choose between two bets and they will default to protecting the existing one.
+---
+### Archetype 3: Operator-Turned-Investor
+Former founders or operators who moved into investing. Typically reachable via warm intro from their portfolio company founders.
+**How to approach:**
+1. Find a founder in their portfolio via LinkedIn or the fund's website
+2. Get a warm intro from that founder: "I'm building in the same space you were in at [company] -- would you be willing to introduce me to [investor name]?"
+3. Operator-investors respond to founder-to-founder intros at 3-5x the rate of cold outreach
+**What to avoid:** Cold email works less well with this archetype. They deal-select heavily from their personal networks. Do not skip the warm path attempt.
+---
+### Archetype 4: Multi-Stage Generalists with Sector Coverage
+Large funds with dedicated sector teams. Partners focus on different verticals. Do not email the wrong person.
+**How to approach:**
+1. Identify the specific partner who covers your sector from the fund website or LinkedIn. Look for their recent investments in your space.
+2. Email the sector partner, not the managing partner or the fund's general inbox.
+3. If you cannot identify the right partner, email a principal or associate first and ask who covers your space. They are more responsive and will route you correctly.
+**What to avoid:** Do not cold email a managing partner at a large fund with no warm intro. Response rates are near zero. The sector partner path is 10x more effective.
+---
+### Archetype 5: Scout Program Participants
+Many top-tier funds (a16z, First Round, Bessemer, etc.) run scout programs -- operators, founders, and angels who refer deals in exchange for carry. Scouts have much higher response rates than GPs at the same fund.
+**How to approach:**
+1. Find scouts on Twitter/X or LinkedIn (search "[fund name] scout")
+2. Scouts are often active founders themselves -- they respond well to peer-to-peer founder outreach
+3. A positive scout referral carries significant weight internally at the fund
+**What to avoid:** Do not treat a scout as a gatekeeper to avoid. A scout intro is often more valuable than a cold email to the GP.
+---
+## Cold Email vs Warm Intro
+| Method | Typical Response Rate | When to Use |
+|---|---|---|
+| Cold email (generic) | 1-3% | Only if no warm path exists |
+| Cold email (thesis-referenced) | 8-15% | Track B investors who published their thesis |
+| Cold email (portfolio-referenced) | 5-10% | Track A investors with named portfolio evidence |
+| Warm intro from portfolio founder | 30-50% | Always attempt first for Archetype 3 and 4 |
+| Warm intro from mutual angel/advisor | 20-35% | Use your existing cap table to find bridge connections |
+The warm intro advantage is real. Before sending a cold email to a Track A or Track B investor, spend 10 minutes on LinkedIn finding a second-degree connection through their portfolio companies. One warm intro is worth 10 cold emails.
+---
+## Application-Form Funds
+Some funds operate structured intake processes. These require filling out a form, not cold emailing.
+| Fund Type | How to Engage |
+|---|---|
+| YC | Apply during batch cycle at ycombinator.com/apply |
+| First Round Capital | Submit via firstround.com/funding |
+| Andreessen Horowitz | Public application at a16z.com -- reference their exact investment criteria from their published content |
+| Sequoia | Warm intro strongly preferred; Arc program for early stage |
+| Techstars | Apply to accelerator program for access to their network |
+For application-form funds: use the exact language from their published investment criteria in your application. They score applications against stated criteria. Generic applications score low regardless of product quality.
+---
+## Outreach Message Structure
+For cold email (any archetype):
+```
+Subject: [Specific signal reference, under 8 words]
+[Sentence about THEM -- their portfolio company, their thesis, their recent investment]
+[One sentence about what you build and who it is for]
+[One specific data point: metric, customer, or signal]
+[One-line ask: 20-minute call, or specific question]
+[Name]
+```
+Hard limits:
+- Email body: under 100 words
+- Subject line: under 8 words
+- No attachments in first message
+- No pitch deck link in first message (send on request only)
+- No "I hope this finds you well" or "I wanted to reach out"
+---
+## Red Flags to Check Before Outreach
+Before emailing any VC from the output list, check:
+1. **Recent fund vintage:** A fund that raised in 2019 and has not announced a new fund is likely deployed and not writing new checks. Check Crunchbase or their website for their most recent fund announcement.
+2. **Portfolio company failure in your space:** If a VC backed a direct competitor that failed, they may be reluctant to re-invest in the same category. Frame your differentiation clearly.
+3. **Check size mismatch:** If the VC's minimum check is $5M and you are raising a $1M seed, the round economics do not work. Do not waste either side's time.
+4. **Geography restriction:** Some funds explicitly invest only in US companies or only in European companies. Check their website before outreach.