@opendirectory.dev/skills 0.1.35 → 0.1.36

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,670 @@
1
+ ---
2
+ name: npm-downloads-to-leads
3
+ description: Takes a list of npm package names (yours or competitors'), fetches 12 weeks of daily download data from the npm API, computes a breakout velocity score per package to identify hockey-stick growth, fetches maintainer profiles from the npm registry and GitHub API, and outputs a ranked lead brief for each breakout package with who built it, how to reach them, and what to say. Use when asked to find evangelists before they are famous, track competitor package momentum, identify breakout npm packages, map npm maintainers to Twitter or GitHub, or find DevTools leads from package growth signals. Trigger when a user says "find leads from npm packages", "who maintains these breakout packages", "track npm download trends", "find evangelists before they are famous", or "map npm maintainers to Twitter".
4
+ compatibility: [claude-code, gemini-cli, github-copilot]
5
+ ---
6
+
7
+ # npm Downloads to Leads
8
+
9
+ Take a list of npm packages. Fetch 12 weeks of download data. Compute breakout velocity. Enrich maintainer profiles. Output a ranked lead brief per breakout package with contact signals and an outreach message.
10
+
11
+ ---
12
+
13
+ **Critical rule:** Every package download figure in the output must come from the npm API response. Every maintainer GitHub handle or Twitter username must come from the GitHub API response -- not guessed from the npm username. If the GitHub API did not return a twitter_username field, write "not found on GitHub" -- do not invent one.
14
+
15
+ ---
16
+
17
+ ## Common Mistakes
18
+
19
+ | The agent will want to... | Why that's wrong |
20
+ |---|---|
21
+ | Fetch GitHub profiles for every package in the list | Rate limit is 60 req/hr without a token. Enriching steady or declining packages wastes the budget before reaching breakout ones. Only fetch profiles for breakout and watching packages. |
22
+ | Rank packages by raw weekly downloads | Raw downloads favor React and lodash, which are not leads. A package going from 1K to 8K/week is more actionable than React at 50M/week. Velocity score is the signal. |
23
+ | Skip URL-encoding for scoped packages | @org/pkg without encoding causes a 404 from the npm API. Encode @ as %40 and / as %2F for every scoped package name. |
24
+ | Stop the skill when the GitHub rate limit is hit | Degrade gracefully. Present the velocity leaderboard from npm data, skip remaining GitHub enrichments, and add a flag to data_quality_flags. Do not abort. |
25
+ | Write outreach messages without naming the specific package | Generic "I saw your project" messages go unanswered. Every outreach message must name the package, its growth numbers, and a specific connection to the context the user provided. |
26
+ | Include packages below 500 weekly downloads as leads | Below 500/week is noise. The maintainer has no meaningful audience yet. Flag as "too early" but do not present as a lead. |
27
+
28
+ ---
29
+
30
+ ## Step 1: Setup Check
31
+
32
+ ```bash
33
+ echo "GITHUB_TOKEN: ${GITHUB_TOKEN:-not set, unauthenticated rate limit applies (60 req/hr -- enough for ~10 packages)}"
34
+ ```
35
+
36
+ **If GITHUB_TOKEN is not set:** Continue. Inform the user: "GITHUB_TOKEN is not set. GitHub enrichment is limited to ~10 packages before hitting the rate limit. Add a token at github.com/settings/tokens (no scopes needed)."
37
+
38
+ No required keys. The npm API and npm registry are fully public with no authentication.
39
+
40
+ ---
41
+
42
+ ## Step 2: Gather Input
43
+
44
+ Collect from the conversation:
45
+ - One or more npm package names (unscoped like `esbuild`, or scoped like `@hono/hono`)
46
+ - Optional: a short product context string (used to personalize outreach messages)
47
+
48
+ If the user gives an npmjs.com URL, extract just the package name. Preserve the full scoped name including `@` and org prefix -- encoding is handled in Step 3.
49
+
50
+ **If no packages are provided:** Ask: "Which npm packages would you like to analyze? Provide your own, competitors, or a mix. Example: esbuild, @hono/hono, zod, valibot"
51
+
52
+ ```bash
53
+ python3 << 'PYEOF'
54
+ import json, sys
55
+
56
+ packages_raw = "PACKAGES_HERE" # comma or newline separated
57
+ product_context = "CONTEXT_HERE" # optional, can be empty string
58
+
59
+ packages = [p.strip() for p in packages_raw.replace("\n", ",").split(",") if p.strip()]
60
+ if not packages:
61
+ print("ERROR: No packages provided.")
62
+ sys.exit(1)
63
+
64
+ print(f"Packages to analyze: {len(packages)}")
65
+ for p in packages:
66
+ print(f" {p}")
67
+
68
+ with open("/tmp/npl-input.json", "w") as f:
69
+ json.dump({"packages": packages, "product_context": product_context}, f)
70
+ PYEOF
71
+ ```
72
+
73
+ ---
74
+
75
+ ## Step 3: Fetch 12-Week Download Data
76
+
77
+ **Use the standalone script if available -- it handles Steps 3, 4, and 5 in one call so you do not need to run the inline code blocks below.**
78
+
79
+ ```bash
80
+ # Check if the script exists
81
+ ls scripts/fetch.py 2>/dev/null && echo "script available" || echo "script not found"
82
+ ```
83
+
84
+ **If the script is available**, run it directly and skip to Step 6:
85
+
86
+ ```bash
87
+ python3 scripts/fetch.py PACKAGES_HERE --context "CONTEXT_HERE" --output /tmp/npl-script-out.json
88
+ ```
89
+
90
+ Then load the output into the enriched format Step 6 expects:
91
+
92
+ ```bash
93
+ python3 << 'PYEOF'
94
+ import json
95
+
96
+ out = json.load(open("/tmp/npl-script-out.json"))
97
+ # Script output has results array -- split into scored and enriched for Steps 6-8
98
+ enriched = [r for r in out["results"] if "profile" in r]
99
+ scored = out["results"]
100
+ json.dump(scored, open("/tmp/npl-scored.json", "w"), indent=2)
101
+ json.dump(enriched, open("/tmp/npl-enriched.json", "w"), indent=2)
102
+ json.dump({"packages": [r["package"] for r in scored], "product_context": out.get("product_context", "")},
103
+ open("/tmp/npl-input.json", "w"), indent=2)
104
+ print(f"Loaded {len(scored)} packages | {out['breakout_count']} breakout | {out['watching_count']} watching")
105
+ PYEOF
106
+ ```
107
+
108
+ **If the script is not available**, run the inline code below.
109
+
110
+ Fetch daily download data for each package from the npm Downloads API. Aggregate to weekly buckets.
111
+
112
+ ```bash
113
+ python3 << 'PYEOF'
114
+ import json, urllib.request, sys, time
115
+ from datetime import datetime, timedelta, timezone
116
+ from collections import defaultdict
117
+ import urllib.parse
118
+
119
+ data = json.load(open("/tmp/npl-input.json"))
120
+ packages = data["packages"]
121
+
122
+ end_date = datetime.now(tz=timezone.utc)
123
+ start_date = end_date - timedelta(weeks=13) # extra week buffer for partial weeks
124
+ start_str = start_date.strftime("%Y-%m-%d")
125
+ end_str = end_date.strftime("%Y-%m-%d")
126
+
127
+ results = []
128
+ failed = []
129
+
130
+ for pkg in packages:
131
+ # URL-encode scoped packages: @ -> %40, / -> %2F
132
+ encoded = pkg.replace("@", "%40").replace("/", "%2F")
133
+ url = f"https://api.npmjs.org/downloads/range/{start_str}:{end_str}/{encoded}"
134
+
135
+ try:
136
+ req = urllib.request.Request(url, headers={"User-Agent": "npm-downloads-to-leads/1.0"})
137
+ with urllib.request.urlopen(req, timeout=20) as resp:
138
+ raw = json.loads(resp.read())
139
+
140
+ # Aggregate daily to weekly by ISO week
141
+ weekly = defaultdict(int)
142
+ for entry in raw.get("downloads", []):
143
+ day = datetime.strptime(entry["day"], "%Y-%m-%d")
144
+ week_key = day.isocalendar()[:2] # (year, week_num)
145
+ weekly[week_key] += entry["downloads"]
146
+
147
+ weeks = [v for k, v in sorted(weekly.items())]
148
+ # Take last 12 complete weekly buckets
149
+ weeks = weeks[-12:]
150
+
151
+ results.append({
152
+ "package": pkg,
153
+ "weeks": weeks,
154
+ "total_weeks": len(weeks),
155
+ "current_weekly": weeks[-1] if weeks else 0,
156
+ "status": "ok"
157
+ })
158
+ print(f" {pkg}: {len(weeks)} weeks, latest week {weeks[-1]:,} downloads")
159
+
160
+ except urllib.error.HTTPError as e:
161
+ if e.code == 404:
162
+ failed.append(pkg)
163
+ results.append({"package": pkg, "weeks": [], "total_weeks": 0, "current_weekly": 0, "status": "not_found"})
164
+ print(f" {pkg}: NOT FOUND (404) -- will be skipped")
165
+ else:
166
+ failed.append(pkg)
167
+ results.append({"package": pkg, "weeks": [], "total_weeks": 0, "current_weekly": 0, "status": f"error_{e.code}"})
168
+ print(f" {pkg}: HTTP {e.code} error")
169
+ except Exception as e:
170
+ failed.append(pkg)
171
+ results.append({"package": pkg, "weeks": [], "total_weeks": 0, "current_weekly": 0, "status": f"error"})
172
+ print(f" {pkg}: fetch failed ({e})")
173
+
174
+ time.sleep(0.2) # gentle rate limiting
175
+
176
+ json.dump(results, open("/tmp/npl-download-data.json", "w"), indent=2)
177
+ print(f"\nFetch complete. OK: {len(results) - len(failed)} | Failed/Not found: {len(failed)}")
178
+ if failed:
179
+ print(f"Skipped: {', '.join(failed)}")
180
+ PYEOF
181
+ ```
182
+
183
+ **If all packages return 404 or errors:** Stop. Tell the user: "No download data could be fetched. Check that the package names are correct and exist on npmjs.com. Scoped packages must include the full name: @org/package."
184
+
185
+ ---
186
+
187
+ ## Step 4: Compute Velocity Scores
188
+
189
+ No API call. Pure Python. Compute velocity score, growth ratio, and classify each package.
190
+
191
+ ```bash
192
+ python3 << 'PYEOF'
193
+ import json
194
+
195
+ raw_results = json.load(open("/tmp/npl-download-data.json"))
196
+ scored = []
197
+
198
+ for item in raw_results:
199
+ pkg = item["package"]
200
+ weeks = item["weeks"]
201
+ status = item["status"]
202
+
203
+ if status != "ok" or len(weeks) < 4:
204
+ scored.append({**item, "velocity_score": 0, "growth_pct": 0, "tier": "insufficient_data",
205
+ "recent_4_avg": 0, "prior_4_avg": 0})
206
+ continue
207
+
208
+ recent_4 = sum(weeks[-4:]) / 4
209
+ prior_4 = sum(weeks[-8:-4]) / max(len(weeks) - 4, 1) if len(weeks) >= 8 else sum(weeks[:4]) / max(len(weeks[:4]), 1)
210
+ recent_2 = sum(weeks[-2:]) / 2
211
+ mid_2 = sum(weeks[-4:-2]) / 2 if len(weeks) >= 4 else recent_2
212
+
213
+ growth_ratio = recent_4 / max(prior_4, 1)
214
+ acceleration = recent_2 / max(mid_2, 1)
215
+ growth_pct = round((growth_ratio - 1) * 100, 1)
216
+
217
+ # Sweet spot multiplier: 500-500K weekly downloads
218
+ if recent_4 < 500:
219
+ noise_factor = max(recent_4 / 500, 0.1)
220
+ elif recent_4 > 500_000:
221
+ noise_factor = max(500_000 / recent_4, 0.1)
222
+ else:
223
+ noise_factor = 1.0
224
+
225
+ velocity_score = round(growth_ratio * acceleration * noise_factor * 100, 1)
226
+
227
+ # Classify
228
+ if velocity_score > 80 and 500 < recent_4 < 500_000 and growth_ratio >= 1.5:
229
+ tier = "breakout"
230
+ elif velocity_score > 40 and recent_4 >= 500 and growth_ratio >= 1.2:
231
+ tier = "watching"
232
+ elif recent_4 < 500:
233
+ tier = "too_early"
234
+ elif recent_4 >= 500_000:
235
+ tier = "established"
236
+ else:
237
+ tier = "steady"
238
+
239
+ scored.append({
240
+ **item,
241
+ "velocity_score": velocity_score,
242
+ "growth_pct": growth_pct,
243
+ "recent_4_avg": round(recent_4),
244
+ "prior_4_avg": round(prior_4),
245
+ "tier": tier
246
+ })
247
+
248
+ # Sort by velocity_score descending
249
+ scored.sort(key=lambda x: x["velocity_score"], reverse=True)
250
+
251
+ json.dump(scored, open("/tmp/npl-scored.json", "w"), indent=2)
252
+
253
+ breakout = [p for p in scored if p["tier"] == "breakout"]
254
+ watching = [p for p in scored if p["tier"] == "watching"]
255
+ too_early = [p for p in scored if p["tier"] == "too_early"]
256
+
257
+ print(f"Velocity scoring complete:")
258
+ print(f" BREAKOUT: {len(breakout)}")
259
+ print(f" WATCHING: {len(watching)}")
260
+ print(f" STEADY/ESTABLISHED: {len([p for p in scored if p['tier'] in ('steady','established')])}")
261
+ print(f" TOO EARLY (<500/week): {len(too_early)}")
262
+ print()
263
+ for p in scored[:10]:
264
+ print(f" {p['tier'].upper():12} {p['package']:30} score={p['velocity_score']:6.1f} "
265
+ f"{p['recent_4_avg']:>8,}/wk growth={p['growth_pct']:+.0f}%")
266
+
267
+ # Stop if nothing worth analyzing
268
+ if not breakout and not watching:
269
+ all_too_early = all(p["tier"] in ("too_early", "insufficient_data") for p in scored)
270
+ if all_too_early:
271
+ print("\nERROR: All packages are below the 500 weekly downloads threshold for reliable velocity analysis.")
272
+ print("Try packages with more community adoption.")
273
+ import sys; sys.exit(1)
274
+ PYEOF
275
+ ```
276
+
277
+ **If all packages are below 500/week:** Stop with the message above.
278
+
279
+ ---
280
+
281
+ ## Step 5: Fetch Maintainer Profiles
282
+
283
+ Only for breakout and watching packages. Fetch npm registry metadata, then GitHub user profiles.
284
+
285
+ ```bash
286
+ python3 << 'PYEOF'
287
+ import json, urllib.request, re, os, time
288
+
289
+ scored = json.load(open("/tmp/npl-scored.json"))
290
+ token = os.environ.get("GITHUB_TOKEN", "")
291
+
292
+ gh_headers = {"Accept": "application/vnd.github+json", "User-Agent": "npm-downloads-to-leads/1.0"}
293
+ if token:
294
+ gh_headers["Authorization"] = f"Bearer {token}"
295
+
296
+ target_packages = [p for p in scored if p["tier"] in ("breakout", "watching")]
297
+ print(f"Fetching profiles for {len(target_packages)} packages (breakout + watching)...")
298
+
299
+ gh_rate_remaining = 999
300
+ enriched = []
301
+
302
+ for item in target_packages:
303
+ pkg = item["package"]
304
+ profile = {"package": pkg, "npm_maintainers": [], "description": "", "keywords": [],
305
+ "github_owner": None, "github_repo": None, "github_users": [], "npm_homepage": ""}
306
+
307
+ # --- npm registry ---
308
+ encoded = pkg.replace("@", "%40").replace("/", "%2F")
309
+ reg_url = f"https://registry.npmjs.org/{encoded}"
310
+ try:
311
+ req = urllib.request.Request(reg_url, headers={"User-Agent": "npm-downloads-to-leads/1.0"})
312
+ with urllib.request.urlopen(req, timeout=20) as resp:
313
+ reg = json.loads(resp.read())
314
+
315
+ profile["description"] = reg.get("description", "")
316
+ profile["keywords"] = (reg.get("keywords") or [])[:6]
317
+ profile["npm_homepage"] = reg.get("homepage", "")
318
+ profile["npm_maintainers"] = [m.get("name", "") for m in reg.get("maintainers", []) if m.get("name")]
319
+
320
+ # Extract GitHub owner from repository URL
321
+ repo_field = reg.get("repository") or {}
322
+ if isinstance(repo_field, dict):
323
+ repo_url = repo_field.get("url", "")
324
+ else:
325
+ repo_url = str(repo_field)
326
+ gh_match = re.search(r"github\.com[/:]([^/]+)/([^/.]+)", repo_url)
327
+ if gh_match:
328
+ profile["github_owner"] = gh_match.group(1)
329
+ profile["github_repo"] = gh_match.group(2).rstrip(".git")
330
+
331
+ print(f" {pkg}: registry OK | maintainers={profile['npm_maintainers'][:3]} | "
332
+ f"github_owner={profile['github_owner']}")
333
+ except Exception as e:
334
+ print(f" {pkg}: registry fetch failed ({e})")
335
+
336
+ time.sleep(0.1)
337
+
338
+ # --- GitHub user profiles ---
339
+ candidates = []
340
+ if profile["github_owner"]:
341
+ candidates.append(profile["github_owner"])
342
+ # Also try npm maintainer usernames (often match GitHub)
343
+ for m in profile["npm_maintainers"][:2]:
344
+ if m and m not in candidates:
345
+ candidates.append(m)
346
+
347
+ for username in candidates[:3]:
348
+ if gh_rate_remaining <= 5:
349
+ print(f" GitHub rate limit low ({gh_rate_remaining} remaining) -- skipping {username}")
350
+ break
351
+
352
+ gh_url = f"https://api.github.com/users/{username}"
353
+ req = urllib.request.Request(gh_url, headers=gh_headers)
354
+ try:
355
+ with urllib.request.urlopen(req, timeout=15) as resp:
356
+ gh_rate_remaining = int(resp.headers.get("X-RateLimit-Remaining", 999))
357
+ gh_data = json.loads(resp.read())
358
+
359
+ profile["github_users"].append({
360
+ "username": username,
361
+ "name": gh_data.get("name") or username,
362
+ "twitter_username": gh_data.get("twitter_username") or "not found on GitHub",
363
+ "bio": gh_data.get("bio") or "",
364
+ "blog": gh_data.get("blog") or "",
365
+ "company": gh_data.get("company") or "",
366
+ "followers": gh_data.get("followers", 0),
367
+ "public_repos": gh_data.get("public_repos", 0),
368
+ "github_url": gh_data.get("html_url", f"https://github.com/{username}")
369
+ })
370
+ print(f" GitHub @{username}: {gh_data.get('followers', 0)} followers | "
371
+ f"twitter={gh_data.get('twitter_username') or 'none'} | rate_remaining={gh_rate_remaining}")
372
+ except urllib.error.HTTPError as e:
373
+ if e.code == 404:
374
+ print(f" GitHub @{username}: not found")
375
+ else:
376
+ print(f" GitHub @{username}: HTTP {e.code}")
377
+ except Exception as e:
378
+ print(f" GitHub @{username}: failed ({e})")
379
+
380
+ time.sleep(0.2)
381
+
382
+ enriched.append({**item, "profile": profile})
383
+
384
+ json.dump(enriched, open("/tmp/npl-enriched.json", "w"), indent=2)
385
+ json.dump(scored, open("/tmp/npl-scored.json", "w"), indent=2)
386
+ print(f"\nEnrichment complete. Profiles fetched: {len(enriched)}")
387
+ print(f"GitHub rate limit remaining: {gh_rate_remaining}")
388
+ PYEOF
389
+ ```
390
+
391
+ ---
392
+
393
+ ## Step 6: Generate Lead Briefs
394
+
395
+ Print enriched breakout and watching packages, then generate lead briefs and outreach messages.
396
+
397
+ ```bash
398
+ python3 << 'PYEOF'
399
+ import json
400
+
401
+ enriched = json.load(open("/tmp/npl-enriched.json"))
402
+ input_data = json.load(open("/tmp/npl-input.json"))
403
+ product_context = input_data.get("product_context", "")
404
+
405
+ breakout = [p for p in enriched if p["tier"] == "breakout"]
406
+ watching = [p for p in enriched if p["tier"] == "watching"]
407
+
408
+ print("=== DATA FOR LEAD BRIEF GENERATION ===")
409
+ print(f"Product context: {product_context or '(none provided)'}")
410
+ print()
411
+
412
+ for item in breakout + watching:
413
+ pkg = item["package"]
414
+ prof = item.get("profile", {})
415
+ gh_users = prof.get("github_users", [])
416
+ primary_gh = gh_users[0] if gh_users else {}
417
+
418
+ print(f"PACKAGE: {pkg} ({item['tier'].upper()})")
419
+ print(f" Velocity score: {item['velocity_score']} | Growth: {item['growth_pct']:+.0f}%")
420
+ print(f" Recent 4-week avg: {item['recent_4_avg']:,}/week | Prior 4-week avg: {item['prior_4_avg']:,}/week")
421
+ print(f" Weekly trend (last 8): {item['weeks'][-8:]}")
422
+ print(f" Description: {prof.get('description', 'none')}")
423
+ print(f" Keywords: {', '.join(prof.get('keywords', []))}")
424
+ print(f" npm maintainers: {', '.join(prof.get('npm_maintainers', []))}")
425
+ if primary_gh:
426
+ print(f" GitHub: @{primary_gh.get('username')} | {primary_gh.get('followers')} followers | "
427
+ f"{primary_gh.get('public_repos')} repos")
428
+ print(f" Twitter: {primary_gh.get('twitter_username')}")
429
+ print(f" Bio: {primary_gh.get('bio')}")
430
+ print(f" Company: {primary_gh.get('company')}")
431
+ else:
432
+ print(f" GitHub: no profile found")
433
+ print()
434
+ PYEOF
435
+ ```
436
+
437
+ Using the package data printed above, generate a lead brief for each BREAKOUT and WATCHING package.
438
+
439
+ Rules:
440
+ - Every growth number in the brief must come from the printed data -- do not round or modify
441
+ - Every GitHub handle and Twitter username must come from the printed data -- write "not found on GitHub" if the field says that
442
+ - "Why reach out now" must reference the specific growth inflection (weeks, numbers) from the data
443
+ - "Suggested first message" must name the package and its growth, and if product_context was provided, connect it specifically to that context
444
+ - No em dashes. No forbidden words: powerful, robust, seamless, innovative, game-changing, streamline, leverage, transform
445
+
446
+ Write your lead briefs to `/tmp/npl-briefs.json` with this exact structure:
447
+
448
+ ```json
449
+ {
450
+ "lead_briefs": [
451
+ {
452
+ "package": "pkg-name",
453
+ "tier": "breakout",
454
+ "growth_summary": "1-sentence summary of the growth numbers",
455
+ "maintainer_handle": "@github_handle or npm username if no GitHub found",
456
+ "twitter": "@handle or not found on GitHub",
457
+ "github_followers": 0,
458
+ "why_now": "2-3 sentences specific to this package's inflection point",
459
+ "suggested_message": "2-4 sentences. Names the package, the growth, and connects to product_context if provided."
460
+ }
461
+ ]
462
+ }
463
+ ```
464
+
465
+ After writing the file, confirm with:
466
+
467
+ ```bash
468
+ python3 -c "
469
+ import json
470
+ d = json.load(open('/tmp/npl-briefs.json'))
471
+ print(f'Lead briefs generated: {len(d.get(\"lead_briefs\", []))}')
472
+ for b in d['lead_briefs']:
473
+ print(f' {b[\"package\"]} ({b[\"tier\"]}): maintainer={b[\"maintainer_handle\"]}')
474
+ "
475
+ ```
476
+
477
+ ---
478
+
479
+ ## Step 7: Self-QA
480
+
481
+ ```bash
482
+ python3 << 'PYEOF'
483
+ import json
484
+
485
+ scored = json.load(open("/tmp/npl-scored.json"))
486
+ enriched = json.load(open("/tmp/npl-enriched.json"))
487
+ briefs = json.load(open("/tmp/npl-briefs.json"))
488
+
489
+ failures = []
490
+
491
+ # Verify: every brief has a real package name from the scored list
492
+ real_packages = {p["package"] for p in scored}
493
+ for brief in briefs.get("lead_briefs", []):
494
+ if brief.get("package") not in real_packages:
495
+ failures.append(f"Brief for unknown package '{brief.get('package')}' -- removed")
496
+
497
+ briefs["lead_briefs"] = [b for b in briefs.get("lead_briefs", []) if b.get("package") in real_packages]
498
+
499
+ # Verify: velocity leaderboard is sorted correctly (checked on scored, not briefs)
500
+ sorted_scores = sorted([(p["package"], p["velocity_score"]) for p in scored], key=lambda x: -x[1])
501
+ if scored[0]["velocity_score"] < scored[-1]["velocity_score"]:
502
+ failures.append("Scored list not sorted by velocity_score -- re-sorted")
503
+ scored.sort(key=lambda x: x["velocity_score"], reverse=True)
504
+
505
+ # Verify: no GitHub/Twitter handles in briefs that weren't in GitHub API responses
506
+ enriched_gh = {}
507
+ for item in enriched:
508
+ for gh_user in item.get("profile", {}).get("github_users", []):
509
+ enriched_gh[gh_user["username"]] = gh_user.get("twitter_username", "not found on GitHub")
510
+
511
+ for brief in briefs.get("lead_briefs", []):
512
+ twitter = brief.get("twitter", "")
513
+ if twitter and twitter not in ("not found on GitHub", "") and not twitter.startswith("not found"):
514
+ # Verify it came from the API
515
+ found = any(twitter.lstrip("@") == v.lstrip("@") for v in enriched_gh.values() if v != "not found on GitHub")
516
+ if not found:
517
+ failures.append(f"Warning: Twitter handle '{twitter}' for {brief['package']} not verified in GitHub API data")
518
+
519
+ # Check required fields
520
+ for brief in briefs.get("lead_briefs", []):
521
+ for field in ["package", "tier", "growth_summary", "maintainer_handle", "twitter", "why_now", "suggested_message"]:
522
+ if not brief.get(field):
523
+ failures.append(f"Missing field '{field}' in brief for {brief.get('package', '?')}")
524
+
525
+ # Check for em dashes
526
+ briefs_str = json.dumps(briefs)
527
+ if "\u2014" in briefs_str:
528
+ briefs_str = briefs_str.replace("\u2014", " - ")
529
+ briefs = json.loads(briefs_str)
530
+ failures.append("Fixed: em dash characters removed from briefs")
531
+
532
+ # Check for forbidden words
533
+ forbidden = ["powerful", "robust", "seamless", "innovative", "game-changing", "streamline", "leverage", "transform"]
534
+ full_text = json.dumps(briefs).lower()
535
+ for word in forbidden:
536
+ if word in full_text:
537
+ failures.append(f"Warning: forbidden word '{word}' found in briefs -- review before presenting")
538
+
539
+ output = {
540
+ "scored": scored,
541
+ "enriched": enriched,
542
+ "briefs": briefs,
543
+ "data_quality_flags": failures
544
+ }
545
+
546
+ json.dump(output, open("/tmp/npl-output.json", "w"), indent=2)
547
+ print(f"QA complete. Issues: {len(failures)}")
548
+ for f in failures:
549
+ print(f" - {f}")
550
+ if not failures:
551
+ print("All QA checks passed.")
552
+ PYEOF
553
+ ```
554
+
555
+ ---
556
+
557
+ ## Step 8: Save and Present Output
558
+
559
+ ```bash
560
+ python3 << 'PYEOF'
561
+ import json, os
562
+ from datetime import datetime, timezone
563
+
564
+ output = json.load(open("/tmp/npl-output.json"))
565
+ scored = output["scored"]
566
+ enriched_map = {e["package"]: e for e in output["enriched"]}
567
+ briefs_map = {b["package"]: b for b in output["briefs"].get("lead_briefs", [])}
568
+ flags = output["data_quality_flags"]
569
+ date_str = datetime.now(tz=timezone.utc).strftime("%Y-%m-%d")
570
+
571
+ breakout = [p for p in scored if p["tier"] == "breakout"]
572
+ watching = [p for p in scored if p["tier"] == "watching"]
573
+ too_early = [p for p in scored if p["tier"] == "too_early"]
574
+ established = [p for p in scored if p["tier"] == "established"]
575
+
576
+ lines = [
577
+ f"## npm Breakout Report",
578
+ f"Packages analyzed: {len(scored)} | Breakout: {len(breakout)} | Watching: {len(watching)} | Date: {date_str}",
579
+ "",
580
+ "---",
581
+ "",
582
+ "### Velocity Leaderboard",
583
+ "",
584
+ "| Rank | Package | Weekly Downloads | 8-Week Growth | Velocity Score | Status |",
585
+ "|---|---|---|---|---|---|",
586
+ ]
587
+
588
+ for i, pkg in enumerate(scored[:15], 1):
589
+ status_label = {"breakout": "BREAKOUT", "watching": "WATCHING", "steady": "steady",
590
+ "established": "established", "too_early": "too early", "insufficient_data": "no data"}.get(pkg["tier"], pkg["tier"])
591
+ growth_str = f"{pkg['growth_pct']:+.0f}%" if pkg.get("growth_pct") else "n/a"
592
+ lines.append(
593
+ f"| {i} | {pkg['package']} | {pkg['recent_4_avg']:,} | {growth_str} | "
594
+ f"{pkg['velocity_score']} | {status_label} |"
595
+ )
596
+
597
+ lines += ["", "---", ""]
598
+
599
+ if breakout or watching:
600
+ lines += ["### Lead Briefs", ""]
601
+
602
+ for item in breakout + watching:
603
+ pkg = item["package"]
604
+ brief = briefs_map.get(pkg, {})
605
+ profile = enriched_map.get(pkg, {}).get("profile", {})
606
+ gh_users = profile.get("github_users", [])
607
+ primary_gh = gh_users[0] if gh_users else {}
608
+
609
+ lines.append(f"#### {pkg} ({item['tier'].upper()})")
610
+ lines.append(f"Weekly downloads: {item['recent_4_avg']:,}/week (was {item['prior_4_avg']:,} -- {item['growth_pct']:+.0f}% growth over 8 weeks)")
611
+ if profile.get("description"):
612
+ lines.append(f"What it does: {profile['description']}")
613
+ if profile.get("keywords"):
614
+ lines.append(f"Keywords: {', '.join(profile['keywords'])}")
615
+ lines.append("")
616
+
617
+ if primary_gh:
618
+ lines.append(f"**Maintainer: @{primary_gh.get('username')}**")
619
+ lines.append(f"- GitHub: {primary_gh.get('followers', 0):,} followers | {primary_gh.get('public_repos', 0)} public repos")
620
+ lines.append(f"- Twitter: {primary_gh.get('twitter_username', 'not found on GitHub')}")
621
+ if primary_gh.get("bio"):
622
+ lines.append(f"- Bio: \"{primary_gh['bio']}\"")
623
+ if primary_gh.get("company"):
624
+ lines.append(f"- Company: {primary_gh['company']}")
625
+ if primary_gh.get("blog"):
626
+ lines.append(f"- Website: {primary_gh['blog']}")
627
+ elif profile.get("npm_maintainers"):
628
+ lines.append(f"**Maintainer (npm only):** {', '.join(profile['npm_maintainers'][:3])}")
629
+ lines.append("- GitHub profile: not found")
630
+
631
+ lines.append("")
632
+ if brief.get("why_now"):
633
+ lines.append(f"**Why reach out now:** {brief['why_now']}")
634
+ if brief.get("suggested_message"):
635
+ lines.append(f"\n**Suggested first message:**")
636
+ lines.append(f"> {brief['suggested_message']}")
637
+ lines.append("")
638
+ lines.append("---")
639
+ lines.append("")
640
+
641
+ if too_early:
642
+ lines += [f"### Too Early ({len(too_early)} packages below 500 weekly downloads)", ""]
643
+ for p in too_early:
644
+ lines.append(f"- {p['package']}: ~{p['recent_4_avg']:,}/week -- revisit when above 500/week")
645
+ lines.append("")
646
+
647
+ if established:
648
+ lines += [f"### Established Packages (above 500K/week, velocity less meaningful)", ""]
649
+ for p in established:
650
+ lines.append(f"- {p['package']}: ~{p['recent_4_avg']:,}/week")
651
+ lines.append("")
652
+
653
+ lines += ["---", ""]
654
+ lines.append(f"Data quality notes: {'; '.join(flags) if flags else 'None'}")
655
+
656
+ output_path = f"docs/npm-leads/{date_str}.md"
657
+ os.makedirs("docs/npm-leads", exist_ok=True)
658
+ open(output_path, "w").write("\n".join(lines))
659
+
660
+ print("\n".join(lines))
661
+ print(f"\nSaved to: {output_path}")
662
+ PYEOF
663
+ ```
664
+
665
+ Clean up temp files:
666
+
667
+ ```bash
668
+ rm -f /tmp/npl-input.json /tmp/npl-download-data.json /tmp/npl-scored.json \
669
+ /tmp/npl-enriched.json /tmp/npl-briefs.json /tmp/npl-output.json
670
+ ```