@pencil-agent/nano-pencil 2.0.0-beta.8 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (241) hide show
  1. package/README.md +267 -267
  2. package/dist/build-meta.json +3 -3
  3. package/dist/core/export-html/AGENT.md +11 -11
  4. package/dist/core/export-html/template.css +971 -971
  5. package/dist/core/export-html/template.html +54 -54
  6. package/dist/core/extensions-host/index.d.ts +1 -1
  7. package/dist/core/extensions-host/loader.js +1 -1
  8. package/dist/core/extensions-host/runner.d.ts +1 -0
  9. package/dist/core/extensions-host/runner.js +2 -2
  10. package/dist/core/extensions-host/types.d.ts +17 -22
  11. package/dist/core/lib/ai/src/types.d.ts +12 -2
  12. package/dist/core/persona/persona-manager.js +5 -2
  13. package/dist/core/runtime/agent-session.js +3 -3
  14. package/dist/core/runtime/extension-core-bindings.d.ts +1 -0
  15. package/dist/core/runtime/extension-core-bindings.js +2 -2
  16. package/dist/extensions/builtin/AGENT.md +115 -115
  17. package/dist/extensions/builtin/browser/AGENT.md +17 -17
  18. package/dist/extensions/builtin/browser/agent-workspace/agent_helpers.py +12 -12
  19. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/amazon/product-search.md +198 -198
  20. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/archive-org/scraping.md +341 -341
  21. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/arxiv/scraping.md +311 -311
  22. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/arxiv-bulk/scraping.md +333 -333
  23. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/atlas/overview.md +70 -70
  24. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/booking-com/scraping.md +578 -578
  25. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/capterra/scraping.md +440 -440
  26. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/centilebrain/generate-estimates.md +110 -110
  27. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coingecko/scraping.md +325 -325
  28. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coinmarketcap/scraping.md +463 -463
  29. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coursera/scraping.md +360 -360
  30. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/craigslist/scraping.md +390 -390
  31. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/crossref/scraping.md +568 -568
  32. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/dev-to/scraping.md +323 -323
  33. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/duckduckgo/scraping.md +349 -349
  34. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/ebay/scraping.md +435 -435
  35. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/etsy/scraping.md +506 -506
  36. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/eventbrite/scraping.md +363 -363
  37. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/expedia/automation.md +168 -168
  38. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/facebook/groups.md +236 -236
  39. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/facebook/pages.md +295 -295
  40. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/framer/editor.md +108 -108
  41. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/fred/scraping.md +493 -493
  42. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/g2/scraping.md +580 -580
  43. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/genius/scraping.md +511 -511
  44. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/github/repo-actions.md +65 -65
  45. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/github/scraping.md +184 -184
  46. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/glassdoor/scraping.md +543 -543
  47. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/gmail/compose.md +122 -122
  48. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/goodreads/scraping.md +461 -461
  49. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/gutenberg/scraping.md +383 -383
  50. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/hackernews/scraping.md +243 -243
  51. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/howlongtobeat/scraping.md +473 -473
  52. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/imdb/scraping.md +271 -271
  53. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/itch-io/scraping.md +436 -436
  54. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/job-boards/indeed-glassdoor.md +1021 -1021
  55. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/letterboxd/scraping.md +349 -349
  56. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/linkedin/invitation-manager.md +109 -109
  57. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/loom/folder-enumeration.md +170 -170
  58. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/macrotrends/scraping.md +537 -537
  59. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/medium/article-hydration.md +120 -120
  60. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/medium/scraping.md +414 -414
  61. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/metacritic/scraping.md +477 -477
  62. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/musicbrainz/scraping.md +478 -478
  63. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/nasa/scraping.md +339 -339
  64. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/news-aggregation/multi-source.md +205 -205
  65. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/open-library/scraping.md +472 -472
  66. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/openalex/scraping.md +470 -470
  67. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/openstreetmap/scraping.md +490 -490
  68. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/package-registries/npm-pypi.md +478 -478
  69. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/polymarket/scraping.md +234 -234
  70. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/producthunt/scraping.md +307 -307
  71. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/pubmed/scraping.md +421 -421
  72. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/quora/scraping.md +364 -364
  73. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/rawg/scraping.md +352 -352
  74. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/reddit/scraping.md +124 -124
  75. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/rest-countries/scraping.md +233 -233
  76. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/sec-edgar/scraping.md +361 -361
  77. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/README.md +36 -36
  78. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/embedded-apps.md +72 -72
  79. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/knowledge-base.md +109 -109
  80. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/polaris-inputs.md +137 -137
  81. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/soundcloud/scraping.md +362 -362
  82. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/spotify/scraping.md +339 -339
  83. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/stackoverflow/scraping.md +435 -435
  84. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/steam/scraping.md +575 -575
  85. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/substack/scraping.md +338 -338
  86. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/thetechgeeks/pricing.md +52 -52
  87. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/tiktok/upload.md +107 -107
  88. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/tradingview/scraping.md +309 -309
  89. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/trello/boards-and-lists.md +88 -88
  90. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/trustpilot/scraping.md +375 -375
  91. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/walmart/scraping.md +444 -444
  92. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/wayback-machine/scraping.md +306 -306
  93. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/weather/scraping.md +398 -398
  94. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/wellfound/scraping.md +596 -596
  95. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/world-bank/scraping.md +356 -356
  96. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/xiaohongshu/scraping.md +84 -84
  97. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/youtube/scraping.md +418 -418
  98. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/zillow/scraping.md +433 -433
  99. package/dist/extensions/builtin/browser/browser.md +73 -73
  100. package/dist/extensions/builtin/browser/install.md +142 -142
  101. package/dist/extensions/builtin/browser/interaction-skills/connection.md +48 -48
  102. package/dist/extensions/builtin/browser/interaction-skills/cookies.md +3 -3
  103. package/dist/extensions/builtin/browser/interaction-skills/cross-origin-iframes.md +3 -3
  104. package/dist/extensions/builtin/browser/interaction-skills/dialogs.md +64 -64
  105. package/dist/extensions/builtin/browser/interaction-skills/downloads.md +3 -3
  106. package/dist/extensions/builtin/browser/interaction-skills/drag-and-drop.md +3 -3
  107. package/dist/extensions/builtin/browser/interaction-skills/dropdowns.md +3 -3
  108. package/dist/extensions/builtin/browser/interaction-skills/iframes.md +3 -3
  109. package/dist/extensions/builtin/browser/interaction-skills/network-requests.md +3 -3
  110. package/dist/extensions/builtin/browser/interaction-skills/print-as-pdf.md +3 -3
  111. package/dist/extensions/builtin/browser/interaction-skills/profile-sync.md +90 -90
  112. package/dist/extensions/builtin/browser/interaction-skills/screenshots.md +17 -17
  113. package/dist/extensions/builtin/browser/interaction-skills/scrolling.md +3 -3
  114. package/dist/extensions/builtin/browser/interaction-skills/shadow-dom.md +3 -3
  115. package/dist/extensions/builtin/browser/interaction-skills/tabs.md +69 -69
  116. package/dist/extensions/builtin/browser/interaction-skills/uploads.md +1 -1
  117. package/dist/extensions/builtin/browser/interaction-skills/viewport.md +3 -3
  118. package/dist/extensions/builtin/browser/src/browser_harness/AGENT.md +15 -15
  119. package/dist/extensions/builtin/browser/src/browser_harness/__init__.py +8 -8
  120. package/dist/extensions/builtin/browser/src/browser_harness/_ipc.py +90 -90
  121. package/dist/extensions/builtin/browser/src/browser_harness/admin.py +722 -722
  122. package/dist/extensions/builtin/browser/src/browser_harness/daemon.py +328 -328
  123. package/dist/extensions/builtin/browser/src/browser_harness/helpers.py +396 -396
  124. package/dist/extensions/builtin/browser/src/browser_harness/run.py +103 -103
  125. package/dist/extensions/builtin/discipline/skills/brainstorming/SKILL.md +33 -33
  126. package/dist/extensions/builtin/discipline/skills/executing-plans/SKILL.md +25 -25
  127. package/dist/extensions/builtin/discipline/skills/finishing-development-branch/SKILL.md +25 -25
  128. package/dist/extensions/builtin/discipline/skills/receiving-code-review/SKILL.md +22 -22
  129. package/dist/extensions/builtin/discipline/skills/requesting-code-review/SKILL.md +31 -31
  130. package/dist/extensions/builtin/discipline/skills/systematic-debugging/SKILL.md +28 -28
  131. package/dist/extensions/builtin/discipline/skills/test-driven-development/SKILL.md +32 -32
  132. package/dist/extensions/builtin/discipline/skills/using-git-worktrees/SKILL.md +25 -25
  133. package/dist/extensions/builtin/discipline/skills/verification-before-completion/SKILL.md +27 -27
  134. package/dist/extensions/builtin/discipline/skills/writing-plans/SKILL.md +26 -26
  135. package/dist/extensions/builtin/goal/README.md +67 -67
  136. package/dist/extensions/builtin/goal/goal-controller.d.ts +39 -10
  137. package/dist/extensions/builtin/goal/goal-controller.js +1 -1
  138. package/dist/extensions/builtin/goal/goal-format.js +1 -1
  139. package/dist/extensions/builtin/goal/goal-prompts.d.ts +2 -0
  140. package/dist/extensions/builtin/goal/goal-prompts.js +5 -4
  141. package/dist/extensions/builtin/goal/goal-store.js +1 -1
  142. package/dist/extensions/builtin/goal/index.d.ts +1 -1
  143. package/dist/extensions/builtin/goal/index.js +10 -7
  144. package/dist/extensions/builtin/grub/README.md +112 -112
  145. package/dist/extensions/builtin/link-world/agent-workspace/README.md +16 -16
  146. package/dist/extensions/builtin/link-world/index.js +6 -6
  147. package/dist/extensions/builtin/link-world/internet-search/internet-search.md +65 -65
  148. package/dist/extensions/builtin/link-world/link-world-agent.md +82 -82
  149. package/dist/extensions/builtin/link-world/linkworld.md +313 -313
  150. package/dist/extensions/builtin/link-world/{network-routing.md → network-routing/network-routing.md} +67 -67
  151. package/dist/extensions/builtin/loop/README.md +92 -92
  152. package/dist/extensions/builtin/mcp/figma-design.md +68 -68
  153. package/dist/extensions/builtin/mcp/mcp-management.md +85 -85
  154. package/dist/extensions/builtin/plan/index.js +1 -1
  155. package/dist/extensions/builtin/recap/AGENT.md +15 -15
  156. package/dist/extensions/builtin/sal/README.md +72 -72
  157. package/dist/extensions/builtin/security-audit/README.md +289 -289
  158. package/dist/extensions/builtin/task/task-store.d.ts +4 -0
  159. package/dist/extensions/builtin/task/task-store.js +1 -1
  160. package/dist/extensions/builtin/team/AGENT.md +112 -112
  161. package/dist/extensions/builtin/team/TESTING.md +299 -299
  162. package/dist/extensions/builtin/token-save/README.md +56 -56
  163. package/dist/extensions/optional/AGENT.md +10 -10
  164. package/dist/index.d.ts +5 -30
  165. package/dist/index.js +1 -1
  166. package/dist/models.d.ts +7 -0
  167. package/dist/models.js +1 -0
  168. package/dist/modes/interactive/components/footer.js +1 -1
  169. package/dist/modes/interactive/components/task-status-panel.d.ts +36 -0
  170. package/dist/modes/interactive/components/task-status-panel.js +1 -0
  171. package/dist/modes/interactive/controllers/stream-render-controller.d.ts +7 -0
  172. package/dist/modes/interactive/controllers/stream-render-controller.js +2 -2
  173. package/dist/modes/interactive/interactive-mode.js +40 -40
  174. package/dist/modes/interactive/state/interactive-state.d.ts +2 -0
  175. package/dist/modes/interactive/state/interactive-state.js +1 -1
  176. package/dist/modes/interactive/theme/dark.json +85 -85
  177. package/dist/modes/interactive/theme/light.json +84 -84
  178. package/dist/modes/interactive/theme/theme-schema.json +335 -335
  179. package/dist/modes/interactive/theme/warm.json +81 -81
  180. package/dist/node_modules/@pencil-agent/ai/dist/cli.js +0 -0
  181. package/dist/node_modules/@pencil-agent/ai/dist/models.generated.js +1 -1
  182. package/dist/node_modules/@pencil-agent/ai/dist/providers/anthropic.js +2 -2
  183. package/dist/node_modules/@pencil-agent/ai/dist/providers/openai-completions.js +5 -5
  184. package/dist/node_modules/@pencil-agent/ai/dist/providers/openai-responses.js +1 -1
  185. package/dist/node_modules/@pencil-agent/ai/dist/stream.js +1 -1
  186. package/dist/packages/protocol/src/commands.d.ts +33 -0
  187. package/dist/packages/protocol/src/flags.d.ts +20 -0
  188. package/dist/packages/protocol/src/hooks.d.ts +17 -0
  189. package/dist/packages/protocol/src/hooks.js +0 -0
  190. package/dist/packages/{extension-sdk → protocol}/src/index.d.ts +7 -4
  191. package/dist/packages/protocol/src/index.js +1 -0
  192. package/dist/packages/{extension-sdk → protocol}/src/lifecycle.d.ts +15 -27
  193. package/dist/packages/protocol/src/lifecycle.js +0 -0
  194. package/dist/packages/{extension-sdk → protocol}/src/tools.d.ts +1 -1
  195. package/dist/packages/protocol/src/tools.js +0 -0
  196. package/dist/public-config.d.ts +12 -0
  197. package/dist/public-config.js +1 -0
  198. package/dist/runtime.d.ts +9 -0
  199. package/dist/runtime.js +1 -0
  200. package/dist/session-compaction.d.ts +7 -0
  201. package/dist/session-compaction.js +1 -0
  202. package/dist/session.d.ts +7 -0
  203. package/dist/session.js +1 -0
  204. package/dist/skills.d.ts +7 -0
  205. package/dist/skills.js +1 -0
  206. package/dist/tools.d.ts +7 -0
  207. package/dist/tools.js +1 -0
  208. package/docs/ACP/345/215/217/350/256/256/351/233/206/346/210/220/345/274/200/345/217/221/346/226/207/346/241/243.md +851 -0
  209. package/docs/SDK-TESTING.md +364 -0
  210. package/docs/codex-goal-command-impl.md +1055 -1055
  211. package/docs/codex-goal-vs-grub.md +500 -500
  212. package/docs/custom-provider.md +27 -27
  213. package/docs/extensions.md +27 -27
  214. package/docs/keybindings.md +27 -27
  215. package/docs/loop /351/207/215/346/236/204/345/256/214/346/210/220/346/200/273/347/273/223.md" +250 -250
  216. package/docs/loop /351/207/215/346/236/204/345/256/214/346/210/220/346/212/245/345/221/212.md" +122 -122
  217. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210.md" +1222 -1222
  218. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210/345/256/236/347/216/260/346/212/245/345/221/212.md" +158 -158
  219. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210/345/257/271/346/257/224/345/210/206/346/236/220.md" +128 -128
  220. package/docs/loop /351/207/215/346/236/204/350/256/241/345/210/222.md" +320 -320
  221. package/docs/loop-usage-examples.md +214 -214
  222. package/docs/mem-core/346/212/200/346/234/257/346/226/207/346/241/243.md +593 -0
  223. package/docs/models.md +27 -27
  224. package/docs/packages.md +27 -27
  225. package/docs/pi-design-philosophy.md +457 -457
  226. package/docs/planmode.md +1987 -1987
  227. package/docs/prompt-templates.md +27 -27
  228. package/docs/providers.md +27 -27
  229. package/docs/sdk.md +27 -27
  230. package/docs/skills.md +27 -27
  231. package/docs/startup-performance-optimization.md +301 -0
  232. package/docs/themes.md +27 -27
  233. package/docs/tui.md +27 -27
  234. package/docs//350/256/244/347/237/245/345/234/260/345/233/276.md +47 -0
  235. package/package.json +190 -162
  236. package/dist/packages/extension-sdk/src/index.js +0 -1
  237. package/docs/cc-agent-design.md +0 -1297
  238. package/docs/cc-tui-design.md +0 -1333
  239. package/docs//345/257/271/346/240/207Claude-Code.md +0 -1775
  240. /package/dist/packages/{extension-sdk/src/lifecycle.js → protocol/src/commands.js} +0 -0
  241. /package/dist/packages/{extension-sdk/src/tools.js → protocol/src/flags.js} +0 -0
@@ -1,478 +1,478 @@
1
- # npm & PyPI — Package Registry Data Extraction
2
-
3
- `https://registry.npmjs.org` · `https://api.npmjs.org` · `https://pypi.org` · `https://pypistats.org`
4
-
5
- Both registries expose full JSON APIs with no auth required. Never use a browser — every data point is available over HTTP.
6
-
7
- Tested 2026-04-18 with `uv run python` + `http_get`.
8
-
9
- ---
10
-
11
- ## Latency reference (measured)
12
-
13
- | Endpoint | Latency |
14
- |----------|---------|
15
- | PyPI package JSON | ~80ms |
16
- | npm downloads point | ~110ms |
17
- | npm registry full doc (react = 6.3MB) | ~280ms |
18
- | npm registry search | ~330ms |
19
- | pypistats.org recent | ~480ms |
20
-
21
- ---
22
-
23
- ## npm Registry
24
-
25
- ### Package metadata
26
-
27
- Two endpoints — pick based on what you need:
28
-
29
- **Full registry document** — includes all version history, time map, author, bugs, homepage, keywords, README (when present). Large for popular packages (react = 6.3MB).
30
-
31
- ```python
32
- import json
33
- data = json.loads(http_get("https://registry.npmjs.org/react"))
34
-
35
- # Top-level keys: _id, name, dist-tags, versions, time, bugs, author,
36
- # license, homepage, keywords, repository, description,
37
- # contributors, maintainers, readme, readmeFilename, users
38
- print(data['name']) # 'react'
39
- print(data['dist-tags']['latest']) # '19.2.5'
40
- print(data['time']['created']) # '2011-10-26T17:46:21.942Z'
41
- print(data['time']['modified']) # '2026-04-18T00:57:09.913Z'
42
-
43
- latest = data['dist-tags']['latest']
44
- v = data['versions'][latest]
45
- # Version object keys: name, version, description, license, keywords,
46
- # homepage, bugs, repository, engines, exports, main, scripts,
47
- # dependencies, devDependencies, peerDependencies, dist, maintainers,
48
- # _npmUser, _nodeVersion, _npmVersion
49
- print(v['description']) # 'React is a JavaScript library...'
50
- print(v['license']) # 'MIT'
51
- print(list(v.get('dependencies', {}).keys())) # [] (react 19 has no runtime deps)
52
- print(v.get('homepage')) # 'https://react.dev/'
53
- print(len(data['versions'])) # 2785 — all published versions
54
- ```
55
-
56
- **Single version endpoint** — 1–2KB instead of megabytes. Use when you only need one version's data.
57
-
58
- ```python
59
- import json
60
- # Fetch a specific version
61
- v = json.loads(http_get("https://registry.npmjs.org/react/19.2.5"))
62
- print(v['name'], v['version'], v['description'])
63
-
64
- # Fetch latest directly (no need to resolve dist-tags first)
65
- v = json.loads(http_get("https://registry.npmjs.org/react/latest"))
66
- print(v['version']) # '19.2.5'
67
- ```
68
-
69
- **Abbreviated document** — skips time map and (in theory) README; versions dict still present. Use `Accept` header.
70
-
71
- ```python
72
- import json, urllib.request, gzip
73
-
74
- req = urllib.request.Request(
75
- "https://registry.npmjs.org/react",
76
- headers={
77
- "Accept": "application/vnd.npm.install-v1+json",
78
- "Accept-Encoding": "gzip"
79
- }
80
- )
81
- with urllib.request.urlopen(req, timeout=20) as r:
82
- raw = r.read()
83
- if r.headers.get("Content-Encoding") == "gzip":
84
- raw = gzip.decompress(raw)
85
- data = json.loads(raw)
86
- # Keys: name, dist-tags, versions, modified (no time map, no readme)
87
- print(data['dist-tags']['latest']) # '4.18.1' (for lodash)
88
- ```
89
-
90
- Note: abbreviated is still large (react: 2.7MB) — use single-version endpoint when possible.
91
-
92
- ### Scoped packages
93
-
94
- Scoped packages (`@scope/name`) work with a direct path — no encoding needed:
95
-
96
- ```python
97
- import json
98
- data = json.loads(http_get("https://registry.npmjs.org/@playwright/test"))
99
- print(data['name']) # '@playwright/test'
100
- print(data['dist-tags']['latest']) # '1.59.1'
101
- print(len(data['versions'])) # 3148
102
- ```
103
-
104
- If constructing URLs dynamically, either form works:
105
- ```python
106
- # Direct path (preferred)
107
- url = f"https://registry.npmjs.org/{pkg}" # '@playwright/test'
108
- # URL-encoded slash
109
- url = f"https://registry.npmjs.org/{pkg.replace('/', '%2F')}"
110
- ```
111
-
112
- ### Download statistics
113
-
114
- The npm downloads API is separate from the registry and very fast (~110ms).
115
-
116
- **Point query** — single number for a period:
117
-
118
- ```python
119
- import json
120
-
121
- # Supported periods: last-day, last-week, last-month, last-year
122
- # Also accepts ISO date ranges: YYYY-MM-DD:YYYY-MM-DD
123
-
124
- stats = json.loads(http_get("https://api.npmjs.org/downloads/point/last-week/react"))
125
- print(stats['downloads']) # 123302510
126
- print(stats['start']) # '2026-04-11'
127
- print(stats['end']) # '2026-04-17'
128
- print(stats['package']) # 'react'
129
-
130
- # Confirmed values (2026-04-18):
131
- # last-day: 19,411,762
132
- # last-week: 123,302,510
133
- # last-month: 502,719,511
134
- # last-year: 3,000,644,845
135
- ```
136
-
137
- **Bulk point query** — up to ~128 packages in one call, comma-separated:
138
-
139
- ```python
140
- import json
141
-
142
- bulk = json.loads(http_get(
143
- "https://api.npmjs.org/downloads/point/last-week/"
144
- "react,vue,angular,webpack,typescript,eslint,jest,prettier,rollup,babel"
145
- ))
146
- # Returns dict keyed by package name
147
- for pkg, info in bulk.items():
148
- print(f"{pkg}: {info['downloads']:,}")
149
- # react: 123,302,510
150
- # vue: 11,042,359
151
- # angular: 524,366
152
- # webpack: 44,425,549
153
- # typescript: 180,054,359
154
- # eslint: 126,113,686
155
- # jest: 43,394,412
156
- # prettier: 87,551,734
157
- # rollup: 103,431,439
158
- # babel: 139,207
159
- ```
160
-
161
- **Range query** — downloads per day over a period:
162
-
163
- ```python
164
- import json
165
-
166
- resp = json.loads(http_get(
167
- "https://api.npmjs.org/downloads/range/2025-01-01:2025-01-07/react"
168
- ))
169
- # resp['downloads'] is a list of {downloads, day} objects
170
- for entry in resp['downloads']:
171
- print(entry['day'], entry['downloads'])
172
- # 2025-01-01 1336801
173
- # 2025-01-02 3288088
174
- # 2025-01-03 3381680
175
- # ...
176
- ```
177
-
178
- ### Search
179
-
180
- ```python
181
- import json
182
-
183
- # Fields: text, size (max ~250), from (offset), quality, popularity, maintenance weights
184
- data = json.loads(http_get(
185
- "https://registry.npmjs.org/-/v1/search?text=browser+automation&size=5"
186
- ))
187
- print(data['total']) # total results matching the query
188
-
189
- for obj in data['objects']:
190
- p = obj['package']
191
- s = obj['score']
192
- # p keys: name, version, description, keywords, date, links, publisher, maintainers
193
- # s keys: final, detail.quality, detail.popularity, detail.maintenance
194
- print(
195
- p['name'],
196
- p['version'],
197
- f"{s['final']:.2f}",
198
- p.get('description', '')[:60]
199
- )
200
- # agent-browser 0.26.0 462.28 Browser automation CLI for AI agents
201
- # nightmare 3.0.2 306.64 A high-level browser automation library.
202
- ```
203
-
204
- Score breakdown (all three are 0–1 floats):
205
- - `quality` — code quality signals (tests, lint, TypeScript types)
206
- - `popularity` — download counts normalized
207
- - `maintenance` — release frequency, open issues
208
-
209
- `final` is a weighted combination and can exceed 1.0 for extremely popular packages.
210
-
211
- ### Error handling
212
-
213
- ```python
214
- import json, urllib.error
215
-
216
- try:
217
- data = json.loads(http_get("https://registry.npmjs.org/nonexistent-pkg-xyz"))
218
- except urllib.error.HTTPError as e:
219
- # 404 for missing packages
220
- print(e.code) # 404
221
- print(json.loads(e.read())) # {'error': 'Not found'}
222
- ```
223
-
224
- ---
225
-
226
- ## PyPI
227
-
228
- ### Package metadata
229
-
230
- ```python
231
- import json
232
-
233
- # Latest version metadata
234
- data = json.loads(http_get("https://pypi.org/pypi/requests/json"))
235
- info = data['info']
236
-
237
- # info keys (selected):
238
- print(info['name']) # 'requests'
239
- print(info['version']) # '2.33.1'
240
- print(info['summary']) # 'Python HTTP for Humans.'
241
- print(info['license']) # 'Apache-2.0'
242
- print(info['author']) # None (sometimes empty — check author_email)
243
- print(info['author_email']) # '"Kenneth Reitz" <me@kennethreitz.org>'
244
- print(info['requires_python']) # '>=3.10'
245
- print(info['home_page']) # None (may be empty — check project_urls)
246
- print(info['project_urls'])
247
- # {'Documentation': 'https://requests.readthedocs.io',
248
- # 'Source': 'https://github.com/psf/requests'}
249
-
250
- requires = info.get('requires_dist') or []
251
- print(requires[:5])
252
- # ['charset_normalizer<4,>=2', 'idna<4,>=2.5', 'urllib3<3,>=1.26',
253
- # 'certifi>=2023.5.7', 'PySocks!=1.5.7,>=1.5.6; extra == "socks"']
254
-
255
- print(info.get('classifiers', [])[:3])
256
- # ['Development Status :: 5 - Production/Stable',
257
- # 'Intended Audience :: Developers',
258
- # 'License :: OSI Approved :: Apache Software License']
259
-
260
- # data['urls'] — list of dist files for the latest version
261
- for f in data['urls']:
262
- # keys: filename, packagetype, python_version, size, digests, url,
263
- # upload_time, requires_python, yanked, yanked_reason
264
- print(f['packagetype'], f['python_version'], f['filename'], f['size'])
265
- # bdist_wheel py3 requests-2.33.1-py3-none-any.whl 64947
266
- # sdist source requests-2.33.1.tar.gz 134120
267
- ```
268
-
269
- ### Specific version
270
-
271
- ```python
272
- import json
273
-
274
- # Fetch a pinned version (not just latest)
275
- data = json.loads(http_get("https://pypi.org/pypi/requests/2.32.3/json"))
276
- print(data['info']['version']) # '2.32.3'
277
- # Same structure as the latest endpoint
278
- ```
279
-
280
- ### Version history and yanked releases
281
-
282
- ```python
283
- import json
284
-
285
- data = json.loads(http_get("https://pypi.org/pypi/requests/json"))
286
-
287
- # data['releases'] is a dict: version_string -> list of file objects
288
- versions = list(data['releases'].keys())
289
- print("Total versions:", len(versions)) # 159
290
- # Versions are insertion-ordered (chronological, oldest first)
291
- # dict key order is stable
292
-
293
- # Find yanked versions
294
- yanked = [
295
- (ver, files[0]['yanked_reason'])
296
- for ver, files in data['releases'].items()
297
- if files and files[0].get('yanked')
298
- ]
299
- print(yanked[:2])
300
- # [('2.32.0', 'Yanked due to conflicts with CVE-2024-35195 mitigation'),
301
- # ('2.32.1', 'Yanked due to conflicts with CVE-2024-35195 mitigation ')]
302
-
303
- # info.yanked is True only if the LATEST version is yanked
304
- print(data['info']['yanked']) # False
305
- print(data['info']['yanked_reason']) # None
306
- ```
307
-
308
- ### Download statistics (pypistats.org)
309
-
310
- PyPI does not expose download counts in its own JSON API. Use pypistats.org.
311
-
312
- ```python
313
- import json
314
-
315
- # Recent (last day/week/month) — fastest, single call
316
- stats = json.loads(http_get("https://pypistats.org/api/packages/requests/recent"))
317
- d = stats['data']
318
- print(d['last_day']) # 52969887
319
- print(d['last_week']) # 356556988
320
- print(d['last_month']) # 1385411770
321
-
322
- # Historical daily totals (overall, going back ~6 months)
323
- overall = json.loads(http_get("https://pypistats.org/api/packages/requests/overall"))
324
- # overall['data'] is list of {category, date, downloads}
325
- # category is 'with_mirrors' or 'without_mirrors'
326
- for row in overall['data'][:3]:
327
- print(row['date'], row['category'], row['downloads'])
328
- # 2025-10-19 with_mirrors 21916634
329
- # 2025-10-19 without_mirrors 21882953
330
-
331
- # Without mirrors (pip installs only, more accurate for real usage):
332
- clean = json.loads(http_get(
333
- "https://pypistats.org/api/packages/requests/overall?mirrors=false"
334
- ))
335
-
336
- # By Python major version
337
- by_python = json.loads(http_get(
338
- "https://pypistats.org/api/packages/requests/python_major"
339
- ))
340
- # data rows: {category: '3', date: '...', downloads: N}
341
-
342
- # By OS
343
- by_sys = json.loads(http_get(
344
- "https://pypistats.org/api/packages/requests/system"
345
- ))
346
- # data rows: {category: 'Darwin'|'Linux'|'Windows'|'other'|'null', date, downloads}
347
-
348
- # By Python minor version
349
- by_minor = json.loads(http_get(
350
- "https://pypistats.org/api/packages/requests/python_minor"
351
- ))
352
- ```
353
-
354
- ### Parallel fetch for multiple packages
355
-
356
- ```python
357
- import json
358
- from concurrent.futures import ThreadPoolExecutor
359
-
360
- packages = ['numpy', 'pandas', 'scikit-learn', 'torch', 'tensorflow']
361
-
362
- def get_pypi_info(pkg):
363
- d = json.loads(http_get(f"https://pypi.org/pypi/{pkg}/json"))
364
- return {
365
- 'name': pkg,
366
- 'version': d['info']['version'],
367
- 'summary': d['info']['summary'],
368
- 'requires_python': d['info']['requires_python'],
369
- }
370
-
371
- with ThreadPoolExecutor(max_workers=5) as ex:
372
- results = list(ex.map(get_pypi_info, packages))
373
-
374
- for r in results:
375
- print(r['name'], r['version'], r['summary'][:50])
376
- # numpy 2.4.4 Fundamental package for array computing in Python
377
- # pandas 3.0.2 Powerful data structures for data analysis, time s
378
- # scikit-learn 1.8.0 A set of python modules for machine learning and d
379
- # torch 2.11.0 Tensors and Dynamic neural networks in Python with
380
- # tensorflow 2.21.0 TensorFlow is an open source machine learning fram
381
- ```
382
-
383
- ### Error handling
384
-
385
- ```python
386
- import json, urllib.error
387
-
388
- try:
389
- data = json.loads(http_get("https://pypi.org/pypi/nonexistent-xyz-abc/json"))
390
- except urllib.error.HTTPError as e:
391
- print(e.code) # 404
392
- # Body is HTML, not JSON — don't try to parse it
393
- ```
394
-
395
- ---
396
-
397
- ## Parallel fetch patterns
398
-
399
- ### Mixed registry + stats in one shot
400
-
401
- ```python
402
- import json
403
- from concurrent.futures import ThreadPoolExecutor
404
-
405
- def npm_info(pkg):
406
- # Use single-version endpoint (1-2KB) not full registry doc (MB)
407
- v = json.loads(http_get(f"https://registry.npmjs.org/{pkg}/latest"))
408
- s = json.loads(http_get(f"https://api.npmjs.org/downloads/point/last-month/{pkg}"))
409
- return {'name': pkg, 'version': v['version'], 'downloads': s['downloads']}
410
-
411
- pkgs = ['react', 'vue', 'svelte', 'solid-js', 'preact']
412
- with ThreadPoolExecutor(max_workers=5) as ex:
413
- results = list(ex.map(npm_info, pkgs))
414
- for r in results:
415
- print(r['name'], r['version'], f"{r['downloads']:,}")
416
- ```
417
-
418
- ### npm bulk downloads (most efficient for many packages)
419
-
420
- ```python
421
- import json
422
-
423
- # Up to ~128 packages in one HTTP call
424
- pkgs = ['react', 'vue', 'angular', 'svelte']
425
- bulk = json.loads(http_get(
426
- f"https://api.npmjs.org/downloads/point/last-week/{','.join(pkgs)}"
427
- ))
428
- # Returns: {pkg_name: {'downloads': N, 'start': '...', 'end': '...', 'package': '...'}, ...}
429
- sorted_pkgs = sorted(bulk.items(), key=lambda x: x[1]['downloads'], reverse=True)
430
- for name, info in sorted_pkgs:
431
- print(f"{name}: {info['downloads']:,}")
432
- ```
433
-
434
- ---
435
-
436
- ## Rate limits
437
-
438
- No rate limits encountered across rapid bursts of 10 sequential calls per endpoint (2026-04-18 testing):
439
-
440
- | API | Observed limit |
441
- |-----|----------------|
442
- | npm registry (`registry.npmjs.org`) | None observed |
443
- | npm downloads (`api.npmjs.org`) | None observed |
444
- | npm search | None observed |
445
- | PyPI JSON (`pypi.org`) | None observed |
446
- | pypistats.org | None observed |
447
-
448
- npm's official documentation mentions soft rate limits at very high volumes, but normal task-level usage (dozens of calls) is unaffected. If building a large scraper, add a short sleep between batches as a precaution.
449
-
450
- ---
451
-
452
- ## Gotchas
453
-
454
- - **Full npm registry doc is huge** — `registry.npmjs.org/react` is 6.3MB (2785 versions). When you only need the latest version metadata, fetch `registry.npmjs.org/react/latest` (~1.8KB) instead. Similarly for any specific version.
455
-
456
- - **npm `versions` dict keys are ordered oldest-first** — The last key is NOT necessarily the latest release; it may be a canary/experimental build. Always use `dist-tags.latest` to identify the stable latest version.
457
-
458
- - **PyPI `author` field is often `None`** — Many packages set `author_email` instead (often in `"Name" <email>` format). Fall back: `info['author'] or info['author_email']`.
459
-
460
- - **PyPI `home_page` is frequently empty** — Check `info['project_urls']` for `Homepage`, `Source`, `Documentation` links instead.
461
-
462
- - **PyPI `requires_dist` can be `None`** — Not an empty list — `None`. Always guard: `info.get('requires_dist') or []`.
463
-
464
- - **PyPI XML-RPC API is dead** — `https://pypi.org/pypi` (XML-RPC) returns a fault for most methods including `package_releases`. Use JSON API only.
465
-
466
- - **pypistats.org `total` field is `None`** — The `total` key in response JSON is null; compute sums from `data` list yourself.
467
-
468
- - **pypistats.org data goes back ~6 months** — The `overall` endpoint returns daily rows for roughly the past 180 days, not full history.
469
-
470
- - **PyPI yanked versions** — `data['releases'][ver][0]['yanked']` is `True` for yanked versions. `data['info']['yanked']` is only `True` if the latest version itself is yanked. Both `yanked` and `yanked_reason` fields exist on each file object.
471
-
472
- - **npm scoped packages** — Both `registry.npmjs.org/@scope/name` (direct path) and `registry.npmjs.org/@scope%2Fname` (URL-encoded) work. Use the direct path form.
473
-
474
- - **npm downloads bulk response is a dict** — When you request multiple packages, the response is `{pkg_name: {...}}`, not a list. Single-package response is a flat object with `downloads`, `start`, `end`, `package` directly.
475
-
476
- - **`http_get` handles gzip transparently** — The helper already decompresses gzip responses. No manual decompression needed.
477
-
478
- - **Never use a browser for either registry** — All data is JSON over HTTP. `http_get` calls take 80–480ms; a browser navigation would take 3–8 seconds with no benefit.
1
+ # npm & PyPI — Package Registry Data Extraction
2
+
3
+ `https://registry.npmjs.org` · `https://api.npmjs.org` · `https://pypi.org` · `https://pypistats.org`
4
+
5
+ Both registries expose full JSON APIs with no auth required. Never use a browser — every data point is available over HTTP.
6
+
7
+ Tested 2026-04-18 with `uv run python` + `http_get`.
8
+
9
+ ---
10
+
11
+ ## Latency reference (measured)
12
+
13
+ | Endpoint | Latency |
14
+ |----------|---------|
15
+ | PyPI package JSON | ~80ms |
16
+ | npm downloads point | ~110ms |
17
+ | npm registry full doc (react = 6.3MB) | ~280ms |
18
+ | npm registry search | ~330ms |
19
+ | pypistats.org recent | ~480ms |
20
+
21
+ ---
22
+
23
+ ## npm Registry
24
+
25
+ ### Package metadata
26
+
27
+ Two endpoints — pick based on what you need:
28
+
29
+ **Full registry document** — includes all version history, time map, author, bugs, homepage, keywords, README (when present). Large for popular packages (react = 6.3MB).
30
+
31
+ ```python
32
+ import json
33
+ data = json.loads(http_get("https://registry.npmjs.org/react"))
34
+
35
+ # Top-level keys: _id, name, dist-tags, versions, time, bugs, author,
36
+ # license, homepage, keywords, repository, description,
37
+ # contributors, maintainers, readme, readmeFilename, users
38
+ print(data['name']) # 'react'
39
+ print(data['dist-tags']['latest']) # '19.2.5'
40
+ print(data['time']['created']) # '2011-10-26T17:46:21.942Z'
41
+ print(data['time']['modified']) # '2026-04-18T00:57:09.913Z'
42
+
43
+ latest = data['dist-tags']['latest']
44
+ v = data['versions'][latest]
45
+ # Version object keys: name, version, description, license, keywords,
46
+ # homepage, bugs, repository, engines, exports, main, scripts,
47
+ # dependencies, devDependencies, peerDependencies, dist, maintainers,
48
+ # _npmUser, _nodeVersion, _npmVersion
49
+ print(v['description']) # 'React is a JavaScript library...'
50
+ print(v['license']) # 'MIT'
51
+ print(list(v.get('dependencies', {}).keys())) # [] (react 19 has no runtime deps)
52
+ print(v.get('homepage')) # 'https://react.dev/'
53
+ print(len(data['versions'])) # 2785 — all published versions
54
+ ```
55
+
56
+ **Single version endpoint** — 1–2KB instead of megabytes. Use when you only need one version's data.
57
+
58
+ ```python
59
+ import json
60
+ # Fetch a specific version
61
+ v = json.loads(http_get("https://registry.npmjs.org/react/19.2.5"))
62
+ print(v['name'], v['version'], v['description'])
63
+
64
+ # Fetch latest directly (no need to resolve dist-tags first)
65
+ v = json.loads(http_get("https://registry.npmjs.org/react/latest"))
66
+ print(v['version']) # '19.2.5'
67
+ ```
68
+
69
+ **Abbreviated document** — skips time map and (in theory) README; versions dict still present. Use `Accept` header.
70
+
71
+ ```python
72
+ import json, urllib.request, gzip
73
+
74
+ req = urllib.request.Request(
75
+ "https://registry.npmjs.org/react",
76
+ headers={
77
+ "Accept": "application/vnd.npm.install-v1+json",
78
+ "Accept-Encoding": "gzip"
79
+ }
80
+ )
81
+ with urllib.request.urlopen(req, timeout=20) as r:
82
+ raw = r.read()
83
+ if r.headers.get("Content-Encoding") == "gzip":
84
+ raw = gzip.decompress(raw)
85
+ data = json.loads(raw)
86
+ # Keys: name, dist-tags, versions, modified (no time map, no readme)
87
+ print(data['dist-tags']['latest']) # '4.18.1' (for lodash)
88
+ ```
89
+
90
+ Note: abbreviated is still large (react: 2.7MB) — use single-version endpoint when possible.
91
+
92
+ ### Scoped packages
93
+
94
+ Scoped packages (`@scope/name`) work with a direct path — no encoding needed:
95
+
96
+ ```python
97
+ import json
98
+ data = json.loads(http_get("https://registry.npmjs.org/@playwright/test"))
99
+ print(data['name']) # '@playwright/test'
100
+ print(data['dist-tags']['latest']) # '1.59.1'
101
+ print(len(data['versions'])) # 3148
102
+ ```
103
+
104
+ If constructing URLs dynamically, either form works:
105
+ ```python
106
+ # Direct path (preferred)
107
+ url = f"https://registry.npmjs.org/{pkg}" # '@playwright/test'
108
+ # URL-encoded slash
109
+ url = f"https://registry.npmjs.org/{pkg.replace('/', '%2F')}"
110
+ ```
111
+
112
+ ### Download statistics
113
+
114
+ The npm downloads API is separate from the registry and very fast (~110ms).
115
+
116
+ **Point query** — single number for a period:
117
+
118
+ ```python
119
+ import json
120
+
121
+ # Supported periods: last-day, last-week, last-month, last-year
122
+ # Also accepts ISO date ranges: YYYY-MM-DD:YYYY-MM-DD
123
+
124
+ stats = json.loads(http_get("https://api.npmjs.org/downloads/point/last-week/react"))
125
+ print(stats['downloads']) # 123302510
126
+ print(stats['start']) # '2026-04-11'
127
+ print(stats['end']) # '2026-04-17'
128
+ print(stats['package']) # 'react'
129
+
130
+ # Confirmed values (2026-04-18):
131
+ # last-day: 19,411,762
132
+ # last-week: 123,302,510
133
+ # last-month: 502,719,511
134
+ # last-year: 3,000,644,845
135
+ ```
136
+
137
+ **Bulk point query** — up to ~128 packages in one call, comma-separated:
138
+
139
+ ```python
140
+ import json
141
+
142
+ bulk = json.loads(http_get(
143
+ "https://api.npmjs.org/downloads/point/last-week/"
144
+ "react,vue,angular,webpack,typescript,eslint,jest,prettier,rollup,babel"
145
+ ))
146
+ # Returns dict keyed by package name
147
+ for pkg, info in bulk.items():
148
+ print(f"{pkg}: {info['downloads']:,}")
149
+ # react: 123,302,510
150
+ # vue: 11,042,359
151
+ # angular: 524,366
152
+ # webpack: 44,425,549
153
+ # typescript: 180,054,359
154
+ # eslint: 126,113,686
155
+ # jest: 43,394,412
156
+ # prettier: 87,551,734
157
+ # rollup: 103,431,439
158
+ # babel: 139,207
159
+ ```
160
+
161
+ **Range query** — downloads per day over a period:
162
+
163
+ ```python
164
+ import json
165
+
166
+ resp = json.loads(http_get(
167
+ "https://api.npmjs.org/downloads/range/2025-01-01:2025-01-07/react"
168
+ ))
169
+ # resp['downloads'] is a list of {downloads, day} objects
170
+ for entry in resp['downloads']:
171
+ print(entry['day'], entry['downloads'])
172
+ # 2025-01-01 1336801
173
+ # 2025-01-02 3288088
174
+ # 2025-01-03 3381680
175
+ # ...
176
+ ```
177
+
178
+ ### Search
179
+
180
+ ```python
181
+ import json
182
+
183
+ # Fields: text, size (max ~250), from (offset), quality, popularity, maintenance weights
184
+ data = json.loads(http_get(
185
+ "https://registry.npmjs.org/-/v1/search?text=browser+automation&size=5"
186
+ ))
187
+ print(data['total']) # total results matching the query
188
+
189
+ for obj in data['objects']:
190
+ p = obj['package']
191
+ s = obj['score']
192
+ # p keys: name, version, description, keywords, date, links, publisher, maintainers
193
+ # s keys: final, detail.quality, detail.popularity, detail.maintenance
194
+ print(
195
+ p['name'],
196
+ p['version'],
197
+ f"{s['final']:.2f}",
198
+ p.get('description', '')[:60]
199
+ )
200
+ # agent-browser 0.26.0 462.28 Browser automation CLI for AI agents
201
+ # nightmare 3.0.2 306.64 A high-level browser automation library.
202
+ ```
203
+
204
+ Score breakdown (all three are 0–1 floats):
205
+ - `quality` — code quality signals (tests, lint, TypeScript types)
206
+ - `popularity` — download counts normalized
207
+ - `maintenance` — release frequency, open issues
208
+
209
+ `final` is a weighted combination and can exceed 1.0 for extremely popular packages.
210
+
211
+ ### Error handling
212
+
213
+ ```python
214
+ import json, urllib.error
215
+
216
+ try:
217
+ data = json.loads(http_get("https://registry.npmjs.org/nonexistent-pkg-xyz"))
218
+ except urllib.error.HTTPError as e:
219
+ # 404 for missing packages
220
+ print(e.code) # 404
221
+ print(json.loads(e.read())) # {'error': 'Not found'}
222
+ ```
223
+
224
+ ---
225
+
226
+ ## PyPI
227
+
228
+ ### Package metadata
229
+
230
+ ```python
231
+ import json
232
+
233
+ # Latest version metadata
234
+ data = json.loads(http_get("https://pypi.org/pypi/requests/json"))
235
+ info = data['info']
236
+
237
+ # info keys (selected):
238
+ print(info['name']) # 'requests'
239
+ print(info['version']) # '2.33.1'
240
+ print(info['summary']) # 'Python HTTP for Humans.'
241
+ print(info['license']) # 'Apache-2.0'
242
+ print(info['author']) # None (sometimes empty — check author_email)
243
+ print(info['author_email']) # '"Kenneth Reitz" <me@kennethreitz.org>'
244
+ print(info['requires_python']) # '>=3.10'
245
+ print(info['home_page']) # None (may be empty — check project_urls)
246
+ print(info['project_urls'])
247
+ # {'Documentation': 'https://requests.readthedocs.io',
248
+ # 'Source': 'https://github.com/psf/requests'}
249
+
250
+ requires = info.get('requires_dist') or []
251
+ print(requires[:5])
252
+ # ['charset_normalizer<4,>=2', 'idna<4,>=2.5', 'urllib3<3,>=1.26',
253
+ # 'certifi>=2023.5.7', 'PySocks!=1.5.7,>=1.5.6; extra == "socks"']
254
+
255
+ print(info.get('classifiers', [])[:3])
256
+ # ['Development Status :: 5 - Production/Stable',
257
+ # 'Intended Audience :: Developers',
258
+ # 'License :: OSI Approved :: Apache Software License']
259
+
260
+ # data['urls'] — list of dist files for the latest version
261
+ for f in data['urls']:
262
+ # keys: filename, packagetype, python_version, size, digests, url,
263
+ # upload_time, requires_python, yanked, yanked_reason
264
+ print(f['packagetype'], f['python_version'], f['filename'], f['size'])
265
+ # bdist_wheel py3 requests-2.33.1-py3-none-any.whl 64947
266
+ # sdist source requests-2.33.1.tar.gz 134120
267
+ ```
268
+
269
+ ### Specific version
270
+
271
+ ```python
272
+ import json
273
+
274
+ # Fetch a pinned version (not just latest)
275
+ data = json.loads(http_get("https://pypi.org/pypi/requests/2.32.3/json"))
276
+ print(data['info']['version']) # '2.32.3'
277
+ # Same structure as the latest endpoint
278
+ ```
279
+
280
+ ### Version history and yanked releases
281
+
282
+ ```python
283
+ import json
284
+
285
+ data = json.loads(http_get("https://pypi.org/pypi/requests/json"))
286
+
287
+ # data['releases'] is a dict: version_string -> list of file objects
288
+ versions = list(data['releases'].keys())
289
+ print("Total versions:", len(versions)) # 159
290
+ # Versions are insertion-ordered (chronological, oldest first)
291
+ # dict key order is stable
292
+
293
+ # Find yanked versions
294
+ yanked = [
295
+ (ver, files[0]['yanked_reason'])
296
+ for ver, files in data['releases'].items()
297
+ if files and files[0].get('yanked')
298
+ ]
299
+ print(yanked[:2])
300
+ # [('2.32.0', 'Yanked due to conflicts with CVE-2024-35195 mitigation'),
301
+ # ('2.32.1', 'Yanked due to conflicts with CVE-2024-35195 mitigation ')]
302
+
303
+ # info.yanked is True only if the LATEST version is yanked
304
+ print(data['info']['yanked']) # False
305
+ print(data['info']['yanked_reason']) # None
306
+ ```
307
+
308
+ ### Download statistics (pypistats.org)
309
+
310
+ PyPI does not expose download counts in its own JSON API. Use pypistats.org.
311
+
312
+ ```python
313
+ import json
314
+
315
+ # Recent (last day/week/month) — fastest, single call
316
+ stats = json.loads(http_get("https://pypistats.org/api/packages/requests/recent"))
317
+ d = stats['data']
318
+ print(d['last_day']) # 52969887
319
+ print(d['last_week']) # 356556988
320
+ print(d['last_month']) # 1385411770
321
+
322
+ # Historical daily totals (overall, going back ~6 months)
323
+ overall = json.loads(http_get("https://pypistats.org/api/packages/requests/overall"))
324
+ # overall['data'] is list of {category, date, downloads}
325
+ # category is 'with_mirrors' or 'without_mirrors'
326
+ for row in overall['data'][:3]:
327
+ print(row['date'], row['category'], row['downloads'])
328
+ # 2025-10-19 with_mirrors 21916634
329
+ # 2025-10-19 without_mirrors 21882953
330
+
331
+ # Without mirrors (pip installs only, more accurate for real usage):
332
+ clean = json.loads(http_get(
333
+ "https://pypistats.org/api/packages/requests/overall?mirrors=false"
334
+ ))
335
+
336
+ # By Python major version
337
+ by_python = json.loads(http_get(
338
+ "https://pypistats.org/api/packages/requests/python_major"
339
+ ))
340
+ # data rows: {category: '3', date: '...', downloads: N}
341
+
342
+ # By OS
343
+ by_sys = json.loads(http_get(
344
+ "https://pypistats.org/api/packages/requests/system"
345
+ ))
346
+ # data rows: {category: 'Darwin'|'Linux'|'Windows'|'other'|'null', date, downloads}
347
+
348
+ # By Python minor version
349
+ by_minor = json.loads(http_get(
350
+ "https://pypistats.org/api/packages/requests/python_minor"
351
+ ))
352
+ ```
353
+
354
+ ### Parallel fetch for multiple packages
355
+
356
+ ```python
357
+ import json
358
+ from concurrent.futures import ThreadPoolExecutor
359
+
360
+ packages = ['numpy', 'pandas', 'scikit-learn', 'torch', 'tensorflow']
361
+
362
+ def get_pypi_info(pkg):
363
+ d = json.loads(http_get(f"https://pypi.org/pypi/{pkg}/json"))
364
+ return {
365
+ 'name': pkg,
366
+ 'version': d['info']['version'],
367
+ 'summary': d['info']['summary'],
368
+ 'requires_python': d['info']['requires_python'],
369
+ }
370
+
371
+ with ThreadPoolExecutor(max_workers=5) as ex:
372
+ results = list(ex.map(get_pypi_info, packages))
373
+
374
+ for r in results:
375
+ print(r['name'], r['version'], r['summary'][:50])
376
+ # numpy 2.4.4 Fundamental package for array computing in Python
377
+ # pandas 3.0.2 Powerful data structures for data analysis, time s
378
+ # scikit-learn 1.8.0 A set of python modules for machine learning and d
379
+ # torch 2.11.0 Tensors and Dynamic neural networks in Python with
380
+ # tensorflow 2.21.0 TensorFlow is an open source machine learning fram
381
+ ```
382
+
383
+ ### Error handling
384
+
385
+ ```python
386
+ import json, urllib.error
387
+
388
+ try:
389
+ data = json.loads(http_get("https://pypi.org/pypi/nonexistent-xyz-abc/json"))
390
+ except urllib.error.HTTPError as e:
391
+ print(e.code) # 404
392
+ # Body is HTML, not JSON — don't try to parse it
393
+ ```
394
+
395
+ ---
396
+
397
+ ## Parallel fetch patterns
398
+
399
+ ### Mixed registry + stats in one shot
400
+
401
+ ```python
402
+ import json
403
+ from concurrent.futures import ThreadPoolExecutor
404
+
405
+ def npm_info(pkg):
406
+ # Use single-version endpoint (1-2KB) not full registry doc (MB)
407
+ v = json.loads(http_get(f"https://registry.npmjs.org/{pkg}/latest"))
408
+ s = json.loads(http_get(f"https://api.npmjs.org/downloads/point/last-month/{pkg}"))
409
+ return {'name': pkg, 'version': v['version'], 'downloads': s['downloads']}
410
+
411
+ pkgs = ['react', 'vue', 'svelte', 'solid-js', 'preact']
412
+ with ThreadPoolExecutor(max_workers=5) as ex:
413
+ results = list(ex.map(npm_info, pkgs))
414
+ for r in results:
415
+ print(r['name'], r['version'], f"{r['downloads']:,}")
416
+ ```
417
+
418
+ ### npm bulk downloads (most efficient for many packages)
419
+
420
+ ```python
421
+ import json
422
+
423
+ # Up to ~128 packages in one HTTP call
424
+ pkgs = ['react', 'vue', 'angular', 'svelte']
425
+ bulk = json.loads(http_get(
426
+ f"https://api.npmjs.org/downloads/point/last-week/{','.join(pkgs)}"
427
+ ))
428
+ # Returns: {pkg_name: {'downloads': N, 'start': '...', 'end': '...', 'package': '...'}, ...}
429
+ sorted_pkgs = sorted(bulk.items(), key=lambda x: x[1]['downloads'], reverse=True)
430
+ for name, info in sorted_pkgs:
431
+ print(f"{name}: {info['downloads']:,}")
432
+ ```
433
+
434
+ ---
435
+
436
+ ## Rate limits
437
+
438
+ No rate limits encountered across rapid bursts of 10 sequential calls per endpoint (2026-04-18 testing):
439
+
440
+ | API | Observed limit |
441
+ |-----|----------------|
442
+ | npm registry (`registry.npmjs.org`) | None observed |
443
+ | npm downloads (`api.npmjs.org`) | None observed |
444
+ | npm search | None observed |
445
+ | PyPI JSON (`pypi.org`) | None observed |
446
+ | pypistats.org | None observed |
447
+
448
+ npm's official documentation mentions soft rate limits at very high volumes, but normal task-level usage (dozens of calls) is unaffected. If building a large scraper, add a short sleep between batches as a precaution.
449
+
450
+ ---
451
+
452
+ ## Gotchas
453
+
454
+ - **Full npm registry doc is huge** — `registry.npmjs.org/react` is 6.3MB (2785 versions). When you only need the latest version metadata, fetch `registry.npmjs.org/react/latest` (~1.8KB) instead. Similarly for any specific version.
455
+
456
+ - **npm `versions` dict keys are ordered oldest-first** — The last key is NOT necessarily the latest release; it may be a canary/experimental build. Always use `dist-tags.latest` to identify the stable latest version.
457
+
458
+ - **PyPI `author` field is often `None`** — Many packages set `author_email` instead (often in `"Name" <email>` format). Fall back: `info['author'] or info['author_email']`.
459
+
460
+ - **PyPI `home_page` is frequently empty** — Check `info['project_urls']` for `Homepage`, `Source`, `Documentation` links instead.
461
+
462
+ - **PyPI `requires_dist` can be `None`** — Not an empty list — `None`. Always guard: `info.get('requires_dist') or []`.
463
+
464
+ - **PyPI XML-RPC API is dead** — `https://pypi.org/pypi` (XML-RPC) returns a fault for most methods including `package_releases`. Use JSON API only.
465
+
466
+ - **pypistats.org `total` field is `None`** — The `total` key in response JSON is null; compute sums from `data` list yourself.
467
+
468
+ - **pypistats.org data goes back ~6 months** — The `overall` endpoint returns daily rows for roughly the past 180 days, not full history.
469
+
470
+ - **PyPI yanked versions** — `data['releases'][ver][0]['yanked']` is `True` for yanked versions. `data['info']['yanked']` is only `True` if the latest version itself is yanked. Both `yanked` and `yanked_reason` fields exist on each file object.
471
+
472
+ - **npm scoped packages** — Both `registry.npmjs.org/@scope/name` (direct path) and `registry.npmjs.org/@scope%2Fname` (URL-encoded) work. Use the direct path form.
473
+
474
+ - **npm downloads bulk response is a dict** — When you request multiple packages, the response is `{pkg_name: {...}}`, not a list. Single-package response is a flat object with `downloads`, `start`, `end`, `package` directly.
475
+
476
+ - **`http_get` handles gzip transparently** — The helper already decompresses gzip responses. No manual decompression needed.
477
+
478
+ - **Never use a browser for either registry** — All data is JSON over HTTP. `http_get` calls take 80–480ms; a browser navigation would take 3–8 seconds with no benefit.