@pencil-agent/nano-pencil 2.0.1 → 2.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (188) hide show
  1. package/README.md +267 -267
  2. package/dist/build-meta.json +3 -3
  3. package/dist/core/export-html/AGENT.md +11 -11
  4. package/dist/core/export-html/template.css +971 -971
  5. package/dist/core/export-html/template.html +54 -54
  6. package/dist/core/model/custom-providers.js +1 -1
  7. package/dist/core/model-registry.js +5 -5
  8. package/dist/extensions/builtin/AGENT.md +115 -115
  9. package/dist/extensions/builtin/browser/AGENT.md +17 -17
  10. package/dist/extensions/builtin/browser/agent-workspace/agent_helpers.py +12 -12
  11. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/amazon/product-search.md +198 -198
  12. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/archive-org/scraping.md +341 -341
  13. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/arxiv/scraping.md +311 -311
  14. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/arxiv-bulk/scraping.md +333 -333
  15. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/atlas/overview.md +70 -70
  16. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/booking-com/scraping.md +578 -578
  17. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/capterra/scraping.md +440 -440
  18. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/centilebrain/generate-estimates.md +110 -110
  19. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coingecko/scraping.md +325 -325
  20. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coinmarketcap/scraping.md +463 -463
  21. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coursera/scraping.md +360 -360
  22. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/craigslist/scraping.md +390 -390
  23. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/crossref/scraping.md +568 -568
  24. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/dev-to/scraping.md +323 -323
  25. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/duckduckgo/scraping.md +349 -349
  26. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/ebay/scraping.md +435 -435
  27. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/etsy/scraping.md +506 -506
  28. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/eventbrite/scraping.md +363 -363
  29. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/expedia/automation.md +168 -168
  30. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/facebook/groups.md +236 -236
  31. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/facebook/pages.md +295 -295
  32. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/framer/editor.md +108 -108
  33. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/fred/scraping.md +493 -493
  34. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/g2/scraping.md +580 -580
  35. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/genius/scraping.md +511 -511
  36. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/github/repo-actions.md +65 -65
  37. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/github/scraping.md +184 -184
  38. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/glassdoor/scraping.md +543 -543
  39. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/gmail/compose.md +122 -122
  40. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/goodreads/scraping.md +461 -461
  41. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/gutenberg/scraping.md +383 -383
  42. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/hackernews/scraping.md +243 -243
  43. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/howlongtobeat/scraping.md +473 -473
  44. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/imdb/scraping.md +271 -271
  45. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/itch-io/scraping.md +436 -436
  46. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/job-boards/indeed-glassdoor.md +1021 -1021
  47. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/letterboxd/scraping.md +349 -349
  48. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/linkedin/invitation-manager.md +109 -109
  49. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/loom/folder-enumeration.md +170 -170
  50. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/macrotrends/scraping.md +537 -537
  51. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/medium/article-hydration.md +120 -120
  52. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/medium/scraping.md +414 -414
  53. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/metacritic/scraping.md +477 -477
  54. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/musicbrainz/scraping.md +478 -478
  55. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/nasa/scraping.md +339 -339
  56. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/news-aggregation/multi-source.md +205 -205
  57. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/open-library/scraping.md +472 -472
  58. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/openalex/scraping.md +470 -470
  59. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/openstreetmap/scraping.md +490 -490
  60. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/package-registries/npm-pypi.md +478 -478
  61. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/polymarket/scraping.md +234 -234
  62. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/producthunt/scraping.md +307 -307
  63. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/pubmed/scraping.md +421 -421
  64. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/quora/scraping.md +364 -364
  65. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/rawg/scraping.md +352 -352
  66. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/reddit/scraping.md +124 -124
  67. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/rest-countries/scraping.md +233 -233
  68. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/sec-edgar/scraping.md +361 -361
  69. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/README.md +36 -36
  70. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/embedded-apps.md +72 -72
  71. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/knowledge-base.md +109 -109
  72. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/polaris-inputs.md +137 -137
  73. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/soundcloud/scraping.md +362 -362
  74. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/spotify/scraping.md +339 -339
  75. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/stackoverflow/scraping.md +435 -435
  76. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/steam/scraping.md +575 -575
  77. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/substack/scraping.md +338 -338
  78. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/thetechgeeks/pricing.md +52 -52
  79. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/tiktok/upload.md +107 -107
  80. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/tradingview/scraping.md +309 -309
  81. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/trello/boards-and-lists.md +88 -88
  82. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/trustpilot/scraping.md +375 -375
  83. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/walmart/scraping.md +444 -444
  84. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/wayback-machine/scraping.md +306 -306
  85. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/weather/scraping.md +398 -398
  86. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/wellfound/scraping.md +596 -596
  87. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/world-bank/scraping.md +356 -356
  88. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/xiaohongshu/scraping.md +84 -84
  89. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/youtube/scraping.md +418 -418
  90. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/zillow/scraping.md +433 -433
  91. package/dist/extensions/builtin/browser/browser.md +73 -73
  92. package/dist/extensions/builtin/browser/install.md +142 -142
  93. package/dist/extensions/builtin/browser/interaction-skills/connection.md +48 -48
  94. package/dist/extensions/builtin/browser/interaction-skills/cookies.md +3 -3
  95. package/dist/extensions/builtin/browser/interaction-skills/cross-origin-iframes.md +3 -3
  96. package/dist/extensions/builtin/browser/interaction-skills/dialogs.md +64 -64
  97. package/dist/extensions/builtin/browser/interaction-skills/downloads.md +3 -3
  98. package/dist/extensions/builtin/browser/interaction-skills/drag-and-drop.md +3 -3
  99. package/dist/extensions/builtin/browser/interaction-skills/dropdowns.md +3 -3
  100. package/dist/extensions/builtin/browser/interaction-skills/iframes.md +3 -3
  101. package/dist/extensions/builtin/browser/interaction-skills/network-requests.md +3 -3
  102. package/dist/extensions/builtin/browser/interaction-skills/print-as-pdf.md +3 -3
  103. package/dist/extensions/builtin/browser/interaction-skills/profile-sync.md +90 -90
  104. package/dist/extensions/builtin/browser/interaction-skills/screenshots.md +17 -17
  105. package/dist/extensions/builtin/browser/interaction-skills/scrolling.md +3 -3
  106. package/dist/extensions/builtin/browser/interaction-skills/shadow-dom.md +3 -3
  107. package/dist/extensions/builtin/browser/interaction-skills/tabs.md +69 -69
  108. package/dist/extensions/builtin/browser/interaction-skills/uploads.md +1 -1
  109. package/dist/extensions/builtin/browser/interaction-skills/viewport.md +3 -3
  110. package/dist/extensions/builtin/browser/src/browser_harness/AGENT.md +15 -15
  111. package/dist/extensions/builtin/browser/src/browser_harness/__init__.py +8 -8
  112. package/dist/extensions/builtin/browser/src/browser_harness/_ipc.py +90 -90
  113. package/dist/extensions/builtin/browser/src/browser_harness/admin.py +722 -722
  114. package/dist/extensions/builtin/browser/src/browser_harness/daemon.py +328 -328
  115. package/dist/extensions/builtin/browser/src/browser_harness/helpers.py +396 -396
  116. package/dist/extensions/builtin/browser/src/browser_harness/run.py +103 -103
  117. package/dist/extensions/builtin/debug/index.js +9 -9
  118. package/dist/extensions/builtin/discipline/skills/brainstorming/SKILL.md +33 -33
  119. package/dist/extensions/builtin/discipline/skills/executing-plans/SKILL.md +25 -25
  120. package/dist/extensions/builtin/discipline/skills/finishing-development-branch/SKILL.md +25 -25
  121. package/dist/extensions/builtin/discipline/skills/receiving-code-review/SKILL.md +22 -22
  122. package/dist/extensions/builtin/discipline/skills/requesting-code-review/SKILL.md +31 -31
  123. package/dist/extensions/builtin/discipline/skills/systematic-debugging/SKILL.md +28 -28
  124. package/dist/extensions/builtin/discipline/skills/test-driven-development/SKILL.md +32 -32
  125. package/dist/extensions/builtin/discipline/skills/using-git-worktrees/SKILL.md +25 -25
  126. package/dist/extensions/builtin/discipline/skills/verification-before-completion/SKILL.md +27 -27
  127. package/dist/extensions/builtin/discipline/skills/writing-plans/SKILL.md +26 -26
  128. package/dist/extensions/builtin/goal/README.md +67 -67
  129. package/dist/extensions/builtin/goal/index.js +6 -6
  130. package/dist/extensions/builtin/grub/README.md +112 -112
  131. package/dist/extensions/builtin/link-world/agent-workspace/README.md +16 -16
  132. package/dist/extensions/builtin/link-world/internet-search/internet-search.md +65 -65
  133. package/dist/extensions/builtin/link-world/link-world-agent.md +82 -82
  134. package/dist/extensions/builtin/link-world/linkworld.md +313 -313
  135. package/dist/extensions/builtin/link-world/network-routing/network-routing.md +67 -67
  136. package/dist/extensions/builtin/loop/README.md +92 -92
  137. package/dist/extensions/builtin/mcp/figma-design.md +68 -68
  138. package/dist/extensions/builtin/mcp/mcp-management.md +85 -85
  139. package/dist/extensions/builtin/recap/AGENT.md +15 -15
  140. package/dist/extensions/builtin/sal/README.md +72 -72
  141. package/dist/extensions/builtin/security-audit/README.md +289 -289
  142. package/dist/extensions/builtin/team/AGENT.md +112 -112
  143. package/dist/extensions/builtin/team/TESTING.md +299 -299
  144. package/dist/extensions/builtin/token-save/README.md +56 -56
  145. package/dist/extensions/optional/AGENT.md +10 -10
  146. package/dist/modes/interactive/controllers/input-submit-controller.js +2 -2
  147. package/dist/modes/interactive/controllers/stream-render-controller.js +2 -2
  148. package/dist/modes/interactive/interactive-mode.js +19 -19
  149. package/dist/modes/interactive/theme/dark.json +85 -85
  150. package/dist/modes/interactive/theme/light.json +84 -84
  151. package/dist/modes/interactive/theme/theme-schema.json +335 -335
  152. package/dist/modes/interactive/theme/warm.json +81 -81
  153. package/dist/node_modules/@pencil-agent/ai/dist/cli.js +0 -0
  154. package/dist/node_modules/@pencil-agent/ai/dist/models.generated.js +1 -1
  155. package/docs/ACP/345/215/217/350/256/256/351/233/206/346/210/220/345/274/200/345/217/221/346/226/207/346/241/243.md +851 -0
  156. package/docs/SDK-TESTING.md +364 -0
  157. package/docs/codex-goal-command-impl.md +1055 -1055
  158. package/docs/codex-goal-vs-grub.md +500 -500
  159. package/docs/custom-provider.md +27 -27
  160. package/docs/extensions.md +27 -27
  161. package/docs/keybindings.md +27 -27
  162. package/docs/loop /351/207/215/346/236/204/345/256/214/346/210/220/346/200/273/347/273/223.md" +250 -250
  163. package/docs/loop /351/207/215/346/236/204/345/256/214/346/210/220/346/212/245/345/221/212.md" +122 -122
  164. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210.md" +1222 -1222
  165. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210/345/256/236/347/216/260/346/212/245/345/221/212.md" +158 -158
  166. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210/345/257/271/346/257/224/345/210/206/346/236/220.md" +128 -128
  167. package/docs/loop /351/207/215/346/236/204/350/256/241/345/210/222.md" +320 -320
  168. package/docs/loop-usage-examples.md +214 -214
  169. package/docs/mem-core/346/212/200/346/234/257/346/226/207/346/241/243.md +593 -0
  170. package/docs/models.md +27 -27
  171. package/docs/packages.md +27 -27
  172. package/docs/pi-design-philosophy.md +457 -457
  173. package/docs/planmode.md +1987 -1987
  174. package/docs/prompt-templates.md +27 -27
  175. package/docs/providers.md +27 -27
  176. package/docs/sdk.md +27 -27
  177. package/docs/skills.md +27 -27
  178. package/docs/startup-performance-optimization.md +301 -0
  179. package/docs/themes.md +27 -27
  180. package/docs/tui.md +27 -27
  181. package/docs//350/256/244/347/237/245/345/234/260/345/233/276.md +47 -0
  182. package/package.json +190 -190
  183. package/docs/cc-agent-design.md +0 -1297
  184. package/docs/cc-tui-design.md +0 -1333
  185. package/docs/nanoPencil-/345/255/246/344/271/240/350/256/241/345/210/222.md +0 -170
  186. package/docs/scan-report.md +0 -3820
  187. package/docs//345/257/271/346/240/207Claude-Code.md +0 -1775
  188. package/docs//351/230/277/351/207/214/345/267/264/345/267/264/350/264/242/346/212/245/345/210/206/346/236/220/344/271/246.md +0 -261
@@ -1,478 +1,478 @@
1
- # npm & PyPI — Package Registry Data Extraction
2
-
3
- `https://registry.npmjs.org` · `https://api.npmjs.org` · `https://pypi.org` · `https://pypistats.org`
4
-
5
- Both registries expose full JSON APIs with no auth required. Never use a browser — every data point is available over HTTP.
6
-
7
- Tested 2026-04-18 with `uv run python` + `http_get`.
8
-
9
- ---
10
-
11
- ## Latency reference (measured)
12
-
13
- | Endpoint | Latency |
14
- |----------|---------|
15
- | PyPI package JSON | ~80ms |
16
- | npm downloads point | ~110ms |
17
- | npm registry full doc (react = 6.3MB) | ~280ms |
18
- | npm registry search | ~330ms |
19
- | pypistats.org recent | ~480ms |
20
-
21
- ---
22
-
23
- ## npm Registry
24
-
25
- ### Package metadata
26
-
27
- Two endpoints — pick based on what you need:
28
-
29
- **Full registry document** — includes all version history, time map, author, bugs, homepage, keywords, README (when present). Large for popular packages (react = 6.3MB).
30
-
31
- ```python
32
- import json
33
- data = json.loads(http_get("https://registry.npmjs.org/react"))
34
-
35
- # Top-level keys: _id, name, dist-tags, versions, time, bugs, author,
36
- # license, homepage, keywords, repository, description,
37
- # contributors, maintainers, readme, readmeFilename, users
38
- print(data['name']) # 'react'
39
- print(data['dist-tags']['latest']) # '19.2.5'
40
- print(data['time']['created']) # '2011-10-26T17:46:21.942Z'
41
- print(data['time']['modified']) # '2026-04-18T00:57:09.913Z'
42
-
43
- latest = data['dist-tags']['latest']
44
- v = data['versions'][latest]
45
- # Version object keys: name, version, description, license, keywords,
46
- # homepage, bugs, repository, engines, exports, main, scripts,
47
- # dependencies, devDependencies, peerDependencies, dist, maintainers,
48
- # _npmUser, _nodeVersion, _npmVersion
49
- print(v['description']) # 'React is a JavaScript library...'
50
- print(v['license']) # 'MIT'
51
- print(list(v.get('dependencies', {}).keys())) # [] (react 19 has no runtime deps)
52
- print(v.get('homepage')) # 'https://react.dev/'
53
- print(len(data['versions'])) # 2785 — all published versions
54
- ```
55
-
56
- **Single version endpoint** — 1–2KB instead of megabytes. Use when you only need one version's data.
57
-
58
- ```python
59
- import json
60
- # Fetch a specific version
61
- v = json.loads(http_get("https://registry.npmjs.org/react/19.2.5"))
62
- print(v['name'], v['version'], v['description'])
63
-
64
- # Fetch latest directly (no need to resolve dist-tags first)
65
- v = json.loads(http_get("https://registry.npmjs.org/react/latest"))
66
- print(v['version']) # '19.2.5'
67
- ```
68
-
69
- **Abbreviated document** — skips time map and (in theory) README; versions dict still present. Use `Accept` header.
70
-
71
- ```python
72
- import json, urllib.request, gzip
73
-
74
- req = urllib.request.Request(
75
- "https://registry.npmjs.org/react",
76
- headers={
77
- "Accept": "application/vnd.npm.install-v1+json",
78
- "Accept-Encoding": "gzip"
79
- }
80
- )
81
- with urllib.request.urlopen(req, timeout=20) as r:
82
- raw = r.read()
83
- if r.headers.get("Content-Encoding") == "gzip":
84
- raw = gzip.decompress(raw)
85
- data = json.loads(raw)
86
- # Keys: name, dist-tags, versions, modified (no time map, no readme)
87
- print(data['dist-tags']['latest']) # '4.18.1' (for lodash)
88
- ```
89
-
90
- Note: abbreviated is still large (react: 2.7MB) — use single-version endpoint when possible.
91
-
92
- ### Scoped packages
93
-
94
- Scoped packages (`@scope/name`) work with a direct path — no encoding needed:
95
-
96
- ```python
97
- import json
98
- data = json.loads(http_get("https://registry.npmjs.org/@playwright/test"))
99
- print(data['name']) # '@playwright/test'
100
- print(data['dist-tags']['latest']) # '1.59.1'
101
- print(len(data['versions'])) # 3148
102
- ```
103
-
104
- If constructing URLs dynamically, either form works:
105
- ```python
106
- # Direct path (preferred)
107
- url = f"https://registry.npmjs.org/{pkg}" # '@playwright/test'
108
- # URL-encoded slash
109
- url = f"https://registry.npmjs.org/{pkg.replace('/', '%2F')}"
110
- ```
111
-
112
- ### Download statistics
113
-
114
- The npm downloads API is separate from the registry and very fast (~110ms).
115
-
116
- **Point query** — single number for a period:
117
-
118
- ```python
119
- import json
120
-
121
- # Supported periods: last-day, last-week, last-month, last-year
122
- # Also accepts ISO date ranges: YYYY-MM-DD:YYYY-MM-DD
123
-
124
- stats = json.loads(http_get("https://api.npmjs.org/downloads/point/last-week/react"))
125
- print(stats['downloads']) # 123302510
126
- print(stats['start']) # '2026-04-11'
127
- print(stats['end']) # '2026-04-17'
128
- print(stats['package']) # 'react'
129
-
130
- # Confirmed values (2026-04-18):
131
- # last-day: 19,411,762
132
- # last-week: 123,302,510
133
- # last-month: 502,719,511
134
- # last-year: 3,000,644,845
135
- ```
136
-
137
- **Bulk point query** — up to ~128 packages in one call, comma-separated:
138
-
139
- ```python
140
- import json
141
-
142
- bulk = json.loads(http_get(
143
- "https://api.npmjs.org/downloads/point/last-week/"
144
- "react,vue,angular,webpack,typescript,eslint,jest,prettier,rollup,babel"
145
- ))
146
- # Returns dict keyed by package name
147
- for pkg, info in bulk.items():
148
- print(f"{pkg}: {info['downloads']:,}")
149
- # react: 123,302,510
150
- # vue: 11,042,359
151
- # angular: 524,366
152
- # webpack: 44,425,549
153
- # typescript: 180,054,359
154
- # eslint: 126,113,686
155
- # jest: 43,394,412
156
- # prettier: 87,551,734
157
- # rollup: 103,431,439
158
- # babel: 139,207
159
- ```
160
-
161
- **Range query** — downloads per day over a period:
162
-
163
- ```python
164
- import json
165
-
166
- resp = json.loads(http_get(
167
- "https://api.npmjs.org/downloads/range/2025-01-01:2025-01-07/react"
168
- ))
169
- # resp['downloads'] is a list of {downloads, day} objects
170
- for entry in resp['downloads']:
171
- print(entry['day'], entry['downloads'])
172
- # 2025-01-01 1336801
173
- # 2025-01-02 3288088
174
- # 2025-01-03 3381680
175
- # ...
176
- ```
177
-
178
- ### Search
179
-
180
- ```python
181
- import json
182
-
183
- # Fields: text, size (max ~250), from (offset), quality, popularity, maintenance weights
184
- data = json.loads(http_get(
185
- "https://registry.npmjs.org/-/v1/search?text=browser+automation&size=5"
186
- ))
187
- print(data['total']) # total results matching the query
188
-
189
- for obj in data['objects']:
190
- p = obj['package']
191
- s = obj['score']
192
- # p keys: name, version, description, keywords, date, links, publisher, maintainers
193
- # s keys: final, detail.quality, detail.popularity, detail.maintenance
194
- print(
195
- p['name'],
196
- p['version'],
197
- f"{s['final']:.2f}",
198
- p.get('description', '')[:60]
199
- )
200
- # agent-browser 0.26.0 462.28 Browser automation CLI for AI agents
201
- # nightmare 3.0.2 306.64 A high-level browser automation library.
202
- ```
203
-
204
- Score breakdown (all three are 0–1 floats):
205
- - `quality` — code quality signals (tests, lint, TypeScript types)
206
- - `popularity` — download counts normalized
207
- - `maintenance` — release frequency, open issues
208
-
209
- `final` is a weighted combination and can exceed 1.0 for extremely popular packages.
210
-
211
- ### Error handling
212
-
213
- ```python
214
- import json, urllib.error
215
-
216
- try:
217
- data = json.loads(http_get("https://registry.npmjs.org/nonexistent-pkg-xyz"))
218
- except urllib.error.HTTPError as e:
219
- # 404 for missing packages
220
- print(e.code) # 404
221
- print(json.loads(e.read())) # {'error': 'Not found'}
222
- ```
223
-
224
- ---
225
-
226
- ## PyPI
227
-
228
- ### Package metadata
229
-
230
- ```python
231
- import json
232
-
233
- # Latest version metadata
234
- data = json.loads(http_get("https://pypi.org/pypi/requests/json"))
235
- info = data['info']
236
-
237
- # info keys (selected):
238
- print(info['name']) # 'requests'
239
- print(info['version']) # '2.33.1'
240
- print(info['summary']) # 'Python HTTP for Humans.'
241
- print(info['license']) # 'Apache-2.0'
242
- print(info['author']) # None (sometimes empty — check author_email)
243
- print(info['author_email']) # '"Kenneth Reitz" <me@kennethreitz.org>'
244
- print(info['requires_python']) # '>=3.10'
245
- print(info['home_page']) # None (may be empty — check project_urls)
246
- print(info['project_urls'])
247
- # {'Documentation': 'https://requests.readthedocs.io',
248
- # 'Source': 'https://github.com/psf/requests'}
249
-
250
- requires = info.get('requires_dist') or []
251
- print(requires[:5])
252
- # ['charset_normalizer<4,>=2', 'idna<4,>=2.5', 'urllib3<3,>=1.26',
253
- # 'certifi>=2023.5.7', 'PySocks!=1.5.7,>=1.5.6; extra == "socks"']
254
-
255
- print(info.get('classifiers', [])[:3])
256
- # ['Development Status :: 5 - Production/Stable',
257
- # 'Intended Audience :: Developers',
258
- # 'License :: OSI Approved :: Apache Software License']
259
-
260
- # data['urls'] — list of dist files for the latest version
261
- for f in data['urls']:
262
- # keys: filename, packagetype, python_version, size, digests, url,
263
- # upload_time, requires_python, yanked, yanked_reason
264
- print(f['packagetype'], f['python_version'], f['filename'], f['size'])
265
- # bdist_wheel py3 requests-2.33.1-py3-none-any.whl 64947
266
- # sdist source requests-2.33.1.tar.gz 134120
267
- ```
268
-
269
- ### Specific version
270
-
271
- ```python
272
- import json
273
-
274
- # Fetch a pinned version (not just latest)
275
- data = json.loads(http_get("https://pypi.org/pypi/requests/2.32.3/json"))
276
- print(data['info']['version']) # '2.32.3'
277
- # Same structure as the latest endpoint
278
- ```
279
-
280
- ### Version history and yanked releases
281
-
282
- ```python
283
- import json
284
-
285
- data = json.loads(http_get("https://pypi.org/pypi/requests/json"))
286
-
287
- # data['releases'] is a dict: version_string -> list of file objects
288
- versions = list(data['releases'].keys())
289
- print("Total versions:", len(versions)) # 159
290
- # Versions are insertion-ordered (chronological, oldest first)
291
- # dict key order is stable
292
-
293
- # Find yanked versions
294
- yanked = [
295
- (ver, files[0]['yanked_reason'])
296
- for ver, files in data['releases'].items()
297
- if files and files[0].get('yanked')
298
- ]
299
- print(yanked[:2])
300
- # [('2.32.0', 'Yanked due to conflicts with CVE-2024-35195 mitigation'),
301
- # ('2.32.1', 'Yanked due to conflicts with CVE-2024-35195 mitigation ')]
302
-
303
- # info.yanked is True only if the LATEST version is yanked
304
- print(data['info']['yanked']) # False
305
- print(data['info']['yanked_reason']) # None
306
- ```
307
-
308
- ### Download statistics (pypistats.org)
309
-
310
- PyPI does not expose download counts in its own JSON API. Use pypistats.org.
311
-
312
- ```python
313
- import json
314
-
315
- # Recent (last day/week/month) — fastest, single call
316
- stats = json.loads(http_get("https://pypistats.org/api/packages/requests/recent"))
317
- d = stats['data']
318
- print(d['last_day']) # 52969887
319
- print(d['last_week']) # 356556988
320
- print(d['last_month']) # 1385411770
321
-
322
- # Historical daily totals (overall, going back ~6 months)
323
- overall = json.loads(http_get("https://pypistats.org/api/packages/requests/overall"))
324
- # overall['data'] is list of {category, date, downloads}
325
- # category is 'with_mirrors' or 'without_mirrors'
326
- for row in overall['data'][:3]:
327
- print(row['date'], row['category'], row['downloads'])
328
- # 2025-10-19 with_mirrors 21916634
329
- # 2025-10-19 without_mirrors 21882953
330
-
331
- # Without mirrors (pip installs only, more accurate for real usage):
332
- clean = json.loads(http_get(
333
- "https://pypistats.org/api/packages/requests/overall?mirrors=false"
334
- ))
335
-
336
- # By Python major version
337
- by_python = json.loads(http_get(
338
- "https://pypistats.org/api/packages/requests/python_major"
339
- ))
340
- # data rows: {category: '3', date: '...', downloads: N}
341
-
342
- # By OS
343
- by_sys = json.loads(http_get(
344
- "https://pypistats.org/api/packages/requests/system"
345
- ))
346
- # data rows: {category: 'Darwin'|'Linux'|'Windows'|'other'|'null', date, downloads}
347
-
348
- # By Python minor version
349
- by_minor = json.loads(http_get(
350
- "https://pypistats.org/api/packages/requests/python_minor"
351
- ))
352
- ```
353
-
354
- ### Parallel fetch for multiple packages
355
-
356
- ```python
357
- import json
358
- from concurrent.futures import ThreadPoolExecutor
359
-
360
- packages = ['numpy', 'pandas', 'scikit-learn', 'torch', 'tensorflow']
361
-
362
- def get_pypi_info(pkg):
363
- d = json.loads(http_get(f"https://pypi.org/pypi/{pkg}/json"))
364
- return {
365
- 'name': pkg,
366
- 'version': d['info']['version'],
367
- 'summary': d['info']['summary'],
368
- 'requires_python': d['info']['requires_python'],
369
- }
370
-
371
- with ThreadPoolExecutor(max_workers=5) as ex:
372
- results = list(ex.map(get_pypi_info, packages))
373
-
374
- for r in results:
375
- print(r['name'], r['version'], r['summary'][:50])
376
- # numpy 2.4.4 Fundamental package for array computing in Python
377
- # pandas 3.0.2 Powerful data structures for data analysis, time s
378
- # scikit-learn 1.8.0 A set of python modules for machine learning and d
379
- # torch 2.11.0 Tensors and Dynamic neural networks in Python with
380
- # tensorflow 2.21.0 TensorFlow is an open source machine learning fram
381
- ```
382
-
383
- ### Error handling
384
-
385
- ```python
386
- import json, urllib.error
387
-
388
- try:
389
- data = json.loads(http_get("https://pypi.org/pypi/nonexistent-xyz-abc/json"))
390
- except urllib.error.HTTPError as e:
391
- print(e.code) # 404
392
- # Body is HTML, not JSON — don't try to parse it
393
- ```
394
-
395
- ---
396
-
397
- ## Parallel fetch patterns
398
-
399
- ### Mixed registry + stats in one shot
400
-
401
- ```python
402
- import json
403
- from concurrent.futures import ThreadPoolExecutor
404
-
405
- def npm_info(pkg):
406
- # Use single-version endpoint (1-2KB) not full registry doc (MB)
407
- v = json.loads(http_get(f"https://registry.npmjs.org/{pkg}/latest"))
408
- s = json.loads(http_get(f"https://api.npmjs.org/downloads/point/last-month/{pkg}"))
409
- return {'name': pkg, 'version': v['version'], 'downloads': s['downloads']}
410
-
411
- pkgs = ['react', 'vue', 'svelte', 'solid-js', 'preact']
412
- with ThreadPoolExecutor(max_workers=5) as ex:
413
- results = list(ex.map(npm_info, pkgs))
414
- for r in results:
415
- print(r['name'], r['version'], f"{r['downloads']:,}")
416
- ```
417
-
418
- ### npm bulk downloads (most efficient for many packages)
419
-
420
- ```python
421
- import json
422
-
423
- # Up to ~128 packages in one HTTP call
424
- pkgs = ['react', 'vue', 'angular', 'svelte']
425
- bulk = json.loads(http_get(
426
- f"https://api.npmjs.org/downloads/point/last-week/{','.join(pkgs)}"
427
- ))
428
- # Returns: {pkg_name: {'downloads': N, 'start': '...', 'end': '...', 'package': '...'}, ...}
429
- sorted_pkgs = sorted(bulk.items(), key=lambda x: x[1]['downloads'], reverse=True)
430
- for name, info in sorted_pkgs:
431
- print(f"{name}: {info['downloads']:,}")
432
- ```
433
-
434
- ---
435
-
436
- ## Rate limits
437
-
438
- No rate limits encountered across rapid bursts of 10 sequential calls per endpoint (2026-04-18 testing):
439
-
440
- | API | Observed limit |
441
- |-----|----------------|
442
- | npm registry (`registry.npmjs.org`) | None observed |
443
- | npm downloads (`api.npmjs.org`) | None observed |
444
- | npm search | None observed |
445
- | PyPI JSON (`pypi.org`) | None observed |
446
- | pypistats.org | None observed |
447
-
448
- npm's official documentation mentions soft rate limits at very high volumes, but normal task-level usage (dozens of calls) is unaffected. If building a large scraper, add a short sleep between batches as a precaution.
449
-
450
- ---
451
-
452
- ## Gotchas
453
-
454
- - **Full npm registry doc is huge** — `registry.npmjs.org/react` is 6.3MB (2785 versions). When you only need the latest version metadata, fetch `registry.npmjs.org/react/latest` (~1.8KB) instead. Similarly for any specific version.
455
-
456
- - **npm `versions` dict keys are ordered oldest-first** — The last key is NOT necessarily the latest release; it may be a canary/experimental build. Always use `dist-tags.latest` to identify the stable latest version.
457
-
458
- - **PyPI `author` field is often `None`** — Many packages set `author_email` instead (often in `"Name" <email>` format). Fall back: `info['author'] or info['author_email']`.
459
-
460
- - **PyPI `home_page` is frequently empty** — Check `info['project_urls']` for `Homepage`, `Source`, `Documentation` links instead.
461
-
462
- - **PyPI `requires_dist` can be `None`** — Not an empty list — `None`. Always guard: `info.get('requires_dist') or []`.
463
-
464
- - **PyPI XML-RPC API is dead** — `https://pypi.org/pypi` (XML-RPC) returns a fault for most methods including `package_releases`. Use JSON API only.
465
-
466
- - **pypistats.org `total` field is `None`** — The `total` key in response JSON is null; compute sums from `data` list yourself.
467
-
468
- - **pypistats.org data goes back ~6 months** — The `overall` endpoint returns daily rows for roughly the past 180 days, not full history.
469
-
470
- - **PyPI yanked versions** — `data['releases'][ver][0]['yanked']` is `True` for yanked versions. `data['info']['yanked']` is only `True` if the latest version itself is yanked. Both `yanked` and `yanked_reason` fields exist on each file object.
471
-
472
- - **npm scoped packages** — Both `registry.npmjs.org/@scope/name` (direct path) and `registry.npmjs.org/@scope%2Fname` (URL-encoded) work. Use the direct path form.
473
-
474
- - **npm downloads bulk response is a dict** — When you request multiple packages, the response is `{pkg_name: {...}}`, not a list. Single-package response is a flat object with `downloads`, `start`, `end`, `package` directly.
475
-
476
- - **`http_get` handles gzip transparently** — The helper already decompresses gzip responses. No manual decompression needed.
477
-
478
- - **Never use a browser for either registry** — All data is JSON over HTTP. `http_get` calls take 80–480ms; a browser navigation would take 3–8 seconds with no benefit.
1
+ # npm & PyPI — Package Registry Data Extraction
2
+
3
+ `https://registry.npmjs.org` · `https://api.npmjs.org` · `https://pypi.org` · `https://pypistats.org`
4
+
5
+ Both registries expose full JSON APIs with no auth required. Never use a browser — every data point is available over HTTP.
6
+
7
+ Tested 2026-04-18 with `uv run python` + `http_get`.
8
+
9
+ ---
10
+
11
+ ## Latency reference (measured)
12
+
13
+ | Endpoint | Latency |
14
+ |----------|---------|
15
+ | PyPI package JSON | ~80ms |
16
+ | npm downloads point | ~110ms |
17
+ | npm registry full doc (react = 6.3MB) | ~280ms |
18
+ | npm registry search | ~330ms |
19
+ | pypistats.org recent | ~480ms |
20
+
21
+ ---
22
+
23
+ ## npm Registry
24
+
25
+ ### Package metadata
26
+
27
+ Two endpoints — pick based on what you need:
28
+
29
+ **Full registry document** — includes all version history, time map, author, bugs, homepage, keywords, README (when present). Large for popular packages (react = 6.3MB).
30
+
31
+ ```python
32
+ import json
33
+ data = json.loads(http_get("https://registry.npmjs.org/react"))
34
+
35
+ # Top-level keys: _id, name, dist-tags, versions, time, bugs, author,
36
+ # license, homepage, keywords, repository, description,
37
+ # contributors, maintainers, readme, readmeFilename, users
38
+ print(data['name']) # 'react'
39
+ print(data['dist-tags']['latest']) # '19.2.5'
40
+ print(data['time']['created']) # '2011-10-26T17:46:21.942Z'
41
+ print(data['time']['modified']) # '2026-04-18T00:57:09.913Z'
42
+
43
+ latest = data['dist-tags']['latest']
44
+ v = data['versions'][latest]
45
+ # Version object keys: name, version, description, license, keywords,
46
+ # homepage, bugs, repository, engines, exports, main, scripts,
47
+ # dependencies, devDependencies, peerDependencies, dist, maintainers,
48
+ # _npmUser, _nodeVersion, _npmVersion
49
+ print(v['description']) # 'React is a JavaScript library...'
50
+ print(v['license']) # 'MIT'
51
+ print(list(v.get('dependencies', {}).keys())) # [] (react 19 has no runtime deps)
52
+ print(v.get('homepage')) # 'https://react.dev/'
53
+ print(len(data['versions'])) # 2785 — all published versions
54
+ ```
55
+
56
+ **Single version endpoint** — 1–2KB instead of megabytes. Use when you only need one version's data.
57
+
58
+ ```python
59
+ import json
60
+ # Fetch a specific version
61
+ v = json.loads(http_get("https://registry.npmjs.org/react/19.2.5"))
62
+ print(v['name'], v['version'], v['description'])
63
+
64
+ # Fetch latest directly (no need to resolve dist-tags first)
65
+ v = json.loads(http_get("https://registry.npmjs.org/react/latest"))
66
+ print(v['version']) # '19.2.5'
67
+ ```
68
+
69
+ **Abbreviated document** — skips time map and (in theory) README; versions dict still present. Use `Accept` header.
70
+
71
+ ```python
72
+ import json, urllib.request, gzip
73
+
74
+ req = urllib.request.Request(
75
+ "https://registry.npmjs.org/react",
76
+ headers={
77
+ "Accept": "application/vnd.npm.install-v1+json",
78
+ "Accept-Encoding": "gzip"
79
+ }
80
+ )
81
+ with urllib.request.urlopen(req, timeout=20) as r:
82
+ raw = r.read()
83
+ if r.headers.get("Content-Encoding") == "gzip":
84
+ raw = gzip.decompress(raw)
85
+ data = json.loads(raw)
86
+ # Keys: name, dist-tags, versions, modified (no time map, no readme)
87
+ print(data['dist-tags']['latest']) # '4.18.1' (for lodash)
88
+ ```
89
+
90
+ Note: abbreviated is still large (react: 2.7MB) — use single-version endpoint when possible.
91
+
92
+ ### Scoped packages
93
+
94
+ Scoped packages (`@scope/name`) work with a direct path — no encoding needed:
95
+
96
+ ```python
97
+ import json
98
+ data = json.loads(http_get("https://registry.npmjs.org/@playwright/test"))
99
+ print(data['name']) # '@playwright/test'
100
+ print(data['dist-tags']['latest']) # '1.59.1'
101
+ print(len(data['versions'])) # 3148
102
+ ```
103
+
104
+ If constructing URLs dynamically, either form works:
105
+ ```python
106
+ # Direct path (preferred)
107
+ url = f"https://registry.npmjs.org/{pkg}" # '@playwright/test'
108
+ # URL-encoded slash
109
+ url = f"https://registry.npmjs.org/{pkg.replace('/', '%2F')}"
110
+ ```
111
+
112
+ ### Download statistics
113
+
114
+ The npm downloads API is separate from the registry and very fast (~110ms).
115
+
116
+ **Point query** — single number for a period:
117
+
118
+ ```python
119
+ import json
120
+
121
+ # Supported periods: last-day, last-week, last-month, last-year
122
+ # Also accepts ISO date ranges: YYYY-MM-DD:YYYY-MM-DD
123
+
124
+ stats = json.loads(http_get("https://api.npmjs.org/downloads/point/last-week/react"))
125
+ print(stats['downloads']) # 123302510
126
+ print(stats['start']) # '2026-04-11'
127
+ print(stats['end']) # '2026-04-17'
128
+ print(stats['package']) # 'react'
129
+
130
+ # Confirmed values (2026-04-18):
131
+ # last-day: 19,411,762
132
+ # last-week: 123,302,510
133
+ # last-month: 502,719,511
134
+ # last-year: 3,000,644,845
135
+ ```
136
+
137
+ **Bulk point query** — up to ~128 packages in one call, comma-separated:
138
+
139
+ ```python
140
+ import json
141
+
142
+ bulk = json.loads(http_get(
143
+ "https://api.npmjs.org/downloads/point/last-week/"
144
+ "react,vue,angular,webpack,typescript,eslint,jest,prettier,rollup,babel"
145
+ ))
146
+ # Returns dict keyed by package name
147
+ for pkg, info in bulk.items():
148
+ print(f"{pkg}: {info['downloads']:,}")
149
+ # react: 123,302,510
150
+ # vue: 11,042,359
151
+ # angular: 524,366
152
+ # webpack: 44,425,549
153
+ # typescript: 180,054,359
154
+ # eslint: 126,113,686
155
+ # jest: 43,394,412
156
+ # prettier: 87,551,734
157
+ # rollup: 103,431,439
158
+ # babel: 139,207
159
+ ```
160
+
161
+ **Range query** — downloads per day over a period:
162
+
163
+ ```python
164
+ import json
165
+
166
+ resp = json.loads(http_get(
167
+ "https://api.npmjs.org/downloads/range/2025-01-01:2025-01-07/react"
168
+ ))
169
+ # resp['downloads'] is a list of {downloads, day} objects
170
+ for entry in resp['downloads']:
171
+ print(entry['day'], entry['downloads'])
172
+ # 2025-01-01 1336801
173
+ # 2025-01-02 3288088
174
+ # 2025-01-03 3381680
175
+ # ...
176
+ ```
177
+
178
+ ### Search
179
+
180
+ ```python
181
+ import json
182
+
183
+ # Fields: text, size (max ~250), from (offset), quality, popularity, maintenance weights
184
+ data = json.loads(http_get(
185
+ "https://registry.npmjs.org/-/v1/search?text=browser+automation&size=5"
186
+ ))
187
+ print(data['total']) # total results matching the query
188
+
189
+ for obj in data['objects']:
190
+ p = obj['package']
191
+ s = obj['score']
192
+ # p keys: name, version, description, keywords, date, links, publisher, maintainers
193
+ # s keys: final, detail.quality, detail.popularity, detail.maintenance
194
+ print(
195
+ p['name'],
196
+ p['version'],
197
+ f"{s['final']:.2f}",
198
+ p.get('description', '')[:60]
199
+ )
200
+ # agent-browser 0.26.0 462.28 Browser automation CLI for AI agents
201
+ # nightmare 3.0.2 306.64 A high-level browser automation library.
202
+ ```
203
+
204
+ Score breakdown (all three are 0–1 floats):
205
+ - `quality` — code quality signals (tests, lint, TypeScript types)
206
+ - `popularity` — download counts normalized
207
+ - `maintenance` — release frequency, open issues
208
+
209
+ `final` is a weighted combination and can exceed 1.0 for extremely popular packages.
210
+
211
+ ### Error handling
212
+
213
+ ```python
214
+ import json, urllib.error
215
+
216
+ try:
217
+ data = json.loads(http_get("https://registry.npmjs.org/nonexistent-pkg-xyz"))
218
+ except urllib.error.HTTPError as e:
219
+ # 404 for missing packages
220
+ print(e.code) # 404
221
+ print(json.loads(e.read())) # {'error': 'Not found'}
222
+ ```
223
+
224
+ ---
225
+
226
+ ## PyPI
227
+
228
+ ### Package metadata
229
+
230
+ ```python
231
+ import json
232
+
233
+ # Latest version metadata
234
+ data = json.loads(http_get("https://pypi.org/pypi/requests/json"))
235
+ info = data['info']
236
+
237
+ # info keys (selected):
238
+ print(info['name']) # 'requests'
239
+ print(info['version']) # '2.33.1'
240
+ print(info['summary']) # 'Python HTTP for Humans.'
241
+ print(info['license']) # 'Apache-2.0'
242
+ print(info['author']) # None (sometimes empty — check author_email)
243
+ print(info['author_email']) # '"Kenneth Reitz" <me@kennethreitz.org>'
244
+ print(info['requires_python']) # '>=3.10'
245
+ print(info['home_page']) # None (may be empty — check project_urls)
246
+ print(info['project_urls'])
247
+ # {'Documentation': 'https://requests.readthedocs.io',
248
+ # 'Source': 'https://github.com/psf/requests'}
249
+
250
+ requires = info.get('requires_dist') or []
251
+ print(requires[:5])
252
+ # ['charset_normalizer<4,>=2', 'idna<4,>=2.5', 'urllib3<3,>=1.26',
253
+ # 'certifi>=2023.5.7', 'PySocks!=1.5.7,>=1.5.6; extra == "socks"']
254
+
255
+ print(info.get('classifiers', [])[:3])
256
+ # ['Development Status :: 5 - Production/Stable',
257
+ # 'Intended Audience :: Developers',
258
+ # 'License :: OSI Approved :: Apache Software License']
259
+
260
+ # data['urls'] — list of dist files for the latest version
261
+ for f in data['urls']:
262
+ # keys: filename, packagetype, python_version, size, digests, url,
263
+ # upload_time, requires_python, yanked, yanked_reason
264
+ print(f['packagetype'], f['python_version'], f['filename'], f['size'])
265
+ # bdist_wheel py3 requests-2.33.1-py3-none-any.whl 64947
266
+ # sdist source requests-2.33.1.tar.gz 134120
267
+ ```
268
+
269
+ ### Specific version
270
+
271
+ ```python
272
+ import json
273
+
274
+ # Fetch a pinned version (not just latest)
275
+ data = json.loads(http_get("https://pypi.org/pypi/requests/2.32.3/json"))
276
+ print(data['info']['version']) # '2.32.3'
277
+ # Same structure as the latest endpoint
278
+ ```
279
+
280
+ ### Version history and yanked releases
281
+
282
+ ```python
283
+ import json
284
+
285
+ data = json.loads(http_get("https://pypi.org/pypi/requests/json"))
286
+
287
+ # data['releases'] is a dict: version_string -> list of file objects
288
+ versions = list(data['releases'].keys())
289
+ print("Total versions:", len(versions)) # 159
290
+ # Versions are insertion-ordered (chronological, oldest first)
291
+ # dict key order is stable
292
+
293
+ # Find yanked versions
294
+ yanked = [
295
+ (ver, files[0]['yanked_reason'])
296
+ for ver, files in data['releases'].items()
297
+ if files and files[0].get('yanked')
298
+ ]
299
+ print(yanked[:2])
300
+ # [('2.32.0', 'Yanked due to conflicts with CVE-2024-35195 mitigation'),
301
+ # ('2.32.1', 'Yanked due to conflicts with CVE-2024-35195 mitigation ')]
302
+
303
+ # info.yanked is True only if the LATEST version is yanked
304
+ print(data['info']['yanked']) # False
305
+ print(data['info']['yanked_reason']) # None
306
+ ```
307
+
308
+ ### Download statistics (pypistats.org)
309
+
310
+ PyPI does not expose download counts in its own JSON API. Use pypistats.org.
311
+
312
+ ```python
313
+ import json
314
+
315
+ # Recent (last day/week/month) — fastest, single call
316
+ stats = json.loads(http_get("https://pypistats.org/api/packages/requests/recent"))
317
+ d = stats['data']
318
+ print(d['last_day']) # 52969887
319
+ print(d['last_week']) # 356556988
320
+ print(d['last_month']) # 1385411770
321
+
322
+ # Historical daily totals (overall, going back ~6 months)
323
+ overall = json.loads(http_get("https://pypistats.org/api/packages/requests/overall"))
324
+ # overall['data'] is list of {category, date, downloads}
325
+ # category is 'with_mirrors' or 'without_mirrors'
326
+ for row in overall['data'][:3]:
327
+ print(row['date'], row['category'], row['downloads'])
328
+ # 2025-10-19 with_mirrors 21916634
329
+ # 2025-10-19 without_mirrors 21882953
330
+
331
+ # Without mirrors (pip installs only, more accurate for real usage):
332
+ clean = json.loads(http_get(
333
+ "https://pypistats.org/api/packages/requests/overall?mirrors=false"
334
+ ))
335
+
336
+ # By Python major version
337
+ by_python = json.loads(http_get(
338
+ "https://pypistats.org/api/packages/requests/python_major"
339
+ ))
340
+ # data rows: {category: '3', date: '...', downloads: N}
341
+
342
+ # By OS
343
+ by_sys = json.loads(http_get(
344
+ "https://pypistats.org/api/packages/requests/system"
345
+ ))
346
+ # data rows: {category: 'Darwin'|'Linux'|'Windows'|'other'|'null', date, downloads}
347
+
348
+ # By Python minor version
349
+ by_minor = json.loads(http_get(
350
+ "https://pypistats.org/api/packages/requests/python_minor"
351
+ ))
352
+ ```
353
+
354
+ ### Parallel fetch for multiple packages
355
+
356
+ ```python
357
+ import json
358
+ from concurrent.futures import ThreadPoolExecutor
359
+
360
+ packages = ['numpy', 'pandas', 'scikit-learn', 'torch', 'tensorflow']
361
+
362
+ def get_pypi_info(pkg):
363
+ d = json.loads(http_get(f"https://pypi.org/pypi/{pkg}/json"))
364
+ return {
365
+ 'name': pkg,
366
+ 'version': d['info']['version'],
367
+ 'summary': d['info']['summary'],
368
+ 'requires_python': d['info']['requires_python'],
369
+ }
370
+
371
+ with ThreadPoolExecutor(max_workers=5) as ex:
372
+ results = list(ex.map(get_pypi_info, packages))
373
+
374
+ for r in results:
375
+ print(r['name'], r['version'], r['summary'][:50])
376
+ # numpy 2.4.4 Fundamental package for array computing in Python
377
+ # pandas 3.0.2 Powerful data structures for data analysis, time s
378
+ # scikit-learn 1.8.0 A set of python modules for machine learning and d
379
+ # torch 2.11.0 Tensors and Dynamic neural networks in Python with
380
+ # tensorflow 2.21.0 TensorFlow is an open source machine learning fram
381
+ ```
382
+
383
+ ### Error handling
384
+
385
+ ```python
386
+ import json, urllib.error
387
+
388
+ try:
389
+ data = json.loads(http_get("https://pypi.org/pypi/nonexistent-xyz-abc/json"))
390
+ except urllib.error.HTTPError as e:
391
+ print(e.code) # 404
392
+ # Body is HTML, not JSON — don't try to parse it
393
+ ```
394
+
395
+ ---
396
+
397
+ ## Parallel fetch patterns
398
+
399
+ ### Mixed registry + stats in one shot
400
+
401
+ ```python
402
+ import json
403
+ from concurrent.futures import ThreadPoolExecutor
404
+
405
+ def npm_info(pkg):
406
+ # Use single-version endpoint (1-2KB) not full registry doc (MB)
407
+ v = json.loads(http_get(f"https://registry.npmjs.org/{pkg}/latest"))
408
+ s = json.loads(http_get(f"https://api.npmjs.org/downloads/point/last-month/{pkg}"))
409
+ return {'name': pkg, 'version': v['version'], 'downloads': s['downloads']}
410
+
411
+ pkgs = ['react', 'vue', 'svelte', 'solid-js', 'preact']
412
+ with ThreadPoolExecutor(max_workers=5) as ex:
413
+ results = list(ex.map(npm_info, pkgs))
414
+ for r in results:
415
+ print(r['name'], r['version'], f"{r['downloads']:,}")
416
+ ```
417
+
418
+ ### npm bulk downloads (most efficient for many packages)
419
+
420
+ ```python
421
+ import json
422
+
423
+ # Up to ~128 packages in one HTTP call
424
+ pkgs = ['react', 'vue', 'angular', 'svelte']
425
+ bulk = json.loads(http_get(
426
+ f"https://api.npmjs.org/downloads/point/last-week/{','.join(pkgs)}"
427
+ ))
428
+ # Returns: {pkg_name: {'downloads': N, 'start': '...', 'end': '...', 'package': '...'}, ...}
429
+ sorted_pkgs = sorted(bulk.items(), key=lambda x: x[1]['downloads'], reverse=True)
430
+ for name, info in sorted_pkgs:
431
+ print(f"{name}: {info['downloads']:,}")
432
+ ```
433
+
434
+ ---
435
+
436
+ ## Rate limits
437
+
438
+ No rate limits encountered across rapid bursts of 10 sequential calls per endpoint (2026-04-18 testing):
439
+
440
+ | API | Observed limit |
441
+ |-----|----------------|
442
+ | npm registry (`registry.npmjs.org`) | None observed |
443
+ | npm downloads (`api.npmjs.org`) | None observed |
444
+ | npm search | None observed |
445
+ | PyPI JSON (`pypi.org`) | None observed |
446
+ | pypistats.org | None observed |
447
+
448
+ npm's official documentation mentions soft rate limits at very high volumes, but normal task-level usage (dozens of calls) is unaffected. If building a large scraper, add a short sleep between batches as a precaution.
449
+
450
+ ---
451
+
452
+ ## Gotchas
453
+
454
+ - **Full npm registry doc is huge** — `registry.npmjs.org/react` is 6.3MB (2785 versions). When you only need the latest version metadata, fetch `registry.npmjs.org/react/latest` (~1.8KB) instead. Similarly for any specific version.
455
+
456
+ - **npm `versions` dict keys are ordered oldest-first** — The last key is NOT necessarily the latest release; it may be a canary/experimental build. Always use `dist-tags.latest` to identify the stable latest version.
457
+
458
+ - **PyPI `author` field is often `None`** — Many packages set `author_email` instead (often in `"Name" <email>` format). Fall back: `info['author'] or info['author_email']`.
459
+
460
+ - **PyPI `home_page` is frequently empty** — Check `info['project_urls']` for `Homepage`, `Source`, `Documentation` links instead.
461
+
462
+ - **PyPI `requires_dist` can be `None`** — Not an empty list — `None`. Always guard: `info.get('requires_dist') or []`.
463
+
464
+ - **PyPI XML-RPC API is dead** — `https://pypi.org/pypi` (XML-RPC) returns a fault for most methods including `package_releases`. Use JSON API only.
465
+
466
+ - **pypistats.org `total` field is `None`** — The `total` key in response JSON is null; compute sums from `data` list yourself.
467
+
468
+ - **pypistats.org data goes back ~6 months** — The `overall` endpoint returns daily rows for roughly the past 180 days, not full history.
469
+
470
+ - **PyPI yanked versions** — `data['releases'][ver][0]['yanked']` is `True` for yanked versions. `data['info']['yanked']` is only `True` if the latest version itself is yanked. Both `yanked` and `yanked_reason` fields exist on each file object.
471
+
472
+ - **npm scoped packages** — Both `registry.npmjs.org/@scope/name` (direct path) and `registry.npmjs.org/@scope%2Fname` (URL-encoded) work. Use the direct path form.
473
+
474
+ - **npm downloads bulk response is a dict** — When you request multiple packages, the response is `{pkg_name: {...}}`, not a list. Single-package response is a flat object with `downloads`, `start`, `end`, `package` directly.
475
+
476
+ - **`http_get` handles gzip transparently** — The helper already decompresses gzip responses. No manual decompression needed.
477
+
478
+ - **Never use a browser for either registry** — All data is JSON over HTTP. `http_get` calls take 80–480ms; a browser navigation would take 3–8 seconds with no benefit.