@pencil-agent/nano-pencil 2.0.0-beta.9 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (207) hide show
  1. package/README.md +267 -267
  2. package/dist/build-meta.json +3 -3
  3. package/dist/core/export-html/AGENT.md +11 -11
  4. package/dist/core/export-html/template.css +971 -971
  5. package/dist/core/export-html/template.html +54 -54
  6. package/dist/core/extensions-host/index.d.ts +1 -1
  7. package/dist/core/extensions-host/types.d.ts +5 -8
  8. package/dist/extensions/builtin/AGENT.md +115 -115
  9. package/dist/extensions/builtin/browser/AGENT.md +17 -17
  10. package/dist/extensions/builtin/browser/agent-workspace/agent_helpers.py +12 -12
  11. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/amazon/product-search.md +198 -198
  12. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/archive-org/scraping.md +341 -341
  13. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/arxiv/scraping.md +311 -311
  14. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/arxiv-bulk/scraping.md +333 -333
  15. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/atlas/overview.md +70 -70
  16. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/booking-com/scraping.md +578 -578
  17. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/capterra/scraping.md +440 -440
  18. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/centilebrain/generate-estimates.md +110 -110
  19. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coingecko/scraping.md +325 -325
  20. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coinmarketcap/scraping.md +463 -463
  21. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coursera/scraping.md +360 -360
  22. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/craigslist/scraping.md +390 -390
  23. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/crossref/scraping.md +568 -568
  24. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/dev-to/scraping.md +323 -323
  25. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/duckduckgo/scraping.md +349 -349
  26. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/ebay/scraping.md +435 -435
  27. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/etsy/scraping.md +506 -506
  28. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/eventbrite/scraping.md +363 -363
  29. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/expedia/automation.md +168 -168
  30. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/facebook/groups.md +236 -236
  31. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/facebook/pages.md +295 -295
  32. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/framer/editor.md +108 -108
  33. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/fred/scraping.md +493 -493
  34. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/g2/scraping.md +580 -580
  35. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/genius/scraping.md +511 -511
  36. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/github/repo-actions.md +65 -65
  37. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/github/scraping.md +184 -184
  38. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/glassdoor/scraping.md +543 -543
  39. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/gmail/compose.md +122 -122
  40. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/goodreads/scraping.md +461 -461
  41. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/gutenberg/scraping.md +383 -383
  42. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/hackernews/scraping.md +243 -243
  43. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/howlongtobeat/scraping.md +473 -473
  44. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/imdb/scraping.md +271 -271
  45. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/itch-io/scraping.md +436 -436
  46. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/job-boards/indeed-glassdoor.md +1021 -1021
  47. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/letterboxd/scraping.md +349 -349
  48. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/linkedin/invitation-manager.md +109 -109
  49. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/loom/folder-enumeration.md +170 -170
  50. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/macrotrends/scraping.md +537 -537
  51. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/medium/article-hydration.md +120 -120
  52. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/medium/scraping.md +414 -414
  53. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/metacritic/scraping.md +477 -477
  54. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/musicbrainz/scraping.md +478 -478
  55. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/nasa/scraping.md +339 -339
  56. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/news-aggregation/multi-source.md +205 -205
  57. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/open-library/scraping.md +472 -472
  58. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/openalex/scraping.md +470 -470
  59. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/openstreetmap/scraping.md +490 -490
  60. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/package-registries/npm-pypi.md +478 -478
  61. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/polymarket/scraping.md +234 -234
  62. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/producthunt/scraping.md +307 -307
  63. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/pubmed/scraping.md +421 -421
  64. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/quora/scraping.md +364 -364
  65. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/rawg/scraping.md +352 -352
  66. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/reddit/scraping.md +124 -124
  67. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/rest-countries/scraping.md +233 -233
  68. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/sec-edgar/scraping.md +361 -361
  69. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/README.md +36 -36
  70. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/embedded-apps.md +72 -72
  71. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/knowledge-base.md +109 -109
  72. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/polaris-inputs.md +137 -137
  73. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/soundcloud/scraping.md +362 -362
  74. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/spotify/scraping.md +339 -339
  75. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/stackoverflow/scraping.md +435 -435
  76. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/steam/scraping.md +575 -575
  77. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/substack/scraping.md +338 -338
  78. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/thetechgeeks/pricing.md +52 -52
  79. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/tiktok/upload.md +107 -107
  80. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/tradingview/scraping.md +309 -309
  81. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/trello/boards-and-lists.md +88 -88
  82. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/trustpilot/scraping.md +375 -375
  83. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/walmart/scraping.md +444 -444
  84. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/wayback-machine/scraping.md +306 -306
  85. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/weather/scraping.md +398 -398
  86. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/wellfound/scraping.md +596 -596
  87. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/world-bank/scraping.md +356 -356
  88. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/xiaohongshu/scraping.md +84 -84
  89. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/youtube/scraping.md +418 -418
  90. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/zillow/scraping.md +433 -433
  91. package/dist/extensions/builtin/browser/browser.md +73 -73
  92. package/dist/extensions/builtin/browser/install.md +142 -142
  93. package/dist/extensions/builtin/browser/interaction-skills/connection.md +48 -48
  94. package/dist/extensions/builtin/browser/interaction-skills/cookies.md +3 -3
  95. package/dist/extensions/builtin/browser/interaction-skills/cross-origin-iframes.md +3 -3
  96. package/dist/extensions/builtin/browser/interaction-skills/dialogs.md +64 -64
  97. package/dist/extensions/builtin/browser/interaction-skills/downloads.md +3 -3
  98. package/dist/extensions/builtin/browser/interaction-skills/drag-and-drop.md +3 -3
  99. package/dist/extensions/builtin/browser/interaction-skills/dropdowns.md +3 -3
  100. package/dist/extensions/builtin/browser/interaction-skills/iframes.md +3 -3
  101. package/dist/extensions/builtin/browser/interaction-skills/network-requests.md +3 -3
  102. package/dist/extensions/builtin/browser/interaction-skills/print-as-pdf.md +3 -3
  103. package/dist/extensions/builtin/browser/interaction-skills/profile-sync.md +90 -90
  104. package/dist/extensions/builtin/browser/interaction-skills/screenshots.md +17 -17
  105. package/dist/extensions/builtin/browser/interaction-skills/scrolling.md +3 -3
  106. package/dist/extensions/builtin/browser/interaction-skills/shadow-dom.md +3 -3
  107. package/dist/extensions/builtin/browser/interaction-skills/tabs.md +69 -69
  108. package/dist/extensions/builtin/browser/interaction-skills/uploads.md +1 -1
  109. package/dist/extensions/builtin/browser/interaction-skills/viewport.md +3 -3
  110. package/dist/extensions/builtin/browser/src/browser_harness/AGENT.md +15 -15
  111. package/dist/extensions/builtin/browser/src/browser_harness/__init__.py +8 -8
  112. package/dist/extensions/builtin/browser/src/browser_harness/_ipc.py +90 -90
  113. package/dist/extensions/builtin/browser/src/browser_harness/admin.py +722 -722
  114. package/dist/extensions/builtin/browser/src/browser_harness/daemon.py +328 -328
  115. package/dist/extensions/builtin/browser/src/browser_harness/helpers.py +396 -396
  116. package/dist/extensions/builtin/browser/src/browser_harness/run.py +103 -103
  117. package/dist/extensions/builtin/discipline/skills/brainstorming/SKILL.md +33 -33
  118. package/dist/extensions/builtin/discipline/skills/executing-plans/SKILL.md +25 -25
  119. package/dist/extensions/builtin/discipline/skills/finishing-development-branch/SKILL.md +25 -25
  120. package/dist/extensions/builtin/discipline/skills/receiving-code-review/SKILL.md +22 -22
  121. package/dist/extensions/builtin/discipline/skills/requesting-code-review/SKILL.md +31 -31
  122. package/dist/extensions/builtin/discipline/skills/systematic-debugging/SKILL.md +28 -28
  123. package/dist/extensions/builtin/discipline/skills/test-driven-development/SKILL.md +32 -32
  124. package/dist/extensions/builtin/discipline/skills/using-git-worktrees/SKILL.md +25 -25
  125. package/dist/extensions/builtin/discipline/skills/verification-before-completion/SKILL.md +27 -27
  126. package/dist/extensions/builtin/discipline/skills/writing-plans/SKILL.md +26 -26
  127. package/dist/extensions/builtin/goal/README.md +67 -67
  128. package/dist/extensions/builtin/goal/goal-controller.js +1 -1
  129. package/dist/extensions/builtin/goal/goal-prompts.js +4 -4
  130. package/dist/extensions/builtin/grub/README.md +112 -112
  131. package/dist/extensions/builtin/link-world/agent-workspace/README.md +16 -16
  132. package/dist/extensions/builtin/link-world/internet-search/internet-search.md +65 -65
  133. package/dist/extensions/builtin/link-world/link-world-agent.md +82 -82
  134. package/dist/extensions/builtin/link-world/linkworld.md +313 -313
  135. package/dist/extensions/builtin/link-world/network-routing/network-routing.md +67 -67
  136. package/dist/extensions/builtin/loop/README.md +92 -92
  137. package/dist/extensions/builtin/mcp/figma-design.md +68 -68
  138. package/dist/extensions/builtin/mcp/mcp-management.md +85 -85
  139. package/dist/extensions/builtin/recap/AGENT.md +15 -15
  140. package/dist/extensions/builtin/sal/README.md +72 -72
  141. package/dist/extensions/builtin/security-audit/README.md +289 -289
  142. package/dist/extensions/builtin/team/AGENT.md +112 -112
  143. package/dist/extensions/builtin/team/TESTING.md +299 -299
  144. package/dist/extensions/builtin/token-save/README.md +56 -56
  145. package/dist/extensions/optional/AGENT.md +10 -10
  146. package/dist/index.d.ts +5 -30
  147. package/dist/index.js +1 -1
  148. package/dist/models.d.ts +7 -0
  149. package/dist/models.js +1 -0
  150. package/dist/modes/interactive/theme/dark.json +85 -85
  151. package/dist/modes/interactive/theme/light.json +84 -84
  152. package/dist/modes/interactive/theme/theme-schema.json +335 -335
  153. package/dist/modes/interactive/theme/warm.json +81 -81
  154. package/dist/node_modules/@pencil-agent/ai/dist/cli.js +0 -0
  155. package/dist/packages/protocol/src/flags.d.ts +20 -0
  156. package/dist/packages/protocol/src/flags.js +0 -0
  157. package/dist/packages/protocol/src/hooks.d.ts +17 -0
  158. package/dist/packages/protocol/src/hooks.js +0 -0
  159. package/dist/packages/protocol/src/index.d.ts +4 -2
  160. package/dist/packages/protocol/src/index.js +1 -1
  161. package/dist/packages/protocol/src/lifecycle.d.ts +11 -21
  162. package/dist/public-config.d.ts +12 -0
  163. package/dist/public-config.js +1 -0
  164. package/dist/runtime.d.ts +9 -0
  165. package/dist/runtime.js +1 -0
  166. package/dist/session-compaction.d.ts +7 -0
  167. package/dist/session-compaction.js +1 -0
  168. package/dist/session.d.ts +7 -0
  169. package/dist/session.js +1 -0
  170. package/dist/skills.d.ts +7 -0
  171. package/dist/skills.js +1 -0
  172. package/dist/tools.d.ts +7 -0
  173. package/dist/tools.js +1 -0
  174. package/docs/ACP/345/215/217/350/256/256/351/233/206/346/210/220/345/274/200/345/217/221/346/226/207/346/241/243.md +851 -0
  175. package/docs/SDK-TESTING.md +364 -0
  176. package/docs/codex-goal-command-impl.md +1055 -1055
  177. package/docs/codex-goal-vs-grub.md +500 -500
  178. package/docs/custom-provider.md +27 -27
  179. package/docs/extensions.md +27 -27
  180. package/docs/keybindings.md +27 -27
  181. package/docs/loop /351/207/215/346/236/204/345/256/214/346/210/220/346/200/273/347/273/223.md" +250 -250
  182. package/docs/loop /351/207/215/346/236/204/345/256/214/346/210/220/346/212/245/345/221/212.md" +122 -122
  183. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210.md" +1222 -1222
  184. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210/345/256/236/347/216/260/346/212/245/345/221/212.md" +158 -158
  185. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210/345/257/271/346/257/224/345/210/206/346/236/220.md" +128 -128
  186. package/docs/loop /351/207/215/346/236/204/350/256/241/345/210/222.md" +320 -320
  187. package/docs/loop-usage-examples.md +214 -214
  188. package/docs/mem-core/346/212/200/346/234/257/346/226/207/346/241/243.md +593 -0
  189. package/docs/models.md +27 -27
  190. package/docs/packages.md +27 -27
  191. package/docs/pi-design-philosophy.md +457 -457
  192. package/docs/planmode.md +1987 -1987
  193. package/docs/prompt-templates.md +27 -27
  194. package/docs/providers.md +27 -27
  195. package/docs/sdk.md +27 -27
  196. package/docs/skills.md +27 -27
  197. package/docs/startup-performance-optimization.md +301 -0
  198. package/docs/themes.md +27 -27
  199. package/docs/tui.md +27 -27
  200. package/docs//350/256/244/347/237/245/345/234/260/345/233/276.md +47 -0
  201. package/package.json +190 -162
  202. package/docs/cc-agent-design.md +0 -1297
  203. package/docs/cc-tui-design.md +0 -1333
  204. package/docs/nanoPencil-/345/255/246/344/271/240/350/256/241/345/210/222.md +0 -170
  205. package/docs/scan-report.md +0 -3820
  206. package/docs//345/257/271/346/240/207Claude-Code.md +0 -1775
  207. package/docs//351/230/277/351/207/214/345/267/264/345/267/264/350/264/242/346/212/245/345/210/206/346/236/220/344/271/246.md +0 -261
@@ -1,339 +1,339 @@
1
- # NASA APIs — Scraping & Data Extraction
2
-
3
- `https://api.nasa.gov` — open NASA data APIs. **Never use the browser.** All endpoints return JSON via `http_get`. DEMO_KEY works for low-volume use; register for a free personal key at https://api.nasa.gov/ to raise limits.
4
-
5
- ## Do this first
6
-
7
- **All `api.nasa.gov` endpoints share the same rate-limit pool under DEMO_KEY. EPIC and Exoplanet Archive are on separate domains with no rate limit.**
8
-
9
- ```python
10
- import json
11
- from helpers import http_get
12
-
13
- # Simplest call: today's Astronomy Picture of the Day
14
- apod = json.loads(http_get("https://api.nasa.gov/planetary/apod?api_key=DEMO_KEY"))
15
- print(apod['date'], apod['title'], apod['media_type'])
16
- # Confirmed output (2026-04-18): 2026-04-18 PanSTARRS and Planets image
17
- ```
18
-
19
- Use DEMO_KEY for exploration. Switch to a personal key for any bulk work — DEMO_KEY hits its limit at ~10 req/hour/IP (daily budget around 50; `retry-after` header will show ~22 hours when exhausted).
20
-
21
- ## Rate limits
22
-
23
- | Key type | Limit | Resets |
24
- |---|---|---|
25
- | `DEMO_KEY` | 10 req/hour, ~50/day per IP | Hourly window; daily hard stop with `retry-after` ~22h |
26
- | Personal key (free) | 1,000 req/hour | Hourly window |
27
-
28
- Rate limit headers on every `api.nasa.gov` response:
29
- - `X-Ratelimit-Limit` — your current window limit (e.g. `10`)
30
- - `X-Ratelimit-Remaining` — calls left this window
31
- - `Retry-After` — seconds until next window (only on 429)
32
-
33
- **EPIC (`epic.gsfc.nasa.gov`) and Exoplanet Archive (`exoplanetarchive.ipac.caltech.edu`) share no rate-limit pool with `api.nasa.gov`.**
34
-
35
- ## Common workflows
36
-
37
- ### APOD — single day
38
-
39
- ```python
40
- import json
41
- from helpers import http_get
42
-
43
- apod = json.loads(http_get("https://api.nasa.gov/planetary/apod?api_key=DEMO_KEY"))
44
- print(apod['date']) # '2026-04-18'
45
- print(apod['title']) # 'PanSTARRS and Planets'
46
- print(apod['media_type']) # 'image' or 'video'
47
- print(apod['url']) # full-res or YouTube embed URL
48
- print(apod['hdurl']) # HD image URL (absent when media_type='video')
49
- print(apod.get('copyright')) # None if public domain
50
- # Confirmed output (2026-04-18):
51
- # url: https://apod.nasa.gov/apod/image/2604/PanstarrsPlanetsPerrotLab1024.jpg
52
- # hdurl: https://apod.nasa.gov/apod/image/2604/PanstarrsPlanetsPerrot.jpg
53
- # copyright: Luc Perrot
54
- ```
55
-
56
- ### APOD — date range (array response)
57
-
58
- ```python
59
- import json
60
- from helpers import http_get
61
-
62
- apods = json.loads(http_get(
63
- "https://api.nasa.gov/planetary/apod"
64
- "?start_date=2024-01-01&end_date=2024-01-07&api_key=DEMO_KEY"
65
- ))
66
- # Returns a list of 7 dicts — same schema as single-day response
67
- for a in apods:
68
- print(a['date'], a['media_type'], a['title'][:50])
69
- # Confirmed output (7 items):
70
- # 2024-01-01 image NGC 1232: A Grand Design Spiral Galaxy
71
- # 2024-01-02 image Rocket Transits Rippling Moon
72
- # 2024-01-03 image A SAR Arc from New Zealand
73
- # 2024-01-04 image Zeta Oph: Runaway Star
74
- # 2024-01-05 image Trapezium: At the Heart of Orion
75
- # 2024-01-06 video The Snows of Churyumov-Gerasimenko
76
- # 2024-01-07 image The Cat's Eye Nebula in Optical and X-ray
77
- ```
78
-
79
- Optional params: `date=YYYY-MM-DD` (specific day), `count=N` (N random entries), `thumbs=true` (include `thumbnail_url` for video entries).
80
-
81
- ### APOD — random sample
82
-
83
- ```python
84
- import json
85
- from helpers import http_get
86
-
87
- apods = json.loads(http_get(
88
- "https://api.nasa.gov/planetary/apod?count=5&api_key=DEMO_KEY"
89
- ))
90
- for a in apods:
91
- print(a['date'], a['title'][:40])
92
- # Returns 5 random APOD entries — dates can be any day since 1995-06-16
93
- ```
94
-
95
- ### NEO — Near Earth Objects feed
96
-
97
- ```python
98
- import json
99
- from helpers import http_get
100
-
101
- data = json.loads(http_get(
102
- "https://api.nasa.gov/neo/rest/v1/feed"
103
- "?start_date=2024-01-01&end_date=2024-01-02&api_key=DEMO_KEY"
104
- ))
105
- print(data['element_count']) # 32 (total NEOs across both days)
106
- neos = data['near_earth_objects'] # dict keyed by date string
107
- for date, objects in sorted(neos.items()):
108
- for neo in objects:
109
- ca = neo['close_approach_data'][0]
110
- print(
111
- neo['name'],
112
- 'hazardous:', neo['is_potentially_hazardous_asteroid'],
113
- 'miss km:', ca['miss_distance']['kilometers'][:12],
114
- 'vel kph:', ca['relative_velocity']['kilometers_per_hour'][:10]
115
- )
116
- # Confirmed output (2 days, 32 total NEOs):
117
- # 415949 (2001 XY10) hazardous: False miss km: 50452409.34 vel kph: 57205.8951
118
- # (22+ more objects per day)
119
- ```
120
-
121
- NEO object fields:
122
- - `id`, `name`, `nasa_jpl_url` — identity
123
- - `estimated_diameter` — dict with `kilometers`, `meters`, `miles`, `feet` sub-dicts, each with `min`/`max`
124
- - `is_potentially_hazardous_asteroid` — bool
125
- - `close_approach_data[0]` — `close_approach_date`, `miss_distance` (au/lunar/km/mi), `relative_velocity` (km/s, km/h, mph), `orbiting_body`
126
-
127
- Date range is capped at **7 days per request**. For longer ranges, paginate with `start_date` / `end_date` in 7-day steps. `links.next` in the response gives the next 7-day window URL.
128
-
129
- ### NEO — single asteroid lookup
130
-
131
- ```python
132
- import json
133
- from helpers import http_get
134
-
135
- # Asteroid ID comes from the feed's `id` field
136
- neo = json.loads(http_get(
137
- "https://api.nasa.gov/neo/rest/v1/neo/2415949?api_key=DEMO_KEY"
138
- ))
139
- print(neo['name'])
140
- print(neo['orbital_data']['orbit_class']['orbit_class_description'])
141
- # Full orbital history + all close approaches are in `close_approach_data` (long list)
142
- ```
143
-
144
- ### Mars Rover photos — Curiosity by sol
145
-
146
- ```python
147
- import json
148
- from helpers import http_get
149
-
150
- # sol = Martian solar day since landing
151
- data = json.loads(http_get(
152
- "https://api.nasa.gov/mars-photos/api/v1/rovers/curiosity/photos"
153
- "?sol=1000&api_key=DEMO_KEY"
154
- ))
155
- photos = data['photos']
156
- print(f"Photos on sol 1000: {len(photos)}")
157
- p = photos[0]
158
- print(p['earth_date']) # '2015-05-30'
159
- print(p['img_src']) # direct JPEG URL
160
- print(p['camera']['name']) # 'FHAZ'
161
- print(p['camera']['full_name']) # 'Front Hazard Avoidance Camera'
162
- print(p['rover']['name']) # 'Curiosity'
163
- print(p['rover']['status']) # 'active'
164
- print(p['rover']['max_sol']) # highest sol with photos
165
-
166
- # Filter by camera
167
- data = json.loads(http_get(
168
- "https://api.nasa.gov/mars-photos/api/v1/rovers/curiosity/photos"
169
- "?sol=1000&camera=navcam&api_key=DEMO_KEY"
170
- ))
171
- ```
172
-
173
- Available cameras for Curiosity: `fhaz`, `rhaz`, `mast`, `chemcam`, `mahli`, `mardi`, `navcam`. Other rovers: `opportunity`, `spirit`, `perseverance`.
174
-
175
- Use `latest_photos` to get the most recent available:
176
- ```python
177
- data = json.loads(http_get(
178
- "https://api.nasa.gov/mars-photos/api/v1/rovers/curiosity/latest_photos"
179
- "?api_key=DEMO_KEY"
180
- ))
181
- photos = data['latest_photos']
182
- ```
183
-
184
- Add `&page=N` for pagination (25 photos/page by default).
185
-
186
- ### EPIC — Earth Polychromatic Imaging Camera
187
-
188
- EPIC images are served from `epic.gsfc.nasa.gov` — **no `api_key` required, no rate limit.**
189
-
190
- ```python
191
- import json
192
- from helpers import http_get
193
-
194
- # Latest available images (natural color)
195
- images = json.loads(http_get("https://epic.gsfc.nasa.gov/api/natural"))
196
- print(f"Latest batch: {len(images)} images") # Confirmed: 4 images on 2026-04-18
197
-
198
- img = images[0]
199
- print(img['identifier']) # '20260416162050'
200
- print(img['image']) # 'epic_1b_20260416162050'
201
- print(img['date']) # '2026-04-16 16:16:01'
202
- print(img['centroid_coordinates']) # {'lat': 13.25, 'lon': -75.59}
203
-
204
- # Construct PNG URL from image name + date
205
- date_str = img['date'].split(' ')[0] # '2026-04-16'
206
- year, month, day = date_str.split('-')
207
- png_url = f"https://epic.gsfc.nasa.gov/archive/natural/{year}/{month}/{day}/png/{img['image']}.png"
208
- jpg_thumb = f"https://epic.gsfc.nasa.gov/archive/natural/{year}/{month}/{day}/thumbs/{img['image']}.jpg"
209
- print(png_url)
210
- # Confirmed: https://epic.gsfc.nasa.gov/archive/natural/2026/04/16/png/epic_1b_20260416162050.png
211
- ```
212
-
213
- ```python
214
- # Images for a specific date
215
- images = json.loads(http_get("https://epic.gsfc.nasa.gov/api/natural/date/2024-01-15"))
216
- print(len(images)) # 14 images on 2024-01-15
217
-
218
- # Enhanced (color-corrected) images — same API, different path
219
- enhanced = json.loads(http_get("https://epic.gsfc.nasa.gov/api/enhanced/date/2024-01-15"))
220
- # Enhanced image URL pattern uses 'enhanced' and 'epic_RGB_' prefix:
221
- img = enhanced[0]
222
- date_str = img['date'].split(' ')[0]
223
- year, month, day = date_str.split('-')
224
- url = f"https://epic.gsfc.nasa.gov/archive/enhanced/{year}/{month}/{day}/png/{img['image']}.png"
225
- # e.g. .../archive/enhanced/2024/01/15/png/epic_RGB_20240115005515.png
226
-
227
- # All available dates
228
- all_dates = json.loads(http_get("https://epic.gsfc.nasa.gov/api/natural/all"))
229
- print(f"Available dates: {len(all_dates)}") # 3477 dates (2015-06-13 to present)
230
- print(all_dates[0]) # {'date': '2026-04-16'} (newest first)
231
- print(all_dates[-1]) # {'date': '2015-06-13'} (oldest)
232
- ```
233
-
234
- ### Exoplanet Archive — TAP/ADQL queries
235
-
236
- No API key or rate limit. SQL-like ADQL queries over the full archive.
237
-
238
- ```python
239
- import json
240
- from helpers import http_get
241
-
242
- # Short-period planets with known radii
243
- planets = json.loads(http_get(
244
- "https://exoplanetarchive.ipac.caltech.edu/TAP/sync"
245
- "?query=select+pl_name,hostname,pl_orbper+from+ps+where+pl_orbper+%3C+10"
246
- "&format=json"
247
- ))
248
- print(f"Rows: {len(planets)}") # 17675 (table 'ps' includes duplicate measurements)
249
- print(planets[0])
250
- # {'pl_name': 'GJ 1214 b', 'hostname': 'GJ 1214', 'pl_orbper': 1.58040482}
251
- ```
252
-
253
- ```python
254
- # Use 'pscomppars' for one row per planet (composite best-estimate params)
255
- planets = json.loads(http_get(
256
- "https://exoplanetarchive.ipac.caltech.edu/TAP/sync"
257
- "?query=select+pl_name,hostname,disc_year,discoverymethod,pl_orbper,pl_rade,pl_masse,pl_eqt,sy_dist"
258
- "+from+pscomppars+where+disc_year+%3E+2020+and+pl_rade+is+not+null"
259
- "+order+by+disc_year+desc"
260
- "&format=json&maxrec=5"
261
- ))
262
- for p in planets:
263
- print(p['pl_name'], p['disc_year'], p['discoverymethod'], f"r={p['pl_rade']}Re")
264
- # Confirmed output:
265
- # KMT-2024-BLG-1870L b 2026 Microlensing r=13.8Re
266
- # LHS 1903 b 2026 Transit r=1.382Re
267
- # TOI-375 d 2026 Radial Velocity r=13.6Re
268
- ```
269
-
270
- Key tables:
271
- - `ps` — all measurements per planet (multiple rows per planet, all sources)
272
- - `pscomppars` — one row per confirmed planet (best composite parameters)
273
-
274
- Key columns: `pl_name`, `hostname`, `disc_year`, `discoverymethod`, `pl_orbper` (orbital period, days), `pl_rade` (radius in Earth radii), `pl_masse` (mass in Earth masses), `pl_eqt` (equilibrium temp K), `sy_dist` (distance in parsec).
275
-
276
- URL-encode operators: `<` = `%3C`, `>` = `%3E`, spaces = `+`.
277
-
278
- ## URL reference
279
-
280
- ### api.nasa.gov endpoints
281
-
282
- | Endpoint | URL pattern |
283
- |---|---|
284
- | APOD today | `https://api.nasa.gov/planetary/apod?api_key=KEY` |
285
- | APOD by date | `...&date=YYYY-MM-DD` |
286
- | APOD range | `...&start_date=YYYY-MM-DD&end_date=YYYY-MM-DD` |
287
- | APOD random N | `...&count=N` |
288
- | NEO feed | `https://api.nasa.gov/neo/rest/v1/feed?start_date=...&end_date=...&api_key=KEY` |
289
- | NEO by ID | `https://api.nasa.gov/neo/rest/v1/neo/{id}?api_key=KEY` |
290
- | Mars photos by sol | `https://api.nasa.gov/mars-photos/api/v1/rovers/{rover}/photos?sol=N&api_key=KEY` |
291
- | Mars photos by date | `...?earth_date=YYYY-MM-DD&api_key=KEY` |
292
- | Mars latest | `https://api.nasa.gov/mars-photos/api/v1/rovers/{rover}/latest_photos?api_key=KEY` |
293
-
294
- ### EPIC (epic.gsfc.nasa.gov — no key, no rate limit)
295
-
296
- | Endpoint | URL |
297
- |---|---|
298
- | Latest natural images | `https://epic.gsfc.nasa.gov/api/natural` |
299
- | Natural by date | `https://epic.gsfc.nasa.gov/api/natural/date/YYYY-MM-DD` |
300
- | Enhanced latest | `https://epic.gsfc.nasa.gov/api/enhanced` |
301
- | Enhanced by date | `https://epic.gsfc.nasa.gov/api/enhanced/date/YYYY-MM-DD` |
302
- | All available dates | `https://epic.gsfc.nasa.gov/api/natural/all` |
303
- | PNG image | `https://epic.gsfc.nasa.gov/archive/natural/YYYY/MM/DD/png/{image}.png` |
304
- | Thumbnail (JPEG) | `https://epic.gsfc.nasa.gov/archive/natural/YYYY/MM/DD/thumbs/{image}.jpg` |
305
- | Enhanced PNG | `https://epic.gsfc.nasa.gov/archive/enhanced/YYYY/MM/DD/png/{image}.png` |
306
-
307
- ### Exoplanet Archive (no key, no rate limit)
308
-
309
- ```
310
- https://exoplanetarchive.ipac.caltech.edu/TAP/sync?query=<ADQL>&format=json&maxrec=<N>
311
- ```
312
-
313
- ## Gotchas
314
-
315
- - **DEMO_KEY limit is effectively 10/hour per IP, not 30** — The `X-Ratelimit-Limit` header shows `10` in practice. When the daily budget (~50 req) is exhausted, the `retry-after` header is set to ~80,000 seconds (about 22 hours). Register a free personal key at https://api.nasa.gov/ to get 1,000/hour.
316
-
317
- - **All `api.nasa.gov` paths share one rate-limit pool** — APOD, NEO, Mars Rover, and all other `api.nasa.gov` paths draw from the same DEMO_KEY bucket. Calling any one of them depletes the limit for all others.
318
-
319
- - **EPIC and Exoplanet Archive are fully free** — `epic.gsfc.nasa.gov` returns no rate-limit headers and is not throttled. `exoplanetarchive.ipac.caltech.edu/TAP/sync` is similarly unrestricted. Use these freely without fear of exhausting DEMO_KEY.
320
-
321
- - **NEO date range max is 7 days** — Requests spanning more than 7 days return HTTP 400. Paginate with 7-day windows and use `links.next` from the response to get the next URL.
322
-
323
- - **APOD earliest date is 1995-06-16** — Requesting `date` before `1995-06-16` returns HTTP 400 with an error message. No upper date bound other than today.
324
-
325
- - **APOD `hdurl` is absent for video entries** — When `media_type` is `video`, the response has `url` (a YouTube embed URL) but no `hdurl`. Always check `media_type` before accessing `hdurl`.
326
-
327
- - **Mars Rover `sol` vs `earth_date`** — Both are valid filter params. `sol` is the Martian solar day since rover landing. `earth_date` uses `YYYY-MM-DD`. You cannot mix them in one request.
328
-
329
- - **Mars Rover pagination defaults to 25 photos/page** — Large sols (Curiosity sol 1000 has many photos) require `&page=2`, `&page=3`, etc. There is no total count in the response; keep paginating until you get an empty `photos` list.
330
-
331
- - **EPIC image name encodes type in the prefix** — Natural images use `epic_1b_` prefix; enhanced color-corrected images use `epic_RGB_` prefix. The API returns the correct filename in `img['image']`; don't guess the prefix.
332
-
333
- - **EPIC `/api/natural/all` returns newest-first** — The list of 3,477+ available dates starts from today and goes back to 2015-06-13. Not all days have images (gaps during spacecraft maintenance).
334
-
335
- - **Exoplanet `ps` table has multiple rows per planet** — Different publications report different measurements for the same planet. Use `pscomppars` for one-row-per-planet composite parameters. `ps` is useful when you need all reported values or want to filter by specific reference.
336
-
337
- - **Exoplanet null values come back as `None` in JSON** — Many fields like `pl_masse` are `null` for planets without mass measurements. Always guard with `if row['pl_masse'] is not None`.
338
-
339
- - **`http_get` in helpers.py uses stdlib `urllib`** — On some macOS Python 3.11 installs, SSL certificate verification fails (`CERTIFICATE_VERIFY_FAILED`). If you hit this, run `curl` via `subprocess` as a fallback, or install certifi and patch the default SSL context. The harness's browser CDP connection is not affected; only `http_get` is.
1
+ # NASA APIs — Scraping & Data Extraction
2
+
3
+ `https://api.nasa.gov` — open NASA data APIs. **Never use the browser.** All endpoints return JSON via `http_get`. DEMO_KEY works for low-volume use; register for a free personal key at https://api.nasa.gov/ to raise limits.
4
+
5
+ ## Do this first
6
+
7
+ **All `api.nasa.gov` endpoints share the same rate-limit pool under DEMO_KEY. EPIC and Exoplanet Archive are on separate domains with no rate limit.**
8
+
9
+ ```python
10
+ import json
11
+ from helpers import http_get
12
+
13
+ # Simplest call: today's Astronomy Picture of the Day
14
+ apod = json.loads(http_get("https://api.nasa.gov/planetary/apod?api_key=DEMO_KEY"))
15
+ print(apod['date'], apod['title'], apod['media_type'])
16
+ # Confirmed output (2026-04-18): 2026-04-18 PanSTARRS and Planets image
17
+ ```
18
+
19
+ Use DEMO_KEY for exploration. Switch to a personal key for any bulk work — DEMO_KEY hits its limit at ~10 req/hour/IP (daily budget around 50; `retry-after` header will show ~22 hours when exhausted).
20
+
21
+ ## Rate limits
22
+
23
+ | Key type | Limit | Resets |
24
+ |---|---|---|
25
+ | `DEMO_KEY` | 10 req/hour, ~50/day per IP | Hourly window; daily hard stop with `retry-after` ~22h |
26
+ | Personal key (free) | 1,000 req/hour | Hourly window |
27
+
28
+ Rate limit headers on every `api.nasa.gov` response:
29
+ - `X-Ratelimit-Limit` — your current window limit (e.g. `10`)
30
+ - `X-Ratelimit-Remaining` — calls left this window
31
+ - `Retry-After` — seconds until next window (only on 429)
32
+
33
+ **EPIC (`epic.gsfc.nasa.gov`) and Exoplanet Archive (`exoplanetarchive.ipac.caltech.edu`) share no rate-limit pool with `api.nasa.gov`.**
34
+
35
+ ## Common workflows
36
+
37
+ ### APOD — single day
38
+
39
+ ```python
40
+ import json
41
+ from helpers import http_get
42
+
43
+ apod = json.loads(http_get("https://api.nasa.gov/planetary/apod?api_key=DEMO_KEY"))
44
+ print(apod['date']) # '2026-04-18'
45
+ print(apod['title']) # 'PanSTARRS and Planets'
46
+ print(apod['media_type']) # 'image' or 'video'
47
+ print(apod['url']) # full-res or YouTube embed URL
48
+ print(apod['hdurl']) # HD image URL (absent when media_type='video')
49
+ print(apod.get('copyright')) # None if public domain
50
+ # Confirmed output (2026-04-18):
51
+ # url: https://apod.nasa.gov/apod/image/2604/PanstarrsPlanetsPerrotLab1024.jpg
52
+ # hdurl: https://apod.nasa.gov/apod/image/2604/PanstarrsPlanetsPerrot.jpg
53
+ # copyright: Luc Perrot
54
+ ```
55
+
56
+ ### APOD — date range (array response)
57
+
58
+ ```python
59
+ import json
60
+ from helpers import http_get
61
+
62
+ apods = json.loads(http_get(
63
+ "https://api.nasa.gov/planetary/apod"
64
+ "?start_date=2024-01-01&end_date=2024-01-07&api_key=DEMO_KEY"
65
+ ))
66
+ # Returns a list of 7 dicts — same schema as single-day response
67
+ for a in apods:
68
+ print(a['date'], a['media_type'], a['title'][:50])
69
+ # Confirmed output (7 items):
70
+ # 2024-01-01 image NGC 1232: A Grand Design Spiral Galaxy
71
+ # 2024-01-02 image Rocket Transits Rippling Moon
72
+ # 2024-01-03 image A SAR Arc from New Zealand
73
+ # 2024-01-04 image Zeta Oph: Runaway Star
74
+ # 2024-01-05 image Trapezium: At the Heart of Orion
75
+ # 2024-01-06 video The Snows of Churyumov-Gerasimenko
76
+ # 2024-01-07 image The Cat's Eye Nebula in Optical and X-ray
77
+ ```
78
+
79
+ Optional params: `date=YYYY-MM-DD` (specific day), `count=N` (N random entries), `thumbs=true` (include `thumbnail_url` for video entries).
80
+
81
+ ### APOD — random sample
82
+
83
+ ```python
84
+ import json
85
+ from helpers import http_get
86
+
87
+ apods = json.loads(http_get(
88
+ "https://api.nasa.gov/planetary/apod?count=5&api_key=DEMO_KEY"
89
+ ))
90
+ for a in apods:
91
+ print(a['date'], a['title'][:40])
92
+ # Returns 5 random APOD entries — dates can be any day since 1995-06-16
93
+ ```
94
+
95
+ ### NEO — Near Earth Objects feed
96
+
97
+ ```python
98
+ import json
99
+ from helpers import http_get
100
+
101
+ data = json.loads(http_get(
102
+ "https://api.nasa.gov/neo/rest/v1/feed"
103
+ "?start_date=2024-01-01&end_date=2024-01-02&api_key=DEMO_KEY"
104
+ ))
105
+ print(data['element_count']) # 32 (total NEOs across both days)
106
+ neos = data['near_earth_objects'] # dict keyed by date string
107
+ for date, objects in sorted(neos.items()):
108
+ for neo in objects:
109
+ ca = neo['close_approach_data'][0]
110
+ print(
111
+ neo['name'],
112
+ 'hazardous:', neo['is_potentially_hazardous_asteroid'],
113
+ 'miss km:', ca['miss_distance']['kilometers'][:12],
114
+ 'vel kph:', ca['relative_velocity']['kilometers_per_hour'][:10]
115
+ )
116
+ # Confirmed output (2 days, 32 total NEOs):
117
+ # 415949 (2001 XY10) hazardous: False miss km: 50452409.34 vel kph: 57205.8951
118
+ # (22+ more objects per day)
119
+ ```
120
+
121
+ NEO object fields:
122
+ - `id`, `name`, `nasa_jpl_url` — identity
123
+ - `estimated_diameter` — dict with `kilometers`, `meters`, `miles`, `feet` sub-dicts, each with `min`/`max`
124
+ - `is_potentially_hazardous_asteroid` — bool
125
+ - `close_approach_data[0]` — `close_approach_date`, `miss_distance` (au/lunar/km/mi), `relative_velocity` (km/s, km/h, mph), `orbiting_body`
126
+
127
+ Date range is capped at **7 days per request**. For longer ranges, paginate with `start_date` / `end_date` in 7-day steps. `links.next` in the response gives the next 7-day window URL.
128
+
129
+ ### NEO — single asteroid lookup
130
+
131
+ ```python
132
+ import json
133
+ from helpers import http_get
134
+
135
+ # Asteroid ID comes from the feed's `id` field
136
+ neo = json.loads(http_get(
137
+ "https://api.nasa.gov/neo/rest/v1/neo/2415949?api_key=DEMO_KEY"
138
+ ))
139
+ print(neo['name'])
140
+ print(neo['orbital_data']['orbit_class']['orbit_class_description'])
141
+ # Full orbital history + all close approaches are in `close_approach_data` (long list)
142
+ ```
143
+
144
+ ### Mars Rover photos — Curiosity by sol
145
+
146
+ ```python
147
+ import json
148
+ from helpers import http_get
149
+
150
+ # sol = Martian solar day since landing
151
+ data = json.loads(http_get(
152
+ "https://api.nasa.gov/mars-photos/api/v1/rovers/curiosity/photos"
153
+ "?sol=1000&api_key=DEMO_KEY"
154
+ ))
155
+ photos = data['photos']
156
+ print(f"Photos on sol 1000: {len(photos)}")
157
+ p = photos[0]
158
+ print(p['earth_date']) # '2015-05-30'
159
+ print(p['img_src']) # direct JPEG URL
160
+ print(p['camera']['name']) # 'FHAZ'
161
+ print(p['camera']['full_name']) # 'Front Hazard Avoidance Camera'
162
+ print(p['rover']['name']) # 'Curiosity'
163
+ print(p['rover']['status']) # 'active'
164
+ print(p['rover']['max_sol']) # highest sol with photos
165
+
166
+ # Filter by camera
167
+ data = json.loads(http_get(
168
+ "https://api.nasa.gov/mars-photos/api/v1/rovers/curiosity/photos"
169
+ "?sol=1000&camera=navcam&api_key=DEMO_KEY"
170
+ ))
171
+ ```
172
+
173
+ Available cameras for Curiosity: `fhaz`, `rhaz`, `mast`, `chemcam`, `mahli`, `mardi`, `navcam`. Other rovers: `opportunity`, `spirit`, `perseverance`.
174
+
175
+ Use `latest_photos` to get the most recent available:
176
+ ```python
177
+ data = json.loads(http_get(
178
+ "https://api.nasa.gov/mars-photos/api/v1/rovers/curiosity/latest_photos"
179
+ "?api_key=DEMO_KEY"
180
+ ))
181
+ photos = data['latest_photos']
182
+ ```
183
+
184
+ Add `&page=N` for pagination (25 photos/page by default).
185
+
186
+ ### EPIC — Earth Polychromatic Imaging Camera
187
+
188
+ EPIC images are served from `epic.gsfc.nasa.gov` — **no `api_key` required, no rate limit.**
189
+
190
+ ```python
191
+ import json
192
+ from helpers import http_get
193
+
194
+ # Latest available images (natural color)
195
+ images = json.loads(http_get("https://epic.gsfc.nasa.gov/api/natural"))
196
+ print(f"Latest batch: {len(images)} images") # Confirmed: 4 images on 2026-04-18
197
+
198
+ img = images[0]
199
+ print(img['identifier']) # '20260416162050'
200
+ print(img['image']) # 'epic_1b_20260416162050'
201
+ print(img['date']) # '2026-04-16 16:16:01'
202
+ print(img['centroid_coordinates']) # {'lat': 13.25, 'lon': -75.59}
203
+
204
+ # Construct PNG URL from image name + date
205
+ date_str = img['date'].split(' ')[0] # '2026-04-16'
206
+ year, month, day = date_str.split('-')
207
+ png_url = f"https://epic.gsfc.nasa.gov/archive/natural/{year}/{month}/{day}/png/{img['image']}.png"
208
+ jpg_thumb = f"https://epic.gsfc.nasa.gov/archive/natural/{year}/{month}/{day}/thumbs/{img['image']}.jpg"
209
+ print(png_url)
210
+ # Confirmed: https://epic.gsfc.nasa.gov/archive/natural/2026/04/16/png/epic_1b_20260416162050.png
211
+ ```
212
+
213
+ ```python
214
+ # Images for a specific date
215
+ images = json.loads(http_get("https://epic.gsfc.nasa.gov/api/natural/date/2024-01-15"))
216
+ print(len(images)) # 14 images on 2024-01-15
217
+
218
+ # Enhanced (color-corrected) images — same API, different path
219
+ enhanced = json.loads(http_get("https://epic.gsfc.nasa.gov/api/enhanced/date/2024-01-15"))
220
+ # Enhanced image URL pattern uses 'enhanced' and 'epic_RGB_' prefix:
221
+ img = enhanced[0]
222
+ date_str = img['date'].split(' ')[0]
223
+ year, month, day = date_str.split('-')
224
+ url = f"https://epic.gsfc.nasa.gov/archive/enhanced/{year}/{month}/{day}/png/{img['image']}.png"
225
+ # e.g. .../archive/enhanced/2024/01/15/png/epic_RGB_20240115005515.png
226
+
227
+ # All available dates
228
+ all_dates = json.loads(http_get("https://epic.gsfc.nasa.gov/api/natural/all"))
229
+ print(f"Available dates: {len(all_dates)}") # 3477 dates (2015-06-13 to present)
230
+ print(all_dates[0]) # {'date': '2026-04-16'} (newest first)
231
+ print(all_dates[-1]) # {'date': '2015-06-13'} (oldest)
232
+ ```
233
+
234
+ ### Exoplanet Archive — TAP/ADQL queries
235
+
236
+ No API key or rate limit. SQL-like ADQL queries over the full archive.
237
+
238
+ ```python
239
+ import json
240
+ from helpers import http_get
241
+
242
+ # Short-period planets with known radii
243
+ planets = json.loads(http_get(
244
+ "https://exoplanetarchive.ipac.caltech.edu/TAP/sync"
245
+ "?query=select+pl_name,hostname,pl_orbper+from+ps+where+pl_orbper+%3C+10"
246
+ "&format=json"
247
+ ))
248
+ print(f"Rows: {len(planets)}") # 17675 (table 'ps' includes duplicate measurements)
249
+ print(planets[0])
250
+ # {'pl_name': 'GJ 1214 b', 'hostname': 'GJ 1214', 'pl_orbper': 1.58040482}
251
+ ```
252
+
253
+ ```python
254
+ # Use 'pscomppars' for one row per planet (composite best-estimate params)
255
+ planets = json.loads(http_get(
256
+ "https://exoplanetarchive.ipac.caltech.edu/TAP/sync"
257
+ "?query=select+pl_name,hostname,disc_year,discoverymethod,pl_orbper,pl_rade,pl_masse,pl_eqt,sy_dist"
258
+ "+from+pscomppars+where+disc_year+%3E+2020+and+pl_rade+is+not+null"
259
+ "+order+by+disc_year+desc"
260
+ "&format=json&maxrec=5"
261
+ ))
262
+ for p in planets:
263
+ print(p['pl_name'], p['disc_year'], p['discoverymethod'], f"r={p['pl_rade']}Re")
264
+ # Confirmed output:
265
+ # KMT-2024-BLG-1870L b 2026 Microlensing r=13.8Re
266
+ # LHS 1903 b 2026 Transit r=1.382Re
267
+ # TOI-375 d 2026 Radial Velocity r=13.6Re
268
+ ```
269
+
270
+ Key tables:
271
+ - `ps` — all measurements per planet (multiple rows per planet, all sources)
272
+ - `pscomppars` — one row per confirmed planet (best composite parameters)
273
+
274
+ Key columns: `pl_name`, `hostname`, `disc_year`, `discoverymethod`, `pl_orbper` (orbital period, days), `pl_rade` (radius in Earth radii), `pl_masse` (mass in Earth masses), `pl_eqt` (equilibrium temp K), `sy_dist` (distance in parsec).
275
+
276
+ URL-encode operators: `<` = `%3C`, `>` = `%3E`, spaces = `+`.
277
+
278
+ ## URL reference
279
+
280
+ ### api.nasa.gov endpoints
281
+
282
+ | Endpoint | URL pattern |
283
+ |---|---|
284
+ | APOD today | `https://api.nasa.gov/planetary/apod?api_key=KEY` |
285
+ | APOD by date | `...&date=YYYY-MM-DD` |
286
+ | APOD range | `...&start_date=YYYY-MM-DD&end_date=YYYY-MM-DD` |
287
+ | APOD random N | `...&count=N` |
288
+ | NEO feed | `https://api.nasa.gov/neo/rest/v1/feed?start_date=...&end_date=...&api_key=KEY` |
289
+ | NEO by ID | `https://api.nasa.gov/neo/rest/v1/neo/{id}?api_key=KEY` |
290
+ | Mars photos by sol | `https://api.nasa.gov/mars-photos/api/v1/rovers/{rover}/photos?sol=N&api_key=KEY` |
291
+ | Mars photos by date | `...?earth_date=YYYY-MM-DD&api_key=KEY` |
292
+ | Mars latest | `https://api.nasa.gov/mars-photos/api/v1/rovers/{rover}/latest_photos?api_key=KEY` |
293
+
294
+ ### EPIC (epic.gsfc.nasa.gov — no key, no rate limit)
295
+
296
+ | Endpoint | URL |
297
+ |---|---|
298
+ | Latest natural images | `https://epic.gsfc.nasa.gov/api/natural` |
299
+ | Natural by date | `https://epic.gsfc.nasa.gov/api/natural/date/YYYY-MM-DD` |
300
+ | Enhanced latest | `https://epic.gsfc.nasa.gov/api/enhanced` |
301
+ | Enhanced by date | `https://epic.gsfc.nasa.gov/api/enhanced/date/YYYY-MM-DD` |
302
+ | All available dates | `https://epic.gsfc.nasa.gov/api/natural/all` |
303
+ | PNG image | `https://epic.gsfc.nasa.gov/archive/natural/YYYY/MM/DD/png/{image}.png` |
304
+ | Thumbnail (JPEG) | `https://epic.gsfc.nasa.gov/archive/natural/YYYY/MM/DD/thumbs/{image}.jpg` |
305
+ | Enhanced PNG | `https://epic.gsfc.nasa.gov/archive/enhanced/YYYY/MM/DD/png/{image}.png` |
306
+
307
+ ### Exoplanet Archive (no key, no rate limit)
308
+
309
+ ```
310
+ https://exoplanetarchive.ipac.caltech.edu/TAP/sync?query=<ADQL>&format=json&maxrec=<N>
311
+ ```
312
+
313
+ ## Gotchas
314
+
315
+ - **DEMO_KEY limit is effectively 10/hour per IP, not 30** — The `X-Ratelimit-Limit` header shows `10` in practice. When the daily budget (~50 req) is exhausted, the `retry-after` header is set to ~80,000 seconds (about 22 hours). Register a free personal key at https://api.nasa.gov/ to get 1,000/hour.
316
+
317
+ - **All `api.nasa.gov` paths share one rate-limit pool** — APOD, NEO, Mars Rover, and all other `api.nasa.gov` paths draw from the same DEMO_KEY bucket. Calling any one of them depletes the limit for all others.
318
+
319
+ - **EPIC and Exoplanet Archive are fully free** — `epic.gsfc.nasa.gov` returns no rate-limit headers and is not throttled. `exoplanetarchive.ipac.caltech.edu/TAP/sync` is similarly unrestricted. Use these freely without fear of exhausting DEMO_KEY.
320
+
321
+ - **NEO date range max is 7 days** — Requests spanning more than 7 days return HTTP 400. Paginate with 7-day windows and use `links.next` from the response to get the next URL.
322
+
323
+ - **APOD earliest date is 1995-06-16** — Requesting `date` before `1995-06-16` returns HTTP 400 with an error message. No upper date bound other than today.
324
+
325
+ - **APOD `hdurl` is absent for video entries** — When `media_type` is `video`, the response has `url` (a YouTube embed URL) but no `hdurl`. Always check `media_type` before accessing `hdurl`.
326
+
327
+ - **Mars Rover `sol` vs `earth_date`** — Both are valid filter params. `sol` is the Martian solar day since rover landing. `earth_date` uses `YYYY-MM-DD`. You cannot mix them in one request.
328
+
329
+ - **Mars Rover pagination defaults to 25 photos/page** — Large sols (Curiosity sol 1000 has many photos) require `&page=2`, `&page=3`, etc. There is no total count in the response; keep paginating until you get an empty `photos` list.
330
+
331
+ - **EPIC image name encodes type in the prefix** — Natural images use `epic_1b_` prefix; enhanced color-corrected images use `epic_RGB_` prefix. The API returns the correct filename in `img['image']`; don't guess the prefix.
332
+
333
+ - **EPIC `/api/natural/all` returns newest-first** — The list of 3,477+ available dates starts from today and goes back to 2015-06-13. Not all days have images (gaps during spacecraft maintenance).
334
+
335
+ - **Exoplanet `ps` table has multiple rows per planet** — Different publications report different measurements for the same planet. Use `pscomppars` for one-row-per-planet composite parameters. `ps` is useful when you need all reported values or want to filter by specific reference.
336
+
337
+ - **Exoplanet null values come back as `None` in JSON** — Many fields like `pl_masse` are `null` for planets without mass measurements. Always guard with `if row['pl_masse'] is not None`.
338
+
339
+ - **`http_get` in helpers.py uses stdlib `urllib`** — On some macOS Python 3.11 installs, SSL certificate verification fails (`CERTIFICATE_VERIFY_FAILED`). If you hit this, run `curl` via `subprocess` as a fallback, or install certifi and patch the default SSL context. The harness's browser CDP connection is not affected; only `http_get` is.