@pencil-agent/nano-pencil 2.0.0-beta.8 → 2.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (241) hide show
  1. package/README.md +267 -267
  2. package/dist/build-meta.json +3 -3
  3. package/dist/core/export-html/AGENT.md +11 -11
  4. package/dist/core/export-html/template.css +971 -971
  5. package/dist/core/export-html/template.html +54 -54
  6. package/dist/core/extensions-host/index.d.ts +1 -1
  7. package/dist/core/extensions-host/loader.js +1 -1
  8. package/dist/core/extensions-host/runner.d.ts +1 -0
  9. package/dist/core/extensions-host/runner.js +2 -2
  10. package/dist/core/extensions-host/types.d.ts +17 -22
  11. package/dist/core/lib/ai/src/types.d.ts +12 -2
  12. package/dist/core/persona/persona-manager.js +5 -2
  13. package/dist/core/runtime/agent-session.js +3 -3
  14. package/dist/core/runtime/extension-core-bindings.d.ts +1 -0
  15. package/dist/core/runtime/extension-core-bindings.js +2 -2
  16. package/dist/extensions/builtin/AGENT.md +115 -115
  17. package/dist/extensions/builtin/browser/AGENT.md +17 -17
  18. package/dist/extensions/builtin/browser/agent-workspace/agent_helpers.py +12 -12
  19. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/amazon/product-search.md +198 -198
  20. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/archive-org/scraping.md +341 -341
  21. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/arxiv/scraping.md +311 -311
  22. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/arxiv-bulk/scraping.md +333 -333
  23. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/atlas/overview.md +70 -70
  24. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/booking-com/scraping.md +578 -578
  25. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/capterra/scraping.md +440 -440
  26. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/centilebrain/generate-estimates.md +110 -110
  27. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coingecko/scraping.md +325 -325
  28. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coinmarketcap/scraping.md +463 -463
  29. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coursera/scraping.md +360 -360
  30. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/craigslist/scraping.md +390 -390
  31. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/crossref/scraping.md +568 -568
  32. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/dev-to/scraping.md +323 -323
  33. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/duckduckgo/scraping.md +349 -349
  34. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/ebay/scraping.md +435 -435
  35. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/etsy/scraping.md +506 -506
  36. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/eventbrite/scraping.md +363 -363
  37. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/expedia/automation.md +168 -168
  38. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/facebook/groups.md +236 -236
  39. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/facebook/pages.md +295 -295
  40. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/framer/editor.md +108 -108
  41. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/fred/scraping.md +493 -493
  42. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/g2/scraping.md +580 -580
  43. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/genius/scraping.md +511 -511
  44. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/github/repo-actions.md +65 -65
  45. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/github/scraping.md +184 -184
  46. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/glassdoor/scraping.md +543 -543
  47. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/gmail/compose.md +122 -122
  48. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/goodreads/scraping.md +461 -461
  49. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/gutenberg/scraping.md +383 -383
  50. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/hackernews/scraping.md +243 -243
  51. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/howlongtobeat/scraping.md +473 -473
  52. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/imdb/scraping.md +271 -271
  53. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/itch-io/scraping.md +436 -436
  54. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/job-boards/indeed-glassdoor.md +1021 -1021
  55. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/letterboxd/scraping.md +349 -349
  56. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/linkedin/invitation-manager.md +109 -109
  57. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/loom/folder-enumeration.md +170 -170
  58. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/macrotrends/scraping.md +537 -537
  59. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/medium/article-hydration.md +120 -120
  60. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/medium/scraping.md +414 -414
  61. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/metacritic/scraping.md +477 -477
  62. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/musicbrainz/scraping.md +478 -478
  63. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/nasa/scraping.md +339 -339
  64. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/news-aggregation/multi-source.md +205 -205
  65. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/open-library/scraping.md +472 -472
  66. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/openalex/scraping.md +470 -470
  67. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/openstreetmap/scraping.md +490 -490
  68. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/package-registries/npm-pypi.md +478 -478
  69. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/polymarket/scraping.md +234 -234
  70. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/producthunt/scraping.md +307 -307
  71. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/pubmed/scraping.md +421 -421
  72. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/quora/scraping.md +364 -364
  73. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/rawg/scraping.md +352 -352
  74. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/reddit/scraping.md +124 -124
  75. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/rest-countries/scraping.md +233 -233
  76. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/sec-edgar/scraping.md +361 -361
  77. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/README.md +36 -36
  78. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/embedded-apps.md +72 -72
  79. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/knowledge-base.md +109 -109
  80. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/polaris-inputs.md +137 -137
  81. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/soundcloud/scraping.md +362 -362
  82. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/spotify/scraping.md +339 -339
  83. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/stackoverflow/scraping.md +435 -435
  84. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/steam/scraping.md +575 -575
  85. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/substack/scraping.md +338 -338
  86. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/thetechgeeks/pricing.md +52 -52
  87. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/tiktok/upload.md +107 -107
  88. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/tradingview/scraping.md +309 -309
  89. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/trello/boards-and-lists.md +88 -88
  90. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/trustpilot/scraping.md +375 -375
  91. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/walmart/scraping.md +444 -444
  92. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/wayback-machine/scraping.md +306 -306
  93. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/weather/scraping.md +398 -398
  94. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/wellfound/scraping.md +596 -596
  95. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/world-bank/scraping.md +356 -356
  96. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/xiaohongshu/scraping.md +84 -84
  97. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/youtube/scraping.md +418 -418
  98. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/zillow/scraping.md +433 -433
  99. package/dist/extensions/builtin/browser/browser.md +73 -73
  100. package/dist/extensions/builtin/browser/install.md +142 -142
  101. package/dist/extensions/builtin/browser/interaction-skills/connection.md +48 -48
  102. package/dist/extensions/builtin/browser/interaction-skills/cookies.md +3 -3
  103. package/dist/extensions/builtin/browser/interaction-skills/cross-origin-iframes.md +3 -3
  104. package/dist/extensions/builtin/browser/interaction-skills/dialogs.md +64 -64
  105. package/dist/extensions/builtin/browser/interaction-skills/downloads.md +3 -3
  106. package/dist/extensions/builtin/browser/interaction-skills/drag-and-drop.md +3 -3
  107. package/dist/extensions/builtin/browser/interaction-skills/dropdowns.md +3 -3
  108. package/dist/extensions/builtin/browser/interaction-skills/iframes.md +3 -3
  109. package/dist/extensions/builtin/browser/interaction-skills/network-requests.md +3 -3
  110. package/dist/extensions/builtin/browser/interaction-skills/print-as-pdf.md +3 -3
  111. package/dist/extensions/builtin/browser/interaction-skills/profile-sync.md +90 -90
  112. package/dist/extensions/builtin/browser/interaction-skills/screenshots.md +17 -17
  113. package/dist/extensions/builtin/browser/interaction-skills/scrolling.md +3 -3
  114. package/dist/extensions/builtin/browser/interaction-skills/shadow-dom.md +3 -3
  115. package/dist/extensions/builtin/browser/interaction-skills/tabs.md +69 -69
  116. package/dist/extensions/builtin/browser/interaction-skills/uploads.md +1 -1
  117. package/dist/extensions/builtin/browser/interaction-skills/viewport.md +3 -3
  118. package/dist/extensions/builtin/browser/src/browser_harness/AGENT.md +15 -15
  119. package/dist/extensions/builtin/browser/src/browser_harness/__init__.py +8 -8
  120. package/dist/extensions/builtin/browser/src/browser_harness/_ipc.py +90 -90
  121. package/dist/extensions/builtin/browser/src/browser_harness/admin.py +722 -722
  122. package/dist/extensions/builtin/browser/src/browser_harness/daemon.py +328 -328
  123. package/dist/extensions/builtin/browser/src/browser_harness/helpers.py +396 -396
  124. package/dist/extensions/builtin/browser/src/browser_harness/run.py +103 -103
  125. package/dist/extensions/builtin/discipline/skills/brainstorming/SKILL.md +33 -33
  126. package/dist/extensions/builtin/discipline/skills/executing-plans/SKILL.md +25 -25
  127. package/dist/extensions/builtin/discipline/skills/finishing-development-branch/SKILL.md +25 -25
  128. package/dist/extensions/builtin/discipline/skills/receiving-code-review/SKILL.md +22 -22
  129. package/dist/extensions/builtin/discipline/skills/requesting-code-review/SKILL.md +31 -31
  130. package/dist/extensions/builtin/discipline/skills/systematic-debugging/SKILL.md +28 -28
  131. package/dist/extensions/builtin/discipline/skills/test-driven-development/SKILL.md +32 -32
  132. package/dist/extensions/builtin/discipline/skills/using-git-worktrees/SKILL.md +25 -25
  133. package/dist/extensions/builtin/discipline/skills/verification-before-completion/SKILL.md +27 -27
  134. package/dist/extensions/builtin/discipline/skills/writing-plans/SKILL.md +26 -26
  135. package/dist/extensions/builtin/goal/README.md +67 -67
  136. package/dist/extensions/builtin/goal/goal-controller.d.ts +39 -10
  137. package/dist/extensions/builtin/goal/goal-controller.js +1 -1
  138. package/dist/extensions/builtin/goal/goal-format.js +1 -1
  139. package/dist/extensions/builtin/goal/goal-prompts.d.ts +2 -0
  140. package/dist/extensions/builtin/goal/goal-prompts.js +5 -4
  141. package/dist/extensions/builtin/goal/goal-store.js +1 -1
  142. package/dist/extensions/builtin/goal/index.d.ts +1 -1
  143. package/dist/extensions/builtin/goal/index.js +10 -7
  144. package/dist/extensions/builtin/grub/README.md +112 -112
  145. package/dist/extensions/builtin/link-world/agent-workspace/README.md +16 -16
  146. package/dist/extensions/builtin/link-world/index.js +6 -6
  147. package/dist/extensions/builtin/link-world/internet-search/internet-search.md +65 -65
  148. package/dist/extensions/builtin/link-world/link-world-agent.md +82 -82
  149. package/dist/extensions/builtin/link-world/linkworld.md +313 -313
  150. package/dist/extensions/builtin/link-world/{network-routing.md → network-routing/network-routing.md} +67 -67
  151. package/dist/extensions/builtin/loop/README.md +92 -92
  152. package/dist/extensions/builtin/mcp/figma-design.md +68 -68
  153. package/dist/extensions/builtin/mcp/mcp-management.md +85 -85
  154. package/dist/extensions/builtin/plan/index.js +1 -1
  155. package/dist/extensions/builtin/recap/AGENT.md +15 -15
  156. package/dist/extensions/builtin/sal/README.md +72 -72
  157. package/dist/extensions/builtin/security-audit/README.md +289 -289
  158. package/dist/extensions/builtin/task/task-store.d.ts +4 -0
  159. package/dist/extensions/builtin/task/task-store.js +1 -1
  160. package/dist/extensions/builtin/team/AGENT.md +112 -112
  161. package/dist/extensions/builtin/team/TESTING.md +299 -299
  162. package/dist/extensions/builtin/token-save/README.md +56 -56
  163. package/dist/extensions/optional/AGENT.md +10 -10
  164. package/dist/index.d.ts +5 -30
  165. package/dist/index.js +1 -1
  166. package/dist/models.d.ts +7 -0
  167. package/dist/models.js +1 -0
  168. package/dist/modes/interactive/components/footer.js +1 -1
  169. package/dist/modes/interactive/components/task-status-panel.d.ts +36 -0
  170. package/dist/modes/interactive/components/task-status-panel.js +1 -0
  171. package/dist/modes/interactive/controllers/stream-render-controller.d.ts +7 -0
  172. package/dist/modes/interactive/controllers/stream-render-controller.js +2 -2
  173. package/dist/modes/interactive/interactive-mode.js +40 -40
  174. package/dist/modes/interactive/state/interactive-state.d.ts +2 -0
  175. package/dist/modes/interactive/state/interactive-state.js +1 -1
  176. package/dist/modes/interactive/theme/dark.json +85 -85
  177. package/dist/modes/interactive/theme/light.json +84 -84
  178. package/dist/modes/interactive/theme/theme-schema.json +335 -335
  179. package/dist/modes/interactive/theme/warm.json +81 -81
  180. package/dist/node_modules/@pencil-agent/ai/dist/cli.js +0 -0
  181. package/dist/node_modules/@pencil-agent/ai/dist/models.generated.js +1 -1
  182. package/dist/node_modules/@pencil-agent/ai/dist/providers/anthropic.js +2 -2
  183. package/dist/node_modules/@pencil-agent/ai/dist/providers/openai-completions.js +5 -5
  184. package/dist/node_modules/@pencil-agent/ai/dist/providers/openai-responses.js +1 -1
  185. package/dist/node_modules/@pencil-agent/ai/dist/stream.js +1 -1
  186. package/dist/packages/protocol/src/commands.d.ts +33 -0
  187. package/dist/packages/protocol/src/flags.d.ts +20 -0
  188. package/dist/packages/protocol/src/hooks.d.ts +17 -0
  189. package/dist/packages/protocol/src/hooks.js +0 -0
  190. package/dist/packages/{extension-sdk → protocol}/src/index.d.ts +7 -4
  191. package/dist/packages/protocol/src/index.js +1 -0
  192. package/dist/packages/{extension-sdk → protocol}/src/lifecycle.d.ts +15 -27
  193. package/dist/packages/protocol/src/lifecycle.js +0 -0
  194. package/dist/packages/{extension-sdk → protocol}/src/tools.d.ts +1 -1
  195. package/dist/packages/protocol/src/tools.js +0 -0
  196. package/dist/public-config.d.ts +12 -0
  197. package/dist/public-config.js +1 -0
  198. package/dist/runtime.d.ts +9 -0
  199. package/dist/runtime.js +1 -0
  200. package/dist/session-compaction.d.ts +7 -0
  201. package/dist/session-compaction.js +1 -0
  202. package/dist/session.d.ts +7 -0
  203. package/dist/session.js +1 -0
  204. package/dist/skills.d.ts +7 -0
  205. package/dist/skills.js +1 -0
  206. package/dist/tools.d.ts +7 -0
  207. package/dist/tools.js +1 -0
  208. package/docs/ACP/345/215/217/350/256/256/351/233/206/346/210/220/345/274/200/345/217/221/346/226/207/346/241/243.md +851 -0
  209. package/docs/SDK-TESTING.md +364 -0
  210. package/docs/codex-goal-command-impl.md +1055 -1055
  211. package/docs/codex-goal-vs-grub.md +500 -500
  212. package/docs/custom-provider.md +27 -27
  213. package/docs/extensions.md +27 -27
  214. package/docs/keybindings.md +27 -27
  215. package/docs/loop /351/207/215/346/236/204/345/256/214/346/210/220/346/200/273/347/273/223.md" +250 -250
  216. package/docs/loop /351/207/215/346/236/204/345/256/214/346/210/220/346/212/245/345/221/212.md" +122 -122
  217. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210.md" +1222 -1222
  218. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210/345/256/236/347/216/260/346/212/245/345/221/212.md" +158 -158
  219. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210/345/257/271/346/257/224/345/210/206/346/236/220.md" +128 -128
  220. package/docs/loop /351/207/215/346/236/204/350/256/241/345/210/222.md" +320 -320
  221. package/docs/loop-usage-examples.md +214 -214
  222. package/docs/mem-core/346/212/200/346/234/257/346/226/207/346/241/243.md +593 -0
  223. package/docs/models.md +27 -27
  224. package/docs/packages.md +27 -27
  225. package/docs/pi-design-philosophy.md +457 -457
  226. package/docs/planmode.md +1987 -1987
  227. package/docs/prompt-templates.md +27 -27
  228. package/docs/providers.md +27 -27
  229. package/docs/sdk.md +27 -27
  230. package/docs/skills.md +27 -27
  231. package/docs/startup-performance-optimization.md +301 -0
  232. package/docs/themes.md +27 -27
  233. package/docs/tui.md +27 -27
  234. package/docs//350/256/244/347/237/245/345/234/260/345/233/276.md +47 -0
  235. package/package.json +190 -162
  236. package/dist/packages/extension-sdk/src/index.js +0 -1
  237. package/docs/cc-agent-design.md +0 -1297
  238. package/docs/cc-tui-design.md +0 -1333
  239. package/docs//345/257/271/346/240/207Claude-Code.md +0 -1775
  240. /package/dist/packages/{extension-sdk/src/lifecycle.js → protocol/src/commands.js} +0 -0
  241. /package/dist/packages/{extension-sdk/src/tools.js → protocol/src/flags.js} +0 -0
@@ -1,65 +1,65 @@
1
- # GitHub — Repo actions (star, unstar, watch)
2
-
3
- `https://github.com/{owner}/{repo}` — user-triggered actions on the repo header (Star, Unstar, Watch, Unwatch) are HTML forms that POST back to GitHub with the session's CSRF token already rendered inline. **Submit the form — do not click the button.**
4
-
5
- ## Do this first
6
-
7
- ```python
8
- # Precondition: user is logged in
9
- if not js('!!document.querySelector("meta[name=user-login]")'):
10
- raise RuntimeError("not logged in to GitHub")
11
-
12
- # Star the current repo
13
- js("""
14
- (()=>{
15
- const f = document.querySelector('form[action$="/star"]');
16
- if (!f) return 'already-starred-or-missing';
17
- f.submit();
18
- return 'submitted';
19
- })()
20
- """)
21
- wait(2)
22
- wait_for_load()
23
-
24
- # Verify — the toggle swaps which form is present
25
- starred = js('!!document.querySelector(\'form[action$="/unstar"]\')')
26
- ```
27
-
28
- Same pattern for the reverse (`form[action$="/unstar"]`) and for watch/unwatch (`form[action$="/subscription"]` + a hidden `_method` field, see below).
29
-
30
- ## Why not click the button
31
-
32
- The visible Star button looks like `button[aria-label^="Star "]`, but that selector has two gotchas on the modern repo header:
33
-
34
- - **There are two matching buttons.** The first one `querySelector` returns is a hidden fallback inside the sticky sub-header form with `getBoundingClientRect() == {x:0, y:0, w:0, h:0}`. Coordinate-clicking it does nothing because it has no geometry.
35
- - **Synthetic `.click()` on the visible React button does not persist the star.** The click fires, `aria-label` stays `Star ...`, network tab shows no POST. GitHub's component swallows the synthetic event somewhere in its React fiber handler.
36
-
37
- `form.submit()` sidesteps both problems — it bypasses React entirely and goes straight to the HTML form's POST. The authenticity token is already in a hidden input inside the form, so there's nothing extra to fetch.
38
-
39
- ## Watch / Unwatch
40
-
41
- The subscription form uses a shared endpoint with a `_method` override:
42
-
43
- ```python
44
- # Watch (all activity)
45
- js("""
46
- (()=>{
47
- const f = document.querySelector('form[action$="/subscription"]');
48
- if (!f) return 'missing';
49
- f.submit();
50
- return 'submitted';
51
- })()
52
- """)
53
- ```
54
-
55
- GitHub renders different form attributes (different `_method` hidden input values) depending on the current state. Re-read the form after every toggle rather than caching a reference.
56
-
57
- ## Gotchas
58
-
59
- - **Star count in the rendered button lags the true count by a hydration tick.** The durable signal that "this worked" is which form is on the page after reload: `form[action$="/star"]` present means unstarred, `form[action$="/unstar"]` means starred. The visible aria-label is reliable once you scroll to the top and wait ~1s after submit; the count inside the button updates on soft navigation and is not a good assertion target.
60
-
61
- - **`form.submit()` bypasses the form's `submit` event listeners** — fine for GitHub's case (the handler is a full navigation), but if a future change wires in `e.preventDefault()` to do an XHR, `form.requestSubmit()` is the safer alternative. Worth trying first if `form.submit()` stops working.
62
-
63
- - **If the user is not logged in the forms are not rendered at all.** `meta[name="user-login"]` is the cheapest pre-check.
64
-
65
- - **For read-only star counts, don't touch the DOM — use the API.** `http_get("https://api.github.com/repos/{owner}/{repo}")` returns `stargazers_count` without any browser interaction. See `scraping.md`. Only use the form-submit pattern when you actually need to *change* state on behalf of the logged-in user.
1
+ # GitHub — Repo actions (star, unstar, watch)
2
+
3
+ `https://github.com/{owner}/{repo}` — user-triggered actions on the repo header (Star, Unstar, Watch, Unwatch) are HTML forms that POST back to GitHub with the session's CSRF token already rendered inline. **Submit the form — do not click the button.**
4
+
5
+ ## Do this first
6
+
7
+ ```python
8
+ # Precondition: user is logged in
9
+ if not js('!!document.querySelector("meta[name=user-login]")'):
10
+ raise RuntimeError("not logged in to GitHub")
11
+
12
+ # Star the current repo
13
+ js("""
14
+ (()=>{
15
+ const f = document.querySelector('form[action$="/star"]');
16
+ if (!f) return 'already-starred-or-missing';
17
+ f.submit();
18
+ return 'submitted';
19
+ })()
20
+ """)
21
+ wait(2)
22
+ wait_for_load()
23
+
24
+ # Verify — the toggle swaps which form is present
25
+ starred = js('!!document.querySelector(\'form[action$="/unstar"]\')')
26
+ ```
27
+
28
+ Same pattern for the reverse (`form[action$="/unstar"]`) and for watch/unwatch (`form[action$="/subscription"]` + a hidden `_method` field, see below).
29
+
30
+ ## Why not click the button
31
+
32
+ The visible Star button looks like `button[aria-label^="Star "]`, but that selector has two gotchas on the modern repo header:
33
+
34
+ - **There are two matching buttons.** The first one `querySelector` returns is a hidden fallback inside the sticky sub-header form with `getBoundingClientRect() == {x:0, y:0, w:0, h:0}`. Coordinate-clicking it does nothing because it has no geometry.
35
+ - **Synthetic `.click()` on the visible React button does not persist the star.** The click fires, `aria-label` stays `Star ...`, network tab shows no POST. GitHub's component swallows the synthetic event somewhere in its React fiber handler.
36
+
37
+ `form.submit()` sidesteps both problems — it bypasses React entirely and goes straight to the HTML form's POST. The authenticity token is already in a hidden input inside the form, so there's nothing extra to fetch.
38
+
39
+ ## Watch / Unwatch
40
+
41
+ The subscription form uses a shared endpoint with a `_method` override:
42
+
43
+ ```python
44
+ # Watch (all activity)
45
+ js("""
46
+ (()=>{
47
+ const f = document.querySelector('form[action$="/subscription"]');
48
+ if (!f) return 'missing';
49
+ f.submit();
50
+ return 'submitted';
51
+ })()
52
+ """)
53
+ ```
54
+
55
+ GitHub renders different form attributes (different `_method` hidden input values) depending on the current state. Re-read the form after every toggle rather than caching a reference.
56
+
57
+ ## Gotchas
58
+
59
+ - **Star count in the rendered button lags the true count by a hydration tick.** The durable signal that "this worked" is which form is on the page after reload: `form[action$="/star"]` present means unstarred, `form[action$="/unstar"]` means starred. The visible aria-label is reliable once you scroll to the top and wait ~1s after submit; the count inside the button updates on soft navigation and is not a good assertion target.
60
+
61
+ - **`form.submit()` bypasses the form's `submit` event listeners** — fine for GitHub's case (the handler is a full navigation), but if a future change wires in `e.preventDefault()` to do an XHR, `form.requestSubmit()` is the safer alternative. Worth trying first if `form.submit()` stops working.
62
+
63
+ - **If the user is not logged in the forms are not rendered at all.** `meta[name="user-login"]` is the cheapest pre-check.
64
+
65
+ - **For read-only star counts, don't touch the DOM — use the API.** `http_get("https://api.github.com/repos/{owner}/{repo}")` returns `stargazers_count` without any browser interaction. See `scraping.md`. Only use the form-submit pattern when you actually need to *change* state on behalf of the logged-in user.
@@ -1,184 +1,184 @@
1
- # GitHub — Scraping & Data Extraction
2
-
3
- `https://github.com` — public data, mix of REST API (fast, rate-limited) and browser (trending page only).
4
-
5
- ## Do this first
6
-
7
- **Use the REST API for repo/user/release data — it's one call, no browser, fully parsed JSON.**
8
-
9
- ```python
10
- import json
11
- data = json.loads(http_get("https://api.github.com/repos/{owner}/{repo}"))
12
- # Key fields: stargazers_count, forks_count, description, language, topics,
13
- # open_issues_count, created_at, updated_at, pushed_at,
14
- # watchers_count, subscribers_count, network_count,
15
- # default_branch, license, homepage, visibility
16
- ```
17
-
18
- Use `raw.githubusercontent.com` for file contents — no rate limit, no auth, no base64 decode:
19
-
20
- ```python
21
- readme = http_get("https://raw.githubusercontent.com/owner/repo/main/README.md")
22
- content = http_get("https://raw.githubusercontent.com/owner/repo/main/pyproject.toml")
23
- ```
24
-
25
- Use the browser **only** for the trending page — it's server-side rendered HTML, no API equivalent.
26
-
27
- ## Common workflows
28
-
29
- ### Repo metadata (API)
30
-
31
- ```python
32
- import json
33
- data = json.loads(http_get("https://api.github.com/repos/browser-use/browser-use"))
34
- print(data['stargazers_count'], data['forks_count'], data['description'])
35
- # returns: 88349 10136 '🌐 Make websites accessible for AI agents.'
36
- ```
37
-
38
- ### User / org profile (API)
39
-
40
- ```python
41
- import json
42
- user = json.loads(http_get("https://api.github.com/users/browser-use"))
43
- print(user['type'], user['followers'], user['public_repos'], user['blog'])
44
- # returns: 'Organization' 3046 39 'https://browser-use.com'
45
- ```
46
-
47
- ### Trending page (browser required)
48
-
49
- The trending page is JS-rendered. `article.Box-row` selector confirmed working (15 results for today/all-languages, 12 for filtered). All fields work in a single JS call — **must navigate and wait in the same script run**, as each run is a separate exec context.
50
-
51
- ```python
52
- import json
53
- goto_url("https://github.com/trending") # or /trending/python?since=weekly
54
- wait_for_load()
55
- wait(2) # extra 2s — React hydration completes after readyState
56
-
57
- result = js("""
58
- (function(){
59
- var rows = Array.from(document.querySelectorAll('article.Box-row'));
60
- return JSON.stringify(rows.map(function(el){
61
- var h2link = el.querySelector('h2 a');
62
- var starLink = el.querySelector('a[href*="/stargazers"]');
63
- var forkLink = el.querySelector('a[href*="/forks"]');
64
- var langEl = el.querySelector('[itemprop="programmingLanguage"]');
65
- var todayEl = el.querySelector('.d-inline-block.float-sm-right');
66
- var descEl = el.querySelector('p');
67
- return {
68
- name: h2link ? h2link.innerText.trim().replace(/\\s+/g,' ') : null,
69
- url: h2link ? 'https://github.com' + h2link.getAttribute('href') : null,
70
- stars_total: starLink ? starLink.innerText.trim() : null,
71
- stars_period: todayEl ? todayEl.innerText.trim() : null,
72
- forks: forkLink ? forkLink.innerText.trim() : null,
73
- language: langEl ? langEl.innerText.trim() : null,
74
- desc: descEl ? descEl.innerText.trim() : null
75
- };
76
- }));
77
- })()
78
- """)
79
- repos = json.loads(result)
80
- # stars_period text is e.g. "737 stars today" or "47,053 stars this week"
81
- ```
82
-
83
- Supported URL params:
84
- - `/trending` — all languages, today
85
- - `/trending/python` — filtered to Python
86
- - `/trending?since=weekly` or `?since=monthly`
87
- - `/trending/python?since=weekly` — combined
88
-
89
- ### Search repositories (API)
90
-
91
- ```python
92
- import json
93
- results = json.loads(http_get(
94
- "https://api.github.com/search/repositories?q=browser+automation+language:python&sort=stars&per_page=10"
95
- ))
96
- print(results['total_count']) # e.g. 3250
97
- for r in results['items']:
98
- print(r['full_name'], r['stargazers_count'])
99
- ```
100
-
101
- Search API rate limit is **10 req/min** unauthenticated (separate from the 60/hour core limit). Runs out fast if called in a loop.
102
-
103
- ### Commits, releases, issues (API)
104
-
105
- ```python
106
- import json
107
- # Commits
108
- commits = json.loads(http_get("https://api.github.com/repos/owner/repo/commits?per_page=10"))
109
- # Fields: sha, commit.message, commit.author.date, author.login
110
-
111
- # Releases
112
- releases = json.loads(http_get("https://api.github.com/repos/owner/repo/releases?per_page=5"))
113
- # Fields: tag_name, name, published_at, body, assets
114
-
115
- # Issues
116
- issues = json.loads(http_get("https://api.github.com/repos/owner/repo/issues?state=open&per_page=10"))
117
- # Fields: number, title, labels, state, created_at, user.login
118
-
119
- # Contributors
120
- contribs = json.loads(http_get("https://api.github.com/repos/owner/repo/contributors?per_page=10"))
121
- # Fields: login, contributions
122
- ```
123
-
124
- ### File contents via API (base64)
125
-
126
- ```python
127
- import json, base64
128
- resp = json.loads(http_get("https://api.github.com/repos/owner/repo/contents/path/to/file.py"))
129
- content = base64.b64decode(resp['content']).decode()
130
- # resp also has: size, sha, html_url
131
- # Prefer raw.githubusercontent.com for large files — no base64, no rate limit hit
132
- ```
133
-
134
- ### Parallel fetching (multiple repos)
135
-
136
- ```python
137
- import json
138
- from concurrent.futures import ThreadPoolExecutor
139
-
140
- def fetch_repo(name):
141
- data = json.loads(http_get(f"https://api.github.com/repos/{name}"))
142
- return {"name": name, "stars": data['stargazers_count'], "lang": data['language']}
143
-
144
- repos = ["owner/repo1", "owner/repo2", "owner/repo3"]
145
- with ThreadPoolExecutor(max_workers=3) as ex:
146
- results = list(ex.map(fetch_repo, repos))
147
- # Confirmed working; watch rate limit — 60 unauthenticated calls/hour total
148
- ```
149
-
150
- ## Gotchas
151
-
152
- - **Rate limits are per IP, unauthenticated** — Core API: 60 req/hour. Search API: 10 req/min. These are separate pools. Check `/rate_limit` endpoint: `http_get("https://api.github.com/rate_limit")`. With a `GITHUB_TOKEN`, both limits increase to 5,000/hour.
153
-
154
- - **Token header format** — Use `Authorization: Bearer <token>` (not `token <token>`), plus `X-GitHub-Api-Version: 2022-11-28`:
155
- ```python
156
- import os
157
- token = os.environ.get('GITHUB_TOKEN', '')
158
- headers = {"Authorization": f"Bearer {token}", "X-GitHub-Api-Version": "2022-11-28"} if token else {}
159
- data = json.loads(http_get("https://api.github.com/repos/owner/repo", headers=headers))
160
- ```
161
-
162
- - **404 raises HTTPError, not a JSON error** — Wrap API calls for missing repos:
163
- ```python
164
- try:
165
- data = json.loads(http_get("https://api.github.com/repos/owner/repo"))
166
- except Exception as e:
167
- print("Not found or rate limited:", e)
168
- ```
169
-
170
- - **Code search requires auth** — `GET /search/code` returns HTTP 401 without a token. Repo/user/issues search works unauthenticated.
171
-
172
- - **Trending page selectors only work if navigation is in the same script run** — Each `uv run browser-harness` exec is fresh. Selectors that returned 0 results were run in a separate invocation after the page had navigated away. Always include `goto_url()` + `wait_for_load()` + `wait(2)` in the same script.
173
-
174
- - **wait(2) after wait_for_load() on trending** — `document.readyState == 'complete'` fires before React finishes painting repo cards. Without the extra 2s sleep, `article.Box-row` count was 0 even though the DOM technically loaded.
175
-
176
- - **Trending stars field is a string with commas** — `stars_total` comes back as `"4,548"` not `4548`. Parse with `int(r['stars_total'].replace(',', ''))` if you need to sort.
177
-
178
- - **stars_period text includes the period** — Value is `"737 stars today"` or `"47,053 stars this week"` — strip the trailing word if you want just the number.
179
-
180
- - **Repo page DOM is React-heavy, API is better** — Extracting star counts from the repo HTML page (`github.com/owner/repo`) is unreliable because GitHub uses React with server-side hydration and component IDs change. The REST API returns all the same data cleanly.
181
-
182
- - **raw.githubusercontent.com has no rate limit and no auth** — Use it for any public file. It serves the raw bytes, no JSON wrapping or base64.
183
-
184
- - **Trending page article count varies** — Today filter returned 15 articles, weekly Python filter returned 12. Don't assume 25 results; iterate `document.querySelectorAll('article.Box-row')` and take what's there.
1
+ # GitHub — Scraping & Data Extraction
2
+
3
+ `https://github.com` — public data, mix of REST API (fast, rate-limited) and browser (trending page only).
4
+
5
+ ## Do this first
6
+
7
+ **Use the REST API for repo/user/release data — it's one call, no browser, fully parsed JSON.**
8
+
9
+ ```python
10
+ import json
11
+ data = json.loads(http_get("https://api.github.com/repos/{owner}/{repo}"))
12
+ # Key fields: stargazers_count, forks_count, description, language, topics,
13
+ # open_issues_count, created_at, updated_at, pushed_at,
14
+ # watchers_count, subscribers_count, network_count,
15
+ # default_branch, license, homepage, visibility
16
+ ```
17
+
18
+ Use `raw.githubusercontent.com` for file contents — no rate limit, no auth, no base64 decode:
19
+
20
+ ```python
21
+ readme = http_get("https://raw.githubusercontent.com/owner/repo/main/README.md")
22
+ content = http_get("https://raw.githubusercontent.com/owner/repo/main/pyproject.toml")
23
+ ```
24
+
25
+ Use the browser **only** for the trending page — it's server-side rendered HTML, no API equivalent.
26
+
27
+ ## Common workflows
28
+
29
+ ### Repo metadata (API)
30
+
31
+ ```python
32
+ import json
33
+ data = json.loads(http_get("https://api.github.com/repos/browser-use/browser-use"))
34
+ print(data['stargazers_count'], data['forks_count'], data['description'])
35
+ # returns: 88349 10136 '🌐 Make websites accessible for AI agents.'
36
+ ```
37
+
38
+ ### User / org profile (API)
39
+
40
+ ```python
41
+ import json
42
+ user = json.loads(http_get("https://api.github.com/users/browser-use"))
43
+ print(user['type'], user['followers'], user['public_repos'], user['blog'])
44
+ # returns: 'Organization' 3046 39 'https://browser-use.com'
45
+ ```
46
+
47
+ ### Trending page (browser required)
48
+
49
+ The trending page is JS-rendered. `article.Box-row` selector confirmed working (15 results for today/all-languages, 12 for filtered). All fields work in a single JS call — **must navigate and wait in the same script run**, as each run is a separate exec context.
50
+
51
+ ```python
52
+ import json
53
+ goto_url("https://github.com/trending") # or /trending/python?since=weekly
54
+ wait_for_load()
55
+ wait(2) # extra 2s — React hydration completes after readyState
56
+
57
+ result = js("""
58
+ (function(){
59
+ var rows = Array.from(document.querySelectorAll('article.Box-row'));
60
+ return JSON.stringify(rows.map(function(el){
61
+ var h2link = el.querySelector('h2 a');
62
+ var starLink = el.querySelector('a[href*="/stargazers"]');
63
+ var forkLink = el.querySelector('a[href*="/forks"]');
64
+ var langEl = el.querySelector('[itemprop="programmingLanguage"]');
65
+ var todayEl = el.querySelector('.d-inline-block.float-sm-right');
66
+ var descEl = el.querySelector('p');
67
+ return {
68
+ name: h2link ? h2link.innerText.trim().replace(/\\s+/g,' ') : null,
69
+ url: h2link ? 'https://github.com' + h2link.getAttribute('href') : null,
70
+ stars_total: starLink ? starLink.innerText.trim() : null,
71
+ stars_period: todayEl ? todayEl.innerText.trim() : null,
72
+ forks: forkLink ? forkLink.innerText.trim() : null,
73
+ language: langEl ? langEl.innerText.trim() : null,
74
+ desc: descEl ? descEl.innerText.trim() : null
75
+ };
76
+ }));
77
+ })()
78
+ """)
79
+ repos = json.loads(result)
80
+ # stars_period text is e.g. "737 stars today" or "47,053 stars this week"
81
+ ```
82
+
83
+ Supported URL params:
84
+ - `/trending` — all languages, today
85
+ - `/trending/python` — filtered to Python
86
+ - `/trending?since=weekly` or `?since=monthly`
87
+ - `/trending/python?since=weekly` — combined
88
+
89
+ ### Search repositories (API)
90
+
91
+ ```python
92
+ import json
93
+ results = json.loads(http_get(
94
+ "https://api.github.com/search/repositories?q=browser+automation+language:python&sort=stars&per_page=10"
95
+ ))
96
+ print(results['total_count']) # e.g. 3250
97
+ for r in results['items']:
98
+ print(r['full_name'], r['stargazers_count'])
99
+ ```
100
+
101
+ Search API rate limit is **10 req/min** unauthenticated (separate from the 60/hour core limit). Runs out fast if called in a loop.
102
+
103
+ ### Commits, releases, issues (API)
104
+
105
+ ```python
106
+ import json
107
+ # Commits
108
+ commits = json.loads(http_get("https://api.github.com/repos/owner/repo/commits?per_page=10"))
109
+ # Fields: sha, commit.message, commit.author.date, author.login
110
+
111
+ # Releases
112
+ releases = json.loads(http_get("https://api.github.com/repos/owner/repo/releases?per_page=5"))
113
+ # Fields: tag_name, name, published_at, body, assets
114
+
115
+ # Issues
116
+ issues = json.loads(http_get("https://api.github.com/repos/owner/repo/issues?state=open&per_page=10"))
117
+ # Fields: number, title, labels, state, created_at, user.login
118
+
119
+ # Contributors
120
+ contribs = json.loads(http_get("https://api.github.com/repos/owner/repo/contributors?per_page=10"))
121
+ # Fields: login, contributions
122
+ ```
123
+
124
+ ### File contents via API (base64)
125
+
126
+ ```python
127
+ import json, base64
128
+ resp = json.loads(http_get("https://api.github.com/repos/owner/repo/contents/path/to/file.py"))
129
+ content = base64.b64decode(resp['content']).decode()
130
+ # resp also has: size, sha, html_url
131
+ # Prefer raw.githubusercontent.com for large files — no base64, no rate limit hit
132
+ ```
133
+
134
+ ### Parallel fetching (multiple repos)
135
+
136
+ ```python
137
+ import json
138
+ from concurrent.futures import ThreadPoolExecutor
139
+
140
+ def fetch_repo(name):
141
+ data = json.loads(http_get(f"https://api.github.com/repos/{name}"))
142
+ return {"name": name, "stars": data['stargazers_count'], "lang": data['language']}
143
+
144
+ repos = ["owner/repo1", "owner/repo2", "owner/repo3"]
145
+ with ThreadPoolExecutor(max_workers=3) as ex:
146
+ results = list(ex.map(fetch_repo, repos))
147
+ # Confirmed working; watch rate limit — 60 unauthenticated calls/hour total
148
+ ```
149
+
150
+ ## Gotchas
151
+
152
+ - **Rate limits are per IP, unauthenticated** — Core API: 60 req/hour. Search API: 10 req/min. These are separate pools. Check `/rate_limit` endpoint: `http_get("https://api.github.com/rate_limit")`. With a `GITHUB_TOKEN`, both limits increase to 5,000/hour.
153
+
154
+ - **Token header format** — Use `Authorization: Bearer <token>` (not `token <token>`), plus `X-GitHub-Api-Version: 2022-11-28`:
155
+ ```python
156
+ import os
157
+ token = os.environ.get('GITHUB_TOKEN', '')
158
+ headers = {"Authorization": f"Bearer {token}", "X-GitHub-Api-Version": "2022-11-28"} if token else {}
159
+ data = json.loads(http_get("https://api.github.com/repos/owner/repo", headers=headers))
160
+ ```
161
+
162
+ - **404 raises HTTPError, not a JSON error** — Wrap API calls for missing repos:
163
+ ```python
164
+ try:
165
+ data = json.loads(http_get("https://api.github.com/repos/owner/repo"))
166
+ except Exception as e:
167
+ print("Not found or rate limited:", e)
168
+ ```
169
+
170
+ - **Code search requires auth** — `GET /search/code` returns HTTP 401 without a token. Repo/user/issues search works unauthenticated.
171
+
172
+ - **Trending page selectors only work if navigation is in the same script run** — Each `uv run browser-harness` exec is fresh. Selectors that returned 0 results were run in a separate invocation after the page had navigated away. Always include `goto_url()` + `wait_for_load()` + `wait(2)` in the same script.
173
+
174
+ - **wait(2) after wait_for_load() on trending** — `document.readyState == 'complete'` fires before React finishes painting repo cards. Without the extra 2s sleep, `article.Box-row` count was 0 even though the DOM technically loaded.
175
+
176
+ - **Trending stars field is a string with commas** — `stars_total` comes back as `"4,548"` not `4548`. Parse with `int(r['stars_total'].replace(',', ''))` if you need to sort.
177
+
178
+ - **stars_period text includes the period** — Value is `"737 stars today"` or `"47,053 stars this week"` — strip the trailing word if you want just the number.
179
+
180
+ - **Repo page DOM is React-heavy, API is better** — Extracting star counts from the repo HTML page (`github.com/owner/repo`) is unreliable because GitHub uses React with server-side hydration and component IDs change. The REST API returns all the same data cleanly.
181
+
182
+ - **raw.githubusercontent.com has no rate limit and no auth** — Use it for any public file. It serves the raw bytes, no JSON wrapping or base64.
183
+
184
+ - **Trending page article count varies** — Today filter returned 15 articles, weekly Python filter returned 12. Don't assume 25 results; iterate `document.querySelectorAll('article.Box-row')` and take what's there.