@pencil-agent/nano-pencil 2.0.0 → 2.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (195) hide show
  1. package/README.md +267 -267
  2. package/dist/build-meta.json +3 -3
  3. package/dist/core/export-html/AGENT.md +11 -11
  4. package/dist/core/export-html/template.css +971 -971
  5. package/dist/core/export-html/template.html +54 -54
  6. package/dist/core/mcp/mcp-client.d.ts +3 -1
  7. package/dist/core/mcp/mcp-client.js +6 -6
  8. package/dist/core/mcp/mcp-config.d.ts +3 -3
  9. package/dist/core/mcp/mcp-config.js +1 -1
  10. package/dist/core/mcp/mcp-manager.d.ts +5 -1
  11. package/dist/core/mcp/mcp-manager.js +1 -1
  12. package/dist/core/platform/config/resource-loader.d.ts +2 -0
  13. package/dist/core/platform/config/resource-loader.js +2 -2
  14. package/dist/core/runtime/agent-session.d.ts +12 -0
  15. package/dist/core/runtime/agent-session.js +8 -8
  16. package/dist/core/runtime/sdk.d.ts +8 -0
  17. package/dist/core/runtime/sdk.js +1 -1
  18. package/dist/extensions/builtin/AGENT.md +115 -115
  19. package/dist/extensions/builtin/browser/AGENT.md +17 -17
  20. package/dist/extensions/builtin/browser/agent-workspace/agent_helpers.py +12 -12
  21. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/amazon/product-search.md +198 -198
  22. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/archive-org/scraping.md +341 -341
  23. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/arxiv/scraping.md +311 -311
  24. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/arxiv-bulk/scraping.md +333 -333
  25. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/atlas/overview.md +70 -70
  26. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/booking-com/scraping.md +578 -578
  27. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/capterra/scraping.md +440 -440
  28. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/centilebrain/generate-estimates.md +110 -110
  29. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coingecko/scraping.md +325 -325
  30. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coinmarketcap/scraping.md +463 -463
  31. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coursera/scraping.md +360 -360
  32. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/craigslist/scraping.md +390 -390
  33. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/crossref/scraping.md +568 -568
  34. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/dev-to/scraping.md +323 -323
  35. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/duckduckgo/scraping.md +349 -349
  36. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/ebay/scraping.md +435 -435
  37. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/etsy/scraping.md +506 -506
  38. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/eventbrite/scraping.md +363 -363
  39. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/expedia/automation.md +168 -168
  40. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/facebook/groups.md +236 -236
  41. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/facebook/pages.md +295 -295
  42. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/framer/editor.md +108 -108
  43. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/fred/scraping.md +493 -493
  44. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/g2/scraping.md +580 -580
  45. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/genius/scraping.md +511 -511
  46. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/github/repo-actions.md +65 -65
  47. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/github/scraping.md +184 -184
  48. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/glassdoor/scraping.md +543 -543
  49. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/gmail/compose.md +122 -122
  50. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/goodreads/scraping.md +461 -461
  51. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/gutenberg/scraping.md +383 -383
  52. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/hackernews/scraping.md +243 -243
  53. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/howlongtobeat/scraping.md +473 -473
  54. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/imdb/scraping.md +271 -271
  55. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/itch-io/scraping.md +436 -436
  56. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/job-boards/indeed-glassdoor.md +1021 -1021
  57. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/letterboxd/scraping.md +349 -349
  58. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/linkedin/invitation-manager.md +109 -109
  59. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/loom/folder-enumeration.md +170 -170
  60. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/macrotrends/scraping.md +537 -537
  61. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/medium/article-hydration.md +120 -120
  62. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/medium/scraping.md +414 -414
  63. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/metacritic/scraping.md +477 -477
  64. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/musicbrainz/scraping.md +478 -478
  65. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/nasa/scraping.md +339 -339
  66. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/news-aggregation/multi-source.md +205 -205
  67. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/open-library/scraping.md +472 -472
  68. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/openalex/scraping.md +470 -470
  69. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/openstreetmap/scraping.md +490 -490
  70. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/package-registries/npm-pypi.md +478 -478
  71. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/polymarket/scraping.md +234 -234
  72. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/producthunt/scraping.md +307 -307
  73. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/pubmed/scraping.md +421 -421
  74. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/quora/scraping.md +364 -364
  75. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/rawg/scraping.md +352 -352
  76. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/reddit/scraping.md +124 -124
  77. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/rest-countries/scraping.md +233 -233
  78. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/sec-edgar/scraping.md +361 -361
  79. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/README.md +36 -36
  80. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/embedded-apps.md +72 -72
  81. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/knowledge-base.md +109 -109
  82. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/polaris-inputs.md +137 -137
  83. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/soundcloud/scraping.md +362 -362
  84. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/spotify/scraping.md +339 -339
  85. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/stackoverflow/scraping.md +435 -435
  86. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/steam/scraping.md +575 -575
  87. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/substack/scraping.md +338 -338
  88. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/thetechgeeks/pricing.md +52 -52
  89. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/tiktok/upload.md +107 -107
  90. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/tradingview/scraping.md +309 -309
  91. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/trello/boards-and-lists.md +88 -88
  92. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/trustpilot/scraping.md +375 -375
  93. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/walmart/scraping.md +444 -444
  94. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/wayback-machine/scraping.md +306 -306
  95. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/weather/scraping.md +398 -398
  96. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/wellfound/scraping.md +596 -596
  97. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/world-bank/scraping.md +356 -356
  98. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/xiaohongshu/scraping.md +84 -84
  99. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/youtube/scraping.md +418 -418
  100. package/dist/extensions/builtin/browser/agent-workspace/domain-skills/zillow/scraping.md +433 -433
  101. package/dist/extensions/builtin/browser/browser.md +73 -73
  102. package/dist/extensions/builtin/browser/install.md +142 -142
  103. package/dist/extensions/builtin/browser/interaction-skills/connection.md +48 -48
  104. package/dist/extensions/builtin/browser/interaction-skills/cookies.md +3 -3
  105. package/dist/extensions/builtin/browser/interaction-skills/cross-origin-iframes.md +3 -3
  106. package/dist/extensions/builtin/browser/interaction-skills/dialogs.md +64 -64
  107. package/dist/extensions/builtin/browser/interaction-skills/downloads.md +3 -3
  108. package/dist/extensions/builtin/browser/interaction-skills/drag-and-drop.md +3 -3
  109. package/dist/extensions/builtin/browser/interaction-skills/dropdowns.md +3 -3
  110. package/dist/extensions/builtin/browser/interaction-skills/iframes.md +3 -3
  111. package/dist/extensions/builtin/browser/interaction-skills/network-requests.md +3 -3
  112. package/dist/extensions/builtin/browser/interaction-skills/print-as-pdf.md +3 -3
  113. package/dist/extensions/builtin/browser/interaction-skills/profile-sync.md +90 -90
  114. package/dist/extensions/builtin/browser/interaction-skills/screenshots.md +17 -17
  115. package/dist/extensions/builtin/browser/interaction-skills/scrolling.md +3 -3
  116. package/dist/extensions/builtin/browser/interaction-skills/shadow-dom.md +3 -3
  117. package/dist/extensions/builtin/browser/interaction-skills/tabs.md +69 -69
  118. package/dist/extensions/builtin/browser/interaction-skills/uploads.md +1 -1
  119. package/dist/extensions/builtin/browser/interaction-skills/viewport.md +3 -3
  120. package/dist/extensions/builtin/browser/src/browser_harness/AGENT.md +15 -15
  121. package/dist/extensions/builtin/browser/src/browser_harness/__init__.py +8 -8
  122. package/dist/extensions/builtin/browser/src/browser_harness/_ipc.py +90 -90
  123. package/dist/extensions/builtin/browser/src/browser_harness/admin.py +722 -722
  124. package/dist/extensions/builtin/browser/src/browser_harness/daemon.py +328 -328
  125. package/dist/extensions/builtin/browser/src/browser_harness/helpers.py +396 -396
  126. package/dist/extensions/builtin/browser/src/browser_harness/run.py +103 -103
  127. package/dist/extensions/builtin/discipline/skills/brainstorming/SKILL.md +33 -33
  128. package/dist/extensions/builtin/discipline/skills/executing-plans/SKILL.md +25 -25
  129. package/dist/extensions/builtin/discipline/skills/finishing-development-branch/SKILL.md +25 -25
  130. package/dist/extensions/builtin/discipline/skills/receiving-code-review/SKILL.md +22 -22
  131. package/dist/extensions/builtin/discipline/skills/requesting-code-review/SKILL.md +31 -31
  132. package/dist/extensions/builtin/discipline/skills/systematic-debugging/SKILL.md +28 -28
  133. package/dist/extensions/builtin/discipline/skills/test-driven-development/SKILL.md +32 -32
  134. package/dist/extensions/builtin/discipline/skills/using-git-worktrees/SKILL.md +25 -25
  135. package/dist/extensions/builtin/discipline/skills/verification-before-completion/SKILL.md +27 -27
  136. package/dist/extensions/builtin/discipline/skills/writing-plans/SKILL.md +26 -26
  137. package/dist/extensions/builtin/goal/README.md +67 -67
  138. package/dist/extensions/builtin/grub/README.md +112 -112
  139. package/dist/extensions/builtin/link-world/agent-workspace/README.md +16 -16
  140. package/dist/extensions/builtin/link-world/internet-search/internet-search.md +65 -65
  141. package/dist/extensions/builtin/link-world/link-world-agent.md +82 -82
  142. package/dist/extensions/builtin/link-world/linkworld.md +313 -313
  143. package/dist/extensions/builtin/link-world/network-routing/network-routing.md +67 -67
  144. package/dist/extensions/builtin/loop/README.md +92 -92
  145. package/dist/extensions/builtin/mcp/figma-design.md +68 -68
  146. package/dist/extensions/builtin/mcp/mcp-management.md +85 -85
  147. package/dist/extensions/builtin/recap/AGENT.md +15 -15
  148. package/dist/extensions/builtin/sal/README.md +72 -72
  149. package/dist/extensions/builtin/security-audit/README.md +289 -289
  150. package/dist/extensions/builtin/team/AGENT.md +112 -112
  151. package/dist/extensions/builtin/team/TESTING.md +299 -299
  152. package/dist/extensions/builtin/token-save/README.md +56 -56
  153. package/dist/extensions/optional/AGENT.md +10 -10
  154. package/dist/modes/interactive/interactive-mode.js +36 -36
  155. package/dist/modes/interactive/theme/dark.json +85 -85
  156. package/dist/modes/interactive/theme/light.json +84 -84
  157. package/dist/modes/interactive/theme/theme-schema.json +335 -335
  158. package/dist/modes/interactive/theme/warm.json +81 -81
  159. package/dist/node_modules/@pencil-agent/agent-core/dist/agent-loop.js +3 -2
  160. package/dist/node_modules/@pencil-agent/agent-core/dist/structured-adaptive-agent-loop.js +2 -1
  161. package/dist/node_modules/@pencil-agent/ai/dist/cli.js +0 -0
  162. package/docs/cc-agent-design.md +1297 -0
  163. package/docs/cc-tui-design.md +1333 -0
  164. package/docs/codex-goal-command-impl.md +1055 -1055
  165. package/docs/codex-goal-vs-grub.md +500 -500
  166. package/docs/custom-provider.md +27 -27
  167. package/docs/extensions.md +27 -27
  168. package/docs/keybindings.md +27 -27
  169. package/docs/loop /351/207/215/346/236/204/345/256/214/346/210/220/346/200/273/347/273/223.md" +250 -250
  170. package/docs/loop /351/207/215/346/236/204/345/256/214/346/210/220/346/212/245/345/221/212.md" +122 -122
  171. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210.md" +1222 -1222
  172. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210/345/256/236/347/216/260/346/212/245/345/221/212.md" +158 -158
  173. package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210/345/257/271/346/257/224/345/210/206/346/236/220.md" +128 -128
  174. package/docs/loop /351/207/215/346/236/204/350/256/241/345/210/222.md" +320 -320
  175. package/docs/loop-usage-examples.md +214 -214
  176. package/docs/models.md +27 -27
  177. package/docs/nanoPencil-/345/255/246/344/271/240/350/256/241/345/210/222.md +170 -0
  178. package/docs/packages.md +27 -27
  179. package/docs/pi-design-philosophy.md +457 -457
  180. package/docs/planmode.md +1987 -1987
  181. package/docs/prompt-templates.md +27 -27
  182. package/docs/providers.md +27 -27
  183. package/docs/scan-report.md +3820 -0
  184. package/docs/sdk.md +27 -27
  185. package/docs/skills.md +27 -27
  186. package/docs/themes.md +27 -27
  187. package/docs/tui.md +27 -27
  188. package/docs//345/257/271/346/240/207Claude-Code.md +1775 -0
  189. package/docs//351/230/277/351/207/214/345/267/264/345/267/264/350/264/242/346/212/245/345/210/206/346/236/220/344/271/246.md +261 -0
  190. package/package.json +190 -190
  191. package/docs/ACP/345/215/217/350/256/256/351/233/206/346/210/220/345/274/200/345/217/221/346/226/207/346/241/243.md +0 -851
  192. package/docs/SDK-TESTING.md +0 -364
  193. package/docs/mem-core/346/212/200/346/234/257/346/226/207/346/241/243.md +0 -593
  194. package/docs/startup-performance-optimization.md +0 -301
  195. package/docs//350/256/244/347/237/245/345/234/260/345/233/276.md +0 -47
@@ -1,65 +1,65 @@
1
- # GitHub — Repo actions (star, unstar, watch)
2
-
3
- `https://github.com/{owner}/{repo}` — user-triggered actions on the repo header (Star, Unstar, Watch, Unwatch) are HTML forms that POST back to GitHub with the session's CSRF token already rendered inline. **Submit the form — do not click the button.**
4
-
5
- ## Do this first
6
-
7
- ```python
8
- # Precondition: user is logged in
9
- if not js('!!document.querySelector("meta[name=user-login]")'):
10
- raise RuntimeError("not logged in to GitHub")
11
-
12
- # Star the current repo
13
- js("""
14
- (()=>{
15
- const f = document.querySelector('form[action$="/star"]');
16
- if (!f) return 'already-starred-or-missing';
17
- f.submit();
18
- return 'submitted';
19
- })()
20
- """)
21
- wait(2)
22
- wait_for_load()
23
-
24
- # Verify — the toggle swaps which form is present
25
- starred = js('!!document.querySelector(\'form[action$="/unstar"]\')')
26
- ```
27
-
28
- Same pattern for the reverse (`form[action$="/unstar"]`) and for watch/unwatch (`form[action$="/subscription"]` + a hidden `_method` field, see below).
29
-
30
- ## Why not click the button
31
-
32
- The visible Star button looks like `button[aria-label^="Star "]`, but that selector has two gotchas on the modern repo header:
33
-
34
- - **There are two matching buttons.** The first one `querySelector` returns is a hidden fallback inside the sticky sub-header form with `getBoundingClientRect() == {x:0, y:0, w:0, h:0}`. Coordinate-clicking it does nothing because it has no geometry.
35
- - **Synthetic `.click()` on the visible React button does not persist the star.** The click fires, `aria-label` stays `Star ...`, network tab shows no POST. GitHub's component swallows the synthetic event somewhere in its React fiber handler.
36
-
37
- `form.submit()` sidesteps both problems — it bypasses React entirely and goes straight to the HTML form's POST. The authenticity token is already in a hidden input inside the form, so there's nothing extra to fetch.
38
-
39
- ## Watch / Unwatch
40
-
41
- The subscription form uses a shared endpoint with a `_method` override:
42
-
43
- ```python
44
- # Watch (all activity)
45
- js("""
46
- (()=>{
47
- const f = document.querySelector('form[action$="/subscription"]');
48
- if (!f) return 'missing';
49
- f.submit();
50
- return 'submitted';
51
- })()
52
- """)
53
- ```
54
-
55
- GitHub renders different form attributes (different `_method` hidden input values) depending on the current state. Re-read the form after every toggle rather than caching a reference.
56
-
57
- ## Gotchas
58
-
59
- - **Star count in the rendered button lags the true count by a hydration tick.** The durable signal that "this worked" is which form is on the page after reload: `form[action$="/star"]` present means unstarred, `form[action$="/unstar"]` means starred. The visible aria-label is reliable once you scroll to the top and wait ~1s after submit; the count inside the button updates on soft navigation and is not a good assertion target.
60
-
61
- - **`form.submit()` bypasses the form's `submit` event listeners** — fine for GitHub's case (the handler is a full navigation), but if a future change wires in `e.preventDefault()` to do an XHR, `form.requestSubmit()` is the safer alternative. Worth trying first if `form.submit()` stops working.
62
-
63
- - **If the user is not logged in the forms are not rendered at all.** `meta[name="user-login"]` is the cheapest pre-check.
64
-
65
- - **For read-only star counts, don't touch the DOM — use the API.** `http_get("https://api.github.com/repos/{owner}/{repo}")` returns `stargazers_count` without any browser interaction. See `scraping.md`. Only use the form-submit pattern when you actually need to *change* state on behalf of the logged-in user.
1
+ # GitHub — Repo actions (star, unstar, watch)
2
+
3
+ `https://github.com/{owner}/{repo}` — user-triggered actions on the repo header (Star, Unstar, Watch, Unwatch) are HTML forms that POST back to GitHub with the session's CSRF token already rendered inline. **Submit the form — do not click the button.**
4
+
5
+ ## Do this first
6
+
7
+ ```python
8
+ # Precondition: user is logged in
9
+ if not js('!!document.querySelector("meta[name=user-login]")'):
10
+ raise RuntimeError("not logged in to GitHub")
11
+
12
+ # Star the current repo
13
+ js("""
14
+ (()=>{
15
+ const f = document.querySelector('form[action$="/star"]');
16
+ if (!f) return 'already-starred-or-missing';
17
+ f.submit();
18
+ return 'submitted';
19
+ })()
20
+ """)
21
+ wait(2)
22
+ wait_for_load()
23
+
24
+ # Verify — the toggle swaps which form is present
25
+ starred = js('!!document.querySelector(\'form[action$="/unstar"]\')')
26
+ ```
27
+
28
+ Same pattern for the reverse (`form[action$="/unstar"]`) and for watch/unwatch (`form[action$="/subscription"]` + a hidden `_method` field, see below).
29
+
30
+ ## Why not click the button
31
+
32
+ The visible Star button looks like `button[aria-label^="Star "]`, but that selector has two gotchas on the modern repo header:
33
+
34
+ - **There are two matching buttons.** The first one `querySelector` returns is a hidden fallback inside the sticky sub-header form with `getBoundingClientRect() == {x:0, y:0, w:0, h:0}`. Coordinate-clicking it does nothing because it has no geometry.
35
+ - **Synthetic `.click()` on the visible React button does not persist the star.** The click fires, `aria-label` stays `Star ...`, network tab shows no POST. GitHub's component swallows the synthetic event somewhere in its React fiber handler.
36
+
37
+ `form.submit()` sidesteps both problems — it bypasses React entirely and goes straight to the HTML form's POST. The authenticity token is already in a hidden input inside the form, so there's nothing extra to fetch.
38
+
39
+ ## Watch / Unwatch
40
+
41
+ The subscription form uses a shared endpoint with a `_method` override:
42
+
43
+ ```python
44
+ # Watch (all activity)
45
+ js("""
46
+ (()=>{
47
+ const f = document.querySelector('form[action$="/subscription"]');
48
+ if (!f) return 'missing';
49
+ f.submit();
50
+ return 'submitted';
51
+ })()
52
+ """)
53
+ ```
54
+
55
+ GitHub renders different form attributes (different `_method` hidden input values) depending on the current state. Re-read the form after every toggle rather than caching a reference.
56
+
57
+ ## Gotchas
58
+
59
+ - **Star count in the rendered button lags the true count by a hydration tick.** The durable signal that "this worked" is which form is on the page after reload: `form[action$="/star"]` present means unstarred, `form[action$="/unstar"]` means starred. The visible aria-label is reliable once you scroll to the top and wait ~1s after submit; the count inside the button updates on soft navigation and is not a good assertion target.
60
+
61
+ - **`form.submit()` bypasses the form's `submit` event listeners** — fine for GitHub's case (the handler is a full navigation), but if a future change wires in `e.preventDefault()` to do an XHR, `form.requestSubmit()` is the safer alternative. Worth trying first if `form.submit()` stops working.
62
+
63
+ - **If the user is not logged in the forms are not rendered at all.** `meta[name="user-login"]` is the cheapest pre-check.
64
+
65
+ - **For read-only star counts, don't touch the DOM — use the API.** `http_get("https://api.github.com/repos/{owner}/{repo}")` returns `stargazers_count` without any browser interaction. See `scraping.md`. Only use the form-submit pattern when you actually need to *change* state on behalf of the logged-in user.
@@ -1,184 +1,184 @@
1
- # GitHub — Scraping & Data Extraction
2
-
3
- `https://github.com` — public data, mix of REST API (fast, rate-limited) and browser (trending page only).
4
-
5
- ## Do this first
6
-
7
- **Use the REST API for repo/user/release data — it's one call, no browser, fully parsed JSON.**
8
-
9
- ```python
10
- import json
11
- data = json.loads(http_get("https://api.github.com/repos/{owner}/{repo}"))
12
- # Key fields: stargazers_count, forks_count, description, language, topics,
13
- # open_issues_count, created_at, updated_at, pushed_at,
14
- # watchers_count, subscribers_count, network_count,
15
- # default_branch, license, homepage, visibility
16
- ```
17
-
18
- Use `raw.githubusercontent.com` for file contents — no rate limit, no auth, no base64 decode:
19
-
20
- ```python
21
- readme = http_get("https://raw.githubusercontent.com/owner/repo/main/README.md")
22
- content = http_get("https://raw.githubusercontent.com/owner/repo/main/pyproject.toml")
23
- ```
24
-
25
- Use the browser **only** for the trending page — it's server-side rendered HTML, no API equivalent.
26
-
27
- ## Common workflows
28
-
29
- ### Repo metadata (API)
30
-
31
- ```python
32
- import json
33
- data = json.loads(http_get("https://api.github.com/repos/browser-use/browser-use"))
34
- print(data['stargazers_count'], data['forks_count'], data['description'])
35
- # returns: 88349 10136 '🌐 Make websites accessible for AI agents.'
36
- ```
37
-
38
- ### User / org profile (API)
39
-
40
- ```python
41
- import json
42
- user = json.loads(http_get("https://api.github.com/users/browser-use"))
43
- print(user['type'], user['followers'], user['public_repos'], user['blog'])
44
- # returns: 'Organization' 3046 39 'https://browser-use.com'
45
- ```
46
-
47
- ### Trending page (browser required)
48
-
49
- The trending page is JS-rendered. `article.Box-row` selector confirmed working (15 results for today/all-languages, 12 for filtered). All fields work in a single JS call — **must navigate and wait in the same script run**, as each run is a separate exec context.
50
-
51
- ```python
52
- import json
53
- goto_url("https://github.com/trending") # or /trending/python?since=weekly
54
- wait_for_load()
55
- wait(2) # extra 2s — React hydration completes after readyState
56
-
57
- result = js("""
58
- (function(){
59
- var rows = Array.from(document.querySelectorAll('article.Box-row'));
60
- return JSON.stringify(rows.map(function(el){
61
- var h2link = el.querySelector('h2 a');
62
- var starLink = el.querySelector('a[href*="/stargazers"]');
63
- var forkLink = el.querySelector('a[href*="/forks"]');
64
- var langEl = el.querySelector('[itemprop="programmingLanguage"]');
65
- var todayEl = el.querySelector('.d-inline-block.float-sm-right');
66
- var descEl = el.querySelector('p');
67
- return {
68
- name: h2link ? h2link.innerText.trim().replace(/\\s+/g,' ') : null,
69
- url: h2link ? 'https://github.com' + h2link.getAttribute('href') : null,
70
- stars_total: starLink ? starLink.innerText.trim() : null,
71
- stars_period: todayEl ? todayEl.innerText.trim() : null,
72
- forks: forkLink ? forkLink.innerText.trim() : null,
73
- language: langEl ? langEl.innerText.trim() : null,
74
- desc: descEl ? descEl.innerText.trim() : null
75
- };
76
- }));
77
- })()
78
- """)
79
- repos = json.loads(result)
80
- # stars_period text is e.g. "737 stars today" or "47,053 stars this week"
81
- ```
82
-
83
- Supported URL params:
84
- - `/trending` — all languages, today
85
- - `/trending/python` — filtered to Python
86
- - `/trending?since=weekly` or `?since=monthly`
87
- - `/trending/python?since=weekly` — combined
88
-
89
- ### Search repositories (API)
90
-
91
- ```python
92
- import json
93
- results = json.loads(http_get(
94
- "https://api.github.com/search/repositories?q=browser+automation+language:python&sort=stars&per_page=10"
95
- ))
96
- print(results['total_count']) # e.g. 3250
97
- for r in results['items']:
98
- print(r['full_name'], r['stargazers_count'])
99
- ```
100
-
101
- Search API rate limit is **10 req/min** unauthenticated (separate from the 60/hour core limit). Runs out fast if called in a loop.
102
-
103
- ### Commits, releases, issues (API)
104
-
105
- ```python
106
- import json
107
- # Commits
108
- commits = json.loads(http_get("https://api.github.com/repos/owner/repo/commits?per_page=10"))
109
- # Fields: sha, commit.message, commit.author.date, author.login
110
-
111
- # Releases
112
- releases = json.loads(http_get("https://api.github.com/repos/owner/repo/releases?per_page=5"))
113
- # Fields: tag_name, name, published_at, body, assets
114
-
115
- # Issues
116
- issues = json.loads(http_get("https://api.github.com/repos/owner/repo/issues?state=open&per_page=10"))
117
- # Fields: number, title, labels, state, created_at, user.login
118
-
119
- # Contributors
120
- contribs = json.loads(http_get("https://api.github.com/repos/owner/repo/contributors?per_page=10"))
121
- # Fields: login, contributions
122
- ```
123
-
124
- ### File contents via API (base64)
125
-
126
- ```python
127
- import json, base64
128
- resp = json.loads(http_get("https://api.github.com/repos/owner/repo/contents/path/to/file.py"))
129
- content = base64.b64decode(resp['content']).decode()
130
- # resp also has: size, sha, html_url
131
- # Prefer raw.githubusercontent.com for large files — no base64, no rate limit hit
132
- ```
133
-
134
- ### Parallel fetching (multiple repos)
135
-
136
- ```python
137
- import json
138
- from concurrent.futures import ThreadPoolExecutor
139
-
140
- def fetch_repo(name):
141
- data = json.loads(http_get(f"https://api.github.com/repos/{name}"))
142
- return {"name": name, "stars": data['stargazers_count'], "lang": data['language']}
143
-
144
- repos = ["owner/repo1", "owner/repo2", "owner/repo3"]
145
- with ThreadPoolExecutor(max_workers=3) as ex:
146
- results = list(ex.map(fetch_repo, repos))
147
- # Confirmed working; watch rate limit — 60 unauthenticated calls/hour total
148
- ```
149
-
150
- ## Gotchas
151
-
152
- - **Rate limits are per IP, unauthenticated** — Core API: 60 req/hour. Search API: 10 req/min. These are separate pools. Check `/rate_limit` endpoint: `http_get("https://api.github.com/rate_limit")`. With a `GITHUB_TOKEN`, both limits increase to 5,000/hour.
153
-
154
- - **Token header format** — Use `Authorization: Bearer <token>` (not `token <token>`), plus `X-GitHub-Api-Version: 2022-11-28`:
155
- ```python
156
- import os
157
- token = os.environ.get('GITHUB_TOKEN', '')
158
- headers = {"Authorization": f"Bearer {token}", "X-GitHub-Api-Version": "2022-11-28"} if token else {}
159
- data = json.loads(http_get("https://api.github.com/repos/owner/repo", headers=headers))
160
- ```
161
-
162
- - **404 raises HTTPError, not a JSON error** — Wrap API calls for missing repos:
163
- ```python
164
- try:
165
- data = json.loads(http_get("https://api.github.com/repos/owner/repo"))
166
- except Exception as e:
167
- print("Not found or rate limited:", e)
168
- ```
169
-
170
- - **Code search requires auth** — `GET /search/code` returns HTTP 401 without a token. Repo/user/issues search works unauthenticated.
171
-
172
- - **Trending page selectors only work if navigation is in the same script run** — Each `uv run browser-harness` exec is fresh. Selectors that returned 0 results were run in a separate invocation after the page had navigated away. Always include `goto_url()` + `wait_for_load()` + `wait(2)` in the same script.
173
-
174
- - **wait(2) after wait_for_load() on trending** — `document.readyState == 'complete'` fires before React finishes painting repo cards. Without the extra 2s sleep, `article.Box-row` count was 0 even though the DOM technically loaded.
175
-
176
- - **Trending stars field is a string with commas** — `stars_total` comes back as `"4,548"` not `4548`. Parse with `int(r['stars_total'].replace(',', ''))` if you need to sort.
177
-
178
- - **stars_period text includes the period** — Value is `"737 stars today"` or `"47,053 stars this week"` — strip the trailing word if you want just the number.
179
-
180
- - **Repo page DOM is React-heavy, API is better** — Extracting star counts from the repo HTML page (`github.com/owner/repo`) is unreliable because GitHub uses React with server-side hydration and component IDs change. The REST API returns all the same data cleanly.
181
-
182
- - **raw.githubusercontent.com has no rate limit and no auth** — Use it for any public file. It serves the raw bytes, no JSON wrapping or base64.
183
-
184
- - **Trending page article count varies** — Today filter returned 15 articles, weekly Python filter returned 12. Don't assume 25 results; iterate `document.querySelectorAll('article.Box-row')` and take what's there.
1
+ # GitHub — Scraping & Data Extraction
2
+
3
+ `https://github.com` — public data, mix of REST API (fast, rate-limited) and browser (trending page only).
4
+
5
+ ## Do this first
6
+
7
+ **Use the REST API for repo/user/release data — it's one call, no browser, fully parsed JSON.**
8
+
9
+ ```python
10
+ import json
11
+ data = json.loads(http_get("https://api.github.com/repos/{owner}/{repo}"))
12
+ # Key fields: stargazers_count, forks_count, description, language, topics,
13
+ # open_issues_count, created_at, updated_at, pushed_at,
14
+ # watchers_count, subscribers_count, network_count,
15
+ # default_branch, license, homepage, visibility
16
+ ```
17
+
18
+ Use `raw.githubusercontent.com` for file contents — no rate limit, no auth, no base64 decode:
19
+
20
+ ```python
21
+ readme = http_get("https://raw.githubusercontent.com/owner/repo/main/README.md")
22
+ content = http_get("https://raw.githubusercontent.com/owner/repo/main/pyproject.toml")
23
+ ```
24
+
25
+ Use the browser **only** for the trending page — it's server-side rendered HTML, no API equivalent.
26
+
27
+ ## Common workflows
28
+
29
+ ### Repo metadata (API)
30
+
31
+ ```python
32
+ import json
33
+ data = json.loads(http_get("https://api.github.com/repos/browser-use/browser-use"))
34
+ print(data['stargazers_count'], data['forks_count'], data['description'])
35
+ # returns: 88349 10136 '🌐 Make websites accessible for AI agents.'
36
+ ```
37
+
38
+ ### User / org profile (API)
39
+
40
+ ```python
41
+ import json
42
+ user = json.loads(http_get("https://api.github.com/users/browser-use"))
43
+ print(user['type'], user['followers'], user['public_repos'], user['blog'])
44
+ # returns: 'Organization' 3046 39 'https://browser-use.com'
45
+ ```
46
+
47
+ ### Trending page (browser required)
48
+
49
+ The trending page is JS-rendered. `article.Box-row` selector confirmed working (15 results for today/all-languages, 12 for filtered). All fields work in a single JS call — **must navigate and wait in the same script run**, as each run is a separate exec context.
50
+
51
+ ```python
52
+ import json
53
+ goto_url("https://github.com/trending") # or /trending/python?since=weekly
54
+ wait_for_load()
55
+ wait(2) # extra 2s — React hydration completes after readyState
56
+
57
+ result = js("""
58
+ (function(){
59
+ var rows = Array.from(document.querySelectorAll('article.Box-row'));
60
+ return JSON.stringify(rows.map(function(el){
61
+ var h2link = el.querySelector('h2 a');
62
+ var starLink = el.querySelector('a[href*="/stargazers"]');
63
+ var forkLink = el.querySelector('a[href*="/forks"]');
64
+ var langEl = el.querySelector('[itemprop="programmingLanguage"]');
65
+ var todayEl = el.querySelector('.d-inline-block.float-sm-right');
66
+ var descEl = el.querySelector('p');
67
+ return {
68
+ name: h2link ? h2link.innerText.trim().replace(/\\s+/g,' ') : null,
69
+ url: h2link ? 'https://github.com' + h2link.getAttribute('href') : null,
70
+ stars_total: starLink ? starLink.innerText.trim() : null,
71
+ stars_period: todayEl ? todayEl.innerText.trim() : null,
72
+ forks: forkLink ? forkLink.innerText.trim() : null,
73
+ language: langEl ? langEl.innerText.trim() : null,
74
+ desc: descEl ? descEl.innerText.trim() : null
75
+ };
76
+ }));
77
+ })()
78
+ """)
79
+ repos = json.loads(result)
80
+ # stars_period text is e.g. "737 stars today" or "47,053 stars this week"
81
+ ```
82
+
83
+ Supported URL params:
84
+ - `/trending` — all languages, today
85
+ - `/trending/python` — filtered to Python
86
+ - `/trending?since=weekly` or `?since=monthly`
87
+ - `/trending/python?since=weekly` — combined
88
+
89
+ ### Search repositories (API)
90
+
91
+ ```python
92
+ import json
93
+ results = json.loads(http_get(
94
+ "https://api.github.com/search/repositories?q=browser+automation+language:python&sort=stars&per_page=10"
95
+ ))
96
+ print(results['total_count']) # e.g. 3250
97
+ for r in results['items']:
98
+ print(r['full_name'], r['stargazers_count'])
99
+ ```
100
+
101
+ Search API rate limit is **10 req/min** unauthenticated (separate from the 60/hour core limit). Runs out fast if called in a loop.
102
+
103
+ ### Commits, releases, issues (API)
104
+
105
+ ```python
106
+ import json
107
+ # Commits
108
+ commits = json.loads(http_get("https://api.github.com/repos/owner/repo/commits?per_page=10"))
109
+ # Fields: sha, commit.message, commit.author.date, author.login
110
+
111
+ # Releases
112
+ releases = json.loads(http_get("https://api.github.com/repos/owner/repo/releases?per_page=5"))
113
+ # Fields: tag_name, name, published_at, body, assets
114
+
115
+ # Issues
116
+ issues = json.loads(http_get("https://api.github.com/repos/owner/repo/issues?state=open&per_page=10"))
117
+ # Fields: number, title, labels, state, created_at, user.login
118
+
119
+ # Contributors
120
+ contribs = json.loads(http_get("https://api.github.com/repos/owner/repo/contributors?per_page=10"))
121
+ # Fields: login, contributions
122
+ ```
123
+
124
+ ### File contents via API (base64)
125
+
126
+ ```python
127
+ import json, base64
128
+ resp = json.loads(http_get("https://api.github.com/repos/owner/repo/contents/path/to/file.py"))
129
+ content = base64.b64decode(resp['content']).decode()
130
+ # resp also has: size, sha, html_url
131
+ # Prefer raw.githubusercontent.com for large files — no base64, no rate limit hit
132
+ ```
133
+
134
+ ### Parallel fetching (multiple repos)
135
+
136
+ ```python
137
+ import json
138
+ from concurrent.futures import ThreadPoolExecutor
139
+
140
+ def fetch_repo(name):
141
+ data = json.loads(http_get(f"https://api.github.com/repos/{name}"))
142
+ return {"name": name, "stars": data['stargazers_count'], "lang": data['language']}
143
+
144
+ repos = ["owner/repo1", "owner/repo2", "owner/repo3"]
145
+ with ThreadPoolExecutor(max_workers=3) as ex:
146
+ results = list(ex.map(fetch_repo, repos))
147
+ # Confirmed working; watch rate limit — 60 unauthenticated calls/hour total
148
+ ```
149
+
150
+ ## Gotchas
151
+
152
+ - **Rate limits are per IP, unauthenticated** — Core API: 60 req/hour. Search API: 10 req/min. These are separate pools. Check `/rate_limit` endpoint: `http_get("https://api.github.com/rate_limit")`. With a `GITHUB_TOKEN`, both limits increase to 5,000/hour.
153
+
154
+ - **Token header format** — Use `Authorization: Bearer <token>` (not `token <token>`), plus `X-GitHub-Api-Version: 2022-11-28`:
155
+ ```python
156
+ import os
157
+ token = os.environ.get('GITHUB_TOKEN', '')
158
+ headers = {"Authorization": f"Bearer {token}", "X-GitHub-Api-Version": "2022-11-28"} if token else {}
159
+ data = json.loads(http_get("https://api.github.com/repos/owner/repo", headers=headers))
160
+ ```
161
+
162
+ - **404 raises HTTPError, not a JSON error** — Wrap API calls for missing repos:
163
+ ```python
164
+ try:
165
+ data = json.loads(http_get("https://api.github.com/repos/owner/repo"))
166
+ except Exception as e:
167
+ print("Not found or rate limited:", e)
168
+ ```
169
+
170
+ - **Code search requires auth** — `GET /search/code` returns HTTP 401 without a token. Repo/user/issues search works unauthenticated.
171
+
172
+ - **Trending page selectors only work if navigation is in the same script run** — Each `uv run browser-harness` exec is fresh. Selectors that returned 0 results were run in a separate invocation after the page had navigated away. Always include `goto_url()` + `wait_for_load()` + `wait(2)` in the same script.
173
+
174
+ - **wait(2) after wait_for_load() on trending** — `document.readyState == 'complete'` fires before React finishes painting repo cards. Without the extra 2s sleep, `article.Box-row` count was 0 even though the DOM technically loaded.
175
+
176
+ - **Trending stars field is a string with commas** — `stars_total` comes back as `"4,548"` not `4548`. Parse with `int(r['stars_total'].replace(',', ''))` if you need to sort.
177
+
178
+ - **stars_period text includes the period** — Value is `"737 stars today"` or `"47,053 stars this week"` — strip the trailing word if you want just the number.
179
+
180
+ - **Repo page DOM is React-heavy, API is better** — Extracting star counts from the repo HTML page (`github.com/owner/repo`) is unreliable because GitHub uses React with server-side hydration and component IDs change. The REST API returns all the same data cleanly.
181
+
182
+ - **raw.githubusercontent.com has no rate limit and no auth** — Use it for any public file. It serves the raw bytes, no JSON wrapping or base64.
183
+
184
+ - **Trending page article count varies** — Today filter returned 15 articles, weekly Python filter returned 12. Don't assume 25 results; iterate `document.querySelectorAll('article.Box-row')` and take what's there.