@pencil-agent/nano-pencil 2.0.0 → 2.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +267 -267
- package/dist/build-meta.json +3 -3
- package/dist/core/export-html/AGENT.md +11 -11
- package/dist/core/export-html/template.css +971 -971
- package/dist/core/export-html/template.html +54 -54
- package/dist/core/mcp/mcp-client.d.ts +3 -1
- package/dist/core/mcp/mcp-client.js +6 -6
- package/dist/core/mcp/mcp-config.d.ts +3 -3
- package/dist/core/mcp/mcp-config.js +1 -1
- package/dist/core/mcp/mcp-manager.d.ts +5 -1
- package/dist/core/mcp/mcp-manager.js +1 -1
- package/dist/core/platform/config/resource-loader.d.ts +2 -0
- package/dist/core/platform/config/resource-loader.js +2 -2
- package/dist/core/runtime/agent-session.d.ts +12 -0
- package/dist/core/runtime/agent-session.js +8 -8
- package/dist/core/runtime/sdk.d.ts +8 -0
- package/dist/core/runtime/sdk.js +1 -1
- package/dist/extensions/builtin/AGENT.md +115 -115
- package/dist/extensions/builtin/browser/AGENT.md +17 -17
- package/dist/extensions/builtin/browser/agent-workspace/agent_helpers.py +12 -12
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/amazon/product-search.md +198 -198
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/archive-org/scraping.md +341 -341
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/arxiv/scraping.md +311 -311
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/arxiv-bulk/scraping.md +333 -333
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/atlas/overview.md +70 -70
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/booking-com/scraping.md +578 -578
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/capterra/scraping.md +440 -440
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/centilebrain/generate-estimates.md +110 -110
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coingecko/scraping.md +325 -325
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coinmarketcap/scraping.md +463 -463
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/coursera/scraping.md +360 -360
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/craigslist/scraping.md +390 -390
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/crossref/scraping.md +568 -568
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/dev-to/scraping.md +323 -323
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/duckduckgo/scraping.md +349 -349
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/ebay/scraping.md +435 -435
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/etsy/scraping.md +506 -506
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/eventbrite/scraping.md +363 -363
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/expedia/automation.md +168 -168
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/facebook/groups.md +236 -236
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/facebook/pages.md +295 -295
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/framer/editor.md +108 -108
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/fred/scraping.md +493 -493
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/g2/scraping.md +580 -580
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/genius/scraping.md +511 -511
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/github/repo-actions.md +65 -65
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/github/scraping.md +184 -184
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/glassdoor/scraping.md +543 -543
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/gmail/compose.md +122 -122
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/goodreads/scraping.md +461 -461
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/gutenberg/scraping.md +383 -383
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/hackernews/scraping.md +243 -243
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/howlongtobeat/scraping.md +473 -473
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/imdb/scraping.md +271 -271
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/itch-io/scraping.md +436 -436
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/job-boards/indeed-glassdoor.md +1021 -1021
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/letterboxd/scraping.md +349 -349
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/linkedin/invitation-manager.md +109 -109
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/loom/folder-enumeration.md +170 -170
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/macrotrends/scraping.md +537 -537
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/medium/article-hydration.md +120 -120
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/medium/scraping.md +414 -414
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/metacritic/scraping.md +477 -477
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/musicbrainz/scraping.md +478 -478
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/nasa/scraping.md +339 -339
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/news-aggregation/multi-source.md +205 -205
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/open-library/scraping.md +472 -472
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/openalex/scraping.md +470 -470
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/openstreetmap/scraping.md +490 -490
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/package-registries/npm-pypi.md +478 -478
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/polymarket/scraping.md +234 -234
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/producthunt/scraping.md +307 -307
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/pubmed/scraping.md +421 -421
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/quora/scraping.md +364 -364
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/rawg/scraping.md +352 -352
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/reddit/scraping.md +124 -124
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/rest-countries/scraping.md +233 -233
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/sec-edgar/scraping.md +361 -361
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/README.md +36 -36
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/embedded-apps.md +72 -72
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/knowledge-base.md +109 -109
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/shopify-admin/polaris-inputs.md +137 -137
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/soundcloud/scraping.md +362 -362
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/spotify/scraping.md +339 -339
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/stackoverflow/scraping.md +435 -435
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/steam/scraping.md +575 -575
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/substack/scraping.md +338 -338
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/thetechgeeks/pricing.md +52 -52
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/tiktok/upload.md +107 -107
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/tradingview/scraping.md +309 -309
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/trello/boards-and-lists.md +88 -88
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/trustpilot/scraping.md +375 -375
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/walmart/scraping.md +444 -444
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/wayback-machine/scraping.md +306 -306
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/weather/scraping.md +398 -398
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/wellfound/scraping.md +596 -596
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/world-bank/scraping.md +356 -356
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/xiaohongshu/scraping.md +84 -84
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/youtube/scraping.md +418 -418
- package/dist/extensions/builtin/browser/agent-workspace/domain-skills/zillow/scraping.md +433 -433
- package/dist/extensions/builtin/browser/browser.md +73 -73
- package/dist/extensions/builtin/browser/install.md +142 -142
- package/dist/extensions/builtin/browser/interaction-skills/connection.md +48 -48
- package/dist/extensions/builtin/browser/interaction-skills/cookies.md +3 -3
- package/dist/extensions/builtin/browser/interaction-skills/cross-origin-iframes.md +3 -3
- package/dist/extensions/builtin/browser/interaction-skills/dialogs.md +64 -64
- package/dist/extensions/builtin/browser/interaction-skills/downloads.md +3 -3
- package/dist/extensions/builtin/browser/interaction-skills/drag-and-drop.md +3 -3
- package/dist/extensions/builtin/browser/interaction-skills/dropdowns.md +3 -3
- package/dist/extensions/builtin/browser/interaction-skills/iframes.md +3 -3
- package/dist/extensions/builtin/browser/interaction-skills/network-requests.md +3 -3
- package/dist/extensions/builtin/browser/interaction-skills/print-as-pdf.md +3 -3
- package/dist/extensions/builtin/browser/interaction-skills/profile-sync.md +90 -90
- package/dist/extensions/builtin/browser/interaction-skills/screenshots.md +17 -17
- package/dist/extensions/builtin/browser/interaction-skills/scrolling.md +3 -3
- package/dist/extensions/builtin/browser/interaction-skills/shadow-dom.md +3 -3
- package/dist/extensions/builtin/browser/interaction-skills/tabs.md +69 -69
- package/dist/extensions/builtin/browser/interaction-skills/uploads.md +1 -1
- package/dist/extensions/builtin/browser/interaction-skills/viewport.md +3 -3
- package/dist/extensions/builtin/browser/src/browser_harness/AGENT.md +15 -15
- package/dist/extensions/builtin/browser/src/browser_harness/__init__.py +8 -8
- package/dist/extensions/builtin/browser/src/browser_harness/_ipc.py +90 -90
- package/dist/extensions/builtin/browser/src/browser_harness/admin.py +722 -722
- package/dist/extensions/builtin/browser/src/browser_harness/daemon.py +328 -328
- package/dist/extensions/builtin/browser/src/browser_harness/helpers.py +396 -396
- package/dist/extensions/builtin/browser/src/browser_harness/run.py +103 -103
- package/dist/extensions/builtin/discipline/skills/brainstorming/SKILL.md +33 -33
- package/dist/extensions/builtin/discipline/skills/executing-plans/SKILL.md +25 -25
- package/dist/extensions/builtin/discipline/skills/finishing-development-branch/SKILL.md +25 -25
- package/dist/extensions/builtin/discipline/skills/receiving-code-review/SKILL.md +22 -22
- package/dist/extensions/builtin/discipline/skills/requesting-code-review/SKILL.md +31 -31
- package/dist/extensions/builtin/discipline/skills/systematic-debugging/SKILL.md +28 -28
- package/dist/extensions/builtin/discipline/skills/test-driven-development/SKILL.md +32 -32
- package/dist/extensions/builtin/discipline/skills/using-git-worktrees/SKILL.md +25 -25
- package/dist/extensions/builtin/discipline/skills/verification-before-completion/SKILL.md +27 -27
- package/dist/extensions/builtin/discipline/skills/writing-plans/SKILL.md +26 -26
- package/dist/extensions/builtin/goal/README.md +67 -67
- package/dist/extensions/builtin/grub/README.md +112 -112
- package/dist/extensions/builtin/link-world/agent-workspace/README.md +16 -16
- package/dist/extensions/builtin/link-world/internet-search/internet-search.md +65 -65
- package/dist/extensions/builtin/link-world/link-world-agent.md +82 -82
- package/dist/extensions/builtin/link-world/linkworld.md +313 -313
- package/dist/extensions/builtin/link-world/network-routing/network-routing.md +67 -67
- package/dist/extensions/builtin/loop/README.md +92 -92
- package/dist/extensions/builtin/mcp/figma-design.md +68 -68
- package/dist/extensions/builtin/mcp/mcp-management.md +85 -85
- package/dist/extensions/builtin/recap/AGENT.md +15 -15
- package/dist/extensions/builtin/sal/README.md +72 -72
- package/dist/extensions/builtin/security-audit/README.md +289 -289
- package/dist/extensions/builtin/team/AGENT.md +112 -112
- package/dist/extensions/builtin/team/TESTING.md +299 -299
- package/dist/extensions/builtin/token-save/README.md +56 -56
- package/dist/extensions/optional/AGENT.md +10 -10
- package/dist/modes/interactive/interactive-mode.js +36 -36
- package/dist/modes/interactive/theme/dark.json +85 -85
- package/dist/modes/interactive/theme/light.json +84 -84
- package/dist/modes/interactive/theme/theme-schema.json +335 -335
- package/dist/modes/interactive/theme/warm.json +81 -81
- package/dist/node_modules/@pencil-agent/agent-core/dist/agent-loop.js +3 -2
- package/dist/node_modules/@pencil-agent/agent-core/dist/structured-adaptive-agent-loop.js +2 -1
- package/dist/node_modules/@pencil-agent/ai/dist/cli.js +0 -0
- package/docs/cc-agent-design.md +1297 -0
- package/docs/cc-tui-design.md +1333 -0
- package/docs/codex-goal-command-impl.md +1055 -1055
- package/docs/codex-goal-vs-grub.md +500 -500
- package/docs/custom-provider.md +27 -27
- package/docs/extensions.md +27 -27
- package/docs/keybindings.md +27 -27
- package/docs/loop /351/207/215/346/236/204/345/256/214/346/210/220/346/200/273/347/273/223.md" +250 -250
- package/docs/loop /351/207/215/346/236/204/345/256/214/346/210/220/346/212/245/345/221/212.md" +122 -122
- package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210.md" +1222 -1222
- package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210/345/256/236/347/216/260/346/212/245/345/221/212.md" +158 -158
- package/docs/loop /351/207/215/346/236/204/346/226/271/346/241/210/345/257/271/346/257/224/345/210/206/346/236/220.md" +128 -128
- package/docs/loop /351/207/215/346/236/204/350/256/241/345/210/222.md" +320 -320
- package/docs/loop-usage-examples.md +214 -214
- package/docs/models.md +27 -27
- package/docs/nanoPencil-/345/255/246/344/271/240/350/256/241/345/210/222.md +170 -0
- package/docs/packages.md +27 -27
- package/docs/pi-design-philosophy.md +457 -457
- package/docs/planmode.md +1987 -1987
- package/docs/prompt-templates.md +27 -27
- package/docs/providers.md +27 -27
- package/docs/scan-report.md +3820 -0
- package/docs/sdk.md +27 -27
- package/docs/skills.md +27 -27
- package/docs/themes.md +27 -27
- package/docs/tui.md +27 -27
- package/docs//345/257/271/346/240/207Claude-Code.md +1775 -0
- package/docs//351/230/277/351/207/214/345/267/264/345/267/264/350/264/242/346/212/245/345/210/206/346/236/220/344/271/246.md +261 -0
- package/package.json +190 -190
- package/docs/ACP/345/215/217/350/256/256/351/233/206/346/210/220/345/274/200/345/217/221/346/226/207/346/241/243.md +0 -851
- package/docs/SDK-TESTING.md +0 -364
- package/docs/mem-core/346/212/200/346/234/257/346/226/207/346/241/243.md +0 -593
- package/docs/startup-performance-optimization.md +0 -301
- package/docs//350/256/244/347/237/245/345/234/260/345/233/276.md +0 -47
package/dist/extensions/builtin/browser/agent-workspace/domain-skills/github/repo-actions.md
CHANGED
|
@@ -1,65 +1,65 @@
|
|
|
1
|
-
# GitHub — Repo actions (star, unstar, watch)
|
|
2
|
-
|
|
3
|
-
`https://github.com/{owner}/{repo}` — user-triggered actions on the repo header (Star, Unstar, Watch, Unwatch) are HTML forms that POST back to GitHub with the session's CSRF token already rendered inline. **Submit the form — do not click the button.**
|
|
4
|
-
|
|
5
|
-
## Do this first
|
|
6
|
-
|
|
7
|
-
```python
|
|
8
|
-
# Precondition: user is logged in
|
|
9
|
-
if not js('!!document.querySelector("meta[name=user-login]")'):
|
|
10
|
-
raise RuntimeError("not logged in to GitHub")
|
|
11
|
-
|
|
12
|
-
# Star the current repo
|
|
13
|
-
js("""
|
|
14
|
-
(()=>{
|
|
15
|
-
const f = document.querySelector('form[action$="/star"]');
|
|
16
|
-
if (!f) return 'already-starred-or-missing';
|
|
17
|
-
f.submit();
|
|
18
|
-
return 'submitted';
|
|
19
|
-
})()
|
|
20
|
-
""")
|
|
21
|
-
wait(2)
|
|
22
|
-
wait_for_load()
|
|
23
|
-
|
|
24
|
-
# Verify — the toggle swaps which form is present
|
|
25
|
-
starred = js('!!document.querySelector(\'form[action$="/unstar"]\')')
|
|
26
|
-
```
|
|
27
|
-
|
|
28
|
-
Same pattern for the reverse (`form[action$="/unstar"]`) and for watch/unwatch (`form[action$="/subscription"]` + a hidden `_method` field, see below).
|
|
29
|
-
|
|
30
|
-
## Why not click the button
|
|
31
|
-
|
|
32
|
-
The visible Star button looks like `button[aria-label^="Star "]`, but that selector has two gotchas on the modern repo header:
|
|
33
|
-
|
|
34
|
-
- **There are two matching buttons.** The first one `querySelector` returns is a hidden fallback inside the sticky sub-header form with `getBoundingClientRect() == {x:0, y:0, w:0, h:0}`. Coordinate-clicking it does nothing because it has no geometry.
|
|
35
|
-
- **Synthetic `.click()` on the visible React button does not persist the star.** The click fires, `aria-label` stays `Star ...`, network tab shows no POST. GitHub's component swallows the synthetic event somewhere in its React fiber handler.
|
|
36
|
-
|
|
37
|
-
`form.submit()` sidesteps both problems — it bypasses React entirely and goes straight to the HTML form's POST. The authenticity token is already in a hidden input inside the form, so there's nothing extra to fetch.
|
|
38
|
-
|
|
39
|
-
## Watch / Unwatch
|
|
40
|
-
|
|
41
|
-
The subscription form uses a shared endpoint with a `_method` override:
|
|
42
|
-
|
|
43
|
-
```python
|
|
44
|
-
# Watch (all activity)
|
|
45
|
-
js("""
|
|
46
|
-
(()=>{
|
|
47
|
-
const f = document.querySelector('form[action$="/subscription"]');
|
|
48
|
-
if (!f) return 'missing';
|
|
49
|
-
f.submit();
|
|
50
|
-
return 'submitted';
|
|
51
|
-
})()
|
|
52
|
-
""")
|
|
53
|
-
```
|
|
54
|
-
|
|
55
|
-
GitHub renders different form attributes (different `_method` hidden input values) depending on the current state. Re-read the form after every toggle rather than caching a reference.
|
|
56
|
-
|
|
57
|
-
## Gotchas
|
|
58
|
-
|
|
59
|
-
- **Star count in the rendered button lags the true count by a hydration tick.** The durable signal that "this worked" is which form is on the page after reload: `form[action$="/star"]` present means unstarred, `form[action$="/unstar"]` means starred. The visible aria-label is reliable once you scroll to the top and wait ~1s after submit; the count inside the button updates on soft navigation and is not a good assertion target.
|
|
60
|
-
|
|
61
|
-
- **`form.submit()` bypasses the form's `submit` event listeners** — fine for GitHub's case (the handler is a full navigation), but if a future change wires in `e.preventDefault()` to do an XHR, `form.requestSubmit()` is the safer alternative. Worth trying first if `form.submit()` stops working.
|
|
62
|
-
|
|
63
|
-
- **If the user is not logged in the forms are not rendered at all.** `meta[name="user-login"]` is the cheapest pre-check.
|
|
64
|
-
|
|
65
|
-
- **For read-only star counts, don't touch the DOM — use the API.** `http_get("https://api.github.com/repos/{owner}/{repo}")` returns `stargazers_count` without any browser interaction. See `scraping.md`. Only use the form-submit pattern when you actually need to *change* state on behalf of the logged-in user.
|
|
1
|
+
# GitHub — Repo actions (star, unstar, watch)
|
|
2
|
+
|
|
3
|
+
`https://github.com/{owner}/{repo}` — user-triggered actions on the repo header (Star, Unstar, Watch, Unwatch) are HTML forms that POST back to GitHub with the session's CSRF token already rendered inline. **Submit the form — do not click the button.**
|
|
4
|
+
|
|
5
|
+
## Do this first
|
|
6
|
+
|
|
7
|
+
```python
|
|
8
|
+
# Precondition: user is logged in
|
|
9
|
+
if not js('!!document.querySelector("meta[name=user-login]")'):
|
|
10
|
+
raise RuntimeError("not logged in to GitHub")
|
|
11
|
+
|
|
12
|
+
# Star the current repo
|
|
13
|
+
js("""
|
|
14
|
+
(()=>{
|
|
15
|
+
const f = document.querySelector('form[action$="/star"]');
|
|
16
|
+
if (!f) return 'already-starred-or-missing';
|
|
17
|
+
f.submit();
|
|
18
|
+
return 'submitted';
|
|
19
|
+
})()
|
|
20
|
+
""")
|
|
21
|
+
wait(2)
|
|
22
|
+
wait_for_load()
|
|
23
|
+
|
|
24
|
+
# Verify — the toggle swaps which form is present
|
|
25
|
+
starred = js('!!document.querySelector(\'form[action$="/unstar"]\')')
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
Same pattern for the reverse (`form[action$="/unstar"]`) and for watch/unwatch (`form[action$="/subscription"]` + a hidden `_method` field, see below).
|
|
29
|
+
|
|
30
|
+
## Why not click the button
|
|
31
|
+
|
|
32
|
+
The visible Star button looks like `button[aria-label^="Star "]`, but that selector has two gotchas on the modern repo header:
|
|
33
|
+
|
|
34
|
+
- **There are two matching buttons.** The first one `querySelector` returns is a hidden fallback inside the sticky sub-header form with `getBoundingClientRect() == {x:0, y:0, w:0, h:0}`. Coordinate-clicking it does nothing because it has no geometry.
|
|
35
|
+
- **Synthetic `.click()` on the visible React button does not persist the star.** The click fires, `aria-label` stays `Star ...`, network tab shows no POST. GitHub's component swallows the synthetic event somewhere in its React fiber handler.
|
|
36
|
+
|
|
37
|
+
`form.submit()` sidesteps both problems — it bypasses React entirely and goes straight to the HTML form's POST. The authenticity token is already in a hidden input inside the form, so there's nothing extra to fetch.
|
|
38
|
+
|
|
39
|
+
## Watch / Unwatch
|
|
40
|
+
|
|
41
|
+
The subscription form uses a shared endpoint with a `_method` override:
|
|
42
|
+
|
|
43
|
+
```python
|
|
44
|
+
# Watch (all activity)
|
|
45
|
+
js("""
|
|
46
|
+
(()=>{
|
|
47
|
+
const f = document.querySelector('form[action$="/subscription"]');
|
|
48
|
+
if (!f) return 'missing';
|
|
49
|
+
f.submit();
|
|
50
|
+
return 'submitted';
|
|
51
|
+
})()
|
|
52
|
+
""")
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
GitHub renders different form attributes (different `_method` hidden input values) depending on the current state. Re-read the form after every toggle rather than caching a reference.
|
|
56
|
+
|
|
57
|
+
## Gotchas
|
|
58
|
+
|
|
59
|
+
- **Star count in the rendered button lags the true count by a hydration tick.** The durable signal that "this worked" is which form is on the page after reload: `form[action$="/star"]` present means unstarred, `form[action$="/unstar"]` means starred. The visible aria-label is reliable once you scroll to the top and wait ~1s after submit; the count inside the button updates on soft navigation and is not a good assertion target.
|
|
60
|
+
|
|
61
|
+
- **`form.submit()` bypasses the form's `submit` event listeners** — fine for GitHub's case (the handler is a full navigation), but if a future change wires in `e.preventDefault()` to do an XHR, `form.requestSubmit()` is the safer alternative. Worth trying first if `form.submit()` stops working.
|
|
62
|
+
|
|
63
|
+
- **If the user is not logged in the forms are not rendered at all.** `meta[name="user-login"]` is the cheapest pre-check.
|
|
64
|
+
|
|
65
|
+
- **For read-only star counts, don't touch the DOM — use the API.** `http_get("https://api.github.com/repos/{owner}/{repo}")` returns `stargazers_count` without any browser interaction. See `scraping.md`. Only use the form-submit pattern when you actually need to *change* state on behalf of the logged-in user.
|
|
@@ -1,184 +1,184 @@
|
|
|
1
|
-
# GitHub — Scraping & Data Extraction
|
|
2
|
-
|
|
3
|
-
`https://github.com` — public data, mix of REST API (fast, rate-limited) and browser (trending page only).
|
|
4
|
-
|
|
5
|
-
## Do this first
|
|
6
|
-
|
|
7
|
-
**Use the REST API for repo/user/release data — it's one call, no browser, fully parsed JSON.**
|
|
8
|
-
|
|
9
|
-
```python
|
|
10
|
-
import json
|
|
11
|
-
data = json.loads(http_get("https://api.github.com/repos/{owner}/{repo}"))
|
|
12
|
-
# Key fields: stargazers_count, forks_count, description, language, topics,
|
|
13
|
-
# open_issues_count, created_at, updated_at, pushed_at,
|
|
14
|
-
# watchers_count, subscribers_count, network_count,
|
|
15
|
-
# default_branch, license, homepage, visibility
|
|
16
|
-
```
|
|
17
|
-
|
|
18
|
-
Use `raw.githubusercontent.com` for file contents — no rate limit, no auth, no base64 decode:
|
|
19
|
-
|
|
20
|
-
```python
|
|
21
|
-
readme = http_get("https://raw.githubusercontent.com/owner/repo/main/README.md")
|
|
22
|
-
content = http_get("https://raw.githubusercontent.com/owner/repo/main/pyproject.toml")
|
|
23
|
-
```
|
|
24
|
-
|
|
25
|
-
Use the browser **only** for the trending page — it's server-side rendered HTML, no API equivalent.
|
|
26
|
-
|
|
27
|
-
## Common workflows
|
|
28
|
-
|
|
29
|
-
### Repo metadata (API)
|
|
30
|
-
|
|
31
|
-
```python
|
|
32
|
-
import json
|
|
33
|
-
data = json.loads(http_get("https://api.github.com/repos/browser-use/browser-use"))
|
|
34
|
-
print(data['stargazers_count'], data['forks_count'], data['description'])
|
|
35
|
-
# returns: 88349 10136 '🌐 Make websites accessible for AI agents.'
|
|
36
|
-
```
|
|
37
|
-
|
|
38
|
-
### User / org profile (API)
|
|
39
|
-
|
|
40
|
-
```python
|
|
41
|
-
import json
|
|
42
|
-
user = json.loads(http_get("https://api.github.com/users/browser-use"))
|
|
43
|
-
print(user['type'], user['followers'], user['public_repos'], user['blog'])
|
|
44
|
-
# returns: 'Organization' 3046 39 'https://browser-use.com'
|
|
45
|
-
```
|
|
46
|
-
|
|
47
|
-
### Trending page (browser required)
|
|
48
|
-
|
|
49
|
-
The trending page is JS-rendered. `article.Box-row` selector confirmed working (15 results for today/all-languages, 12 for filtered). All fields work in a single JS call — **must navigate and wait in the same script run**, as each run is a separate exec context.
|
|
50
|
-
|
|
51
|
-
```python
|
|
52
|
-
import json
|
|
53
|
-
goto_url("https://github.com/trending") # or /trending/python?since=weekly
|
|
54
|
-
wait_for_load()
|
|
55
|
-
wait(2) # extra 2s — React hydration completes after readyState
|
|
56
|
-
|
|
57
|
-
result = js("""
|
|
58
|
-
(function(){
|
|
59
|
-
var rows = Array.from(document.querySelectorAll('article.Box-row'));
|
|
60
|
-
return JSON.stringify(rows.map(function(el){
|
|
61
|
-
var h2link = el.querySelector('h2 a');
|
|
62
|
-
var starLink = el.querySelector('a[href*="/stargazers"]');
|
|
63
|
-
var forkLink = el.querySelector('a[href*="/forks"]');
|
|
64
|
-
var langEl = el.querySelector('[itemprop="programmingLanguage"]');
|
|
65
|
-
var todayEl = el.querySelector('.d-inline-block.float-sm-right');
|
|
66
|
-
var descEl = el.querySelector('p');
|
|
67
|
-
return {
|
|
68
|
-
name: h2link ? h2link.innerText.trim().replace(/\\s+/g,' ') : null,
|
|
69
|
-
url: h2link ? 'https://github.com' + h2link.getAttribute('href') : null,
|
|
70
|
-
stars_total: starLink ? starLink.innerText.trim() : null,
|
|
71
|
-
stars_period: todayEl ? todayEl.innerText.trim() : null,
|
|
72
|
-
forks: forkLink ? forkLink.innerText.trim() : null,
|
|
73
|
-
language: langEl ? langEl.innerText.trim() : null,
|
|
74
|
-
desc: descEl ? descEl.innerText.trim() : null
|
|
75
|
-
};
|
|
76
|
-
}));
|
|
77
|
-
})()
|
|
78
|
-
""")
|
|
79
|
-
repos = json.loads(result)
|
|
80
|
-
# stars_period text is e.g. "737 stars today" or "47,053 stars this week"
|
|
81
|
-
```
|
|
82
|
-
|
|
83
|
-
Supported URL params:
|
|
84
|
-
- `/trending` — all languages, today
|
|
85
|
-
- `/trending/python` — filtered to Python
|
|
86
|
-
- `/trending?since=weekly` or `?since=monthly`
|
|
87
|
-
- `/trending/python?since=weekly` — combined
|
|
88
|
-
|
|
89
|
-
### Search repositories (API)
|
|
90
|
-
|
|
91
|
-
```python
|
|
92
|
-
import json
|
|
93
|
-
results = json.loads(http_get(
|
|
94
|
-
"https://api.github.com/search/repositories?q=browser+automation+language:python&sort=stars&per_page=10"
|
|
95
|
-
))
|
|
96
|
-
print(results['total_count']) # e.g. 3250
|
|
97
|
-
for r in results['items']:
|
|
98
|
-
print(r['full_name'], r['stargazers_count'])
|
|
99
|
-
```
|
|
100
|
-
|
|
101
|
-
Search API rate limit is **10 req/min** unauthenticated (separate from the 60/hour core limit). Runs out fast if called in a loop.
|
|
102
|
-
|
|
103
|
-
### Commits, releases, issues (API)
|
|
104
|
-
|
|
105
|
-
```python
|
|
106
|
-
import json
|
|
107
|
-
# Commits
|
|
108
|
-
commits = json.loads(http_get("https://api.github.com/repos/owner/repo/commits?per_page=10"))
|
|
109
|
-
# Fields: sha, commit.message, commit.author.date, author.login
|
|
110
|
-
|
|
111
|
-
# Releases
|
|
112
|
-
releases = json.loads(http_get("https://api.github.com/repos/owner/repo/releases?per_page=5"))
|
|
113
|
-
# Fields: tag_name, name, published_at, body, assets
|
|
114
|
-
|
|
115
|
-
# Issues
|
|
116
|
-
issues = json.loads(http_get("https://api.github.com/repos/owner/repo/issues?state=open&per_page=10"))
|
|
117
|
-
# Fields: number, title, labels, state, created_at, user.login
|
|
118
|
-
|
|
119
|
-
# Contributors
|
|
120
|
-
contribs = json.loads(http_get("https://api.github.com/repos/owner/repo/contributors?per_page=10"))
|
|
121
|
-
# Fields: login, contributions
|
|
122
|
-
```
|
|
123
|
-
|
|
124
|
-
### File contents via API (base64)
|
|
125
|
-
|
|
126
|
-
```python
|
|
127
|
-
import json, base64
|
|
128
|
-
resp = json.loads(http_get("https://api.github.com/repos/owner/repo/contents/path/to/file.py"))
|
|
129
|
-
content = base64.b64decode(resp['content']).decode()
|
|
130
|
-
# resp also has: size, sha, html_url
|
|
131
|
-
# Prefer raw.githubusercontent.com for large files — no base64, no rate limit hit
|
|
132
|
-
```
|
|
133
|
-
|
|
134
|
-
### Parallel fetching (multiple repos)
|
|
135
|
-
|
|
136
|
-
```python
|
|
137
|
-
import json
|
|
138
|
-
from concurrent.futures import ThreadPoolExecutor
|
|
139
|
-
|
|
140
|
-
def fetch_repo(name):
|
|
141
|
-
data = json.loads(http_get(f"https://api.github.com/repos/{name}"))
|
|
142
|
-
return {"name": name, "stars": data['stargazers_count'], "lang": data['language']}
|
|
143
|
-
|
|
144
|
-
repos = ["owner/repo1", "owner/repo2", "owner/repo3"]
|
|
145
|
-
with ThreadPoolExecutor(max_workers=3) as ex:
|
|
146
|
-
results = list(ex.map(fetch_repo, repos))
|
|
147
|
-
# Confirmed working; watch rate limit — 60 unauthenticated calls/hour total
|
|
148
|
-
```
|
|
149
|
-
|
|
150
|
-
## Gotchas
|
|
151
|
-
|
|
152
|
-
- **Rate limits are per IP, unauthenticated** — Core API: 60 req/hour. Search API: 10 req/min. These are separate pools. Check `/rate_limit` endpoint: `http_get("https://api.github.com/rate_limit")`. With a `GITHUB_TOKEN`, both limits increase to 5,000/hour.
|
|
153
|
-
|
|
154
|
-
- **Token header format** — Use `Authorization: Bearer <token>` (not `token <token>`), plus `X-GitHub-Api-Version: 2022-11-28`:
|
|
155
|
-
```python
|
|
156
|
-
import os
|
|
157
|
-
token = os.environ.get('GITHUB_TOKEN', '')
|
|
158
|
-
headers = {"Authorization": f"Bearer {token}", "X-GitHub-Api-Version": "2022-11-28"} if token else {}
|
|
159
|
-
data = json.loads(http_get("https://api.github.com/repos/owner/repo", headers=headers))
|
|
160
|
-
```
|
|
161
|
-
|
|
162
|
-
- **404 raises HTTPError, not a JSON error** — Wrap API calls for missing repos:
|
|
163
|
-
```python
|
|
164
|
-
try:
|
|
165
|
-
data = json.loads(http_get("https://api.github.com/repos/owner/repo"))
|
|
166
|
-
except Exception as e:
|
|
167
|
-
print("Not found or rate limited:", e)
|
|
168
|
-
```
|
|
169
|
-
|
|
170
|
-
- **Code search requires auth** — `GET /search/code` returns HTTP 401 without a token. Repo/user/issues search works unauthenticated.
|
|
171
|
-
|
|
172
|
-
- **Trending page selectors only work if navigation is in the same script run** — Each `uv run browser-harness` exec is fresh. Selectors that returned 0 results were run in a separate invocation after the page had navigated away. Always include `goto_url()` + `wait_for_load()` + `wait(2)` in the same script.
|
|
173
|
-
|
|
174
|
-
- **wait(2) after wait_for_load() on trending** — `document.readyState == 'complete'` fires before React finishes painting repo cards. Without the extra 2s sleep, `article.Box-row` count was 0 even though the DOM technically loaded.
|
|
175
|
-
|
|
176
|
-
- **Trending stars field is a string with commas** — `stars_total` comes back as `"4,548"` not `4548`. Parse with `int(r['stars_total'].replace(',', ''))` if you need to sort.
|
|
177
|
-
|
|
178
|
-
- **stars_period text includes the period** — Value is `"737 stars today"` or `"47,053 stars this week"` — strip the trailing word if you want just the number.
|
|
179
|
-
|
|
180
|
-
- **Repo page DOM is React-heavy, API is better** — Extracting star counts from the repo HTML page (`github.com/owner/repo`) is unreliable because GitHub uses React with server-side hydration and component IDs change. The REST API returns all the same data cleanly.
|
|
181
|
-
|
|
182
|
-
- **raw.githubusercontent.com has no rate limit and no auth** — Use it for any public file. It serves the raw bytes, no JSON wrapping or base64.
|
|
183
|
-
|
|
184
|
-
- **Trending page article count varies** — Today filter returned 15 articles, weekly Python filter returned 12. Don't assume 25 results; iterate `document.querySelectorAll('article.Box-row')` and take what's there.
|
|
1
|
+
# GitHub — Scraping & Data Extraction
|
|
2
|
+
|
|
3
|
+
`https://github.com` — public data, mix of REST API (fast, rate-limited) and browser (trending page only).
|
|
4
|
+
|
|
5
|
+
## Do this first
|
|
6
|
+
|
|
7
|
+
**Use the REST API for repo/user/release data — it's one call, no browser, fully parsed JSON.**
|
|
8
|
+
|
|
9
|
+
```python
|
|
10
|
+
import json
|
|
11
|
+
data = json.loads(http_get("https://api.github.com/repos/{owner}/{repo}"))
|
|
12
|
+
# Key fields: stargazers_count, forks_count, description, language, topics,
|
|
13
|
+
# open_issues_count, created_at, updated_at, pushed_at,
|
|
14
|
+
# watchers_count, subscribers_count, network_count,
|
|
15
|
+
# default_branch, license, homepage, visibility
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
Use `raw.githubusercontent.com` for file contents — no rate limit, no auth, no base64 decode:
|
|
19
|
+
|
|
20
|
+
```python
|
|
21
|
+
readme = http_get("https://raw.githubusercontent.com/owner/repo/main/README.md")
|
|
22
|
+
content = http_get("https://raw.githubusercontent.com/owner/repo/main/pyproject.toml")
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
Use the browser **only** for the trending page — it's server-side rendered HTML, no API equivalent.
|
|
26
|
+
|
|
27
|
+
## Common workflows
|
|
28
|
+
|
|
29
|
+
### Repo metadata (API)
|
|
30
|
+
|
|
31
|
+
```python
|
|
32
|
+
import json
|
|
33
|
+
data = json.loads(http_get("https://api.github.com/repos/browser-use/browser-use"))
|
|
34
|
+
print(data['stargazers_count'], data['forks_count'], data['description'])
|
|
35
|
+
# returns: 88349 10136 '🌐 Make websites accessible for AI agents.'
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
### User / org profile (API)
|
|
39
|
+
|
|
40
|
+
```python
|
|
41
|
+
import json
|
|
42
|
+
user = json.loads(http_get("https://api.github.com/users/browser-use"))
|
|
43
|
+
print(user['type'], user['followers'], user['public_repos'], user['blog'])
|
|
44
|
+
# returns: 'Organization' 3046 39 'https://browser-use.com'
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
### Trending page (browser required)
|
|
48
|
+
|
|
49
|
+
The trending page is JS-rendered. `article.Box-row` selector confirmed working (15 results for today/all-languages, 12 for filtered). All fields work in a single JS call — **must navigate and wait in the same script run**, as each run is a separate exec context.
|
|
50
|
+
|
|
51
|
+
```python
|
|
52
|
+
import json
|
|
53
|
+
goto_url("https://github.com/trending") # or /trending/python?since=weekly
|
|
54
|
+
wait_for_load()
|
|
55
|
+
wait(2) # extra 2s — React hydration completes after readyState
|
|
56
|
+
|
|
57
|
+
result = js("""
|
|
58
|
+
(function(){
|
|
59
|
+
var rows = Array.from(document.querySelectorAll('article.Box-row'));
|
|
60
|
+
return JSON.stringify(rows.map(function(el){
|
|
61
|
+
var h2link = el.querySelector('h2 a');
|
|
62
|
+
var starLink = el.querySelector('a[href*="/stargazers"]');
|
|
63
|
+
var forkLink = el.querySelector('a[href*="/forks"]');
|
|
64
|
+
var langEl = el.querySelector('[itemprop="programmingLanguage"]');
|
|
65
|
+
var todayEl = el.querySelector('.d-inline-block.float-sm-right');
|
|
66
|
+
var descEl = el.querySelector('p');
|
|
67
|
+
return {
|
|
68
|
+
name: h2link ? h2link.innerText.trim().replace(/\\s+/g,' ') : null,
|
|
69
|
+
url: h2link ? 'https://github.com' + h2link.getAttribute('href') : null,
|
|
70
|
+
stars_total: starLink ? starLink.innerText.trim() : null,
|
|
71
|
+
stars_period: todayEl ? todayEl.innerText.trim() : null,
|
|
72
|
+
forks: forkLink ? forkLink.innerText.trim() : null,
|
|
73
|
+
language: langEl ? langEl.innerText.trim() : null,
|
|
74
|
+
desc: descEl ? descEl.innerText.trim() : null
|
|
75
|
+
};
|
|
76
|
+
}));
|
|
77
|
+
})()
|
|
78
|
+
""")
|
|
79
|
+
repos = json.loads(result)
|
|
80
|
+
# stars_period text is e.g. "737 stars today" or "47,053 stars this week"
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
Supported URL params:
|
|
84
|
+
- `/trending` — all languages, today
|
|
85
|
+
- `/trending/python` — filtered to Python
|
|
86
|
+
- `/trending?since=weekly` or `?since=monthly`
|
|
87
|
+
- `/trending/python?since=weekly` — combined
|
|
88
|
+
|
|
89
|
+
### Search repositories (API)
|
|
90
|
+
|
|
91
|
+
```python
|
|
92
|
+
import json
|
|
93
|
+
results = json.loads(http_get(
|
|
94
|
+
"https://api.github.com/search/repositories?q=browser+automation+language:python&sort=stars&per_page=10"
|
|
95
|
+
))
|
|
96
|
+
print(results['total_count']) # e.g. 3250
|
|
97
|
+
for r in results['items']:
|
|
98
|
+
print(r['full_name'], r['stargazers_count'])
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
Search API rate limit is **10 req/min** unauthenticated (separate from the 60/hour core limit). Runs out fast if called in a loop.
|
|
102
|
+
|
|
103
|
+
### Commits, releases, issues (API)
|
|
104
|
+
|
|
105
|
+
```python
|
|
106
|
+
import json
|
|
107
|
+
# Commits
|
|
108
|
+
commits = json.loads(http_get("https://api.github.com/repos/owner/repo/commits?per_page=10"))
|
|
109
|
+
# Fields: sha, commit.message, commit.author.date, author.login
|
|
110
|
+
|
|
111
|
+
# Releases
|
|
112
|
+
releases = json.loads(http_get("https://api.github.com/repos/owner/repo/releases?per_page=5"))
|
|
113
|
+
# Fields: tag_name, name, published_at, body, assets
|
|
114
|
+
|
|
115
|
+
# Issues
|
|
116
|
+
issues = json.loads(http_get("https://api.github.com/repos/owner/repo/issues?state=open&per_page=10"))
|
|
117
|
+
# Fields: number, title, labels, state, created_at, user.login
|
|
118
|
+
|
|
119
|
+
# Contributors
|
|
120
|
+
contribs = json.loads(http_get("https://api.github.com/repos/owner/repo/contributors?per_page=10"))
|
|
121
|
+
# Fields: login, contributions
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
### File contents via API (base64)
|
|
125
|
+
|
|
126
|
+
```python
|
|
127
|
+
import json, base64
|
|
128
|
+
resp = json.loads(http_get("https://api.github.com/repos/owner/repo/contents/path/to/file.py"))
|
|
129
|
+
content = base64.b64decode(resp['content']).decode()
|
|
130
|
+
# resp also has: size, sha, html_url
|
|
131
|
+
# Prefer raw.githubusercontent.com for large files — no base64, no rate limit hit
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
### Parallel fetching (multiple repos)
|
|
135
|
+
|
|
136
|
+
```python
|
|
137
|
+
import json
|
|
138
|
+
from concurrent.futures import ThreadPoolExecutor
|
|
139
|
+
|
|
140
|
+
def fetch_repo(name):
|
|
141
|
+
data = json.loads(http_get(f"https://api.github.com/repos/{name}"))
|
|
142
|
+
return {"name": name, "stars": data['stargazers_count'], "lang": data['language']}
|
|
143
|
+
|
|
144
|
+
repos = ["owner/repo1", "owner/repo2", "owner/repo3"]
|
|
145
|
+
with ThreadPoolExecutor(max_workers=3) as ex:
|
|
146
|
+
results = list(ex.map(fetch_repo, repos))
|
|
147
|
+
# Confirmed working; watch rate limit — 60 unauthenticated calls/hour total
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
## Gotchas
|
|
151
|
+
|
|
152
|
+
- **Rate limits are per IP, unauthenticated** — Core API: 60 req/hour. Search API: 10 req/min. These are separate pools. Check `/rate_limit` endpoint: `http_get("https://api.github.com/rate_limit")`. With a `GITHUB_TOKEN`, both limits increase to 5,000/hour.
|
|
153
|
+
|
|
154
|
+
- **Token header format** — Use `Authorization: Bearer <token>` (not `token <token>`), plus `X-GitHub-Api-Version: 2022-11-28`:
|
|
155
|
+
```python
|
|
156
|
+
import os
|
|
157
|
+
token = os.environ.get('GITHUB_TOKEN', '')
|
|
158
|
+
headers = {"Authorization": f"Bearer {token}", "X-GitHub-Api-Version": "2022-11-28"} if token else {}
|
|
159
|
+
data = json.loads(http_get("https://api.github.com/repos/owner/repo", headers=headers))
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
- **404 raises HTTPError, not a JSON error** — Wrap API calls for missing repos:
|
|
163
|
+
```python
|
|
164
|
+
try:
|
|
165
|
+
data = json.loads(http_get("https://api.github.com/repos/owner/repo"))
|
|
166
|
+
except Exception as e:
|
|
167
|
+
print("Not found or rate limited:", e)
|
|
168
|
+
```
|
|
169
|
+
|
|
170
|
+
- **Code search requires auth** — `GET /search/code` returns HTTP 401 without a token. Repo/user/issues search works unauthenticated.
|
|
171
|
+
|
|
172
|
+
- **Trending page selectors only work if navigation is in the same script run** — Each `uv run browser-harness` exec is fresh. Selectors that returned 0 results were run in a separate invocation after the page had navigated away. Always include `goto_url()` + `wait_for_load()` + `wait(2)` in the same script.
|
|
173
|
+
|
|
174
|
+
- **wait(2) after wait_for_load() on trending** — `document.readyState == 'complete'` fires before React finishes painting repo cards. Without the extra 2s sleep, `article.Box-row` count was 0 even though the DOM technically loaded.
|
|
175
|
+
|
|
176
|
+
- **Trending stars field is a string with commas** — `stars_total` comes back as `"4,548"` not `4548`. Parse with `int(r['stars_total'].replace(',', ''))` if you need to sort.
|
|
177
|
+
|
|
178
|
+
- **stars_period text includes the period** — Value is `"737 stars today"` or `"47,053 stars this week"` — strip the trailing word if you want just the number.
|
|
179
|
+
|
|
180
|
+
- **Repo page DOM is React-heavy, API is better** — Extracting star counts from the repo HTML page (`github.com/owner/repo`) is unreliable because GitHub uses React with server-side hydration and component IDs change. The REST API returns all the same data cleanly.
|
|
181
|
+
|
|
182
|
+
- **raw.githubusercontent.com has no rate limit and no auth** — Use it for any public file. It serves the raw bytes, no JSON wrapping or base64.
|
|
183
|
+
|
|
184
|
+
- **Trending page article count varies** — Today filter returned 15 articles, weekly Python filter returned 12. Don't assume 25 results; iterate `document.querySelectorAll('article.Box-row')` and take what's there.
|