hatch3r 1.7.5 → 1.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +2 -2
- package/agents/hatch3r-context-rules.md +22 -6
- package/agents/hatch3r-creator.md +2 -1
- package/agents/hatch3r-handoff-loader.md +1 -1
- package/agents/hatch3r-implementer.md +8 -0
- package/agents/hatch3r-learnings-loader.md +1 -1
- package/agents/hatch3r-reviewer.md +2 -0
- package/agents/shared/user-content-templates.md +31 -1
- package/commands/hatch3r-agent-customize.md +4 -0
- package/commands/hatch3r-api-spec.md +7 -0
- package/commands/hatch3r-benchmark.md +7 -0
- package/commands/hatch3r-board-fill.md +7 -0
- package/commands/hatch3r-board-groom.md +4 -0
- package/commands/hatch3r-board-init.md +51 -0
- package/commands/hatch3r-board-pickup.md +8 -0
- package/commands/hatch3r-board-refresh.md +4 -0
- package/commands/hatch3r-board-shared.md +6 -6
- package/commands/hatch3r-bug-plan.md +7 -0
- package/commands/hatch3r-codebase-map.md +8 -0
- package/commands/hatch3r-command-customize.md +4 -0
- package/commands/hatch3r-context-health.md +5 -0
- package/commands/hatch3r-create.md +57 -4
- package/commands/hatch3r-debug.md +7 -0
- package/commands/hatch3r-dep-audit.md +4 -0
- package/commands/hatch3r-feature-plan.md +7 -0
- package/commands/hatch3r-handoff.md +7 -0
- package/commands/hatch3r-healthcheck.md +4 -0
- package/commands/hatch3r-hooks.md +4 -0
- package/commands/hatch3r-learn.md +16 -0
- package/commands/hatch3r-migration-plan.md +7 -0
- package/commands/hatch3r-onboard.md +7 -0
- package/commands/hatch3r-pr-resolve.md +8 -1
- package/commands/hatch3r-project-spec.md +8 -0
- package/commands/hatch3r-quick-change.md +7 -0
- package/commands/hatch3r-recipe.md +4 -0
- package/commands/hatch3r-refactor-plan.md +7 -0
- package/commands/hatch3r-release.md +5 -0
- package/commands/hatch3r-revision.md +7 -0
- package/commands/hatch3r-roadmap.md +8 -0
- package/commands/hatch3r-rule-customize.md +4 -0
- package/commands/hatch3r-security-audit.md +4 -0
- package/commands/hatch3r-skill-customize.md +4 -0
- package/commands/hatch3r-test-plan.md +7 -0
- package/commands/hatch3r-workflow.md +9 -1
- package/dist/cli/index.js +2600 -777
- package/dist/cli/index.js.map +1 -1
- package/package.json +8 -5
- package/rules/hatch3r-agent-orchestration-detail.md +3 -0
- package/rules/hatch3r-agent-orchestration-detail.mdc +3 -0
- package/rules/hatch3r-agent-orchestration.md +25 -2
- package/rules/hatch3r-agent-orchestration.mdc +25 -2
- package/rules/hatch3r-iteration-summary.md +2 -0
- package/rules/hatch3r-iteration-summary.mdc +2 -0
- package/rules/hatch3r-observability-tracing-detail.md +7 -148
- package/rules/hatch3r-observability-tracing-detail.mdc +6 -148
- package/rules/hatch3r-observability-tracing.md +154 -6
- package/rules/hatch3r-observability-tracing.mdc +154 -6
- package/skills/hatch3r-agent-customize/SKILL.md +10 -0
- package/skills/hatch3r-ai-feature/SKILL.md +2 -0
- package/skills/hatch3r-api-spec/SKILL.md +68 -0
- package/skills/hatch3r-cli-csvkit/SKILL.md +2 -2
- package/skills/hatch3r-cli-duckdb/SKILL.md +3 -3
- package/skills/hatch3r-cli-jq/SKILL.md +4 -0
- package/skills/hatch3r-cli-miller/SKILL.md +2 -2
- package/skills/hatch3r-cli-overview/SKILL.md +1 -1
- package/skills/{hatch3r-cli-xsv → hatch3r-cli-qsv}/SKILL.md +20 -18
- package/skills/hatch3r-cli-stagehand/SKILL.md +48 -16
- package/skills/hatch3r-command-customize/SKILL.md +10 -0
- package/skills/hatch3r-customize/SKILL.md +3 -0
- package/skills/hatch3r-design-system-detect/SKILL.md +2 -0
- package/skills/hatch3r-observability-verify/SKILL.md +4 -3
- package/skills/hatch3r-reliability-verify/SKILL.md +2 -0
- package/skills/hatch3r-rule-customize/SKILL.md +10 -0
- package/skills/hatch3r-skill-customize/SKILL.md +10 -0
- package/skills/hatch3r-ui-ux-verify/SKILL.md +2 -0
|
@@ -19,48 +19,71 @@ Browserbase Stagehand — AI-driven browser automation
|
|
|
19
19
|
|
|
20
20
|
## When to Use
|
|
21
21
|
|
|
22
|
-
Reach for `stagehand` when the task is in the **browser** category and the agent would otherwise call an MCP tool or read large outputs into context.
|
|
22
|
+
Reach for `stagehand` when the task is in the **browser** category and the agent would otherwise call an MCP tool or read large outputs into context. v3 (released 2025-10-29) operates directly on the Chrome DevTools Protocol — choose Stagehand when the target page changes shape often enough that hand-written selectors break, or when a prompt is the most compact spec of intent.
|
|
23
23
|
|
|
24
24
|
## Token Cost
|
|
25
25
|
|
|
26
26
|
CLI tools return structured stdout that fits in <1KB for typical queries; equivalent MCP calls regularly exceed 10KB.
|
|
27
27
|
Reference: Anthropic engineering (Nov 4 2025) — code-execution-over-MCP yields 98.7% token reduction.
|
|
28
28
|
|
|
29
|
+
## v3 Driver Model
|
|
30
|
+
|
|
31
|
+
v3 dropped the hard Playwright dependency and exposes a modular driver layer. Pick the driver that matches the host environment:
|
|
32
|
+
|
|
33
|
+
- **CDP-native (default):** Stagehand talks Chrome DevTools Protocol directly — no test-runner dependency, smallest install, Bun-compatible.
|
|
34
|
+
- **Playwright peer:** install `playwright-core` alongside Stagehand to reuse existing Playwright fixtures, traces, or `@playwright/test` reporters.
|
|
35
|
+
- **Puppeteer peer:** install `puppeteer-core` to share a launcher with existing Puppeteer scripts.
|
|
36
|
+
- **Patchright peer:** install `patchright-core` for stealth-patched CDP profiles.
|
|
37
|
+
|
|
38
|
+
`playwright-core`, `puppeteer-core`, and `patchright-core` are peer dependencies in v3 — install only the driver you use.
|
|
39
|
+
|
|
29
40
|
## Recipes
|
|
30
41
|
|
|
31
42
|
```bash
|
|
32
|
-
npx
|
|
43
|
+
npx create-browser-app
|
|
33
44
|
```
|
|
34
|
-
Scaffold a Stagehand project with
|
|
45
|
+
Scaffold a v3 Stagehand project with TypeScript wiring, a `stagehand.config.ts`, and an example `act`/`extract`/`observe` script. Replaces the v2 `npx stagehand init` workflow.
|
|
35
46
|
|
|
36
47
|
```bash
|
|
37
|
-
|
|
48
|
+
node scripts/login.ts
|
|
38
49
|
```
|
|
39
|
-
Execute an AI-driven action script
|
|
50
|
+
Execute an AI-driven action script. The script imports `Stagehand` from `@browserbasehq/stagehand`, calls `stagehand.act("click the login button")`, and Stagehand resolves the action at runtime via CDP — no test runner required.
|
|
40
51
|
|
|
41
52
|
```bash
|
|
42
|
-
npx
|
|
53
|
+
npx browse get markdown https://example.com
|
|
43
54
|
```
|
|
44
|
-
|
|
55
|
+
One-shot page extraction via `browse-cli` (v0.6+). Returns structured Markdown the agent can consume directly; cheaper than spawning a full Stagehand session for a single read.
|
|
45
56
|
|
|
46
57
|
```bash
|
|
47
|
-
npx
|
|
58
|
+
npx browse cdp wss://browser.example.com
|
|
59
|
+
```
|
|
60
|
+
Attach to an existing CDP endpoint (Browserbase managed session, local Chrome, or a custom launcher). Useful when the script delegates browser lifecycle to another supervisor.
|
|
61
|
+
|
|
62
|
+
```typescript
|
|
63
|
+
// scripts/observe.ts — observe primitive returns actions without executing
|
|
64
|
+
import { Stagehand } from "@browserbasehq/stagehand";
|
|
65
|
+
const stagehand = new Stagehand({ env: "LOCAL" });
|
|
66
|
+
await stagehand.init();
|
|
67
|
+
const actions = await stagehand.observe("find the login form");
|
|
68
|
+
console.log(JSON.stringify(actions, null, 2));
|
|
69
|
+
await stagehand.close();
|
|
48
70
|
```
|
|
49
|
-
|
|
71
|
+
Dry-run agent loop: `observe` returns the candidate action set without performing it, so a caller can route the decision (execute, ask the user, or reject).
|
|
50
72
|
|
|
51
73
|
## Wrong Choice When
|
|
52
74
|
|
|
53
|
-
- **
|
|
54
|
-
- **
|
|
55
|
-
- **
|
|
75
|
+
- **High-volume scraping at scale:** Stagehand's per-action LLM round-trip is cost-prohibitive past a few hundred pages — use the Browserbase managed-browser product, raw CDP with cached locators (v3's `deepLocator`), or Stagehand's action cache once a workflow is recorded as a deterministic script.
|
|
76
|
+
- **Headless CI in air-gapped environments:** Stagehand requires outbound LLM API access for selector resolution; offline environments fail the `act`/`extract`/`observe` calls. Pre-record actions with v3's automatic action cache, then replay the cached deterministic script in the air-gapped runner.
|
|
77
|
+
- **Workflows already covered by a stable test suite:** if Playwright tests with hand-tuned locators already pass green, Stagehand adds an LLM round-trip per step with no behavioural gain. Use `hatch3r-cli-playwright` (tier 2) for the test surface; reserve Stagehand for the agent-driven exploratory flows.
|
|
56
78
|
|
|
57
79
|
## Alternatives
|
|
58
80
|
|
|
59
81
|
| Tool | When to prefer |
|
|
60
82
|
|------|----------------|
|
|
61
|
-
| `hatch3r-cli-playwright` (tier 2) |
|
|
62
|
-
| Browserbase managed browsers | Production scale, session recording, anti-bot evasion |
|
|
63
|
-
|
|
|
83
|
+
| `hatch3r-cli-playwright` (tier 2) | Existing test fixtures, deterministic CI, no LLM round-trips needed |
|
|
84
|
+
| Browserbase managed browsers | Production scale, session recording, anti-bot evasion, CAPTCHA solving |
|
|
85
|
+
| Stagehand action cache (built into v3) | Same workflow re-run many times — record once, replay deterministically |
|
|
86
|
+
| Skyvern / Browser-Use | Workflow-style automation with embedded LLM agents and built-in task loops |
|
|
64
87
|
|
|
65
88
|
## Detection / Install
|
|
66
89
|
|
|
@@ -72,8 +95,17 @@ command -v stagehand
|
|
|
72
95
|
Install (mac):
|
|
73
96
|
|
|
74
97
|
```bash
|
|
75
|
-
# npm
|
|
98
|
+
# npm — v3 (Oct 29 2025); drivers are peer deps, install only what you use
|
|
76
99
|
npm install -g @browserbasehq/stagehand
|
|
100
|
+
# Add a driver only if you need Playwright/Puppeteer/Patchright interop:
|
|
101
|
+
# npm install -g playwright-core # OR
|
|
102
|
+
# npm install -g puppeteer-core # OR
|
|
103
|
+
# npm install -g patchright-core
|
|
77
104
|
```
|
|
78
105
|
|
|
106
|
+
References:
|
|
107
|
+
- v3 release announcement (2025-10-29): https://www.browserbase.com/blog/stagehand-v3
|
|
108
|
+
- Latest npm releases: https://github.com/browserbase/stagehand/releases
|
|
109
|
+
- v3 docs: https://docs.stagehand.dev/v3/get_started/introduction
|
|
110
|
+
|
|
79
111
|
Homepage: https://github.com/browserbase/stagehand
|
|
@@ -5,9 +5,19 @@ tags: [customize]
|
|
|
5
5
|
quality_charter: agents/shared/quality-charter.md
|
|
6
6
|
efficiency_patterns: agents/shared/efficiency-patterns.md
|
|
7
7
|
cache_friendly: true
|
|
8
|
+
redirect_to: hatch3r-customize
|
|
8
9
|
---
|
|
9
10
|
# Command Customization
|
|
10
11
|
|
|
11
12
|
> **This skill has been consolidated.** Use the `hatch3r-customize` skill with `type: command`.
|
|
12
13
|
|
|
13
14
|
For command-specific reference (YAML schema, examples), see the `hatch3r-command-customize` command.
|
|
15
|
+
|
|
16
|
+
## Rejected Merge Alternative (D16.3 add-vs-remove bias)
|
|
17
|
+
|
|
18
|
+
Per `governance/audit/domains/D16-compound-system.md` SA 16.3, the default recommendation on functional overlap is MERGE rather than removal. Full deletion of this redirect file was rejected for two reasons:
|
|
19
|
+
|
|
20
|
+
1. **Preserves UX entry points.** Users typed `/h4tcher-command-customize` or referenced the id `hatch3r-command-customize` (per `commands/hatch3r-command-customize.md:2` and sibling redirects) before consolidation. Deleting the id breaks those entry points without a redirect target.
|
|
21
|
+
2. **Signals umbrella canonicality.** The `redirect_to: hatch3r-customize` frontmatter field marks `hatch3r-customize` as the single source of truth — tooling, audit scans, and adapters can resolve any redirect to the canonical without re-reading body prose.
|
|
22
|
+
|
|
23
|
+
The 13-LOC redirect cost is paid once per type; the umbrella body lives in `skills/hatch3r-customize/SKILL.md`.
|
|
@@ -5,9 +5,12 @@ tags: [customize]
|
|
|
5
5
|
quality_charter: agents/shared/quality-charter.md
|
|
6
6
|
efficiency_patterns: agents/shared/efficiency-patterns.md
|
|
7
7
|
cache_friendly: true
|
|
8
|
+
canonical_for: [hatch3r-agent-customize, hatch3r-command-customize, hatch3r-rule-customize, hatch3r-skill-customize]
|
|
8
9
|
---
|
|
9
10
|
# Artifact Customization Management
|
|
10
11
|
|
|
12
|
+
> **Canonical entry point.** Four type-specific skills (`hatch3r-agent-customize`, `hatch3r-command-customize`, `hatch3r-rule-customize`, `hatch3r-skill-customize`) redirect here via `redirect_to: hatch3r-customize` frontmatter. Their body documents the rejected-merge alternative per `governance/audit/domains/D16-compound-system.md` SA 16.3.
|
|
13
|
+
|
|
11
14
|
## Quick Start
|
|
12
15
|
|
|
13
16
|
```
|
|
@@ -4,6 +4,8 @@ type: skill
|
|
|
4
4
|
description: Detect existing design tokens, component library, and theming convention in a project before authoring new UI primitives — output a concise inventory for downstream implementers
|
|
5
5
|
tags: [ui, design-system, frontend]
|
|
6
6
|
quality_charter: agents/shared/quality-charter.md
|
|
7
|
+
efficiency_patterns: agents/shared/efficiency-patterns.md
|
|
8
|
+
cache_friendly: true
|
|
7
9
|
---
|
|
8
10
|
# Design System Detection Workflow
|
|
9
11
|
|
|
@@ -4,6 +4,8 @@ type: skill
|
|
|
4
4
|
description: Verification gate before declaring an agent-produced service done — OTel span coverage on request path, structured-log + trace-id correlation, SLO definition, error-tracking integration, GenAI semconv on AI features
|
|
5
5
|
tags: [review, performance, devops]
|
|
6
6
|
quality_charter: agents/shared/quality-charter.md
|
|
7
|
+
efficiency_patterns: agents/shared/efficiency-patterns.md
|
|
8
|
+
cache_friendly: true
|
|
7
9
|
---
|
|
8
10
|
# Observability Verification Gate
|
|
9
11
|
|
|
@@ -79,7 +81,7 @@ Never under-fan-out to save tokens. Token cost is dominated by quality and compl
|
|
|
79
81
|
Applies only when the feature calls an LLM or runs an agent:
|
|
80
82
|
|
|
81
83
|
- GenAI semconv span on every LLM call carrying `gen_ai.system`, `gen_ai.request.model`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, `gen_ai.response.finish_reasons`. Cache-hit flag emitted as a span attribute when the provider returns one.
|
|
82
|
-
- Tools invoked by the agent emit `tool.{name}.execute` spans per `rules/hatch3r-observability-tracing
|
|
84
|
+
- Tools invoked by the agent emit `tool.{name}.execute` spans per `rules/hatch3r-observability-tracing.md` § "AI Agent Instrumentation". Each tool span carries `tool.name`, `tool.input_hash`, `tool.output_status`, `tool.duration_ms`.
|
|
83
85
|
- Cost telemetry per request: a metric counter `gen_ai.tokens_total{direction, model, agent_name}` and a histogram `gen_ai.request_duration_ms`.
|
|
84
86
|
- GenAI spans sampled at 50-100% in production — higher than general spans because volume is low and per-call cost is high.
|
|
85
87
|
|
|
@@ -119,8 +121,7 @@ The orchestrator running this skill emits a single-line verdict per gate (`GATE_
|
|
|
119
121
|
- `rules/hatch3r-observability.md`
|
|
120
122
|
- `rules/hatch3r-observability-logging.md`
|
|
121
123
|
- `rules/hatch3r-observability-metrics.md`
|
|
122
|
-
- `rules/hatch3r-observability-tracing.md`
|
|
123
|
-
- `rules/hatch3r-observability-tracing-detail.md`
|
|
124
|
+
- `rules/hatch3r-observability-tracing.md` (includes AI agent instrumentation; was previously split as `-detail`)
|
|
124
125
|
|
|
125
126
|
## References
|
|
126
127
|
|
|
@@ -4,6 +4,8 @@ type: skill
|
|
|
4
4
|
description: Reliability verification gate before declaring an agent-produced service done — SLO defined, kill switch, timeouts, retries, probes, runbook, staged rollout
|
|
5
5
|
tags: [review, devops]
|
|
6
6
|
quality_charter: agents/shared/quality-charter.md
|
|
7
|
+
efficiency_patterns: agents/shared/efficiency-patterns.md
|
|
8
|
+
cache_friendly: true
|
|
7
9
|
---
|
|
8
10
|
# Reliability Verification Gate
|
|
9
11
|
|
|
@@ -5,9 +5,19 @@ tags: [customize]
|
|
|
5
5
|
quality_charter: agents/shared/quality-charter.md
|
|
6
6
|
efficiency_patterns: agents/shared/efficiency-patterns.md
|
|
7
7
|
cache_friendly: true
|
|
8
|
+
redirect_to: hatch3r-customize
|
|
8
9
|
---
|
|
9
10
|
# Rule Customization
|
|
10
11
|
|
|
11
12
|
> **This skill has been consolidated.** Use the `hatch3r-customize` skill with `type: rule`.
|
|
12
13
|
|
|
13
14
|
For rule-specific reference (scope overrides, YAML schema), see the `hatch3r-rule-customize` command.
|
|
15
|
+
|
|
16
|
+
## Rejected Merge Alternative (D16.3 add-vs-remove bias)
|
|
17
|
+
|
|
18
|
+
Per `governance/audit/domains/D16-compound-system.md` SA 16.3, the default recommendation on functional overlap is MERGE rather than removal. Full deletion of this redirect file was rejected for two reasons:
|
|
19
|
+
|
|
20
|
+
1. **Preserves UX entry points.** Users typed `/h4tcher-rule-customize` or referenced the id `hatch3r-rule-customize` (per `rules/hatch3r-browser-verification.md:57` and sibling cross-references) before consolidation. Deleting the id breaks those entry points without a redirect target.
|
|
21
|
+
2. **Signals umbrella canonicality.** The `redirect_to: hatch3r-customize` frontmatter field marks `hatch3r-customize` as the single source of truth — tooling, audit scans, and adapters can resolve any redirect to the canonical without re-reading body prose.
|
|
22
|
+
|
|
23
|
+
The 13-LOC redirect cost is paid once per type; the umbrella body lives in `skills/hatch3r-customize/SKILL.md`.
|
|
@@ -5,9 +5,19 @@ tags: [customize]
|
|
|
5
5
|
quality_charter: agents/shared/quality-charter.md
|
|
6
6
|
efficiency_patterns: agents/shared/efficiency-patterns.md
|
|
7
7
|
cache_friendly: true
|
|
8
|
+
redirect_to: hatch3r-customize
|
|
8
9
|
---
|
|
9
10
|
# Skill Customization
|
|
10
11
|
|
|
11
12
|
> **This skill has been consolidated.** Use the `hatch3r-customize` skill with `type: skill`.
|
|
12
13
|
|
|
13
14
|
For skill-specific reference (YAML schema, examples), see the `hatch3r-skill-customize` command.
|
|
15
|
+
|
|
16
|
+
## Rejected Merge Alternative (D16.3 add-vs-remove bias)
|
|
17
|
+
|
|
18
|
+
Per `governance/audit/domains/D16-compound-system.md` SA 16.3, the default recommendation on functional overlap is MERGE rather than removal. Full deletion of this redirect file was rejected for two reasons:
|
|
19
|
+
|
|
20
|
+
1. **Preserves UX entry points.** Users typed `/h4tcher-skill-customize` or referenced the id `hatch3r-skill-customize` (per `rules/hatch3r-browser-verification.md:58` and sibling cross-references) before consolidation. Deleting the id breaks those entry points without a redirect target.
|
|
21
|
+
2. **Signals umbrella canonicality.** The `redirect_to: hatch3r-customize` frontmatter field marks `hatch3r-customize` as the single source of truth — tooling, audit scans, and adapters can resolve any redirect to the canonical without re-reading body prose.
|
|
22
|
+
|
|
23
|
+
The 13-LOC redirect cost is paid once per type; the umbrella body lives in `skills/hatch3r-customize/SKILL.md`.
|
|
@@ -4,6 +4,8 @@ type: skill
|
|
|
4
4
|
description: UI/UX verification gate before declaring a feature done — axe-core, scripted keyboard trace, accessibility-tree snapshot, four-state coverage, visual-regression baseline, one human screen-reader pass per release
|
|
5
5
|
tags: [ui, ux, a11y]
|
|
6
6
|
quality_charter: agents/shared/quality-charter.md
|
|
7
|
+
efficiency_patterns: agents/shared/efficiency-patterns.md
|
|
8
|
+
cache_friendly: true
|
|
7
9
|
---
|
|
8
10
|
# UI/UX Verification Gate
|
|
9
11
|
|