opengstack 0.13.10 → 0.14.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +4 -4
- package/CLAUDE.md +127 -110
- package/README.md +10 -5
- package/SKILL.md +500 -70
- package/bin/opengstack.js +69 -69
- package/{skills/land-and-deploy/SKILL.md → commands/autoplan.md} +7 -25
- package/{skills/benchmark/SKILL.md → commands/benchmark.md} +84 -108
- package/{skills/browse/SKILL.md → commands/browse.md} +60 -81
- package/{skills/ship/SKILL.md → commands/canary.md} +7 -27
- package/{skills/careful/SKILL.md → commands/careful.md} +2 -22
- package/{skills/canary/SKILL.md → commands/codex.md} +7 -26
- package/{skills/connect-chrome/SKILL.md → commands/connect-chrome.md} +7 -24
- package/commands/cso.md +70 -0
- package/commands/design-consultation.md +70 -0
- package/commands/design-review.md +70 -0
- package/commands/design-shotgun.md +70 -0
- package/commands/document-release.md +70 -0
- package/{skills/freeze/SKILL.md → commands/freeze.md} +3 -29
- package/{skills/guard/SKILL.md → commands/guard.md} +4 -35
- package/commands/investigate.md +70 -0
- package/commands/land-and-deploy.md +70 -0
- package/commands/office-hours.md +70 -0
- package/{skills/gstack-upgrade/SKILL.md → commands/opengstack-upgrade.md} +64 -79
- package/commands/plan-ceo-review.md +70 -0
- package/commands/plan-design-review.md +70 -0
- package/commands/plan-eng-review.md +70 -0
- package/commands/qa-only.md +70 -0
- package/commands/qa.md +70 -0
- package/commands/retro.md +70 -0
- package/commands/review.md +70 -0
- package/{skills/setup-browser-cookies/SKILL.md → commands/setup-browser-cookies.md} +22 -40
- package/commands/setup-deploy.md +70 -0
- package/commands/ship.md +70 -0
- package/commands/unfreeze.md +25 -0
- package/docs/designs/CHROME_VS_CHROMIUM_EXPLORATION.md +9 -9
- package/docs/designs/CONDUCTOR_CHROME_SIDEBAR_INTEGRATION.md +2 -2
- package/docs/designs/CONDUCTOR_SESSION_API.md +16 -16
- package/docs/designs/DESIGN_SHOTGUN.md +74 -74
- package/docs/designs/DESIGN_TOOLS_V1.md +111 -111
- package/docs/skills.md +483 -202
- package/package.json +42 -43
- package/scripts/analytics.ts +188 -0
- package/scripts/dev-skill.ts +83 -0
- package/scripts/discover-skills.ts +39 -0
- package/scripts/eval-compare.ts +97 -0
- package/scripts/eval-list.ts +117 -0
- package/scripts/eval-select.ts +86 -0
- package/scripts/eval-summary.ts +188 -0
- package/scripts/eval-watch.ts +172 -0
- package/scripts/gen-skill-docs.ts +473 -0
- package/scripts/resolvers/browse.ts +129 -0
- package/scripts/resolvers/codex-helpers.ts +133 -0
- package/scripts/resolvers/composition.ts +48 -0
- package/scripts/resolvers/confidence.ts +37 -0
- package/scripts/resolvers/constants.ts +50 -0
- package/scripts/resolvers/design.ts +950 -0
- package/scripts/resolvers/index.ts +59 -0
- package/scripts/resolvers/learnings.ts +96 -0
- package/scripts/resolvers/preamble.ts +505 -0
- package/scripts/resolvers/review.ts +884 -0
- package/scripts/resolvers/testing.ts +573 -0
- package/scripts/resolvers/types.ts +45 -0
- package/scripts/resolvers/utility.ts +421 -0
- package/scripts/skill-check.ts +190 -0
- package/scripts/cleanup.py +0 -100
- package/scripts/filter-skills.sh +0 -114
- package/scripts/filter_skills.py +0 -164
- package/scripts/install-skills.js +0 -60
- package/skills/autoplan/SKILL.md +0 -96
- package/skills/autoplan/SKILL.md.tmpl +0 -694
- package/skills/benchmark/SKILL.md.tmpl +0 -222
- package/skills/browse/SKILL.md.tmpl +0 -131
- package/skills/browse/bin/find-browse +0 -21
- package/skills/browse/bin/remote-slug +0 -14
- package/skills/browse/scripts/build-node-server.sh +0 -48
- package/skills/browse/src/activity.ts +0 -208
- package/skills/browse/src/browser-manager.ts +0 -959
- package/skills/browse/src/buffers.ts +0 -137
- package/skills/browse/src/bun-polyfill.cjs +0 -109
- package/skills/browse/src/cli.ts +0 -678
- package/skills/browse/src/commands.ts +0 -128
- package/skills/browse/src/config.ts +0 -150
- package/skills/browse/src/cookie-import-browser.ts +0 -625
- package/skills/browse/src/cookie-picker-routes.ts +0 -230
- package/skills/browse/src/cookie-picker-ui.ts +0 -688
- package/skills/browse/src/find-browse.ts +0 -61
- package/skills/browse/src/meta-commands.ts +0 -550
- package/skills/browse/src/platform.ts +0 -17
- package/skills/browse/src/read-commands.ts +0 -358
- package/skills/browse/src/server.ts +0 -1192
- package/skills/browse/src/sidebar-agent.ts +0 -280
- package/skills/browse/src/sidebar-utils.ts +0 -21
- package/skills/browse/src/snapshot.ts +0 -407
- package/skills/browse/src/url-validation.ts +0 -95
- package/skills/browse/src/write-commands.ts +0 -364
- package/skills/browse/test/activity.test.ts +0 -120
- package/skills/browse/test/adversarial-security.test.ts +0 -32
- package/skills/browse/test/browser-manager-unit.test.ts +0 -17
- package/skills/browse/test/bun-polyfill.test.ts +0 -72
- package/skills/browse/test/commands.test.ts +0 -2075
- package/skills/browse/test/compare-board.test.ts +0 -342
- package/skills/browse/test/config.test.ts +0 -316
- package/skills/browse/test/cookie-import-browser.test.ts +0 -519
- package/skills/browse/test/cookie-picker-routes.test.ts +0 -260
- package/skills/browse/test/file-drop.test.ts +0 -271
- package/skills/browse/test/find-browse.test.ts +0 -50
- package/skills/browse/test/findport.test.ts +0 -191
- package/skills/browse/test/fixtures/basic.html +0 -33
- package/skills/browse/test/fixtures/cursor-interactive.html +0 -22
- package/skills/browse/test/fixtures/dialog.html +0 -15
- package/skills/browse/test/fixtures/empty.html +0 -2
- package/skills/browse/test/fixtures/forms.html +0 -55
- package/skills/browse/test/fixtures/iframe.html +0 -30
- package/skills/browse/test/fixtures/network-idle.html +0 -30
- package/skills/browse/test/fixtures/qa-eval-checkout.html +0 -108
- package/skills/browse/test/fixtures/qa-eval-spa.html +0 -98
- package/skills/browse/test/fixtures/qa-eval.html +0 -51
- package/skills/browse/test/fixtures/responsive.html +0 -49
- package/skills/browse/test/fixtures/snapshot.html +0 -55
- package/skills/browse/test/fixtures/spa.html +0 -24
- package/skills/browse/test/fixtures/states.html +0 -17
- package/skills/browse/test/fixtures/upload.html +0 -25
- package/skills/browse/test/gstack-config.test.ts +0 -138
- package/skills/browse/test/gstack-update-check.test.ts +0 -514
- package/skills/browse/test/handoff.test.ts +0 -235
- package/skills/browse/test/path-validation.test.ts +0 -91
- package/skills/browse/test/platform.test.ts +0 -37
- package/skills/browse/test/server-auth.test.ts +0 -65
- package/skills/browse/test/sidebar-agent-roundtrip.test.ts +0 -226
- package/skills/browse/test/sidebar-agent.test.ts +0 -199
- package/skills/browse/test/sidebar-integration.test.ts +0 -320
- package/skills/browse/test/sidebar-unit.test.ts +0 -96
- package/skills/browse/test/snapshot.test.ts +0 -467
- package/skills/browse/test/state-ttl.test.ts +0 -35
- package/skills/browse/test/test-server.ts +0 -57
- package/skills/browse/test/url-validation.test.ts +0 -72
- package/skills/browse/test/watch.test.ts +0 -129
- package/skills/canary/SKILL.md.tmpl +0 -212
- package/skills/careful/SKILL.md.tmpl +0 -56
- package/skills/careful/bin/check-careful.sh +0 -112
- package/skills/codex/SKILL.md +0 -90
- package/skills/codex/SKILL.md.tmpl +0 -417
- package/skills/connect-chrome/SKILL.md.tmpl +0 -195
- package/skills/cso/ACKNOWLEDGEMENTS.md +0 -14
- package/skills/cso/SKILL.md +0 -93
- package/skills/cso/SKILL.md.tmpl +0 -606
- package/skills/design-consultation/SKILL.md +0 -94
- package/skills/design-consultation/SKILL.md.tmpl +0 -415
- package/skills/design-review/SKILL.md +0 -94
- package/skills/design-review/SKILL.md.tmpl +0 -290
- package/skills/design-shotgun/SKILL.md +0 -91
- package/skills/design-shotgun/SKILL.md.tmpl +0 -285
- package/skills/document-release/SKILL.md +0 -91
- package/skills/document-release/SKILL.md.tmpl +0 -359
- package/skills/freeze/SKILL.md.tmpl +0 -77
- package/skills/freeze/bin/check-freeze.sh +0 -79
- package/skills/gstack-upgrade/SKILL.md.tmpl +0 -222
- package/skills/guard/SKILL.md.tmpl +0 -77
- package/skills/investigate/SKILL.md +0 -105
- package/skills/investigate/SKILL.md.tmpl +0 -194
- package/skills/land-and-deploy/SKILL.md.tmpl +0 -881
- package/skills/office-hours/SKILL.md +0 -96
- package/skills/office-hours/SKILL.md.tmpl +0 -645
- package/skills/plan-ceo-review/SKILL.md +0 -94
- package/skills/plan-ceo-review/SKILL.md.tmpl +0 -811
- package/skills/plan-design-review/SKILL.md +0 -92
- package/skills/plan-design-review/SKILL.md.tmpl +0 -446
- package/skills/plan-eng-review/SKILL.md +0 -93
- package/skills/plan-eng-review/SKILL.md.tmpl +0 -303
- package/skills/qa/SKILL.md +0 -95
- package/skills/qa/SKILL.md.tmpl +0 -316
- package/skills/qa/references/issue-taxonomy.md +0 -85
- package/skills/qa/templates/qa-report-template.md +0 -126
- package/skills/qa-only/SKILL.md +0 -89
- package/skills/qa-only/SKILL.md.tmpl +0 -101
- package/skills/retro/SKILL.md +0 -89
- package/skills/retro/SKILL.md.tmpl +0 -820
- package/skills/review/SKILL.md +0 -92
- package/skills/review/SKILL.md.tmpl +0 -281
- package/skills/review/TODOS-format.md +0 -62
- package/skills/review/checklist.md +0 -220
- package/skills/review/design-checklist.md +0 -132
- package/skills/review/greptile-triage.md +0 -220
- package/skills/setup-browser-cookies/SKILL.md.tmpl +0 -81
- package/skills/setup-deploy/SKILL.md +0 -92
- package/skills/setup-deploy/SKILL.md.tmpl +0 -215
- package/skills/ship/SKILL.md.tmpl +0 -636
- package/skills/unfreeze/SKILL.md +0 -37
- package/skills/unfreeze/SKILL.md.tmpl +0 -36
package/docs/skills.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Skill Deep Dives
|
|
2
2
|
|
|
3
|
-
Detailed guides for every
|
|
3
|
+
Detailed guides for every opengstack skill — philosophy, workflow, and examples.
|
|
4
4
|
|
|
5
5
|
| Skill | Your specialist | What they do |
|
|
6
6
|
|-------|----------------|--------------|
|
|
@@ -12,14 +12,21 @@ Detailed guides for every gstack skill — philosophy, workflow, and examples.
|
|
|
12
12
|
| [`/review`](#review) | **Staff Engineer** | Find the bugs that pass CI but blow up in production. Auto-fixes the obvious ones. Flags completeness gaps. |
|
|
13
13
|
| [`/investigate`](#investigate) | **Debugger** | Systematic root-cause debugging. Iron Law: no fixes without investigation. Traces data flow, tests hypotheses, stops after 3 failed fixes. |
|
|
14
14
|
| [`/design-review`](#design-review) | **Designer Who Codes** | Live-site visual audit + fix loop. 80-item audit, then fixes what it finds. Atomic commits, before/after screenshots. |
|
|
15
|
+
| [`/design-shotgun`](#design-shotgun) | **Design Explorer** | Generate multiple AI design variants, open a comparison board in your browser, and iterate until you approve a direction. Taste memory biases toward your preferences. |
|
|
16
|
+
| [`/design-html`](#design-html) | **Design Engineer** | Takes an approved mockup from `/design-shotgun` and generates production-quality Pretext-native HTML. Text reflows on resize, heights adjust to content. Smart API routing per design type. Framework detection for React/Svelte/Vue. |
|
|
15
17
|
| [`/qa`](#qa) | **QA Lead** | Test your app, find bugs, fix them with atomic commits, re-verify. Auto-generates regression tests for every fix. |
|
|
16
18
|
| [`/qa-only`](#qa) | **QA Reporter** | Same methodology as /qa but report only. Use when you want a pure bug report without code changes. |
|
|
17
19
|
| [`/ship`](#ship) | **Release Engineer** | Sync main, run tests, audit coverage, push, open PR. Bootstraps test frameworks if you don't have one. One command. |
|
|
20
|
+
| [`/land-and-deploy`](#land-and-deploy) | **Release Engineer** | Merge the PR, wait for CI and deploy, verify production health. One command from "approved" to "verified in production." |
|
|
21
|
+
| [`/canary`](#canary) | **SRE** | Post-deploy monitoring loop. Watches for console errors, performance regressions, and page failures using the browse daemon. |
|
|
22
|
+
| [`/benchmark`](#benchmark) | **Performance Engineer** | Baseline page load times, Core Web Vitals, and resource sizes. Compare before/after on every PR. Track trends over time. |
|
|
18
23
|
| [`/cso`](#cso) | **Chief Security Officer** | OWASP Top 10 + STRIDE threat modeling security audit. Scans for injection, auth, crypto, and access control issues. |
|
|
19
24
|
| [`/document-release`](#document-release) | **Technical Writer** | Update all project docs to match what you just shipped. Catches stale READMEs automatically. |
|
|
20
25
|
| [`/retro`](#retro) | **Eng Manager** | Team-aware weekly retro. Per-person breakdowns, shipping streaks, test health trends, growth opportunities. |
|
|
21
26
|
| [`/browse`](#browse) | **QA Engineer** | Give the agent eyes. Real Chromium browser, real clicks, real screenshots. ~100ms per command. |
|
|
22
27
|
| [`/setup-browser-cookies`](#setup-browser-cookies) | **Session Manager** | Import cookies from your real browser (Chrome, Arc, Brave, Edge) into the headless session. Test authenticated pages. |
|
|
28
|
+
| [`/autoplan`](#autoplan) | **Review Pipeline** | One command, fully reviewed plan. Runs CEO → design → eng review automatically with encoded decision principles. Surfaces only taste decisions for your approval. |
|
|
29
|
+
| [`/learn`](#learn) | **Memory** | Manage what opengstack learned across sessions. Review, search, prune, and export project-specific patterns and preferences. |
|
|
23
30
|
| | | |
|
|
24
31
|
| **Multi-AI** | | |
|
|
25
32
|
| [`/codex`](#codex) | **Second Opinion** | Independent review from OpenAI Codex CLI. Three modes: code review (pass/fail gate), adversarial challenge, and open consultation with session continuity. Cross-model analysis when both `/review` and `/codex` have run. |
|
|
@@ -29,7 +36,9 @@ Detailed guides for every gstack skill — philosophy, workflow, and examples.
|
|
|
29
36
|
| [`/freeze`](#safety--guardrails) | **Edit Lock** | Restrict all file edits to a single directory. Blocks Edit and Write outside the boundary. Accident prevention for debugging. |
|
|
30
37
|
| [`/guard`](#safety--guardrails) | **Full Safety** | Combines /careful + /freeze in one command. Maximum safety for prod work. |
|
|
31
38
|
| [`/unfreeze`](#safety--guardrails) | **Unlock** | Remove the /freeze boundary, allowing edits everywhere again. |
|
|
32
|
-
| [`/
|
|
39
|
+
| [`/connect-chrome`](#connect-chrome) | **Chrome Controller** | Launch your real Chrome controlled by opengstack with the Side Panel extension. Watch every action live. |
|
|
40
|
+
| [`/setup-deploy`](#setup-deploy) | **Deploy Configurator** | One-time setup for `/land-and-deploy`. Detects your platform, production URL, and deploy commands. |
|
|
41
|
+
| [`/opengstack-upgrade`](#opengstack-upgrade) | **Self-Updater** | Upgrade opengstack to the latest version. Detects global vs vendored install, syncs both, shows what changed. |
|
|
33
42
|
|
|
34
43
|
---
|
|
35
44
|
|
|
@@ -84,7 +93,7 @@ Recommends A because you learn from real usage. CRM data comes naturally in week
|
|
|
84
93
|
|
|
85
94
|
### The design doc
|
|
86
95
|
|
|
87
|
-
Both modes end with a design doc written to `~/.
|
|
96
|
+
Both modes end with a design doc written to `~/.opengstack/projects/` — and that doc feeds directly into `/plan-ceo-review` and `/plan-eng-review`. The full lifecycle is now: `office-hours → plan → implement → review → QA → ship → retro`.
|
|
88
97
|
|
|
89
98
|
After the design doc is approved, `/office-hours` reflects on what it noticed about how you think — not generic praise, but specific callbacks to things you said during the session. The observations appear in the design doc too, so you re-encounter them when you re-read later.
|
|
90
99
|
|
|
@@ -138,7 +147,7 @@ It asks, **"what is the 10-star product hiding inside this request?"**
|
|
|
138
147
|
- **HOLD SCOPE** — maximum rigor on the existing plan. No expansions surfaced.
|
|
139
148
|
- **SCOPE REDUCTION** — find the minimum viable version. Cut everything else.
|
|
140
149
|
|
|
141
|
-
Visions and decisions are persisted to `~/.
|
|
150
|
+
Visions and decisions are persisted to `~/.opengstack/projects/` so they survive beyond the conversation. Exceptional visions can be promoted to `docs/designs/` in your repo for the team.
|
|
142
151
|
|
|
143
152
|
---
|
|
144
153
|
|
|
@@ -203,23 +212,23 @@ Every review (CEO, Eng, Design) logs its result. At the end of each review, you
|
|
|
203
212
|
|
|
204
213
|
```
|
|
205
214
|
+====================================================================+
|
|
206
|
-
|
|
|
215
|
+
| REVIEW READINESS DASHBOARD |
|
|
207
216
|
+====================================================================+
|
|
208
|
-
| Review
|
|
217
|
+
| Review | Runs | Last Run | Status | Required |
|
|
209
218
|
|-----------------|------|---------------------|-----------|----------|
|
|
210
|
-
| Eng Review
|
|
211
|
-
| CEO Review
|
|
212
|
-
| Design Review
|
|
219
|
+
| Eng Review | 1 | 2026-03-16 15:00 | CLEAR | YES |
|
|
220
|
+
| CEO Review | 1 | 2026-03-16 14:30 | CLEAR | no |
|
|
221
|
+
| Design Review | 0 | — | — | no |
|
|
213
222
|
+--------------------------------------------------------------------+
|
|
214
|
-
| VERDICT: CLEARED — Eng Review passed
|
|
223
|
+
| VERDICT: CLEARED — Eng Review passed |
|
|
215
224
|
+====================================================================+
|
|
216
225
|
```
|
|
217
226
|
|
|
218
|
-
Eng Review is the only required gate (disable with `
|
|
227
|
+
Eng Review is the only required gate (disable with `opengstack-config set skip_eng_review true`). CEO and Design are informational — recommended for product and UI changes respectively.
|
|
219
228
|
|
|
220
229
|
### Plan-to-QA flow
|
|
221
230
|
|
|
222
|
-
When `/plan-eng-review` finishes the test review section, it writes a test plan artifact to `~/.
|
|
231
|
+
When `/plan-eng-review` finishes the test review section, it writes a test plan artifact to `~/.opengstack/projects/`. When you later run `/qa`, it picks up that test plan automatically — your engineering review feeds directly into QA testing with no manual copy-paste.
|
|
223
232
|
|
|
224
233
|
---
|
|
225
234
|
|
|
@@ -238,34 +247,34 @@ Seven passes over the plan: information architecture, interaction state coverage
|
|
|
238
247
|
### Example
|
|
239
248
|
|
|
240
249
|
```
|
|
241
|
-
You:
|
|
250
|
+
You: /plan-design-review
|
|
242
251
|
|
|
243
252
|
Claude: Initial Design Rating: 4/10
|
|
244
253
|
|
|
245
|
-
|
|
246
|
-
|
|
247
|
-
|
|
248
|
-
|
|
249
|
-
|
|
250
|
-
|
|
251
|
-
|
|
252
|
-
|
|
253
|
-
|
|
254
|
-
|
|
255
|
-
|
|
256
|
-
|
|
257
|
-
|
|
258
|
-
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
|
|
265
|
-
|
|
266
|
-
|
|
267
|
-
|
|
268
|
-
|
|
254
|
+
"This plan describes a user dashboard but never specifies
|
|
255
|
+
what the user sees first. It says 'cards with icons' —
|
|
256
|
+
which looks like every SaaS template. It mentions zero
|
|
257
|
+
loading states, zero empty states, and no mobile behavior."
|
|
258
|
+
|
|
259
|
+
Pass 1 (Info Architecture): 3/10
|
|
260
|
+
"A 10 would define primary/secondary/tertiary content
|
|
261
|
+
hierarchy for every screen."
|
|
262
|
+
→ Added information hierarchy section to plan
|
|
263
|
+
|
|
264
|
+
Pass 2 (Interaction States): 2/10
|
|
265
|
+
"The plan has 4 UI features but specifies 0 out of 20
|
|
266
|
+
interaction states (4 features × 5 states each)."
|
|
267
|
+
→ Added interaction state table to plan
|
|
268
|
+
|
|
269
|
+
Pass 4 (AI Slop): 4/10
|
|
270
|
+
"The plan says 'clean, modern UI with cards and icons'
|
|
271
|
+
and 'hero section with gradient'. These are the top 2
|
|
272
|
+
AI-generated-looking patterns."
|
|
273
|
+
→ Rewrote UI descriptions with specific, intentional alternatives
|
|
274
|
+
|
|
275
|
+
Overall: 4/10 → 8/10 after fixes
|
|
276
|
+
"Plan is design-complete. Run /design-review after
|
|
277
|
+
implementation for visual QA."
|
|
269
278
|
```
|
|
270
279
|
|
|
271
280
|
When you re-run it, sections already at 8+ get a quick pass. Sections below 8 get full treatment. For live-site visual audits post-implementation, use `/design-review`.
|
|
@@ -293,66 +302,66 @@ Then it writes `DESIGN.md` to your repo root — your project's design source of
|
|
|
293
302
|
### Example
|
|
294
303
|
|
|
295
304
|
```
|
|
296
|
-
You:
|
|
305
|
+
You: /design-consultation
|
|
297
306
|
|
|
298
307
|
Claude: From the README, this looks like a developer analytics dashboard
|
|
299
|
-
|
|
300
|
-
|
|
301
|
-
|
|
308
|
+
for engineering teams. Sound right? Tell me about the feeling
|
|
309
|
+
you want — and would you like me to look at what's out there
|
|
310
|
+
first so we can see the landscape?
|
|
302
311
|
|
|
303
|
-
You:
|
|
312
|
+
You: Yes, it's for eng managers. Show me what's out there.
|
|
304
313
|
|
|
305
314
|
Claude: [browses Linear, Vercel, Datadog, Posthog — takes screenshots]
|
|
306
315
|
|
|
307
|
-
|
|
308
|
-
|
|
309
|
-
|
|
316
|
+
Here's the landscape. They all converge on clean sans-serifs
|
|
317
|
+
and muted blue-gray palettes. Linear stands out with dark-first
|
|
318
|
+
and tight spacing. The rest feel interchangeable.
|
|
310
319
|
|
|
311
|
-
|
|
320
|
+
Here's my proposal — and where I'd play it safe vs. take risks:
|
|
312
321
|
|
|
313
|
-
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
|
|
317
|
-
|
|
318
|
-
|
|
322
|
+
AESTHETIC: Industrial/Utilitarian — data-first, every pixel earns its place
|
|
323
|
+
TYPOGRAPHY: Geist (body + UI), Instrument Serif (display), JetBrains Mono (data)
|
|
324
|
+
COLOR: Restrained — slate neutrals, single teal accent (#0D9488)
|
|
325
|
+
SPACING: 4px base, compact density
|
|
326
|
+
LAYOUT: Grid-disciplined, 12-column
|
|
327
|
+
MOTION: Minimal-functional — only transitions that aid comprehension
|
|
319
328
|
|
|
320
|
-
|
|
321
|
-
|
|
322
|
-
|
|
323
|
-
|
|
329
|
+
SAFE CHOICES (match category expectations):
|
|
330
|
+
Geist for body — your users already know this font from Vercel
|
|
331
|
+
Slate neutrals — warm grays feel right for data-heavy UIs
|
|
332
|
+
Grid-disciplined layout — dashboards need predictable alignment
|
|
324
333
|
|
|
325
|
-
|
|
326
|
-
|
|
327
|
-
|
|
328
|
-
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
|
|
333
|
-
|
|
334
|
+
RISKS (where you'd stand out):
|
|
335
|
+
Instrument Serif for display headings — nobody in dev tools
|
|
336
|
+
uses a serif. It creates instant hierarchy contrast and
|
|
337
|
+
signals "we care about craft" in a space full of sans-serifs.
|
|
338
|
+
Single teal accent instead of blue — everyone else in this space uses blue.
|
|
339
|
+
Teal is close enough to feel trustworthy, different enough
|
|
340
|
+
to be recognizable.
|
|
341
|
+
4px compact density — tighter than Linear, says "we respect
|
|
342
|
+
your screen real estate."
|
|
334
343
|
|
|
335
|
-
|
|
336
|
-
|
|
344
|
+
The safe choices keep you literate. The risks are where your
|
|
345
|
+
product gets its own face. Which risks appeal to you?
|
|
337
346
|
|
|
338
|
-
You:
|
|
347
|
+
You: Love the serif risk. Let's see it.
|
|
339
348
|
|
|
340
349
|
Claude: Preview opened in your browser.
|
|
341
|
-
|
|
350
|
+
[/tmp/design-consultation-preview-1710000000.html]
|
|
342
351
|
|
|
343
|
-
|
|
344
|
-
|
|
345
|
-
|
|
346
|
-
|
|
347
|
-
|
|
352
|
+
The page shows a realistic dashboard: sidebar nav, header with
|
|
353
|
+
avatar, stat cards, data table — all in your design system.
|
|
354
|
+
Instrument Serif at hero scale, Geist in body text, JetBrains
|
|
355
|
+
Mono in the data table. Teal accent on buttons and active states.
|
|
356
|
+
Light and dark mode toggle in the corner.
|
|
348
357
|
|
|
349
|
-
|
|
358
|
+
How does it feel? Want to adjust anything before I write DESIGN.md?
|
|
350
359
|
|
|
351
|
-
You:
|
|
360
|
+
You: Ship it.
|
|
352
361
|
|
|
353
362
|
Claude: Wrote DESIGN.md (typography, color, spacing, layout, motion).
|
|
354
|
-
|
|
355
|
-
|
|
363
|
+
Updated CLAUDE.md with design system guidelines.
|
|
364
|
+
All future sessions will respect these choices.
|
|
356
365
|
```
|
|
357
366
|
|
|
358
367
|
---
|
|
@@ -370,32 +379,134 @@ The self-regulation heuristic is tuned for design work — CSS-only changes get
|
|
|
370
379
|
### Example
|
|
371
380
|
|
|
372
381
|
```
|
|
373
|
-
You:
|
|
382
|
+
You: /design-review https://myapp.com
|
|
374
383
|
|
|
375
384
|
Claude: [Runs full 80-item visual audit on the live site]
|
|
376
|
-
|
|
377
|
-
|
|
385
|
+
Design Score: C | AI Slop Score: D
|
|
386
|
+
12 findings (4 high, 5 medium, 3 polish)
|
|
387
|
+
|
|
388
|
+
Fixing 9 design issues...
|
|
389
|
+
|
|
390
|
+
style(design): FINDING-001 — replace 3-column icon grid with asymmetric layout
|
|
391
|
+
style(design): FINDING-002 — add heading scale 48/32/24/18/16
|
|
392
|
+
style(design): FINDING-003 — remove gradient hero, use bold typography
|
|
393
|
+
style(design): FINDING-004 — add second font for headings
|
|
394
|
+
style(design): FINDING-005 — vary border-radius by element role
|
|
395
|
+
style(design): FINDING-006 — left-align body text, reserve center for headings
|
|
396
|
+
style(design): FINDING-007 — add hover/focus states to all interactive elements
|
|
397
|
+
style(design): FINDING-008 — add prefers-reduced-motion media query
|
|
398
|
+
style(design): FINDING-009 — set max content width to 680px for body text
|
|
399
|
+
|
|
400
|
+
Final audit:
|
|
401
|
+
Design Score: C → B+ | AI Slop Score: D → A
|
|
402
|
+
9 fixes applied (8 verified, 1 best-effort). 3 deferred.
|
|
403
|
+
|
|
404
|
+
[Report with before/after screenshots saved to .OpenGStack/design-reports/]
|
|
405
|
+
```
|
|
406
|
+
|
|
407
|
+
Nine commits, each touching one concern. The AI Slop score went from D to A because the three most recognizable patterns (gradient hero, 3-column grid, uniform radius) are gone.
|
|
408
|
+
|
|
409
|
+
---
|
|
410
|
+
|
|
411
|
+
## `/design-shotgun`
|
|
412
|
+
|
|
413
|
+
This is my **design exploration mode**.
|
|
414
|
+
|
|
415
|
+
You know the feeling. You have a feature, a page, a landing screen... and you're not sure what it should look like. You could describe it to Claude and get one answer. But one answer means one perspective, and design is a taste game. You need to see options.
|
|
416
|
+
|
|
417
|
+
`/design-shotgun` generates 3 visual design variants using the GPT Image API, opens a comparison board in your browser, and waits for your feedback. You pick a direction, request changes, or ask for entirely new variants. The board supports remix, regenerate, and approval actions.
|
|
418
|
+
|
|
419
|
+
### The loop
|
|
420
|
+
|
|
421
|
+
1. You describe what you want (or point at an existing page)
|
|
422
|
+
2. The skill reads your `DESIGN.md` for brand constraints (if it exists)
|
|
423
|
+
3. It generates 3 distinct design variants as PNGs
|
|
424
|
+
4. A comparison board opens in your browser with all 3 side-by-side
|
|
425
|
+
5. You click "Approve" on the one you like, or give feedback for another round
|
|
426
|
+
6. The approved variant saves to `~/.opengstack/projects/$SLUG/designs/` with an `approved.json`
|
|
427
|
+
|
|
428
|
+
That `approved.json` is what `/design-html` reads. The design pipeline chains: shotgun picks the direction, design-html renders it as working code.
|
|
429
|
+
|
|
430
|
+
### Taste memory
|
|
378
431
|
|
|
379
|
-
|
|
432
|
+
The skill remembers your preferences across sessions. If you consistently prefer minimal designs over busy ones, it biases future generations. This isn't a setting you configure... it emerges from your approvals.
|
|
380
433
|
|
|
381
|
-
|
|
382
|
-
|
|
383
|
-
|
|
384
|
-
|
|
385
|
-
|
|
386
|
-
|
|
387
|
-
|
|
388
|
-
|
|
389
|
-
|
|
434
|
+
### Example
|
|
435
|
+
|
|
436
|
+
```
|
|
437
|
+
You: /design-shotgun — hero section for a developer tools landing page
|
|
438
|
+
|
|
439
|
+
Claude: [Generates 3 variants]
|
|
440
|
+
Variant A: Bold typography, dark background, code snippet hero
|
|
441
|
+
Variant B: Split layout, product screenshot left, copy right
|
|
442
|
+
Variant C: Minimal, centered headline, gradient accent
|
|
443
|
+
|
|
444
|
+
[Opens comparison board at localhost:PORT]
|
|
390
445
|
|
|
391
|
-
|
|
392
|
-
Design Score: C → B+ | AI Slop Score: D → A
|
|
393
|
-
9 fixes applied (8 verified, 1 best-effort). 3 deferred.
|
|
446
|
+
You: [Clicks "Approve" on Variant A in the browser]
|
|
394
447
|
|
|
395
|
-
|
|
448
|
+
Claude: Approved Variant A. Saved to ~/.opengstack/projects/myapp/designs/
|
|
449
|
+
Next: run /design-html to generate production HTML from this mockup.
|
|
396
450
|
```
|
|
397
451
|
|
|
398
|
-
|
|
452
|
+
---
|
|
453
|
+
|
|
454
|
+
## `/design-html`
|
|
455
|
+
|
|
456
|
+
This is my **design-to-code mode**.
|
|
457
|
+
|
|
458
|
+
Every AI code generation tool produces static CSS. Hardcoded heights. Text that overflows on resize. Breakpoints that snap instead of flowing. The output looks right at exactly one viewport size and breaks at every other.
|
|
459
|
+
|
|
460
|
+
`/design-html` fixes this. It takes the approved mockup from `/design-shotgun` and generates HTML using [Pretext](https://github.com/chenglou/pretext) by Cheng Lou (ex-React core, Midjourney frontend). Pretext is a 15KB library that computes text layout without DOM measurement. Text reflows. Heights adjust to content. Cards size themselves. Chat bubbles shrinkwrap. All sub-millisecond, all dynamic.
|
|
461
|
+
|
|
462
|
+
### Smart API routing
|
|
463
|
+
|
|
464
|
+
Not every page needs the full Pretext engine. The skill reads the design and picks the right tools:
|
|
465
|
+
|
|
466
|
+
- **Simple layouts** (landing, marketing): `prepare()` + `layout()` for resize-aware heights
|
|
467
|
+
- **Card grids** (dashboard, listing): `prepare()` + `layout()` for self-sizing cards
|
|
468
|
+
- **Chat UIs**: `walkLineRanges()` for tight-fit bubbles with zero wasted pixels
|
|
469
|
+
- **Editorial layouts**: `layoutNextLine()` for text flowing around obstacles
|
|
470
|
+
- **Complex editorial**: Full engine with `layoutWithLines()` for manual line rendering
|
|
471
|
+
|
|
472
|
+
### The refinement loop
|
|
473
|
+
|
|
474
|
+
1. Reads the approved mockup from `approved.json`
|
|
475
|
+
2. Uses GPT-4o vision to extract implementation spec (colors, typography, layout)
|
|
476
|
+
3. Generates self-contained HTML with Pretext inlined (15KB, zero network dependency)
|
|
477
|
+
4. Spins up a live-reload server so you see changes instantly
|
|
478
|
+
5. Screenshots at 3 viewports (mobile, tablet, desktop) to verify layout
|
|
479
|
+
6. AskUserQuestion: what needs to change?
|
|
480
|
+
7. Surgical edits via the Edit tool (not full regeneration)
|
|
481
|
+
8. Repeat until you say "done"
|
|
482
|
+
|
|
483
|
+
### Framework detection
|
|
484
|
+
|
|
485
|
+
If your project uses React, Svelte, or Vue (detected from `package.json`), the skill offers to generate a framework component instead of vanilla HTML. Framework output uses `npm install @chenglou/pretext` instead of inline vendoring.
|
|
486
|
+
|
|
487
|
+
### Example
|
|
488
|
+
|
|
489
|
+
```
|
|
490
|
+
You: /design-html
|
|
491
|
+
|
|
492
|
+
Claude: Found approved mockup: variant-A.png (hero section)
|
|
493
|
+
Design type: simple layout → using prepare() + layout()
|
|
494
|
+
|
|
495
|
+
[Generates HTML with Pretext, opens live-reload server]
|
|
496
|
+
[Screenshots at 375px, 768px, 1440px — all verified]
|
|
497
|
+
|
|
498
|
+
The HTML is live at localhost:3456. Resize your browser.
|
|
499
|
+
Text reflows dynamically. What needs to change?
|
|
500
|
+
|
|
501
|
+
You: Make the headline bigger and add more spacing below the CTA
|
|
502
|
+
|
|
503
|
+
Claude: [Edits font-size and margin, browser auto-reloads]
|
|
504
|
+
Updated. Two changes applied.
|
|
505
|
+
|
|
506
|
+
You: Done
|
|
507
|
+
|
|
508
|
+
Claude: Saved to ~/.opengstack/projects/myapp/designs/hero-20260330/finalized.html
|
|
509
|
+
```
|
|
399
510
|
|
|
400
511
|
---
|
|
401
512
|
|
|
@@ -481,18 +592,18 @@ When `/qa` fixes a bug and verifies it, it automatically generates a regression
|
|
|
481
592
|
### Example
|
|
482
593
|
|
|
483
594
|
```
|
|
484
|
-
You:
|
|
595
|
+
You: /qa https://staging.myapp.com
|
|
485
596
|
|
|
486
597
|
Claude: [Explores 12 pages, fills 3 forms, tests 2 flows]
|
|
487
598
|
|
|
488
|
-
|
|
599
|
+
QA Report: staging.myapp.com — Health Score: 72/100
|
|
489
600
|
|
|
490
|
-
|
|
491
|
-
|
|
492
|
-
|
|
493
|
-
|
|
601
|
+
Top 3 Issues:
|
|
602
|
+
1. CRITICAL: Checkout form submits with empty required fields
|
|
603
|
+
2. HIGH: Mobile nav menu doesn't close after selecting an item
|
|
604
|
+
3. MEDIUM: Dashboard chart overlaps sidebar below 1024px
|
|
494
605
|
|
|
495
|
-
|
|
606
|
+
[Full report with screenshots saved to .OpenGStack/qa-reports/]
|
|
496
607
|
```
|
|
497
608
|
|
|
498
609
|
**Testing authenticated pages:** Use `/setup-browser-cookies` first to import your real browser sessions, then `/qa` can test pages behind login.
|
|
@@ -525,6 +636,82 @@ A lot of branches die when the interesting work is done and only the boring rele
|
|
|
525
636
|
|
|
526
637
|
---
|
|
527
638
|
|
|
639
|
+
## `/land-and-deploy`
|
|
640
|
+
|
|
641
|
+
This is my **deploy pipeline mode**.
|
|
642
|
+
|
|
643
|
+
`/ship` creates the PR. `/land-and-deploy` finishes the job: merge, deploy, verify.
|
|
644
|
+
|
|
645
|
+
It merges the PR, waits for CI, waits for the deploy to finish, then runs canary checks against production. One command from "approved" to "verified in production." If the deploy breaks, it tells you what failed and whether to rollback.
|
|
646
|
+
|
|
647
|
+
First run on a new project triggers a dry-run walk-through so you can verify the pipeline before it does anything irreversible. After that, it trusts the config and runs straight through.
|
|
648
|
+
|
|
649
|
+
### Setup
|
|
650
|
+
|
|
651
|
+
Run `/setup-deploy` first. It detects your platform (Fly.io, Render, Vercel, Netlify, Heroku, GitHub Actions, or custom), discovers your production URL and health check endpoints, and writes the config to CLAUDE.md. One-time, 60 seconds.
|
|
652
|
+
|
|
653
|
+
### Example
|
|
654
|
+
|
|
655
|
+
```
|
|
656
|
+
You: /land-and-deploy
|
|
657
|
+
|
|
658
|
+
Claude: Merging PR #42...
|
|
659
|
+
CI: 3/3 checks passed
|
|
660
|
+
Deploy: Fly.io — deploying v2.1.0...
|
|
661
|
+
Health check: https://myapp.fly.dev/health → 200 OK
|
|
662
|
+
Canary: 5 pages checked, 0 console errors, p95 < 800ms
|
|
663
|
+
|
|
664
|
+
Production verified. v2.1.0 is live.
|
|
665
|
+
```
|
|
666
|
+
|
|
667
|
+
---
|
|
668
|
+
|
|
669
|
+
## `/canary`
|
|
670
|
+
|
|
671
|
+
This is my **post-deploy monitoring mode**.
|
|
672
|
+
|
|
673
|
+
After deploy, `/canary` watches the live site for trouble. It loops through your key pages using the browse daemon, checking for console errors, performance regressions, page failures, and visual anomalies. Takes periodic screenshots and compares against pre-deploy baselines.
|
|
674
|
+
|
|
675
|
+
Use it right after `/land-and-deploy`, or schedule it to run periodically after a risky deploy.
|
|
676
|
+
|
|
677
|
+
```
|
|
678
|
+
You: /canary https://myapp.com
|
|
679
|
+
|
|
680
|
+
Claude: Monitoring 8 pages every 2 minutes...
|
|
681
|
+
|
|
682
|
+
Cycle 1: ✓ All pages healthy. p95: 340ms. 0 console errors.
|
|
683
|
+
Cycle 2: ✓ All pages healthy. p95: 380ms. 0 console errors.
|
|
684
|
+
Cycle 3: ⚠ /dashboard — new console error: "TypeError: Cannot read
|
|
685
|
+
property 'map' of undefined" at dashboard.js:142
|
|
686
|
+
Screenshot saved.
|
|
687
|
+
|
|
688
|
+
Alert: 1 new console error after 3 monitoring cycles.
|
|
689
|
+
```
|
|
690
|
+
|
|
691
|
+
---
|
|
692
|
+
|
|
693
|
+
## `/benchmark`
|
|
694
|
+
|
|
695
|
+
This is my **performance engineer mode**.
|
|
696
|
+
|
|
697
|
+
`/benchmark` establishes performance baselines for your pages: load time, Core Web Vitals (LCP, CLS, INP), resource counts, and total transfer size. Run it before and after a PR to catch regressions.
|
|
698
|
+
|
|
699
|
+
It uses the browse daemon for real Chromium measurements, not synthetic estimates. Multiple runs averaged. Results persist so you can track trends across PRs.
|
|
700
|
+
|
|
701
|
+
```
|
|
702
|
+
You: /benchmark https://myapp.com
|
|
703
|
+
|
|
704
|
+
Claude: Benchmarking 5 pages (3 runs each)...
|
|
705
|
+
|
|
706
|
+
/ load: 1.2s LCP: 0.9s CLS: 0.01 resources: 24 (890KB)
|
|
707
|
+
/dashboard load: 2.1s LCP: 1.8s CLS: 0.03 resources: 31 (1.4MB)
|
|
708
|
+
/settings load: 0.8s LCP: 0.6s CLS: 0.00 resources: 18 (420KB)
|
|
709
|
+
|
|
710
|
+
Baseline saved. Run again after changes to compare.
|
|
711
|
+
```
|
|
712
|
+
|
|
713
|
+
---
|
|
714
|
+
|
|
528
715
|
## `/cso`
|
|
529
716
|
|
|
530
717
|
This is my **Chief Security Officer**.
|
|
@@ -532,16 +719,16 @@ This is my **Chief Security Officer**.
|
|
|
532
719
|
Run `/cso` on any codebase and it performs an OWASP Top 10 + STRIDE threat model audit. It scans for injection vulnerabilities, broken authentication, sensitive data exposure, XML external entities, broken access control, security misconfiguration, XSS, insecure deserialization, known-vulnerable components, and insufficient logging. Each finding includes severity, evidence, and a recommended fix.
|
|
533
720
|
|
|
534
721
|
```
|
|
535
|
-
You:
|
|
722
|
+
You: /cso
|
|
536
723
|
|
|
537
724
|
Claude: Running OWASP Top 10 + STRIDE security audit...
|
|
538
725
|
|
|
539
|
-
|
|
540
|
-
|
|
541
|
-
|
|
542
|
-
|
|
726
|
+
CRITICAL: SQL injection in user search (app/models/user.rb:47)
|
|
727
|
+
HIGH: Session tokens stored in localStorage (app/frontend/auth.ts:12)
|
|
728
|
+
MEDIUM: Missing rate limiting on /api/login endpoint
|
|
729
|
+
LOW: X-Frame-Options header not set
|
|
543
730
|
|
|
544
|
-
|
|
731
|
+
4 findings across 12 files scanned. 1 critical, 1 high.
|
|
545
732
|
```
|
|
546
733
|
|
|
547
734
|
---
|
|
@@ -553,16 +740,16 @@ This is my **technical writer mode**.
|
|
|
553
740
|
After `/ship` creates the PR but before it merges, `/document-release` reads every documentation file in the project and cross-references it against the diff. It updates file paths, command lists, project structure trees, and anything else that drifted. Risky or subjective changes get surfaced as questions — everything else is handled automatically.
|
|
554
741
|
|
|
555
742
|
```
|
|
556
|
-
You:
|
|
743
|
+
You: /document-release
|
|
557
744
|
|
|
558
745
|
Claude: Analyzing 21 files changed across 3 commits. Found 8 documentation files.
|
|
559
746
|
|
|
560
|
-
|
|
561
|
-
|
|
562
|
-
|
|
563
|
-
|
|
747
|
+
README.md: updated skill count from 9 to 10, added new skill to table
|
|
748
|
+
CLAUDE.md: added new directory to project structure
|
|
749
|
+
CONTRIBUTING.md: current — no changes needed
|
|
750
|
+
TODOS.md: marked 2 items complete, added 1 new item
|
|
564
751
|
|
|
565
|
-
|
|
752
|
+
All docs updated and committed. PR body updated with doc diff.
|
|
566
753
|
```
|
|
567
754
|
|
|
568
755
|
It also polishes CHANGELOG voice (without ever overwriting entries), cleans up completed TODOS, checks cross-doc consistency, and asks about VERSION bumps only when appropriate.
|
|
@@ -582,27 +769,27 @@ It also tracks test health: total test files, tests added this period, regressio
|
|
|
582
769
|
### Example
|
|
583
770
|
|
|
584
771
|
```
|
|
585
|
-
You:
|
|
772
|
+
You: /retro
|
|
586
773
|
|
|
587
774
|
Claude: Week of Mar 1: 47 commits (3 contributors), 3.2k LOC, 38% tests, 12 PRs, peak: 10pm | Streak: 47d
|
|
588
775
|
|
|
589
|
-
|
|
590
|
-
|
|
591
|
-
|
|
592
|
-
|
|
593
|
-
|
|
776
|
+
## Your Week
|
|
777
|
+
32 commits, +2.4k LOC, 41% tests. Peak hours: 9-11pm.
|
|
778
|
+
Biggest ship: cookie import system (browser decryption + picker UI).
|
|
779
|
+
What you did well: shipped a complete feature with encryption, UI, and
|
|
780
|
+
18 unit tests in one focused push...
|
|
594
781
|
|
|
595
|
-
|
|
782
|
+
## Team Breakdown
|
|
596
783
|
|
|
597
|
-
|
|
598
|
-
|
|
599
|
-
|
|
784
|
+
### Alice
|
|
785
|
+
12 commits focused on app/services/. Every PR under 200 LOC — disciplined.
|
|
786
|
+
Opportunity: test ratio at 12% — worth investing before payment gets more complex.
|
|
600
787
|
|
|
601
|
-
|
|
602
|
-
|
|
603
|
-
|
|
788
|
+
### Bob
|
|
789
|
+
3 commits — fixed the N+1 query on dashboard. Small but high-impact.
|
|
790
|
+
Opportunity: only 1 active day this week — check if blocked on anything.
|
|
604
791
|
|
|
605
|
-
|
|
792
|
+
[Top 3 team wins, 3 things to improve, 3 habits for next week]
|
|
606
793
|
```
|
|
607
794
|
|
|
608
795
|
It saves a JSON snapshot to `.context/retros/` so the next run can show trends.
|
|
@@ -620,30 +807,30 @@ It is a compiled binary that talks to a persistent Chromium daemon — built on
|
|
|
620
807
|
### Example
|
|
621
808
|
|
|
622
809
|
```
|
|
623
|
-
You:
|
|
624
|
-
|
|
810
|
+
You: /browse staging.myapp.com — log in, test the signup flow, and check
|
|
811
|
+
every page I changed in this branch
|
|
625
812
|
|
|
626
813
|
Claude: [18 tool calls, ~60 seconds]
|
|
627
814
|
|
|
628
|
-
|
|
629
|
-
|
|
630
|
-
|
|
631
|
-
|
|
632
|
-
|
|
633
|
-
|
|
634
|
-
|
|
815
|
+
> browse goto https://staging.myapp.com/signup
|
|
816
|
+
> browse snapshot -i
|
|
817
|
+
> browse fill @e2 "$TEST_EMAIL"
|
|
818
|
+
> browse fill @e3 "$TEST_PASSWORD"
|
|
819
|
+
> browse click @e5 (Submit)
|
|
820
|
+
> browse screenshot /tmp/signup.png
|
|
821
|
+
> Read /tmp/signup.png
|
|
635
822
|
|
|
636
|
-
|
|
823
|
+
Signup works. Redirected to onboarding. Now checking changed pages.
|
|
637
824
|
|
|
638
|
-
|
|
639
|
-
|
|
640
|
-
|
|
641
|
-
|
|
825
|
+
> browse goto https://staging.myapp.com/dashboard
|
|
826
|
+
> browse screenshot /tmp/dashboard.png
|
|
827
|
+
> Read /tmp/dashboard.png
|
|
828
|
+
> browse console
|
|
642
829
|
|
|
643
|
-
|
|
830
|
+
Dashboard loads. No console errors. Charts render with sample data.
|
|
644
831
|
|
|
645
|
-
|
|
646
|
-
|
|
832
|
+
All 4 pages load correctly. No console errors. No broken layouts.
|
|
833
|
+
Signup → onboarding → dashboard flow works end to end.
|
|
647
834
|
```
|
|
648
835
|
|
|
649
836
|
18 tool calls, about a minute. Full QA pass. No browser opened.
|
|
@@ -657,19 +844,19 @@ When the headless browser gets stuck — CAPTCHA, MFA, complex auth — hand off
|
|
|
657
844
|
|
|
658
845
|
```
|
|
659
846
|
Claude: I'm stuck on a CAPTCHA at the login page. Opening a visible
|
|
660
|
-
|
|
847
|
+
Chrome so you can solve it.
|
|
661
848
|
|
|
662
|
-
|
|
849
|
+
> browse handoff "Stuck on CAPTCHA at login page"
|
|
663
850
|
|
|
664
|
-
|
|
665
|
-
|
|
666
|
-
|
|
851
|
+
Chrome opened at https://app.example.com/login with all your
|
|
852
|
+
cookies and tabs intact. Solve the CAPTCHA and tell me when
|
|
853
|
+
you're done.
|
|
667
854
|
|
|
668
|
-
You:
|
|
855
|
+
You: done
|
|
669
856
|
|
|
670
857
|
Claude: > browse resume
|
|
671
858
|
|
|
672
|
-
|
|
859
|
+
Got a fresh snapshot. Logged in successfully. Continuing QA.
|
|
673
860
|
```
|
|
674
861
|
|
|
675
862
|
The browser preserves all state (cookies, localStorage, tabs) across the handoff. After `resume`, the agent gets a fresh snapshot of wherever you left off. If the browse tool fails 3 times in a row, it automatically suggests using `handoff`.
|
|
@@ -689,14 +876,14 @@ Before `/qa` or `/browse` can test authenticated pages, they need cookies. Inste
|
|
|
689
876
|
It auto-detects installed Chromium browsers (Comet, Chrome, Arc, Brave, Edge), decrypts cookies via the macOS Keychain, and loads them into the Playwright session. An interactive picker UI lets you choose exactly which domains to import — no cookie values are ever displayed.
|
|
690
877
|
|
|
691
878
|
```
|
|
692
|
-
You:
|
|
879
|
+
You: /setup-browser-cookies
|
|
693
880
|
|
|
694
881
|
Claude: Cookie picker opened — select the domains you want to import
|
|
695
|
-
|
|
882
|
+
in your browser, then tell me when you're done.
|
|
696
883
|
|
|
697
|
-
|
|
884
|
+
[You pick github.com, myapp.com in the browser UI]
|
|
698
885
|
|
|
699
|
-
You:
|
|
886
|
+
You: done
|
|
700
887
|
|
|
701
888
|
Claude: Imported 2 domains (47 cookies). Session is ready.
|
|
702
889
|
```
|
|
@@ -704,13 +891,107 @@ Claude: Imported 2 domains (47 cookies). Session is ready.
|
|
|
704
891
|
Or skip the UI entirely:
|
|
705
892
|
|
|
706
893
|
```
|
|
707
|
-
You:
|
|
894
|
+
You: /setup-browser-cookies github.com
|
|
708
895
|
|
|
709
896
|
Claude: Imported 12 cookies for github.com from Comet.
|
|
710
897
|
```
|
|
711
898
|
|
|
712
899
|
---
|
|
713
900
|
|
|
901
|
+
## `/autoplan`
|
|
902
|
+
|
|
903
|
+
This is my **review autopilot mode**.
|
|
904
|
+
|
|
905
|
+
Running `/plan-ceo-review`, then `/plan-design-review`, then `/plan-eng-review` individually means answering 15-30 intermediate questions. Each question is valuable, but sometimes you want the gauntlet to run without stopping for every decision.
|
|
906
|
+
|
|
907
|
+
`/autoplan` reads all three review skills from disk and runs them sequentially: CEO → Design → Eng. It makes decisions automatically using six encoded principles (prefer completeness, match existing patterns, choose reversible options, prefer the option the user chose for similar past decisions, defer ambiguous items, and escalate security). Taste decisions (close approaches, borderline scope expansions, cross-model disagreements) get saved and presented at a final approval gate.
|
|
908
|
+
|
|
909
|
+
One command, fully reviewed plan out.
|
|
910
|
+
|
|
911
|
+
```
|
|
912
|
+
You: /autoplan
|
|
913
|
+
|
|
914
|
+
Claude: Running CEO review... [4 scope decisions auto-resolved]
|
|
915
|
+
Running design review... [3 design dimensions auto-scored]
|
|
916
|
+
Running eng review... [2 architecture decisions auto-resolved]
|
|
917
|
+
|
|
918
|
+
TASTE DECISIONS (need your input):
|
|
919
|
+
1. Scope: Codex suggested adding search — borderline expansion. Add?
|
|
920
|
+
2. Design: Two approaches scored within 1 point. Which feels right?
|
|
921
|
+
|
|
922
|
+
[Shows both options with context]
|
|
923
|
+
|
|
924
|
+
You: 1) Yes, add search. 2) Option A.
|
|
925
|
+
|
|
926
|
+
Claude: Plan complete. 9 decisions auto-resolved, 2 taste decisions approved.
|
|
927
|
+
```
|
|
928
|
+
|
|
929
|
+
---
|
|
930
|
+
|
|
931
|
+
## `/learn`
|
|
932
|
+
|
|
933
|
+
This is my **institutional memory mode**.
|
|
934
|
+
|
|
935
|
+
opengstack learns from every session. Patterns, pitfalls, preferences, architectural decisions... they accumulate in `~/.opengstack/projects/$SLUG/learnings.jsonl`. Each learning has a confidence score, source attribution, and the files it references.
|
|
936
|
+
|
|
937
|
+
`/learn` lets you see what opengstack has absorbed, search for specific patterns, prune stale entries (when referenced files no longer exist), and export learnings for team sharing. The real magic is in other skills... they automatically search learnings before making recommendations, and display "Prior learning applied" when a past insight is relevant.
|
|
938
|
+
|
|
939
|
+
```
|
|
940
|
+
You: /learn
|
|
941
|
+
|
|
942
|
+
Claude: 23 learnings for this project (14 high confidence, 6 medium, 3 low)
|
|
943
|
+
|
|
944
|
+
Top patterns:
|
|
945
|
+
- [9/10] API responses always wrapped in { data, error } envelope
|
|
946
|
+
- [8/10] Tests use factory helpers in test/support/factories.ts
|
|
947
|
+
- [8/10] All DB queries go through repository pattern, never direct
|
|
948
|
+
|
|
949
|
+
3 potentially stale (referenced files deleted):
|
|
950
|
+
- "auth middleware uses JWT" — auth/middleware.ts was deleted
|
|
951
|
+
[Prune these? Y/N]
|
|
952
|
+
```
|
|
953
|
+
|
|
954
|
+
---
|
|
955
|
+
|
|
956
|
+
## `/connect-chrome`
|
|
957
|
+
|
|
958
|
+
This is my **co-presence mode**.
|
|
959
|
+
|
|
960
|
+
`/browse` runs headless by default. You don't see what the agent sees. `/connect-chrome` changes that. It launches your actual Chrome browser controlled by Playwright, with the opengstack Side Panel extension auto-loaded. You watch every action in real time... same screen, same window.
|
|
961
|
+
|
|
962
|
+
A subtle green shimmer at the top edge tells you which Chrome window opengstack controls. All existing browse commands work unchanged. The Side Panel shows a live activity feed of every command and a chat sidebar where you can direct Claude with natural language instructions.
|
|
963
|
+
|
|
964
|
+
```
|
|
965
|
+
You: /connect-chrome
|
|
966
|
+
|
|
967
|
+
Claude: Launched Chrome with Side Panel extension.
|
|
968
|
+
Green shimmer indicates the controlled window.
|
|
969
|
+
All $B commands now run in headed mode.
|
|
970
|
+
Type in the Side Panel to direct the browser agent.
|
|
971
|
+
```
|
|
972
|
+
|
|
973
|
+
---
|
|
974
|
+
|
|
975
|
+
## `/setup-deploy`
|
|
976
|
+
|
|
977
|
+
One-time deploy configuration. Run this before your first `/land-and-deploy`.
|
|
978
|
+
|
|
979
|
+
It auto-detects your deploy platform (Fly.io, Render, Vercel, Netlify, Heroku, GitHub Actions, or custom), discovers your production URL, health check endpoints, and deploy status commands. Writes everything to CLAUDE.md so all future deploys are automatic.
|
|
980
|
+
|
|
981
|
+
```
|
|
982
|
+
You: /setup-deploy
|
|
983
|
+
|
|
984
|
+
Claude: Detected: Fly.io (fly.toml found)
|
|
985
|
+
Production URL: https://myapp.fly.dev
|
|
986
|
+
Health check: /health → expects 200
|
|
987
|
+
Deploy command: fly deploy
|
|
988
|
+
Status command: fly status
|
|
989
|
+
|
|
990
|
+
Written to CLAUDE.md. Run /land-and-deploy when ready.
|
|
991
|
+
```
|
|
992
|
+
|
|
993
|
+
---
|
|
994
|
+
|
|
714
995
|
## `/codex`
|
|
715
996
|
|
|
716
997
|
This is my **second opinion mode**.
|
|
@@ -730,20 +1011,20 @@ When `/review` catches bugs from Claude's perspective, `/codex` brings a complet
|
|
|
730
1011
|
When both `/review` (Claude) and `/codex` (OpenAI) have reviewed the same branch, you get a cross-model comparison: which findings overlap (high confidence), which are unique to Codex (different perspective), and which are unique to Claude. This is the "two doctors, same patient" approach to code review.
|
|
731
1012
|
|
|
732
1013
|
```
|
|
733
|
-
You:
|
|
1014
|
+
You: /codex review
|
|
734
1015
|
|
|
735
1016
|
Claude: Running independent Codex review...
|
|
736
1017
|
|
|
737
|
-
|
|
738
|
-
|
|
739
|
-
|
|
740
|
-
|
|
741
|
-
|
|
1018
|
+
CODEX REVIEW: PASS (3 findings)
|
|
1019
|
+
[P2] Race condition in payment handler — concurrent charges
|
|
1020
|
+
can double-debit without advisory lock
|
|
1021
|
+
[P3] Missing null check on user.email before downcase
|
|
1022
|
+
[P3] Token comparison not using constant-time compare
|
|
742
1023
|
|
|
743
|
-
|
|
744
|
-
|
|
745
|
-
|
|
746
|
-
|
|
1024
|
+
Cross-model analysis (vs /review):
|
|
1025
|
+
OVERLAP: Race condition in payment handler (both caught it)
|
|
1026
|
+
UNIQUE TO CODEX: Token comparison timing attack
|
|
1027
|
+
UNIQUE TO CLAUDE: N+1 query in listing photos
|
|
747
1028
|
```
|
|
748
1029
|
|
|
749
1030
|
---
|
|
@@ -775,14 +1056,14 @@ Restrict all file edits to a single directory. When you're debugging a billing b
|
|
|
775
1056
|
`/investigate` activates this automatically — it detects the module being debugged and freezes edits to that directory.
|
|
776
1057
|
|
|
777
1058
|
```
|
|
778
|
-
You:
|
|
1059
|
+
You: /freeze src/billing
|
|
779
1060
|
|
|
780
1061
|
Claude: Edits restricted to src/billing/. Run /unfreeze to remove.
|
|
781
1062
|
|
|
782
|
-
|
|
1063
|
+
[Later, Claude tries to edit src/auth/middleware.ts]
|
|
783
1064
|
|
|
784
1065
|
Claude: BLOCKED — Edit outside freeze boundary (src/billing/).
|
|
785
|
-
|
|
1066
|
+
Skipping this change.
|
|
786
1067
|
```
|
|
787
1068
|
|
|
788
1069
|
Note: this blocks Edit and Write tools only. Bash commands like `sed` can still modify files outside the boundary — it's accident prevention, not a security sandbox.
|
|
@@ -797,27 +1078,27 @@ Remove the `/freeze` boundary, allowing edits everywhere again. The hooks stay r
|
|
|
797
1078
|
|
|
798
1079
|
---
|
|
799
1080
|
|
|
800
|
-
## `/
|
|
1081
|
+
## `/opengstack-upgrade`
|
|
801
1082
|
|
|
802
|
-
Keep
|
|
1083
|
+
Keep opengstack current with one command. It detects your install type (global at `~/.claude/skills/opengstack` vs vendored in your project at `.claude/skills/opengstack`), runs the upgrade, syncs both copies if you have dual installs, and shows you what changed.
|
|
803
1084
|
|
|
804
1085
|
```
|
|
805
|
-
You:
|
|
1086
|
+
You: /opengstack-upgrade
|
|
806
1087
|
|
|
807
1088
|
Claude: Current version: 0.7.4
|
|
808
|
-
|
|
1089
|
+
Latest version: 0.8.2
|
|
809
1090
|
|
|
810
|
-
|
|
811
|
-
|
|
812
|
-
|
|
813
|
-
|
|
814
|
-
|
|
815
|
-
|
|
1091
|
+
What's new:
|
|
1092
|
+
- Browse handoff for CAPTCHAs and auth walls
|
|
1093
|
+
- /codex multi-AI second opinion
|
|
1094
|
+
- /qa always uses browser now
|
|
1095
|
+
- Safety skills: /careful, /freeze, /guard
|
|
1096
|
+
- Proactive skill suggestions
|
|
816
1097
|
|
|
817
|
-
|
|
1098
|
+
Upgraded to 0.8.2. Both global and project installs synced.
|
|
818
1099
|
```
|
|
819
1100
|
|
|
820
|
-
Set `auto_upgrade: true` in `~/.
|
|
1101
|
+
Set `auto_upgrade: true` in `~/.opengstack/config.yaml` to skip the prompt entirely — opengstack upgrades silently at the start of each session when a new version is available.
|
|
821
1102
|
|
|
822
1103
|
---
|
|
823
1104
|
|
|
@@ -827,13 +1108,13 @@ Set `auto_upgrade: true` in `~/.gstack/config.yaml` to skip the prompt entirely
|
|
|
827
1108
|
|
|
828
1109
|
### Setup
|
|
829
1110
|
|
|
830
|
-
Install Greptile on your GitHub repo at [greptile.com](https://greptile.com) — it takes about 30 seconds. Once it's reviewing your PRs,
|
|
1111
|
+
Install Greptile on your GitHub repo at [greptile.com](https://greptile.com) — it takes about 30 seconds. Once it's reviewing your PRs, opengstack picks up its comments automatically. No additional configuration.
|
|
831
1112
|
|
|
832
1113
|
### How it works
|
|
833
1114
|
|
|
834
1115
|
The problem with any automated reviewer is triage. Greptile is good, but not every comment is a real issue. Some are false positives. Some flag things you already fixed three commits ago. Without a triage layer, the comments pile up and you start ignoring them — which defeats the purpose.
|
|
835
1116
|
|
|
836
|
-
|
|
1117
|
+
opengstack solves this. `/review` and `/ship` are now Greptile-aware. They read Greptile's comments, classify each one, and take action:
|
|
837
1118
|
|
|
838
1119
|
- **Valid issues** get added to the critical findings and fixed before shipping
|
|
839
1120
|
- **Already-fixed issues** get an auto-reply acknowledging the catch
|
|
@@ -843,38 +1124,38 @@ The result is a two-layer review: Greptile catches things asynchronously on the
|
|
|
843
1124
|
|
|
844
1125
|
### Learning from history
|
|
845
1126
|
|
|
846
|
-
Every false positive you confirm gets saved to `~/.
|
|
1127
|
+
Every false positive you confirm gets saved to `~/.opengstack/greptile-history.md`. Future runs auto-skip known FP patterns for your codebase. And `/retro` tracks Greptile's batting average over time — so you can see whether the signal-to-noise ratio is improving.
|
|
847
1128
|
|
|
848
1129
|
### Example
|
|
849
1130
|
|
|
850
1131
|
```
|
|
851
|
-
You:
|
|
1132
|
+
You: /ship
|
|
852
1133
|
|
|
853
1134
|
Claude: [syncs main, runs tests, pre-landing review...]
|
|
854
1135
|
|
|
855
|
-
|
|
1136
|
+
Greptile found 3 comments on this PR:
|
|
856
1137
|
|
|
857
|
-
|
|
858
|
-
|
|
859
|
-
|
|
1138
|
+
[VALID] app/services/payment_service.rb:47 — Race condition:
|
|
1139
|
+
concurrent charges can double-debit. Recommend DB-level advisory
|
|
1140
|
+
lock around the charge block.
|
|
860
1141
|
|
|
861
|
-
|
|
862
|
-
|
|
863
|
-
|
|
1142
|
+
[ALREADY FIXED] app/models/user.rb:12 — Missing null check
|
|
1143
|
+
on email before downcase.
|
|
1144
|
+
→ Already fixed in commit abc1234. Auto-replying.
|
|
864
1145
|
|
|
865
|
-
|
|
866
|
-
|
|
867
|
-
|
|
1146
|
+
[FALSE POSITIVE] lib/auth.rb:88 — "Token comparison should
|
|
1147
|
+
use constant-time comparison." We already use
|
|
1148
|
+
ActiveSupport::SecurityUtils.secure_compare here.
|
|
868
1149
|
|
|
869
|
-
|
|
1150
|
+
Fixing the race condition in payment_service.rb...
|
|
870
1151
|
|
|
871
|
-
|
|
872
|
-
|
|
1152
|
+
The auth token comment is a false positive.
|
|
1153
|
+
Options: A) Reply to Greptile explaining B) Fix anyway C) Ignore
|
|
873
1154
|
|
|
874
|
-
You:
|
|
1155
|
+
You: A
|
|
875
1156
|
|
|
876
1157
|
Claude: Replied to Greptile. All tests pass.
|
|
877
|
-
|
|
1158
|
+
PR: github.com/you/app/pull/42
|
|
878
1159
|
```
|
|
879
1160
|
|
|
880
1161
|
Three Greptile comments. One real fix. One auto-acknowledged. One false positive pushed back with a reply. Total extra time: about 30 seconds.
|