opengstack 0.13.10 → 0.14.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/{skills/land-and-deploy/SKILL.md → commands/autoplan.md} +0 -16
- package/{skills/benchmark/SKILL.md → commands/benchmark.md} +0 -17
- package/{skills/browse/SKILL.md → commands/browse.md} +0 -17
- package/{skills/ship/SKILL.md → commands/canary.md} +0 -18
- package/{skills/careful/SKILL.md → commands/careful.md} +0 -20
- package/{skills/canary/SKILL.md → commands/codex.md} +0 -17
- package/{skills/connect-chrome/SKILL.md → commands/connect-chrome.md} +0 -15
- package/commands/cso.md +72 -0
- package/commands/design-consultation.md +72 -0
- package/commands/design-review.md +72 -0
- package/commands/design-shotgun.md +72 -0
- package/commands/document-release.md +72 -0
- package/{skills/freeze/SKILL.md → commands/freeze.md} +0 -26
- package/{skills/gstack-upgrade/SKILL.md → commands/gstack-upgrade.md} +0 -14
- package/{skills/guard/SKILL.md → commands/guard.md} +0 -31
- package/commands/investigate.md +72 -0
- package/commands/land-and-deploy.md +72 -0
- package/commands/office-hours.md +72 -0
- package/commands/plan-ceo-review.md +72 -0
- package/commands/plan-design-review.md +72 -0
- package/commands/plan-eng-review.md +72 -0
- package/commands/qa-only.md +72 -0
- package/commands/qa.md +72 -0
- package/commands/retro.md +72 -0
- package/commands/review.md +72 -0
- package/{skills/setup-browser-cookies/SKILL.md → commands/setup-browser-cookies.md} +0 -14
- package/commands/setup-deploy.md +72 -0
- package/commands/ship.md +72 -0
- package/{skills/unfreeze/SKILL.md → commands/unfreeze.md} +0 -12
- package/package.json +4 -4
- package/scripts/install-commands.js +45 -0
- package/skills/autoplan/SKILL.md +0 -96
- package/skills/autoplan/SKILL.md.tmpl +0 -694
- package/skills/benchmark/SKILL.md.tmpl +0 -222
- package/skills/browse/SKILL.md.tmpl +0 -131
- package/skills/browse/bin/find-browse +0 -21
- package/skills/browse/bin/remote-slug +0 -14
- package/skills/browse/scripts/build-node-server.sh +0 -48
- package/skills/browse/src/activity.ts +0 -208
- package/skills/browse/src/browser-manager.ts +0 -959
- package/skills/browse/src/buffers.ts +0 -137
- package/skills/browse/src/bun-polyfill.cjs +0 -109
- package/skills/browse/src/cli.ts +0 -678
- package/skills/browse/src/commands.ts +0 -128
- package/skills/browse/src/config.ts +0 -150
- package/skills/browse/src/cookie-import-browser.ts +0 -625
- package/skills/browse/src/cookie-picker-routes.ts +0 -230
- package/skills/browse/src/cookie-picker-ui.ts +0 -688
- package/skills/browse/src/find-browse.ts +0 -61
- package/skills/browse/src/meta-commands.ts +0 -550
- package/skills/browse/src/platform.ts +0 -17
- package/skills/browse/src/read-commands.ts +0 -358
- package/skills/browse/src/server.ts +0 -1192
- package/skills/browse/src/sidebar-agent.ts +0 -280
- package/skills/browse/src/sidebar-utils.ts +0 -21
- package/skills/browse/src/snapshot.ts +0 -407
- package/skills/browse/src/url-validation.ts +0 -95
- package/skills/browse/src/write-commands.ts +0 -364
- package/skills/browse/test/activity.test.ts +0 -120
- package/skills/browse/test/adversarial-security.test.ts +0 -32
- package/skills/browse/test/browser-manager-unit.test.ts +0 -17
- package/skills/browse/test/bun-polyfill.test.ts +0 -72
- package/skills/browse/test/commands.test.ts +0 -2075
- package/skills/browse/test/compare-board.test.ts +0 -342
- package/skills/browse/test/config.test.ts +0 -316
- package/skills/browse/test/cookie-import-browser.test.ts +0 -519
- package/skills/browse/test/cookie-picker-routes.test.ts +0 -260
- package/skills/browse/test/file-drop.test.ts +0 -271
- package/skills/browse/test/find-browse.test.ts +0 -50
- package/skills/browse/test/findport.test.ts +0 -191
- package/skills/browse/test/fixtures/basic.html +0 -33
- package/skills/browse/test/fixtures/cursor-interactive.html +0 -22
- package/skills/browse/test/fixtures/dialog.html +0 -15
- package/skills/browse/test/fixtures/empty.html +0 -2
- package/skills/browse/test/fixtures/forms.html +0 -55
- package/skills/browse/test/fixtures/iframe.html +0 -30
- package/skills/browse/test/fixtures/network-idle.html +0 -30
- package/skills/browse/test/fixtures/qa-eval-checkout.html +0 -108
- package/skills/browse/test/fixtures/qa-eval-spa.html +0 -98
- package/skills/browse/test/fixtures/qa-eval.html +0 -51
- package/skills/browse/test/fixtures/responsive.html +0 -49
- package/skills/browse/test/fixtures/snapshot.html +0 -55
- package/skills/browse/test/fixtures/spa.html +0 -24
- package/skills/browse/test/fixtures/states.html +0 -17
- package/skills/browse/test/fixtures/upload.html +0 -25
- package/skills/browse/test/gstack-config.test.ts +0 -138
- package/skills/browse/test/gstack-update-check.test.ts +0 -514
- package/skills/browse/test/handoff.test.ts +0 -235
- package/skills/browse/test/path-validation.test.ts +0 -91
- package/skills/browse/test/platform.test.ts +0 -37
- package/skills/browse/test/server-auth.test.ts +0 -65
- package/skills/browse/test/sidebar-agent-roundtrip.test.ts +0 -226
- package/skills/browse/test/sidebar-agent.test.ts +0 -199
- package/skills/browse/test/sidebar-integration.test.ts +0 -320
- package/skills/browse/test/sidebar-unit.test.ts +0 -96
- package/skills/browse/test/snapshot.test.ts +0 -467
- package/skills/browse/test/state-ttl.test.ts +0 -35
- package/skills/browse/test/test-server.ts +0 -57
- package/skills/browse/test/url-validation.test.ts +0 -72
- package/skills/browse/test/watch.test.ts +0 -129
- package/skills/canary/SKILL.md.tmpl +0 -212
- package/skills/careful/SKILL.md.tmpl +0 -56
- package/skills/careful/bin/check-careful.sh +0 -112
- package/skills/codex/SKILL.md +0 -90
- package/skills/codex/SKILL.md.tmpl +0 -417
- package/skills/connect-chrome/SKILL.md.tmpl +0 -195
- package/skills/cso/ACKNOWLEDGEMENTS.md +0 -14
- package/skills/cso/SKILL.md +0 -93
- package/skills/cso/SKILL.md.tmpl +0 -606
- package/skills/design-consultation/SKILL.md +0 -94
- package/skills/design-consultation/SKILL.md.tmpl +0 -415
- package/skills/design-review/SKILL.md +0 -94
- package/skills/design-review/SKILL.md.tmpl +0 -290
- package/skills/design-shotgun/SKILL.md +0 -91
- package/skills/design-shotgun/SKILL.md.tmpl +0 -285
- package/skills/document-release/SKILL.md +0 -91
- package/skills/document-release/SKILL.md.tmpl +0 -359
- package/skills/freeze/SKILL.md.tmpl +0 -77
- package/skills/freeze/bin/check-freeze.sh +0 -79
- package/skills/gstack-upgrade/SKILL.md.tmpl +0 -222
- package/skills/guard/SKILL.md.tmpl +0 -77
- package/skills/investigate/SKILL.md +0 -105
- package/skills/investigate/SKILL.md.tmpl +0 -194
- package/skills/land-and-deploy/SKILL.md.tmpl +0 -881
- package/skills/office-hours/SKILL.md +0 -96
- package/skills/office-hours/SKILL.md.tmpl +0 -645
- package/skills/plan-ceo-review/SKILL.md +0 -94
- package/skills/plan-ceo-review/SKILL.md.tmpl +0 -811
- package/skills/plan-design-review/SKILL.md +0 -92
- package/skills/plan-design-review/SKILL.md.tmpl +0 -446
- package/skills/plan-eng-review/SKILL.md +0 -93
- package/skills/plan-eng-review/SKILL.md.tmpl +0 -303
- package/skills/qa/SKILL.md +0 -95
- package/skills/qa/SKILL.md.tmpl +0 -316
- package/skills/qa/references/issue-taxonomy.md +0 -85
- package/skills/qa/templates/qa-report-template.md +0 -126
- package/skills/qa-only/SKILL.md +0 -89
- package/skills/qa-only/SKILL.md.tmpl +0 -101
- package/skills/retro/SKILL.md +0 -89
- package/skills/retro/SKILL.md.tmpl +0 -820
- package/skills/review/SKILL.md +0 -92
- package/skills/review/SKILL.md.tmpl +0 -281
- package/skills/review/TODOS-format.md +0 -62
- package/skills/review/checklist.md +0 -220
- package/skills/review/design-checklist.md +0 -132
- package/skills/review/greptile-triage.md +0 -220
- package/skills/setup-browser-cookies/SKILL.md.tmpl +0 -81
- package/skills/setup-deploy/SKILL.md +0 -92
- package/skills/setup-deploy/SKILL.md.tmpl +0 -215
- package/skills/ship/SKILL.md.tmpl +0 -636
- package/skills/unfreeze/SKILL.md.tmpl +0 -36
|
@@ -1,129 +0,0 @@
|
|
|
1
|
-
/**
|
|
2
|
-
* Tests for watch mode state machine in BrowserManager.
|
|
3
|
-
*
|
|
4
|
-
* Pure unit tests — no browser needed. Just instantiate BrowserManager
|
|
5
|
-
* and test the watch state methods (startWatch, stopWatch, addWatchSnapshot,
|
|
6
|
-
* isWatching).
|
|
7
|
-
*/
|
|
8
|
-
|
|
9
|
-
import { describe, test, expect } from 'bun:test';
|
|
10
|
-
import { BrowserManager } from '../src/browser-manager';
|
|
11
|
-
|
|
12
|
-
describe('watch mode — state machine', () => {
|
|
13
|
-
test('isWatching returns false by default', () => {
|
|
14
|
-
const bm = new BrowserManager();
|
|
15
|
-
expect(bm.isWatching()).toBe(false);
|
|
16
|
-
});
|
|
17
|
-
|
|
18
|
-
test('startWatch sets isWatching to true', () => {
|
|
19
|
-
const bm = new BrowserManager();
|
|
20
|
-
bm.startWatch();
|
|
21
|
-
expect(bm.isWatching()).toBe(true);
|
|
22
|
-
});
|
|
23
|
-
|
|
24
|
-
test('stopWatch clears isWatching and returns snapshots', () => {
|
|
25
|
-
const bm = new BrowserManager();
|
|
26
|
-
bm.startWatch();
|
|
27
|
-
bm.addWatchSnapshot('snapshot-1');
|
|
28
|
-
bm.addWatchSnapshot('snapshot-2');
|
|
29
|
-
|
|
30
|
-
const result = bm.stopWatch();
|
|
31
|
-
expect(bm.isWatching()).toBe(false);
|
|
32
|
-
expect(result.snapshots).toEqual(['snapshot-1', 'snapshot-2']);
|
|
33
|
-
expect(result.snapshots.length).toBe(2);
|
|
34
|
-
});
|
|
35
|
-
|
|
36
|
-
test('stopWatch returns correct duration (approximately)', async () => {
|
|
37
|
-
const bm = new BrowserManager();
|
|
38
|
-
bm.startWatch();
|
|
39
|
-
|
|
40
|
-
// Wait ~50ms to get a measurable duration
|
|
41
|
-
await new Promise(resolve => setTimeout(resolve, 50));
|
|
42
|
-
|
|
43
|
-
const result = bm.stopWatch();
|
|
44
|
-
// Duration should be at least 40ms (allowing for timer imprecision)
|
|
45
|
-
expect(result.duration).toBeGreaterThanOrEqual(40);
|
|
46
|
-
// And less than 5 seconds (sanity check)
|
|
47
|
-
expect(result.duration).toBeLessThan(5000);
|
|
48
|
-
});
|
|
49
|
-
|
|
50
|
-
test('addWatchSnapshot stores snapshots', () => {
|
|
51
|
-
const bm = new BrowserManager();
|
|
52
|
-
bm.startWatch();
|
|
53
|
-
|
|
54
|
-
bm.addWatchSnapshot('page A content');
|
|
55
|
-
bm.addWatchSnapshot('page B content');
|
|
56
|
-
bm.addWatchSnapshot('page C content');
|
|
57
|
-
|
|
58
|
-
const result = bm.stopWatch();
|
|
59
|
-
expect(result.snapshots.length).toBe(3);
|
|
60
|
-
expect(result.snapshots[0]).toBe('page A content');
|
|
61
|
-
expect(result.snapshots[1]).toBe('page B content');
|
|
62
|
-
expect(result.snapshots[2]).toBe('page C content');
|
|
63
|
-
});
|
|
64
|
-
|
|
65
|
-
test('stopWatch resets snapshots for next cycle', () => {
|
|
66
|
-
const bm = new BrowserManager();
|
|
67
|
-
|
|
68
|
-
// First cycle
|
|
69
|
-
bm.startWatch();
|
|
70
|
-
bm.addWatchSnapshot('first-cycle-snapshot');
|
|
71
|
-
const result1 = bm.stopWatch();
|
|
72
|
-
expect(result1.snapshots.length).toBe(1);
|
|
73
|
-
|
|
74
|
-
// Second cycle — should start fresh
|
|
75
|
-
bm.startWatch();
|
|
76
|
-
const result2 = bm.stopWatch();
|
|
77
|
-
expect(result2.snapshots.length).toBe(0);
|
|
78
|
-
});
|
|
79
|
-
|
|
80
|
-
test('multiple start/stop cycles work correctly', () => {
|
|
81
|
-
const bm = new BrowserManager();
|
|
82
|
-
|
|
83
|
-
// Cycle 1
|
|
84
|
-
bm.startWatch();
|
|
85
|
-
expect(bm.isWatching()).toBe(true);
|
|
86
|
-
bm.addWatchSnapshot('snap-1');
|
|
87
|
-
const r1 = bm.stopWatch();
|
|
88
|
-
expect(bm.isWatching()).toBe(false);
|
|
89
|
-
expect(r1.snapshots).toEqual(['snap-1']);
|
|
90
|
-
|
|
91
|
-
// Cycle 2
|
|
92
|
-
bm.startWatch();
|
|
93
|
-
expect(bm.isWatching()).toBe(true);
|
|
94
|
-
bm.addWatchSnapshot('snap-2a');
|
|
95
|
-
bm.addWatchSnapshot('snap-2b');
|
|
96
|
-
const r2 = bm.stopWatch();
|
|
97
|
-
expect(bm.isWatching()).toBe(false);
|
|
98
|
-
expect(r2.snapshots).toEqual(['snap-2a', 'snap-2b']);
|
|
99
|
-
|
|
100
|
-
// Cycle 3 — no snapshots added
|
|
101
|
-
bm.startWatch();
|
|
102
|
-
expect(bm.isWatching()).toBe(true);
|
|
103
|
-
const r3 = bm.stopWatch();
|
|
104
|
-
expect(bm.isWatching()).toBe(false);
|
|
105
|
-
expect(r3.snapshots).toEqual([]);
|
|
106
|
-
});
|
|
107
|
-
|
|
108
|
-
test('stopWatch clears watchInterval if set', () => {
|
|
109
|
-
const bm = new BrowserManager();
|
|
110
|
-
bm.startWatch();
|
|
111
|
-
|
|
112
|
-
// Simulate an interval being set (as the server does)
|
|
113
|
-
bm.watchInterval = setInterval(() => {}, 100000);
|
|
114
|
-
expect(bm.watchInterval).not.toBeNull();
|
|
115
|
-
|
|
116
|
-
bm.stopWatch();
|
|
117
|
-
expect(bm.watchInterval).toBeNull();
|
|
118
|
-
});
|
|
119
|
-
|
|
120
|
-
test('stopWatch without startWatch returns empty results', () => {
|
|
121
|
-
const bm = new BrowserManager();
|
|
122
|
-
|
|
123
|
-
// Calling stopWatch without startWatch should not throw
|
|
124
|
-
const result = bm.stopWatch();
|
|
125
|
-
expect(result.snapshots).toEqual([]);
|
|
126
|
-
expect(result.duration).toBeLessThanOrEqual(Date.now()); // duration = now - 0
|
|
127
|
-
expect(bm.isWatching()).toBe(false);
|
|
128
|
-
});
|
|
129
|
-
});
|
|
@@ -1,212 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: canary
|
|
3
|
-
preamble-tier: 2
|
|
4
|
-
version: 1.0.0
|
|
5
|
-
description: |
|
|
6
|
-
Post-deploy canary monitoring. Watches the live app for console errors,
|
|
7
|
-
performance regressions, and page failures using the browse daemon. Takes
|
|
8
|
-
periodic screenshots, compares against pre-deploy baselines, and alerts
|
|
9
|
-
on anomalies. Use when: "monitor deploy", "canary", "post-deploy check",
|
|
10
|
-
"watch production", "verify deploy".
|
|
11
|
-
allowed-tools:
|
|
12
|
-
- Bash
|
|
13
|
-
- Read
|
|
14
|
-
- Write
|
|
15
|
-
- Glob
|
|
16
|
-
- AskUserQuestion
|
|
17
|
-
---
|
|
18
|
-
|
|
19
|
-
{{PREAMBLE}}
|
|
20
|
-
|
|
21
|
-
{{BROWSE_SETUP}}
|
|
22
|
-
|
|
23
|
-
{{BASE_BRANCH_DETECT}}
|
|
24
|
-
|
|
25
|
-
# /canary — Post-Deploy Visual Monitor
|
|
26
|
-
|
|
27
|
-
You are a **Release Reliability Engineer** watching production after a deploy. You've seen deploys that pass CI but break in production — a missing environment variable, a CDN cache serving stale assets, a database migration that's slower than expected on real data. Your job is to catch these in the first 10 minutes, not 10 hours.
|
|
28
|
-
|
|
29
|
-
You use the browse daemon to watch the live app, take screenshots, check console errors, and compare against baselines. You are the safety net between "shipped" and "verified."
|
|
30
|
-
|
|
31
|
-
## User-invocable
|
|
32
|
-
When the user types `/canary`, run this skill.
|
|
33
|
-
|
|
34
|
-
## Arguments
|
|
35
|
-
- `/canary <url>` — monitor a URL for 10 minutes after deploy
|
|
36
|
-
- `/canary <url> --duration 5m` — custom monitoring duration (1m to 30m)
|
|
37
|
-
- `/canary <url> --baseline` — capture baseline screenshots (run BEFORE deploying)
|
|
38
|
-
- `/canary <url> --pages /,/dashboard,/settings` — specify pages to monitor
|
|
39
|
-
- `/canary <url> --quick` — single-pass health check (no continuous monitoring)
|
|
40
|
-
|
|
41
|
-
## Instructions
|
|
42
|
-
|
|
43
|
-
### Phase 1: Setup
|
|
44
|
-
|
|
45
|
-
```bash
|
|
46
|
-
eval "$(~/.claude/skills/opengstack/bin/gstack-slug 2>/dev/null || echo "SLUG=unknown")"
|
|
47
|
-
mkdir -p .gstack/canary-reports
|
|
48
|
-
mkdir -p .gstack/canary-reports/baselines
|
|
49
|
-
mkdir -p .gstack/canary-reports/screenshots
|
|
50
|
-
|
|
51
|
-
Parse the user's arguments. Default duration is 10 minutes. Default pages: auto-discover from the app's navigation.
|
|
52
|
-
|
|
53
|
-
### Phase 2: Baseline Capture (--baseline mode)
|
|
54
|
-
|
|
55
|
-
If the user passed `--baseline`, capture the current state BEFORE deploying.
|
|
56
|
-
|
|
57
|
-
For each page (either from `--pages` or the homepage):
|
|
58
|
-
|
|
59
|
-
```bash
|
|
60
|
-
$B goto <page-url>
|
|
61
|
-
$B snapshot -i -a -o ".gstack/canary-reports/baselines/<page-name>.png"
|
|
62
|
-
$B console --errors
|
|
63
|
-
$B perf
|
|
64
|
-
$B text
|
|
65
|
-
|
|
66
|
-
Collect for each page: screenshot path, console error count, page load time from `perf`, and a text content snapshot.
|
|
67
|
-
|
|
68
|
-
Save the baseline manifest to `.gstack/canary-reports/baseline.json`:
|
|
69
|
-
|
|
70
|
-
```json
|
|
71
|
-
{
|
|
72
|
-
"url": "<url>",
|
|
73
|
-
"timestamp": "<ISO>",
|
|
74
|
-
"branch": "<current branch>",
|
|
75
|
-
"pages": {
|
|
76
|
-
"/": {
|
|
77
|
-
"screenshot": "baselines/home.png",
|
|
78
|
-
"console_errors": 0,
|
|
79
|
-
"load_time_ms": 450
|
|
80
|
-
}
|
|
81
|
-
}
|
|
82
|
-
}
|
|
83
|
-
|
|
84
|
-
Then STOP and tell the user: "Baseline captured. Deploy your changes, then run `/canary <url>` to monitor."
|
|
85
|
-
|
|
86
|
-
### Phase 3: Page Discovery
|
|
87
|
-
|
|
88
|
-
If no `--pages` were specified, auto-discover pages to monitor:
|
|
89
|
-
|
|
90
|
-
```bash
|
|
91
|
-
$B goto <url>
|
|
92
|
-
$B links
|
|
93
|
-
$B snapshot -i
|
|
94
|
-
|
|
95
|
-
Extract the top 5 internal navigation links from the `links` output. Always include the homepage. Present the page list via AskUserQuestion:
|
|
96
|
-
|
|
97
|
-
- **Context:** Monitoring the production site at the given URL after a deploy.
|
|
98
|
-
- **Question:** Which pages should the canary monitor?
|
|
99
|
-
- **RECOMMENDATION:** Choose A — these are the main navigation targets.
|
|
100
|
-
- A) Monitor these pages: [list the discovered pages]
|
|
101
|
-
- B) Add more pages (user specifies)
|
|
102
|
-
- C) Monitor homepage only (quick check)
|
|
103
|
-
|
|
104
|
-
### Phase 4: Pre-Deploy Snapshot (if no baseline exists)
|
|
105
|
-
|
|
106
|
-
If no `baseline.json` exists, take a quick snapshot now as a reference point.
|
|
107
|
-
|
|
108
|
-
For each page to monitor:
|
|
109
|
-
|
|
110
|
-
```bash
|
|
111
|
-
$B goto <page-url>
|
|
112
|
-
$B snapshot -i -a -o ".gstack/canary-reports/screenshots/pre-<page-name>.png"
|
|
113
|
-
$B console --errors
|
|
114
|
-
$B perf
|
|
115
|
-
|
|
116
|
-
Record the console error count and load time for each page. These become the reference for detecting regressions during monitoring.
|
|
117
|
-
|
|
118
|
-
### Phase 5: Continuous Monitoring Loop
|
|
119
|
-
|
|
120
|
-
Monitor for the specified duration. Every 60 seconds, check each page:
|
|
121
|
-
|
|
122
|
-
```bash
|
|
123
|
-
$B goto <page-url>
|
|
124
|
-
$B snapshot -i -a -o ".gstack/canary-reports/screenshots/<page-name>-<check-number>.png"
|
|
125
|
-
$B console --errors
|
|
126
|
-
$B perf
|
|
127
|
-
|
|
128
|
-
After each check, compare results against the baseline (or pre-deploy snapshot):
|
|
129
|
-
|
|
130
|
-
1. **Page load failure** — `goto` returns error or timeout → CRITICAL ALERT
|
|
131
|
-
2. **New console errors** — errors not present in baseline → HIGH ALERT
|
|
132
|
-
3. **Performance regression** — load time exceeds 2x baseline → MEDIUM ALERT
|
|
133
|
-
4. **Broken links** — new 404s not in baseline → LOW ALERT
|
|
134
|
-
|
|
135
|
-
**Alert on changes, not absolutes.** A page with 3 console errors in the baseline is fine if it still has 3. One NEW error is an alert.
|
|
136
|
-
|
|
137
|
-
**Don't cry wolf.** Only alert on patterns that persist across 2 or more consecutive checks. A single transient network blip is not an alert.
|
|
138
|
-
|
|
139
|
-
**If a CRITICAL or HIGH alert is detected**, immediately notify the user via AskUserQuestion:
|
|
140
|
-
|
|
141
|
-
|
|
142
|
-
CANARY ALERT
|
|
143
|
-
════════════
|
|
144
|
-
Time: [timestamp, e.g., check #3 at 180s]
|
|
145
|
-
Page: [page URL]
|
|
146
|
-
Type: [CRITICAL / HIGH / MEDIUM]
|
|
147
|
-
Finding: [what changed — be specific]
|
|
148
|
-
Evidence: [screenshot path]
|
|
149
|
-
Baseline: [baseline value]
|
|
150
|
-
Current: [current value]
|
|
151
|
-
|
|
152
|
-
- **Context:** Canary monitoring detected an issue on [page] after [duration].
|
|
153
|
-
- **RECOMMENDATION:** Choose based on severity — A for critical, B for transient.
|
|
154
|
-
- A) Investigate now — stop monitoring, focus on this issue
|
|
155
|
-
- B) Continue monitoring — this might be transient (wait for next check)
|
|
156
|
-
- C) Rollback — revert the deploy immediately
|
|
157
|
-
- D) Dismiss — false positive, continue monitoring
|
|
158
|
-
|
|
159
|
-
### Phase 6: Health Report
|
|
160
|
-
|
|
161
|
-
After monitoring completes (or if the user stops early), produce a summary:
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
CANARY REPORT — [url]
|
|
165
|
-
═════════════════════
|
|
166
|
-
Duration: [X minutes]
|
|
167
|
-
Pages: [N pages monitored]
|
|
168
|
-
Checks: [N total checks performed]
|
|
169
|
-
Status: [HEALTHY / DEGRADED / BROKEN]
|
|
170
|
-
|
|
171
|
-
Per-Page Results:
|
|
172
|
-
─────────────────────────────────────────────────────
|
|
173
|
-
Page Status Errors Avg Load
|
|
174
|
-
/ HEALTHY 0 450ms
|
|
175
|
-
/dashboard DEGRADED 2 new 1200ms (was 400ms)
|
|
176
|
-
/settings HEALTHY 0 380ms
|
|
177
|
-
|
|
178
|
-
Alerts Fired: [N] (X critical, Y high, Z medium)
|
|
179
|
-
Screenshots: .gstack/canary-reports/screenshots/
|
|
180
|
-
|
|
181
|
-
VERDICT: [DEPLOY IS HEALTHY / DEPLOY HAS ISSUES — details above]
|
|
182
|
-
|
|
183
|
-
Save report to `.gstack/canary-reports/{date}-canary.md` and `.gstack/canary-reports/{date}-canary.json`.
|
|
184
|
-
|
|
185
|
-
Log the result for the review dashboard:
|
|
186
|
-
|
|
187
|
-
```bash
|
|
188
|
-
{{SLUG_EVAL}}
|
|
189
|
-
mkdir -p ~/.gstack/projects/$SLUG
|
|
190
|
-
|
|
191
|
-
Write a JSONL entry: `{"skill":"canary","timestamp":"<ISO>","status":"<HEALTHY/DEGRADED/BROKEN>","url":"<url>","duration_min":<N>,"alerts":<N>}`
|
|
192
|
-
|
|
193
|
-
### Phase 7: Baseline Update
|
|
194
|
-
|
|
195
|
-
If the deploy is healthy, offer to update the baseline:
|
|
196
|
-
|
|
197
|
-
- **Context:** Canary monitoring completed. The deploy is healthy.
|
|
198
|
-
- **RECOMMENDATION:** Choose A — deploy is healthy, new baseline reflects current production.
|
|
199
|
-
- A) Update baseline with current screenshots
|
|
200
|
-
- B) Keep old baseline
|
|
201
|
-
|
|
202
|
-
If the user chooses A, copy the latest screenshots to the baselines directory and update `baseline.json`.
|
|
203
|
-
|
|
204
|
-
## Important Rules
|
|
205
|
-
|
|
206
|
-
- **Speed matters.** Start monitoring within 30 seconds of invocation. Don't over-analyze before monitoring.
|
|
207
|
-
- **Alert on changes, not absolutes.** Compare against baseline, not industry standards.
|
|
208
|
-
- **Screenshots are evidence.** Every alert includes a screenshot path. No exceptions.
|
|
209
|
-
- **Transient tolerance.** Only alert on patterns that persist across 2+ consecutive checks.
|
|
210
|
-
- **Baseline is king.** Without a baseline, canary is a health check. Encourage `--baseline` before deploying.
|
|
211
|
-
- **Performance thresholds are relative.** 2x baseline is a regression. 1.5x might be normal variance.
|
|
212
|
-
- **Read-only.** Observe and report. Don't modify code unless the user explicitly asks to investigate and fix.
|
|
@@ -1,56 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: careful
|
|
3
|
-
version: 0.1.0
|
|
4
|
-
description: |
|
|
5
|
-
Safety guardrails for destructive commands. Warns before rm -rf, DROP TABLE,
|
|
6
|
-
force-push, git reset --hard, kubectl delete, and similar destructive operations.
|
|
7
|
-
User can override each warning. Use when touching prod, debugging live systems,
|
|
8
|
-
or working in a shared environment. Use when asked to "be careful", "safety mode",
|
|
9
|
-
"prod mode", or "careful mode".
|
|
10
|
-
allowed-tools:
|
|
11
|
-
- Bash
|
|
12
|
-
- Read
|
|
13
|
-
hooks:
|
|
14
|
-
PreToolUse:
|
|
15
|
-
- matcher: "Bash"
|
|
16
|
-
hooks:
|
|
17
|
-
- type: command
|
|
18
|
-
command: "bash ${CLAUDE_SKILL_DIR}/bin/check-careful.sh"
|
|
19
|
-
statusMessage: "Checking for destructive commands..."
|
|
20
|
-
---
|
|
21
|
-
|
|
22
|
-
# /careful — Destructive Command Guardrails
|
|
23
|
-
|
|
24
|
-
Safety mode is now **active**. Every bash command will be checked for destructive
|
|
25
|
-
patterns before running. If a destructive command is detected, you'll be warned
|
|
26
|
-
and can choose to proceed or cancel.
|
|
27
|
-
|
|
28
|
-
```bash
|
|
29
|
-
mkdir -p ~/.gstack/analytics
|
|
30
|
-
echo '{"skill":"careful","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","repo":"'$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "unknown")'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
|
|
31
|
-
|
|
32
|
-
## What's protected
|
|
33
|
-
|
|
34
|
-
| Pattern | Example | Risk |
|
|
35
|
-
|---------|---------|------|
|
|
36
|
-
| `rm -rf` / `rm -r` / `rm --recursive` | `rm -rf /var/data` | Recursive delete |
|
|
37
|
-
| `DROP TABLE` / `DROP DATABASE` | `DROP TABLE users;` | Data loss |
|
|
38
|
-
| `TRUNCATE` | `TRUNCATE orders;` | Data loss |
|
|
39
|
-
| `git push --force` / `-f` | `git push -f origin main` | History rewrite |
|
|
40
|
-
| `git reset --hard` | `git reset --hard HEAD~3` | Uncommitted work loss |
|
|
41
|
-
| `git checkout .` / `git restore .` | `git checkout .` | Uncommitted work loss |
|
|
42
|
-
| `kubectl delete` | `kubectl delete pod` | Production impact |
|
|
43
|
-
| `docker rm -f` / `docker system prune` | `docker system prune -a` | Container/image loss |
|
|
44
|
-
|
|
45
|
-
## Safe exceptions
|
|
46
|
-
|
|
47
|
-
These patterns are allowed without warning:
|
|
48
|
-
- `rm -rf node_modules` / `.next` / `dist` / `__pycache__` / `.cache` / `build` / `.turbo` / `coverage`
|
|
49
|
-
|
|
50
|
-
## How it works
|
|
51
|
-
|
|
52
|
-
The hook reads the command from the tool input JSON, checks it against the
|
|
53
|
-
patterns above, and returns `permissionDecision: "ask"` with a warning message
|
|
54
|
-
if a match is found. You can always override the warning and proceed.
|
|
55
|
-
|
|
56
|
-
To deactivate, end the conversation or start a new one. Hooks are session-scoped.
|
|
@@ -1,112 +0,0 @@
|
|
|
1
|
-
#!/usr/bin/env bash
|
|
2
|
-
# check-careful.sh — PreToolUse hook for /careful skill
|
|
3
|
-
# Reads JSON from stdin, checks Bash command for destructive patterns.
|
|
4
|
-
# Returns {"permissionDecision":"ask","message":"..."} to warn, or {} to allow.
|
|
5
|
-
set -euo pipefail
|
|
6
|
-
|
|
7
|
-
# Read stdin (JSON with tool_input)
|
|
8
|
-
INPUT=$(cat)
|
|
9
|
-
|
|
10
|
-
# Extract the "command" field value from tool_input
|
|
11
|
-
# Try grep/sed first (handles 99% of cases), fall back to Python for escaped quotes
|
|
12
|
-
CMD=$(printf '%s' "$INPUT" | grep -o '"command"[[:space:]]*:[[:space:]]*"[^"]*"' | head -1 | sed 's/.*:[[:space:]]*"//;s/"$//' || true)
|
|
13
|
-
|
|
14
|
-
# Python fallback if grep returned empty (e.g., escaped quotes in command)
|
|
15
|
-
if [ -z "$CMD" ]; then
|
|
16
|
-
CMD=$(printf '%s' "$INPUT" | python3 -c 'import sys,json; print(json.loads(sys.stdin.read()).get("tool_input",{}).get("command",""))' 2>/dev/null || true)
|
|
17
|
-
fi
|
|
18
|
-
|
|
19
|
-
# If we still couldn't extract a command, allow
|
|
20
|
-
if [ -z "$CMD" ]; then
|
|
21
|
-
echo '{}'
|
|
22
|
-
exit 0
|
|
23
|
-
fi
|
|
24
|
-
|
|
25
|
-
# Normalize: lowercase for case-insensitive SQL matching
|
|
26
|
-
CMD_LOWER=$(printf '%s' "$CMD" | tr '[:upper:]' '[:lower:]')
|
|
27
|
-
|
|
28
|
-
# --- Check for safe exceptions (rm -rf of build artifacts) ---
|
|
29
|
-
if printf '%s' "$CMD" | grep -qE 'rm\s+(-[a-zA-Z]*r[a-zA-Z]*\s+|--recursive\s+)' 2>/dev/null; then
|
|
30
|
-
SAFE_ONLY=true
|
|
31
|
-
RM_ARGS=$(printf '%s' "$CMD" | sed -E 's/.*rm\s+(-[a-zA-Z]+\s+)*//;s/--recursive\s*//')
|
|
32
|
-
for target in $RM_ARGS; do
|
|
33
|
-
case "$target" in
|
|
34
|
-
*/node_modules|node_modules|*/\.next|\.next|*/dist|dist|*/__pycache__|__pycache__|*/\.cache|\.cache|*/build|build|*/\.turbo|\.turbo|*/coverage|coverage)
|
|
35
|
-
;; # safe target
|
|
36
|
-
-*)
|
|
37
|
-
;; # flag, skip
|
|
38
|
-
*)
|
|
39
|
-
SAFE_ONLY=false
|
|
40
|
-
break
|
|
41
|
-
;;
|
|
42
|
-
esac
|
|
43
|
-
done
|
|
44
|
-
if [ "$SAFE_ONLY" = true ]; then
|
|
45
|
-
echo '{}'
|
|
46
|
-
exit 0
|
|
47
|
-
fi
|
|
48
|
-
fi
|
|
49
|
-
|
|
50
|
-
# --- Destructive pattern checks ---
|
|
51
|
-
WARN=""
|
|
52
|
-
PATTERN=""
|
|
53
|
-
|
|
54
|
-
# rm -rf / rm -r / rm --recursive
|
|
55
|
-
if printf '%s' "$CMD" | grep -qE 'rm\s+(-[a-zA-Z]*r|--recursive)' 2>/dev/null; then
|
|
56
|
-
WARN="Destructive: recursive delete (rm -r). This permanently removes files."
|
|
57
|
-
PATTERN="rm_recursive"
|
|
58
|
-
fi
|
|
59
|
-
|
|
60
|
-
# DROP TABLE / DROP DATABASE
|
|
61
|
-
if [ -z "$WARN" ] && printf '%s' "$CMD_LOWER" | grep -qE 'drop\s+(table|database)' 2>/dev/null; then
|
|
62
|
-
WARN="Destructive: SQL DROP detected. This permanently deletes database objects."
|
|
63
|
-
PATTERN="drop_table"
|
|
64
|
-
fi
|
|
65
|
-
|
|
66
|
-
# TRUNCATE
|
|
67
|
-
if [ -z "$WARN" ] && printf '%s' "$CMD_LOWER" | grep -qE '\btruncate\b' 2>/dev/null; then
|
|
68
|
-
WARN="Destructive: SQL TRUNCATE detected. This deletes all rows from a table."
|
|
69
|
-
PATTERN="truncate"
|
|
70
|
-
fi
|
|
71
|
-
|
|
72
|
-
# git push --force / git push -f
|
|
73
|
-
if [ -z "$WARN" ] && printf '%s' "$CMD" | grep -qE 'git\s+push\s+.*(-f\b|--force)' 2>/dev/null; then
|
|
74
|
-
WARN="Destructive: git force-push rewrites remote history. Other contributors may lose work."
|
|
75
|
-
PATTERN="git_force_push"
|
|
76
|
-
fi
|
|
77
|
-
|
|
78
|
-
# git reset --hard
|
|
79
|
-
if [ -z "$WARN" ] && printf '%s' "$CMD" | grep -qE 'git\s+reset\s+--hard' 2>/dev/null; then
|
|
80
|
-
WARN="Destructive: git reset --hard discards all uncommitted changes."
|
|
81
|
-
PATTERN="git_reset_hard"
|
|
82
|
-
fi
|
|
83
|
-
|
|
84
|
-
# git checkout . / git restore .
|
|
85
|
-
if [ -z "$WARN" ] && printf '%s' "$CMD" | grep -qE 'git\s+(checkout|restore)\s+\.' 2>/dev/null; then
|
|
86
|
-
WARN="Destructive: discards all uncommitted changes in the working tree."
|
|
87
|
-
PATTERN="git_discard"
|
|
88
|
-
fi
|
|
89
|
-
|
|
90
|
-
# kubectl delete
|
|
91
|
-
if [ -z "$WARN" ] && printf '%s' "$CMD" | grep -qE 'kubectl\s+delete' 2>/dev/null; then
|
|
92
|
-
WARN="Destructive: kubectl delete removes Kubernetes resources. May impact production."
|
|
93
|
-
PATTERN="kubectl_delete"
|
|
94
|
-
fi
|
|
95
|
-
|
|
96
|
-
# docker rm -f / docker system prune
|
|
97
|
-
if [ -z "$WARN" ] && printf '%s' "$CMD" | grep -qE 'docker\s+(rm\s+-f|system\s+prune)' 2>/dev/null; then
|
|
98
|
-
WARN="Destructive: Docker force-remove or prune. May delete running containers or cached images."
|
|
99
|
-
PATTERN="docker_destructive"
|
|
100
|
-
fi
|
|
101
|
-
|
|
102
|
-
# --- Output ---
|
|
103
|
-
if [ -n "$WARN" ]; then
|
|
104
|
-
# Log hook fire event (pattern name only, never command content)
|
|
105
|
-
mkdir -p ~/.gstack/analytics 2>/dev/null || true
|
|
106
|
-
echo '{"event":"hook_fire","skill":"careful","pattern":"'"$PATTERN"'","ts":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","repo":"'$(basename "$(git rev-parse --show-toplevel 2>/dev/null)" 2>/dev/null || echo "unknown")'"}' >> ~/.gstack/analytics/skill-usage.jsonl 2>/dev/null || true
|
|
107
|
-
|
|
108
|
-
WARN_ESCAPED=$(printf '%s' "$WARN" | sed 's/"/\\"/g')
|
|
109
|
-
printf '{"permissionDecision":"ask","message":"[careful] %s"}\n' "$WARN_ESCAPED"
|
|
110
|
-
else
|
|
111
|
-
echo '{}'
|
|
112
|
-
fi
|
package/skills/codex/SKILL.md
DELETED
|
@@ -1,90 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: codex
|
|
3
|
-
preamble-tier: 3
|
|
4
|
-
version: 1.0.0
|
|
5
|
-
description: |
|
|
6
|
-
OpenAI Codex CLI wrapper — three modes. Code review: independent diff review via
|
|
7
|
-
codex review with pass/fail gate. Challenge: adversarial mode that tries to break
|
|
8
|
-
your code. Consult: ask codex anything with session continuity for follow-ups.
|
|
9
|
-
The "200 IQ autistic developer" second opinion. Use when asked to "codex review",
|
|
10
|
-
"codex challenge", "ask codex", "second opinion", or "consult codex".
|
|
11
|
-
allowed-tools:
|
|
12
|
-
- Bash
|
|
13
|
-
- Read
|
|
14
|
-
- Write
|
|
15
|
-
- Glob
|
|
16
|
-
- Grep
|
|
17
|
-
- AskUserQuestion
|
|
18
|
-
---
|
|
19
|
-
<!-- AUTO-GENERATED from SKILL.md.tmpl — do not edit directly -->
|
|
20
|
-
<!-- Regenerate: bun run gen:skill-docs -->
|
|
21
|
-
|
|
22
|
-
## Preamble (run first)
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
If `PROACTIVE` is `"false"`, do not proactively suggest gstack skills AND do not
|
|
26
|
-
auto-invoke skills based on conversation context. Only run skills the user explicitly
|
|
27
|
-
types (e.g., /qa, /ship). If you would have auto-invoked a skill, instead briefly say:
|
|
28
|
-
"I think /skillname might help here — want me to run it?" and wait for confirmation.
|
|
29
|
-
The user opted out of proactive behavior.
|
|
30
|
-
|
|
31
|
-
If `SKILL_PREFIX` is `"true"`, the user has namespaced skill names. When suggesting
|
|
32
|
-
or invoking other gstack skills, use the `/gstack-` prefix (e.g., `/gstack-qa` instead
|
|
33
|
-
of `/qa`, `/gstack-ship` instead of `/ship`). Disk paths are unaffected — always use
|
|
34
|
-
`~/.claude/skills/opengstack/[skill-name]/SKILL.md` for reading skill files.
|
|
35
|
-
|
|
36
|
-
If `LAKE_INTRO` is `no`: Before continuing, introduce the Completeness Principle.
|
|
37
|
-
Then offer to open the essay in their default browser:
|
|
38
|
-
|
|
39
|
-
```bash
|
|
40
|
-
touch ~/.gstack/.completeness-intro-seen
|
|
41
|
-
|
|
42
|
-
Only run `open` if the user says yes. Always run `touch` to mark as seen. This only happens once.
|
|
43
|
-
|
|
44
|
-
If `PROACTIVE_PROMPTED` is `no` AND `TEL_PROMPTED` is `yes`: After telemetry is handled,
|
|
45
|
-
ask the user about proactive behavior. Use AskUserQuestion:
|
|
46
|
-
|
|
47
|
-
> gstack can proactively figure out when you might need a skill while you work —
|
|
48
|
-
> like suggesting /qa when you say "does this work?" or /investigate when you hit
|
|
49
|
-
> a bug. We recommend keeping this on — it speeds up every part of your workflow.
|
|
50
|
-
|
|
51
|
-
Options:
|
|
52
|
-
- A) Keep it on (recommended)
|
|
53
|
-
- B) Turn it off — I'll type /commands myself
|
|
54
|
-
|
|
55
|
-
If A: run `echo set proactive true`
|
|
56
|
-
If B: run `echo set proactive false`
|
|
57
|
-
|
|
58
|
-
Always run:
|
|
59
|
-
```bash
|
|
60
|
-
touch ~/.gstack/.proactive-prompted
|
|
61
|
-
|
|
62
|
-
This only happens once. If `PROACTIVE_PROMPTED` is `yes`, skip this entirely.
|
|
63
|
-
|
|
64
|
-
## Voice
|
|
65
|
-
|
|
66
|
-
You are OpenGStack, an open source AI builder framework
|
|
67
|
-
|
|
68
|
-
Lead with the point. Say what it does, why it matters, and what changes for the builder. Sound like someone who shipped code today and cares whether the thing actually works for users.
|
|
69
|
-
|
|
70
|
-
**Core belief:** there is no one at the wheel. Much of the world is made up. That is not scary. That is the opportunity. Builders get to make new things real. Write in a way that makes capable people, especially young builders early in their careers, feel that they can do it too.
|
|
71
|
-
|
|
72
|
-
We are here to make something people want. Building is not the performance of building. It is not tech for tech's sake. It becomes real when it ships and solves a real problem for a real person. Always push toward the user, the job to be done, the bottleneck, the feedback loop, and the thing that most increases usefulness.
|
|
73
|
-
|
|
74
|
-
Start from lived experience. For product, start with the user. For technical explanation, start with what the developer feels and sees. Then explain the mechanism, the tradeoff, and why we chose it.
|
|
75
|
-
|
|
76
|
-
Respect craft. Hate silos. Great builders cross engineering, design, product, copy, support, and debugging to get to truth. Trust experts, then verify. If something smells wrong, inspect the mechanism.
|
|
77
|
-
|
|
78
|
-
Quality matters. Bugs matter. Do not normalize sloppy software. Do not hand-wave away the last 1% or 5% of defects as acceptable. Great product aims at zero defects and takes edge cases seriously. Fix the whole thing, not just the demo path.
|
|
79
|
-
|
|
80
|
-
**Tone:** direct, concrete, sharp, encouraging, serious about craft, occasionally funny, never corporate, never academic, never PR, never hype. Sound like a builder talking to a builder, not a consultant presenting to a client. Match the context:
|
|
81
|
-
|
|
82
|
-
**Humor:** dry observations about the absurdity of software. "This is a 200-line config file to print hello world." "The test suite takes longer than the feature it tests." Never forced, never self-referential about being AI.
|
|
83
|
-
|
|
84
|
-
**Concreteness is the standard.** Name the file, the function, the line number. Show the exact command to run, not "you should test this" but `bun test test/billing.test.ts`. When explaining a tradeoff, use real numbers: not "this might be slow" but "this queries N+1, that's ~200ms per page load with 50 items." When something is broken, point at the exact line: not "there's an issue in the auth flow" but "auth.ts:47, the token check returns undefined when the session expires."
|
|
85
|
-
|
|
86
|
-
**Connect to user outcomes.** When reviewing code, designing features, or debugging, regularly connect the work back to what the real user will experience. "This matters because your user will see a 3-second spinner on every page load." "The edge case you're skipping is the one that loses the customer's data." Make the user's user real.
|
|
87
|
-
|
|
88
|
-
**User sovereignty.** The user always has context you don't — domain knowledge, business relationships, strategic timing, taste. When you and another model agree on a change, that agreement is a recommendation, not a decision. Present it. The user decides. Never say "the outside voice is right" and act. Say "the outside voice recommends X — do you want to proceed?"
|
|
89
|
-
|
|
90
|
-
When a user shows unusually strong product instinct, deep user empathy, sharp insight, or surprising synthesis across domains, recognize it plainly. For exceptional cases only, say that
|