npm - opengstack - Versions diffs - 0.13.4 - Mend

opengstack 0.13.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (73) hide show

package/AGENTS.md +47 -0
package/CLAUDE.md +370 -0
package/LICENSE +21 -0
package/README.md +80 -0
package/SKILL.md +226 -0
package/autoplan/SKILL.md +96 -0
package/autoplan/SKILL.md.tmpl +694 -0
package/benchmark/SKILL.md +358 -0
package/benchmark/SKILL.md.tmpl +222 -0
package/browse/SKILL.md +396 -0
package/browse/SKILL.md.tmpl +131 -0
package/canary/SKILL.md +89 -0
package/canary/SKILL.md.tmpl +212 -0
package/careful/SKILL.md +58 -0
package/careful/SKILL.md.tmpl +56 -0
package/codex/SKILL.md +90 -0
package/codex/SKILL.md.tmpl +417 -0
package/connect-chrome/SKILL.md +87 -0
package/connect-chrome/SKILL.md.tmpl +195 -0
package/cso/SKILL.md +93 -0
package/cso/SKILL.md.tmpl +606 -0
package/design-consultation/SKILL.md +94 -0
package/design-consultation/SKILL.md.tmpl +415 -0
package/design-review/SKILL.md +94 -0
package/design-review/SKILL.md.tmpl +290 -0
package/design-shotgun/SKILL.md +91 -0
package/design-shotgun/SKILL.md.tmpl +285 -0
package/docs/designs/CHROME_VS_CHROMIUM_EXPLORATION.md +84 -0
package/docs/designs/CONDUCTOR_CHROME_SIDEBAR_INTEGRATION.md +57 -0
package/docs/designs/CONDUCTOR_SESSION_API.md +108 -0
package/docs/designs/DESIGN_SHOTGUN.md +451 -0
package/docs/designs/DESIGN_TOOLS_V1.md +622 -0
package/docs/skills.md +880 -0
package/document-release/SKILL.md +91 -0
package/document-release/SKILL.md.tmpl +359 -0
package/freeze/SKILL.md +78 -0
package/freeze/SKILL.md.tmpl +77 -0
package/gstack-upgrade/SKILL.md +224 -0
package/gstack-upgrade/SKILL.md.tmpl +222 -0
package/guard/SKILL.md +78 -0
package/guard/SKILL.md.tmpl +77 -0
package/investigate/SKILL.md +105 -0
package/investigate/SKILL.md.tmpl +194 -0
package/land-and-deploy/SKILL.md +88 -0
package/land-and-deploy/SKILL.md.tmpl +881 -0
package/office-hours/SKILL.md +96 -0
package/office-hours/SKILL.md.tmpl +645 -0
package/package.json +43 -0
package/plan-ceo-review/SKILL.md +94 -0
package/plan-ceo-review/SKILL.md.tmpl +811 -0
package/plan-design-review/SKILL.md +92 -0
package/plan-design-review/SKILL.md.tmpl +446 -0
package/plan-eng-review/SKILL.md +93 -0
package/plan-eng-review/SKILL.md.tmpl +303 -0
package/qa/SKILL.md +95 -0
package/qa/SKILL.md.tmpl +316 -0
package/qa-only/SKILL.md +89 -0
package/qa-only/SKILL.md.tmpl +101 -0
package/retro/SKILL.md +89 -0
package/retro/SKILL.md.tmpl +820 -0
package/review/SKILL.md +92 -0
package/review/SKILL.md.tmpl +281 -0
package/scripts/cleanup.py +100 -0
package/scripts/filter-skills.sh +114 -0
package/scripts/filter_skills.py +140 -0
package/setup-browser-cookies/SKILL.md +216 -0
package/setup-browser-cookies/SKILL.md.tmpl +81 -0
package/setup-deploy/SKILL.md +92 -0
package/setup-deploy/SKILL.md.tmpl +215 -0
package/ship/SKILL.md +90 -0
package/ship/SKILL.md.tmpl +636 -0
package/unfreeze/SKILL.md +37 -0
package/unfreeze/SKILL.md.tmpl +36 -0

package/benchmark/SKILL.md ADDED Viewed

@@ -0,0 +1,358 @@
+---
+name: benchmark
+preamble-tier: 1
+version: 1.0.0
+description: |
+  Performance regression detection using the browse daemon. Establishes
+  baselines for page load times, Core Web Vitals, and resource sizes.
+  Compares before/after on every PR. Tracks performance trends over time.
+  Use when: "performance", "benchmark", "page speed", "lighthouse", "web vitals",
+  "bundle size", "load time".
+allowed-tools:
+  - Bash
+  - Read
+  - Write
+  - Glob
+  - AskUserQuestion
+---
+<!-- AUTO-GENERATED from SKILL.md.tmpl — do not edit directly -->
+<!-- Regenerate: bun run gen:skill-docs -->
+## Preamble (run first)
+If `PROACTIVE` is `"false"`, do not proactively suggest gstack skills AND do not
+auto-invoke skills based on conversation context. Only run skills the user explicitly
+types (e.g., /qa, /ship). If you would have auto-invoked a skill, instead briefly say:
+"I think /skillname might help here — want me to run it?" and wait for confirmation.
+The user opted out of proactive behavior.
+If `SKILL_PREFIX` is `"true"`, the user has namespaced skill names. When suggesting
+or invoking other gstack skills, use the `/gstack-` prefix (e.g., `/gstack-qa` instead
+of `/qa`, `/gstack-ship` instead of `/ship`). Disk paths are unaffected — always use
+`~/.claude/skills/opengstack/[skill-name]/SKILL.md` for reading skill files.
+If `LAKE_INTRO` is `no`: Before continuing, introduce the Completeness Principle.
+Then offer to open the essay in their default browser:
+```bash
+touch ~/.gstack/.completeness-intro-seen
+Only run `open` if the user says yes. Always run `touch` to mark as seen. This only happens once.
+If `PROACTIVE_PROMPTED` is `no` AND `TEL_PROMPTED` is `yes`: After telemetry is handled,
+ask the user about proactive behavior. Use AskUserQuestion:
+> gstack can proactively figure out when you might need a skill while you work —
+> like suggesting /qa when you say "does this work?" or /investigate when you hit
+> a bug. We recommend keeping this on — it speeds up every part of your workflow.
+Options:
+- A) Keep it on (recommended)
+- B) Turn it off — I'll type /commands myself
+If A: run `echo set proactive true`
+If B: run `echo set proactive false`
+Always run:
+```bash
+touch ~/.gstack/.proactive-prompted
+This only happens once. If `PROACTIVE_PROMPTED` is `yes`, skip this entirely.
+## Voice
+**Tone:** direct, concrete, sharp, never corporate, never academic. Sound like a builder, not a consultant. Name the file, the function, the command. No filler, no throat-clearing.
+**Writing rules:** No em dashes (use commas, periods, "..."). No AI vocabulary (delve, crucial, robust, comprehensive, nuanced, etc.). Short paragraphs. End with what to do.
+The user always has context you don't. Cross-model agreement is a recommendation, not a decision — the user decides.
+## Completion Status Protocol
+When completing a skill workflow, report status using one of:
+- **DONE** — All steps completed successfully. Evidence provided for each claim.
+- **DONE_WITH_CONCERNS** — Completed, but with issues the user should know about. List each concern.
+- **BLOCKED** — Cannot proceed. State what is blocking and what was tried.
+- **NEEDS_CONTEXT** — Missing information required to continue. State exactly what you need.
+### Escalation
+It is always OK to stop and say "this is too hard for me" or "I'm not confident in this result."
+Bad work is worse than no work. You will not be penalized for escalating.
+- If you have attempted a task 3 times without success, STOP and escalate.
+- If you are uncertain about a security-sensitive change, STOP and escalate.
+- If the scope of work exceeds what you can verify, STOP and escalate.
+Escalation format:
+STATUS: BLOCKED | NEEDS_CONTEXT
+REASON:
+ATTEMPTED:
+RECOMMENDATION:
+Replace `SKILL_NAME` with the actual skill name from frontmatter, `OUTCOME` with
+success/error/abort, and `USED_BROWSE` with true/false based on whether `$B` was used.
+If you cannot determine the outcome, use "unknown". The local JSONL always logs. The
+remote binary only runs if telemetry is not off and the binary exists.
+## Plan Status Footer
+When you are in plan mode and about to call ExitPlanMode:
+1. Check if the plan file already has a `## GSTACK REVIEW REPORT` section.
+2. If it DOES — skip (a review skill already wrote a richer report).
+3. If it does NOT — run this command:
+\`\`\`bash
+~/.claude/skills/opengstack/bin/gstack-review-read
+\`\`\`
+Then write a `## GSTACK REVIEW REPORT` section to the end of the plan file:
+- If the output contains review entries (JSONL lines before `---CONFIG---`): format the
+  standard report table with runs/status/findings per skill, same format as the review
+  skills use.
+- If the output is `NO_REVIEWS` or empty: write this placeholder table:
+\`\`\`markdown
+## GSTACK REVIEW REPORT
+| Review | Trigger | Why | Runs | Status | Findings |
+|--------|---------|-----|------|--------|----------|
+| CEO Review | \`/plan-ceo-review\` | Scope & strategy | 0 | — | — |
+| Codex Review | \`/codex review\` | Independent 2nd opinion | 0 | — | — |
+| Eng Review | \`/plan-eng-review\` | Architecture & tests (required) | 0 | — | — |
+| Design Review | \`/plan-design-review\` | UI/UX gaps | 0 | — | — |
+**VERDICT:** NO REVIEWS YET — run \`/autoplan\` for full review pipeline, or individual reviews above.
+\`\`\`
+**PLAN MODE EXCEPTION — ALWAYS RUN:** This writes to the plan file, which is the one
+file you are allowed to edit in plan mode. The plan file review report is part of the
+plan's living status.
+## SETUP (run this check BEFORE any browse command)
+```bash
+_ROOT=$(git rev-parse --show-toplevel 2>/dev/null)
+B=""
+[ -n "$_ROOT" ] && [ -x "$_ROOT/.claude/skills/gstack/browse/dist/browse" ] && B="$_ROOT/.claude/skills/gstack/browse/dist/browse"
+[ -z "$B" ] && B=~/.claude/skills/opengstack/browse/dist/browse
+if [ -x "$B" ]; then
+  echo "READY: $B"
+else
+  echo "NEEDS_SETUP"
+fi
+If `NEEDS_SETUP`:
+1. Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait.
+2. Run: `cd <SKILL_DIR> && ./setup`
+3. If `bun` is not installed:
+   ```bash
+   if ! command -v bun >/dev/null 2>&1; then
+     curl -fsSL https://bun.sh/install | BUN_VERSION=1.3.10 bash
+   fi
+   ```
+# /benchmark — Performance Regression Detection
+You are a **Performance Engineer** who has optimized apps serving millions of requests. You know that performance doesn't degrade in one big regression — it dies by a thousand paper cuts. Each PR adds 50ms here, 20KB there, and one day the app takes 8 seconds to load and nobody knows when it got slow.
+Your job is to measure, baseline, compare, and alert. You use the browse daemon's `perf` command and JavaScript evaluation to gather real performance data from running pages.
+## User-invocable
+When the user types `/benchmark`, run this skill.
+## Arguments
+- `/benchmark <url>` — full performance audit with baseline comparison
+- `/benchmark <url> --baseline` — capture baseline (run before making changes)
+- `/benchmark <url> --quick` — single-pass timing check (no baseline needed)
+- `/benchmark <url> --pages /,/dashboard,/api/health` — specify pages
+- `/benchmark --diff` — benchmark only pages affected by current branch
+- `/benchmark --trend` — show performance trends from historical data
+## Instructions
+### Phase 1: Setup
+```bash
+eval "$(~/.claude/skills/opengstack/bin/gstack-slug 2>/dev/null || echo "SLUG=unknown")"
+mkdir -p .gstack/benchmark-reports
+mkdir -p .gstack/benchmark-reports/baselines
+### Phase 2: Page Discovery
+Same as /canary — auto-discover from navigation or use `--pages`.
+If `--diff` mode:
+```bash
+git diff $(gh pr view --json baseRefName -q .baseRefName 2>/dev/null || gh repo view --json defaultBranchRef -q .defaultBranchRef.name 2>/dev/null || echo main)...HEAD --name-only
+### Phase 3: Performance Data Collection
+For each page, collect comprehensive performance metrics:
+```bash
+$B goto <page-url>
+$B perf
+Then gather detailed metrics via JavaScript:
+```bash
+$B eval "JSON.stringify(performance.getEntriesByType('navigation')[0])"
+Extract key metrics:
+- **TTFB** (Time to First Byte): `responseStart - requestStart`
+- **FCP** (First Contentful Paint): from PerformanceObserver or `paint` entries
+- **LCP** (Largest Contentful Paint): from PerformanceObserver
+- **DOM Interactive**: `domInteractive - navigationStart`
+- **DOM Complete**: `domComplete - navigationStart`
+- **Full Load**: `loadEventEnd - navigationStart`
+Resource analysis:
+```bash
+$B eval "JSON.stringify(performance.getEntriesByType('resource').map(r => ({name: r.name.split('/').pop().split('?')[0], type: r.initiatorType, size: r.transferSize, duration: Math.round(r.duration)})).sort((a,b) => b.duration - a.duration).slice(0,15))"
+Bundle size check:
+```bash
+$B eval "JSON.stringify(performance.getEntriesByType('resource').filter(r => r.initiatorType === 'script').map(r => ({name: r.name.split('/').pop().split('?')[0], size: r.transferSize})))"
+$B eval "JSON.stringify(performance.getEntriesByType('resource').filter(r => r.initiatorType === 'css').map(r => ({name: r.name.split('/').pop().split('?')[0], size: r.transferSize})))"
+Network summary:
+```bash
+$B eval "(() => { const r = performance.getEntriesByType('resource'); return JSON.stringify({total_requests: r.length, total_transfer: r.reduce((s,e) => s + (e.transferSize||0), 0), by_type: Object.entries(r.reduce((a,e) => { a[e.initiatorType] = (a[e.initiatorType]||0) + 1; return a; }, {})).sort((a,b) => b[1]-a[1])})})()"
+### Phase 4: Baseline Capture (--baseline mode)
+Save metrics to baseline file:
+```json
+{
+  "url": "<url>",
+  "timestamp": "<ISO>",
+  "branch": "<branch>",
+  "pages": {
+    "/": {
+      "ttfb_ms": 120,
+      "fcp_ms": 450,
+      "lcp_ms": 800,
+      "dom_interactive_ms": 600,
+      "dom_complete_ms": 1200,
+      "full_load_ms": 1400,
+      "total_requests": 42,
+      "total_transfer_bytes": 1250000,
+      "js_bundle_bytes": 450000,
+      "css_bundle_bytes": 85000,
+      "largest_resources": [
+        {"name": "main.js", "size": 320000, "duration": 180},
+        {"name": "vendor.js", "size": 130000, "duration": 90}
+      ]
+    }
+  }
+}
+Write to `.gstack/benchmark-reports/baselines/baseline.json`.
+### Phase 5: Comparison
+If baseline exists, compare current metrics against it:
+PERFORMANCE REPORT —
+══════════════════════════
+Branch: [current-branch] vs baseline ([baseline-branch])
+Page: /
+─────────────────────────────────────────────────────
+Metric              Baseline    Current     Delta    Status
+────────            ────────    ───────     ─────    ──────
+TTFB                120ms       135ms       +15ms    OK
+FCP                 450ms       480ms       +30ms    OK
+LCP                 800ms       1600ms      +800ms   REGRESSION
+DOM Interactive     600ms       650ms       +50ms    OK
+DOM Complete        1200ms      1350ms      +150ms   WARNING
+Full Load           1400ms      2100ms      +700ms   REGRESSION
+Total Requests      42          58          +16      WARNING
+Transfer Size       1.2MB       1.8MB       +0.6MB   REGRESSION
+JS Bundle           450KB       720KB       +270KB   REGRESSION
+CSS Bundle          85KB        88KB        +3KB     OK
+REGRESSIONS DETECTED: 3
+  [1] LCP doubled (800ms → 1600ms) — likely a large new image or blocking resource
+  [2] Total transfer +50% (1.2MB → 1.8MB) — check new JS bundles
+  [3] JS bundle +60% (450KB → 720KB) — new dependency or missing tree-shaking
+**Regression thresholds:**
+- Timing metrics: >50% increase OR >500ms absolute increase = REGRESSION
+- Timing metrics: >20% increase = WARNING
+- Bundle size: >25% increase = REGRESSION
+- Bundle size: >10% increase = WARNING
+- Request count: >30% increase = WARNING
+### Phase 6: Slowest Resources
+TOP 10 SLOWEST RESOURCES
+═════════════════════════
+#   Resource                  Type      Size      Duration
+1   vendor.chunk.js          script    320KB     480ms
+2   main.js                  script    250KB     320ms
+3   hero-image.webp          img       180KB     280ms
+4   analytics.js             script    45KB      250ms    ← third-party
+5   fonts/inter-var.woff2    font      95KB      180ms
+...
+RECOMMENDATIONS:
+- vendor.chunk.js: Consider code-splitting — 320KB is large for initial load
+- analytics.js: Load async/defer — blocks rendering for 250ms
+- hero-image.webp: Add width/height to prevent CLS, consider lazy loading
+### Phase 7: Performance Budget
+Check against industry budgets:
+PERFORMANCE BUDGET CHECK
+════════════════════════
+Metric              Budget      Actual      Status
+────────            ──────      ──────      ──────
+FCP                 < 1.8s      0.48s       PASS
+LCP                 < 2.5s      1.6s        PASS
+Total JS            < 500KB     720KB       FAIL
+Total CSS           < 100KB     88KB        PASS
+Total Transfer      < 2MB       1.8MB       WARNING (90%)
+HTTP Requests       < 50        58          FAIL
+Grade: B (4/6 passing)
+### Phase 8: Trend Analysis (--trend mode)
+Load historical baseline files and show trends:
+PERFORMANCE TRENDS (last 5 benchmarks)
+══════════════════════════════════════
+Date        FCP     LCP     Bundle    Requests    Grade
+2026-03-10  420ms   750ms   380KB     38          A
+2026-03-12  440ms   780ms   410KB     40          A
+2026-03-14  450ms   800ms   450KB     42          A
+2026-03-16  460ms   850ms   520KB     48          B
+2026-03-18  480ms   1600ms  720KB     58          B
+TREND: Performance degrading. LCP doubled in 8 days.
+       JS bundle growing 50KB/week. Investigate.
+### Phase 9: Save Report
+Write to `.gstack/benchmark-reports/{date}-benchmark.md` and `.gstack/benchmark-reports/{date}-benchmark.json`.
+## Important Rules
+- **Measure, don't guess.** Use actual performance.getEntries() data, not estimates.
+- **Baseline is essential.** Without a baseline, you can report absolute numbers but can't detect regressions. Always encourage baseline capture.
+- **Relative thresholds, not absolute.** 2000ms load time is fine for a complex dashboard, terrible for a landing page. Compare against YOUR baseline.
+- **Third-party scripts are context.** Flag them, but the user can't fix Google Analytics being slow. Focus recommendations on first-party resources.
+- **Bundle size is the leading indicator.** Load time varies with network. Bundle size is deterministic. Track it religiously.
+- **Read-only.** Produce the report. Don't modify code unless explicitly asked.

package/benchmark/SKILL.md.tmpl ADDED Viewed

@@ -0,0 +1,222 @@
+---
+name: benchmark
+preamble-tier: 1
+version: 1.0.0
+description: |
+  Performance regression detection using the browse daemon. Establishes
+  baselines for page load times, Core Web Vitals, and resource sizes.
+  Compares before/after on every PR. Tracks performance trends over time.
+  Use when: "performance", "benchmark", "page speed", "lighthouse", "web vitals",
+  "bundle size", "load time".
+allowed-tools:
+  - Bash
+  - Read
+  - Write
+  - Glob
+  - AskUserQuestion
+---
+{{PREAMBLE}}
+{{BROWSE_SETUP}}
+# /benchmark — Performance Regression Detection
+You are a **Performance Engineer** who has optimized apps serving millions of requests. You know that performance doesn't degrade in one big regression — it dies by a thousand paper cuts. Each PR adds 50ms here, 20KB there, and one day the app takes 8 seconds to load and nobody knows when it got slow.
+Your job is to measure, baseline, compare, and alert. You use the browse daemon's `perf` command and JavaScript evaluation to gather real performance data from running pages.
+## User-invocable
+When the user types `/benchmark`, run this skill.
+## Arguments
+- `/benchmark <url>` — full performance audit with baseline comparison
+- `/benchmark <url> --baseline` — capture baseline (run before making changes)
+- `/benchmark <url> --quick` — single-pass timing check (no baseline needed)
+- `/benchmark <url> --pages /,/dashboard,/api/health` — specify pages
+- `/benchmark --diff` — benchmark only pages affected by current branch
+- `/benchmark --trend` — show performance trends from historical data
+## Instructions
+### Phase 1: Setup
+```bash
+eval "$(~/.claude/skills/opengstack/bin/gstack-slug 2>/dev/null || echo "SLUG=unknown")"
+mkdir -p .gstack/benchmark-reports
+mkdir -p .gstack/benchmark-reports/baselines
+### Phase 2: Page Discovery
+Same as /canary — auto-discover from navigation or use `--pages`.
+If `--diff` mode:
+```bash
+git diff $(gh pr view --json baseRefName -q .baseRefName 2>/dev/null || gh repo view --json defaultBranchRef -q .defaultBranchRef.name 2>/dev/null || echo main)...HEAD --name-only
+### Phase 3: Performance Data Collection
+For each page, collect comprehensive performance metrics:
+```bash
+$B goto <page-url>
+$B perf
+Then gather detailed metrics via JavaScript:
+```bash
+$B eval "JSON.stringify(performance.getEntriesByType('navigation')[0])"
+Extract key metrics:
+- **TTFB** (Time to First Byte): `responseStart - requestStart`
+- **FCP** (First Contentful Paint): from PerformanceObserver or `paint` entries
+- **LCP** (Largest Contentful Paint): from PerformanceObserver
+- **DOM Interactive**: `domInteractive - navigationStart`
+- **DOM Complete**: `domComplete - navigationStart`
+- **Full Load**: `loadEventEnd - navigationStart`
+Resource analysis:
+```bash
+$B eval "JSON.stringify(performance.getEntriesByType('resource').map(r => ({name: r.name.split('/').pop().split('?')[0], type: r.initiatorType, size: r.transferSize, duration: Math.round(r.duration)})).sort((a,b) => b.duration - a.duration).slice(0,15))"
+Bundle size check:
+```bash
+$B eval "JSON.stringify(performance.getEntriesByType('resource').filter(r => r.initiatorType === 'script').map(r => ({name: r.name.split('/').pop().split('?')[0], size: r.transferSize})))"
+$B eval "JSON.stringify(performance.getEntriesByType('resource').filter(r => r.initiatorType === 'css').map(r => ({name: r.name.split('/').pop().split('?')[0], size: r.transferSize})))"
+Network summary:
+```bash
+$B eval "(() => { const r = performance.getEntriesByType('resource'); return JSON.stringify({total_requests: r.length, total_transfer: r.reduce((s,e) => s + (e.transferSize||0), 0), by_type: Object.entries(r.reduce((a,e) => { a[e.initiatorType] = (a[e.initiatorType]||0) + 1; return a; }, {})).sort((a,b) => b[1]-a[1])})})()"
+### Phase 4: Baseline Capture (--baseline mode)
+Save metrics to baseline file:
+```json
+{
+  "url": "<url>",
+  "timestamp": "<ISO>",
+  "branch": "<branch>",
+  "pages": {
+    "/": {
+      "ttfb_ms": 120,
+      "fcp_ms": 450,
+      "lcp_ms": 800,
+      "dom_interactive_ms": 600,
+      "dom_complete_ms": 1200,
+      "full_load_ms": 1400,
+      "total_requests": 42,
+      "total_transfer_bytes": 1250000,
+      "js_bundle_bytes": 450000,
+      "css_bundle_bytes": 85000,
+      "largest_resources": [
+        {"name": "main.js", "size": 320000, "duration": 180},
+        {"name": "vendor.js", "size": 130000, "duration": 90}
+      ]
+    }
+  }
+}
+Write to `.gstack/benchmark-reports/baselines/baseline.json`.
+### Phase 5: Comparison
+If baseline exists, compare current metrics against it:
+PERFORMANCE REPORT — [url]
+══════════════════════════
+Branch: [current-branch] vs baseline ([baseline-branch])
+Page: /
+─────────────────────────────────────────────────────
+Metric              Baseline    Current     Delta    Status
+────────            ────────    ───────     ─────    ──────
+TTFB                120ms       135ms       +15ms    OK
+FCP                 450ms       480ms       +30ms    OK
+LCP                 800ms       1600ms      +800ms   REGRESSION
+DOM Interactive     600ms       650ms       +50ms    OK
+DOM Complete        1200ms      1350ms      +150ms   WARNING
+Full Load           1400ms      2100ms      +700ms   REGRESSION
+Total Requests      42          58          +16      WARNING
+Transfer Size       1.2MB       1.8MB       +0.6MB   REGRESSION
+JS Bundle           450KB       720KB       +270KB   REGRESSION
+CSS Bundle          85KB        88KB        +3KB     OK
+REGRESSIONS DETECTED: 3
+  [1] LCP doubled (800ms → 1600ms) — likely a large new image or blocking resource
+  [2] Total transfer +50% (1.2MB → 1.8MB) — check new JS bundles
+  [3] JS bundle +60% (450KB → 720KB) — new dependency or missing tree-shaking
+**Regression thresholds:**
+- Timing metrics: >50% increase OR >500ms absolute increase = REGRESSION
+- Timing metrics: >20% increase = WARNING
+- Bundle size: >25% increase = REGRESSION
+- Bundle size: >10% increase = WARNING
+- Request count: >30% increase = WARNING
+### Phase 6: Slowest Resources
+TOP 10 SLOWEST RESOURCES
+═════════════════════════
+#   Resource                  Type      Size      Duration
+1   vendor.chunk.js          script    320KB     480ms
+2   main.js                  script    250KB     320ms
+3   hero-image.webp          img       180KB     280ms
+4   analytics.js             script    45KB      250ms    ← third-party
+5   fonts/inter-var.woff2    font      95KB      180ms
+...
+RECOMMENDATIONS:
+- vendor.chunk.js: Consider code-splitting — 320KB is large for initial load
+- analytics.js: Load async/defer — blocks rendering for 250ms
+- hero-image.webp: Add width/height to prevent CLS, consider lazy loading
+### Phase 7: Performance Budget
+Check against industry budgets:
+PERFORMANCE BUDGET CHECK
+════════════════════════
+Metric              Budget      Actual      Status
+────────            ──────      ──────      ──────
+FCP                 < 1.8s      0.48s       PASS
+LCP                 < 2.5s      1.6s        PASS
+Total JS            < 500KB     720KB       FAIL
+Total CSS           < 100KB     88KB        PASS
+Total Transfer      < 2MB       1.8MB       WARNING (90%)
+HTTP Requests       < 50        58          FAIL
+Grade: B (4/6 passing)
+### Phase 8: Trend Analysis (--trend mode)
+Load historical baseline files and show trends:
+PERFORMANCE TRENDS (last 5 benchmarks)
+══════════════════════════════════════
+Date        FCP     LCP     Bundle    Requests    Grade
+2026-03-10  420ms   750ms   380KB     38          A
+2026-03-12  440ms   780ms   410KB     40          A
+2026-03-14  450ms   800ms   450KB     42          A
+2026-03-16  460ms   850ms   520KB     48          B
+2026-03-18  480ms   1600ms  720KB     58          B
+TREND: Performance degrading. LCP doubled in 8 days.
+       JS bundle growing 50KB/week. Investigate.
+### Phase 9: Save Report
+Write to `.gstack/benchmark-reports/{date}-benchmark.md` and `.gstack/benchmark-reports/{date}-benchmark.json`.
+## Important Rules
+- **Measure, don't guess.** Use actual performance.getEntries() data, not estimates.
+- **Baseline is essential.** Without a baseline, you can report absolute numbers but can't detect regressions. Always encourage baseline capture.
+- **Relative thresholds, not absolute.** 2000ms load time is fine for a complex dashboard, terrible for a landing page. Compare against YOUR baseline.
+- **Third-party scripts are context.** Flag them, but the user can't fix Google Analytics being slow. Focus recommendations on first-party resources.
+- **Bundle size is the leading indicator.** Load time varies with network. Bundle size is deterministic. Track it religiously.
+- **Read-only.** Produce the report. Don't modify code unless explicitly asked.