opencastle 0.7.0 → 0.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (93) hide show
  1. package/README.md +30 -3
  2. package/bin/cli.mjs +2 -0
  3. package/dist/cli/adapters/claude-code.d.ts +2 -5
  4. package/dist/cli/adapters/claude-code.d.ts.map +1 -1
  5. package/dist/cli/adapters/claude-code.js +12 -251
  6. package/dist/cli/adapters/claude-code.js.map +1 -1
  7. package/dist/cli/adapters/cursor.d.ts.map +1 -1
  8. package/dist/cli/adapters/cursor.js +3 -17
  9. package/dist/cli/adapters/cursor.js.map +1 -1
  10. package/dist/cli/adapters/frontmatter.d.ts +26 -0
  11. package/dist/cli/adapters/frontmatter.d.ts.map +1 -0
  12. package/dist/cli/adapters/frontmatter.js +40 -0
  13. package/dist/cli/adapters/frontmatter.js.map +1 -0
  14. package/dist/cli/adapters/index.d.ts +5 -0
  15. package/dist/cli/adapters/index.d.ts.map +1 -0
  16. package/dist/cli/adapters/index.js +9 -0
  17. package/dist/cli/adapters/index.js.map +1 -0
  18. package/dist/cli/adapters/opencode.d.ts +2 -5
  19. package/dist/cli/adapters/opencode.d.ts.map +1 -1
  20. package/dist/cli/adapters/opencode.js +12 -250
  21. package/dist/cli/adapters/opencode.js.map +1 -1
  22. package/dist/cli/adapters/single-file-base.d.ts +40 -0
  23. package/dist/cli/adapters/single-file-base.d.ts.map +1 -0
  24. package/dist/cli/adapters/single-file-base.js +246 -0
  25. package/dist/cli/adapters/single-file-base.js.map +1 -0
  26. package/dist/cli/dashboard.d.ts.map +1 -1
  27. package/dist/cli/dashboard.js +3 -2
  28. package/dist/cli/dashboard.js.map +1 -1
  29. package/dist/cli/detect.d.ts.map +1 -1
  30. package/dist/cli/detect.js +13 -11
  31. package/dist/cli/detect.js.map +1 -1
  32. package/dist/cli/doctor.d.ts +3 -0
  33. package/dist/cli/doctor.d.ts.map +1 -0
  34. package/dist/cli/doctor.js +205 -0
  35. package/dist/cli/doctor.js.map +1 -0
  36. package/dist/cli/init.d.ts.map +1 -1
  37. package/dist/cli/init.js +31 -19
  38. package/dist/cli/init.js.map +1 -1
  39. package/dist/cli/run/schema.d.ts +1 -5
  40. package/dist/cli/run/schema.d.ts.map +1 -1
  41. package/dist/cli/run/schema.js +6 -330
  42. package/dist/cli/run/schema.js.map +1 -1
  43. package/dist/cli/run.d.ts.map +1 -1
  44. package/dist/cli/run.js +14 -1
  45. package/dist/cli/run.js.map +1 -1
  46. package/dist/cli/types.d.ts +0 -5
  47. package/dist/cli/types.d.ts.map +1 -1
  48. package/dist/cli/update.d.ts.map +1 -1
  49. package/dist/cli/update.js +4 -17
  50. package/dist/cli/update.js.map +1 -1
  51. package/package.json +7 -2
  52. package/src/cli/adapters/claude-code.ts +13 -304
  53. package/src/cli/adapters/cursor.ts +3 -23
  54. package/src/cli/adapters/frontmatter.ts +47 -0
  55. package/src/cli/adapters/index.ts +13 -0
  56. package/src/cli/adapters/opencode.ts +12 -301
  57. package/src/cli/adapters/single-file-base.ts +320 -0
  58. package/src/cli/dashboard.ts +3 -2
  59. package/src/cli/detect.ts +19 -15
  60. package/src/cli/doctor.ts +235 -0
  61. package/src/cli/init.ts +31 -24
  62. package/src/cli/run/schema.ts +7 -365
  63. package/src/cli/run.ts +17 -1
  64. package/src/cli/types.ts +0 -6
  65. package/src/cli/update.ts +5 -23
  66. package/src/dashboard/dist/_astro/{index.CWVzbF4T.css → index.Bnq19_1M.css} +1 -1
  67. package/src/dashboard/dist/index.html +170 -11
  68. package/src/dashboard/node_modules/.vite/deps/_metadata.json +6 -6
  69. package/src/dashboard/seed-data/reviews.ndjson +6 -0
  70. package/src/dashboard/src/pages/index.astro +213 -10
  71. package/src/dashboard/src/styles/dashboard.css +196 -0
  72. package/src/orchestrator/agent-workflows/bug-fix.md +2 -2
  73. package/src/orchestrator/agent-workflows/data-pipeline.md +8 -8
  74. package/src/orchestrator/agent-workflows/database-migration.md +2 -2
  75. package/src/orchestrator/agent-workflows/feature-implementation.md +12 -5
  76. package/src/orchestrator/agent-workflows/performance-optimization.md +2 -2
  77. package/src/orchestrator/agent-workflows/refactoring.md +2 -2
  78. package/src/orchestrator/agent-workflows/schema-changes.md +2 -2
  79. package/src/orchestrator/agent-workflows/security-audit.md +2 -2
  80. package/src/orchestrator/agents/data-expert.agent.md +2 -2
  81. package/src/orchestrator/agents/researcher.agent.md +0 -16
  82. package/src/orchestrator/agents/team-lead.agent.md +17 -6
  83. package/src/orchestrator/customizations/AGENT-PERFORMANCE.md +1 -3
  84. package/src/orchestrator/prompts/bootstrap-customizations.prompt.md +1 -1
  85. package/src/orchestrator/prompts/bug-fix.prompt.md +11 -6
  86. package/src/orchestrator/prompts/implement-feature.prompt.md +9 -4
  87. package/src/orchestrator/prompts/quick-refinement.prompt.md +9 -5
  88. package/src/orchestrator/prompts/resolve-pr-comments.prompt.md +18 -4
  89. package/src/orchestrator/skills/agent-hooks/SKILL.md +4 -2
  90. package/src/orchestrator/skills/fast-review/SKILL.md +15 -4
  91. package/src/orchestrator/skills/self-improvement/SKILL.md +1 -1
  92. package/src/orchestrator/skills/validation-gates/SKILL.md +152 -15
  93. package/src/orchestrator/prompts/metrics-report.prompt.md +0 -144
@@ -123,7 +123,9 @@ CONFIDENCE: low | medium | high
123
123
  **Auto-PASS conditions (skip reviewer):**
124
124
  - The delegation was pure research/exploration with no code changes
125
125
  - The delegation only modified documentation files (`.md`)
126
- - All deterministic gates already passed AND the change is ≤10 lines across ≤2 files
126
+ - All deterministic gates already passed AND the change is ≤10 lines across ≤2 files AND **no sensitive files were touched** (see validation-gates Gate 3 sensitive file list)
127
+
128
+ > **Sensitive file override:** Changes to auth/middleware files, database migrations, RLS policies, security headers, CSP configuration, environment variable schemas, or CI/CD configuration **always** require a reviewer — even for 1-line changes. Auto-PASS never applies to these files.
127
129
 
128
130
  ### Step 4: Handle Verdict
129
131
 
@@ -247,14 +249,23 @@ Fast review sits between the agent's output and the Team Lead's acceptance:
247
249
  Agent completes work
248
250
 
249
251
 
250
- Deterministic checks (lint, test, build) ← validation-gates Gate 1
252
+ Secret Scanning ← validation-gates Gate 1
253
+
254
+
255
+ Deterministic checks (lint, test, build) ← validation-gates Gate 2
256
+
257
+
258
+ Blast Radius Check ← validation-gates Gate 3
259
+
260
+
261
+ Dependency Audit (if packages changed) ← validation-gates Gate 4
251
262
 
252
263
 
253
- Fast Review (this skill) ← validation-gates Gate 1.5
264
+ Fast Review (this skill) ← validation-gates Gate 5
254
265
 
255
266
  ├── PASS → Accept, move to next task
256
267
  ├── FAIL → Retry loop (up to 2x)
257
- └── 3x FAIL → Escalate to Panel (Gate 5)
268
+ └── 3x FAIL → Escalate to Panel (Gate 9)
258
269
  ```
259
270
 
260
271
  ### Relationship to on-post-delegate Hook
@@ -50,7 +50,7 @@ A lesson MUST be written when **any** of these triggers occur:
50
50
  echo '{"timestamp":"'$(date -u +%Y-%m-%dT%H:%M:%SZ)'","agent":"Agent Name","model":"model-id","task":"Short description","outcome":"success","files_changed":N,"retries":0}' >> .github/customizations/logs/sessions.ndjson
51
51
  ```
52
52
 
53
- This is **mandatory** — session logging fuels the metrics dashboard (`metrics-report` prompt).
53
+ This is **mandatory** — session logging fuels the observability dashboard (`npx opencastle dashboard`).
54
54
 
55
55
  ## How to Write a Lesson
56
56
 
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: validation-gates
3
- description: "Shared validation gates for all orchestration workflows — deterministic checks, browser testing, cache management, regression checks. Referenced by prompt templates to maintain single source of truth."
3
+ description: "Shared validation gates for all orchestration workflows — secret scanning, deterministic checks, blast radius analysis, dependency auditing, browser testing, cache management, regression checks, and final smoke tests. Referenced by prompt templates to maintain single source of truth."
4
4
  ---
5
5
 
6
6
  <!-- ⚠️ This file is managed by OpenCastle. Edits will be overwritten on update. Customize in the .github/customizations/ directory instead. -->
@@ -9,7 +9,57 @@ description: "Shared validation gates for all orchestration workflows — determ
9
9
 
10
10
  Canonical reference for validation gates shared across all orchestration workflows. Prompt templates reference this skill to avoid duplication.
11
11
 
12
- ## Gate 1: Deterministic Checks
12
+ **Gate summary:**
13
+
14
+ | Gate | Name | Runs When |
15
+ |------|------|-----------|
16
+ | 1 | Secret Scanning | Every delegation |
17
+ | 2 | Deterministic Checks | Every delegation |
18
+ | 3 | Blast Radius Check | Every delegation |
19
+ | 4 | Dependency Audit | When `package.json` or lockfiles change |
20
+ | 5 | Fast Review | Every delegation (with auto-PASS exceptions) |
21
+ | 6 | Cache Clearing | Before browser testing |
22
+ | 7 | Browser Testing | UI changes |
23
+ | 8 | Regression Testing | Every delegation |
24
+ | 9 | Panel Review | High-stakes changes only |
25
+ | 10 | Final Smoke Test | Feature completion (after all tasks Done) |
26
+
27
+ ---
28
+
29
+ ## Gate 1: Secret Scanning
30
+
31
+ > **HARD GATE — Constitution rule #1.** No tokens, keys, passwords, or connection strings in code, logs, commits, or terminal output.
32
+
33
+ Scan every diff **before** any other gate. A secret leak caught after merge is exponentially more expensive than one caught at review time.
34
+
35
+ ### What to scan
36
+
37
+ Run a regex scan of all changed files for patterns that match common secret formats:
38
+
39
+ ```bash
40
+ # Scan staged/changed files for common secret patterns
41
+ grep -rn -E '(AKIA[0-9A-Z]{16}|sk-[a-zA-Z0-9]{20,}|ghp_[a-zA-Z0-9]{36}|glpat-[a-zA-Z0-9\-]{20}|xox[bpors]-[a-zA-Z0-9\-]+|eyJ[a-zA-Z0-9]{10,}\.[a-zA-Z0-9]{10,}|-----BEGIN (RSA |EC |DSA )?PRIVATE KEY-----|mongodb(\+srv)?://[^\s]+|postgres(ql)?://[^\s]+|mysql://[^\s]+|redis://[^\s]+)' <changed-files>
42
+ ```
43
+
44
+ Also check for:
45
+ - Hardcoded `password`, `secret`, `api_key`, `apiKey`, `token` assignments (not just references)
46
+ - `.env` file contents copied into source files
47
+ - Base64-encoded secrets (common obfuscation attempt)
48
+
49
+ ### On detection
50
+
51
+ - **BLOCK immediately** — do not proceed to Gate 2
52
+ - Flag the specific file and line number
53
+ - Re-delegate to the agent with explicit instruction to use environment variables instead
54
+ - If a secret was already committed, **rotate it immediately** — git history is permanent
55
+
56
+ ### Exceptions
57
+
58
+ - Test fixtures with obviously fake values (e.g., `sk-test-1234567890`)
59
+ - Documentation examples with placeholder values (e.g., `YOUR_API_KEY_HERE`)
60
+ - Pattern matches inside comments that are clearly explanatory
61
+
62
+ ## Gate 2: Deterministic Checks
13
63
 
14
64
  Run for every affected project (resolve exact commands via the **codebase-tool** skill):
15
65
 
@@ -19,31 +69,84 @@ Run for every affected project (resolve exact commands via the **codebase-tool**
19
69
 
20
70
  All must pass with zero errors. Run for **every** project that consumed modified files, not just the primary project.
21
71
 
22
- ## Gate 1.5: Fast Review (MANDATORY)
72
+ ## Gate 3: Blast Radius Check
73
+
74
+ Assess the scope of changes to catch scope creep and ensure reviewers can evaluate the diff effectively.
75
+
76
+ ### Thresholds
77
+
78
+ | Metric | Normal | Warning | Escalate |
79
+ |--------|--------|---------|----------|
80
+ | Lines changed | ≤200 | 201–500 | >500 |
81
+ | Files changed | ≤5 | 6–10 | >10 |
82
+ | Projects affected | ≤1 | 2 | >2 |
83
+
84
+ ### Actions
85
+
86
+ - **Normal** — proceed to Gate 4
87
+ - **Warning** — log a note in the delegation record. Ask: *"Was this scope expected?"* If yes, proceed. If unexpected, investigate whether the agent drifted from the partition
88
+ - **Escalate** — **STOP.** The Team Lead must review the diff before proceeding:
89
+ 1. Verify all changed files are within the agent's assigned partition
90
+ 2. Check whether the task should have been split into smaller subtasks
91
+ 3. If scope creep: revert extra changes, re-delegate with tighter scope
92
+ 4. If legitimately large: proceed, but **always run fast review** (no auto-PASS) and consider panel review
93
+
94
+ ### Sensitive files
95
+
96
+ Changes to these file categories always trigger Warning regardless of line count:
97
+
98
+ - Auth/middleware files (e.g., `middleware.ts`, `auth.ts`, `**/auth/**`)
99
+ - Database migrations, RLS policies
100
+ - Security headers, CSP configuration (`next.config.*`, `vercel.json`)
101
+ - Environment variable schemas (`.env.example`, `env.ts`)
102
+ - CI/CD configuration (`.github/workflows/**`)
103
+ - Package manager configs (`package.json`, lockfiles) — also triggers Gate 4
104
+
105
+ ## Gate 4: Dependency Audit
106
+
107
+ > Runs only when `package.json`, `yarn.lock`, `package-lock.json`, `pnpm-lock.yaml`, or similar lockfiles are modified.
108
+
109
+ When agents add, remove, or update npm packages, verify:
110
+
111
+ 1. **Vulnerability scan** — Run `npm audit` (or the project's equivalent). No new `high` or `critical` vulnerabilities
112
+ 2. **License compatibility** — New packages must use MIT, Apache-2.0, BSD-2-Clause, BSD-3-Clause, or ISC licenses. Flag any copyleft (GPL, LGPL, AGPL) or proprietary licenses for human review
113
+ 3. **Bundle size impact** — For frontend packages, note the minified + gzipped size. Flag packages >50KB gzipped that have lighter alternatives
114
+ 4. **Duplicate functionality** — Check whether the new dependency overlaps with an existing one (e.g., adding `moment` when `date-fns` is already installed)
115
+ 5. **Maintenance health** — Flag packages with no updates in >2 years or <100 weekly downloads
116
+
117
+ ### On failure
118
+
119
+ - **Vulnerability:** BLOCK. Re-delegate with instruction to use a patched version or alternative package
120
+ - **License concern:** Flag for human review. Do not block, but document in the PR description
121
+ - **Size/duplicate:** Flag as SHOULD-FIX in the fast review. Not blocking unless egregious (>200KB)
122
+
123
+ ## Gate 5: Fast Review (MANDATORY)
23
124
 
24
125
  > **HARD GATE:** Every agent delegation output must pass fast review before acceptance. This is non-negotiable — even for overnight/unattended runs. Load the **fast-review** skill for the full procedure.
25
126
 
26
- After deterministic checks (Gate 1) pass:
127
+ After gates 1–4 pass:
27
128
 
28
129
  1. **Spawn a single reviewer sub-agent** with the review prompt from the fast-review skill
29
130
  2. **On PASS** — proceed to remaining gates
30
131
  3. **On FAIL** — re-delegate to the same agent with reviewer feedback (up to 2 retries)
31
- 4. **On 3x FAIL** — escalate to panel review (Gate 5)
132
+ 4. **On 3x FAIL** — escalate to panel review (Gate 9)
32
133
 
33
134
  The reviewer validates: acceptance criteria met, file partition respected, no regressions, type safety, error handling, security basics, and edge cases.
34
135
 
35
136
  **Auto-PASS conditions** (skip the reviewer sub-agent):
36
137
  - Pure research/exploration with no code changes
37
138
  - Only `.md` files were modified
38
- - All deterministic gates passed AND the change is ≤10 lines across ≤2 files
139
+ - All deterministic gates passed AND the change is ≤10 lines across ≤2 files AND **no sensitive files were touched** (see Gate 3 sensitive file list)
39
140
 
40
- ## Gate 2: Cache Clearing (BEFORE Browser Testing)
141
+ > **Sensitive file override:** If any changed file falls into the sensitive file categories listed in Gate 3 (auth, migrations, security headers, env schemas, CI/CD), auto-PASS is **never** applied — even for 1-line changes. These files always get a human-quality review.
142
+
143
+ ## Gate 6: Cache Clearing (BEFORE Browser Testing)
41
144
 
42
145
  **Always clear before testing.** Testing stale code wastes time and produces false results.
43
146
 
44
147
  Clear framework caches and task runner caches before starting the dev server for browser testing. See the **codebase-tool** skill for cache-clearing commands.
45
148
 
46
- ## Gate 3: Browser Testing (MANDATORY for UI Changes)
149
+ ## Gate 7: Browser Testing (MANDATORY for UI Changes)
47
150
 
48
151
  > **HARD GATE:** A task with UI changes is NOT done until you have screenshots in Chrome proving the feature works. "The code looks correct" is not proof. "Tests pass" is not proof. Only a screenshot of the working UI in Chrome is proof.
49
152
 
@@ -59,7 +162,7 @@ Clear framework caches and task runner caches before starting the dev server for
59
162
 
60
163
  Load the **browser-testing** skill for Chrome MCP commands, breakpoint details, and reporting format.
61
164
 
62
- ## Gate 4: Regression Testing
165
+ ## Gate 8: Regression Testing
63
166
 
64
167
  New features must not break existing functionality:
65
168
 
@@ -68,7 +171,7 @@ New features must not break existing functionality:
68
171
  3. **Verify navigation** — Ensure routing, links, and back-button behavior still work
69
172
  4. **Check shared components** — If a component from a shared library was modified, test it in all apps that consume it
70
173
 
71
- ## Gate 5: Panel Review (High-Stakes Only)
174
+ ## Gate 9: Panel Review (High-Stakes Only)
72
175
 
73
176
  Use the **panel-majority-vote** skill for:
74
177
 
@@ -79,16 +182,50 @@ Use the **panel-majority-vote** skill for:
79
182
 
80
183
  If the panel returns BLOCK, extract MUST-FIX items, re-delegate to the same agent, and re-run the panel. Never skip, never halt. Max 3 attempts, then escalate to Architect.
81
184
 
185
+ ## Gate 10: Final Smoke Test (Feature-Level)
186
+
187
+ > Runs once after ALL tasks in a feature are Done — not per-task.
188
+
189
+ Individual tasks pass gates 1–9 independently. But the combined result may have integration issues that per-task testing misses. This gate verifies the feature as a cohesive unit.
190
+
191
+ ### Steps
192
+
193
+ 1. **Full build** — Build all affected projects from clean state (not incremental)
194
+ 2. **Full test suite** — Run tests across all projects that consumed any changed files
195
+ 3. **End-to-end browser walkthrough** — Navigate the complete user flow from start to finish:
196
+ - Verify all states: loading, empty, populated, error, partial
197
+ - Test every state transition end-to-end (not just individual screens)
198
+ - Confirm data flows correctly between pages/components
199
+ - Test the happy path AND at least one error path
200
+ 4. **Cross-task integration check** — Verify that outputs from different tasks (e.g., DB migration + component + page) compose correctly
201
+ 5. **Smoke test at all breakpoints** — If the feature has UI, one final responsive sweep
202
+
203
+ ### When to skip
204
+
205
+ - Non-UI features with comprehensive test coverage (e.g., pure backend/data pipeline work where tests verify integration)
206
+ - Single-task features (Gate 8 already covers regression)
207
+
208
+ ### On failure
209
+
210
+ Re-delegate the specific failing integration point to the agent responsible for that layer. Do NOT re-run the entire feature implementation.
211
+
212
+ ---
213
+
82
214
  ## Universal Completion Checklist
83
215
 
84
216
  Use this checklist for any orchestration workflow:
85
217
 
86
- - [ ] Lint, test, and build pass for all affected projects
87
- - [ ] **Fast review passed** (mandatory load **fast-review** skill)
88
- - [ ] Dev server started with **clean cache** (clear framework + task runner caches — see the **codebase-tool** skill)
89
- - [ ] UI changes verified in Chrome with screenshots at all breakpoints
218
+ - [ ] **No secrets in diff** (Gate 1)
219
+ - [ ] Lint, test, and build pass for all affected projects (Gate 2)
220
+ - [ ] Blast radius assessed scope is expected (Gate 3)
221
+ - [ ] Dependency audit passed if packages changed (Gate 4)
222
+ - [ ] **Fast review passed** (mandatory — load **fast-review** skill) (Gate 5)
223
+ - [ ] Dev server started with **clean cache** (Gate 6)
224
+ - [ ] UI changes verified in Chrome with screenshots at all breakpoints (Gate 7)
90
225
  - [ ] Every acceptance criteria item visually confirmed — not just "page loads"
91
- - [ ] No regressions in adjacent functionality
226
+ - [ ] No regressions in adjacent functionality (Gate 8)
227
+ - [ ] Panel review passed for high-stakes changes (Gate 9)
228
+ - [ ] **Final smoke test passed** for multi-task features (Gate 10)
92
229
  - [ ] Shared code changes tested across all consuming apps
93
230
  - [ ] No duplicated code — shared logic extracted to libraries
94
231
  - [ ] Lessons learned captured if any retries occurred
@@ -1,144 +0,0 @@
1
- ---
2
- description: 'Collect and report metrics from agent logs, GitHub PRs, tracker issues, and deployments'
3
- agent: Researcher
4
- ---
5
-
6
- <!-- ⚠️ This file is managed by OpenCastle. Edits will be overwritten on update. Customize in the .github/customizations/ directory instead. -->
7
-
8
- # Metrics Report
9
-
10
- Generate a comprehensive metrics dashboard from all project data sources.
11
-
12
- ## Data Sources
13
-
14
- Collect data from ALL of these sources. Run collections in parallel where possible.
15
-
16
- ### 1. Agent Session Logs (local)
17
-
18
- Read `.github/customizations/logs/sessions.ndjson` and `.github/customizations/logs/delegations.ndjson`.
19
-
20
- Compute:
21
- - **Total sessions** and **sessions per agent**
22
- - **Success rate** — `outcome` field breakdown (success / partial / failed)
23
- - **Retries per session** — average and total
24
- - **Lessons added** — count and which agents contribute most
25
- - **Delegation stats** — mechanism (sub-agent vs background), tier distribution, success rate per agent
26
- - **Model usage** — which models used how often
27
- - **Activity timeline** — sessions per day/week
28
-
29
- ### 2. GitHub PRs and Commits
30
-
31
- Use `gh` CLI commands (always prefix with `GH_PAGER=cat`):
32
-
33
- ```bash
34
- # All PRs (open + closed + merged)
35
- GH_PAGER=cat gh pr list --state all --limit 100 --json number,title,state,createdAt,mergedAt,closedAt,author,additions,deletions,changedFiles,labels,headRefName
36
-
37
- # Recent commits on main
38
- GH_PAGER=cat gh api repos/{owner}/{repo}/commits --paginate -q '.[0:50] | .[] | {sha: .sha[0:7], date: .commit.author.date, message: .commit.message}' 2>/dev/null || git --no-pager log main --oneline -50
39
- ```
40
-
41
- Compute:
42
- - **PR count** — total, open, merged, closed-without-merge
43
- - **Merge rate** — merged / (merged + closed-without-merge)
44
- - **Time to merge** — median and average (createdAt → mergedAt)
45
- - **PR size** — average additions, deletions, changedFiles
46
- - **Commit frequency** — commits per day/week on main
47
- - **Bogus/closed PRs** — PRs closed without merge (potential failed agent work)
48
-
49
- ### 3. Tracker Issues
50
-
51
- Use tracker MCP tools (`list_issues`, `search_issues`):
52
-
53
- ```
54
- list_issues with status filter for each state: Backlog, Todo, In Progress, Done, Cancelled
55
- ```
56
-
57
- Compute:
58
- - **Issue count by status** — Backlog, Todo, In Progress, Done, Cancelled
59
- - **Completion rate** — Done / (Done + Cancelled + In Progress + Todo)
60
- - **Issues by label** — which areas have the most work
61
- - **Issues by priority** — distribution across Urgent/High/Medium/Low
62
- - **Cycle time** — average time from In Progress → Done (if dates available)
63
- - **Stale issues** — In Progress for >7 days without updates
64
-
65
- ### 4. Deployments
66
-
67
- Use deployment platform tools (if available via MCP or CLI):
68
-
69
- Query deployments for all configured apps (see `project.instructions.md` for the app inventory).
70
-
71
- Compute:
72
- - **Total deployments** — count over last 30 days
73
- - **Deployment success rate** — ready / (ready + error + cancelled)
74
- - **Failure rate** — error / total
75
- - **Build times** — average, median, p95
76
- - **Deployments per day** — activity timeline
77
- - **Failed deployment details** — which commits/branches failed and why
78
-
79
- ### 5. Panel Reviews (local)
80
-
81
- Read `.github/customizations/logs/panels.ndjson`.
82
-
83
- Compute:
84
- - **Total reviews** — count of panel runs
85
- - **Pass rate** — pass / total
86
- - **Must-fix vs should-fix** — average counts per review
87
- - **Retry rate** — reviews with attempt > 1
88
- - **Model usage** — which reviewer models used
89
- - **Reviews by panel key** — what gets reviewed most
90
-
91
- ### 6. Agent Failures (DLQ)
92
-
93
- Read `.github/customizations/AGENT-FAILURES.md`.
94
-
95
- Compute:
96
- - **Total failures** — count of DLQ entries
97
- - **Failures by agent** — which agents fail most
98
- - **Failure status** — pending vs resolved
99
- - **Common root causes** — categorize failure reasons
100
-
101
- ## Report Format
102
-
103
- Present the report as a structured markdown summary with these sections:
104
-
105
- ```markdown
106
- # Project Metrics Dashboard
107
- > Generated: {date} | Period: Last 30 days
108
-
109
- ## Executive Summary
110
- - X agent sessions, Y% success rate
111
- - Z PRs merged, W% merge rate
112
- - N deployments, M% success rate
113
- - P tracker issues completed
114
-
115
- ## Agent Activity
116
- {sessions table, success rates, model usage}
117
-
118
- ## Delegation Performance
119
- {per-agent delegation stats, tier distribution}
120
-
121
- ## GitHub
122
- {PR stats, merge rates, commit frequency}
123
-
124
- ## Task Board
125
- {issue distribution, completion rate, stale issues}
126
-
127
- ## Deployments
128
- {success rate, failure rate, build times}
129
-
130
- ## Panel Reviews
131
- {pass rate, retry rate, must-fix/should-fix stats}
132
-
133
- ## Agent Failures (DLQ)
134
- {failure count, pending items}
135
-
136
- ## Trends & Recommendations
137
- {observations, areas for improvement}
138
- ```
139
-
140
- ## Usage
141
-
142
- Run this prompt periodically (weekly recommended) to track project health. Compare with previous reports to identify trends.
143
-
144
- If session logs are empty (no data yet), still collect GitHub/tracker/deployment data and note that agent logging has just been enabled.