@xcraftmind/mastermind 0.28.1 → 0.28.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +1 -1
- package/package.json +9 -9
- package/share/agents/mastermind-auditor.md +76 -2
- package/share/agents/mastermind-critic.md +1 -0
- package/share/agents/mastermind-investigator.md +168 -0
- package/share/agents/mastermind-prompt-refiner.md +29 -10
- package/share/agents/mastermind-researcher.md +23 -4
- package/share/agents/mastermind-task-executor.md +29 -0
- package/share/skills/mastermind-prompt-refiner/SKILL.md +61 -8
- package/share/skills/mastermind-task-planning/SKILL.md +105 -3
- package/share/skills/mastermind-task-planning/references/design-review-packet.md +120 -0
- package/share/skills/mastermind-task-planning/references/spec-template.md +84 -4
- package/share/agents/mastermind-release.md +0 -442
- package/share/commands/api-shape-explorer.md +0 -107
- package/share/skills/doc-stub-sync/SKILL.md +0 -187
- package/share/skills/doc-stub-sync/references/error-handling.md +0 -79
- package/share/skills/doc-stub-sync/references/url-patterns.md +0 -83
- package/share/skills/doc-stub-sync/scripts/doc_update.py +0 -285
- package/share/skills/doc-stub-sync/scripts/requirements.txt +0 -2
- package/share/skills/flaky-finder/SKILL.md +0 -75
- package/share/skills/mastermind-incident-response/SKILL.md +0 -157
- package/share/skills/mastermind-incident-response/references/investigation-playbook.md +0 -174
- package/share/skills/mastermind-incident-response/references/postmortem-template.md +0 -184
- package/share/skills/mastermind-incident-response/references/triage-checklist.md +0 -118
- package/share/skills/pr-review/SKILL.md +0 -89
|
@@ -1,118 +0,0 @@
|
|
|
1
|
-
# Triage checklist — first 5 minutes
|
|
2
|
-
|
|
3
|
-
Reference for the [`mastermind-incident-response`](../SKILL.md) skill, Phase 1. Run through these questions / commands in parallel with talking to the user. Goal: enough information to choose a mitigation in Phase 2.
|
|
4
|
-
|
|
5
|
-
---
|
|
6
|
-
|
|
7
|
-
## What to ask the user (priority order)
|
|
8
|
-
|
|
9
|
-
1. **Symptom — what's observable?**
|
|
10
|
-
- "What error are you / users seeing?"
|
|
11
|
-
- Paste actual error message, not a paraphrase
|
|
12
|
-
- Single sentence, ideally
|
|
13
|
-
- Example: `❌ "the app is slow"` → `✓ "/api/messages returning 500 with 'connection refused' since 14:32"`
|
|
14
|
-
|
|
15
|
-
2. **Scope**
|
|
16
|
-
- "How many users / what % of traffic / which surfaces?"
|
|
17
|
-
- Distinguish: 1 user vs 1 region vs everyone
|
|
18
|
-
- If unclear → ask: do we know yet?
|
|
19
|
-
|
|
20
|
-
3. **Severity classification** (pick one; if unclear, default to one level higher):
|
|
21
|
-
- **sev0** — total outage / safety / data loss; immediate
|
|
22
|
-
- **sev1** — major surface broken; act within minutes
|
|
23
|
-
- **sev2** — partial degradation; act within hours
|
|
24
|
-
- **sev3** — cosmetic / single user; act within days
|
|
25
|
-
- Severity drives whether to ask "should we page?" before doing anything else
|
|
26
|
-
|
|
27
|
-
4. **Timeline — when did this start?**
|
|
28
|
-
- "What's the first timestamp you have?"
|
|
29
|
-
- Convert to UTC immediately, write it down
|
|
30
|
-
- This is what you'll correlate against deploys / git log
|
|
31
|
-
|
|
32
|
-
5. **What's been tried?**
|
|
33
|
-
- Avoid duplicate effort
|
|
34
|
-
- If user already rolled back / restarted / flipped a flag — note it
|
|
35
|
-
- **Critical:** if a previous action made things WORSE, the next action shouldn't be the same kind
|
|
36
|
-
|
|
37
|
-
---
|
|
38
|
-
|
|
39
|
-
## What to gather in parallel (mmcg / git / files)
|
|
40
|
-
|
|
41
|
-
Run these via `mastermind-researcher` subagent or directly — should take < 1 minute:
|
|
42
|
-
|
|
43
|
-
```bash
|
|
44
|
-
# What committed recently (might have shipped the issue)
|
|
45
|
-
git log --since='2 hours ago' --oneline
|
|
46
|
-
|
|
47
|
-
# What's deployed (if there's a deploy marker file or git tag)
|
|
48
|
-
git tag --sort=-creatordate | head -5
|
|
49
|
-
git log -10 --oneline
|
|
50
|
-
|
|
51
|
-
# What specs were finished recently (each task is a folder containing spec.md)
|
|
52
|
-
ls -lt .mastermind/tasks/ | head -10
|
|
53
|
-
ls -lt .mastermind/tasks/*/spec.md 2>/dev/null | head -10
|
|
54
|
-
|
|
55
|
-
# Are there in-progress specs that might be related?
|
|
56
|
-
git status -s
|
|
57
|
-
|
|
58
|
-
# Is the index reachable (am I working from stale info?)
|
|
59
|
-
mmcg_status
|
|
60
|
-
|
|
61
|
-
# Quick scan for the symptom — has this been seen before?
|
|
62
|
-
grep -i "<error string>" CONTEXT.md 2>/dev/null
|
|
63
|
-
```
|
|
64
|
-
|
|
65
|
-
---
|
|
66
|
-
|
|
67
|
-
## Severity-driven branching
|
|
68
|
-
|
|
69
|
-
After triage, the severity determines what comes next:
|
|
70
|
-
|
|
71
|
-
| Severity | What you do next |
|
|
72
|
-
|---|---|
|
|
73
|
-
| **sev0** | Ask user: "should we page on-call before continuing?" Then go to Phase 2 with the most conservative mitigation. |
|
|
74
|
-
| **sev1** | Go directly to Phase 2. Watch the clock — if no improvement in 10 min, escalate. |
|
|
75
|
-
| **sev2** | Phase 2 with more deliberation — rollback is still preferred if available. |
|
|
76
|
-
| **sev3** | Skip Phase 2's "stop bleeding" urgency. Go to Phase 3 investigation. The postmortem (Phase 4) may even be lightweight (a CONTEXT.md gotcha entry, not a full doc). |
|
|
77
|
-
|
|
78
|
-
---
|
|
79
|
-
|
|
80
|
-
## What to write down (timeline starts now)
|
|
81
|
-
|
|
82
|
-
For the postmortem later, you'll need a timeline. Start it during triage:
|
|
83
|
-
|
|
84
|
-
```
|
|
85
|
-
14:32 UTC — first failure observed (per <source>)
|
|
86
|
-
14:36 UTC — user reported in #ops
|
|
87
|
-
14:38 UTC — incident response engaged
|
|
88
|
-
14:39 UTC — triage complete; sev1 declared; investigating recent deploys
|
|
89
|
-
...
|
|
90
|
-
```
|
|
91
|
-
|
|
92
|
-
Even rough timestamps are useful. The postmortem can refine them from logs later, but having any timeline beats reconstructing from memory.
|
|
93
|
-
|
|
94
|
-
---
|
|
95
|
-
|
|
96
|
-
## Anti-patterns during triage
|
|
97
|
-
|
|
98
|
-
- **Don't speculate about root cause yet.** Triage answers WHAT and WHEN, not WHY. WHY comes in Phase 3.
|
|
99
|
-
- **Don't write a fix yet.** Even if you're sure you know what's wrong, Phase 2 prefers rollback over hot patches.
|
|
100
|
-
- **Don't change scope unilaterally.** "While I'm here let me also fix X" is exactly the wrong instinct under time pressure.
|
|
101
|
-
- **Don't blame the deploy author** — the system allowed the bad deploy; that's the systemic issue, not the human.
|
|
102
|
-
|
|
103
|
-
---
|
|
104
|
-
|
|
105
|
-
## When to call this phase done
|
|
106
|
-
|
|
107
|
-
You're done with triage when you can answer all five:
|
|
108
|
-
1. ✓ What's the symptom? (concrete)
|
|
109
|
-
2. ✓ Who/what is affected? (scope)
|
|
110
|
-
3. ✓ Severity?
|
|
111
|
-
4. ✓ When did it start?
|
|
112
|
-
5. ✓ What's been tried?
|
|
113
|
-
|
|
114
|
-
…and you have either:
|
|
115
|
-
- A candidate mitigation in mind, OR
|
|
116
|
-
- An explicit "I don't know yet, going to Phase 3 first" decision
|
|
117
|
-
|
|
118
|
-
Move to Phase 2.
|
|
@@ -1,89 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: pr-review
|
|
3
|
-
description: Review a pull request for correctness, security, design issues, and operational risk — staff-engineer style. Use when the user says "review my PR", "audit this diff", "check before merge", or pastes a PR URL.
|
|
4
|
-
metadata:
|
|
5
|
-
version: 0.1.0
|
|
6
|
-
authors:
|
|
7
|
-
- mastermind
|
|
8
|
-
tags:
|
|
9
|
-
- code-review
|
|
10
|
-
model: opus
|
|
11
|
-
requires:
|
|
12
|
-
- gh CLI (for fetching PR diffs from GitHub)
|
|
13
|
-
---
|
|
14
|
-
|
|
15
|
-
# PR Review
|
|
16
|
-
|
|
17
|
-
Reviews a pull request the way a staff engineer would: operational correctness and blast radius first, line-level style last. The goal is a short, prioritized list of issues, not a wall of nitpicks.
|
|
18
|
-
|
|
19
|
-
## When to use
|
|
20
|
-
|
|
21
|
-
- User says "review my PR", "audit this diff", "code review", "check before merge"
|
|
22
|
-
- User pastes a GitHub PR URL or a unified diff
|
|
23
|
-
- User asks "what's wrong with this change?"
|
|
24
|
-
- Do NOT use for design review of a system that doesn't exist yet — that's a different kind of review (one that doesn't yet have a paired skill in this repo).
|
|
25
|
-
|
|
26
|
-
## Prerequisites
|
|
27
|
-
|
|
28
|
-
- `gh` CLI installed and authenticated (only if reviewing from a PR URL)
|
|
29
|
-
- Repo checked out locally (for cross-file context)
|
|
30
|
-
|
|
31
|
-
## Steps
|
|
32
|
-
|
|
33
|
-
1. **Get the diff.** If given a URL: `gh pr diff <number>`. If given raw diff: use it as-is.
|
|
34
|
-
2. **Read the PR description.** What is the author trying to do? If unclear, ask — don't guess.
|
|
35
|
-
3. **Sort changed files by blast radius.** Migrations, auth, billing, public APIs → top. Tests, docs, internal helpers → bottom.
|
|
36
|
-
4. **For each file (high-blast first), check in order:**
|
|
37
|
-
- **Correctness** — does it do what the description claims?
|
|
38
|
-
- **Operational risk** — what happens at 10x scale? What if the network is slow? What if this runs concurrently?
|
|
39
|
-
- **Security** — input validation, authz, secret handling, SQL/command injection.
|
|
40
|
-
- **Error paths** — what's caught, what's swallowed, what propagates?
|
|
41
|
-
- **Design** — is this the right place for this code? Does it duplicate something?
|
|
42
|
-
- **Style** — only flag if it actually hurts readability.
|
|
43
|
-
5. **Compress findings.** Three high-confidence issues beat fifteen maybes. Drop anything you're <70% sure about.
|
|
44
|
-
6. **Write the report** in the format below.
|
|
45
|
-
|
|
46
|
-
## Outputs
|
|
47
|
-
|
|
48
|
-
A markdown report:
|
|
49
|
-
|
|
50
|
-
```markdown
|
|
51
|
-
## PR Review — <PR title>
|
|
52
|
-
|
|
53
|
-
### Must fix (blocks merge)
|
|
54
|
-
- **<file:line>** — <issue>. <Why it matters.> <Suggested fix in 1 sentence.>
|
|
55
|
-
|
|
56
|
-
### Should fix (before merge if possible)
|
|
57
|
-
- **<file:line>** — <issue>. <Why it matters.>
|
|
58
|
-
|
|
59
|
-
### Consider
|
|
60
|
-
- **<file:line>** — <smaller suggestion>.
|
|
61
|
-
|
|
62
|
-
### What looks good
|
|
63
|
-
- <1-2 specific things, not generic praise>
|
|
64
|
-
```
|
|
65
|
-
|
|
66
|
-
If there are no "Must fix" items, say so explicitly — silence reads as "I didn't check."
|
|
67
|
-
|
|
68
|
-
## Examples
|
|
69
|
-
|
|
70
|
-
**Input:** `gh pr 1247` — a change to the rate limiter
|
|
71
|
-
|
|
72
|
-
**Output:**
|
|
73
|
-
```markdown
|
|
74
|
-
## PR Review — Add per-tenant rate limiting
|
|
75
|
-
|
|
76
|
-
### Must fix
|
|
77
|
-
- **src/limiter.go:88** — Counter is incremented before the limit check, so a request that exceeds the limit still counts toward the bucket. This makes the limit effectively `N-1`. Move the increment inside the `if !exceeded` branch.
|
|
78
|
-
- **src/limiter.go:142** — Redis call has no timeout. If Redis is slow, every request blocks. Add a 50ms context timeout.
|
|
79
|
-
|
|
80
|
-
### Should fix
|
|
81
|
-
- **src/limiter.go:55** — `tenantID` is read from a header without authentication. A client can spoof another tenant's ID and consume their bucket. Pull the tenant from the authenticated session instead.
|
|
82
|
-
|
|
83
|
-
### Consider
|
|
84
|
-
- **tests/limiter_test.go** — No test for the concurrent-increment race. Worth adding a `t.Parallel()` test with 100 goroutines.
|
|
85
|
-
|
|
86
|
-
### What looks good
|
|
87
|
-
- Clean separation between the policy (limits) and the mechanism (Redis ops).
|
|
88
|
-
- The metrics emission at `limiter.go:201` is exactly what oncall will want.
|
|
89
|
-
```
|