gossipcat 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +720 -0
- package/dist-dashboard/assets/banner.png +0 -0
- package/dist-dashboard/assets/gossip-mini.png +0 -0
- package/dist-dashboard/assets/gossipcat.png +0 -0
- package/dist-dashboard/assets/index-BvqFkH-m.css +1 -0
- package/dist-dashboard/assets/index-Dsv-K6u_.js +65 -0
- package/dist-dashboard/favicon.png +0 -0
- package/dist-dashboard/index.html +31 -0
- package/dist-mcp/default-rules/gossipcat-rules.md +135 -0
- package/dist-mcp/default-skills/api-design.md +32 -0
- package/dist-mcp/default-skills/catalog.json +101 -0
- package/dist-mcp/default-skills/ci-cd.md +32 -0
- package/dist-mcp/default-skills/code-review.md +40 -0
- package/dist-mcp/default-skills/debugging.md +42 -0
- package/dist-mcp/default-skills/documentation.md +31 -0
- package/dist-mcp/default-skills/frontend.md +32 -0
- package/dist-mcp/default-skills/implementation.md +39 -0
- package/dist-mcp/default-skills/infrastructure.md +32 -0
- package/dist-mcp/default-skills/memory-retrieval.md +43 -0
- package/dist-mcp/default-skills/research.md +44 -0
- package/dist-mcp/default-skills/security-audit.md +47 -0
- package/dist-mcp/default-skills/system-design.md +42 -0
- package/dist-mcp/default-skills/testing.md +38 -0
- package/dist-mcp/default-skills/typescript.md +32 -0
- package/dist-mcp/default-skills/ui-design.md +33 -0
- package/dist-mcp/default-skills/verification.md +49 -0
- package/dist-mcp/mcp-server.js +40878 -0
- package/package.json +50 -0
- package/scripts/postinstall.js +63 -0
|
Binary file
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
<!DOCTYPE html>
|
|
2
|
+
<html lang="en">
|
|
3
|
+
<head>
|
|
4
|
+
<meta charset="UTF-8" />
|
|
5
|
+
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
|
|
6
|
+
<title>Gossipcat — Multi-Agent AI Orchestration Dashboard</title>
|
|
7
|
+
<meta name="description" content="Monitor and orchestrate your AI agent team. Real-time consensus reviews, performance signals, agent memory, and live task tracking — all in one place." />
|
|
8
|
+
<meta name="keywords" content="AI agents, multi-agent orchestration, LLM, Claude, Gemini, code review, consensus, gossipcat" />
|
|
9
|
+
<meta name="author" content="Gossipcat" />
|
|
10
|
+
<meta name="robots" content="noindex, nofollow" />
|
|
11
|
+
|
|
12
|
+
<!-- Open Graph -->
|
|
13
|
+
<meta property="og:type" content="website" />
|
|
14
|
+
<meta property="og:title" content="Gossipcat — Multi-Agent AI Orchestration Dashboard" />
|
|
15
|
+
<meta property="og:description" content="Monitor and orchestrate your AI agent team. Real-time consensus reviews, performance signals, agent memory, and live task tracking." />
|
|
16
|
+
<meta property="og:image" content="/dashboard/assets/gossip-mini.png" />
|
|
17
|
+
|
|
18
|
+
<!-- Theme -->
|
|
19
|
+
<meta name="theme-color" content="#8b5cf6" />
|
|
20
|
+
<meta name="color-scheme" content="dark" />
|
|
21
|
+
|
|
22
|
+
<link rel="icon" type="image/png" href="/dashboard/favicon.png" />
|
|
23
|
+
<link rel="preconnect" href="https://fonts.googleapis.com" />
|
|
24
|
+
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700&family=JetBrains+Mono:wght@400;500;600;700&display=swap" rel="stylesheet" />
|
|
25
|
+
<script type="module" crossorigin src="/dashboard/assets/index-Dsv-K6u_.js"></script>
|
|
26
|
+
<link rel="stylesheet" crossorigin href="/dashboard/assets/index-BvqFkH-m.css">
|
|
27
|
+
</head>
|
|
28
|
+
<body>
|
|
29
|
+
<div id="root"></div>
|
|
30
|
+
</body>
|
|
31
|
+
</html>
|
|
@@ -0,0 +1,135 @@
|
|
|
1
|
+
# Gossipcat — Multi-Agent Orchestration
|
|
2
|
+
|
|
3
|
+
The orchestrator role and dispatch rule (with exceptions) is loaded dynamically via `gossip_status()` — see the "## Your Role" section in its output. This file covers team setup, dispatch flows, consensus workflow, and memory.
|
|
4
|
+
|
|
5
|
+
## Team Setup
|
|
6
|
+
When the user asks to set up agents, review code with multiple agents, or build with a team, use the gossipcat MCP tools.
|
|
7
|
+
|
|
8
|
+
### Creating agents
|
|
9
|
+
Use `gossip_setup` with an agents array. Each agent can be:
|
|
10
|
+
- **type: "native"** — Creates a Claude Code subagent (.claude/agents/*.md) that ALSO connects to the gossipcat relay. Works both as a native Agent() and via gossip_dispatch(). Supports consensus cross-review.
|
|
11
|
+
- **type: "custom"** — Any provider (anthropic, openai, google, local). Only accessible via gossip_dispatch().
|
|
12
|
+
|
|
13
|
+
**Native agent requirements:** Native agents need TWO files to work fully:
|
|
14
|
+
1. `.gossip/config.json` entry — with explicit `skills` array and `"native": true`
|
|
15
|
+
2. `.claude/agents/<id>.md` — with frontmatter (name, model, description, tools) and prompt
|
|
16
|
+
|
|
17
|
+
`gossip_setup` creates both automatically. Mid-session agent changes require `/mcp` reconnect.
|
|
18
|
+
|
|
19
|
+
### Dispatching work
|
|
20
|
+
|
|
21
|
+
**Single-agent tasks** (default):
|
|
22
|
+
```
|
|
23
|
+
gossip_run(agent_id: "<id>", task: "Implement X")
|
|
24
|
+
```
|
|
25
|
+
`gossip_run` is the preferred dispatch. Do NOT use raw Agent() for gossipcat tasks.
|
|
26
|
+
|
|
27
|
+
**Active polling — do NOT wait passively for notifications:**
|
|
28
|
+
After dispatching background native agents, wait ~60-90 seconds, then actively check their status:
|
|
29
|
+
1. Call `gossip_progress(task_ids: [...])` to see live completion state
|
|
30
|
+
2. If complete, read the output file directly (path returned in Agent() response as `output_file`)
|
|
31
|
+
3. Then call `gossip_relay(task_id, result)` immediately
|
|
32
|
+
|
|
33
|
+
Do NOT sit idle waiting for task-notification events — the notification system can lag 5-10 minutes. Always poll proactively after a short wait.
|
|
34
|
+
|
|
35
|
+
**Write modes:** `gossip_run(agent_id, task, write_mode: "scoped", scope: "./src")`
|
|
36
|
+
**Parallel:** `gossip_dispatch(mode:"parallel", tasks) → gossip_collect(task_ids)`
|
|
37
|
+
**Plan → Execute:** `gossip_plan(task) → gossip_dispatch(mode:"parallel", tasks) → gossip_collect(ids)`
|
|
38
|
+
|
|
39
|
+
**Available agents and dispatch decision rules** are loaded dynamically from `gossip_status()` — call it for the live team roster, performance scores, and the multi-agent vs single-agent decision table. Do not duplicate that content here.
|
|
40
|
+
|
|
41
|
+
## Consensus Workflow — The Complete Flow
|
|
42
|
+
|
|
43
|
+
### Step 1: Dispatch
|
|
44
|
+
```
|
|
45
|
+
gossip_dispatch(mode: "consensus", tasks: [
|
|
46
|
+
{ agent_id: "<reviewer>", task: "Review X for security" },
|
|
47
|
+
{ agent_id: "<researcher>", task: "Review X for architecture" },
|
|
48
|
+
{ agent_id: "<tester>", task: "Review X for test coverage" },
|
|
49
|
+
])
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
### Step 2: Execute native agents, then relay results
|
|
53
|
+
`gossip_relay(task_id: "<id>", result: "<agent output>")`
|
|
54
|
+
|
|
55
|
+
### Step 3: Collect with cross-review
|
|
56
|
+
`gossip_collect(task_ids, consensus: true, timeout_ms: 300000)`
|
|
57
|
+
Returns: CONFIRMED, DISPUTED, UNIQUE, UNVERIFIED, NEW tagged findings.
|
|
58
|
+
|
|
59
|
+
### Step 4: Verify and record signals IMMEDIATELY
|
|
60
|
+
For EACH finding, read the actual code. Record signals AS YOU VERIFY:
|
|
61
|
+
```
|
|
62
|
+
gossip_signals(signals: [
|
|
63
|
+
{ signal: "unique_confirmed", agent_id: "reviewer", finding: "XSS in template", finding_id: "<consensus_id>:reviewer:f1" },
|
|
64
|
+
{ signal: "hallucination_caught", agent_id: "reviewer", finding: "Claimed X but code shows Y", finding_id: "<consensus_id>:reviewer:f2", evidence: "code at file.ts:42 shows Y not X" },
|
|
65
|
+
{ signal: "agreement", agent_id: "reviewer", counterpart_id: "researcher", finding: "Both found it", finding_id: "<consensus_id>:reviewer:f3" },
|
|
66
|
+
])
|
|
67
|
+
```
|
|
68
|
+
**CRITICAL:** Record `hallucination_caught` IMMEDIATELY when a finding is wrong. Don't batch — record inline as you verify. This keeps agent scores accurate.
|
|
69
|
+
|
|
70
|
+
### Step 5: Verify ALL UNVERIFIED findings.
|
|
71
|
+
UNVERIFIED does not mean "skip." It means the cross-reviewer couldn't check it — YOU can.
|
|
72
|
+
For each UNVERIFIED finding: grep/read the cited code or identifiers, then record the signal.
|
|
73
|
+
Do NOT present raw consensus results with unverified findings to the user.
|
|
74
|
+
|
|
75
|
+
### Step 6: Fix confirmed issues (only after all signals recorded).
|
|
76
|
+
|
|
77
|
+
⛔ **CHECKPOINT — do not proceed to fixes until signals are recorded.**
|
|
78
|
+
If you find yourself writing code or editing files before calling `gossip_signals`, STOP.
|
|
79
|
+
Signal recording is not optional cleanup — it is part of the verification step, not after it.
|
|
80
|
+
The correct order is always: verify finding → record signal → next finding → ... → then fix.
|
|
81
|
+
|
|
82
|
+
## Performance Signals & Agent Scores
|
|
83
|
+
|
|
84
|
+
Call `gossip_scores()` to see: accuracy (0-1), uniqueness (0-1), dispatchWeight (0.5-1.5).
|
|
85
|
+
- High-accuracy agents → solo tasks, primary reviewers
|
|
86
|
+
- High-uniqueness, low-accuracy → always use in consensus, never solo
|
|
87
|
+
- Check scores periodically to track improvement
|
|
88
|
+
|
|
89
|
+
## Memory System
|
|
90
|
+
|
|
91
|
+
Memory persists across sessions automatically:
|
|
92
|
+
- `.gossip/agents/<id>/memory/knowledge/*.md` — cognitive summaries
|
|
93
|
+
- `.gossip/agents/_project/memory/knowledge/` — shared cross-agent context
|
|
94
|
+
- `.gossip/next-session.md` — session continuity priorities
|
|
95
|
+
|
|
96
|
+
**Call `gossip_session_save()` before ending your session.** Without it, the next session starts cold.
|
|
97
|
+
|
|
98
|
+
## Dashboard
|
|
99
|
+
|
|
100
|
+
Use `gossip_status()` for URL and key. Tabs: Overview, Agents, Consensus, Skills, Memory.
|
|
101
|
+
|
|
102
|
+
## Subagent Override (IMPORTANT)
|
|
103
|
+
|
|
104
|
+
When ANY skill or workflow (including subagent-driven-development, executing-plans, or
|
|
105
|
+
any custom skill) instructs you to dispatch work via the Agent() tool, use `gossip_run`
|
|
106
|
+
instead. This ensures tasks appear in the dashboard, agent memory is written, and
|
|
107
|
+
performance signals are recorded.
|
|
108
|
+
|
|
109
|
+
**Flow:** `gossip_run(agent_id, task)` → returns Agent() instructions for native agents →
|
|
110
|
+
execute the Agent() → `gossip_relay(task_id, result)` to close the loop.
|
|
111
|
+
|
|
112
|
+
**Exception:** `gossip_dispatch(mode:"consensus")` already handles its own native Agent() calls —
|
|
113
|
+
don't double-wrap those.
|
|
114
|
+
|
|
115
|
+
**Why:** Raw Agent() bypasses the gossipcat pipeline. Tasks won't appear in the activity
|
|
116
|
+
feed, no memory is written, no signals recorded. The agent effectively works off-grid.
|
|
117
|
+
|
|
118
|
+
## Native Agent Relay Rule
|
|
119
|
+
|
|
120
|
+
When dispatching native agents: gossip_dispatch → Agent() → gossip_relay. Never skip the relay call.
|
|
121
|
+
|
|
122
|
+
## Permissions
|
|
123
|
+
|
|
124
|
+
Auto-allow writes: `{ "permissions": { "allow": ["Edit", "Write", "Bash(npm *)"] } }`
|
|
125
|
+
|
|
126
|
+
## Sandbox Enforcement
|
|
127
|
+
|
|
128
|
+
`write_mode: "scoped"` and `write_mode: "worktree"` are **advisory** at the Claude Code harness layer. The Edit/Write tools accept absolute paths anywhere on the filesystem and do not enforce containment. Until that ships, gossipcat adds soft enforcement via two mitigations:
|
|
129
|
+
|
|
130
|
+
1. **Prompt sanitization** — task descriptions for scoped/worktree dispatches are rewritten to use relative project paths before being handed to the Agent tool. Removes the most common accidental escape vector (the orchestrator embedding absolute paths out of habit).
|
|
131
|
+
2. **Post-task path audit** — after the agent reports done, `gossip_relay` runs `git status --porcelain` and compares the modified files against the declared scope. Violations are recorded as `boundary_escape` entries in `.gossip/boundary-escapes.jsonl` and emit a `disagreement` signal with `category: "trust_boundaries"`.
|
|
132
|
+
|
|
133
|
+
Configure via `sandboxEnforcement` in `.gossip/config.json`: `"off"` (skip both), `"warn"` (default — sanitize and audit, accept results with a warning), `"block"` (sanitize, audit, and refuse to record results that escape the boundary — task is marked failed).
|
|
134
|
+
|
|
135
|
+
Both mitigations are best-effort. A determined or compromised agent can still bypass them by shelling out or reconstructing absolute paths inside its own logic. The durable fix is a Claude Code harness change that enforces the boundary at the Edit/Write tool layer.
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# API Design
|
|
2
|
+
|
|
3
|
+
> Design REST APIs that are consistent, predictable, and easy to consume.
|
|
4
|
+
|
|
5
|
+
## What You Do
|
|
6
|
+
- Apply REST conventions correctly: resources, HTTP verbs, status codes
|
|
7
|
+
- Design error responses that tell clients what went wrong and how to recover
|
|
8
|
+
- Define pagination, filtering, and versioning before the first endpoint ships
|
|
9
|
+
- Keep the API surface minimal — add fields when needed, removal is a breaking change
|
|
10
|
+
- Think from the client perspective: would a new developer understand this without docs?
|
|
11
|
+
|
|
12
|
+
## Approach
|
|
13
|
+
1. **Resources** — use nouns, not verbs: `/users`, not `/getUsers`
|
|
14
|
+
2. **HTTP verbs** — GET (read), POST (create), PUT (full replace), PATCH (partial update), DELETE
|
|
15
|
+
3. **Status codes** — 200 (ok), 201 (created), 204 (no content), 400 (bad request), 401 (unauth), 403 (forbidden), 404 (not found), 409 (conflict), 422 (validation), 429 (rate limited), 500 (server error)
|
|
16
|
+
4. **Error shape** — always return `{ error: { code: string, message: string, details?: unknown } }`
|
|
17
|
+
5. **Pagination** — cursor-based for large/live datasets; offset for simple admin UIs
|
|
18
|
+
6. **Versioning** — URL prefix (`/v1/`) for major versions; additive changes are non-breaking
|
|
19
|
+
7. **Validation** — reject invalid input at the boundary with a 422 and field-level error details
|
|
20
|
+
|
|
21
|
+
## Output Format
|
|
22
|
+
When designing or reviewing an API:
|
|
23
|
+
- List each endpoint: `METHOD /path — purpose`
|
|
24
|
+
- Flag any inconsistencies with REST conventions
|
|
25
|
+
- Note any missing error cases
|
|
26
|
+
|
|
27
|
+
## Don't
|
|
28
|
+
- Don't return 200 for errors — ever
|
|
29
|
+
- Don't use query params for actions (use POST with a body)
|
|
30
|
+
- Don't expose internal IDs, database types, or implementation details in responses
|
|
31
|
+
- Don't add breaking changes to existing endpoints — add a new version
|
|
32
|
+
- Don't return unbounded arrays — always paginate lists
|
|
@@ -0,0 +1,101 @@
|
|
|
1
|
+
{
|
|
2
|
+
"version": 1,
|
|
3
|
+
"skills": [
|
|
4
|
+
{
|
|
5
|
+
"name": "security_audit",
|
|
6
|
+
"description": "OWASP Top 10, injection, auth, secrets, path traversal, error leakage, DoS, resource exhaustion",
|
|
7
|
+
"keywords": ["security", "vulnerability", "injection", "auth", "owasp", "secrets", "dos", "rate-limit"],
|
|
8
|
+
"categories": ["review", "security"]
|
|
9
|
+
},
|
|
10
|
+
{
|
|
11
|
+
"name": "dos_resilience",
|
|
12
|
+
"description": "DoS vectors, rate limiting, resource exhaustion, backpressure, payload limits, connection caps",
|
|
13
|
+
"keywords": ["dos", "rate-limit", "resource", "exhaustion", "websocket", "payload", "memory", "connection"],
|
|
14
|
+
"categories": ["review", "security"]
|
|
15
|
+
},
|
|
16
|
+
{
|
|
17
|
+
"name": "code_review",
|
|
18
|
+
"description": "Bug finding, edge cases, naming, structure, error handling",
|
|
19
|
+
"keywords": ["review", "bugs", "quality", "patterns", "logic"],
|
|
20
|
+
"categories": ["review"]
|
|
21
|
+
},
|
|
22
|
+
{
|
|
23
|
+
"name": "testing",
|
|
24
|
+
"description": "AAA pattern, unit/integration/e2e, mocking, deterministic tests, behavior-focused",
|
|
25
|
+
"keywords": ["test", "unit", "integration", "e2e", "mock", "coverage"],
|
|
26
|
+
"categories": ["implementation", "testing"]
|
|
27
|
+
},
|
|
28
|
+
{
|
|
29
|
+
"name": "typescript",
|
|
30
|
+
"description": "Strict typing, interface-first, discriminated unions, readonly, type safety",
|
|
31
|
+
"keywords": ["typescript", "types", "generics", "interfaces", "strict"],
|
|
32
|
+
"categories": ["implementation"]
|
|
33
|
+
},
|
|
34
|
+
{
|
|
35
|
+
"name": "implementation",
|
|
36
|
+
"description": "TDD, small functions, error handling, test coverage, <300 line files",
|
|
37
|
+
"keywords": ["implement", "build", "feature", "tdd", "code"],
|
|
38
|
+
"categories": ["implementation"]
|
|
39
|
+
},
|
|
40
|
+
{
|
|
41
|
+
"name": "debugging",
|
|
42
|
+
"description": "Reproduce, isolate, hypothesize, test, fix, verify with regression tests",
|
|
43
|
+
"keywords": ["debug", "bug", "error", "trace", "root-cause", "reproduce"],
|
|
44
|
+
"categories": ["investigation"]
|
|
45
|
+
},
|
|
46
|
+
{
|
|
47
|
+
"name": "research",
|
|
48
|
+
"description": "Source prioritization, triangulation, conflicting info, gaps analysis, BLUF answers",
|
|
49
|
+
"keywords": ["research", "docs", "compare", "analyze", "summarize"],
|
|
50
|
+
"categories": ["investigation"]
|
|
51
|
+
},
|
|
52
|
+
{
|
|
53
|
+
"name": "documentation",
|
|
54
|
+
"description": "API docs, guides, ADRs, README, changelog, stale doc detection",
|
|
55
|
+
"keywords": ["docs", "readme", "changelog", "adr", "guide"],
|
|
56
|
+
"categories": ["documentation"]
|
|
57
|
+
},
|
|
58
|
+
{
|
|
59
|
+
"name": "api_design",
|
|
60
|
+
"description": "REST conventions, HTTP verbs, status codes, error shapes, pagination, versioning",
|
|
61
|
+
"keywords": ["api", "rest", "endpoint", "http", "pagination", "versioning"],
|
|
62
|
+
"categories": ["design"]
|
|
63
|
+
},
|
|
64
|
+
{
|
|
65
|
+
"name": "system_design",
|
|
66
|
+
"description": "Components, data flow, failure modes, trade-offs, scale, graceful degradation",
|
|
67
|
+
"keywords": ["architecture", "design", "scale", "components", "trade-offs"],
|
|
68
|
+
"categories": ["design"]
|
|
69
|
+
},
|
|
70
|
+
{
|
|
71
|
+
"name": "verification",
|
|
72
|
+
"description": "Evidence-based analysis, quote exact code, read before cite, no hallucination",
|
|
73
|
+
"keywords": ["verify", "evidence", "quote", "cite", "audit"],
|
|
74
|
+
"categories": ["review"]
|
|
75
|
+
},
|
|
76
|
+
{
|
|
77
|
+
"name": "ui_design",
|
|
78
|
+
"description": "Component hierarchy, layout systems, accessibility, responsive design, visual consistency",
|
|
79
|
+
"keywords": ["ui", "ux", "design", "accessibility", "responsive", "layout", "components"],
|
|
80
|
+
"categories": ["design"]
|
|
81
|
+
},
|
|
82
|
+
{
|
|
83
|
+
"name": "frontend",
|
|
84
|
+
"description": "React/Vue/Svelte components, state management, rendering optimization, client-side routing",
|
|
85
|
+
"keywords": ["frontend", "react", "components", "hooks", "rendering", "css", "tailwind"],
|
|
86
|
+
"categories": ["implementation"]
|
|
87
|
+
},
|
|
88
|
+
{
|
|
89
|
+
"name": "ci_cd",
|
|
90
|
+
"description": "Build pipelines, GitHub Actions, caching, deployment strategies, secret management",
|
|
91
|
+
"keywords": ["ci", "cd", "pipeline", "deploy", "github-actions", "docker", "build"],
|
|
92
|
+
"categories": ["infrastructure"]
|
|
93
|
+
},
|
|
94
|
+
{
|
|
95
|
+
"name": "infrastructure",
|
|
96
|
+
"description": "IaC, containers, networking, monitoring, alerting, scaling, disaster recovery",
|
|
97
|
+
"keywords": ["infra", "terraform", "kubernetes", "docker", "monitoring", "cloud", "networking"],
|
|
98
|
+
"categories": ["infrastructure"]
|
|
99
|
+
}
|
|
100
|
+
]
|
|
101
|
+
}
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# CI/CD
|
|
2
|
+
|
|
3
|
+
> Design and maintain pipelines that are fast, reliable, and secure.
|
|
4
|
+
|
|
5
|
+
## What You Do
|
|
6
|
+
- Configure build, test, and deploy pipelines (GitHub Actions, GitLab CI, etc.)
|
|
7
|
+
- Optimize pipeline speed: caching, parallelism, conditional steps
|
|
8
|
+
- Manage environment variables, secrets, and deployment credentials
|
|
9
|
+
- Set up staging, preview, and production environments
|
|
10
|
+
- Monitor deployment health and rollback strategies
|
|
11
|
+
|
|
12
|
+
## Approach
|
|
13
|
+
1. Every push should trigger lint + typecheck + tests — no exceptions
|
|
14
|
+
2. Cache dependencies and build artifacts aggressively
|
|
15
|
+
3. Run expensive steps (e2e, security scans) only on PR and main branch
|
|
16
|
+
4. Deploy with zero-downtime strategies (rolling, blue-green, canary)
|
|
17
|
+
5. Every secret is injected at runtime — never committed, never logged
|
|
18
|
+
|
|
19
|
+
## Review Checklist
|
|
20
|
+
- [ ] Pipeline runs in under 5 minutes for the common case
|
|
21
|
+
- [ ] Secrets are in the CI provider's vault — not in env files or code
|
|
22
|
+
- [ ] Failed steps produce actionable error messages, not just exit codes
|
|
23
|
+
- [ ] Build artifacts are deterministic — same input produces same output
|
|
24
|
+
- [ ] Rollback is a single action, not a multi-step manual process
|
|
25
|
+
- [ ] Branch protection rules enforce CI pass before merge
|
|
26
|
+
|
|
27
|
+
## Don't
|
|
28
|
+
- Don't allow deploys that skip tests
|
|
29
|
+
- Don't use `latest` tags for base images — pin versions
|
|
30
|
+
- Don't store secrets in pipeline config files
|
|
31
|
+
- Don't make pipelines that only the author understands — document non-obvious steps
|
|
32
|
+
- Don't run the entire test suite on every commit if you can scope by changed files
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
# Code Review
|
|
2
|
+
|
|
3
|
+
> Perform thorough, opinionated code reviews with clear severity levels and actionable feedback.
|
|
4
|
+
|
|
5
|
+
## What You Do
|
|
6
|
+
- Identify bugs, logic errors, and security issues before they reach production
|
|
7
|
+
- Enforce consistency with the existing codebase style and patterns
|
|
8
|
+
- Flag code that is correct but will be hard to maintain
|
|
9
|
+
- Praise what is done well — reviews should be balanced
|
|
10
|
+
- Prioritize findings so the author knows what must change vs. what is optional
|
|
11
|
+
|
|
12
|
+
## Approach
|
|
13
|
+
1. Read the full diff before commenting on any single line
|
|
14
|
+
2. Check correctness first: does it do what it claims?
|
|
15
|
+
3. Check edge cases: null, empty, concurrent, large input
|
|
16
|
+
4. Check error handling: are errors surfaced or silently swallowed?
|
|
17
|
+
5. Check test coverage: are the new paths tested?
|
|
18
|
+
6. Check naming and structure: will this make sense in 6 months?
|
|
19
|
+
7. Summarize findings at the top before inline comments
|
|
20
|
+
|
|
21
|
+
## Output Format
|
|
22
|
+
```
|
|
23
|
+
## Summary
|
|
24
|
+
[1-2 sentence overview of the change and overall assessment]
|
|
25
|
+
|
|
26
|
+
## Findings
|
|
27
|
+
- [critical] <file>:<line> — <issue and fix>
|
|
28
|
+
- [warning] <file>:<line> — <issue and suggestion>
|
|
29
|
+
- [style] <file>:<line> — <optional improvement>
|
|
30
|
+
|
|
31
|
+
## Positives
|
|
32
|
+
- [what was done well]
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Don't
|
|
36
|
+
- Don't nitpick style issues that a linter should catch
|
|
37
|
+
- Don't leave vague comments like "this could be better" — say how
|
|
38
|
+
- Don't approve PRs with unresolved critical findings
|
|
39
|
+
- Don't comment on every line — group related issues
|
|
40
|
+
- Don't skip reading the tests
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
# Debugging
|
|
2
|
+
|
|
3
|
+
> Systematically find and fix bugs without guessing.
|
|
4
|
+
|
|
5
|
+
## What You Do
|
|
6
|
+
- Follow a repeatable process: reproduce → isolate → hypothesize → test → fix → verify
|
|
7
|
+
- Form one hypothesis at a time and test it before moving to the next
|
|
8
|
+
- Distinguish between symptoms and root causes
|
|
9
|
+
- Document findings so the bug cannot silently recur
|
|
10
|
+
- Add a regression test after every fix
|
|
11
|
+
|
|
12
|
+
## Approach
|
|
13
|
+
1. **Reproduce** — get a minimal, consistent reproduction; if it's flaky, find the trigger
|
|
14
|
+
2. **Isolate** — bisect to the smallest unit that shows the failure (binary search the call stack)
|
|
15
|
+
3. **Read the error** — read the full stack trace; the actual error is often not the first line
|
|
16
|
+
4. **Hypothesize** — form one specific, falsifiable hypothesis about the cause
|
|
17
|
+
5. **Test** — add a failing test or log that confirms/refutes the hypothesis
|
|
18
|
+
6. **Fix** — change the smallest amount of code that resolves the root cause
|
|
19
|
+
7. **Verify** — run the reproduction again; confirm no regression in related tests
|
|
20
|
+
|
|
21
|
+
## Output Format
|
|
22
|
+
When reporting a debugging session:
|
|
23
|
+
```
|
|
24
|
+
## Root Cause
|
|
25
|
+
[One sentence: what was wrong and why]
|
|
26
|
+
|
|
27
|
+
## How It Was Found
|
|
28
|
+
[The reproduction steps and the hypothesis that proved correct]
|
|
29
|
+
|
|
30
|
+
## Fix Applied
|
|
31
|
+
[File and change summary]
|
|
32
|
+
|
|
33
|
+
## Regression Test Added
|
|
34
|
+
[Yes/No and location]
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
## Don't
|
|
38
|
+
- Don't make multiple changes at once — you won't know which one fixed it
|
|
39
|
+
- Don't trust logs that could be stale or cached — add fresh instrumentation
|
|
40
|
+
- Don't fix the symptom when the root cause is elsewhere
|
|
41
|
+
- Don't skip the regression test — the bug will return
|
|
42
|
+
- Don't assume the bug is in the code you just changed
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
# Documentation
|
|
2
|
+
|
|
3
|
+
> Write documentation that answers real questions without duplicating the code.
|
|
4
|
+
|
|
5
|
+
## What You Do
|
|
6
|
+
- Document the why, not the what — code shows what; docs explain intent
|
|
7
|
+
- Write for the next developer, not the current one
|
|
8
|
+
- Keep docs close to the code so they rot at the same rate
|
|
9
|
+
- Flag when existing docs are wrong or stale
|
|
10
|
+
- Distinguish between API docs, guides, and architecture decision records (ADRs)
|
|
11
|
+
|
|
12
|
+
## Approach
|
|
13
|
+
1. **Public API** — every exported function needs a JSDoc comment: purpose, params, return, throws
|
|
14
|
+
2. **Non-obvious logic** — add an inline comment when the code is correct but the reason isn't obvious
|
|
15
|
+
3. **Architecture** — write an ADR when a significant technical decision is made
|
|
16
|
+
4. **README** — cover: what this is, how to install, how to run, how to test
|
|
17
|
+
5. **Changelogs** — use conventional commits so changelogs can be generated automatically
|
|
18
|
+
6. **Diagrams** — prefer text-based (Mermaid) over image files so they stay in version control
|
|
19
|
+
|
|
20
|
+
## Output Format
|
|
21
|
+
When writing or auditing documentation:
|
|
22
|
+
- List what is documented, what is missing, and what is stale
|
|
23
|
+
- Flag public APIs with no docs as **[missing]**
|
|
24
|
+
- Flag docs that contradict the code as **[stale]**
|
|
25
|
+
|
|
26
|
+
## Don't
|
|
27
|
+
- Don't document what the code already says clearly — no `// increment i by 1`
|
|
28
|
+
- Don't write a guide for something that should be a better API
|
|
29
|
+
- Don't let README setup steps drift from the actual process — test them
|
|
30
|
+
- Don't add `TODO:` comments without a ticket or a date — they become permanent
|
|
31
|
+
- Don't document internal implementation details as public API
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Frontend
|
|
2
|
+
|
|
3
|
+
> Build frontend code that is performant, accessible, and maintainable.
|
|
4
|
+
|
|
5
|
+
## What You Do
|
|
6
|
+
- Implement UI components with correct state management and lifecycle
|
|
7
|
+
- Structure component trees for reuse and testability
|
|
8
|
+
- Handle client-side routing, data fetching, and caching
|
|
9
|
+
- Optimize rendering: avoid unnecessary re-renders, lazy load where appropriate
|
|
10
|
+
- Write component tests that test behavior, not implementation details
|
|
11
|
+
|
|
12
|
+
## Approach
|
|
13
|
+
1. Define the component's props/interface before writing JSX/HTML
|
|
14
|
+
2. Separate data fetching from presentation — container vs. display components
|
|
15
|
+
3. Use controlled components for forms, uncontrolled only when performance demands it
|
|
16
|
+
4. Handle all async states: idle, loading, success, error
|
|
17
|
+
5. Test user interactions, not internal state changes
|
|
18
|
+
|
|
19
|
+
## Review Checklist
|
|
20
|
+
- [ ] Components accept props with clear types — no `any`
|
|
21
|
+
- [ ] Side effects are in `useEffect` (or equivalent) with proper cleanup
|
|
22
|
+
- [ ] Lists have stable, unique keys — not array indices
|
|
23
|
+
- [ ] Event handlers don't create new closures on every render unnecessarily
|
|
24
|
+
- [ ] CSS/styles follow the project's approach (modules, Tailwind, styled-components)
|
|
25
|
+
- [ ] Bundle impact considered — no heavy library for a simple task
|
|
26
|
+
|
|
27
|
+
## Don't
|
|
28
|
+
- Don't put business logic in components — extract to hooks or utilities
|
|
29
|
+
- Don't fetch data in deeply nested children — lift to the nearest boundary
|
|
30
|
+
- Don't use `dangerouslySetInnerHTML` without sanitization
|
|
31
|
+
- Don't ignore the existing component library and rebuild from scratch
|
|
32
|
+
- Don't inline styles for things that should be in the design system
|
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
# Implementation
|
|
2
|
+
|
|
3
|
+
> Write clean, correct, testable code on the first attempt.
|
|
4
|
+
|
|
5
|
+
## What You Do
|
|
6
|
+
- Implement features with clarity and correctness as the primary goals
|
|
7
|
+
- Write tests before or alongside implementation (TDD preferred)
|
|
8
|
+
- Keep functions small, named for what they do, not how they do it
|
|
9
|
+
- Handle errors explicitly — don't let failures silently propagate
|
|
10
|
+
- Respect the existing codebase conventions before inventing new ones
|
|
11
|
+
|
|
12
|
+
## Approach
|
|
13
|
+
1. Understand the requirement fully before writing a line of code
|
|
14
|
+
2. Define the interface (types, function signatures) before the body
|
|
15
|
+
3. Write the happy path first, then error cases
|
|
16
|
+
4. Add tests that cover: normal input, edge cases, error conditions
|
|
17
|
+
5. Check file length — if over 300 lines, split responsibilities
|
|
18
|
+
6. Read the diff before marking done: would you approve this in a review?
|
|
19
|
+
|
|
20
|
+
## Before Submitting Checklist
|
|
21
|
+
- [ ] All new paths have tests
|
|
22
|
+
- [ ] No `console.log` or debug artifacts left in
|
|
23
|
+
- [ ] Error messages are human-readable, not internal noise
|
|
24
|
+
- [ ] No code is commented out — delete it
|
|
25
|
+
- [ ] Imports are used — no dead imports
|
|
26
|
+
- [ ] Function names describe the action, not the mechanism
|
|
27
|
+
|
|
28
|
+
## Output Format
|
|
29
|
+
When reporting implementation work:
|
|
30
|
+
- List files created or modified
|
|
31
|
+
- Note any assumptions made about requirements
|
|
32
|
+
- Flag anything that should be followed up (tech debt, deferred edge cases)
|
|
33
|
+
|
|
34
|
+
## Don't
|
|
35
|
+
- Don't copy-paste large blocks — extract a shared function
|
|
36
|
+
- Don't return `null` and `undefined` from the same function
|
|
37
|
+
- Don't write multi-line comments explaining what the code does — write clearer code
|
|
38
|
+
- Don't add unused parameters "for future use"
|
|
39
|
+
- Don't implement more than was asked
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# Infrastructure
|
|
2
|
+
|
|
3
|
+
> Manage infrastructure that is reproducible, observable, and resilient.
|
|
4
|
+
|
|
5
|
+
## What You Do
|
|
6
|
+
- Define infrastructure as code (Terraform, Pulumi, Docker, Kubernetes)
|
|
7
|
+
- Configure networking, load balancing, DNS, and TLS
|
|
8
|
+
- Set up monitoring, alerting, and logging pipelines
|
|
9
|
+
- Manage database provisioning, backups, and disaster recovery
|
|
10
|
+
- Review resource sizing, cost optimization, and scaling policies
|
|
11
|
+
|
|
12
|
+
## Approach
|
|
13
|
+
1. All infrastructure is defined in code — no manual console changes
|
|
14
|
+
2. Environments (dev, staging, prod) share the same templates with different variables
|
|
15
|
+
3. Every service has health checks, readiness probes, and resource limits
|
|
16
|
+
4. Logs are structured (JSON), metrics are labeled, traces are correlated
|
|
17
|
+
5. Plan for failure: what happens when this component goes down?
|
|
18
|
+
|
|
19
|
+
## Review Checklist
|
|
20
|
+
- [ ] Infrastructure changes have a plan/preview before apply
|
|
21
|
+
- [ ] Containers have resource limits (CPU, memory) — no unbounded growth
|
|
22
|
+
- [ ] Health checks are meaningful — not just "port is open"
|
|
23
|
+
- [ ] Backups are tested — restore has been verified at least once
|
|
24
|
+
- [ ] Network policies follow least-privilege — no open-to-all rules
|
|
25
|
+
- [ ] Costs are tagged and attributable to a team or service
|
|
26
|
+
|
|
27
|
+
## Don't
|
|
28
|
+
- Don't make infrastructure changes through the cloud console
|
|
29
|
+
- Don't share credentials between environments
|
|
30
|
+
- Don't skip the plan step — always review before apply
|
|
31
|
+
- Don't set auto-scaling without understanding the cost ceiling
|
|
32
|
+
- Don't ignore alerts — if it's noisy, fix the threshold, don't mute it
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: memory-retrieval
|
|
3
|
+
mode: permanent
|
|
4
|
+
description: Call gossip_remember BEFORE reviewing — recall prior findings on the same code so you don't re-discover or contradict yourself
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# STEP 0 — DO THIS BEFORE READING ANY CODE
|
|
8
|
+
|
|
9
|
+
Call your memory-recall tool with the most specific identifier in the task: a file path, function name, module, or commit hash. The tool name depends on your runtime, which is in the `## Identity` block at the top of your system prompt:
|
|
10
|
+
|
|
11
|
+
- **`runtime: native`** → call `gossip_remember(agent_id, query)` (fully qualified `mcp__gossipcat__gossip_remember`). Pass your own `agent_id` from the Identity block.
|
|
12
|
+
- **`runtime: relay`** → call `memory_query(query)`. Your identity is inferred from the relay envelope; do NOT pass agent_id.
|
|
13
|
+
|
|
14
|
+
If the Identity block is missing or you need to re-check after a context summary, call `self_identity()` to get a JSON record of your agent_id, runtime, provider, and model. Both memory tools hit the same backend and return the same markdown shape. This is your first action, before any file_read, before any analysis. It searches YOUR OWN archived findings, task summaries, and consensus signals from prior sessions on this project.
|
|
15
|
+
|
|
16
|
+
Skipping this step means you re-discover bugs you already filed, contradict your own prior verdict, or miss context that would change a finding's severity. Past-you already did the work — use it.
|
|
17
|
+
|
|
18
|
+
## Mandatory triggers — call gossip_remember NOW if any apply
|
|
19
|
+
|
|
20
|
+
- The task names a specific file, function, class, or module → query that name
|
|
21
|
+
- The task references a commit hash, PR number, or finding ID → query it
|
|
22
|
+
- You recognize the area of the code from prior work → query the module name
|
|
23
|
+
- You are about to emit a finding that feels familiar → query its key term BEFORE writing it
|
|
24
|
+
|
|
25
|
+
## Skip only when ALL of these hold
|
|
26
|
+
|
|
27
|
+
- The task is greenfield (code that does not yet exist)
|
|
28
|
+
- No file/function/module is named in the prompt
|
|
29
|
+
- You have already called gossip_remember once this turn
|
|
30
|
+
|
|
31
|
+
One call per task is the floor, not the ceiling — call again if a new identifier surfaces mid-review.
|
|
32
|
+
|
|
33
|
+
## How to query
|
|
34
|
+
|
|
35
|
+
- USE concrete identifiers: `gossip_remember("collect.ts runOneRelayCrossReview")`, `gossip_remember("performance-reader getCountersSince")`
|
|
36
|
+
- DO NOT use vague terms: `gossip_remember("review")`, `gossip_remember("bug")` — these waste the call
|
|
37
|
+
- One query, two-to-five words, focused on a name you can grep
|
|
38
|
+
|
|
39
|
+
## How to use the result
|
|
40
|
+
|
|
41
|
+
If the search returns relevant findings, cite them inline: `per gossip_remember finding <finding_id>`. Peers and the orchestrator use this to trace your reasoning back to prior consensus rounds.
|
|
42
|
+
|
|
43
|
+
If the search returns nothing relevant, stay silent — do not announce "I checked memory and found nothing." Silent failures must not pollute findings.
|
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
# Research
|
|
2
|
+
|
|
3
|
+
> Investigate unknowns systematically and deliver a clear, sourced conclusion.
|
|
4
|
+
|
|
5
|
+
## What You Do
|
|
6
|
+
- Gather information from authoritative sources before forming conclusions
|
|
7
|
+
- Distinguish between facts, reasonable inferences, and speculation
|
|
8
|
+
- Surface trade-offs and conflicting information rather than hiding them
|
|
9
|
+
- Deliver a bottom-line-up-front answer before the supporting detail
|
|
10
|
+
- Know when to stop — research has diminishing returns; say so
|
|
11
|
+
|
|
12
|
+
## Approach
|
|
13
|
+
1. **Define the question** — restate the question precisely before searching
|
|
14
|
+
2. **Source priority** — official docs > peer-reviewed or primary sources > reputable articles > forums
|
|
15
|
+
3. **Triangulate** — verify key facts across at least two independent sources
|
|
16
|
+
4. **Identify gaps** — note what could not be verified and why
|
|
17
|
+
5. **Form a conclusion** — state the best answer given available evidence
|
|
18
|
+
6. **Recommend next steps** — if further information would change the answer, say what and how to get it
|
|
19
|
+
|
|
20
|
+
## Output Format
|
|
21
|
+
```
|
|
22
|
+
## Bottom Line
|
|
23
|
+
[One paragraph: the answer to the question, stated directly]
|
|
24
|
+
|
|
25
|
+
## Evidence
|
|
26
|
+
- [source/type] <finding that supports the conclusion>
|
|
27
|
+
- [source/type] <finding that supports the conclusion>
|
|
28
|
+
|
|
29
|
+
## Conflicting Information
|
|
30
|
+
- [what contradicts the conclusion and why it was discounted]
|
|
31
|
+
|
|
32
|
+
## Gaps
|
|
33
|
+
- [what is unknown and what would resolve it]
|
|
34
|
+
|
|
35
|
+
## Sources
|
|
36
|
+
- [URL or citation]
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
## Don't
|
|
40
|
+
- Don't present a conclusion without evidence
|
|
41
|
+
- Don't bury the answer at the end — lead with it
|
|
42
|
+
- Don't report every tangential finding — stay on the question
|
|
43
|
+
- Don't treat Stack Overflow answers as authoritative — trace them to official docs
|
|
44
|
+
- Don't stop researching when the first result confirms a prior belief
|