mindforge-cc 10.0.2 → 10.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.mindforge/config.json +73 -2
- package/.mindforge/engine/autonomous/cross-iteration-bridge.md +96 -0
- package/.mindforge/engine/cost-tracking/budget-enforcer.md +68 -0
- package/.mindforge/engine/cost-tracking/router.md +58 -0
- package/.mindforge/engine/cost-tracking/token-ledger.md +77 -0
- package/.mindforge/engine/council/council-protocol.md +96 -0
- package/.mindforge/engine/council/council-templates.md +85 -0
- package/.mindforge/engine/council/synthesis-engine.md +71 -0
- package/.mindforge/engine/cross-model-eval.md +74 -0
- package/.mindforge/engine/instincts/capture-engine.md +63 -0
- package/.mindforge/engine/instincts/instinct-schema.md +76 -0
- package/.mindforge/engine/instincts/promotion-engine.md +77 -0
- package/.mindforge/engine/proactive/signal-detector.md +60 -0
- package/.mindforge/engine/proactive/suggestion-engine.md +100 -0
- package/.mindforge/engine/skills/composition.md +83 -0
- package/.mindforge/engine/skills/loader.md +16 -0
- package/.mindforge/personas/agent-architect.md +57 -0
- package/.mindforge/personas/agent-evaluator.md +162 -0
- package/.mindforge/personas/agent-memory-designer.md +157 -0
- package/.mindforge/personas/agent-ops-engineer.md +120 -0
- package/.mindforge/personas/agent-orchestrator.md +112 -0
- package/.mindforge/personas/ai-economist.md +57 -0
- package/.mindforge/personas/ai-safety-engineer.md +57 -0
- package/.mindforge/personas/analytics-engineer.md +57 -0
- package/.mindforge/personas/anti-pattern-hunter.md +61 -0
- package/.mindforge/personas/api-gateway-designer.md +132 -0
- package/.mindforge/personas/auth-engineer.md +112 -0
- package/.mindforge/personas/build-engineer.md +57 -0
- package/.mindforge/personas/business-analyst.md +56 -0
- package/.mindforge/personas/cache-architect.md +100 -0
- package/.mindforge/personas/causal-scientist.md +57 -0
- package/.mindforge/personas/cdn-architect.md +118 -0
- package/.mindforge/personas/change-agent.md +104 -0
- package/.mindforge/personas/code-narrator.md +52 -0
- package/.mindforge/personas/codegen-specialist.md +68 -0
- package/.mindforge/personas/communication-architect.md +102 -0
- package/.mindforge/personas/compliance-engineer.md +96 -0
- package/.mindforge/personas/consensus-engineer.md +116 -0
- package/.mindforge/personas/contract-tester.md +60 -192
- package/.mindforge/personas/cost-optimizer.md +71 -0
- package/.mindforge/personas/council-architect.md +66 -0
- package/.mindforge/personas/council-critic.md +67 -0
- package/.mindforge/personas/council-pragmatist.md +71 -0
- package/.mindforge/personas/council-skeptic.md +73 -0
- package/.mindforge/personas/data-architect.md +108 -0
- package/.mindforge/personas/data-mesh-architect.md +57 -0
- package/.mindforge/personas/data-pipeline-architect.md +120 -0
- package/.mindforge/personas/de-sloppifier.md +60 -0
- package/.mindforge/personas/debt-manager.md +66 -0
- package/.mindforge/personas/decision-architect.md +82 -51
- package/.mindforge/personas/deployment-captain.md +74 -0
- package/.mindforge/personas/design-system-lead.md +112 -0
- package/.mindforge/personas/dmux-orchestrator.md +75 -0
- package/.mindforge/personas/doc-auditor.md +84 -0
- package/.mindforge/personas/dx-engineer.md +96 -0
- package/.mindforge/personas/ecommerce-engineer.md +57 -0
- package/.mindforge/personas/edge-engineer.md +94 -0
- package/.mindforge/personas/edtech-architect.md +106 -0
- package/.mindforge/personas/embedding-architect.md +57 -0
- package/.mindforge/personas/environment-engineer.md +57 -0
- package/.mindforge/personas/eval-judge.md +55 -0
- package/.mindforge/personas/event-architect.md +102 -0
- package/.mindforge/personas/experiment-designer.md +138 -0
- package/.mindforge/personas/feature-store-engineer.md +57 -0
- package/.mindforge/personas/finops-analyst.md +66 -0
- package/.mindforge/personas/fintech-architect.md +57 -0
- package/.mindforge/personas/flutter-engineer.md +104 -0
- package/.mindforge/personas/gaming-engineer.md +57 -0
- package/.mindforge/personas/graphql-designer.md +73 -0
- package/.mindforge/personas/healthcare-engineer.md +57 -0
- package/.mindforge/personas/hiring-strategist.md +105 -0
- package/.mindforge/personas/hitl-architect.md +165 -0
- package/.mindforge/personas/i18n-architect.md +69 -0
- package/.mindforge/personas/instinct-curator.md +83 -0
- package/.mindforge/personas/iot-architect.md +105 -0
- package/.mindforge/personas/knowledge-curator.md +139 -0
- package/.mindforge/personas/knowledge-engineer.md +57 -0
- package/.mindforge/personas/lakehouse-architect.md +57 -0
- package/.mindforge/personas/llm-orchestrator.md +57 -0
- package/.mindforge/personas/logistics-architect.md +106 -0
- package/.mindforge/personas/market-analyst.md +53 -0
- package/.mindforge/personas/marketplace-engineer.md +105 -0
- package/.mindforge/personas/mcp-designer.md +54 -0
- package/.mindforge/personas/meeting-designer.md +104 -0
- package/.mindforge/personas/mentorship-lead.md +106 -0
- package/.mindforge/personas/migration-architect.md +57 -0
- package/.mindforge/personas/ml-ops-engineer.md +101 -0
- package/.mindforge/personas/mobile-architect.md +105 -0
- package/.mindforge/personas/mobile-security-engineer.md +106 -0
- package/.mindforge/personas/multi-model-bridge.md +86 -0
- package/.mindforge/personas/multi-tenancy-architect.md +71 -0
- package/.mindforge/personas/multimodal-engineer.md +57 -0
- package/.mindforge/personas/offline-specialist.md +105 -0
- package/.mindforge/personas/onboarding-navigator.md +63 -0
- package/.mindforge/personas/payments-engineer.md +135 -0
- package/.mindforge/personas/pipeline-engineer.md +115 -0
- package/.mindforge/personas/platform-engineer.md +97 -0
- package/.mindforge/personas/platform-lead.md +57 -0
- package/.mindforge/personas/privacy-engineer.md +57 -0
- package/.mindforge/personas/product-owner.md +56 -0
- package/.mindforge/personas/productivity-analyst.md +57 -0
- package/.mindforge/personas/prompt-architect.md +101 -0
- package/.mindforge/personas/proofreader.md +53 -0
- package/.mindforge/personas/pwa-architect.md +105 -0
- package/.mindforge/personas/quality-scorer.md +63 -0
- package/.mindforge/personas/react-native-engineer.md +106 -0
- package/.mindforge/personas/resilience-engineer.md +69 -0
- package/.mindforge/personas/rfc-architect.md +64 -0
- package/.mindforge/personas/saga-orchestrator.md +80 -0
- package/.mindforge/personas/secrets-engineer.md +57 -0
- package/.mindforge/personas/skill-smith.md +79 -0
- package/.mindforge/personas/sre-lead.md +107 -0
- package/.mindforge/personas/stream-engineer.md +57 -0
- package/.mindforge/personas/streaming-engineer.md +64 -0
- package/.mindforge/personas/swarm-templates.json +695 -38
- package/.mindforge/personas/system-designer.md +57 -0
- package/.mindforge/personas/team-coach.md +120 -0
- package/.mindforge/personas/tech-lead-coach.md +103 -0
- package/.mindforge/personas/technical-writer-lead.md +111 -0
- package/.mindforge/personas/threat-modeler.md +82 -0
- package/.mindforge/personas/vibe-checker.md +75 -0
- package/.mindforge/personas/worktree-manager.md +56 -0
- package/.mindforge/personas/zero-trust-engineer.md +113 -0
- package/.mindforge/skills/a11y-testing/SKILL.md +143 -0
- package/.mindforge/skills/agent-evaluation-framework/SKILL.md +227 -0
- package/.mindforge/skills/agent-introspection-debugging/SKILL.md +88 -0
- package/.mindforge/skills/agent-loops/SKILL.md +84 -0
- package/.mindforge/skills/agent-memory-design/SKILL.md +199 -0
- package/.mindforge/skills/agent-orchestration-patterns/SKILL.md +129 -0
- package/.mindforge/skills/agent-tool-selection/SKILL.md +204 -0
- package/.mindforge/skills/ai-agent-deployment/SKILL.md +176 -0
- package/.mindforge/skills/ai-cost-management/SKILL.md +57 -0
- package/.mindforge/skills/ai-safety-alignment/SKILL.md +53 -0
- package/.mindforge/skills/analytics-instrumentation/SKILL.md +172 -0
- package/.mindforge/skills/api-gateway-patterns/SKILL.md +177 -0
- package/.mindforge/skills/api-marketplace/SKILL.md +56 -0
- package/.mindforge/skills/api-versioning/SKILL.md +100 -0
- package/.mindforge/skills/app-store-deployment/SKILL.md +44 -0
- package/.mindforge/skills/architecture-tradeoff-analysis/SKILL.md +97 -0
- package/.mindforge/skills/audit-logging/SKILL.md +140 -0
- package/.mindforge/skills/auth-patterns/SKILL.md +148 -0
- package/.mindforge/skills/autonomous-agent-harness/SKILL.md +218 -0
- package/.mindforge/skills/autonomous-agents/SKILL.md +59 -0
- package/.mindforge/skills/autonomous-loops/SKILL.md +105 -0
- package/.mindforge/skills/build-system-optimization/SKILL.md +54 -0
- package/.mindforge/skills/build-vs-buy/SKILL.md +80 -0
- package/.mindforge/skills/bundle-optimization/SKILL.md +174 -0
- package/.mindforge/skills/business-analyst/SKILL.md +82 -0
- package/.mindforge/skills/caching-strategies/SKILL.md +132 -0
- package/.mindforge/skills/capacity-planning/SKILL.md +96 -0
- package/.mindforge/skills/causal-inference/SKILL.md +42 -0
- package/.mindforge/skills/cdn-optimization/SKILL.md +212 -0
- package/.mindforge/skills/change-management/SKILL.md +106 -0
- package/.mindforge/skills/chaos-engineering/SKILL.md +99 -0
- package/.mindforge/skills/ci-cd-pipeline/SKILL.md +118 -0
- package/.mindforge/skills/cli-design/SKILL.md +118 -0
- package/.mindforge/skills/code-generation-patterns/SKILL.md +92 -0
- package/.mindforge/skills/code-review-methodology/SKILL.md +180 -0
- package/.mindforge/skills/code-tour/SKILL.md +145 -0
- package/.mindforge/skills/codebase-onboarding/SKILL.md +95 -0
- package/.mindforge/skills/compliance-as-code/SKILL.md +195 -0
- package/.mindforge/skills/conflict-resolution/SKILL.md +87 -0
- package/.mindforge/skills/connection-pooling/SKILL.md +151 -0
- package/.mindforge/skills/container-security/SKILL.md +151 -0
- package/.mindforge/skills/context-engineering/SKILL.md +114 -0
- package/.mindforge/skills/continuous-learning/SKILL.md +84 -0
- package/.mindforge/skills/contract-testing/SKILL.md +85 -0
- package/.mindforge/skills/cost-aware-routing/SKILL.md +83 -0
- package/.mindforge/skills/cost-estimation/SKILL.md +82 -0
- package/.mindforge/skills/council/SKILL.md +68 -0
- package/.mindforge/skills/cqrs-event-sourcing/SKILL.md +95 -0
- package/.mindforge/skills/cross-platform-testing/SKILL.md +43 -0
- package/.mindforge/skills/data-governance/SKILL.md +42 -0
- package/.mindforge/skills/data-lakehouse/SKILL.md +42 -0
- package/.mindforge/skills/data-mesh/SKILL.md +42 -0
- package/.mindforge/skills/data-modeling/SKILL.md +107 -0
- package/.mindforge/skills/data-pipeline-design/SKILL.md +171 -0
- package/.mindforge/skills/data-privacy-engineering/SKILL.md +42 -0
- package/.mindforge/skills/database-performance/SKILL.md +174 -0
- package/.mindforge/skills/database-sharding-advanced/SKILL.md +206 -0
- package/.mindforge/skills/de-sloppify/SKILL.md +120 -0
- package/.mindforge/skills/defense-in-depth/SKILL.md +84 -0
- package/.mindforge/skills/delegation-patterns/SKILL.md +123 -0
- package/.mindforge/skills/dependency-management/SKILL.md +94 -0
- package/.mindforge/skills/deployment-workflow/SKILL.md +135 -0
- package/.mindforge/skills/design-system/SKILL.md +113 -0
- package/.mindforge/skills/developer-onboarding/SKILL.md +99 -0
- package/.mindforge/skills/developer-productivity-metrics/SKILL.md +59 -0
- package/.mindforge/skills/distributed-consensus/SKILL.md +141 -0
- package/.mindforge/skills/dmux-workflows/SKILL.md +141 -0
- package/.mindforge/skills/dns-architecture/SKILL.md +167 -0
- package/.mindforge/skills/doc-health-audit/SKILL.md +102 -0
- package/.mindforge/skills/ecommerce-architecture/SKILL.md +41 -0
- package/.mindforge/skills/edge-computing/SKILL.md +91 -0
- package/.mindforge/skills/edtech-platform/SKILL.md +41 -0
- package/.mindforge/skills/email-deliverability/SKILL.md +177 -0
- package/.mindforge/skills/embedding-systems/SKILL.md +55 -0
- package/.mindforge/skills/environment-management/SKILL.md +54 -0
- package/.mindforge/skills/error-handling-architecture/SKILL.md +118 -0
- package/.mindforge/skills/estimation-techniques/SKILL.md +113 -0
- package/.mindforge/skills/eval-harness/SKILL.md +180 -0
- package/.mindforge/skills/event-driven-architecture/SKILL.md +162 -0
- package/.mindforge/skills/experiment-design/SKILL.md +139 -0
- package/.mindforge/skills/experiment-platform/SKILL.md +43 -0
- package/.mindforge/skills/feature-engineering/SKILL.md +42 -0
- package/.mindforge/skills/feature-flag-management/SKILL.md +183 -0
- package/.mindforge/skills/fine-tuning-workflow/SKILL.md +189 -0
- package/.mindforge/skills/fintech-patterns/SKILL.md +41 -0
- package/.mindforge/skills/flutter-architecture/SKILL.md +42 -0
- package/.mindforge/skills/gaming-backend/SKILL.md +41 -0
- package/.mindforge/skills/git-workflow-design/SKILL.md +129 -0
- package/.mindforge/skills/graceful-degradation/SKILL.md +95 -0
- package/.mindforge/skills/graphql-patterns/SKILL.md +243 -0
- package/.mindforge/skills/guardrails-and-safety/SKILL.md +137 -0
- package/.mindforge/skills/healthcare-systems/SKILL.md +40 -0
- package/.mindforge/skills/hiring-engineering/SKILL.md +119 -0
- package/.mindforge/skills/human-in-the-loop-design/SKILL.md +234 -0
- package/.mindforge/skills/i18n-architecture/SKILL.md +147 -0
- package/.mindforge/skills/idempotency-patterns/SKILL.md +84 -0
- package/.mindforge/skills/incident-communication/SKILL.md +96 -0
- package/.mindforge/skills/incident-management/SKILL.md +97 -0
- package/.mindforge/skills/infrastructure-as-code/SKILL.md +98 -0
- package/.mindforge/skills/instinct-clustering/SKILL.md +190 -0
- package/.mindforge/skills/internal-developer-platform/SKILL.md +51 -0
- package/.mindforge/skills/iot-platform/SKILL.md +41 -0
- package/.mindforge/skills/k8s-deployment/SKILL.md +358 -0
- package/.mindforge/skills/knowledge-graphs/SKILL.md +56 -0
- package/.mindforge/skills/knowledge-sharing-systems/SKILL.md +112 -0
- package/.mindforge/skills/llm-cost-optimization/SKILL.md +198 -0
- package/.mindforge/skills/llm-orchestration/SKILL.md +56 -0
- package/.mindforge/skills/load-testing/SKILL.md +84 -0
- package/.mindforge/skills/logistics-optimization/SKILL.md +40 -0
- package/.mindforge/skills/market-researcher/SKILL.md +99 -0
- package/.mindforge/skills/marketplace-trust/SKILL.md +40 -0
- package/.mindforge/skills/mcp-server-patterns/SKILL.md +264 -0
- package/.mindforge/skills/media-streaming/SKILL.md +41 -0
- package/.mindforge/skills/meeting-architecture/SKILL.md +146 -0
- package/.mindforge/skills/mentoring-patterns/SKILL.md +77 -0
- package/.mindforge/skills/microservices-patterns/SKILL.md +83 -0
- package/.mindforge/skills/migration-platform/SKILL.md +61 -0
- package/.mindforge/skills/migration-strategies/SKILL.md +129 -0
- package/.mindforge/skills/ml-feature-store/SKILL.md +56 -0
- package/.mindforge/skills/ml-monitoring/SKILL.md +42 -0
- package/.mindforge/skills/mobile-performance/SKILL.md +44 -0
- package/.mindforge/skills/mobile-security/SKILL.md +45 -0
- package/.mindforge/skills/model-evaluation/SKILL.md +53 -0
- package/.mindforge/skills/monorepo-management/SKILL.md +100 -0
- package/.mindforge/skills/multi-llm-consult/SKILL.md +75 -0
- package/.mindforge/skills/multi-tenancy-patterns/SKILL.md +145 -0
- package/.mindforge/skills/multi-turn-conversation-design/SKILL.md +206 -0
- package/.mindforge/skills/multimodal-ai/SKILL.md +51 -0
- package/.mindforge/skills/mutation-testing/SKILL.md +97 -0
- package/.mindforge/skills/notification-system-design/SKILL.md +168 -0
- package/.mindforge/skills/observability-stack/SKILL.md +136 -0
- package/.mindforge/skills/offline-first-design/SKILL.md +43 -0
- package/.mindforge/skills/on-call-design/SKILL.md +111 -0
- package/.mindforge/skills/pagination-patterns/SKILL.md +230 -0
- package/.mindforge/skills/payment-integration/SKILL.md +176 -0
- package/.mindforge/skills/performance-reviews/SKILL.md +140 -0
- package/.mindforge/skills/platform-observability/SKILL.md +58 -0
- package/.mindforge/skills/platform-reliability/SKILL.md +52 -0
- package/.mindforge/skills/post-incident-learning/SKILL.md +96 -0
- package/.mindforge/skills/product-manager/SKILL.md +104 -0
- package/.mindforge/skills/progressive-web-app/SKILL.md +44 -0
- package/.mindforge/skills/prompt-engineering/SKILL.md +94 -0
- package/.mindforge/skills/proofreader/SKILL.md +158 -0
- package/.mindforge/skills/push-notification-architecture/SKILL.md +45 -0
- package/.mindforge/skills/python-performance/SKILL.md +183 -0
- package/.mindforge/skills/quality-audit/SKILL.md +171 -0
- package/.mindforge/skills/queue-design/SKILL.md +85 -0
- package/.mindforge/skills/rag-architecture/SKILL.md +176 -0
- package/.mindforge/skills/rate-limiting-design/SKILL.md +94 -0
- package/.mindforge/skills/react-native-patterns/SKILL.md +42 -0
- package/.mindforge/skills/react-performance/SKILL.md +229 -0
- package/.mindforge/skills/real-time-analytics/SKILL.md +42 -0
- package/.mindforge/skills/real-time-sync/SKILL.md +83 -0
- package/.mindforge/skills/responsive-native/SKILL.md +44 -0
- package/.mindforge/skills/responsive-patterns/SKILL.md +141 -0
- package/.mindforge/skills/rfc-pipeline/SKILL.md +114 -0
- package/.mindforge/skills/saas-multi-tenant/SKILL.md +41 -0
- package/.mindforge/skills/santa-method/SKILL.md +134 -0
- package/.mindforge/skills/search-implementation/SKILL.md +98 -0
- package/.mindforge/skills/secrets-platform/SKILL.md +56 -0
- package/.mindforge/skills/secrets-rotation/SKILL.md +173 -0
- package/.mindforge/skills/self-serve-infrastructure/SKILL.md +51 -0
- package/.mindforge/skills/serverless-patterns/SKILL.md +119 -0
- package/.mindforge/skills/skill-creator-meta/SKILL.md +146 -0
- package/.mindforge/skills/sprint-retrospective-facilitation/SKILL.md +112 -0
- package/.mindforge/skills/stakeholder-communication/SKILL.md +85 -0
- package/.mindforge/skills/state-management/SKILL.md +104 -0
- package/.mindforge/skills/stream-processing/SKILL.md +43 -0
- package/.mindforge/skills/streaming-architecture/SKILL.md +81 -0
- package/.mindforge/skills/supply-chain-security/SKILL.md +145 -0
- package/.mindforge/skills/synthetic-data-generation/SKILL.md +52 -0
- package/.mindforge/skills/system-design/SKILL.md +88 -0
- package/.mindforge/skills/team-topology-design/SKILL.md +107 -0
- package/.mindforge/skills/technical-debt-management/SKILL.md +86 -0
- package/.mindforge/skills/technical-interview-design/SKILL.md +98 -0
- package/.mindforge/skills/technical-leadership/SKILL.md +75 -0
- package/.mindforge/skills/technical-writing/SKILL.md +237 -0
- package/.mindforge/skills/technology-radar/SKILL.md +88 -0
- package/.mindforge/skills/testing-anti-patterns/SKILL.md +288 -0
- package/.mindforge/skills/threat-modeling/SKILL.md +109 -0
- package/.mindforge/skills/tool-design/SKILL.md +138 -0
- package/.mindforge/skills/typescript-advanced/SKILL.md +198 -0
- package/.mindforge/skills/using-git-worktrees/SKILL.md +139 -0
- package/.mindforge/skills/verification-loop/SKILL.md +97 -0
- package/.mindforge/skills/vibe-security/SKILL.md +165 -0
- package/.mindforge/skills/visual-regression-testing/SKILL.md +97 -0
- package/.mindforge/skills/websocket-patterns/SKILL.md +203 -0
- package/.mindforge/skills/writing-plans/SKILL.md +170 -0
- package/.mindforge/skills/writing-skills/SKILL.md +216 -0
- package/.mindforge/skills/zero-trust-architecture/SKILL.md +166 -0
- package/CHANGELOG.md +195 -0
- package/MINDFORGE.md +4 -4
- package/README.md +2 -2
- package/RELEASENOTES.md +66 -0
- package/bin/installer-core.js +1 -1
- package/bin/wizard/theme.js +2 -2
- package/docs/commands-reference.md +18 -1
- package/package.json +2 -2
- package/.mindforge/personas/data-privacy-engineer.md +0 -187
|
@@ -0,0 +1,234 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: human-in-the-loop-design
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.0.4
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: human in the loop, escalation gate, approval threshold, confidence threshold, explanation quality, trust calibration, user override, hitl pattern, agent handoff, supervision design, human review trigger, autonomous boundary
|
|
7
|
+
compose: guardrails-and-safety
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
# Skill — Human-in-the-Loop Design (Escalation & Supervision Architecture)
|
|
11
|
+
|
|
12
|
+
## When this skill activates
|
|
13
|
+
When designing agent autonomy boundaries, building escalation gates, calibrating
|
|
14
|
+
confidence thresholds, or implementing approval workflows. Use for any system where
|
|
15
|
+
an AI agent must decide between acting autonomously and requesting human guidance.
|
|
16
|
+
|
|
17
|
+
Core principle: **Maximum VALUE, not maximum autonomy** — the goal is not to minimize
|
|
18
|
+
human involvement. The goal is to maximize the value delivered. Sometimes the highest-value
|
|
19
|
+
action is asking the human. The art is knowing WHEN.
|
|
20
|
+
|
|
21
|
+
## Mandatory actions when this skill is active
|
|
22
|
+
|
|
23
|
+
### Action Classification (Reversibility x Impact Matrix)
|
|
24
|
+
|
|
25
|
+
1. **Classify every agent action:**
|
|
26
|
+
```
|
|
27
|
+
| Impact \ Reversibility | Easily Reversible | Hard to Reverse | Irreversible |
|
|
28
|
+
|------------------------|------------------------|------------------------|------------------------|
|
|
29
|
+
| Low Impact | AUTONOMOUS | AUTONOMOUS | CONFIRM |
|
|
30
|
+
| Medium Impact | AUTONOMOUS | CONFIRM | APPROVE |
|
|
31
|
+
| High Impact | CONFIRM | APPROVE | APPROVE + WAIT |
|
|
32
|
+
|
|
33
|
+
Levels:
|
|
34
|
+
- AUTONOMOUS: Agent acts without asking (log for audit)
|
|
35
|
+
- CONFIRM: Agent acts but shows what it did (user can undo)
|
|
36
|
+
- APPROVE: Agent proposes, human approves before execution
|
|
37
|
+
- APPROVE + WAIT: Agent proposes, human approves, agent waits for explicit "go"
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
2. **Per-action classification examples:**
|
|
41
|
+
```
|
|
42
|
+
AUTONOMOUS (act freely):
|
|
43
|
+
- Reading files
|
|
44
|
+
- Running read-only queries
|
|
45
|
+
- Searching codebases
|
|
46
|
+
- Generating suggestions (not applying them)
|
|
47
|
+
|
|
48
|
+
CONFIRM (act, show, allow undo):
|
|
49
|
+
- Editing existing files
|
|
50
|
+
- Creating new files in expected locations
|
|
51
|
+
- Running tests
|
|
52
|
+
- Installing dev dependencies
|
|
53
|
+
|
|
54
|
+
APPROVE (propose, wait for yes):
|
|
55
|
+
- Deleting files
|
|
56
|
+
- Modifying configuration
|
|
57
|
+
- Running destructive commands
|
|
58
|
+
- Changing auth/security code
|
|
59
|
+
- Making API calls with side effects
|
|
60
|
+
|
|
61
|
+
APPROVE + WAIT (high ceremony):
|
|
62
|
+
- Deploying to production
|
|
63
|
+
- Modifying database schema
|
|
64
|
+
- Changing payment logic
|
|
65
|
+
- Force-pushing to shared branches
|
|
66
|
+
- Deleting user data
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
### Escalation Triggers
|
|
70
|
+
|
|
71
|
+
3. **When to escalate (confidence-based):**
|
|
72
|
+
```
|
|
73
|
+
Always escalate when:
|
|
74
|
+
- Confidence < 0.7 on the correct approach
|
|
75
|
+
- Action is irreversible AND high-impact
|
|
76
|
+
- Multiple valid approaches exist with no clear winner
|
|
77
|
+
- Task contradicts prior user guidance
|
|
78
|
+
- Security-sensitive code is being modified
|
|
79
|
+
- User's intent is ambiguous
|
|
80
|
+
|
|
81
|
+
Escalation format:
|
|
82
|
+
"I need your input on [X].
|
|
83
|
+
Context: [what I understand about the situation]
|
|
84
|
+
Options: [A, B, C with tradeoffs]
|
|
85
|
+
My recommendation: [preferred option + why]
|
|
86
|
+
What I'm uncertain about: [specific uncertainty]"
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
Rules:
|
|
90
|
+
- ALWAYS explain WHY you're escalating (don't just say "I'm not sure")
|
|
91
|
+
- ALWAYS provide a recommendation (even when uncertain)
|
|
92
|
+
- ALWAYS state what additional context would resolve the uncertainty
|
|
93
|
+
- Never escalate without having done research first (don't be lazy)
|
|
94
|
+
|
|
95
|
+
### Approval Gate Design
|
|
96
|
+
|
|
97
|
+
4. **Designing low-friction approval UX:**
|
|
98
|
+
```
|
|
99
|
+
Principles:
|
|
100
|
+
- Fast: approval should take <5 seconds for clear cases
|
|
101
|
+
- Informative: show WHAT will happen, not just ask "ok?"
|
|
102
|
+
- Defaulted: suggest the likely answer (approve/reject)
|
|
103
|
+
- Skippable: allow bulk-approve for repetitive low-risk items
|
|
104
|
+
- Auditable: log every approval decision with timestamp and rationale
|
|
105
|
+
|
|
106
|
+
Good approval request:
|
|
107
|
+
"I'd like to add an index on users.email (migration file ready).
|
|
108
|
+
This will lock the table for ~2 seconds during deploy.
|
|
109
|
+
[Approve] [Reject] [Show migration SQL first]"
|
|
110
|
+
|
|
111
|
+
Bad approval request:
|
|
112
|
+
"Can I make a database change?"
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
Rules:
|
|
116
|
+
- Show the EFFECT of the action, not just the action itself
|
|
117
|
+
- Provide enough context to decide without further research
|
|
118
|
+
- Offer a way to see more detail (for cautious reviewers)
|
|
119
|
+
- Default to the safe option (reject) for high-impact actions
|
|
120
|
+
- Time-box approvals: if no response in X hours, remind or escalate
|
|
121
|
+
|
|
122
|
+
### Confidence Calibration
|
|
123
|
+
|
|
124
|
+
5. **Ensuring confidence scores are meaningful:**
|
|
125
|
+
```
|
|
126
|
+
Calibration goal:
|
|
127
|
+
When the agent says "I'm 90% confident" → it should be correct 90% of the time
|
|
128
|
+
When the agent says "I'm 50% confident" → it should be correct 50% of the time
|
|
129
|
+
|
|
130
|
+
Measuring calibration:
|
|
131
|
+
- Collect (confidence, actual_outcome) pairs from eval runs
|
|
132
|
+
- Plot calibration curve (expected accuracy vs actual accuracy)
|
|
133
|
+
- Perfect calibration = diagonal line
|
|
134
|
+
- Overconfident = curve below diagonal (says 90%, is right 70%)
|
|
135
|
+
- Underconfident = curve above diagonal (says 50%, is right 80%)
|
|
136
|
+
|
|
137
|
+
Fixing miscalibration:
|
|
138
|
+
- Overconfident: lower confidence thresholds (escalate more)
|
|
139
|
+
- Underconfident: raise thresholds (escalate less, trust yourself)
|
|
140
|
+
- Recalibrate after major model/prompt changes
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
Rules:
|
|
144
|
+
- Calibrate quarterly (or after any major agent change)
|
|
145
|
+
- If overconfident: the agent is making unescalated mistakes → tighten boundaries
|
|
146
|
+
- If underconfident: the agent is annoying users with unnecessary escalations → loosen
|
|
147
|
+
- Track calibration as a first-class metric (alongside accuracy)
|
|
148
|
+
|
|
149
|
+
### Explanation Quality
|
|
150
|
+
|
|
151
|
+
6. **How to explain escalations effectively:**
|
|
152
|
+
```
|
|
153
|
+
Explanation structure:
|
|
154
|
+
1. WHAT: what you're asking about (specific, concrete)
|
|
155
|
+
2. WHY: why you can't decide autonomously (the uncertainty)
|
|
156
|
+
3. OPTIONS: what the choices are (with tradeoffs)
|
|
157
|
+
4. RECOMMENDATION: what you'd do if forced to decide
|
|
158
|
+
5. CONTEXT_GAP: what information would let you decide next time
|
|
159
|
+
|
|
160
|
+
Good explanation:
|
|
161
|
+
"I found two approaches to implement caching (Redis vs in-memory).
|
|
162
|
+
Redis is more robust but adds infrastructure cost.
|
|
163
|
+
In-memory is simpler but won't survive restarts.
|
|
164
|
+
I'd lean toward Redis for production, but I don't know your infra budget.
|
|
165
|
+
If you tell me the acceptable monthly cost, I can decide this autonomously next time."
|
|
166
|
+
|
|
167
|
+
Bad explanation:
|
|
168
|
+
"Should I use Redis or in-memory caching?"
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
### Trust Building (Progressive Autonomy)
|
|
172
|
+
|
|
173
|
+
7. **Earning autonomy over time:**
|
|
174
|
+
```
|
|
175
|
+
Trust levels:
|
|
176
|
+
Level 1 — New agent (restrictive):
|
|
177
|
+
- APPROVE for any write operation
|
|
178
|
+
- CONFIRM for most read operations
|
|
179
|
+
- Escalation rate: high (~30% of actions)
|
|
180
|
+
|
|
181
|
+
Level 2 — Established (standard):
|
|
182
|
+
- AUTONOMOUS for reads and standard writes
|
|
183
|
+
- CONFIRM for destructive operations
|
|
184
|
+
- APPROVE for irreversible high-impact actions
|
|
185
|
+
- Escalation rate: moderate (~10%)
|
|
186
|
+
|
|
187
|
+
Level 3 — Trusted (permissive):
|
|
188
|
+
- AUTONOMOUS for most operations
|
|
189
|
+
- CONFIRM for irreversible actions
|
|
190
|
+
- APPROVE only for production deploys and security changes
|
|
191
|
+
- Escalation rate: low (~3%)
|
|
192
|
+
|
|
193
|
+
Level transitions:
|
|
194
|
+
- Promote: 20 consecutive successful autonomous actions without user correction
|
|
195
|
+
- Demote: 1 autonomous action that user explicitly reverses or flags as wrong
|
|
196
|
+
- Demotion is faster than promotion (trust is earned slowly, lost quickly)
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
### Monitoring Escalation Health
|
|
200
|
+
|
|
201
|
+
8. **Tracking escalation quality:**
|
|
202
|
+
```
|
|
203
|
+
Metrics to monitor:
|
|
204
|
+
- Escalation rate: % of actions that require human input
|
|
205
|
+
- False escalation rate: % of escalations where human says "just do it"
|
|
206
|
+
- Missed escalation rate: % of autonomous actions that were wrong
|
|
207
|
+
- Approval latency: time between escalation and human response
|
|
208
|
+
- Rubber-stamp rate: % of approvals decided in <2 seconds (too fast = not reading)
|
|
209
|
+
|
|
210
|
+
Healthy ranges:
|
|
211
|
+
- Escalation rate: 5-15% (too low = risky, too high = annoying)
|
|
212
|
+
- False escalation rate: <20% (too high = boundaries too tight)
|
|
213
|
+
- Missed escalation rate: <2% (too high = boundaries too loose)
|
|
214
|
+
- Rubber-stamp rate: <30% (too high = approval fatigue, redesign needed)
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
Rules:
|
|
218
|
+
- If rubber-stamp rate is high: reduce approval friction or widen autonomy
|
|
219
|
+
- If missed escalation rate is high: tighten boundaries immediately
|
|
220
|
+
- Review escalation metrics weekly (don't let them drift)
|
|
221
|
+
- Treat high rubber-stamp rate as a UX bug (users are being annoyed, not helped)
|
|
222
|
+
|
|
223
|
+
## Self-check before task completion
|
|
224
|
+
|
|
225
|
+
Before marking a task done when this skill was active:
|
|
226
|
+
|
|
227
|
+
- [ ] Are actions classified by reversibility x impact (autonomous/confirm/approve)?
|
|
228
|
+
- [ ] Do escalation triggers include: low confidence, irreversible actions, ambiguity?
|
|
229
|
+
- [ ] Is the approval UX low-friction (fast, informative, defaulted)?
|
|
230
|
+
- [ ] Are explanations structured (what, why, options, recommendation, context gap)?
|
|
231
|
+
- [ ] Is progressive autonomy designed (trust levels with promotion/demotion)?
|
|
232
|
+
- [ ] Are escalation health metrics defined (false escalation rate, rubber-stamp rate)?
|
|
233
|
+
- [ ] Is confidence calibration measured (says 90% → right 90%)?
|
|
234
|
+
- [ ] Does the guardrails-and-safety skill co-activate for safety-critical boundaries?
|
|
@@ -0,0 +1,147 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: i18n-architecture
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 0.3.0
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: i18n architecture, message catalog, pluralization rule, ICU message format, RTL layout, locale detection, translation loading, internationalization setup, language fallback, number formatting, date locale, translation management
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Skill — Internationalization Architecture
|
|
10
|
+
|
|
11
|
+
## When this skill activates
|
|
12
|
+
Any task involving multi-language support, locale handling, message catalogs,
|
|
13
|
+
RTL layouts, number/date formatting, or translation infrastructure.
|
|
14
|
+
|
|
15
|
+
## Mandatory actions when this skill is active
|
|
16
|
+
|
|
17
|
+
### Before implementing i18n
|
|
18
|
+
1. Audit all user-facing strings in the codebase.
|
|
19
|
+
2. Define the locale detection strategy.
|
|
20
|
+
3. Choose a message format that handles plurals and gender.
|
|
21
|
+
|
|
22
|
+
### Message format (ICU MessageFormat)
|
|
23
|
+
|
|
24
|
+
**Why ICU:**
|
|
25
|
+
- Handles plurals correctly across languages (some have 6 plural forms).
|
|
26
|
+
- Handles gender agreement.
|
|
27
|
+
- Handles select/choice patterns.
|
|
28
|
+
- Industry standard supported by most i18n libraries.
|
|
29
|
+
|
|
30
|
+
**Examples:**
|
|
31
|
+
```
|
|
32
|
+
{count, plural,
|
|
33
|
+
=0 {No items}
|
|
34
|
+
one {# item}
|
|
35
|
+
other {# items}
|
|
36
|
+
}
|
|
37
|
+
|
|
38
|
+
{gender, select,
|
|
39
|
+
male {He liked your post}
|
|
40
|
+
female {She liked your post}
|
|
41
|
+
other {They liked your post}
|
|
42
|
+
}
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
**Critical rule:** NEVER concatenate strings for messages.
|
|
46
|
+
- BAD: `"Hello " + name + ", you have " + count + " messages"`
|
|
47
|
+
- GOOD: `"Hello {name}, you have {count, plural, one {# message} other {# messages}}"`
|
|
48
|
+
|
|
49
|
+
### Catalog structure
|
|
50
|
+
|
|
51
|
+
**One file per locale, namespaced by feature:**
|
|
52
|
+
```
|
|
53
|
+
locales/
|
|
54
|
+
en/
|
|
55
|
+
common.json
|
|
56
|
+
auth.json
|
|
57
|
+
dashboard.json
|
|
58
|
+
fr/
|
|
59
|
+
common.json
|
|
60
|
+
auth.json
|
|
61
|
+
dashboard.json
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
**Rules:**
|
|
65
|
+
- Keys are semantic, not the English text (`auth.loginButton` not `"Log in"`).
|
|
66
|
+
- Flat keys with dot notation or nested objects — pick one, be consistent.
|
|
67
|
+
- Never store HTML in translation strings (use interpolation components).
|
|
68
|
+
- Keep a "base" locale (usually en) as the source of truth.
|
|
69
|
+
|
|
70
|
+
### Loading strategy
|
|
71
|
+
|
|
72
|
+
**Lazy-load per route/namespace:**
|
|
73
|
+
- Do NOT load all locales upfront — only the active locale.
|
|
74
|
+
- Do NOT load all namespaces — only what the current route needs.
|
|
75
|
+
- Prefetch the next likely namespace on navigation intent.
|
|
76
|
+
|
|
77
|
+
**Implementation:**
|
|
78
|
+
```javascript
|
|
79
|
+
// Load only when needed
|
|
80
|
+
const messages = await import(`./locales/${locale}/${namespace}.json`);
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
**Fallback chain:**
|
|
84
|
+
- Specific locale (fr-CA) → base locale (fr) → default locale (en).
|
|
85
|
+
- Missing key in active locale → fall back, log warning in development.
|
|
86
|
+
|
|
87
|
+
### Locale detection
|
|
88
|
+
|
|
89
|
+
**Priority order:**
|
|
90
|
+
1. User explicit preference (stored in profile/cookie).
|
|
91
|
+
2. Accept-Language header (server-side).
|
|
92
|
+
3. Navigator.language (client-side).
|
|
93
|
+
4. Geo-IP lookup (least reliable).
|
|
94
|
+
5. Default locale (en).
|
|
95
|
+
|
|
96
|
+
**Rules:**
|
|
97
|
+
- Let users override detected locale at any time.
|
|
98
|
+
- Persist user choice across sessions.
|
|
99
|
+
- URL strategy: subdomain (fr.app.com) or path prefix (/fr/dashboard).
|
|
100
|
+
|
|
101
|
+
### RTL layout support
|
|
102
|
+
|
|
103
|
+
**CSS logical properties (mandatory):**
|
|
104
|
+
- Use `margin-inline-start` not `margin-left`.
|
|
105
|
+
- Use `padding-inline-end` not `padding-right`.
|
|
106
|
+
- Use `inset-inline-start` not `left`.
|
|
107
|
+
- Use `border-inline-start` not `border-left`.
|
|
108
|
+
|
|
109
|
+
**HTML:**
|
|
110
|
+
- Set `dir="rtl"` on the `<html>` element based on locale.
|
|
111
|
+
- Use `dir="auto"` on user-generated content.
|
|
112
|
+
|
|
113
|
+
**Icons and images:**
|
|
114
|
+
- Mirror directional icons (arrows, progress indicators) in RTL.
|
|
115
|
+
- Do NOT mirror: logos, clocks, phone icons, checkmarks.
|
|
116
|
+
|
|
117
|
+
### Number and date formatting
|
|
118
|
+
|
|
119
|
+
**Always use Intl APIs:**
|
|
120
|
+
```javascript
|
|
121
|
+
// Numbers
|
|
122
|
+
new Intl.NumberFormat(locale, { style: 'currency', currency: 'USD' }).format(amount);
|
|
123
|
+
|
|
124
|
+
// Dates
|
|
125
|
+
new Intl.DateTimeFormat(locale, { dateStyle: 'long', timeStyle: 'short' }).format(date);
|
|
126
|
+
|
|
127
|
+
// Relative time
|
|
128
|
+
new Intl.RelativeTimeFormat(locale, { numeric: 'auto' }).format(-1, 'day');
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
**Rules:**
|
|
132
|
+
- Never manually format numbers or dates with string templates.
|
|
133
|
+
- Store dates in UTC, display in user's timezone.
|
|
134
|
+
- Currency display must respect locale (symbol position, separator).
|
|
135
|
+
|
|
136
|
+
### Translation management
|
|
137
|
+
|
|
138
|
+
- Use a translation management system (Crowdin, Lokalise, Phrase) for professional translations.
|
|
139
|
+
- Extract new keys automatically from code (i18next-parser, formatjs extract).
|
|
140
|
+
- CI check: fail if base locale has keys missing from other locales.
|
|
141
|
+
- Pseudo-localization in development to catch hardcoded strings and layout overflow.
|
|
142
|
+
|
|
143
|
+
## Self-check before task completion
|
|
144
|
+
- [ ] Did I follow the mandatory actions for this skill?
|
|
145
|
+
- [ ] Did I apply the patterns appropriate to the context?
|
|
146
|
+
- [ ] Did I verify the implementation meets the criteria above?
|
|
147
|
+
- [ ] Did I document decisions and trade-offs made?
|
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: idempotency-patterns
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.0.9
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: idempotency pattern, idempotency key, exactly once semantic, deduplication strategy, replay safety, idempotent consumer, idempotent api, request deduplication, operation retry safety, duplicate detection, idempotent write, at-most-once processing
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Skill — Idempotency Patterns
|
|
10
|
+
|
|
11
|
+
## When this skill activates
|
|
12
|
+
Any task involving idempotent API design, idempotency keys, exactly-once semantics,
|
|
13
|
+
deduplication, replay safety, or retry-safe operations.
|
|
14
|
+
|
|
15
|
+
## Mandatory actions when this skill is active
|
|
16
|
+
|
|
17
|
+
### Before writing any code
|
|
18
|
+
1. Identify which operations MUST be idempotent (any retryable operation).
|
|
19
|
+
2. Determine key strategy (client-generated UUID in header).
|
|
20
|
+
3. Choose storage backend (Redis for short-lived, DB for permanent records).
|
|
21
|
+
|
|
22
|
+
### During implementation
|
|
23
|
+
- Accept keys via `Idempotency-Key` header on all POST endpoints.
|
|
24
|
+
- Store complete response with the key (status + body, not just success flag).
|
|
25
|
+
- Set appropriate TTL on idempotency records.
|
|
26
|
+
- Handle concurrent duplicates (lock or 409 Conflict).
|
|
27
|
+
|
|
28
|
+
### After implementation
|
|
29
|
+
- Test concurrent duplicate requests (race condition safety).
|
|
30
|
+
- Verify partial failures are NOT cached (only complete operations).
|
|
31
|
+
- Document which endpoints are idempotent and key requirements.
|
|
32
|
+
|
|
33
|
+
## Core Flow
|
|
34
|
+
|
|
35
|
+
```
|
|
36
|
+
1. Receive request with Idempotency-Key
|
|
37
|
+
2. Key exists in store? → YES: return cached response. NO: continue.
|
|
38
|
+
3. Lock key (prevent concurrent processing of same key)
|
|
39
|
+
4. Execute operation
|
|
40
|
+
5. Store: key → {status_code, body, created_at}
|
|
41
|
+
6. Release lock, return response
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
## Idempotency Key Design
|
|
45
|
+
- Client-generated UUID v4 or ULID. Same key = same intended operation.
|
|
46
|
+
- Scope per-endpoint AND per-user: `idempotency:{user}:{endpoint}:{key}`.
|
|
47
|
+
- Max length: 255 chars. Stable across retries of same business action.
|
|
48
|
+
|
|
49
|
+
## Storage Options
|
|
50
|
+
|
|
51
|
+
- **Redis**: fast, TTL built-in, atomic via Lua. Use for API requests (24h TTL).
|
|
52
|
+
- **Database**: durable, queryable. Use for financial/audit operations. Needs cleanup job.
|
|
53
|
+
|
|
54
|
+
## Database Patterns
|
|
55
|
+
|
|
56
|
+
- **INSERT ON CONFLICT DO NOTHING**: safe insert, check RETURNING for duplicate.
|
|
57
|
+
- **Conditional UPDATE**: `WHERE version = $expected` — 0 rows = stale (reject).
|
|
58
|
+
- **Transactional Outbox**: atomically persist state + event, poll and publish.
|
|
59
|
+
|
|
60
|
+
## Consumer Idempotency
|
|
61
|
+
- Dedup table: `processed_events(event_id PK, processed_at)`.
|
|
62
|
+
- Flow: check if processed → BEGIN → process → INSERT dedup → COMMIT → ACK.
|
|
63
|
+
- TTL: retain dedup records for broker retention + buffer (e.g., 14 days).
|
|
64
|
+
|
|
65
|
+
## API Design
|
|
66
|
+
|
|
67
|
+
- **GET/PUT/DELETE**: naturally idempotent (safe to retry without keys).
|
|
68
|
+
- **POST/PATCH**: require explicit `Idempotency-Key` header.
|
|
69
|
+
- Response headers: `Idempotent-Replayed: true` for cached responses.
|
|
70
|
+
|
|
71
|
+
## Error Handling
|
|
72
|
+
- 4xx client errors: cache (client should not retry same bad input).
|
|
73
|
+
- 5xx server errors: do NOT cache (may succeed on retry).
|
|
74
|
+
- Partial completion: do NOT cache (delete record, retry from scratch).
|
|
75
|
+
|
|
76
|
+
## Self-check before task completion
|
|
77
|
+
|
|
78
|
+
- [ ] Are idempotency keys accepted on all non-idempotent endpoints?
|
|
79
|
+
- [ ] Is the complete response cached (status + body)?
|
|
80
|
+
- [ ] Are concurrent duplicates handled safely (lock or 409)?
|
|
81
|
+
- [ ] Is TTL set appropriately on idempotency records?
|
|
82
|
+
- [ ] Are server errors excluded from caching?
|
|
83
|
+
- [ ] Are DB writes using conflict-safe patterns?
|
|
84
|
+
- [ ] Are message consumers deduplicating by event ID?
|
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: incident-communication
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.3.0
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: incident communication, war room coordination, customer incident message, postmortem facilitation, blameless culture, incident status page, outage communication, stakeholder notification incident, incident timeline, root cause communication, severity classification, incident bridge
|
|
7
|
+
compose:
|
|
8
|
+
- incident-management
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Incident Communication
|
|
12
|
+
|
|
13
|
+
## When this skill activates
|
|
14
|
+
|
|
15
|
+
This skill activates during production incidents when coordinating war rooms, writing customer-facing incident messages, updating status pages, communicating to stakeholders, facilitating postmortems, or establishing blameless culture. It applies to on-call engineers, incident commanders, and engineering managers responsible for incident response and communication.
|
|
16
|
+
|
|
17
|
+
## Mandatory actions when this skill is active
|
|
18
|
+
|
|
19
|
+
### Before incident communication begins
|
|
20
|
+
|
|
21
|
+
1. **Classify severity immediately** — SEV1 (business-critical service down), SEV2 (degraded performance, partial outage), SEV3 (minor issue, workaround available), SEV4 (cosmetic issue, no user impact). Severity determines communication cadence and escalation.
|
|
22
|
+
2. **Assign roles explicitly** — Incident Commander (coordinates response), Communications Lead (stakeholder updates), Tech Lead (drives technical resolution), Scribe (documents timeline). Ambiguous roles cause chaos.
|
|
23
|
+
3. **Establish communication channels** — War room (Slack/Teams/Zoom for internal), status page (for customers), stakeholder channel (for execs/product). Don't mix internal and external comms.
|
|
24
|
+
4. **Start the incident timeline immediately** — Document: when did it start, when was it detected, who's working on it, what's been tried, what's the current status. Real-time notes prevent memory loss.
|
|
25
|
+
|
|
26
|
+
### During incident response
|
|
27
|
+
|
|
28
|
+
#### War Room Coordination
|
|
29
|
+
|
|
30
|
+
- **Run structured check-ins every 15-30 minutes** — Incident Commander asks: (1) What's the current status? (2) What are we trying next? (3) Do we need more people? Prevents chaos and ensures everyone is aligned.
|
|
31
|
+
- **Use threaded communication** — Main channel for status updates only. Threads for technical debugging. Don't pollute the main channel with noisy debugging logs.
|
|
32
|
+
- **Limit the war room to essential people** — Too many cooks slow resolution. Core team: 3-5 people. Observers can follow in a read-only channel.
|
|
33
|
+
- **Escalate when stuck** — If the team is stuck for >30 minutes, escalate. Call in the architect, the team that owns the upstream service, or the engineer who built the system. Ego has no place in incidents.
|
|
34
|
+
- **Declare resolution criteria upfront** — What does "fixed" mean? Service is healthy? All users can transact? Error rate below threshold? Prevents premature "all clear" declarations.
|
|
35
|
+
|
|
36
|
+
#### Customer-Facing Communication
|
|
37
|
+
|
|
38
|
+
- **Acknowledge fast, diagnose later** — Within 15 minutes of detection, post to status page: "We are investigating reports of [service] being unavailable. Updates to follow." Don't wait for root cause.
|
|
39
|
+
- **Use the 3-part update structure** — (1) Current status (what's broken), (2) Customer impact (what can't they do), (3) Next steps (when's the next update). No jargon, no excuses.
|
|
40
|
+
- **Update cadence by severity** — SEV1: every 30 minutes. SEV2: every 60 minutes. SEV3: every 2 hours. Customers hate silence more than bad news.
|
|
41
|
+
- **Avoid over-promising** — Don't say "Fixed in 10 minutes" if you're unsure. Say "Actively working on resolution. Next update in 30 minutes."
|
|
42
|
+
- **Post resolution message** — Once resolved, post: what broke, how long it lasted, how many users were impacted, what we did to fix it, what we're doing to prevent recurrence. Transparency builds trust.
|
|
43
|
+
|
|
44
|
+
#### Stakeholder Notification
|
|
45
|
+
|
|
46
|
+
- **Notify executives immediately for SEV1** — Execs need to know ASAP, especially if customers are complaining or revenue is impacted. One-line summary: "Service X is down. Y% of users impacted. We're on it."
|
|
47
|
+
- **Use BLUF (Bottom Line Up Front)** — Don't bury the lede. "The payment service is down" comes before "We suspect a database connection pool exhaustion."
|
|
48
|
+
- **Provide ETA cautiously** — If you estimate 2 hours to fix, tell stakeholders 3-4 hours. Better to resolve early than overpromise and underdeliver.
|
|
49
|
+
- **Summarize after resolution** — Once the incident is resolved, send a concise executive summary: what broke, how long, customer impact, resolution, next steps. Save the detailed RCA for the postmortem.
|
|
50
|
+
|
|
51
|
+
#### Incident Timeline Documentation
|
|
52
|
+
|
|
53
|
+
- **Scribe logs everything in real-time** — Timestamp every key event: detection, escalation, hypothesis tested, mitigation applied, resolution. Future-you will thank present-you.
|
|
54
|
+
- **Capture decisions and reasoning** — Don't just log "Rolled back to v1.3." Log "Rolled back to v1.3 because v1.4 introduced N+1 query causing DB saturation."
|
|
55
|
+
- **Include dead ends** — Document failed attempts: "Restarted service at 10:15am. Did not resolve issue." Prevents repeating failed approaches.
|
|
56
|
+
- **Link to relevant artifacts** — Logs, dashboards, PRs, alerts. Timeline should be a navigation hub, not a standalone document.
|
|
57
|
+
|
|
58
|
+
#### Root Cause Communication
|
|
59
|
+
|
|
60
|
+
- **Distinguish proximate cause from root cause** — Proximate: "The database ran out of connections." Root: "We didn't set a connection pool limit, so a spike in traffic exhausted connections." Root cause prevents recurrence.
|
|
61
|
+
- **Use the Five Whys** — Why did X happen? Because Y. Why did Y happen? Because Z. Repeat 5 times until you reach the systemic failure, not the surface symptom.
|
|
62
|
+
- **Avoid blame in RCA communication** — Don't say "Engineer A deployed bad code." Say "A deployment bypassed our automated testing due to a gap in CI pipeline." Focus on systems, not individuals.
|
|
63
|
+
- **Define prevention actions** — Every RCA must end with: what are we doing to prevent this from happening again? Action items with owners and deadlines.
|
|
64
|
+
|
|
65
|
+
#### Blameless Culture
|
|
66
|
+
|
|
67
|
+
- **Assume good intent** — Engineers don't cause incidents on purpose. Treat incidents as learning opportunities, not witch hunts.
|
|
68
|
+
- **Focus on systems, not people** — If one engineer's mistake caused an outage, the real failure is the system that allowed a single mistake to cascade. Fix the system.
|
|
69
|
+
- **Celebrate transparency** — When someone admits a mistake, praise their honesty. If people fear punishment, they hide mistakes until they explode.
|
|
70
|
+
- **Conduct blameless postmortems** — Postmortem facilitator enforces: no blame, no naming individuals (unless volunteering credit), focus on system improvements. If someone says "X person messed up," redirect: "What system gap allowed this to happen?"
|
|
71
|
+
|
|
72
|
+
#### Postmortem Facilitation
|
|
73
|
+
|
|
74
|
+
- **Schedule postmortem within 3-5 days** — Too soon: emotions are high, data is incomplete. Too late: memory fades, urgency disappears.
|
|
75
|
+
- **Invite all incident responders + key stakeholders** — Engineers who responded, on-call rotation, product/exec if customer-facing.
|
|
76
|
+
- **Use a structured template** — Summary, Timeline, Root Cause Analysis, What Went Well, What Went Poorly, Action Items (with owners and deadlines). Don't free-form it.
|
|
77
|
+
- **Timebox to 1 hour** — Longer meetings lose focus. If you can't cover it in 1 hour, schedule a follow-up.
|
|
78
|
+
- **Publish postmortem widely** — Share in engineering all-hands, team wikis, or public blog (if appropriate). Transparency accelerates learning across the org.
|
|
79
|
+
|
|
80
|
+
### After incident resolution
|
|
81
|
+
|
|
82
|
+
- **Close the loop with customers** — If customers were impacted, follow up: apologize, explain what broke, what you're doing to prevent recurrence. Consider service credits if appropriate.
|
|
83
|
+
- **Track action items to completion** — 70% of postmortem action items don't get done. Assign a DRI (Directly Responsible Individual) and review progress in sprint planning.
|
|
84
|
+
- **Measure incident response effectiveness** — Track: detection time, resolution time, communication cadence, customer satisfaction. Improve the metrics that matter.
|
|
85
|
+
- **Update runbooks** — Every incident reveals gaps in runbooks. After resolution, update the runbook so future responders have better context.
|
|
86
|
+
|
|
87
|
+
## Self-check before task completion
|
|
88
|
+
|
|
89
|
+
- [ ] Severity is classified (SEV1-4) and roles are assigned (IC, Comms Lead, Tech Lead, Scribe)
|
|
90
|
+
- [ ] War room check-ins happen every 15-30 minutes with structured updates
|
|
91
|
+
- [ ] Customer-facing communication acknowledges the issue within 15 minutes
|
|
92
|
+
- [ ] Status page updates use 3-part structure (status, impact, next steps) with no jargon
|
|
93
|
+
- [ ] Incident timeline is documented in real-time with timestamps and decisions
|
|
94
|
+
- [ ] Root cause distinguishes proximate cause from systemic root cause
|
|
95
|
+
- [ ] Postmortem is scheduled within 3-5 days and uses blameless facilitation
|
|
96
|
+
- [ ] Action items from postmortem have owners, deadlines, and are tracked to completion
|
|
@@ -0,0 +1,97 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: incident-management
|
|
3
|
+
version: 1.0.0
|
|
4
|
+
min_mindforge_version: 10.0.7
|
|
5
|
+
status: stable
|
|
6
|
+
triggers: incident management process, runbook authoring, severity classification framework, communication template design, on-call rotation design, escalation path design, blameless postmortem facilitation, incident timeline reconstruction, incident playbook, war room methodology, incident retrospective framework, pager rotation scheduling
|
|
7
|
+
compose:
|
|
8
|
+
- observability-stack
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Incident Management
|
|
12
|
+
|
|
13
|
+
## When this skill activates
|
|
14
|
+
|
|
15
|
+
This skill activates when the user is designing, implementing, or improving incident
|
|
16
|
+
management processes. This includes severity classification frameworks, runbook
|
|
17
|
+
authoring, on-call rotation scheduling, escalation path design, blameless postmortem
|
|
18
|
+
facilitation, war room methodology, communication templates for stakeholders, and
|
|
19
|
+
incident timeline reconstruction for retrospectives.
|
|
20
|
+
|
|
21
|
+
## Mandatory actions
|
|
22
|
+
|
|
23
|
+
### Before
|
|
24
|
+
|
|
25
|
+
1. Identify the current incident management maturity (ad-hoc, documented, measured, optimized).
|
|
26
|
+
2. Determine the team size, timezone coverage, and existing on-call tooling (PagerDuty, Opsgenie, etc.).
|
|
27
|
+
3. Assess existing runbook coverage and documentation gaps.
|
|
28
|
+
4. Review past incident frequency and mean-time-to-recovery (MTTR) trends.
|
|
29
|
+
5. Confirm observability stack integration (alerts feed into incident workflow).
|
|
30
|
+
|
|
31
|
+
### During
|
|
32
|
+
|
|
33
|
+
**Severity Framework:**
|
|
34
|
+
- **SEV1 (Critical):** Total service outage affecting all users. Revenue impact. All-hands response. Target acknowledgment: 5 minutes. Target resolution: 1 hour.
|
|
35
|
+
- **SEV2 (Major):** Significant degradation affecting many users. Key features unavailable. On-call + escalation. Target acknowledgment: 15 minutes. Target resolution: 4 hours.
|
|
36
|
+
- **SEV3 (Minor):** Partial degradation affecting a subset of users. Workarounds exist. On-call handles. Target acknowledgment: 30 minutes. Target resolution: 24 hours.
|
|
37
|
+
- **SEV4 (Low):** Minor issue with minimal user impact. Handled during business hours. Target resolution: 72 hours.
|
|
38
|
+
- Severity is determined by impact (users affected) x urgency (revenue/safety/compliance risk).
|
|
39
|
+
|
|
40
|
+
**Runbook Template:**
|
|
41
|
+
- **Trigger Conditions:** What alert or symptom initiates this runbook.
|
|
42
|
+
- **Investigation Steps:** Ordered diagnostic commands and checks (copy-pasteable).
|
|
43
|
+
- **Mitigation Actions:** Immediate steps to restore service (rollback, failover, scale).
|
|
44
|
+
- **Escalation Criteria:** When to involve additional teams or bump severity.
|
|
45
|
+
- **Communication Templates:** Pre-written status page updates and stakeholder messages.
|
|
46
|
+
- **Verification:** How to confirm the incident is resolved.
|
|
47
|
+
- Keep runbooks in version control, review quarterly, update after every incident.
|
|
48
|
+
|
|
49
|
+
**On-Call Rotation:**
|
|
50
|
+
- Follow-the-sun model for distributed teams (no one owns overnight).
|
|
51
|
+
- Maximum rotation length: 1 week. Longer causes burnout.
|
|
52
|
+
- Provide compensation (extra pay, time off, or both).
|
|
53
|
+
- Escalation after 15 minutes of no acknowledgment.
|
|
54
|
+
- Secondary on-call as backup for every primary.
|
|
55
|
+
- On-call handoff includes open incidents, recent changes, and known risks.
|
|
56
|
+
|
|
57
|
+
**War Room Methodology:**
|
|
58
|
+
- Designate an Incident Commander (IC) who coordinates, does not debug.
|
|
59
|
+
- IC assigns roles: Communications Lead, Technical Lead, Scribe.
|
|
60
|
+
- Communication cadence: status page updates every 15 minutes during SEV1/SEV2.
|
|
61
|
+
- Use a shared channel (Slack/Teams) with pinned timeline.
|
|
62
|
+
- No blame during active incident — focus on mitigation only.
|
|
63
|
+
|
|
64
|
+
**Postmortem Structure:**
|
|
65
|
+
- **Timeline:** Minute-by-minute reconstruction of events.
|
|
66
|
+
- **Impact:** Users affected, duration, revenue/SLA impact.
|
|
67
|
+
- **Root Cause:** The systemic failure that allowed the incident.
|
|
68
|
+
- **Contributing Factors:** What made detection/resolution harder.
|
|
69
|
+
- **Action Items:** Specific, assigned, time-boxed improvements.
|
|
70
|
+
- Blameless: focus on systems and processes, never individuals.
|
|
71
|
+
- Share postmortems broadly — learning is organizational.
|
|
72
|
+
- Track action item completion rate as a metric.
|
|
73
|
+
|
|
74
|
+
**Communication:**
|
|
75
|
+
- Status page updates every 15 minutes during active incidents.
|
|
76
|
+
- Internal stakeholder updates via designated channel.
|
|
77
|
+
- Customer-facing communication: acknowledge → investigate → mitigate → resolve.
|
|
78
|
+
- Post-resolution: summary email with impact and next steps.
|
|
79
|
+
|
|
80
|
+
### After
|
|
81
|
+
|
|
82
|
+
1. Verify runbooks are tested (tabletop exercises quarterly).
|
|
83
|
+
2. Confirm escalation paths are current and contact information is valid.
|
|
84
|
+
3. Validate on-call schedule has no coverage gaps.
|
|
85
|
+
4. Review postmortem action item completion from previous incidents.
|
|
86
|
+
5. Measure MTTR trends and identify systemic improvement opportunities.
|
|
87
|
+
|
|
88
|
+
## Self-check before task completion
|
|
89
|
+
|
|
90
|
+
- [ ] Severity levels are clearly defined with response time targets.
|
|
91
|
+
- [ ] Runbooks follow the template and are actionable (copy-pasteable commands).
|
|
92
|
+
- [ ] On-call rotation is sustainable (max 1 week, compensation, follow-the-sun).
|
|
93
|
+
- [ ] Escalation paths have timeout-based auto-escalation.
|
|
94
|
+
- [ ] Postmortem template is blameless and includes action items.
|
|
95
|
+
- [ ] Communication cadence is defined for each severity level.
|
|
96
|
+
- [ ] All processes integrate with existing alerting and observability tooling.
|
|
97
|
+
- [ ] Quarterly review cadence is established for runbook freshness.
|