mindforge-cc 10.0.3 → 11.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (333) hide show
  1. package/.mindforge/MINDFORGE-V2-SCHEMA.json +43 -10
  2. package/.mindforge/config.json +30 -2
  3. package/.mindforge/engine/cross-model-eval.md +74 -0
  4. package/.mindforge/engine/proactive/signal-detector.md +60 -0
  5. package/.mindforge/engine/proactive/suggestion-engine.md +100 -0
  6. package/.mindforge/personas/agent-architect.md +57 -0
  7. package/.mindforge/personas/agent-evaluator.md +162 -0
  8. package/.mindforge/personas/agent-memory-designer.md +157 -0
  9. package/.mindforge/personas/agent-ops-engineer.md +120 -0
  10. package/.mindforge/personas/agent-orchestrator.md +112 -0
  11. package/.mindforge/personas/ai-economist.md +57 -0
  12. package/.mindforge/personas/ai-safety-engineer.md +57 -0
  13. package/.mindforge/personas/analytics-engineer.md +57 -0
  14. package/.mindforge/personas/anti-pattern-hunter.md +61 -0
  15. package/.mindforge/personas/api-gateway-designer.md +132 -0
  16. package/.mindforge/personas/auth-engineer.md +112 -0
  17. package/.mindforge/personas/build-engineer.md +57 -0
  18. package/.mindforge/personas/business-analyst.md +56 -0
  19. package/.mindforge/personas/cache-architect.md +100 -0
  20. package/.mindforge/personas/causal-scientist.md +57 -0
  21. package/.mindforge/personas/cdn-architect.md +118 -0
  22. package/.mindforge/personas/change-agent.md +104 -0
  23. package/.mindforge/personas/code-narrator.md +52 -0
  24. package/.mindforge/personas/codegen-specialist.md +68 -0
  25. package/.mindforge/personas/communication-architect.md +102 -0
  26. package/.mindforge/personas/compliance-engineer.md +96 -0
  27. package/.mindforge/personas/consensus-engineer.md +116 -0
  28. package/.mindforge/personas/contract-tester.md +60 -192
  29. package/.mindforge/personas/data-architect.md +108 -0
  30. package/.mindforge/personas/data-mesh-architect.md +57 -0
  31. package/.mindforge/personas/data-pipeline-architect.md +120 -0
  32. package/.mindforge/personas/de-sloppifier.md +60 -0
  33. package/.mindforge/personas/debt-manager.md +66 -0
  34. package/.mindforge/personas/decision-architect.md +82 -51
  35. package/.mindforge/personas/deployment-captain.md +74 -0
  36. package/.mindforge/personas/design-system-lead.md +112 -0
  37. package/.mindforge/personas/dmux-orchestrator.md +75 -0
  38. package/.mindforge/personas/dx-engineer.md +96 -0
  39. package/.mindforge/personas/ecommerce-engineer.md +57 -0
  40. package/.mindforge/personas/edge-engineer.md +94 -0
  41. package/.mindforge/personas/edtech-architect.md +106 -0
  42. package/.mindforge/personas/embedding-architect.md +57 -0
  43. package/.mindforge/personas/environment-engineer.md +57 -0
  44. package/.mindforge/personas/eval-judge.md +55 -0
  45. package/.mindforge/personas/event-architect.md +102 -0
  46. package/.mindforge/personas/experiment-designer.md +138 -0
  47. package/.mindforge/personas/feature-store-engineer.md +57 -0
  48. package/.mindforge/personas/finops-analyst.md +66 -0
  49. package/.mindforge/personas/fintech-architect.md +57 -0
  50. package/.mindforge/personas/flutter-engineer.md +104 -0
  51. package/.mindforge/personas/gaming-engineer.md +57 -0
  52. package/.mindforge/personas/graphql-designer.md +73 -0
  53. package/.mindforge/personas/healthcare-engineer.md +57 -0
  54. package/.mindforge/personas/hiring-strategist.md +105 -0
  55. package/.mindforge/personas/hitl-architect.md +165 -0
  56. package/.mindforge/personas/i18n-architect.md +69 -0
  57. package/.mindforge/personas/iot-architect.md +105 -0
  58. package/.mindforge/personas/knowledge-curator.md +139 -0
  59. package/.mindforge/personas/knowledge-engineer.md +57 -0
  60. package/.mindforge/personas/lakehouse-architect.md +57 -0
  61. package/.mindforge/personas/llm-orchestrator.md +57 -0
  62. package/.mindforge/personas/logistics-architect.md +106 -0
  63. package/.mindforge/personas/market-analyst.md +53 -0
  64. package/.mindforge/personas/marketplace-engineer.md +105 -0
  65. package/.mindforge/personas/mcp-designer.md +54 -0
  66. package/.mindforge/personas/meeting-designer.md +104 -0
  67. package/.mindforge/personas/mentorship-lead.md +106 -0
  68. package/.mindforge/personas/migration-architect.md +57 -0
  69. package/.mindforge/personas/ml-ops-engineer.md +101 -0
  70. package/.mindforge/personas/mobile-architect.md +105 -0
  71. package/.mindforge/personas/mobile-security-engineer.md +106 -0
  72. package/.mindforge/personas/multi-tenancy-architect.md +71 -0
  73. package/.mindforge/personas/multimodal-engineer.md +57 -0
  74. package/.mindforge/personas/offline-specialist.md +105 -0
  75. package/.mindforge/personas/onboarding-navigator.md +63 -0
  76. package/.mindforge/personas/payments-engineer.md +135 -0
  77. package/.mindforge/personas/pipeline-engineer.md +115 -0
  78. package/.mindforge/personas/platform-engineer.md +97 -0
  79. package/.mindforge/personas/platform-lead.md +57 -0
  80. package/.mindforge/personas/privacy-engineer.md +57 -0
  81. package/.mindforge/personas/product-owner.md +56 -0
  82. package/.mindforge/personas/productivity-analyst.md +57 -0
  83. package/.mindforge/personas/prompt-architect.md +101 -0
  84. package/.mindforge/personas/proofreader.md +53 -0
  85. package/.mindforge/personas/pwa-architect.md +105 -0
  86. package/.mindforge/personas/quality-scorer.md +63 -0
  87. package/.mindforge/personas/react-native-engineer.md +106 -0
  88. package/.mindforge/personas/resilience-engineer.md +69 -0
  89. package/.mindforge/personas/rfc-architect.md +64 -0
  90. package/.mindforge/personas/saga-orchestrator.md +80 -0
  91. package/.mindforge/personas/secrets-engineer.md +57 -0
  92. package/.mindforge/personas/skill-smith.md +79 -0
  93. package/.mindforge/personas/sre-lead.md +107 -0
  94. package/.mindforge/personas/stream-engineer.md +57 -0
  95. package/.mindforge/personas/streaming-engineer.md +64 -0
  96. package/.mindforge/personas/swarm-templates.json +674 -44
  97. package/.mindforge/personas/system-designer.md +57 -0
  98. package/.mindforge/personas/team-coach.md +120 -0
  99. package/.mindforge/personas/tech-lead-coach.md +103 -0
  100. package/.mindforge/personas/technical-writer-lead.md +111 -0
  101. package/.mindforge/personas/vibe-checker.md +75 -0
  102. package/.mindforge/personas/worktree-manager.md +56 -0
  103. package/.mindforge/personas/zero-trust-engineer.md +113 -0
  104. package/.mindforge/skills/a11y-testing/SKILL.md +143 -0
  105. package/.mindforge/skills/agent-evaluation-framework/SKILL.md +227 -0
  106. package/.mindforge/skills/agent-memory-design/SKILL.md +199 -0
  107. package/.mindforge/skills/agent-orchestration-patterns/SKILL.md +129 -0
  108. package/.mindforge/skills/agent-tool-selection/SKILL.md +204 -0
  109. package/.mindforge/skills/ai-agent-deployment/SKILL.md +176 -0
  110. package/.mindforge/skills/ai-cost-management/SKILL.md +57 -0
  111. package/.mindforge/skills/ai-safety-alignment/SKILL.md +53 -0
  112. package/.mindforge/skills/analytics-instrumentation/SKILL.md +172 -0
  113. package/.mindforge/skills/api-gateway-patterns/SKILL.md +177 -0
  114. package/.mindforge/skills/api-marketplace/SKILL.md +56 -0
  115. package/.mindforge/skills/api-versioning/SKILL.md +100 -0
  116. package/.mindforge/skills/app-store-deployment/SKILL.md +44 -0
  117. package/.mindforge/skills/architecture-tradeoff-analysis/SKILL.md +97 -0
  118. package/.mindforge/skills/audit-logging/SKILL.md +140 -0
  119. package/.mindforge/skills/auth-patterns/SKILL.md +148 -0
  120. package/.mindforge/skills/autonomous-agent-harness/SKILL.md +218 -0
  121. package/.mindforge/skills/autonomous-agents/SKILL.md +59 -0
  122. package/.mindforge/skills/build-system-optimization/SKILL.md +54 -0
  123. package/.mindforge/skills/build-vs-buy/SKILL.md +80 -0
  124. package/.mindforge/skills/bundle-optimization/SKILL.md +174 -0
  125. package/.mindforge/skills/business-analyst/SKILL.md +82 -0
  126. package/.mindforge/skills/caching-strategies/SKILL.md +132 -0
  127. package/.mindforge/skills/capacity-planning/SKILL.md +96 -0
  128. package/.mindforge/skills/causal-inference/SKILL.md +42 -0
  129. package/.mindforge/skills/cdn-optimization/SKILL.md +212 -0
  130. package/.mindforge/skills/change-management/SKILL.md +106 -0
  131. package/.mindforge/skills/chaos-engineering/SKILL.md +99 -0
  132. package/.mindforge/skills/ci-cd-pipeline/SKILL.md +118 -0
  133. package/.mindforge/skills/cli-design/SKILL.md +118 -0
  134. package/.mindforge/skills/code-generation-patterns/SKILL.md +92 -0
  135. package/.mindforge/skills/code-review-methodology/SKILL.md +180 -0
  136. package/.mindforge/skills/code-tour/SKILL.md +145 -0
  137. package/.mindforge/skills/codebase-onboarding/SKILL.md +95 -0
  138. package/.mindforge/skills/compliance-as-code/SKILL.md +195 -0
  139. package/.mindforge/skills/conflict-resolution/SKILL.md +87 -0
  140. package/.mindforge/skills/connection-pooling/SKILL.md +151 -0
  141. package/.mindforge/skills/container-security/SKILL.md +151 -0
  142. package/.mindforge/skills/context-engineering/SKILL.md +114 -0
  143. package/.mindforge/skills/contract-testing/SKILL.md +85 -0
  144. package/.mindforge/skills/cost-estimation/SKILL.md +82 -0
  145. package/.mindforge/skills/cqrs-event-sourcing/SKILL.md +95 -0
  146. package/.mindforge/skills/cross-platform-testing/SKILL.md +43 -0
  147. package/.mindforge/skills/data-governance/SKILL.md +42 -0
  148. package/.mindforge/skills/data-lakehouse/SKILL.md +42 -0
  149. package/.mindforge/skills/data-mesh/SKILL.md +42 -0
  150. package/.mindforge/skills/data-modeling/SKILL.md +107 -0
  151. package/.mindforge/skills/data-pipeline-design/SKILL.md +171 -0
  152. package/.mindforge/skills/data-privacy-engineering/SKILL.md +42 -0
  153. package/.mindforge/skills/database-performance/SKILL.md +174 -0
  154. package/.mindforge/skills/database-sharding-advanced/SKILL.md +206 -0
  155. package/.mindforge/skills/de-sloppify/SKILL.md +120 -0
  156. package/.mindforge/skills/defense-in-depth/SKILL.md +84 -0
  157. package/.mindforge/skills/delegation-patterns/SKILL.md +123 -0
  158. package/.mindforge/skills/dependency-management/SKILL.md +94 -0
  159. package/.mindforge/skills/deployment-workflow/SKILL.md +135 -0
  160. package/.mindforge/skills/design-system/SKILL.md +113 -0
  161. package/.mindforge/skills/developer-onboarding/SKILL.md +99 -0
  162. package/.mindforge/skills/developer-productivity-metrics/SKILL.md +59 -0
  163. package/.mindforge/skills/distributed-consensus/SKILL.md +141 -0
  164. package/.mindforge/skills/dmux-workflows/SKILL.md +141 -0
  165. package/.mindforge/skills/dns-architecture/SKILL.md +167 -0
  166. package/.mindforge/skills/ecommerce-architecture/SKILL.md +41 -0
  167. package/.mindforge/skills/edge-computing/SKILL.md +91 -0
  168. package/.mindforge/skills/edtech-platform/SKILL.md +41 -0
  169. package/.mindforge/skills/email-deliverability/SKILL.md +177 -0
  170. package/.mindforge/skills/embedding-systems/SKILL.md +55 -0
  171. package/.mindforge/skills/environment-management/SKILL.md +54 -0
  172. package/.mindforge/skills/error-handling-architecture/SKILL.md +118 -0
  173. package/.mindforge/skills/estimation-techniques/SKILL.md +113 -0
  174. package/.mindforge/skills/eval-harness/SKILL.md +180 -0
  175. package/.mindforge/skills/event-driven-architecture/SKILL.md +162 -0
  176. package/.mindforge/skills/experiment-design/SKILL.md +139 -0
  177. package/.mindforge/skills/experiment-platform/SKILL.md +43 -0
  178. package/.mindforge/skills/feature-engineering/SKILL.md +42 -0
  179. package/.mindforge/skills/feature-flag-management/SKILL.md +183 -0
  180. package/.mindforge/skills/fine-tuning-workflow/SKILL.md +189 -0
  181. package/.mindforge/skills/fintech-patterns/SKILL.md +41 -0
  182. package/.mindforge/skills/flutter-architecture/SKILL.md +42 -0
  183. package/.mindforge/skills/gaming-backend/SKILL.md +41 -0
  184. package/.mindforge/skills/git-workflow-design/SKILL.md +129 -0
  185. package/.mindforge/skills/graceful-degradation/SKILL.md +95 -0
  186. package/.mindforge/skills/graphql-patterns/SKILL.md +243 -0
  187. package/.mindforge/skills/guardrails-and-safety/SKILL.md +137 -0
  188. package/.mindforge/skills/healthcare-systems/SKILL.md +40 -0
  189. package/.mindforge/skills/hiring-engineering/SKILL.md +119 -0
  190. package/.mindforge/skills/human-in-the-loop-design/SKILL.md +234 -0
  191. package/.mindforge/skills/i18n-architecture/SKILL.md +147 -0
  192. package/.mindforge/skills/idempotency-patterns/SKILL.md +84 -0
  193. package/.mindforge/skills/incident-communication/SKILL.md +96 -0
  194. package/.mindforge/skills/incident-management/SKILL.md +97 -0
  195. package/.mindforge/skills/infrastructure-as-code/SKILL.md +98 -0
  196. package/.mindforge/skills/instinct-clustering/SKILL.md +190 -0
  197. package/.mindforge/skills/internal-developer-platform/SKILL.md +51 -0
  198. package/.mindforge/skills/iot-platform/SKILL.md +41 -0
  199. package/.mindforge/skills/k8s-deployment/SKILL.md +358 -0
  200. package/.mindforge/skills/knowledge-graphs/SKILL.md +56 -0
  201. package/.mindforge/skills/knowledge-sharing-systems/SKILL.md +112 -0
  202. package/.mindforge/skills/llm-cost-optimization/SKILL.md +198 -0
  203. package/.mindforge/skills/llm-orchestration/SKILL.md +56 -0
  204. package/.mindforge/skills/load-testing/SKILL.md +84 -0
  205. package/.mindforge/skills/logistics-optimization/SKILL.md +40 -0
  206. package/.mindforge/skills/market-researcher/SKILL.md +99 -0
  207. package/.mindforge/skills/marketplace-trust/SKILL.md +40 -0
  208. package/.mindforge/skills/mcp-server-patterns/SKILL.md +264 -0
  209. package/.mindforge/skills/media-streaming/SKILL.md +41 -0
  210. package/.mindforge/skills/meeting-architecture/SKILL.md +146 -0
  211. package/.mindforge/skills/mentoring-patterns/SKILL.md +77 -0
  212. package/.mindforge/skills/microservices-patterns/SKILL.md +83 -0
  213. package/.mindforge/skills/migration-platform/SKILL.md +61 -0
  214. package/.mindforge/skills/migration-strategies/SKILL.md +129 -0
  215. package/.mindforge/skills/ml-feature-store/SKILL.md +56 -0
  216. package/.mindforge/skills/ml-monitoring/SKILL.md +42 -0
  217. package/.mindforge/skills/mobile-performance/SKILL.md +44 -0
  218. package/.mindforge/skills/mobile-security/SKILL.md +45 -0
  219. package/.mindforge/skills/model-evaluation/SKILL.md +53 -0
  220. package/.mindforge/skills/monorepo-management/SKILL.md +100 -0
  221. package/.mindforge/skills/multi-tenancy-patterns/SKILL.md +145 -0
  222. package/.mindforge/skills/multi-turn-conversation-design/SKILL.md +206 -0
  223. package/.mindforge/skills/multimodal-ai/SKILL.md +51 -0
  224. package/.mindforge/skills/mutation-testing/SKILL.md +97 -0
  225. package/.mindforge/skills/notification-system-design/SKILL.md +168 -0
  226. package/.mindforge/skills/observability-stack/SKILL.md +136 -0
  227. package/.mindforge/skills/offline-first-design/SKILL.md +43 -0
  228. package/.mindforge/skills/on-call-design/SKILL.md +111 -0
  229. package/.mindforge/skills/pagination-patterns/SKILL.md +230 -0
  230. package/.mindforge/skills/payment-integration/SKILL.md +176 -0
  231. package/.mindforge/skills/performance-reviews/SKILL.md +140 -0
  232. package/.mindforge/skills/platform-observability/SKILL.md +58 -0
  233. package/.mindforge/skills/platform-reliability/SKILL.md +52 -0
  234. package/.mindforge/skills/post-incident-learning/SKILL.md +96 -0
  235. package/.mindforge/skills/product-manager/SKILL.md +104 -0
  236. package/.mindforge/skills/progressive-web-app/SKILL.md +44 -0
  237. package/.mindforge/skills/prompt-engineering/SKILL.md +94 -0
  238. package/.mindforge/skills/proofreader/SKILL.md +158 -0
  239. package/.mindforge/skills/push-notification-architecture/SKILL.md +45 -0
  240. package/.mindforge/skills/python-performance/SKILL.md +183 -0
  241. package/.mindforge/skills/quality-audit/SKILL.md +171 -0
  242. package/.mindforge/skills/queue-design/SKILL.md +85 -0
  243. package/.mindforge/skills/rag-architecture/SKILL.md +176 -0
  244. package/.mindforge/skills/rate-limiting-design/SKILL.md +94 -0
  245. package/.mindforge/skills/react-native-patterns/SKILL.md +42 -0
  246. package/.mindforge/skills/react-performance/SKILL.md +229 -0
  247. package/.mindforge/skills/real-time-analytics/SKILL.md +42 -0
  248. package/.mindforge/skills/real-time-sync/SKILL.md +83 -0
  249. package/.mindforge/skills/responsive-native/SKILL.md +44 -0
  250. package/.mindforge/skills/responsive-patterns/SKILL.md +141 -0
  251. package/.mindforge/skills/rfc-pipeline/SKILL.md +114 -0
  252. package/.mindforge/skills/saas-multi-tenant/SKILL.md +41 -0
  253. package/.mindforge/skills/santa-method/SKILL.md +134 -0
  254. package/.mindforge/skills/search-implementation/SKILL.md +98 -0
  255. package/.mindforge/skills/secrets-platform/SKILL.md +56 -0
  256. package/.mindforge/skills/secrets-rotation/SKILL.md +173 -0
  257. package/.mindforge/skills/self-serve-infrastructure/SKILL.md +51 -0
  258. package/.mindforge/skills/serverless-patterns/SKILL.md +119 -0
  259. package/.mindforge/skills/skill-creator-meta/SKILL.md +146 -0
  260. package/.mindforge/skills/sprint-retrospective-facilitation/SKILL.md +112 -0
  261. package/.mindforge/skills/stakeholder-communication/SKILL.md +85 -0
  262. package/.mindforge/skills/state-management/SKILL.md +104 -0
  263. package/.mindforge/skills/stream-processing/SKILL.md +43 -0
  264. package/.mindforge/skills/streaming-architecture/SKILL.md +81 -0
  265. package/.mindforge/skills/supply-chain-security/SKILL.md +145 -0
  266. package/.mindforge/skills/synthetic-data-generation/SKILL.md +52 -0
  267. package/.mindforge/skills/system-design/SKILL.md +88 -0
  268. package/.mindforge/skills/team-topology-design/SKILL.md +107 -0
  269. package/.mindforge/skills/technical-debt-management/SKILL.md +86 -0
  270. package/.mindforge/skills/technical-interview-design/SKILL.md +98 -0
  271. package/.mindforge/skills/technical-leadership/SKILL.md +75 -0
  272. package/.mindforge/skills/technical-writing/SKILL.md +237 -0
  273. package/.mindforge/skills/technology-radar/SKILL.md +88 -0
  274. package/.mindforge/skills/testing-anti-patterns/SKILL.md +288 -0
  275. package/.mindforge/skills/tool-design/SKILL.md +138 -0
  276. package/.mindforge/skills/typescript-advanced/SKILL.md +198 -0
  277. package/.mindforge/skills/using-git-worktrees/SKILL.md +139 -0
  278. package/.mindforge/skills/verification-loop/SKILL.md +13 -1
  279. package/.mindforge/skills/vibe-security/SKILL.md +165 -0
  280. package/.mindforge/skills/visual-regression-testing/SKILL.md +97 -0
  281. package/.mindforge/skills/websocket-patterns/SKILL.md +203 -0
  282. package/.mindforge/skills/writing-plans/SKILL.md +170 -0
  283. package/.mindforge/skills/writing-skills/SKILL.md +216 -0
  284. package/.mindforge/skills/zero-trust-architecture/SKILL.md +166 -0
  285. package/CHANGELOG.md +240 -0
  286. package/MINDFORGE.md +4 -4
  287. package/README.md +49 -4
  288. package/RELEASENOTES.md +80 -0
  289. package/SECURITY.md +20 -8
  290. package/bin/autonomous/audit-writer.js +13 -0
  291. package/bin/autonomous/auto-runner.js +74 -16
  292. package/bin/autonomous/context-refactorer.js +26 -11
  293. package/bin/autonomous/state-manager.js +62 -6
  294. package/bin/autonomous/stuck-monitor.js +46 -7
  295. package/bin/autonomous/wave-executor.js +66 -25
  296. package/bin/dashboard/api-router.js +43 -0
  297. package/bin/dashboard/metrics-aggregator.js +28 -1
  298. package/bin/dashboard/server.js +67 -4
  299. package/bin/dashboard/sse-bridge.js +4 -4
  300. package/bin/engine/feedback-loop.js +8 -0
  301. package/bin/engine/intelligence-interlock.js +32 -15
  302. package/bin/engine/logic-drift-detector.js +2 -1
  303. package/bin/engine/nexus-tracer.js +3 -2
  304. package/bin/engine/remediation-engine.js +155 -32
  305. package/bin/engine/self-corrective-synthesizer.js +84 -10
  306. package/bin/engine/sre-manager.js +12 -4
  307. package/bin/engine/temporal-hub.js +131 -34
  308. package/bin/governance/approve.js +41 -5
  309. package/bin/governance/impact-analyzer.js +28 -0
  310. package/bin/governance/policy-engine.js +10 -3
  311. package/bin/governance/quantum-crypto.js +32 -19
  312. package/bin/governance/rbac-manager.js +74 -2
  313. package/bin/governance/ztai-manager.js +49 -7
  314. package/bin/hindsight-injector.js +3 -3
  315. package/bin/memory/eis-client.js +71 -34
  316. package/bin/memory/embedding-engine.js +61 -0
  317. package/bin/memory/knowledge-graph.js +58 -5
  318. package/bin/memory/knowledge-indexer.js +53 -6
  319. package/bin/memory/knowledge-store.js +22 -0
  320. package/bin/migrations/10.7.0-to-11.0.0.js +110 -0
  321. package/bin/migrations/schema-versions.js +13 -0
  322. package/bin/models/anthropic-provider.js +45 -0
  323. package/bin/models/cloud-broker.js +68 -20
  324. package/bin/models/gemini-provider.js +51 -0
  325. package/bin/models/model-client.js +20 -0
  326. package/bin/models/model-router.js +28 -8
  327. package/bin/models/openai-provider.js +44 -0
  328. package/bin/utils/file-io.js +63 -1
  329. package/bin/utils/index.js +58 -0
  330. package/docs/getting-started.md +1 -1
  331. package/docs/user-guide.md +2 -2
  332. package/package.json +2 -2
  333. package/.mindforge/personas/data-privacy-engineer.md +0 -187
@@ -0,0 +1,176 @@
1
+ ---
2
+ name: payment-integration
3
+ version: 1.0.0
4
+ min_mindforge_version: 10.0.4
5
+ status: stable
6
+ triggers: payment integration, stripe architecture, payment webhook, idempotent charge, refund flow, PCI scope, payment state machine, subscription billing, payment retry, payment reconciliation, checkout flow, payment method tokenization
7
+ compose: security-review
8
+ ---
9
+
10
+ # Skill — Payment Integration (Idempotent Payment Architecture)
11
+
12
+ ## When this skill activates
13
+ When building or modifying payment flows, integrating payment providers (Stripe,
14
+ PayPal, Braintree), handling subscriptions, processing refunds, or dealing with
15
+ any money movement in the system. Also activates for PCI compliance considerations.
16
+
17
+ Core principle: **Idempotency is life** — every payment operation must be safe to
18
+ retry without charging the customer twice. When in doubt, err on the side of NOT
19
+ charging.
20
+
21
+ ## Mandatory actions when this skill is active
22
+
23
+ ### Payment State Machine
24
+
25
+ 1. **Every payment has a well-defined state machine:**
26
+ ```
27
+ States:
28
+ created → processing → succeeded
29
+ → failed → (retry) → processing
30
+ succeeded → refund_pending → refunded
31
+ succeeded → disputed → dispute_won (funds returned)
32
+ → dispute_lost (funds lost)
33
+
34
+ State transitions:
35
+ - created → processing: charge initiated with provider
36
+ - processing → succeeded: provider confirms capture
37
+ - processing → failed: provider declines or errors
38
+ - succeeded → refund_pending: refund initiated
39
+ - refund_pending → refunded: provider confirms refund
40
+ ```
41
+
42
+ Rules:
43
+ - State transitions are APPEND-ONLY (never delete payment records)
44
+ - Every transition logged with timestamp, actor, and reason
45
+ - Failed payments can retry (max 3 attempts with exponential backoff)
46
+ - Terminal states: succeeded, refunded, dispute_won, dispute_lost
47
+
48
+ ### Idempotency
49
+
50
+ 2. **Idempotency key on every charge call:**
51
+ ```
52
+ Idempotency key format: [user_id]-[order_id]-[attempt_number]
53
+ Example: usr_abc123-ord_xyz789-1
54
+
55
+ Rules:
56
+ - Generate idempotency key BEFORE calling payment provider
57
+ - Store key in database alongside payment intent
58
+ - If retry needed: increment attempt number, generate new key
59
+ - Provider stores result by key — retrying same key returns same result
60
+ - Key expiry: 24 hours (Stripe default) — don't retry after that
61
+ ```
62
+
63
+ Critical: If the client retries (network timeout, unclear response), the
64
+ idempotency key ensures no double charge. This is non-negotiable.
65
+
66
+ ### Webhook Processing
67
+
68
+ 3. **Webhook handler requirements:**
69
+ ```
70
+ 1. Verify signature FIRST (reject if invalid — no processing)
71
+ 2. Respond 200 immediately (within 5 seconds)
72
+ 3. Process the event ASYNCHRONOUSLY (queue for background processing)
73
+ 4. Process IDEMPOTENTLY (same webhook delivered twice = same outcome)
74
+ 5. Handle OUT-OF-ORDER delivery (payment_intent.succeeded before payment_intent.created)
75
+ ```
76
+
77
+ Implementation:
78
+ ```
79
+ POST /webhooks/stripe
80
+ 1. Verify: stripe.webhooks.constructEvent(body, sig, secret)
81
+ 2. Dedup: check event.id against processed_events table
82
+ 3. If already processed: return 200 (idempotent)
83
+ 4. Queue: enqueue event for async processing
84
+ 5. Return 200
85
+ 6. [Async worker]: process event, update payment state, mark as processed
86
+ ```
87
+
88
+ Rules:
89
+ - NEVER do business logic synchronously in the webhook handler
90
+ - Store raw webhook payload for debugging/replay
91
+ - Implement webhook replay for missed events (fetch from provider API)
92
+ - Monitor webhook lag (time between event creation and processing)
93
+
94
+ ### PCI Scope Minimization
95
+
96
+ 4. **Never touch raw card numbers:**
97
+ ```
98
+ Client-side tokenization flow:
99
+ 1. User enters card → Stripe.js/Elements captures it
100
+ 2. Card data goes DIRECTLY to Stripe (never touches your server)
101
+ 3. Stripe returns a token/PaymentMethod ID
102
+ 4. Your server uses the token to create charges
103
+
104
+ Your server NEVER sees: card number, CVV, expiration date
105
+ Your PCI scope: SAQ-A (lowest level — just a questionnaire)
106
+ ```
107
+
108
+ Rules:
109
+ - Use Stripe Elements, PayPal JS SDK, or equivalent client-side tokenization
110
+ - Never log request bodies that might contain card data
111
+ - Never store card data in your database (only token references)
112
+ - If using iframes: ensure they're from the payment provider's domain
113
+ - PCI-DSS audit not required if you stay at SAQ-A level
114
+
115
+ ### Subscription Billing
116
+
117
+ 5. **Subscription lifecycle:**
118
+ ```
119
+ States: trial → active → past_due → canceled → expired
120
+
121
+ trial → active: trial period ends, first charge succeeds
122
+ active → past_due: renewal charge fails
123
+ past_due → active: retry succeeds
124
+ past_due → canceled: all retries exhausted + grace period ended
125
+ canceled → active: user resubscribes (new subscription)
126
+ ```
127
+
128
+ Dunning (failed payment recovery):
129
+ ```
130
+ Day 0: Charge fails → retry immediately
131
+ Day 1: Second retry
132
+ Day 3: Third retry + email notification ("update payment method")
133
+ Day 7: Final retry + urgent email + in-app banner
134
+ Day 14: Cancel subscription + final email ("your subscription has ended")
135
+ ```
136
+
137
+ Rules:
138
+ - Dunning schedule is configurable per plan tier
139
+ - Always give users a way to update payment method without re-subscribing
140
+ - Prorate upgrades/downgrades (charge difference immediately or credit)
141
+ - Webhook: handle invoice.payment_failed for dunning triggers
142
+
143
+ ### Reconciliation
144
+
145
+ 6. **Daily reconciliation process:**
146
+ ```
147
+ Every 24 hours:
148
+ 1. Fetch all payments from provider API (last 48 hours, overlap for safety)
149
+ 2. Match against internal payment records
150
+ 3. Flag discrepancies:
151
+ - Payment in provider but not in our DB (missed webhook)
152
+ - Payment in our DB but not in provider (ghost record)
153
+ - Amount mismatch (partial capture, currency conversion)
154
+ - Status mismatch (we say succeeded, provider says failed)
155
+ 4. Auto-resolve simple cases (missed webhook → replay)
156
+ 5. Alert on unresolvable discrepancies (requires human review)
157
+ ```
158
+
159
+ Rules:
160
+ - Reconciliation runs daily minimum, hourly for high-volume systems
161
+ - Use 48-hour overlap window to catch delayed settlements
162
+ - Discrepancy alerts go to finance + engineering
163
+ - Never auto-resolve amount mismatches (always flag for human)
164
+
165
+ ## Self-check before task completion
166
+
167
+ Before marking a task done when this skill was active:
168
+
169
+ - [ ] Is there a well-defined state machine for payment lifecycle?
170
+ - [ ] Does every charge call include an idempotency key?
171
+ - [ ] Are webhooks verified (signature), deduplicated, and processed async?
172
+ - [ ] Is PCI scope minimized (client-side tokenization, no raw card data on server)?
173
+ - [ ] For subscriptions: is the dunning sequence defined with escalating notifications?
174
+ - [ ] Is daily reconciliation implemented (provider vs internal records)?
175
+ - [ ] Are all payment state transitions logged with timestamp and reason?
176
+ - [ ] Has the security-review skill been co-activated for this payment code?
@@ -0,0 +1,140 @@
1
+ ---
2
+ name: performance-reviews
3
+ version: 1.0.0
4
+ min_mindforge_version: 10.3.0
5
+ status: stable
6
+ triggers: performance review engineering, promotion case writing, feedback framework, calibration session, engineering evaluation criteria, performance improvement plan, impact documentation, promotion packet, peer feedback engineering, engineering levels, growth assessment, performance calibration
7
+ ---
8
+
9
+ # Performance Reviews
10
+
11
+ ## When this skill activates
12
+
13
+ This skill activates when conducting engineering performance evaluations, writing promotion cases, designing feedback frameworks, participating in calibration sessions, creating performance improvement plans, or assessing growth against engineering levels. It applies to engineering managers, tech leads, and senior engineers involved in performance management.
14
+
15
+ ## Mandatory actions when this skill is active
16
+
17
+ ### Before performance reviews
18
+
19
+ 1. **Define evaluation criteria explicitly** — What does success look like at each engineering level? Common dimensions: technical execution, system design, code quality, communication, collaboration, ownership, impact, mentorship. Map criteria to levels (junior, mid, senior, staff, principal).
20
+ 2. **Collect evidence throughout the cycle** — Don't rely on memory. Keep a running doc of: projects shipped, PRs reviewed, incidents handled, design docs written, mentoring moments. Real-time logging prevents recency bias.
21
+ 3. **Gather 360-degree feedback** — Ask peers, cross-functional partners, and direct reports (if applicable) for input. Single-source feedback is incomplete. Use structured prompts: "What does [Engineer] do well?" "Where could they grow?"
22
+ 4. **Review the engineer's self-assessment** — Ask them to evaluate their own performance before writing your review. Gaps between self-assessment and manager assessment are learning opportunities.
23
+
24
+ ### During performance evaluation
25
+
26
+ #### Engineering Level Expectations
27
+
28
+ Use a competency matrix to define clear expectations at each level. Example dimensions:
29
+
30
+ | Dimension | Junior | Mid-Level | Senior | Staff | Principal |
31
+ |-----------|--------|-----------|--------|-------|-----------|
32
+ | Scope of Work | Well-defined tasks | Small features | Full features/services | Multi-team projects | Org-wide initiatives |
33
+ | Technical Complexity | Low complexity | Medium complexity | High complexity | Architectural decisions | Strategic direction |
34
+ | Autonomy | Needs guidance | Some autonomy | Fully autonomous | Defines direction | Sets vision |
35
+ | Code Quality | Learns best practices | Applies best practices | Role models best practices | Defines standards | Elevates org quality |
36
+ | Design | Implements designs | Designs small features | Designs systems | Designs platforms | Shapes architecture |
37
+ | Mentorship | Learns from others | Helps peers | Mentors 1-2 juniors | Mentors team | Mentors org |
38
+ | Communication | Within team | Cross-team (technical) | Cross-functional | Executives + external | Industry thought leader |
39
+ | Impact | Individual tasks | Team features | Service/product | Organization | Company/industry |
40
+
41
+ #### Evaluation Process
42
+
43
+ - **Rate performance on each dimension** — Use a 1-5 scale: 1 = Below expectations, 2 = Partially meets, 3 = Meets, 4 = Exceeds, 5 = Greatly exceeds.
44
+ - **Provide specific examples** — Don't say "Strong communicator." Say "Led design review for Payment Service rewrite, clearly articulated tradeoffs, and incorporated feedback from 5 engineers."
45
+ - **Distinguish between performance at level vs readiness for next level** — Meeting expectations at Senior level doesn't automatically mean ready for Staff. Promotion requires sustained performance at the next level.
46
+ - **Identify growth areas** — Every engineer has gaps. Name them specifically and provide actionable guidance: "To reach Staff, you need to mentor 2-3 engineers and lead a cross-team project."
47
+
48
+ #### Feedback Framework: SBI + Coaching
49
+
50
+ Use **Situation-Behavior-Impact (SBI)** for developmental feedback:
51
+ - **Situation**: "In last week's design review..."
52
+ - **Behavior**: "...you dismissed Sarah's concern about edge cases without discussing it..."
53
+ - **Impact**: "...which made the team hesitant to raise concerns in future reviews."
54
+
55
+ Follow SBI with **Coaching**:
56
+ - "Next time, try acknowledging the concern and discussing it openly. Even if you ultimately disagree, demonstrating openness builds trust."
57
+
58
+ #### Writing Performance Reviews
59
+
60
+ **Structure:**
61
+ 1. **Summary** — Overall performance (meets/exceeds expectations), 2-3 key strengths, 1-2 growth areas.
62
+ 2. **Key Accomplishments** — 3-5 most impactful projects or contributions with specific outcomes (metrics, launches, quality improvements).
63
+ 3. **Dimension-by-Dimension Assessment** — Rate and provide examples for each competency (technical execution, collaboration, ownership, etc.).
64
+ 4. **Growth Areas** — 1-3 specific areas for development with actionable suggestions.
65
+ 5. **Career Development** — If promotion-track, outline path to next level. If not promotion-track, outline how to grow within current level.
66
+
67
+ **Tone:**
68
+ - Be direct but supportive. Sugarcoating developmental feedback doesn't help.
69
+ - Use "I observed" not "you are." Focus on behavior, not identity.
70
+ - Balance positive and developmental. If someone is strong, say so. If they have gaps, name them.
71
+
72
+ #### Promotion Case Writing
73
+
74
+ **Promotion Readiness Criteria:**
75
+ - **Sustained performance at the next level** — For 6-12 months, not just one stellar project. Consistency matters.
76
+ - **Demonstrated scope expansion** — Taking on bigger, more complex, more ambiguous work.
77
+ - **Organizational impact** — Contributed beyond their immediate team (mentoring, tooling, process improvements).
78
+
79
+ **Promotion Packet Structure:**
80
+ 1. **Summary** — Candidate name, current level, target level, tenure, manager endorsement.
81
+ 2. **Case for Promotion** — Why are they ready? Use the competency matrix. Show where they meet or exceed next-level expectations.
82
+ 3. **Key Accomplishments** — 3-5 high-impact projects with measurable outcomes. Align each to next-level competencies.
83
+ 4. **Peer Feedback** — 3-5 quotes from peers, cross-functional partners, or senior engineers. Shows they operate at the next level already.
84
+ 5. **Growth Areas** — Even strong candidates have gaps. Acknowledge them but show they're manageable.
85
+ 6. **Comparison to Peers** — How does this candidate compare to others at the target level? Calibration context matters.
86
+
87
+ **Pitfall:** Nominating someone for promotion because they've been around a long time, not because they perform at the next level. Tenure is not a promotion criterion.
88
+
89
+ #### Performance Improvement Plans (PIPs)
90
+
91
+ **When to Use PIPs:**
92
+ - Performance is significantly below expectations for 2+ months.
93
+ - Specific, documented performance issues that haven't improved despite feedback.
94
+ - Not a surprise. PIP should be the culmination of ongoing feedback, not a sudden shock.
95
+
96
+ **PIP Structure:**
97
+ 1. **Performance Gaps** — Specific areas where performance is below expectations. Use examples.
98
+ 2. **Success Criteria** — What does improvement look like? Measurable, time-bound goals (e.g., "Ship 2 features with <2 rounds of rework within 60 days").
99
+ 3. **Support Provided** — What will the manager, mentor, or team do to support improvement? (e.g., weekly 1:1s, pairing sessions, dedicated mentor).
100
+ 4. **Timeline** — Typically 30-60 days. Clear checkpoints (15 days, 30 days).
101
+ 5. **Consequences** — If performance doesn't improve, what happens? (Usually termination or role change.)
102
+
103
+ **Facilitation:**
104
+ - Weekly check-ins during the PIP period. Don't wait until the end to give feedback.
105
+ - Document everything. Notes from 1:1s, progress on goals, feedback given.
106
+ - Be honest but supportive. The goal is improvement, not punishment.
107
+
108
+ #### Calibration Sessions
109
+
110
+ **Purpose:** Ensure consistency in performance ratings across managers. Prevents rating inflation or deflation.
111
+
112
+ **Process:**
113
+ 1. **Managers submit preliminary ratings** — Each manager rates their team members on the 1-5 scale.
114
+ 2. **Group discussion** — Managers present outlier cases (all 5s, any 1s or 2s). Justify ratings with evidence.
115
+ 3. **Identify inconsistencies** — If Manager A rates their team higher than Manager B for similar performance, probe why. Normalize.
116
+ 4. **Adjust ratings** — Based on discussion, managers adjust ratings to reflect consistent standards.
117
+
118
+ **Best Practices:**
119
+ - Come prepared with evidence. Don't rely on vague impressions.
120
+ - Challenge inflation. If someone rates their entire team 4s and 5s, that's not a high-performing team; that's grade inflation.
121
+ - Use the competency matrix as the source of truth. Ratings should map to observable behaviors at each level.
122
+
123
+ ### After performance reviews
124
+
125
+ - **Deliver feedback in 1:1s** — Don't just send the written review. Walk through it together. Give the engineer space to ask questions or disagree.
126
+ - **Create a growth plan** — Based on the review, define 30-60-90 day goals tied to growth areas. Make it concrete and actionable.
127
+ - **Follow up regularly** — Check progress on growth goals in 1:1s. Don't wait for the next review cycle to give feedback.
128
+ - **Track promotion pipeline** — Identify engineers who are on a promotion track. Ensure they get the projects and visibility needed to demonstrate readiness.
129
+ - **Document outcomes** — If performance improves or worsens, note it. Builds a longitudinal record that's useful for calibration and promotion discussions.
130
+
131
+ ## Self-check before task completion
132
+
133
+ - [ ] Evaluation criteria are explicitly defined with clear expectations at each engineering level
134
+ - [ ] Evidence is collected throughout the review cycle (projects, PRs, incidents, mentoring)
135
+ - [ ] 360-degree feedback is gathered from peers, cross-functional partners, and reports
136
+ - [ ] Performance review includes specific examples for each competency, not vague statements
137
+ - [ ] Developmental feedback uses SBI framework (Situation-Behavior-Impact) with coaching
138
+ - [ ] Promotion case demonstrates sustained performance at the next level for 6-12 months
139
+ - [ ] Promotion packet includes key accomplishments, peer feedback, and calibration context
140
+ - [ ] PIPs include specific performance gaps, measurable success criteria, and support plan
@@ -0,0 +1,58 @@
1
+ ---
2
+ name: platform-observability
3
+ version: 1.0.0
4
+ min_mindforge_version: 10.7.0
5
+ status: stable
6
+ triggers: platform observability design, unified observability stack, trace correlation platform, log aggregation architecture, metrics cardinality management, observability platform, telemetry pipeline, distributed tracing platform, observability cost, observability data model, observability self-service, alert routing platform
7
+ compose: observability-stack
8
+ ---
9
+
10
+ # Skill — Platform Observability
11
+
12
+ ## When this skill activates
13
+
14
+ This skill activates when the user is designing or implementing platform observability capabilities. This includes unified observability stacks, trace correlation, log aggregation architecture, metrics cardinality management, telemetry pipelines, distributed tracing platforms, observability cost optimization, observability data models, self-service observability, and alert routing platforms.
15
+
16
+ ## Mandatory actions when this skill is active
17
+
18
+ ### Before writing any code
19
+
20
+ 1. Audit current observability tooling: metrics (Prometheus, Datadog), logs (Elasticsearch, Splunk), traces (Jaeger, Honeycomb). Identify gaps and redundancies.
21
+ 2. Assess observability cost: cost per service, cost per metric, cost per log line, cost per trace. Identify high-cardinality metrics and expensive log patterns.
22
+ 3. Define observability requirements per service tier: critical services need full traces, non-critical services need sampled traces.
23
+ 4. Map existing alert sprawl: how many alerts fire per week, what percentage are actionable. Target: reduce noise by 70-90%.
24
+ 5. Establish observability SLOs: query latency (p95 < 3 seconds), data freshness (under 30 seconds), retention (metrics 30 days, logs 7 days, traces 7 days).
25
+
26
+ ### During implementation
27
+
28
+ - **Unified Observability Stack:** Use OpenTelemetry for instrumentation (vendor-neutral). Collect metrics, logs, and traces via single SDK. Export to backend(s) of choice (Prometheus, Loki, Tempo). Avoids vendor lock-in.
29
+ - **Trace Correlation:** Link traces, logs, and metrics via trace ID. Every log line should include trace ID and span ID. Enables root-cause analysis by jumping from metric spike → trace → logs.
30
+ - **Log Aggregation Architecture:** Centralized log storage (Elasticsearch, Loki, CloudWatch). Use structured logging (JSON) with consistent schema. Index on: service, environment, level, trace_id. Retain logs for 7-30 days (compliance may require longer).
31
+ - **Metrics Cardinality Management:** High-cardinality labels (user_id, request_id) explode metric storage cost. Use exemplars (link to trace) instead. Limit labels to: service, environment, region, status_code. Target: under 10,000 time series per service.
32
+ - **Telemetry Pipeline:** Decouple collection from storage. Use OpenTelemetry Collector as aggregation layer. Enables: sampling, filtering, enrichment, multi-backend export. Pipeline should handle 100k+ events/second.
33
+ - **Distributed Tracing:** Instrument all services with OpenTelemetry. Use head-based sampling (sample 1-10% of traces) or tail-based sampling (sample slow/error traces at 100%, fast traces at 1%). Traces should include: service name, operation, duration, status, attributes (http.method, db.statement).
34
+ - **Observability Cost Optimization:** Sample aggressively (1-10% for most services). Drop high-volume, low-value logs (health checks, debug logs in prod). Use tiered storage (hot: 7 days, warm: 30 days, cold: 90 days). Target: observability cost under 5% of infrastructure cost.
35
+ - **Observability Data Model:** Use semantic conventions (OpenTelemetry) for consistent attribute naming. Define standard labels: service.name, deployment.environment, service.version. Enables cross-service queries and dashboards.
36
+ - **Self-Service Observability:** Developers provision dashboards and alerts via IaC (Terraform, Jsonnet). Pre-built dashboard templates for common patterns (RED, USE, Golden Signals). Query language accessible to non-SREs (LogQL, PromQL with examples).
37
+ - **Alert Routing:** Route alerts to appropriate team via PagerDuty, Opsgenie, or Slack. Use severity levels: SEV1 (page), SEV2 (urgent Slack), SEV3 (non-urgent Slack). Alerts should include: runbook link, suggested queries, recent changes.
38
+
39
+ ### After implementation
40
+
41
+ - Verify all services emit metrics, logs, and traces via OpenTelemetry.
42
+ - Confirm trace IDs are propagated and linked across logs, metrics, and traces.
43
+ - Validate metrics cardinality is under 10,000 time series per service.
44
+ - Ensure telemetry pipeline handles 100k+ events/second with sampling and filtering.
45
+ - Check that observability cost is under 5% of infrastructure cost.
46
+
47
+ ## Self-check before task completion
48
+
49
+ - [ ] Unified observability stack uses OpenTelemetry for vendor-neutral instrumentation.
50
+ - [ ] Trace correlation links metrics, logs, and traces via trace ID and span ID.
51
+ - [ ] Log aggregation uses structured logging (JSON) with consistent schema.
52
+ - [ ] Metrics cardinality is managed: under 10,000 time series per service.
53
+ - [ ] Telemetry pipeline decouples collection from storage and handles 100k+ events/second.
54
+ - [ ] Distributed tracing uses sampling (head or tail-based) to control costs.
55
+ - [ ] Observability cost is under 5% of infrastructure cost via sampling and tiered storage.
56
+ - [ ] Observability data model uses OpenTelemetry semantic conventions.
57
+ - [ ] Self-service observability enables developers to provision dashboards and alerts via IaC.
58
+ - [ ] Alert routing uses severity levels and includes runbook links.
@@ -0,0 +1,52 @@
1
+ ---
2
+ name: platform-reliability
3
+ version: 1.0.0
4
+ min_mindforge_version: 10.7.0
5
+ status: stable
6
+ triggers: platform reliability engineering, SLO management platform, error budget policy, platform availability design, capacity management platform, platform SLA, reliability target, platform health metric, platform uptime, error budget spending, toil reduction platform, platform incident prevention
7
+ compose: incident-management
8
+ ---
9
+
10
+ # Skill — Platform Reliability
11
+
12
+ ## When this skill activates
13
+
14
+ This skill activates when the user is designing or implementing platform reliability capabilities. This includes SLO management systems, error budget policies, platform availability design, capacity management, platform health metrics, reliability targets, uptime monitoring, error budget tracking, toil reduction initiatives, and platform incident prevention strategies.
15
+
16
+ ## Mandatory actions when this skill is active
17
+
18
+ ### Before writing any code
19
+
20
+ 1. Inventory all platform services and their current reliability posture (uptime, error rates, latency, throughput).
21
+ 2. Define SLOs for each platform capability (e.g., 99.9% for APIs, 99.5% for batch jobs, 99.99% for critical path services).
22
+ 3. Establish error budget policy: what happens when budget is exhausted (freeze launches, prioritize reliability work).
23
+ 4. Identify toil sources (manual escalations, runbook execution, repetitive debugging) and quantify hours spent per week.
24
+ 5. Map platform dependencies and identify single points of failure that require redundancy.
25
+
26
+ ### During implementation
27
+
28
+ - **SLO Management:** Define SLOs as percentiles over rolling windows (e.g., 99th percentile latency < 200ms over 28 days). Avoid averages (they hide outliers). Each SLO should have: objective, measurement window, error budget calculation, and owner.
29
+ - **Error Budget Policy:** If error budget is exhausted, automatically freeze non-critical deployments and redirect engineering time to reliability improvements. Budget resets monthly or quarterly. Include exemptions for security patches.
30
+ - **Platform Availability Design:** Use multi-region active-active for critical path services. Implement circuit breakers, rate limiting, and graceful degradation. Platform should survive single availability zone failure with zero downtime.
31
+ - **Capacity Management:** Track resource utilization (CPU, memory, disk, network) and predict exhaustion 30-90 days in advance. Automate horizontal scaling for stateless services. Capacity alerts should fire before user-visible impact.
32
+ - **Platform Health Metrics:** Track: request rate, error rate, latency (p50, p95, p99), saturation, and throughput. Use RED (Rate, Errors, Duration) for services and USE (Utilization, Saturation, Errors) for infrastructure. Dashboards should load in under 3 seconds.
33
+ - **Toil Reduction:** Automate repetitive tasks that consume more than 2 hours per week. Toil reduction should free up 30-50% of on-call time within 6 months. Track toil hours saved as a platform metric.
34
+ - **Incident Prevention:** Use chaos engineering to validate failure modes (kill instances, partition networks, inject latency). Run game days quarterly. Each incident should produce at least one actionable prevention task.
35
+
36
+ ### After implementation
37
+
38
+ - Verify each platform service has defined SLOs, error budgets, and dashboards tracking compliance.
39
+ - Confirm error budget policy is enforced automatically (deployment freezes when budget exhausted).
40
+ - Validate multi-region failover works via chaos engineering tests (kill a region, verify zero downtime).
41
+ - Ensure capacity management predicts exhaustion 30-90 days in advance with alerts.
42
+ - Check that toil reduction initiatives have freed up measurable on-call time (tracked weekly).
43
+
44
+ ## Self-check before task completion
45
+
46
+ - [ ] Each platform service has SLOs defined as percentiles over rolling windows.
47
+ - [ ] Error budget policy automatically freezes non-critical deployments when budget exhausted.
48
+ - [ ] Platform survives single availability zone failure with zero user-visible downtime.
49
+ - [ ] Capacity management predicts resource exhaustion 30-90 days in advance.
50
+ - [ ] Platform health metrics use RED for services and USE for infrastructure.
51
+ - [ ] Toil reduction initiatives free up 30-50% of on-call time within 6 months.
52
+ - [ ] Chaos engineering validates failure modes quarterly via game days.
@@ -0,0 +1,96 @@
1
+ ---
2
+ name: post-incident-learning
3
+ version: 1.0.0
4
+ min_mindforge_version: 10.1.0
5
+ status: stable
6
+ triggers: post-incident learning, systemic pattern, defense mechanism, recurrence prevention, incident class, contributing factor analysis, improvement measurement, learning review, incident pattern, failure class prevention, defense layer, systemic fix
7
+ compose: incident-management
8
+ ---
9
+
10
+ # Post-Incident Learning
11
+
12
+ ## When this skill activates
13
+
14
+ This skill activates after an incident has been resolved and the team needs to extract
15
+ lasting organizational learning. It goes beyond traditional postmortems (which document
16
+ what happened) to identify failure classes, build defense mechanisms, and measure
17
+ whether the organization is actually getting better at preventing recurrence.
18
+
19
+ ## Mandatory actions when this skill is active
20
+
21
+ ### Before
22
+
23
+ 1. **Gather incident data** — Timeline, logs, communications, actions taken, resolution
24
+ steps. Ensure the raw facts are documented before memory fades.
25
+ 2. **Identify participants** — Who was involved in detection, response, and resolution?
26
+ Schedule the learning review within 72 hours of resolution.
27
+ 3. **Set the frame** — This is a learning exercise, not a blame exercise. Establish
28
+ psychological safety explicitly. No individual fault-finding.
29
+
30
+ ### During
31
+
32
+ 4. **Distinguish postmortem from learning:**
33
+ - Postmortem = What happened? (timeline, root cause, impact)
34
+ - Learning = What patterns do we now defend against? (systemic improvement)
35
+ - This skill focuses on the LEARNING phase that follows the postmortem.
36
+
37
+ 5. **Contributing factor analysis (go beyond root cause):**
38
+ - **Proximate cause** — The immediate trigger (e.g., bad deploy, config change).
39
+ - **Contributing factors** — Conditions that allowed the trigger to cause harm
40
+ (e.g., missing validation, no canary, insufficient monitoring).
41
+ - **Systemic conditions** — Organizational patterns that created the contributing
42
+ factors (e.g., time pressure, unclear ownership, technical debt tolerance).
43
+ - Map all three levels. Fixes at only the proximate level guarantee recurrence.
44
+
45
+ 6. **Identify the incident CLASS:**
46
+ - This is not just one incident — what CLASS of failure does it represent?
47
+ - Examples: "deploy without validation," "silent dependency failure," "config
48
+ drift between environments," "cascading timeout."
49
+ - Name the class explicitly. Search history for past incidents of the same class.
50
+ - If this class has occurred before, the previous fixes were insufficient.
51
+
52
+ 7. **Design defense mechanisms (layered):**
53
+ - **Layer 1: Automated prevention** — Make the failure impossible through code,
54
+ infrastructure, or tooling changes (strongest defense).
55
+ - **Layer 2: Automated detection** — If prevention is impossible, detect instantly
56
+ and auto-remediate or alert within seconds.
57
+ - **Layer 3: Process improvement** — Checklists, review steps, approval gates
58
+ (weakest defense — humans forget).
59
+ - Never accept "be more careful" as a defense mechanism.
60
+ - Prefer Layer 1 > Layer 2 > Layer 3 always.
61
+
62
+ 8. **Define improvement measurements:**
63
+ - **Mean time between incidents of same class** — Trending up means defenses work.
64
+ - **Detection time** — Time from failure occurrence to human awareness.
65
+ - **Recovery time** — Time from awareness to full resolution.
66
+ - **Blast radius** — Users/systems affected (should shrink over time).
67
+ - Set specific targets for each metric.
68
+
69
+ 9. **Create action items with teeth:**
70
+ - Each action item must have: owner, deadline, definition of done, and verification
71
+ method.
72
+ - Classify priority: P0 (fix this week), P1 (fix this sprint), P2 (fix this quarter).
73
+ - Track completion publicly. Incomplete incident actions are organizational debt.
74
+
75
+ ### After
76
+
77
+ 10. **Share the learning broadly** — Publish findings to the engineering organization.
78
+ Other teams may have the same class of vulnerability.
79
+ 11. **Update runbooks and alerts** — Ensure the detection and response improvements are
80
+ codified in operational documentation.
81
+ 12. **Schedule verification** — In 30 days, verify that defense mechanisms are in place
82
+ and metrics show improvement. If not, escalate.
83
+ 13. **Feed into incident class tracker** — Maintain an organizational record of incident
84
+ classes, their defenses, and recurrence rates.
85
+
86
+ ## Self-check before task completion
87
+
88
+ - [ ] Contributing factors analyzed at all three levels (proximate, contributing, systemic)
89
+ - [ ] Incident class identified and named (not just this one incident)
90
+ - [ ] Historical incidents of same class searched and referenced
91
+ - [ ] Defense mechanisms designed with preference for automation over process
92
+ - [ ] No action item says "be more careful" or equivalent
93
+ - [ ] Metrics defined with specific improvement targets
94
+ - [ ] All action items have owner, deadline, and verification method
95
+ - [ ] Learning shared beyond the immediate team
96
+ - [ ] 30-day verification scheduled
@@ -0,0 +1,104 @@
1
+ ---
2
+ name: product-manager
3
+ version: 1.0.0
4
+ min_mindforge_version: 10.0.6
5
+ status: stable
6
+ triggers: user story, PRD, product requirements, backlog prioritization, RICE score, MoSCoW, jobs to be done, feature scoring, sprint planning, product backlog, acceptance criteria, user journey
7
+ ---
8
+
9
+ # Skill — Product Manager
10
+
11
+ ## When this skill activates
12
+ Any task involving product requirements, user story writing, backlog prioritization,
13
+ feature scoring, sprint planning, PRD creation, or user journey mapping.
14
+
15
+ ## Mandatory actions when this skill is active
16
+
17
+ ### Before
18
+
19
+ 1. **Define the problem** — Articulate the user problem with evidence (tickets, data, interviews). No solutions yet.
20
+ 2. **Identify personas** — 1-3 specific personas with context, goals, and frustrations.
21
+ 3. **State success metrics** — Define how to measure success BEFORE designing the solution.
22
+
23
+ ### During
24
+
25
+ #### PRD structure (6 mandatory sections)
26
+ 1. **Problem Statement** — data-backed, who has it, how we know
27
+ 2. **User Personas** — context, goal, frustration per persona
28
+ 3. **Requirements** — functional (numbered, prioritized) + non-functional (perf, a11y)
29
+ 4. **Success Metrics** — current baseline, target, measurement method per metric
30
+ 5. **Scope + Timeline** — phases with explicit out-of-scope items
31
+ 6. **Risks + Mitigations** — risk, probability, impact, mitigation plan
32
+
33
+ #### User story format
34
+ ```
35
+ As a [persona], I want [action] so that [outcome/value].
36
+
37
+ Rules:
38
+ - One story = one testable behavior
39
+ - Always include "so that" (forces value articulation)
40
+ - Completable in one sprint (split if larger)
41
+ - Every story has acceptance criteria attached
42
+ ```
43
+
44
+ #### Acceptance criteria (Given/When/Then)
45
+ ```gherkin
46
+ Scenario: [descriptive name]
47
+ Given [precondition/context]
48
+ When [action taken]
49
+ Then [observable outcome]
50
+ And [additional assertions]
51
+ ```
52
+ Cover: happy path, edge cases, error states, boundary conditions.
53
+
54
+ #### RICE scoring
55
+ ```
56
+ Score = (Reach * Impact * Confidence) / Effort
57
+ Reach: users affected per quarter
58
+ Impact: 3=massive, 2=high, 1=medium, 0.5=low, 0.25=minimal
59
+ Confidence: 100%=high, 80%=medium, 50%=low
60
+ Effort: person-weeks
61
+ ```
62
+ Show the math. Rank by score. Communicate rationale for top picks.
63
+
64
+ #### MoSCoW prioritization
65
+ - Must Have: non-negotiable for launch (failure without these)
66
+ - Should Have: expected, but launch survives without them
67
+ - Could Have: nice-to-have if time permits
68
+ - Won't Have: explicitly deferred (prevents scope creep)
69
+
70
+ #### Jobs-to-be-Done framework
71
+ ```
72
+ Interview structure (45-60 min):
73
+ 1. First Thought — trigger that started the search
74
+ 2. Passive Looking — alternatives considered
75
+ 3. Active Looking — event that forced action NOW
76
+ 4. Decision — why this solution, what almost stopped them
77
+ 5. Satisfaction — does it deliver, what would cause switching
78
+
79
+ Output: "When [situation], I want to [motivation], so I can [outcome]."
80
+ ```
81
+
82
+ #### User journey mapping
83
+ ```
84
+ Stages: Awareness → Consideration → Setup → First Value → Expansion
85
+ Per stage: Actions, Touchpoints, Emotions, Pain Points, Opportunities, Metrics
86
+ ```
87
+ Identify the critical drop-off points and design interventions for each.
88
+
89
+ ### After
90
+
91
+ 1. **Validate with users** — Show PRD to 2-3 target users. Confirm problem resonates.
92
+ 2. **Engineering feasibility** — Tech lead confirms effort estimates and constraints.
93
+ 3. **Stakeholder sign-off** — Explicit agreement on v1 scope vs deferred.
94
+ 4. **Define done** — What must be true (metrics hit, not just code deployed).
95
+
96
+ ## Self-check before task completion
97
+ - [ ] Problem statement evidence-backed (data, quotes, ticket volume)
98
+ - [ ] Personas specific with context, goals, and frustrations
99
+ - [ ] Success metrics have baseline, target, and measurement method
100
+ - [ ] Stories follow "As a... I want... So that..." with acceptance criteria
101
+ - [ ] Backlog prioritized with visible math (RICE/MoSCoW/WSJF)
102
+ - [ ] Scope states what is OUT as well as IN
103
+ - [ ] User journey maps full experience from awareness to expansion
104
+ - [ ] Engineering validated feasibility and effort