mindforge-cc 10.0.3 → 11.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (333) hide show
  1. package/.mindforge/MINDFORGE-V2-SCHEMA.json +43 -10
  2. package/.mindforge/config.json +30 -2
  3. package/.mindforge/engine/cross-model-eval.md +74 -0
  4. package/.mindforge/engine/proactive/signal-detector.md +60 -0
  5. package/.mindforge/engine/proactive/suggestion-engine.md +100 -0
  6. package/.mindforge/personas/agent-architect.md +57 -0
  7. package/.mindforge/personas/agent-evaluator.md +162 -0
  8. package/.mindforge/personas/agent-memory-designer.md +157 -0
  9. package/.mindforge/personas/agent-ops-engineer.md +120 -0
  10. package/.mindforge/personas/agent-orchestrator.md +112 -0
  11. package/.mindforge/personas/ai-economist.md +57 -0
  12. package/.mindforge/personas/ai-safety-engineer.md +57 -0
  13. package/.mindforge/personas/analytics-engineer.md +57 -0
  14. package/.mindforge/personas/anti-pattern-hunter.md +61 -0
  15. package/.mindforge/personas/api-gateway-designer.md +132 -0
  16. package/.mindforge/personas/auth-engineer.md +112 -0
  17. package/.mindforge/personas/build-engineer.md +57 -0
  18. package/.mindforge/personas/business-analyst.md +56 -0
  19. package/.mindforge/personas/cache-architect.md +100 -0
  20. package/.mindforge/personas/causal-scientist.md +57 -0
  21. package/.mindforge/personas/cdn-architect.md +118 -0
  22. package/.mindforge/personas/change-agent.md +104 -0
  23. package/.mindforge/personas/code-narrator.md +52 -0
  24. package/.mindforge/personas/codegen-specialist.md +68 -0
  25. package/.mindforge/personas/communication-architect.md +102 -0
  26. package/.mindforge/personas/compliance-engineer.md +96 -0
  27. package/.mindforge/personas/consensus-engineer.md +116 -0
  28. package/.mindforge/personas/contract-tester.md +60 -192
  29. package/.mindforge/personas/data-architect.md +108 -0
  30. package/.mindforge/personas/data-mesh-architect.md +57 -0
  31. package/.mindforge/personas/data-pipeline-architect.md +120 -0
  32. package/.mindforge/personas/de-sloppifier.md +60 -0
  33. package/.mindforge/personas/debt-manager.md +66 -0
  34. package/.mindforge/personas/decision-architect.md +82 -51
  35. package/.mindforge/personas/deployment-captain.md +74 -0
  36. package/.mindforge/personas/design-system-lead.md +112 -0
  37. package/.mindforge/personas/dmux-orchestrator.md +75 -0
  38. package/.mindforge/personas/dx-engineer.md +96 -0
  39. package/.mindforge/personas/ecommerce-engineer.md +57 -0
  40. package/.mindforge/personas/edge-engineer.md +94 -0
  41. package/.mindforge/personas/edtech-architect.md +106 -0
  42. package/.mindforge/personas/embedding-architect.md +57 -0
  43. package/.mindforge/personas/environment-engineer.md +57 -0
  44. package/.mindforge/personas/eval-judge.md +55 -0
  45. package/.mindforge/personas/event-architect.md +102 -0
  46. package/.mindforge/personas/experiment-designer.md +138 -0
  47. package/.mindforge/personas/feature-store-engineer.md +57 -0
  48. package/.mindforge/personas/finops-analyst.md +66 -0
  49. package/.mindforge/personas/fintech-architect.md +57 -0
  50. package/.mindforge/personas/flutter-engineer.md +104 -0
  51. package/.mindforge/personas/gaming-engineer.md +57 -0
  52. package/.mindforge/personas/graphql-designer.md +73 -0
  53. package/.mindforge/personas/healthcare-engineer.md +57 -0
  54. package/.mindforge/personas/hiring-strategist.md +105 -0
  55. package/.mindforge/personas/hitl-architect.md +165 -0
  56. package/.mindforge/personas/i18n-architect.md +69 -0
  57. package/.mindforge/personas/iot-architect.md +105 -0
  58. package/.mindforge/personas/knowledge-curator.md +139 -0
  59. package/.mindforge/personas/knowledge-engineer.md +57 -0
  60. package/.mindforge/personas/lakehouse-architect.md +57 -0
  61. package/.mindforge/personas/llm-orchestrator.md +57 -0
  62. package/.mindforge/personas/logistics-architect.md +106 -0
  63. package/.mindforge/personas/market-analyst.md +53 -0
  64. package/.mindforge/personas/marketplace-engineer.md +105 -0
  65. package/.mindforge/personas/mcp-designer.md +54 -0
  66. package/.mindforge/personas/meeting-designer.md +104 -0
  67. package/.mindforge/personas/mentorship-lead.md +106 -0
  68. package/.mindforge/personas/migration-architect.md +57 -0
  69. package/.mindforge/personas/ml-ops-engineer.md +101 -0
  70. package/.mindforge/personas/mobile-architect.md +105 -0
  71. package/.mindforge/personas/mobile-security-engineer.md +106 -0
  72. package/.mindforge/personas/multi-tenancy-architect.md +71 -0
  73. package/.mindforge/personas/multimodal-engineer.md +57 -0
  74. package/.mindforge/personas/offline-specialist.md +105 -0
  75. package/.mindforge/personas/onboarding-navigator.md +63 -0
  76. package/.mindforge/personas/payments-engineer.md +135 -0
  77. package/.mindforge/personas/pipeline-engineer.md +115 -0
  78. package/.mindforge/personas/platform-engineer.md +97 -0
  79. package/.mindforge/personas/platform-lead.md +57 -0
  80. package/.mindforge/personas/privacy-engineer.md +57 -0
  81. package/.mindforge/personas/product-owner.md +56 -0
  82. package/.mindforge/personas/productivity-analyst.md +57 -0
  83. package/.mindforge/personas/prompt-architect.md +101 -0
  84. package/.mindforge/personas/proofreader.md +53 -0
  85. package/.mindforge/personas/pwa-architect.md +105 -0
  86. package/.mindforge/personas/quality-scorer.md +63 -0
  87. package/.mindforge/personas/react-native-engineer.md +106 -0
  88. package/.mindforge/personas/resilience-engineer.md +69 -0
  89. package/.mindforge/personas/rfc-architect.md +64 -0
  90. package/.mindforge/personas/saga-orchestrator.md +80 -0
  91. package/.mindforge/personas/secrets-engineer.md +57 -0
  92. package/.mindforge/personas/skill-smith.md +79 -0
  93. package/.mindforge/personas/sre-lead.md +107 -0
  94. package/.mindforge/personas/stream-engineer.md +57 -0
  95. package/.mindforge/personas/streaming-engineer.md +64 -0
  96. package/.mindforge/personas/swarm-templates.json +674 -44
  97. package/.mindforge/personas/system-designer.md +57 -0
  98. package/.mindforge/personas/team-coach.md +120 -0
  99. package/.mindforge/personas/tech-lead-coach.md +103 -0
  100. package/.mindforge/personas/technical-writer-lead.md +111 -0
  101. package/.mindforge/personas/vibe-checker.md +75 -0
  102. package/.mindforge/personas/worktree-manager.md +56 -0
  103. package/.mindforge/personas/zero-trust-engineer.md +113 -0
  104. package/.mindforge/skills/a11y-testing/SKILL.md +143 -0
  105. package/.mindforge/skills/agent-evaluation-framework/SKILL.md +227 -0
  106. package/.mindforge/skills/agent-memory-design/SKILL.md +199 -0
  107. package/.mindforge/skills/agent-orchestration-patterns/SKILL.md +129 -0
  108. package/.mindforge/skills/agent-tool-selection/SKILL.md +204 -0
  109. package/.mindforge/skills/ai-agent-deployment/SKILL.md +176 -0
  110. package/.mindforge/skills/ai-cost-management/SKILL.md +57 -0
  111. package/.mindforge/skills/ai-safety-alignment/SKILL.md +53 -0
  112. package/.mindforge/skills/analytics-instrumentation/SKILL.md +172 -0
  113. package/.mindforge/skills/api-gateway-patterns/SKILL.md +177 -0
  114. package/.mindforge/skills/api-marketplace/SKILL.md +56 -0
  115. package/.mindforge/skills/api-versioning/SKILL.md +100 -0
  116. package/.mindforge/skills/app-store-deployment/SKILL.md +44 -0
  117. package/.mindforge/skills/architecture-tradeoff-analysis/SKILL.md +97 -0
  118. package/.mindforge/skills/audit-logging/SKILL.md +140 -0
  119. package/.mindforge/skills/auth-patterns/SKILL.md +148 -0
  120. package/.mindforge/skills/autonomous-agent-harness/SKILL.md +218 -0
  121. package/.mindforge/skills/autonomous-agents/SKILL.md +59 -0
  122. package/.mindforge/skills/build-system-optimization/SKILL.md +54 -0
  123. package/.mindforge/skills/build-vs-buy/SKILL.md +80 -0
  124. package/.mindforge/skills/bundle-optimization/SKILL.md +174 -0
  125. package/.mindforge/skills/business-analyst/SKILL.md +82 -0
  126. package/.mindforge/skills/caching-strategies/SKILL.md +132 -0
  127. package/.mindforge/skills/capacity-planning/SKILL.md +96 -0
  128. package/.mindforge/skills/causal-inference/SKILL.md +42 -0
  129. package/.mindforge/skills/cdn-optimization/SKILL.md +212 -0
  130. package/.mindforge/skills/change-management/SKILL.md +106 -0
  131. package/.mindforge/skills/chaos-engineering/SKILL.md +99 -0
  132. package/.mindforge/skills/ci-cd-pipeline/SKILL.md +118 -0
  133. package/.mindforge/skills/cli-design/SKILL.md +118 -0
  134. package/.mindforge/skills/code-generation-patterns/SKILL.md +92 -0
  135. package/.mindforge/skills/code-review-methodology/SKILL.md +180 -0
  136. package/.mindforge/skills/code-tour/SKILL.md +145 -0
  137. package/.mindforge/skills/codebase-onboarding/SKILL.md +95 -0
  138. package/.mindforge/skills/compliance-as-code/SKILL.md +195 -0
  139. package/.mindforge/skills/conflict-resolution/SKILL.md +87 -0
  140. package/.mindforge/skills/connection-pooling/SKILL.md +151 -0
  141. package/.mindforge/skills/container-security/SKILL.md +151 -0
  142. package/.mindforge/skills/context-engineering/SKILL.md +114 -0
  143. package/.mindforge/skills/contract-testing/SKILL.md +85 -0
  144. package/.mindforge/skills/cost-estimation/SKILL.md +82 -0
  145. package/.mindforge/skills/cqrs-event-sourcing/SKILL.md +95 -0
  146. package/.mindforge/skills/cross-platform-testing/SKILL.md +43 -0
  147. package/.mindforge/skills/data-governance/SKILL.md +42 -0
  148. package/.mindforge/skills/data-lakehouse/SKILL.md +42 -0
  149. package/.mindforge/skills/data-mesh/SKILL.md +42 -0
  150. package/.mindforge/skills/data-modeling/SKILL.md +107 -0
  151. package/.mindforge/skills/data-pipeline-design/SKILL.md +171 -0
  152. package/.mindforge/skills/data-privacy-engineering/SKILL.md +42 -0
  153. package/.mindforge/skills/database-performance/SKILL.md +174 -0
  154. package/.mindforge/skills/database-sharding-advanced/SKILL.md +206 -0
  155. package/.mindforge/skills/de-sloppify/SKILL.md +120 -0
  156. package/.mindforge/skills/defense-in-depth/SKILL.md +84 -0
  157. package/.mindforge/skills/delegation-patterns/SKILL.md +123 -0
  158. package/.mindforge/skills/dependency-management/SKILL.md +94 -0
  159. package/.mindforge/skills/deployment-workflow/SKILL.md +135 -0
  160. package/.mindforge/skills/design-system/SKILL.md +113 -0
  161. package/.mindforge/skills/developer-onboarding/SKILL.md +99 -0
  162. package/.mindforge/skills/developer-productivity-metrics/SKILL.md +59 -0
  163. package/.mindforge/skills/distributed-consensus/SKILL.md +141 -0
  164. package/.mindforge/skills/dmux-workflows/SKILL.md +141 -0
  165. package/.mindforge/skills/dns-architecture/SKILL.md +167 -0
  166. package/.mindforge/skills/ecommerce-architecture/SKILL.md +41 -0
  167. package/.mindforge/skills/edge-computing/SKILL.md +91 -0
  168. package/.mindforge/skills/edtech-platform/SKILL.md +41 -0
  169. package/.mindforge/skills/email-deliverability/SKILL.md +177 -0
  170. package/.mindforge/skills/embedding-systems/SKILL.md +55 -0
  171. package/.mindforge/skills/environment-management/SKILL.md +54 -0
  172. package/.mindforge/skills/error-handling-architecture/SKILL.md +118 -0
  173. package/.mindforge/skills/estimation-techniques/SKILL.md +113 -0
  174. package/.mindforge/skills/eval-harness/SKILL.md +180 -0
  175. package/.mindforge/skills/event-driven-architecture/SKILL.md +162 -0
  176. package/.mindforge/skills/experiment-design/SKILL.md +139 -0
  177. package/.mindforge/skills/experiment-platform/SKILL.md +43 -0
  178. package/.mindforge/skills/feature-engineering/SKILL.md +42 -0
  179. package/.mindforge/skills/feature-flag-management/SKILL.md +183 -0
  180. package/.mindforge/skills/fine-tuning-workflow/SKILL.md +189 -0
  181. package/.mindforge/skills/fintech-patterns/SKILL.md +41 -0
  182. package/.mindforge/skills/flutter-architecture/SKILL.md +42 -0
  183. package/.mindforge/skills/gaming-backend/SKILL.md +41 -0
  184. package/.mindforge/skills/git-workflow-design/SKILL.md +129 -0
  185. package/.mindforge/skills/graceful-degradation/SKILL.md +95 -0
  186. package/.mindforge/skills/graphql-patterns/SKILL.md +243 -0
  187. package/.mindforge/skills/guardrails-and-safety/SKILL.md +137 -0
  188. package/.mindforge/skills/healthcare-systems/SKILL.md +40 -0
  189. package/.mindforge/skills/hiring-engineering/SKILL.md +119 -0
  190. package/.mindforge/skills/human-in-the-loop-design/SKILL.md +234 -0
  191. package/.mindforge/skills/i18n-architecture/SKILL.md +147 -0
  192. package/.mindforge/skills/idempotency-patterns/SKILL.md +84 -0
  193. package/.mindforge/skills/incident-communication/SKILL.md +96 -0
  194. package/.mindforge/skills/incident-management/SKILL.md +97 -0
  195. package/.mindforge/skills/infrastructure-as-code/SKILL.md +98 -0
  196. package/.mindforge/skills/instinct-clustering/SKILL.md +190 -0
  197. package/.mindforge/skills/internal-developer-platform/SKILL.md +51 -0
  198. package/.mindforge/skills/iot-platform/SKILL.md +41 -0
  199. package/.mindforge/skills/k8s-deployment/SKILL.md +358 -0
  200. package/.mindforge/skills/knowledge-graphs/SKILL.md +56 -0
  201. package/.mindforge/skills/knowledge-sharing-systems/SKILL.md +112 -0
  202. package/.mindforge/skills/llm-cost-optimization/SKILL.md +198 -0
  203. package/.mindforge/skills/llm-orchestration/SKILL.md +56 -0
  204. package/.mindforge/skills/load-testing/SKILL.md +84 -0
  205. package/.mindforge/skills/logistics-optimization/SKILL.md +40 -0
  206. package/.mindforge/skills/market-researcher/SKILL.md +99 -0
  207. package/.mindforge/skills/marketplace-trust/SKILL.md +40 -0
  208. package/.mindforge/skills/mcp-server-patterns/SKILL.md +264 -0
  209. package/.mindforge/skills/media-streaming/SKILL.md +41 -0
  210. package/.mindforge/skills/meeting-architecture/SKILL.md +146 -0
  211. package/.mindforge/skills/mentoring-patterns/SKILL.md +77 -0
  212. package/.mindforge/skills/microservices-patterns/SKILL.md +83 -0
  213. package/.mindforge/skills/migration-platform/SKILL.md +61 -0
  214. package/.mindforge/skills/migration-strategies/SKILL.md +129 -0
  215. package/.mindforge/skills/ml-feature-store/SKILL.md +56 -0
  216. package/.mindforge/skills/ml-monitoring/SKILL.md +42 -0
  217. package/.mindforge/skills/mobile-performance/SKILL.md +44 -0
  218. package/.mindforge/skills/mobile-security/SKILL.md +45 -0
  219. package/.mindforge/skills/model-evaluation/SKILL.md +53 -0
  220. package/.mindforge/skills/monorepo-management/SKILL.md +100 -0
  221. package/.mindforge/skills/multi-tenancy-patterns/SKILL.md +145 -0
  222. package/.mindforge/skills/multi-turn-conversation-design/SKILL.md +206 -0
  223. package/.mindforge/skills/multimodal-ai/SKILL.md +51 -0
  224. package/.mindforge/skills/mutation-testing/SKILL.md +97 -0
  225. package/.mindforge/skills/notification-system-design/SKILL.md +168 -0
  226. package/.mindforge/skills/observability-stack/SKILL.md +136 -0
  227. package/.mindforge/skills/offline-first-design/SKILL.md +43 -0
  228. package/.mindforge/skills/on-call-design/SKILL.md +111 -0
  229. package/.mindforge/skills/pagination-patterns/SKILL.md +230 -0
  230. package/.mindforge/skills/payment-integration/SKILL.md +176 -0
  231. package/.mindforge/skills/performance-reviews/SKILL.md +140 -0
  232. package/.mindforge/skills/platform-observability/SKILL.md +58 -0
  233. package/.mindforge/skills/platform-reliability/SKILL.md +52 -0
  234. package/.mindforge/skills/post-incident-learning/SKILL.md +96 -0
  235. package/.mindforge/skills/product-manager/SKILL.md +104 -0
  236. package/.mindforge/skills/progressive-web-app/SKILL.md +44 -0
  237. package/.mindforge/skills/prompt-engineering/SKILL.md +94 -0
  238. package/.mindforge/skills/proofreader/SKILL.md +158 -0
  239. package/.mindforge/skills/push-notification-architecture/SKILL.md +45 -0
  240. package/.mindforge/skills/python-performance/SKILL.md +183 -0
  241. package/.mindforge/skills/quality-audit/SKILL.md +171 -0
  242. package/.mindforge/skills/queue-design/SKILL.md +85 -0
  243. package/.mindforge/skills/rag-architecture/SKILL.md +176 -0
  244. package/.mindforge/skills/rate-limiting-design/SKILL.md +94 -0
  245. package/.mindforge/skills/react-native-patterns/SKILL.md +42 -0
  246. package/.mindforge/skills/react-performance/SKILL.md +229 -0
  247. package/.mindforge/skills/real-time-analytics/SKILL.md +42 -0
  248. package/.mindforge/skills/real-time-sync/SKILL.md +83 -0
  249. package/.mindforge/skills/responsive-native/SKILL.md +44 -0
  250. package/.mindforge/skills/responsive-patterns/SKILL.md +141 -0
  251. package/.mindforge/skills/rfc-pipeline/SKILL.md +114 -0
  252. package/.mindforge/skills/saas-multi-tenant/SKILL.md +41 -0
  253. package/.mindforge/skills/santa-method/SKILL.md +134 -0
  254. package/.mindforge/skills/search-implementation/SKILL.md +98 -0
  255. package/.mindforge/skills/secrets-platform/SKILL.md +56 -0
  256. package/.mindforge/skills/secrets-rotation/SKILL.md +173 -0
  257. package/.mindforge/skills/self-serve-infrastructure/SKILL.md +51 -0
  258. package/.mindforge/skills/serverless-patterns/SKILL.md +119 -0
  259. package/.mindforge/skills/skill-creator-meta/SKILL.md +146 -0
  260. package/.mindforge/skills/sprint-retrospective-facilitation/SKILL.md +112 -0
  261. package/.mindforge/skills/stakeholder-communication/SKILL.md +85 -0
  262. package/.mindforge/skills/state-management/SKILL.md +104 -0
  263. package/.mindforge/skills/stream-processing/SKILL.md +43 -0
  264. package/.mindforge/skills/streaming-architecture/SKILL.md +81 -0
  265. package/.mindforge/skills/supply-chain-security/SKILL.md +145 -0
  266. package/.mindforge/skills/synthetic-data-generation/SKILL.md +52 -0
  267. package/.mindforge/skills/system-design/SKILL.md +88 -0
  268. package/.mindforge/skills/team-topology-design/SKILL.md +107 -0
  269. package/.mindforge/skills/technical-debt-management/SKILL.md +86 -0
  270. package/.mindforge/skills/technical-interview-design/SKILL.md +98 -0
  271. package/.mindforge/skills/technical-leadership/SKILL.md +75 -0
  272. package/.mindforge/skills/technical-writing/SKILL.md +237 -0
  273. package/.mindforge/skills/technology-radar/SKILL.md +88 -0
  274. package/.mindforge/skills/testing-anti-patterns/SKILL.md +288 -0
  275. package/.mindforge/skills/tool-design/SKILL.md +138 -0
  276. package/.mindforge/skills/typescript-advanced/SKILL.md +198 -0
  277. package/.mindforge/skills/using-git-worktrees/SKILL.md +139 -0
  278. package/.mindforge/skills/verification-loop/SKILL.md +13 -1
  279. package/.mindforge/skills/vibe-security/SKILL.md +165 -0
  280. package/.mindforge/skills/visual-regression-testing/SKILL.md +97 -0
  281. package/.mindforge/skills/websocket-patterns/SKILL.md +203 -0
  282. package/.mindforge/skills/writing-plans/SKILL.md +170 -0
  283. package/.mindforge/skills/writing-skills/SKILL.md +216 -0
  284. package/.mindforge/skills/zero-trust-architecture/SKILL.md +166 -0
  285. package/CHANGELOG.md +240 -0
  286. package/MINDFORGE.md +4 -4
  287. package/README.md +49 -4
  288. package/RELEASENOTES.md +80 -0
  289. package/SECURITY.md +20 -8
  290. package/bin/autonomous/audit-writer.js +13 -0
  291. package/bin/autonomous/auto-runner.js +74 -16
  292. package/bin/autonomous/context-refactorer.js +26 -11
  293. package/bin/autonomous/state-manager.js +62 -6
  294. package/bin/autonomous/stuck-monitor.js +46 -7
  295. package/bin/autonomous/wave-executor.js +66 -25
  296. package/bin/dashboard/api-router.js +43 -0
  297. package/bin/dashboard/metrics-aggregator.js +28 -1
  298. package/bin/dashboard/server.js +67 -4
  299. package/bin/dashboard/sse-bridge.js +4 -4
  300. package/bin/engine/feedback-loop.js +8 -0
  301. package/bin/engine/intelligence-interlock.js +32 -15
  302. package/bin/engine/logic-drift-detector.js +2 -1
  303. package/bin/engine/nexus-tracer.js +3 -2
  304. package/bin/engine/remediation-engine.js +155 -32
  305. package/bin/engine/self-corrective-synthesizer.js +84 -10
  306. package/bin/engine/sre-manager.js +12 -4
  307. package/bin/engine/temporal-hub.js +131 -34
  308. package/bin/governance/approve.js +41 -5
  309. package/bin/governance/impact-analyzer.js +28 -0
  310. package/bin/governance/policy-engine.js +10 -3
  311. package/bin/governance/quantum-crypto.js +32 -19
  312. package/bin/governance/rbac-manager.js +74 -2
  313. package/bin/governance/ztai-manager.js +49 -7
  314. package/bin/hindsight-injector.js +3 -3
  315. package/bin/memory/eis-client.js +71 -34
  316. package/bin/memory/embedding-engine.js +61 -0
  317. package/bin/memory/knowledge-graph.js +58 -5
  318. package/bin/memory/knowledge-indexer.js +53 -6
  319. package/bin/memory/knowledge-store.js +22 -0
  320. package/bin/migrations/10.7.0-to-11.0.0.js +110 -0
  321. package/bin/migrations/schema-versions.js +13 -0
  322. package/bin/models/anthropic-provider.js +45 -0
  323. package/bin/models/cloud-broker.js +68 -20
  324. package/bin/models/gemini-provider.js +51 -0
  325. package/bin/models/model-client.js +20 -0
  326. package/bin/models/model-router.js +28 -8
  327. package/bin/models/openai-provider.js +44 -0
  328. package/bin/utils/file-io.js +63 -1
  329. package/bin/utils/index.js +58 -0
  330. package/docs/getting-started.md +1 -1
  331. package/docs/user-guide.md +2 -2
  332. package/package.json +2 -2
  333. package/.mindforge/personas/data-privacy-engineer.md +0 -187
@@ -0,0 +1,135 @@
1
+ ---
2
+ name: deployment-workflow
3
+ version: 1.0.0
4
+ min_mindforge_version: 10.0.5
5
+ status: stable
6
+ triggers: deploy, deployment, staged rollout, canary, feature flag rollout, rollback plan, production release, release pipeline, ship to production, go live, launch pipeline, release workflow
7
+ ---
8
+
9
+ # Skill — Deployment Workflow (Staged Production Release Pipeline)
10
+
11
+ ## When this skill activates
12
+
13
+ When deploying code to staging or production, planning a release, setting up
14
+ feature flags, or managing rollback strategies. Use for any transition from
15
+ "code is ready" to "code is live and monitored." Covers the full lifecycle from
16
+ pre-deploy verification through post-deploy health confirmation.
17
+
18
+ Core principle: **Never deploy blind** — every release has a rollback plan,
19
+ success thresholds, and monitoring before it is considered complete.
20
+
21
+ ## Mandatory actions when this skill is active
22
+
23
+ ### Before deployment begins
24
+
25
+ 1. **Select depth level based on risk:**
26
+
27
+ | Level | Time | When to use | Scope |
28
+ |-------|------|-------------|-------|
29
+ | Quick | 5-10 min | Hotfix, docs, config | Git clean + tests only |
30
+ | Standard | 15-30 min | Feature, refactor | Full pipeline |
31
+ | Deep | 30-60 min | Breaking change, infra | Backup + load test + canary |
32
+
33
+ 2. **Pre-deploy checklist (all levels):**
34
+ - [ ] Working directory is clean (`git status` shows no uncommitted changes)
35
+ - [ ] All tests pass (`npm test` / `pytest` / equivalent)
36
+ - [ ] No lint errors or type errors
37
+ - [ ] CHANGELOG updated with version and summary
38
+ - [ ] Branch is rebased on latest main
39
+ - [ ] PR approved (if applicable)
40
+
41
+ 3. **Risk assessment:**
42
+ - Does this touch auth, payments, or PII? (triggers security review)
43
+ - Does this require a database migration? (adds rollback complexity)
44
+ - Does this change public API contracts? (requires versioning)
45
+ - What is the blast radius if this fails? (single user vs all users)
46
+
47
+ ### During deployment
48
+
49
+ **Phase 1 — Build & Bundle:**
50
+ - Compile/transpile all source
51
+ - Run type checker (zero errors required)
52
+ - Bundle analysis: warn if any chunk exceeds 250KB
53
+ - Generate source maps for production debugging
54
+ - Tag commit with version: `git tag v[X.Y.Z]`
55
+
56
+ **Phase 2 — Staging:**
57
+ - Deploy to staging environment
58
+ - Run health check endpoint (HTTP 200 within 30s)
59
+ - Execute smoke test suite (critical paths only)
60
+ - Verify environment variables are set correctly
61
+ - Check database connectivity and migration status
62
+
63
+ **Phase 3 — Production (select strategy):**
64
+
65
+ | Strategy | Use when | Process |
66
+ |----------|----------|---------|
67
+ | Quick | Low risk, fast rollback | Deploy all at once, monitor 5 min |
68
+ | Rolling | Medium risk | Replace instances 25% at a time, 2 min between |
69
+ | Canary | High risk, new feature | 5% → 25% → 50% → 100%, monitor at each step |
70
+
71
+ **Phase 4 — Monitoring (mandatory for all strategies):**
72
+ - Watch error rates for 15 minutes post-deploy
73
+ - Compare P95 latency to pre-deploy baseline
74
+ - Check business metrics (conversion, signup, core actions)
75
+ - Verify no new error types in exception tracker
76
+
77
+ **Decision thresholds:**
78
+ ```
79
+ GREEN (proceed): Error rate < 10% above baseline
80
+ YELLOW (investigate): Error rate 10-100% above baseline
81
+ RED (rollback): Error rate > 2x baseline OR P95 > 3x baseline
82
+ ```
83
+
84
+ **Feature flag lifecycle (when applicable):**
85
+ ```
86
+ DEPLOY (flag OFF) → ENABLE (team only) → CANARY (5% users) →
87
+ GRADUAL (25% → 50% → 75%) → FULL (100%) → CLEANUP (remove flag)
88
+ ```
89
+
90
+ - Each stage requires minimum 1 hour soak time (24 hours for GRADUAL steps)
91
+ - Monitor flag-specific metrics at each stage
92
+ - Rollback = disable flag (instant, < 1 minute)
93
+
94
+ ### After deployment
95
+
96
+ 1. **Health verification:**
97
+ - [ ] All health endpoints return 200
98
+ - [ ] Error rate < 1% (absolute, not relative)
99
+ - [ ] No new exception types in monitoring
100
+ - [ ] P95 latency within acceptable range
101
+ - [ ] Business metrics stable (no cliff)
102
+
103
+ 2. **Rollback procedures (know these BEFORE deploying):**
104
+
105
+ | Method | Time | When to use |
106
+ |--------|------|-------------|
107
+ | Feature flag disable | ~1 min | Flag-guarded changes |
108
+ | Git revert + redeploy | ~5 min | Code-only changes |
109
+ | Database rollback | ~15 min | Migration-dependent changes |
110
+
111
+ 3. **Post-deploy documentation:**
112
+ - Update DEPLOYMENT.md with:
113
+ - Phase timestamps (start, staging, production, verified)
114
+ - Commit hash deployed
115
+ - Strategy used
116
+ - Any issues encountered and resolution
117
+ - Final status: SUCCESS / ROLLED_BACK / PARTIAL
118
+
119
+ 4. **Metrics baseline:**
120
+ - Record current performance metrics as new baseline
121
+ - Archive previous baseline for comparison
122
+ - Update alerting thresholds if performance improved
123
+
124
+ ## Self-check before task completion
125
+
126
+ Before marking a deployment task done:
127
+
128
+ - [ ] Did I verify all pre-deploy checks pass (tests, lint, types)?
129
+ - [ ] Did I document the rollback plan BEFORE deploying?
130
+ - [ ] Did I set up monitoring and define success thresholds?
131
+ - [ ] Did I wait the minimum soak time before declaring success?
132
+ - [ ] Did I record the deployment in DEPLOYMENT.md with timestamps?
133
+ - [ ] Is the error rate below 1% post-deploy?
134
+ - [ ] Did I update the metrics baseline?
135
+ - [ ] If using feature flags: is the cleanup step scheduled?
@@ -0,0 +1,113 @@
1
+ ---
2
+ name: design-system
3
+ version: 1.0.0
4
+ min_mindforge_version: 10.0.7
5
+ status: stable
6
+ triggers: design system architecture, component library design, design token system, storybook documentation, variant pattern design, theme architecture, atomic design pattern, component documentation standard, design system versioning, token based theming, component catalog, ui component library
7
+ ---
8
+
9
+ # Design System
10
+
11
+ ## When this skill activates
12
+
13
+ This skill activates when the user is designing, building, or maintaining a design
14
+ system or component library. This includes design token architecture, atomic design
15
+ methodology, component API design, Storybook documentation, variant patterns, theme
16
+ architecture (dark/light/custom), design system versioning strategy, contribution
17
+ models, and component catalog organization.
18
+
19
+ ## Mandatory actions
20
+
21
+ ### Before
22
+
23
+ 1. Identify the target frameworks and platforms (React, Vue, Svelte, iOS, Android, web components).
24
+ 2. Determine the team size and contribution model (centralized team vs federated).
25
+ 3. Assess existing UI components and visual inconsistencies to address.
26
+ 4. Review brand guidelines and design specifications (Figma, Sketch, or equivalent).
27
+ 5. Identify theming requirements (dark mode, white-label, multi-brand).
28
+
29
+ ### During
30
+
31
+ **Design Tokens:**
32
+ - **Primitive tokens:** Raw values with no semantic meaning (colors: `blue-500: #3B82F6`, spacing: `space-4: 16px`, typography: `font-size-lg: 18px`).
33
+ - **Semantic tokens:** Meaningful aliases that reference primitives (`color-primary: {blue-500}`, `color-text-default: {gray-900}`, `spacing-inline-md: {space-4}`).
34
+ - **Component tokens:** Scoped to specific components (`button-padding-x: {spacing-inline-md}`, `button-bg-primary: {color-primary}`).
35
+ - Store tokens in a tool-agnostic format (JSON/YAML) and generate platform-specific outputs (CSS custom properties, Swift/Kotlin constants, SCSS variables).
36
+ - Use Style Dictionary or Tokens Studio for token transformation pipelines.
37
+ - Token naming convention: `{category}-{property}-{variant}-{state}` (e.g., `color-text-primary-hover`).
38
+
39
+ **Atomic Design Methodology:**
40
+ - **Atoms:** Smallest indivisible elements (Button, Input, Icon, Label, Badge).
41
+ - **Molecules:** Groups of atoms functioning together (SearchBar = Input + Button, FormField = Label + Input + ErrorMessage).
42
+ - **Organisms:** Complex sections composed of molecules (Header = Logo + Nav + SearchBar, Card = Image + Title + Description + Actions).
43
+ - **Templates:** Page layouts without real content (defines the skeleton/grid).
44
+ - **Pages:** Templates populated with real content (final implementation).
45
+ - Not every component fits neatly — use atomic levels as guidance, not strict rules.
46
+
47
+ **Component API Design:**
48
+ - Props interface is the public contract. Design it carefully.
49
+ - Use variants via props, not CSS class names (`<Button variant="primary">` not `<Button className="btn-primary">`).
50
+ - Prefer composition over configuration (slot-based patterns, children, render props).
51
+ - Limit prop count: if > 8 props, consider splitting into sub-components.
52
+ - Use TypeScript/PropTypes for full type safety on component interfaces.
53
+ - Default props to the most common use case (progressive disclosure).
54
+ - Forward refs and spread remaining props for extensibility.
55
+
56
+ **Storybook Documentation:**
57
+ - One story file per component, co-located with the component source.
58
+ - Stories for every meaningful state: default, hover, focus, disabled, loading, error, empty.
59
+ - Use args/controls for interactive prop exploration.
60
+ - Include the a11y addon and verify each story passes accessibility checks.
61
+ - Write MDX documentation pages for usage guidelines and do/don't examples.
62
+ - Chromatic or Percy for visual regression testing on every PR.
63
+
64
+ **Versioning:**
65
+ - Semantic versioning (semver) for the component library package.
66
+ - **Major:** Breaking API changes (prop removal, renamed component, changed behavior).
67
+ - **Minor:** New components, new variants, additive features.
68
+ - **Patch:** Bug fixes, accessibility improvements, style corrections.
69
+ - Maintain a changelog (auto-generated from conventional commits).
70
+ - Support at least one previous major version during migration period.
71
+ - Pin design system version in consuming applications.
72
+
73
+ **Theming:**
74
+ - CSS custom properties for runtime theme switching (no rebuild required).
75
+ - Theme object structure mirrors semantic token hierarchy.
76
+ - Support: light, dark, and system-preference (prefers-color-scheme).
77
+ - White-label: allow consumers to override semantic tokens without touching components.
78
+ - Test all components in all supported themes (visual regression).
79
+ - Provide a ThemeProvider component for React/Vue context-based theming.
80
+
81
+ **Contribution Model:**
82
+ - RFC process for new components (proposal → review → build → document → release).
83
+ - Review criteria: accessibility, responsiveness, theme support, documentation, tests.
84
+ - Component readiness checklist before merging into the system.
85
+ - Deprecation policy: announce → mark deprecated → migration guide → remove after N versions.
86
+ - Office hours or design system team review for proposed additions.
87
+
88
+ **Accessibility (Built-in):**
89
+ - Every component must meet WCAG 2.1 AA as a minimum.
90
+ - Keyboard navigation, focus management, and ARIA attributes are non-negotiable.
91
+ - Color contrast ratios enforced via tokens (4.5:1 for text, 3:1 for UI elements).
92
+ - Screen reader testing as part of component QA.
93
+ - Document accessibility patterns in component documentation.
94
+
95
+ ### After
96
+
97
+ 1. Verify token pipeline generates correct outputs for all target platforms.
98
+ 2. Confirm Storybook coverage includes all component states and variants.
99
+ 3. Validate theme switching works across all components without visual regressions.
100
+ 4. Check accessibility audit passes for every component in the catalog.
101
+ 5. Ensure versioning and changelog are current and accurate.
102
+ 6. Validate contribution documentation is clear for new contributors.
103
+
104
+ ## Self-check before task completion
105
+
106
+ - [ ] Design tokens follow the three-tier hierarchy (primitive → semantic → component).
107
+ - [ ] Components follow atomic design principles with clear categorization.
108
+ - [ ] Component APIs use props for variants and composition for flexibility.
109
+ - [ ] Storybook documents all states with interactive controls and accessibility checks.
110
+ - [ ] Versioning follows semver with a maintained changelog.
111
+ - [ ] Theming supports light/dark/system with CSS custom properties.
112
+ - [ ] Contribution model includes RFC process and readiness checklist.
113
+ - [ ] Accessibility meets WCAG 2.1 AA across all components and themes.
@@ -0,0 +1,99 @@
1
+ ---
2
+ name: developer-onboarding
3
+ version: 1.0.0
4
+ min_mindforge_version: 10.0.8
5
+ status: stable
6
+ triggers: developer onboarding, first day script, time to first commit, starter task, onboarding checklist, readme driven, local setup, onboarding buddy, knowledge transfer, ramp up plan, new hire guide, dev environment setup
7
+ ---
8
+
9
+ # Developer Onboarding
10
+
11
+ ## When this skill activates
12
+
13
+ This skill activates when creating or improving developer onboarding processes, writing setup scripts, curating starter tasks, designing ramp-up plans, or reducing time-to-first-commit for new team members. It applies to both new hire onboarding and existing developers joining a new project.
14
+
15
+ ## Mandatory actions when this skill is active
16
+
17
+ ### Before
18
+
19
+ 1. Identify the target developer profile (junior/senior, frontend/backend/fullstack, intern/FTE).
20
+ 2. Measure current time-to-first-commit (if unknown, estimate from last onboarding).
21
+ 3. Inventory existing onboarding materials (README, wiki, setup scripts, recorded sessions).
22
+ 4. Identify the most common setup failures (from Slack history, onboarding feedback).
23
+ 5. Determine who will act as onboarding buddy for new joiners.
24
+
25
+ ### During
26
+
27
+ **North Star Metric: Time-to-First-Commit**
28
+ - Target: < 4 hours from laptop opening to merged PR.
29
+ - Measure: first commit to main branch (not just local setup complete).
30
+ - Track this metric for every new joiner. If it degrades, fix the onboarding immediately.
31
+ - This metric reveals every broken doc, missing tool, and outdated step.
32
+
33
+ **First-Day Setup Script:**
34
+ - One command should set up everything: `make setup` or `./scripts/setup.sh`.
35
+ - Script must handle: dependency installation, environment variables, database setup, seed data, test run.
36
+ - Script must be idempotent (safe to run multiple times).
37
+ - Script must detect and report issues clearly (missing tool versions, port conflicts).
38
+ - Script must work on all team platforms (macOS, Linux; Windows via WSL if needed).
39
+ - Include a final verification step: "Setup complete. Running tests... All 47 tests passed."
40
+
41
+ **README-Driven Development:**
42
+ - If it is not in the README, it does not exist for new developers.
43
+ - README must answer: How do I run it? How do I test it? How do I deploy it? Where do I ask questions?
44
+ - Keep setup instructions in the README (not a wiki that gets stale).
45
+ - Test the README quarterly: have someone follow it literally on a fresh machine.
46
+ - Mark prerequisites explicitly: Node 20+, Docker, PostgreSQL 15, etc.
47
+
48
+ **Starter Tasks (Good First Issues):**
49
+ - Label issues explicitly: `good-first-issue`, `starter-task`, `onboarding`.
50
+ - Scope: achievable in 2-4 hours, touches 1-3 files, has clear acceptance criteria.
51
+ - Include: link to relevant code, example of similar completed work, who to ask.
52
+ - Types that work well: fix a typo in UI, add a missing validation, write a test for uncovered code, update a dependency.
53
+ - Avoid: tasks requiring deep domain knowledge, cross-service changes, or ambiguous requirements.
54
+
55
+ **Onboarding Buddy System:**
56
+ - Assign a buddy for the first 2 weeks (not the manager, a peer).
57
+ - Buddy responsibilities: pair on first PR, answer "dumb" questions, review first 3 PRs same-day.
58
+ - Buddy should proactively check in (don't wait for the new dev to ask).
59
+ - Rotate buddy duty across team (prevents knowledge silos, spreads empathy).
60
+
61
+ **Knowledge Transfer:**
62
+ - Record short Loom videos of key workflows (deploy process, debugging common issues, release flow).
63
+ - Keep recordings under 10 minutes (people don't watch long videos).
64
+ - Store in a discoverable location (project wiki, onboarding channel pinned messages).
65
+ - Update when workflows change (stale videos are worse than no videos).
66
+ - Architecture diagram: one page showing services, databases, and data flow.
67
+
68
+ **Ramp-Up Plan (4-Week Template):**
69
+ - **Week 1**: Environment setup, explore codebase, first starter task merged, meet team.
70
+ - **Week 2**: First bug fix (real issue from backlog), attend architecture walkthrough, pair with buddy.
71
+ - **Week 3**: Small feature (scoped, designed by senior, implemented independently), first code review given.
72
+ - **Week 4**: Independent work on medium-scoped task, contribute to team process (improve a doc, fix flaky test).
73
+ - Checkpoint at end of each week: buddy + manager assess progress, adjust plan if needed.
74
+
75
+ **Common Pitfalls to Prevent:**
76
+ - Setup docs reference tools that are no longer used.
77
+ - Env vars documented in one place but actual required set has grown.
78
+ - "Ask Bob" for access — Bob is on vacation. Document the process instead.
79
+ - Tests pass on CI but fail locally due to missing env vars or services.
80
+ - New dev changes something they shouldn't because boundaries aren't documented.
81
+
82
+ ### After
83
+
84
+ 1. New developer has a merged PR within the target timeframe.
85
+ 2. Setup script runs without manual intervention on a fresh machine.
86
+ 3. All blockers encountered are fixed in the onboarding materials immediately.
87
+ 4. Feedback collected from new developer (what was confusing, what was missing).
88
+ 5. Onboarding materials updated based on feedback.
89
+
90
+ ## Self-check before task completion
91
+
92
+ - [ ] Setup script is one command and runs successfully on a clean machine.
93
+ - [ ] README contains all information needed to run, test, and deploy locally.
94
+ - [ ] At least 3 starter tasks are labeled and ready for the next new joiner.
95
+ - [ ] Ramp-up plan exists with weekly milestones and checkpoints.
96
+ - [ ] Onboarding buddy is assigned and knows their responsibilities.
97
+ - [ ] Architecture diagram is current and accessible.
98
+ - [ ] Time-to-first-commit metric is tracked and below the 4-hour target.
99
+ - [ ] Feedback loop exists: every new joiner improves the process for the next one.
@@ -0,0 +1,59 @@
1
+ ---
2
+ name: developer-productivity-metrics
3
+ version: 1.0.0
4
+ min_mindforge_version: 10.7.0
5
+ status: stable
6
+ triggers: developer productivity metrics, DORA metrics implementation, developer experience survey, flow metric, cognitive load measurement, engineering effectiveness, developer satisfaction, deployment frequency, lead time metric, change failure rate, developer velocity, engineering metric dashboard
7
+ ---
8
+
9
+ # Skill — Developer Productivity Metrics
10
+
11
+ ## When this skill activates
12
+
13
+ This skill activates when the user is designing or implementing developer productivity measurement systems. This includes DORA metrics (deployment frequency, lead time, change failure rate, MTTR), developer experience surveys, flow metrics, cognitive load measurement, engineering effectiveness tracking, developer satisfaction scoring, velocity metrics, and engineering metric dashboards.
14
+
15
+ ## Mandatory actions when this skill is active
16
+
17
+ ### Before writing any code
18
+
19
+ 1. Identify the goal of measurement: improve developer experience, justify platform investment, identify bottlenecks, or benchmark against industry.
20
+ 2. Survey developers to understand perceived friction points before instrumenting metrics (avoid measuring the wrong thing).
21
+ 3. Establish a baseline for each metric category (DORA, flow, satisfaction) to track improvement over time.
22
+ 4. Define metric ownership: who monitors, who acts on findings, and what decision-making authority they have.
23
+ 5. Commit to "metrics for improvement, not punishment" — avoid individual developer tracking or performance reviews based on these metrics.
24
+
25
+ ### During implementation
26
+
27
+ - **DORA Metrics (Four Keys):**
28
+ - **Deployment Frequency:** Count deployments to production per day/week. Elite: multiple per day. High: once per day to once per week.
29
+ - **Lead Time for Changes:** Time from commit to production deploy. Elite: under 1 hour. High: under 1 day.
30
+ - **Change Failure Rate:** Percentage of deployments causing incidents. Elite: 0-15%. High: 16-30%.
31
+ - **Mean Time to Restore (MTTR):** Time from incident detection to resolution. Elite: under 1 hour. High: under 1 day.
32
+ - **Flow Metrics (SPACE Framework):**
33
+ - **Satisfaction:** Developer experience survey (quarterly, 10-15 questions, NPS score). Track: tooling satisfaction, cognitive load, collaboration quality.
34
+ - **Performance:** DORA metrics + throughput (PRs merged, features shipped per sprint).
35
+ - **Activity:** Commits, PRs, code reviews. Use as context, not performance indicator.
36
+ - **Communication:** PR review time, unblocking time, documentation quality.
37
+ - **Efficiency:** Time in flow state (4+ hours uninterrupted), meeting load, context switches per day.
38
+ - **Cognitive Load Measurement:** Survey-based (scale 1-10): "How many systems do you need to understand to complete typical tasks?" "How often do you context switch?" "How easy is it to find information?"
39
+ - **Engineering Effectiveness Dashboard:** Single dashboard showing DORA metrics, flow metrics, developer satisfaction, and trend lines. Refresh daily. Include comparisons to industry benchmarks (DORA State of DevOps Report).
40
+ - **Avoid Vanity Metrics:** Lines of code written, commits per day, hours logged. These measure activity, not impact. Focus on outcomes (features shipped, incidents prevented, time saved).
41
+ - **Privacy Guardrails:** Aggregate metrics at team level (minimum 5 developers). Never track individual developer productivity. Anonymize survey responses.
42
+
43
+ ### After implementation
44
+
45
+ - Verify DORA metrics are auto-collected from CI/CD pipelines and incident management systems (no manual entry).
46
+ - Confirm developer experience surveys run quarterly with at least 70% response rate.
47
+ - Validate that engineering effectiveness dashboard refreshes daily and includes trend lines.
48
+ - Ensure metrics are reviewed in team retrospectives with action items to address bottlenecks.
49
+ - Check that metrics are never used for individual performance reviews (enforce via policy).
50
+
51
+ ## Self-check before task completion
52
+
53
+ - [ ] DORA four keys are auto-collected and tracked (deployment frequency, lead time, change failure rate, MTTR).
54
+ - [ ] Developer experience survey runs quarterly with 70%+ response rate.
55
+ - [ ] Flow metrics cover satisfaction, performance, activity, communication, and efficiency.
56
+ - [ ] Cognitive load is measured via survey and tracked over time.
57
+ - [ ] Engineering effectiveness dashboard refreshes daily and shows trend lines.
58
+ - [ ] Metrics are aggregated at team level (minimum 5 developers) to protect individual privacy.
59
+ - [ ] Metrics are reviewed in retrospectives with action items to improve developer experience.
@@ -0,0 +1,141 @@
1
+ ---
2
+ name: distributed-consensus
3
+ version: 1.0.0
4
+ min_mindforge_version: 10.1.1
5
+ status: stable
6
+ triggers: distributed consensus, raft algorithm, paxos, leader election, split-brain prevention, quorum read, quorum write, consensus protocol, distributed lock, linearizability, strong consistency, consensus group
7
+ ---
8
+
9
+ # Skill — Distributed Consensus
10
+
11
+ ## When this skill activates
12
+ Any task involving strong consistency requirements in distributed systems,
13
+ leader election, distributed locks, quorum-based reads/writes,
14
+ or preventing split-brain scenarios in clustered systems.
15
+
16
+ ## Mandatory actions when this skill is active
17
+
18
+ ### Before writing any code
19
+ 1. Confirm consensus is actually needed (not everything requires strong consistency).
20
+ 2. Identify what data requires linearizability vs what can be eventually consistent.
21
+ 3. Choose existing implementation (etcd, ZooKeeper, Consul) — never build your own.
22
+ 4. Size the cluster (odd numbers only: 3 for most, 5 for critical).
23
+
24
+ ### During implementation
25
+ - Use established consensus systems (etcd/ZooKeeper/Consul) — do NOT implement Raft/Paxos yourself.
26
+ - Implement fencing tokens for all distributed locks.
27
+ - Handle network partitions explicitly (what happens when consensus is lost?).
28
+ - Set appropriate timeouts for leader election (not too short = flapping, not too long = unavailability).
29
+ - Use consensus ONLY for metadata/coordination — never for high-throughput data plane.
30
+
31
+ ### After implementation
32
+ - Test split-brain scenarios (network partition between nodes).
33
+ - Verify leader election completes within acceptable time.
34
+ - Confirm fencing tokens prevent stale operations.
35
+ - Load test to ensure consensus doesn't become a bottleneck.
36
+ - Monitor cluster health (leader stability, replication lag).
37
+
38
+ ## Raft Consensus (Most Common)
39
+
40
+ ### How It Works
41
+ 1. **Leader Election**: Nodes start as followers. If no heartbeat from leader within timeout, a follower becomes a candidate and requests votes.
42
+ 2. **Log Replication**: Leader receives writes, appends to log, replicates to followers.
43
+ 3. **Commitment**: Entry is committed when majority (quorum) acknowledges.
44
+ 4. **Safety**: Only one leader per term. Committed entries are never lost.
45
+
46
+ ### Key Properties
47
+ - Strong leader (all writes go through leader).
48
+ - Leader elected by majority vote.
49
+ - Log entries committed when replicated to majority.
50
+ - Cluster tolerates (N-1)/2 failures (3 nodes tolerates 1, 5 tolerates 2).
51
+
52
+ ## Quorum Mathematics
53
+
54
+ ### Formulas
55
+ ```
56
+ N = total nodes
57
+ W = write quorum (nodes that must acknowledge a write)
58
+ R = read quorum (nodes that must respond to a read)
59
+
60
+ Strong consistency: R + W > N
61
+ Write availability: W ≤ N (can tolerate N-W failures for writes)
62
+ Read availability: R ≤ N (can tolerate N-R failures for reads)
63
+ ```
64
+
65
+ ### Common Configurations
66
+ | N | W | R | Consistency | Write Tolerance | Read Tolerance |
67
+ |---|---|---|-------------|-----------------|----------------|
68
+ | 3 | 2 | 2 | Strong | 1 failure | 1 failure |
69
+ | 5 | 3 | 3 | Strong | 2 failures | 2 failures |
70
+ | 5 | 3 | 1 | Eventual reads | 2 failures | 4 failures |
71
+
72
+ ## Split-Brain Prevention
73
+
74
+ ### The Problem
75
+ Network partition can create two groups, each believing it's the leader.
76
+
77
+ ### Solutions
78
+ 1. **Majority quorum**: Only the partition with majority can elect a leader.
79
+ 2. **Fencing tokens**: Monotonically increasing token with every lock acquisition. Storage rejects operations with stale tokens.
80
+ 3. **Epoch numbers**: Leader increments epoch on election. Older epochs are rejected.
81
+ 4. **External witness**: Third-party arbiter breaks ties (but introduces dependency).
82
+
83
+ ### Fencing Token Pattern
84
+ ```
85
+ Client A acquires lock → token=42
86
+ Client A pauses (GC, network)
87
+ Client B acquires lock → token=43
88
+ Client A resumes, sends write with token=42
89
+ Storage rejects: 42 < 43 (stale token)
90
+ ```
91
+
92
+ ## When to Use Consensus
93
+
94
+ ### Good Use Cases
95
+ - Leader election for worker coordination.
96
+ - Distributed configuration management.
97
+ - Service discovery and membership.
98
+ - Distributed locks (with fencing tokens).
99
+ - Metadata storage (small, infrequently written).
100
+
101
+ ### Anti-Patterns (DON'T use consensus for)
102
+ - High-throughput data writes (consensus = bottleneck at ~10K writes/sec).
103
+ - Large data storage (consensus stores small metadata, not big data).
104
+ - Read-heavy workloads (use eventual consistency + caching instead).
105
+ - Every database write (use consensus for critical metadata only).
106
+
107
+ ## Practical Systems
108
+
109
+ ### etcd
110
+ - Raft-based, Kubernetes uses it for cluster state.
111
+ - Key-value store with watch capabilities.
112
+ - Best for: service discovery, config, leader election.
113
+
114
+ ### ZooKeeper
115
+ - ZAB protocol (similar to Raft).
116
+ - Hierarchical namespace with ephemeral nodes.
117
+ - Best for: distributed locks, barriers, leader election.
118
+
119
+ ### Consul
120
+ - Raft-based, service mesh integration.
121
+ - Service discovery + health checking + KV store.
122
+ - Best for: service mesh, multi-datacenter coordination.
123
+
124
+ ## Failure Scenarios to Test
125
+
126
+ 1. **Leader crash**: New leader elected within timeout. No committed data lost.
127
+ 2. **Network partition (minority isolated)**: Majority continues. Minority becomes read-only or unavailable.
128
+ 3. **Network partition (even split)**: Neither side has majority → cluster unavailable until partition heals.
129
+ 4. **Slow node**: Doesn't affect consensus (majority can proceed without it).
130
+ 5. **Clock skew**: Raft uses logical clocks — physical clock skew shouldn't matter.
131
+ 6. **Disk full on leader**: Leader steps down, new election.
132
+
133
+ ## Self-check
134
+ - [ ] Consensus is genuinely needed (not over-engineering eventual consistency).
135
+ - [ ] Using established system (etcd/ZooKeeper/Consul) — not custom implementation.
136
+ - [ ] Cluster size is odd (3 or 5).
137
+ - [ ] Fencing tokens implemented for distributed locks.
138
+ - [ ] Network partition behavior tested and documented.
139
+ - [ ] Consensus used only for coordination/metadata (not data plane).
140
+ - [ ] Leader election timeout tuned (not too short, not too long).
141
+ - [ ] Monitoring: leader stability, replication lag, cluster health.